Some Linux kernel hardening tools can be integrated with Kubernetes to control how Pods and containers interact with the host operating system. For example, we can restrict Pods from creating files or executing programs. This article will discuss two of these tools: AppArmor and seccomp. In the following subsections, we will use these tools to restrict some operations between Kubernetes and the host operating system.

AppArmor
AppArmor is a security kernel module for Linux that offers fine-grained access control for programs running on Linux systems. An AppArmor profile consists of rules that specify what a program is allowed or not allowed to do.
An AppArmor profile can be loaded at the server level and activated in one of two modes. The first mode is called complain. In this mode, AppArmor doesn't take any action. It only generates a report with the actions the program is executing. This is useful for discovering what commands/functions a Pod is executing. The second mode is enforce. When a profile is loaded in this mode, AppArmor will actively prevent the Pod from executing anything that the profile does not allow. AppArmor profiles must be activated in every single worker node.
Applying an AppArmor profile to a Pod
Let’s create an AppArmor profile that deny all writes to disk and apply this profile to a Pod. In a worker node, create a file called k8s-deny-write with the following content:
#include <tunables/global>
profile k8s-deny-write flags=(attach_disconnected) {
#include <abstractions/base>
file,
# Deny all file writes.
deny /** w,
}
To activate an AppArmor profile, use the apparmor_parse command. By default, the profile is loaded in enforce mode. Use the -C flag to activate it in complain mode.
>sudo apparmor_parser ./k8s-deny-write
To verify if a profile is loaded execute the aa-status command:
>sudo aa-status
apparmor module is loaded.
56 profiles are loaded.
52 profiles are in enforce mode.
...
k8s-deny-write
...
0 processes are unconfined but have a profile defined.
Using the following manifest file, we'll create a Pod that outputs a simple message:
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor
spec:
containers:
- name: hello
image: busybox:1.28
command: [ "sh", "-c", "while true; do echo 'Hello AppArmor!' > /tmp/hello && cat /tmp/hello; sleep 10; done" ]
After creating the Pod, looking at the its logs, we can see that the message was written:
>kubectl logs hello-apparmor -f
Hello AppArmor!
Hello AppArmor!
Note: Prior to Kubernetes v1.30, AppArmor was specified through annotations. Check documentation for previous versions at https://kubernetes.io/docs/tutorials/security/apparmor/
Now, we will configure the AppArmor profile in the Pod manifest file using the securityContext. The new manifest file is the following:
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor
spec:
securityContext:
appArmorProfile:
type: Localhost
localhostProfile: k8s-deny-write
containers:
- name: hello
image: busybox:1.28
command: [ "sh", "-c", "while true; do echo 'Hello AppArmor!' > /tmp/hello && cat /tmp/hello; sleep 10; done" ]
Now, delete the current Pod, apply the new manifest files and check the logs again:
>kubectl delete pods hello-apparmor
pod "hello-apparmor" deleted
>kubectl apply -f apparmor.yaml
pod/hello-apparmor created
>kubectl logs hello-apparmor -f
sh: can't create /tmp/hello: Permission denied
sh: can't create /tmp/hello: Permission denied
sh: can't create /tmp/hello: Permission denied
The hello-apparmor Pod is unable to create files because of the AppArmor profile. You can also verify if the profile was applied executing:
>kubectl exec hello-apparmor -- cat /proc/1/attr/current
k8s-deny-write (enforce)
Here we can see that the k8s-deny-write profile was applied and it’s in enforce mode.
To finalise this AppArmor section, let me show what happens if we configure a Pod with a profile that hasn’t been loaded by creating the following Pod:
>kubectl create -f /dev/stdin <<EOF
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor-2
spec:
securityContext:
appArmorProfile:
type: Localhost
localhostProfile: k8s-apparmor-example-allow-write
containers:
- name: hello
image: busybox:1.28
command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep 1h" ]
EOF
pod/hello-apparmor-2 created
Verify the status of the Pod:
>kubectl get pods hello-apparmor-2
NAME READY STATUS RESTARTS AGE
hello-apparmor-2 0/1 CreateContainerError 0 12s
Let’s use kubectl describe to investigate the error with the Pod:
>kubectl describe pods hello-apparmor-2
...
Warning Failed 79s (x12 over 3m15s) kubelet Error: failed to get container spec opts: failed to generate apparmor spec opts: apparmor profile not found k8s-apparmor-example-allow-write
...
In the Events sections, we can clearly see that the AppArmor profile was not found. Therefore, a Pod will not start if it’s configured to use a profile that hasn’t been loaded.
Seccomp
Seccomp is a Linux kernel feature that allows you to restrict the system calls that applications can make. This is useful for enhancing the security of applications by limiting their ability to interact with the underlying operating system, thus reducing the potential attack surface. Seccomp operates by defining a filter that specifies which system calls are allowed and which are denied.
In the context of Kubernetes, seccomp can be used to define security policies for pods, ensuring that they only use the necessary system calls required for their operation, thus reducing the risk of exploitation.
Creating example seccomp profiles
We will download three example seccomp profiles that we will use to test our Pods. These files must be present in all the nodes in the Kubernetes cluster. The first profile is the audit.json, which logs all system calls of a process. The second is the violation.json, which does not allow for any system calls. The third is the fine-grained.json, which allows some system calls in the "action": "SCMP_ACT_ALLOW" block. To download and save to the local directory /var/lib/kubelet/seccomp/seccomp_profiles, execute the following commands:
>sudo mkdir -p /var/lib/kubelet/seccomp/seccomp_profiles
>curl -L -o seccomp_profiles/audit.json https://k8s.io/examples/pods/security/seccomp/profiles/audit.json
>curl -L -o seccomp_profiles/violation.json https://k8s.io/examples/pods/security/seccomp/profiles/violation.json
>curl -L -o seccomp_profiles/fine-grained.json https://k8s.io/examples/pods/security/seccomp/profiles/fine-grained.json
>ls seccomp_profiles/
audit.json fine-grained.json violation.json
Creating a Pod that logs all system calls
Configure the audit.json profile in the .spec.securityContext field of a Pod. Here is a Pod manifest file that will use the seccomp profile:
apiVersion: v1
kind: Pod
metadata:
name: audit-pod
labels:
app: audit-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: seccomp_profiles/audit.json
containers:
- name: test-container
image: hashicorp/http-echo:1.0
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
Let’s create the Pod:
>kubectl apply -f audit.yaml
pod/audit-pod created
This profile doesn’t block any action so, the Pod will execute without any problem. Now, log into the worker node where the Pod is running and verify in the syslog the system calls the audit-pod is executing:
>sudo tail -f /var/log/syslog | grep 'http-echo'
Jul 23 22:09:48 5600791c132c kernel: [ 2070.311535] audit: type=1326 audit(1721772588.695:700): auid=4294967295 uid=65532 gid=65532 ses=4294967295 subj=cri-containerd.apparmor.d pid=16021 comm="http-echo" exe="/http-echo" sig=0 arch=c000003e syscall=35 compat=0 ip=0x4685d7 code=0x7ffc0000
Jul 23 22:09:48 5600791c132c kernel: [ 2070.311662] audit: type=1326 audit(1721772588.695:701): auid=4294967295 uid=65532 gid=65532 ses=4294967295 subj=cri-containerd.apparmor.d pid=16021 comm="http-echo" exe="/http-echo" sig=0 arch=c000003e syscall=202 compat=0 ip=0x468ba3 code=0x7ffc0000
Creating a Pod with a profile that does not allow any system call
The violation.json profile does not allow any system call. Configuring it in a Pod, the Pod should fail to start. The following Pod uses the violation.json profile:
apiVersion: v1
kind: Pod
metadata:
name: violation-pod
labels:
app: violation-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: seccomp_profiles/violation.json
containers:
- name: test-container
image: hashicorp/http-echo:1.0
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
Now, we will create the Pod and verify its status:
>kubectl apply -f violation.yaml
pod/violation-pod created
>kubectl get pods violation-pod
NAME READY STATUS RESTARTS AGE
violation-pod 0/1 RunContainerError 2 (1s ago) 26s
Looking in the syslog, we will see an error indicating that the Pod is unable to start a process:
>sudo tail -f /var/log/syslog | grep 'http-echo'
Jul 23 22:28:14 5600791c132c kubelet[769]: E0723 22:28:14.220131 769 kuberuntime_manager.go:1256] container &Container{Name:test-container,Image:hashicorp/http-echo:1.0,Command:[],Args:[-text=just made some syscalls!],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},Claims:[]ResourceClaim{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:kube-api-access-lmsz5,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,RecursiveReadOnly:nil,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:nil,Privileged:nil,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:*false,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,AppArmorProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,ResizePolicy:[]ContainerResizePolicy{},RestartPolicy:nil,} start failed in pod violation-pod_default(9743bb58-804b-40c8-9775-babfa69e80d1): RunContainerError: failed to start containerd task "28a0542f2cb20f3747de7518a1bafeeb0c470bd0cbac07482f4bca82f1b32279": cannot start a stopped process: unknown
As expected, the seccomp profile prevented the Pod from executing any system call.
Creating a Pod with a profile that allows the required system calls
For the image http-echo to run successfully, it needs to have permission to perform some system calls. The fine-grained.json profile allows the required system calls. The following is a Pod using the fine-grained.json profile:
apiVersion: v1
kind: Pod
metadata:
name: fine-grained-pod
labels:
app: fine-grained-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: seccomp_profiles/fine-grained.json
containers:
- name: test-container
image: hashicorp/http-echo:1.0
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
Applying the Pod and checking the status:
>kubectl apply -f fine-grained.yaml
pod/fine-grained-pod created
>kubectl get pods fine-grained-pod
NAME READY STATUS RESTARTS AGE
fine-grained-pod 1/1 Running 0 67s
As the profile allows all system calls necessary for the image, there is no error log in the syslog and the Pod is running successfully.
Summary
Using Linux kernel hardening tools to protect the host operating system enhances the isolation, minimising the attack surface of your Kubernetes cluster. Some points to remember:
· Before Kubernetes version 1.30, AppArmor and seccomp profiles were configured in the Pod using annotations. After this version, they are configured in the .spec.securityContext fields.
· AppArmor and seccomp profiles must be configured in all nodes of the cluster
· Review the Kubernetes documentation about using AppArmor and seccomp available at https://kubernetes.io/docs/tutorials/security/
Comments