Securing the Kubernetes Host Operating System

Rafael Natali
Feb 19
4 min read

It is crucial to prioritise Kubernetes security by safeguarding the Kubernetes host from the containers operating on it. If an attacker manages to compromise a container or a Pod, there are several potential attack paths they could exploit to compromise the hosts. If the host operating system is breached, the attacker could use it to target other nodes in the cluster, along with all the Pods and applications running on that node. Eventually, the attacker can even access other systems in your network! The next subsections contain the information necessary to secure the host operating system.

Operating System Namespaces

In terms of security, containers make use of operating system namespaces. This concept of operating system namespaces is distinct from the idea of Kubernetes namespaces. In this context, namespaces refer to a Linux kernel feature that container technologies use to separate themselves from other containers running on the same machine and the host operating system.

In a Kubernetes server, there are host namespaces, which are used by normal applications running directly on the host operating system. On the other hand, containers operate in a completely separate containers namespace. This setup creates separation between containers and between containers and the host operating system.

Figure 1.1 - Host operating system namespaces

As a result of this configuration, if a container is compromised, an attacker's activities are limited to the container namespace. This isolation restricts the attacker's ability to potentially compromise the host or other containers.

All this introduction is important because you can configure Pods to use the host namespace instead of the isolated container namespace. You might want to do that if you have a container that needs to interact directly with the host operating system. However, this should only be done if absolutely necessary due to the security risks involved. If that container is compromised, the attacker potentially has more ability to interact with components of the operating system because the isolation of the container namespace is not present.

This is the Pod configuration that allows it to interact with the host operating system:

apiVersion: v1
kind: Pod
metadata:
  name: host-pod
spec:
  hostIPC: true
  hostNetwork: true
  hostPID: true
  containers:
  - name: nginx
    image: nginx

When setting spec.hostIPC to true containers will use the host's inter-process communication (IPC) namespace. Interprocess communication is a feature of Linux that allows processes to communicate with each other. Usually, containers use a separate IPC namespace, meaning there is no communication between containers and processes in the host operating system. Another setting is spec.hostNetwork, and this controls the network namespace. The spec.hostPID will make this container use the host process ID namespace. All these configurations instruct the container to use the host namespace in each one of these different areas. All these values default to false. So, if you don't specify any of these settings, the containers will use the isolated namespace.

Avoiding Pods running in privileged mode

Another concept you might need to be aware of when protecting the host operating system is the concept of privileged mode. Privileged mode provides the broadest possible level of permissions. It allows containers to use privilege escalation and access host-level resources, similar to a process running directly on the host.

In the following diagram, we have a non-privileged and a privileged container running in the same host. If the non-privileged container is compromised, that limits what an attacker can do. On the other hand, privileged containers can access those host-level resources.

Figure 1.2 - Privileged and non-privileged containers

To create a privileged container, we set the field spec.containers[*].securityContext.privileged to true in the Pod definition:

apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
spec:
  containers:
  - name: nginx
    image: nginx
    securityContext:
      privileged: true

Studying Linux capabilities

A third concept we need to study is Linux capabilities. With Linux capabilities, you can give specific privileges to a process without granting all the privileges of the root user. By default, containers are executed with a default set of capabilities as specified by the Container Runtime, and this is sufficient for most scenarios. Following the principle of least privilege, it's advisable to remove all capabilities from a Pod definition unless they are necessary. Otherwise, granting more permissions than required to a Pod could expose your host operating system to security threats.

To explicitly drop all capabilities of a Pod, set spec.containers[*].securityContext.capabilities.drop to ALL:

apiVersion: v1
kind: Pod
metadata:
  name: capabilities-pod
spec:
  containers:
  - image: busybox
    name: busybox
    command:
      - sleep
      - "3600"
    securityContext:
      capabilities:
        drop:
          - ALL

Running containers as non-root users

The final concept we are going to review is the runAsUser. Containers may run as any Linux user. Containers that run as the root user - runAsUser: 0 - if compromised, can likely be used to gain root access to the host operating system. Again, with root access to the host, an attacker can take control of the entire Kubernetes cluster or other systems in the company.

This is the Pod configuration to mitigate the risk of running containers as root:

apiVersion: v1
kind: Pod
metadata:
  name: capabilities-pod
spec:
  containers:
  - image: busybox
    name: busybox
    command:
      - sleep
      - "3600"
    securityContext:
      runAsNonRoot: true
      runAsUser: 1000
      runAsGroup: 1000

With the field spec.containers[*].securityContext.runAsNonRoot set to true, a container is required to run as non-root users. The other two fields, spec.containers[*].securityContext.runAsUser and spec.containers[*].securityContext.runAsGroup, define the user and group that has permission to execute the application within the container. To make sure that the application will run as non-root user, besides the configurations in the Pod definition, set the user and group in the Dockerfile when creating the Docker image for the application.

Summary

Securing and isolating the host operating system is paramount to guarantee not only the Kubernetes cluster security but also the security of your company. Some points to remember:

Beware of Pod settings like hostIPC, hostPID, and hostNetwork
Privileged mode is configured when the field spec.containers[*].securityContext.privileged is set to true
Drop all Linux capabilities in the Pod definition
Containers should be running as non-root users

Review the official documentation at https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ and https://kubernetes.io/docs/concepts/security/pod-security-standards/ for more information.

1 Comment

Api Connects

Jun 16

API Connects is a leading IT firm in New Zealand, specializing in IoT development, IoT solutions, and data engineering services. We provide cutting-edge IoT solutions to enhance business operations and data engineering services for seamless data migration and optimization. Our expert DevOps team ensures secure core banking data migration for financial institutions. Visit- https://apiconnects.co.nz/iot-development-testing-consulting/ , https://apiconnects.co.nz/data-engineering-services/

Rafael Natali