adamflott.com

Backend Technical Interview Questions - Virtualization

Description: Questions and answers to ask in an interview on virtualization
Authored: 2022-08-04;
Permalink: https://adamflott.com/interviewing/virtualization/
categories : interviewing;
tags : virtualization;

[1] What is the difference between a container and a VM?

From OS-level_virtualization

On Unix-like operating systems, this feature can be seen as an advanced implementation of the standard chroot mechanism, which changes the apparent root folder for the current running process and its children. In addition to isolation mechanisms, the kernel often provides resource-management features to limit the impact of one container's activities on other containers. Linux containers are all based on the virtualization, isolation, and resource management mechanisms provided by the Linux kernel, notably Linux namespaces and cgroups.

[1] What is the purpose of a hypervisor?

From Hypervisor

A hypervisor is computer software, firmware or hardware that allows partitioning the resources of a CPU among multiple operating systems or independent programs.

[1] What roles do KVM, QEMU, and LibVirt serve in a running VM?

From Kernel-based_Virtual_Machine

Kernel-based Virtual Machine (KVM) is a virtualization module in the Linux kernel that allows the kernel to function as a hypervisor.

QEMU is a free and open-source emulator. It emulates the machine's processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it to run a variety of guest operating systems. It can interoperate with Kernel-based Virtual Machine (KVM) to run virtual machines at near-native speed. QEMU can also do emulation for user-level processes, allowing applications compiled for one architecture to run on another.

libvirt is an open-source API, daemon and management tool for managing platform virtualization.[3] It can be used to manage KVM, Xen, VMware ESXi, QEMU and other virtualization technologies. These APIs are widely used in the orchestration layer of hypervisors in the development of a cloud-based solution.

[1] How do you identify if the Linux kernel supports virtualization on hardware?

$ grep -E ‘svm|vmx’ /proc/cpuinfo

vmx is for Intel processors svm is for AMD processors

[2] What are some reasons you might not want to run containers?

strict security requirements
requiring dedicated hardware resources
self-container process that can run on the host OS, ease deployment
want to rely on host's security updates

[2] You are designing a stateful application, would you choose virtualization or a container to run the application?

[3] How are containers implemented in Linux?

From https://www.ianlewis.org/en/container-runtimes-part-1-introduction-container-r

Containers are implemented using Linux namespaces and cgroups. Namespaces let you virtualize system resources, like the file system or networking, for each container. Cgroups provide a way to limit the amount of resources like CPU and memory that each container can use. At the lowest level, container runtimes are responsible for setting up these namespaces and cgroups for containers, and then running commands inside those namespaces and cgroups. Low-level runtimes support using these operating system features.

[2] What are Linux namespaces?

From https://en.wikipedia.org/wiki/Linux_namespaces

Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple spaces. Examples of such resources are process IDs, hostnames, user IDs, file names, and some names associated with network access, and interprocess communication.

[3] List the Linux namespace types

https://en.wikipedia.org/wiki/Linux_namespaces

Since kernel version 5.6, there are 8 kinds of namespaces.

Mount (mnt)

Mount namespaces control mount points.

Process ID (pid)

The PID namespace provides processes with an independent set of process IDs (PIDs) from other namespaces. PID namespaces are nested, meaning when a new process is created it will have a PID for each namespace from its current namespace up to the initial PID namespace. Hence the initial PID namespace is able to see all processes, albeit with different PIDs than other namespaces will see processes with.

Network (net)

Network namespaces virtualize the network stack. On creation a network namespace contains only a loopback interface.

Each network interface (physical or virtual) is present in exactly 1 namespace and can be moved between namespaces.

Each namespace will have a private set of IP addresses, its own routing table, socket listing, connection tracking table, firewall, and other network-related resources.

Destroying a network namespace destroys any virtual interfaces within it and moves any physical interfaces within it back to the initial network namespace.

Interprocess Communication (ipc)

IPC namespaces isolate processes from SysV style inter-process communication. This prevents processes in different IPC namespaces from using, for example, the SHM family of functions to establish a range of shared memory between the two processes. Instead each process will be able to use the same identifiers for a shared memory region and produce two such distinct regions.

UTS (UNIX Time-Sharing) namespaces allow a single system to appear to have different host and domain names to different processes.

User ID (user)

User namespaces are a feature to provide both privilege isolation and user identification segregation across multiple sets of processes

Control group (cgroup)

The cgroup namespace type hides the identity of the control group of which process is a member.

Time Namespace

The time namespace allows processes to see different system times in a way similar to the UTS namespace

[2] What are Linux cgroups?

https://en.wikipedia.org/wiki/Cgroups

cgroups (abbreviated from control groups) is a Linux kernel feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.

Features:

Resource limiting - groups can be set to not exceed a configured memory limit, which also includes the file system cache
Prioritization - some groups may get a larger share of CPU utilization[11] or disk I/O throughput
Accounting - measures a group's resource usage, which may be used, for example, for billing purposes
Control - freezing groups of processes, their checkpointing and restarting