KVM Architecture
KVM transforms the Linux kernel into a type-1 hypervisor by leveraging hardware virtualization extensions. Understanding KVM's architecture is essential for optimal performance and effective troubleshooting.
Core Components
KVM Kernel Module
The KVM kernel module (kvm.ko, kvm-intel.ko, or kvm-amd.ko) provides the core virtualization infrastructure. It handles CPU and memory virtualization using Intel VT-x or AMD-V extensions.
Hardware Requirements
- CPU Extensions: Intel VT-x or AMD-V required
- EPT/NPT: Extended/Nested Page Tables for memory virtualization
- IOMMU: VT-d/AMD-Vi for device passthrough
- Check support:
grep -E 'vmx|svm' /proc/cpuinfo
Memory Management
Memory Virtualization
KVM uses EPT (Intel) or NPT (AMD) for hardware-assisted memory virtualization. Guest virtual addresses are translated through two levels: guest virtual to guest physical, then guest physical to host physical.
Memory Overcommit Techniques
- KSM (Kernel Samepage Merging): Deduplicates identical memory pages across VMs
- Transparent Huge Pages: Uses 2MB pages to reduce TLB pressure
- Memory Ballooning: virtio-balloon driver reclaims unused guest memory
- zswap: Compressed swap cache for better performance
CPU Virtualization
vCPU Scheduling
Each vCPU runs as a kernel thread on the host. The Linux CFS scheduler handles vCPU scheduling, allowing integration with cgroups for resource control.
CPU Pinning
Pin vCPUs to specific physical CPUs for consistent performance:
virsh vcpupin myvm 0 0-3
virsh vcpupin myvm 1 4-7
I/O Virtualization
virtio Framework
virtio provides paravirtualized device drivers for optimal I/O performance. Guests using virtio drivers achieve near-native speeds compared to emulated devices.
virtio Devices
- virtio-net: Network interface (10Gbps+ throughput possible)
- virtio-blk: Block storage (lower CPU overhead than virtio-scsi)
- virtio-scsi: SCSI controller (supports more devices, TRIM/discard)
- virtio-gpu: Graphics acceleration
- virtio-rng: Entropy source for better randomness
Device Passthrough (VFIO)
VFIO allows assigning PCI devices directly to VMs with IOMMU protection. Common use cases include GPU passthrough for machine learning or gaming.
Storage Architecture
Storage Backend Options
- Raw disk images: Maximum performance, no features
- qcow2: Copy-on-write, snapshots, compression, thin provisioning
- LVM volumes: Native snapshots, thin provisioning
- Ceph RBD: Distributed block storage
- NFS/iSCSI: Network-based shared storage
Networking Architecture
Network Modes
- Bridge: VMs on same network as host, full L2 connectivity
- NAT: VMs behind NAT, libvirt manages DHCP and routing
- Macvtap: Direct connection to physical interface, no bridge needed
- SR-IOV: Hardware VF assignment for maximum throughput
Performance Considerations
- Use virtio drivers for all devices in production
- Enable huge pages for memory-intensive workloads
- Configure CPU topology matching to avoid NUMA penalties
- Use multiqueue virtio-net for high-bandwidth networking
- Consider PCI passthrough for GPUs and high-performance NICs