Chapter 13: Virtualization & Cloud Computing — Introduction to Digital Computation

The Problem Virtualization Solves

Before virtualization, the standard practice was to dedicate a physical server to a single application. The reasoning was sound: mixing workloads on one machine meant that one runaway process could starve everything else of resources, and a crashed application could destabilize the entire server. Isolation required physical separation. The unintended consequence was data centers full of mostly-idle machines — servers running at 8%, 12%, 15% CPU utilization around the clock, each drawing full power and generating full heat regardless of how little they were actually doing.

The economics were painful. A server might cost $10,000 to purchase and $3,000 per year in power and cooling to operate, yet spend 90% of its time waiting for work. Provisioning a new server meant submitting a purchase request, waiting weeks for delivery, physically racking and cabling the hardware, installing and configuring an OS, and finally deploying the application — a process that could take a month. IT teams could not respond to business needs at the speed business moved. Virtualization solved both problems at once.

Virtual Machines

A virtual machine (VM) is a software-defined computer — a complete simulated environment that includes virtual CPU cores, virtual RAM, virtual storage, and virtual network interfaces. From the operating system's perspective, the VM looks like real hardware. It boots, runs an OS, executes applications, and crashes, just like a physical machine. The critical difference is that it shares the underlying physical hardware with other VMs, all running simultaneously on the same server.

The software layer that makes this possible is the hypervisor. The hypervisor sits between the physical hardware and the virtual machines and is responsible for creating VMs, allocating physical resources (CPU time, RAM pages, storage I/O) among them, and enforcing strict isolation so that one VM cannot read or interfere with another's memory or processes. A physical server with 64 CPU cores and 512 GB of RAM might comfortably run 20–40 VMs, each believing it has exclusive access to its assigned slice. What used to require 20 physical servers now fits on 2, with utilization rates well above 70%.

Type 1 vs. Type 2 Hypervisors

Hypervisors come in two architectures. A Type 1 (bare-metal) hypervisor runs directly on the physical hardware — there is no host operating system underneath it. The hypervisor itself is effectively the OS of the physical machine, and the virtual machines run on top of it. VMware ESXi, Microsoft Hyper-V, and the Linux KVM kernel module are all examples. With direct hardware access and no intermediary, they are fast, efficient, and used exclusively in production data centers and cloud infrastructure.

A Type 2 (hosted) hypervisor runs as an application inside a conventional host operating system. Windows or macOS runs first; then VMware Workstation, Oracle VirtualBox, or Parallels Desktop runs on top of it, hosting VMs inside an application window. This is convenient for development and testing — a developer can run a Linux VM on their Mac to test a deployment — but the extra layer of the host OS adds overhead that makes it unsuitable for production.

Snapshots and Templates

Two VM features with no physical-server equivalent change how IT operates. A snapshot captures the complete state of a VM at a point in time — disk contents, RAM contents, CPU state, and configuration. Before applying a risky patch or making a major configuration change, you take a snapshot. If something goes wrong, you revert to the snapshot in minutes. With physical servers, rolling back meant restoring from a backup that might be hours or days old. With VMs, rollback is nearly instant.

A template is a snapshot used as a master image. Instead of building a new server from scratch every time, you build one "golden image" — OS installed, hardened, configured to standard — and clone VMs from it on demand. Provisioning a new server drops from weeks to minutes: clone the template, adjust the hostname and IP, and the VM is ready. This is the operational model that makes cloud computing possible at scale.

Containers

Virtual machines virtualize hardware — each VM gets a full OS. Containers virtualize the operating system instead. Multiple containers run on the same OS kernel, each isolated from the others by the kernel's namespace and control group features. Where a VM packages a complete OS plus an application, a container packages only the application and its dependencies — the runtime, libraries, and configuration files the application needs. Nothing more.

The result is dramatically lighter. A VM might consume 2–4 GB of disk space and take 60–90 seconds to boot. A container image might be 50–200 MB and start in under a second. A physical server that could comfortably run 20 VMs can run hundreds of containers. Docker is the dominant container platform. A Docker image is a read-only, layered blueprint — the application code, its runtime, and all its dependencies baked together. A running instance of an image is a container. Because the image captures the entire environment, a container that works on a developer's laptop will run identically in production. "It works on my machine" is the oldest excuse in software development; containers largely eliminate it.

The tradeoff for this efficiency is reduced isolation. Containers share the host OS kernel, so a kernel-level vulnerability could theoretically be exploited across all containers on a host. VMs provide stronger isolation because each has its own kernel — a vulnerability in one VM cannot directly affect others. In practice, organizations use both: containers for application workloads where density and speed matter, VMs (or VMs running containers) where stronger isolation is required, such as multi-tenant environments or sensitive data processing.

Kubernetes. When an organization runs hundreds or thousands of containers across many servers, manually deciding which container runs on which server becomes unmanageable. Kubernetes (K8s) is the dominant container orchestration platform — it automates deployment, scaling, load balancing, and self-healing of containerized applications across a cluster of servers. When a container crashes, Kubernetes restarts it automatically. When traffic spikes, Kubernetes spins up more containers. You describe the desired state; Kubernetes continuously works to achieve it.

Cloud Computing

Cloud computing delivers computing resources — compute, storage, networking, databases, AI services, and more — over the internet, on demand, with pay-as-you-go pricing. Instead of buying servers and operating them in your own data center, you rent capacity from a provider's infrastructure and pay only for what you use, for as long as you use it. Shut down a VM at the end of the day and you stop paying. Spin up 500 servers for a two-hour batch job and pay for two hours of 500 servers.

The three dominant public cloud providers — Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP) — collectively operate hundreds of data centers across dozens of geographic regions worldwide. AWS, the pioneer (launched 2006), holds the largest market share and the widest range of services. Azure, preferred by enterprises already invested in Microsoft's ecosystem (Active Directory, Microsoft 365, Windows Server), is the second largest. GCP, while third in market share, is the leader in data analytics, AI/ML infrastructure, and Kubernetes (Google built Kubernetes and open-sourced it). Most large enterprises use at least two of the three.

The Three Service Models

Imagine you want to run a web application. The most extreme version of doing everything yourself is to buy a server, install an OS, install a web server, deploy your code, and answer your phone at 3 AM when it crashes. The most extreme version of doing nothing yourself is to log into someone else's pre-built application and use it. In between those two extremes is a spectrum, and cloud providers have given the points on that spectrum names.

The categorization comes down to how much of the technology stack the provider manages versus how much the customer manages — the shared responsibility model.

IaaS — Infrastructure as a Service

With IaaS, the provider supplies virtual machines, storage, and networking — the raw infrastructure. You manage everything above that: the operating system, the runtime, the application, and the data. IaaS is the most flexible model and the most responsibility. If a security patch is released for your OS, patching it is your job. If your application has a bug, debugging it is your job. IaaS gives you a VM that looks exactly like a physical server you rented remotely — what you do with it is up to you. Examples: AWS EC2, Azure Virtual Machines, Google Compute Engine.

PaaS — Platform as a Service

With PaaS, the provider manages the OS and runtime environment in addition to the infrastructure. You deploy your application code and manage your data — the platform handles the rest. You don't choose an OS version, apply patches, or configure web servers. You push code; it runs. PaaS is the right choice when you want to focus on building the application rather than operating the infrastructure that runs it. Examples: Azure App Service, Google App Engine, AWS Elastic Beanstalk, Heroku.

SaaS — Software as a Service

With SaaS, the provider runs and manages the entire application. You log in and use it — there's no code to deploy, no OS to patch, no runtime to configure. SaaS is what most business users interact with every day without realizing it's cloud computing. Examples: Microsoft 365 (email, Word, Excel, Teams), Google Workspace (Gmail, Docs, Drive), Salesforce (CRM), Zoom, Slack. If you use any of these, you are a SaaS customer.

Most organizations use all three models simultaneously. A company might use Microsoft 365 for email (SaaS), Azure App Service to host their customer-facing web application (PaaS), and Azure Virtual Machines for legacy applications that need OS-level control (IaaS). The widget below visualizes exactly who is responsible for what at each layer of the stack.

Deployment Models

Beyond the service model (what is provided), organizations also choose a deployment model (where it runs and who controls the environment).

A public cloud is infrastructure operated by a third-party provider and shared across many organizations simultaneously. AWS, Azure, and GCP are all public clouds. The multi-tenant nature — your VMs run on the same physical hardware as other customers' VMs — is what makes public cloud economical, but it also means you have no visibility into or control over the underlying hardware. The provider's isolation and security controls protect you from other tenants.

A private cloud is cloud infrastructure dedicated exclusively to a single organization. It may run on-premises (in the organization's own data center) or in a colocation facility, but it provides the same self-service provisioning and automation that public cloud does — just for one organization's exclusive use. Private clouds sacrifice the economies of scale of public cloud in exchange for complete control over the environment.

A hybrid cloud connects private and public infrastructure, allowing workloads to move between them. Sensitive data and regulated workloads stay on-premises or in the private cloud; development environments, test systems, and burst capacity spin up in the public cloud on demand. Hybrid is the reality for most large enterprises: they have years of investment in on-premises infrastructure that isn't going away, but they use public cloud for new projects and flexible capacity.

The Business Case

Capital expenditure vs. operational expenditure. Traditional IT spending is CapEx: you buy servers every three to five years, depreciate them on the balance sheet, and replace them when they age out. Cloud shifts this to OpEx: a monthly bill, like a utility. Finance departments often prefer OpEx because it's predictable, doesn't require large upfront approvals, and scales directly with business activity rather than being a lump sum purchased in anticipation of future growth.

Elasticity. A retailer's IT demand on Black Friday is 10 times their average daily load. With physical servers, they face a choice: buy enough hardware to handle the peak (which sits 90% idle the rest of the year) or be underpowered during their most important sales period. Cloud eliminates this dilemma — scale to 10× capacity for 12 hours, pay for 12 hours of 10× capacity, then scale back down. The same principle applies to any workload with variable demand: batch processing, machine learning training, product launches, seasonal spikes.

No hardware lifecycle. Physical servers fail, age out of warranty, and require periodic replacement — a planning and capital cycle that repeats every three to five years. Cloud customers offload this entirely. When AWS upgrades their hardware, you automatically benefit the next time you provision an instance. There is no procurement process, no racking, no cabling, and no e-waste to dispose of.

Global reach and managed services. AWS operates 33+ geographic regions. Deploying infrastructure in Tokyo instead of Chicago is a configuration choice, not a six-month project. Cloud providers also offer hundreds of managed services — fully operated databases, message queues, AI model APIs, identity systems, and more — that organizations can adopt without building or operating the underlying infrastructure. A team that would take six months to build and harden a production database can be using a managed cloud database that day.

Cloud cost management. Cloud's pay-as-you-go model is a double-edged sword. Unused resources that aren't shut down keep billing. Over-provisioned instances waste money quietly. Most organizations that move to cloud without disciplined cost governance are surprised by their bills. Cloud providers offer cost management tools (AWS Cost Explorer, Azure Cost Management) and organizations adopt practices like tagging resources by team and project, setting budgets and alerts, and regularly reviewing and right-sizing instances. "Cloud is cheaper" is only true if someone is actively managing the spend.

Chapter 14 follows a single request — a login, a click, a page load — all the way through the stack, from the keypress to the cloud VM that answers it.

Quiz Chapter 13 Quiz

1. What is the primary function of a hypervisor?

2. A developer uses Oracle VirtualBox on their Windows laptop to run a Linux VM for testing. What type of hypervisor is VirtualBox?

3. Before applying a major OS patch to a production VM, a sysadmin takes a snapshot. Why?

4. How do containers differ from virtual machines in the most fundamental way?

5. A company uses Microsoft 365 for email, Teams, and Word. Which cloud service model best describes Microsoft 365?

6. A startup wants to deploy a web application without managing OS patches, web server configuration, or runtime updates. Which service model fits?

7. In the shared responsibility model, which layer is ALWAYS the customer's responsibility regardless of whether they use IaaS, PaaS, or SaaS?

8. A retailer scales their cloud infrastructure to 10× normal capacity for Black Friday, then scales back to normal the next day. This cloud characteristic is called:

9. A hospital keeps patient records on its own private cloud for compliance, but runs development and test environments on AWS. This is an example of:

10. Moving from buying servers every five years to paying a monthly cloud bill shifts IT spending from CapEx to ______.