Cloud & Infrastructure

Virtualization and Cloud

Private vs. public cloud, VMware vs. Azure/AWS/GCP, and where each workload belongs.

The Hypervisor

A hypervisor is the software layer that makes virtualization work. It sits above physical hardware and allocates CPU, RAM, storage, and network resources to virtual machines (VMs). From each VM's perspective, it's running on dedicated hardware — the abstraction is complete.

Two types:

Type 1 (bare metal): Runs directly on hardware, no underlying OS. VMware ESXi, Microsoft Hyper-V, Xen. More efficient, better performance, used in production environments.

Type 2 (hosted): Runs as an application on top of an existing OS. VMware Fusion, VirtualBox, Parallels. Convenient for development and testing; not used for production infrastructure.

VMware: On-Premises Virtualization

VMware vSphere (ESXi + vCenter) is the dominant enterprise on-premises virtualization platform. It provides the full stack: hypervisor, management console, live VM migration between hosts (vMotion), high availability, distributed resource scheduling, and storage virtualization.

VMware does best:

Workloads that must remain on-premises for compliance, latency, or licensing reasons
Predictable performance with full hardware control
Environments with significant Dell/HPE/Cisco infrastructure already in place
Workloads where software licensing prohibits or penalizes cloud deployment

The post-Broadcom reality: VMware's 2023 acquisition changed the economics significantly. Perpetual licensing is gone; subscription costs are considerably higher for many organizations. This is accelerating cloud migration conversations for customers who previously had no financial incentive to move.

AWS, Azure, and GCP: Public Cloud

The hyperscalers offer virtualization as a service. You rent compute rather than own hardware — no capital expense, no hardware refresh cycles, elastic scale.

AWS: Most mature platform, broadest service portfolio. Right for complex application architectures and teams with AWS expertise.

Azure: Best integration with Microsoft-centric environments — Active Directory, SQL Server, .NET. Microsoft licensing incentives often favor running Windows workloads in Azure.

GCP: Best for data analytics, machine learning, containerized workloads. Kubernetes originated at Google; GKE remains the benchmark managed Kubernetes offering.

The Real Decision: Workload by Workload

The on-premises vs. cloud debate is often framed as binary. It isn't. The right question is: where does this specific workload run best?

On-premises is right when:

The workload is latency-sensitive (direct-attached storage for CAD/video, high-frequency queries)
Compliance requires data residency in a specific location
Workload is stable and predictable — reserved on-premises capacity is often cheaper than cloud for steady-state compute
Existing capital investment and compatible licensed software

Public cloud is right when:

You need to scale rapidly and unpredictably
Geographic redundancy and disaster recovery are requirements
The workload is variable — cloud pay-per-use makes sense
You want to eliminate hardware refresh cycles and capital expense

Many organizations end up hybrid: on-premises for core infrastructure, latency-sensitive workloads, and predictable capacity — cloud for burst capacity, DR, and applications that benefit from managed services.

Azure Site Recovery is worth mentioning specifically: it replicates on-premises VMs to Azure, providing geographically redundant disaster recovery for roughly $25/VM/month plus storage costs. For organizations with existing on-premises VMware that want to reduce their footprint, this provides bi-directional DR infrastructure, geo-redundancy, and a credible exit ramp from on-premises — at a cost that's difficult to replicate with dedicated hardware.

GPU VMs, Local AI, and Agentic Workloads

One of the most interesting emerging use cases for on-premises and cloud VMs is AI inference. GPU-equipped VMs — available from all three major cloud providers, and deployable on-premises with NVIDIA hardware — can run large language models locally, without sending data to external APIs. For organizations with data privacy requirements, or that want to run automated AI workflows at scale, local inference changes the economics.

Separately: tools like Claude Code make VM capacity significantly more useful for development and automation. Spinning up a sandboxed VM, having an agent build and test something inside it, then discarding the VM is a workflow that requires available compute capacity — and rewards organizations that have it.

For real-world examples of infrastructure decisions, see our case studies.

Virtualization and Cloud

The Hypervisor

VMware: On-Premises Virtualization

AWS, Azure, and GCP: Public Cloud

The Real Decision: Workload by Workload

GPU VMs, Local AI, and Agentic Workloads

Untangling the Spaghetti Monster

A GPU in Azure, a Mac on Every Desk

Built for Scale

Understanding AI: From Hardware to Agents

Directory Services

Network Fundamentals