gcp-compute
Installation
SKILL.md
GCP Compute Engine
Deploy, manage, and scale Compute Engine virtual machines on Google Cloud Platform.
When to Use
- Deploying web servers, application backends, or batch-processing workloads on GCP
- Running workloads that need full OS-level control (unlike Cloud Run or App Engine)
- Creating managed instance groups for auto-healing and auto-scaling behind a load balancer
- Provisioning GPU-attached VMs for ML training or rendering pipelines
- Cost-optimizing non-critical workloads with preemptible or spot VMs
Prerequisites
- Google Cloud SDK (
gcloud) installed and authenticated - A GCP project with the Compute Engine API enabled
- IAM role
roles/compute.adminor scoped roles for instance management
gcloud auth list
gcloud config set project $PROJECT_ID
gcloud services enable compute.googleapis.com
Machine Types Reference
| Family | Example | vCPUs | Memory | Use Case |
|---|---|---|---|---|
| E2 | e2-micro | 0.25 | 1 GB | Dev/test, microservices |
| E2 | e2-medium | 1 | 4 GB | Light web servers |
| N2 | n2-standard-4 | 4 | 16 GB | General-purpose production |
| N2 | n2-highmem-8 | 8 | 64 GB | In-memory caches, databases |
| C2 | c2-standard-16 | 16 | 64 GB | Compute-intensive, HPC |
# List machine types available in a zone
gcloud compute machine-types list --zones=us-central1-a --filter="name~'e2-'"
# Create a custom machine type (6 vCPUs, 24 GB RAM)
gcloud compute instances create custom-vm \
--custom-cpu=6 --custom-memory=24GB \
--zone=us-central1-a \
--image-family=debian-12 --image-project=debian-cloud
Create an Instance
# Production instance with shielded VM and startup script
gcloud compute instances create web-server \
--machine-type=e2-medium \
--zone=us-central1-a \
--image-family=debian-12 \
--image-project=debian-cloud \
--boot-disk-size=20GB \
--boot-disk-type=pd-balanced \
--tags=http-server,https-server \
--labels=env=production,team=backend \
--metadata=enable-oslogin=TRUE \
--shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring
# Instance with a startup script and service account
gcloud compute instances create app-server \
--machine-type=e2-standard-2 \
--zone=us-central1-a \
--image-family=ubuntu-2204-lts \
--image-project=ubuntu-os-cloud \
--boot-disk-size=50GB \
--metadata-from-file=startup-script=startup.sh \
--service-account=app-sa@${PROJECT_ID}.iam.gserviceaccount.com \
--scopes=cloud-platform
# Instance with an additional data disk
gcloud compute instances create db-server \
--machine-type=n2-highmem-4 \
--zone=us-central1-a \
--image-family=debian-12 --image-project=debian-cloud \
--boot-disk-size=20GB \
--create-disk=name=data-disk,size=200GB,type=pd-ssd,auto-delete=no
Startup Script Example
#!/bin/bash
# startup.sh - runs on first boot and every reboot
set -euo pipefail
apt-get update && apt-get install -y nginx
systemctl enable nginx && systemctl start nginx
curl -X PUT -H "Metadata-Flavor: Google" \
"http://metadata.google.internal/computeMetadata/v1/instance/guest-attributes/startup/status" \
-d "complete"
Instance Templates and Managed Instance Groups
# Create an instance template
gcloud compute instance-templates create web-template \
--machine-type=e2-medium \
--image-family=debian-12 --image-project=debian-cloud \
--boot-disk-size=20GB --tags=http-server \
--metadata-from-file=startup-script=startup.sh
# Create a regional managed instance group (MIG) with health check
gcloud compute health-checks create http http-health-check \
--port=80 --request-path=/healthz \
--check-interval=10s --timeout=5s \
--healthy-threshold=2 --unhealthy-threshold=3
gcloud compute instance-groups managed create web-mig \
--template=web-template --size=3 \
--region=us-central1 \
--health-check=http-health-check --initial-delay=120
# Configure autoscaling
gcloud compute instance-groups managed set-autoscaling web-mig \
--region=us-central1 \
--min-num-replicas=2 --max-num-replicas=10 \
--target-cpu-utilization=0.65 --cool-down-period=90
# Rolling update to a new template
gcloud compute instance-groups managed rolling-action start-update web-mig \
--version=template=web-template-v2 \
--region=us-central1 --max-surge=3 --max-unavailable=0
Preemptible and Spot VMs
# Spot VM (recommended over legacy preemptible)
gcloud compute instances create spot-worker \
--machine-type=n2-standard-8 \
--zone=us-central1-a \
--image-family=debian-12 --image-project=debian-cloud \
--provisioning-model=SPOT \
--instance-termination-action=STOP
# Spot instance template for batch MIG
gcloud compute instance-templates create batch-template \
--machine-type=n2-standard-4 \
--image-family=debian-12 --image-project=debian-cloud \
--provisioning-model=SPOT \
--instance-termination-action=DELETE
Snapshots and Images
# Create a snapshot
gcloud compute disks snapshot web-server \
--zone=us-central1-a \
--snapshot-names=web-server-snap-$(date +%Y%m%d)
# Scheduled snapshot policy
gcloud compute resource-policies create snapshot-schedule daily-backup \
--region=us-central1 --max-retention-days=14 \
--daily-schedule --start-time=03:00
gcloud compute disks add-resource-policies web-server \
--zone=us-central1-a --resource-policies=daily-backup
# Create a custom image from an instance
gcloud compute instances stop web-server --zone=us-central1-a
gcloud compute images create web-golden-image \
--source-disk=web-server --source-disk-zone=us-central1-a \
--family=web-server --labels=version=v1
Terraform Configuration
resource "google_compute_instance" "web" {
name = "web-server"
machine_type = "e2-medium"
zone = "us-central1-a"
tags = ["http-server", "https-server"]
boot_disk {
initialize_params {
image = "debian-cloud/debian-12"
size = 20
type = "pd-balanced"
}
}
network_interface {
subnetwork = google_compute_subnetwork.main.id
access_config {}
}
metadata_startup_script = file("${path.module}/startup.sh")
service_account {
email = google_service_account.app.email
scopes = ["cloud-platform"]
}
shielded_instance_config {
enable_secure_boot = true
enable_vtpm = true
enable_integrity_monitoring = true
}
}
resource "google_compute_instance_template" "web" {
name_prefix = "web-"
machine_type = "e2-medium"
disk {
source_image = "debian-cloud/debian-12"
auto_delete = true
boot = true
disk_size_gb = 20
}
network_interface {
subnetwork = google_compute_subnetwork.main.id
}
lifecycle { create_before_destroy = true }
}
resource "google_compute_region_instance_group_manager" "web" {
name = "web-mig"
base_instance_name = "web"
region = "us-central1"
version {
instance_template = google_compute_instance_template.web.id
}
target_size = 3
named_port { name = "http"; port = 80 }
auto_healing_policies {
health_check = google_compute_health_check.http.id
initial_delay_sec = 120
}
}
resource "google_compute_region_autoscaler" "web" {
name = "web-autoscaler"
region = "us-central1"
target = google_compute_region_instance_group_manager.web.id
autoscaling_policy {
min_replicas = 2
max_replicas = 10
cooldown_period = 90
cpu_utilization { target = 0.65 }
}
}
Common Operations
# SSH into an instance
gcloud compute ssh web-server --zone=us-central1-a
# List all instances with status
gcloud compute instances list \
--format="table(name,zone,status,machineType.basename())"
# Stop / start / resize
gcloud compute instances stop web-server --zone=us-central1-a
gcloud compute instances set-machine-type web-server \
--machine-type=e2-standard-4 --zone=us-central1-a
gcloud compute instances start web-server --zone=us-central1-a
# View serial port output (debug startup scripts)
gcloud compute instances get-serial-port-output web-server --zone=us-central1-a
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Instance stuck in STAGING | Quota exceeded or resource unavailable | Check quota with gcloud compute project-info describe; try another zone |
| Startup script not running | Syntax errors or wrong metadata key | Check serial output; ensure key is startup-script not startup_script |
| Cannot SSH | Firewall blocks port 22 or OS Login misconfigured | Add firewall rule for tcp:22; verify enable-oslogin metadata |
| Preempted too often | Zone resource pressure | Use Spot VM with STOP action; spread across zones in a MIG |
| Disk out of space | Boot disk too small | Use gcloud compute disks resize; enable --storage-auto-increase for data disks |
| MIG not healing | Health check misconfigured or initial delay too short | Verify health check path returns 200; increase --initial-delay |
Related Skills
- gcp-networking - VPC, firewall rules, and load balancers for Compute Engine
- terraform-gcp - Provision Compute Engine resources with Infrastructure as Code
- gcp-gke - When workloads are better suited for containers than VMs
- gcp-cloud-sql - Managed databases that Compute Engine applications connect to
Weekly Installs
35
Repository
bagelhole/devop…t-skillsGitHub Stars
18
First Seen
4 days ago
Security Audits