vm-snapshot-restore
/vm-snapshot-restore Skill
Restore virtual machines from snapshots in OpenShift Virtualization. CRITICAL: This operation replaces current VM state with snapshot data. ALL changes since the snapshot will be LOST.
Implementation Note: This skill uses generic Kubernetes resource tools (resources_create_or_update) to create VirtualMachineRestore resources. Dedicated restore tools do not currently exist in the openshift-virtualization MCP server.
Prerequisites
Required MCP Server: openshift-virtualization (OpenShift MCP Server)
Required MCP Tools:
resources_create_or_update(from openshift-virtualization) - Create VirtualMachineRestoreresources_get(from openshift-virtualization) - Verify VM/snapshot exists, monitor restorevm_lifecycle(from openshift-virtualization) - Stop VM if running
Required Environment Variables:
KUBECONFIG- Path to Kubernetes configuration file with cluster access
Required Cluster Setup:
- OpenShift cluster (>= 4.19)
- OpenShift Virtualization operator installed
- ServiceAccount with RBAC permissions to create VirtualMachineRestore resources
When to Use This Skill
Trigger this skill when:
- User wants to restore a VM to a previous state
- User wants to recover from failed changes/upgrades
- User explicitly requests snapshot restore
User phrases that trigger this skill:
- "Restore VM api-server from snapshot snapshot-20240115"
- "Roll back database-01 to pre-upgrade snapshot"
- "Recover VM web-server from backup"
Do NOT use this skill when:
- User wants to create snapshots → Use
vm-snapshot-createskill - User wants to list snapshots → Use
vm-snapshot-listskill - User wants to clone a VM → Use
vm-cloneskill
Workflow
Step 1: Gather Restore Information
Required Information from User:
- VM Name - VM to restore
- Namespace - Namespace where VM exists
- Snapshot Name - Snapshot to restore from
If any information missing, ask for it.
Step 2: Verify VM Exists
MCP Tool: resources_get (from openshift-virtualization)
Parameters:
{
"apiVersion": "kubevirt.io/v1",
"kind": "VirtualMachine",
"namespace": "<namespace>",
"name": "<vm-name>"
}
Error Handling:
- If VM not found → Report error
- If permission denied → Report RBAC error
Step 3: Check VM Running State
From the VM resource in Step 2, check status.printableStatus.
If VM is Running:
⚠️ VM Must Be Stopped Before Restore
**VM**: `<vm-name>` (namespace: `<namespace>`)
**Status**: Running
**Safety Requirement**: VMs must be stopped before restore to prevent data corruption.
**Options:**
1. "stop-and-restore" - Stop the VM first, then restore from snapshot
2. "cancel" - Cancel restore operation
How would you like to proceed?
Wait for user response.
- If "stop-and-restore" → Stop VM using vm_lifecycle, then continue
- If "cancel" → Stop workflow
Step 4: Verify Snapshot Exists
MCP Tool: resources_get (from openshift-virtualization)
Parameters:
{
"apiVersion": "snapshot.kubevirt.io/v1beta1",
"kind": "VirtualMachineSnapshot",
"namespace": "<namespace>",
"name": "<snapshot-name>"
}
If snapshot not found:
❌ Snapshot Not Found
**Snapshot**: `<snapshot-name>` does not exist in namespace `<namespace>`.
**To list available snapshots:**
"List snapshots for VM <vm-name>"
Restore operation cancelled.
STOP workflow.
Extract snapshot details:
metadata.creationTimestamp- Creation timestatus.phase- Must be "Succeeded"status.readyToUse- Must betruespec.source.name- Verify it matches the VM name
If snapshot status is not Ready:
❌ Snapshot Not Ready
**Snapshot**: `<snapshot-name>`
**Status**: <status.phase>
**Ready to Use**: <status.readyToUse>
Snapshot is not ready for restore. Only snapshots with "Succeeded" phase and readyToUse=true can be used.
Restore operation cancelled.
STOP workflow.
Step 5: Present Restore Preview and Get Typed Confirmation
CRITICAL: User must type the snapshot name to confirm.
## 🔴 VM RESTORE - Data Loss Warning
**⚠️ THIS WILL REPLACE CURRENT VM STATE WITH SNAPSHOT DATA ⚠️**
### What Will Happen
**VM to Restore**: `<vm-name>` (namespace: `<namespace>`)
**Snapshot to Restore From**: `<snapshot-name>`
**Current VM State** (WILL BE LOST):
- **Last Modified**: <current-timestamp>
- **Changes Since Snapshot**: ALL changes made after <snapshot-creation-timestamp> WILL BE PERMANENTLY LOST
**Snapshot State** (WILL BE RESTORED):
- **Created**: <snapshot-creation-timestamp>
- **Age**: <snapshot-age>
**Time Range of Data Loss**:
- **⚠️ ALL CHANGES in the last <time-diff> WILL BE LOST ⚠️**
### What Will Be Restored
- ✓ VM configuration (from snapshot time)
- ✓ Disk data (from snapshot time)
### What Will Be Lost
- ✗ **ALL disk changes** made after <snapshot-creation-timestamp>
- ✗ **ALL configuration changes** made after <snapshot-creation-timestamp>
---
**⚠️ CRITICAL: This restore is permanent. Current VM state cannot be recovered unless you create a snapshot now.**
**To proceed with restore, type the snapshot name exactly as shown:**
Type `<snapshot-name>` to confirm: _____
Wait for user to type the snapshot name.
Validation:
- Compare user input with snapshot name (case-sensitive, exact match)
- If match: Proceed to Step 6
- If mismatch: Cancel operation
On mismatch:
❌ Confirmation Failed
**You typed**: `<user-input>`
**Expected**: `<snapshot-name>`
Names do not match. Restore cancelled for safety.
Operation cancelled. Current VM state preserved.
STOP workflow.
Step 6: Final Confirmation Before Restore
After typed verification succeeds, ask for final explicit confirmation.
## ✓ Typed Verification Passed
**Confirmation received for snapshot**: `<snapshot-name>`
### Ready to Restore
**VM**: `<vm-name>` (namespace: `<namespace>`)
**From Snapshot**: `<snapshot-name>`
**Impact**:
- Current VM state will be replaced with snapshot state
- All changes in the last <time-diff> will be permanently lost
---
**Proceed with VM restore? This action cannot be undone.**
- Type "yes" to execute restore
- Type "cancel" to abort
Your choice: _____
Wait for user response.
Handle response:
- If "yes" → Proceed to Step 7 (execute restore)
- If "cancel", "no", "wait", or anything else → Cancel operation
On cancellation:
Restore operation cancelled by user. Current VM state preserved.
STOP workflow.
Step 7: Execute Restore
ONLY PROCEED AFTER:
- ✓ VM verified (exists, stopped)
- ✓ Snapshot verified (exists, ready)
- ✓ User typed snapshot name correctly
- ✓ User confirmed "yes"
MCP Tool: resources_create_or_update (from openshift-virtualization)
Construct VirtualMachineRestore YAML:
apiVersion: snapshot.kubevirt.io/v1beta1
kind: VirtualMachineRestore
metadata:
name: <restore-name>
namespace: <namespace>
spec:
target:
apiGroup: kubevirt.io
kind: VirtualMachine
name: <vm-name>
virtualMachineSnapshotName: <snapshot-name>
Generate restore name:
- Format:
restore-<vm-name>-<timestamp> - Example:
restore-database-01-20260218-143500
Parameters:
{
"resource": "apiVersion: snapshot.kubevirt.io/v1beta1\nkind: VirtualMachineRestore\nmetadata:\n name: <restore-name>\n namespace: <namespace>\nspec:\n target:\n apiGroup: kubevirt.io\n kind: VirtualMachine\n name: <vm-name>\n virtualMachineSnapshotName: <snapshot-name>"
}
Report progress:
🔄 Restoring VM from snapshot...
⏳ This may take several minutes...
Step 8: Monitor Restore Progress
Use resources_get to monitor VirtualMachineRestore status.
Check status.complete:
true→ Restore completedfalse→ Restore in progress
Wait up to 10 minutes for restore to complete.
Step 9: Report Restore Results
On success:
## ✓ VM Restored Successfully
**VM**: `<vm-name>` (namespace: `<namespace>`)
**Restored From**: Snapshot `<snapshot-name>`
### Restore Details
- **Snapshot Created**: <snapshot-creation-timestamp>
- **Restore Completed**: <current-timestamp>
- **VM Status**: Stopped (ready to start)
### Data Loss Confirmation
- ⚠️ All changes made after <snapshot-creation-timestamp> have been lost
### Next Steps
**To start the restored VM:**
"Start VM <vm-name> in namespace <namespace>"
On failure:
## ❌ VM Restore Failed
**Error**: <error-message>
**VM**: `<vm-name>`
**Snapshot**: `<snapshot-name>`
**Current VM State**: UNKNOWN - may be partially restored or unchanged
**CRITICAL**: Do not start VM until restore issue is resolved
**Recovery Options:**
1. Try restore again after resolving the error
2. Restore from a different snapshot
3. Contact cluster admin for investigation
Dependencies
Required MCP Servers
openshift-virtualization- OpenShift MCP server with kubevirt toolset
Required MCP Tools
resources_create_or_update(from openshift-virtualization) - Create VirtualMachineRestoreresources_get(from openshift-virtualization) - Verify and monitorvm_lifecycle(from openshift-virtualization) - Stop VM if running
Related Skills
vm-snapshot-list- List snapshots before restorevm-snapshot-create- Create snapshots before risky operationsvm-snapshot-delete- Delete old snapshotsvm-lifecycle-manager- Start VM after restore
Reference Documentation
Official Red Hat Documentation:
Upstream Documentation:
Critical: Human-in-the-Loop Requirements
IMPORTANT: This skill performs DESTRUCTIVE operations. You MUST:
-
Before Restoring Snapshots (CRITICAL - Data Loss Risk)
- REQUIRE VM to be stopped first if currently running
- Display what will be lost (current VM state since snapshot)
- Show snapshot details (creation time, age)
- Require typed confirmation - user must type snapshot name exactly
- Ask: "Proceed with restore? This will replace current VM state. (yes/cancel)"
- Wait for explicit "yes"
-
Never Auto-Execute
- NEVER restore without user confirmation
- NEVER restore to running VMs without stopping first
- NEVER skip typed verification for restore operations
Why This Matters:
- Data Loss on Restore: Restoring replaces current VM state - all changes since snapshot are PERMANENTLY LOST
- No Undo: Restore cannot be reversed - current data cannot be recovered
- Typed Confirmation: Prevents accidental restores to wrong snapshots
Common Issues
Issue 1: Restore Fails - Insufficient Storage Capacity
Error: "Failed to restore: insufficient storage capacity" or "PVC provisioning failed"
Cause: The namespace doesn't have enough storage quota or the storage backend is full.
Solution:
- Check namespace storage quota:
resources_listwith kind="ResourceQuota" - Check PVC status:
resources_listfor PersistentVolumeClaims - Delete unnecessary snapshots: Use vm-snapshot-delete skill
- Request quota increase: Contact cluster admin
- Retry restore once storage is available
Issue 2: Restore Stuck in Progress
Error: VirtualMachineRestore status shows complete: false for extended period
Cause: The storage backend is slow, the snapshot is corrupted, or there's a CSI driver issue.
Solution:
- Check VirtualMachineRestore
status.conditionsfor detailed error messages - Verify snapshot is "Succeeded": Use vm-snapshot-list skill
- Wait longer: Large VMs may take 10+ minutes to restore
- Cancel and retry: Delete VirtualMachineRestore resource and try again
- Try alternative snapshot if restore continues to fail
- Check CSI driver logs (requires cluster admin access)
Issue 3: User Typed Wrong Snapshot Name
Error: "Confirmation Failed - Names do not match"
Cause: User made a typo when typing the snapshot name for confirmation.
Solution:
- Retry with exact name: Copy-paste the snapshot name from the previous message
- Check capitalization: Snapshot names are case-sensitive
- Check for spaces: Ensure no leading/trailing spaces in the typed name
- Use vm-snapshot-list skill to see all available snapshots and verify the correct name
This safety mechanism is intentional - it prevents accidental data loss from restoring to the wrong snapshot.
Security Considerations
- Data Protection: Restore requires TWO confirmations (typed + yes) to prevent accidental data loss
- VM Stop Requirement: VMs must be stopped before restore
- Typed Verification: User must type exact snapshot name
- RBAC Enforcement: Requires permissions for VirtualMachineRestore resources
- Namespace Isolation: Restore scoped to namespace boundaries
- Audit Trail: All restore operations logged in Kubernetes API audit logs
- No Rollback: Restore is permanent - current state cannot be recovered after restore
Example Usage
User: "The database upgrade failed. Restore database-01 from snapshot pre-upgrade-snapshot"
Agent: "Which namespace is database-01 in?"
User: "production"
Agent: [Verifies VM - Running status detected] "⚠️ VM Must Be Stopped Before Restore. Options: 1) stop-and-restore 2) cancel"
User: "stop-and-restore"
Agent: [Stops VM, verifies snapshot exists] "🔴 VM RESTORE - Data Loss Warning
- VM: database-01 (production)
- Snapshot: pre-upgrade-snapshot (created 2026-02-18 10:00, age 6h 30m)
- ⚠️ ALL CHANGES in last 6h 30m WILL BE LOST
Type
pre-upgrade-snapshotto confirm: _____"
User: "pre-upgrade-snapshot"
Agent: "✓ Typed Verification Passed. Proceed with restore? (yes/cancel)"
User: "yes"
Agent: "🔄 Restoring VM from snapshot... ⏳ This may take several minutes..." [Monitors progress] "✓ VM Restored Successfully
- VM: database-01 (production)
- Restored From: pre-upgrade-snapshot (created 2026-02-18 10:00)
- Status: Stopped (ready to start)
- ⚠️ All changes after 2026-02-18 10:00 have been lost
To start: 'Start VM database-01 in namespace production'"