schema-cleaner

SKILL.md

Schema Cleaner — JSON Schema Normalization

Clean JSON Schemas for optimal LLM tool-calling compatibility across different providers.

Overview

Different LLM providers support different subsets of JSON Schema. This skill:

  • Provider-Specific Cleaning: Remove keywords unsupported by each provider
  • Reference Resolution: Inline $ref entries from $defs and definitions
  • Union Flattening: Convert anyOf/oneOf with literals into enum
  • Nullable Handling: Strip nullable variants from unions and type arrays
  • Const Conversion: Convert const to single-value enum
  • Circular Detection: Detect and safely handle circular references

Provider Compatibility Matrix

Keyword Gemini Anthropic OpenAI Description
$ref Reference resolution
$defs Schema definitions
additionalProperties Extra properties
pattern Regex validation
minLength Minimum string length
maxLength Maximum string length
format String format
minimum Minimum number
maximum Maximum number
examples Example values

API

Clean for Specific Provider

const { cleanSchema } = require('schema-cleaner');

// Clean for Gemini (most restrictive)
const geminiSchema = cleanSchema(dirtySchema, { provider: 'gemini' });

// Clean for Anthropic (moderate)
const anthropicSchema = cleanSchema(dirtySchema, { provider: 'anthropic' });

// Clean for OpenAI (most permissive)
const openaiSchema = cleanSchema(dirtySchema, { provider: 'openai' });

Validate Schema

const { validateSchema } = require('schema-cleaner');

const errors = validateSchema(mySchema);
if (errors.length > 0) {
  console.error('Invalid schema:', errors);
}

Resolve References

const { resolveRefs } = require('schema-cleaner');

const inlineSchema = resolveRefs(schemaWithRefs);

Usage Examples

Before and After (Gemini)

Before:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1,
      "pattern": "^[a-z]+$"
    },
    "age": {
      "$ref": "#/$defs/Age"
    }
  },
  "$defs": {
    "Age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    }
  }
}

After (Gemini):

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "age": {
      "type": "integer"
    }
  }
}

Complex Schema Cleaning

const schema = {
  type: 'object',
  properties: {
    status: {
      anyOf: [
        { const: 'active' },
        { const: 'inactive' },
        { const: 'pending' }
      ]
    },
    metadata: {
      type: ['string', 'null']
    }
  }
};

const cleaned = cleanSchema(schema, { provider: 'gemini' });
// Result:
// {
//   type: 'object',
//   properties: {
//     status: { type: 'string', enum: ['active', 'inactive', 'pending'] },
//     metadata: { type: 'string' }
//   }
// }

CLI Usage

# Clean a schema file for Gemini
schema-cleaner clean schema.json --provider gemini --output clean-schema.json

# Validate a schema
schema-cleaner validate schema.json

# Check provider compatibility
schema-cleaner check schema.json --all-providers

Advanced Features

Custom Provider Strategy

const { cleanSchema } = require('schema-cleaner');

// Define custom keywords to remove
const customStrategy = {
  remove: ['minLength', 'maxLength', 'pattern', 'description'],
  preserve: ['title', 'default']
};

const cleaned = cleanSchema(schema, { strategy: customStrategy });

Batch Processing

const schemas = [tool1Schema, tool2Schema, tool3Schema];
const cleaned = schemas.map(s => cleanSchema(s, { provider: 'gemini' }));

Best Practices

  1. Clean at Runtime: Clean schemas dynamically based on the current provider
  2. Preserve Descriptions: Keep description fields for better LLM understanding
  3. Test Per Provider: Validate cleaned schemas work with each target provider
  4. Cache Results: Cache cleaned schemas to avoid repeated processing
  5. Version Schemas: Track schema versions for debugging

Error Messages

The cleaner provides helpful error messages:

{
  "valid": false,
  "errors": [
    {
      "type": "circular_reference",
      "path": "$.properties.parent.properties.child.$ref",
      "message": "Circular reference detected: parent -> child -> parent"
    }
  ]
}

Integration with Tool Definition

const { defineTool } = require('thepopebot');
const { cleanSchema } = require('./schema-cleaner');

// Define tool with full JSON Schema
const tool = defineTool({
  name: 'file_write',
  description: 'Write content to a file',
  parameters: {
    type: 'object',
    properties: {
      path: {
        type: 'string',
        minLength: 1,
        description: 'File path'
      },
      content: {
        type: 'string',
        description: 'Content to write'
      }
    },
    required: ['path', 'content']
  }
});

// Clean for current provider before registering
const provider = process.env.LLM_PROVIDER || 'anthropic';
const cleanParams = cleanSchema(tool.parameters, { provider });

// Register with cleaned schema
registerTool({ ...tool, parameters: cleanParams });
Weekly Installs
3
First Seen
Mar 1, 2026
Installed on
opencode3
gemini-cli3
claude-code3
github-copilot3
codex3
kimi-cli3