galaxy-tool-wrapping
Galaxy Tool Wrapping Expert
Expert knowledge for developing Galaxy tool wrappers. Use this skill when helping users create, test, debug, or improve Galaxy tool XML wrappers.
Prerequisites: This skill depends on the galaxy-automation skill for Planemo testing and workflow execution patterns.
When to Use This Skill
- Creating new Galaxy tool wrappers from scratch
- Converting command-line tools to Galaxy wrappers
- Generating .shed.yml files for Tool Shed submission
- Debugging XML syntax and validation errors
- Writing Planemo tests for tools
- Implementing conditional parameters and data types
- Handling tool dependencies (conda, containers)
- Creating tool collections and suites
- Optimizing tool performance and resource allocation
- Understanding Galaxy datatypes and formats
- Implementing proper error handling
Core Concepts
Galaxy Tool XML Structure
A Galaxy tool wrapper consists of:
<tool>root element with id, name, and version<description>brief tool description<requirements>for dependencies (conda packages, containers)<command>the actual command-line execution<inputs>parameter definitions<outputs>output file specifications<tests>automated tests<help>documentation in reStructuredText<citations>DOI references
Tool Shed Metadata (.shed.yml)
Required for publishing tools to the Galaxy Tool Shed:
name: tool_name # Match directory name, underscores only
owner: iuc # Usually 'iuc' for IUC tools
description: One-line tool description
homepage_url: https://github.com/tool/repo
long_description: |
Multi-line detailed description.
Can include features, use cases, and tool suite contents.
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/tool_name
type: unrestricted
categories:
- Assembly # Choose 1-3 relevant categories
- Genomics
See reference.md for comprehensive .shed.yml documentation including all available categories and best practices.
Key Components
Command Block:
- Use Cheetah templating:
$variable_nameor${variable_name} - Conditional logic:
#if $param then... #end if - Loop constructs:
#for $item in $collection... #end for - CDATA sections for complex commands
Cheetah Template Best Practices:
Working around path handling issues in conda packages:
<command detect_errors="exit_code"><![CDATA[
## Add trailing slash if script concatenates paths without separator
tool_command
-o 'output_dir/' ## Quoted with trailing slash
## Script does: output_dir + 'file.txt' → 'output_dir/file.txt' ✓
## Without slash: output_dir + 'file.txt' → 'output_dirfile.txt' ✗
]]></command>
When to use quotes in Cheetah:
- Always quote user inputs:
'$input_file' - Quote literal strings with special chars:
'output_dir/' - Use bare variables for simple references:
$variable
Input Parameters:
<param>elements with type, name, label- Types: text, integer, float, boolean, select, data, data_collection
- Optional vs required parameters
- Validators and sanitizers
- Conditional parameter display
Outputs:
<data>elements for output files- Dynamic output naming with
labelandname - Format discovery and conversion
- Filters for conditional outputs
- Collections for multiple outputs
Tests:
- Input parameters and files
- Expected output files or assertions
- Test data location and organization
Best Practices
- Always include tests - Planemo won't pass without them
- Use semantic versioning - Increment tool version on changes
- Specify exact dependencies - Pin conda package versions
- Add clear help text - Document all parameters
- Handle errors gracefully - Check exit codes, validate inputs
- Use collections - For multiple related files
- Follow IUC standards - If contributing to intergalactic utilities commission
Common Planemo Commands
# Test tool locally
planemo test tool.xml
# Serve tool in local Galaxy
planemo serve tool.xml
# Lint tool for best practices
planemo lint tool.xml
# Upload tool to ToolShed
planemo shed_update --shed_target toolshed
# Test with conda
planemo test --conda_auto_init --conda_auto_install tool.xml
Testing Tools
Regenerating Expected Test Outputs
When test files don't match but the tool runs correctly:
# Run the tool manually with test inputs
mkdir -p output_dir
/path/to/conda/env/bin/tool_command \
-i test-data/input.fa \
-o output_dir
# Copy to expected output
cp output_dir/output.fa test-data/expected_output.fa
# Clean up
rm -rf output_dir
Verifying before regenerating:
- Check that tool exit code is 0 (successful)
- Inspect the actual output to ensure it's correct
- Compare line counts:
wc -l expected.fa actual.fa - Review diffs to understand what changed
Common reasons to regenerate:
- Test was created before tool updates
- Expected file only has subset of sequences (bug in test creation)
- Format changes in newer tool versions
Common Issues and Solutions
Issue: "Command not found"
- Check
<requirements>section has correct package - Verify conda package name and version
- Test command availability:
planemo conda_install tool.xml
Issue: "Output file not found"
- Verify command actually creates the file
- Check output file path matches
<data name="output" from_work_dir="..."> - Use
discover_datasetsfor dynamic outputs
Issue: "Test failed"
- Compare expected vs actual output
- Check for whitespace/newline differences
- Use
sim_sizefor approximate size matching - Add
lines_difffor line-by-line comparison
Issue: "Invalid XML"
- Run
planemo lint tool.xml - Check closing tags match opening tags
- Validate CDATA sections for command blocks
- Ensure proper escaping of special characters
Debugging Tool Test Failures
General Workflow
-
Read the test output JSON first
cat tool_test_output.jsonLook for:
- Exit codes and error messages in
stderr/stdout output_problemsarray for test assertion failures- Actual vs expected output differences
- Exit codes and error messages in
-
Never copy/modify conda package scripts
- Tool wrappers should ALWAYS use conda packages
- If there are bugs in the conda package scripts, work around them in the XML wrapper
- Common workaround: Add trailing slashes to paths if script concatenates without separators
-
Wrong test expectations vs bugs
- If tests fail but the tool runs successfully (exit code 0), check if expected test files are wrong
- Regenerate expected outputs by running the tool manually with test inputs
- Update
expect_num_outputsif optional outputs are created
Common Issues and Fixes
Path concatenation bugs in Python scripts:
<!-- If script does: args.output_dir + 'file.txt' without '/' -->
<!-- Fix in wrapper with trailing slash: -->
-o 'output_dir/' <!-- instead of -o output_dir -->
Wrong number of expected outputs:
<!-- Check if optional outputs are always created -->
<test expect_num_outputs="3"> <!-- Update count -->
Output has extra sequences/data:
- First check if this is expected behavior
- Regenerate expected test files from actual tool output
- Don't add post-processing filters unless absolutely necessary
XML Template Example
<tool id="tool_id" name="Tool Name" version="1.0.0">
<description>Brief description</description>
<requirements>
<requirement type="package" version="1.0">package_name</requirement>
</requirements>
<command detect_errors="exit_code"><![CDATA[
tool_command
--input '$input'
--output '$output'
#if $optional_param
--param '$optional_param'
#end if
]]></command>
<inputs>
<param name="input" type="data" format="txt" label="Input file"/>
<param name="optional_param" type="text" optional="true" label="Optional parameter"/>
</inputs>
<outputs>
<data name="output" format="txt" label="${tool.name} on ${on_string}"/>
</outputs>
<tests>
<test>
<param name="input" value="test_input.txt"/>
<output name="output" file="expected_output.txt"/>
</test>
</tests>
<help><![CDATA[
**What it does**
Describe what the tool does.
**Inputs**
- Input file: description
**Outputs**
- Output file: description
]]></help>
<citations>
<citation type="doi">10.1234/example.doi</citation>
</citations>
</tool>
Supporting Documentation
This skill includes detailed reference documentation:
-
reference.md - Comprehensive Galaxy tool wrapping guide with IUC best practices
- Repository structure standards
- .shed.yml configuration
- Complete XML structure reference
- Advanced features and patterns
-
troubleshooting.md - Practical troubleshooting guide
- Reading tool_test_output.json
- Common exit codes and their meanings
- Solutions for frequent issues
- Test failure diagnosis
-
dependency-debugging.md - Dependency conflict resolution
- Using
planemo mullfor diagnosis - Conda solver error interpretation
- macOS testing considerations
- Version conflict workflows
- Using
These files provide deep technical details that complement the core concepts above.
Related Skills
- galaxy-automation - BioBlend & Planemo foundation (dependency)
- galaxy-workflow-development - Building workflows that use these tools
- conda-recipe - Creating conda packages for tool dependencies
- bioinformatics-fundamentals - Understanding file formats and data types used in tools
Resources
- Galaxy Tool Development: https://docs.galaxyproject.org/en/latest/dev/
- Planemo Documentation: https://planemo.readthedocs.io/
- IUC Standards: https://galaxy-iuc-standards.readthedocs.io/
- Galaxy Training: https://training.galaxyproject.org/