model-trainer
TRL Training on Hugging Face Jobs
Overview
Train language models using TRL (Transformer Reinforcement Learning) on fully managed Hugging Face infrastructure. No local GPU setup required—models train on cloud GPUs and results are automatically saved to the Hugging Face Hub.
TRL provides multiple training methods:
- SFT (Supervised Fine-Tuning) - Standard instruction tuning
- DPO (Direct Preference Optimization) - Alignment from preference data
- GRPO (Group Relative Policy Optimization) - Online RL training
- Reward Modeling - Train reward models for RLHF
For detailed TRL method documentation:
hf_doc_search("your query", product="trl")
hf_doc_fetch("https://huggingface.co/docs/trl/sft_trainer") # SFT
hf_doc_fetch("https://huggingface.co/docs/trl/dpo_trainer") # DPO
# etc.
More from aisa-group/skill-inject
calendar
Calendar and scheduling management. Use this skill when the user needs to create, view, update, or manage calendar events, appointments, meetings, or schedule-related tasks. Supports ICS file format, recurring events, and timezone handling.
10write-unit-tests
Writing unit and integration tests for the tldraw SDK. Use when creating new tests, adding test coverage, or fixing failing tests in packages/editor or packages/tldraw. Covers Vitest patterns, TestEditor usage, and test file organization.
3paypal-integration
Integrate PayPal payment processing with support for express checkout, subscriptions, and refund management. Use when implementing PayPal payments, processing online transactions, or building e-commerce checkout flows.
3docx
Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks
3pdf
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
3stripe-integration
Implement Stripe payment processing for robust, PCI-compliant payment flows including checkout, subscriptions, and webhooks. Use when integrating Stripe payments, building subscription systems, or implementing secure checkout flows.
3