foundation-models
SKILL.md
Foundation Models
Deep understanding of how Foundation Models work.
Sampling Parameters
# Temperature Guide
TEMPERATURE = {
"factual_qa": 0.0, # Deterministic
"code_generation": 0.2, # Slightly creative
"translation": 0.3, # Mostly deterministic
"creative_writing": 0.9, # Creative
"brainstorming": 1.2, # Very creative
}
# Key parameters
response = client.chat.completions.create(
model="gpt-4",
messages=[...],
temperature=0.7, # 0.0-2.0, controls randomness
top_p=0.9, # Nucleus sampling (0.0-1.0)
max_tokens=1000, # Maximum output length
)
Structured Outputs
# JSON Mode
response = client.chat.completions.create(
model="gpt-4",
messages=[...],
response_format={"type": "json_object"}
)
# Function Calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}]
Post-Training Stages
| Stage | Purpose | Result |
|---|---|---|
| Pre-training | Learn language patterns | Base model |
| SFT | Instruction following | Chat model |
| RLHF/DPO | Human preference alignment | Aligned model |
Model Selection Factors
| Factor | Consideration |
|---|---|
| Context length | 4K-128K+ tokens |
| Multilingual | Tokenization costs (up to 10x for non-Latin) |
| Domain | General vs specialized (code, medical, legal) |
| Latency | TTFT, tokens/second |
| Cost | Input/output token pricing |
Best Practices
- Match temperature to task type
- Use structured outputs when parsing needed
- Consider context length limits
- Test sampling parameters systematically
- Account for knowledge cutoff dates
Common Pitfalls
- High temperature for factual tasks
- Ignoring tokenization costs for multilingual
- Not accounting for context length limits
- Expecting determinism without temperature=0
Weekly Installs
1
Repository
doanchienthangdev/omgkitGitHub Stars
3
First Seen
6 days ago
Security Audits
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1