mlb-category-state-analyzer
MLB Category State Analyzer
Table of Contents
Example
Scenario: Week 3, K L's Boomers (Team 5) vs. Los Doyers. Wednesday AM (mid-week). 4 scoring days remain.
Raw matchup scores (pulled from Yahoo matchup page):
| Cat | Us | Opp | Margin | Games left (us/opp) |
|---|---|---|---|---|
| R | 28 | 31 | -3 | 26 / 22 |
| HR | 9 | 7 | +2 | 26 / 22 |
| RBI | 30 | 29 | +1 | 26 / 22 |
| SB | 4 | 6 | -2 | 26 / 22 |
| OBP | .342 (82 PA) | .336 (78 PA) | +.006 | 26 / 22 GP |
| K | 42 | 38 | +4 | 9 SP starts / 7 SP starts |
| ERA | 3.80 (21 IP) | 4.12 (19 IP) | -0.32 (better) | 9 / 7 |
| WHIP | 1.18 (21 IP) | 1.25 (19 IP) | -0.07 (better) | 9 / 7 |
| QS | 2 | 1 | +1 | 9 / 7 |
| SV | 3 | 5 | -2 | ~8 RP days / ~8 RP days |
Projections built for the sim (rest-of-week {mean, stddev} — see resources/methodology.md):
| Cat | Our projection | Opp projection |
|---|---|---|
| R | final 52 ± 9 | final 57 ± 8 |
| HR | final 15 ± 3.5 | final 13 ± 3.2 |
| RBI | final 52 ± 9 | final 55 ± 8 |
| SB | final 6 ± 2.3 | final 10 ± 2.5 |
| OBP | .346 ± .015 | .341 ± .014 |
| K | final 96 ± 11 | final 85 ± 10 |
| ERA | 3.88 ± 0.40 | 4.05 ± 0.45 |
| WHIP | 1.20 ± 0.07 | 1.25 ± 0.08 |
| QS | final 6.1 ± 1.5 | final 3.8 ± 1.4 |
| SV | final 4.8 ± 1.4 | final 7.7 ± 1.5 |
Delegate to matchup-win-probability-sim with:
cat_list = [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]cat_inverse_list = [ERA, WHIP]cat_win_threshold = 6our_per_cat_projection/opp_per_cat_projectionfrom the table abovesim_mode = "monte_carlo",n_simulations = 10000,random_seed = 42
Sim output (consumed by this skill):
matchup_win_probability = 0.58per_cat_win_probability: R 0.36, HR 0.65, RBI 0.42, SB 0.16, OBP 0.60, K 0.74, ERA 0.62, WHIP 0.68, QS 0.85, SV 0.10expected_cats_won = 5.18
Per-cat signals (derived here from sim output + baseball state — see resources/methodology.md):
| Cat | Position (from state) | Pressure (state + pace) | Reachability (= round(100 × p_cat)) | Punt Score (= f(1 − p_cat) + volatility) | Verdict |
|---|---|---|---|---|---|
| R | losing | 72 | 36 | 44 | push (contested) |
| HR | winning | 48 | 65 | 21 | maintain |
| RBI | winning (thin) | 65 | 42 | 35 | push |
| SB | losing | 55 | 16 | 58 | evaluate punt |
| OBP | winning (thin) | 70 | 60 | 24 | push |
| K | winning | 55 | 74 | 16 | push |
| ERA | winning | 62 | 62 | 23 | push |
| WHIP | winning | 60 | 68 | 19 | maintain |
| QS | winning | 78 | 85 | 9 | push hard |
| SV | losing | 38 | 10 | 84 | punt |
Overall recommendation: Push 6, maintain 2, punt 2. Matchup win prob 58% (neutral favorite).
- Push (6): HR, OBP, K, ERA, QS — each has
per_cat_win_probability ≥ 0.60. Plus RBI as the contested-but-reachable 6th. - Maintain (2): WHIP (locked-ish), R (reachability lowish but not a true punt).
- Punt (2): SB (p = 0.16, low reach) and SV (p = 0.10 + volatility bonus → punt score 84).
Downstream implications for other agents:
- Lineup optimizer:
matchup_win_probability = 0.58→ neutral-to-favorite, standard daily_quality optimization (no variance tilt). - Waiver analyst: prioritize SP (QS, K, ERA), OBP-heavy bats; not closers or speed specialists.
- Streaming strategist: every QS-capable SP starts; skip any 5-inning risk arm.
Workflow
Copy this checklist and track progress:
MLB Category State Analysis Progress:
- [ ] Step 1: Pull current matchup scores from Yahoo
- [ ] Step 2: Count remaining games/PAs/IP for both rosters
- [ ] Step 3: Build per-cat projection dicts ({mean, stddev}) for both rosters
- [ ] Step 4: Delegate to matchup-win-probability-sim (pass cat_list, projections, threshold=6, inverse=[ERA,WHIP])
- [ ] Step 5: Derive cat_position (from state), cat_pressure, cat_reachability, cat_punt_score from sim output
- [ ] Step 6: Rank cats and emit push/maintain/punt plan
- [ ] Step 7: Write signal file with YAML frontmatter (include matchup_win_probability from sim)
Step 1: Pull current matchup scores
Web-fetch the Yahoo matchup page: https://baseball.fantasysports.yahoo.com/b1/23756/5/matchup?week=N. Extract current totals for both teams in each of the 10 cats. For ratio cats (OBP, ERA, WHIP), also capture the denominator (PAs for OBP, IP for ERA/WHIP). This is required — you cannot build a ratio-cat projection without the volume underlying the ratio.
- 5 batting cats: R, HR, RBI, SB, OBP (+ at-bats / plate-appearances)
- 5 pitching cats: K, ERA, WHIP, QS, SV (+ innings pitched)
- Source URL cited in signal file
See resources/methodology.md for scrape procedure and fallback if Yahoo is unreachable.
Step 2: Count remaining games/PAs/IP
For each roster, count the number of MLB games its players will play for the rest of the scoring period, and project PAs (hitters) and IP (pitchers).
- Hitter games remaining: sum of (each rostered hitter's team games × probability they start)
- Pitcher starts remaining: number of scheduled SP starts for the rest of the week per roster
- Reliever days remaining: days × eligible RPs (for SV projection)
- Volume imbalance: if one team has meaningfully more games, that will show up directly in the projection means (and so in
per_cat_win_probability)
Use MLB.com schedules + probable pitcher grids. See resources/methodology.md.
Step 3: Build per-cat projection dicts
For each team, build a dict {cat: {mean, stddev}} where mean is the projected final (or remaining, consistently used across both teams — pick one convention) and stddev reflects uncertainty given remaining volume.
- Counting cats (R, HR, RBI, SB, K, QS, SV):
mean = current_total + Σ(per-player per-game rate × games remaining × daily_quality).stddev ≈ 0.35 × expected_remainingas a default CV. - Ratio cats (OBP, ERA, WHIP):
mean = (current_ratio × current_volume + projected_remaining_ratio × remaining_volume) / total_volume.stddev ≈ σ_per_obs / sqrt(total_volume)— shrinks as total IP/PA grows. - Both dicts have identical keys and the exact league
cat_list. - Use OBP (not AVG) and
qs_probability(not W) from upstreammlb-player-analyzersignals — see Guardrails.
Step 4: Delegate to matchup-win-probability-sim
Invoke the sibling skill with a well-formed input payload:
inputs to matchup-win-probability-sim:
cat_list: [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]
cat_inverse_list: [ERA, WHIP]
cat_win_threshold: 6
our_per_cat_projection: <dict from Step 3>
opp_per_cat_projection: <dict from Step 3>
sim_mode: "monte_carlo"
n_simulations: 10000
random_seed: 42
tie_rule: "half"
outputs consumed:
matchup_win_probability (float in [0,1])
per_cat_win_probability (dict[cat, float])
expected_cats_won (float)
variance_estimate (float)
- All 10 cats present in both projection dicts
-
cat_inverse_list = [ERA, WHIP](lower-is-better) -
cat_win_threshold = 6(Yahoo 10-cat majority) - Seed passed for reproducibility
- Sim output fields captured and stored for Step 5
Step 5: Derive per-cat signals from sim output + state
Apply the formulas in resources/methodology.md. The sim owns the probability math; this skill owns the baseball-state interpretation.
-
cat_position∈ {winning, tied, losing} — computed locally from current totals (not sim). Ratio-cat direction handled (OBP higher = winning; ERA/WHIP lower = winning). -
cat_pressure(0–100) — simple arithmetic from position + close-margin + volume-edge + locked-in flags. See pressure formula in Quick Reference. -
cat_reachability(0–100) — now = round(100 × per_cat_win_probability[cat]), taken directly from the sim. -
cat_punt_score(0–100) —(100 × (1 − per_cat_win_probability[cat])) × 0.6 + 30 × is_volatile + 20 × below_min_threshold − 10 × has_spillover, clamped.
Step 6: Rank and emit plan
Rank all 10 cats by cat_pressure × cat_reachability / 100:
- Top 6: push — mark these as priority for waivers, streams, starts
- Middle 2: maintain — hold position, don't overspend
- Bottom 2: evaluate punt — if
cat_punt_score > 60, confirm punt; otherwise hold
Goal in H2H Cats is 6-of-10. A defensible plan is "push 6, concede up to 4." See resources/template.md for the output signal format.
Step 7: Write signal file
Write to signals/YYYY-MM-DD-cat-state.md with YAML frontmatter (type: cat-state). Include matchup_win_probability from the sim as a top-level field. Validate with mlb-signal-emitter before persisting.
- All 10 cats present with all 4 signals each
-
matchup_win_probabilityandexpected_cats_wonrecorded in frontmatter -
sim_metablock (sim_mode, n_simulations, random_seed) recorded for reproducibility - Confidence reflects data quality (lower if Yahoo scrape was partial)
-
source_urlsincludes Yahoo matchup page + MLB.com schedule pages + a reference to the sim skill - Red-team findings noted (e.g., "Opp has a two-start ace coming that could flip K + ERA + WHIP all at once")
Validate output using resources/evaluators/rubric_mlb_category_state_analyzer.json. Minimum: average score of 3.5 or above.
Common Patterns
Pattern 1: Balanced mid-week state
- Typical Wednesday AM state: 3-4 cats already locked, 3-4 close, 2-3 volatile.
- Action: push the close cats hardest, coast the locked wins, ignore locked losses.
- The sim's per-cat probs already reflect this — cats with
p ∈ [0.40, 0.65]are the contested ones.
Pattern 2: Volume-imbalanced matchup
- We have 30 hitter games left, opp has 22. Our counting-cat projection means rise; sim's
per_cat_win_probabilityfor R/HR/RBI/SB rises accordingly. - Action: stack the lineup (fewer off-days, prefer teams playing doubleheaders), bid on streamers. Pressure boost comes from the volume-edge flag, reachability boost comes automatically from the sim.
Pattern 3: Two-start ace incoming (us or them)
- One pitcher's two-start week can swing K, ERA, WHIP, QS simultaneously.
- Encode this in the projection dict: their expected IP and K rise, ERA/WHIP means improve (toward their ERA/WHIP), QS mean rises by ~0.45 per expected QS-quality start.
- The sim then shows 4 pitching cats moving together in
per_cat_win_probabilitydeltas.
Pattern 4: Save-category volatility
- SVs are low-frequency; one walkoff blown save flips the category.
- In the projection dict, use a low mean (≤ 2.5/week per locked closer) and moderate stddev (≥ 1.2). The sim will naturally report
per_cat_win_probabilitynear 0.1–0.25 when behind by 2+. - The +30 volatility bonus in
cat_punt_score(applied here, not in the sim) pushes SV to punt when sim reachability agrees.
Pattern 5: Ratio-cat "freeze"
- Late in the week, if opp is far below the IP/PA minimum (e.g., has 9 IP on Friday with no more starts), their ratio cats are locked at whatever they have.
- Encode by setting opp ratio-cat stddev near zero and their mean at a punitive-or-forfeited value. The sim then returns
per_cat_win_probability ≈ 1.0for those cats.
Guardrails
-
Never compute OBP/ERA/WHIP from rates alone — always include volume (PA/IP). A .400 OBP in 10 PAs is not better than .342 in 82 PAs. The projection-dict mean/stddev for ratio cats must come from the weighted-average formula; the sim takes those as truth.
-
QS is the #1 category, not Wins. This league uses Quality Starts (6+ IP, ≤3 ER). A 5-inning outing scores zero. When projecting remaining QS, multiply each SP start by its QS probability (from
mlb-player-analyzer'sqs_probabilitysignal) — don't just count scheduled starts. -
OBP is the #5 category, not AVG. Walks count. When projecting OBP contribution, use players' OBP (not AVG). A high-BB, low-AVG player like Juan Soto is worth more in this league than his raw hit rate suggests.
-
SV is volatile — trust the punt when signals agree. Unlike counting batting cats, a 2-save deficit with 3 days left has low
per_cat_win_probabilityregardless of roster. Don't fight for saves if the closer role on your roster isn't locked (checksave_role_certainty< 70 → automatic punt candidate). The volatility bonus incat_punt_scoreis applied here, not in the sim — the sim returns raw probability. -
cat_reachabilitycomes from the sim — don't recompute. This is a delegation. If the sim returnsper_cat_win_probability[R] = 0.36, thencat_reachability[R] = 36. Do not apply z-score shortcuts or best/worst-case buckets here — those lived in the old heuristic and are now owned by the sim skill. -
Locked-in cats get pressure adjustments, not zero. A locked-in win still has
cat_pressure ≈ 40(it's banked). A locked-in loss still hascat_pressure ≈ 20(stop investing). Don't set them to zero — downstream agents use non-zero values to decide bench vs. drop. -
Ratio cats need the minimum-IP/PA rule. Yahoo enforces minimums for pitcher ratio cats (usually 20 IP for the week). If either roster is tracking below the minimum late in the week, the ratio cat may auto-loss. Encode this in the projection dict (stddev → 0, mean → punitive) before calling the sim, AND add +20
below_min_thresholdtocat_punt_score. -
Never re-derive upstream signals.
qs_probability,sb_opportunity,obp_contribution,save_role_certaintycome frommlb-player-analyzer. Read them from the signal directory; do not recompute. -
Always pass a
random_seedto the sim. Without it, two runs of this skill produce slightly differentcat_reachabilityvalues, which will confuse downstream agents doing diff comparisons. Default seed:42.
Quick Reference
Where the math lives now:
| Signal | Owner | Formula |
|---|---|---|
cat_position |
this skill | enum from current totals (ratio-direction aware) |
cat_pressure |
this skill | baseline 50 + 20 × close + 15 × vol-edge − 10 × locked_win − 30 × locked_loss |
cat_reachability |
delegated to matchup-win-probability-sim |
= round(100 × per_cat_win_probability[cat]) |
cat_punt_score |
this skill (uses sim output) | (100 × (1 − p_cat)) × 0.6 + 30 × volatile + 20 × below_min − 10 × spillover |
matchup_win_probability |
delegated to matchup-win-probability-sim |
Monte Carlo P(cats_won ≥ 6) |
cat_pressure =
50 # neutral baseline
+ 20 × (is_close_margin: deficit/lead ≤ 10% of total)
+ 15 × (opponent_volume_exhausted: we have more games left)
- 10 × (locked_in_win)
- 30 × (locked_in_loss)
clamp(0, 100)
cat_reachability = round(100 × per_cat_win_probability[cat]) # from sim
cat_punt_score =
(100 - cat_reachability) × 0.6 # base: if we can't reach, consider punting
+ 30 × (cat is traditionally volatile: SV)
+ 20 × (below min-PA/IP threshold)
- 10 × (cat has spillover: K→QS, OBP→R, HR→R+RBI)
clamp(0, 100)
League constants (from context/league-config.md):
- 10 cats: R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV
- Inverse cats: ERA, WHIP (lower-is-better — passed as
cat_inverse_listto the sim) - OBP (not AVG) — walks matter
- QS (not W) — 6+ IP with ≤3 ER
- H2H Cats, goal = win 6+ of 10 each week (
cat_win_threshold = 6) - Daily lineup lock; weekly matchup rolls Mon-Sun
Signal file output schema (from context/frameworks/signal-framework.md):
---
type: cat-state
date: YYYY-MM-DD
emitted_by: mlb-category-state-analyzer
week: N
matchup_opponent: <team name>
scoring_days_remaining: N
matchup_win_probability: 0.58 # from matchup-win-probability-sim
expected_cats_won: 5.18 # from matchup-win-probability-sim
sim_meta:
sim_mode: monte_carlo
n_simulations: 10000
random_seed: 42
synthesis_confidence: 0.0-1.0
source_urls:
- https://baseball.fantasysports.yahoo.com/b1/23756/5/matchup?week=N
---
Body: per-cat table + overall push/maintain/punt recommendation + red-team findings.
Thresholds used downstream:
| Agent | Threshold | Effect |
|---|---|---|
| Waiver analyst | cat_pressure ≥ 60 |
Prioritize targets that fill that cat |
| Streaming strategist | cat_pressure (ERA/WHIP) < 30 |
Allow riskier streamers (we're punting) |
| Lineup optimizer | matchup_win_probability < 0.4 / > 0.6 |
Variance-seek as underdog / damp as favorite |
| Trade analyzer | weights trade_cat_delta |
Multiplied by cat_pressure / 50 |
Key resources:
- resources/template.md: Output signal file format, per-cat table, sim-integration worked example
- resources/methodology.md: Yahoo scrape procedure, remaining-games projection, building per-cat projection dicts, sim-integration formulas
- resources/evaluators/rubric_mlb_category_state_analyzer.json: Evaluator rubric
- Sibling skill:
matchup-win-probability-sim— owns per-cat and matchup-level win-probability math via Monte Carlo / Poisson-binomial
Inputs required:
- Current matchup scores (10 cats, both teams, with volume for ratio cats)
- Roster IDs for both teams
- Remaining MLB schedule through Sunday
- Upstream signals:
qs_probability,save_role_certainty,obp_contribution,sb_opportunity,daily_quality - League config (cats list, min-IP/PA thresholds,
cat_win_threshold)
Outputs produced:
signals/YYYY-MM-DD-cat-state.md— signal file with 10-cat table, overall plan,matchup_win_probability, confidence, source URLscat_position,cat_pressure,cat_reachability,cat_punt_scoreper cat- Overall "push N, maintain M, punt P" recommendation (N + M + P = 10, target N ≥ 6)
matchup_win_probability(from sim delegate) for lineup-optimizer variance decisions