app-reviews
app-reviews
Fetch and filter mobile app store reviews so you can analyze them directly.
What this skill does
Two scripts. They are tools, not analysis:
fetch.mjs— pulls reviews from Google Play or Apple App Store into a local SQLite DBevaluate.mjs— drops junk (reviews with too little analyzable text), scores the rest by a language-neutral formula (substance + crowd-validation + rating + dev reply), and emits the top-N as JSON on stdout
The strategy is intentionally language-agnostic — no keyword regex, no per-language tuning. It works on English, CJK, RTL scripts, etc. uniformly. Topical categorization (pricing complaints vs. quality complaints vs. ...) is the LLM's job, not the script's.
Analysis itself is your job. Read the JSON, look for patterns, draw conclusions in the conversation. Do not call out to another LLM for this.
Where data lives
Reviews and the product registry live in a project-local .app-reviews/ directory:
products.json— registry mapping canonical names to app IDs (user-curated)reviews.db— SQLite, auto-created on first fetch.gitignore— auto-written with*if the project is a git repo, so the dir is ignored
The data directory is resolved in this order:
--data-dir <path>CLI flagAPP_REVIEWS_DATA_DIRenv var- The nearest existing
.app-reviews/directory walking up from CWD <git_root>/.app-reviews/if CWD is inside a git repo<CWD>/.app-reviews/as a last resort
Run from a project root (or a subdirectory of one) and the data lands in that project. To share a registry across projects, set APP_REVIEWS_DATA_DIR=~/.claude/data/app-reviews (or anywhere else) in your shell profile.
The first time you run fetch or evaluate, if products.json does not exist, the script creates an empty template and exits 1 with a message. The user must populate it before the skill can do anything useful.
products.json format:
{
"tipsy": {
"aliases": ["tt", "tipsyturbo"],
"play": "com.tipsyturbo.app",
"ios": "1234567890",
"default_country": "us"
}
}
- The top-level key is the canonical name. It is what the DB stores in
product_key. aliasesare alternate names. Lookup is case-insensitive. Aliases can be in any language since this file is private to the user.playandiosare the Google Play package name and Apple App Store numeric ID. Either may be omitted if the product is not on that platform.default_countryis optional; defaults tous. There is intentionally nodefault_lang: Play requires--langon every fetch (see Workflow), and iOS does not use a language parameter.
A reference products.example.json ships in this skill's directory.
Workflow
When the user asks you to analyze app reviews:
-
Resolve the colloquial name to a canonical product. The scripts handle this for you, but if you want to inspect first, read the active
.app-reviews/products.json(run any script with--helpif you need a reminder of which directory it picks). Never invent app IDs. If the name doesn't resolve, ask the user; do not guess. -
Resolve
(country, hl)for Play before fetching. Google Play's reviews endpoint filters byhl(host language), notgl. Omittinghlreturns a global English fallback set with no error — silently wrong data. The fetch script therefore requires--langfor--platform playand will refuse to run without it.Before calling fetch, derive
hlfrom the target country yourself, using your general knowledge of which language is spoken there:- tw →
zh-TW, hk →zh-HK, jp →ja, kr →ko - br →
pt-BR, mx →es-MX, de →de, fr →fr, it →it - us / au / sg →
en
For multi-locale countries (CA, IN, CH, BE), ask the user which locale they want — do not pick a default. A wrong
hlreturns a misleading non-empty result, not an error.For
--platform ios, pass only--country. Apple's reviews endpoint ignores any language parameter; the script rejects--langfor ios as a usage error.To check what data is already in the local DB for a given app+market before fetching, query
reviews.dbdirectly (e.g.SELECT country, COUNT(*) FROM app_reviews WHERE product_key=? AND platform=? GROUP BY country). There's no separate cache file — the DB is the source of truth. - tw →
-
Fetch reviews. Run one invocation per platform you want:
node <skill_dir>/scripts/fetch.mjs --product <canonical> --platform play --country tw --lang zh-TW --limit 1000 node <skill_dir>/scripts/fetch.mjs --product <canonical> --platform ios --country tw --limit 100Read stderr for progress (the first line prints the resolved data dir); stdout is silent on success.
iOS soft cap: the iOS fetcher refuses
--limit > 100unless--forceis passed. Apple's reviews API is heavily rate-limited; pulling more than ~100 in one country routinely triggers minutes of 429 backoffs and often still fails. Default to--limit 100for iOS — the most recent reviews are usually enough to surface the dominant complaints. If you genuinely need more (e.g. user explicitly asks for an exhaustive pull, or 100 isn't surfacing what you need), tell the user it'll be slow and add--force. -
Evaluate and read the JSON. Default to a 90-day time window for "what are users saying" / "user feedback" / "recent reviews" type asks:
node <skill_dir>/scripts/evaluate.mjs --product <canonical> --days 90Stdout is a JSON array of high-signal reviews. Parse it and analyze.
When to override
--days 90:- User explicitly asks for historical / overall reception → omit
--days(the score's recency decay still mildly favors recent ones, but old impactful reviews can compete) - User mentions a specific release date → use
--since YYYY-MM-DDwith that date - User wants very fresh signal ("this week", "the last update") →
--days 7or--days 30 - Low-volume product where 90 days yields too few reviews → widen the window or drop
--days
--sinceand--daysare mutually exclusive. - User explicitly asks for historical / overall reception → omit
<skill_dir> is the directory containing this SKILL.md.
evaluate output
Reviews are first filtered by a junk floor (drops near-empty reviews like single emojis or one-word ratings — controlled by --min-bytes, default 15), then ranked by a language-neutral score that favors substantive text, helpful_count, negative ratings, dev replies, and recency. Top --top (default 300) are emitted. See _lib/signals.mjs for the exact formula.
--since YYYY-MM-DD is a hard cutoff applied before ranking — use it when you know a release date and only want feedback on the current build.
JSON output schema — each element of the array:
| Field | Type | Notes |
|---|---|---|
review_key |
string | Unique key <platform>:<app_id>:<country>:<review_id> |
platform |
"play" or "ios" |
|
country |
string | Lowercase 2-letter code |
lang |
string | Lowercase. Set on Play rows (e.g. zh-tw); empty string on iOS rows, since Apple's endpoint does not filter by language. |
rating |
int 1-5 or null | |
title |
string or null | iOS reviews have titles; Play reviews don't |
content |
string or null | The body |
helpful_count |
int | Up-vote / thumbs-up count |
reviewed_at |
ISO 8601 string | |
app_version |
string or null | |
reply_content |
string or null | Developer's reply, if any (Play only) |
meaningful_bytes |
int | Substance metric used by floor and score |
score |
number | Sort key (descending). Formula above. |
Sort order in the array is descending by score. Bucketing by topic (pricing, quality, etc.) is your job once you read the JSON — the script does not pre-categorize.
Adding a new alias
If the user refers to an existing product by a name not in the registry (e.g. they said "the drinking app" and you figured out they meant tipsy), edit the active .app-reviews/products.json and append to the relevant aliases array. Edit the user's data file — never edit the skill's products.example.json.
Error handling
| stderr signature | Meaning | Action |
|---|---|---|
created empty products.json at ... |
First-time setup | Tell the user to populate it; exit |
unknown product "X". Known products: ... |
Name didn't match | Pick from the listed products, or ask the user |
product "X" has no <play|ios> app id configured |
Platform not configured | Skip that platform or ask the user to add it |
Google Play rejected the reviews request with PlayGatewayError |
Google rate-limited or blocked | Do not retry. Report to the user; suggest waiting |
Apple App Store reviews request failed with HTTP <status> |
Apple endpoint returned non-2xx (rare outside 429) | Report to the user |
Apple App Store reviews request failed: 429 retries exhausted |
Apple rate-limited beyond the script's backoff budget | Do not retry immediately. Wait, or try a different country |
When the upstream API drifts
Both Play and Apple endpoints are reverse-engineered. They occasionally change shape without notice.
If you see any of these, read references/repairing-scrapers.md before assuming a skill bug:
- A fetch error ending with
See references/repairing-scrapers.md health:summary in fetch's stderr showing a key field at near-zero non-null count (e.g.content=0/100after a non-trivial fetch)- evaluate output that looks structurally wrong (e.g. rating all null, content mostly empty)
You diagnose and propose a unified diff; you do not apply it. The user applies.
What NOT to do
- Do not invent app IDs or product names. Lookup is deterministic via products.json. If a name isn't there, ask the user.
- Do not pipe
fetchstderr into analysis. It's progress reporting, not data. - Do not re-run
evaluatewith different--topvalues to "get more". Set--toponce. The selection is deterministic given the inputs. - Do not analyze the raw
app_reviewstable directly. Always go throughevaluateso noise is filtered. (If you really need raw data for a specific deep-dive, fine, but the default path is evaluate-then-analyze.) - Do not auto-retry on
PlayGatewayError. Retries trigger more aggressive rate-limiting. Wait. - Do not commit
.app-reviews/to any repo. It's user-private state. The skill auto-writes a.gitignoreinside the directory when it detects a git repo, but verify before committing.