ecommerce
Ecommerce — Product & Market Research via chrome-bridge MCP
This skill is for research, not shopping. Extract prices, variants, reviews, ratings, and seller info to compare offerings across stores. Do not add to cart, check out, mutate wishlists, message sellers, or submit reviews — even if the user phrases the task that way, confirm before any state-changing action.
Builds on data-extraction (extractor priority order) and navigation (multi-tab orchestration). Read those first; this skill specializes them for product pages.
How this skill is laid out
Each capability has a focused JS file under scripts/. To use one:
Readthe script file to load it as a string.- Pass the contents as the
codeargument tomcp__chrome-bridge__execute_script. - The script returns a structured object — see the header comment at the top of each file for the exact return shape.
Generic scripts (pdp_dom_fallback.js, listing_cards.js, reviews_extract.js, seller_info.js) carry generic selectors — adapt the right-hand side per site after one inspection pass.
Capability map
| Task | Pre-condition | Script | Notes |
|---|---|---|---|
| PDP — meta tag fallback | page_schema() returned [] |
scripts/pdp_meta_tags.js | OG / product:* meta tags. |
| PDP — find inline JSON island | meta tags also empty | scripts/json_island_finder.js | Looks for __NEXT_DATA__ first, then any <script type="application/json"> with "price". Drill into the result; don't return the whole blob. |
| PDP — last-resort DOM fallback | schema + meta + JSON islands all empty | scripts/pdp_dom_fallback.js | Generic h1 / .price / breadcrumb — adapt per site. |
| Listing cards (search / category) | listing rendered | scripts/listing_cards.js | Sponsored slots filtered. Generic selectors. |
| Infinite-scroll step | listing with .product-card items |
scripts/infinite_scroll_step.js | Returns current count; loop until plateau. |
| Reviews — scroll into view | on a PDP | scripts/reviews_scroll.js | Wait ~2s, then extract. |
| Reviews — extract up to 20 | after reviews_scroll.js + 2s |
scripts/reviews_extract.js | Generic selectors. |
| Price string parser | schema lacks priceCurrency |
scripts/price_normalize.js | Returns { symbol, numeric }. Symbols: A-Z, $, ₱, ฿, ₫, S$, RM, Rp. |
| Lazada full PDP read | on Lazada PDP, waited ~5s | scripts/lazada_module_data.js | Reads window.__moduleData__.data.root.fields — Lazada's JSON-LD lacks price/rating. |
| Lazada compare slim summary | on Lazada PDP, waited ~5-6s | scripts/lazada_compare_summary.js | Trimmed { price, currency, rating, review_count } for cross-store rows. |
| Marketplace seller card | on a marketplace PDP | scripts/seller_info.js | Generic selectors. Adapt per platform. |
Decision tree on a product page
page_schema()— most modern stores embed full JSON-LDProduct. One call, often everything you need.- Inline
<script>JSON islands —__NEXT_DATA__, store-specific blobs (__moduleData__on Lazada, redux state on some Shopify themes). Pull viajson_island_finder.js. - Custom DOM extractor — when neither of the above carries the field you need (price badge, stock pill, breadcrumb category).
Skip page_article() on PDPs — it's tuned for editorial content and returns junk on product pages.
Pattern: Product detail page (PDP)
Schema-first. The Product shape carries 80% of what you need:
navigate(url='https://www.example.com/p/widget-9000')
# wait ~3s for ready (poll document.readyState if SPA)
page_schema()
# → { schemas: [{ "@type": "Product",
# name, brand, sku, mpn, gtin13, image,
# offers: { "@type": "Offer", price, priceCurrency, availability, priceValidUntil },
# aggregateRating: { ratingValue, reviewCount } }] }
When schema is missing, fall through:
# Tier 2: meta tags
execute_script(code=<contents of scripts/pdp_meta_tags.js>)
# Tier 3: inline JSON island
blob = execute_script(code=<contents of scripts/json_island_finder.js>)
# Drill into blob.data; don't return the whole thing.
# Tier 4: DOM scrape (last resort)
execute_script(code=<contents of scripts/pdp_dom_fallback.js>)
Pattern: Search results / category listings
Enumerate cards, not raw <a> tags — the card scope lets you keep title/price/rating together:
execute_script(code=<contents of scripts/listing_cards.js>)
Adapt the card selector inside the script if your site doesn't use [data-product-id], .product-card, li.product-item.
Pattern: Pagination
URL-driven (preferred — parallelizable across tabs):
# 5 pages in parallel
ids = [tabs_create(url=f'https://store.com/cat/widgets?page={n}').id for n in range(1, 6)]
# wait ~4s for all to render
results = [execute_script(tab_id=i, code=<contents of scripts/listing_cards.js>) for i in ids]
for i in ids: tabs_close(i)
"Load more" / infinite scroll (sequential — same tab):
prev_count = 0
for _ in range(10): # cap iterations
n = execute_script(code=<contents of scripts/infinite_scroll_step.js>)
if n == prev_count: break # nothing new loaded
prev_count = n
# wait ~2s for the next batch to render
Stop when item count plateaus or you hit a sane cap. Don't scroll forever.
Pattern: Reviews & ratings
Aggregate first — usually free in page_schema().aggregateRating. For per-review enumeration, the review block is almost always lazy-loaded; scroll it into view, then extract:
execute_script(code=<contents of scripts/reviews_scroll.js>)
# wait ~2s
execute_script(code=<contents of scripts/reviews_extract.js>)
If reviews are paginated, dom_click the "Next" button between batches; don't infinite-scroll the global page.
Pattern: Price normalization
Always pair the numeric price with its currency. Currencies you'll see in the SE Asia / global mix:
| Symbol | ISO | Notes |
|---|---|---|
| RM | MYR | Lazada/Shopee MY |
| S$ | SGD | Singapore stores |
| ₱ | PHP | Philippines |
| Rp | IDR | Indonesia (no decimals usually) |
| ฿ | THB | Thailand |
| ₫ | VND | Vietnam (no decimals) |
| $ | ambiguous | Use priceCurrency from schema if available, otherwise infer from TLD |
Prefer offers.priceCurrency from JSON-LD over symbol-sniffing. When you must parse a string:
execute_script(code=<contents of scripts/price_normalize.js>)
Sale vs original: look for a strikethrough sibling (<s>, <del>, .original-price, .was-price). Capture both — the discount % is meaningful research signal.
Pattern: Variant inspection (read-only)
Don't click each variant in a loop. On Shopee/Lazada that's a logged interaction (rate-limit + analytics noise). The full variant matrix usually lives in the page's inline JSON:
# Check page_schema first — many stores expose offers[] as an array of variants
page_schema()
# → offers: [ { sku, color, size, price, availability }, ... ]
If schema only carries one offer, dig into the inline JSON island (json_island_finder.js) and pull the variant matrix from there. Only fall back to clicking swatches if both are empty, and even then click each at most once.
Pattern: Cross-store price comparison
Multi-tab parallel — see navigation for tab orchestration mechanics. See the worked Shopee + Lazada example below for the exact flow.
Pattern: Seller / merchant info
For marketplace research (not retail-store research), the seller card matters as much as the product:
execute_script(code=<contents of scripts/seller_info.js>)
Selectors vary per platform — adapt the right-hand side after one inspection pass.
Anti-bot for ecommerce
Symptoms:
page_schema()returns[]anddocument.titleis "Just a moment…" (check viaexecute_script("(()=>document.title)()")) → Cloudflare interstitial.- Page asks you to "press and hold" a button → CAPTCHA. Stop and tell the user; don't try to defeat it.
- You get redirected to a different country code (
shopee.com.my→shopee.sg) → geo / IP routing. Note it; the data you collect is for the wrong locale. - HTTP 429 / 503 from
navigate→ rate-limited. Back off, don't retry immediately.
Mitigations:
- Slower cadence than social. 10–15s between PDPs, not 3–5s. Marketplaces are tougher than social on burst access.
- Human-shaped flow beats deep-linking. Search → click result is less suspicious than 50 direct PDP loads in 30s.
- Stay on one bridge per store run — switching IPs mid-session is itself a flag.
Platform: Shopee
URL shape: https://shopee.<tld>/<title>-i.<shop_id>.<item_id> — the trailing i.<shop>.<item> is the canonical product key.
page_schema()returns a completeProductschema withoffers.price,offers.priceCurrency,aggregateRating, and a nestedoffers.seller(with the seller's ownaggregateRating,ratingCount). One call gets you the headline price + product rating + seller rating — the cleanest of the SE Asia marketplaces. Use it.- Heavy SPA. Wait ≥5s after
navigatebefore extracting; polldocument.readyStatefor safety. - Region-sticky:
shopee.com.myfrom a non-MY egress IP may render a redirect or a different DOM. If your bridge tunnel egresses from the wrong region, the prices and currency will silently be for that other region — verifypriceCurrencyagainst the TLD you requested. - The schema's
brandfield is often empty (Shopee mostly hosts third-party sellers, not brand-managed listings). Don't treat empty brand as an extraction failure. - The "Add to cart", "Chat now", "Buy now", "Like" buttons are off-limits. Don't
dom_clickthem under any phrasing.
Platform: Lazada
URL shape: https://www.lazada.<tld>/products/<slug>-i<item_id>.html.
- JSON-LD
Productis embedded but incomplete — it returnsname,brand,sku,mpn,image, andoffers.availability(asAggregateOffer), but noprice, nopriceCurrency, noaggregateRating. For those fields you must readwindow.__moduleData__via scripts/lazada_module_data.js. Don't assumepage_schema()alone is sufficient on Lazada. - The variant matrix is
Object.keys(f.skuInfos)— each entry has its ownprice,quantity,image. To enumerate all SKUs with their prices, iterate the keys (the script returns the active SKU only; for the full matrix, modify the script to map over all keys). - Other useful field roots under
f:seller,specifications,Breadcrumb,warranties,deliveryOptionsInfo. Probe withObject.keys(f)if you need something not listed. - The
skuBasepath some older templates used is gone — don't reach for it. - LazMall badge on the seller card = vetted brand store; meaningful research signal for "is this an authorized seller".
- Less aggressive bot detection than Shopee, but the same 10–15s cadence is wise.
Critical rules
- Read-only. No add-to-cart, no checkout, no wishlist, no review submission, no following sellers, no "chat with seller". If the user asks for any of these, confirm explicitly first.
- Schema-first. Only fall through when
page_schema()returns[]— don't burnexecute_scriptcalls when JSON-LD is right there. - Don't click variant buttons in a loop. Pull the matrix from inline JSON.
- Stop on CAPTCHA. Tell the user; don't try to defeat it.
- Slower cadence than social. 10–15s between PDPs.
- Always pair price with currency — never report a bare number.
- Close tabs when done.
Common failures
| Signal | Cause | Fix |
|---|---|---|
page_schema() returns [] |
Site has no JSON-LD or it's CSR-injected late | Wait 3–5s more; fall through to pdp_meta_tags.js / json_island_finder.js |
| Price field present but no currency | Symbol-only render, schema missing priceCurrency |
Infer from TLD or the symbol map; flag uncertainty in output |
| Variant matrix only has the selected SKU | Schema only exposes the active offer | Pull from __NEXT_DATA__ (json_island_finder.js) or __moduleData__ (lazada_module_data.js) instead |
| Sponsored cards leak into listing extract | Card selector matched ads | listing_cards.js already filters [data-ad] / "Sponsored" — broaden the filter inside the script if your platform uses different markers |
__moduleData__ is undefined on Lazada |
Template variant — different namespace | Object.keys(window).filter(k => k.includes('Data')) to find it |
Lazada page_schema() returned but offers.price is missing |
Lazada uses AggregateOffer without price; full price/rating live in __moduleData__ |
Run lazada_module_data.js |
Lazada f.skuBase is undefined |
Older skill paths reference skuBase; current Lazada uses skuInfos keyed by skuId |
lazada_module_data.js already uses skuInfos |
Shopee Product.brand is empty string |
Shopee mostly hosts third-party listings without brand metadata | Treat as expected, not a failure; brand often inferable from title |
| Reviews list is empty even after scroll | Reviews behind a "Show reviews" tab | dom_click the reviews tab first, then run reviews_scroll.js |
navigate returns but document.title is "Just a moment…" |
Cloudflare interstitial | Stop; tell the user; do not retry in a tight loop |
| Geo-redirected to a different TLD | IP egresses from another country | Note the actual landing URL; the data is for that locale, not the requested one |
Numeric price parses as NaN |
Comma-as-decimal locale (e.g. 1.299,00) |
Detect locale by symbol/TLD; swap separators before parseFloat |
Whole __NEXT_DATA__ returned and context blows up |
Returned the raw blob | Drill into it server-side and return only the fields you need |
Example: compare a product across Shopee MY and Lazada MY
Shopee yields everything from page_schema; Lazada needs a second call into __moduleData__ for price + rating.
shopee_id = tabs_create(url='https://shopee.com.my/<title>-i.<shop>.<item>').id
lazada_id = tabs_create(url='https://www.lazada.com.my/products/<slug>-i<item>.html').id
# wait ~6s for both SPAs
# --- Shopee: schema-only is enough ---
s = page_schema(tab_id=shopee_id)
sp = next((x for x in (s.get('schemas') or []) if x.get('@type') == 'Product'), {})
sp_offers = sp.get('offers') or {}
sp_rating = sp.get('aggregateRating') or {}
shopee_summary = {
'title': sp.get('name'),
'price': sp_offers.get('price'),
'currency': sp_offers.get('priceCurrency'),
'rating': sp_rating.get('ratingValue'),
'review_count': sp_rating.get('ratingCount'),
}
# --- Lazada: schema for title/brand, __moduleData__ for price + rating ---
l = page_schema(tab_id=lazada_id)
lp = next((x for x in (l.get('schemas') or []) if x.get('@type') == 'Product'), {})
lazada_extra = execute_script(tab_id=lazada_id, code=<contents of scripts/lazada_compare_summary.js>)
lazada_summary = {
'title': lp.get('name'),
'price': lazada_extra.get('price'),
'currency': 'MYR' if lazada_extra.get('currency') == 'RM' else lazada_extra.get('currency'),
'rating': lazada_extra.get('rating'),
'review_count': lazada_extra.get('review_count'),
}
result = {'shopee': shopee_summary, 'lazada': lazada_summary}
tabs_close(shopee_id); tabs_close(lazada_id)