skills/aradotso/trending-skills/opencli-rs-web-scraper

opencli-rs-web-scraper

SKILL.md

opencli-rs Web Scraper & Site Fetcher

Skill by ara.so — Daily 2026 Skills collection.

opencli-rs is a single 4.7MB Rust binary that fetches real-time data from 55+ websites (Twitter/X, Reddit, YouTube, HackerNews, Bilibili, Zhihu, Xiaohongshu, and more) with one command. It reuses your browser session via a Chrome extension, supports AI-native API discovery, controls Electron desktop apps (Cursor, ChatGPT, Notion), and passes through to external CLIs (gh, docker, kubectl). Up to 12x faster and 10x less memory than the Node.js original.


Installation

One-line (macOS / Linux)

curl -fsSL https://raw.githubusercontent.com/nashsu/opencli-rs/main/scripts/install.sh | sh

Windows (PowerShell)

Invoke-WebRequest -Uri "https://github.com/nashsu/opencli-rs/releases/latest/download/opencli-rs-x86_64-pc-windows-msvc.zip" -OutFile opencli-rs.zip
Expand-Archive opencli-rs.zip -DestinationPath .
Move-Item opencli-rs.exe "$env:LOCALAPPDATA\Microsoft\WindowsApps\"

Build from Source

git clone https://github.com/nashsu/opencli-rs.git
cd opencli-rs
cargo build --release
cp target/release/opencli-rs /usr/local/bin/

Chrome Extension (required for Browser-mode commands)

  1. Download opencli-rs-chrome-extension.zip from GitHub Releases
  2. Extract to any directory
  3. Open chrome://extensions → enable Developer modeLoad unpacked → select the extracted folder
  4. Extension auto-connects to the opencli-rs daemon

AI Agent Skill Install

npx skills add https://github.com/nashsu/opencli-rs-skill

Command Modes

Mode Requirement Examples
Public Nothing — calls public APIs directly hackernews, devto, arxiv, wikipedia
Browser Chrome + extension running twitter, bilibili, reddit, zhihu
Desktop Target Electron app running cursor, chatgpt, notion, discord

Key Commands

Global Flags

opencli-rs --help                        # list all commands
opencli-rs <site> --help                 # site-specific help
opencli-rs <site> <cmd> --format json    # output: table | json | yaml | csv | markdown
opencli-rs <site> <cmd> --limit 20       # limit results
opencli-rs doctor                        # run diagnostics
opencli-rs completion bash >> ~/.bashrc  # shell completions (bash/zsh/fish)

Public Commands (no browser needed)

# Hacker News
opencli-rs hackernews top --limit 10
opencli-rs hackernews search "rust async" --limit 5
opencli-rs hackernews user pg

# arXiv
opencli-rs arxiv search "large language models" --limit 10
opencli-rs arxiv paper 2303.08774

# Wikipedia
opencli-rs wikipedia summary "Rust programming language"
opencli-rs wikipedia search "memory safety"
opencli-rs wikipedia random

# Stack Overflow
opencli-rs stackoverflow hot
opencli-rs stackoverflow search "tokio async runtime"

# Dev.to
opencli-rs devto top
opencli-rs devto tag rust

# Linux.do
opencli-rs linux-do hot
opencli-rs linux-do search "Rust"

Browser Commands (Chrome extension required)

# Twitter/X
opencli-rs twitter search "rust lang" --limit 10
opencli-rs twitter trending
opencli-rs twitter timeline
opencli-rs twitter bookmarks
opencli-rs twitter profile elonmusk
opencli-rs twitter post "Hello from opencli-rs!"
opencli-rs twitter follow rustlang

# Bilibili
opencli-rs bilibili hot --limit 20
opencli-rs bilibili search "Rust教程"
opencli-rs bilibili ranking
opencli-rs bilibili feed
opencli-rs bilibili download <video_url>

# Reddit
opencli-rs reddit frontpage
opencli-rs reddit popular
opencli-rs reddit subreddit rust
opencli-rs reddit search "async await"
opencli-rs reddit upvote <post_id>

# Zhihu
opencli-rs zhihu hot
opencli-rs zhihu search "Rust 内存安全"
opencli-rs zhihu question 12345

# Xiaohongshu
opencli-rs xiaohongshu search "旅行攻略"
opencli-rs xiaohongshu feed
opencli-rs xiaohongshu user <user_id>

# YouTube
opencli-rs youtube search "rust tutorial"
opencli-rs youtube video <video_id>
opencli-rs youtube transcript <video_id>

# Weibo
opencli-rs weibo hot
opencli-rs weibo search "科技"

# Douban
opencli-rs douban top250
opencli-rs douban search "三体"
opencli-rs douban movie-hot

# Medium
opencli-rs medium search "rust programming"
opencli-rs medium feed
opencli-rs medium user graydon_hoare

# Xueqiu (stock)
opencli-rs xueqiu hot-stock
opencli-rs xueqiu search "茅台"
opencli-rs xueqiu stock SH600519

Desktop App Control (Electron)

# Cursor IDE
opencli-rs cursor status
opencli-rs cursor send "Refactor this function to use async/await"
opencli-rs cursor read
opencli-rs cursor ask "What does this code do?"
opencli-rs cursor screenshot
opencli-rs cursor extract-code
opencli-rs cursor model

# ChatGPT Desktop
opencli-rs chatgpt new
opencli-rs chatgpt send "Explain Rust lifetimes"
opencli-rs chatgpt read

# Notion
opencli-rs notion search "project notes"
opencli-rs notion read <page_id>
opencli-rs notion write <page_id> "New content"
opencli-rs notion sidebar

# Discord Desktop
opencli-rs discord-app channels
opencli-rs discord-app send "Hello team"
opencli-rs discord-app read

External CLI Passthrough

# GitHub CLI
opencli-rs gh repo list
opencli-rs gh pr list
opencli-rs gh issue create --title "Bug report"

# Docker
opencli-rs docker ps
opencli-rs docker images
opencli-rs docker logs my-container

# Kubernetes
opencli-rs kubectl get pods
opencli-rs kubectl get services -n production
opencli-rs kubectl logs <pod-name>

AI Discovery Commands

# Explore a website's API surface
opencli-rs explore https://example.com

# Auto-detect authentication strategies
opencli-rs cascade https://api.example.com/data

# Auto-generate an adapter (YAML pipeline)
opencli-rs generate https://example.com --goal "hot posts"

# Synthesize adapter from discovered API
opencli-rs synthesize https://news.example.com

Output Formats

# Default: ASCII table
opencli-rs hackernews top --limit 5

# JSON — great for piping to jq
opencli-rs hackernews top --limit 5 --format json | jq '.[].title'

# YAML
opencli-rs hackernews top --format yaml

# CSV — for spreadsheets
opencli-rs hackernews top --format csv > hn_top.csv

# Markdown — for docs
opencli-rs hackernews top --format markdown

Declarative YAML Pipeline (Custom Adapters)

Add new site adapters without writing Rust code. Create a YAML file describing the scraping pipeline:

# ~/.config/opencli-rs/adapters/my-site.yaml
name: my-site
description: Fetch top posts from my-site
base_url: https://api.my-site.com
commands:
  top:
    description: Get top posts
    endpoint: /v1/posts/top
    method: GET
    params:
      limit:
        flag: --limit
        default: 10
        query_param: count
    response:
      items_path: $.data.posts
      fields:
        - name: title
          path: $.title
        - name: url
          path: $.url
        - name: score
          path: $.points
        - name: author
          path: $.author.name
# Use the custom adapter
opencli-rs my-site top --limit 20
opencli-rs my-site top --format json

Real-World Patterns

Pipe JSON output to jq for filtering

# Get only titles from HN top stories
opencli-rs hackernews top --limit 20 --format json | jq -r '.[].title'

# Get Twitter trending topics as plain list
opencli-rs twitter trending --format json | jq -r '.[].name'

# Find Bilibili videos with >1M views
opencli-rs bilibili ranking --format json | jq '[.[] | select(.view > 1000000)]'

Shell scripting

#!/bin/bash
# Daily digest script

echo "=== HackerNews Top 5 ==="
opencli-rs hackernews top --limit 5 --format table

echo ""
echo "=== Bilibili Trending ==="
opencli-rs bilibili hot --limit 5 --format table

echo ""
echo "=== Zhihu Hot ==="
opencli-rs zhihu hot --limit 5 --format table

Export to file

opencli-rs reddit popular --format csv > reddit_$(date +%Y%m%d).csv
opencli-rs hackernews top --format json > hn_top.json

Use in Rust project via subprocess

use std::process::Command;
use serde_json::Value;

fn fetch_hn_top(limit: u32) -> anyhow::Result<Vec<Value>> {
    let output = Command::new("opencli-rs")
        .args(["hackernews", "top", "--limit", &limit.to_string(), "--format", "json"])
        .output()?;

    let json: Vec<Value> = serde_json::from_slice(&output.stdout)?;
    Ok(json)
}

fn fetch_twitter_search(query: &str) -> anyhow::Result<Vec<Value>> {
    let output = Command::new("opencli-rs")
        .args(["twitter", "search", query, "--format", "json"])
        .output()?;

    if !output.status.success() {
        let err = String::from_utf8_lossy(&output.stderr);
        anyhow::bail!("opencli-rs error: {}", err);
    }

    let json: Vec<Value> = serde_json::from_slice(&output.stdout)?;
    Ok(json)
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let stories = fetch_hn_top(10)?;
    for story in &stories {
        println!("{}", story["title"].as_str().unwrap_or(""));
    }
    Ok(())
}

AI Agent integration (AGENT.md / .cursorrules)

# AGENT.md

## Available Tools
Run `opencli-rs list` to discover all available commands.

### Web Data Fetching
- `opencli-rs hackernews top --format json` — HN top stories
- `opencli-rs twitter search "<query>" --format json` — Twitter search
- `opencli-rs arxiv search "<topic>" --format json` — Research papers
- `opencli-rs reddit subreddit <name> --format json` — Subreddit posts

### Local CLI Tools
- `opencli-rs gh <args>` — GitHub operations
- `opencli-rs docker <args>` — Docker operations
- `opencli-rs kubectl <args>` — Kubernetes operations

Register custom tools: `opencli-rs register mycli`

Configuration

Config file location

~/.config/opencli-rs/config.toml   # macOS / Linux
%APPDATA%\opencli-rs\config.toml   # Windows

Custom adapter directory

# ~/.config/opencli-rs/config.toml
adapter_dir = "~/.config/opencli-rs/adapters"
default_format = "table"
default_limit = 20

Register a custom external CLI

# Register your own CLI tool for passthrough
opencli-rs register mycli

# Now use it via opencli-rs
opencli-rs mycli --help
opencli-rs mycli some-command --flag value

Troubleshooting

Run diagnostics first

opencli-rs doctor

Chrome extension not connecting

# Check extension is loaded at chrome://extensions
# Verify Developer mode is ON
# Reload the extension after reinstalling opencli-rs
# Check daemon is running:
opencli-rs doctor

Browser command returns empty / auth error

  • Open Chrome and ensure you are logged in to the target site
  • The extension reuses your existing browser session — no tokens needed
  • Try refreshing the target site tab, then retry the command

Binary not found after install

# Verify install location
which opencli-rs
ls /usr/local/bin/opencli-rs

# Add to PATH if missing
export PATH="/usr/local/bin:$PATH"
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.zshrc

Build errors from source

# Ensure Rust toolchain is up to date
rustup update stable
rustup target add aarch64-apple-darwin   # macOS Apple Silicon
cargo build --release

Command slow or timing out

  • Browser commands require Chrome + extension; without it they will hang
  • Run opencli-rs doctor to verify extension connectivity
  • Public commands (hackernews, arxiv, etc.) need no browser and run in ~1-2s

Windows path issues

# Verify binary location
Get-Command opencli-rs
# If not found, ensure the directory is in $env:PATH
$env:PATH += ";$env:LOCALAPPDATA\Microsoft\WindowsApps"

Supported Sites Reference

Category Sites
Tech News HackerNews, Dev.to, Lobsters, Linux-do
Social Twitter/X, Reddit, Facebook, Instagram, TikTok, Jike
Video YouTube, Bilibili, Weixin
Chinese Zhihu, Xiaohongshu, Weibo, Douban, Xueqiu, Weread, Sinablog, Sinafinance
Research arXiv, Hugging Face
Finance Yahoo Finance, Barchart, Xueqiu
Jobs Boss, LinkedIn
Reading Medium, Substack, Wikipedia, BBC, Bloomberg, Reuters
Shopping Steam, SMZDM, Ctrip, Coupang
AI/Desktop Cursor, ChatGPT, Codex, Doubao, ChatWise, Notion, Discord
Podcast Apple Podcasts, Xiaoyuzhou
External CLI gh, docker, kubectl, obsidian, readwise, gws
Weekly Installs
16
GitHub Stars
11
First Seen
1 day ago
Installed on
opencode16
gemini-cli16
deepagents16
antigravity16
github-copilot16
codex16