paper-digest

SKILL.md

Paper Digest

Single-paragraph summaries optimized for social sharing. Insight over information.

Structure

  1. Context: What's the problem?
  2. Insight: What did they realize that others missed?
  3. Solution: How does insight โ†’ method? (should feel natural)
  4. Evidence: Concrete comparison showing it works

Then: Implication line + ๐Ÿ“Ž arxiv link

Key Rules

  • Explain like reader is smart but unfamiliar with the domain
  • Use concrete examples/analogies (e.g., "์“ฐ๋ ˆ๊ธฐํ†ต ์—ญํ• " >> "ํŠน์ • ํ† ํฐ์— ์ง‘์ค‘")
  • Show cause-and-effect chains explicitly
  • Compare/contrast with alternatives ("X failed while Y succeeded")
  • Bold 2-4 key concepts
  • Match user's language (Korean/English)

Example

Input: arXiv 2601.15380

Output:

Transformer์˜ attention์€ "์–ด๋–ค ํ† ํฐ์„ ์–ผ๋งˆ๋‚˜ ๋ณผ์ง€"๋ฅผ ๊ฒฐ์ •ํ•˜๋Š”๋ฐ, ์ด ๋…ผ๋ฌธ์€ softmax attention์„ **Entropic Optimal Transport(EOT)**๋ผ๋Š” ์ตœ์ ํ™” ๋ฌธ์ œ์˜ ํ•ด๋กœ ์žฌํ•ด์„ํ•œ๋‹ค. ์ด ๊ด€์ ์ด ์ฃผ๋Š” ํ†ต์ฐฐ์€: attention ๊ณ„์‚ฐ์—๋Š” ์•”๋ฌต์ ์œผ๋กœ "๋ชจ๋“  ์œ„์น˜๊ฐ€ ๋™๋“ฑํ•˜๊ฒŒ ์ค‘์š”ํ•˜๋‹ค"๋Š” uniform prior๊ฐ€ ์ˆจ์–ด์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์ด๊ฒŒ ์™œ ๋ฌธ์ œ์ธ๊ฐ€? LLM์—์„œ ์ฒซ ๋ฒˆ์งธ ํ† ํฐ์ด ์˜๋ฏธ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์—„์ฒญ๋‚œ attention์„ ๋ฐ›๋Š” attention sink ํ˜„์ƒ์ด ์žˆ๋‹ค. Softmax๋Š” ํ•ฉ์ด 1์ธ ํ™•๋ฅ ์„ ์ถœ๋ ฅํ•ด์•ผ ํ•˜๋ฏ€๋กœ, query๊ฐ€ ๋งˆ๋•…ํžˆ ๋ณผ ํ† ํฐ์ด ์—†์„ ๋•Œ attention์„ "๋ฒ„๋ฆด ๊ณณ"์ด ํ•„์š”ํ•œ๋ฐ, uniform prior ํ•˜์—์„œ ์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋ ค๋ฉด ์ฒซ ํ† ํฐ์˜ key vector๊ฐ€ "๋‚˜๋Š” ์“ฐ๋ ˆ๊ธฐํ†ต์ด์•ผ"๋ผ๋Š” ๊ตฌ์กฐ์  ์ •๋ณด๊นŒ์ง€ ๋‹ด์•„์•ผ ํ•œ๋‹คโ€”์›๋ž˜ semantic content๋งŒ ํ‘œํ˜„ํ•ด์•ผ ํ•  key์˜ ํ‘œํ˜„๋ ฅ์ด ๋‚ญ๋น„๋˜๋Š” ๊ฒƒ์ด๋‹ค. EOT ํ•ด์„์ด ์ด ๋ฌธ์ œ๋ฅผ ๋“œ๋Ÿฌ๋‚ด์ฃผ์—ˆ์œผ๋ฏ€๋กœ, ํ•ด๊ฒฐ์ฑ…๋„ ์ž์—ฐ์Šค๋Ÿฝ๋‹ค: prior๋ฅผ uniform์—์„œ learnable๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค. ์ด ๋…ผ๋ฌธ์ด ์ œ์•ˆํ•˜๋Š” GOAT์€ "๊ฐ ์œ„์น˜์˜ ๊ธฐ๋ณธ ์ค‘์š”๋„"๋ฅผ ๋ณ„๋„์˜ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ํ•ญ์œผ๋กœ ๋ถ„๋ฆฌํ•ด์„œ, key vector๋Š” ์ˆœ์ˆ˜ํ•˜๊ฒŒ ์˜๋ฏธ๋งŒ, ์œ„์น˜ ์ •๋ณด๋Š” prior๊ฐ€ ๋‹ด๋‹นํ•˜๊ฒŒ ํ•œ๋‹ค. ์‹คํ—˜์—์„œ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์ด ํ›ˆ๋ จ ๊ธธ์ด ์ดˆ๊ณผ ์‹œ ๊ธ‰๊ฒฉํžˆ ์‹คํŒจํ•œ ๋ฐ˜๋ฉด, GOAT์€ ๊ธด ๋ฌธ๋งฅ์—์„œ๋„ ์ •๋ณด ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์„ ์œ ์ง€ํ–ˆ๋‹ค.

Implication: EOT ๊ด€์ ์€ attention์˜ ์ˆจ๊ฒจ์ง„ ๊ฐ€์ •์„ ๋“œ๋Ÿฌ๋‚ด๊ณ , ๊ทธ ๊ฐ€์ •์„ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ๋‹ค๋Š” ์„ค๊ณ„ ์ž์œ ๋„๋ฅผ ์—ด์–ด์ค€๋‹คโ€”attention sink๋Š” uniform prior์˜ ๋ถ€์‚ฐ๋ฌผ์ด๋ฉฐ, prior๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋ฉด ํ•ด๊ฒฐ๋œ๋‹ค.

๐Ÿ“Ž https://arxiv.org/abs/2601.15380

Avoid

  • Jargon without intuition
  • Findings without comparison to alternatives
  • Method description without motivation ("์™œ ์ด๋ ‡๊ฒŒ ํ–ˆ๋Š”์ง€" ์—†์ด "์ด๋ ‡๊ฒŒ ํ–ˆ๋‹ค"๋งŒ)

Multiple Papers

When summarizing multiple papers:

  • Lead with the unifying theme/problem
  • Contrast what each paper realized differently
  • Synthesize implications across papers

Language

Match the user's language (Korean/English). Maintain the same insight-first structure regardless of language.

Weekly Installs
12
GitHub Stars
1
First Seen
Feb 4, 2026
Installed on
opencode11
gemini-cli11
github-copilot11
codex11
kimi-cli11
amp11