{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "The RRF Top-n Playbook: How to Get Cited by ChatGPT’s Web Mode (k≈60)",
  "description": "The RRF Top-n Playbook: How to Get Cited by ChatGPT’s Web Mode (k≈60)",
  "datePublished": "2025-09-11T00:00:00.000Z",
  "dateModified": "2025-11-02T00:00:00.000Z",
  "url": "https://metehan.ai/blog/the-rrf-top-n-playbook-how-to-get-cited-by-chatgpts-web-mode/",
  "category": "featured-research",
  "tags": [],
  "image": "/wp-content/uploads/2025/09/RRF-1.png",
  "wordCount": 1770,
  "readTime": "9 min",
  "articleBody": "*I've been quiet for two weeks, and I appreciate all the questions you've sent. Thank you for your patience and interest. I'm excited to share that I joined [AEOVision](https://aeovision.ai), an AI search optimization company, and I'm thrilled to be working with our new clients. Thanks to everyone who reached out with congratulations and support. This post is trying to align with RAG, mostly.*\n\n> This post examines a mathematical approach for Reciprocal Rank Fusion (RRF), it's experimental work that I'm still refining. **While the theoretical top-60 threshold is most suitable for RRF calculations, ChatGPT doesn't return a strict 60 results every time. In my testing with the AYIMA plugin, I've observed it returning anywhere from 38 to 65 results depending on the query, it's highly dynamic.** (Shared AYIMA results at the end) I'm sharing my notes and research to get more ideas from our lovely community.\n\nGiven this variability, targeting the top 30 seems more promising and achievable. Any competent SEO professional can realistically target top-30 positions. For the calculations and examples in this post, I've used positions beyond 100 to account for RRF scenarios, including LLMs' deep research features. When LLMs engage their deep research functionality, they're fetching well over 60 results dynamically. This post attempts to present a practical approach that accounts for these real-world variations. And it's important to find LLMs' subqueries.\n\n--EXPERIMENTAL--\n\n*If LLMs fuse multiple SERP slices, your goal isn't just \"domain authority.\" It's simple: **make the fused top-60**. Here's a predictive formula, thresholds, and a step-by-step plan you can run on your own SERP scrapes to force inclusion.*\n\n## TL;DR\n\n- ChatGPT-style web agents often fuse **multiple sub-queries** with **Reciprocal Rank Fusion (RRF)** using **k≈60**. [I wrote about RRF here.](https://metehan.ai/blog/chatgpt-is-using-reciprocal-rank-fusion-rrf/)\n- Theory: If your URL's **fused score** hits **τ = 0.020**, you almost always land in the **final top-60 citation pool**. // IF ChatGPT retrieves at least 60 results.\n- Practically: **show up in ≥2 lists inside top-40** or **≥3 lists inside top-90**. // For Deep Research feature\n- Targeting the top 30 mostly works.\n- Topic clusters are essential.\n\n## 1) The Fusion Math I'm Working On It\n\nI'm working with my clients for their prompts to understand being cited \"almost\" every time & being at the top of the reranking process.\" Assume the browser agent runs **M** sub-queries (let's say a query fan-out) and fuses with **RRF**:\n\n$latex S(d)=\\sum_{i \\in I(d)} \\frac{w_i}{k + r_i(d)} \\quad \\text{with } k=60,; w_i \\approx 1$\n\n- $latex r_i(d)$ = your page's **1-based rank** in sub-query $latex i$ (∞ if absent)\n- $latex k=60$ dampens rank gaps; below ~60 the contribution is small\n- $latex S(d)$ = your fused score used to sort pages; **top-60** are kept/citable\n\nBecause you don't know competitors in advance, set a **safe inclusion threshold** $latex \\tau$ that beats common patterns:\n\n- A page **once at #1** → $latex 1/(60+1)=0.01639$\n- A page **twice at #40 & #50** → $latex 1/100+1/110\\approx 0.01909$\n\nA reliable fixed target is:\n\n$latex \\boxed{\\tau = 0.020}$\n\nHit $latex S(d)\\geq 0.020$ and you're very likely in the fused **top-60** across typical fan-outs.\n\n***Considerations:**Users' personal embedding, semiotics(for prompting), less than 60 results, reranking process; citations' rankings change dynamically even if you send the same prompt. I'm getting better results if I target the top 30.*\n\n## 2) Your \"Be-Cited\" Rule (Plug-and-Play)\n\nYou're in the citation pool if:\n\n$latex \\boxed{\\sum_{i \\in I(d)} \\frac{1}{60 + r_i(d)} ;;\\geq;; 0.020}$\n\nReading this with equal weights (if we also want to target deep research feature, for 60>):\n\n- **2× top-40**: $latex 2/(60+40)=2/100=0.020$ ✅\n- **3× top-90**: $latex 3/(60+90)=3/150=0.020$ ✅\n- **1× #1 + 1× top-80**: $latex 1/61+1/140\\approx 0.0235$ ✅\n- **4× top-140**: $latex 4/(60+140)=4/200=0.020$ ✅\n\nUniform \"top-R\" shortcut:\n\n$latex m\\geq\\lceil\\tau (60+R)\\rceil$\n\n- Can reach **R=40**? → need **m=2** sub-queries\n- Only **R=90**? → need **m=3**\n- Best **R=140**? → need **m=4**\n\n**Bottom line:** **Appear in ≥2 lists inside top-40** or **≥3 lists inside top-90** and you've crossed the line.\n\n## 3) How to Operationalize (End-to-End)\n\n### A) Feed Your Topic Clusters (What LLMs Likely Hit)\n\nCreate **8–16** sub-queries around the head intent (adapt per niche):\n\n- Head + year (\"best X 2025\"), Head + \"top\", Head + \"review(s)\"\n- Head + \"price(s)\", Head + \"compare\"/\"vs\", Head + \"near me\" (local)\n- Singular/plural/entity variants; modifiers (cheap, premium, eco, beginner, pro)\n- PAA-shaped questions (\"What is…\", \"How to choose…\", \"Is X worth it…\")\n\n### B) Scrape Top-N for Each Sub-Query\n\nPull **top 60–100**. Log your **rank $latex r_i$** (∞ if absent). Deeper ranks won't move the needle.\n\n### C) Compute Your Fused Score\n\nFor your URL $latex d$:\n\n$latex S(d)=\\sum_{i:,r_i     Lift **two variants to top-40**, or\n- Lift **three variants to top-90**, or\n- Pair **one #1–#3** with **one ≤80**\n\nThis is a tiny greedy knapsack: fix the **cheapest lifts** first (where you're already close).\n\n## 4) On-Page \"Rank-in-Multiple-Lists\" Engineering (Fast Wins)\n\nYou're not optimizing \"authority\" here; you're optimizing **multi-presence** across the fan-out.\n\n1. **Title/H1 & lede n-grams** Mirror dominant SERP n-grams across your chosen variants (e.g., \"best ___ in 2025\", \"top ___ for beginners\"). This helps win **several** lists at once.\n2. **Subhead packing** Add compact sections aligned to each variant: - `H2: Best ___ for Beginners (2025)` - `H2: ___ Price & Value` - `H2: ___ vs Alternatives` - `H2: How to Choose ___` They give search engines precise anchors without bloat.\n3. **PAA-style answer blocks** One-paragraph, definition-style blocks that **exactly** match question forms (\"Is ___ worth it?\", \"How much does ___ cost?\"). These capture Q&A variants.\n4. **Year & freshness tokens** Use the current year in **title, H1, lede, table captions** where natural. Rotate minor content (tables, FAQs) routinely.\n5. **Edge snippets** Add a small **comparison table** and **pricing table**. These often surface for \"vs\" and \"price\" variants—two quick extra lists.\n\n## 5) Minimal Implementation (Sheet or Code)\n\nGiven ranks $latex r_i$ for sub-queries $latex i=1..M$:\n\n```\nS = sum(1 / (60 + r) for r in ranks if r is not None)\nIN_TOP_60 = S >= 0.020\n# If False, lift the easiest variants until S >= 0.020\n\n```\n\n**Greedy lift helper (human-in-the-loop):** Sort variants by how close you are to the cutoffs (40, 90, 140). Improve those first, recompute $latex S$.\n\n## Quick Reference (k=60, τ=0.020)\n\n| Your appearances | Max rank (each) | Guaranteed $latex S$ |\n| --- | --- | --- |\n| 2× | ≤ 40 | 0.0200 |\n| 2× | ≤ 45 | 0.01905 *(add a tiny extra)* |\n| 2× | ≤ 50 | 0.01818 *(one more small appearance)* |\n| 3× | ≤ 90 | 0.0200 |\n| 4× | ≤ 140 | 0.0200 |\n| 1× #1 + 1× ≤ 80 | — | ≈0.0235 |\n\n> Prefer dynamic targets? Set $latex \\tau$ to the **actual 60th fused score** you observe in your own SERP cloud. The rule is unchanged: **hit $latex S(d)\\geq\\tau$**.\n\n## Bottom Line\n\n- At citation time, LLMs **select from the fused top-K**—they're not evaluating your topical authority.\n- Engineer **multi-presence**: make your page appear across **multiple sub-queries** so\n\n$latex \\sum \\frac{1}{60+r_i} ;;\\geq;; 0.020$\n\n- The most efficient universal target: **2 appearances inside top-40** (or **3 inside top-90**). Do that, and you will likely land in the **final citation list**.\n\n## ChatGPT & AYIMA Experiment\n\n![](/wp-content/uploads/2025/09/ayima-rrf.png)\n\n_*]:min-w-0 !gap-3.5\">\n\nUsing the AYSIMA ChatGPT extension to track RAG sources, I tested how many web results ChatGPT actually retrieves for different queries. The results were eye-opening for anyone hoping to get AI citations from lower search rankings.\n\n## The Raw Data for Some Broad Queries\n\nHere's what ChatGPT retrieved for example broad queries below:\n\n1. **\"latest ai news\"** → 65 sources\n2. **\"latest tech news\"** → 52 sources\n3. **\"latest sport news\"** → 38 sources\n4. **\"trending sneaker brands 2025\"** → 60 sources\n\n**ChatGPT retrieved 38-65 total sources, varying by query type.**\n\n## The Reality Check for Lower Rankings\n\nIf you're ranking around position 60 in search results, here's the brutal truth:\n\n### Best Case Scenario\n\nFor queries retrieving 60> sources:\n\n- If it's a single query pulling all 60> results, rank 60 *is fine*\n- You'd be literally the last or near-last result\n- Your RRF score: <0.00833 (minimal)\n\n## Why This Matters for RRF Scoring\n\nRRF (Reciprocal Rank Fusion) only scores what it retrieves. If you're not in the retrieval window, you get zero score. No score means no chance of citations.\n\nThe math is simple:\n\n- Retrieved at rank 60: 1/(60+60) = 0.00833 (terrible but something)\n- Not retrieved: 0 (game over)\n\n## The Pattern I Experimented\n\nDifferent query types get different retrieval depths:\n\n- **Broad topics** (sports): Least retrieval (38)\n- **Standard topics** (tech): Moderate retrieval (52)\n- **Specific queries** (sneakers): Higher retrieval (60)\n- **Technical topics** (AI): Maximum retrieval (64)\n\n*I wonder if ChatGPT has some boost multipliers for some topics, just like [Perplexity](https://metehan.ai/blog/perplexity-ai-seo-59-ranking-patterns/).*\n\nBut even with maximum retrieval, rank 60 is either invisible or dead last.\n\n## What Actually Works\n\nForget trying to win from rank 60 everywhere. Instead:\n\n### Find Your Strong Spots\n\nIdentify 2-3 query variations where you rank in the top 20:\n\n- Long-tail keywords\n- Specific niches\n- Less competitive angles\n\n### Do the Math, it's fun\n\nTo reach a competitive RRF score of 0.02:\n\n- Need just 2 appearances at rank 15: 2 × 0.0133 = 0.0266 ✓\n- Or 3 appearances at rank 20: 3 × 0.0125 = 0.0375 ✓\n\n### Build Topic Clusters\n\nCreate multiple pages targeting different angles. Some will rank high enough to be retrieved, while others won't. The cumulative effect is what counts.\n\n## The Bottom Line\n\n**Rank 60 is effectively invisible to ChatGPT's web search.**\n\nWith only 38-64 sources retrieved per search, and these likely split across multiple queries, you need to rank approximately **top 10-20** to have any chance of being included in AI-generated responses.\n\nThe winning strategy isn't improving from rank 60 to rank 55. It's finding specific queries where you can crack the top 20.\n\n## Takeaway for SEOs and Content Creators\n\nStop obsessing over marginal improvements in average rankings. Start identifying and dominating specific query variations where you can achieve top-20 positions.\n\nIn the age of AI search, it's better to rank #10 for three specific queries than #60 for everything.\n\n*Methodology: Tests conducted using AYSIMA ChatGPT extension to track RAG source retrieval. Results show total sources retrieved across all queries ChatGPT runs for each search task. Actual query structure (single vs. multiple) cannot be definitively determined but multiple queries are most likely based on standard IR practices.*\n\n## What next?\n\nI'm working on LLMs reranking process and plan to publish a new post.\n\nUseful sources:\n\n1. [http://cormack.uwaterloo.ca/cormacksigir09-rrf.pdf#:~:text=results%20of%2030%20configurations%20of,particular%20sets%20were%20selected%20because](http://cormack.uwaterloo.ca/cormacksigir09-rrf.pdf#:~:text=results%20of%2030%20configurations%20of,particular%20sets%20were%20selected%20because)\n2. [https://www.adelean.com/en/blog/20250417_hybrid_reranking/](https://www.adelean.com/en/blog/20250417_hybrid_reranking/)\n3. [https://medium.com/@danushidk507/rag-vii-reranking-with-rrf-d8a13dba96de](https://medium.com/@danushidk507/rag-vii-reranking-with-rrf-d8a13dba96de)",
  "author": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai",
    "sameAs": [
      "https://x.com/metehan777",
      "https://www.linkedin.com/in/metehanyesilyurt",
      "https://github.com/metehan777"
    ]
  },
  "publisher": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai"
  },
  "alternateFormat": {
    "html": "https://metehan.ai/blog/the-rrf-top-n-playbook-how-to-get-cited-by-chatgpts-web-mode/",
    "json": "https://metehan.ai/api/post/the-rrf-top-n-playbook-how-to-get-cited-by-chatgpts-web-mode.json",
    "rss": "https://metehan.ai/rss.xml"
  }
}