{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "I Found It in the Code, Science Proved It in the Lab: The Recency Bias That's Reshaping AI Search",
  "description": "I Found It in the Code, Science Proved It in the Lab: The Recency Bias That's Reshaping AI Search",
  "datePublished": "2025-10-03T00:00:00.000Z",
  "dateModified": "2025-11-02T00:00:00.000Z",
  "url": "https://metehan.ai/blog/i-found-it-in-the-code-science-proved-it-in-the-lab-the-recency-bias-thats-reshaping-ai-search/",
  "category": "featured-research",
  "tags": [],
  "image": "/wp-content/uploads/2025/10/temp-chat-2025-seo-guide.png",
  "wordCount": 936,
  "readTime": "5 min",
  "articleBody": "**In August 2025, I found something on ChatGPT's configuration files and identified a single line of code that explained many things:**\n\n```\nuse_freshness_scoring_profile: true\n\n```\n\nI wrote then: \"ChatGPT actively prioritizes recent content over older material. Regular content updates aren't just good practice; they're essential for ChatGPT visibility.\" Here is the August 2025 post -> [https://metehan.ai/blog/chatgpt-5-search-configuration/](https://metehan.ai/blog/chatgpt-5-search-configuration/)\n\n**Today, I can tell you exactly how much this matters, because researchers just quantified it.**\n\n![](/wp-content/uploads/2025/10/arxiv-recency-bias-llms.png)\n\nA team from Waseda University [published a great study testing seven major AI models](https://arxiv.org/abs/2509.11353) (GPT-4o, GPT-4, GPT-3.5, LLaMA-3 8B/70B, and Qwen-2.5 7B/72B). They added artificial publication dates to search results and measured what happened.\n\nThe results validate everything I found in that configuration file and the numbers are interesting than I thought.\n\n## What I Found vs. What They Proved\n\n### My Discovery: The Configuration\n\nLooking at ChatGPT's actual production settings, I found:\n\n```\nreranker_model: \"ret-rr-skysight-v3\"\nuse_freshness_scoring_profile: true\nenable_query_intent: true\nvocabulary_search_enabled: true\n\n```\n\n**My conclusion:** \"That comprehensive guide you wrote in 2022? It might be losing ground to newer content, even if yours is more detailed.\"\n\n### Their Proof: The Numbers\n\nThe researchers took passages from TREC 2021 and 2022 test collections, added fake publication dates (nothing else changed same text, same quality), and watched AI models rerank them.\n\n**Every. Single. Model. Fell. For. It.**\n\nHere's what happened:\n\n| Metric | Best Case | Worst Case |\n| --- | --- | --- |\n| **Average year shift in top-10** | +0.82 years (Qwen2.5-72B) | +4.78 years (LLaMA3-8B) |\n| **Largest single position jump** | 61 ranks (Qwen2.5-7B) | 95 ranks (GPT-3.5-turbo) |\n| **Preference reversals** | 8.25% (Qwen2.5-72B) | 25.23% (LLaMA3-8B) |\n\n \n\n**Translation:**\n\n- Your top-10 results can shift by nearly 5 years just from timestamps\n- Individual pieces of content can jump 95 positions\n- 1 in 4 relevance decisions flip based solely on dates\n\n## The \"Seesaw Effect\": How Your Rankings Get Destroyed\n\nThe research revealed something fascinating they call the **\"seesaw pattern\"**and it perfectly explains what that freshness scoring profile actually does.\n\nImagine your search results as a seesaw with a pivot point in the middle:\n\n###  Top 40 Positions: Systematically Younger\n\n**What happens:** Content with recent dates (real or fake) consistently climbs here\n\n**By the numbers:**\n\n- Ranks 1-10: +0.8 to +4.8 years fresher (all models, both datasets)\n- Ranks 11-20: +0.2 to +0.9 years fresher (statistically significant)\n- Ranks 21-40: Still positive shifts, smaller magnitude\n\n**What this means:** Even if you rank #1 based on content quality, a newer piece with worse content can overtake you.\n\n### ⚖️ Ranks 41-60: The Pivot Point\n\n**What happens:** Minimal movement, acts as the fulcrum\n\n**By the numbers:**\n\n- Some slight positive shifts in 41-50 band\n- Some slight negative shifts in 51-60 band\n- Mostly non-significant statistically\n\n**What this means:** This is the \"neutral zone\" where freshness matters least.\n\n###  Bottom 60: Systematically Older\n\n**What happens:** Older-dated content sinks here, even when equally relevant\n\n**By the numbers:**\n\n- Ranks 61-70: -0.4 to -1.0 years older\n- Ranks 71-80: -0.6 to -1.2 years older\n- Ranks 81-90: -0.7 to -1.7 years older\n- Ranks 91-100: -0.5 to -2.0 years older (most dramatic)\n\n**What this means:** Older authoritative content gets systematically buried.\n\n## Real-World Impact: Three Scenarios\n\n###  Scenario 1: Medical Content\n\n**What should happen:** A landmark 2018 study with 10,000 participants and peer review should rank highly.\n\n**What actually happens:** A preliminary 2024 blog post with 50-person sample and no peer review ranks higher just because it's newer.\n\n**The numbers:** The 2018 study could drop 40-60 positions purely from its date.\n\n###  Scenario 2: Technical Documentation\n\n**What should happen:** The definitive 2020 guide with 5,000 verifications and community vetting should be authoritative.\n\n**What actually happens:** A 2024 unverified blog post ranks higher.\n\n**The numbers:** Up to 25% chance the AI \"prefers\" the newer, worse content.\n\n###  Scenario 3: Academic Research\n\n**What should happen:** Foundational papers from 2015-2020 should remain authoritative reference material.\n\n**What actually happens:** Recent commentary pieces with no original research rank higher.\n\n**The numbers:** Top-10 can shift 1-5 years newer, systematically demoting classics.\n\n## The Configuration + Research = Complete Picture\n\nLet me show you how my configuration discovery and their research fit together:\n\n### 1. The Reranker (`ret-rr-skysight-v3`)\n\n**What I found:** ChatGPT uses a sophisticated reranking model that processes search results post-retrieval.\n\n**What research adds:** This isn't unique to ChatGPT **all listwise rerankers** exhibit this bias. It's architectural, not implementation-specific.\n\n**New insight:** The Skysight-v3 model likely has temporal bias **built into its training**, not just as a configuration parameter.\n\n### 2. Freshness Scoring\n\n**What I found:** `use_freshness_scoring_profile: true` is always on.\n\n**What research adds:** The effect magnitude is 1 to 5 years of shift in top results.\n\n**New insight:** This isn't a minor ranking signal. It's **dominant enough to override content quality signals**.\n\n### 3. Query Intent Detection\n\n**What I found:** `enable_query_intent: true` means ChatGPT analyzes what you're actually trying to accomplish.\n\n**What research adds:** Intent detection **doesn't adjust for temporal appropriateness**. Historical queries get the same freshness bias as news queries.\n\n**New insight:** A query like \"causes of World War I\" shouldn't prioritize 2024 content, but it does. The intent detection isn't temporally aware.\n\n### 4. Vocabulary Search\n\n**What I found:** `vocabulary_search_enabled: true` with fine-grained filtering rewards technical terminology.\n\n**What research adds:** Even content with **perfect vocabulary** loses to newer content with **worse vocabulary** up to 25% of the time.\n\n**New insight:** Technical accuracy      Original configuration analysis: [Inside ChatGPT's GPT 5 Search Configuration](https://metehan.ai/blog/chatgpt-5-search-configuration/)\n- Academic research: [\"Do Large Language Models Favor Recent Content? A Study on Recency Bias in LLM-Based Reranking\" by Fang et al., Waseda University, 2025](https://arxiv.org/abs/2509.11353)",
  "author": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai",
    "sameAs": [
      "https://x.com/metehan777",
      "https://www.linkedin.com/in/metehanyesilyurt",
      "https://github.com/metehan777"
    ]
  },
  "publisher": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai"
  },
  "alternateFormat": {
    "html": "https://metehan.ai/blog/i-found-it-in-the-code-science-proved-it-in-the-lab-the-recency-bias-thats-reshaping-ai-search/",
    "json": "https://metehan.ai/api/post/i-found-it-in-the-code-science-proved-it-in-the-lab-the-recency-bias-thats-reshaping-ai-search.json",
    "rss": "https://metehan.ai/rss.xml"
  }
}