{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Inside ChatGPT's GPT 5 Search: What the Configuration Files Reveal About How It Ranks Your Content",
  "description": "Inside ChatGPT's GPT 5 Search: What the Configuration Files Reveal About How It Ranks Your Content",
  "datePublished": "2025-08-20T00:00:00.000Z",
  "dateModified": "2025-11-02T00:00:00.000Z",
  "url": "https://metehan.ai/blog/chatgpt-5-search-configuration/",
  "category": "featured-research",
  "tags": [],
  "image": "/wp-content/uploads/2025/08/reranker.png",
  "wordCount": 1318,
  "readTime": "7 min",
  "articleBody": "*An analysis of actual ChatGPT configuration settings and what they mean for content creators. Many probabilities...*\n\nIf you've ever wondered how ChatGPT decides which websites to reference when answering questions, we've got some fascinating insights. Recent configuration data from ChatGPT's production environment reveals the exact settings that govern how it searches, retrieves, and ranks web content.\n\n> *Previously, I wrote about [Reciprocal Rank Fusion.](https://metehan.ai/blog/chatgpt-is-using-reciprocal-rank-fusion-rrf/)*\n\nNo speculation, no guesswork, just the actual configuration parameters that determine whether your content makes it into ChatGPT's responses. You can see it yourself in the source code.\n\nJust visit any past chat window, click the right and \"View Source Code\"\n\nCTRL/Command + F -> \"rerank\"\n\n![](/wp-content/uploads/2025/08/chatgpt-reranker.png)\n\n## The Reranking Model: ret-rr-skysight-v3\n\nAt the heart of ChatGPT's retrieval system is a reranking model with the cryptic name `ret-rr-skysight-v3`. This isn't just a simple search algorithm; it's a sophisticated post-processing layer that takes initial search results and completely reorders them based on quality signals.\n\n```\nreranker_model: \"ret-rr-skysight-v3\"\n\n```\n\nThis single line of configuration confirms what many suspected: ChatGPT doesn't just grab the first search results it finds. Instead, it retrieves a larger set of potential sources and then applies this reranker to identify the most relevant and authoritative content.\n\n## Freshness Is King: The Scoring Profile\n\nPerhaps the most significant finding for content creators is this setting:\n\n```\nuse_freshness_scoring_profile: true\n\n```\n\nThis can confirm that ChatGPT actively prioritizes recent content over older material. It's not just looking at publication dates, it's using a dedicated \"freshness scoring profile\" to weight newer information more heavily. Isn't it? What do you think?\n\nFor website owners, this can be crucial: that comprehensive guide you wrote in 2022? It might be losing ground to newer content, even if yours is more detailed. Regular content updates aren't just good practice; they're essential for ChatGPT visibility.\n\nHere is another evidence for freshness\n\n***enable_source_specific_search_params**: `retrieval_additional_system_prompt`*\n\n***The user may have connected sources. If they have, you can assist the user by searching over documents from their connected sources, using the file_search tool. For example, this may include documents from their Google Drive, or files from their Dropbox. The exact sources (if any) will be mentioned to you in a follow-up message.*\n\n*Use the file_search tool to assist users when their request may be related to information from connected sources, such as questions about their projects, plans, documents, or schedules, BUT ONLY IF IT IS CLEAR THAT the user's query requires it; if ambiguous, and especially if asking about something that is clearly common knowledge, or better answerable from a different tool, DO NOT SEARCH SOURCES. Use the `web` tool instead when the user asks about recent events / fresh information, or asks about news etc. Conversely, if the user's query clearly expects you to reference / read some non-public resource, it is likely that they are expecting you to search connectors.*\n\n*Note that the file_search tool allows you to search through the connected soures, and interact with the results. However, you do not have the ability to _exhaustively_ list documents from the corpus and you should inform the user you cannot help with such requests. Examples of requests you should refuse are 'What are the names of all my documents?' or 'What are the files that need improvement?'*\n\n*IMPORTANT: Your answers, when relating to information from connected sources, must be detailed, in multiple sections (with headings) and paragraphs. You MUST use Markdown syntax in these, and include a significant level of detail, covering ALL key facts. However, do not repeat yourself. Remember that you can call file_search more than once before responding to the user if necessary to gather all information.*\n\n***Capabilities limitations**:*\n\n*- You do not have the ability to exhaustively list documents from the corpus.*\n\n*- You also cannot access to any folders information and you should inform the user you cannot help with folder-level related request. Examples of requests you should refuse are 'What are the names of all my documents?' or 'What are the files that need improvement?' or 'What are the files in folder X?'.*\n\n*- Also, you cannot directly write the file back to Google Drive.*\n\n*- For Google Sheets or CSV file analysis: If a user requests analysis of spreadsheet files that were previously retrieved - do NOT simulate the data, either extract the real data fully or ask the users to upload the files directly into the chat to proceed with advanced analysis.*\n\n*- You cannot monitor file changes in Google Drive or other connectors. Do not offer to do so.**: `enable_dynamic_prompt`*\n\n## The Multi-Layer Filtering System\n\nThe configuration reveals a sophisticated filtering pipeline with multiple checkpoints:\n\n```\nenable_query_intent: true\nenable_source_filtering: true\nenable_mimetype_filtering: true\nvocabulary_search_enabled: true\nuse_coarse_grained_filters_for_vocabulary_search: false\n\n```\n\nLet's break down what each of these means:\n\n### Query Intent Detection\n\nWith `enable_query_intent: true`, ChatGPT analyzes what the user is actually trying to accomplish. It's not just matching keywords, it's understanding whether someone wants a definition, a how-to guide, a comparison, or something else entirely.\n\n### Vocabulary Search: The Domain Expert Advantage\n\nHere's where it gets interesting:\n\n```\nvocabulary_search_enabled: true\nuse_coarse_grained_filters_for_vocabulary_search: false\n\n```\n\nChatGPT may use vocabulary-aware searching with fine-grained (not coarse) filters(probably fine-grained!). This means it recognizes and prioritizes domain-specific terminology. Sites that consistently use proper industry terminology and define their terms have a built-in advantage.\n\n## The Mystery Settings: What I Don't Know\n\nInterestingly, one relevance feature is explicitly disabled:\n\n```\nuse_relevance_lmp: false\n\n```\n\nI don't know what \"LMP\" stands for, but we know ChatGPT has chosen NOT to use it. This suggests the system relies on other relevance signals, possibly more traditional information retrieval methods combined with the neural reranker.\n\nSimilarly, these features are enabled but their exact purpose remains unclear:\n\n```\nenable_mclick_urls: true\nenable_mclick_dates: true\nuse_light_weight_scoring_for_slurm_tenants: true\nenable_source_specific_search_params: true\n\n```\n\nThe \"mclick\" features might relate to multi-click behavior or tracking how users interact with multiple sources. Or just mobile clicks?\n\n**What about slurm?**\n\nThere is another configuration in ChatGPT.\n\n*enabledConnectors: [*\n*\"gdrive_action_connector\",*\n*\"slurm_dropbox\", //  Full reranking and scoring\n2. **Connected personal/work accounts** -> Lightweight scoring\n\n**The results/citations even change dynamically if users connect their own sources!**\n\n## What This Means for Your Content Strategy\n\nBased on these confirmed settings, here's what actually matters:\n\n### 1. Update Frequency Beats Static Perfection\n\nThat freshness scoring profile isn't optional; it's always on. Even perfect content grows stale in ChatGPT's eyes.\n\n### 2. Intent Alignment Is Critical\n\nWith query intent detection active, your content needs to clearly signal what type of information it provides. A product comparison should look and read like a comparison, not a blog post pretending to be one.\n\n### 3. Technical Vocabulary Matters\n\nThe vocabulary search system rewards proper use of industry terminology.\n\n### 4. The Reranker Changes A Lot\n\nInitial search visibility can't be enough. The `ret-rr-skysight-v3` reranker will reshuffle everything based on quality signals we can only partially understand. Focus on comprehensive, authoritative content that would survive any reordering.\n\n## The Configuration Doesn't Lie or...?\n\nThese are actual production settings from ChatGPT's retrieval system. Every `true` and `false` in this configuration directly impacts whether your content appears in AI-generated responses.\n\nThe most striking revelation? The complexity of the filtering and ranking pipeline. This isn't a simple search engine; it's a multi-stage retrieval system with intent detection, vocabulary analysis, freshness scoring, source filtering, and neural reranking all working in concert.\n\nFor content creators, the message is clear: optimize for substance, freshness, and clarity. The configuration shows ChatGPT is looking for recent, relevant, technically accurate content that clearly matches user intent.\n\nGaming this system would require fooling multiple independent filters and a sophisticated neural reranker. Instead, focus on what the configuration implicitly rewards: becoming the most current, comprehensive, and authoritative source in your niche.\n\n*Note: This analysis is based on configuration data from a ChatGPT Plus user session in August 2025. Settings may vary by user type, region, or over time as OpenAI updates its systems.*\n\nSee a comprehensive list here;\n\n[https://github.com/metehan777/chatgpt-5-configuration-analysis](https://github.com/metehan777/chatgpt-5-configuration-analysis)",
  "author": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai",
    "sameAs": [
      "https://x.com/metehan777",
      "https://www.linkedin.com/in/metehanyesilyurt",
      "https://github.com/metehan777"
    ]
  },
  "publisher": {
    "@type": "Person",
    "name": "Metehan Yesilyurt",
    "url": "https://metehan.ai"
  },
  "alternateFormat": {
    "html": "https://metehan.ai/blog/chatgpt-5-search-configuration/",
    "json": "https://metehan.ai/api/post/chatgpt-5-search-configuration.json",
    "rss": "https://metehan.ai/rss.xml"
  }
}