Intercom Fin works well when the documentation underneath it is accurate. When the documentation is stale, Fin works against you. It retrieves confidently, synthesizes fluently, and delivers wrong answers at scale. There is no warning label when it does. Support teams that have run Fin for more than three months without a documentation discipline almost always notice the same pattern: resolution rate plateaus, certain query clusters keep failing, and the failures are weirdly specific. Wrong navigation paths. Deprecated feature names. Steps that used to work but no longer do. The pattern is not a model problem. It is a documentation decay problem.
How Intercom Fin actually works
Intercom Fin uses retrieval-augmented generation (RAG). When a customer sends a message, Fin searches your connected knowledge base for relevant articles, then formulates a response grounded in what it finds. It does not draw on general LLM knowledge or make up information outside your help center content. It retrieves from your articles and synthesizes from that.
This architecture is deliberate and correct for business use. RAG keeps Fin's answers grounded in your product's specific information, makes hallucinations unlikely, and lets you control what Fin knows by controlling the knowledge base. Intercom reports a hallucination rate below 1%, meaning Fin almost never invents information that is not in the source material. That is a genuine achievement and the right starting point for an enterprise AI agent.
The catch is that Fin's accuracy ceiling is set entirely by the accuracy of your articles. Fin cannot tell whether an article is current or eight months out of date. It retrieves the most semantically relevant content and synthesizes from it. If the most relevant article is wrong, Fin delivers that wrong answer with the same confidence as a correct one. This is not a hallucination in the technical sense. Fin accurately represented what the article said. The article was wrong. That distinction matters because the fixes are completely different: you cannot solve a documentation accuracy problem by upgrading the AI model. You solve it by fixing the documentation.
What sets Fin's accuracy ceiling
Three variables determine how well Fin performs on any given query. The first is coverage: if there is no article on the topic, Fin cannot answer it. This is a straightforward gap to identify and address. The second is clarity: if the article is poorly structured or ambiguous, Fin may retrieve it but extract the wrong section. The third is accuracy: if the article exists, is well-structured, but contains outdated steps, Fin delivers those outdated steps.
Most teams invest in coverage and clarity. The accuracy variable is the one that drives the most failures in established deployments, because teams write new articles regularly but update old ones rarely. Intercom's own support team ran a full audit of more than 700 articles before enabling Fin internally. That is not a small number. It is a direct signal of how seriously documentation quality affects AI performance when you operate at scale, and of how much decay accumulates in a real knowledge base over time.
The Knowledge-Centered Service methodology from the Service Innovation Library benchmarks knowledge article useful life at roughly six months. For teams shipping weekly, that lifespan is much shorter in practice. A feature rename can invalidate multiple articles in a single sprint without anyone flagging the change to the docs team. The support content quality problem is not a failure of effort. It is a structural gap between how software is built and how documentation is maintained.
How stale documentation produces confident wrong answers
The failure mode is concrete. Your product had a settings panel called "Account Settings." Your engineering team split it into two panels: "Profile" and "Billing." Fourteen help articles reference "Account Settings." Nobody flagged those articles for the docs team when the change shipped. There was no process for it.
A customer asks Fin: "How do I update my payment method?" Fin retrieves the billing article, which begins: "Go to Account Settings, then select Billing." Fin tells the customer exactly that. The customer searches for "Account Settings" in your product. It no longer exists. They fail. They escalate to a human agent, now frustrated because they already tried and wasted time.
Fin did nothing wrong by its own logic. It retrieved an accurate representation of the article's content. The article described a UI that no longer exists. This is not a model failure. It is a documentation quality failure, and it compounds with every release that touches the product without triggering an article update. The knowledge base accuracy problem multiplies with every sprint.
Intercom's own troubleshooting documentation acknowledges a related version of this: when a knowledge base article conflicts with a configured procedure, Fin defaults to the article. Outdated articles do not just produce wrong answers. They actively block the correct procedures from running. A team that set up a refund procedure but left an old refund article in place will see Fin cite the article and bypass the procedure entirely.
How fast does accuracy degrade at shipping speed
Documentation decay is a function of release cadence multiplied by article coverage. For a team shipping once a week with 200 articles covering UI-dependent workflows, the math works against you. A conservative estimate: each weekly release touches three to five UI elements. Each UI element change potentially invalidates one to three articles. Over a quarter, that is 12 releases touching 36 to 60 elements, producing up to 180 article-level inaccuracies across a knowledge base that may have 200 articles total.
Intercom's documentation for Fin optimization recommends reviewing articles unchanged for six or more months on a monthly basis. But the real problem for fast-shipping teams is not the articles that have not been touched in six months. It is the articles that were accurate three weeks ago and are now wrong because a developer renamed a button. Manual review cadences cannot catch that. One documented case in the Intercom community showed a 70% Fin failure rate on password reset queries traced entirely to a single outdated article describing a flow that had been redesigned two sprints earlier.
The SuperOffice 2023 customer service benchmarks found that customers who fail at self-service are significantly more expensive to serve when they escalate. They arrive already frustrated and with partial context that agents have to untangle. Running Fin on stale documentation is not just inefficient. It actively increases the cost of each failure, because failed AI interactions create more agitated customers than no AI interaction at all.
Why the fix is structural, not a review sprint
The typical response to documentation accuracy problems is a review sprint: pull the top articles by Fin involvement rate, have someone walk through each one against the live product, update what is wrong. This works once. It does not solve the problem over time, because the product keeps shipping and the review cadence never keeps pace with the release cycle.
Intercom's own Content Gap Suggestions feature illustrates the reactive limit clearly. The tool identifies what has already failed: queries Fin could not resolve. Then it suggests content fixes. But it cannot tell you which articles are about to go stale because a pull request merged yesterday. It identifies yesterday's failures, not tomorrow's. For teams shipping weekly, that means the knowledge base is always at least one release behind the optimization signal. The gap between code and documentation never closes. It only gets measured after it causes failures.
The structural fix is to connect documentation to the release cycle at the source. That means recording documentation in a format that knows what it is referencing in the product. Screenshot-based documentation records the visual state of the UI at one moment in time. When the product changes, the screenshot is wrong, but nothing in the documentation system knows it. There is no link between the frozen image and the code element that changed. The only way to find the inaccuracy is a manual review. As covered in why AI chatbots give wrong answers, this is the core architectural problem that breaks AI retrieval at scale: the knowledge base is built on data that has no connection to its own accuracy.
What GitHub Sync does differently
HappyRecorder, HappySupport's Chrome extension, records documentation differently. Instead of capturing screenshots, it captures DOM selectors and CSS metadata at each step: references to the actual code elements the user is interacting with, not a frozen image of how they looked at recording time. When a developer renames a button or reorganizes a navigation panel, the selector reference in the guide points to something that has changed. The system knows this before anyone has to manually find it.
HappyAgent, the GitHub Sync layer, monitors the repository for those selector changes. When a pull request modifies a CSS class or DOM element referenced in a guide, HappyAgent detects the change, maps it to the affected articles, and flags them for review. The connection between the code change and the documentation impact is automated, not manual. An engineer ships a rename. HappyAgent surfaces the three articles that referenced the old label. The support team reviews and confirms. The knowledge base stays current at release cadence, not at review sprint cadence.
This is what Intercom's content gap suggestions cannot do: it cannot tell you a guide is about to break before the guide breaks. GitHub Sync works on the upstream source of documentation decay rather than the downstream symptoms. The failure never reaches Fin's retrieval layer because it is caught at the point of change.
For teams currently running Fin on a screenshot-based knowledge base, the practical sequence is straightforward. Audit the articles Fin uses most against the live product first. Fix the highest-traffic inaccuracies. Then switch the recording method for new articles to one that captures code metadata rather than pixels. The GitHub Sync documentation guide covers how HappyAgent handles the continuous monitoring piece once the knowledge base is rebuilt with selector-aware content.
Measuring Fin accuracy before and after documentation fixes
Intercom's Fin Optimize feature gives teams the data needed to quantify the documentation problem. Look at three metrics in sequence: Fin's involvement rate (the percentage of conversations where Fin engages), the resolution rate per involved conversation, and the per-article resolution rate for the articles Fin references most.
Articles with high involvement and low resolution rate are the documentation accuracy failures in plain sight. They are the articles Fin retrieves frequently but that do not produce resolved conversations. Most of the time, the reason is inaccurate content: the article describes something that no longer works as described. These are not missing articles. They are wrong articles, and wrong articles are harder to catch than missing ones because they look fine until someone follows the steps.
A practical before-and-after benchmark: identify the ten articles with the highest involvement and lowest resolution rate. Fix each one against the live product. Rerun Fin on the same query set. Intercom's own knowledge management guidance reports that teams can reach an 80% resolution rate through refined knowledge management. The ceiling is not the model. It is the documentation quality.
The goal is not to tune the AI. It is to eliminate the documentation accuracy gap that the AI is faithfully reproducing. Once that gap closes, Fin's performance improves without any changes to model configuration. The bottleneck was always the knowledge base. Recognizing that is the first step. Building a process that keeps the knowledge base accurate automatically is what makes the improvement permanent.
If you want to audit your full knowledge base before connecting it to any AI system, the help center content audit guide covers exactly what to check and in what order. HappyRecorder creates guides with code-level references so articles know what they are pointing to. HappyAgent watches the codebase and flags changes when they happen. The combination keeps Fin's knowledge base current at shipping speed, without relying on anyone remembering to check after every release.







