The news: AI-powered search tools, including large-language models (LLMs) used by Perplexity and other AI companies, increasingly deliver unreliable data. Users have reported inaccurate market statistics and financial figures sourced from questionable summaries instead of verified documents like SEC 10-Ks, per The Register.
“Model collapse” is when LLMs trained on data generated by other LLMs (rather than original human data) gradually degrade in quality over time, the same way a photocopied document begins to fade and blur after repeatedly copying of the copies. In the case of AI, this eventually leads to erroneous responses, per research from Nature.
Brand marketers are worried—35% cite concerns about the reliability of generative AI (genAI), particularly hallucinations, as the greatest challenge to using the tech in marketing, per Econsultancy.
The problem: Training data is drying up, forcing AI to recycle old content as firms scramble for fresh sources and smarter methods.
Model collapse will lead to an increase in repetitive, less accurate, and sometimes outright incorrect outputs over generations, even if the models appear increasingly fluent and confident.
- Because errors compound across successive model generations, this results in distorted data distributions and "irreversible defects" in performance.
- The final result? “The model becomes poisoned with its own projection of reality,” per Nature.
Mixing synthetic and fresh human data can slow model collapse, according to TechTarget, but the scarcity and rising cost of training data could discourage its inclusion.
Solutions for marketers: Relying solely on AI output could lead to data disasters. Here are some ways marketers can get ahead of any potential problems:
- Agencies need to pressure-test genAI tools and ask vendors tough questions about how models were trained. Proactive assessment, especially for critical client or industry information, could bolster accuracy and relevance.
- Publishers and analysts should flag AI summaries citing outdated or nonexistent sources to avoid bad data creeping into planning, forecasting, and campaign messaging.
- Marketers dealing with financial, health, and regulatory content should exercise tighter oversight—one false stat can trigger bad decisions, legal trouble, or damage to brands.
Our take: Audit your AI stack. Don’t rely on outputs alone—verify the provenance of AI-generated data and prioritize tools trained on verified, high-integrity sources.
Vet vendors based on transparency, update cycles, and data hygiene. If you’re using AI for decision-making, demand traceable accountability—because “good enough” answers can quickly become costly mistakes.