The news: Users managed to trick Gap’s chatbot into discussing intimacy products, sex toys, and other topics beyond its intended scope fairly quickly after launch, according to a report by The Information.
How it happened: Sierra, the AI startup powering the chatbot, blamed a “bad actor” that since last week “has been attempting to maliciously jailbreak over a dozen” of its clients’ AI agents, head of communications Rachel Whetstone told The Information.
- The company’s built-in abuse detection system was able to intercept all of the attempts except the one involving Gap, which was missed because the agent’s guardrails had been “inadvertently misconfigured,” Whetstone said.
- Those guardrails have since been correctly configured, and the chatbot remains accessible to users.
Why it matters: The episode illustrates the risks that AI chatbots can pose to brand safety and reputation. While Sierra was able to fix the issue with Gap’s agent fairly quickly—and with relatively little fallout (so far)—the incident shows the need for brands and their AI partners to be vigilant as bad actors come up with increasingly sophisticated ways of jailbreaking LLMs.
The relative ease with which users were able to circumvent Gap’s topic restrictions shows that brands need to be exceedingly careful when choosing AI vendors and make sure that all necessary safeguards are in place before such features go live.
This content is part of EMARKETER’s subscription Briefings, where we pair daily updates with data and analysis from forecasts and research reports. Our Briefings prepare you to start your day informed, to provide critical insights in an important meeting, and to understand the context of what’s happening in your industry. Non-clients can click here to get a demo of our full platform and coverage.