The news: OpenAI’s new “ChatGPT agent”, which started rolling out last week, goes beyond chatbots by acting as an autonomous “digital worker,” per TechCrunch.
Available to OpenAI’s Pro, Team, and Enterprise subscribers for $200/month per user, the agent operates software, browses websites, fills out forms, and creates documents within a secure sandbox, potentially rivaling tools like Microsoft Office.
The next evolution of agentic AI: The feature combines ChatGPT’s conversational interface, Operator’s web browsing capability, and Deep Research’s information aggregation. OpenAI says it’s like having a worker with their own computer.
ChatGPT agent can integrate with apps (Gmail, GitHub) and log into websites, enhancing research and task execution. It can also generate full PowerPoint decks and Excel-style sheets inside ChatGPT, letting users bypass Word, Excel, and PowerPoint and chipping away at Office’s productivity lock-in.
But there are limitations: To edit agent-generated documents, users have to refine them through prompts, export content to other suites, or enable ChatGPT connector plug-ins.
Collaboration features are also unavailable to ChatGPT agent at this time.
Testing challenges: While ChatGPT agent can handle complex tasks similar to human personal assistants, some early reviewers are noting it can be slow or buggy.
- The Verge found that agent can misinterpret user intent or complete only parts of a workflow and get stuck, requiring human interaction to move forward.
- It also fell short in real-world transactions for tasks like adding items to a shopping cart or booking appointments, which happen in a sandbox environment and not on real accounts.
Our take: ChatGPT agent is a glimpse of how generative AI (genAI) is quickly evolving and on the path to reshape digital workflows by combining complex systems, but it still has a lot of room for improvement. Since OpenAI has set performance and pricing targets, expect competitors to roll out similar advanced autonomous tools.
Next steps:
- As AI companies combine their models into autonomous tools, marketers, researchers, and pilots should test agents on repeatable, low-risk tasks like generating decks or summarizing reports.
- Exercise human oversight, track time saved, and evaluate ROI and output quality against legacy tools like Microsoft Office to determine if agents are a viable replacement.