Events & Resources

Learning Center
Read through guides, explore resource hubs, and sample our coverage.
Learn More
Events
Register for an upcoming webinar and track which industry events our analysts attend.
Learn More
Podcasts
Listen to our podcast, Behind the Numbers for the latest news and insights.
Learn More

About

Our Story
Learn more about our mission and how EMARKETER came to be.
Learn More
Our Clients
Key decision-makers share why they find EMARKETER so critical.
Learn More
Our People
Take a look into our corporate culture and view our open roles.
Join the Team
Our Methodology
Rigorous proprietary data vetting strips biases and produces superior insights.
Learn More
Newsroom
See our latest press releases, news articles or download our press kit.
Learn More
Contact Us
Speak to a member of our team to learn more about EMARKETER.
Contact Us

Behind the Numbers: Next-Gen AI: From Assistants, to Autonomous Agents, and Beyond

On today’s podcast episode, we discuss the many definitions of an “AI agent”, why they’re so hard to build right, and what comes next. Join Senior Director of Podcasts and host Marcus Johnson, Analyst Jacob Bourne, and Vice President of GenAI Dan Van Dyke. Listen everywhere and watch on YouTube and Spotify.

Subscribe to the “Behind the Numbers” podcast on Apple Podcasts, Spotify, Pandora, Stitcher, YouTube, Podbean or wherever you listen to podcasts. Follow us on Instagram.

 

Kinective Media by United Airlines is the airline industry’s first traveler media network, using insights from travel behaviors to connect customers to personalized, real-time advertising, content, experiences and offers from leading brands. Kinective Media’s platform allows marketers to reach travelers across a wide range of channels including United's award-winning mobile app and inflight entertainment screens. Kinective Media enhances the travel experience for millions of United customers and drives greater loyalty among United MileagePlus® members through customized offers and experiences. For more information, visit: https://kinectivemedia.com to get started today.

Transcript

 

Marcus:

This episode is made possible by Kinective Media by United Airlines. Kinective Media by United Airlines is redefining traveler media with a world-first omnichannel network. From in-flight to online and in-app, experience best-in-class tech helping brands engage travelers where it matters most. Are you ready to make an impact? Of course you are. Discover more at kinectivemedia.com. That's Kinective with a K.

Hey, gang, it's Friday, April 18th, somehow. Dan, Jacob, and listeners, welcome to Behind the Numbers, an eMarketer video podcast made possible by Kinective Media by United Airlines. I'm Marcus. Today we'll be discussing what's ahead for AI. Join me for that conversation, we have two people, let's meet them right now.

We start with our VP of gen AI based in New York, it's Dan van Dyke.

Dan Van Dyke:

Thanks for having me, Marcus.

Marcus:

Yes, sir. Of course. We're also joined by our technology analysis on the other coast out in California in the Bay, it's Jacob Bourne.

Jacob Bourne:

Thanks for having me today, Marcus.

Marcus:

Absolutely, absolutely. All right. Today's fact. Right before we hit record, Dan said, "I've got one for you." I wasted three hours today on my one. What I said to Dan is we're going to compete. Jacob's going to ref. Dan, you can go first.

Dan Van Dyke:

Okay. Today I learned that sloths, my favorite animals, will only use the bathroom, I'm going to say that broadly, once a week. They'll come down from their trees-

Marcus:

Once a week?

Dan Van Dyke:

... and they excrete one-third of their body weight. One-third. They take the time to dig a hole and then they bury it. They risk their lives in the process and nobody knows why. That is my fact of the day.

Jacob Bourne:

Interesting. Slow metabolism, I guess.

Dan Van Dyke:

Yeah, I guess so.

Jacob Bourne:

Marcus, I think this is the second time that sloths have come up-

Marcus:

It is.

Jacob Bourne:

... in the fact of the day.

Dan Van Dyke:

Really?

Jacob Bourne:

Yeah.

Marcus:

Yeah, I think before I was talking about how they can hold their breath for 40 minutes.

Dan Van Dyke:

What?

Marcus:

I don't know why you would need that. Maybe for a little deep sea diving on the Great Barrier Reef.

Jacob Bourne:

Is that true?

Dan Van Dyke:

They do swim.

Marcus:

For 40 minutes, yeah.

Jacob Bourne:

Holy cow.

Marcus:

Yeah, a shocking amount. I think it's one of the most, maybe if the most, by an animal.

How did this come about, Dan?

Dan Van Dyke:

Oh, I was just on Reddit.

Marcus:

Okay. That's how it happens. All right, cool. That is a good one. I've got one for you. Jacob, don't be swayed, but I did invite you on the show. Denmark holds the Guinness World Record for the oldest continuous use of their national flag. Since 1625, they've been using the same flag.

I went down a rabbit hole of flags because I'm cool. A lot of flags look very similar I found out, which is another fact about flags. Chad and Romania's are literally identical. Romania's flag came first by 100 years, all right, Chad, so you stole there's. Senegal and Mali have the same one, but Senegal's has a little star in the middle. Indonesia and Monaco both have two horizontal stripes, red over white, but their dimensions differ. New Zealand and Australia are the same, but the stars are different colors. Venezuela, Ecuador, Columbia are yellow, blue, red horizontal bars, but have different emblems in the middle. Two more for you. Luxembourg and the Netherlands are red, white, blue lines, but the blue's slightly different shades. Slovenia, Russia, and Slovakia all are white, blue, and red horizontal bars, but with different coat of arms.

Dan Van Dyke:

Two reactions. One, that's 42 facts. Two, remarkable that you could say all that without stuttering. I'm very in awe.

Marcus:

I practiced. [inaudible 00:03:44].

Dan Van Dyke:

Who's the winner?

Marcus:

Come on, Jacob.

Jacob Bourne:

The sloth, anything about animals is memorable.

Marcus:

Ah, you made good points. Of course he won!

Jacob Bourne:

Marcus, yours made me think something for the first time, which is that we don't really think outside the box much with flags, do we?

Marcus:

Yeah, not at all.

Jacob Bourne:

Not a lot of creativity there.

Marcus:

Not at all, no. They're all grouped together, so I guess regions change, but the flags stay quite similar based on whereabouts the countries are that are coming up with them. But yeah, not a ton of creativity. Dan absolutely wins of course. It wasn't even close. Anyway, today's real topic, the dawn of AI agents, and also the AI native company.

All right, gents. "Everyone's talking about AI agents, barely anyone knows what they are," writes Isabelle Bousquette of The Wall Street Journal. She notes that "AI agents are broadly understood to be systems that can take some action on behalf of humans," like buying groceries or making restaurant reservations. But in some cases, the question of what constitutes an action is blurry.

Dan, I'll start with you. What is an AI agent?

Dan Van Dyke:

Yeah. I actually had an education on this recently. I was talking to a vendor of one of those AI native companies, getting a demo from them.

Marcus:

Okay.

Dan Van Dyke:

I used the term agents wrong. The person on the other end of the phone, or Zoom, politely told me that really, there's a whole spectrum, from the chatbots like ChatGPT, to workflows that are rigidly orchestrated, but a little bit more robust than a chatbot, to what fits into the term agent in the classical sense. Agent means an AI-based tool that can take action based on a predefined task with autonomy and use tools. That's what defines an agent.

But, Jacob, you've actually written on the subject, so I'm curious if that gels with your definition?

Jacob Bourne:

It does. I think, well, first of all, it's a buzzword at this point. Your story, Dan, is relevant because these are technical terms that become commercialized and become part of the consumer marketplace, and then it takes on new meaning.

Dan Van Dyke:

Yeah.

Jacob Bourne:

But I think distinguishing between gen AI chatbots or gen AI tools and agents, I think it's really about the level of autonomy. With a chatbot, you have to prompt it for every small tasks. With AI agents, it can take an action without that need for step-by-step prompting. It can do things in the background that you didn't necessarily tell it to, but it's all geared towards the goal that you want essentially.

Marcus:

Okay. You said wrong, you said you said it wrong, Dan. Different, maybe? It feels like if you ask 100 people, you get 101 responses, even if you are talking technical terms.

One from Tom Coshow, senior editor analyst at Gartner says, "Does the AI make a decision and does the AI agent take action? Software needs to reason itself and make decisions based on contextual knowledge to be a true agent."

There's another quote here from Robert Blumofe, CTO at Akamai Technologies that says, "Many use cases," he said, "today resemble assistive agents rather than autonomous agents, requiring direction from a human user before taking action and narrowly focused on individual use cases." He does say it's a bit of an oxymoron in an assistive agent because an agent's supposed to do it for you.

What do we think of those variations of definitions?

Dan Van Dyke:

I think it reflects the fact that the goalposts are shifting for what constitutes an agent. For now, it's like what's next? The threshold is defined by the level of autonomy and the access tools. But the capabilities of the baseline, so what you can get within ChatGPT, really resemble a lot of the characteristics that you were describing, Marcus, in that ChatGPT can decide to search the web based off the request that you ask it. It can invoke different tools, like image generation. Does that constitute an agent?

As nice as it would be to be able to come up with a crisp and specific definition for what constitutes an agent, it is a murky term and the definitions are changing over time.

Marcus:

Yeah.

Jacob Bourne:

Yeah.

Marcus:

Jacob, there are levels to this, right? With autonomous driving, I've referenced this quite a lot, there are I think six levels, from zero to five.

Jacob Bourne:

Yeah.

Marcus:

Various levels of autonomous cars. I'm surprised that agents don't have something similar, because Miss Lin, Belle Lin who writes for the Journal was saying, "AI agent can perform simple tasks, like ordering office supplies." Eventually, some enterprises want to get them to financial transactions and hiring new workers, but that's quite a variation in difficulty.

Jacob Bourne:

Yeah, I think that's a great analogy you're making there, or a comparison anyway, with the autonomous vehicles. I think the difference here is that autonomous vehicles are doing a very specific tasks, drive your car. With AI in general, it's potentially anything, anything that a human could do. At least, that's the vision with artificial general intelligence.

I think what this really highlights here is there's a bit of a disconnect between the vision for the AI sector, AI companies building this, and where the technology currently is at. The vision is boundless automation, essentially. Artificial general intelligence that can do anything a human can do, I think that's the vision. But it's far from getting there, so a lot of these terms become incremental steps towards what the ultimate goal is. If we think about the initial agents that launched, like OpenAI's Tasks for example, very limited automation, very limited capabilities, but we're still calling them agents. I think they're really incremental steps towards where we'll eventually get, which is that you have agents that can really handle very complex tasks that a human would do. Really, it means people giving up a lot of that micro-decision making to the AI that's really operating fully in the background.

Marcus:

It's quite ironic, actually. Zoe Weinberg, a venture capital investor was saying, "It's ironic to see a term that started out describing human agency being used to talk about its opposite technology [inaudible 00:10:30] with little to no human oversight."

Dan, we were talking before this recording about this quote from Erin Griffith of The New York Times. She says, "After AI agents comes agentic AI." How are they different?

Dan Van Dyke:

I don't know if I agree it's what comes after.

Marcus:

Oh, interesting. Okay.

Dan Van Dyke:

At least, according to my definition. We've already touched on how subjective and non-uniform those are. The way that I define agentic AI is an umbrella term that encompasses both agents in the true sense of the word. Jacob, you were talking about Tasks. OpenAI has also released operator which can browse the internet on a user's behalf, and do things like attempt to book a flight. Or deep research, which can write a research report by browsing tons of sources. Those are true agents.

But there's a middle ground that's above what ChatGPT can do, but below the capabilities of a true agent. An example of that would be I'm building a lot of workflows to assist our research team in gathering content from Feedly, curating it, and writing what are known as research blog posts, which is an internal tool. Although it's tons of large language models strung together, and although there's a high degree of prompting and complexity in this workflow, I wouldn't call it an agent. I would say that it fits within the realm of agentic AI.

But to your question of what's next? I would say multi-agent workflows is the thing that's next.

Jacob Bourne:

Yeah.

Dan Van Dyke:

What that means is think deep research, which can write reports, meets writes a report that ends up triggering 10 operators to go out accomplish different tasks all in service of a user's request.

Jacob Bourne:

Yeah.

Dan Van Dyke:

It's starting to build to an organization all working in unison towards a common goal.

Jacob Bourne:

Yeah. Yeah, I 100% agree with that. I think it's really about these different types of AI with different skills in and of themselves coming together to accomplish more. I think the next step is also AI agents that can anticipate user's needs so you don't really need to hardly do any prompting. It knows what you're going to need in the future and is already working on it in the background.

I think though, for daily purposes, ultimately already we're seeing that these terms get used interchangeably. I imagine that, again, the deeper meaning or the technical meaning probably will get lost eventually.

Marcus:

Yeah, yeah. I liked what you said, Dan, this umbrella term. Dr. Andrew Ng, a prominent AI researcher, was saying, "There's a gray zone. Agentic is an umbrella term and covers some tech that wasn't strictly an agent, but had agent-like qualities."

We talked about some of these agents, at least one. I think, Dan, you mentioned OpenAI, maybe Jacob, OpenAI Tasks. Who else do we have? What are some other examples of popular AI agents at the moment?

Jacob Bourne:

There's all kinds from most of the tech giants, leading AI companies. Amazon has its Bedrock agents through the cloud. Google has its Vertex AI Agent Builder. Also, Google has Agent Space, which just announced that its agent now has coding capabilities, so autonomous coding.

Marcus:

Okay.

Jacob Bourne:

Dan mentioned Operator from OpenAI. Oracle has a clinical AI agent for healthcare. Nvidia has Agentic AI Blueprints, which allows organizations to create their own custom agents. Microsoft-Salesforce has Agentforce. The list goes on and on. There's also more industry-specific agentic platforms as well.

Marcus:

Okay. Are they interoperable, Jacob? Can they speak to each other? Dan was talking about a multi-agent world. Is that within the umbrella of Google, the ecosystem of Amazon, or do they talk to each other across companies?

Jacob Bourne:

Well, I think that that's part of the vision and I think they're working towards interoperability.

Marcus:

Okay.

Jacob Bourne:

But I would say that we're quite there yet.

Marcus:

Okay.

Dan Van Dyke:

Two recent steps that have brought us closer, and I guess recent is a stretch for this one. First, the introduction of MCP. MCP stands for mono-context protocol, which is released by Anthropic, the creator of Claude. Mono-context protocol is simply a way for agents to be able to access tools. Think accessing GitHub repos, or accessing databases, or Zapier for automations. It's a simple way that is very elegant and is becoming the mainstream accepted standard to connect AI to all these other things that exist on the open internet, or even local files on a user's computer if they give it access. Then secondly, the new A-to-A protocol which was released by Google aims to compliment the capabilities of MCP by allowing for agents to communicate to agents in a common language.

That vision of interoperability is starting to come a little bit more into focus. But the reality is quite fragmented, as Jacob, you were painting a picture of. Where everybody wants to be the defacto home for agents. I think we're inevitably headed towards consolidation as a provider starts to emerge at the forefront. But for the moment, it's just getting increasingly crowded, competitive, and fragmented.

Marcus:

Yeah. A lot of options out there, in terms of different agents that you can choose. OpenAI, an artificial intelligence company, released a platform that lets companies create their own AI bots for completing tasks, such as customer service and financial analysis. Belle Lin at The Wall Street Journal was nothing this.

Dan, we talked before, I think it was last week. You'd said to me that part of this conversation, maybe a part that's not being discussed as much, is that AI agents are hard to build right. What did you mean by that?

Dan Van Dyke:

They're very easy to build, period. You could spin together a prototype with a couple of hours if you're technical enough and it'll be really impressive. Once you try to push that into production to fulfill a real need that you have in an organization or build something that would be client-facing is where you start to encounter difficulties. That's why the eval process is actually the most crucial part in measuring the efficacy of agents. Where a lot of people will get hung up is they'll realize that, for a particular task, what they really need is 95% accuracy to meet the baseline that they have with people. An AI agent maybe, right out of the box, will get to 80% accuracy. But that last 15% is actually 80% of the effort.

What I was describing in some of the workflows that I've strung together in assistance with our research team actually turned into a very protracted process of figuring out evaluations, pushing new iterations out to the research team. Having them come back to me and realize, "Oh, yeah, I didn't ask for this feature and it's actually crucial." Then doing that again and again. It's through no fault of the research team, it's just you don't know what to ask for until you're actually deploying these in the real world and seeing where they fail.

It's a much more difficult process than it looks like on its face to take something from very promising POC to something that's actually in production and starting to create value. Which is not to say it's impossible. In fact, the thing that I am describing for the research team, they're really positive about it, it's really useful right now. But I'm already looking into new capabilities that would make it even more useful. It's definitely a journey, and easy to get sucked up in the hype and think that it's going to be fast. It isn't.

Marcus:

Yeah.

Jacob Bourne:

Yeah. Yeah. Just to add to that, I think we all know about the issue with chatbots hallucinating. It's well documented with lots of examples. But the risk there is that, okay, you have something problematic in a chat box, an output that's erroneous or problematic in some other way. But when you have AI agents that are potentially making transactions online, if they get stuff wrong, then the stakes are a bit higher. I think that makes it difficult on the technical level, in terms of putting in safeguards to reduce the likelihood of that happening. But also, just deploying it commercially knowing that there's that risk there I think makes it difficult.

Marcus:

Yeah. Yeah, it's not something that happens overnight. Greg Shoemaker, Adecco Group senior VP of ops in AI, had a good quote saying, "Companies should approach agents less as a tech deployment and more as the development of digital workers that need to be onboarded and trained."

Dan, you mentioned a word which I thought was interesting, which was ... I think you said something to the effect of technical ability.

Dan Van Dyke:

Evals, maybe?

Marcus:

Well, just the fact that you have to have some kind of a technical understanding of how these things work. I'm wondering if that's part of the problem is that this is hard. OpenAI was even saying, "To use its AI agent building platform, enterprise developers still need to have a comprehensive technical background." How proficient do you have to be with AI to be able to build one of these agents and build it right, to your point, Dan?

Dan Van Dyke:

Well, I've been covering the AI space for maybe eight or so years, but primarily from a financial services lens as formerly the head of financial services research within eMarketer. Recently, I would say two-and-a-half years ago with the advent of ChatGPT, if I'm getting that timeline right, started to focus more and more of my work to a now 100% on AI, and starting to build POCs and applications that have transitioned into it becoming my full-time focus.

Over the course of that amount of time, say two-and-a-half years, I've gotten to the point where now I feel proficient enough that, yes, I could build a POC. Yes, I could do evals that would help get something into production. In fact, I've done those things. But it did take years. That time is spent figuring out things like how do you set up a GitHub account, and what is the importance of not hard-coding environment variables into repos that you push into production. All these arcane terms that really have real world consequences if you're talking about an application that you're building, putting out into the world that would otherwise become a mess of spaghetti and quickly attacked by hackers.

Marcus:

Yeah.

Jacob Bourne:

Yeah.

Dan Van Dyke:

You become a cautionary tale. I think it is well put that there is a learning curve that you still have to overcome, but that learning curve is rapidly dropping as tools like Claude 3.7 become more effective helpers. That's led to the emergence of something called vibe coding.

Marcus:

Right.

Dan Van Dyke:

Which is somebody like me is just describing, "Here's what I want." I'm proficient enough that I could describe, "Here's the platform I want you to use, here's what I want you to avoid." I can guide it every now and then. It's just a lot of I get errors and then I'm saying, "Help me fix this error. I'm going to feed you documentation."

Jacob Bourne:

Yeah.

Dan Van Dyke:

Which helps, but I don't want to overstate it. It's quite often frustrating, mind-numbing work. Hopefully, increasingly less so in time.

Jacob Bourne:

Yeah.

Marcus:

But it's a great example of how someone in-house can learn it. You don't have to hire externally someone who studied it, and has got a PhD in it, and have been at a company doing it for 20 years. Actually, if someone in-house has an understanding of internal processes, and what the company needs, and a relationship with those people at the company as well, so there's an argument to be made that maybe that is better perhaps, Jacob.

Jacob Bourne:

Right. Yeah, just to note too, that I think things are changing. Just yesterday, Google Cloud announced its new no-code agent designer, which is launched specifically to tackle this problem of how can non-technical people take advantage of developing their own agents. I think this is something we're going to see more of to meet that need.

Marcus:

Listen to this one, gents. AI agent adoption, it seems as though it's been extremely limited so far. I have one data point from Mr. Coshow who I mentioned earlier from Gartner. He was saying just 6% of 3400 people in a recent Gartner webinar on the subject said that their company had deployed agents, just 6%. There's an argument to be made, "Look, you've deployed one." There's also an argument to be made about, "Yes, but did you deploy it well? How advanced is it?" Dan, you were saying you can do it, but they're hard to do right.

Dan, let's start with you for this one. What do you see the next couple of months looking like for agents, for agent deployment? In, I'd say, yeah, we're only in April, so maybe I should just say 2025 because in a few months, as I was saying before the show, it's Christmas.

Dan Van Dyke:

I think by the end of the year, you'll probably get into the low 10s, up to 20% adoption if you reran that same study.

Marcus:

Interesting.

Dan Van Dyke:

If I had to guess.

Marcus:

Yeah.

Dan Van Dyke:

That will be as a result of more companies releasing agentic platforms so that the developer workforce who is eager to build these tools can go out and build on permissioned, secure platforms that they're already using.

Additionally, you'll start to see a trickle of folks internally, I'm thinking about very advanced AI users that we have within eMarketer like Henry Powderly, for instance, going out and starting to pick up skills and build their own tools. I think we'll see a convergence as both groups start to build more agents and I'm excited to see that continue to grow into 2026.

Marcus:

Yeah. Jacob?

Jacob Bourne:

I agree with Dan's forecast there. Marcus, I think that that 6% number, it does seem low, especially since there was other data that indicated the adoption was higher. I think the issue here goes back to what we're saying about what constitutes an AI agent.

Marcus:

Yeah.

Jacob Bourne:

The lower number points to the fact that I think the adoption of true agents is very low. But I think there is definitely a lot more adoption of AI assistants that are getting called agents, which pushes the data up a bit in terms of adoption.

Marcus:

Yeah.

Jacob Bourne:

We're going to continue to see this, in terms of okay, are you actually using an agent or not? But as the technology gets better and we achieve a higher level of automation, then I think it'll become more clear over time.

Marcus:

Yeah. Dan mentioned Henry Powderly, he was on with with Gadjo Sevilla, who both talked about using AI at work. A two-part episode, or series if you will. I think it was March 31st, April 4th. Both those episodes came out, so check those out.

That's all we have time for for today's episode, unfortunately. Thank you so much to my guests for hanging out with me today. Thank you first to Jacob.

Jacob Bourne:

Thanks for having me today, Marcus. Appreciate it.

Marcus:

Yes, sir. Thank you of course to Dan.

Dan Van Dyke:

Thank you.

Marcus:

Absolutely. Thank you to the whole editing crew, Victoria, John Lance, and Danny. Stuart runs the team and Sophie does our social media. Thanks to everyone for listening in to Behind the Numbers, a eMarketer video podcast made possible by Kinective Media by United Airlines. We'll be back on Monday. Happiest of weekends.

Create an account for uninterrupted access to select audios.
Create a Free Account