Uncovering Wild Inconsistencies

Over the last couple of months, the team at BehindLogin has assessed 20 banking chatbots through real, logged-in customer journeys as part of our Chatbot Experience Benchmark 2026.

When reviewing the research, we uncovered some wild inconsistencies.

Some chatbots sound intelligent but can’t complete meaningful tasks. Others can complete tasks yet still feel frustrating to use. Many sit inside logged-in banking apps with access to rich customer context, yet respond as though they know almost nothing about the person they are speaking to.

After analysing hundreds of interactions across five assessment areas, a few reflections stood out.

Also a reminder that customers are rarely looking for a chatbot!

This article offers a taster of the full report, now available for download.

1. Beware the market hype

Only one provider achieved an Exceptional Experience in the benchmark.

That doesn’t mean the market lacks investment or ambition. Quite the opposite and we see this within our Radar updates – the volume of AI announcements, assistant launches and roadmap discussions is enormous.

But behind the login, most experiences could be described as advanced tooling rather genuinely intelligent assistance.

The market is no-where near as mature as the industry narrative suggests & there’s a significant gap between positioning and execution.

2. Context is king

Data Use & Personalisation was the lowest scoring category across the benchmark.

A number of chatbots behaved like they had never met the customer before.

These systems sit inside logged-in banking apps. They can see spending patterns, products, behaviours, balances and transaction history. Yet many interactions still produced generic, context-free responses. Many are still standalone rather than integrated into the wider experience.

On review, we observed that the providers pulling ahead weren’t necessarily the most conversational. They were simply the ones using customer context in genuinely useful ways.

3. Task completion doesn’t always = a good experience

One of the more interesting findings was that some providers scored relatively well for Action & Task Completion while still delivering weak Interaction Experiences overall.

In other words, the chatbot technically “worked” but the experience still felt robotic, fragmented or effortful.

Users don’t pause to separate:

the AI capability
the UX
the language
the flow

They simply experience it all as one interaction.

And ultimately, they just want to get something done.

4. Many chatbots are still redirection tools

A recurring pattern across the benchmark was the tendency to redirect rather than resolve.

Many chatbots still rely heavily on:

links
FAQs
help articles
handoffs
signposting to other channels

Let’s remember that many customers will arrive in chat after failing elsewhere. They’re already in a moment of uncertainty, friction or urgency.

At that point, opening another door rarely feels helpful.

The strongest experiences reduced effort by keeping users within the flow and helping them complete the task there and then.

5. Trust is defined in the small moments

Conversational AI is key for trust. And this is found across a number of small moments.

Every interaction either:

reassures
clarifies
reduces effort

…or it slowly erodes confidence & trust.

In some instances we saw personal data displayed with the wrong context, which immediately felt careless. We also spotted generic responses, poor handoffs, repeating information or failure to understand intent.

Whilst these things may seem small individually, collectively they shape how competent and trustworthy a brand feels. Where trust is hard earnt and easily lost, we saw some experiences that were arguably more damaging than valuable.

So it comes back to the classic mantra of using tech for good. Not to make a brand appear flashy & intelligent, but to create experiences that make customers feel understood, supported and confident at moments that matter.

A reminder that customers are rarely looking for a chatbot

Banking apps aren’t alone in moving toward a future where conversational interfaces play a much larger role. And irrespective of sector, context is king.

It’s always worth remembering that customers are rarely looking for a chatbot. They’re typically looking for their problem to be understood and resolved.

So bold marketing claims aside, the providers that focus on thoughtful, well executed outcomes are likely to come top.

Download the Chatbot Experience Benchmark

The Chatbot Experience Benchmark is a vital resource for teams who are designing and building their own conversational experiences.

Access the full diagnostics of each provider behind-the-login
Explore a showcase of Best-In-Class conversational experiences
Discover a blueprint for chatbot evolution with our Maturity Assessment Framework

Uncovering Wild Inconsistencies

1. Beware the market hype

2. Context is king

3. Task completion doesn’t always = a good experience

4. Many chatbots are still redirection tools

5. Trust is defined in the small moments

A reminder that customers are rarely looking for a chatbot

Download the Chatbot Experience Benchmark

Click to Download

Intrigued? Let’s talk

Why BehindLogin?

Solutions

Resources

Download

Uncovering Wild Inconsistencies

1. Beware the market hype

2. Context is king

3. Task completion doesn’t always = a good experience

4. Many chatbots are still redirection tools

5. Trust is defined in the small moments

A reminder that customers are rarely looking for a chatbot

Download the Chatbot Experience Benchmark

Click to Download

Intrigued? Let’s talk

More news