What's Left of the Tester When AI Takes Over 80 Percent?

When I write to fellow testers these days — usually in Teams or as a Slack DM, between two meetings — two sentences keep coming up. One gets typed out fast: “I just can’t keep up anymore.” The other comes later in the day, often disconnected from whatever we were just discussing, and with three seconds of typing-stops-typing before it lands: “I sometimes wonder if they’ll still need me in two years.”

Both sentences are honest, both are understandable, and both miss the point. That’s not meant as reassurance — the shift happening right now is real. But it looks different from the headlines.

What’s Actually Shifting

The sober numbers first, then what sits behind them.

Capgemini’s World Quality Report 2025-26 says: 75 percent of organisations see AI-powered testing as a pivotal part of their strategy. Sixteen percent have actually scaled it. Meaning: it’s everywhere in the slide decks, but rarely orderly in the engine room. If you have the everyday feeling that “everyone talks, few deliver” — you’re right.

What I see concretely: test creation is accelerating. AI tools generate test skeletons from user stories in minutes — coverage in the 60-80 percent range of what I’d write myself. Self-healing locators reduce UI test maintenance by 70 to 80 percent. Across the SDLC, the phases are blurring: what used to be a clean handoff between requirements, development, and test is becoming a loop. Discrete phases are turning into continuous flow with strategic checkpoints.

That isn’t “SDLC is dead”. It’s “SDLC looks different now”. Anyone who still treats testing as a sequential step after development has been working against the current for two years.

The Real Fears — Not the Ones in the Headlines

The clickbait stat is familiar by now: 74 percent of IT professionals fear their skills will become obsolete. Sixty-nine percent fear being replaced. I’ve been reading those numbers in every other LinkedIn post for the past eighteen months.

The sentences I actually hear are different — and more concrete:

“I’m losing track.” That’s the falling-behind fear. It’s not “AI replaces me tomorrow”, it’s “while I’m getting up to speed on one tool, three new ones come out that do everything better”. On the Ministry of Testing forums this concern now appears more often than the replacement question.

“I no longer understand how the output is produced.” Black-box anxiety. The AI generates a test, I click “Run”, it goes green — but I can’t say with certainty why it does what it does. That gnaws at a profession in which understanding is part of the DNA.

“Thinking is getting harder.” An HBR study called this phenomenon “AI Brain Fry”: GenAI power users report a 45 percent higher burnout rate. Not because they work less — but because they’re constantly evaluating somebody else’s suggestions without going through the thinking process themselves. Evaluating without first arriving at insight wears you out differently than insight itself.

“Who am I if the AI writes the tests?” That’s the most uncomfortable question, because it isn’t a tool question. Anyone who has tied their identity to a task list — and we have a few of those in testing — gets restless when the list shrinks.

What Stays Human — And Why That Shouldn’t Be Comforting

The standard answer in every other article is: “Exploratory testing, domain knowledge, judgment — that stays human.” True. But at that level of generality, it doesn’t help anyone.

Getting more specific:

Risk assessment in business context. What does a bug mean for regulated systems? For compliance? For an audit? An AI can produce a list of “critical / non-critical”. But it doesn’t weigh the risks against stakeholder politics, against release pressures, against what the contract actually says.
Edge-case gut feeling. That comes from experience, not from training data. If you’ve seen five times that a particular class of bug always shows up in the same component, you look there first. That isn’t a heuristic — it’s pattern recognition with history.
Stakeholder translation. The same defect sounds like “technical debt” to a developer, “velocity risk” to a PO, “reporting obligation” to compliance. No AI does that translation — and it’s often what makes the difference in the end.

And then there’s the thing that’s hard to describe and that I still consider the most important point.

The Crying Kids in the Back

Picture a family on a summer trip in their EV. Two kids in the back, whining for the past hour. Battery shows the last percent. The parents have aimed for the next fast-charging station — range just barely enough, cutting it close. They arrive, grab the cable, plug in. The station blinks briefly, throws an error, the session aborts. Second attempt. Another error. Restart the app, third try, another error. By now the kids are crying, the mother is on the phone with support, the father is trying the next station over — and there too, something doesn’t work.

That scene is what I think about when someone says: “Well, AI tests that automatically now.”

An AI can test the charging session. It can detect the error code, measure response time, even simulate a retry. What it cannot do: prioritise this scenario, because it doesn’t know what this family is feeling. It knows no pressure, no exhaustion, no crying children. It knows codes and states, but it knows nothing of desperation on a hot summer day.

I test E2E paths for charging infrastructure. When I find a bug, the question is never just “does the component work?” The question is “which human experiences this, in which situation, with which consequences?” That translation from technical behaviour into human reality — that is what defines our role. And it doesn’t get less important when the AI writes the tests. The opposite.

How the Role Is Actually Shifting

A simple picture for what’s moving where:

quadrantChart
    title "Tester Tasks: Where AI Helps, Where Humans Stay"
    x-axis "Routine" --> "Strategic"
    y-axis "Low Consequence" --> "High Consequence"
    quadrant-1 "Human decides"
    quadrant-2 "Human prioritises"
    quadrant-3 "AI automates"
    quadrant-4 "AI assists, human reviews"
    "Test skeleton from user story": [0.18, 0.30]
    "Selector healing": [0.12, 0.22]
    "Regression run": [0.20, 0.45]
    "Edge-case exploration": [0.78, 0.62]
    "Risk prioritisation": [0.85, 0.82]
    "Release-gate decision": [0.88, 0.90]
    "Stakeholder translation": [0.70, 0.75]
    "Test data generation": [0.30, 0.35]

Routine plus low consequence reliably moves to AI: test skeletons, selector repair, standard regressions. Strategic plus high consequence stays human: risk assessment, release decisions, translation between stakeholders. In between, a new zone is forming: AI output that needs human review — the actual workplace for many testers in the coming years.

What this means for new roles tends to sound grand at conference talks: “Agentic Test Automation Architect”, “AI Output QA Analyst”, “Continuous Quality Engineer”. If you hear that, you think buzzwords. In practice it means something more modest and more tangible: someone who no longer types themselves but orchestrates test agents. Someone who no longer finds every bug themselves but evaluates which bugs from the AI stream are even real bugs. Someone who no longer just monitors the system but also what the AI suggestions do to the system.

Test Managers: What Has to Be Decided Now

For test managers the centre of gravity shifts even more visibly. The operational question “Who tests what by when?” recedes. In its place, questions arrive that weren’t on the table two years ago:

Who is liable when an AI-generated test signed off a critical feature — and the bug only surfaces in production?
For which decisions is human-in-the-loop mandatory, for which only etiquette?
How do we document what an agent did when nobody is watching anymore?

Those are governance questions, and they’ll get more uncomfortable in the coming months. Forrester names culture, not tools, as the biggest barrier to shift-left: developers see testing as a QA matter, QA fears that shift-left eliminates their role. That very conflict escalates when AI gets added in — because now it’s even less clear who is actually responsible when things go wrong.

My impression: the test managers who’ll be most valuable in the next two years are the ones who actively mediate this conflict — not the ones who avoid it. If you’re looking for a practical entry point on how to introduce this in a team, the 90-Day Plan holds the workflow I used myself.

How I Deal With It

I’m not here to deliver a finished answer. But three things have proven workable for me:

First: Don’t tie your identity to your task list. If you say “I’m a tester because I write tests”, you’ll get nervous when the typing falls away. If you say “I’m a tester because I make sure software does what people need”, the compass holds, even when the tool changes. Sounds like a coaching line, but it works in practice.

Second: Stay deliberately sceptical without categorically blocking. I test the tools myself — Stack Finder, prompt generator, AI code assistants. With every one I’ve done things that impressed me, and things that made me question my profession. Both matter equally. If you only collect the good experiences, you become hype-prone. If you only collect the bad ones, you miss the shift.

Third: Don’t lose the human anchor. With every tool hype I think back to that family at the broken charging station. If what I’m doing doesn’t help that family get on with their journey, it isn’t important — no matter how elegant the AI pipeline looks.

The question isn’t whether testers will still be around in two years. We will be, because someone has to carry the responsibility no model can carry. The question is as what. And that’s what we’re deciding right now — day by day, tool by tool, test by test.

If you want to dig deeper into the topics I touched on here: the Complete Guide sorts strategy, tools, and prompt engineering by role. And if you happen to be at the point of not knowing where to start — that’s where you’ll find the quick path for your role.