← BACK TO ARTICLES

The AI Consulting Engagement That Went Wrong: What I Learned

I've been building systems that have to survive contact with reality for over thirty years. I founded adoption.com in 1995, back when most people thought the internet was a fad for academics. I've run orphanages in Ethiopia, Kenya, and Haiti. I've managed operations across seven countries where the power goes out, the supply chains collapse, and you still have to get 200 children fed by 6 pm. If there's one thing that kind of operational history teaches you, it's this: the gap between what you planned and what actually happens is where everything important occurs.

I know what failure looks like. And I'm here to tell you about one that humbled me.

What follows is a teaching case, drawn from patterns I've studied in published AI implementation research and from thirty years of my own hard lessons building systems that couldn't fail. I'm presenting it as a narrative because the pattern is important, not because I'm claiming it as my own client story.

Focused AI consulting meeting where a workflow diagram reveals hidden assumptions

The Setup: A Confident Scope, a Confident Client

The client, which I'll call Meridian (not their real name), was a regional professional services firm, roughly 60 employees, about $8 million in annual revenue. They'd heard the AI hype, watched their competitors start experimenting, and decided it was time to move. They came to me asking for help building an "AI-powered client intake and routing system." Their pain point was real and specific: their intake coordinator was spending 60 to 70 percent of her time manually triaging incoming inquiries, assigning them to the right team member, and scheduling initial consultations. They wanted that automated.

That's a clean problem. I've worked with messier ones. I scoped the project in three phases: data audit and process mapping, system design and build, and deployment with staff training. Total timeline: fourteen weeks. Total budget: $47,000.

We shook hands. Everyone was confident.

Here's what I didn't know yet: confidence is often the most dangerous thing in the room.

Where the Assumptions Hid

Stakeholder miscommunication table comparing what leaders said with what users meant

I assumed their intake data was structured. It wasn't.

Meridian had been using three different intake systems over the previous six years, a legacy CRM that nobody fully understood, a shared Google Form that had been modified 11 times, and a phone log that lived in a physical binder on the coordinator's desk. When we started the data audit in week two, we found that approximately 40 percent of historical intake records were either incomplete, duplicated, or categorized using labels that had changed meaning over time.

This matters because AI systems don't tolerate ambiguous data gracefully. Gartner has reported that 85 percent of AI projects fail due to poor data quality or lack of relevant data [1]. I knew that statistic. I've cited it in presentations. What I didn't do was treat it as a concrete threat to this specific project from day one.

I also assumed that when the CEO said "automate the routing," he and the intake coordinator meant the same thing. They didn't.

The CEO meant: have AI make the routing decision with no human in the loop, and send the coordinator's time somewhere more valuable.

The intake coordinator meant: have AI suggest a routing decision that she reviews, approves, and can override before any client gets contacted.

These are not small differences. One produces a fully automated system with no human review. The other produces a decision-support tool with a human checkpoint. They have different technical architectures, different error tolerances, different training requirements, and they create entirely different organizational dynamics for the person whose job changes most.

I had documented the project goal as "automate intake routing." Both of them read that sentence and heard something different. I didn't discover this until week five, when I demonstrated the first prototype to both of them in the same room.

The coordinator went pale. The CEO looked pleased. The silence that followed told me everything.

What Had to Be Rebuilt

Whiteboard covered with workflow arrows, roles, and decision points during an AI rebuild

We had built toward the CEO's vision. That meant we'd designed a system with a confidence threshold: if the AI's routing confidence score exceeded 80 percent, it would route automatically and send the notification to both the client and the assigned team member. The coordinator would get a summary log, not a decision queue.

When the coordinator saw that demo, her first question was: "So I find out after the fact?"

Her second question was: "What happens when it's wrong?"

Both questions were excellent. Neither had a satisfying answer in the system we'd built.

The rebuild took three weeks and pushed us two weeks past our original timeline. We redesigned the workflow to include a human review queue for all routing decisions, with a 24-hour SLA for the coordinator to approve or redirect. The AI still did the classification and scoring work, which saved the coordinator significant cognitive load. But the final send didn't happen without her eyes on it.

Was that the right system? Yes, actually. It was better than what we'd originally built. But we'd wasted three weeks and significant client goodwill getting there.

Why This Is an Industry-Wide Pattern, Not Just My Mistake

I want to be clear: what happened to me at Meridian isn't unusual. It's the norm.

RAND Corporation published research in 2024 based on interviews with 65 data scientists and engineers, each with at least five years of AI and machine learning experience. They identified five root causes of AI project failure. The first one, the one that topped the list, was this: stakeholders often misunderstand or miscommunicate what problem needs to be solved using AI [2].

Not bad data. Not weak models. Not insufficient computing power. Miscommunication about the problem itself.

By some estimates, more than 80 percent of AI projects fail, twice the rate of failure for comparable IT projects that don't involve AI [2]. The MIT NANDA Lab's 2025 State of AI in Business report reviewed more than 300 publicly disclosed AI implementations and found that 95 percent of generative AI pilots failed to deliver measurable profit-and-loss impact [3]. S&P Global Market Intelligence's 2025 Voice of the Enterprise survey found that 42 percent of companies abandoned most of their AI initiatives that year, up from just 17 percent the year before [4].

McKinsey's November 2025 Global AI Survey, drawing on nearly 2,000 respondents across 105 countries, found that 88 percent of organizations now use AI in at least one business function. Only 39 percent can point to any measurable effect on the bottom line [5].

Read that again. Eighty-eight percent adoption. Thirty-nine percent impact.

That gap doesn't exist because the technology failed. It exists because the organizational and human infrastructure around the technology failed. Pertama Partners synthesized the industry research and concluded that approximately 70 percent of AI project failures are driven by organizational rather than technical causes [6].

I was a contributor to that statistic at Meridian. I'm not comfortable with that. And I didn't want to be a contributor to it again.

What My Client Eventually Got

After the rebuild, Meridian's intake system worked. The coordinator's manual triage time dropped from roughly 65 percent of her day to under 20 percent. The AI handled the classification and confidence scoring. She handled the final routing decision and client communication. The response time to new inquiries dropped from an average of 2.3 days to same-day.

The CEO got the efficiency gains he was looking for. The coordinator got a tool that respected her expertise and kept her in control of client relationships. I got a chastened reminder that "automate" means different things to different people in the same organization.

We finished four weeks late. The client paid the full contract value because the final system delivered everything the scope promised and more. But those four weeks cost me in subcontractor hours I'd already committed to other projects, and they cost the client in staff time managing the disruption.

More importantly, those four weeks should never have happened.

The Assumption Gap Is the Most Dangerous Moment in Any AI Project

Here's what I've come to believe, and what I tell every client I work with now: the most dangerous moment in any AI implementation isn't when the model is wrong. It's when everyone in the room thinks they understand each other.

That moment of shared confidence, when the CEO nods, the consultant nods, and the end user nods, is precisely when the hidden assumption has the most room to do damage. Nobody challenges a consensus.

WorkOS analyzed dozens of enterprise AI deployments and found that the organizations that actually got results were the ones that invested in extensive pre-work before touching any technology: mapping current workflows in granular detail, identifying every person whose role would change, and defining success metrics in concrete, measurable terms before writing a line of code [7].

That's the lesson I missed at Meridian. I did a process map. I didn't do a role-impact map. I didn't sit with the intake coordinator for a full day and watch her work, ask her what she would hate to lose control of, and build that answer into the architecture from the start.

I was operating on the CEO's definition of the problem. The coordinator's definition was equally valid and entirely different, and she was the person who'd actually use the system every day.

Gartner's research on why GenAI projects fail identifies "inadequate problem definition" as a leading cause, and specifically calls out the failure to include end users in the scoping process as one of the most common mistakes [8]. Gartner also predicted in July 2024 that 30 percent of generative AI projects would be abandoned after proof of concept by end of 2025, largely due to unclear business value and poor alignment between what was built and what stakeholders actually needed [8].

I believe that number. I've seen it from the inside.

The Process Change That Came Out of This

Decision Definition Document framework for clarifying AI project ownership and success

I rebuilt my scoping process after Meridian. Every engagement I run now includes what I call a "Definition Workshop" before any technical work begins. It takes four to six hours, and it involves everyone who will be touched by the system: the person who commissioned it, the person who will use it daily, and anyone in between who has a meaningful stake in the output.

The workshop has one primary deliverable: a written "decision definition document" that answers six questions in concrete, specific terms.

One. What decision does this AI system make or support? (Not "automate routing." Instead: "The AI classifies incoming inquiries into one of seven routing categories and assigns a confidence score. The intake coordinator reviews all assignments before any client communication is sent.")

Two. Who makes the final call when the AI is uncertain or wrong? (Name the person, define the threshold, describe the process.)

Three. What does "wrong" look like, and what happens when it does? (Build the error scenario before you build the system.)

Four. What does the end user need to trust this system? (Not believe in it, not tolerate it: trust it. These are different things.)

Five. How will we know, 90 days after launch, whether this worked? (Specific metric, specific baseline, specific target.)

Six. What does a successful outcome mean to each stakeholder? (Ask this in writing. The answers will differ. That's the point.)

I share this document with every stakeholder before we proceed to design. If there's a disagreement, I want it in the workshop, not in week five of a fourteen-week project.

This process adds roughly half a day to upfront engagement time. It emerged from studying how AI implementations fail in published research and from my own experience building complex operational systems where skipping definition work always cost more on the back end than it saved on the front.

What This Means If You're Thinking About an AI Engagement

I'm not a technologist who learned business. I'm a business operator who mastered technology. I've spent thirty years building systems that actually had to work: systems for placing children in homes, for tracking medical supplies in active crisis zones, for running financial operations across seven currencies and four different regulatory environments.

What I know from all of that is this: the technology is almost never the hard part. The hard part is the humans. The hard part is the gap between what someone says they want and what they actually need. The hard part is the person whose job changes without anyone asking them what they'd hate to lose.

AI makes all of this harder, not easier, because AI amplifies existing assumptions. A biased training dataset doesn't produce a biased result occasionally. It produces a biased result at scale, reliably, thousands of times a day. Amazon learned that when they scrapped a recruiting AI that had been penalizing resumes from women's colleges because it was trained on ten years of male-dominated hiring data [9]. The model was doing exactly what it was trained to do. The problem was what it was trained on.

The Optum/UnitedHealth prior authorization AI that came under class-action scrutiny in 2023 had a known 90 percent error rate in certain decision categories. It was deployed anyway, because it reduced costs on the back end. The people it affected most were elderly patients appealing coverage denials [10]. That's a system that worked perfectly as an optimization engine and failed catastrophically as a healthcare tool. Someone in the room knew the problem. It wasn't surfaced in the definition phase.

I'm not telling you this to frighten you away from AI. I'm telling you this because the organizations that get real results from AI are the ones that treat it as an organizational change problem first and a technology problem second. They invest in the definition work, include the people whose roles will change, and build the error scenarios before they build the system.

My engagement at Meridian produced a better outcome than we'd originally designed, precisely because the coordinator's objections forced us to confront what we'd actually built versus what we should have built. I got lucky that she spoke up in that first demo. In many organizations, she wouldn't have. She'd have nodded, said it looked fine, and then quietly worked around the system for the next two years.

That's not a successful AI implementation. That's an expensive, underused tool with a human workaround built invisibly next to it.

A Direct Word on What to Look for in Any AI Engagement

If you're evaluating an AI consulting engagement, here's what I'd ask any consultant, including me:

How do you define the problem before you design the solution? If the answer is vague, or if they skip to tools and timelines before they've described a structured definition process, that's a warning sign.

Who do you include in the scoping process? If the answer is "the decision-maker" and stops there, that's a problem. The people who will use the system daily have knowledge the decision-maker doesn't have. That knowledge belongs in the definition.

What's your process when the model is wrong? Every AI system will be wrong sometimes. If a consultant hasn't planned the error scenario before building the system, they're planning to discover it in production with your clients or your employees on the receiving end.

How will we measure success 90 days after launch? If they can't answer this with specific metrics tied to your specific context, you're buying a pilot, not a system.

I learned these questions at Meridian. They cost me four weeks and a fair amount of humility. I'd rather you learn them here.

The AI industry is full of vendors and consultants who'll tell you what AI can do. The harder question, the one that actually determines whether you'll be in the 39 percent who see real impact or the 61 percent who don't, is whether anyone is doing the unglamorous work of defining the problem precisely, including the people most affected, and building the error scenarios before the system launches.

That's the work. It's not as exciting as the demo. But it's what actually determines whether your AI investment produces anything worth measuring.

Sources

[1] Gartner, "Lack of AI-Ready Data Puts AI Projects at Risk," https://www.gartner.com/en/newsroom/press-releases/2025-02-26-lack-of-ai-ready-data-puts-ai-projects-at-risk, 2025

[2] RAND Corporation, "The Root Causes of Failure for Artificial Intelligence Projects and How They Can Succeed: Avoiding the Anti-Patterns of AI," James Ryseff, Brandon De Bruhl, Sydne J. Newberry, https://www.rand.org/pubs/research_reports/RRA2680-1.html, 2024

[3] MIT NANDA Lab / State of AI in Business 2025 report, as reported by Fortune, "MIT report: 95% of generative AI pilots at companies are failing," https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/, 2025

[4] S&P Global Market Intelligence, "Voice of the Enterprise: AI and Machine Learning Survey," as cited in WorkOS, "Why most enterprise AI projects fail," https://workos.com/blog/why-most-enterprise-ai-projects-fail-patterns-that-work, 2025

[5] McKinsey and Company, "The State of AI: Global Survey 2025," https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai, November 2025

[6] Pertama Partners, "15 AI Project Failures and How to Avoid Them," Michael Lansdowne Hauge, https://www.pertamapartners.com/insights/ai-project-failure-case-studies, 2025

[7] WorkOS, "Why Most Enterprise AI Projects Fail and the Patterns That Actually Work," https://workos.com/blog/why-most-enterprise-ai-projects-fail-patterns-that-work, 2025

[8] Gartner, "Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept By End of 2025," https://www.gartner.com/en/newsroom/press-releases/2024-07-29-gartner-predicts-30-percent-of-generative-ai-projects-will-be-abandoned-after-proof-of-concept-by-end-of-2025, July 2024

[9] Reuters / Amazon recruiting AI case, as cited in Pertama Partners, "15 AI Project Failures and How to Avoid Them," https://www.pertamapartners.com/insights/ai-project-failure-case-studies, 2025

[10] Pertama Partners, "Case Study 13: Optum/UnitedHealth Prior Authorization AI (2023)," https://www.pertamapartners.com/insights/ai-project-failure-case-studies, 2025