Stop Building AI Workflows Over Broken Processes

engineering-culture ai leadership testing
RSS Feed

The Demo That Made Me Uncomfortable

At a recent sprint demo, a team showed off a new AI-powered workflow. It works like this:

  1. Add a label to a Jira ticket
  2. An N8N workflow fires via webhook
  3. The ticket description gets fed to an LLM
  4. The LLM generates test cases
  5. Test cases get posted as a comment on a cloned QA ticket

The demo went well. People were impressed. I sat there thinking about all the ways it was going to fail.

The Problem Underneath

Within minutes of the demo, someone asked the obvious question: "This assumes the ticket description is accurate, right?"

The answer was yes. And that's the problem.

Ticket descriptions are frequently incomplete, vague, or outright wrong. Engineers regularly build something different from what's in the ticket. PRs get merged with no summary. Stories arrive from product with poor technical requirements that are never updated.

So now we have an AI generating test cases from descriptions that don't reflect what was actually built. The QA team reviews these generated test cases, fixes them manually, and then writes the tests they were going to write anyway.

We automated the wrong thing.

Garbage In, AI Out

There's a pattern I keep seeing: a team identifies a real problem, skips past the root cause, and builds an AI workflow to treat the symptom.

The root cause here is that ticket descriptions are bad. The fix is to make them better -- write acceptance criteria, require technical context before moving to development, break stories into tasks with clear scope.

That's boring. That's a process conversation. That's a manager telling an engineer "your ticket needs a description before it moves to in progress." Nobody gets to demo that at sprint review.

An N8N workflow with an LLM in the middle? That's a demo. That gets applause. And then it quietly produces unreliable output that someone still has to verify by hand.

When AI Workflows Make Sense

I'm not anti-AI/LLM. I use AI and LLM tools daily. The difference is where you point them.

LLMs work well when:

  • The input is reliable (code, structured data, well-defined schemas)
  • The output is verified automatically (tests that either pass or fail)
  • The failure mode is cheap (a suggestion you ignore, not a test suite you trust)

LLMs work poorly when:

  • The input is garbage (vague ticket descriptions)
  • The output requires human review anyway (QA manually checking every generated test case)
  • The workflow replaces accountability (nobody has to write good tickets because the AI will figure it out)

The Real Fix Is Boring

The test case generation problem has a simple solution: write better tickets. Require acceptance criteria. Have engineers add technical context before marking a ticket ready for QA. Review the ticket description when the PR is opened, not after it's merged.

I've written before about how tests serve as institutional memory -- they encode what the system actually does, not what someone remembers it doing. Good ticket descriptions serve the same purpose. Both are boring. Both are cheap. Both outlast the engineer who wrote them. AI-generated test cases from bad tickets do none of these things.

That costs nothing. No N8N license, no LLM API calls, no workflow maintenance. Just a conversation about standards and someone willing to enforce them.

But that requires organizational will. It means telling people their work isn't good enough. It means a manager having an uncomfortable conversation. It means slowing down for a sprint to speed up for a quarter.

Nobody gets to demo that.

The Deeper Problem

Every AI workflow built on top of a broken process is a vote against fixing the process. It tells the organization: this dysfunction is permanent, so let's build around it.

Over time, the process gets worse because there's less pressure to fix it. The AI handles it, right? Except it doesn't -- it just moves the failure downstream where it's harder to see and harder to debug.

The engineers who could fix the root cause stop trying because they've been told, implicitly, that the AI solution is the path forward. The org loses the muscle to improve itself.

That's the real cost. Not the N8N subscription or the LLM tokens. It's the slow erosion of the instinct to fix things properly.

Build AI on Solid Ground

If you're going to build AI workflows, build them on top of processes that already work. Automate the tedious parts of a healthy system. Don't use AI to paper over a broken one.

And if someone proposes an AI workflow, ask one question first: what would we need to fix so that this workflow isn't necessary?

If the answer is "nothing, we just want to go faster" -- great, build it. If the answer is "well, the ticket descriptions would need to be better" -- fix the tickets first.

The boring fix is almost always the right one.