When CI/CD Pipelines Enable Bad Decisions Artwork

Good intentions, bad outcomes

A podcast about challenges and practices you might encounter in the workplace... things that were intended well, but have outcomes that aren't so great. In most cases, the organizations aren't even aware of how bad the outcomes are.

Every episode we discuss a situation that has something wrong with it: the what, the why and what can be done to address it.

All Episodes

Good intentions, bad outcomes

When CI/CD Pipelines Enable Bad Decisions

September 23, 2025 • Xodiac • Season 1 • Episode 16

What happens when your deployment pipeline consistently shows failing tests, but your team needs to keep shipping? In this episode, Gino and Wayne share a real-world example of how one organization created an "amber state" - where code compiles but tests fail, and treated it as "good enough" for deployment.

Wayne and Gino explore the dangerous psychology behind this common workplace pattern: when feedback becomes untrustworthy, teams stop listening to it entirely. The result? Quality silently erodes while everyone pretends the system is working.

In this episode, we discuss:

How the amber light anti-pattern destroys test discipline
Why "defects vs bugs" classifications create similar problems
The connection between failing CI/CD and "almost done" Kanban columns
When radical solutions (like deleting your entire test suite) make sense
How good intentions around "unblocking" teams lead to quality debt

Timestamps:
0:00 - The Amber Light Anti-Pattern Exposed

1:45 - Why 90% Failed Tests Became "Good Enough"

4:00 - The Hidden Psychology of Ignoring Quality

5:14 - Defects vs Bugs: Another Dangerous Distinction

6:41 - "Almost Done" Columns: The Kanban Version

8:21 - The Nuclear Option: Delete Your Tests

Key insight: If you don't trust your feedback system enough to act on it, remove it entirely rather than work around it.

This is a shorter episode focusing on one powerful anti-pattern that many development teams will recognize. Whether you're dealing with flaky tests, unreliable builds, or teams that have learned to ignore quality signals, this conversation offers both diagnosis and cure.

Got a workplace situation where good intentions led to bad outcomes? We'd love to hear about it for a future episode.

0:02 Hi, welcome to Good Intentions and Bad Outcomes, a podcast about challenges and practices that you might encounter in the workplace, where good intentions have unintended and often unnoticed consequences.

0:15 I'm Gino.

0:17 And I'm Wayne.

0:19 Fantastic. Welcome, Wayne. Every time we share a situation that we or one of our listeners have seen in the workplace, that is challenging in some way. We discuss what and why, and we try to come up with alternatives on the spot.

0:32 So let's get started. I brought a topic for today. In the past, I've seen a situation where - I've seen it multiple times, but I'm going to call out a specific example where the pipeline, the delivery pipeline that automatically picked up changes from the software or from the source code management system, built it, ran the automated unit tests and integration tests.

0:57 So whatever tests they had that automatically ran. And then provided feedback back to the developers, essentially about the state of their build.

1:07 And this particular organization or that team, they build... Actually, there were two steps. One was a compilation, and the other one was a test and the tests consistently failed.

1:20 So like 90% of the time, if not more, the tests were red, but the compilation was green.

1:28 So, instead of really going back and trying to fix this entire thing, they said like, how can we deal with that. So, they introduced an amber state and an amber state meant it compiles, but the tests don't run.

1:45 Now, Wayne, tell me, what's the writing on the wall here?

1:52 Well, I can see where this is going. I think I see what they're trying to do, but of course, the unintended consequence is staring me right in the face. You still don't have running code. And it may look better than it is, so that would be the main problem that I could see.

2:10 I would even go further than that, right? So what they want to do is they don't want to be blocked. Reality is in their case, that they moved ahead and deployed this version anyway, and they blamed it on the fact that their tests were not stable, that they were... There were stability issues versus real quality issues.

2:35 Now, my question, of course, is, well, if you have stability issues, is that not a quality issue, whether it's quality of your code or quality of your tests that I'm gonna leave in the middle, that's depending on the situation.

2:47 But regardless, they, the feedback that they received, they didn't trust it. Now, with the feedback that you receive, you do not trust. Well, then there's no point in getting that feedback. Then make sure that you get feedback that you can trust.

3:04 Now what they had was they said, essentially they did this, and they said, let's strip all the tests, because that's essentially what it came down to, right? Let's strip all the tests from our results. Let's consider amber good enough, and let's deploy whatever is actually building into an amber state.

3:22 Now, the unintended consequence that they had then was, of course, nobody paid attention to the failing tests anymore.

3:31 Sure.

3:32 And that's what you're getting. So instead of its staring you in the face and saying, oh man, this is annoying that this thing is red, OK, well go deploy anyways because we believe that there's not a quality issue and the code runs and it does exactly what it needs to do, and it's just our tests that are failing.

3:54 If that is taken away and suddenly it's good enough to have an amber result or an amber state, then who's going to pay attention to actually fixing the test and making sure that your delivery pipeline generates a result that you're all confident in, right?

4:12 That says Green on white, if you will, or black on white that you, this is absolutely a fair, a successful build that we can deploy that we have full confidence in.

4:29 That's for me more important than just the idea of we might not be entirely confident, and that is that you're actually discouraging your teams from doing something about it.

4:47 Right, right. And of course, quality is the thing that gets sacrificed. Unfortunately gets sacrificed and yeah, yeah.

5:01 And then what are you going to do about it? So, then you have a poor quality product in the end and you've got more problems than your system was designed to allow to pass by.

5:11 I saw, it reminds me of two things. One time I saw a team that made a distinction between a defect and a bug.

5:21 And one of the reasons they did that was similar to the green amber light was that defects were OK, but bugs needed to be fixed.

5:32 And so when there was a problem reported and they put a bug in the backlog, it was one of those things that, well, we can deploy anyway, even if there are bugs, but we want to make sure we catch the defects.

5:46 So the unintended consequence, well, the idea was that they were trying to get software out there, but the unintended consequence was there's still a whole bunch of quality issues that remained unattended.

5:57 Yeah, and I agree that it's the same thing. It's you're trying to hide real problems and as a result, you're no longer paying attention to them, which is the unintended consequence.

6:09 Now, I do, I'm trying to figure out what the difference is between a defect and a bug, so I'm really curious as to what characteristics of those two were deciding on the fact whether it was a bug or a defect. So if you remember that way, you definitely have to explain that to me.

6:27 Yeah, yeah. There wasn't a whole lot of difference that I could see anyway. I think it was mostly semantic, but it reminds me of a second situation and this isn't actually part of the build pipeline.

6:41 But the process overall, I've seen lots of teams do this before where they have a board, let's say a scrumboard or a kanban board, and they'll have a done column.

6:51 And for whatever reason, they can't get their pieces of work right to the done column. So they might create an almost done column, or maybe they have a multi-layered definition of done, done level 1, done level 2, done level 3.

7:07 And basically all they're doing is the same thing as adding an amber light in there. We got this far, but we couldn't get the rest of the way. The problem is, of course, that you need to get the rest of the way and you're ignoring that need.

7:17 Yeah, exactly.

7:19 So how can you in this particular case, how should we address this? What should we do different? Do you have any suggestions?

7:28 I definitely do, Wayne. Well, the first thing that pops to mind, I'll throw one out there first, if you've got tests that are failing and they're giving you an inaccurate view of the state of your code, then of course you want to go back and fix the tests and maybe the testing practice itself has become so convoluted that it's, you've got a brittle test suite that's hard to maintain.

7:59 It may need some work, but you want to, you basically want to fix the system that's alerting you to the problems so that it really does alert you to problems.

8:08 Yeah, I would even go further. I'm a bit more confrontational when it comes to that, and I've actually advised the client to do exactly this, get rid of your tests.

8:21 Nobody is looking at them. Nobody is willing to invest to improve them. Get rid of them. Don't give yourself the false impression that you have tested code because you do not, right?

8:34 So get rid of that false impression. Deal with the fact, suck it up. Deal with the fact that you don't have any tests, don't run them, but make sure that you start building up a test suite that is stable and actually you listen to.

8:49 If you don't listen to the results of your test, get rid of them. It's as simple as that, right? That would be my advice. And then you have green, keep it green.

9:03 It almost puts me in the mind of having documentation. We like to build up documentation because we think it tells us useful information, but if it's not maintained and it's not accurate, then bad documentation is even worse than no documentation.

9:15 So you might as well just throw it away because it's useless. In fact, it's misleading.

9:19 That's right. Anyways, I mean, that's the advice that I would give specifically for this unless you want to invest in making your test screen, but clearly, if that's what you wanted to do, you would have done it before.

9:34 Right.

9:35 Anyways, let's make this a short one. Why not? This is, we don't always need to talk for 15 minutes.

9:43 So that's it for today. I hope you enjoyed our conversation, however short it was, if you have a situation at work in the past or right now that you feel has unintended bad outcomes, please let us know.

9:54 We might discuss it in one of our next sessions. So thank you again, Wayne, and looking forward to talk to you again.

10:02 Thanks, Gino.

10:03 Bye.