Use Case #16: Deep Bug Investigation
How Claude traced a bug through multiple layers without giving up.
William Welsh
Author
Use Case #16: Deep Bug Investigation
"It works most of the time."
The worst kind of bug.
The Symptom
User clicks "Generate Report." Usually works. Sometimes returns empty. No error. No pattern I could identify.
The Investigation
I asked Claude to investigate systematically. What followed was 43 minutes of methodical debugging.
Minutes 1-5: Reproduce - Claude tried the feature 10 times. Failed 3 times. Confirmed: intermittent, roughly 30% failure rate.
Minutes 6-15: Trace the Flow - Report generation touched 12 files: UI component, API route, service layer, database queries, PDF generator, email sender. Claude mapped the entire flow.
Minutes 16-25: Instrument - Added logging at each step. Ran 10 more tests. Found: failures happened when step 3 (data fetch) and step 4 (template load) completed in different order.
Minutes 26-35: Root Cause - The code assumed template loaded before data. Both were async. Usually template was faster. But under load, data sometimes won. When data arrived first, it tried to render with undefined template. Silent failure.
Minutes 36-40: Fix - Added proper await ordering. Both async calls now complete before rendering starts.
Minutes 41-43: Verify - Ran 20 tests. Zero failures.
What Made This Work
No giving up. Claude didn't stop at "it's intermittent." It kept digging.
Systematic approach. Map the flow, instrument it, find the pattern, fix the root cause.
Statistical validation. Not "it seems to work" but "20/20 successful after fix."
The Lesson
Intermittent bugs have deterministic causes. The randomness is in the environment (timing, load, order), not the bug itself. Find what varies.
This was in the ContentEngine report generator, January 2026.
William Welsh
Building AI-powered systems and sharing what I learn along the way. Founder at Tech Integration Labs.
Related Articles
View all →Use Case #14: RLS Policy Debugging
The query was correct. The data existed. But Supabase returned nothing. Claude traced it to a recursive RLS policy that created an infinite loop.
Use Case #1: Autonomous Bug Fixing from Slack
One prompt. Zero babysitting. Claude read bug reports from Slack, traced the issues through my codebase, fixed them, deployed to production, and verified the fixes in a browser.
Use Case #2: Client Onboarding from URL
I gave Claude a business URL. It researched the company, scraped their content catalog, identified competitors, extracted brand colors, and generated a fully configured ContentEngine instance.