I Finally Shipped Something Real — And AI Helped Me Build It Right

A going series on AI here at cantechit.com

A year ago, if you’d told me I’d be sitting here announcing my first open-source GitHub project — actual firmware, C code, a real license, a public repo with a proper README — I’d have looked at you like you had two heads. Not because I doubted you were well-intentioned, but because the thing you were describing was simply not in my universe of possible actions. That was a world for “real developers.”

A year ago now feels like five years ago.
That’s not hyperbole, and I don’t think it’s just me. The pace at which the tools have changed what a technically-minded IT pro can build — and ship — is genuinely hard to overstate. I want to tell you the real story of how I got here, what I learned, and what I think it means. Not the highlight reel. The whole thing.

The Project: LilySPOT

Let me start with the what, because it matters.
LilySPOT is now public. It’s open-source firmware — written in C, built on Espressif’s ESP-IDF framework — that turns a LilyGO T-A7670G board into a reliable, battery-powered mobile WiFi hotspot. The board is an ESP32-WROVER with an onboard A7670G LTE CAT1 modem, and with LilySPOT flashed onto it you get WiFi SoftAP bridged over 4G, a full web admin UI, OTA updates with automatic rollback, WireGuard VPN (compiled in, off until you want it), optional GPS with a stratum-1 NTP service, data usage tracking, battery and solar monitoring, band locking with an anti-lockout state machine, and more.

It’s v0.9.0. It runs on real hardware. The VPN has been verified end-to-end on that hardware. I’m proud of it.
I did not write ANY of the code. My agents did. But I built this, and I’ll explain what I mean by that.

How I Actually Built It: The Multi-Agent Workflow

n earlier posts in this series I wrote about experimenting with multi-agent AI workflows. LilySPOT is where I stopped experimenting and started productionizing.

The architecture I landed on has four distinct roles. A Manager (Claude Opus) is the general contractor — it holds the big picture, sequences work, briefs me, and gates merges. A Planner (also Opus) takes a GitHub Issue and produces a concrete implementation plan: module design, memory budgets, API surfaces, risk flags. A Coder (Claude Sonnet) implements exactly the approved plan and opens a pull request. A Tester (Sonnet) verifies the PR against the acceptance criteria independently, without touching the code — just reports pass or fail with evidence.

Each of those agents works in an isolated git worktree on its own branch. They literally cannot step on each other. And each runs at the model appropriate for the job — Opus for the thinking work, Sonnet for the execution work. That model pinning matters: you don’t want to pay Opus rates for a mechanical build-matrix check, and you don’t want Haiku reasoning through an architecture trade-off.

The whole thing is governed by a CLAUDE.md file at the root of the project. This is the single most important infrastructure investment I made.

CLAUDE.md Is Your Law Book — Don’t Skip It

Here’s what I learned that I wish I’d understood at the start: every agent reads CLAUDE.md when it spins up. It’s not a suggestion file. It’s the constitution. If your rules live in your head, they stay in your head. If they live in CLAUDE.md, every agent instance — every new conversation, every new worktree, every fresh spin-up — starts from the same foundation.

For LilySPOT, I codified nine Design Laws. Things like: every feature must be toggleable and off by default; flash is the hard ceiling (and on a 4MB device with dual OTA slots, that ceiling is real); the web admin binds to the LAN interface only, never the carrier side; no state may be created that the user can’t escape from. That last one came from a real incident — an agent produced a band-lock implementation that could have permanently wedged the device with no recovery path. Because “no lockouts” was in the law book, the fix was a matter of pointing at the rule and requiring it be honoured.

The rules in CLAUDE.md don’t just prevent bad outcomes. They create trust. When I know the law book is there and current, I can give an agent more latitude. When it’s missing or vague, I’m micromanaging every decision. The leverage is enormous.

Agents Drift. Supervision Is the Job.

I want to be direct about something, because I think the AI discourse sometimes glosses over it: agents drift. Not every session, not maliciously — but the three-role model (plan, code, test) only holds if someone is watching the seams.

More than once I’d hand off to what was supposed to be the Coder, and it would quietly start doing planning. Or the Tester would identify a problem and just… fix it, crossing the lane it was supposed to stay in. Or an agent would helpfully add a feature that wasn’t in the Issue, reasoning that it was “obvious” and “small.” Every one of those drift events had to be caught and corrected.

Sometimes, it actually added in something useful, or found something I didn’t think of – and that’s good, but just like an over eager junior developer, sometimes it didn’t think, or thought too much.

This is not a complaint about the tools. It’s a description of the actual job. The human-in-the-loop role isn’t ceremonial — it’s the role that notices when the workflow is no longer following the workflow. Supervision is the leverage point. Get comfortable with it.

The agents are genuinely capable. But they are not self-governing. The system only maintains its integrity because a human is reading the outputs, checking the seams, and redirecting when it drifts. That human was me, and it took real attention.

The GitHub Learning Curve Was Real

I’m going to say something that I suspect a lot of people in my position feel but don’t say out loud: I had to get genuinely comfortable with GitHub.

Not just “I use GitHub to push code” comfortable. I mean pull requests, branch protection, Issues as the single source of truth, conventional commit formats, worktrees, merge strategies, PR review gates, release tagging — the whole ecosystem. For this project, I built a rule that every piece of work must trace back to an Issue. No Issue, no work. That discipline forced me to understand what GitHub Issues actually are and how to use them well.
The question I kept coming back to was: how do I properly build and release a project on GitHub? Not how do I use git. How do you build something that looks, to the outside world, like a professional project? How do releases work? What goes in a README? What does a tag mean? When do you call something v0.9.0 versus v1.0.0?

I don’t have a CS degree. I came up through IT. These things were not handed to me — I had to go get them. The agents helped, honestly. When I had to think carefully about branching strategy or release gates because I was writing it into a governing document, it forced me to actually understand it rather than just muddle through.

The State of the Nation: We Are in the Middle of Something

Let me step back and say something about the broader moment, because I think it deserves a real argument and not just cheerleading.

Agentic AI building is not a gimmick. What is happening right now is that a meaningful class of people — technically literate, with domain knowledge and good judgment, but without formal software engineering training — are crossing a threshold into genuine production. Not “I made a script.” Not “I built a thing for myself.” Shipped, licensed, public projects.

The precondition for that wasn’t just capable AI. It was capable AI plus good workflow design. The agents are tools. The multi-agent architecture, the governing document, the GitHub discipline, the human supervision — those are the scaffolding that turns capable-but-untamed AI into a reliable construction process. Without the scaffolding, you get drift, hallucinated features, unchecked technical debt, and projects that look done but aren’t.

The people who figure out the scaffolding are the ones who will be building real things. The people who treat AI as a magic answer machine will hit walls, and they’ll hit them repeatedly. That’s the honest state of the nation.

I’ve talked to people who tried agentic building once, had an agent do something confusing, and concluded the whole thing was overhyped. That’s like concluding power tools are overhyped because you picked up a circular saw without knowing how to use it. The tool isn’t the ceiling. The workflow is.

I Still Might Make a Fool of Myself

Here’s the self-aware part: I’m announcing a public project on GitHub knowing full well that real embedded systems engineers are going to look at it. People who’ve written bare-metal C for longer than I’ve been in IT. People who will have opinions about my UART handling, my PSRAM allocation strategy, my lwIP configuration.

I might have gotten things wrong. Almost certainly have, somewhere. The throughput on this device tops out around 512 kbps because the cellular data runs over PPP on a serial UART — I say that right in the README. That’s not a bug, it’s physics, but someone is going to read it and have feelings.

I went into this knowing I might make a fool of myself. I’ve decided that’s fine. The alternative — never shipping, always polishing privately, always waiting until I’m “ready” — is worse. LilySPOT is v0.9.0 for a reason. It’s honest about what it is and what it isn’t. I think that’s more respectable than pretending. I have more ideas for this thing, but let’s get it out there.

Sharing It with the Org — Build Your Vibe

The last thing I want to say is about a course I recently taught at work, called Build Your Vibe.

I’d been on this journey for a while, building privately, and I finally felt like I understood enough about this new way of working to share it — what the agentic workflow actually looks like in practice, how to govern it, what the genuine leverage points are.

It felt great to bring that into the organization. Not as a pitch for any particular tool, but as an honest account of what it looks like when someone who isn’t a “developer” decides to build something real and figures out how to actually do it. The reaction was good. People are hungry for it.
I think that’s the version of this conversation that matters most: not “AI will replace developers” (it won’t, and the people who think it will have never tried to supervise an agent through a hard architectural decision), but “here is a new set of skills that expands what you can build.” Those skills are learnable. I learned them. So can you.

cantechit – Technology and IT BLOG

RAW Opinion of an IT Professional – and a bit of Rally Racing.

I Finally Shipped Something Real — And AI Helped Me Build It Right

The Project: LilySPOT

How I Actually Built It: The Multi-Agent Workflow

CLAUDE.md Is Your Law Book — Don’t Skip It

Agents Drift. Supervision Is the Job.

The GitHub Learning Curve Was Real

The State of the Nation: We Are in the Middle of Something

I Still Might Make a Fool of Myself

Sharing It with the Org — Build Your Vibe

Leave a comment Cancel reply

The Project: LilySPOT

How I Actually Built It: The Multi-Agent Workflow

CLAUDE.md Is Your Law Book — Don’t Skip It

Agents Drift. Supervision Is the Job.

The GitHub Learning Curve Was Real

The State of the Nation: We Are in the Middle of Something

I Still Might Make a Fool of Myself

Sharing It with the Org — Build Your Vibe

Share this:

Related

Leave a comment Cancel reply