The Inside Story of AiGov: Reflections on Building an AI Governance Platform

The problem I was solving

Technostica had no centralized way to track AI initiatives. Teams were adopting tools independently — marketing signing up for AI copywriters, engineering wiring up code-gen APIs, finance uploading spreadsheets to analytics platforms. Nobody knew what was approved, what was redundant, or what was risky.

The existing process was an email distribution list called “AI Usage” — a dozen-plus members from legal, security, and various other teams all on one thread. Everyone was answering questions, nobody knew who owned what, the mailbox was drowning in noise, and there was zero tracking. Which initiatives did we approve? Why? Who reviewed them? The answers required head-scratching, inbox archaeology, and a lot of “I think someone replied to that back in June.”

I set out to build something different: a platform where submitting an AI initiative feels as easy as having a conversation, reviews route automatically based on risk, and the entire lifecycle is tracked in one place. That platform became AiGov.

Spec-driven development

Development is easy when you know exactly what you're building.

Before I wrote a single line of code, I wrote a full product spec — every screen, every workflow state, every edge case, every role and permission. I call this Spec-Driven Development, and it's the single biggest reason AiGov shipped fast.

The spec defined a three-stage workflow state machine with eleven statuses, automatic transitions, and role-gated permissions. Six roles mapped to every allowed action. Security reviews trigger automatically for cloud deployments handling sensitive data. Legal reviews trigger for new vendors. By the time I sat down to code, it was a translation exercise — not a design exercise.

The lesson: Most projects feel hard because people are simultaneously designing and building. Separate those two activities. The code writes itself after that — with far fewer wrong turns and "wait, what should this actually do?" moments.

Not a chatbot — an AI agent that takes real actions

The difference between AI-sprinkled and AI-native.

Most enterprise AI features are chatbots bolted onto existing workflows — you ask a question, get an answer, and still do everything yourself. AiGov's intake is different. The AI doesn't just answer questions — it searches the tech inventory, researches vendors on the web, updates records in real-time, routes submissions, triggers reviews, and notifies the right people. One conversation replaces a form.

This is the tool-use agent pattern. The AI has five registered tools — search_tech_inventory, web_search, update_title, submit_initiative, and resolve_initiative. It runs in a loop (up to ten turns), deciding on its own when to search, when to ask follow-ups, and when it has enough to submit.

What a real interaction looks like

User: "I want to use Cursor for AI-assisted coding."

Agent: updates title → searches inventory → searches "code assistant" → web-searches Cursor pricing

Agent: "I found GitHub Copilot already in your org's inventory. Cursor is separate — here's the pricing. Want me to connect you to the Copilot owner, or submit Cursor as a new initiative?"

Four tool calls. Three database queries. One web search. One recommendation. Under ten seconds. Why tool-use over RAG? RAG is read-only Q&A. AiGov needed the AI to do things— write to the database, trigger workflows, send notifications. And every tool is an allowlist: no tool for "delete initiative" means the agent can't even try.

Production bugs that humbled me

Development is easy. Production is humbling.

Chatbot submissions skipped security reviews

The review trigger logic was only wired into the manual form path. Initiatives with PII were sailing past security via the chatbot. Lesson: two entry points to the same workflow means two sets of integration tests.

Rate limiting counted per-tenant, not per-user

One chatty user in a 200-person department would lock out everyone else. Two-line fix — but only caught by a real user report. Lesson: test rate limits with multiple mock users, not just one.

SSE streams dying behind Cloud Run

Streams dropped silently after 30 seconds of no text output during long tool executions. Fix: SSE keepalive pings every 15 seconds. Lesson: test streaming behind real infrastructure, not just localhost.

The pattern: All three bugs had the same shape — I tested features in isolation but not the integration points. The spec told me what should happen. Tests should verify that it happens through every path.

What actually slowed me down (hint: not the code)

The bottleneck is almost never the technology.

Blocker	Wait
Entra ID app registration + SSO setup	3 days
Custom URL (portal.aigov.technostica.net)	5 days
Cloud SQL provisioning + VPC networking	2 days
Cloud Build IAM + Secret Manager access	3 days
DNS propagation + SSL certificate	1 day
Total non-coding wait time	~14 days

Fourteen days. Not debugging. Not designing. Waiting for tickets, permissions, and infrastructure. Then the cloud surprises: default service accounts missing token creator permissions, Cloud Run in a different VPC than the database, pg_trgm extension not enabled by default. None technically hard — all "you don't know until you hit them."

The lesson: Add two weeks for infrastructure and access requests. File them on day one, before you write any code. Parallelize the bureaucracy with the development.

The mid-flight Supabase rip-out

When your prototype stack doesn't survive enterprise requirements.

I built the prototype on Supabase — auth, database, storage, all in one. Then enterprise requirements landed: Microsoft SSO is mandatory, data must live in our GCP project, storage goes through approved GCS buckets. So I ripped it all out. Mid-flight. While the product was already being used.

Before (Supabase)

After (enterprise stack)

Supabase Auth + Google OAuth

Auth.js v5 + Microsoft Entra ID

@supabase/supabase-js

Drizzle ORM + Cloud SQL

Supabase Storage

Google Cloud Storage

Supabase Realtime

Polling-based notifications

The spec saved me again. Because I'd defined the system in terms of capabilities — not implementations — the migration was surgical. Swap the adapter, keep the behavior. Start with the boring enterprise stack from day one if you can.

Ship the thing

AiGov is proof that the ideas work in practice: agentic AI that takes real actions, governance that's fast enough to compete with a credit card, and a workflow engine that routes reviews automatically based on risk. The hardest parts weren't the code — they were the Entra ID ticket, the DNS request, and the production bugs that only appear behind a load balancer.

Write the spec. File the infra tickets on day zero. Deploy to the real environment immediately. Test every path. And don't just talk about AI-native products — build one.