# Shipping doany.ai: A 90-Day Engineering Launch Recap

*By the doany.ai Engineering Team — April 2026*

## The Starting Line

Three months ago, doany.ai was a Figma file and a SQLite prototype. Today it serves 14,000 active users running AI-powered skill workflows across design, code review, and content generation. This is the story of how we got from zero to launch — the architecture bets that paid off, the ones that didn't, and what we learned shipping under pressure.

## Architecture Decisions That Shaped Everything

### Betting on Edge-First

We chose Cloudflare Workers over traditional server infrastructure from day one. The latency win was immediate — p95 response times dropped from 320ms to 47ms for our skill metadata API. But Workers' 128MB memory ceiling forced us to rethink how we handle large language model context windows.

Our solution: a two-tier architecture. Workers handle routing, auth, and lightweight JSON transforms. Heavy inference calls fan out to GPU-backed containers on Fly.io, streamed back through Workers via Server-Sent Events. The cold-start penalty on Fly was brutal at first (8-12 seconds), but pre-warming pools brought it under 800ms.

### The Pydantic Wall

Every data boundary in our pipeline speaks Pydantic v2. Skill definitions, evaluation plans, benchmark results, site models — all strictly typed. Early on, this felt like over-engineering. By week six, when we refactored the entire evaluation pipeline, the type checker caught 23 breaking changes before a single test ran. The "Pydantic wall" became our most trusted safety net.

### Content-Addressed Bundles

We hash every skill bundle with SHA-256 and store outputs content-addressed. This means:
- Deterministic cache invalidation (no TTLs, no stale data)
- Any pipeline step can be re-run without corrupting downstream artifacts
- Rollbacks are instant — just point to a previous hash

The tradeoff: storage cost went up 40% due to deduplication overhead. Worth it for the operational simplicity.

## What Broke (And How We Fixed It)

### The WebSocket Meltdown (Week 4)

Our real-time collaboration layer used WebSockets over Workers. At 200 concurrent connections, everything was fine. At 2,000, Cloudflare's connection limits turned our dashboard into a graveyard. We migrated to Durable Objects with a pub/sub fan-out pattern in 72 hours. The migration was messy — three engineers, zero sleep — but connection capacity jumped to 50,000+.

### The LLM Cost Spiral (Week 7)

We shipped without prompt caching. Our evaluation pipeline called Claude for every single benchmark run, even when the skill definition hadn't changed. Monthly API costs hit $18,000 before we noticed. The fix: content-addressed prompt caching keyed on skill bundle hash + model version. Costs dropped 73% overnight.

### Database Migration Gone Wrong (Week 9)

A NOT NULL column added to the skills table without a default value took down the API for 22 minutes. We now run all migrations through a shadow database first, and every migration includes a rollback script tested in CI.

## By the Numbers

| Metric | Launch Day | Today |
|--------|-----------|-------|
| Active users | 0 | 14,200 |
| Skills in catalog | 12 | 847 |
| API p95 latency | 320ms | 47ms |
| Pipeline steps | 5 | 11 |
| Deploy frequency | weekly | 4x daily |
| Uptime (30d) | — | 99.94% |

## The Team

Eight engineers shipped this in 90 days. We operated in two-week sprints with daily async standups over Slack. No project managers — engineers owned their pipeline steps end to end. Code review turnaround averaged 4 hours. Every PR required at least one approval and passing CI.

## What's Next

We're investing in three areas for Q2:
1. **Multi-modal skill evaluation** — benchmarking skills that produce images, audio, and video
2. **Federated skill registry** — letting teams publish private skills that integrate with the public catalog
3. **Streaming evaluation** — real-time benchmark progress instead of batch-and-wait

The first 90 days proved the architecture. The next 90 will prove the platform.

---

*doany.ai is an open platform for discovering and running AI agent skills. Follow our engineering blog for deep dives into the systems behind the platform.*
