50by50
Day 3 / 132 ·

Day 3 in retrospect: the day the calculators actually compute

Wrote the handoff, handed the keyboard over, came back to two working engines and 3,500 words of supporting prose. The day the spoke pattern became real.

A note on this post before we start: Day 3 was a deliberate agent-execution day. I wrote a long handoff at noon — the most concrete brief I’ve prepared in this sprint — and then handed the keyboard to the coding agent. The recap below is the honest account of what shipped, in the order it shipped, with git timestamps as the only timeline I’m willing to invent dates from.

Day 1 was me writing all morning, the agent writing all afternoon, and a launch by midnight. Day 2 was a parallel-pair scaffold day with five working hours and a rebuild. Day 3 is the third pattern: a handoff day, where my work is the brief and the work-product is what the agent ships from it.

If that sounds like delegation theater, fair. I want to argue against that reading in the next 1,500 words. The argument is going to be: the quality of an agent-execution day is the quality of the handoff, and the quality of the handoff is itself work — work that nobody can do for you, and work that compounds the brief you’ll write tomorrow and the day after.


~12:00 PM — The handoff goes in

The handoff I wrote at noon ran longer than I expected: standing rules, coding standards inlined from Pinecone (eleven records, verbatim), the Tier 1 / Tier 2 / Tier 3 priority list with explicit “Friday-launch-blocking” tagging on the two big engines, the decisions already made that I didn’t want the agent to relitigate, repo-specific quirks (Vite version pin, prettier ignore, eslint import-order strictness, the subpath-export pattern), and two questions I needed from me-the-user later in the day (GA4 measurement IDs, Search Console verification path).

The reason it ran long is the reason agent-execution days only work for me: every sentence of context I leave out of the brief becomes a wrong choice the agent makes confidently. The standards inline matters because a “follow our coding standards” instruction is interpreted as “follow your training” if the actual standards aren’t pasted into the conversation. The decisions-already-made section matters because an agent will otherwise spend cycles second-guessing decisions you’ve already settled, in good faith, like a colleague who skipped a meeting.

The handoff also forced me to look at the day’s structure honestly. Tier 1 was two engines and a CLI. Tier 2 was 3,500 words of article prose. Tier 3 was the recap, GA4/SC wiring, and TrustCore Media polish. I would not have all of that. The handoff named which would slip if the day ran short.

That’s the work of the handoff: ranking, naming the gate, and being honest about the budget.


12:25 PM — Status flip, recap queued

I committed f790024 — Day 2 status flip from “in progress” to “done” and a queue note for Day 3. Last commit of the user-typing portion of the day. Three minutes later, the agent’s first commit landed.

12:27 PM — Commit 00fcb3f: LaunchCost calc engine math + tests

Twenty-five tests passing, pure TypeScript module, no I/O, validated inputs at the boundary per the standards’ cs-05 rule. The math the calculator promises: five inputs (team size, stack, paid distribution, office mode, legal setup) become a {low, mid, high} range with a per-line breakdown that explains every dollar. The defaults come from BLS wage data, public SaaS list prices, Apple’s $99 developer fee, and three public Delaware LLC filings — all citable in the methodology page.

The thing I want to flag about this commit: the rule that the calculation module has no I/O, no DOM, no fetch is what made the next four commits cheap. The same pure module ended up powering the homepage UI (in a <script> block) and the CLI (in a Node entry point) with zero drift between the two. That’s a cs-02 pristine-code dividend, paid in commits two and three.

12:32 PM — Commit d99100c: wire the calc into the homepage

This is the commit where the LaunchCost spoke stops being scaffolding. SSR baseline computed at build time from the default input. A client script that reads every form select, runs calculate() on change, and patches the result card + breakdown in place — three event listeners and a render function. No framework, no state library, no hydration overhead.

Manual hand-check during verification: switch team to 3, paid distribution to heavy, office to leased. Mid jumps from $42,180 to $139,780. The math hand-confirms: $93,600 founder time + $2,640 hosting + $18,000 leased office + $24,000 heavy paid + $1,540 legal/accounting/domain. That’s the kind of verification I trust more than a test — eyes on the actual dollars.

The interpretation line also got a small repair in this commit. The original generator produced “shipping a saas us build” which read awkwardly. New version: “shipping a US SaaS” / “shipping a mobile app” / “shipping a marketplace.” Three lines of code, an order-of-magnitude improvement in how the result reads aloud.

12:33 PM — Commit 49e9ef9: the CLI

cs-08 in the coding standards says every feature gets a CLI. Five minutes later, LaunchCost has one. npx tsx scripts/calc-cli.ts --json emits machine-readable JSON; --team 99 exits 2 with [calc-cli] --team must be 1, 2, or 3 (got "99"). Fix: --team 2. on stderr.

The CLI matters more than its size suggests. It means the calculator is now scriptable for agents — Paperclip, autoresearch, anything that wants to ask “what does a 3-person mobile team’s year-one look like” without parsing HTML. It’s also the artifact that proves the math module’s purity is real: same import, same answer, in a context the homepage never reaches.

12:37 PM — Commit 0589961: Parent Care checklist engine + 30 tests

The bigger of the two engines. Twelve baseline questions across four domains (daily living, medical, finance, support network), each answer carrying a concern weight from 0 (no concern) to 3 (urgent). Five conditional follow-ups that surface inline when a base answer flags a specific risk — for example, “needs daily help bathing” triggers “have they fallen in the last six months.” The follow-ups don’t contribute to the readiness score; they sharpen the agency-call question packet only.

The four-band readiness scoring lands at: stable (0-25% concern), watch (26-50%), plan (51-75%), urgent (76-100%). The agency-call question packet generates only from concerns at level 2 or higher — no generic checklist dump, only what’s relevant to the family answering. That choice matters because the alternative — handing every reader the same twelve questions whether their parent is stable or in crisis — is the kind of thing that makes a “checklist tool” feel like content. The packet should be earned by the answers.

The tests cover band boundaries, conditional follow-up triggering, follow-up dedupe, agency-packet filtering, per-domain isolation, and the cs-05 validation path. All 30 pass.

12:41 PM — Commit 1e4b780: Parent Care UI + PDF export

The checklist now renders end-to-end. Twelve numbered radio sections, four per domain. The result panel updates live (no submit button) — band, per-domain percent bars with accent-colored fills, flagged-concern summary in plain English, and the agency packet appearing only when there’s something to put in it.

Conditional follow-ups render inline under the parent question that triggered them, with a dedicated accent-bordered card so the relationship is visible without scrolling. Toggling the parent answer back to a low-concern choice clears the follow-up automatically — verified by simulating that exact toggle and watching the follow-up disappear and the concern count drop from 5 to 4.

PDF export uses window.print() with a print-only stylesheet instead of a PDF library. Modern browsers expose “Save as PDF” in the native print dialog, so the user gets a downloadable artifact entirely on-device, with no JavaScript dependency on a 200KB PDF library. The print stylesheet inlines its own Source Sans / serif typography and matches the design-system feel.

Privacy posture made explicit in the result panel copy: “Answers stay on your device. This entire page runs on your device.” No analytics fire on the answers themselves. The methodology block calls out specifically: there is no server collecting the responses; the PDF is generated locally and downloaded directly.

The Parent Care hero also got reframed in this commit. Day 2’s copy described the page as “twelve questions to ask any home-care agency” — accurate for the supporting article at /agency-questions, but wrong for the homepage. The homepage IS the readiness checklist. New copy: “Should we get outside help yet? Twelve questions that turn into your agency-call packet.” That’s what the page actually does.

12:45 PM — Commit 3e102cf: 3,500 words of article prose

All four supporting article pages — the nanocluster’s /hosting-costs, /stack-tiers, /agency-questions, /independent-vs-agency — went from STUB scaffolding to finished prose in a single commit.

The LaunchCost articles came in opinionated. Hosting Costs walks the five cost categories with real dollar ranges per category and names the egress trap explicitly (“100GB/month on AWS S3 is $9 in egress, which seems trivial until you multiply by 12 and remember it’s per region”). Stack Tiers compares Managed (Vercel + Supabase + Stripe) to Serverless (Cloudflare Workers + D1 + R2) to DIY (Hetzner + Postgres + nginx) with cost shapes named in dollars, then offers four diagnostic questions to pick a tier on purpose instead of by accident.

The Parent Care articles came in with the same opinionated voice but a softer hand. Agency Questions ships twelve specific questions across four domains (who shows up, training, money, supervision) with the answer shape to expect (“the honest turnover rate from a good agency is 30-60%”) and three red-flag patterns that should make a family keep shopping. Independent vs Agency walks five structural differences (taxes, sick coverage, supervision, exit friction, real loaded cost) and ends on one diagnostic question — “how much operational risk can your family absorb?” — that’s the same question the readiness checklist answers from the other direction.

Both Parent Care articles carry the same disclaimer the methodology box does: nothing on the page is medical, legal, or tax advice for a specific situation. The reviewer names remain placeholder pending final consent to publish — the body copy says so explicitly now, in plain text, instead of relying on a STUB comment that a reader wouldn’t see.


~12:46 PM — Where Day 3 ended

Seven commits between 12:27 and 12:45. Friday is no longer at risk; both spokes have working engines, both have finished prose, both have an end-to-end path from open-the-page to download-your-result.

What didn’t ship today:

  • GA4 measurement IDs + Search Console verification. I left those open in the handoff specifically because I didn’t have the IDs to paste in yet. If they arrive Thursday, they’re a 30-minute wire-up — three site-level src/lib/site.ts edits and a TXT record on the apex.
  • TrustCore Media polish. Still on trustcore-media.vercel.app; the media.50by50.dev DNS attach, automated curation pipeline doc, and on-Hub roadmap link slipped to Thursday.
  • Reviewer consent finalized. The four reviewer names in the Parent Care methodology box are placeholder, called out as such in the article body. Consent calls are queued for this week.

Friday’s launch is now a smoke-test-and-ship day, not a cramming day. That was the actual point of the noon handoff.


The thing I want to say about agent-execution days

I’m aware that “I wrote a handoff and the agent shipped” reads, to some people, like I didn’t do the day. I want to argue against that reading without being defensive about it.

The handoff is the work. It’s not a memo and it’s not a wishlist; it’s the spec — the rank order of priorities, the standards inline, the decisions-already-made that prevent a tool from making them again wrong, the questions I owe me-the-user before mid-afternoon, and the honest “what slips if it slips” budget. That spec is the only thing that determines whether the day’s commits are useful or wasted. I can outsource the typing. I cannot outsource the brief.

The day also taught me something I want to remember: a quality brief is itself a compounding asset. The Day 3 handoff has decisions in it that came from Day 1 and Day 2 (the nanocluster pattern, the warm-paper aesthetic, the privacy posture, the cs-08 CLI mandate, the verify-gate-before-commit discipline) and the next handoff will have decisions in it that came from today (the print-via-window.print() pattern for client PDF, the result-panel anchor pattern, the no-fabricated-reviewer-names rule the agent surfaced). Each day’s brief is shorter and sharper than the day before, because the standing rules grow and the things I have to re-explain shrink.

That’s the version of “shipping fast” I want to be testing in this sprint. Not “ship lots of code per hour” — anyone can do that, briefly. The thing I want is “ship the right code per hour because the brief is right.” Day 3 was the first day where that pattern showed up clearly enough for me to name.


Day 3 of 132. 129 days remaining. Two spokes ready for Friday’s launch.

Tomorrow: GA4 + Search Console if the IDs are in, TrustCore Media polish, the reviewer-consent calls, and the Friday-launch dry run.


Subscribe

Follow the build

One email a week. Real numbers, real lessons. No hype.

Unsubscribe in one click. No selling, no sharing.