The half I left out of my parallel worktrees post

A few weeks ago I wrote about how I run parallel AI coding sessions on the same Laravel project, each one isolated in its own git worktree with its own Docker stack, its own database, and its own localhost port. That post was about the infrastructure — how to spin up three independent development environments from a single repository in about ninety seconds. What I didn't admit in that post, and what I think I owe the follow-up to, is that the infrastructure on its own was not quite enough. It solved a real problem, but it left a quieter one untouched.

The empty worktree

The quieter problem was that each freshly spawned worktree was a ghost town. Running migrate:fresh --seed would populate the database with countries, question templates, rating scales, and a handful of starter users, and that was all. Structurally, it was a complete application. Functionally, every page beyond the admin panel was an empty table. There were no filled-out forms, no assets, no incidents, no remediation plans, and no way to tell whether a visual change to the audit register was actually working without first clicking through ten minutes of form-filling to generate something for the register to display. That rather defeats the point of being able to spin up three sessions in ninety seconds.

What I really needed was a seeder that produced a populated application. Not just the reference data that every install needs — the lookup tables and the question templates — but the kind of transactional data that a real tenant would accumulate after a week of actual use. Finished assessments. Audits with ratings and the remediation plans those ratings imply. A handful of incidents. Assets with real map coordinates so the location map would actually render. The obvious move was to build Laravel model factories for every domain model I cared about and then compose them inside a single DemoDataSeeder that stitched them together into a believable scenario.

Factories and a seeder

That is what I set out to do, and it went more or less the way you would expect. Working with Claude, I built the factories in dependency order — starting with the simple lookup tables, then users and organizations, then the structure of the forms, then the form instances, then the assets, then the responses, and finally the remediation plans and incidents that hang off everything else. Each factory got a small smoke test that asserted it could create a single instance and persist it without anything blowing up. The repository's test count went from eighty-nine to one hundred and nine, and every page I cared about in the application started showing data again. The dashboards had numbers. The audit register had two completed audits. The asset register had twelve assets, four of them with map pins. The application looked like a functioning product for the first time in my worktrees.

The shape of the result I was aiming for was a single method chain that would do most of the setup work for me — something like this:

User::factory()->organization()->create();

One line, and behind the scenes it would create a new organization account, attach the fifteen rows of reference data that every real organization in this application needs in order to function, and leave me with an object I could then hang assets, audits, and responses off of inside the demo seeder. Getting a single line to do that much useful work is really what all twenty-or-so factories were for.

One small design decision is worth mentioning, because it turned out to make a disproportionate difference to how pleasant the environment was to use. My original plan had been to create separate demo organizations — Acme Corp, Beta Industries — each with their own fictitious admin users, so that you could log in as demo-orgadmin@example.com and see their data. I prototyped this, and then I threw it away. Remembering a new set of credentials every time you spin up a worktree is friction, and friction compounds when you are spinning up worktrees all day long. The better approach was to hang the demo data off the Test Organization that already existed in the bootstrap seeders, so that the familiar orgadmin@example.com account — the one I use everywhere else — would quietly find itself in front of a fully populated application with no new accounts, no new passwords, and no mental overhead at all.

The final piece was the environment guard. I wanted the seeder to run on every environment that wasn't production — local development for me, but also our staging server, where the team needed a populated application for the same reasons I did. One line inside DatabaseSeeder was enough:

if (! app()->environment('production')) {
    $this->call(DemoDataSeeder::class);
}

A normal deployment runs php artisan migrate, which doesn't touch seeders at all, so shipping this code to staging was entirely safe. The demo data only appears when someone explicitly runs migrate:fresh --seed, which is exactly when — and only when — you want it to.

One more check, before I merged

At this point I thought I was done. The factories existed, the seeder ran, the application was populated, and the test suite was green. I was ready to open the merge request and move on. Before I did, though, I wanted to do one more check — the kind of check that doesn't feel necessary until you have been bitten by skipping it. I wanted to compare what my seeder was actually putting into the database against what a real user had been writing into our staging environment over the previous few weeks. Not the shape of the data, not the row counts, but the exact column values, row by row and field by field.

I am very glad I did, because my seeder was lying to me, and the lies were subtle enough that no automated test would ever have caught them.

A column called `user_email`

The first mismatch I found is still, several days later, my favourite. My seeder was populating a column called user_email on internal audit responses with, reasonably enough, the email address of the user who had submitted the response. The real application, it turned out, writes the literal string '0' into that column for internal users, and nothing in the schema, nothing in the column name, and nothing in any reasonable person's mental model would ever have hinted at this.

What my seeder wrote:    user_email = 'orgadmin@example.com'
What production writes:  user_email = '0'

It was one of those small pieces of behaviour that only exists because a controller, at some point years ago, needed a sentinel value and nobody ever revisited the decision. My seeder was not wrong in a way that would break anything — the application was perfectly happy with 'orgadmin@example.com' sitting in that column. It was wrong in a way that made the seeded data look slightly nicer than the real data, and that, I am now convinced, is the worst kind of wrong, because it makes local development quietly drift away from production reality without anybody noticing.

There were ten of these mismatches in total, and every single one of them traced back to the same underlying pattern. The controllers in this application write their data using raw SQL INSERT statements that only touch a specific subset of columns — nine, in some cases, out of a table with thirty — and every other column in the row is left sitting at whatever default the migration originally set, which was almost always NULL. A client ID my seeder was cheerfully populating. A progress percentage my seeder was calculating. A type field my seeder was deriving from the form. None of these fields were actually written by the application in production. All of them had, quietly and without any particular announcement, become columns that the schema promised but the code simply never touched.

Correcting the seeder, once I understood the pattern, was the easy part. I went back through the DemoDataSeeder and stopped setting any column that the application itself did not set, and the staging diff came back clean. The harder question — and I think the reason this whole detour is worth writing about at all — is how I would ever have caught any of this without the staging comparison. I do not think I would have. Every automated test I could have written would have been reading the same schema I was writing to, which means the tests would have agreed with the seeder that everything was in order. The only authoritative source of truth was the production code path itself — the actual controller writing the actual INSERT statement, and the actual staging database that had been touched by that code path for weeks on end. Anything less than that would have missed it.

What I actually learned

This, I think, is the lesson that matters most, and it is specifically a lesson about working with an AI collaborator on a legacy codebase. Claude was a very capable partner throughout this work. It built the factories in the right order. It wrote smoke tests without being asked. It raised sensible questions about ownership columns and prerequisite chains and the difference between assessment responses and audit responses, and it caught several mistakes I would otherwise have made on my own. What it could not do — and I do not think any AI collaborator could reasonably have done — was know that user_email = '0' was the right value. That knowledge does not live in the schema, it does not live in the column names, it does not live in the tests, and it does not live in any piece of documentation I had put into the conversation. It lives only in the line of controller code that writes the INSERT, and in the staging database that reflects the consequences of that line running against real users over time. If I had not gone looking for it, neither of us would have found it.

So the moral of this follow-up, for me, is a slightly adjusted version of the one I ended the first post with. That first post was about teaching the AI about the environment — making sure it understood that a worktree was not an ordinary branch, and making sure it would not run cleanup commands that would kill the session. This one is about something adjacent, and a little more uncomfortable. It is about accepting that on a legacy codebase, an AI collaborator can only ever be as correct as the ground truth you show it.

The schema is not ground truth. The code that actually runs in production is.

If you are going to let the AI write your seed data, you have to be willing to compare what it wrote against a database that was shaped by that production code — and you have to be willing to do the comparison yourself, because nothing else will do it for you.

Where things stand now

The parallel worktrees, for what it is worth, are now as genuinely useful as I had hoped they would be when I wrote the first post. I can spin up three sessions at once, and every single one of them boots into an application that looks and behaves like a real product. The infrastructure, it turns out, was the easy half of the problem. The data — and the small, uncomfortable process of making sure the data wasn't quietly lying to me — was the other half, the half I didn't write about the first time round. This is me writing about it now.

Related: How I Run Parallel AI Coding Sessions on the Same Laravel Project — the first post in this pair, where the whole parallel-worktree setup got built in the first place.

The half I left out of my parallel worktrees post

The empty worktree

Factories and a seeder

One more check, before I merged

A column called `user_email`

What I actually learned

Where things stand now

Comments

More from this blog

Six Slash Commands I Built on Top of Claude Code

How I Run Parallel AI Coding Sessions on the Same Laravel Project

Command Palette

The empty worktree

Factories and a seeder

One more check, before I merged

A column called user_email

What I actually learned

Where things stand now

Comments

More from this blog

A column called `user_email`