You can only connect the dots looking backwards.
A list of the work — essays, projects, questions, influences. The full field view is best on a larger screen.
Essays
- The immersive room is a listening environmentWe keep measuring whether immersive demos teach more. They don't. That's the wrong question.
- The apprenticeship was a curriculumJunior labor was the curriculum. AI eliminated the labor and didn't replace the curriculum.
- The author is still the authorUse AI to draft, refine, brainstorm, code — fine. The moment your name is on it, every word is yours. 'GPT wrote it' does not transfer accountability. It just makes you a person who admits they didn't read what they sent.
- Dataset vs data productA dataset is a file. A data product is a contract about what questions it can answer, who owns the answers, and what happens when those answers change.
- Is the AI-to-AI email loop the world we want?When my assistant writes to your assistant, what is the email actually for? Asymmetric drafting is one thing. Both sides automated is another, and the medium starts to lie about whose attention is being spent.
- What 20 correlations taught me about consumer health dataTwenty typed correlations, a sample-size gate per phase, and one confounder that broke half of them. The half that survived are the only ones I trust to surface to a user.
- I thought AI could write the app. My team humbled me.Code generation has gotten very good. Naming the right abstraction has not. Most of what makes a system maintainable still happens before any code is written, and that part is not getting easier.
- Cycle as a confounderHalf the species moves through a four-phase hormonal cycle. Most consumer health correlations don't condition on which phase the data was collected in. Most of those correlations are wrong, or right for the wrong reason.
- Zero-to-low sample generationA handful of real records, a regulator who won't release more, and a system that has to be tested anyway. Most synthetic data writing is for a different problem.
Notes
- Why I love synthetic dataIt's the rare medium where breadth, creativity, and technical chops compound instead of trading off.
- Taste is the rare skill nowProducing is cheap. Choosing is hard. The résumé that didn't get rewarded before is the one that does now.
- Synthetic data quality is not a numberThere's no general score, the absence is structural, and the test that matters is the one you run on your own use case.
- Real AI strategies are about subtractionIf you can't name what you'll stop doing once the AI works, you don't have a strategy. You have a wishlist.
Open Questions
- Are static synthetic datasets actually useful to share?If a synthetic dataset is tuned to a specific fact pattern, what survives when someone else picks it up for their own?
- How do you evaluate synthetic data when there's no blanket metric?"Looks realistic" is the lazy proxy. Real quality is conditional on the question you're trying to answer with it.
- What does an honest eval look like for a system that learns from feedback?Once the loop closes, last month's benchmark is part of the training distribution. So what are we measuring?
- Why do synthetic data programs stall?The technique works. The pilots succeed. Then the program flatlines. The reasons rarely have to do with the data.
- What counts as an agent?When does a script become a colleague, and who is responsible for what it does?
- When does giving an agent a tool help, and when does it leak responsibility?Each tool is also a place where the human who reviews the output can no longer follow what happened.
- Is data quality a property of the data, or of the question being asked of it?The same dataset is clean for one decision and useless for another. We act like quality lives in the table.
- Whose evidence counts in consumer health?n=1 is dismissed and n=10,000 is overclaimed. The honest middle is small and unfashionable.
- What does it take for AI to actually ship inside a regulated function?Most pilots stall not because the model is wrong but because no one will sign the memo.
- How do you staff an AI team that ships to non-technical users?Half the team is wrong for the model work. The other half is wrong for the operator conversations. The shape that works is rare.
Projects
Roles
Influences
- Seeing Like a State — James C. ScottLegibility comes at a cost, and the cost is usually borne by whoever the system can't see.
- The Book of Why — Judea PearlCausal language as a missing layer. The reason 'controlling for X' isn't always honest.
- Superforecasting — Tetlock & GardnerCalibration as a habit, not a talent. Most of what makes a forecaster honest is updating in small increments and out loud.
- Thinking, Fast and Slow — Daniel KahnemanThe book that taught me to distrust my own first read of a chart.
- The Visual Display of Quantitative Information — Edward TufteEvery chart is also an ethical artifact. The question is whether the reader can tell what's been hidden.