Mary Fung
noteMay 9, 2026

Why I love synthetic data

It's the rare medium where breadth, creativity, and technical chops compound instead of trading off.

Synthetic data is the part of this work I genuinely love, and I think it's because of how my career happens to be shaped. I've worked across enough firms, groups, and functions over the years to have a quiet library in my head — what the supply chain person actually needs to see, what the risk officer is looking for, what the clinical lead is allergic to, what would make the head of operations lean forward. For most of my career that library lived as slides and conversation. Synthetic data is the medium where it finally gets to leave my head as something you can poke at.

Most people frame synthetic data as a privacy workaround or a volume hack. Both true, both small. The bigger thing it is — for me — is the medium where cross-domain pattern recognition becomes a buildable artifact. You don't have to wait for someone's real data to land before you can show them what a system would do with it. You can build the scenario that the real operating environment would never have given you in time, and let the person who'd have to live with the system actually walk into it.

This is the part that rewards exactly the shape of practitioner I happen to be. Generalist enough to recognize what matters in domain after domain. Specific enough to know how to populate it. Technical enough to build it. None of those would be unusual on their own. The rare thing is the medium where all three compound instead of trading off, and synthetic data is that medium.

It also pairs naturally with immersive demonstration work — you can populate scenarios that real data couldn't have populated in the time available, in front of the people whose decisions you're actually trying to inform.

Most of what I've found genuinely fun to build in the last few years has been downstream of this one quiet shift.

← back to the field