Picture this. You’re an individual contributor working at some company that requires you to write “code” to get your job done. I’m trying to cast a wide net here, for example, you could be a full-stack data scientist at Stitch Fix creating models that plug back into the business, or you could be a software engineer at a startup writing product features, basically anyone where you have to develop some “software” and through your work, the business somehow moves forward. In general, it is easy to get started and deliver value to the business, since things are relatively simple. But consistently being able to deliver value and doing it over time is hard. You can easily reach terminal velocity and end up spending all your time keeping your prior efforts running, or fighting their details to expand and do more, versus moving your business forward. So how do you prevent this? At some point you need to start building out abstractions to reduce maintenance costs and increase your development velocity, this is, after all, what all the big tech companies do internally. What these abstractions build out is a platform, i.e. something you build on top of. Now, building good platforms isn’t that straightforward, especially as businesses grow and scale.
I was lucky enough to spend the last six years focusing on “engineering for data science” and learning to build great platforms for the world class data science team at Stitch Fix. During this time, I saw lots of platform successes and failures first hand. Now, there is plenty of material available on what types of platforms have been built (see any big tech company’s blog) and how to think about building a software product (e.g. building a MVP), but very few on how to start a platform and build one out. In this post I will synthesize my major learnings about how to build platforms into five lessons. My hope is that these five lessons will come in handy for any one trying to build a platform, especially in the data/ML space.