Here at IC, the main technical challenge we face is adding up lots of numbers in a dynamic and performant way. The advent of streaming means that modern royalty accounting involves millions of individual transactions, and even the most optimized database queries start suffering from performance issues. We’ve spent the last couple of months building architecture to handle this issue, and I’m excited to share what we’ve come up with. It relies on two main pieces: materialized views in our PostgreSQL database, and leveraging a change data capture event stream in Kafka to trigger refreshing those materialized views.
The heart of our software is what we call a payee summary, a report which calculates how much a payee in our system is owed along with a breakdown of how we got to that number. This report is generated using a fairly complex and hand-optimized SQL query. Traditionally, royalty accounting is done a round at a time, with the final balance carried over and used as the starting point for the next round. This can make changes and corrections to previously closed rounds a painful affair, and if there’s one thing we’ve learned in our time doing royalties it’s that there are always changes. To avoid this, our payee summaries are calculated using every transaction from the beginning of an account’s history, no matter which round we’re reporting on. This makes corrections a breeze, but the price we pay is that as an account’s transaction history grows, our performance suffers.
Our initial attempts at mitigating these performance issues were fairly vanilla; caching the result of the query, and invalidating it whenever anything changed in an account. This worked out ok for a while, but had a couple of key downsides — slow first time responses, and small changes to an account requiring recalculations of every single payee summary. So we started rethinking our approach with a lot of help from Martin Kleppman’s invaluable book Designing Data-Intensive Applications. We came to the conclusion that we could break up our single large query into individual pieces that could be cached and refreshed separately. The most expensive piece, aggregated transaction sums, don’t change that often, mostly when users import new data. The other more dynamic pieces, like payee splits or which contract to use for a set of transactions, are fairly cheap in terms of performance cost.