In tech companies, Cost of Goods Sold  is a key business metric driven in large part by the efficiency of software architectures. Saving money always

Saving Millions on Logging: Finding Relevant Savings

submited by
Style Pass
2023-01-27 20:30:06

In tech companies, Cost of Goods Sold is a key business metric driven in large part by the efficiency of software architectures. Saving money always sounds like a great idea, but it is not always a priority over features and growth, nor is it straightforward. At Hubspot, our relatively new Backend Performance team is tasked with improving the runtime and cost performance of our backend software. In this two-part blog series, we will look at a structured method we use for approaching cost savings work and demonstrating how we apply it at Hubspot to save millions on the storage costs of our application logs.

The first phase to working on cost savings is discovery. We need to know how much each of our software systems are costing. The foundations for cost data often start with cloud providers like Amazon Web Services (AWS). They generally provide detailed cost data for the cloud resources you use. In simpler systems, this may be enough to start piecing together cost categorizations.

Categorizing Costs At Hubspot, our backend microservices are deployed using a custom Mesos layer called Singularity on top of AWS EC2 hosts. Any given EC2 host may be running multiple different deployable applications at any time. We also run our own database servers via Kubernetes instead of using cloud-hosted databases. All of this virtualization makes it hard to correlate the cost of a single EC2 instance to the cost of a specific application. To address this challenge, we have built an internal library that correlates applications to AWS resources by intercepting samples of application network calls to track usage of resources like S3, AWS Lambda, our internal hosted databases, and more. Tying all this data together, we are able to aggregate the costs of applications and databases, as well as attribute utilization of database costs to applications.

Leave a Comment