At Flipkart, over a billion recommendations are served to users every day. These recommendations help millions of users discover products and stores and guide them on their purchase journey. A performant recommendations system is crucial for a seamless shopping experience.
After adding many features over time, we recently took up a challenge to improve the performance of the system. After exhausting a series of optimisations based on profiling the application and addressing multiple bottlenecks, we identified an opportunity to merge two services into a unified service.
As a result, Recommendations can now handle 3x more content on 20% less hardware. In this post, we share some lessons we had along the way and cover two optimisations:
Service (which handles Pt. 1 above) understands user intent. It spends most of the time waiting for responses from other services, and the garbage pressure is due to short-lived objects. This service is also less sensitive to GC pause times and follows the “most objects die young” philosophy behind garbage collectors. We use Java8 with a G1GC collector on 8 GB heap space for this service.
Core does most of the heavy lifting and handles Pt. 2 and 3 above. It spends most of the time ranking and the garbage pressure is because of a mix of short-lived objects during candidate creation, and long-lived objects in Machine Learning (ML) models. This service is extremely sensitive to GC pause times. Any STW (Stop The World) GC pauses adds directly to the overall latency, taking CPU time away from ranking. We use Java8 with a G1GC collector on 16 GB heap space for this service.