LLMs are in the spotlight right now, from the largest companies to the lone hacker on Twitter, everyone’s talking about it. 
While we are all still

LLMs, structured outputs and data generation

submited by
Style Pass
2024-06-08 15:30:06

LLMs are in the spotlight right now, from the largest companies to the lone hacker on Twitter, everyone’s talking about it. While we are all still learning what these models can and can’t do effectively, it’s undeniable that they had, are, and will keep having a big impact on how people approach software development. Recently, I’ve decided to take OpenAI’s ChatGPT 3.5 Turbo API for a spin, to tackle a problem that, I believe, is fairly common: ensuring that you can generate data on the fly based on already existing samples and perform what I just made up now to be “data retro-fitting”.

This experiment was all done in the context of a side project I have been working on for a little while, so, obviously, stakes are super low, mistakes are free to make, and data criticality is essentially non-existent. However, I feel like there are some really cool takeaways from this exercise, which is why I thought this would be worthy of a post.

The context is a super simple app that shows recipes to a user, and offers possibilities to search for recipes by title (which will be unique, as we don’t want to have repeated recipes in the app). Initially, this started very small, I simply used a SQLite DB which I seeded via a JSON file in memory:

Leave a Comment