Running a Local Vision Language Model with LM Studio to sort out my screenshot mess

submited by
Style Pass
2024-10-25 09:00:01

Running Local Language Models has become increasingly easy over the last year. This is in a huge part thanks to the work of George Gersonov on llama.cpp. llama.cpp is used under the hood by a growing number of UI tools like LM Studio.

Recently LM Studio has added additonal support for running models on Macs with Apple Silicon via mlx-engine which uses Apple’s MLX library to accelerate the inference of models on Macs with Apple Silicon. The most recent release of LM Studio has also added two very exciting features:

In addition to these two features, LM Studio already had support for Structued Outputs (using Outlines). In this blog post I’ll show how these new features can be used for a perfect local VLM task: sorting out my chaotic desktop 😅.

I have hundreds of screenshots on my desktop. Some of them are screenshots of code, some are screenshots of webpages, some are screenshots of videos, some are screenshots of stupid memes, etc.

Leave a Comment