The large language models (LLMs) that have increasingly taken over the tech world are not “cheap” in many ways. The most prominent LLMs, GPT-4 for instance, took some $100 million to build because of the legal costs of accessing training data; computational power costs for what could be billions or trillions of parameters; the energy and water needed to fuel computation; and the many coders needed to develop the training algorithms that must run cycle after cycle so the machine will “learn.”
What if a researcher needs to do a specialized task that a machine could do more efficiently, but doesn’t have access to a large institution like Washington University in St. Louis that offers access to generative AI tools? Say, a parent wants to prep their child for a difficult test and needs to show many examples of how to solve complicated math problems?
Building one’s own LLM is an onerous prospect because of the numerous forementioned costs but also because making direct use of the big models like GPT-4 and Llama 3.1 might not immediately be suited for the complex reasoning in logic and math the task requires.