I wanted to better understand how AI models are created. Not to become an expert, but to gain an appreciation for the abstractions I use every day. Th

Understanding AI | Lee Robinson

submited by

Style Pass

2024-12-02 17:00:06

I wanted to better understand how AI models are created. Not to become an expert, but to gain an appreciation for the abstractions I use every day.

This post will highlight what I’ve learned so far. It’s written for other engineers who are new to topics like neural networks, deep learning, and transformers.

Software is deterministic. Given some input, if you run the program again, you will get the same output. A developer has explicitly written code to handle each case.

Machine learning teaches software to recognize patterns from data. Given some input, you might not get the same output². AI models like GPT (from OpenAI), Claude (from Anthropic), and Gemini (from Google) are “trained” on a large chunk of internet documents. These models learn patterns during training.

Then, there’s an API or chat interface where you can talk to the model. Based on some input, it can predict and generate sentences, images, or audio as output. You can think about machine learning as a subset of the broader AI category.