Usually, when I use LLMs such as Claude 3.5 Sonnet or ChatGPT's O1, I am trying to learn more or to apply known techniques or known theory/practice th

Acting as Claude's Research Helper in AI

submited by
Style Pass
2024-12-27 07:00:03

Usually, when I use LLMs such as Claude 3.5 Sonnet or ChatGPT's O1, I am trying to learn more or to apply known techniques or known theory/practice that comes from published results and literature, as well as the vast corpus of examples and discussions that are included in the training sets of these models.

But I occasionally like to speculate or dream a bit with these models. What's so stimulating to me about doing this is that these models have been exposed to an unbelievably vast amount of pure math. Not only does this include all the classic papers and textbooks across all areas of pure and applied math, but even the extremely dense reference works such as the Stacks Project, which currently exceeds 7,000 pages of incredibly dense and hard math (see this PDF to get an idea of what I'm talking about).

Now, I don't know how many human beings in the entire world have really read and understood even 10% of this work, but I highly doubt it's more than a couple thousand at most. And of those people, how many have also gone that deep into other areas of higher math, such as probability theory? And then, what is the Venn Diagram of people who know all that stuff cold but who are also interested in knowledgable about the state of the art in AI/LLMs and the latest developments in model architectures, training algorithms like ADAM, enhanced context window size, model distillation, etc? It can't be more than a few hundred people at best, and is probably more like a few dozen.

Leave a Comment