With that, we gain the ability to send HTTP requests, serialize & deserialize JSON, and to handle errors without cursing. We’re ready to write s

Judging Code - by Thorsten Ball - Register Spill

submited by
Style Pass
2025-01-22 20:30:02

With that, we gain the ability to send HTTP requests, serialize & deserialize JSON, and to handle errors without cursing. We’re ready to write some code.

It’s the final two pieces in this little demonstration and this is also where some audience participation is allowed, but to keep things simple, how about this:

I’ve used LLMs-as-Judges quite a bit in the past few weeks at work and seeing LLMs work like that, be reliable like that, be a fuzzy-to-non-fuzzy adapter — it made me reconsider what I thought LLMs were useful for.

Reliable? Yes. The temperature is 0 and even if I ask Claude ten times, it will very likely produce the same thing, as long as all inputs stay the same:

Seeing LLMs work like that made me think of all the questions I had in the past about data, about code, about text, that were very hard to answer in code but so easy to express in prose: does this page show the sign-in button? does this function call that one? is that thing hidden and that one extended? is this documented? is there commented-out code in here?

Leave a Comment