Programming note: Technical Chops is back from the August break. From this week on, posts will be delivered on a Friday morning.
One of the challenges with Large Language Models (LLMs) is that they are still, quite often, incorrect. Worse still, they are confidently incorrect. They are, to put it politely, bullshitters.
Within a critical environment, we build in a way which assumes that software, or the hardware that runs or is an input to that software, can be wrong.
Airliners are an acute example – they don’t just have one sensor for a given metric, they often have at least three. By using the data from three sensors, their software can use a methodology called Triple Modular Redundancy (TMR) to identify when one of the sensors is providing bad data. If one airspeed sensor starts saying that the plane is going too slowly, but the other two sensors say everything is normal, TMR allows the plane to effectively take the “votes” of the two normal sensors as the correct speed. This is especially critical when the plane can take actions by itself based on those inputs; the lack of it was a contributory factor to the Boeing 737 Max disasters in 2018 and 2019.
Thankfully, I can’t think of any good uses for LLMs in critical environments. Whilst there have been cases where lawyers have cited fake cases created by ChatGPT and faced the resultant wrath of a judge, most of the time the stakes are your own, or your product’s, credibility.