Over the last year and a half of developing features that users didn’t like, finding out they didn’t like them, and then trying again, we’ve bui

Instant Apply

submited by
Style Pass
2024-11-25 21:30:09

Over the last year and a half of developing features that users didn’t like, finding out they didn’t like them, and then trying again, we’ve built a better idea of what will and won’t fly. Among the biggest predictors is what I call “UX risk”. [0]

In our internally-formed UX lingo, this basically means the amount of energy that the user can expect to lose from the LLM messing up, something like

where \(p\) is the probability that the LLM fails, \(t\) is the time from user request to response, and \(a\) is any additional “annoyance” caused in the case of failure. [1]

Consider some examples: Autocomplete is a great feature because it quickly predicts small pieces of text, and is unobtrusive when wrong. This means that \(p\), \(t\), and \(a\) are small. On the other hand, OpenAI’s new model, o1-preview, is impressive but doesn’t provide great UX because it takes some time and doesn’t stream (\(t\) is large), and it is so far quite difficult to guess which tasks it will be uniquely suited to solving (\(p\) is large).

When users learn that a feature is risky (high latency, low probability of success), they stop using it, and unfortunately (but understandably) their trust is hard to win back. This isn’t entirely new as a concept, but it’s certainly more pronounced in LLM-centric products.

Leave a Comment