Every programmer should have an opinion on what the outcomes of the expressions like

My opinion on “5” == 5 – Win Vector LLC

submited by
Style Pass
2021-07-28 16:00:09

Every programmer should have an opinion on what the outcomes of the expressions like "5" == 5 should be, and perhaps even a guess as to what the answer is in their most familiar programming language.

That is a nice safe early error that can prevent a lot of confusing data processing bugs down the line. This may be clearer in the related example.

In this context: it is likely the user meant SELECT 5 IN (1, 2, 3), i.e. to check if an integer was in a given set of integers. And it is unlikely the user actually meant SELECT "5" IN (1, 2, 3). The second form can never be true in SQL, by simple type inspection, so it is useful to have it disallowed.

It turns out it is False, which is useful and understandable, in a general purpose programming language. Python deals with heterogeneous lists and sets, so it isn’t obvious that "5" == 5 or "5" in {1, 2, 3} are nonsense expressions given the language context.

However, I can’t defend R’s return value for "5" == 5. This turns out to be TRUE. One can, of course, guess at what sort of implicit casting is supporting this, but R isn’t a language where strings and numbers are generally equivalent. Yet we have 5 %in% c("5") evaluating to TRUE. R mostly enforces homogeneous types, but it does so by quiet implicit type conversion (a buggy gift that keeps on giving). A well informed R user expects c(5, "5") to be a vector of strings; I am less convinced many expect "5" == 5 to evaluate to TRUE.

Leave a Comment
Related Posts