Unicode Roman Numerals and Screen Readers

submited by
Style Pass
2023-03-15 13:00:05

Most people with a grasp of the interplay between English and Latin would say "In Hamlet, Act four, scene nine". And they'd be right! But screen-readers - computer programs which convert text into speech - often get this wrong.

Why? Well, because I didn't just type "Uppercase Letter i, Uppercase Letter v". Instead, I used the Unicode symbol for the Roman numeral 4 - Ⅳ. And, it turns out, lots of screen-readers have a problem with those characters.

Unicode contains the range of Roman numbers from 1 - 10, plus a couple of compound numbers, 50, 100, 500, and 1000 - in a variety of forms.

Why does Unicode contain these number which, to most people, are just squashed together Latin letter? As ever with Unicode, it is a mix of legacy and practicality.

Roman Numerals. For most purposes, it is preferable to compose the Roman numerals from sequences of the appropriate Latin letters. However, the uppercase and lowercase variants of the Roman numerals through 12, plus L, C, D, and M, have been encoded for compatibility with East Asian standards. Unlike sequences of Latin letters, these symbols remain upright in vertical layout. Additionally, in certain locales, compact date formats use Roman numerals for the month, but may expect the use of a single character.

Leave a Comment