Suppose that you give me two ASCII strings having the same number of characters. I wish to compute efficiently the number of matching characters (same

Counting the number of matching characters in two ASCII strings

submited by
Style Pass
2021-05-21 15:00:03

Suppose that you give me two ASCII strings having the same number of characters. I wish to compute efficiently the number of matching characters (same position, same character). E.g., the strings ‘012c’ and ‘021c’ have two matching characters (‘0’ and ‘c’).

There is nothing wrong with this code. An optimizing compiler can auto-vectorize this code so that it will do far fewer than one instruction per byte, given long enough strings.

However, it does appear that the routine looks at every character, one by one. So it looks like you are loading two values, then you are comparing and then incrementing a counter, for each character. So it might compile to over 5 instructions per character (prior to auto-vectorization).

What you can do instead is load the data in blocks of 8 bytes, into 64-bit integers as in the following code. Do not be mislead by the apparently expensive memcpy calls: an optimizing compiler will turn these function calls into a single load instruction.

Leave a Comment