[ELI5] Why does autocorrect insist that the first letter of a misspelled word is more important than the rest of it?

926 viewsOtherTechnology

For example, if I spell “umportant”, it’s easy for us to recognise that it’s supposed to be “important”, but autocorrect insists that it’s something like “umbrella”, or I guess more logically “unimportant”, even though “important” is only 1 correction away.

​

These are real examples from my phone (Samsung Galaxy):

​

Wuick gets the suggestions Wicked, Which, Wucky, Whickham, Whicker, Wick, Wickets, Wicket, and Wickham. None of which are “Quick”, what I intended to write.

​

Nrown gets the suggestions Now, Nr own, Noon, Nowhere, Nr owner, Nr owns, and Nr owners. None of which are “Brown”.

​

Dence gets the suggestions Dance, December, Denied, Dancers, Decent, Dense, Dench, and Deuce. None of which are “Fence”.

​

It’s bothered me for years that it never ever picks up on a misspelt first letter.

​

Edit: I tried “umportant”, and it actually comes with 0 suggestions. Not umbrella, not unimportant, not even “important”. But “inportant” and “ikportant” and even “iqportant” are all recognised as “important”.

In: Technology

14 Answers

Anonymous 0 Comments

Many auto-correct systems and word suggesters use a data structure known as a trie that you can visualize as a tree. At the very top is an empty value (i.e., no characters). Then, it has a branch for every letter so that there’s “a”, “b,” … “z.” A branch then extends from every letter for all letters that could make a possible word. So A would have B and C and most if not all letters as branches extending from it. Q, on the other hand, would mainly lead to U.

Let’s say you start typing the beginning of “umbrella.” First, you’d traverse the trie from the empty value to the U. From there, you could find branches to other letters that might form a word. You could traverse to N because of words like “undo” or to Z because of words like “uzi.”

Now, let’s say you accidentally type “umport.” “Ump” is valid because it could lead to “umpire,” but I’m pretty sure that “umpo” doesn’t lead to any words, so it has no branches. It looks at the last possible valid path and returns the first word it finds, which might be umpire.

Each implementation could be different leading to slightly different outcomes. Some don’t use tries at all, but many do.

You are viewing 1 out of 14 answers, click here to view all answers.