[ELI5] Why does autocorrect insist that the first letter of a misspelled word is more important than the rest of it?

1.42K viewsOtherTechnology

For example, if I spell “umportant”, it’s easy for us to recognise that it’s supposed to be “important”, but autocorrect insists that it’s something like “umbrella”, or I guess more logically “unimportant”, even though “important” is only 1 correction away.

​

These are real examples from my phone (Samsung Galaxy):

​

Wuick gets the suggestions Wicked, Which, Wucky, Whickham, Whicker, Wick, Wickets, Wicket, and Wickham. None of which are “Quick”, what I intended to write.

​

Nrown gets the suggestions Now, Nr own, Noon, Nowhere, Nr owner, Nr owns, and Nr owners. None of which are “Brown”.

​

Dence gets the suggestions Dance, December, Denied, Dancers, Decent, Dense, Dench, and Deuce. None of which are “Fence”.

​

It’s bothered me for years that it never ever picks up on a misspelt first letter.

​

Edit: I tried “umportant”, and it actually comes with 0 suggestions. Not umbrella, not unimportant, not even “important”. But “inportant” and “ikportant” and even “iqportant” are all recognised as “important”.

In: Technology

14 Answers

Anonymous 0 Comments

There are different metrics you can use for measuring ‘distance’ between different strings, for example https://en.m.wikipedia.org/wiki/Levenshtein_distance vs. https://en.m.wikipedia.org/wiki/Damerau–Levenshtein_distance. Autocorrect systems often order their suggestions using something like this, and different ones might weight differently. Like another commenter mentioned, they might be more likely to assume that you at least got the first letter right in a long word, and so be less likely to suggest corrections that change that.

There are also other very different models you might use, like trying to help people guessing at phonetic spelling. They might have a database of frequently phonetically misspelled words and look for potential matches to those as well. On a cellphone specifically they might also look for things related to your keyboard layout — if people tend to miss letters by offsetting from one side more than the other that might also change the weighting they give.

A lot of more modern systems also take context into account to some degree. Perhaps using some kind of digraph or trigraph model to look at the preceding one/two words and using that to guess how likely the different autocomplete suggestions are to be correct. I can tell it’s at least sort of doing this on my phone; ‘the quick nrown’ suggests “brown” but ‘turn that nrown’ doesn’t make any autocomplete suggestion as I type, perhaps because its model suggests none of the good substitutions are a likely fit.

Usually the ones on smartphones also have some kind of adaptive model that tries to learn how *you* misspell things in particular, and adds in words that it doesn’t know about but that you tell it are to be accepted.

I have an iPhone, but I get very different results than you with those misspellings in isolation. “umportant” suggests important (and nothing else), “wuick” suggests quick (-> Buick -> wick), “nrown” suggests brown (-> crown -> grown), “dence” suggests fence. However, they give very different suggestions in the middle of a sentence (‘dence’ is usually either “dance” or “Denise”).

You are viewing 1 out of 14 answers, click here to view all answers.