Words are lower cased, breaks into trigrams. The number of overlapping trigrams defines similarity. Heuristics:
- Words are padded by blanks (2 at the beginning and 1 to the end) to compensate word borders.
- Artificial trigram composed from '#' and the first letter of a word is added to benefit first character.