Fitting a Spell Checker into 64 kB

English has an estimated one million unique words, well beyond the 64KB of memory allocated for the spellchecking program on early Unix machines.
To work with this limitation, engineers squeezed the dictionary down to 25,000 words, using an algorithm to remove affixes and a Bloom filter for lookups.
This wasn’t sufficient, so they expanded the dictionary using hash compression, followed by hash differences, and a special compression method that came close to theoretical limits.
While today’s computers have infinite storage compared to 1970s machines, the spirit of innovation lives on in modern text compression developments such as large language models.
And they didn’t have to waste time on bullsh*t management concepts like “managed languages,” FuSi, MISRA, and SCRUM.

Fast Feed