OpenOffice All Languages Wordlist
(As it seems like no one did this before... I tried to
)
One of the coolest things from OpenOffice (IMHO) is it's huge spell checking database, available for 92 languages. It contains not only the "most popular" words, but also some "mutation rules", which describe the generic "word formation" algorithm for each language. This opens great possibilities for a dictionary attack.
So, all I did was picking an "All Language Pack" from OpenOffice.org, and expanding it (i.e. using the affix rules to generate a list of all recognized words of a dictionary) with the unmunch utility from the Hunspell. As unmunch behaved a bit... weirdly with some .aff files (namely: ar, eu_ES, gl_ES, he_IL, hu_HU, lt_LT, mn_MN, qu_BO, se), I had to use the non-expanded .dic instead.
This generated a set of lists with 127,153,335 unique words, which summed up into 1,932.2 MB.
And finally, here it is, for all your security testing purposes, packed with 7-Zip into a 167.3 MB file:
- The Pirate Bay (torrent)
- Mininova (torrent)

|
stas » November 3, 2008 » 14:06
1902 reads
|




