OpenOffice All Languages Wordlist

(As it seems like no one did this before... I tried to Smiling)

One of the coolest things from OpenOffice (IMHO) is it's huge spell checking database, available for 92 languages. It contains not only the "most popular" words, but also some "mutation rules", which describe the generic "word formation" algorithm for each language. This opens great possibilities for a dictionary attack.

So, all I did was picking an "All Language Pack" from OpenOffice.org, and expanding it (i.e. using the affix rules to generate a list of all recognized words of a dictionary) with the unmunch utility from the Hunspell. As unmunch behaved a bit... weirdly with some .aff files (namely: ar, eu_ES, gl_ES, he_IL, hu_HU, lt_LT, mn_MN, qu_BO, se), I had to use the non-expanded .dic instead. This generated a set of lists with 127,153,335 unique words, which summed up into 1,932.2 MB.

And finally, here it is, for all your security testing purposes, packed with 7-Zip into a 167.3 MB file:

Have a lot of phun Laughing out loud


Share/Save/Bookmark

stas's picture
stas » November 3, 2008 » 14:06

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I heard that Mininova is

I heard that Mininova is going to close soon and we won't be able to use it anymore, any thoughts?

Damian (not verified) » February 5, 2009 » 11:47

Post new comment

*
*
The content of this field is kept private and will not be shown publicly.


*

  • Allowed HTML tags: <a> <i> <b> <u> <img> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <pre> <hr>
  • Lines and paragraphs break automatically.
  • Textual smileys will be replaced with graphical ones.