database

OpenOffice All Languages Wordlist

(As it seems like no one did this before... I tried to Smiling)

One of the coolest things from OpenOffice (IMHO) is it's huge spell checking database, available for 92 languages. It contains not only the "most popular" words, but also some "mutation rules", which describe the generic "word formation" algorithm for each language. This opens great possibilities for a dictionary attack.

So, all I did was picking an "All Language Pack" from OpenOffice.org, and expanding it (i.e. using the affix rules to generate a list of all recognized words of a dictionary) with the unmunch utility from the Hunspell. As unmunch behaved a bit... weirdly with some .aff files (namely: ar, eu_ES, gl_ES, he_IL, hu_HU, lt_LT, mn_MN, qu_BO, se), I had to use the non-expanded .dic instead. This generated a set of lists with 127,153,335 unique words, which summed up into 1,932.2 MB.

And finally, here it is, for all your security testing purposes, packed with 7-Zip into a 167.3 MB file:

Have a lot of phun Laughing out loud

Share/Save/Bookmark

stas's picture
stas » November 3, 2008 » 14:06

The Pirate Bay un-SSL

Theory

Recently, the world saw The Pirate Bay offering SSL encryption on their server. This means that your ISP won't know anymore which torrent you are downloading, right? Wrong.
HTTPS is quite useless for protecting static and public content. By static, I do mean the .torrent file itself. It is always the same. By public, I do mean than one doesn't need any kind of authentication to pick up the content. It's always the same, for everyone. For crawlers, too.
So, one could easily index (a portion of) The Pirate Bay torrent database by the Content-Length. Then, one could intercept some encrypted traffic between some machine(s) within his/her network and the torrents.thepiratebay.org server. Knowing both (encrypted) request and response lengths, it is possible to get a quite reliable list of matches from the previously indexed torrent list.

Practice

Don't try this at work, or you might hurt yourself Eye-wink

  1. Use Wireshark to capture some torrent downloads. Torrents are hosted on a separate server, which makes the task easier yet. Just use the following capture filter: "tcp and port 443 and host torrents.thepiratebay.org"
  2. Now, just go with the stream Smiling ("Follow TCP Stream" for the packet you suspect belongs to the torrent download. This will create another filter, just like "(ip.addr eq 192.168.0.10 and ip.addr eq 83.140.176.156) and (tcp.port eq 2157 and tcp.port eq 443)")
  3. Just save the displayed stream anywhere else (pcap1.pcap sounds nice)
  4. Now, use my quick&dirty TPB-TLSlen.pl Perl script to get the request/response lengths:
    perl TPB-TLSlen.pl pcap1.pcap
    Yeah, I know, it is nasty. It only supports the TLS cypher. And it simply calls the tshark (the command line version of Wireshark) to parse it's output.
  5. Now, just paste the REQ and RES values below Laughing out loud
    (note that the REQ value is optional, setting it to 0 simply ignores the request size for matching)
Note that you are able to fine-tune the maximum and minimum header sizes. For the response, the headers are almost the same all the time. The only thing that varies is the decimal representation of the file length and age. (Un)fortuately, the request headers do vary for different browsers and referring pages. However, knowing the request size still helps a bit, specially if the torrent's filename was huge Smiling

Precision

The following size distribution chart was generated using the database with ~165K torrents:

torrent size distribution

The most common torrent size is ~14 KB, and it's easy to figure out that such torrents represent the shared 700 MB files Smiling
There's also a major peak for the 454 bytes torrents. However, bigger torrents are less common, thus, the size detection technique becomes more precise. Now, the average "distance" between torrent sizes is ~44 bytes (at least for the sample I've collected). So, adding a cookie with the random size up to 128 bytes will disrupt the size matching detection a lot. The request size disruption is even easier: the largest torrent URI I've found was 150 bytes-wide. Thus, padding every request URI to match 150 characters is enough to make the requests completely indistinguishable. Joining the pieces (the padding add-on strings are bold):
GET /4319199/[a4e]Ghost_in_the_Shell_TV_01-26.4319199.TPB.torrent?nVM2UGfcG533un4ym70eT2
9r0WwBLYdmFCNN+UTV/hiJ7EAXdFU5KfdWHpkB5lXaCmITsACKOPVyjmpbaOB+CrI5
HTTP/1.1 Host: torrents.thepiratebay.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208
Firefox/3.0.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: https://thepiratebay.org/recent Cookie: language=pt_BR; country=BR; PHPSESSID=ad6cb7e414c8dc88e0c2444f6215165a
HTTP/1.1 200 OK Content-Type: application/x-bittorrent Etag: "2198642509" Last-Modified: Mon, 28 Jul 2008 22:28:59 GMT Server: lighttpd Content-Length: 91601 Date: Mon, 28 Jul 2008 22:37:56 GMT X-Varnish: 108010229 107999438 Age: 253 Via: 1.1 varnish Connection: keep-alive Set-Cookie: p=68eOfxOC7JwBYcMe1RJWC4Z5PV/lJzqJORW8KROPMH9zQhszSjFnRp2tsNWEoyabWAloneUaoz
MxYtx4hoM9MZUKE/7wGzC3ZKLEZdppG4og3W; expires=Mon, 28-Jul-2008 22:37:56 GMT; path=/;
domain=torrents.thepiratebay.org
(binary torrent data)

Solution

  1. Use a constant padding in the .torrent files. This messes things a bit, but stills ineffective. The only advantage is not messing up with the server Sad
  2. Patch the lighttpd server so it sends a non-lasting cookie with a random size.

Thanks

Encrypted session data

REQUESTRESPONSE
SSL size:
Min header length:
Max header length:

Possible matches

The Pirate Bay URL strlen(URI) torrent size
0 matches
Torrents indexed: 1009993

Share/Save/Bookmark

stas's picture
stas » July 31, 2008 » 11:05
database » hack » music » network » perl » php » video » web

Formulário CEP/CPF/CNPJ


(tente preencher os campos acima; a informação é atualizada instantaneamente)


Este projeto é o sucessor do CEP-2-City. É um formulário online que:

  • Verifica a validade do número CPF
  • Verifica a validade do número CNPJ
  • A partir do CEP, deduz o endereço completo (Cidade/Estado/Bairro/Rua)
  • A partir do CEP, deduz o código DDD da região Eye-wink

O banco de dados utilizado é compilado a partir de diversas fontes. Se não constar a informação da rua, então somente a cidade é retornada. A interface com o banco de dados foi implementada usando Perl e PHP, e pode ser acessada via CGI, Flash ou AJAX. O sistema de busca é extremamente eficiente, e não necessita de MySQL. O tamanho do banco de dados é cerca de 60 MB, e a performance chega a milhares de consultas por segundo.

Portanto, eis uma solução bastante símples, flexível e eficaz para o cadastro de clientes. Já utilizei este sistema numa pesquisa que conduzi, e asseguro que me poupou bastante tempo. Para realizar as consultas, basta acessar a seguinte URL (com a devida substituição do CEP): http://sysd.org/brloc/brloc.php?cep=05437000, e processar a string retornada com a função parse_str() (em PHP).

Se tiver interesse no banco de dados em si, entre em contato!

Share/Save/Bookmark

stas's picture
stas » August 15, 2007 » 20:31

Geolizer HTTP stats

Sample Geolizer output (fragment)

About Geolizer

This is an enhanced version of the popular Webalizer HTTP server statistics generator. It's main feature is an ability to discover visitor's country by his/her IP address. Default Webalizer method is to extract host suffix from the reversal DNS query (obtained directly from log files, or by webazolver program if HTTP server doesn't reverses client IPs), which is slow and imprecise (for example, Brazilian host could be reversed as .com). Geolizer relies on the GeoIP library API to do the same thing. Thus, no more DNS queries are required, and results are much more precise. Geolizer also has some additional features: it displays file sizes in a human-readable form (bytes/KB/MB/GB/TB) instead of default kilobytes. It also compiles under MinGW/MSYS now, so you can process your UN*X log files on your Windows box. And, finally, Geolizer features a nice eye-candy: country flags! Smiling

Beware as Geolizer also has some bad features (read "bugs"): for example, webazolver won't work anymore, and already resolved hosts aren't handled well. Want to see how it looks like, at all? Take a look at some sample statistics! Or see who else uses Geolizer to produce their server stats.

Tips

  • The country flag pictures can be downloaded at http://flags.blogpotato.de/. Just download and unzip world.small.zip & special.small.zip to the flags/ subdirectory in your HTML output path.
  • You may enhance your Webalizer further (allowing it to identify more user agents, referrers and search engines than normal) using extended configuration files, provided by Enric Naval and available at http://griho.udl.es/webalizer/.
  • It is possible to use multiple configuration files on Webalizer. Just specify them at the command line:
    webalizer -c common.conf -c user_stas.conf
  • Why don't you try also AWStats & WebDruid?!


It is easy to located an internet service which provides low prices for voip. Simply by downloading the software the service of the voip can be utilized, though for this a fast wireless internet is also required. This is especially benefical for small website hosting companies to contact their clients at a low cost.

Share/Save/Bookmark

stas's picture
stas » January 16, 2007 » 14:00

CEP-2-City

Módulo Perl que obtém o nome do município (do Brasil) a partir de um CEP dado. Exemplo:

#!/usr/bin/perl
use CEP;

# inicializa
my $cep = new CEP;

# $city será referência para array com estado/cidade
my $city = $cep->city ('12.437-660'); # processa somente os dígitos numéricos (0-9)

if ($city) {
    # $str será uma string no formato Cidade/ESTADO
    my $str = CEP::city_string ($city);
    # filtra os acentos e imprime em caixa alta
    printf "esse CEP pertence a [%s]\n", uc CEP::normalize ($str);
} else {
    print "CEP não encontrado\n";
}

exit;

Conforme pode ser observado no exemplo dado acima, este é um módulo orientado a objeto. A instância do objeto CEP é criada com new CEP. Durante a inicialização, é construído um array com a lista ordenada das faixas de CEPs (isso pode demorar, portanto é uma boa idéia criar uma única instância e reaproveitá-la indefinidamente). O método que realiza a busca binária pelo CEP e retorna o nome do município é city(). O seu único parâmetro é o número CEP propriamente dito. Somente os dígitos numéricos serão considerados. city() retorna uma referência para array que contém o nome do estado e o nome da cidade, nessa ordem. Inclui também uma subrotina city_string() que recebe como parâmetro o array retornado pelo city() e retorna uma string no formato "Nome da cidade/ESTADO". E, finalmente, quando os acentos são desnecessários, pode-se filtrá-los com a função normalize(), que recebe uma string com acentos e a retorna sem acentos Sticking out tongue

P.S. - cuidado com os line-endings do arquivo CEP.pm! Se o segmento __DATA__ for salvo com CRLF, o módulo apresentará comportamento estranho em sistemas UN*X!
P.P.S. - Este projeto tem agora um sucessor (que consegue obter informação sobre Rua/Bairro e até mesmo DDD regional)! Confira!

Share/Save/Bookmark

stas's picture
stas » January 3, 2007 » 17:55

X-Plane key binding

Here I compiled the keyboard mapping for several versions of X-Plane. I wrote a tiny Perl program that simply formats the keys/X-Plane.txt file distributed with X-Plane into a comprehensible HTML table:

(for X-Plane v6.30, take a look here)

Share/Save/Bookmark

stas's picture
stas » May 10, 2006 » 01:51

ASCII code explorer

Designed to be the freakin' best ASCII table viewer for DOS platform, LOL!!!

ASC.EXE screenshot

It accesses the console font bitmaps directly from the BIOS and amplifies them 16 times (bitmaps in a hex form are shown, also). User can navigate the character map using his/her mouse or the cursor keys.  For every character, it's ASCII code in decimal, hexadecimal & binary formats is displayed. One can also build strings of ASCII characters, just like in Windows' "Character Map". Foreground/background colors for the character, magnified character & character string are also editable through the GUI (there are 16 colors available for background, instead of default 8 Smiling. It's pretty useless today, but helped me a lot to develop my elder programs. By the way, this ASCII explorer was written using QBasic 4.5...

Share/Save/Bookmark

stas's picture
stas » April 20, 2006 » 01:52

GibCounter QW stats

GibCounter is a tiny yet quite useful game statistics generator for QuakeWorld games. It operates parsing frag*.log files generated by the QuakeWorld game server. If your server doesn't generate such a log files by default, you may enable this feature executing the server as follows:
qw-server +set fraglogfile 1
Of course you can also edit your server's .cfg files to enable frag logging. You may run GibCounter on the same machine (and, on UN*X systems, as the same user) that runs the game server. If it's a QuakeForge server, then GibCounter will locate log files automatically at the path $HOME/.quakeforge/qw. For the different ports of the QuakeWorld server, or a QuakeForge server operating as a different user, you may specify the location of the qw directory manually. To do that, simply pass the new directory as an argument to GibCounter (using your favorite command line shell):
perl gibcounter.pl /home/qserver/.quake/qw
GibCounter outputs the generated HTML code directly to STDOUT. So, if you're going to add it into your server's crontab, don't forget to pipe the output to some file! For example, the following crontab line will regenerate GibCounter game stats every 30 min and make it available through the URL http://yourserver.com/~youruser/gibcounter.html (file paths & crontab format may be different on your system; so ask your system administrator if unsure):
0,30 * * * * perl $HOME/gibcounter.pl > $HOME/www/gibcounter.html
GibCounter is also highly theme-able: CSS style of almost every element may be changed within gibcounter.css file. Of course, you can also edit the HTML template which is contained inside the Perl source itself Smiling
The game statistics page generated by GibCounter is self-explanatory. On the top, it will show players ranking, sorted by (guess what?!) the frags they scored. Frags are computed as: kills minus suicides. GibCounter also computes how many times each player was killed by others. And, on the bottom of the generated page, some computed totals are shown. This includes the period of time for which the stats were made, the top fragger (the best) & the top fragged (the worst) players. Please note that players with default nicknames, such as "unnamed" or "user-#", are automatically excluded from being processed, simply to avoid useless information bloat (as many different players would be rated as one, with very high kill/death number)!
By the way, GibCounter preserves the colorization of the graphical font in players' nicknames (Quake console is able to print some ASCII characters in white, orange, gold & brown colors), and translates all symbols into readable ASCII.
So, after all, how does GibCounter-generated page looks like?! Look for yourself, here are some example stats. Also, feel free to modify the program itself to fit your own needs!

Share/Save/Bookmark

stas's picture
stas » April 20, 2006 » 01:48

reg3dit

This one looks like and feels like the popular "Microsoft ® Registry Editor" (A.K.A. regedit.exe Eye-wink), specifically one that comes from Win2k default installation.
It only has one (significative) difference... It will never prompt you with following message box, when started:

"Registry editing has been disabled by your administrator."
"Registry editing has been disabled by your administrator."

This restriction is supposed to save users from themselves. Well, if you've successfully located an override (like mine Smiling), I hope you really know what's you're doing! My regedit clone will ignore administrator's restriction, which consist in the following registry patch:
REGEDIT4

[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\System]
"DisableRegistryTools"=dword:00000001
Then, you may use reg3dit to make all the changes you need (note that on Windows NT/2k/XP & superiors some keys would still give you "Access denied", as such OSes use per-user security policies). For example, you can unpatch that DisableRegistryTools thing and simply turn back to use default regedit.exe Smiling

P.S. - reg3dit has nothing to do with the leaked Win2k source!!! I've created it by my own.

Share/Save/Bookmark

stas's picture
stas » April 20, 2006 » 01:16
XML feed