AOL query data
Here is the AOL data : http://www.gregsadetsky.com/aol-data/ (439MB)
Site1: http://www.atrus.org/hosted/AOL-data.tgz
Site2: http://aolsearchlogs.cloudsites.com/AOL-data.tgz
Site3: http://sexygeeks.be/AOL-data.tgz
Phantomjs
PhantomJS can be fully scripted using JavaScript. It is an optimal solution for headless testing of web-based applications, site scraping, pages capture, SVG renderer, PDF converter and many other usages.
Free Python learning
(1)http://www.swaroopch.com/notes/Python
(2)http://learnpythonthehardway.org/
(3)http://en.wikibooks.org/wiki/Non-Programmer’s_Tutorial_for_Python_2.6
(4)http://en.wikibooks.org/wiki/Python_Programming
(5)http://docs.python.org/tutorial/index.html
(6)http://www.greenteapress.com/thinkpython/thinkpython.html