I'm looking at the Bad Data Handbook via Safari Books and was wondering if the data from the examples used is available on a site somewhere?
Thanks for your interest in the Bad Data Handbook. It doesn't appear as if there is a dedicated site for the data used in the book, but I did find this at the end of chapter 6 on page 93:
All of my examples have used NLTK, Python’s Natural Language ToolKit, which you
can find at http://nltk.org/. I also train all my models using the scripts I created in nltktrainer at https://github.com/japerk/nltk-trainer. To learn how to do text classification and sentiment analysis with NLTK yourself, I wrote a series of posts on my blog, starting with http://bit.ly/X9sqWR. And for those who want to go beyond basic text classification,take a look at scikit-learn, which is implementing all the latest and greatest machine learning algorithms in Python: http://scikit-learn.org/stable/. For Java people, there is Apache’s OpenNLP project at http://opennlp.apache.org/, and a commercial library called LingPipe, available at http://alias-i.com/lingpipe/."
The author also has provided a link to his website on oreilly.com which can be found here: http://www.oreillynet.com/pub/au/1910 where you can contact him for this information if it is available.
Customer Service Representative