Probabilistic Programming and Bayesian Methods for Hackers is an open source online book. The book is developed with iPython, so it can be read in a variety of formats: web, PDF, or locally with iPython installed.
Also, contributions are welcome via the Github repository for the book (or you can email the authors).
This is the first iPython project I have really looked at, and iPython looks very promising.
Are you confused on what hadoop is? What about Hbase, Pig Hive? Well, this link will help you out.
Hadoop Toolbox: When to Use What | SmartData Collective.
It provides a nice short explaination for the following terms:
Recently, both NYU and Columbia launched academic programs in data science. Well, another school in New York City is entering the mix. The City University of New York (CUNY) is now offering an online masters degree in data analytics. If you would like more information, there will be an online information session on May 22.
This looks to be a great webinar! It is today.
Webinar on Data Science, May 14.
Jason and Jeremy Kolb of Applied Data Labs recently released a new book, Secrets of the Big Data Revolution. As of today, it is free on Amazon. I have just started it, and it is good so far. It is only free for a limited time.
This Spring, Harvard University ran a data science course. Technically, the name of the course was Stat 221 Statistical Computing and Visualization. The course recently finished, and all the course lecture slides are available.
The slides contain a bunch of useful information, plus they show one possible layout for a data science course.
If you are looking for public data, Enigma.io is a new startup just for you. Enigma searches, finds, and connects a variety of formats of public data. The data is then linked and made accessible. Watch the video below for more details.