Coursera is offering the course Mining of Massive Datasets from Stanford University. This is a popular course at Stanford and goes along with the book by the same name. The FREE course starts September 29, 2014, and runs for 7 weeks. The prerequisites are some SQL, algorithms, and data structures knowledge.
Thanks to David Trower for the tip on this course
Jenna Dutcher, community relations manager for the datascience@berkeley online master’s program, interviewed more than 40 thought leaders to answer this one simple question: What is big data? (Full disclosure: I was honored to be asked to provided a definition on the list.)
The answers are quite diverse and definitely worth reading.
I thought Hal Varian, Chief Economist at Google, provided one of the simplest and best definitions.
Big data means data that cannot fit easily into a standard relational database.
See the full list of What is Big Data?
Which definition is your favorite? How would you define big data?
My previous list of Colleges with Data Science Degrees has grown very large, and numerous people have requested the ability to sort and/or filter. Thus, I built a new list. It is available at: Data Science Colleges. As far as I know, this is the most comprehensive list of data science programs available. Here are some of the features it offers:
- Over 200 Programs
- Certificate, Bachelors, Masters, and Doctorate programs included
- Sort and Filter Programs
- US and International
- Program Name
- Online Programs
- Ability to download the raw data as CSV or JSON
Yes, you read that last one correctly. All the data is freely available for you. If you do use the data for something, I would love to know and potentially blog about it.
The list will continue to evolve. If you find any broken links or missing programs, please leave a comment. Also, please leave a comment if you can think of ways to improve the list.
DataKind, the organization dedicated matching data scientists with data that matters, has just launched 5 new global partners. The five cities are:
- San Francisco
- Washington D.C.
Read the official announcement on DataKind’s Blog – Announcing New Chapters.
Deep Learning is the hottest topic in all of data science right now. Adam Gibson, cofounder of Blix.io, has created an open source deep learning library for Java named DeepLearning4j. For those curious, DeepLearning4j is open sourced on github.
Below is a video of Adam introducing deep learning and DeepLearning4j. Also, if you are interested in learning more about deep learning. Here are a couple more very help links.
It was a 2-week intensive course focused on machine learning for big data. Some of the top academics in machine learning gave presentations. Most of the videos are fairly long (around 1 hour each), but a whole lot of material is covered.
All the CMU Machine Learning Summer School Videos are on Youtube.
Here is one lecture by Alex Smola on Scalable Machine Learning.
Mode Analytics, a recently launched site for collaborative data science in the cloud, has published an excellent tutorial for learning SQL.
The tutorial is named SQL School .
This is one of the best SQL tutorials I have seen. Plus, it has the huge added advantage of not requiring you to setup your own database first (the data is already available). Setting up your own database can be a bit overwhelming when you are first learning. So, if you are looking to learn SQL, now is a great time to start.