Deep Learning in Java

Deep Learning is the hottest topic in all of data science right now. Adam Gibson, cofounder of, has created an open source deep learning library for Java named DeepLearning4j. For those curious, DeepLearning4j is open sourced on github.

Below is a video of Adam introducing deep learning and DeepLearning4j. Also, if you are interested in learning more about deep learning. Here are a couple more very help links.

CMU Machine Learning Summer School Videos

It was a 2-week intensive course focused on machine learning for big data. Some of the top academics in machine learning gave presentations. Most of the videos are fairly long (around 1 hour each), but a whole lot of material is covered.

All the CMU Machine Learning Summer School Videos are on Youtube.

Here is one lecture by Alex Smola on Scalable Machine Learning.

Want to Learn SQL? Here is a Great Tutorial!

Mode Analytics, a recently launched site for collaborative data science in the cloud, has published an excellent tutorial for learning SQL.

The tutorial is named SQL School .

This is one of the best SQL tutorials I have seen. Plus, it has the huge added advantage of not requiring you to setup your own database first (the data is already available). Setting up your own database can be a bit overwhelming when you are first learning. So, if you are looking to learn SQL, now is a great time to start.

Stanford Releases Large Network Datasets

Stanford University has just released a collection of large datasets of network data. When I say network data, I am referring to the mathematical term of networks (think of a collection of nodes and edges). Here are just a few of the possible categories.

  • Citation Networks
  • Road Networks
  • Web graphs
  • Social Networks such as twitter
  • and many more
  • If you are looking to study network data, or just want some practice analyzing big data, this just might be a good place to start.

An Organization for Opendata and Healthcare

Health Data Consortium is an advocacy group focused on helping the healthcare industry respond to the availability of health data. They are currently focused on innovation and the uses of open health data.

Healthcare is currently undergoing some radical changes and data science is going to play a key role in the future of healthcare. It is great to see the medical field building an official group to define the practice. I hope other industry will follow the lead of the medical field and begin forming their own groups around open data. I am eager to see how the Health Data Consortium progresses over the coming years and months.