SRK.AI
Published on

Are you a self-made data scientist? How did you do it?

Original Answer

My original answer to Quora question "Are you a self-made data scientist? How did you do it?" has been deleted due to Quora moderation policies. But I come across this question quite often from people trying to get into data science. So I am copy-pasting the original answer here - written in 2016.

Please bear in mind that this is an old answer and some of the aspects may not be relevant now.

I am a Mechanical Engineering graduate who had no prior knowledge about data science or for that matter even coding when I left my college six years back. Now I am working as a Lead Data Scientist in a reputed firm and also one of the top 25 Kaggle Data Scientists in the world.

Though I do not have formal background in CS or Statistics or Maths, I have a passion for crunching numbers and finding patterns right from my school days. I think anyone with a good passion for patterns and numbers coupled with right amount of hard work can become a self-made data scientist. Here is my path :

MOOC Courses

This played a major role and is the first place in my learning path. Courses which helped me understand the basics concepts are

Some other nice online courses which I came across are

  • Data Science by Harvard Extension - This is a very good course for people wanting to learn the concepts using python.
  • Data Science and Engineering using Apache Spark by Edx - This is a very useful course for people starting with big data analytics
  • Learning from Data by CalTech - This covers the basic concepts of machine learning
  • Neural Networks for Machine Learning by Coursera - Interested in knowing about the new boy (Deep Learning) in town. This course is the perfect place for that taught by none other than Geoff Hinton himself.

Once I get a fair understanding of the DS concepts from these courses, I was itching to use them somewhere. I was looking for options to test these theoretical skills. That is when I came across DS / ML competitions.

DS / ML Competitions

I came to know about Kaggle when I was searching for datasets to apply my learnings. I thought that I can ace the competitions easily since I have a good understanding of basic concepts. Poor me was not aware that hands-on is a different ball game from theory.

I started doing competitions on Kaggle but ended up at the bottom half of the table inspite of all the hard works. So once the competitions were over, I started looking at how others solved the problems from Kaggle Forums and blog. This is one important place where most of my learning took / taking place.

It also helped me hone my structured thinking on approaching the DS problems. It also helped me work on different real world datasets from different domains, each one challenging in its own way. When working deeper on these problems, I got new learnings every time and helped me improve myself further.

Doing Kaggle competitions at the first go might be daunting these days since the competition levels are quite high. So one can try to work on data science problems in other platforms like Analytics Vidhya Hackathons, Crowdanalytix, Driven Data etc before trying out on Kaggle to gain some confidence.

Other Sources

Apart from MOOCs and DS competitions, two important sources that helped me with my learning and understanding of this space are

I follow these two blogs to update my knowledge and to keep up myself to the advancements in the field. Other resources which I found to be helpful are

Hope this helps other budding self-made data scientists.!

Update in 2021:

There are several new courses, hackathon platforms, blogs that have come up after the original answer. I am listing down some of them which were / are helpful for my learning.

MOOC Courses

Some more good MOOC courses for the beginners are:

DS / ML Competition platforms

Some of the well known platforms for data science hackathons are

Blogs

Some more additional good blogs are

  • Towards DataScience - a medium publication sharing concepts, ideas and codes about data science
  • Blog by Jay Alammar - a good blog that helps us understand the machine learning concepts through visualizations
  • Machine Learning Mastery - a very good blog by Jason Brownlee which helps to understand the concepts with python codes
  • Applying ML - blog focussing on applying machine learning in the organizations. A good place to learn more about end to end machine learning
  • Blog by Eugene Yan - a good blog on how to design, develop, build and operate machine learning systems at scale