The Complete Roadmap for AI, Data Scientist Aspirants

Tek Raj Awasthi
9 min readAug 22, 2020

AI, ML explosion is going to be a Game Changer. Data Science and Artificial Intelligence, are the two most important technologies in the world today. While Data Science makes use of Artificial Intelligence in its operations, it does not completely represent AI.

Hello there, I’m Tek Raj Awasthi. Currently, I’m working as Associate Data Engineer at Bungee Technology. You can Reach me on Linkedin, Github , Twitter , Gmail.

Because AI, Data Science are one of the most complex domains in IT and Computer Engineering, you can’t get started with these after learning one, two, or three skills.

So, first I’ll make you clear on what actually is AI, ML, Data Science, and Deep Learning but I won’t be defining each and everything here in this article like what is AI, what is ML, what is Data Science as you can find these things anywhere on the internet. I’ll be guiding you on how to start a journey towards AI, Data Science.

At the end of this article, I’ll mention some amazing AI, ML products, and projects which have brought (are going to bring) a revolution in Technology. That will motivate you more in this field.

AI is a vast field. It has sub-fields mainly Machine Learning, Robotics, and Natural Language Processing. ML has again its sub-domain Deep Learning. And the Heart of all these is Machine Learning as AI is a theoretical thing and ML uses AI to make things happen, to implement all things using programming. You can clearly understand more about the AI field from the figure below.

fig. AI and it’s sub-fields

And here the roadmap begins.

  1. The first step towards AI, Data Science is not programming, not maths, not even science but preparing a Mindset, Why? Because almost half of students choose AI, Machine Learning field because of hype, who don’t have an interest in maths. Yes, choosing the hot things, cute girl, handsome boyfriend in life is okay, we can have our choice😁😂. But the problem is that we don’t think about what’s within this, we directly get started with things without any vision. Because AI, ML, and Data Science won’t be easy if you can’t make your Maths, Data Structure and Algorithms strong further. You don’t have to be a pro in mathematics to be AI, ML developer (unless you are going to be a Researcher in this field). Just learn the necessary things well. And lots of students give up after they realize the value of mathematics but they aren’t good at it. So, make sure that you’ll prepare these things well in the 1st, 2nd, 3rd year of Engineering, IT study.
  2. The next step in learning is Python Programming. But in most of the engineering, IT colleges, python programming is taught in 3rd years and even some colleges don’t teach python in their curriculum. And Engineering mathematics starts from the first year itself. So, I have put Python Programming and Mathematics at the same level so that you can start with maths or python. Even you can learn python programming yourself parallelly with mathematics.
fig.Complete Roadmap to be Data Scientist (it took my 1 hour to draw😂)

Please don’t go with AI, and ML without making your programming, Data Structure, and Algorithms strong. You’ll know the importance of these things later. I have mentioned what to learn in python and Mathemetics clearly.

The next thing that is at the heart of the programming field is Open Source. Be familiar with Github, Gits, and Jupyter Notebooks because these are the most used tools.

4. The next step is to learn Database. Everyone is busy learning R or Python for Data Science, but without Database Data Science is meaningless. Mainly, Relational Database is used. MySQL, Amazon Redshift, BigQuery, and PostgreSQL are all good relational database choices. SQL is mostly used query langauge while writing Scripts , working with database as a part of ML or Data pipelines.

But don’t worry of these heavy words, if you are beginners, start with SQL, and Mysql. These are easy to learn and fundamental too. Later, you will learn more things one by one.

5. After this, next thing to learn is Data Wrangling which is first thing but most important thing in Data Science. Data Wrangling means collecting data, cleaning Data and exploring Data.

6. Then learn Data Visualization which is mostly needed in Data Science, mainly in Machine Learning and Data Analysis. Just learn the basics, you can learn more while doing projects and also most of data visualization part would be learned in Matplolib and rest while working with projects. But at the same time, you can learn Theoretical AI; the concept of AI Agents, Searching Algorithms, the field of AI and more so that you will be clear on AI and then step into Machine Learning.

7. Then comes the Heart of AI, Data Science i.e. Machine Learning which is the application of AI. Machine Learning is about making the machines to learn from trained data, examples so that machines can learn what to do like human. So machine learning models are fed with a dataset which is a collection of data. Then ML model is trained to do, find, analyze, explore things. So, ML is used everywhere, in AI, in NL, in Deep Learning, in Computer Vision.

And here comes the Mathematics into Play. Machine Learning is mostly about Differential Calculus, Linear Algebra, Statistics and Probability. These things are used in AI , Data Science field along with programming languages like Python, R(even c++ can also be used somewhere). Python is Ranked 1 programming language in the world, and AI, Data Science field uses python mainly. So, you’ll implement all these mathematical concepts or algorithms using Python or R language. In Machine Learning, you’ll use Optimized Python Libraries like sklearn, Tensorflow, Keras, pytorch while implementing ML projects. So learn these libraries basic for now, you’ll learn more day by day while using in projects. (Contact me for best Tensorflow Tutorials for a beginner with theory and programming. It’s free, but not possible to attach here)

8. And Deep Learning is a sub-field of ML, Deep learning uses Artificial Neural Networks which works with very huge amounts of data. But before going with Deep Learning, learn Computer Vision first so that you can later apply Deep Learning in it. And Image recognition, image classification, face, objects ( in video) recognition, detection all these things come under Computer Vision and Machine Learning, Deep Learning are applied in this. You’ll use Pyhton Library openCV in computer vision. So learn this library int this step.

9. Already, I have told you when and why to use Deep Learning in the previous step. And Deep Learning is an amazing thing, you’ll just love it. Even you may forget Machine Learning algorithms after being addicted to Artificial Neural Networks in Deep Learning. But ML algorithms and Deep Neural Networks have different application, you’ll know it later.

10. The next thing you have to learn is Natural Language Processing. It uses Machine Learning and Deep Neural Networks. NLP is bringing revolutions day by day. So, finally, you are AI Engineer after 10 Steps.

11. And the next big thing in Data field apart from AI is Big Data which you may guess already as it’s the era of internet and billion, trillion amount of data are being produced per second. Big data is a field that treats ways to analyze, systematically extract information from, or otherwise, deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.

Big Data Engineers work closely with Machine Learning, AI engineers. In big data, there are so many tools, libraries used. A few of them are Hadoop, ApacheSpark. Apache Hadoop and Apache Spark are both open-source frameworks for big data processing with some key differences. So you have to learn these things to work with Big Data.

Finally, you are a complete Data Scientist after learning and working with all these things.

The game-changing applications of AI that exists or are about to land in the market are:

  1. OPENAI’s GPT-3 is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series, a for-profit San Francisco-based artificial intelligence research laboratory. GPT-3 is expected to reduce or replace (but not completely) the job of software engineer as it develops the programming code as per user’s instruction
  2. Tesla self-driving car uses a combination of sensors, cameras, radar and artificial intelligence (AI) to travel between destinations without a human operator.
  3. IBM’s Deep Blue supercomputer defeated Kasparov on 12 May 1997. DeepMind said the difference between AlphaZero and its competitors is that its machine-learning approach is given no human input apart from the basic rules of chess.
  4. AlphaGo defeated world Go champion Lee Sedol over five matches. It goes where no machine has gone before. Gameplay has long been a chosen method for demonstrating the abilities of thinking machines, and the trend continued to make headlines in 2016 when AlphaGo, created by Deep Mind (now a Google subsidiary) defeated world Go champion Lee Sedol over five matches.
  5. Facebook’s Transcoder, an AI Source-to-Source compiler which converts code from one to another language in C++, Java and Python.
  6. Microsoft Math uses optical character recognition (OCR) for handwriting to extract a math equation from a student’s photo of their notes.
  7. Microsoft’s Pix2Story uses Natural Language Processing (NLP) for storytelling. AI scans a picture, applies a writing style, and generates a story — demonstrating how AI can drive creativity.
  8. Micorsoft’s Sketch2Code converts hand-written drawings to HTML prototypes. Designers share ideas on a whiteboard, then changes are shown instantly in the browser — helping improve collaboration between the designer, developer, and customer.
  9. Microsofts’s Celebs Like Me uses facial recognition to match the user’s photo to similar-looking celebrities. Powered by a Deep Neural Net (DNN) model, it was trained using Bing Satori Knowledge Graph and Bing Image Graph.

And many more. Later I’ll publish the learning materials, best tutorials, projects ideas in this field if you need it.

Thank you for reading.

If you need proper guidance in AI, Data Science filed, you can connect with me on Linkedin. Soon, we are planning to start a Data Science Community (not finalized yet) to mentor, guide the learners and developers in this field throughout the journey with proper learning materials and Project guidance sequentially(following this road map).

We are planning this initiative because I myself had a very bad experience as a beginner in this domain because I didn’t find any mentor on time who could guide me on how to go ahead. It took me around 1 year to get clear ideas about the roadmap. In starting, I was randomly learning AI, Machine Learning. And randomly doing python programming. And I was not able to find how to do programming in AI, ML projects, when and where Math is used, when and where to use Deep Learning, Computer vision. And it’s not only my experience, every beginner have same experience in this field as AI, Data Science is a combination of Math, Science(robotics) and Programming and there isn’t articles or videos who have guided this much in-depth about everything.

That’s why this field is challenging and you’ll be Dinosaraus if you don’t go strongly, because of hype and competition in this field. Remember, AI, ML is not a piece of cake because even the Tech and Finance Industry isn’t successful in deploying AI, ML, Robotics as of now because moving onto automation is challenging and Game-Changing too.

Than you. Happy Learning. Stay blessed.

Never give up on your goals without trying all possible ways. Else you just don’t want to make it happen.

--

--

Tek Raj Awasthi

Data Engineer @Bungee Tech | AI & Machine Learning Developer & Researcher(self) | Positivity content Writer