Skip to main content

Data Scientist Check list

Preparing Data Science Arsenal

  • Scripting Language : Python  
  • Data Visualization: Matplotlib , Pandas builtin visualisation,  Seaborn, Plotly and Cufflinks, Geographical Plotting ,Tableau/D3.js 
  • Data Analyses and Manipulation: Numpy , Pandas.
  • Linear & Logistic Regression
  • Bias Variance Trade off
  • K Nearest Neighbors
  • Decision Trees & Random Forest
  • Support Vector Machines
  • K Means Clustering
  • Principle Component Analysis
  • Recommender Systems
  • Natural Language Processing
  • SQL / Relational Database
  • Big Data & Spark & Hadoop
  • Neural Network & Deep Learning
  • AWS
  • Web Scraping : Beautiful Soup, Scrapy,  HTML CSS

Trust me, this list is daunting for me aswell. May be that is why I keep calling this journey as a challenge. This list is made after a thorough research of the field. These are tools, concepts and libraries I plan to dive into during this 100 day challenge. For every topic I will share the resource material and exercises , if possible.



Comments

  1. Hello sir, I am really interested in doing data science. I have been searching for the required materials and concepts. Fortunately, I read your blog.
    Can u plz post the schedule and resource materials of remaining days? I will be thankful to u sir.


    ReplyDelete

Post a Comment

Popular posts from this blog

Day 5 : Ace Matplotlib

Now that we have learned a bit about data manipulation and retrieval, we can take next step. That is to start analyzing the data we have gathered. And the easiest way for data analysis is through graph plotting. Don't worry you won't have to calculate every x and y coordinate to plot on graph because python or more specifically matplotlib will do it for us. Q.What is Matplotlib? Ans. It is a python package that makes graph plotting very easy. Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. Wikipedia Later on we will build on to the graph plotting using another python library Seaborn, it is built upon Matplotlib. At this stage learning these libraries might feel very boring to you but trust me if you are to become a data scientist some day, these will be your bread and butter.  So focus and absorb as much as you can. The best resource you need to learn matplotlib is the course that i have mentioned in

Day 2 : Starting with Python For Data Science

Setting up the environment First of all you need to install Python Download Python from    here Now that we have installed Python on our computer, we need to get an IDE for writing code. Personally I prefer PyCharm by JetBrains. You may use Atom and etc. Download  PyCharm Community   Edition   from    here   If you know programming, this part shouldn't be very difficult for you. You can skim through the notes, get familiarize with the syntax and you may even write some snippet of code to get comfortable. However, if you do not know programming in any other language. I would suggest you to first complete a programming course before going ahead. If you want to build your basics and understanding of computer science and programming I would suggest you to complete CS 50  course by Harvard. Read Python online guide click here  (part 1 to 4.8 is enough) This shall provide you the basic understanding of Python. I my self practiced some sorting algorithms in pyth

Who am I?

Hey Everyone ! Abdul Samad here , I am a Computer Engineering student at GIKI , Pakistan. Focused, driven, realistically optimist ,an aspiring Data Scientist, best describes me. I am hoping to graduate by this JUNE. I would just cut my intro short.  The purpose of this blog is to document my journey of becoming a Data Scientist. I am up against a challenge. 100 day challenge to become a Data Scientist.   Is it even possible???  Lets find out together :D