Data Scientist in 100 days

Posts

Day 4 : Pandas and Jupyter Notebook

The name sounds funny but Pandas is one of the most useful python library for data science. Most of the time spend by data scientists is used to clean and manipulate the data. Pandas provide the useful data structures such as Series and Data Frame . These are easy to manipulate and the and with them possibilities are limitless. Installing Pandas by PIP: Just type this in your terminal/shell Pip install pandas Kudos. Now you have pandas installed. Next -> Start exploring pandas Resource : Pandas for beginners Next thing we are going to use is Jupyter Notebook. The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more Jupyter.org This link will guide how to use Jupyter Notebooks. after you ar...

Day 3: Using Pip and Numpy

Q:Why is python so popular? Ans: Good community and wonderful packages for literally everything. Q: What is pip? Ans: To install and manage those wonderful packages in Python we use a system call pip. If you have Python 2 >=2.7.9 or Python 3 >=3.4 installed from python.org , you will already have pip and setuptools , but will need to upgrade to the latest version: On Linux or macOS: pip install - U pip setuptools On Windows: python - m pip install - U pip setuptools Numpy The first package that we are going to install is Numpy. Consider it as a core requirement to learn data science. Just write the in your terminal/shell pip install numpy And thats it. You have successfully installed numpy package. One of the best resource to get started with numpy is this Video by datacamp. Enjoy

Day 2 : Starting with Python For Data Science

Setting up the environment First of all you need to install Python Download Python from here Now that we have installed Python on our computer, we need to get an IDE for writing code. Personally I prefer PyCharm by JetBrains. You may use Atom and etc. Download PyCharm Community Edition from here If you know programming, this part shouldn't be very difficult for you. You can skim through the notes, get familiarize with the syntax and you may even write some snippet of code to get comfortable. However, if you do not know programming in any other language. I would suggest you to first complete a programming course before going ahead. If you want to build your basics and understanding of computer science and programming I would suggest you to complete CS 50 course by Harvard. Read Python online guide click here (part 1 to 4.8 is enough) This shall provide you the basic understanding of Python. I my s...

Day 1 : Gather the resources

Hey there, Fortunately, we are starting at such a time of the year when Black Friday Sale is up. Udemy is offering huge discounts on online courses. I would highly encourage you to buy this course stated below. Just last week it was priced at about 195 Pounds, right now you can get it for 10 Pounds. Trust me its worth it. I am not affiliated with udemy, so I am not getting any money out of it. I dont care if you buy it or get it from somewhere else 😜. Its totally upto you. Python-for-data-science-and-machine-learning-bootcamp Building Machine Learning Systems with Python - Richert, Coelho , this is also a very good resource especially for beginners. I will be using both of the resources during this journey. PyCharm IDE I would also suggest you to install PyCharm IDE. It is created by Jetbrains specifically for Python. You can get the community edition for free. Kudos.

Data Science Is An Amalgamation

Data is useless, unless processed and some useful information is extracted from it. For the same reason, a data scientist not should only be able to clean data and extract data , he should essentially produce some useful information out of heaps of data. With that being said, another proposition that you might have heard but may not have completely understand is that "Data Science is a multi disciplinary field" . To be effective one has to juggle around many subjects. It employs techniques and theories drawn from many fields within the broad areas of Statistics and Computer Science. C omputer Science , in particular from the sub-domains of Machine Learning, Classification , Databases, visualization and Big data technologies. Data scientists use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consist...

Who am I?

Hey Everyone ! Abdul Samad here , I am a Computer Engineering student at GIKI , Pakistan. Focused, driven, realistically optimist ,an aspiring Data Scientist, best describes me. I am hoping to graduate by this JUNE. I would just cut my intro short. The purpose of this blog is to document my journey of becoming a Data Scientist. I am up against a challenge. 100 day challenge to become a Data Scientist. Is it even possible??? Lets find out together :D