Course Description
Data Science is the study of the generalizable extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science along with a good understanding of the craft of problem formulation to engineer effective solutions. This course will introduce students to this rapidly growing field and equip them with some of its basic principles and tools as well as its general mindset. Students will learn concepts, techniques and tools they need to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modelling, descriptive modelling, data product creation, evaluation, and effective communication. The focus in the treatment of these topics will be on breadth, rather than depth, and emphasis will be placed on integration and synthesis of concepts and their application to solving problems.
Learning Outcomes
 Describe what Data Science is and the skill sets needed to be a data scientist.
 Explain in basic terms what Statistical Inference means. Identify probability distributions commonly used as foundations for statistical modelling. Able to fit a model to data.
 Use python to carry out basic data analytics and machine learning applications.
 Introduction to AI and some Machine Learning algorithms.
 Understanding about data visualization tool with special emphasis to Hadoop.
Target Audience
The course is suitable for upperlevel undergraduate students in computer science, mathematics, or any science streams.
Prerequisites
Students are expected to have basic knowledge in mathematics and programming logics. If you are interested in taking the course, but are not sure if you have the right background, talk to the instructors. You may still be allowed to take the course if you are willing to put in the extra effort to fill in any gaps.
Course work
The course consists of:
 Lectures
 Problem Solving
 Assignment/Case Study submissions
Topics and course outline:
1. Introduction to Data Science: What is Data Science?
 Data Science Components
 Data Mining and Data Warehousing
 Big Data and Data Science hype
 The Data Science Process
 Data Science and Ethical Issues
 Discussions on privacy, security, ethics
 A look back at Data Science
 Data Science Job Roles in present era
 Nextgeneration data scientists
 Tools for Data Science
2. Probability and Statistics for Data Science
 Population and samples
 Statistical Inference: modelling, probability distributions, fitting a model
 Descriptive Statistics
 Probability Distributions
 Inferential Statistics through hypothesis tests
 Permutation & Randomization Test
 Correlation and Regression
 ANOVA
3. Python for Data Science
 Introduction to Python in Data Science
 Environment setup/ Working with Google Colab
 Basic Data types
 Data Structures: arrays, string, list, dictionary.
 Data Operations: slicing and string operations.
 Iteration and Conditional Construct.
 Overview of python libraries
 Familiarization to important python libraries Numpy, Panda, Mathplotlib.
 Exploratory Analysis in python.
4. Artificial Intelligence and Machine Learning
 Introduction to Artificial Neural Network
 Biological Neurons vs Artificial Neurons
 Learning theory in ML
 Supervised Learning Classification: KNN and SVM
 Unsupervised Learning Clustering: Kmeans
 Reinforcement Learning
 Introduction to Deep neural networks
 DNN types
5. Introduction to Data Analytics and Data Visualization Tools
 Introduction to Big Data
 Applications of Big Data
 Data Analytics (DA)
 Steps in Data Analytics
 Challenges for Big Data Analytics
 Ideas and tools for Data Visualization: with special reference to Hadoop.

Introduction to Data Science
 What is Data Science? Data Science Components , Data Mining and Data Warehousing
 Big Data and Data Science hype The Data Science Process
 Data Science and Ethical Issues Discussions on privacy, security, ethics
 A look back at Data Science Data Science Job Roles in present era
 Nextgeneration data scientists Tools for Data Science

Python for Data Science

Artificial Intelligence and Machine Learning