Skip to main content
  1. Resources/
  2. Study Materials/
  3. Information Technology Engineering/
  4. IT Semester 4/
  5. Fundamentals of Machine Learning (4341603)/

·
Milav Dabgar
Author
Milav Dabgar
Experienced lecturer in the electrical and electronic manufacturing industry. Skilled in Embedded Systems, Image Processing, Data Science, MATLAB, Python, STM32. Strong education professional with a Master’s degree in Communication Systems Engineering from L.D. College of Engineering - Ahmedabad.
Table of Contents

GUJARAT TECHNOLOGICAL UNIVERSITY (GTU)
#

Competency-focused Outcome-based Green Curriculum-2021 (COGC-2021)
#

Semester -IV
#

Course Title: Fundamentals of Machine Learning
#

(Course Code: 4341603)

Diploma programme in which this course is offeredSemester in which offered
Information Technology4 th semester

1. RATIONALE
#

Machine learning focuses on the use of data and algorithms to perform learning similar to the way human learns. To solve recent problems in IT domain it is important to understand the need of machine learning and apply machine learning methods in efficient ways. Every student of Information Technology must therefore understand the blue prints of machine learning approaches and must be able to apply learning methods on available datasets. This course will help students to build up core competencies in understanding machine learning approaches and students will be able to design and train machine learning modes for various use cases.

2. COMPETENCY
#

The purpose of this course is to help the student to attain the following industry identified competency through various teaching-learning experiences:

● Develop appropriate machine learning algorithms for problem solving
#

3. COURSE OUTCOMES (COs)
#

The practical exercises, the underpinning knowledge, and the relevant soft skills associated with this competency are to be developed in the student to display the following COs:

The student will develop underpinning knowledge, adequate programming skills of competency for implementing various applications using python programming language to attain the following course outcomes.

  • a) To understand the need of machine learning for various problem solving.
  • b) Prepare machine leaning model and learning the evaluation methods.
  • c) Evaluate various supervised learning algorithms using appropriate dataset.
  • d) Evaluate various unsupervised learning algorithms using appropriate dataset.
  • e) To understand the use of various existing machine learning libraries.

4. TEACHING AND EXAMINATION SCHEME
#

Teaching SchemeTeaching SchemeTeaching SchemeTotal Credits (L+T/2+P/2)Examination SchemeExamination SchemeExamination SchemeExamination SchemeExamination Scheme
(In Hours)(In Hours)(In Hours)Total Credits (L+T/2+P/2)Theory MarksTheory MarksPractical MarksPractical MarksTotal Marks
LTPCCAESECAESETotal Marks
3-4530702525150

(*): Out of 30 marks under the theory CA, 10 marks are for assessment of the micro-project to facilitate integration of COs and the remaining 20 marks is the average of 2 tests to be taken during the semester for the assessing the attainment of the cognitive domain UOs required for the attainment of the COs .

Legends: L -Lecture; T - Tutorial/Teacher Guided Theory Practice; P -Practical; C - Credit, CA -Continuous Assessment; ESE -End Semester Examination.

5. SUGGESTED PRACTICAL EXERCISES
#

The following practical outcomes (PrOs) are the subcomponents of the COs. . . . These PrOs need to be attained to achieve the COs.

S. No.Practical Outcomes (PrOs)Unit No.Approx. Hrs. required
1Numerical Computing with Python (NumPy, Matplotlib)I04
2Introduction to Pandas for data import and export (Excel, CSV etc.)VI04
3Basic Introduction to Scikit learnVI04
4Implement the Find-S concept learning algorithm that finds the most specific hypothesis that is consistent with the given training data. Conditions: Hypothesis can only be conjunction (AND) of literals. Literals are either attributes or their negations.II06
5Import Pima Indian diabetes data Apply select K best and chi2 for feature selection Identify the best featuresII04
6Write a program to learn a decision tree and use it to predict class labels of test data Training and test data will be explicitly provided by instructor. Tree pruning should not be performed.III04
7ML Project Use the following dataset as music.csvIII06
agegendergenre
20HipHop
23HipHop
25HipHop
261Jazz
291Jazz
301
311Classical
331Classical
37Classical
20Dance
21Dance
25Dance
26Acoustic
27Acoustic
30Acoustic
31Classical
34Classical
35Classical
output(genre) data set c. Use decision tree model from sklearn to predict the genre of various age group people.(Ex A male of age 21 likes hiphop whereas female of age 22 likes dance) d. Calculate the accuracy of the model. e. vary training and test size to check different accuracy values modela. Store file as music.csv and import it to python using pandas b. Prepare the data by splitting data in input(age ,gender) anda. Store file as music.csv and import it to python using pandas b. Prepare the data by splitting data in input(age ,gender) anda. Store file as music.csv and import it to python using pandas b. Prepare the data by splitting data in input(age ,gender) and
8achieves. Write a program to use a K-nearest neighbor it to predict class labels of test data. Training and test data must be provided explicitly.achieves. Write a program to use a K-nearest neighbor it to predict class labels of test data. Training and test data must be provided explicitly.IV04
9Import vgsales.csv from kaggle platform. a. Find rows and columns in dataset b. Find basic information regarding dataset using describe command. C. Find values using values command.Import vgsales.csv from kaggle platform. a. Find rows and columns in dataset b. Find basic information regarding dataset using describe command. C. Find values using values command.IV04
10Project on regression a. Import home_data.csv on kaggle using pandas b. Understand data by running head ,info and describe command c. Plot the price of house with respect to area using matplotlib library d. Apply linear regression model to predict the price of houseProject on regression a. Import home_data.csv on kaggle using pandas b. Understand data by running head ,info and describe command c. Plot the price of house with respect to area using matplotlib library d. Apply linear regression model to predict the price of houseIV06
11Write a program to cluster a set of points using K-means. Training and test data must be provided explicitly.Write a program to cluster a set of points using K-means. Training and test data must be provided explicitly.V04
12Import Iris dataset a. Find rows and columns using shape command b. Print first 30 instances using head commandImport Iris dataset a. Find rows and columns using shape command b. Print first 30 instances using head commandV06
e. Plot the univariate graphs(box plot and histograms) f. Plot the multivariate plot(scatter matrix) g. Split data to train model by 80% data values h. Apply K-NN and k means clustering to check accuracy and decide which is better.
Total56

Note
#

  • i. More Practical Exercises can be designed and offered by the respective course teacher to develop the industry relevant skills/outcomes to match the COs. The above table is only a suggestive list .
  • ii. The following are some sample ‘Process’ and ‘Product’ related skills (more may be added/deleted depending on the course) that occur in the above listed Practical Exercises of this course required which are embedded in the COs and ultimately the competency..
S. No.Sample Performance Indicators for the PrOsWeightage in %
1Using the existing python libraries through Python Jupyter notebook.25
2Use python to read dataset and modify as per requirement.20
3Selecting appropriate machine learning method.25
4Train and test the model by importing existing data set.10
5Making predictions and improve learning parameters as well as improve accuracy.20
TotalTotal100

6. MAJOR EQUIPMENT/ INSTRUMENTS REQUIRED
#

This major equipment with broad specifications for the PrOs is a guide to procure them by the administrators to usher in uniformity of practical in all institutions across the state.

S. No.Equipment Name with Broad SpecificationsPrO. No.
1Computer system with operating system: Windows 7 or higher Ver., macOS, and Linux, with 4GB or higher RAM, Python versions: 2.7.X, 3.6.XAll
2Python IDEs and Code Editors Open Source : Anaconda Navigator

7. AFFECTIVE DOMAIN OUTCOMES
#

The following sample Affective Domain Outcomes (ADOs) are embedded in many of the above-mentioned COs and PrOs. More could be added to fulfill the development of this competency.

  • a) Work as a Data scientist.
  • b) Follow ethical practices.

The ADOs are best developed through the laboratory/field based exercises. Moreover, the level of achievement of the ADOs according to Krathwohl’s ‘Affective Domain Taxonomy’ should gradually increase as planned below:

  • i. ‘Valuing Level’ in 1 st year
  • ii. ‘Organization Level’ in 2 nd year.
  • iii. ‘Characterization Level’ in 3 rd year.

9. UNDERPINNING THEORY
#

Only the major Underpinning Theory is formulated as higher-level UOs of Revised Bloom’s taxonomy in order development of the COs and competency is not missed out by the students and teachers. If required, more such higher-level UOs could be included by the course teacher to focus on the attainment of COs and competency.

UnitUnit Outcomes (UOs) (4 to 6 UOs at Application and above level)Topics and Sub-topics
Unit - I Introduction to machine learning1.1 Describe basic concept of machine learning and its applications1.1.1 Overview of Human Learning and Machine Learning 1.1.2 Types of Machine Learning 1.1.3 Applications of Machine Learning 1.1.4 Tools and Technology for Machine Learning
Unit - II Preparing to Model2.1 Describe different types of Machine learning Activities. 2.2 Explain types of data and data preprocessing.2.1.1 Machine Learning activities 2.1.2 Types of data in Machine Learning 2.1.3 Structures of data 2.1.4 Data quality and remediation 2.1.5 Data Pre-Processing  Dimensionality reduction  Feature subset selection
Unit- III Modeling and Evaluation3.1 selecting a machine learning model 3.2 Train the model for supervised learning 3.3 Evaluate the prepared model.3.1.1 Selecting a Model  Predictive/Descriptive 3.2.1 Training a Model for supervised learning: ● Holdout method ● K-fold Cross-validation method 3.3.1 Model representation and interpretability 3.3.2 Evaluating performance of a model  Confusion Matrix 3.3.3 Improving Performance of a model
Unit- IV Supervised Learning - Classification4.1 Describe supervised learning 4.2 Explain classification Algorithms. 4.3 Explain Regression.4.1.1 Introduction to supervised learning. 4.1.2 Classification Model 4.1.3 Learning steps
and Regression4.2.1 Classification Algorithms  k-Nearest Neighbor (kNN)  Support Vector Machines 4.3.1Regression  Simple linear regression  Multiple linear regression  Logistic regression
Unit- V Unsupervised Learning5.1 Explain unsupervised learning 5.2 Describe Clustering 5.3 Describe pattern finding using association rule5.1.1 Supervised vs. Unsupervised Learning 5.1.2 Applications of unsupervised learning 5.2.1 Clustering  K-means Clustering algorithm 5.3.1 Finding pattern using Association Rule  Apriori algorithm
Unit- VI Python libraries for Machine learning6.1 Explain the use of existing machine learning python libraries.6.1.1 Pandas 6.1.2 Numpy 6.1.3 Matplotlib 6.1.4 Scikit-learn

Note : The UOs need to be formulated at the ‘Application Level’ and above of Revised Bloom’s Taxonomy’ to accelerate the attainment of the COs and the competency.

10. SUGGESTED SPECIFICATION TABLE FOR QUESTION PAPER DESIGN
#

Unit No.Unit TitleTeaching HoursDistribution of Theory MarksDistribution of Theory MarksDistribution of Theory MarksDistribution of Theory Marks
Unit No.Unit TitleTeaching HoursR LevelU LevelATotal Marks
IIntroduction to machine learning040204028
IIPreparing to Model0602060412
IIIModeling and Evaluation0802040612
IVSupervised Learning - Classification and Regression1002040814
VUnsupervised Learning0802060412
VIPython libraries for Machine learning0604040412
TotalTotal4214282870

Legends: R=Remember, U=Understand, A=Apply and above (Revised Bloom’s taxonomy) Note : This specification table provides general guidelines to assist students for their learning and to teachers to teach and question paper designers/setters to formulate test

items/questions assess the attainment of the UOs. The actual distribution of marks at different taxonomy levels (of R, U and A) in the question paper may vary slightly from the above table.

11. SUGGESTED STUDENT ACTIVITIES
#

Other than the classroom and laboratory learning, following are the suggested studentrelated co-curricular activities which can be undertaken to accelerate the attainment of the various outcomes in this course: Students should conduct following activities in group and prepare reports of about 5 pages for each activity, also collect/record physical evidences for their (student’s) portfolio which will be useful for their placement interviews:

  • a) Explore different data repositories and register for ML based competitions on platforms like kaggle.
  • b) Undertake micro-projects in teams
  • c) Give a seminar on any relevant topics.
  • d) Collect various sensor data from smart phones and apply machine learning approach.

12. SUGGESTED SPECIAL INSTRUCTIONAL STRATEGIES (if any)
#

These are sample strategies, which the teacher can use to accelerate the attainment of the various outcomes in this course:

  • a) Massive open online courses ( MOOCs ) may be used to teach various topics/subtopics.
  • b) Guide student(s) in undertaking micro-projects.
  • c) ‘L’ in section No. 4 means different types of teaching methods that are to be employed by teachers to develop the outcomes.
  • d) About 20% of the topics/sub-topics which are relatively simpler or descriptive in nature is to be given to the students for self-learning , but to be assessed using different assessment methods.
  • e) With respect to section No.11 , teachers need to ensure to create opportunities and provisions for co-curricular activities .
  • f) Guide students for open source python editors.

13. SUGGESTED MICRO-PROJECTS
#

Only one micro-project is planned to be undertaken by a student that needs to be assigned to him/her in the beginning of the semester. In the first four semesters, the micro-project are group-based. However, in the fifth and sixth semesters, it should be preferably be individually undertaken to build up the skill and confidence in every student to become problem solver so that s/he contributes to the projects of the industry. In special situations where groups have to be formed for micro-projects, the number of students in the group should not exceed three.

The micro-project could be industry application based, internet-based, workshopbased, laboratory-based or field-based. Each micro-project should encompass two or more COs which are in fact, an integration of PrOs, UOs and ADOs. Each student will have to maintain a dated work diary consisting of individual contributions in the project work and give a seminar presentation of it before submission. The total duration of the micro-project should not be less than 16 (sixteen) student engagement hours during the course. The student ought to submit a micro-project by the end of the semester to develop the industry oriented COs.

A suggestive list of micro-projects is given here. This has to match the competency and the COs. Similar micro-projects could be added by the concerned course teacher:

  • Project idea 1: BigMart Sales Prediction: BigMart sales dataset consists of 2013 sales data for 1559 products across 10 different outlets in different cities. The goal of the BigMart sales prediction ML project is to build a regression model to predict the sales of each of 1559 products for the following year in each of the 10 different BigMart outlets.
  • Project idea 2: Stock Price Prediction using machine learning is the process of predicting the future value of a stock traded on a stock exchange for reaping profits. With multiple factors involved in predicting stock prices, it is challenging to predict stock prices with high accuracy, and this is where machine learning plays a vital role.
  • Project idea 3: Data from leading music service can be taken to build a better music recommendation system.

14. SUGGESTED LEARNING RESOURCES
#

S. No.Title of BookAuthorPublication with place, year and ISBN
1Machine Learning_ Step-by-Step Guide To Implement Machine Learning Algorithms with Python.Rudolph RussellRudolph Russell Publications
2Machine LearningSaikatDull,S.Chjandr amouliDas, Pearson
3Machine Learning with Python Cookbook_ Practical Solutions from Preprocessing to Deep Learning.Chris AlbonO’Reilly Media, Inc.

15. SOFTWARE/LEARNING WEBSITES
#

16. PO-COMPETENCY-CO MAPPING
#

Semester IIFundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )Fundamentals of Machine Learning (Course Code: 4341603 )
POs and PSOsPOs and PSOsPOs and PSOsPOs and PSOsPOs and PSOsPOs and PSOsPOs and PSOs
Competency & Course OutcomesPO 1 Basic & Discipline specific knowledgePO 2 Problem AnalysisPO 3 Design/ development of solutionsPO 4 Engineering Tools, Experimentatio n &TestingPO 5 Engineering practices for society, sustainability & environmentPO 6 Project Manage mentPO 7 Life-long learning
Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.Competency Develop a machine learning model to solve real world problems.
Course Outcomes CO a) To understand the need of machine learning for various problem solving.3232--3

Legend: ’ 3’ for high, ’ 2 ’ for medium, ‘1’ for low or ‘-’ for the relevant correlation of each competency, CO, with PO/ PSO

CO b) Prepare machine leaning model and learning the evaluation methods.3232-23
CO c) Evaluate various supervised learning algorithms using appropriate dataset3333-33
CO d)Evaluate various unsupervised learning algorithms using appropriate dataset3333-33
CO e) To understand the use of various existing machine learning libraries3233--3

17. COURSE CURRICULUM DEVELOPMENT COMMITTEE
#

GTU Resource Persons
#

Sr. No.Name and DesignationInstituteEmail
1Mr. Sunil K. Paryani- Head(IT)Government Polytechnic Ahmedabadmailtosunil9@gmail.com
2Ms. Hiral R. Patel - Lect.(IT)Government Polytechnic Gandhinagarhiralit@gmail.com
3Mr. Pramod K. Tripathi - Lect.(IT)Government Polytechnic Gandhinagarcsharp.pramod@gmail.com