Course Syllabus
Course Description and Objectives
MAT555E
is a graduate level course which aims to provide an introduction to commonly used statistical methods for inference and prediction problems in data analysis. The course will harmonize statistical theory and data analysis through examples. This course is designed such that:
- To provide the fundamental mathematical, statistical, and computational concepts behind supervised and unsupervised statistical learning methods and algorithms for inference and prediction.
- To provide extensions of these methods to high-dimensional settings.
- To provide the applications of these methods in real life data sets.
- To provide the implementation of these methods in Python.
Course Type
This is a graduate-level elective course open to all graduate students at ITU.
Course Credits
3 local credits.
Course Prerequisites
Since the course also touches on the mathematical and statistical theory behind the methods and uses Python for implementation, this course requires the following background:
- Knowledge of linear algebra, probability, statistics, and optimization,
- Familiarity with Python’s Numpy, Pandas, Matplotlib, Seaborn, statsmodels, and Scikit-Learn libraries,
- Familiarity with at least one computational document such as Jupyter Notebook, Google Colab, Visual Studio Code, or RStudio Quarto, and
- Familiarity with Git commands and GitHub interface.
Class Schedule
CRN 14267:
Tuesdays between 14:30-17:30 at OBL3 (Computer Lab).
Course Logistics
Course related all announcements will be done through Ninova. Lecture materials (lecture slides, code scripts, assignments etc) will be uploaded on Ninova and posted on GitHub organization of the course. Students are expected to bring their own portable computer to the class.
Course Workload
1 midterm exam, 1 group-based paper presentation with a written-report, 1 final exam, and in-class performance.
Course Tentative Plan
We will closely follow the weekly schedule given below. However, weekly class schedules are subject to change depending on the progress we make as a class.
Week 1. Introduction. Framing a learning problem. Explanatory data analysis.
Week 2. Simple linear regression.
Week 3. Multiple linear regression.
Week 4. Introduction to classification. Logistic regression. Multinomial logistic regression.
Week 5. Naive Bayes. K-nearest neighbors.
Week 6. Linear discriminant analysis. Quadratic discriminant analysis.
Week 7. Cross-validation. Unsupervised pre-processing. Grid search and hyper-parameter tuning.
Week 8. ITU Fall Break.
Week 9. Model assessment and selection. Regularization methods for regression and classification problems. Ridge regression and lasso. Extensions to non-convex penalties.
Week 10. Moving beyond linearity. Polynomial regression. Regression splines.
Week 11. Tree based methods. Bagging, Random forests, and Boosting.
Week 12. Support vector machines.
Week 13. Unsupervised learning. Principal component analysis. Factor analysis.
Week 14. Clustering methods.
Week 15. Final review and examples.
Student Learning Outcomes
A student who completed this course successfully is expected:
- To be fluent in the fundamental concepts and principles behind supervised and unsupervised statistical learning methods,
- To be able to identify which method(s) might be suitable for conducting data analysis on specific real life data sets,
- To get familiar with Python Scikit-Learn library, and
- To be prepared for more advanced coursework or scientific research in machine learning and related fields.
immediately following the course, and/or a few months after the course.
Textbook
All lecture materials.
Recommended Primary Bibliography
Students are encouraged to consult the following sources on their own:
- Hastie, T., Tibshirani, R., Friedman, J.H., and Friedman, J.H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer. [Hard copy available at ITU Mustafa Inan Library with CALL #Q325.5 .H37 2009] [Available online at https://hastie.su.domains/ElemStatLearn/]
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R. New York: Springer. [Available online at https://www.statlearning.com/ ].
- Fan, J., Li, R., Zhang, C.H., and Zou, H. (2020). Statistical Foundations of Data Science. Chapman and Hall/CRC.
- Deisenroth, M.P., Faisal, A.A., and Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge University Press. [Available online at https://mml-book.github.io/].
- VanderPlas, J. (2016). Python Data Science Handbook: Essential Tools for Working with Data. O’Reilly Media, Inc. [Available online at https://jakevdp.github.io/PythonDataScienceHandbook/].
- Müller, A.C., and Guido, S. (2016). Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media, Inc. [Available online at https://github.com/amueller/introduction_to_ml_with_python].
Supplementary Readings
- Murphy, K.P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. [Available online at https://probml.github.io/pml-book/book1.html].
- Bishop, C.M., Nasrabadi, N. M. (2006). Pattern Recognition and Machine Learning. New York: Springer. [Hard copy available at ITU Mechanical Eng. Library with CALL #Q327 .B52 2006]
Off-Campus Access to the ITU Library E-sources
Access to library e-sources remotely is possible with a library account. Users without a library account should apply for the library registration at Library register. After setting the web configurations given at Proxy only once on your computer, you will able to have an access to ITU Library e-sources.
Selected Important Dates
For the official ITU Fall 2022 academic calendar, please visit:
Here are some selected important dates in Fall 2022 semester:
September 19, 2022: First day of classes.
September 19-23, 2022: Add-drop week.
October 29, 2022: Republic Day of Turkey (Saturday).
November 7-11, 2022: ITU Fall Break (No classes).
December 30, 2022: Last day of classes.
January 1, 2023: New year (Sunday).
January 02-15, 2023: Final exam week.
I also honor other national and religious holidays. Students, who needs flexibility on individual-based studies overlapping with these special days, can inform me.
Course Policies
Please read the information below as a reference for how this class will be conducted.
Grading Policy
Assessment Method | Contribution to Final Grade |
---|---|
In-class performance | 10% |
Midterm exam | 30% |
Paper presentation | 30% |
Final exam | 30% |
Midterm date and coverage
The midterm will be on November 17th, 2022 between 17:30-20:30
. The midterm topics will cover whatever we have covered up to that week. The main aim of the midterm is to assess whether you are able to frame a data analysis problem, implement it, and report the results. The midterm will be hands-on and open-book exam. For that reason, you have to bring your own portable computer to the exam place.
Group-based paper presentation (with report submission) date and coverage
The group-based paper presentations will be on December 15th, 2022 between 17:30-20:30
.
When the semester starts, I urge you to visit Proceedings of Machine Learning Research and pick a paper published within nearly past three years (2022, 2021, and 2020) at AISTATS, ICML, or NeurIPS. I would like to hear about papers on explainable AI (XAI)
a lot, but, feel free to select a topic on your own interest.
I can anticipate that the content of these papers may be very heavy compared to our class topics. However, the main aim of group-based paper presentation (along with a report submission) is to asses whether you are able to read and understand a research problem recently carried out, and suggest an improvement (e.g., mathematical or computational) as an extension of the paper. The group size can be at most 2 and the presentation duration is 30 minutes (25 min. talk + 5 min. Q.A.).
I also found the suggestions of Prof. Pranav Rajpurkar on “How to find good research ideas” listed here useful:
https://docs.google.com/document/d/15pnUpD47S6mAM-g4fwQvc2klYIb-GKgWex1oOlmNjvg/edit
Final exam date and coverage
The final exam date will be announced by ITU SIS later in December. The final exam topics will cover whatever we have covered throughout the semester with more advanced problems compared to midterm. The final exam will be hands-on and open-book exam. For that reason, you have to bring your own portable computer to the exam place.
Final Exam Attendance Policy
There is no VF rule to attend or not to attend the final exam.
Make-Up Exam Policy
The students who miss either midterm exam or final exam due to a health problem can take a make-up exam as long as they have a valid medical report taken on the exam day. The medical report should be handed in immediately (within two days of its expiration). There will be NO make-up for missed in-class activities.
Class Attendance Policy
The students must attend at least 70% of classes and are deemed responsible to manage his/her absences.
Participation Policy
The students are expected to ask and answer questions, participate in in-class activities, and show their interest and engagement in the class.
E-mail Policy
Please:
- Use a proper descriptive subject line (which may consist of the course number MAT555E followed by a short phrase summarizing the subject of your e-mail).
- Start off your e-mail with a proper greeting, introduce yourself (give your name), then state your problem as short as possible.
- Finally, use a proper closing and then finish your e-mail with your first name and so on.
Feel free to send me e-mails. But be sure you that give me enough time to get back to you.
Academic Honesty Policy
At every stage of the academic life, every ITU student is responsible for obeying the academic honesty policy of ITU stated below:
https://odek.itu.edu.tr/en/code-of-honor/ethics-in-university-life.
Equity, Diversity, and Inclusion
In this class, I am committed to cultural and individual differences and diversity as including, but not limited to, age, disability, ethnicity, gender, gender identity, language, national origin, race, religion, culture, and socioeconomic status and I acknowledge the value of differences.
Student with Special Needs
I truly care about that every student in my class feels that she/he involved in this class equally. If you are a student with special needs, please, let me know that how we can adjust the course environment, materials, and course assessment methods in accordance with your needs. Furthermore, you are also invited to contact the office of students with special needs at: