SICOM 3A and Master SIGMA program: Statistical/Machine learning course
News
Due to last minute constraints, the ML lab of this Monday 3 October will be done exceptionally by zoom (identifiers sent by mail)
Homework for Monday, October 10
-
read the lesson (slides) on support vector machines:
- in first reading you can skip the slides 10 to 18 (on constrained convex optimization),
- the introduction to random forest (appendix, slides 43 to 47) is also optional
- prepare your questions for the course/lab session!
~~Lab4 instructions
Lab4 statement is hereupload at the end of the session your lab 4 short report in the chamilo assigment task (pdf file from your editor, or scanned pdf file of a handwritten paper; code, figures or graphics are not required)
Homework for Monday, October 3
read the lesson (slides) on lasso and logistic regression: read up to end of the slidesprepare your questions for the course/lab session!
Lab3 instructions
Lab3 statement is hereupload at the end of the session your lab 3 short report in the chamilo assigment task (pdf file from your editor, or scanned pdf file of a handwritten paper; code, figures or graphics are not required)
Homework for Friday, September 30
read the lesson (slides) on linear models: read up to ridge regression slide 23prepare your questions for the course/lab session!
Lab2 instructions
Lab2 statement is hereupload at the end of the session your lab 2 short report in the chamilo assigment task (pdf file from your editor, or scanned pdf file of a handwritten paper; code, figures or graphics are not required)
Homework for Monday, September 26
read the lesson (slides ) on generative models: discriminant analysis + naïve Bayesprepare your questions for the course/lab session!
Lab1 instructions
Lab1 is scheduled on Monday 19 (13:30 - M253 Minatec for IMMAC, and 15:45 - M253 Minatec for EEH students)Statement is hereupload your lab 1 short report in the chamilo assigment task (pdf file from your editor, or scanned pdf file of a handwritten paper; code, figures or graphics are not required)
Homework before the first lab on Monday, September 19
read and run the introduction notebooksN1_Linear_Classification.ipynb
andN2_Polynomial_Classification_Model_Complexity.ipynb
-
answer the questions of the notebook exercises and upload it (pdf file from your editor, or scanned pdf file of a handwritten sheet) under chamilo in the assignment tool (those and only those who do not yet have an agalan account can send it to me by email):only text explanations are required, no need to copy/paste figure or graphics!must not exceed half a length of A4 paper
First course session will take place Monday afternoon, September 12 at Minatec Z306.
Welcome to the Statistical Learning course!
You will find in this gitlab repository the necessary material for the teaching of Machine Learning:
- course materials for the lessons (slides)
- examples and exercises for the labs in the form of Jupyter python notebooks (
.ipynb
files) and/or via online applications, - quiz: online tool Socrative, room MLSICOM
These resources will be updated as the sessions progress.
How to use the notebooks?
The examples and exercises will be done under python 3.x through scikit-learn, and also tensorflow. These are two of the most widely used machine learning packages.
The Jupyter Notebooks (.ipynb
files) are programs containing both cells of code (for us Python) and cells of markdown text for the narrative side. These notebooks are often used to explore and analyze data. Their processing is done with a jupyter-notebook
, or juypyter-lab
application, which is accessed through a web browser.
In order to run them you have several possibilities:
- Download the notebooks to run them on your machine. This requires a Python environment (> 3.3), and the Jupyter notebook and scikit-learn packages. It is recommended to install them via the anaconda distribution which will directly install all the necessary dependencies.
Or
- Use the mybinder service ans links to run them interactively and remotely (online):
(open the link and wait a few seconds for the environment to load).
Warning: Binder is meant for ephemeral interactive coding, meaning that your own modifications/codes/results will be lost when your user session will automatically shut down (basically after 10 minutes of inactivity)
Or
- Use a
jupyterhub
online service:
- we recommend the UGA's service, jupyterhub.u-ga.fr, so that you can run your notebooks on the UGA's computation server while saving your modifications and results. Also useful to launch a background computation (connection with your Agalan account; requires uploading your notebooks+data to the server).
- alternatively you can use an equivalent
jupyterhub
service. For example the one from google, namely google-colab, which allows you to run/save your notebooks and also to share the edition to several collaborators (requires a google account and upload your notebooks+data in your Drive)
Note : You will also find among the notebooks an introduction to Python notebooks/0_python_in_a_nutshell
Miscellaneous remarks on the materials
- The slides are designed to be self-sufficient (even if the narrative side is often limited by the format).
- In addition to the slides and bibliographical/web references, we generally propose links or videos (at the beginning or end of the slides) specific to the concepts presented. These lists are of course not exhaustive, and you will find throughout the web many resources, often pedagogical. Feel free to do your own research.