Trustworthy Machine Learning (MATH80630): Fall 2022

This project will be worth 60% of your final grade. You must work in teams of one or two.

Grading Scheme

Project proposal (5%)

Clarity/Relevance of problem statement: 2%
Relevance of proposed approach: 1%
Experiment setup (dataset): 1%
Members of the team: 1%

Project Mid Report (10%)

Clarity/Relevance of problem statement: 1%
Description of approach: 2%
Discussion of relationship to previous work and references: 2%
Research questions: 3%
Design of experiments: 2%

Project Final Report (35%)

Clarity/Relevance of problem statement and description of approach: 8%
Discussion of relationship to previous work and references: 3%
Design and execution of experiments: 10%
Figures/Tables/Writing: easily readable, properly labeled, informative: 5%
Disscussion of the limitations, relations to all aspects of trustworthy ML fields, and societical impacts: 3%
Code: 5% (you need to send your code via email to the instructor)
Individual report: 1%

Project Presentation (10%)

Clarity of presentation: 3%
Slide or Poster quality: 2%
Correctness: 2%
Answers to questions: 3%

Timeline

Project Proposal, due: September 30, 2022 (by the end of the day EDT). Upload the PDF of the proposal to gradescope by the head of the team.
Project meeting, September 30, 2022
Mid group report, due: October 21, 2022 (by the end of the day EDT). Upload the PDF of the final group report to gradescope by the head of the team. And Upload the PDF of the final group report to gradescope by the head of the team
Project Presentation, due: December 9, 2022. Upload the PDF of your poster/slides to gradescope by the head of the team.
In-class Presentation, on December 9, 2021.
Final group report, due: December 20, 2022 (by the end of the day EDT). Upload the PDF of the final group report to gradescope by the head of the team. And Upload the PDF of the final group report to gradescope by the head of the team
Final individual report, due: December 20, 2022 (by the end of the day EDT). Upload the PDF of the final individual report to gradescope (per each team member).

Goals

The aim of this project is to allow you to learn about machine learning by trying to solve a task with it.

First, select a question that can be answered using machine learning. I expect that your question will be about a model/algorithm or about an application. Then design a study that will try to answer your question. Your study must have an element of novelty. For example the novelty could be an extension or a variation of an existing algorithm or results of an existing method on a new dataset.

Your study should involve reading and understanding some background material. Your study must involve running some experiments. You are free to use (or not) any of the tools or models we have seen in class.

Alternatively: You could decide to participate in this open challenge: ML Reproducibility Challenge 2022. Let me know as soon as possible if you are interested in this.

Project Proposal: (1 upload per team) Please submit a one-page summary of your proposed research question and study to Gradescope. I will meet with each group to discuss study plans during the lecture of September 30. I will send you a schedule the day before. We will probably only have about 15 minutes so please make sure that your study plan is clear and precise. You may also include questions that you would like us to discuss at the end of the document.

The group report: (1 upload per team) Your report must contain a description of the question you are trying to answer, a clear description of the model/algorithm you are studying, a survey of related work which proper references, an empirical section that reports your results, and a conclusion that summarizes your findings and (if pertinent) highlights possible future directions of investigation. Your report should be no longer than 10 pages in length (plus references) for individuals or 13 pages (plus references) for teams of two. You must format your submission using the NeurIPS 2022 LaTeX style file which includes a “preprint” option for non-anonymous preprints.

The individual report: (1 upload per student) You will also submit a brief individual report (at most one page), which will: (1) Describe the parts of the project you worked on (which machine learning methods you applied, which preprocessing steps you performed on the data, which parts of the term paper you wrote, who you worked with on what parts, etc.) and what parts of the project your teammates worked on. (2) What you learned from the project. The purpose of the individual report is to facilitate fair grading and to allow the instructor to understand well what you learned from the project.

Some advice:

Be selective! Don’t choose a project that has nothing to do with trustworthy machine learning. Don’t investigate an algorithm that has a high chance of failing or being un-implementable. Don’t attack a problem that is irrelevant, ill-defined or unsolvable. Spend most of your time doing the main project and not related things such as data collection.
Be honest! You are not being marked on how good the results are. It doesn’t matter if your method is worse than the ones you compare to provided you implemented it properly. What matters is that you try something sensible and clearly describe the problem, your method, what you did, and what the results were. Be modest! Don’t pick a project that is way too hard. Usually, if you select the simplest thing you can think of to try, and do it carefully, it will take much longer than you think.
Be careful! Don’t do foolish things like test on your training data, set parameters by cheating, compare unfairly against other methods, include plots with unlabeled axes, use undefined symbols in equations, etc. Do sensible cross-checks like running your algorithms several times, leaving out small parts of your data, adding a few noisy points, etc. to make sure everything still works reasonably well. Make lots of pictures along the way.
Learn! The point of the project is to give you a chance to “test drive” the process of doing trustworthy machine learning. Consider this an opportunity to learn how to write code to run large experiments, make nice figures, layout readable equations, describe your work concisely to a smart but uninitiated reader, etc.
Have fun! If you pick something you think is cool, that will make getting it to work less painful and writing up your results less boring.

Class projects