CS492(D): Diffusion Models and Their Applications
Minhyuk Sung, KAIST, Fall 2024
Time & Location
Time: Mon/Wed 1:00 p.m. - 2:15 p.m. (KST)
Location: Zoom / N1 Rm 201.
Description
Recent breakthroughs in generative AI have amazed people with the unprecedented quality of generated images and videos, as exemplified by SORA, Midjourney, StableDiffusion, and many others. These advancements have been achieved using diffusion models, which have become the new standard technique for generative models. Diffusion models offer numerous advantages, including superior performance in the quality of generated outputs, as well as capabilities in conditional generation, personalization, zero-shot manipulation, and knowledge distillation.
In this course, we will discuss the theoretical foundations and practical applications of diffusion models. While the goal is to cover both theory and practice, the focus will be on gaining hands-on experience by implementing diffusion model techniques in programming assignments and solving real-world problems in the course project. There will be no midterm or final exams.
Course Staff
Instructor: Minhyuk Sung (mhsung@kaist.ac.kr)
Course Assistants:
- Yuseung Lee (phillip0701@kaist.ac.kr)
- Jaihoon Kim (jh27kim@kaist.ac.kr)
- Seungwoo Yoo (dreamy1534@kaist.ac.kr)
- Juil Koo (63days@kaist.ac.kr)
Prerequisites
This course is designed for students with a fundamental understanding of deep learning and experience using PyTorch.
Grading
- Programming Assignments: 45%
- Project: 45%
- In-Class Participation: 10%
Paper List
Useful Resources
- SIGGRAPH 2024 Course: Diffusion Models for Visual Content Generation
- CVPR 2023 Tutorial: Denoising Diffusion Models: A Generative Learning Big Bang
- "Generative Modeling by Estimating Gradients of the Data Distribution", Yang Song.
- "What are Diffusion Models?", Lilian Weng.
- "Understanding Diffusion Models: A Unified Perspective". Calvin Luo.
- "Tutorial on Diffusion Models for Imaging and Vision". Stanley H. Chan.
- "Step-by-Step Diffusion: An Elementary Tutorial". Preetum Nakkiran, Arwen Bradley, Hattie Zhou, and Madhu Advani.
Important Dates
ALL ASSIGNMENTS ARE DUE 23:59 KST.
(Subject to Change)
- 1st Programming Assignment: Due Sep 29 (Sun)
- 2nd Programming Assignment: Due Oct 9 (Wed)
- 3rd Programming Assignment: Due Oct 21 (Mon)
- 4th Programming Assignment: Due Nov 5 (Tue)
- 5th Programming Assignment: Due Nov 18 (Mon)
- 6th Programming Assignment: Due Dec 6 (Fri)
- 7th Programming Assignment: Due Dec 13 (Fri)
- Project Proposal: Due Oct 19 (Sat)
- Project Interim Report: Due Nov 9 (Sat)
- Project Early Reporting Due: Due Nov 22 (Fri)
- Project Submission: Due Nov 30 (Sat)
Schedule
(Subject to Change)
Week | Mon | Topic | Wed | Topic |
---|---|---|---|---|
1 | Sep 02 | Course Introduction Slides |
Sep 04 | Introduction to Generative Models / GAN / VAE Slides Recording |
2 | Sep 09 | DDPM 1 Slides Recording |
Sep 11 | DDPM 2 Slides Recording Assignment 1 Session Slides |
3 | Sep 16 | No Class (Chuseok) | Sep 18 | No Class (Chuseok) |
4 | Sep 23 | DDIM 1 Slides Recording |
Sep 25 | DDIM 2 / CFG Slides Recording Assignment 2 Session Slides |
5 | Sep 30 | CFG / Latent Diffusion / ControlNet / LoRA Slides Recording |
Oct 02 | No Class (Substitution of Hangul Day) |
6 | Oct 07 | Zero-Shot Applications Slides Recording Assignment 3 Session Slides |
Oct 10 (Thu) 4:00pm KST |
Guest Lecture 1 Or Patashnik Ph.D. Student at Tel-Aviv University Recording |
7 | Oct 14 | DDIM Inversion / Score Distillation 1 Slides Recording |
Oct 16 | Score Distillation 2 Slides Recording Assignment 4 Session Slides |
8 | Oct 21 | No Class (Midterm Week) | Oct 23 | No Class (Midterm Week) |
9 | Oct 28 | Diffusion Synchronization Slides Recording |
Oct 30 | Assignment 5 Session Slides |
10 | Nov 04 | Inverse Problems 1 Slides Recording |
Nov 06 | Inverse Problems 2 Slides Recording Project Orientation Session |
11 | Nov 11 | Probability Flow ODE / DPM-Solver Slides Recording |
Nov 13 | Assignment 6 Session Slides |
12 | Nov 18 | Flow Matching 1 Slides Recording |
Nov 20 | Flow Matching 2 Slides Recording Assignment 7 Session Slides |
13 | Nov 25 | DiT / Applications / Future of Generative Models |
Nov 27 | Guest Lecture 2 Jiaming Song Chief Scientist at Luma AI |
14 | Dec 02 | Project Presentations 1 | Dec 04 | Project Presentations 2 |
15 | Dec 09 | No Class (Conference Trip) | Dec 11 | No Class (Conference Trip) |
16 | Dev 16 | No Class (Final Week) | Dec 18 | No Class (Final Week) |
AI Coding Assistant Tool Policy
You are allowed (and even encouraged) to utilize AI coding assistant tools, such as ChatGPT, Copilot, Codex, and Code Intelligence, for your programming assignments and projects. Utilizing AI coding assistant tools will not be deemed as plagiarism. However, it is still strictly prohibited to directly copy code from the Internet or from someone else. Doing so will lead to a score of zero and a report to the university.