CS492(D): Diffusion Models and Their Applications
Minhyuk Sung, KAIST, Fall 2024
Time & Location
Time: Mon/Wed 1:00 p.m. - 2:15 p.m. (KST)
Location: Zoom / N1 Rm 201.
Description
Recent breakthroughs in generative AI have amazed people with the unprecedented quality of generated images and videos, as exemplified by SORA, Midjourney, StableDiffusion, and many others. These advancements have been achieved using diffusion models, which have become the new standard technique for generative models. Diffusion models offer numerous advantages, including superior performance in the quality of generated outputs, as well as capabilities in conditional generation, personalization, zero-shot manipulation, and knowledge distillation.
In this course, we will discuss the theoretical foundations and practical applications of diffusion models. While the goal is to cover both theory and practice, the focus will be on gaining hands-on experience by implementing diffusion model techniques in programming assignments and solving real-world problems in the course project. There will be no midterm or final exams.
Course Staff
Instructor: Minhyuk Sung (mhsung@kaist.ac.kr)
Course Assistants:
- Yuseung Lee (phillip0701@kaist.ac.kr)
- Jaihoon Kim (jh27kim@kaist.ac.kr)
- Seungwoo Yoo (dreamy1534@kaist.ac.kr)
- Juil Koo (63days@kaist.ac.kr)
Prerequisites
This course is designed for students with a fundamental understanding of deep learning and experience using PyTorch.
Grading
- Programming Assignments: 45%
- Project: 45%
- In-Class Participation: 10%
Paper List
Useful Resources
- SIGGRAPH 2024 Course: Diffusion Models for Visual Content Generation
- CVPR 2023 Tutorial: Denoising Diffusion Models: A Generative Learning Big Bang
- "Generative Modeling by Estimating Gradients of the Data Distribution", Yang Song.
- "What are Diffusion Models?", Lilian Weng.
- "Understanding Diffusion Models: A Unified Perspective". Calvin Luo.
- "Tutorial on Diffusion Models for Imaging and Vision". Stanley H. Chan.
- "Step-by-Step Diffusion: An Elementary Tutorial". Preetum Nakkiran, Arwen Bradley, Hattie Zhou, and Madhu Advani.
Important Dates
Each programming assignment is due two weeks after the assignment session.
ALL ASSIGNMENTS ARE DUE 23:59 KST.
(Subject to Change)
- 1st Programming Assignment: Due Sep 29 (Sun)
- 2nd Programming Assignment: Due Oct 9 (Wed)
- Project Proposal: Due Oct 19 (Sat)
- Project Interim Report: Due Nov 9 (Sat)
- Project Submission: Due Nov 30 (Sat)
Schedule
(Subject to Change)
Week | Mon | Topic | Wed | Topic |
---|---|---|---|---|
1 | Sep 02 | Course Introduction Slides |
Sep 04 | Introduction to Generative Models / GAN / VAE Slides Recording |
2 | Sep 09 | DDPM 1 Slides Recording |
Sep 11 | DDPM 2 Slides Recording Assignment 1 Session Slides |
3 | Sep 16 | No Class (Chuseok) | Sep 18 | No Class (Chuseok) |
4 | Sep 23 | DDIM 1 Slides Recording |
Sep 25 | DDIM 2 / CFG Slides Assignment 2 Session Slides Recording |
5 | Sep 30 | CFG / Latent Diffusion / ControlNet / LoRA Slides Recording |
Oct 02 | No Class (Substitution of Hangul Day) |
6 | Oct 07 | Assignment 3 Session | Oct 10 (Thu) 4:00pm KST |
Guest Lecture 1 Or Patashnik Ph.D. Student at Tel-Aviv University |
7 | Oct 14 | Inverse Problem / Knowledge Distillation | Oct 16 | Assignment 4 Session |
8 | Oct 21 | No Class (Midterm Week) | Oct 23 | No Class (Midterm Week) |
9 | Oct 28 | Project Introduction | Oct 30 | Diffusion Synchronization |
10 | Nov 04 | Assignment 5 Session | Nov 06 | SDE/ODE Solvers |
11 | Nov 11 | Assignment 6 Session | Nov 13 | No Class (Break) |
12 | Nov 18 | Consistency Model / Flow-Based Models | Nov 20 | Assignment 7 Session |
13 | Nov 25 | DiT / Applications / Future of Generative Models |
Nov 27 | Guest Lecture 2 Jiaming Song Chief Scientist at Luma AI |
14 | Dec 02 | Project Presentations 1 | Dec 04 | Project Presentations 2 |
15 | Dec 09 | No Class (Conference Trip) | Dec 11 | No Class (Conference Trip) |
16 | Dev 16 | No Class (Final Week) | Dec 18 | No Class (Final Week) |
AI Coding Assistant Tool Policy
You are allowed (and even encouraged) to utilize AI coding assistant tools, such as ChatGPT, Copilot, Codex, and Code Intelligence, for your programming assignments and projects. Utilizing AI coding assistant tools will not be deemed as plagiarism. However, it is still strictly prohibited to directly copy code from the Internet or from someone else. Doing so will lead to a score of zero and a report to the university.