Diffusion models explained. How does OpenAI's GLIDE work?

Published 2022-03-23
Diffusion models beat GANs in image synthesis, GLIDE generates images from text descriptions, surpassing even DALL-E in terms of photorealism! Check out this video to learn how diffusion models work. Enjoy the visuals!
SPONSOR: Weights & Biases 👉 wandb.me/ai-coffee-break

❓ Check out our daily #MachineLearning Quiz Questions: youtube.com/c/AICoffeeBreak/community
➡️ AI Coffee Break Merch! 🛍️ aicoffeebreak.creator-spring....

Recommended videos:
📺 DALL-E video:    • OpenAI's DALL-E explained. How GPT-3 ...  
📺 GAN explained video:    • GANs explained | Generative Adversari...  
📺 CLIP video:    • OpenAI’s CLIP explained! | Examples, ...  

Papers:
📜 GLIDE paper: Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. "Glide: Towards photorealistic image generation and editing with text-guided diffusion models." arXiv preprint arXiv:2112.10741 (2021). arxiv.org/abs/2112.10741
🔗 GLIDE mini, demo: huggingface.co/spaces/valhalla/glide-text2im
📜 Diffusion models for image generation: Dhariwal, Prafulla, and Alexander Nichol. "Diffusion models beat GANs on image synthesis." Advances in Neural Information Processing Systems 34 (2021). arxiv.org/abs/2105.05233
📜 Original diffusion models paper: Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. "Deep unsupervised learning using nonequilibrium thermodynamics." In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015. arxiv.org/abs/1503.03585
🔗 Check out this awesome blogpost by Lilian Weng: lilianweng.github.io/lil-log/2021/07/11/diffusion-…
🔗 Flow-based models: lilianweng.github.io/lil-log/2018/10/13/flow-based…
🔗 DALL-E blog post: openai.com/blog/dall-e/
💻 If interested in the basic code of diffusion models, here is a wonderful annotated diffusion model from 🤗: huggingface.co/blog/annotated-diffusion

Outline:
00:00 Diffusion models are cool
00:33 Weights & Biases (Sponsor)
01:51 4 types of generative models (in 2022)
05:13 Diffusion models explained
08:27 Why are diffusion models good at photorealism? – Diffusion models beat GANs
10:36 GLIDE explained
12:16 Classifier-guided diffusion, CLIP-guided diffusion
13:56 Classifier-free guidance

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Joel Ang, Julián Salazar, Edvard Grødem

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: www.patreon.com/AICoffeeBreak
Ko-fi: ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

------------------------------------
🔗 Links:
AICoffeeBreakQuiz: youtube.com/c/AICoffeeBreak/community
Twitter: twitter.com/AICoffeeBreak
Reddit: www.reddit.com/r/AICoffeeBreak/
YouTube: youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Video contains the rock emoji designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0

Music 🎵 : Tell Me That I Can't (Instr

All Comments (21)
  • @Mrbits01
    As I was about to go and generate the avocado armchair, I heard you say no avocado armchair. My disappointment is immeasurable and my day is ruined.
  • @AICoffeeBreak
    Sorry, the upload seems buggy. Re-uploading did not help. I'll wait to see if this gets better over time. Did you try turning it off and on again? 🤖
  • @LecrazyMaffe
    This video offers one of the best explanations for classifier-free guidance.
  • @CristianGarcia
    Something not stated in the video is that Diffusion Models are WAY easier to train than GANs. Although it requires you to code the forward and backward diffusion procedures, training is rather stable which is more gratifying. Might release a tutorial on training diffusion models on a toy-ish dataset in the near future :)
  • @alfcnz
    Nice high-level summary. Thanks!
  • @ElieAtik
    This is the only video that goes into how OpenAI used text/tokens in combination with the diffusion model in order to achieve such results. That was very helpful.
  • @tylerk3130
    Thank you for the first effective high-level explanation of Diffusion I've found. Truly, I do not know how I went so long in this space not knowing about your channel.
  • @OP-yw3ws
    You explained the CFG so well. I was trying to wrap my head around it for a while!
  • @jonahturner2969
    Love your channel! Cat videos get millions of views. Your videos might get in the thousands of views, but they have a huge impact by explaining high level concepts to people who can actually use them. Please keep up your exceptional work
  • I'm here to speculate Ms Coffee Bean knew the existence of DALLE 2... Convenient timing...
  • @phizc
    Wow what a difference a few months make. Dall-E 2 in April, Midjourney in July, and Stable Diffusion in August. Hi from the future 😊.
  • @samanthaqiu3416
    I love Yannic, but boy do I like your articulate presentation? I think I do
  • @r00t257
    love your video so much! lots of helpful intuition 🌻🌻💮Thanks ms. coffee bean a lot
  • @balcaenpunch
    At @3:55, in "227" the two "2s" written differently - I have never seen someone else other than myself do this! Cheers, Letitia. Great video.
  • @tripzero0
    I finally understand diffusion! (Not really but moreso than before)