Diffusion models explained. How does OpenAI's GLIDE work?
87,606
Published 2022-03-23
SPONSOR: Weights & Biases 👉 wandb.me/ai-coffee-break
❓ Check out our daily #MachineLearning Quiz Questions: youtube.com/c/AICoffeeBreak/community
➡️ AI Coffee Break Merch! 🛍️ aicoffeebreak.creator-spring....
Recommended videos:
📺 DALL-E video: • OpenAI's DALL-E explained. How GPT-3 ...
📺 GAN explained video: • GANs explained | Generative Adversari...
📺 CLIP video: • OpenAI’s CLIP explained! | Examples, ...
Papers:
📜 GLIDE paper: Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. "Glide: Towards photorealistic image generation and editing with text-guided diffusion models." arXiv preprint arXiv:2112.10741 (2021). arxiv.org/abs/2112.10741
🔗 GLIDE mini, demo: huggingface.co/spaces/valhalla/glide-text2im
📜 Diffusion models for image generation: Dhariwal, Prafulla, and Alexander Nichol. "Diffusion models beat GANs on image synthesis." Advances in Neural Information Processing Systems 34 (2021). arxiv.org/abs/2105.05233
📜 Original diffusion models paper: Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. "Deep unsupervised learning using nonequilibrium thermodynamics." In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015. arxiv.org/abs/1503.03585
🔗 Check out this awesome blogpost by Lilian Weng: lilianweng.github.io/lil-log/2021/07/11/diffusion-…
🔗 Flow-based models: lilianweng.github.io/lil-log/2018/10/13/flow-based…
🔗 DALL-E blog post: openai.com/blog/dall-e/
💻 If interested in the basic code of diffusion models, here is a wonderful annotated diffusion model from 🤗: huggingface.co/blog/annotated-diffusion
Outline:
00:00 Diffusion models are cool
00:33 Weights & Biases (Sponsor)
01:51 4 types of generative models (in 2022)
05:13 Diffusion models explained
08:27 Why are diffusion models good at photorealism? – Diffusion models beat GANs
10:36 GLIDE explained
12:16 Classifier-guided diffusion, CLIP-guided diffusion
13:56 Classifier-free guidance
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Joel Ang, Julián Salazar, Edvard Grødem
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: www.patreon.com/AICoffeeBreak
Ko-fi: ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
------------------------------------
🔗 Links:
AICoffeeBreakQuiz: youtube.com/c/AICoffeeBreak/community
Twitter: twitter.com/AICoffeeBreak
Reddit: www.reddit.com/r/AICoffeeBreak/
YouTube: youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video contains the rock emoji designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0
Music 🎵 : Tell Me That I Can't (Instr
All Comments (21)
-
As I was about to go and generate the avocado armchair, I heard you say no avocado armchair. My disappointment is immeasurable and my day is ruined.
-
Amazing production quality! Here we go!!
-
Sorry, the upload seems buggy. Re-uploading did not help. I'll wait to see if this gets better over time. Did you try turning it off and on again? 🤖
-
This video offers one of the best explanations for classifier-free guidance.
-
Something not stated in the video is that Diffusion Models are WAY easier to train than GANs. Although it requires you to code the forward and backward diffusion procedures, training is rather stable which is more gratifying. Might release a tutorial on training diffusion models on a toy-ish dataset in the near future :)
-
Nice high-level summary. Thanks!
-
This is the only video that goes into how OpenAI used text/tokens in combination with the diffusion model in order to achieve such results. That was very helpful.
-
Thank you for the first effective high-level explanation of Diffusion I've found. Truly, I do not know how I went so long in this space not knowing about your channel.
-
I was waiting for this Leticia, love your channel, thank you
-
You explained the CFG so well. I was trying to wrap my head around it for a while!
-
Love your channel! Cat videos get millions of views. Your videos might get in the thousands of views, but they have a huge impact by explaining high level concepts to people who can actually use them. Please keep up your exceptional work
-
This was soo informative. And the humour was spot on!
-
Great explanation. Thank you.
-
I'm here to speculate Ms Coffee Bean knew the existence of DALLE 2... Convenient timing...
-
Wow what a difference a few months make. Dall-E 2 in April, Midjourney in July, and Stable Diffusion in August. Hi from the future 😊.
-
I love Yannic, but boy do I like your articulate presentation? I think I do
-
Just found your channel yesterday and I'm loving it! Way to go !
-
love your video so much! lots of helpful intuition 🌻🌻💮Thanks ms. coffee bean a lot
-
At @3:55, in "227" the two "2s" written differently - I have never seen someone else other than myself do this! Cheers, Letitia. Great video.
-
I finally understand diffusion! (Not really but moreso than before)