Understanding AI from the nuts and bolts

2024-02-13に共有
Brandon Rohrer who obtained his Ph.D from MIT is driven by understanding algorithms ALL the way down to their nuts and bolts, so he can make them accessible to everyone by first explaining them in the way HE himself would have wanted to learn!

Please support us on Patreon for loads of exclusive content and private Discord:
patreon.com/mlst (public discord)
discord.gg/aNPkGUQtc5
twitter.com/MLStreetTalk

Brandon's career has seen him in Principal-level roles at Microsoft and Facebook. An educator at heart, he also shares his knowledge through detailed tutorials, courses, and his forthcoming book, "How to Train Your Robot."

Pod version: podcasters.spotify.com/pod/show/machinelearningstr…

TOC:
00:00:00 - Intro to Brandon
00:01:48 - RLHF
00:02:22 - Limitations of transformers
00:08:36 - Agency - we are all GPTs
00:10:20 - BPE / representation bias
00:13:13 - LLM adherents - true believers
00:17:55 - Brandon's Teaching
00:21:03 - ML vs Real World / Robotics
00:31:12 - Reward shaping
00:38:21 - No true Scotsman - when do we accept capabilities as real
00:40:03 - Externalism
00:44:16 - Building flexible robots
00:46:50 - Is reward enough
00:55:43 - Optimisation curse
00:59:28 - Collective intelligence
01:03:04 - Intelligence, Creativity + ChatGPT
01:26:32 - Transformers

Brandon's links:
github.com/brohrer
   / @brandonrohrer  
www.linkedin.com/in/brohrer/

How transformers work:
e2eml.school/transformers

Brandon's End-to-End Machine Learning school courses, posts, and tutorials
e2eml.school/

Free course:
end-to-end-machine-learning.teachable.com/p/comple…

Blog: e2eml.school/blog.html

Ziptie: Learning Useful Features [Brandon Rohrer]
www.brandonrohrer.com/ziptie

コメント (21)
  • @foxabilo
    I don't think I've seen so much cope this year, this pair will still be saying LLM's can't do X after they are better at everything than any human on earth. "But it's still just matrix multiplications, it can't smell"...yet.
  • @jd.8019
    I think this is a fantastic episode. I know I'm personally biased, as I was the fan which was mentioned at the end of the talk, but I think something to keep in mind is that Brandon has been doing this for years: the better part of 2 decades. As mentioned many times, the nuts and bolts are his jam, and if you are someone who is thinking about going into ML/Data science for a career or school, this is an example of someone who has been looking at ML as a long term project, in addition to applying those frameworks to real-world problems/applications. Additionally, he's been trying to synthesize and democratize that core knowledge into bite sized packets of information that are very understandable, even for those with a limited amount of formal mathematical training. Personally, I don’t think I’ve seen an MLST episode where either Tim or his guest(s) have spent more time smiling/laughing while staying on topic. That enthusiasm is so infectious! These are complicated and serious topics, but that injection of light-hearted fun exploration of these topics make them seem approachable. Here’s a question for us to think about: did anyone feel lost during this episode? Now, pick another MLST episode at random, and again, repeat the previous question to yourself. There are not even many people who even appear on this channel (experts in this field mind you), that could honestly say that there wasn’t a part of almost any randomly chosen episode that they didn't experience this confusion (of varying degree, of course). With that thought out of the way, I feel I could present this episode to my mother or a high school student, and they, for the most part, would be able to follow along reasonably well and have a blast while doing so; that says something! So, major kudos to Tim for a fantastic episode, and kudos to Brandon for being a part of it!
  • @Seehart
    Some good points, but I have to say I disagree with the consensus expressed in this conversation. Would be nice to include someone with the counter-hypothesis, because without that, it devolves into strawman. LLMs as described, out of the box, are analogous to a genious with a neurological defect that prevents reflection prior to saying the first thing that comes to mind. Various LLM cluster model's have addressed that weakness. Tools such as tree of thought have greatly improved the ability to solve problems which when done by a human requires intelligence. If you want to know if these systems can reason, I recommend starting with the paper "GPT-4 can't reason". Then follow up with any on several papers and videos that utterly debunk the paper. (Edit: accidentally clicked send too soon)
  • @Self-Duality
    “So as soon as you take an action, you break your model.”
  • @paxdriver
    I love sharing this channel's new releases. Love your work, Tim.
  • @_tnk_
    Love this episode! Brandon is super eloquent and the topics discussed were illuminating.
  • The story of the robot arm reaching under the table is exactly what humans do when they hack bugs in games. Like the old ex army gunner back in the 80s that my managers challenged with an artillery game problem. He immediately loaded the maximum charge and shot the shell directly through the mountain then went back to his job muttering "Stupid game" as he departed.
  • @JD-jl4yy
    16:06 You can also strip a human brain down to its neurons and find no consciousness. Unless this guy has solved the hard problem of consciousness, his statement doesn't really mean anything...
  • @XOPOIIIO
    Chat GPT is not designed to be boring, it's boring because it's reward function is to predict the next token, it's easier to predict it when it's more mundane and obvious.
  • I don't think these guys know how close we are to AGI God.
  • 13:12 "I would include other sequences of other types of inputs". Yes, LLMs, are limited to texts, or linear sequence of tokens. We also think in pictures, at least 2D (a projection of 3D we see), and we can infer the 3D and have a mental model of that, even with time dimension (4D spacetime), but most of us are very bad at thinking about 4D spatial, or 5D etc. since we never can see it. But my point is, while we CAN traverse e.g. 2D images (or matrices) sequentially, pixel by pixel across, then line by line, that's not at all how we do it, so are linear sequences limiting? To be fair, we do NOT actually see the whole picture, the eyes jump around (saccades) and your mind gives the illusion you're seeing it whole, so maybe images, and thoughts in general are represented linearly? When you play music, it's linear in a sense, but for each instrument, or e.g. for each finger that plays the piano. So again, sequences are ok, at least parallel sequences. How do the text-to-image (or to-video) models work? The input is linear, but the output is 2D, and we also have the reverse process covered. Those diffusion processes start with pure random noise but I've not understood yet how each successive step to a clear picture can work, does it work linearally, because of the linear prompt? [Sort of in a way of a painter's brush.]
  • I love the vibe of quarantine. The remotes podcasts today give my nostalgic feelings
  • Great interview! Brandon is one of the very top ML popularizers out there. Huge fan!
  • Rohrer describes our ability to excessively imbue things with agency. Rohrer is guilty of making this very mistake in his assumptions about human intelligence and understanding. Language models are certainly not humans, but humans are almost certainly more similar to language models than Rohrer would like to think. Much of human learning and processing doesn't play out via formal or symbolic systems, rather, we are also very skilled statistical learners. Rohrer seems immune to decades of research coming from Connectionism. There is considerable evidence that we are capable of learning not just surface features, but also abstract category structures and models via statistical learning. The extent to which these processes underlie human cognition is an open, empirical question. I do agree with Rohrer that our definitions of intelligence have been too anthropocentric. We need not use human intelligence as a lens for viewing a model. If we choose to use that lens, as Rohrer has done here, we should do so in an informed way.
  • @bgtyhnmju7
    Great chat. Brandon - you're a cool guy. I enjoy your thought on things, and your general enthusiasm. Cheers dudes.
  • @lionardo
    you can represent any type of input with symbols. What is your view on hybrid systems like neuro-symbolic ANN and HD computing with ANN?
  • @fteoOpty64
    I love the description of a car in terms of personality it seems to possess. Yes, most who are into cars have such an experience. In fact, I used to race mine on the trace track. There were days where I did not feel like wanting to push very hard but the car seems to "want it". The result was unexpectedly enjoyable and surprisingly exciting drive. It was the pleasant surprise I needed at the time.
  • LLMs used for coding can sometimes be dangerous and lead you down the completely wrong pathway to doing something. I asked ChatGPT, with some back and forth, to write some code to read a .wav file and run a very basic dynamic range compressor on the samples. It had no concept of the fact that .wav files are going to be PCM blocks and just assumed that samples would be []float64. Jetbrains AI assistant was much more helpful, in my experience, and knew that you would need a library to decode PCM blocks (and directed me to the most popular one!). It's a bit of a niche subject, but it was rather alarming to me.