CEO OS
Learning ·March 20, 2026 ·youtube

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

tldr

Andrej Karpathy — former OpenAI/Tesla AI lead, creator of nanoGPT and microGPT — describes being in a state of "AI psychosis" since December when coding agents crossed a capability threshold. He hasn't typed a line of code since. The conversation covers his shift to managing parallel agents as macro actions over repositories, his "Dobby the elf claw" home automation system (built in three prompts), AutoResearch — an autonomous loop that beat his two decades of hand-tuned hyperparameters overnight — and a vision for "auto research at home" where untrusted swarms collaborate like folding@home. He argues the digital space will see massive "unhobbling" first, physical robotics will lag by years, open source is ~6-8 months behind frontier and that's actually healthy, and education is shifting from teaching humans to teaching agents.

Key Takeaways

The Agent Revolution: "Everything Is Skill Issue" [0:00–7:00]

  • December was the inflection. Karpathy went from 80/20 (writing code vs. delegating) to 2/98 — and now closer to 0/100. "I don't think I've typed like a line of code probably since December." The entire default workflow of building software changed overnight for anyone paying attention.
  • "AI psychosis" is real. The feeling of infinite capability but bounded by your own ability to orchestrate agents. "I feel nervous when I have subscription left over — that just means I haven't maximized my token throughput."
  • Token throughput is the new FLOPS. Just as PhD students felt nervous when GPUs sat idle, Karpathy now feels nervous when he's not maximizing token spend across agents. The bottleneck shifted from compute to human orchestration.
  • Macro actions, not micro edits. The unit of work is no longer a line or function — it's a functionality. "Here's a new functionality, delegate it to agent one. Here's another that won't interfere, give it to agent two." One agent researches, another codes, another plans.
  • Peter Steinberg as the archetype. Famous photo: multiple Codex agents tiling the monitor, each taking ~20 minutes with high effort, rotating between 10+ repos. The aspiration is to become a high-throughput agent orchestrator.

Claws: The Persistent Agent Layer [7:00–15:00]

  • A "claw" is different from an agent. It keeps looping, has its own sandbox, does work on your behalf even when you're not looking, and has more sophisticated memory systems than default context-compaction agents. OpenClaw's memory system is more advanced than what ships with most coding agents.
  • Personality matters. Karpathy thinks Claude's personality is well-dialed: "When Claude gives me praise I do feel like I slightly deserve it." Codex is too dry — "it doesn't seem to care about what you're creating." The soul/personality document is an underrated innovation.
  • Dobby the elf claw. Karpathy's home automation claw: discovered all smart home subsystems on the LAN (Sonos, lights, HVAC, shades, pool, security cameras) in ~3 prompts. Built APIs, a dashboard, and a WhatsApp interface. Uses a Qwen model for video change detection — sends WhatsApp alerts like "Hey, a FedEx truck just pulled up." Replaced six separate apps with natural language commands.
  • The agentic web means apps become APIs. "These apps that are in the app store for using these smart home devices — these shouldn't even exist. Shouldn't it just be APIs and shouldn't agents be using it directly?" The customer is no longer the human — it's agents acting on behalf of humans.
  • Today's vibe coding is tomorrow's table stakes. What takes a technical person effort to wire up today will be trivially free in 1-2 years. "This is trivial. This is table stakes. Any AI, even the open source models, can do this."

AutoResearch: Removing Yourself as the Bottleneck [16:00–23:00]

  • The name of the game is leverage. "Put in just very few tokens, just once in a while, and a huge amount of stuff happens on my behalf." AutoResearch is the implication taken to its conclusion: arrange the system once, hit go, maximize token throughput without your involvement.
  • It beat two decades of hand-tuning. Karpathy's GPT-2 training repo (data_chat) was "fairly well tuned" after extensive manual hyperparameter work. AutoResearch ran overnight and found improvements he missed — weight decay on value embeddings, insufficiently tuned Adam betas, joint interactions between parameters. "I shouldn't be running these hyperparameter search optimizations."
  • Works best with objective metrics. Caveat #1: AutoResearch excels when you have verifiable, easy-to-evaluate metrics (CUDA kernel efficiency, validation loss). If you can't evaluate it, you can't auto-research it.
  • Still bursting at the seams. Caveat #2: "If you try to go too far ahead, the whole thing is actually net not useful." Models are simultaneously a brilliant PhD systems programmer and a 10-year-old. The "jaggedness" is far more extreme than in humans.
  • The joke test. Ask any state-of-the-art model to tell a joke — you get "Why don't scientists trust atoms? Because they make everything up." Same joke from 3-4 years ago. Models improved tremendously on RL-optimized tasks but humor, nuance, and soft skills are stuck. "You're either on rails of what it was trained for and going at the speed of light, or you're not."

Program.md and Meta-Optimization [21:00–24:00]

  • A research organization is a set of markdown files. Every research org can be described by a program.md — roles, processes, how things connect. One org has fewer standups, one is more risk-taking. It's all code. Once it's code, you can tune it.
  • Contest idea: Let people write different program.mds for the same hardware, measure improvement. Feed all results to the model to write a better program.md. Meta-optimization over the research loop itself.
  • Layers of an onion. LLM → taken for granted. Agent → taken for granted. Claw → taken for granted. Multiple claws → taken for granted. Instructions to claws → taken for granted. Optimization over instructions → this is where we are now. "This is infinite and everything is skill issue."

Open Source vs. Frontier: A Healthy Equilibrium [49:00–54:00]

  • Open source is ~6-8 months behind frontier. Started at "nothing exists," went to 18 months behind, now converging. For the vast majority of consumer use cases, open source models are "actually quite good" and will increasingly run locally.
  • Linux analogy. Just as Linux runs 60%+ of computers despite Windows/macOS existing as closed alternatives, the industry has a structural need for a common open platform. The difference: LLMs require massive capex.
  • Frontier intelligence for Nobel-Prize-level work. "Let's move Linux from C to Rust" — the kind of project requiring frontier closed intelligence. Open source eats through basic use cases; frontier handles the hardest problems.
  • Centralization has a poor track record. Karpathy wants more frontier labs, not fewer. "In machine learning, ensembles always outperform any individual model — so I want there to be ensembles of people thinking about the hardest problems." Having intelligence solely in closed systems carries systemic risk.
  • By accident, we're in a good spot. The current setup — frontier labs slightly ahead, open source trailing by months — is "actually kind of a pretty good setup overall."

Auto Research at Home: Decentralized AI Research [33:00–37:00]

  • Untrusted swarms like folding@home. A huge amount of search goes into finding good commits, but verification is cheap — just train and check validation loss. This mirrors protein folding: hard to find low-energy configurations, trivial to verify them.
  • Blockchain-like structure for code commits. Instead of blocks, you have commits that build on each other. Proof of work = massive experimentation. Reward = leaderboard position (no monetary incentive yet). Untrusted workers submit candidates, trusted workers verify.
  • "The Earth is much bigger." Frontier labs have trusted compute, but the global pool of untrusted compute is vastly larger. With proper verification systems, "a swarm of agents on the internet could collaborate to improve LLMs and could potentially even run circles around frontier labs."
  • Donate compute to causes you care about. Instead of donating money to an institution, purchase compute and join the auto-research forum for a specific project — cancer research, materials science, etc. Compute becomes the contribution currency.

Jobs, the Digital-Physical Divide, and Jevons Paradox [38:00–48:00]

  • Digital first, physical later. "Flipping bits and the ability to copy-paste digital information is a million times faster than accelerating matter." Expect massive unhobbling in digital space — a "boiling soup" of activity. Physical robotics will lag behind but represents a potentially even larger market.
  • The interface between digital and physical is the next frontier. Sensors feeding intelligence (lab equipment for materials science, cameras for security, biology instruments) and actuators executing in the physical world. Many interesting companies will come from this interface.
  • Jevons Paradox for software engineering. Software was scarce because it was expensive. AI makes it cheaper → unlocked demand → more software engineers needed, not fewer. The ATM/bank-teller canonical example. Karpathy is "cautiously optimistic" for software engineering jobs.
  • But researchers are automating themselves. "I went around OpenAI and I was like, you guys realize if we're successful, we're all out of a job. We're just building automation for Sam or the board."
  • Information markets are missing. "How come there isn't a process where taking a photo from somewhere in Tehran should cost like 10 bucks?" Agents guessing betting markets and stock prices will need real-world sensor data. The agentic web doesn't have mechanisms for this yet.

Why Karpathy Is Independent (For Now) [44:00–49:00]

  • You can't be a free agent inside a frontier lab. "There are certain things you can't say. There are certain things the organization wants you to say." Side eyes and awkward conversations if you deviate. Outside, he feels "more aligned with humanity."
  • But judgment drifts without frontier access. The labs are opaque. Without seeing what's coming down the line, your understanding of capability inevitably becomes stale. Ideal solution: go back and forth — spend time inside a lab, then return to independent work.
  • Ecosystem-level impact is real. Both Karpathy and the hosts agree: impact outside frontier labs — in ecosystem roles, education, open source — can be as significant as impact inside.

Education: Teaching Agents, Not Humans [1:01:00–1:06:00]

  • microGPT is 200 lines. The entire LLM training algorithm — data, architecture (50 lines), autograd engine (100 lines), Adam optimizer (10 lines), training loop — distilled to its bare essence. All the complexity in real systems is from efficiency, not the algorithm itself.
  • "I'm not explaining to people anymore. I'm explaining to agents." If agents get it, they become the router — targeting the explanation to each human's level, language, and capability. Infinite patience, personalized instruction.
  • Skills > lectures. Instead of recording a video walkthrough, Karpathy would write a "skill" — instructions for the agent on the progression to take a learner through. Script the curriculum as agent instructions.
  • Agents can't originate; they can distribute. microGPT is Karpathy's value-add — the result of a decade-long obsession with simplification. He tried to get an agent to write it from scratch and it couldn't. "My contribution is these few bits. Everything else in terms of education is not my domain anymore."
  • "The things that agents can't do is your job now." The strategic question for every knowledge worker: what can agents do (probably better than you, very soon) vs. what can't they do (that's where you invest your time).

Timestamps

Time Topic
0:00 Intro — "AI psychosis" and the December inflection point
0:34 The No Priors hosts welcome Andrej Karpathy
0:59 The agent revolution: 80/20 → 2/98 delegation
3:00 Token throughput as the new FLOPS — maximizing subscriptions
5:09 Peter Steinberg: the agent orchestration archetype
6:37 Where does mastery of agents go?
7:02 Claws: persistent agents with memory and autonomy
7:50 OpenClaw, personality in agents, Claude vs. Codex
9:17 Dobby the elf claw — home automation in 3 prompts
11:45 Do people want all the software we have today?
13:09 Apps should be APIs — agents as the glue layer
14:24 Vibe coding today is table stakes tomorrow
15:27 Security/privacy hesitation with claws
16:33 AutoResearch: removing yourself as the bottleneck
17:45 AutoResearch beats 2 decades of hand-tuning overnight
19:42 Scaling auto research to frontier labs
20:54 Program.md — describing a research organization in markdown
22:10 Contest idea: competing program.mds, meta-optimization
23:06 Layers of the onion — everything is skill issue
23:41 Caveats: objective metrics required, models still jaggy
24:30 "Brilliant PhD + 10-year-old" — the jaggedness of AI
26:22 The joke test: RL optimizes code but not humor
28:57 Should models be unbundled into specialized experts?
29:31 Monoculture vs. speciation of intelligences
33:00 Auto research at home: untrusted swarms, blockchain-like commits
35:32 Folding@home analogy — cheap to verify, expensive to find
37:53 Jobs data analysis from Bureau of Labor Statistics
39:12 Digital vs. physical: bits are a million times easier than atoms
41:59 Jevons paradox: cheaper software → more demand for engineers
43:50 Researchers at frontier labs automating themselves away
44:35 Why not rejoin a frontier lab? Independence vs. access
47:50 Open source vs. frontier: ~6-8 months behind and converging
49:50 Linux analogy — industry needs a common open platform
52:01 Centralization has a poor track record
54:18 Robotics: self-driving lessons, atoms are harder
55:41 The interface between digital and physical worlds
58:11 Sensors and actuators: materials science, biology, training data
59:02 Missing information markets — pricing physical-world data
1:01:30 microGPT: LLM training in 200 lines of Python
1:03:10 Education shift: explaining to agents, not humans
1:04:00 Skills as agent curricula
1:05:57 "The things agents can't do is your job now"

Relevance to SupportWire & FeatureOS

  • "The customer is not the human anymore — it's agents acting on behalf of humans." This is existential for FeatureOS. If feedback collection shifts to agents parsing support tickets and user behavior autonomously, the product must expose APIs that agents can drive — not just a UI for PMs. Agent-first design should be on the roadmap.
  • "Apps shouldn't exist — just APIs and agents as glue." SupportWire's advantage: if customer support goes agentic (AI agents resolving tickets), the six-app chaos Karpathy describes for home automation is exactly what support teams face today (Zendesk + Slack + Jira + CRM + knowledge base + analytics). SupportWire can be the unified agentic layer.
  • Program.md as operating system. Karpathy's insight about research orgs described in markdown maps directly to how we should think about CLAUDE.md — it's not just developer docs, it's the program.md for how agents operate on FeatureOS/SupportWire repos. Invest in making it better = compounding agent productivity.
  • AutoResearch pattern applies to product metrics. If you can define objective metrics (activation rate, time-to-first-value, churn signals), you can set up autonomous loops that test variations without a PM in the loop. The same "arrange once, hit go" philosophy applies to product experimentation.
  • Jevons paradox validates SupportWire's bet. Cheaper AI support → more demand for support (because companies can now afford to actually respond to every customer). The market grows, not shrinks.

Notable Quotes

"I kind of went from 80/20 to 20/80 of writing code by myself versus delegating to agents. And I don't even think it's 20/80 by now. I don't think I've typed a line of code probably since December." [1:51]

"I feel nervous when I have subscription left over — that just means I haven't maximized my token throughput." [5:42]

"I simultaneously feel like I'm talking to an extremely brilliant PhD student who's been a systems programmer for their entire life and a 10-year-old. And it's so weird because humans have a lot less of that kind of jaggedness." [24:30]

"Tell me a joke — you get the same crappy joke from 5 years ago. And it's because it's outside of the reinforcement learning. It's outside of what's being improved." [27:00]

"A research organization is a set of markdown files that describe all the roles and how the whole thing connects. And you can imagine having a better research organization." [21:38]

"A swarm of agents on the internet could collaborate to improve LLMs and could potentially even run circles around frontier labs." [36:06]

"I'm not explaining to people anymore. I'm explaining to agents. If you can explain it to agents, then agents can be the router." [1:03:12]

"The things that agents can't do is your job now. The things that agents can do, they can probably do better than you or like very soon." [1:05:59]


One Thing to Act On

Build your own program.md for SupportWire and FeatureOS development. Karpathy's core insight is that the leverage game is about maximizing autonomous token throughput — removing yourself as the bottleneck. Write explicit instructions (a program.md equivalent) for how agents should approach your repos: coding standards, testing philosophy, PR review criteria, deployment flows. Then let agents run longer loops without requiring your prompt-by-prompt involvement. The founders who figure out high-leverage agent orchestration first will have a compounding advantage.


#ai-agents #code-agents #auto-research #open-source #education #robotics #andrej-karpathy #frontier-labs #agent-orchestration


Raw Transcript

Auto-captions from YouTube. Folded by default — expand if you need to grep the source or pull an exact quote.

0:00 code's not even the right verb anymore, 0:02 right? But I have to um express my will 0:04 to my agents for 16 hours a day 0:07 manifest. 0:07 >> How can I have not just a single session 0:09 of clot code or codeex or some of these 0:11 agent harnesses? How can I have more of 0:12 them? How can I do that appropriately? 0:14 The agent part is now taken for granted. 0:16 Now the claw-like entities are taken for 0:18 granted and now you can have multiple of 0:19 them and now you can have instructions 0:21 to them and now you can have 0:22 optimization over the instructions. But 0:24 there I mean this is why it gets to the 0:25 psychosis is that this is like infinite 0:27 and everything is skill issue. 0:34 Hi listeners, welcome back to No Briars. 0:37 Today I'm here with Andre Karpathy and 0:38 we have a wide-ranging conversation for 0:40 you about code agents, the future of 0:43 engineering and AI research, how more 0:45 people can contribute to research, 0:47 what's happening in robotics, his 0:49 prediction for how agents can reach out 0:51 into the real world, and education in 0:53 this next age. Welcome, Andre. Andre, 0:56 thanks for doing this. Yeah, thank you 0:58 for having me. 0:59 >> Uh, so it's been a very exciting couple 1:01 of months in AI. 1:02 >> Uh, yeah, you could say that. 1:04 >> I remember um walking into the office at 1:07 some point and you were like really 1:08 locked in and I was asking what you were 1:10 up to and you're like, I just I have to 1:11 code for 16 hours a day or code's not 1:13 even the right verb anymore, right? But 1:15 I have to 1:16 >> um express my will to my agents for 16 1:19 hours a day. Manifest 1:21 um because like there's been a jump in 1:24 capability. 1:25 >> Uh what's happening? and tell me about 1:27 your experience. 1:28 >> Yeah, I kind of feel like I was just in 1:29 this perpetual I still am often in this 1:31 state of AI psychosis just like all the 1:33 time. Uh because there was a huge unlock 1:35 in what you can achieve as a person as 1:36 an individual, right? Because you were 1:38 bottlenecked by you know your typing 1:39 speed and so on. But now with these 1:41 agents, it really I would say in 1:43 December is when it really just 1:45 something flipped where I kind of went 1:46 from 8020 of like you know uh to like 1:49 2080 of writing code by myself versus 1:51 just delegating to agents. And I don't 1:53 even think it's 2080 by now. I think 1:54 it's a lot more than that. I don't think 1:55 I've typed like a line of code probably 1:58 since December basically. Um, which is 2:01 like an extremely large uh change. Um, I 2:05 was talking to it like for example I was 2:06 talking about it to for example my 2:08 parents and so on and I don't think like 2:09 a normal person actually realizes that 2:11 this happened or how dramatic it was 2:12 like literally like if you just find a 2:14 random software engineer or something 2:15 like that at their at their desk and 2:17 what they're doing like their default 2:19 workflow of you know building software 2:21 is completely different as of basically 2:23 December. Uh so I'm just like in this 2:26 state of psychosis of trying to figure 2:28 out like what's possible uh trying to 2:30 push it to the limit. How is it how can 2:31 I have not just a single session of you 2:33 know um clot code or codecs or some of 2:35 these agent harnesses. How can I have 2:36 more of them? How can I do that uh 2:38 appropriately? And then how can I use 2:40 these claws? What are these claws? Uh 2:43 and uh so there's like a lot of new 2:45 things. I want to be at the forefront of 2:46 it, you know, and I'm very antsy that 2:49 I'm not at the forefront of it. And I 2:50 see lots of people on Twitter doing all 2:51 kinds of things and they all sound like 2:52 really good ideas and I need to be at 2:54 the forefront or I feel extremely 2:55 nervous. And so I guess I'm just in this 2:57 psychosis of like what's possible like 2:58 because it's unexplored fundamentally. 3:00 >> Well, if you're nervous, the rest of us 3:02 are nervous. We have a uh we have a team 3:04 that we work with at conviction that 3:07 their setup is everybody is like, you 3:10 know, none of the engineers write code 3:11 by hand and they they're all microphoned 3:14 and they just like whisper to their 3:15 agents all the time. It's the strangest 3:17 work setting ever. Uh, and I thought 3:19 they were crazy and now I like I fully 3:21 accept I was like, "Oh, this was the 3:22 way." Like you're just ahead of it. 3:24 >> Um, what uh how do you think about your 3:27 own capacity now to like explore or to 3:30 do projects like what is it limited by? 3:32 >> Yeah. What is it limited by? Uh just I 3:34 think everything like so many things 3:36 even if they don't work I think to a 3:38 large extent you feel like it's skill 3:39 issue. It's not that the capability is 3:41 not there. It's that you just haven't 3:42 found a way to string it together of 3:44 what's available. like I just don't I 3:46 didn't give good enough instructions in 3:47 the agents MD file or whatever it may 3:50 be. I don't have a nice enough memory 3:52 tool that I put in there or something 3:54 like that. So it all kind of feels like 3:56 skills when it doesn't work to some 3:57 extent. You want to see how you can 3:58 paralyze them etc. And you want to be 4:00 Peter Steinberg basically. Uh so Peter 4:02 is famous. He has a funny photo where 4:04 he's in front of a monitor with lots of 4:05 uh like uh he uses codecs. So lots of 4:07 codecs agents styling the the monitor 4:10 and they all take about 20 minutes if 4:11 you prompt them correctly and you use 4:12 the high effort. And so they all take 4:14 about 20 minutes. they have multiple, 4:15 you know, 10 repos checked out and so 4:18 he's just um going between them and 4:20 giving them work. It's just like you can 4:22 you can you can move in much larger 4:24 macro actions. It's not just like here's 4:25 a line of code, here's a new function. 4:27 It's like here's a new functionality and 4:29 delegate it to agent one. Here's a new 4:30 functionality that's not going to 4:31 interfere with the other one. Give it a 4:33 two and then try to uh review their work 4:35 as best as you can depending on how much 4:38 you care about that code. Like what are 4:39 these macro actions that I can like 4:41 manipulate my software repository by? 4:43 and like another agent is doing some 4:45 like research and another agent is 4:46 writing code, another one is coming up 4:48 with a plan for some new implementation. 4:49 And so everything just like happens in 4:51 these like macro actions over your 4:52 repository. Um, and you're just trying 4:55 to become like really good at it and 4:56 develop like a muscle memory for it is 4:58 extremely um, yeah, it's very rewarding 5:01 number one because it actually works. 5:02 Uh, but it's also kind of like the new 5:04 thing to learn. So that's why hence the 5:06 psychosis. Yeah, I I do feel like my 5:09 instinct is like whenever I am waiting 5:12 for an agent to complete something, the 5:13 obvious thing to do is like, well, I can 5:15 do more work, right? Like if I have 5:16 access to more tokens, then like I 5:18 should just paralyze add more tasks. And 5:20 so that that's very stressful because if 5:22 you 5:23 >> don't feel very bounded by your ability 5:25 to spend on tokens, then you know you 5:28 are the bottleneck in the system that is 5:30 max capability. 5:31 >> Yeah. you're not maximizing your 5:32 subscription at least and ideally for 5:35 multiple agents like if you run out of 5:36 the kod on codecs you should switch to 5:38 cloud or whatnot I don't know like 5:40 that's what I've been trying to do a 5:41 little bit and I feel nervous when I 5:42 have subscription left over uh that just 5:44 means I haven't maximized my token 5:46 throughput so I actually kind of 5:47 experienced this when I was a PhD 5:48 student you would feel nervous when your 5:50 GPUs are not running like you have GPU 5:52 capability and you're not maxing out the 5:53 available flops to you but now it's not 5:55 about flops it's about tokens uh so what 5:58 is your token throughput and what token 6:00 throughput do you command I would 6:01 actually argue that it's very 6:02 interesting that we had you know at 6:05 least 10 years where in many engineering 6:09 tasks people just did they didn't feel 6:10 compute bound 6:12 >> right um and like the entire industry 6:14 feels that now they feel like they they 6:16 they felt resource bound 6:18 >> uh and now that you have this big 6:20 capability jump you're like oh actually 6:23 it's not you know my ability to access 6:25 the compute anymore like I'm I'm the 6:27 binding constraint 6:28 >> yeah it's a skill issue 6:29 >> which is very empowering cuz um yeah cuz 6:31 you could be getting better. So that's 6:33 why that's why I think it's very 6:34 addictive because there's unlocks when 6:36 you when you get better. 6:37 >> Where do you think it goes? Like if you 6:38 just think about like okay you know 6:40 Andre is iterating and everybody else is 6:42 for 16 hours a day getting better at 6:44 using coding agents like what does it 6:45 look like in a year of like you've 6:47 reached mastery? 6:49 >> Yeah. What does mastery look like right 6:50 at the end of the year or like two three 6:52 years 5 years 10 years etc. 6:54 >> Well I think everyone is basically 6:55 interested in like going up the stack. 6:58 So I would say yeah it's not about a 6:59 single session with your agent. Um 7:02 multiple agents how do they collaborate 7:03 and teams and so on. So everyone's 7:05 trying to figure out what that looks 7:06 like. And then I would say claw is also 7:07 kind of an interesting direction because 7:08 it really when I say a claw I mean this 7:10 like layer that uh kind of takes 7:12 persistence to a whole new level. Like 7:14 it's something that like keeps looping 7:16 is is like um it's not something that 7:18 you are interactively in the middle of. 7:20 It kind of like has its own little 7:21 sandbox its own little you know it kind 7:24 of like does stuff on your behalf even 7:25 if you're not looking kind of thing. 7:27 um and then also has like maybe more 7:29 sophisticated memory systems etc that 7:30 are not yet implemented in agents. So 7:33 open claw has a lot more sophisticated 7:34 memory I would say than what you would 7:35 get by default uh which is just a memory 7:37 compaction when your context runs out. 7:39 Right. 7:39 >> You think that's the piece that 7:40 resonated for more users versus like 7:43 perhaps like broader tool access 7:45 >> for open claw. 7:46 >> Yeah. There there's like I think there's 7:48 at least I think 7:48 >> there's a lot of really good ideas in 7:49 here. Yeah. Good job Peter. 7:50 >> I mean Peter has done a really amazing 7:52 job. Um I saw him recently uh and I 7:55 talked to him about it and I he's very 7:57 humble about it but I think he innovated 7:59 simultaneously in like five different 8:00 ways and put it all together. Um so for 8:02 example like the soul and D document 8:04 like he actually really crafted a 8:05 personality that is kind of compelling 8:07 and interesting and I feel like a lot of 8:08 the current agents they don't get this 8:09 correctly. I actually think a clot has a 8:11 pretty good personality. It feels like a 8:12 teammate 8:13 >> uh and it's excited with you etc. Uh, I 8:16 would say um, for example, Codex is a 8:18 lot more dry. Um, which is kind of 8:20 interesting because in ChachiPT CEX is 8:22 like a lot more upbeat and highly 8:23 cyclical. But I would say Codex the 8:25 coding agent is very dry. It doesn't it 8:27 doesn't seem to care about what you're 8:28 creating. It's kind of like, oh, I 8:30 implemented it. It's like, okay, but do 8:31 you understand what we're building? 8:34 >> It's true. 8:34 >> You know, it doesn't it. The other thing 8:37 I would say is for example with Claude I 8:38 think they dial the psychopasy fairly 8:40 well where when Claude gives me praise I 8:42 do feel like I slightly deserve it 8:44 >> because sometimes I kind of give it like 8:45 not very wellformed thoughts and uh I 8:47 give it an idea that I don't think it's 8:49 fully baked and it doesn't actually 8:50 react very strongly. It's like oh yeah 8:51 we can implement that. But when it's a 8:53 really good idea by my own account, it 8:55 does seem to reward it a bit more. And 8:57 so I kind of feel like I'm trying to 8:59 like earn its praise which is really 9:00 weird. 9:01 >> And so I do think the personality 9:02 matters a lot. uh and I think a lot of 9:04 the other uh tools maybe don't 9:06 appreciate as much and I think in this 9:07 aspect also Peter really cares about 9:08 this and so that was correct and then 9:10 the memory system and then uh just you 9:12 know he's just having fun with this um 9:14 and then the the single WhatsApp portal 9:16 to all of the automation. 9:17 >> Yeah. Is there something that you have 9:20 done personally with your claws beyond 9:23 software engineering that you think is 9:24 fun or interesting? 9:25 >> Yeah. So in January I had a claw I went 9:27 through a period of claw psychosis. So, 9:29 I built um I have a claw basically that 9:32 takes care of my home and I call him 9:33 Dobby the elf claw. Um and uh basically 9:38 I used uh the agents to find all of the 9:41 smart home subsystems of my home on the 9:43 local area network which I was kind of 9:44 surprised that worked out of the box. 9:45 Like I just told it that I think I have 9:47 Sonos at home. Like can you try to find 9:48 it? and it goes and it did like IP scan 9:50 of all the um basically um computers on 9:53 the local area network and it found the 9:55 Sonos thing uh the Sonos uh system and 9:58 it turned out that there's no password 10:00 protection or anything like that. I just 10:01 logged in and it's like oh yeah you have 10:02 these Sonos systems installed I let me 10:04 try to reverse engineer how it's 10:05 working. It does some web searches and 10:07 it finds like okay these are the API 10:08 endpoints and then it's like do you want 10:10 to try it? And I'm like whoa like you 10:12 just did that. And I'm like, "Yeah, can 10:13 you try to play something in the study?" 10:14 And uh it does and music comes out and 10:16 I'm like, "I can't believe I just 10:18 >> That's crazy. That's like three 10:19 prompts." Yeah. 10:20 >> I can't believe I just typed in like, 10:21 "Can you find my sonos?" And that 10:22 suddenly it's playing music. And it did 10:23 the same for lights. And so basically 10:25 like it kind of hacked in, figured out 10:27 the whole thing. Uh created APIs, 10:28 created a dashboard so I could see the 10:30 command kind of center of like all of my 10:32 lights in the home. And then it was like 10:34 switching lights on and off. And you 10:35 know, so I can ask it like Dobby at 10:37 sleepy time. And when it's sleepy time, 10:39 that just means all the lights go off, 10:41 etc. and so on. So, it controls all of 10:43 my lights, my HVAC, my shades, uh the 10:45 pool and uh spa and also my security 10:48 system. So, I have a camera pointed 10:49 outside of the house and anytime someone 10:51 rolls in, I have a Quinn uh a Quinn 10:55 model that looks at the videos. So, 10:56 first of all, there's change detection, 10:58 right? 10:58 >> And then based on change detection, it 10:59 goes to Quinn and then it actually like 11:01 tells me um it sends me a text to my 11:03 WhatsApp. It shows an image from the 11:05 outside and it says, "Hey, a FedEx truck 11:08 just pulled up. FedEx truck just pulled 11:10 up and you might want to check it and 11:11 you got me mail or something like that. 11:12 And Dobby just text me this this really 11:15 incredible. Um so so Dobby is in charge 11:18 of the house. I text through with it 11:20 through WhatsApp. Um and it's been like 11:22 really fun to have these macro actions 11:24 that maintain my house. I haven't like 11:25 really pushed it uh like way more beyond 11:27 that and I think people are doing a lot 11:28 more crazy things with it. Uh but for me 11:30 even just a home automation setup, I 11:32 used to use like six apps, completely 11:34 different apps and I don't have to use 11:35 these apps anymore. Like Doby controls 11:36 everything in natural language. It's 11:38 amazing. Um, and so I think like I 11:40 haven't even pushed a paradigm fully, 11:42 but already that is so helpful and so 11:44 inspiring I would say. 11:45 >> Do you think that's indicative of like 11:46 what people want from a user experience 11:48 perspective with software, right? 11:50 Because I I don't think you know it's 11:52 pretty ignored that it takes humans 11:53 effort to like learn new software like 11:56 new UI. Yeah, I think uh to some extent 11:59 that's right. It's like working 12:00 backwards from how people think an AI 12:02 should be because what people have in 12:05 their mind of like what an AI is is not 12:06 actually what an LLM is by by like in a 12:08 raw sense like LLM is a token generator, 12:10 you know, like more tokens come out, but 12:12 what they think of is like this p this 12:14 persona identity that they can tell 12:16 stuff and it remembers it, you know, and 12:19 it's just kind of an entity behind a 12:20 WhatsApp. It's like a lot more 12:21 understandable. Mhm. 12:22 >> Uh so I think to some extent it's like 12:24 matching the expectations that humans 12:26 already have for what an AI should 12:27 behave but under the hood it's like a 12:28 lot of technical details go into that 12:30 and LLMs are too raw of a primitive to 12:32 actually um type check as AI I think for 12:35 most people if that makes sense. 12:37 >> Yeah. Um I think that's like how we 12:39 understand what the AI is and like the 12:42 um description of it as Dobby or some 12:45 personality obviously resonates with 12:47 people. Um, I also think that it uh the 12:50 unification that you did across your six 12:52 different software systems for your home 12:54 automation speaks to a different 12:55 question of like 12:56 >> do people really want all the software 12:58 that we have today? 12:59 >> Yeah. 12:59 >> Right. Um because I I would argue like 13:01 well you have the hardware 13:03 >> but you've now thrown away the software 13:05 >> or the the UX layer of it. Um do you 13:08 think that's what people want? 13:09 >> Yeah. I think there's this like there's 13:11 this sense that these apps that are in 13:13 the app store for using these smart home 13:15 devices etc. uh these shouldn't even 13:17 exist kind of in a certain sense like 13:18 shouldn't it just be APIs and shouldn't 13:20 agents be just using it directly and um 13:24 wouldn't it like I can do all kinds of 13:26 home automation stuff that uh any 13:28 individual app will not be able to do 13:29 right um and an LLM can actually drive 13:31 the tools and call all the right tools 13:32 and do do pretty complicated things um 13:36 >> and so in a certain sense it does point 13:38 to this like maybe there's like an 13:39 overproduction of lots of custom bespoke 13:41 apps that shouldn't exist because agents 13:43 kind of like crumble them up and 13:45 everything should be a lot more just 13:47 like exposed API endpoints and agents 13:49 are the glue of the intelligence that 13:51 actually like tool calls all the all the 13:53 parts. Um, another example is like my 13:55 treadmill. Uh, there's an app for my 13:57 treadmill and I wanted to like keep 13:58 track of how often I do my cardio. Uh, 14:00 but like I don't want to like log into a 14:02 web UI and go through a flow and etc. 14:04 Like all this should just be like make 14:06 APIs available and this is kind of you 14:08 know going towards the agentic um sort 14:10 of web or like agent first uh tools and 14:12 all this kind of stuff. So I think the 14:14 industry just has to reconfigure in so 14:16 many ways that it's like the customer is 14:17 not the human anymore. It's like agents 14:19 who are acting on behalf of humans and 14:20 this refactoring will be will probably 14:22 be substantial in a certain sense. One 14:24 way that people sometimes push back on 14:26 this is like do people do do we expect 14:28 people to v code some of these tools? Do 14:30 we expect normal people to do this kind 14:32 of stuff that I described? 14:33 >> But I think to some extent 14:35 >> this is just you know technology as it 14:36 exists today and right now there is some 14:38 vibe coding and I'm actually watching it 14:40 and I'm working with the system. But I 14:42 kind of feel like this kind of stuff 14:43 that I just talked about this should be 14:45 free like in a year or two or three. 14:47 There's no back coding involved. This is 14:49 trivial. This is table stakes. This is 14:50 like any AI even the open source models 14:52 etc can like do this. 14:54 >> You should be able to translate from a 14:56 less technical humans intent very easily 14:59 to this 14:59 >> extremely easily. Yeah. Today it's vi 15:01 coding it's involved and not many people 15:02 are going to do it. But 15:02 >> and you still have to make some design 15:04 decisions, right? We were talking about 15:05 like you take frames for example. 15:07 >> Yeah. 15:08 >> Yeah. But I kind of feel like this will 15:10 just uh start to the barrier will just 15:12 come down and it's just ephemeral 15:14 software on your behalf and some kind of 15:17 like claw is handling all the details 15:19 for you but you're not involved. Claw 15:20 has a 15:21 >> claw has a machine and it will figure it 15:22 out and it's just presenting you UIs and 15:24 you're like saying stuff you know. Mhm. 15:27 >> Why haven't you um I guess like pushed 15:29 the boundaries of what you can do 15:30 personally with Claus? Like is it you 15:32 know you're focusing on more important 15:35 projects, auto research, etc. or uh 15:38 you're climbing the hill to mastery or 15:40 something else, right? 15:41 >> Yeah. I just feel like I'm so distracted 15:43 by everything. So I spend I spent like a 15:45 week on the class stuff and I I have 15:47 more to do almost. Um but I will say 15:49 that um 15:50 >> like Jensen tools were all just busier 15:52 unfortunately. 15:53 >> Yeah. Uh, I didn't really take advantage 15:54 of a lot of like email and calendar and 15:57 all this other stuff and I didn't give 15:58 it access because I'm still a little bit 16:00 like suspicious and it's still very new 16:01 and rough around the edges. So, I didn't 16:03 want to give it like full access to my 16:04 digital life yet. And part of it is just 16:06 the security, privacy and uh just being 16:08 very cautious in that in that realm. And 16:11 um, so some of it is like held back by 16:13 that I would say. Yeah, maybe that's 16:14 like the dominant dominant feature, but 16:16 some of it is also just I feel so 16:18 distracted because I feel like I had a 16:19 week of claw and then other stuff is 16:21 happening. And 16:21 >> what was the um I mean you've talked 16:24 about like being able to train or at 16:26 least optimize a a model as a task you 16:30 want to see agents do for a long time 16:31 like what was the motivation behind auto 16:33 research? 16:33 >> Auto research. Yeah. So I think like I 16:36 had a tweet earlier where I kind of like 16:38 said something along the lines of to get 16:40 the most out of the tools that have 16:41 become available now you have to remove 16:43 yourself as the as the bottleneck. You 16:45 can't be there to prompt the next thing. 16:47 You're you need to take yourself 16:48 outside. um you have to arrange things 16:51 such that they're completely autonomous 16:52 and the more you you know how can you 16:54 maximize your token throughput and not 16:55 be in the loop. This is the this is the 16:57 goal and so I kind of mentioned that the 16:59 the name of the game now is to increase 17:00 your leverage. uh I put in just very few 17:03 tokens just once in a while and a huge 17:04 amount of stuff happens on my behalf and 17:06 so auto research like I tweeted that and 17:08 I think people liked it and whatnot but 17:10 that they haven't like maybe worked 17:12 through like the implications of that 17:13 and for me auto research is an example 17:14 of like an implication of that 17:16 >> where it's like I don't want to be like 17:17 the researcher in the loop like looking 17:19 at results etc like I'm I'm holding the 17:21 system back so the question is how do I 17:24 refactor all the abstractions so that 17:25 I'm not I have to arrange it once and 17:27 hit go the name of the game is how can 17:29 you get more agents running for longer 17:31 periods of time without your involvement 17:32 doing stuff on your behalf and auto 17:34 research is just yeah here's an 17:36 objective here's a metric here's your 17:37 boundaries of what you can and cannot do 17:39 and go and uh yeah 17:42 >> you were surprised at its effectiveness 17:44 >> yeah I I didn't expect uh it to work 17:46 because so I have the project data chat 17:48 um and fundamentally like I think a lot 17:50 of people are very confused with my 17:52 obsession for like training GBT2 models 17:54 and so on but for me uh training GBT 17:56 models and so on is just a little 17:57 harness a little playground for training 17:58 LLMs and fundamentally what I'm more 18:00 interested in is like this idea of 18:01 recursive self-improvement and to what 18:03 extent you can actually have LLMs 18:04 improving LLMs because I think all the 18:06 Frontier Labs this is like the thing 18:08 >> uh for obvious reasons and they're all 18:10 trying to recursively self-improve 18:12 roughly speaking and so for me this is 18:13 kind of like um a little play pen of 18:15 that um and I guess I like tuned Namat 18:18 already quite a bit by hand in a good 18:20 old fashioned way that I'm used to like 18:21 I'm a researcher I've done this for like 18:22 you know two decades I have some amount 18:24 of like what is the opposite of 18:26 >> uh yeah 18:28 >> earned confidence 18:29 >> okay I have like two decades of like oh 18:31 I've trained this model like thousands 18:33 of times of like um so I've done a bunch 18:35 of experiments I've done hyper primary 18:37 tuning I've done all the things I'm very 18:38 used to and I've done for two decades 18:39 and I've gotten to a certain point and I 18:42 thought it was like fairly well tuned 18:43 and then I let auto research go for like 18:45 overnight and it came back with like 18:47 tunings that I didn't see 18:48 >> and yeah I did forget like the weight 18:50 decay on the value embeddings and my 18:51 atom betas were not sufficiently tuned 18:54 and these things jointly interact so 18:56 like once you tune one thing the other 18:57 things have to potentially change too 18:58 you know I shouldn't be a bottleneck 18:59 like I shouldn't be running these 19:00 hyperparameter search optimizations. I 19:02 shouldn't be looking at the results. 19:03 There's objective criteria in this case. 19:05 Uh so you just let you just have to 19:06 arrange it so that it can just go 19:08 forever. So that's a single sort of 19:09 version of auto research of like a 19:11 single loop trying to improve. And I was 19:13 surprised that it um it found these 19:15 things that I you know the repo is 19:17 already fairly well tuned and still 19:18 found something. And that's just a 19:19 single it's a single loop like these 19:21 frontier labs they have GPU clusters of 19:24 tens of thousands of them. And so it's 19:26 very easy to imagine how you would 19:28 basically get a lot of this automation 19:30 on um smaller models and fundamentally 19:32 everything around like frontier level 19:34 intelligence is about extrapolation and 19:36 scaling loss and so you basically do a 19:38 ton of the exploration on the smaller 19:39 models and then you try to um 19:42 extrapolate out. 19:42 >> So you're saying our research efforts 19:44 are going to get more efficient like 19:45 we're going to have better direction for 19:47 when we scale as well if we can do this 19:49 experimentation better. Yeah, I would 19:50 say that like the most interesting 19:52 project and probably what the frontier 19:53 labs are working on is uh you know you 19:55 experiment on the smaller models. You 19:56 try to make it as autonomous as 19:58 possible. Remove researchers from the 20:00 loop. Uh they have way too much what is 20:02 the what is the opposite? Way too much 20:04 confidence. Yeah, they don't know. They 20:06 shouldn't be touching any of this really 20:08 and so you have to like rewrite the 20:09 whole thing because right now I mean 20:10 certainly they can contribute ideas but 20:12 okay uh they shouldn't actually be 20:14 enacting those ideas. There's a queue of 20:16 ideas and there's maybe an automated 20:18 scientist that comes up with ideas based 20:19 on all the archive papers and GitHub 20:21 repos and it funnels ideas in or 20:23 researchers can contribute ideas but 20:25 it's a single queue and there's workers 20:27 that pull uh items and they try them out 20:29 and uh whatever works just gets uh sort 20:31 of put on the feature branch and maybe 20:33 some people like um monitor the feature 20:36 branch and merge to the main branch 20:37 sometimes. So yeah, just removing humans 20:41 uh from all the processes and automating 20:43 as much as possible and getting high tok 20:44 tokens per second throughputs and it 20:46 does require rethinking of all the 20:48 abstractions. Um and uh everything has 20:50 to be reshuffled. So yeah, I think it's 20:53 very exciting. If 20:54 >> we take one more recursive step here, um 20:57 uh when is the model going to write a 20:58 better program MD than you? 21:00 >> Yeah. Uh so program MD is 21:03 >> we're not in the loop. 21:04 >> Yeah, exactly. 21:05 >> Yeah. Um, so program MD is my crappy 21:07 attempt at describing like how the auto 21:10 researcher should work like oh do this 21:11 then do that and that and then try these 21:13 kinds of ideas and then here's maybe 21:15 some ideas like look at architecture 21:16 look at optimizer etc. I just came up 21:18 with this in markdown, right? 21:20 >> Um 21:21 >> and so uh yeah, exactly. You want some 21:25 kind of an auto research loop maybe that 21:27 looks for you can imagine that different 21:29 program NDS would um would give you 21:32 different uh progress. So you basically 21:35 every research organization is described 21:37 by program MD. Yeah, 21:38 >> a research organization is a set of 21:40 markdown files that describe all the 21:41 roles and how the whole thing connects. 21:43 Um and you can imagine having a better 21:45 research organization. So maybe they do 21:47 fewer stand-ups in the morning because 21:48 they're useless. And this is all just 21:50 code, right? Um and so you can so one 21:52 organization can have fewer stand-ups, 21:54 one organization can have more uh one 21:56 organization can be very risk-taking, 21:58 one organization can be less. And so you 22:00 can definitely imagine that you have 22:01 multiple research orgs. Um and then they 22:03 all have code and once you have code, 22:05 then you can imagine tuning the code. So 22:06 100% there's like the meta layer of it. 22:08 Uh um 22:09 >> did you see my text about my contest 22:11 idea? My contest idea was 22:14 uh like let people write uh different 22:17 program MDs, right? And and so for same 22:19 hardware, where do you get most 22:21 improvement? 22:22 >> Oh, I see. 22:22 >> And then you can take all that data and 22:24 then give it to the model and say write 22:25 a better program MD. 22:26 >> Yes. Yes. 22:28 >> Yeah. Exactly. 22:28 >> We're going to get something better. 22:29 Like there's no way we don't. 22:30 >> You can 100% look at um where the 22:32 improvements came from and like can I 22:35 change the program MD such that more of 22:36 these kinds of things would be done or 22:38 like things that didn't work. uh 22:40 >> meta optimization. Yeah, 22:41 >> you can 100% imagine doing that. So I 22:43 think this is a great idea, but it's 22:45 like you know I think like you sort of 22:47 go one step at a time where you sort of 22:48 have one process and then second process 22:50 and then the next process and these are 22:51 all layers of an onion like the LLM sort 22:54 of part is now taken for granted. The 22:56 agent part is now taken for granted. Now 22:58 the claw-like entities are taken for 22:59 granted and now you can have multiple of 23:01 them and now you can have instructions 23:02 to them and now you can have 23:03 optimization over the instructions and 23:05 it's just like it's a little too much 23:06 you know but there I mean this is why it 23:08 gets to the psychosis is that this is 23:09 like infinite and everything is still 23:10 issue and that's why I feel like yeah 23:12 that's just coming back to this is why 23:14 it's so insane. Okay. Well, if we're 23:16 we're just trying to like diagnose the 23:18 current moment and uh what is a relevant 23:22 skill right now, what do you like what 23:23 do you think is the implication that 23:25 this um that this is the loop we should 23:27 be trying to achieve in different areas 23:29 and that it works right like you know 23:31 remove 23:32 >> create the metric or create the ability 23:34 for um agents to continue working on it 23:37 without you. 23:37 >> Yeah. 23:38 >> Do we still have performance engineering 23:39 like 23:40 >> Yeah. I mean so there's a few caveats 23:42 that I would put on top of the LM 23:43 ecosystem. Number one, 23:45 >> uh this is extremely well suited to 23:46 anything that has objective uh metrics 23:48 that are easy to evaluate. So for 23:49 example, like writing kernels for more 23:51 efficient CUDA, you know, code for 23:53 various parts of a model, etc. are the 23:55 perfect fit. 23:56 >> Because you have inefficient code and 23:58 then you want efficient code that has 23:59 the exact same behavior, but it's much 24:01 faster, perfect fit. 24:03 >> Uh so a lot of things like like are 24:05 perfect fit for auto research, but many 24:07 things will not be and so they it's just 24:08 if you can't evaluate then you can't 24:10 auto research it, right? Uh so that's 24:12 like caveat number one. And then maybe 24:14 caveat number two I would say is, you 24:15 know, we're we're kind of talking about 24:16 next steps and we kind of see what the 24:18 next steps are, but fundamentally the 24:19 the whole thing still doesn't it's still 24:21 kind of like bursting at the seams a 24:22 little bit and there's cracks and it 24:24 doesn't fully work. And if you kind of 24:26 try to go too far ahead, the whole thing 24:27 is actually net not useful if that makes 24:29 sense. 24:30 >> Um because these models like still are 24:32 not, you know, they've improved a lot, 24:34 but they're still like rough around the 24:35 edges is maybe the way I would describe 24:37 it. I simultaneously feel like I'm 24:39 talking to an extremely brilliant PhD 24:41 student who's been like a systems 24:43 programmer for their entire life and a 24:45 10-year-old. And it's so weird because 24:47 humans like there's I feel like they're 24:49 a lot more coupled. Like you have, you 24:51 know, um everything 24:52 >> you wouldn't you wouldn't encounter that 24:54 combination. 24:54 >> This jaggedness is really strange and 24:56 humans have a lot less of that kind of 24:57 jaggedness. Although they definitely 24:59 have some, but humans have a lot more 25:01 jaggedness. uh sorry the agents have a 25:04 lot more jaggedness where uh sometimes 25:06 like you know I ask for functionality 25:08 and it like comes back with something 25:09 that's just like totally wrong and then 25:11 we get into loops that are totally wrong 25:12 and then I'm just I get so frustrated 25:14 with the agents all the time still 25:15 because you feel the power of it but you 25:18 also there's still like it does 25:21 nonsensical things once in a while for 25:22 me still as well 25:23 >> I get very annoyed when um uh I feel 25:27 like the agent wasted a lot of compute 25:30 on something it should have recognized 25:31 was an obvious problem. 25:32 >> Yeah, I think like some of the bigger 25:34 things is like maybe what's under 25:36 underneath it, if I could hypothesize, 25:38 is fundamentally these models are 25:39 trained via reinforcement learning. So 25:41 they're actually struggling with the 25:41 exact same thing we just talked about, 25:42 which is the labs can improve the models 25:45 in anything that is verifiable, whether 25:47 has rewards. So did you write the 25:49 program correctly and does it do the 25:51 unit test check out? Yes or no? But some 25:53 of the things where they're struggling 25:54 is like for example, I think they have a 25:56 tough time with like nuance of maybe 25:58 what I what I had in mind or what I 25:59 intended and when to ask clarifying 26:01 questions. Um like what I yeah it's just 26:04 um anything that feels softer is like 26:07 worse. And so you're kind of like you're 26:09 either on Rails and you're part of the 26:11 super intelligence circuits or you're 26:13 not on Rails and you're outside of the 26:14 verifiable domains and suddenly 26:15 everything kind of just like meanders. 26:17 Like maybe another way to put it is if 26:19 you go to if today if you go to like 26:21 state-of-the-art model chachi PT and you 26:22 ask it tell me a joke um do you know 26:26 what joke you're going to get? There's 26:27 the joke. 26:28 >> The joke I do feel I I I can't tell you 26:30 like the you know standard form of it 26:32 but I do feel like Chach has like three 26:34 jokes. 26:34 >> Yeah. Yeah. So the the joke that 26:36 apparently all thems like laugh the most 26:38 is why do scientists uh not trust atoms? 26:41 >> Okay. 26:42 >> Because they make everything up. 26:43 >> Okay. 26:44 >> They make everything up. 26:45 So this is 26:46 >> how did that emerge? 26:47 >> So this is the joke you would get like 26:49 three or four years ago and this is the 26:50 joke you still get today. 26:52 >> Okay. 26:52 >> So even though the models have improved 26:54 tremendously. 26:54 >> Yeah. 26:55 >> And if you give them an agentic task, 26:56 they will just go for hours and move 26:59 mountains for you. 27:00 >> And then you ask for like a joke and it 27:02 has a stupid joke, a crappy joke from 5 27:04 years ago. And it's because it's outside 27:06 of the it's outside of the RL. 27:08 >> It's outside of the reinforcement 27:09 learning. It's outside of what's being 27:10 improved. It's like and it's part of the 27:13 jaggedness of like shouldn't you expect 27:15 models as they get better to also have 27:16 like better jokes or more diversity of 27:18 them or it's just it's not being 27:20 optimized and it's stuck. 27:22 >> Do you uh uh think that that implies 27:25 that we are not seeing like 27:27 generalization in the sense of like 27:29 broader intelligence of joke smartness 27:32 being attached to code smartness. Yeah, 27:35 I think there's some decoupling where 27:37 some things are verifiable and some 27:38 things are not and some things are 27:40 optimized for arbitrarily by the labs 27:41 depending on like what data went in and 27:43 some things are not and um 27:46 >> and 27:46 >> but I mean the the premise there's a you 27:48 know premise from some research groups 27:51 that if you are smarter at code 27:53 generation or in these ver verifiable 27:55 fields you should be better at 27:56 everything and and like the the the joke 27:59 situation suggests that that's not 28:00 happening in all 28:00 >> I don't think that's happening. Yeah, I 28:02 don't think that's happening. I think uh 28:03 I think maybe we're seeing like a little 28:05 bit of that but not like a satisfying 28:06 amount. 28:06 >> Yeah, that agonist exists in humans. 28:10 >> You can be very very good at math and 28:12 still tell a really bad joke. 28:14 >> Yeah, that's true. Yeah, but it just it 28:15 still means that we're not getting like 28:17 the story is that we're getting a lot of 28:18 the intelligence and capabilities in all 28:20 the domains of society like for free as 28:22 we get better and better models. And 28:24 that's not like exactly fundamentally 28:25 what's going on. And there's some blind 28:27 spots and some things are not being 28:28 optimized for. And this is all clustered 28:30 up in these neural net opaque models, 28:33 right? So you're either on rails of what 28:35 it was trained for and everything is 28:36 like you're going at speed of light or 28:37 you're not. Um and so it's jaggedness. 28:40 So um so that's why I think like even 28:43 though the the progression is obvious 28:45 what should happen, you can't let it 28:48 fully go there yet because it doesn't 28:51 fully work or it's a skill issue and we 28:53 just haven't like figured out how to use 28:54 it. So you know it's hard to tell. Can I 28:56 ask kind of a blasphemous question which 28:58 is like if this jaggedness is persisting 29:01 um and it's all rolled up in a uh at 29:04 least monolithic interface right but you 29:06 know single model 29:08 >> um does that make sense or do do you 29:10 should should it be unbundled into 29:11 things that are can be optimized and 29:13 improved against different 29:15 >> domains of intelligence 29:16 >> uh like unbundling the models into 29:18 multiple experts in different areas etc 29:20 >> more directly yeah 29:21 >> um 29:22 >> instead of juste that we have no 29:24 exposure to that can be confusing as a 29:28 why is it so good at this but not at 29:30 this other thing. 29:31 >> Yeah, I think currently my impression is 29:33 the labs are trying to have a single 29:34 sort of like monoculture of a model that 29:36 is arbitrarily intelligent in all these 29:39 different domains and they just stuff it 29:41 into the parameters. I do think that we 29:43 will we I I I do think we should expect 29:44 more speciation in the um intelligences. 29:48 Um like you know the animal kingdom is 29:51 extremely diverse in the brains that 29:52 exist and there's lots of different 29:53 niches of uh of nature and some animals 29:56 have overdeveloped visual cortex or 29:58 other part kind of parts and I think we 30:00 we should be able to see more speciation 30:03 and um you don't need like this oracle 30:05 that knows everything. and you kind of 30:06 speciate it and then you put it on a 30:07 specific task. And we should be seeing 30:09 some of that because you should be able 30:10 to have like much smaller models that 30:11 still have the cognitive core like 30:13 they're still competent but then they 30:14 specialize and then um and then they can 30:17 become more efficient in terms of 30:19 latency or throughput on uh specific 30:21 tasks that you really care about like if 30:22 you're a mathematician working in lean. 30:24 I saw for example there's a few releases 30:25 that really like target that as a 30:27 domain. Um uh so there's a probably 30:29 going to be a few examples like that 30:30 where the unbundling kind of makes 30:32 sense. One question I have is whether or 30:34 not uh the capacity constraint on 30:38 available compute infrastructure 30:40 >> drives more of this because efficiency 30:42 Yeah. actually matters more, right? Like 30:44 your 30:45 >> if you 30:47 financing aside though financing is 30:49 involved in all of this, if you have 30:50 access to full compute for anything you 30:52 do, like even one single model, right? 30:55 But if you actually feel pressure where 30:57 you're like I can't serve 31:00 >> um a model of massive size for every use 31:03 case 31:04 >> like do you think that leads to any 31:05 speciation? Does that question make 31:06 sense to you? The question makes sense 31:08 and I guess like what I'm what I'm what 31:10 I what I'm struggling with is I don't 31:11 think we've seen too much speciation 31:13 just yet, right? 31:14 >> No. 31:14 >> Uh we're seeing a monoculture of models. 31:16 >> Yeah. 31:16 >> So um 31:17 >> and there's like clearly pressure for 31:18 like make a good code model, put it back 31:20 in the main merge again. 31:21 >> Yeah. Yeah. 31:23 >> Um 31:25 even though there already is pressure on 31:27 the models. 31:28 >> I guess perhaps I I feel like there's a 31:30 lot of very short-term supply crunch and 31:32 like maybe that causes more speciation 31:34 now. 31:35 >> Yeah. Yeah, I think fundamentally like 31:37 the the the labs are serving a model and 31:40 they don't really know what the end user 31:41 is going to be asking about. Uh so maybe 31:43 that's like some part of it because they 31:45 kind of have to multitask over all the 31:46 possible things that could be asked. But 31:48 I think if you're coming to a business 31:49 and maybe partnering on some specific 31:50 problems you care about, then maybe you 31:52 would see that there. Um or there would 31:54 be some very high value applications 31:56 that are like more niche. Um but uh I 31:59 think right now they're kind of like 32:00 going after the totality of what's 32:02 available. I don't think that the 32:03 science of manipulating the brains is 32:05 like fully developed yet partly. 32:07 >> What do you mean manipulating? 32:08 >> So like so fine-tuning without losing 32:11 capabilities as an example and we don't 32:12 have these primitives for actually like 32:14 working with the intelligences in ways 32:15 other than just context windows like 32:17 context windows kind of just just work 32:19 and it's very cheap to manipulate etc. 32:20 And this is how we're getting some of 32:21 the customization etc. Uh but I think if 32:24 it was I think it's a it's a bit more of 32:26 a developing science of how you like 32:27 more deeply adjust the models, how you 32:29 have continual learning maybe or how you 32:32 um how you fine-tune in a certain area, 32:34 how you get better in a certain area or 32:35 like how you actually touch the weights, 32:36 not just the context windows. And so 32:38 it's a lot more tricky, I would say, to 32:40 touch the weights than just the context 32:42 windows. Uh because you're actually 32:43 fundamentally changing the full model 32:44 and potentially its intelligence. And so 32:47 um so maybe it's just like not a fully 32:49 developed science, if that makes sense, 32:50 of speciation. A and it also has to be 32:53 like cheap enough 32:54 >> for that speciation to be worthwhile in 32:56 these given 32:57 >> contexts. Can I ask a question about uh 33:00 like uh an extension to auto research 33:02 that you described in terms of um open 33:04 ground? You said okay well you know we 33:06 have this thing 33:07 >> um we need more collaboration surface 33:10 around it essentially for people to 33:12 contribute um to research overall. Can 33:15 you talk about that? Yeah. So, we talked 33:16 about our research has a single thread 33:17 of like I'm going to try stuff in loop, 33:19 but fundamentally uh the paralization of 33:21 this is like the interesting component. 33:23 Um, and I guess I was trying to like 33:24 play around with a few ideas, but I 33:26 don't have anything that like clicks as 33:28 simply as like I don't have something 33:29 that I'm like super happy with just yet, 33:30 but it's something I'm like working on 33:31 on the side when I'm not working on my 33:33 claw. Um so I think like one issue is if 33:37 you have a bunch of nodes uh of 33:40 paralization available to you then it's 33:42 very easy to just have multiple auto 33:43 researchers talking through um a common 33:45 system or something like that. What I 33:47 was more interested in is how you can 33:48 have an untrusted pool of workers out 33:50 there on the internet. 33:51 >> So for example in auto research uh 33:53 you're just trying to find um the piece 33:57 of code that trains a model to a very 33:58 low validation loss. If anyone gives you 34:01 a candidate commit, it's very easy to 34:03 verify that that commit is correct, is 34:05 good. Like they someone could claim from 34:07 the internet that this piece of code 34:08 will optimize uh much better and give 34:10 you much better performance. You could 34:11 just check very easy, but probably a lot 34:14 of work goes into that checking. Uh but 34:16 fundamentally they could lie and etc. So 34:18 you're basically dealing with a similar 34:19 kind of it's almost actually like looks 34:21 a little bit like my my designs that 34:22 incorporate an untrusted pool of workers 34:25 uh actually look a little bit more like 34:26 a blockchain a little bit. uh because 34:28 instead of blocks, you have uh commits 34:31 and these commits can build on each 34:32 other and they contain like changes to 34:33 the code as you're improving it. Um and 34:36 uh the proof of work is basically doing 34:38 tons of experimentation to find the 34:39 commits that work. 34:41 >> Um and that's hard. Um and then the 34:43 reward is just being on the leaderboard 34:44 right now. There's no monetary reward 34:46 whatsoever. uh but I don't want to push 34:48 the analogy too far but it fundamentally 34:50 has this issue where you a huge amount 34:52 of search goes into it but it's very 34:54 cheap to verify that a candidate 34:55 solution is indeed good because you can 34:57 just train a single you know someone had 34:59 to try 10,000 ideas but you just have to 35:01 check that the thing that they produced 35:02 actually works 35:03 >> because the 99,000 of them didn't work 35:05 you know 35:06 >> um and so basically long story short is 35:09 like you have to come up with a system 35:11 where an untrusted pool of workers can 35:13 collaborate with a trusted pool of 35:15 workers uh that do the verification 35:18 And the whole thing is kind of like 35:19 asynchronous and works and um and so on 35:22 and uh it's it's like safe from a 35:24 security perspective because if anyone 35:26 sends you arbitrary code and you're 35:27 going to run it that's very sketchy and 35:28 dodgy. So um but fundamentally it should 35:31 be totally possible. So you're familiar 35:32 with projects like seti at home and 35:34 folding at home all of these problems 35:35 have a similar kind of setup. So folding 35:37 at home you're folding a protein um and 35:40 it's very hard to find a configuration 35:41 that is low energy. But if someone finds 35:42 a configuration that they evaluate to be 35:44 low energy that's perfect you can just 35:45 use it. you can easily verify it. So a 35:47 lot of things have this property that 35:48 you know very expensive to come up with 35:50 but very cheap to verify and so in all 35:52 those cases things like folding at home 35:54 or seti at home or auto research at home 35:57 will be good fits. And so, um, long 36:00 story short, a swarm of agents on the 36:02 internet could collaborate to improve 36:04 LLMs and could potentially even like run 36:06 circles around Frontier Labs. Like, who 36:08 knows, you know? Um, yeah, like maybe 36:11 that's even possible. Like, Frontier 36:12 Labs have a huge amount of trusted 36:14 compute, but the Earth is much bigger 36:16 and has huge amount of untrusted 36:18 compute. But if you put systems in check 36:20 uh systems in place that you know deal 36:22 with this then maybe it is possible that 36:24 the swarm out there could uh could come 36:26 up with with better with better 36:28 solutions and people kind of like 36:30 contribute cycles um to to a thing that 36:33 they care about. And so sorry so the 36:35 last thought is uh lots of companies or 36:37 whatnot they could maybe have like their 36:38 own uh things that they care about and 36:40 you if you have compute capacity you 36:42 could contribute to different kind of 36:43 auto research tracks like maybe you care 36:45 about certain you know like you care 36:47 about like cancer or something like that 36:48 of certain type you don't have just 36:50 donate money to an institution you 36:51 actually could like purchase compute and 36:53 then you could join the auto resource 36:55 forum for that project you know uh so if 36:58 everything is rebundled into other 36:59 researchers then compute becomes the 37:01 thing that you're contributing to the 37:02 pool. Yeah, that's very inspiring. And 37:04 it's also interesting like I don't I 37:06 don't know how far this goes, but it is 37:08 interesting that at least some audience 37:10 of people, you know, here in Silicon 37:12 Valley or lining up at um you know, 37:14 retail stores in China have discovered 37:16 that like having access to personal 37:19 compute is interesting again. 37:20 >> Yeah. 37:20 >> Right. So maybe they're really motivated 37:22 to do that for their claws and then they 37:24 can contribute to auto research. 37:25 >> It's almost like dollars the thing 37:27 everyone cares about, but is flop the 37:29 thing that actually everyone cares about 37:30 in the future? Like is there going to be 37:32 like a flipping almost of like what the 37:34 thing that you care about? Like right 37:35 now for example, it's really hard to get 37:36 compute even if you have money. 37:37 >> Yeah. 37:38 >> So actually it almost seems like the 37:40 flop is like dominant uh in a certain 37:43 sense. Um yeah. So uh so maybe that's 37:45 kind of like kind of like that like how 37:47 much how many flops do you control 37:49 instead of like what wealth do you 37:50 control? I don't actually think that's 37:51 true but it's kind of interesting to 37:52 think about. 37:53 >> The last thing you released was like a 37:55 little bit of jobs data analysis. Is 37:57 that right? 37:58 What um and might have touched a nerve 38:01 even though you're just like visualizing 38:02 some public data. Yeah. Uh what was you 38:05 know what were you curious about? 38:06 >> Yeah, I guess I was curious to um I mean 38:09 everyone is like really it's everyone is 38:11 really thinking about the impacts of AI 38:12 on the job market and what's going to 38:13 look like. So I was just interested to 38:15 take a look like what does the job 38:16 market look like? Where are the 38:17 different roles? um and how many people 38:20 are in different professions. And I was 38:21 like really just interested to like look 38:23 through uh the individual cases and try 38:25 to think myself about like you know with 38:27 these AIs and how they're likely to 38:28 evolve like are these going to be tools 38:31 that people are using? Are these going 38:32 to be displacing tools for these uh 38:35 professions and like what are the 38:37 current professions and how are they 38:38 going to change? Are they going to grow 38:39 or uh adjust to a large extent or like 38:42 what could be new professions? So it's 38:43 really just like a way to fuel my own 38:45 chain of thought about the industry I 38:46 suppose. M 38:47 >> um and so uh yeah the jobs data 38:50 basically is just a Bureau of Labor 38:52 Statistics they actually have um percent 38:55 outlook for each profession about how 38:56 much it's expected to grow over the next 38:58 I think almost a decade. 38:59 >> Uh yeah I think it's a decade but it was 39:01 made in 2024. 39:02 >> We need a lot of healthare workers. 39:04 >> Yeah. So so they've already made those 39:06 projections and I'm not sure actually 39:07 100% what the methodology was that they 39:09 that they put into the projections. Um, 39:11 I guess I was interested to color things 39:13 by like if people think that what's like 39:15 primarily being um developed now is this 39:18 kind of like more digital AI that is 39:20 kind of like almost like these ghosts or 39:21 spirit entities that can like interact 39:24 in the digital world and manipulate a 39:26 lot of like digital information and they 39:27 currently don't really have a physical 39:28 embodiment or presence and the physical 39:30 stuff is probably going to go slightly 39:32 slower because you're manipulating 39:33 atoms. So flipping flipping bits and and 39:36 the ability to copy paste a digital 39:38 information is like makes everything a 39:39 million times faster than accelerating 39:41 matter, you know. So um so energetically 39:44 I just think we're going to see a huge 39:46 amount of activity in digital space, 39:47 huge amount of rewriting, huge amount of 39:49 activity boiling soup and I think the 39:52 we're going to see something that that 39:53 in the digital space goes at the speed 39:54 of light compared to I think what's 39:56 going to happen in the physical world to 39:57 some extent if would be the 39:59 extrapolation. And so I think like 40:01 there's currently kind of like I think 40:03 overhang where there can be like a lot 40:06 of unhobling almost potentially of like 40:08 a lot of digital information processing 40:10 that used to be done by computers and 40:11 people and now with AI as like a third 40:13 kind of manipulator of digital 40:14 information. There's going to be a lot 40:15 of refactoring in those in those uh 40:18 disciplines. Um but the physical world 40:20 is actually going to be like I think um 40:22 behind that by some amount of time. And 40:24 so I think what's really fascinating to 40:26 me is like so that's why I was 40:28 highlighting the the professions that 40:29 fundamentally manipulate digital 40:31 information. This is work you could do 40:32 from your home etc. because I feel like 40:34 those will be like things will change 40:36 and it doesn't mean that there's going 40:37 to be less of those jobs or more of 40:38 those jobs because that has to do with 40:40 like demand elasticity and many other 40:41 factors but things will change in these 40:43 professions because of these new tools 40:45 and um because of this upgrade to the 40:47 nervous system of the human 40:49 superorganism if you want to think about 40:51 it that way. Given the look you had at 40:53 the data, do you have either any 40:54 observations or um uh guidance for 40:58 people facing the job market or thinking 41:00 about what to study now or what skills 41:02 to develop? I mean, we can all go get 41:04 like I'm very thankful that I have to 41:06 like meet people for my job right now. 41:08 >> More physical. Yeah. 41:09 >> Could you do your work from home though? 41:11 Uh I could 41:12 >> I think there are relationship parts of 41:14 it that are hard, but most of it I 41:15 could. 41:16 >> Yeah. I think it's really hard to tell 41:17 because again like the job market is 41:18 extremely diverse and I think the 41:19 answers will probably vary but uh to a 41:21 large extent like these tools are 41:23 extremely new, extremely powerful and so 41:24 just being uh you know just trying to 41:26 keep up with it is like the first thing 41:28 um and um yeah because I think a lot of 41:32 people kind of like dismiss it or 41:33 >> or they're afraid of it 41:34 >> or they're afraid of it etc which is 41:36 totally understandable of course. Yeah, 41:38 I think like um it's fundamentally an 41:40 empowering tool at the moment. Um and 41:42 these jobs are bundles of tasks and some 41:44 of these tasks can go a lot faster and 41:45 so people should think of it as 41:46 primarily a tool that it is right now. 41:48 Um and I think the long-term future of 41:50 that is uncertain. Yeah, it's kind of 41:52 really hard to forecast to be honest and 41:54 like I'm not professionally like doing 41:56 that really and I think it's a job of 41:57 like economists to do properly. 41:59 >> You are an engineer though. Uh and like 42:02 one thing I thought was interesting is 42:03 that like the uh demand for engineering 42:05 jobs is continuing to increase. 42:08 >> Yeah. 42:08 >> Um I I can't tell if that's like a 42:10 temporary phenomenon. I'm not sure how I 42:11 feel about it yet. Do you know? 42:13 >> Yeah. That's like the demand almost like 42:15 uh software was scarce, right? And so 42:17 the reason we don't have more demand for 42:19 software is just it's scarcity and it's 42:21 too expensive. 42:21 >> Too expensive. Yeah. 42:22 >> So if the barrier comes down then 42:23 actually you have the Jevans paradox 42:25 which is like you know you actually the 42:26 demand for software actually goes up. 42:28 It's cheaper and there's more more 42:29 >> powerful. Yeah. the the classical 42:31 example of this always is the ATMs and 42:33 the bank tellers uh because there was a 42:35 lot of like fear that um ATMs and 42:38 computers basically uh would displace 42:40 tellers but what happened is they made 42:42 like the cost of operation of um of a 42:44 bank branch much cheaper and so there 42:46 were more bank branches so there were 42:48 more tellers is like the canonical 42:49 example people site uh but basically 42:51 it's just paradox like something becomes 42:53 cheaper so there's a lot of unlocked 42:56 demand for it so I do think that that's 42:58 probably I do have cautiously optimistic 43:00 view of this in software engineering 43:02 where I do um it does seem to me like 43:05 the demand for software will be 43:06 extremely large um and it's just become 43:08 a lot cheaper and um so I do think that 43:12 for quite some time um it's very hard to 43:16 forecast but it does seem to me like 43:18 right now at least locally there's going 43:19 to be more demand for software um 43:21 because software is amazing it's like 43:22 you know digital information processing 43:23 you're not forced to use like arbitrary 43:25 tools that were given to you that are 43:26 imperfect in various ways you're not 43:28 forced to subscribe to what exists uh 43:30 code is now ephemeral and it can change 43:32 and it can be modified um and so I think 43:35 there's going to be a lot of activity in 43:36 the digital space to like rewire 43:38 everything in a certain sense and I 43:40 think it's going to create a lot of 43:41 demand for for this kind of stuff I 43:43 think long term um yeah obviously even 43:45 with auto research like openi or or you 43:48 know uh anthropic or these other labs 43:50 like they're employing what like a 43:51 thousand something researchers right 43:53 >> these researchers are basically like 43:54 glorified auto like you know 43:58 >> they're like automating themselves away 43:59 like actively and this is like the thing 44:00 they're all trying to do. 44:01 >> Yeah. 44:02 >> I f like I went around um 44:04 >> some of those researchers also feel feel 44:06 the psychosis, right? Because they can 44:07 it's working. Yeah. 44:08 >> Right. And and so they're like oh it's 44:10 over for me too. 44:11 >> I did spend a bunch of time going around 44:12 opening eye and I was like you guys 44:14 realize if we're successful like we're 44:15 all out of job like 44:16 >> like it's just we're just building 44:18 automation for Sam or something like 44:19 that. like I or the board I'm not sure 44:22 but like uh there's just building like 44:24 this automation for yeah the board or 44:26 the CEO or something like that and we're 44:27 all out of our job and maybe um 44:30 contributing on the sides and so yeah 44:33 it's kind of like uh nerving from that 44:34 perspective 44:35 >> is it okay if I ask you Nome's question 44:38 >> you know you could be doing that right 44:40 auto researching with a lot of compute 44:42 scale and a bunch of colleagues at one 44:43 of the frontier labs like why not 44:45 >> well I was there for a while right like 44:46 and I did re-enter so to some extent I 44:49 agree and I think that there are many 44:50 ways to slice this question. It's a very 44:51 loaded question a little bit. Um I will 44:53 say that I feel very good about like 44:55 what people can contribute in their 44:57 impact uh outside of the frontier labs 44:59 obviously not in the industry but also 45:01 in like more like ecosystem level roles. 45:04 Um so your role for example is more like 45:05 ecosystem level. My role currently is 45:07 also kind of more on ecosystem level and 45:08 I feel very good about like impact that 45:10 people can have in those kinds of uh 45:11 roles. I think conversely there's there 45:13 are definite problems in my mind for um 45:16 uh for basically aligning yourself way 45:18 too much with the frontier labs too. So 45:20 fundamentally I mean you're you have a 45:21 huge amount of financial incentive to uh 45:23 with these frontier labs and by your own 45:25 admission the uh the AIs are going to 45:28 like really change humanity and society 45:29 in very dramatic ways and here you are 45:32 basically like building the technology 45:34 and benefiting from it like and being 45:35 like very allied to it through financial 45:37 means like this was a conundrum that was 45:39 in um at the heart of you know how open 45:42 started in the beginning like this was 45:43 the conundrum that we were trying to 45:44 solve. M 45:45 >> um and so you know that so it's kind of 45:49 um 45:49 >> it's still not the conundrum is still 45:50 not like fully resolved. So that's 45:52 number one. You you're not a completely 45:54 free agent and you can't actually like 45:55 be part of that conversation in a fully 45:57 autonomous um free way like if you're 45:59 inside one of the frontier labs like 46:01 there are certain things that you can't 46:02 say. Uh and conversely there are certain 46:04 things that the organization wants you 46:05 to say and you know they're not going to 46:07 twist your arm but you feel the pressure 46:09 of like what you should be saying you 46:11 know cuz like obviously 46:14 Otherwise, it's like really awkward 46:16 conversations, 46:17 strange side eyes, like what are you 46:19 doing? You know, like so you can't like 46:20 really be an independent agent and I I 46:22 feel like a bit more ali like aligned 46:24 with humanity in a certain sense outside 46:26 of a frontier lab because uh I don't I'm 46:28 not subject to those pressures almost, 46:29 right? And I can't say whatever I want 46:31 or yeah, I would say in the frontier 46:32 labs like um you can have like uh impact 46:36 there of course as well. So uh but 46:38 there's many researchers and maybe 46:39 you're one of them, maybe your ideas are 46:40 really good, etc. Maybe there's a lot of 46:42 decision-m to to do and you want to be 46:43 in a position where you are in the room 46:44 with those conversations when they come 46:45 up. I do think that currently the stakes 46:47 are like overall fairly low and so 46:49 everything is kind of like nice but 46:51 ultimately at the end of the day like 46:52 when the stakes are really high etc. If 46:53 you're an employee at an organization I 46:55 don't actually know how much sway you're 46:56 going to have on the organization what 46:58 it's going to do like fundamentally at 46:59 the end of the day um uh it's uh you're 47:02 not like really in charge like you're in 47:04 a room and you're contributing ideas but 47:05 you're not like really in charge of that 47:06 entity that you're that you're a part 47:08 of. So those are like some sources of 47:09 misalignment I think to some extent. I 47:11 will say that like in one way I do agree 47:13 a lot with that sentiment that um I do 47:16 feel like and if uh like the labs for 47:18 better or worse they're opaque and a lot 47:20 of work is there and they're kind of 47:21 like at the edge of capability and 47:22 what's possible and they're working on 47:24 what's coming down the line and I think 47:25 if you're outside of the frontier lab uh 47:28 your your judgment fundamentally will 47:29 start to drift because you're not part 47:31 of the you know what's coming down the 47:33 line right and so I feel like my 47:35 judgment will inevitably start to drift 47:36 as well and uh I won't actually have an 47:38 understanding of how these systems 47:39 actually work under the hood that's an 47:40 opaque system uh I won't have a a good 47:43 understanding of how it's going to 47:44 develop and etc. And so I do think that 47:46 in that sense I agree and something I'm 47:48 nervous about. I think it's worth 47:50 basically bas uh being in touch with 47:52 what's actually happening and actually 47:53 being in the frontier lab and if if some 47:55 of the frontier labs would have me come 47:56 for you know some amount of time and do 47:58 really good work for them and then maybe 48:00 coming 48:00 >> is looking for a job. This is super 48:02 exciting. 48:02 >> Yeah. 48:03 >> Then I think that's maybe a good setup 48:05 because I kind of feel like it kind of 48:06 um you know um maybe that's like one way 48:10 >> uh to to actually be connected to what's 48:12 actually happening but also not feel 48:13 like you're necessarily fully controlled 48:14 by 48:15 >> Yeah. 48:15 >> by those entities. So I think honestly 48:17 in my mind like uh Noom can probably get 48:20 do extremely good work at uh at OI but 48:22 also I think his most um impactful work 48:24 could very well be outside of OpenAI. 48:26 >> No that's a call to be an independent 48:28 researcher with auto research. Yeah, 48:30 there's many things to do on the outside 48:31 and it's a it's a and I think ultimately 48:34 I think the ideal solution maybe is like 48:36 yeah going back and forth uh or um yeah 48:39 and I think fundamentally you can have 48:41 really amazing impact in both places. So 48:43 very complic I don't know like it's a 48:44 very loaded question a little bit but I 48:46 mean I joined the frontier lab and now 48:47 I'm outside and then maybe in the future 48:49 I'll want to join again and I think um 48:53 uh that's kind of like how I look at it. 48:54 One question related to what visibility 48:57 to does the world or the AI ecosystem 49:00 have into uh the frontier is like how 49:04 how close open sources to the frontier 49:06 >> um and how sustainable that is. I I 49:09 think yeah I think it is quite 49:11 >> surprising the entire sequence of events 49:13 actually from like having a handful of 49:16 Chinese models and global models and I 49:19 think people are going to continue 49:20 releasing here in the near term that are 49:22 closer than much of the industry 49:24 anticipated from a capability 49:25 perspective. 49:26 >> Um I don't know if you're surprised by 49:27 that but you're a long-term contributor 49:28 to open source. Like what's your 49:30 prediction here? Yeah. So roughly 49:31 speaking basically the um yeah the 49:34 closed models are ahead but like people 49:35 are monitoring the number of months that 49:36 sort of like open source models are 49:38 behind. Um 49:39 >> and it started with there's nothing and 49:40 then it went to 18 months and now it's 49:42 >> convergence right. So maybe they're 49:44 behind by like what is the latest maybe 49:46 like eight six months eight months kind 49:47 of thing right now. Yeah I'm a huge fan 49:48 of open source obviously. So for example 49:50 in operating systems you have like 49:51 closed like you know Windows and Mac OS. 49:53 These are large software projects kind 49:54 of like what LM are going to become and 49:56 there's Linux but Linux is very easy 49:58 like actually Linux is extremely 50:00 successful project it runs on the vast 50:01 majority of computers like last time I 50:03 checked was it like 60% or something 50:05 like run Linux um and that's because 50:07 there is a need in the industry to have 50:09 a common open platform that everyone 50:10 feels uh sort of safe using I would say 50:13 like the industry has always felt a 50:14 demand for that kind of a project to 50:16 exist and I think the same is true now 50:18 and that's why businesses actually want 50:19 there's demand for this kind of a um a 50:21 thing to exist the big difference is 50:23 that everything is capital. Uh there's a 50:25 lot of capex that goes into this. 50:27 >> Um so I think that's where things like 50:29 fall apart a little bit make it a bit 50:30 harder to to compete in certain sense. 50:32 Uh I I do think that the current models 50:34 are very good. The other thing that I 50:35 think is like really interesting is that 50:36 for the vast majority of like consumer 50:38 use cases and things like that even like 50:40 term open source models are actually 50:41 quite good I would say and I think like 50:43 if you go forward like more uh more 50:46 years it does seem to mean like a huge 50:48 amount of like simple use cases are 50:50 going to be well covered and actually 50:51 even run locally. Um, but there's going 50:54 to be always like some demand for like 50:56 frontier intelligence and that that can 50:57 actually be extremely large piece of the 50:59 pie. But it could be that the frontier 51:01 the need for frontier intelligence is 51:02 going to be like, you know, Nobel Prize 51:04 kind of work or like let's move Linux 51:07 from C to Rust. There's going to be like 51:08 bigger projects, you know, like scoped 51:11 in that kind of a way. And there's going 51:12 to be maybe more um and maybe that's 51:15 where a lot of the frontier closed 51:16 intelligences were are going to be 51:18 interacting with and open source is kind 51:20 of like going to eat through a lot of 51:21 the more basic use cases or something 51:23 like that. You know at some point what 51:25 is frontier today is going to be you 51:27 know probably later this year what's 51:28 frontier today in terms of what I'm 51:30 using right now from the closed labs uh 51:32 might be open source and that's going to 51:33 be doing a lot of work. So I kind of 51:34 expect that this dynamic will actually 51:36 basically continue like we'll have 51:37 Frontier Labs that have closed um AIS 51:39 that are kind of like these oracles and 51:41 then we'll have open source kind of like 51:42 behind by some amount of months and I 51:44 kind of expect that to uh to continue 51:46 and I actually think that's like a 51:47 pretty pretty good setup uh overall. Um 51:51 because I I'm a little bit hesitant of 51:53 having um I don't actually think it's 51:54 like structurally I think there's some 51:56 systemic risk attached to just having 51:58 intelligences that are closed and that's 51:59 like that's it. Mhm. 52:01 >> And I think that that's uh you know 52:02 centralization has a very poor track 52:04 record in my view uh in in the past and 52:07 has uh 52:07 >> you mean like in political or economic 52:09 systems in general. 52:10 >> Yes. 52:12 >> Exactly. I think there's like a lot of 52:13 >> like Eastern European. Yeah. 52:15 >> A lot of pretty bad president. So I want 52:16 there to be a thing that is maybe not at 52:18 the edge of capability because it's new 52:20 and unexplored etc. But I want there to 52:21 be a thing that's behind and that uh is 52:24 kind of like a common working space for 52:25 intelligences that the entire industry 52:27 has access to. Yeah, that seems to me 52:28 like a pretty decent power balance for 52:30 the industry. 52:31 >> Yeah, I also think there's just like 52:32 there are many problems to solve, right? 52:34 Like if you keep advancing intelligence 52:36 from the frontier, we can do new things 52:38 and there are a lot of like very big 52:39 problems for humanity, right? And so 52:42 like it seems that that will continue to 52:44 be a very expensive game. And so I want 52:45 to like root for labs that are doing 52:47 that because there are problems we 52:48 cannot solve without continuing to 52:50 advance the models in a very expensive 52:52 way. Yeah. And yet, as you point out, 52:54 like 52:54 >> if what we have today as Frontier is 52:58 open, that's a lot of capability. Yeah. 53:00 Right. And and so I I think you know the 53:02 power of that or the democratization of 53:04 that seems like 53:05 >> very useful and also healthy. 53:06 >> Yeah. I think basically by accident 53:08 we're actually like in okay spot 53:09 >> and optimal. Yeah. 53:11 >> By accident we we are happen to be in a 53:12 good spot in a certain sense. Um 53:14 >> well and and to some degree the the 53:16 longer this endures like this dynamic 53:19 >> um the the healthier of a spot like the 53:22 ecosystem might be in right because you 53:24 have more and more area under the curve 53:26 >> and I will say that even on the close 53:27 side I almost feel like it's been like 53:29 even further centralizing recently 53:30 because I think a lot of the front 53:31 runners are like not necessarily like 53:33 the top tier and so yeah like in that 53:36 sense I think it's um it's not super 53:38 ideal. I would love there to be more 53:40 more front to last because yeah I'm like 53:42 by default very suspicious of like um I 53:45 want there to be more people in the 53:46 room. I want I think like in machine 53:48 learning ensembles always outperform any 53:50 individual model and so I want there to 53:52 be ensembles of people thinking about 53:53 all the hardest problems and I want 53:54 there to be ensembles of people in the 53:56 room when they um to be all well 53:58 informed and to make all those decisions 54:00 you know so uh I don't want it to be 54:01 like a closed doors with two people or 54:03 three people. I feel like that's like 54:04 not a good not a good future. I almost 54:06 wish like there were more labs is long 54:08 story short and I I I do think that open 54:10 source has a has a has a place to play. 54:12 I hope it sticks around and I basically 54:15 it's currently slightly behind and 54:16 that's actually kind of like a good 54:17 thing. 54:18 >> Okay. you worked on the precursor to 54:20 generalized robotics autonomy um in 54:23 cars, right? Uh a a lot has happened in 54:26 the last couple months with robotics 54:28 companies as well like acceleration of 54:31 really impressive generalization of 54:33 environment of tasks like increasing 54:35 long horizon tasks, lots of money going 54:37 into the space like is it going to 54:39 happen? Has anything in your view 54:41 changed recently? 54:42 >> So like my view is kind of informed by 54:44 what I saw in self-driving and I do feel 54:45 like self-driving is the first robotics 54:46 application. So probably what I saw is 54:48 at the time like 10 years ago there were 54:50 a large number of startups and I kind of 54:52 feel like um like most of them basically 54:54 like didn't long-term make it. Um and 54:57 what I saw is that like a lot of capital 54:58 expenditure had to go in and a lot of 55:00 time and so um I think it like I think 55:03 robotics because it's so difficult and 55:05 so messy and requires huge amount of 55:06 capital investment and a lot of like con 55:08 conviction um just it's like a big 55:11 problem and I think items are really 55:12 hard. So I kind of feel like they will 55:14 lag be it will lag behind what's going 55:16 to happen in digital space and in 55:17 digital space there's going to be a huge 55:19 amount of unhobling uh basically like 55:21 things that weren't super efficient 55:22 becoming a lot more efficient by like a 55:24 factor of 100 55:25 >> because bits are so much easier and so I 55:27 think currently in terms of what's going 55:29 to change and like where the activity is 55:32 I kind of feel like digital space is 55:34 going to like change a huge amount and 55:36 then the physical space will lag behind 55:38 and what I find very interesting is like 55:39 this interface in between them as well 55:41 because I think in this 55:43 If you we do have more agents acting on 55:44 behalf of humans and more agents kind of 55:46 like talking to each other and and doing 55:49 tasks and participating in the kind of 55:50 economy of agents etc. Um you're going 55:53 to run out of things that you're going 55:54 to do purely in a digital space. At some 55:56 point you have to go to the universe and 55:57 you have to ask it questions. Um you 55:59 have to run an experiment and see what 56:01 the universe tells you to get back to 56:02 learn something. And so we currently 56:04 have a huge amount of like digital work 56:07 uh because there's an overhang in how 56:08 much we collectively thought about what 56:10 already is digital. So we just didn't 56:12 have enough thinking cycles among the 56:14 humans to think about all the 56:15 information that is already digital and 56:16 already uploaded. Um and so we're going 56:18 to start running out of stuff that is 56:20 actually like um already uploaded. Uh so 56:23 you're going to at some point read all 56:24 the papers and process them and have 56:26 some ideas about what to try. But um 56:28 yeah, we're just going to uh I don't 56:30 actually know how much you can like get 56:31 intelligence that's like fully closed 56:33 off and with just information that's 56:34 available to it, you know. And so I 56:36 think what what's going to happen is 56:37 first there's going to be huge amount of 56:38 unhobling and I think there's a huge 56:39 amount of work there. Then actually it's 56:41 going to move to like the interfaces 56:42 between physical and digital. So and 56:44 that's like sensors of like seeing the 56:46 world and actuators of like doing 56:47 something to the world. So I think a lot 56:49 of interesting companies will actually 56:50 come from that interface of like can we 56:53 feed the super intelligence in a certain 56:55 sense data and can we actually like take 56:57 data out and manipulate the physical 56:59 world um per its bidding if you want to 57:01 like interropomorphize the whole thing 57:03 right and then the the physical world 57:04 actually I almost feel like the the 57:06 total addressable market etc in terms of 57:07 like the amount of work and so on is is 57:09 massive possibly even much larger maybe 57:12 what can happen in digital space so I 57:13 actually think it's like a much bigger 57:14 opportunity as well but um I do feel 57:18 like it's a huge amount of work and and 57:20 in my in my mind the atoms are just like 57:22 a a million times harder. So um so it 57:25 will lag behind but it's also I think a 57:27 little bit of bigger market. So it's 57:28 kind of like uh yeah I think the 57:30 opportunities kind of like follow that 57:31 kind of trajectory. So right now this 57:34 digital is like my main interest then 57:36 interfaces would be like after that and 57:38 then maybe like some of the physical 57:40 things um like their time will come and 57:41 they'll be huge when they do come. Well, 57:44 it's it's an interesting framework for 57:45 it too because uh certain things not the 57:47 things I'm working on right now but 57:48 certain things are much easier even in 57:50 the world of atoms right like if you 57:52 just think about like read and write to 57:54 the physical world like read like 57:56 sensors cameras like there's a lot of 57:58 existing hardware and you can imagine 58:00 like 58:01 >> enriching agent capabilities or 58:03 capturing a lot of new data if you're 58:04 just clever about it and like you don't 58:06 necessarily have to invest a lot to like 58:09 get something valuable. 58:11 >> Yeah. So like examples of this that I 58:12 saw for example are you know a friend of 58:14 mine Liam is running is a CEO of 58:16 periodic I visited them last week so 58:19 it's just on top of mind like they're 58:21 trying to do auto research for material 58:22 science 58:23 >> um and so in that case it's like the 58:25 sensors to the intelligence are actually 58:27 like pretty expensive lab equipment and 58:28 the same is true in biology. I think a 58:30 lot of people are very interested in 58:31 engineering biology and you know the 58:32 sensors will be more than just like 58:34 video cameras if that makes sense. And 58:35 then the other thing I was I saw for 58:37 example is companies that are trying to 58:38 have um like you basically pay people 58:40 for training data. Yeah. As an example 58:42 to feed 58:42 >> programmatically. 58:43 >> Yeah. To feed to feed the Borg. Uh um 58:47 and so like these are all examples of 58:48 like sensors in a certain sense. So they 58:50 take many diverse shapes and forms if 58:51 that makes sense. 58:53 >> Yeah. So I'm looking forward to the 58:54 point where I can ask for a task in the 58:56 physical world and I can put a price on 58:58 it and just tell the agent like you know 59:00 you figure out how to do it. Go get the 59:02 data. 59:02 >> I'm actually kind of surprised we don't 59:03 have enough like information markets. 59:05 Mhm. 59:05 >> Like if for example if poly market or 59:07 other betting markets or even stocks etc 59:08 if they have so much autonomous activity 59:10 and rising amount of activity 59:12 >> like um why should like for example if 59:14 Iran was just happening now like how 59:16 come there isn't a process where like 59:17 taking a photo or video from somewhere 59:18 in Tan should cost like 10 bucks like 59:21 someone should be able to pay for that 59:22 you know like and that's an example of 59:23 like feeding the intelligence there's 59:25 not going to be a human looking at it 59:26 it's going to be like agents who are 59:27 trying to guess the betting games and 59:29 stock markets and so on. M 59:30 >> so I kind of feel like the agentic web 59:32 is still like fairly new that there's no 59:33 like mechanisms for this but this is an 59:35 example of what I I think might happen 59:38 there's a good book that maybe is 59:39 inspiring called demon you potentially 59:42 read it in Damon the intelligence um 59:45 ends up like puppeteering almost a 59:47 little bit like humanity in a certain 59:48 sense you know and so humans are kind of 59:49 like its actuators but humans are also 59:51 like its sensors 59:53 >> um and so I think like collectively like 59:55 society will kind of like reshape in a 59:56 certain way in uh to to serve that kind 60:00 kind of a that will kind of like end up 60:02 happening collectively across the 60:03 industry where yeah there's just a lot 60:06 more automation and has certain needs 60:07 and kind of humans will be serving those 60:09 needs of that of that machine not 60:11 necessarily like to each other 60:12 >> well we were um on this very specific 60:14 point of uh like missing pieces of 60:16 training data we needed um we needed 60:18 something like auto research right like 60:20 we we need the training cycle or the SFT 60:22 piece to be uh far more mechanized 60:27 >> for for what part 60:28 >> in order to make the uh collection like 60:31 in order to take the human out of the 60:33 loop to ask for a task that is just like 60:34 improve my model quality 60:36 >> with new data, 60:38 >> right? 60:38 >> Uh yes. 60:40 >> Does that make sense to you? Like we um 60:42 if you can't have the model do the 60:45 training runs by itself, 60:47 >> then your ability to do this as a like 60:50 closed loop task Yes. with u by pricing 60:53 data Yeah. 60:54 >> is um more challenged. 60:55 >> Yes. Yes. 100%. Yeah. But now the thing 60:58 is for LLM training it actually is like 61:00 very easily it like really fits the 61:01 paradigm. 61:02 >> Um so you'd actually 61:04 >> yeah clean metric 61:05 >> yeah like LM training actually fits the 61:06 paradigm really well really easily like 61:08 all the optimization of all the code and 61:10 so it runs faster and then you also have 61:12 like metrics that you can optimize 61:13 against. I do think that if you had an 61:15 autonomous loop over those metrics 61:16 there's going to be a lot of like good 61:17 harding going on where the system will 61:19 like overfitit to those metrics and so 61:21 um but then you can use the system to 61:23 devise more metrics and you just have 61:24 really good coverage. So it's kind of 61:26 hard to tell but um in a certain sense 61:28 it's like a pretty pretty good fit. 61:30 >> I want to talk about a little uh tiny 61:32 side project you have before we end. Um 61:34 tell me about the micro GPTR. 61:37 >> Oh yeah. Okay. So micro GPT. So I have 61:39 this like running obsession of like 61:41 maybe a decade or two of just like 61:42 simplifying and boiling down the uh 61:45 basically LLMs uh to like their bare 61:47 essence. And I've had a number of 61:48 projects along these lines. So like nano 61:50 GPT and um make more and uh micro GP 61:54 microrad etc. So I feel like micro GPT 61:57 is now the state-of-the-art of me trying 61:58 to like just boil it down to just the 62:00 essence because the thing is like 62:02 training neural nets and LLMs 62:04 specifically um it's a huge amount of 62:05 code but all of that code is actually 62:07 complexity from efficiency. 62:09 >> It's just because you need it to go 62:10 fast. If you don't need it to go fast 62:12 and you just care about the algorithm 62:13 then that algorithm actually is 200 62:15 lines of Python very simple to read and 62:17 this includes comments and everything. 62:19 Um because you just have like uh your 62:21 data set which is a text um and you need 62:23 your neural network architecture which 62:24 is like 50 lines. You need to do your 62:26 forward pass and then you have to do uh 62:28 your backward pass to calculate the 62:29 gradients. And so an little autograd 62:31 engine uh to calculate the gradients is 62:32 like 100 lines and then you need an 62:34 optimizer an atom for example which is a 62:36 very state-of-the-art optimizer is like 62:38 again 10 lines really. And so putting 62:40 everything together in a training loop 62:41 is like yeah 200 lines. And it was 62:44 interesting to me like normally before 62:46 like maybe a year ago or more if I had 62:49 come up with micro GPT I would be 62:50 tempted to basically explain to people 62:52 like I have a video like stepping 62:54 through it or something like that. Uh 62:56 and I actually tried to make that video 62:58 a little bit and I tried to make like a 62:59 little guide to it and so on but I kind 63:01 of realized that this is is not really 63:03 it's not really adding too much because 63:05 people cuz it's already so simple that 63:06 it's 200 lines that anyone could ask 63:08 their agent to explain it in various 63:10 ways and the agents like I'm not 63:12 explaining to people anymore. I'm 63:13 explaining it to agents. If you can 63:14 explain it to agents, then agents can be 63:16 the router and they can actually target 63:18 it to the human in their language uh 63:20 with infinite uh you know uh patience 63:23 and uh just at their capability and so 63:25 on. 63:26 >> Right. If I don't understand um this 63:28 particular function, I can ask the agent 63:30 to explain it to me like three different 63:31 ways and I'm not going to get that from 63:32 you. 63:33 >> Exactly. 63:33 >> And so I kind of feel like you know what 63:35 is education? like it used to be guides, 63:36 it used to be lectures, it used to be 63:38 this thing, but I feel like now more I'm 63:40 explaining things to agents and maybe 63:41 I'm coming up with skills uh where like 63:44 um uh so basically skill is just a way 63:47 to instruct the agent how to teach the 63:49 thing. So maybe I could have a skill for 63:50 micro GPT of the progression I imagine 63:52 the agent should take you through if 63:54 you're interested in understanding the 63:55 codebase and it's just like hints to the 63:57 model to like oh first start off with 63:58 this and then with that and so I could 64:00 just script the curriculum a little bit 64:02 as a skill. Uh, so, uh, so I I don't 64:05 feel like, um, yeah, I feel like there's 64:07 going to be less of like explaining 64:08 things directly to people and it's going 64:10 to be more of just like does the agent 64:12 get it? And if the agent gets it, 64:13 they'll do the explanation. And we're 64:15 not fully there yet because they I still 64:17 can I still think I can probably explain 64:19 things a little bit better than the 64:20 agents, but I still feel like the models 64:21 are improving so rapidly that um, I feel 64:25 like it's a losing battle to some to 64:26 some extent. 64:28 Um and so I think uh education is going 64:30 to be kind of like reshuffled by this 64:32 quite substantially uh where it's the 64:34 end of like teaching each other things 64:36 almost a little bit like if I have a um 64:38 library for example of code or something 64:40 like that it used to be that you have 64:41 documentation for other people who are 64:42 in my user library but like you 64:44 shouldn't do that anymore like you 64:45 should have instead of HTML documents 64:47 for humans you have markdown documents 64:48 for agents because if agents get it then 64:50 they can just explain all the different 64:52 parts of it. So it's this redirection 64:54 through agents, you know, um, and that's 64:56 like, so I think we're going to see a 64:58 lot more of that playing out. 65:00 >> Well, we'll see if the great teachers 65:02 know like to develop intuition for how 65:04 to explain things to agents differently. 65:06 >> Ultimately, so for example, micro GPT 65:07 like I asked I tried to get an agent to 65:09 write micro GPT. So I told it like try 65:11 to boil down the simplest things like 65:14 try to boil down um, neural networking 65:16 to the simplest thing and can't do it. 65:17 like micro GPT is like my is it's like 65:20 my end of my obsession. It's the 200 65:23 lines. I thought about this for a long 65:25 time. I was obsessed about this for a 65:26 long time. This is this is the solution. 65:28 Trust me, it can't get simpler. And this 65:30 is this is my value ad. Everything else 65:32 like agent gets it. 65:33 >> It just can't come up with it. But it 65:35 totally gets it and understands why it's 65:36 done in certain way etc. So like my 65:39 contribution is kind of like these few 65:40 bits, but everything else in terms of 65:42 like the education that goes on after 65:44 that is like not my domain anymore. So 65:47 maybe yeah, it's like education kind of 65:49 changes in those ways where you kind of 65:50 have to infuse the few bits that you 65:52 feel strongly about the curriculum or 65:54 the the best the better way of 65:56 explaining it or something like that. 65:57 The things that agents can't do is your 65:59 job now. The things that agents can do, 66:01 they can probably do better than you or 66:03 like very soon. And so you should um be 66:05 strategic about what you're actually 66:07 spending time on. 66:07 >> Well, we appreciate the few things. 66:09 Thank you, Andre. 66:10 >> Okay. 66:13 >> Find us on Twitter at no prior pod. 66:16 Subscribe to our YouTube channel if you 66:17 want to see our faces. Follow the show 66:19 on Apple Podcasts, Spotify, or wherever 66:22 you listen. That way, you get a new 66:23 episode every week. And sign up for 66:25 emails or find transcripts for every 66:27 episode at no-briers.com.

Connected (20)

Private. Behind Cloudflare Access. © Karthik Kamalakannan.