Today's Top Episodes
Viewing Podcast: Podcast
AI
Arts
Business
Crypto
Finance
Health
History
Interviews
Investing
Macro
Misc
News
Politics
Product
Programming
Science
Social
Startups
Technology
VC

Dataflow Computing for AI Inference with Kunle Olukotun - #751
Duration: 00:56:37
October 14, 2025
- Reconfigurable data flow architectures are designed to match the data flow graphs of AI algorithms directly in hardware, offering a paradigm shift from traditional instruction-based computing by reconfiguring hardware to execute specific computational graphs.
- This architecture excels at fast and energy-efficient inference for large language models by minimizing memory bandwidth usage and maximizing hardware utilization through techniques like fusing entire model components.
- The approach supports dynamic AI systems and agentic workflows through capabilities like rapid model switching and the development of specialized compiler infrastructure, pushing towards systems that can adapt more fluidly to changing computational demands.

Recurrence and Attention for Long-Context Transformers with Jacob Buckman - #750
Duration: 00:57:23
October 7, 2025
- The discussion highlights the importance of context length as a crucial axis of scale for AI models, distinct from simply increasing parameters or data set size.
- A key architectural innovation discussed is "retention," which combines the strengths of recurrent models (linear cost with context length) and attention mechanisms (parallelization and hardware efficiency) by using a chunked algorithm.
- The conversation emphasizes the need for a balanced architecture regarding the ratio between flops spent on parameter-based calculations and state-based calculations, arguing that imbalance can hinder compute optimality.

The Decentralized Future of Private AI with Illia Polosukhin - #749
Duration: 01:04:03
September 30, 2025
- The increasing liability of user data for application developers due to privacy regulations and data taxes is driving demand for platforms where developers don't directly handle user data.
- Near AI addresses the challenges faced by model developers, particularly the risk of model leakage, through decentralized confidential machine learning, leveraging secure enclaves and encryption to protect model weights and user data.
- The podcast discussed the shift from user feedback to more specialized, verifiable data, highlighting the need for curated training data and the opportunity for models to have their own tokens that distribute revenue to data contributors.

Inside Nano Banana ๐ and the Future of Vision-Language Models with Oliver Wang - #748
Duration: 01:03:39
September 23, 2025
- Gemini 2.5 Flash Image (Nano Banana) excels due to its integration with Gemini, leveraging world knowledge for better prompt understanding and autonomous operation.
- A key factor for high-fidelity image editing is a combination of both model architecture and data, not one or the other, that preserves detail and enforces consistency between edits.
- While primarily for creative professionals, image models are moving towards broader applications, like information seeking and visual components for complex problem solving.

Is It Time to Rethink LLM Pre-Training? with Aditi Raghunathan - #747
Duration: 00:58:26
September 16, 2025
- The podcast explores the limitations of solely relying on benchmark performance for evaluating large language models, highlighting how models can often fail in real-world deployment scenarios despite excelling on specific datasets.
- Research suggests that continually training models on more data can paradoxically make them harder to adapt through fine-tuning, potentially due to overfitting and brittleness.
- The conversation dives into the challenge of "unlearning" harmful information from LLMs, proposing a novel approach called "memorization syncs" that aims to isolate and disentangle knowledge within the model architecture to facilitate selective removal of data.

Building an Immune System for AI Generated Software with Animesh Koratana - #746
Duration: 01:05:11
September 9, 2025
- Player Zero is focused on what happens to your code after it leaves the agentic developer, including software verification and fixing.
- The company aims to build an "immune system" for software by creating long-term memory through scenarios and code simulations, which helps in understanding the codebase and its relationship to reality.
- Animesh believes the integration of AI will level up human involvement to higher-level judgment, emphasizing that the world needs more software, not fewer developers.

Autoformalization and Verifiable Superintelligence with Christian Szegedy - #745
Duration: 01:11:48
September 2, 2025
- Christian believes AI will soon achieve "mathematical super intelligence," exceeding human capabilities in specific domains with demonstrably testable results.
- A key focus of Christian's current research is auto formalization, converting mathematical knowledge into formal languages that AI can verify and build upon.
- Christian emphasizes the importance of formal verification in AI development to ensure outputs are guaranteed to meet specifications, thereby mitigating potential risks of AI subversion or unintended behaviors, especially about AI safety.

Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743
Duration: 01:01:01
August 19, 2025
- Genie 3 represents a significant leap forward in world model technology, achieving approximately a 100x improvement by pushing the limits across dimensions like generation quality, resolution, interaction duration, and frame generation speed.
- The project evolved from Genie 1 and 2 by prioritizing real-time interaction, enabling the model to respond to user actions in real time, a key design decision that influenced the entire system architecture.
- A key limitation of Genie 3 is the lack of complex multi-agent interactions within generated worlds, although future development may focus on teaching agents to interact in visually realistic, embodied worlds with humans.

Closing the Loop Between AI Training and Inference with Lin Qiao - #742
Duration: 01:01:11
August 12, 2025
- The "fast iteration experimentation loop" needs to combine both training and inference, as product AB testing is the ultimate judge of whether a model investment is successful.
- A key lesson learned from PyTorch is that a cohesive system is needed to ensure training and inference alignment for quick cross-deployment, and the inference system for experimentation should be the same as that for large-scale production.
- The most exciting trend is making AI model customization accessible to all developers, so they can leverage their production data in a closed loop to gain a competitive edge.

Context Engineering for Productive AI Agents with Filip Kozera - #741
Duration: 00:46:01
July 29, 2025
- Wordware is simplifying agent creation by allowing users to define agent tasks using natural language documents, which are then executed by React agents.
- The discussion highlighted the importance of incorporating human feedback into agent reflection loops to handle situations where the agent lacks knowledge or requires creativity.
- A key challenge is balancing data access and privacy, specifically concerning how AI agents can leverage user data in silos like Slack and Notion without compromising data ownership.