
RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732
- The study found that Retrieval-Augmented Generation (RAG) systems can compromise the safety of Large Language Models (LLMs), even when using safe documents to provide context.
- Bloomberg's application of Generative AI includes question answering, summarization and transparent attribution, all built around providing grounded responses linked to trusted sources like market data and news articles.
- Governance is critical for AI safety in regulated domains like financial services, requiring multi-layered safeguards, ongoing red teaming, and adaptation of risk taxonomies that reflect concerns related to financial impartiality and insider trading.

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731
- Bespoke Labs is focused on the critical role of data curation in improving AI model performance, particularly custom models and agents, emphasizing data visualization, error analysis, and annotation.
- Reinforcement Learning (RL) offers a more flexible and robust approach to building AI agents compared to prompt engineering, by enabling models to learn reasoning and adapt to specific environments and tasks, such as tool use and function calling.
- The recent applicability of RL to LLMs is due to their existing world knowledge significantly reducing the compute and data requirements, allowing engineers to shape reward functions that guide the model's learning process to achieve desired outcomes.

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730
- Current agent development primarily relies on human-designed workflows, which often oversimplify real-world processes; however, reward-based training methods can yield superior outcomes by allowing models to learn how to handle tasks more effectively.
- The shift from businesses creating bespoke machine learning models to leveraging general-purpose models like GPT-3 and GPT-4 signifies a significant change in the AI landscape, making it clear that creating off-the-shelf solutions is faster and more cost-effective than training custom models.
- Training agents with reinforcement learning to learn from failures can enhance their performance in complex, multi-step tasks, illustrating the potential evolution towards more resilient and intuitive agentic systems tailored for specific real-world applications.

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729
- The conversation centers on the creation of a benchmarking model called CTI bench to evaluate the performance of large language models (LLMs) in cyber threat intelligence, addressing a critical gap in existing tools for this area.
- Nidi Rastoi discusses how LLMs have transformed cybersecurity by providing contextualized responses based on vast amounts of data, enhancing threat detection and analysis capabilities beyond traditional methods.
- The challenges of LLMs in cybersecurity, including issues of hallucinations and outdated knowledge, are highlighted, emphasizing the need for continuous updates and human oversight in critical decision-making processes.

Generative Benchmarking with Kelly Hong - #728
- The podcast discusses generative benchmarking as a solution to improve systematic evaluation of AI systems, addressing issues like overfitting and data leakage that plague current public benchmarks.
- A key focus is the importance of document filtering and context-aware query generation, which ensures that the generated evaluation data reflects realistic user queries to enhance retrieval accuracy.
- The conversation highlights the necessity for human alignment in using LLMs for evaluation, emphasizing that while generative benchmarking aids the evaluation process, it still requires human oversight for reliability and accuracy.

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727
-
The podcast discusses mechanistic interpretability, emphasizing the importance of understanding the internal workings of large language models (LLMs) like Claude, which are often perceived as "stochastic parrots" but exhibit complex behaviors, such as planning and reasoning with linguistic tasks.
-
A key finding involves circuit tracing to analyze the pathways in which models generate responses, enabling researchers to identify the mechanisms at play and how features interact within the model, thereby revealing surprising insights about model behavior and decision-making.
-
The conversation highlights the limitations of LLMs, such as the challenges of attention mechanisms versus MLPs, and addresses the phenomenon of hallucination where models confidently provide incorrect answers, indicating a disconnection between the model's ability to generate text and factual knowledge management.

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726
- Ma Shen's research focuses on enhancing AI systems by making them more intelligent and reliable, particularly through quantifying uncertainty and addressing challenges like hallucination in language models.
- The project Satori introduces a novel method of applying reinforcement learning for reasoning in language models, allowing them to self-correct and reflect on their generated responses, akin to human problem-solving processes.
- Satori demonstrates promising capabilities in both math problem-solving and general reasoning tasks, outperforming traditional instruction-based models while using significantly less training data, indicating its potential for broader application across various domains.

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725
- The integration of Foundation Models into autonomous vehicle systems is enhancing the vehicles' ability to understand complex driving scenarios, utilizing advanced techniques like Vision Language Models for improved spatial awareness and reasoning over time.
- Waymo has achieved significant growth in operational scale, now offering over 200,000 fully autonomous rides weekly across four major cities, demonstrating the potential impact of autonomous vehicles on daily transportation.
- The ongoing development of a Foundation Model tailored for autonomous driving aims to leverage extensive data and scaling techniques to improve driving generalization and predictability in diverse environments, addressing common challenges in safety and performance.

Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724
-
Tokenization is crucial for language models but inherently flawed due to language-specific compression rates, leading to inefficiencies and potential overcharging for users of under-resourced languages.
-
Julie Kini's research introduces Mr T5, a bite-level model architecture that aims to improve efficiency by using dynamic token merging, outperforming traditional token-based models in various multilingual tasks while maintaining performance.
-
The Mission Impossible paper explores the limitations of language models in learning "impossible languages," demonstrating that architectures trained in English contexts may bias performance towards natural languages, prompting further research on cognitive-linguistic alignment in AI models.

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723
-
The podcast features a discussion on latent reasoning and recurrent depth, highlighting a novel approach to model training that allows for greater scalability and algorithm learning compared to fixed-depth architectures.
-
The conversation emphasizes the model's performance in grade school math and coding tasks, demonstrating significant improvements over traditional models with the same number of parameters by leveraging a recurrent architecture for deeper computations.
-
The speakers address the implications of this approach for model safety and understanding, suggesting that thinking internally without verbalization may provide more efficient reasoning processes while still enabling transparency in model development through open-source practices.