Overview
Large language models (LLMs) have demonstrated impressive capabilities in generating coherent and contextually relevant text. However, they face inherent limitations: their knowledge is static as of their training cutoff, they can hallucinate information, and they struggle with providing verifiable citations for their outputs. Retrieval-augmented generation (RAG) has emerged as a powerful approach to address these limitations by connecting LLMs to external knowledge sources.
Current RAG systems, however, often treat retrieval and generation as separate components with minimal integration. Retrievers operate independently of the generation process, and language models passively consume whatever content is retrieved without actively guiding the information-seeking process. This separation limits the system’s ability to handle complex information needs that require multi-step reasoning and targeted information gathering.
Our research focuses on deeply integrating retrieval capabilities with language models—creating systems where the model actively guides the retrieval process, reasons effectively over multiple sources, and maintains coherent understanding across extended interactions.
Key Research Challenges
Active Information Seeking
Current systems retrieve information based on initial queries without adapting their search strategy based on what they learn. How can we train models to actively formulate effective retrieval queries and incorporate returned information in a multi-step process?
Cross-Document Reasoning
Complex questions often require synthesizing information across multiple documents with potentially conflicting information. How can systems effectively reason across diverse sources while maintaining factual accuracy?
Knowledge Persistence
Systems often fail to maintain consistent understanding throughout extended interactions, forgetting previously retrieved information or contradicting earlier statements. How can we create architectures that maintain coherent belief states over time?
Resource Efficiency
Naively retrieving and processing large amounts of text for every query is computationally expensive. How can systems learn to retrieve only when necessary and focus on the most valuable information?
Source Attribution and Factuality
Generated responses often blend retrieved information with model-generated content without clear attribution. How can systems maintain provenance through the generation process and distinguish between retrieved facts and inferences?
Research Questions
Our work in this pillar explores several interrelated research questions:
- What training approaches could teach language models to formulate effective retrieval queries and incorporate returned information?
- How might systems implement iterative retrieval and reasoning loops that progressively build understanding of complex topics?
- What memory architectures could help models maintain coherent understanding throughout extended interactions while incorporating new information?
- How can systems effectively manage their attention across multiple retrieved documents to focus on the most relevant information?
- What mechanisms would enable models to attribute information to specific sources and maintain provenance through the generation process?
- How might systems learn to identify when retrieval is necessary versus when they can rely on parametric knowledge?
- What evaluation frameworks can effectively measure both factual grounding and coherent reasoning in retrieval-augmented systems?
Broader Directions
Our research in this pillar encompasses several broader directions:
Reinforcement Learning for Retrieval Optimization
Developing frameworks that use reinforcement learning to teach models effective information-seeking behaviors, including when and how to retrieve information.
Iterative Retrieval-Reasoning Frameworks
Creating systems that alternate between retrieval and reasoning steps, progressively building understanding of complex topics through multi-step exploration.
Memory-Augmented Architectures
Building hierarchical memory structures that help models distinguish between and appropriately update different types of information—factual knowledge, user preferences, conversation history, etc.
Source-Aware Generation
Developing training methods that enhance models’ ability to attribute information to specific sources and maintain provenance through the generation process.
Cross-Source Consistency Verification
Creating mechanisms for identifying and resolving conflicts between different information sources based on source credibility and evidence strength.
By advancing research in these directions, we aim to create systems that don’t just access information but actively reason about what information they need and how to find it—enabling more accurate, transparent, and effective knowledge-intensive applications.