Quick News Spot

s3: The new RAG framework that trains search agents with minimal data - RocketNews


s3: The new RAG framework that trains search agents with minimal data - RocketNews

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Researchers at University of Illinois Urbana-Champaign have introduced s3, an open-source framework designed to build retrieval-augmented generation (RAG) systems more efficiently than current methods.

s3 can benefit developers creating real-world large language model (LLM) applications, as it simplifies and reduces the cost of creating retriever models within RAG architectures.

RAG retrieval

The effectiveness of any RAG system hinges on the quality of its retrieval component. In their paper, the researchers categorize the evolution of RAG approaches into three distinct phases.

"Classic RAG" systems rely on static retrieval methods with fixed queries, where retrieval quality is disconnected from the ultimate generation performance. These architectures struggle with queries requiring contextual or multi-hop reasoning.

A subsequent phase, dubbed "Pre-RL-Zero," introduces more active LLM participation during inference. These techniques involved multi-turn interactions, interleaving query generation, retrieval, and reasoning. However, they typically depend on zero-shot prompting and lack trainable components to optimize retrieval through direct outcome signals.

The most recent phase, "RL-Zero," leverages reinforcement learning (RL) to train models to act as search agents, improving through outcome-based feedback like answer correctness. An example is Search-R1, which trains the model to interleave reasoning with search queries and retrieved context.

Despite their advancements, existing RL-Zero approaches often optimize retrieval using search-centric metrics that ignore downstream utility. Moreover, they require fine-tuning the LLM, which is costly and error-prone. By entangling retrieval with generation, they limit real search utility and compatibility with frozen or proprietary models.

Different types of RAG Source: arXiv

As the researchers put it, "This motivates a shift toward a modular framework where search and generation are cleanly separated, and optimization focuses purely on search quality with respect to downstream utility."

s3

The s3 framework addresses this challenge with a model-agnostic approach. The main idea is to train a search agent with structured, multi-turn access to external knowledge. This search agent improves the quality of the retrieval stage without affecting the LLM that generates the final answer.

In s3, a dedicated searcher LLM iteratively interacts with a search engine. It generates queries based on the prompt, retrieves relevant documents, selects a useful subset of evidence, and decides whether to continue searching for more information. Once the search concludes, a separate, frozen generator LLM consumes this accumulated evidence to produce the final answer.

s3 framework Source: ...

Previous articleNext article

POPULAR CATEGORY

corporate

4724

tech

4045

entertainment

5863

research

2673

misc

6230

wellness

4731

athletics

6114