Enhanced Self-RAG: Multi-hop Reasoning with Adaptive Hybrid Retrieval for Knowledge-Intensive Question Answering

August 24th, 2025

Knowledge-intensive question answering is among the most demanding tasks in natural language processing, as it necessitates that systems retrieve, process, and integrate information from extensive external knowledge sources in order to produce accurate and factual answers. While Large Language Models (LLMs) demonstrate remarkable abilities, they face issues such as hallucination, reliance on outdated information, and reasoning processes that lack transparency and traceability [4]. Although these models possess exceptional parametric knowledge, they often generate factual inaccuracies when faced with inquiries that require exact, current information or intricate multi-step reasoning involving various knowledge sources[10].

By integrating knowledge from external databases, Retrieval-Augmented Generation (RAG) has surfaced as a promising solution. This improves the precision and trustworthiness of the generation, especially for tasks that require extensive knowledge, and facilitates ongoing updates of knowledge as well as the incorporation of information specific to the domain [4]. To overcome the shortcomings of purely parametric models, the basic RAG framework proposed by Lewis et al. (2020) [5] enhances language models with a differentiable retrieval mechanism that taps into explicit non-parametric memory. This method has shown considerable enhancements in factual accuracy and has been extensively embraced across multiple areas, such as question-answering, summarization, and knowledge-based tasks [9].

However, traditional RAG systems suffer from several critical limitations that constrain their effectiveness in complex reasoning scenarios. Indiscriminately retrieving and incorporating a fixed number of retrieved passages, regardless of whether retrieval is necessary or passages are relevant, diminishes LM versatility or can lead to unhelpful response generation[1]. These systems typically employ static retrieval strategies that cannot adaptively determine when external knowledge is needed, often resulting in unnecessary computational overhead or the incorporation of irrelevant information that may confuse the generation process[7][2] .

Regarding these restrictions, Asai et al. Self-RAG (Self-Reflective Retrieval-Augmented Generation) was presented in 2023[1]. This innovative framework allows language models to critically assess both their generations and retrieved passages using learned reflection tokens, as well as selectively retrieve pertinent knowledge on demand. A breakthrough is Self-Reflective Retrieval Augmented Generation, which trains models to make adaptive retrieval decisions and engage in self-criticism, enhancing response quality and factual accuracy. Using mechanisms for retrieval necessity prediction, relevance assessment, and output quality evaluation, the framework performs exceptionally well on a variety of knowledge-intensive tasks.

However, Self-RAG still faces challenges with multi-hop reasoning scenarios where answers need to combine information from multiple sources through complex inferential chains. Multi-hop question answering (QA) is a challenging task that requires complex reasoning and inference across different sources to predict an accurate answer [3]. Current implementations of Self-RAG fail to: (1) determine the best retrieval strategies for each step in multi-hop scenarios, (2) maintain coherence within and between multiple cycles of retrieval and generation, and (3) achieve effective balancing of information retrieval from various knowledge sources when hybrid approaches are necessary. These limitations are most evident in knowledge-intensive tasks that require sophisticated reasoning patterns such as comparative analysis, causal inference, and temporal reasoning [10][8].

This paper presents Enhanced Self-RAG, a novel framework that uses adaptive hybrid retrieval mechanisms specifically tailored for multi-hop reasoning in knowledge-intensive question answering to overcome these important limitations. The researcher's method builds upon the Self-RAG paradigm by integrating: (1) dynamic retrieval strategy selection that adjusts to the complexity of each query's reasoning; (2) multi-source knowledge integration capabilities that efficiently integrate structured and unstructured data; and (3) improved reflection mechanisms that assess the coherence of the reasoning chain over a number of hops. In comparison to current methods, the researcher shows through extensive experiments on difficult multi-hop QA benchmarks that Enhanced Self-RAG achieves notable improvements in accuracy and reasoning quality.

You can download the complete draft of the paper here: https://files.rochiey.dev/s/APqtCyBpZXiwTHZ

Code so you can test: https://github.com/rochiey/ehanced-self-rag