Dynamic retrieval strategies: promoting quality guarantee through types of query
Authors:
(1) Soyeong Jeong, Computing School;
(2) Geneon Pike, College of Graduate Studies, Amnesty International;
(3) Sogin Chu, Computing School;
(4) Song Joe Hwang, the Advanced Korea Institute for Science and Technology;
(5) Jong C Park, Computing School.
Links table
Abstract and 1. Introduction
2 related work
3 method and 3.1 preliminary
3.2-Rag adaptive: A adaptive generation to retrieve
4 experimental settings and 4.1 data sets
4.2 Models and 4.3 Evaluation Measures
4.4 Implementation details
5 experimental results and analyzes
6 Conclusion, restrictions, ethics clarification, declarations, and references
Additional experimental settings
B additional experimental results
Open field guarantee quality QA is the open field is the task of answering accurately to inquire by identifying the sources of documents related to inquiries, and then explaining it to provide answers (Chen et al Along with the appearance of LLMS with superior thinking capabilities thanks to the billion -size parameters (WeI et al LLMS through the reference capabilities of the reader, as well as take advantage of the external documents that were recovered (Cho et al., 2023).
Multiple law QA A multi -homing QA is an extension of the traditional QA, which in addition to that, the system is comprehensively combined with multiple documents (often frequent), to answer more complicated inquiries (Trivedi et al., 2022A; Yang et al., 2018). In a multi -law world, the access approach to both LLMS and the recovery unit are used in general. Specifically, khattab et al. (2022), Press et al. (2023), Pereira and others. (2023), and others. (2023) I suggested that the multi -law disintegrate in the simplest queries of the jump, and repeatedly reach LLMS and Retriever to solve these sub -shells, and merge their solutions to formulate a complete answer. Unlike this decomposition -based approach, other recent studies, such as yao et al. (2023) and Trivedi et al. (2023), exploring the logical thinking of the series (WeI et al In addition, Jiang and others. (2023) Establish a approach to recovering new documents over and over again if the symbols in the sentences generated have low confidence. However, the above -mentioned methods have ignored the fact that the intelligence in the real world is a wide range of complications. Therefore, it will be largely incompetence to reach LLMS and RetrieVers frequently for each query, which may be simple enough with a single recovery step or even with LLM itself only.
Adaptive retrieval To deal with variable complications, adaptive retrieval strategy aims to determine whether or not you will recover documents, based on the complexity of each query. In this vein, Mallen et al. (2023) He suggested determining the level of complexity of the query based on the frequency of its entities and suggested that the recovery units be used only when the frequency decreases somewhat. However, this approach, which focuses only on the bilateral decision on recovering whether it will be recovered or not, may not be sufficient for the most complicated intelligence that requires multiple thinking steps. In addition, Qi et al. (2021) The method of carrying out a fixed set of operations (recovery, reading, and regeneration) was proposed several times until the answer is derived for the specified inquiry, which was built on the traditional LMS similar LMS Bert. However, contrary to adaptive kindness that determines the pre -complex award and the adaptation of the operating behavior of any LLMS on the shelf accordingly, this approach applies the same fixed processes to each inquiry regardless of its complexity but also requires additional specific training on LMS. Concurrent with our work, Asai et al. (2024) He suggested training an advanced model for the text, criticism and dynamic creation. However, we argue that all adaptive recovery methods mentioned above that depend on one model may be below the optimal level in dealing with a variety of inquiries for a range of different complications because they tend to be simple or very complicated for all input queries, which require a new approach that can choose the most suitable strategy for the LLMS decline.
3 method
In this section, we described our approach to the recovery LLMS air conditioning, by determining the complexity of the query in advance and then choosing the most suitable strategies for the LLMS of retrieval.
3.1 preliminary
We start with the qualifiers, and officially offer different strategies of the overproof LLMS.
One step approach to QA To process the above scenarios where LLM may struggle with the queries that LLM does not answer itself, we can take advantage of external knowledge D, which includes useful information for information, which is recovered from the source of external knowledge D that can be an encyclopedia (like Wikipedia) commensurate with a tendency of documents. Specifically, to obtain such a D from D, it is necessary to have a specific retrieval model, which restores documents based on its importance with the specific query. This process can be formulated as follows: D = retriever (Q; D), where Retriever is a retrieval model, with D ∈ D. here, we can use any unavailability (Robertson et al., 1994; karpukhin et al., 2020).
This LLMS process allows access to external information in D, which can provide the supplementary context to which LLM’s internal knowledge lacks, which can be improved later in accuracy and coincidence of the LLMS of QA.
Multiple steps for QA Although the above -mentioned approach provides significant improvements to the non -retreat of q, which requires external knowledge, it faces noticeable restrictions, especially when dealing with complex information that requires the synthesis of information from multiple source documents and logic on them. This is where the approach becomes a multi -step and logic approach for Qa is necessary.
[1] It should be noted that LLM and Retriever applications vary according to a different multi -step -steps (Trivedi et al., 2023; Press et al Therefore, the CI context may integrate no, some or all of the previous documents and answers.