Training the Retriever: The retriever model is trained to encode both queries and documents into a vector database where similar vectors are also captured.
Retrieving Documents: For a given query, the retriever model encodes the query into a vector and retrieves the top-k most similar documents from the corpus based on vector similarity.
Training the Generator: using a dataset where the inputs consist of the query and the retrieved documents, and the outputs are the desired responses. This training helps the generator learn to canada whatsapp number data utilize the context provided by the retrieved documents to produce accurate and relevant responses.
Generating Responses: During inference, for a given query, the retriever first fetches the top-k relevant documents. These documents are then fed into the generator along with the query. The generator produces a response based on the combined input of the query and the retrieved documents.
Integration and Optimization: The retriever and generator are integrated into a single pipeline where the output of the retriever directly feeds into the generator. In this phase, the retriever and generator could be even trained jointly to optimize the overall system performance.