Developing a RAG-Based Q&A Solution for Deltek: A Collaboration with AWS GenAIIC
Generative AI for Question Answering on Government Solicitation Documents
This blog post, co-written by Kevin Plexico and Shakun Vohra from Deltek, delves into the world of Retrieval Augmented Generation (RAG) and its application in question answering (Q&A) using documents. This innovative approach leverages the power of large language models (LLMs) to interact with documents in natural language, making it an invaluable tool for various use cases such as customer support chatbots, legal research assistants, and healthcare advisors.
The post provides an overview of a custom solution developed by the AWS Generative AI Innovation Center (GenAIIC) for Deltek, a globally recognized leader in project-based businesses within the government contracting and professional services sectors. Deltek, serving over 30,000 clients with industry-specific software and information solutions, collaborated with the AWS GenAIIC team to create a RAG-based solution for Q&A on single and multiple government solicitation documents.
The solution utilizes AWS services including Amazon Textract, Amazon OpenSearch Service, and Amazon Bedrock. With Amazon Bedrock offering a choice of high-performing foundation models (FMs) and LLMs from leading AI companies, the solution provides Deltek with the capability to enhance question answering across diverse document formats while ensuring security, privacy, and responsible AI practices.
RAG is highlighted as a methodology that optimizes LLMs by allowing them to reference external authoritative knowledge bases, addressing challenges such as outdated or generic information. The post explains the process of data ingestion and Q&A, showcasing the steps involved in extracting text and tables from documents, generating embedding vectors, indexing in OpenSearch Service, and utilizing the LLM to provide natural language responses to user queries.
The main challenge discussed in the post pertains to applying RAG for Q&A across multiple related documents, emphasizing the importance of handling temporal aspects to ensure accurate and relevant responses over time. The solution’s key features, including section-aware chunking, table to CSV transformation, and adding metadata to the index, contribute to the overall success of the Q&A system, achieving a 96% overall accuracy rate based on Deltek’s evaluations.
In conclusion, the post emphasizes the collaborative efforts between Deltek and the AWS GenAIIC team in developing a generative AI solution for streamlining the review of complex proposal documents. With ongoing refinements to the solution, such as expanding support for additional file formats and optimizing data ingestion pipelines, Deltek aims to enhance the system further to meet specific business requirements.
Overall, the blog post provides valuable insights into the practical application of generative AI for question answering on government solicitation documents, showcasing the potential of RAG architecture in enabling efficient and accurate interactions with textual data.