Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

How PDI Developed a Robust Enterprise-Grade RAG System for AI Applications on AWS

Transforming Enterprise Knowledge Accessibility: The PDIQ Solution

Introduction to PDI Technologies

Challenges in Knowledge Accessibility

Overview of PDI Intelligence Query (PDIQ)

Solution Architecture

Process Flow

Crawlers

Handling Images

Document Processing

Outcomes and Next Steps

Conclusion

About the Authors

PDI Technologies: Revolutionizing Knowledge Access with PDI Intelligence Query

In today’s fast-paced business landscape, efficient data management and accessibility are crucial for success. PDI Technologies has long been recognized as a global leader in the convenience retail and petroleum wholesale industries, leveraging 40 years of experience to enhance profitability and operational efficiency for clients worldwide. Their innovative spirit has led to the creation of PDI Intelligence Query (PDIQ)—an AI-powered assistant designed to streamline knowledge access within the organization.

The Challenge: Fragmented Knowledge Management

Despite PDI’s vast experience and innovative solutions, a significant internal challenge persisted: the fragmentation of information scattered across multiple systems including websites, Confluence pages, SharePoint sites, and various other data sources. The company’s internal teams struggled to retrieve and utilize information effectively, an issue exacerbated by the increasing demand for AI-driven insights.

Recognizing the need for a comprehensive solution, PDI Technologies set out to develop PDIQ—a transformative tool that consolidates and enhances access to company knowledge through a user-friendly chat interface. PDIQ is engineered to overcome several challenges:

  • Content Extraction: Automatically pulling data from diverse platforms with various authentication requirements.
  • Model Flexibility: Facilitating the selection and application of the most suitable Large Language Models (LLMs) for varied processing needs.
  • Semantic Processing: Indexing content for contextual and meaningful retrieval.
  • Knowledge Refresh: Keeping information up-to-date through scheduled crawling.
  • Enterprise-Specific Context: Ensuring AI interactions are relevant to specific business scenarios.

Solution Architecture: How PDIQ Works

An Overview of PDIQ’s Architecture

The design of PDIQ is intricate yet efficient, involving a multitude of services on Amazon Web Services (AWS). Here’s a breakdown of its key components:

  • Scheduler: Managed by Amazon EventBridge, it executes the crawling schedule.
  • Crawlers: Powered by AWS Lambda, these collect data from various sources including web pages, Confluence, Azure DevOps, and SharePoint.
  • Data Storage: Information is stored in Amazon S3 and pertinent metadata is cataloged in Amazon DynamoDB.
  • Notification Services: Amazon SNS and Amazon SQS facilitate communication and queue management among the different services.
  • Embedding Generation: Amazon Bedrock offers access to foundational models for processing data, while Amazon Aurora stores the vector embeddings for retrieval.

Ensuring Security with a Zero-Trust Model

PDIQ embraces a zero-trust security model to safeguard sensitive information. There are distinct access controls for administrators and end-users:

  • Administrators manage crawlers and data through configured user groups and encrypted credentials.
  • End-users access knowledge bases based on validated group permissions, enhancing security without compromising on usability.

Step-by-Step Process Flow

Understanding how PDIQ operates highlights its innovative capabilities:

Data Collection via Crawlers

Crawlers, customizable by administrators, are the backbone of data collection. They support various configurations to target specific information sources, ensuring a comprehensive knowledge base.

Types of Crawlers:

  • Web Crawler: Uses Puppeteer to convert HTML to markdown, capturing full context and relationships.
  • Confluence Crawler: Extracts page content while preserving hierarchy and relationships.
  • Azure DevOps Crawler: Aggregates information about codebases and project documentation.
  • SharePoint Crawler: Utilizes Microsoft Graph API to pull documents and maintain version histories.

Image Handling and Document Processing

Images extracted from data sources are stored in Amazon S3, with metadata tags ensuring easy reference. Image captions are generated to enhance searchability and are linked back to the original documents.

The critical document processing phase focuses on generating vector embeddings through a series of steps—captioning images, breaking documents into chunks, summarizing content, and creating embeddings. This multi-step approach enriches the document context and optimizes retrieval effectiveness.

Achieving Business Outcomes

By integrating this sophisticated architecture, PDI Technologies has experienced numerous benefits:

  • Efficiency Boost: Support teams resolve inquiries faster, leading to quicker customer responses.
  • Increased Customer Satisfaction: Accurate and relevant information strengthens customer relationships.
  • Cost Reduction: Automation reduces operational overhead and allows staff to focus on complex issues.
  • Business Flexibility: The solution is adaptable for various business units without extensive redesigns.

Future Enhancements

As PDI continues to evolve PDIQ, plans are underway for additional enhancements, including:

  • New crawlers for additional data sources.
  • Multilingual support for global operations.
  • Advanced document understanding features.

Conclusion

PDI Technologies has set a benchmark in enterprise knowledge management by developing PDIQ, an AI-driven assistant that fosters efficient knowledge access and improves operational efficiencies. By leveraging AWS’s scalable architecture, PDIQ optimally balances performance, cost, and security. As the company enhances this innovative solution, it stands poised to redefine how enterprises globally manage and access their knowledge assets.


About the Authors

Samit Kumbhani is a Senior Solutions Architect at AWS, focusing on scalable cloud solutions. His diverse interests include cricket and traveling.

Jhorlin De Armas leads AI-driven platform design at PDI Technologies and specializes in serverless architectures.

David Mbonu is an AWS Sr. Solutions Architect with extensive experience in enterprise solutions and focuses on AI/ML innovations.

Latest

50+ Essential Machine Learning Resources for Self-Study in 2026

Unlocking the World of Machine Learning: Essential Resources for...

ChatGPT’s 4% Fee Validates Marketplace Economics

Shopify Merchants to Face 4% Transaction Fee on ChatGPT...

AFF Holiday & Travel Expo, Robotics Conference, and E-Commerce Summit

Upcoming Major Events in Hong Kong: Financial Insights, Travel...

Wealth and Asset Managers Accelerate AI Adoption Driven by ML, NLP, and Generative AI

Subscribe to Our Free Newsletter: Get the Latest Fintech...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

AI That Mimics Human Thinking: How Close Are We? | Aiiot...

Can AI Truly Think Like a Human? Exploring the Boundaries of Machine Intelligence Understanding What "Thinking Like a Human" Means How Current AI Measures Up The Biggest...

Introducing Multimodal Retrieval for Knowledge Bases in Amazon Bedrock

Exciting Announcement: Multimodal Retrieval Now Available for Amazon Bedrock Knowledge Bases Unlocking New Possibilities with Native Support for Video and Audio Content Streamlining AI Applications Across...

Enhance Creative Asset Discovery with Amazon Nova’s Unified Vector Search for...

Transforming Creative Asset Management in Gaming: Leveraging Amazon Nova Multimodal Embeddings for Enhanced Discoverability This heading encapsulates the main focus of the content while highlighting...