Azure Integration services

Introduction

The introduction of OpenAI on Azure enables the integration of ChatGPT with existing enterprise data through Azure OpenAI and Azure AI Search in .NET (or any front-end) applications. These services enhance business operations using advanced Natural Language Processing (NLP), allowing organizations to develop efficient Enterprise Knowledge Management and Customer Support systems. This post discusses how to leverage .NET and Azure services like Azure OpenAI, Azure Cognitive Search, and ChatGPT to create intelligent AI-powered applications.

Use Cases

Some of the use cases for ChatGPT-based Chatbot on enterprise data can be:

Enterprise Knowledge Management: Companies can create AI-powered knowledge bases that quickly summarize and retrieve information from large document repositories, making it easier for employees to find what they need.
Customer Support: AI-driven Chatbot can handle customer queries, resolve issues, and provide detailed product information, significantly reducing the workload on human agents.

Technical Design

Technical Architecture for Enterprise Data with Azure OpenAI and AI Search-

The below diagram represents the architecture of an Azure-based AI Chatbot system, showing the interaction between the user, API management, Azure Cognitive Search, OpenAI, and various data stores.

Indexing Knowledge Base: Azure Cognitive Search indexes the knowledge base from various data sources (e.g., SharePoint, SQL, MySQL).
User Login: A user logs into the Azure chat web app through the public network.
Access Token Generation: After successful login, an access token is generated and sent with the user’s prompt to the API Management system.
Token Validation: API Management validates the access token using Azure Active Directory (AAD).
Prompt Forwarding: Once the token is validated, API Management forwards the user’s prompt to the back-end API.
Azure Cognitive Search Query: The prompt is sent to Azure Cognitive Search to query knowledge or indexed data.
Cache Response Retrieval: API checks in Azure Redis cache whether there is already a response for the prompt to reduce the call to Open API to reduce the token consumption. If found, it retrieves the response directly.
Open AI Response Generation: If there is no cached response, OpenAI processes the prompt, based on the knowledge base, and generates a response based on the combined data from Cognitive Search and AI.
Response Delivery: The final response (from either cache or AI) is sent back to the Azure chat web app for user interaction.
Monitoring: Azure Monitor tracks and logs the system’s performance and activity across the network and services.

More info on the implementation can be found here

Components

Azure Cognitive Search: To build a ChatGPT-like feature using Azure, the initial step involves indexing the knowledge base, which could be composed of documents in various formats. These knowledge bases might be stored in different data sources, such as a SharePoint Library, SQL databases, or Azure Data Lake. Azure Cognitive Search is well-suited for this task. It supports indexing from various source systems, including SharePoint, Cosmos DB, SQL, Azure Data Lake, and Azure Storage accounts. This means that documents from these sources can be indexed and enriched using Azure Cognitive Search. Once the knowledge bases are indexed, they can be queried using Azure Search Querying, similar to how searches are conducted in SharePoint. Additionally, the index can be set to automatically update based on data changes, with updates scheduled to run daily or at other regular intervals. , hourly or on demand.

Azure Open AI: Azure OpenAI provides the ChatGPT models used to generate responses based on user queries. It supports the Retrieval Augmented Generation (RAG) pattern, which enhances response accuracy by incorporating relevant information from a knowledge base. Here’s how it works: When a user submits a query, the system first queries the knowledge base to retrieve relevant documents. These documents are then used as context for generating a response. The back-end uses Semantic Kernel to create and execute prompts, which translate user queries into search terms and generate contextually accurate responses based on the retrieved information.

Azure Redis Cache: Azure Redis Cache is used to store and quickly retrieve responses to frequently asked or similar queries. By caching these responses, the service reduces latency, providing faster answers to users. Additionally, it helps manage costs by decreasing the number of requests that need to be sent to the Azure OpenAI Service, as cached responses can be served instead.

API Management: API Management serves as the gateway for all API requests from the Azure Chat Web App. It handles key tasks such as token validation, ensuring secure access, and routing requests to back-end API within the private virtual network. Additionally, API Management can enforce rate limiting and throttling to control traffic, optimize performance, and prevent overloading of back-end services.

Function App (API): The API orchestrates interactions among various services. When a request is received from the Chat App, the API first checks Redis Cache to see if a response for that prompt already exists. If it does, the cached response is immediately returned to the Chat App, reducing latency and costs. If the response isn’t in the cache, the API sends the query to Azure Cognitive Search to retrieve relevant indexed data. This data, along with the user’s prompt, is then sent to the Azure OpenAI Service, which generates the response. Finally, the API returns the response to the Chat App and caches it for future use.

Azure Monitor: Used for monitoring the health and performance of the application, ensuring reliability and quick response to issues.

A sample code for the above solution can be found here

The output of the chatbot:

Search outcome for Enterprise Data with Azure OpenAI and AI Search-

Security

Data security should be a top priority in any AI application architecture.
Authentication: To protect sensitive information, we implement Azure AD authentication for the chat app, ensuring that user identities and access are securely managed.
Managed Identity: Managed Identity is used for Azure resource authentication, eliminating the need for hardcoded credentials and minimizing the risk of credential exposure.
Network Security: All resources are contained within a Virtual Network (VNET), utilizing Private Endpoints to secure network traffic, disable public internet exposure and prevent unauthorized access. Azure API Management (APIM) V2 further enhances security by ensuring that only authorized requests reach the API endpoints.

This layered security approach ensures that your data remains protected at every step.

Azure OpenAI Landing Zone reference architecture can be found here

Cost Impact

Deploying the above applications with this architecture involves costs related to:

Azure OpenAI Service: Charges are based on the number of tokens processed.
Azure Cognitive Search: Pricing varies depending on the storage capacity and query volume.
Azure Storage Account: Costs are determined by the amount of data stored.
Azure Redis Cache: Charges depend on the size of the cache and the number of operations.

Conclusion

Building intelligent applications using Azure services allows you to harness the full potential of AI, creating solutions that are both powerful, secure and scalable. By understanding the architecture, supported data sources, limitations, cost implications, and security considerations, we can design and deploy applications that not only meet business needs but also enhance user experiences.