Path to AI: Better Business Intelligence Through AI-Powered Assistants
Overview
Continuing our series ‘Path to AI’ where we explore ways to make AI more accessible for businesses of all sizes, I want to share how we plan to leverage ChatGPT like Large Language Models (LLMs) to create AI-Powered Data Assistants. This suite of generative AI chatbots help product and business teams get insights to our data in a more conversational manner, complementing existing dashboards and reports. Our goal is to give easy, on-demand, flexible access to our data and enhance business intelligence capabilities through AI.
This article outlines how we’re developing a proof of concept (POC) for a suite of AI-Powered Assistants using Snowflake’s Cortex AI and Notebooks. These assistants will address our business teams’ pain points by providing quicker, more flexible access to our data.
The Vision
As our company shifts toward a more product-driven culture, the need for rapid and flexible access to data has become increasingly important. Our AI data assistants are here to meet this challenge, enabling self-service data exploration and reporting. While traditional BI tools like Tableau have laid a strong foundation, anyone involved in analytics or data teams knows that once a dashboard is launched, the flood of questions from the business begins. This often creates a bottleneck, as we scramble to incorporate all the necessary answers into those dashboards. As a result, the complexity can overwhelm users, making it difficult for them to extract meaningful insights.
Our new AI assistants change the game by providing immediate, conversational access to data. This empowers users to derive insights at their own pace, dive deep into the data, and even explore those intriguing data rabbit holes if they choose. Our goal is to build a data assistant for each data set (email marketing metrics, customer care metrics, product metrics, finance, etc.), and a finally, the moonshot goal is to combine all of it and build a solid data set that can answer all questions regarding our company’s data.
User Interface and Features
Our AI-Powered Data Assistants are designed with user experience in mind, ensuring that accessing data is as intuitive and straightforward as possible. Users log into our Snowflake account to engage with these chatbots. Upon entry, they are greeted with a brief overview of what the assistant can help with, including the types of data it accesses and the kinds of questions it can answer. This sets the stage for a seamless interaction right from the start.
Key Features:
- Guided Introduction: Users receive a concise description of the assistant’s capabilities, outlining the specific data available and the types of inquiries they can make. This helps users quickly understand how to maximize their interaction with the assistant.
- User Information Storage: The system collects the name and department of each user. This information is valuable for follow-ups, ensuring that we can reach out if further clarification or assistance is needed.
- Dynamic Responses: The assistant delivers answers in a clear, tabular format that is not only searchable but also downloadable as CSV files. This feature allows users to easily manipulate and analyze the data further as needed.
- Visualizations: Wherever applicable, the assistant provides line graphs and bar graphs to visually represent data, making it easier for users to grasp trends and insights at a glance.
- SQL Query Transparency: Each time the assistant retrieves information, it generates and displays the SQL query used to fetch the data. This transparency allows users to verify the query if they wish, fostering trust and understanding of the data retrieval process.
- Customer Feedback Mechanism: To continually improve the user experience, we’ve incorporated a feedback system featuring thumbs up and thumbs down options. This allows users to quickly indicate whether the information provided was helpful, facilitating ongoing refinement of the assistant’s performance.
The user interface itself is built as a Streamlit app, which offers a clean and responsive design. Moreover, the flexibility of Streamlit allows us to enhance and customize these features using Python, providing room for further improvements that I’ll delve into in future articles. With these thoughtful design elements and features, our AI-Powered Data Assistants aim to create an engaging and efficient data exploration experience.
Multi-Agent Architecture
To ensure the effectiveness and reliability of our AI-Powered Data Assistants, we developed a sophisticated multi-agent architecture. This setup consists of three key components working together seamlessly:
- Primary AI Agent: This is the interface users interact with directly. It processes user queries and retrieves relevant data based on the input provided. The primary agent is designed to understand natural language and respond in a conversational manner, making data exploration intuitive and user-friendly.
- Secondary Validation Agent: Acting as a safeguard, this agent re-runs the same prompts submitted to the primary AI agent. By independently validating the responses, it ensures that the information provided is consistent and accurate. This additional layer of verification is crucial for maintaining the integrity of the data insights.
- Python Script for Comparison: A dedicated Python script analyzes the outputs from both the primary and secondary agents. By comparing the two sets of answers, it validates the accuracy and quality of the responses. This script also builds heuristics around the findings, helping to continuously improve the assistant’s performance and reliability over time.
In addition to these components, we store chat history as part of our essential “human-in-the-loop” framework. This allows our analysts and data teams to monitor interactions and fine-tune the models as needed. By leveraging this multi-agent architecture, we ensure that our AI-Powered Data Assistants deliver high-quality insights while also providing the flexibility and adaptability necessary for ongoing improvement.
Challenges
Of course, it’s not all sunshine and rainbows. As we dive into this initiative, we are navigating several challenges, particularly in applying Retrieval Augmented Generation (RAG) and semantic modeling techniques to get desired results. Achieving consistent results from large language models (LLMs) is a work in progress, and we’re actively learning how to optimize these methods.
One of our primary challenges is addressing the issues of hallucinations — instances where the AI generates incorrect or nonsensical information — and ensuring accuracy. Accuracy is our biggest focus because we want our business users to trust the data coming from these assistants. To build this trust, we are training the models to refrain from answering questions when they are uncertain, effectively prioritizing reliability over volume.
In the coming articles, I will dive deeper into the technical aspects of these challenges, the strategies we’re implementing to mitigate them, and how we’re continuously evolving our approach to enhance the performance and reliability of our AI-Powered Data Assistants.
Resources