What Retail and Restaurants Must Know About AI Hallucination and Model Drift
Understanding AI Limitations: The Hallucination Phenomenon and Model Drift
Artificial intelligence, particularly large language models (LLMs) like ChatGPT, Claude or Gemini, has made significant strides in generating fluent and coherent text across a wide range of topics. However, these models are not without their limitations. Two key challenges are their tendency to "hallucinate," producing plausible-sounding yet incorrect information, and "model drift," where the accuracy of predictions degrades over time.
Illustrating Hallucinations with Examples
Consider these statements:
- Chocolate is made from cocoa butter.
- Before I worked for Bluedot, I was an educator.
- Blue moons are blue.
The common thread among these statements is that they are all hallucinations produced by a large language model. In reality, chocolate is made from cocoa beans that may include cocoa butter as part of the recipe. While I do consider myself a lifetime learner, I was not an educator prior to joining Bluedot. Moreover, a blue moon occurs when there is a second full moon in a given calendar month or a third full moon in a season containing four full moons.
Real-World Observations of AI Hallucinations
AI hallucinations are a significant issue faced by companies developing these technologies. Having read multiple research articles and discussing with experts, it is clear that not only can AI produce incorrect answers, but also at the same time - public perception considers it a source of truth. This inconsistency can range from providing highly accurate responses to generating complete nonsense or, worse, partially correct answers that can be misleading. It is imperative that those who offer AI knowledge and those who receive it understand that hallucinations do occur, especially when context is missing.
What is a Hallucination?
In the context of LLMs, hallucinations are outputs that deviate from facts or logical consistency. There are several types of hallucinations including:
- Sentence Contradiction: Occurs when the model generates a sentence that contradicts a previous sentence. For example, "The sky is blue today" followed by "The sky is green today."
- Prompt Contradiction: This happens when the generated sentence contradicts the input prompt. For instance, asking for a positive restaurant review and receiving, "The food was terrible and the service was rude."
- Factual Contradictions: These are outright factual errors, such as stating "CO2 is the chemical symbol for water."
- Nonsensical or Irrelevant Information: Sometimes, the model includes information that has no logical place in the context, like "What are the key factors for a successful quick service restaurant store?" Response: "The key factors for a successful QSR store are the color of the walls and the number of clouds in the sky."
Why Do Hallucinations Happen?
Understanding why LLMs hallucinate involves exploring the complex mechanisms behind their output generation. Here are a few common causes:
- Data Quality: LLMs are trained on vast amounts of text data, which may contain errors, biases, or inconsistencies. Even reliable sources can have inaccuracies, and not all topics or domains are covered comprehensively in the training data.
- Generation Method: The methods used to generate text can introduce biases and trade-offs. For instance, a search may prioritize high-probability but generic words over low-probability but specific ones, much akin to the field of search engine optimization for broad match or exact match key terms.
- Input Context: The information provided in the input prompt can guide the model's output. For example, asking, "Who or what is man in the middle" might yield "a type of cyberattack" or “a song by ABBA” or “a song by the BeeGees” for that matter, depending on the context provided about what was asked prior.
Understanding Machine Learning Model Drift
Fundamentals of Model Drift
In machine learning, we assume that the data and logic used for training mimics real world behaviors and actions. This assumption is crucial for our predictions. Unfortunately, sometimes previous behaviors cannot predict future behavior. As the saying goes, the only constant variable is change. So what happens when machine learning models cannot keep pace with these changes? The training data becomes obsolete, leading to a phenomenon known as model drift.
At its most fundamental level, machine learning is a set of algorithms that will predict or take future action based on observations and past behaviors. This is most notable when the same action occurs thousands (if not millions) of times in a given environment. For example, the act of posting a check to an account. Machine learning can help automate this process as it knows to read the check amount and deposit that amount into the account of the checkholder. Model drifting would be minimal in a straight forward scenario such as this. However, real world examples are often more contextual, complex and nuanced. Models predicting easily trendable changes will tend to experience drift faster than those handling stable, well-defined tasks.
Defining Model Drift
Machine learning model drift means that over time, our models deviate from accuracy, resulting in poor predictions. This drift can significantly impact businesses or opportunity cost.
For instance, when a user searches for “summer dresses” on an eCommerce website, the AI-powered product search system was trained on historical data, including previous cycle’s trends, customer preferences and popular styles. However, over time, those fashion trends have changed. As a result, results for “summer dresses” now serve outdated or less relevant options, leading to a missed opportunity and essentially, lower revenue.
Some examples of different types of drift that might impact retailers include:
- Changing Customer Preference Drift: When historical data no longer is representative of evolving trends as in the example of “summer dresses.”
- Concept Drift Due To Competitors: Changes in the underlying relationship between input data and predictions. For example, a competitor offers significant discounts for a group of products, leading to a decrease in average basket size (ABS). This can result in inaccurate sales forecasting if the changes are not properly categorized as a one-time event or an evergreen event.
- Feature Drift: Consider when characteristics or factors that influence customer purchasing decisions may no longer be accurately captured by the model. For instance, let's say the retail company introduces a new loyalty program that significantly impacts customer behavior. The loyalty program offers personalized discounts and rewards based on individual shopping habits. As a result, customers' purchasing decisions may be influenced more by their loyalty program status and personalized offers rather than their age or location.
Causes and Detection of Drift
Model drift can be triggered by real changes in the data or by data integrity issues. Detecting drift involves monitoring performance and data distribution:
- Performance Monitoring: Uses the process of continuously tracking and evaluating the performance of machine learning models in production. Performance monitoring can include continuous evaluation, real-time observability, monitoring metrics, drift detection techniques, alerts and triggers, among others.
- Data Drift Monitoring: Uses techniques to compare incoming data distribution against the training data such as distribution comparisons, statistical techniques, and proxy metrics.
Furthermore, addressing model drift involves:
- Data Integrity Checks: Ensure data quality and correct any pipeline issues.
- Drift Analytics: Analyze data distribution changes and their impact on model performance.
- Model Retraining: Update or retrain models based on new data to maintain accuracy.
The Impact of AI Hallucination and Drift in eCommerce and Food & Bev
Product and Search Discovery
In the eCommerce sector, AI-driven product and search discovery systems are crucial for providing customers with relevant product recommendations and search results. Hallucinations and model drift can significantly impact these systems.
- Hallucinations: When a search or recommendation system hallucinates, it might suggest irrelevant or non-existent products. For instance, a customer searching for "organic shampoo" might be shown results for "organic food" or "synthetic shampoo." Similarly, adoption of AI in food recommendations for cold drinks might result in a “salad” as it’s considered a cold item, leading to a poor user experience and potentially lost sales.
- Model Drift: Over time, customer preferences and market trends change. If a product recommendation system is not regularly updated to reflect these changes, it may continue to recommend outdated or less relevant products.
Conversational AI
Conversational AI, such as chatbots and virtual assistants, play a vital role in customer service and engagement in retail. Hallucinations and model drift in these systems can lead to misunderstandings and poor customer experiences.
- Hallucinations: A chatbot might provide incorrect information about product availability, shipping times, or return policies. For example, a customer inquiring about the return policy for a specific item might be misinformed, leading to dissatisfaction and potential loss of customer trust.
- Model Drift: As conversational AI systems interact with users over time, the language and preferences of users can evolve. If the AI does not adapt to these changes, it might provide outdated responses or fail to understand new slang or terminology. This can result in a less effective customer service experience and decreased user satisfaction.
Embracing AI in Retail and Restaurants
AI is currently in a hype cycle, similar to the early days of the internet. And, large language models may sometimes hallucinate and drift, producing unexpected results. However, understanding the causes and employing strategies to minimize these occurrences allows business leaders to harness its true potential.
At Bluedot, now a Rezolve AI company, having a system and fail-safes to manage hallucinations and model drift ensures continued, relevant accuracy. In the retail sector, addressing these challenges is crucial for maintaining effective product discovery, conversational AI and customer service. Moreover, understanding limitations and pitfalls of AI will help you navigate towards even better results than your competition.
As powerful as AI is, humans need to be part of that feedback loop. AI requires being intentional about your goals as well as understanding its limitations.
P.S. Image is courtesy of AI generator, DALL-E. After multiple inputs and iterations, it's still a work in progress.