Anomaly Detection in Survey Data: Spotting the Unexpected in E-Commerce

Enia Xhakaj
Data Scientist
,
Fairing
Reliable survey data depends on flawless delivery. We rebuilt our anomaly detection system to ensure every survey appears on your Thank You and Order Confirmation pages, where multiple scripts can interfere with survey delivery. With Fairing's industry-leading response rates of over 50%, each missed survey is a lost opportunity to capture valuable zero-party data. Our new system quickly identifies when surveys stop appearing, letting us address technical issues before they impact your data collection.
The Objective: Identifying Unusual Activity
Our goal is to detect unexpected trends in survey views. By monitoring survey engagement, we can spot sudden changes—like a sudden drop in views—that may signal a problem. This proactive approach allows us to address issues before they escalate, ensuring a smooth customer experience.
The Analysis: Preparing the Data
To effectively detect anomalies, we first need to prepare and analyze the data. Our approach operates at two levels:
Website-Level Views: This provides a big-picture understanding of survey activity across an entire website. It’s useful for identifying overall trends and patterns.
Question-Level Views: This zooms in on specific survey questions, particularly high-priority ones (e.g., “rank 0” questions). It’s ideal for digging deeper into areas that generate significant engagement.
To ensure a complete and accurate analysis, we address two key challenges:
Filling the Gaps: If a minute has no survey views, we add a placeholder with “0 views” to maintain a continuous timeline.
Grouping by Time: To reduce noise, we group data into intervals (e.g., 30 minutes) and summarize activity within each period.
Detecting Anomalies: What Counts as Unusual?
An anomaly occurs when survey activity deviates significantly from expected patterns. For example, a sudden drop in views during a typically busy period could indicate a technical issue, while a spike might suggest a product drop or a problem with the survey itself.
Here’s how we identify anomalies:
Detect Zero-View Periods: If a website or question usually has more than five views per interval but suddenly has zero, we flag it as unusual. This detector also activates for new websites with insufficient historical data.
Account for Trends: Survey data often follows natural trends, such as increased activity during holidays or drops after major sales events like Black Friday. To focus on unexpected changes, we remove these trends using a moving average.
How We Remove Trends
Calculate the trend using a moving average.
Subtract the trend from the original data to create a “detrended” version.
Identifying Anomalies: Comparing Today to the Past
To determine whether today’s activity is unusual, we compare it to historical patterns using two statistical methods:
Median Absolute Deviation (MAD): This method compares today’s data to the historical median. It’s robust against extreme values, making it reliable for spotting anomalies.
Standard Deviation (STD): This measures how far today’s activity deviates from the historical average. It’s particularly useful for identifying large deviations.
Flagging Anomalies: Taking Action
If today’s activity falls below the thresholds for both MAD and STD, it’s flagged as an anomaly. This dual-method approach ensures accuracy and helps teams act quickly to investigate and resolve unexpected changes.
Conclusion
Survey data is a goldmine of insights, but only if we know how to interpret it. By leveraging anomaly detection techniques, we can quickly spot and respond to unusual activity, ensuring a seamless customer experience and driving long-term success. Whether it’s a sudden drop in engagement or an unusual lull in activity, being able to identify and act on anomalies is a game-changer for any business.