Data quality is vital for CMOs who need to generate insights, with 89% of buyers agreeing it’s their number one priority (source: Greenbook, 2023 GRIT Insights Practice Report).
A recent industry quality pledge is gaining momentum as more buyers, sellers, and providers understand that this issue must be fixed.
As Jane Frost, CEO of the Market Research Society, says: “Fraudulent activity is becoming increasingly sophisticated, particularly in online research. It poses a significant risk to our sector’s future.”
Data quality should be a hygiene factor that insights buyers can rely on, but rampant fraud has persisted for years. And it’s a strangely undiscussed topic across the industry. Panel fraud is looking like the ad fraud or click farm of this decade – and is becoming industrialised, fast.
What’s worse, as fraud rates increase, so do the shifts in resulting data.
In this piece, we reveal what’s going on with fraud, and how Kantar can solve it with AI and other advanced solutions.
Globally, there are three big industry challenges that affect panels:
1. The fight for eyeballs - how do we compete for panellists’ precious time?
2. Increasing data privacy compliance requirements: GDPR is different from CPPA, for example.
3. Increasing levels of online fraud. ‘Reconciliation rates’ - the percentage of samples that are rejected for being low quality - have increased by around 300% over the last three years, and clients are rejecting up to 40% of data post field.
Panel owners must deal with each of these three factors intelligently and strategically.
1. Fight for eyeballs
This starts with how we treat panellists; not as a commodity but as a precious resource. We constantly look at ways to refine how we ask our questions, the length of interview (LoI) and how to increase gamification. We know our panellists as people by responding to their questions and treating them well.We match each unique panellist via our unique survey matching algorithm, so the right people take the right surveys, at the right pace. This helps reduce dropouts and screenouts and translates to 175% more completed surveys than the industry average.
By combining our treating respondents as appreciated people and advanced panel technology, we find our panellists are happy and engaged. They rate our app a 4.2 on Trustpilot and make positive review claims such as: “The energy is positive, and I have learnt so much from the online surveys while my bank account keeps smiling too!”
2. Increasing data privacy compliance requirements
Kantar takes a leading role in industry discussions and working groups (for example, ESOMAR). We also have an in-house specialist team who constantly monitor privacy and consent regulations, and ensure we have the right technical solutions for capturing, storing and deleting data.
In China, for example, we have a specific PIPL-compliant sample management platform for CAC-approved data collection, offering a series of market specific optimisations. It’s fully contained within Chinese cyberspace and grants programmatic access to our wholly owned WeChat mobile panel: access to 1.5m harder-to-reach people. We also have multiple layers of fraud prevention and quality checks that ensure each WeChat account links to a real and unique bank account. Hashed IDs and survey links are encrypted with MD5 and Wave Secret, to mitigate ghost completes and fraudulent responses by hackers.
3. Increasing levels of online fraud
Over two-thirds of data quality flags (69%) are attributed to different types of fraud. Of these, 41% are from international hackers,13% from known bots, 7% from ghost completes (where a respondent appears to have completed a survey but no data is collected due them setting up redirected links), and 8% from duplicates (where a respondent completes multiple surveys, usually if they have set up many fraudulent accounts pretending to be various demographics).
To ensure the highest data quality, we have segmented fraud into three types:
• Disengaged panellists: they multi-task, straightline their way through surveys, so the accuracy is in question. The impact on data integrity is moderate to low. These panellists need guidance and behaviour monitoring. Excluding them from certain studies may be necessary.
• Dishonest panellists: they lie about who they are and complete more surveys to earn rewards faster. The impact on data integrity is moderate to high.
• Fraudulent panellists: they act on their own, or in a group, to hack surveys and earn rewards in bulk – the new click farms, if you like. This is serious fraud, at volume with a high impact on data integrity.
What is Kantar doing to combat each of these types of fraud? And how are we using market-leading AI/GenAI tools to combat it?
• We promote good survey design: Survey quality hinges on design, length, and user experience. Even the most engaged participants can lose interest if these factors aren’t considered.
• We prevent disengaged mistakes: Some panellists give inconsistent answers due to misunderstanding, and some aren’t who they claim to be: but not all flagged issues stem from active deception. Some are innocent mistakes, and not all flagged actions harm data integrity. We want to be inclusive of all genuine participants. So, we give panellists training, and a chance to improve their behaviour, if needed.
• We define quality: It’s subjective, so we use objective metrics. Recognising various levels of poor quality and different contributing factors is key. Kantar’s Profiles division amplifies its 20+ years of deep panel expertise with tech and AI to do this in real-time via Qubed AI- its proprietary anti-fraud tool. Qubed AI runs in real-time, is powered by 5 Deep Neural Networks (in other words Advanced Machine Learning), is trained daily based off 60mm+ events, and processes over 300 features for each survey session to automatically score and return a verdict and suggested action on whether or not a panellist is fraudulent, within milliseconds – something a human (and other anti-fraud tech) simply could not do.
• We use GenAI with Qubed Open-End Validation: we use our proprietary ChatGPT-based open-ended evaluator solution which scores open-end responses from panellists across multiple dimensions. Factors we detect include relevance to question being asked, originality, completeness, language, plagiarised answers, use of PII, slang, use of abbreviations, as well as profanity, racism, gibberish, and ChatGPT-generated answers. For more on how Kantar’s Qubed Open-End Validation fights fraud, see this previous piece we published called Transforming Panels: How is Kantar using LLMs to improve panel responses?
• Introduction of Qubed Facial Verification: Kantar’s latest step forward in fighting survey fraud has been the integration of Realeyes Verify into our Qubed AI. Verify is a lightweight facial verification technology trained on a unique webcam dataset of 17m consenting survey sessions. We can quickly identify when bad actors attempt to join our Premium Panels.
CMOs and Insights leaders need to understand how their panel partners are prioritising data quality, and to be assured that their panel partners are providing timely and accurate data, unsullied by fraudulent responses.
As the entire industry embraces quality through the Quality Pledge and other means, Kantar is well positioned to continue its leadership role in eliminating fraud and returning greater confidence to the consumer data industry, through the intelligent use of AI.