Measuring the Effectiveness of Consumer Assistance Chatbots in the Philippines
The Challenge
As digital financial services rapidly expand around the world, financial regulators grapple with finding policy levers to monitor consumer protection risks effectively. As customer interactions increasingly shift to automated channels, chatbots have emerged as both a customer service tool for financial institutions and a potential regulatory monitoring mechanism. For instance, in the Philippines, both the BSP and financial service providers have increasingly deployed AI-driven chatbots to handle customer inquiries and complaint resolution. Chatbot usage has grown significantly, such as in the United States where roughly 37 percent of consumers engaged with bank chatbots in 2022, likely due to cost savings for financial institutions.1
However, despite the increased usage of chatbots, their effectiveness in providing accurate and helpful responses remains unclear. While existing IPA evidence from the Philippines suggests chatbots can provide faster, real-time support for basic inquiries, they are less effective at resolving complex issues. Additionally, chatbots increasingly utilize Generative AI technology, which introduces new risks such as "hallucinations" that can lead to incorrect information being provided to consumers. Poorly designed chatbots may fail to address consumer concerns, leading to diminished trust in financial institutions. Can innovative monitoring tools help regulators assess chatbot performance and identify areas for improvement?
The Research
IPA researchers are conducting a pilot study to monitor the performance of consumer assistance chatbots used by BSP and financial institutions. Researchers will design and deploy an automated auditing tool that uses robotic process automation and generative AI to simulate realistic conversations with chatbots, calibrated to different scenarios and consumer personas. This will allow the tool to test not only simple informational queries but also more complex problems that consumers commonly raise when seeking redress.
Once developed, the system will enable regulators to quickly and repeatedly assess chatbot quality in a scalable way. Because deployment costs are low, the tool can generate high-frequency monitoring data that helps identify where chatbots are failing and how performance changes over time. This capability is particularly important as institutions transition from simpler rules-based chatbots to more advanced generative AI models, which offer greater functionality but also carry heightened risks such as hallucinated responses and bias or discrimination. Continuous auditing can help regulators stay ahead of these emerging risks and target improvements where they matter most.
Results
Results will be available in 2026.
Sources
1. CFPB, “Chatbots in consumer finance: Issue spotlight,” June 2023, https://files.consumerfinance.gov/f/documents/cfpb_chatbot-issue-spotlight_2023-06.pdf
Implementing Partner











