Harnessing AI for Good Requires a Systems Approach

Harnessing AI for Good Requires a Systems Approach

The promise of Artificial Intelligence to transform development outcomes has captured attention across the global development sector. From predicting disease outbreaks to optimizing agricultural yields, the potential applications are endless. But if decades of technology interventions in low- and middle-income countries have taught us anything, it's that potential alone doesn't translate into lasting impact. Whether innovations succeed or fail depends on how we integrate new tools into existing systems and how those tools support organizations working to reduce poverty.

At Innovations for Poverty Action (IPA), we've spent more than 20 years testing what works in poverty alleviation. One pattern has become clear: one-off technological innovations, however promising, rarely deliver sustained change on their own. The graveyard of development technology is littered with pilot projects that showed initial promise but never scaled, mobile apps that launched with fanfare but sat unused, and digital platforms that collapsed once funding dried up.

IPA's own research illustrates this. We evaluated Peru's $180 million investment in deploying laptops to 318 primary schools and found null or negative effects on learning outcomes. We tested ICT training for over 1,100 Kenyan youth and saw no increase in employment or earnings when training wasn't connected to job market systems. We measured mobile money services across three countries and documented transaction failure rates as high as 39 percent, despite functional technology. The pattern is consistent: potential alone doesn't create impact.

The difference between failure and progress isn't the sophistication of the technology. It's whether the intervention is built into a system strong enough to sustain it.

This is why IPA's approach to AI focuses on strengthening systems. We generate evidence at each stage of an intervention's lifecycle: exploration, pilot, test, transition, and scale. Lasting improvements require working within government, co-designing solutions, building local capacity, improving data infrastructure, and ensuring new tools integrate into existing decision-making processes. AI that ignores these fundamentals may look impressive, but it won't deliver results that last.

Matching algorithms with action

Consider education systems across developing countries. In the Philippines, approximately 40 percent of students who enter Grade 1 leave school by Grade 10. At the tertiary level, the national dropout rate is 39 percent, with some regions reaching 93 percent. These numbers represent millions of young people whose futures are constrained by incomplete education.

Through our Embedded Evidence Lab program, IPA supports the Philippines' Department of Education (DepEd) Lab, in using machine learning to predict which students are at highest risk of dropping out. But this isn't simply about deploying an algorithm. The project addresses weaknesses in the education system's data infrastructure, improves enrollment record-keeping, and trains government staff to interpret and act on model outputs. We're co-developing systems with DepEd that the government will own and operate long-term, not building external tools that require ongoing IPA support.

A prediction is only valuable if it enables action. Without strengthening the system's capacity to respond through better data, trained personnel, and integrated workflows, even perfect predictions accomplish nothing. Our two-decade partnership with DepEd, supporting their national "Research O'clock" forum and strengthening M&E systems, exemplifies this approach. The goal isn't an impressive technical demonstration. It's building the government's capacity to reduce dropout rates through proactive support for vulnerable students.

Bringing better information and stronger infrastructure together

The same thinking guides our consumer protection work in digital financial services. Across low- and middle-income countries, mobile money, digital credit, and microinsurance are expanding access for previously excluded populations. But this promise depends on consumers understanding the true costs and risks of different products, information that's often scattered, inconsistent, or deliberately obscured.

IPA's Consumer Protection Research Initiative uses AI to scrape and analyze data from financial institutions across 18 countries, transforming fragmented pricing information into structured, comparable data. We've shared this on Digital Financial Services Pricing through interactive visualizations. But the AI is only one component.

In the Philippines, we've partnered with Bangko Sentral ng Pilipinas (the Central Bank), using AI tools for auditing transparency and redress while co-developing reporting templates that will become standard regulatory requirements. In Nigeria, IPA's Evidence Lab with the Central Bank uses social media scraping to monitor consumer complaints and market conduct. In Kenya, our work with the Competition Authority applied machine learning to large-scale market data, informing consumer protection policies affecting tens of millions. In Uganda, the Communications Commission used predictive modeling and natural language processing to analyze complaint patterns, and has since institutionalized our monitoring approach into their ongoing Consumer Affairs reports. We're not handing recommendations to governments. We're building regulatory infrastructure together that partners will own and operate after our involvement ends.

This work spans the full pathway from evidence to scale: AI that extracts pricing data, products co-developed with regulators, randomized trials testing whether transparency changes consumer decisions (such as our credit comparison study with Mexico's central bank), measured impact on financial inclusion, and regulatory systems that sustain consumer protection after IPA's role transitions.

IPA is exploring similar approaches to help governments make better use of their own research. We're working with partners across Africa and Latin America to create searchable knowledge systems of government research, data, and evaluation reports. Rather than leaving valuable evidence scattered across hard drives and filing cabinets, an AI-powered search system would let policymakers quickly find relevant research, identify patterns across studies, and draw on accumulated knowledge when making decisions.

This speaks to a broader challenge: the problem often isn't lack of evidence but inability to access it when decisions are made. Governments in low-resource settings conduct substantial research, but institutional memory is fragile, staff turnover is high, and finding the right evidence at the right moment is difficult. AI-powered knowledge systems can address this, but only if designed for government constraints, built through partnerships that transfer capacity to government teams, and integrated into existing workflows.

Why AI must be grounded in evidence

As AI capabilities advance, development practitioners and policymakers face important choices about how to engage. The temptation to chase impressive demonstrations is strong. But experience suggests that sustainable impact requires patient work: co-design, strengthening data infrastructure, building local capacity, integrating tools into existing processes, and supporting governments to use evidence more effectively.

The development sector has made important advances in AI evaluation. The AI Evaluation Framework developed by the Center for Global Development, The Agency Fund, and J-PAL provides a four-level approach: assessing whether AI models work, whether products engage users, whether interventions change behavior, and whether development outcomes improve. For more than 20 years, IPA has contributed to this evidence base, and we continue to adapt our methods for AI-powered interventions. In Tanzania, for example, we're running a randomized evaluation testing whether female business owners benefit from AI-powered business advice delivered via interactive voice response.

But IPA's contribution goes beyond individual evaluations. We bring together sector expertise, research and policy teams, and country offices across Africa, Asia, and Latin America, enabling us to translate AI evaluations into locally relevant benchmarks. This allows us to build evaluation into systems: strengthening data-driven decision-making, institutional capacity, and infrastructure that help solutions scale beyond pilots.

IPA's Stage-Based Learning approach generates insights at each phase, from exploration through scale, measuring not just whether technology works but whether implementation systems can reliably produce outcomes. This methodology draws on two decades of evidence documenting what fails and what succeeds when built on strong foundations.

The organizations that successfully harness AI for development will treat technology as one component of systems strengthening, invest in capacity building alongside technical deployment, and measure success not by algorithmic performance but by whether governments become more effective and development outcomes more equitable.