Leading with Language: the rise of LLMs in nonprofit AI initiatives
At the recent “AI for Global Development Accelerator” event co-hosted by The Agency Fund and Project Tech4Dev, I had the opportunity to listen to the AI initiatives of seven nonprofits across Education, Health, and Agriculture. The use-cases presented were at different stages of maturity and deployment, but the promise was clear: AI was helping these nonprofits reach many more users, at lower costs, enhancing user engagement and decision making. Later, I attempted a classification of these efforts along their primary use-cases, primary goals and primary users. (See table in the Annexure below).
All the AI use-cases presented were, without exception, LLM-based.
This isn’t surprising, given the amount of attention and funding Gen AI-based solutions are attracting. (The shift has been so total that few bother to specify ‘Gen AI’ anymore—it’s all simply called ‘AI’ now.)
Wadhwani AI’s 2023 annual report is a useful source to analyse this trend towards Gen AI based solutions. The report lists projects using Machine Learning techniques that predated ChatGPT – for example:
Cough Against TB: The solution analyzes cough sounds and symptoms to identify presumptive pulmonary TB. It involves audio analysis (using signal processing and machine learning classification) and symptom analysis.
Screening of Diabetic Retinopathy: Uses “computer vision” to detect and grade diabetic retinopathy from retinal images.
Detection Of Radiological Feature In Chest X-ray: Analyzes chest X-rays to detect signs of pulmonary TB, silicosis, and other lung diseases.
Such solutions rely on domain-specific datasets, often labeled by experts, and their specialised nature leads to higher accuracy. They can’t (yet?) be replaced by LLM-based approaches, despite advances like GPT-vision. The drawback, of course, is that they are more expensive to build and maintain.
But the annual report also lists some projects using “traditional” AI approaches where, going forward, LLMs would clearly add value. For example:
Media Disease Surveillance (MDS): The solution uses AI to scan media for unusual health alerts in real time to identify potential disease outbreaks. In 2023 it was primarily using machine learning techniques for information classification. The ability of LLMs to process context and nuances in language can lead to more accurate identification of potential health alerts compared to traditional keyword-based approaches.
Oral Reading Fluency (ORF): Assessing oral reading fluency from voice recordings was done using Automatic Speech Recognition (ASR) techniques. LLMs excel at understanding context and language patterns, which allows them to correct errors made by traditional ASR systems, especially in complex or ambiguous speech.
The report contains further examples of this nature. So already in 2023 it was becoming clear that that LLMs would soon be taking centre-stage. Several factors explain this shift.
Lower entry barrier. Off-the-shelf Gen AI capabilities can now be adapted to nonprofit use-cases with relatively little initial R&D, lowering the barrier to entry.
Better access, broader reach/scale, higher efficiency, personalisation. LLM-based solutions offer a more personalised and localised interaction with end-users, often through familiar channels like SMS or WhatsApp. Unlike traditional AI approaches that are often deployed to assist “specialists”, LLM-based solutions enable the “unskilled end user” or the “semi-skilled Intermediary” (highlighted in the table below). And they are highly cost-effective in large scale contexts.
Flexibility and versatality. General-purpose LLMs enable rapid deployment across multiple domains and use-cases, giving them a Swiss Army knife character for a range of development challenges. This single model approach also simplifies maintenance over time.
Tech momentum and funding. The sheer momentum behind Gen AI plays a significant role. With major players like OpenAI (a collaborator in the aforementioned accelerator) and Google supporting Gen AI-based nonprofit initiatives, the slant towards Gen AI use-cases is easy to understand.
Challenges with non-LLM approaches. These “traditional” approaches often require more effort, expertise, and infrastructure to implement. For instance, computer vision techniques or predictive models typically depend on high-quality, domain-specific datasets that are hard to come by in low-resource settings. Collecting, cleaning, and annotating this data is time-intensive and expensive—especially for nonprofits without large R&D budgets. Further, many non-LLM techniques demand closer collaboration with domain experts during development and deployment. Building a model to detect silicosis from X-rays or to anticipate drought patterns requires deep partnerships with health professionals or climate scientists. In contrast, LLMs, especially those already trained by large platforms, offer an enticing shortcut: a general-purpose tool that can be adapted quickly using prompt engineering or lightweight fine-tuning.
It’s not that these traditional AI approaches are disappearing – they’re sometimes even being integrated alongside LLMs. For instance, Rocket Learning uses image recognition to auto-grade homework photos at scale and “predictive modeling to identify and retain at-risk students.”
What’s different is that language has taken center stage in many interventions post-2023. The ability to have a dialogue – to ask questions, clarify doubts, receive step-by-step guidance – covers a wide range of human needs in development sector related areas.
LLMs of course come with a well-known risk of inconsistent behaviour, but as Han Chieng Sha (a keynote speaker at the event) notes, this unpredictability can be managed:
Application developers I met were clear eyed that AI technology can behave inconsistently—generative AI models rarely meet intended behaviors “out of the box.” The critical question for developers isn’t whether the technology deviates from the designer’s intent, but whether their mitigation strategies sufficiently ensure quality outcomes, particularly for vulnerable populations.
But not all development sector use-cases intersect with language, and there are other AI approaches that might be better suited for specific problems. Analysing geospacial datasets and using predictive modeling for disaster risk assessment (or to detect crop diseases), using machine learning to detect specific conditions from medical X-rays, building and using specialised models for drug discovery – these are a few common examples. By focusing heavily on LLMs, do we risk neglecting these potentially impactful, albeit less hyped, technologies?
I asked GPT 4o what it thought about this risk. Following a long explanation, it ended with this:
LLMs are like charismatic generalists — they’re flexible, good at talking, and make excellent demos. But quietly in the background, we also need the domain experts: the vision models that spot rust fungus, the time-series systems that know when cholera spikes, and the optimizers that get vaccines to remote clinics.
If we only fund the generalists, we risk weakening the entire AI team. The social sector can’t afford to get distracted by what’s shiny — it has to stay obsessed with what works.
That’s a bit too dramatic and judgemental for my taste. What strikes me most about the current landscape is that language has become the default modality for innovation. Other approaches are receiving less attention, in part because they’re harder to build, demo, deploy, or scale quickly. Over time, will this reinforce a narrow kind of tech-centric thinking that prizes conversational UX and rapid deployment over deeper programmatic integration?
It’s an interesting space to watch and follow. I’ll wait for that 2024 – and then 2025 – Annual Report of Wadhwani AI.
ANNEXURE
NonprofitPrimary AI Use-Case(s)Primary GoalPrimary UsersJacaranda HealthNudging/Guiding Users – Operates an AI-powered SMS health navigator (“PROMPTS”) that engages expecting mothers with two-way messaging and personalized advice, while triaging urgent queries to human nurses.Improved Decision Making – Empowers timely care-seeking and adherence to medical advice, leading to better maternal and newborn health outcomes.Unskilled End-user – New and expecting mothers directly receive and interact with the SMS-based service.Rocket LearningNudging/Guiding Users; Tutoring – Personalised nudging for parents and AI Powered assistant for Anganwadi workers Improved User Engagement – Keeps parents and children in learning through personalized, localized content and feedback, driving better participation and habit formation.Unskilled End-user, Semi-skilled intermediary – Parents of preschoolers in low-income communities, with content support for semi-skilled Anganwadi workers.Precision Development (PxD)Nudging/Guiding Users – Delivers customized mobile advisories to smallholder farmers (e.g. SMS or voice calls) with agricultural tips tailored by data analysis and machine learning to local conditions.Improved Decision Making – Enables data-driven farming practices (e.g. optimal planting, input use) that improve yields, profits, and climate resilience for farmers.Unskilled End-user – Smallholder farmers with minimal formal training, receiving direct farming guidance on basic mobile phones.Digital GreenNudging/Guiding Users – Building an AI assistant (“Farmer.Chat”) to support agricultural extension by answering farmers’ questions in local languages and providing on-demand advice drawn from a vast library of farming knowledge.Cost Saving – Achieves far more cost-effective scale in agriculture extension (10× cost reduction) while boosting farmers’ productivity and incomes .Semi-skilled Intermediary – Frontline agriculture extension agents use the assistant to help farmers (with plans to expand direct access to smallholder farmers themselves).Reach Digital HealthNudging/Guiding Users – Supports pregnant women and new mothers through WhatsApp/SMSImproved User Engagement – LLM-driven adaptation to user context will increase engagement, satisfaction, and proactive healthy behaviour management. Unskilled End-user – Pregnant women and new mothers engage via WhatsApp/SMSYouth ImpactTutoring Users – Integrating generative AI into a phone-based tutoring program (“ConnectEd”) to create a voice-interactive tutor that can conduct one-on-one math and literacy lessons and automate scheduling of sessions.Cost Saving – Delivers personalized tutoring at scale for a fraction of the cost of human tutors, enabling low-cost, widespread remediation of learning loss among students.Unskilled End-user – Primary school children (often in resource-poor settings) who lack access to quality tutoring, now reached via simple mobile phones through AI voice calls.Noora HealthNudging/Guiding Users – Runs a Remote Engagement Service (RES) that follows up with family caregivers via mobile (WhatsApp chatbots, automated calls) to reinforce hospital training, answer FAQs with AI-assisted responses, and flag high-risk cases for prompt intervention.Improved Decision Making – Equips family caregivers with the knowledge and confidence to make life-saving care decisions at home (e.g. recognizing danger signs and seeking timely medical help).Unskilled End-user – Family caregivers of patients (e.g. new mothers, post-surgery patient’s relatives) who typically have no formal medical training, supported by the chatbot in local languages.