How Do We Improve AI When Sharing Our Data Is So Dangerous?
Women’s health AI will never work without our data. But in a post-Dobbs world, the same data that could improve our care could also be used to prosecute us.
I get asked this question constantly. From clinic directors, from SRH advocates, from people running Title X programs who are trying to figure out whether to adopt AI tools and what it means for the people they serve.
They already feel the tension. They already know both sides of it. They want to know what I think, and whether there’s a way through.
We need our data in the system. Without it, AI will never work for us. And not just because the training data is incomplete, though it is: 90% of clinical trials don’t report sex-specific outcomes, and the conditions driving the biggest share of the women’s health burden received less than 1% of research funding between 2019 and 2023.
The problem is that women’s absence doesn’t stop at the training data. It cascades through every layer of how AI systems work. I wrote about this in detail in my last piece, “You Can’t Fix AI’s Reproductive Health Problem With Better Data.” If you haven’t read it, start there, because it’s the foundation for what I’m about to argue here. The short version: the training data doesn’t include women, so the model’s built-in worldview doesn’t reflect women’s health reality. The safety tuning was calibrated without that reality, so it treats reproductive health as dangerous territory requiring maximum hedging. The retrieval layer pulls from an internet flooded with crisis pregnancy center content and post-Dobbs legal confusion. And the generation layer produces outputs that don’t reflect women’s experience because none of the layers it draws from ever did. Every layer is broken by the same absence. That’s why we need our data in the system: not to fix one layer, but because the absence compounds at every step.
But the same data that could save us could also be used against us.
Not hypothetically. Not in some dystopian future. Right now. In states where prosecutors can use your search history to build a case around a pregnancy outcome. Where your period tracking app may disclose your data to law enforcement on request. Where a teenager in Nebraska was convicted after prosecutors subpoenaed Facebook messages between her and her mother about ending a pregnancy. Where a woman in Mississippi was charged with second-degree murder after a stillbirth because prosecutors pointed to her online search history as evidence.
The people I work with in the SRH field are navigating this every day, and they deserve more than the usual reassurances: “use end-to-end encryption,” “anonymize the data,” “get better consent forms.” Those are real tools, but they don’t solve the structural problem. Encryption doesn’t help when a court orders you to hand over the decryption key. Anonymization doesn’t hold up when location data, timestamps, and search histories can be cross-referenced to re-identify individuals. Consent forms don’t protect you when the law changes after you’ve already shared.
So here’s the question this piece is trying to answer: what would it actually take to get our data into the system safely? And is “safely” even possible under current legal conditions?
The Trap
For women seeking reproductive health care in the United States right now, sharing health data carries real, material risk. Reproductive health data can be subpoenaed. It can be used as evidence in a criminal prosecution. It can be sold to data brokers who sell it to anyone willing to pay. The legal infrastructure for this existed before Dobbs v. Jackson Women’s Health Organization. What Dobbs did was overturn the constitutional right to abortion, enabling states to criminalize reproductive care and making every piece of reproductive health data a potential liability. The surveillance architecture was already built. Dobbs gave it teeth.
Refuse to share our data, and the AI that’s increasingly making decisions about our health care will never see us accurately. Share our data, and we hand prosecutors, law enforcement, data brokers, and tech companies the raw material to surveil, profile, and criminalize reproductive decisions.
That’s not a technology problem you can solve with a better algorithm. It’s a structural problem about who has power over health information and what they’re legally allowed to do with it.
Who Gets to Be Brave With Their Data?
This is the part of the conversation most people skip, and it’s the part we can’t afford to.
Not everyone faces the same risk in sharing their health data. A white woman with private insurance in Massachusetts tracking her endometriosis symptoms on an app faces a fundamentally different data risk calculation than a Black woman in Texas who needs reproductive care she can’t legally access in her state. The first woman’s period tracking data is unlikely to be subpoenaed. The second woman’s data (her search history, her location pings near a clinic, her text messages, her app activity) could be used to build a criminal case. The threat isn’t the medical condition. The threat is the digital trail that documents every step of seeking care: the Google search, the telehealth appointment, the pharmacy pickup, the follow-up visit that didn’t happen.
BIPOC and low-income people are disproportionately surveilled and disproportionately targeted in reproductive health prosecutions. The legal advocacy group If/When/How documented at least 61 investigations or arrests for allegedly self-managed abortion between 2000 and 2020, overwhelmingly targeting communities of color. A scoping review of women’s mobile health apps (period trackers, fertility monitors, pregnancy apps, the whole category) found that 87% shared user data with third parties, and only 52% even requested consent from users before collecting their data.
So when we talk about “contributing your data to improve women’s health AI,” we need to ask: Who bears the legal risk? Who absorbs the consequences if the data is subpoenaed? And who reaps the benefit of the improved models?
Right now, the people with the most privilege to share safely are the people whose health outcomes are already best represented in the datasets. And the people whose data is most urgently needed (Black women, Indigenous women, trans and nonbinary people, immigrants, women in the Global South) are the ones for whom sharing carries the greatest danger. Any framework that doesn’t name this dynamic and design around it is building another system that extracts from the most vulnerable and delivers to the most privileged.
Catherine D’Ignazio and Lauren Klein frame this directly in Data Feminism: they argue for “data justice” over “data ethics” because ethics frameworks tend to focus on fairness within existing power structures, while justice frameworks ask who those structures serve and how to change them. But even Data Feminism doesn’t specifically address the reproductive health surveillance question, which may be the most extreme version of the paradox: the people who most need to be counted are the people for whom being counted is most dangerous.
What If We Owned Our Own Data?
Some of the most creative thinking on this problem is coming from outside the health field entirely.
Alex “Sandy” Pentland at MIT has been developing frameworks for what he calls data cooperatives: collective structures where individuals retain ownership of their personal data, pool it for shared benefit, and have fiduciary representation in how it gets used. (Full disclosure: Pentland taught parts of my AI coursework at MIT, and his thinking on data governance has shaped how I approach this problem.)
The core idea: you should have the same rights over your data that you have over your body and your money. You should be able to see what’s been collected, decide who accesses it, derive value from it, and revoke access when the terms change. Not just health data. All of it. Every data trail you leave.
Pentland’s “New Deal on Data“ proposes that data should be treated as a personal asset that can be shared for collective benefit, but only under conditions the individual controls. He and his collaborators at MIT have proposed data cooperatives modeled on credit unions: not-for-profit institutions owned by their members, chartered to manage digital data and represent members’ interests legally and financially. Think credit unions, but for all of your personal data, not just your money.
It’s a compelling framework. But it immediately raises a question Pentland’s work, grounded in tech-economy framing, doesn’t fully address:
What does a data cooperative look like when the data itself could be evidence in a criminal case?
A reproductive health data cooperative would need to be designed not just for privacy, but for legal immunity. Not just for individual control, but for collective defense. And it would need governance structures built by the communities most at risk, not by the institutions most eager to collect.
This connects to a broader question about compensation. Your data is already generating enormous value: period tracking apps monetize your cycle, health platforms sell aggregate insights to pharmaceutical companies and insurers, and OpenAI’s ChatGPT Health (which is not covered by HIPAA) is asking 40 million weekly health users to upload their medical records into a system also exploring advertising as a revenue model. Some argue people should be paid directly for their data. But paying individuals for health data creates incentives that fall hardest on the most economically vulnerable. The woman who needs $50 for groceries will share data that the woman with a full pantry won’t. In a landscape where reproductive data can be weaponized, that incentive becomes a trap: trading legal risk for economic survival.
Paying individuals for their data doesn’t redistribute power. It just puts a price tag on the extraction. The approaches that actually shift who controls reproductive health data look different.
Approaches We Should Be Funding
Federated learning. AI models don’t need to centralize your data to learn from it. Federated learning trains algorithms across decentralized datasets: your data stays on your device or within your clinic’s system, the model comes to you, learns the patterns, and leaves without taking the raw information. There are legitimate concerns about what can be reverse-engineered from model updates, but it’s a fundamentally different architecture than “upload your medical records to OpenAI.”
Community data sovereignty. Indigenous data sovereignty movements have been leading this thinking for decades. The CARE Principles for Indigenous Data Governance (Collective benefit, Authority to control, Responsibility, Ethics) assert that communities should govern the data about them, not just consent to its collection. What would it look like if SRH organizations operated as data stewards for the communities they serve, accountable to those communities rather than to funders, researchers, or tech companies?
Legal sanctuary for reproductive health data. Before June 2025, the Biden administration’s HIPAA Privacy Rule modification offered some federal protection, prohibiting disclosure of protected health information related to lawful reproductive care for criminal investigations. A federal court in Texas vacated that rule, and the current administration did not appeal. States like California, Connecticut, Illinois, Massachusetts, New Jersey, New Mexico, New York, and Washington, D.C. have passed their own reproductive health data privacy laws. Washington state’s My Health, My Data Act created a novel framework for consumer health data. But this patchwork is fragile and already under legal challenge: Texas AG Ken Paxton sued a New York doctor in December 2024, directly testing New York’s shield laws. What if we pushed for a framework that treats reproductive health data the way attorney-client privilege treats legal communications: categorically protected, not subject to subpoena, not accessible to law enforcement without an extraordinarily high bar?
Differential privacy and synthetic data. Differential privacy adds statistical noise to datasets so that overall patterns are preserved but individual data points cannot be extracted. Apple and the U.S. Census Bureau both use it, though researchers found the Census implementation introduced disproportionate errors for rural and non-white populations: a cautionary tale about equity in privacy-preserving methods. Synthetic data (which I discussed in my previous piece) offers another route: training AI on statistically identical data that doesn’t correspond to any actual person. Both tools have real limitations, and both inherit the biases of their source data. But they deserve development specifically for reproductive health applications.
What the Movement Needs to Do
Stop treating data protection and data inclusion as separate policy agendas. They are one fight. Every advocate pushing for better women’s health datasets should be simultaneously pushing for legal protections that ensure that data cannot be weaponized. You cannot build the dataset and figure out the safeguards later. Later is how people get prosecuted.
Get SRH expertise into AI governance, and build AI literacy inside SRH organizations. This runs in two directions. Most AI governance conversations happen in rooms full of technologists, corporate lawyers, and academics. The people who understand what happens when a prosecutor subpoenas a clinic’s patient records (the clinic directors, the legal advocates, the community health workers) are almost never there. They need to be there with decision-making power. Organizations like If/When/How and the Center for Reproductive Rights have the legal expertise. SRH field leaders have the operational knowledge. AI governance needs both.
But the reverse is equally urgent: most SRH leaders don’t yet have the AI literacy to know what questions to ask, what governance means for their organizations, their staff, and the communities they serve, or what’s at stake in the technical decisions being made right now. Understanding what training data is, how algorithms make decisions, what “consent” actually means in a data pipeline, what your vendor’s privacy policy does and doesn’t protect: these aren’t optional technical details. They’re the difference between an organization that can protect its community and one that unknowingly exposes it. AI literacy is a survival skill for reproductive health organizations. The ones that build it now will shape how this technology enters their communities. The ones that don’t will have it done to them.
Name the origins of the framework we’re using. The reproductive justice framework (the insistence that reproductive rights cannot be separated from economic security, freedom from violence, and the right to parent) was created in 1994 by 12 Black women, including Loretta Ross, who went on to co-found the SisterSong Women of Color Reproductive Justice Collective in 1997 with 16 women-of-color-led organizations. They built this framework specifically because mainstream reproductive rights discourse centered the experiences of white women. When we say “center the people most harmed,” we are drawing on a Black-led intellectual and political tradition. That should inform how we build data governance: with Black women, Indigenous women, and women of color as architects, not beneficiaries of someone else’s design.
Fund the infrastructure. Federated learning for reproductive health. Synthetic data for underrepresented populations. Legal frameworks for data sanctuary. Community-governed data trusts. These aren’t theoretical luxuries. They’re infrastructure we need now. And they require dedicated investment from foundations, policymakers, and technologists who understand that this is a civil rights issue, not just a computer science problem.
What’s Missing, and What I’m Still Thinking About
The global dimension. Most of my framing here is U.S.-centric, because the post-Dobbs legal landscape drives the most acute version of this paradox. But 54% of the global women’s health burden is in low- and middle-income countries, and the dynamics there are different. Researchers have documented how mandatory data sharing mandates can perpetuate neocolonial extraction: raw data collected from communities in low- and middle-income countries, then analyzed and published by researchers in wealthy nations who receive the career benefits, the grant funding, and the publications, while the communities that generated the data see little or no return.
The disability justice lens. Disabled women face compounding barriers in health data systems, from inaccessible data collection tools to medical records that conflate disability with pathology. I haven’t explored how disability justice frameworks should inform reproductive health data governance, and they should.
The limits of feminist frameworks. D’Ignazio and Klein’s Data Feminism is the most developed framework I’ve seen for thinking about power and data. But as some critics have noted, even Data Feminism can risk flattening the specific intellectual contributions of Black women by absorbing them into a broader “intersectional” framework without fully crediting their origins (a concern scholars like Nikol G. Alexander-Floyd have raised about intersectionality discourse more broadly). And no framework can substitute for the material conditions of safety. You can have the most beautifully designed data cooperative in the world, and it won’t matter if the legal environment allows prosecutors to subpoena its records.
Holding the Tension
We’ve spent decades fighting for women to be included in clinical trials, in research funding, in the data that shapes how health systems see us. Not just AI. The entire health care infrastructure: insurance algorithms, drug development pipelines, the diagnostic criteria your doctor was trained on. That fight is urgent.
But we’re also living in a moment where the data point that teaches an algorithm to recognize a dangerous pregnancy complication in a Black woman in Georgia could, if it includes identifying information and reaches the wrong hands, become evidence in a prosecution. That’s the legal reality in states where pregnancy outcomes can trigger criminal investigation.
I don’t think the answer is to retreat from data collection. The cost of invisibility is measured in years of untreated pain, missed heart attacks, and maternal deaths that better algorithms could prevent.
And I don’t think the answer is to share without safeguards, trusting institutions that have repeatedly prioritized extraction over protection.
The answer is something we haven’t fully built yet. Something that looks like collective ownership, legal sanctuary, technical innovation, and political power, all woven together. Something that says: yes, count us. But on our terms. Under our governance. With protections that match the stakes.
The reproductive health movement, rooted in Black women’s leadership, forged in fights over bodily autonomy, practiced at holding impossible tensions, is the right community to lead this design. Not because nobody else could, but because nobody else has the combination of political analysis, lived experience, and operational knowledge that this problem demands.
We are not going to let them build the future of health AI without us. But we are also not going to hand them the weapon and call it progress.
Both things are true. And the space between them is where the real work lives.
What do you think? How should we navigate this? I want to hear from providers, patients, advocates, technologists, and anyone sitting with this tension. Reply to this email or find me on LinkedIn. This is a conversation the movement needs to have, and this newsletter exists to start it.
The Body is the Interface is AI strategy for the people building reproductive justice. If this piece resonated, share it with your network and subscribe so you don’t miss what’s next.
About the Author
Lyndsay Sanborn is the founder of Frame + Forge, an AI strategy studio built for the reproductive health movement. She has spent 25+ years building programs, shaping strategy, and designing products across the SRH ecosystem, from direct service organizations like Planned Parenthood, to federal programs including HRSA, Title X Family Planning, and Ryan White, to movement partners like Power to Decide, SisterSong, Upstream USA, HealthHIV, and many others. She holds an MIT xPRO certification in AI strategy, leadership and product innovation and a master’s in health policy and administration. She publishes The Body is the Interface because she believes the people closest to the work of reproductive justice should be the ones shaping how AI enters it.
Sources & Further Reading
WEF & McKinsey Health Institute. “Blueprint to Close the Women’s Health Gap: How to Improve Lives and Economies for All.” January 2025. Link
World Economic Forum. “Building Global Infrastructure for Women’s Health Data (WHIT).” January 2026. Link
Stateline/Pew. “Data Privacy After Dobbs: Is Period Tracking Safe?” July 2024. Link
VICE News. “Period-Tracking Apps Won’t Say Whether They’ll Hand Your Data Over to Cops.” 2022. Link
PMC. “Missed Period? The Significance of Period-Tracking Applications in a Post-Roe America.” 2023. Link
MIT IDE. “Q&A with Sandy Pentland: Data Cooperatives.” 2020. Link
Harvard Business Review. “With Big Data Comes Big Responsibility” (Sandy Pentland). 2014. Link
Hardjono & Pentland. “Data Cooperatives: Towards a Foundation for Decentralized Personal Data Management.” 2019. Link
TIME. “Is Giving ChatGPT Health Your Medical Records a Good Idea?” January 2026. Link
CNBC. “OpenAI Launches ChatGPT Health.” January 2026. Link
PMC. “Privacy, Data Sharing, and Data Security Policies of Women’s mHealth Apps: Scoping Review.” 2022. Link
D’Ignazio, Catherine & Klein, Lauren F. Data Feminism. MIT Press, 2020. Free online
Global Indigenous Data Alliance. “CARE Principles for Indigenous Data Governance.” Link
Center for Reproductive Rights, If/When/How, & National Partnership for Women & Families. “Reproductive Health and Data Privacy After Roe.” 2025. Link
Network for Public Health Law. “Reproductive Health Data Privacy: What Now?” September 2025. Link
IAPP. “The State of US Reproductive Privacy in 2025.” Link
PMC. “Differential Privacy for Public Health Data.” 2021. Link
Apple. “Learning with Privacy at Scale” (Differential Privacy). Link
Santos-Lozada et al. “How Differential Privacy Will Affect Our Understanding of Health Disparities.” PNAS, 2020. Link
PMC. “The 2020 US Census Differential Privacy Method Introduces Disproportionate Discrepancies for Rural and Non-White Populations.” 2024. Link
Frontiers in Big Data. “Big Data and AI for Gender Equality in Health.” 2024. Link
PMC. “What Constitutes Equitable Data Sharing in Global Health Research?” 2023. Link
SisterSong Women of Color Reproductive Justice Collective. “Reproductive Justice.” Link


