Carla book ch4

Chapter Four: Dr. John Snow and the First Health Data Trust

“Trust and accountability: above all else, these are the pillars of public health.”

– Laurie Garrett

Dr. John Snow, Medical Data Detective

Recap: Our story so far

Over the first three articles in this series, we have discussed the desirability of replacing the Web’s current SEAMs cycles (surveil, extract, analyze, and manipulate), with a new HAACS ethos (enhancing human autonomy/agency, via computational systems). Article one highlighted the dynamics of fostering large-scale systems change in our social institutions. Article two introduced us to Carla, a typical Web user, and how “Userhood” and the SEAMs cycles reduce her agency in the digital world. Article three described how entities acting as personal digital fiduciaries, along with tech tools like Personal AIs, could greatly empower Carla and other Web end users. In this article, we will look at another fiduciary-based institution: the data trust. 

With Carla receiving the unwelcome news that she has tested positive for COVID-19, a “contact tracer” calls to ask questions about her recent activities. Carla separately wants to donate her health data to trustworthy institutions for beneficial medical research, but she is unclear how to proceed. As we will see, a health data trust could be an effective and accountable way for Carla to share her somatic data, so that it winds up in the hands of accredited researchers and medical professionals – and not SEAMs-based data brokers. (SEAMs = surveillance, extraction, analysis, manipulation)

Tracing the infectious contacts

An infectious pathogen in the air. A deadly pandemic on the rise. Fear rampant in homes, in the streets. Medical professionals frantically searching for answers. For at least some, the promise of contact tracing as a way to flatten the infection curve.

The time is early September 1854. The place is London, England. The plague is cholera. And the dedicated contact tracer is named John Snow, a young medical doctor. In a neighborhood of the working poor in central London (now Soho), some 600 residents have perished from cholera in a single week. While other medical experts of his day, including the General Board of Health, are convinced that the disease is transmitted as an airborne “miasma” among the “squalid” poor, Snow believes the pathogen actually spreads via contaminated water.

Snow confirms his then-heretical theory by meticulously assembling evidence of infection, including reconstructing and tracking the daily movements of infected patients in and around the neighborhood. By observing the data clusters, Snow determines that everyone affected has one thing in common: they all retrieved drinking water from a hand pump well, located on Broad Street.

After assembling his evidence, Dr. Snow decided to try to alert the singularly named “Board of Guardians of the Poor.”  This local group of landowning white men was charged with maintaining some semblance of order and health in a neighborhood of the economically challenged.

On the evening of September 7th, Snow brought his findings to the Board’s outwardly skeptical Sanitary Committee. As Snow's contemporaneous biographer relates it: “The vestrymen of St. James were sitting in solemn consultation on the causes of the visitation.” Without warning, “a stranger had asked in modest speech for a brief hearing.” Dr. Snow explained his case, advising removal of the Broad Street water pump handle as “the grand prescription.”  The vestry was “incredulous,” but the next day agreed as a precaution to remove the water pump’s handle, rendering it inoperable. The epidemic, already ebbing, is quelled.

The Broad Street water pump

Snow’s findings at the Broad Street pump, and in other cholera outbreaks, helped usher in “the sanitation movement.” Over several decades, cities and towns around the world constructed greatly improved sewage drainage and water purification systems. By contributing to such huge systemic changes. Snow has been touted by many as the father of epidemiology. In one survey of medical professionals, he came out as the “greatest doctor of all time,” edging out the legendary Hippocrates (460-370 BC). Despite such massive improvements globally, cholera still kills some 100,000 people every year, mainly in developing countries without safe drinking water and sanitation.

Medical data, and well-earned trust

Dr. Snow personally attended to many of the pandemic’s unfortunate victims. His biographer spoke of Snow’s “untiring zeal … how he laboured, and at what cost, and at what risk….. He laid aside, as much as possible the emoluments of practice, even by early rising and late taking rest…. Whatever cholera was visitant, there was he in the midst.”  Steven Johnson observes that “the fearlessness of the act still astonishes.” In his house-to-house inquiries, Snow spoke with the dying, and the relatives of the dead. 

People in hugely crowded private residences. Strangers in the streets. Hospital patients. Prison inmates. St. James Workhouse laborers. He also interrogated those who (like workers at the Lion Brewery) for some reason had not fallen ill.

Dr. Snow managed to convince a wide range of people to confide sensitive information. Names and addresses. Eating and drinking history, regular habits, known physical contacts, daily whereabouts. Many patients and families undoubtedly were desperate. Some likely knew they were doomed, and yet (or perhaps therefore) shared their confidences anyway. People placed their trust in Snow, as a potential healer, a medical sleuth, and an attentive listener.

Dr. Snow recorded his data points diligently in a ledger, eventually displaying them on a London street map. There, each black bar stood for one cholera death. Snow’s map became the public repository of crucial empirical data, reflecting genuine local knowledge gathered and sifted from the source.

John Snow’s “ghost map”

While John Snow was not the first contact tracer in history, in many ways he has become the most consequential. Because he was a true native of the Broad Street area, this “gave him both awareness of how the neighborhood actually worked and it gave him a credibility with the residents, on whose intimate knowledge of the outbreak Snow’s inquiry depended.”  

Reema Patel with the Ada Lovelace Institute finds relevance and inspiration in John Snow’s career. In particular, as Patel explains, Snow recognized that the ethical and swift use of relevant and sensitive medical data can help save lives. Dr. Snow in effect became a trusted repository of such data, a role which he then extended for socially-beneficial uses that would have far-reaching impact.

Tracing a pandemic, in 2020

A new outbreak, a new pathogen, airborne transmission – and yet, in 2020, the medical detective work largely is the same. Gather the evidence, look for patterns, and take swift action.

Today, contact tracing is the first and often most powerful tool in tracing the origins of a disease, as well as implementing appropriate social isolation guidelines. The public health worker typically takes a two-step approach, first conducting a detailed contact tracing interview with an infected individual, and then asking the individual to stay at home and self-isolate. 

The U.S. Centers for Disease Control (CDC) provides specific principles, strategies, and detailed scripts for how public health workers should engage to notify individuals of exposure. These principles include to ensure confidentiality, demonstrate an ethical and professional conduct, create a “judgment-free zone,” have cultural humility, ask open-ended questions, use reflective listening techniques, and address concerns that arise during the conversation. Each principle is intended to allow public health workers to “build and maintain trust with clients and contacts.” 

The usefulness of sharing data

To many, contact tracing constitutes an invasion of personal privacy, justified because of the larger public health benefits. The real-life work of contact tracers demonstrates that underlying many of their efforts to contact and persuade people to cooperate is "the contagion of fear."  Still, as recent experience in the Apache nation shows, intense contact tracing, conducted in person by well-known and trusted community members, can overcome that fear – resulting in revealing valuable and actionable information.

A recent report from the Aapti Institute establishes some of the many societal benefits from sharing one’s personal data. In particular:

Sharing data is instrumental in the development of knowledge, research, innovation, and cooperation that help us better understand our society, economy, and polity. Some data, therefore, is rightly regarded as a shared resource that benefits society at large. 

Under the right governance structures – the “rules of the road” -- human autonomy and agency can be promoted with the sharing of personal data. As contemporary technology markets increasingly extract value from aggregated data, Martin Tisne of Luminate explains that there is a pressing need to create institutions that protect both individual and collective level data rights. Where benefits and harms alike are collective, governance mechanisms can fill that gap, beyond the more individualized online needs served by the personal digital fiduciary discussed in Article Three.

Contact tracing as an app: where is the governance?

During the COVID-19 pandemic, many of us have become aware of the concept of contact tracing, in the context of “exposure notification” software applications. The best known is the mobile app produced by Google and Apple. In their joint announcement, the companies discussed the desirability of gaining access to certain pools of health data for purposes of medical research, while avoiding the downside of capturing and transmitting highly sensitive information about individuals.

It is apparent that certain contact tracing apps are superior to others, in terms of protecting sensitive identifying information and medical data. Beyond the efficacy of specific technology implementations, however, looms the larger concern about trust and accountability. Most of these health-related apps are operating in the field without the actual “back end” of abiding governance structures. 

Attempting to embed the rules into a software application’s terms of service, or its privacy and data protection practices, is a fraught proposition. As so many of us know, Web end users already face the prospects of “notice and consent” fatigue. Most websites and apps now ask us to review and assess details of their policies – including, how much data is collected, and what kind, and for how long, and under what terms.  Asking for similar user engagement, for even more sensitive health-related sites and apps, can very well exacerbate the burden.

Laudable efforts to create guiding principles by outside bodies creates similar challenges for end users. For example, a recent MIT working draft report identifies 13 different principles that should be applied to contact tracing applications. Similarly, the Data Rights for Exposure Notification project lists some 14 different rights that should apply to affected individuals. As with online consent requirements, these well-meaning attempts to bolster human agency instead may mire ordinary Web users in yet further confusion.

Ultimately, while the power of contact tracing apps can complement human efforts, conceiving of software as the primary solution only neglects the role and value of human institutions, as well as the necessity of trust for adopting such personal tools. Perhaps not surprisingly, a recent poll of US smartphone users found that fully half would refuse to use the new Google/Apple contact tracing app. As Laurie Garrett concludes in her magisterial Betrayal of Trust: The Collapse of Global Public Health, it is up to public health to “bring the world toward a sense of singular community in which the health of each one member rises or falls with the health of all others.”

The challenges involved in developing and adopting connected technologies highlights the need to integrate the human element. We need entities operating under certain established legal/ethical principles, setting the standards under which our software applications will be created and deployed. When crises inevitably arise, we don’t just need (more) technology; we require trustworthy tech governance.

Following the (legal) trust

The data trust is one legal instrument increasingly being discussed as a means to bridge that governance gap. The concepts of both the fiduciary (discussed in the previous article on digital fiduciaries) and the trust have their roots in the English common law of equity courts. There, medieval judges first began handing down what could be considered “gap-filler” judgments, to provide supplicants with remedies where otherwise adhering to the strict letter of the law would lead to unjust outcomes. As a result, the common law of equities is premised on applying ethical principles, which plays out in the ways that both fiduciary and trust law doctrines have evolved over time.

Equity law in action: The Court of Chancery, London, in the early 19th Century.

The basic concept of a trust is that a “settlor” places property or some other object of value into control of a “trustee,” which in turn manages that asset for the benefit of a “beneficiary.”   The trustee is granted its authority to hold and make decisions about the assets by a trust charter. The trustee is obliged to operate under strict fiduciary standards, including a duty of loyalty towards the beneficiary.

The data trust essentially extends the legal trust concept to treating data as the protected asset.  While there is no one clear definition of a data trust, Anouk Ruhaak helpfully delineates that

A data trust is a structure whereby data is placed under the control of a board of trustees with a fiduciary responsibility to look after the interests of the beneficiaries — you, me, society. Using them offers all of us the chance of a greater say in how our data is collected, accessed and used by others. This goes further than limiting data collecting and access to protect our privacy; it promotes the beneficial use of data, and ensures these benefits are widely felt across society. In a sense, data trusts are to the data economy what trade unions are to the labour economy.

Sylvie Delacroix and Neil Lawrence have written a seminal article on what they call the “bottom up” data trust. More “top-down” legal and regulatory structures typically rely on express or implied contracts to delineate parties’ rights and obligations. The notice-and consent process applied to Web users by so many websites and apps is one such example. By contrast, the “bottom-up” model focuses on individuals using a legal trust for specified functions, such as establishing verified access by third parties to their data. To Delacroix and Lawrence, this latter version would utilize legal safeguards from trust law, and operate with a clear purpose rooted in public, social, or charitable benefits.

The Open Data Institute (ODI) has identified six basic characteristics of the data trust: a clear purpose, a legal structure (including trustors, trustees with fiduciary duties, and beneficiaries), rights and duties over stewarded data, a defined decision-making process, a description of how benefits are shared, and sustainable funding.

Importantly, as ODI explains, the motives for establishing such a governance structure can be good (distributing the benefits of data more equitably), or bad (avoiding data protection obligations).

Little is firmly settled about the notion of bringing data within trust law, and the legal landscape remains uncertain.  For starters, civil law-based countries (such as China, Japan, and much of Europe) rely more on codified statutory codes. These systems may not readily accommodate the judge-made common law more prevalent in the United States, Canada, and India. Questions also remain about whether and how a data access and collection regime can be superimposed onto actual legal trust arrangements. Defining and apportioning the rights and duties between the three parties of the settlor, the trustee, and the beneficiary, has also proven challenging.

Nonetheless, many see potentially enormous value in how a data trust can empower individuals to exercise their data rights collectively. In Canada, Element AI has conducted important research on data trusts. In India, the Aapti Institute has examined the data trust as part of its ongoing work on defining data stewardship. The Mozilla Foundation also has embarked on a data stewardship initiative, which includes launching a "data futures lab" to explore and sponsor various data trusts and fiduciaries models.  As these and other projects unfold, open questions around the viability of data trusts hopefully will be resolved.

One promising model: health care data trusts

Data about a person’s medical status can be a highly useful societal resource. Work at MIT on “building the new economy,” for example, includes proposing a detailed technical architecture to facilitate collecting and sharing health data. By themselves, however, such impressive technology implementations lack a companion means of governance to create, build, and hold actual trust and accountability.

Delacroix and Lawrence posit that one particularly apt use case for the “bottom up” data trust is the donation, pooling, and exchange of personal medical information. Element AI also observes that the mission of societal benefit from medical research could be the driving factor in the success of a health data trust.

Despite these seeming benefits, and as with the data trust model more generally, there are few explicit health data trusts operating today. Those organizations operating in the medical/health care space and utilizing aggregated patient data tend to be organized as non-profits, and utilize ordinary contracts rather than trusts. 

There are still some noteworthy examples. As far back as 2011, the U.S. National Academy of Sciences reported on a variety of projects to implement health data trusts.  Johns Hopkins Medicine today has in place a Data Trust that provides the infrastructure, processes, and procedures for accessing patient data. In the UK, the Royal Free London NHS Foundation Trust is a healthcare provider that includes a trust board selected by a council of governors to manage the trust. 

While not formally a data trust, the UK Biobank has a detailed ethics and governance framework for collecting and utilizing patient data, under what they term a “committed data stewardship” model. Similarly, the Idaho Health Data Exchange is a statewide data stewardship platform for pooling and accessing patient data. The Structural Genomics Consortium is a collaborative community of scientists that places its genomic research in the public domain, and includes an “open science trust agreement” to assign trustee obligations. Finally, one for-profit entity taking on some quasi-data trust obligations is a digital identity management company called Yoti ( This firm operates under a stringent ethical framework, which covers its set of COVID-19 testing applications. Yoti also is spearheading the development of a draft code for sharing medical data.

Benefits and limitations of the data trust

In theory at least, the data trust model can provide important benefits when applied to health data, especially as compared with the quasi-contractual “notice and consent” model employed by most websites and apps. Notionally, the data trust relationship itself -- rather than the underlying software and hardware, or its one-sided use policies -- carries the crucial human elements. 

Specific advantages for a health data trust can include:

Perhaps most importantly, the presence of these types of up-front safeguards should result in increased human trust, and the subsequent sharing of more relevant data than would be the case under prevailing top-down consent approaches.

Nonetheless, today the sharing of health data typically happens via ordinary bilateral contracts between parties, not a legal data trust. What is hindering uptake of this governance mechanism?  

Possible reasons could include:

Rooting the data trust in a new “PEP” framework


As noted above, part of the challenge with adopting the data trust model is uncertainty about the specific rights and duties that should apply. For example, the practice of contact tracing derives its ethos from the world of public health. That ethos encompasses protecting the individual’s confidentiality, while gathering pertinent information and driving towards positive social outcomes. In this way, public health differs from the practice of direct medical care, with its emphasis on treating individual patients. Taking care of the well-being of an entire community requires a different mindset.

How then should fiduciary/trust-like duties play out in the online practices of a health data trust?  

One suggestion is to apply lessons from the actual physician-patient relationship. In functional terms, the physician-patient dynamic contains crucial elements to form a fiduciary relationship. 

These include:


As we saw in Article Three with the “PEP” (protect/enhanced/promote) model, these kinds of power dynamics suggest that the individual entrustor (here the patient) would benefit from applying four different types of fiduciary law-based duties to the entrusted party (here the physician). These include (1) the general tort-like duty of do no harm, (2) the fiduciary duty of prudent conduct, the (3) “thin” loyalty duty of having no conflicts, and (4) the “thick” loyalty duty of promoting the entrustor’s best interests. The PEP model matches up these fiduciary obligations with specific actions carried out on behalf of the client.


Importantly, this framework can be extended to apply as well to legal trusts serving collective social interests. A PEP model for data trusts would differ in its particulars in several ways from one devised for personal digital fiduciaries and their clients. The use cases and clientele are collective (rather than individual); the scope is limited to data as an aggregated resource (rather than a more expansive digital relationship), and the form is a trust, with a board of trustees, trustors, and beneficiaries (as opposed to the simpler fiduciary binary of a trustor and trustee). Still, in theory the same three protect/enhance/promote action tiers, aligned with the four common law-derived duties, can be applied across the board.

A good governance/PEP model mashup

Sushant Kumar recently pointed out the compelling need for technology safeguards to govern COVID-related apps. He proposes three “good governance” pillars: transparency, accountability, and fairness. Interestingly, these same principles would play out well in a collective data trust-oriented PEP model:

As this exercise shows, the PEP model can be refined and extended to apply it to different types of collective data trusts.

A highly complementary role for personal digital fiduciaries

The digital fiduciary concept introduced in Article Three can help spur adoption of the data trust model. In particular, a personal digital fiduciary can help its clients explore the brave new world of data trusts, make decisions about utilizing particular ones, and negotiate/reach agreement on their behalf. One or more data trusts, dedicated to different use cases, can provide someone like Carla the means of using her own data to express her solidarity with a larger community. 

This more relational, rather than transactional, approach also lessens the reality of cognitive load for ordinary people, and greatly bolsters the value of trust-based relationships.

Combining the personal digital fiduciary and collective data trust models also provides another way to accomplish the extension of human rights into the digital world. By improving on our human-based governance regimes, perhaps we can yet achieve the aspirational D>=A formula (digital rights equal to, if not exceeding, analog rights), for individuals and communities alike.

Some recommended courses of action

Within each jurisdiction of law, there are a number of ways that the data trust concept can become a reality, and successfully adopted. 

Options include:

Conclusion: Celebrating a missing pump handle

Some 170 years after the fact, John Snow’s actions on Broad Street, along with the arc of his career, still manage to impress. Shifting seamlessly from doctor to sociologist to statistician, Snow was a true systems thinker, well before the term came into being. His persistent, evidence-based challenges to conventional wisdom remains refreshing today. His many roles bear highlighting as well – from initial theorizing, to carefully collecting and analyzing details of lived lives as relevant data, to testing his theory. Steven Johnson opines that Snow’s careful questioning of the neighborhood residents about whether they had been drinking the well water, in itself may have diminished the spread of the epidemic, sparing additional lives.

Snow’s systematic collection of the necessary data, based largely on gaining patients’ trust, involved honing the investigator’s tool of contact tracing. The resulting street-level map, as Johnson puts it, was “a direct reflection of the ordinary lives of the ordinary people who made up the neighborhood…. A neighborhood representing itself, turning its own patterns into a deeper truth.” John Snow was that rare individual who was able, and willing, to persevere to uncover and demonstrate that truth.

Most importantly, Dr. Snow subsequently made use of the data he gathered and analyzed to further the public good – hence the missing pump handle.  His work led to proposing prompt action to save human lives. Perhaps this focus on socially beneficial outcomes made him history’s first health data trust. Tapping into the societal value of shared local data, under conditions of respecting and protecting confidences and building trust – as a means for making a difference. Snow’s human qualities stand as testament to the kind of institutions we should want to build to govern our digital world. 

In London today, John Snow has a pub named for him, just steps from where the Broad Street pump once stood -- an ironic testament to a lifelong teetotaler. As we wrestle with data governance models for our future, perhaps a more apt living tribute would be something like the “John Snow Health Data Trust.”

Next article: Civic Data Trusts, plus Symmetrical Interfaces

In Article Five, we will round out our examination of fiduciaries-based digital institutions in the context of the so-called “smart” city. We will explore how the civic data trust, partnered with symmetrical “IoT” interfaces, provides another mechanism for Web users to gain digital rights, in this case while participating in their daily physical environments.