Thursday, October 18, 2018

The End of Privacy As We've Known It

I have been revised! The news came just the other day in an email from informing me that my DNA profile has been revised in light of serious amounts of new data that they have recently processed and which now allow them to refine my ancestral portrait based on the DNA sample I sent them last spring. And now for the results: instead of being of 96% European Ashkenazic heritage, 2% Sephardic, 1% South-East Asian (a true mystery) and 1% of indistinct origin (whatever that meant exactly), my DNA profile has now been revised to yield the completely un-startling result that, genetically speaking (as well as by disposition, worldview, and appearance), I am of 100% Ashkenazic/European origin. Was I surprised? Not very! And yet…I had come to like the idea of having some weirdly inexplicable Sri Lankan blood in me somewhere, something that, at the very least, could have turned into a good short story. I suppose I’ll get over it. I might as well! 

Joan took the test too and received similarly expected results. I suppose most people do. But, of course, not all do. I wrote to you last year about the remarkable way that a woman from Chicago discovered that her (apparently) 100% Irish Catholic father turned out to have started out in life as a 100% Jewish baby boy who was sent home with the wrong set of parents and whose real parents (i.e., the woman who gave birth to him and his biological father) took whom the (actually) Irish Catholic baby who grew up to be a Jewish man from the Bronx and the patriarch of a large, complicated Jewish family. (If you find that confusing, you can revisit that letter by clicking here.) There, I mused aloud about the malleable boundaries of identity, about what it means to be who we are—and what that means with respect to the ultimate definition of Jewishness or, for that matter, any kind of identity deemed to inhere in an individual at birth. To my great surprise, I actually received an email from the woman with the Jewish Irish Catholic father in response to what I wrote about her case and I was very gratified indeed by her very generous appraisal of what I had to say about her situation and her father’s.

You have to be a serious genealogist to take advantage of most of what these online DNA sites offer. When I visit the website, for example, I can see the names of more than a dozen people whom the site says are “almost definitely” my fourth or fifth cousins. (Fifth cousins are people, one of whose thirty-two great-great-great-grandparents was a sibling of one of the other person’s thirty-two great-great-great-grandparents.) I’ll have to upgrade my membership if I want actually to contact any of them, but I haven’t taken that step. Nor do I think I will in the future. (In all fairness, they’ve also dangled the names of two second cousins to see if I’ll take the bait. So far, I’ve resisted.) But it turns out that there is a lot more to all of this than learning the names of theoretical cousins possibly descended from theoretical siblings who lived in the eighteenth century.
One of the side developments of all this DNA testing is the discovery some men have made, not of distant cousins, but of children inadvertently fathered somewhere along the way and in any number of different ways. (This phenomenon, which will only become more common in the coming years, has touched one family in our congregation and it has touched my own family as well. Those two stories were different in detail, but identical in terms of result…and, although both appear to be having happy endings, it feels unlikely that there are not out there people whose entire lives have been or will be turned upside down by this kind of unanticipated revelation.) Another has to do with the forensic use of these data banks to solve crimes long consigned to the “cold case” bin and only now becoming solvable in the wake of the proliferation of these online DNA banks.  You may recall reading about the arrest of the man police accuse of being the so-called “Golden State Killer,” a violent criminal considered likely to be responsible for fifty rapes and a dozen murders committed between 1976 and 1986 whose identity was only revealed to the authorities after they uploaded DNA taken from the crime scenes to a site called (To read more about that specific case, click here. Making that specific case more interesting is the fact that although the suspect did not personally offer his DNA to any of the online testing sites, a few of his relatives did…and matching the crime-scene DNA to their profiles led to the arrest of the sole individual to whom they were all related.)

But the specific issue I want to write about this week has to do neither with the discovery of unknown offspring nor the solution of cold-case crimes. Instead, I’d like to write about an issue that feels as though it has the potential to dwarf both those issues in terms of the impact it could conceivably have on society.
To date, about fifteen million people have consciously and intentionally sent in samples of their DNA for analysis to sites like or Another couple of million have signed up at a few less well-known sites. We are, therefore, talking about far less than 10% of American citizens, but the implications of this phenomenon are far greater than the numbers suggest. Just this week, a study co-written by Yaniv Erlich, Tal Shor, Itsik Pe’er, and Shai Carmi was published in the journal Science that suggested just how important this whole phenomenon is…and how it will soon affect the lives of millions of people who themselves have not sent in their DNA for analysis.

To date, about sixty percent of Americans of North European descent—Brits, Germans, Poles, Danes, Swedes, etc.—can be identified through these databases regardless of whether they have personally sent in their DNA for analysis. And that number is only the beginning: within two or three years, the authors of the Science essay imagine that a full ninety percent of Americans whose families originate in northern Europe will be identifiable through their DNA even if they themselves have not personally contributed any DNA sample.
To me, that sounded unbelievable. It’s one thing, after all, for my page to say that mitchKK (whoever he is) and I are “highly likely” to be second cousins. (I think we probably are cousins, by the way—the 2nd K matches the odd way my great-grandparents spelled their last name so I’m guessing one of his grandfathers must have been one of my grandmother’s brothers.) But that only sounds plausible because we both contributed samples of our DNA and so opened ourselves up to being identified as each other’s relative. But how could this possibly work with people who specifically have not contributed their DNA? That’s what I set myself to trying to figure out.

I’m not sure I understand the Science article entirely correctly. (To try for yourself, click here.) But as far as I can understand, the whole thing has to do with third cousins because, it turns out, the way the tests work is precisely to identify people whose DNA samples match closely enough for them to be third cousins, i.e., the great-grandchildren of siblings. Most of us apparently have about 800 people in the world whose DNA matches ours to that extent. And if just one of those people is in the data base, then someone who truly knows what he or she is doing can extrapolate information based on other public records to find a trail to a sought-after individual even if that person has not personally contributed DNA of his or her own.  This does not bode well for people who value their privacy.
The authors of the Science article chose thirty DNA test results at random from the GEDmatch database and then, by analyzing that data and using public information available to all, they were able to identify third cousins of about 60% the people whose DNA they had selected for study. (GEDmatch, with only a million customers, is significantly smaller than its competitors but was amenable to allowing the experiment to proceed.). In an article describing the experiment published in the New York Times this week (click here), Heather Murphy quoted Yaniv Erlich, one of the authors of the Science article, as saying that, “to identify an individual of any ancestry background, all that is needed is a database containing two percent of the target population.” That stopped me in my tracks.  

Is that really possible? Graham Coop, a genetics professor at the University of California Davis who is cited in the Times article, thinks so and is quoted as saying that “society is not far from being able to identify 90 percent of people through the DNA of their cousins in genealogical databases.” In my opinion, anyone who doesn’t find that both startling and seriously unsettling probably hasn’t thought the matter through carefully enough!
I’ve been sensitive for a long time to the slow erosion of personal privacy in our American culture. For most of us, that thought conjures up almost funny images of some drone at the NSA poring over trillions of emails that could not possibly be of interest to anyone other than the person to whom they were sent. But the thought that society seems to be blundering almost unawares into a future in which personal privacy is a thing of the past and the fullness of an individual’s genetic heritage is suddenly a matter of public record regardless of whether that individual has or hasn’t chosen to become part the digital quarry from which amateurs like myself presumed such data could only be mined—that seems to me to be far beyond something reasonably referenced as a quirky innovation of the digital age. The right to personal privacy in life—to live free without the oversight of others and without their interference—is one of the fundamental privileges of citizens in a democracy. That we appear to be on the verge of losing control over that foundational right is just another sign of just how out of control things are as we barrel into the future only vaguely aware of what we ourselves have wrought.

