Heart Palpitations? Beware of Dr. Google

24 Jun 2015

Googling a cure for chest pain might put your personal information more at risk than you think, according to new research by Tim Libert of the Annenberg School for Communication.

Libert, a doctoral student who specializes in computer security and Internet governance, looked at a little-known side effect of health-related Web searches: the amount of private health information divulged to third parties, like advertisers and data brokers, when people investigate their medical concerns online.

“I took the names of about 2,000 common diseases or treatments and I found the top 50 search results for each of those, so that gave me about 80,000 Web pages,” Libert explains. He found that “about 90 percent” of them collect and distribute personal information to outside parties.

Even Web pages run by the federal Centers for Disease Control—which might seem less likely to commercialize your health data than businesses like Target or WebMD—aren’t necessarily safe.

“On the CDC’s HIV page, third-party requests are made to the servers of Facebook, Pinterest, [and] Twitter,” Libert explains. Essentially, the social-media buttons at the bottom of the page—“Recommend,” “Tweet,” “Pin it”—enable those companies to collect information about people visiting the page. “It is unlikely that many users would understand that the presence of these buttons indicates that their data is sent to these companies,” Libert notes.

What’s more, Facebook “Like” buttons record activity “even if you don’t use Facebook,” Libert says. “They actually can keep profiles on people who’ve never signed up for Facebook, never used Facebook. Which is kind of weird.”

Yet social media firms aren’t the only ones recording your virtual movements. Google had its fingers in some 78 percent of the Web pages Libert analyzed, including the CDC page. Data broker comScore came in second, tracking 38 percent of the Web pages, while Facebook was third with 31 percent.

Libert himself was surprised by his findings. “I thought there was going to be a lot of tracking, but there was actually a little bit more than I even thought. The thing that really surprised me was how much tracking Google had … I didn’t think any one single company would have that much data. A single company has the ability to record the Web activity of a huge number of individuals seeking sensitive health-related information, without their knowledge or consent.”

He adds that such data can give companies the power to discriminate against people who are ill. Some Web surfers could be “segregated into data silos of undesirables” and “excluded from favorable offers and prices.” In addition, poorly protected health information can be lost, stolen, or potentially used to identify individuals against their will. As Joseph Turow C’72 ASC’73 Gr’76, the Robert Lewis Shayon Professor at the Annenberg School, testified before a Senate committee in 2010, data brokers are sophisticated enough to be able to link “anonymous” medical and pharmacy insurance claims information with postal clusters of three-to-eight homes, and sell that information on the open market [“Phantom Privacy,” Sept|Oct 2010].

The American public is uneasy about mysterious third parties gleaning personal information, but doesn’t know exactly how much of it is being leaked. Surveys in the 2000s found that 70 percent of people were “nervous that Websites had information about them,” and 85 percent of Internet users in poor health were concerned that Websites would share their data. However, as recently as 2012, only 36 percent of survey respondents knew advertisers could legally trace their health-related online searches.

Although it may be hard for individuals to protect their health data from “highly motivated and well-funded corporations with cutting-edge technologies,” as Libert puts it, there are steps people can take.

“One thing that helps a lot is preventatively installing two browser plug-ins: one is called Ghostery, and there’s another one called AdBlock Plus,” he says. “They don’t catch everything, but they catch a lot of stuff, and they have a little alert [that] will tell you who is tracking on the page.”

He concedes that there are merits to the advertising model that relies on this sort of tracking. “Advertisements did give us WebMD, and WebMD has a lot of good information. [But] I think the government should have a role in deciding how this information can be used and how long you’re allowed to keep it … Right now, you can do anything you want.”

As for government and NGO Websites, “there’s no excuse to have this type of tracking,” he says. “If you’re WebMD, part of your business is that you advertise to people. But if you’re the CDC, or if you’re AIDS.org or one of these, you get your money either from taxes or from charitable donations … In cases where you’re leaking information but there’s zero justification for it, you should change that tomorrow.”

Libert also suggests that the federal government could work within existing laws to regulate how businesses collect health data, citing an extension of the Child Online Protection Act (COPA) as a possible solution. “It’s actually illegal to keep information on people younger than 13 without parental notice. So we already have [businesses] knowing they can’t collect information on kids. We could add into that, ‘or sick people.’”

He adds that HIPAA, the federal law that protects certain information people share with their medical providers and insurers, could potentially be extended to cover portions of the virtual realm.

Libert acknowledges that these solutions are not perfect. “But there’s enough that if Congress got serious, they have enough to work with.”

In addition to advocating for more extensive internet health privacy laws, Libert says he’s also interested in exploring how health-related search data might be “used for good.”

“The advertisers have all this data, but I would be much happier if medical researchers had the data,” he remarks. “Because maybe then they could see that, ‘Oh, there’s diseases happening in this town,’ or ‘We see people who search for this symptom end up searching for this other symptom two weeks later.’ Maybe that could start to inform medicine and medical researchers.

“The data itself is useful for really good things,” he says. “But it’s not being used that way right now.”

—Phoebe Low C’17

2 Responses

Paul Van Dorpe

July 8, 2015 at 3:06 am

Reply

Based on the reference from this article, I installed Ghostery into my Firefox browser and reloaded this article. The add-on found (and blocked) 4 different web trackers that thepenngazette.com was using.
1. Penn Gazette
  
  July 8, 2015 at 4:03 pm
  
  Reply
  
  Our website uses a gravatar plugin to allow a user’s gravatar image to show with their comment and three google products: google fonts, google custom search, and google analytics.

Heart Palpitations? Beware of Dr. Google

2 Responses

Leave a Reply Cancel Reply

Departments

Heart Palpitations? Beware of Dr. Google

2 Responses

Leave a Reply Cancel Reply

Popular

Departments