Jeremy Gillula, PhD (BS ’06) was halfway through a PhD in computer science at Stanford when he discovered that his true calling lay in digital privacy. “I realized that what kept me up at night wasn’t, ‘Oh my God, will my robot crash and burn?’” says Gillula, who earned his doctorate for work on robotics and machine learning. “It was, ‘Oh my God, I just read this article about some company that is doing terrible things with respect to users’ privacy.’”
That epiphany led to six years at the Electronic Frontier Foundation, a nonprofit dedicated to defending civil liberties in the digital world. Today, Gillula employs his technical expertise and privacy chops as a staff privacy engineer at Google. “My primary responsibility is to look at products from a privacy angle and say, ‘Is this the right thing in terms of privacy for our users? If not, how can we make it better?’” he says.
Here, Gillula talks from his personal perspective as a privacy professional about the latest advances in consumer data management, and what the future of digital privacy might look like.
“We are light years ahead of where we were a few years ago. People have started to realize just how important privacy is, and, as a result, companies that serve people have realized how important it is to get privacy right.”
Jeremy Gillula, PhD (BS ’06)
Consumers today have access to everything from cookie blockers to incognito browsers. Has digital privacy improved in recent years?
Jeremy Gillula: We are light years ahead of where we were a few years ago. People have started to realize just how important privacy is, and, as a result, companies that serve people have realized how important it is to get privacy right. In particular, they have realized that privacy is a selling point—that people will actually make decisions in terms of what products or services they use based on privacy.
Consumer pressure and government regulation are pushing in the direction of greater control over personal data. Yet at the same time, we’re generating more of that data than ever before, and companies are better able to mine it via machine learning.
I wouldn’t necessarily say that those things are strictly at odds. There are techniques now for doing machine learning without actually sending the raw data up to a central server. That’s called federated learning, and it’s a wonderful example of a technique where these things aren’t in tension: You can keep your data locally on your device, you don’t have to share it with some central party, and yet you can still get some benefit from it.
How does that work?
JG: You have a bunch of different users on different devices, and you have an algorithm that runs locally on each device. The algorithm basically says, “Okay, based on your local data, here are the changes I would make to some machine-learned model.” Those changes can be combined with everyone else’s changes to update the model, which can then be shipped back out to individual users. G-board, the Android keyboard, uses federated learning. We definitely don’t want to collect the stuff that people are typing on their phones. But we can learn locally on your device and send up only the changes to the model that describe how people use language so that we can better predict the next word they are likely to type. That way we don’t have to look at the raw data of what you are typing, but we can still create a good model that will be beneficial for everybody.
What other privacy-related changes are on the horizon?
JG: The days of the third-party cookie are numbered. A third-party cookie is essentially a pseudonymous ID made up of random numbers that are assigned to a browser. It’s collected on any website that has embedded content from the third party that created it, which is usually an advertiser. That advertiser can then tell if a user has visited any of those websites. A lot of browsers have already deprecated third-party cookies. That means advertisers are going to have to find a different way to target their ads.
What might that look like?
JG: The Chrome browser has a few different proposals to do that by learning locally in your browser—to look at the sorts of sites you visit, and then say, “Okay, based on these types of sites, I think that you’re interested in shoes or gardening or jazz.”
It’s going from, “I have to watch where this person goes and guess what they’re interested in,” to just asking the browser to do that and saying, “Look, I don’t actually care where they’ve been, I just care what topics they’re interested in so I can send them an ad that they will feel is relevant.”
Unlike those notices that ask you to set your cookie preferences, it seems like these privacy protections would be invisible.
The goal of modern privacy technology is to become invisible to people who don’t want to think about it, while still giving options to people who do want to directly control their privacy settings. It’s a terrible cognitive load on people when they have to constantly make privacy decisions. It would be much better if we could get them into a good default so they wouldn’t have to worry about it.
More posts like this
In 1968, Wally Rippel, BS (BS ’68) and his Caltech team challenged MIT students to a cross-country race of electric cars