The era of big data is already upon us. Organizations far and wide are constantly collecting information on who we are, what we do and how we live. That may seem scary to most people, but to Dr. DJ Patil it provides an opportunity to do good. The White House agreed, and named him earlier this year as the nation’s first Chief Data Scientist. An apt title, since Patil himself is credited with coining the term “data scientist” to refer to people researching data and extracting useful insight from the numbers. We had him on the show to explain his new role as well the balance between useful statistics and security for the American people.
Kai Ryssdal: We’ve been collecting data for years and years and years, but it seems to the layperson out there, it seems like we are only now recently absorbing the lessons of big data and what we can do with it, what took us so long as a society to figure this out?
DJ Patil: One, is making sure that the data people are at the table. You know, the joke that we often say is the most important thing we can learn in life is through watching Star Trek, and so there’s Spock on the bridge. Spock is on the bridge, he’s not in some way back office. So in the modern boardroom or in policy conversations, why isn’t there a Spock in the room? Who helps us figure out context, interpretation, all these different ideas? That shift as people have started to put the Spock in the bridge, that said we could use data in this novel way, we could do more things with this and then it’s like what more can we do with that? Oh, we can build something.
Kai Ryssdal: So how does this apply in government policy terms then, day in and day out when you go to your office, what do you do?
DJ Patil: The responsibility of the chief data scientist is to responsibly unleash the power of data to benefit all Americans. So what does that look like? Number one is says what type of data do we have, how do we secure it, how do we make it safe, but then how do we return it back to the public? And so one of those best examples is all the incredible data that is being released on Data.gov; 180,000 data sets that people are using in all sorts of manners.
Kai Ryssdal: You understand though, that the American public by and large hears government, data and they think “NSA, and they’re reading my emails and monitoring my phones” and they don’t see it as a good thing.
DJ Patil: Well, the most important part of this is the question of what does it mean to responsibly unleash the data, and what we have to do is to be constantly having a conversation about what does that mean. Because there is a tremendous amount of data that we always do want the government to have. In the case of health care for example, that is data that benefits everybody, like if everyone is able to have a safe way to provide their data, we’re gonna be able to learn so much about the type of chronic conditions that are typically undiscovered. There’s all sorts of undiagnosed genetic diseases that we can’t see at this point because we just don’t have enough people contributing to the data.