Dgura data school photo1
Earlier this week, participants in The Eric & Wendy Schmidt Data Science for Social Good Fellowship presented their work at 1871, a tech incubator in Chicago. - 

There is a growing interest in “big data,” the science of collecting large amounts of information and making sense of it.

For a long time, this was the provenance of companies. Why? Because they have collected so much data – about what we buy, what we spend, and what ads we respond to, among other things. But the field is evolving. In 2008 and 2012, data science became a major part of political campaigns. Now, governments and nonprofits are beginning to see it as a way to tackle social problems.

After a decade digging through corporate data, Rayid Ghani wanted to do something different. “I decided one day just to arbitrarily pick a date to quit,” he says.

Ghani, an expert on information retrieval and machine learning, gave himself three months to find a job  a position with the potential for what he calls “social impact.” A few weeks later, he became the chief scientist on President Obama’s re-election campaign. “It was a fairly random detour,” he says. Ghani and his team did pioneering work on fundraising and analytics.

For now, Ghani is done with politics, but he says that detour was eye-opening. It showed him that data science has other applications. Today, Ghani teaches at the University of Chicago, where he runs The Eric & Wendy Schmidt Data Science for Social Good Fellowship. “The reason I started the program was really looking back at myself and saying, 'I would have loved to be in a program like that.'"  

Rayid Ghani directs The Eric & Wendy Schmidt Data Science for Social Good Fellowship at the University of Chicago.

Forty-eight of the world’s brightest data scientists – students from more than 30 universities – have spent 12 weeks in Chicago. They have worked with community organizations, nongovernmental organizations and governments to tackle social problems, including maternal mortality and homelessness.

“They had no idea that those problems have data, and that their skills can be helpful in solving those problems,” Ghani says.

The fellows set up shop in an unfinished office space in downtown Chicago, and started programming in Python and R.  Those are the hammer and hacksaw in a data scientist’s toolbox. The fellows made themselves at home. In the kitchen, behind a stack of boxes, there is coffee and a half-eaten pie, and tucked in a corner, there is a ping-pong table.

The outside groups  the Sunlight Foundation, WikiEnergy, Chicago Public Schools, to name a few  gave the fellows two things: problems to tackle and data. Lots and lots of data.

This week, participants in the program gathered at 1871, a tech incubator high above the Chicago River, to show off what they accomplished.  One group devised a new way for the World Bank to look for corporate collusion in development projects. Right now, that organization has to rely on whistleblowers. The fellows found a way for the Bank to flag contracts where corporate collusion is most likely to occur.

Another group helped the Chicago Department of Public Health to pinpoint homes where children are at the highest risk of lead poisoning. Right now, the Department has a list of tens of thousands of housing units where kids could be at risk. The fellows narrowed that down considerably, creating a model that suggests the Department could focus its attention on 378 units where the risk is highest. That would take just two months, and it would cost the city less than $200,000.

After the presentations, the data scientists fielded questions. Sam Zhang, who worked on a way to improve outreach to Americans who don’t have health insurance, said he and his fellow data scientists had a lot to learn about the organizations they partnered with. “I came, and I didn’t know what Medicare, Medicaid, what the health care subsidies were,” he admitted. “The first few weeks were 'print out a bunch of papers about it, and just read.'”

And those organizations, he says, had a lot to learn about big data  what it can do, and what someone trained in analytics or econometrics can do with it.  Zhang went to Swarthmore College, and he acknowledged he could have wound up in Silicon Valley. “The thing is Rayid is trying to steal us away from the people optimizing ad click,” he said. “And I think he’s pretty successful.”

Some fellows will return to Ph.D. programs, and a handful will remain in Chicago, where they will continue to work on these projects. Ghani’s hope is all of them will leave with a new sense of what data can do.

Follow David Gura at @davidgura