By: Gautam Narula
Every day, Internet users generate millions of gigabytes of data. Every time someone clicks a link, visits a website, uses an app, or makes a phone call, data is created. These actions are tracked, recorded, and added to increasingly large datasets. The creation of these truly massive datasets and the newfound ability to analyze them thanks to cheap hardware and improved algorithms has led observers to call this phenomenon “Big Data”. The consulting firm McKinsey called Big Data the “next frontier for innovation, competition, and productivity.”
Technology companies have been well aware of Big Data and its implications for the past decade. Google and Facebook comb their billions of users’ actions to personalize ads and search results. Netflix and Amazon analyze users’ browsing and reviewing histories to offer personalized recommendations on new products. Palantir Technologies, a Silicon Valley startup, has worked with the CIA and the FBI to analyze and integrate intelligence data to disrupt terrorist networks and combat credit card fraud. However, 2012 was the first time that large-scale data analytics entered the political arena, leading some observers to dub it the “Year of Big Data.”
As people share increasing amounts of personal information online through their social networks, purchasing history, and browsing habits, companies—and political campaigns—will learn much more about them. Nate Silver, a statistician blogging for the New York Times, used aggregated data analysis to consistently predict Barack Obama’s electoral victory to be a near certainty, drawing the ire of political pundits pushing “too close to call” narratives in opinion columns and nightly talk shows. Joe Scarborough, a conservative political commentator with MSNBC, criticized Silver’s analysis, asserting that “anybody that thinks that this race is anything but a tossup right now is such an ideologue, they should be kept away from typewriters, computers, laptops and microphones for the next 10 days, because they’re jokes.” Indirectly referencing the furor over Silver’s predictions in his New York Times column, David Brooks wrote, “If there’s one thing we know, it’s that even experts with fancy computer models are terrible at predicting human behavior.” But come Election Day, Silver correctly predicted the winner of all fifty states and the District of Columbia
The 2012 election also saw both the Obama and Romney campaigns using complex data analysis algorithms operating on massive voter databases. These algorithms integrated data from multiple sources to create targeted, personalized ad campaigns and identify swing voters most susceptible to campaign advertising. The campaigns spent tens of millions of dollars hiring teams of data scientists to merge offline and online data to create sophisticated profiles of potential voters—how likely a voter was to donate, how often an individual talks to friends about politics, what messaging would be most effective to persuade an indecisive voter, and more.
This is just the tip of the iceberg. In the midst of the recent debate on gun control, some have suggested creating a massive database of all guns and ammo purchases, with data algorithms identifying unusual patterns in purchase history and flagging individuals most likely to attack others with guns. As Republicans and Democrats battle over deficits and budget cuts, data mining may play a vital role in reducing fraud and waste in government expenditures. From education to healthcare to counterterrorism, Big Data promises to transform many of the issues currently debated on Capitol Hill.
The greatest obstacle to the rise of data crunchers is concern about privacy and the amount of information corporations, campaigns, and governments have on individuals. Target was caught in a public relations nightmare when its algorithms sent pregnancy related coupons to a teenage girl’s house before she had told her father about the pregnancy. The Federal Trade Commission now requires Facebook to undergo privacy audits for the next two decades after the social network provided user data to third parties for advertising and app development. At the same time that the Obama administration was pushing for greater online privacy protections, the Obama campaign was collecting vast amounts of data about potential voters through the very same methods the administration was trying to impede. As data collectors become increasingly aggressive in collecting information, policymakers face the difficult task of respecting privacy rights without choking off the power of Big Data.
The politics of the future will be a much more precise affair—more evidence based, more targeted, and more invasive. In the coming years, the advanced techniques employed by the Obama and Romney campaigns will trickle down to statewide and local races. Elections will be decided not on eloquent speeches or slick photo ops, but on the volume of data gathered. Political pundits will eventually have to incorporate rigorous metrics into their analysis, or risk embarrassment and ridicule from the Nate Silvers of the world. Welcome to the age of Big Data.