Big Data

flavor image depicting a globe composed of data points

"For me, the technological definitions (like "too big to fit in an Excel spreadsheet" or "too big to hold in memory") are important, but aren't really the main point. Big data for me is data at a scale and scope that changes in some fundamental way (not just at the margins) the range of solutions that can be considered when people and organizations face a complex problem. Different solutions, not just 'more, better.'" --Steven Weber, School of Information, UC Berkeley, datascience@berkeley

"These new resources represent a new kind of data that will enable transformative research on demographic and economic change and the spatial organization of society." --Steven Ruggles, Demography, 2014

"Big data" is a term that has seen increasing use across many disciplines, under a wide variety of definitions. The phrase was added to the Oxford English Dictionary in 2013: "data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges; (also) the branch of computing involving such data."

Gil Press, a contributing writer for Forbes, outlines the origins of big data, as well as a variety of popular definitions for big data. One such definition comes from academic and business analyst Tom Davenport: "the broad range of new and massive data types that have appeared over the last decade or so." Press' article also refers to a larger list of big data definitions from leaders in academics and industry, compiled by the School of Information at UC Berkeley.

Nature published several insightful pieces in a special issue dedicated to big data. Clifford Lynch, executive director of the Coalition for Networked Information and a professor at UC Berkeley's School of Information, contributed an opinion piece called "How do your data grow?" Here, Dr. Lynch explains the important role that preservation institutions play in making the most of big data, citing NIH's National Center for Biotechnology, our own Inter-university Consortium for Political and Social Research, and the Protein Data Bank as examples of stable archive collections.

Many experts in data-driven disciplines seem to agree that, if properly handled, big data has the power to revolutionize the way we understand our increasingly digital world.

Past Big Data Workshops and Symposia

Below is a brief list of past meetings concerning big data and innovations in data science. In addition, DSDR provides Examples of Big Data Initiatives and Funding Projects at various universities, companies, and organizations. These lists are by no means exhaustive, but rather provide a brief introduction into the kinds of initiatives that are underway.

  • The Networking and Information Technology Research and Development (NITRD) Program "provides a framework in which many Federal agencies come together to coordinate their networking and information technology (IT) research and development (R&D) efforts" (NITRD website). The 2015 NITRD Big Data Strategic Initiative Workshop was hosted at Georgetown University on January 23, 2015. This workshop brought together an interdisciplinary core of leaders to work on creating a Federal Big Data Research Agenda. Links to videos of the workshop can be found on the website.

  • Georgetown University and the White House sponsored a big data workshop at Georgetown in June 2014. The workshop was entitled, "Improving Government Performance in the Era of Big Data: Opportunities and Challenges for Federal Agencies." Speakers focused on the opportunities and challenges ahead for federal agencies in light of the increasing availability of massive data sets. Videos of the panel discussions and more information about the speakers can be found online.

  • The Michigan Institute for Data Science at the University of Michigan (MIDAS) hosted its annual symposium in April 2014, bringing together computational and data science experts to discuss various innovative methodologies and applications of big data.

References

Allen, Corey. 2015. "How Big Data Can Improve Healthcare." UBC News, January 8. Retrieved March 11, 2015.

"big data." The Oxford English Dictionary. 2015. Retrieved March 11, 2015.

Boyd, Danah, and Crawford, Kate. 2012. "Critical Questions for Big Data." Information, Communication and Society 15(5): 662-679. doi: 10.1080/1369118X.2012.678878.

"Community Cleverness Required." 2008. Nature 455(7209): 1. doi: 10.1038/455001a.

Graham R., Duncan. 2008. "Big Data: The Next Google." Nature 455(7209): 8-9. doi:10.1038/455008a.

Dutcher, Jenna. 2014. "What Is Big Data?" Berkeley School of Information. datascience@berkeley Blog. Retrieved March 11, 2015.

Executive Office of the President. 2014. Big Data: Seizing Opportunities, Preserving Values.

Lynch, Clifford. 2008. "How Do Your Data Grow?" Nature 455(7209): 28-29. doi:10.1038/455028a.

Press, Gil. 2013. "A Very Short History of Big Data." Forbes Technology, May 9. Retrieved March 11, 2015.

Press, Gil. 2014. "12 Big Data Definitions: What's Yours?" Forbes Technology, September 3. Retrieved March 11, 2015.

Ruggles, Steven. 2014. "Big Microdata for Population Research." Demography 51(1): 287-297.

Waldrop, Mitch. 2008. "Wikiomics." Nature 455(7209): 22-25. doi:10.1038/455022a.