Data Science

image courtesy of dirkcuys

Early detection, early understanding, and early decision are pivotal to efforts to prevent or mitigate major crises, including outbreaks of mass atrocity. A knowledge system is necessary to anticipate these crises. Much of the work to build this system requires engagement of seasoned human expertise in contextual issues. A necessary component of this system, however, is a technological capacity for scanning, gathering, sifting, analyzing and portraying data from the myriad sites now available and searchable by computer applications. This proposal speaks to the potential for developing this capacity for early crisis warning at Harvard University.

The world community can anticipate a near future marked by massive natural disasters, many impelled by climate change, and severe large-scale wars and refugee crises. The great challenge as these crises evolve is to mobilize the skills, manpower, and goods and services to promote effective early intervention. That mobilization requires multi-national political will.

Yet behind that challenge—political will leading to effective mobilization of resources– is the problem of effective knowledge. All reasonable actors, at all levels of governance, now recognize that to move swiftly and productively to forestall these major crises it is essential to know what is happening to whom, in what locations, in what escalating time frame. Crafting a comprehensive strategy for early warning and early intervention demands a curated body of knowledge organized in such a way as to bring to life the capacity for complex ascertainment.

In the last 10 years, the political and technological grounds have been laid to support this grand idea. The disaster, genocide, and conflict communities have created common language and common understandings about the roles of prevention, early warning, and early intervention. The disaster, conflict, and atrocity escalation scenarios are well understood–at the small group and population level.

Obtaining relevant information in an actionable time frame is what is missing. Crisis mapping, with all its bells and whistles, has evolved to enlist the skills and energies of crisis responders and technological experts. It has now become a basic descriptive tool used by governments and major international agencies seeking to assess their reach and outstanding gaps in any disaster or conflict response operation—primarily at the population level. Crisis mapping, however, is deployed only after a major crisis has broken out.

To date, the potential of Big Data to support this capacity for early as well as complex ascertainment has not been realized. The task is to marshal and organize the massive amounts of information and data (now existing in many places and formats, much of it not digitized) into structured and searchable information that can allow an end user to analyze the urgency and leverage points in a specific crisis escalation scenario within a time frame when action will make a difference.    

Burden of War Database / Syrian Database

Early Signal for Mass Atrocity Platform 

Building on Dr. Jennifer Leaning’s extensive research in humanitarian emergencies, this project seeks to build prediction models for signaling mass atrocities by monitoring hate speech and other indicators in traditional and modern communication channels.

ESMAP will be populated by data from traditional sources (news reports, texts, television and radio) as well as from new avenues like social media platforms including Facebook and Twitter, to generate tools to map escalation in violence against target communities. The project will particularly focus on state-sponsored or state-aided violence.

Given the exponential growth of (big) data in recent decades, this project will necessitate an inter-disciplinary application of tools from data science, computational modelling and the humanities to develop and test its prediction algorithms.