A New Global Dataset of Conflict Zones: Using Machine Learning for Mapping Armed Conflict

Research
Activities
Research Activities
Research Projects

Research Networks

Seminars & Events
Research
Columns
Research Columns
Columns

About
Papers &
Publications
Papers & Publications
Publication Search

New Books

New Journals

New Reports

Books

Journals & Magazines

Reports
Researchers
Researchers
Alphabetical Order

By Region

By Topic

By Department

Fixed-term Researchers

Visiting Research Fellows
Library
Library
Calendar

Information

Finding Materials

Institutional Repository(ARRIDE)

Digital Archives
About IDE
About IDE
Message from the President

Outline of IDE

Location

Inquiries

News

Fellowships / Job Openings

IDEAS

Japanese

IDE Research Columns

IDE Research Columns

Column

A New Global Dataset of Conflict Zones:
Using Machine Learning for Mapping Armed Conflict

Print this Page

A New Global Dataset of Conflict Zones:Using Machine Learning for Mapping Armed Conflict

Kyosuke KIKUTA
Institute of Developing Economies, JETRO

December 2022

How can we map the zones of armed conflicts? Kikuta (2021) applied a machine learning method that flexibly accounts for various shapes of conflict zones and made a new open dataset—wzone—available for all countries for 1989–2019. The new dataset helps us better answer important questions about armed conflicts such as what their environmental impacts are. In fact, while Daskin and Pringle (2018) suggest that armed conflicts have devastating impacts on biodiversity, I do not find such a large impact with the wzone dataset. This finding implies that we need to understand “where” armed conflicts occur to answer “why” they occur. Moreover, the new method can potentially be applied to other mapping exercises such as for crime zones and the spread of infectious diseases.

Even as we approach the second quarter of the 21st century, the world is still not free from armed conflict. In 2021, about 120,000 people were killed in 277 armed conflicts in 59 countries. The conflicts continue in Ukraine, Syria, Afghanistan, Myanmar, and the Democratic Republic of the Congo, spreading nearly to all regions of the world (Sundberg and Melander 2013). To examine these conflicts, researchers have compiled global datasets of conflict events, including the Uppsala Conflict Data Program Georeferenced Event Dataset (UCDP GED) and Armed Conflict Location and Event Data. It appears that we have a better understanding of where conflict happened—a question that must be answered to understand the consequences of conflicts. It turns out, however, that answering the “where” question is harder than one might think.

Why We Cannot See Conflict Zones

So what is the problem? The problem stems from the fact that the existing datasets only identify locations of conflict events. The UCDP GED, for instance, reports an event occurring on July 31, 2017; “ at least 20 killed in Shiite mosque attack in Afghanistan's Herat” (Sundberg and Melander 2013) with the latitude and longitude of the mosque indicated. It sounds like very precise information. However, this information does not tell us what locations are not affected by the event. If a person was 1cm away from the mosque, can we say that she/he was unaffected by the conflict? Probably not. How about if she/he was 1 m away from the mosque? How about 10 m? 100 m? 1km? Importantly, without answering these questions, we only have a partial understanding of the conflict zone. We know the “point” locations of conflict events, but we do not know the “zonal” areas of the conflict. This lack of knowledge is critical, as the effects of conflicts may not be contained in a point location. Conflicts, for instance, can damage the natural environments in broader areas.

Researchers have not been silent on this problem. The UCDP, for instance, makes a dataset, called “UCDP GED Polygons” (Croicu and Sundberg 2012). The basic idea is straightforward; we can simply gather all events for a certain conflict, connect the outer edges of the event locations, and thus map a conflict zone (see the left square of Figure 1). However, it turns out that the problem is not that simple. Consider the case of the right square of Figure 1, where the events are polarized into two locations on the edge. If we would simply connect the outer edges, we might mistakenly include the central area where no event is reported. The simple method can lead to a misleading picture of a conflict.

Figure 1. Stylized Examples of UCDP GED Polygons

Figure 1. Stylized Examples of UCDP GED Polygons
Source: The Author

Let us Use Machine Learning to Solve the Problem

In one article (Kikuta 2022), I address these problems by using a machine learning technique, a so-called “one-class support vector machine” (OC-SVM), which allows for more flexible mapping. Although the method involves many technicalities—and I asked interested readers to refer to the article for details—the idea is actually simple. The method draws zones that include as many event locations as possible while minimizing the overall size of the zone. If we were to use a large conflict zone (say, a zone including all square areas in Figure 1), the conflict zone would include all of the points but have many redundant areas. By contrast, if we were to use a tiny zone (say, a tiny circle that includes only one point in Figure 1), the zone would have no redundant areas but omit all of the points except for one. The method strikes a crucial balance between these two extremes and gives an “optimal” conflict zone. Moreover, the method can draw any shape—whether it is linear, curvy, or separated into multiple parts. In Kikuta (2021), I show that the method indeed provides more accurate zones than existing methods, including UCDP GED Polygons and even those in other academic fields.

I applied the method to the UCDP GED to create a new global dataset of conflict zones—wzone. The data cover each day from 1989 to 2019 worldwide. Even better, the dataset is free and open for public use (Kikuta 2021; here is a link to download the dataset). The following video shows the timelapse of conflict zones. As you can see, the conflicts differ greatly in their sizes, shapes, and duration. Unlike previous methods, the new method accommodates various shapes of conflict zones and provides a basis for further analysis (can you identify the very large conflict zone appearing from 2012? It is international terrorist activity by the Islamic State, which does not have geographic zones).

Movie: Zones of Armed Conflicts, 1989-2018 (version 3.0)
（Created by the Author）

The New Data Make a Difference

I hope that readers agree on the usefulness of the new dataset, wzone, which provides a better picture of armed conflicts. A remaining question is whether the new data can change our understanding of conflicts. To answer this question, I re-analyze a study published in Nature. Daskin and Pringle (2018) examined whether armed conflicts harmed biodiversity. By using the UCDP GED Polygons, they found that compared to areas outside of conflict zones, the conflict zones showed a sharp decrease in mammal populations. Their analysis indicated that five years of armed conflicts had eradicated over 99% of the initial populations. It appears that armed conflicts have had devastating impacts on biodiversity. Their findings were so shocking that many media, such as The New York Times, The Economist, and The Washington Post, cited the article.

However, it turns out that their findings are built on a weak foundation. Once I have replaced the UCDP GED Polygons with wzone and replicated the analysis, armed conflicts no longer have as drastic an impact as discussed in the study. In the new estimate, five years of armed conflicts decrease the mammal population by 41%. Although this is still large, the estimate of the study has many uncertainties as well. The effect can be as devastating as total extinction, but armed conflicts can also double the size of the mammal population. Statistically, we cannot say which is true. As seen in this and other analyzes in Kikuta (2021), it is crucial to understand where armed conflict happens to discuss its impacts. The dataset of conflict zones, for instance, allows us to quantify damages to biodiversity, deforestation, and economic activities arising from conflict.

The new method and data have various potential uses as well. For instance, even though many governments provide travel advisories, they tend to be based on qualitative observations and available only at district levels. Fusing quantitative data into those databases can provide more fine-grained information and also ease the administrative burden involved in manually collecting the data. Furthermore, the method can be applied to any data with locational information (i.e., longitude and latitude). The method, for instance, can be used to map crime zones, market areas of retailers, and zones of infectious diseases. I hope that this article sheds new light on the geography of armed conflicts and, more broadly, zoning methods in the social sciences.

References

Croicu, Mihai Catalin, and Ralph Sundberg. 2012. UCDP GED Conflict Polygons Dataset Codebook Version 1.1 (Dataset). Uppsala: Uppsala University.
https://ucdp.uu.se/downloads/old/polygon/ucdp-ged-polygons-v-1-1-codebook.pdf

Daskin, Joshua H., and Robert M. Pringle. 2018. “Warfare and Wildlife Declines in Africa’s Protected Areas.” Nature 553 (7688): 328–32. https://doi.org/10.1038/nature25194

Kikuta, Kyosuke. 2021. WzoneData: Zones of Armed Conflicts (Dataset). Cambridge, MA: Harvard Dataverse. https://doi.org/10.7910/DVN/PUWJEU

———. 2022. “A New Geography of Civil War: A Machine Learning Approach to Measuring the Zones of Armed Conflicts.” Political Science Research and Methods 10 (1): 97–115.
https://doi.org/10.1017/psrm.2020.16

Sundberg, R., and E. Melander. 2013. “Introducing the UCDP Georeferenced Event Dataset.” Journal of Peace Research 50 (4): 523–32.
https://doi.org/10.1177/0022343313484347

Author’s Profile

Kyosuke Kikuta is an Associate Senior Research Fellow at Institute of Developing Economies (Ph.D. in Government, University of Texas at Austin). His research interest is in the quantitative analysis of conflict. His articles are published or forthcoming in the Journal of Politics, International Organization, Journal of Conflict Resolution, and Political Science Research and Methods among others.

IDE Research Columns

IDE Research Columns

Column

A New Global Dataset of Conflict Zones: Using Machine Learning for Mapping Armed Conflict

Why We Cannot See Conflict Zones

Let us Use Machine Learning to Solve the Problem

The New Data Make a Difference

References

Author’s Profile

Other Articles by This Author

A New Global Dataset of Conflict Zones:
Using Machine Learning for Mapping Armed Conflict