Decoding the EM-DAT Natural Disaster Dataset Hierarchy

A couple weeks ago, I authored an article on Getting Started with the International Disaster Database (EM-DAT) using Python and Pandas.

While that article briefly reviewed the hierarchical fashion in which EM-DAT organizes natural disaster types, today I want to:

  1. Explore the EM-DAT hierarchy in depth
  2. Provide Python and Pandas code to understand the natural disaster hierarchy
  3. Demonstrate how to quickly filter natural disaster types in EM-DAT

By the end of this article, you’ll have a strong understanding of how the EM-DAT database organizes natural disasters in a hierarchical manner.

Download the code and EM-DAT dataset

For the sake of simplicity and reproducibility, I am using the Kaggle-hosted version of the EM-DAT dataset, which is freely available to download and use.

If you instead use the official EM-DAT version from CRED (which requires registration), the results of running my code may look slightly different.

Kindly keep in mind potential result discrepancy when running your own experiments.

A quick review of natural disaster hierarchy in the EM-DAT database

EM-DAT Hierarchy The EM-DAT dataset organizes natural disasters into a hierarchy (image credit)

The EM-DAT database catalogs over 25,000 mass disasters from the year 1900 to the present day, including a total of 58 unique disaster types (e.g., flood, hurricane, tornado, etc.).

As I mentioned in my introductory article on the EM-DAT dataset, EM-DAT organizes natural disasters in a hierarchical fashion, making it (theoretically) easier for data scientists to navigate the dataset.

I say “easier” because working with EM-DAT has a bit of a learning curve, one that can only be overcome by exploring the data.

The hierarchical structure of EM-DAT allows you to drill down into natural disaster types based on the following five columns:

  1. Disaster Group (typically ignored since all rows have the same value, “Natural”)
  2. Disaster Subgroup
  3. Disaster Type
  4. Disaster Subtype
  5. Disaster Subsubtype

The best way to fully comprehend this hierarchical structure is with a series of examples.

To start, we can load the EM-DAT dataset from disk:

# import the necessary packages
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import os

# specify the path to the EM-DAT dataset
emdat_dataset_path = os.path.join(
    "natural-disasters-data",
    "em-dat",
    "EMDAT_1900-2021_NatDis.csv"
)

# load the EM-DAT natural disasters dataset from disk
df = pd.read_csv(emdat_dataset_path)
df.tail()

Our dataframe is now ready for analysis:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
158222020-0031-ZMB202031NaturalHydrologicalFloodNaNNaNNaNAffectedZambiaZMBEastern AfricaAfricaGwembe, Siavonga, Mambwe and Lumezi districtsHeavy rainsNaNNaNNaNNaNNaNNaNNaNKm2NaNNaNNaNNaN20201.0NaN20201.0NaNNaNNaN1500.0NaN1500.0NaNNaNNaNNaN
158232020-0110-ZMB2020110NaturalHydrologicalFloodNaNNaNNaNAffectedZambiaZMBEastern AfricaAfricaSamfya, Mushindamo, Nakonde districts (Luapula province)Heavy rainsNaNNaNNaNNaNNaNNaNNaNKm2NaNNaNNaNNaN20203.020.020203.026.0NaNNaN700000.0NaN700000.0NaNNaNNaNNaN
158242021-0036-ZWE202136NaturalMeteorologicalStormTropical cycloneNaNTropical cyclone 'Eloise'KillZimbabweZWEEastern AfricaAfricaEswatiniNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20211.023.020211.023.03.0NaN1745.0NaN1745.0NaNNaNNaNNaN
158252020-0131-TLS2020131NaturalHydrologicalFloodRiverine floodNaNNaNAffectedTimor-LesteTLSSouth-Eastern AsiaAsiaCristo Rei, Nain Feto, Dom Aleixo, and Vera Cruz (Dili municipality)Heavy rainsNaNNaNNaNNaNNaNNaNNaNKm2NaNNaNNaNNaN20203.013.020203.013.03.07.09124.0NaN9131.0NaNNaN20000.0NaN
158262020-0362-SSD2020362NaturalHydrologicalFloodNaNNaNNaNAffectedSouth SudanSSDNorthern AfricaAfricaBor South, Twic East, Duk, Ayod Countie (Jonglei); Renk county (Eastern Nile); Pochallla county (Pibor); Lakes, Unity, Upper Nile, Warra, Western Equatoria, Central Equatoria, Northern Bahr-el-GhazalHeavy rainsNaNNaNNaNNaNNaNNaNNaNKm2NaNNaNNaNWhite Nile, Akobo River20207.0NaN202012.0NaNNaNNaN1042000.0NaN1042000.0NaNNaNNaNNaN

Let’s now move on to exploring the hierarchical structure of the EM-DAT dataset.

Disaster Group

EM-DAT: Disaster Group Image credit

The base of the EM-DAT hierarchy starts with the “Disaster Group” column:

# display the disaster groups
df["Disaster Group"].unique()

However, this is an uninteresting place to start since this column has only a single value (i.e., “Natural”):

array(['Natural'], dtype=object)

For this reason, and for all practical purposes, we typically consider the “Disaster Subgroup” column to be our starting point of the EM-DAT hierarchy.

Disaster Subgroup

EM-DAT: Disaster Group Image credit

The following code snippet allows us to explore all possible “Disaster Subgroups” in EM-DAT:

# display the natural disaster subgroups
df["Disaster Subgroup"].unique()

Which gives us:

array(['Climatological', 'Geophysical', 'Meteorological', 'Hydrological',
       'Biological', 'Extra-terrestrial'], dtype=object)
  1. Biological
  2. Climatological
  3. Extra-terrestial (yes, EM-DAT includes a single data point on extra-terrestrial related natural disasters)
  4. Geophysical
  5. Hydrological
  6. Meterological

For example, we can grab all “Meteorological” natural disasters from the EM-DAT dataset using the following code:

# grab all rows that are part of the 'meteorological' disaster subgroup
df_meteo = df[df["Disaster Subgroup"] == "Meteorological"]
df_meteo.tail()

As our output dataframe shows, we’ve successfully filtered all “Meteorological” events:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
158072020-0425-VNM2020425NaturalMeteorologicalStormTropical cycloneNaNTropical storm 'Nangka' (Nika)WaitingViet NamVNMSouth-Eastern AsiaAsiaNam Dinh, Ninh Bình, Thanh Hóa provincesNaNNaNNaNNaNNaNNaNNaN85.0KphNaNNaNNaNNaN202010.013.0202010.014.02.0NaN67855.02925.070780.0NaNNaNNaNNaN
158082020-0462-VNM2020462NaturalMeteorologicalStormTropical cycloneNaNTropical storm 'Noul' (Leon)KillViet NamVNMSouth-Eastern AsiaAsiaDa NangNaNNaNNaNNaNNaNNaNNaN85.0KphNaNNaNNaNNaN20209.018.020209.021.06.0NaN125000.0NaN125000.0NaNNaN33000.0NaN
158092020-0558-VNM2020558NaturalMeteorologicalStormTropical cycloneNaNTropical depression 'Vicky' (Krovanh)AffectedViet NamVNMSouth-Eastern AsiaAsiaNaNNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN202012.021.0202012.021.01.04.0NaNNaN4.0NaNNaNNaNNaN
158102020-0132-VUT2020132NaturalMeteorologicalStormTropical cycloneNaNCyclone 'Harold'--VanuatuVUTMelanesiaOceaniaPentecost, Espiritu Santo, Penama, Sanma, Malampa, Shefa, TorbaNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.04.020204.05.05.0NaN83837.0NaN83837.0NaNNaNNaNNaN
158242021-0036-ZWE202136NaturalMeteorologicalStormTropical cycloneNaNTropical cyclone 'Eloise'KillZimbabweZWEEastern AfricaAfricaEswatiniNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20211.023.020211.023.03.0NaN1745.0NaN1745.0NaNNaNNaNNaN

Disaster Type

EM-DAT: Disaster Group Image credit

Nested under the “Disaster Subgroup”, the “Disaster Type” offers a more specific classification of the natural disaster.

Let’s explore the “Disaster Type” for all “Meteorological” events:

# display all natural disaster types for "meteorological" events
df_meteo["Disaster Type"].unique()

Here is the output:

array(['Storm', 'Extreme temperature', 'Fog'], dtype=object

We see there are three types of meteorological Disaster Types:

  1. Extreme Temperature
  2. Fog
  3. Storm

To examine all “Storm” Disaster Types from the “Meteorological” Disaster Subgroup, we can use the following snippet:

# grab all rows that are part of the 'meteorological' disaster subgroup
df_storm = df_meteo[df_meteo["Disaster Type"] == "Storm"]
df_storm.tail()

Notice how the following dataframe only includes “Storm” events:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
158072020-0425-VNM2020425NaturalMeteorologicalStormTropical cycloneNaNTropical storm 'Nangka' (Nika)WaitingViet NamVNMSouth-Eastern AsiaAsiaNam Dinh, Ninh Bình, Thanh Hóa provincesNaNNaNNaNNaNNaNNaNNaN85.0KphNaNNaNNaNNaN202010.013.0202010.014.02.0NaN67855.02925.070780.0NaNNaNNaNNaN
158082020-0462-VNM2020462NaturalMeteorologicalStormTropical cycloneNaNTropical storm 'Noul' (Leon)KillViet NamVNMSouth-Eastern AsiaAsiaDa NangNaNNaNNaNNaNNaNNaNNaN85.0KphNaNNaNNaNNaN20209.018.020209.021.06.0NaN125000.0NaN125000.0NaNNaN33000.0NaN
158092020-0558-VNM2020558NaturalMeteorologicalStormTropical cycloneNaNTropical depression 'Vicky' (Krovanh)AffectedViet NamVNMSouth-Eastern AsiaAsiaNaNNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN202012.021.0202012.021.01.04.0NaNNaN4.0NaNNaNNaNNaN
158102020-0132-VUT2020132NaturalMeteorologicalStormTropical cycloneNaNCyclone 'Harold'--VanuatuVUTMelanesiaOceaniaPentecost, Espiritu Santo, Penama, Sanma, Malampa, Shefa, TorbaNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.04.020204.05.05.0NaN83837.0NaN83837.0NaNNaNNaNNaN
158242021-0036-ZWE202136NaturalMeteorologicalStormTropical cycloneNaNTropical cyclone 'Eloise'KillZimbabweZWEEastern AfricaAfricaEswatiniNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20211.023.020211.023.03.0NaN1745.0NaN1745.0NaNNaNNaNNaN

Disaster Subtype

EM-DAT: Disaster Group Image credit

The Disaster Subtype provides an even more detailed breakdown of the Disaster Type.

Let’s take “Storm” as our starting “Disaster Type”:

# display all natural disaster subtypes for "storm" events
df_storm["Disaster Subtype"].unique()

Which gives us the following output:

array(['Tropical cyclone', 'Convective storm', nan,
       'Extra-tropical storm'], dtype=object)

We see there are four Disaster Subtypes when we start with “Storm” as our root “Disaster Type”:

  1. Convective storm
  2. Extra-tropical storm
  3. Tropical cyclone
  4. NA (no further categorization)

Any rows with a value of “NA” implies that the particular natural disaster is not categorized beyond the “Disaster Type”.

Let’s now filter on all “Convective storm” samples:

# grab all rows that are part of the 'convective form' disaster subtype
df_convective = df_storm[df_storm["Disaster Subtype"] == "Convective storm"]
df_convective.tail()

Which gives us the following dataframe:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
157862020-0167-USA2020167NaturalMeteorologicalStormConvective stormTornadoNaNWaitingUnited States of America (the)USANorthern AmericaAmericasTexas, Oklahoma, Louisiana, Mississippi, Alabama, Georgia, Florida, VirginiaNaNFloodNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.021.020204.024.03.031.0NaNNaN31.0NaNNaN1400000.0NaN
157872020-0011-USA202011NaturalMeteorologicalStormConvective stormSevere stormNaNKillUnited States of America (the)USANorthern AmericaAmericasTexas, Oklahoma, Missouri, Arkansas, Louisiana, Mississippi, Alabama, Tennessee, Kentucky, Georgia statesNaNFloodNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20201.010.020201.012.010.0NaNNaNNaNNaNNaNNaN1200000.0NaN
157912020-0165-VNM2020165NaturalMeteorologicalStormConvective stormLightning/ThunderstormsNaNAffectedViet NamVNMSouth-Eastern AsiaAsiaHa Giang, Son La, Yen Bai, Lao Cai, and Quang Binh ProvincesNaNFloodNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.022.020204.027.03.013.030000.0NaN30013.0NaNNaNNaNNaN
157982020-0082-USA202082NaturalMeteorologicalStormConvective stormTornadoNaNWaitingUnited States of America (the)USANorthern AmericaAmericasNashville (Tennessee), Kentucky, Missouri, Mississippi, Georgia, Texas Oklahoma, Illinois, Indiana, Ohio, Arkansas, West Virginia, PennsylvaniaNaNNaNNaNNaNNaNYesNaNNaNKphNaNNaNNaNNaN20203.02.020203.05.025.0300.012000.0NaN12300.0NaNNaN2500000.0NaN
157992020-0582-USA2020582NaturalMeteorologicalStormConvective stormSevere stormNaNSigDamUnited States of America (the)USANorthern AmericaAmericasMissouri, Oklahoma, Texas, Illinois, Indiana, Ohio, Arkansas, Kentucky, Tennessee, West Virginia, PennnsylvaniaNaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20203.027.020203.028.0NaNNaNNaNNaNNaNNaN2200000.02900000.0NaN

Now, only convective storms are included in the output.

However, if you examine the “Disaster Subsubtype” column, you’ll see further categorization of the natural disaster, including “Tornado”, “Severe storm”, “Lightning/Thunderstorms”, etc.).

Disaster Subsubtype

EM-DAT: Disaster Group Image credit

The most granular level of classification in the EM-DAT hierarchy is the Disaster Subsubtype.

Let’s start with “Convective storm” as the “Disaster Subtype” and determine all possible “Disaster Subsubtype” values:

# display all natural disaster subtypes for "storm" events
df_convective["Disaster Subsubtype"].unique()

The output of the code follows:

array(['Tornado', 'Hail', 'Severe storm', 'Winter storm/Blizzard',
       'Lightning/Thunderstorms', nan, 'Sand/Dust storm', 'Rain',
       'Storm/Surge', 'Derecho'], dtype=object)

Which tells us there are 10 Disaster Subsubtypes for “Convective storms”:

  1. Derecho
  2. Hail
  3. Lightning/Thunderstorms
  4. Rain
  5. Sand/Dust storm
  6. Severe storm
  7. Storm/Surge
  8. Tornado
  9. Winter storm/Blizzard
  10. NA

As a final example, let’s grab all rows where the “Disaster Subsubtype” is “Tornado”:

# grab all rows that are part of the 'tornado form' disaster subsubtype
df_tornado = df_convective[df_convective["Disaster Subsubtype"] == "Tornado"]
df_tornado.tail()

And sure enough, we’ve now filtered only the tornado events from EM-DAT:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
152982019-0081-USA201981NaturalMeteorologicalStormConvective stormTornadoNaNKillUnited States of America (the)USANorthern AmericaAmericasAlabama, Georgia, South Carolina, Florida, Mississippi,NaNNaNNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20193.03.020193.04.028.090.0NaNNaN90.0NaN140000.0190000.0100.0
157802020-0190-USA2020190NaturalMeteorologicalStormConvective stormTornadoNaNSigDamUnited States of America (the)USANorthern AmericaAmericasIllinois, Iowa, Wisconsin, Michigan, Indiana, Ohio, Kentucky, Arkansas, Tennessee, MissouriNaNHailNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.06.020204.09.0NaNNaNNaNNaNNaNNaN2200000.02900000.0NaN
157852020-0148-USA2020148NaturalMeteorologicalStormConvective stormTornadoNaNKillUnited States of America (the)USANorthern AmericaAmericasLouisiana, Texas, Mississippi, South Carolina, Georgia, Tennessee, Arkansas, North Carolina, AlabamaNaNFloodNaNNaNNaNNaNNaN160.0KphNaNNaNNaNNaN20204.010.020204.014.038.0200.0NaNNaN200.0NaN2600000.03500000.0NaN
157862020-0167-USA2020167NaturalMeteorologicalStormConvective stormTornadoNaNWaitingUnited States of America (the)USANorthern AmericaAmericasTexas, Oklahoma, Louisiana, Mississippi, Alabama, Georgia, Florida, VirginiaNaNFloodNaNNaNNaNNaNNaNNaNKphNaNNaNNaNNaN20204.021.020204.024.03.031.0NaNNaN31.0NaNNaN1400000.0NaN
157982020-0082-USA202082NaturalMeteorologicalStormConvective stormTornadoNaNWaitingUnited States of America (the)USANorthern AmericaAmericasNashville (Tennessee), Kentucky, Missouri, Mississippi, Georgia, Texas Oklahoma, Illinois, Indiana, Ohio, Arkansas, West Virginia, PennsylvaniaNaNNaNNaNNaNNaNYesNaNNaNKphNaNNaNNaNNaN20203.02.020203.05.025.0300.012000.0NaN12300.0NaNNaN2500000.0NaN

An easy way to filter natural disaster types in EM-DAT

The above sections provided code snippets demonstrating how the EM-DAT hierarchy is organized.

However, since we are using Pandas, we can instead filter directly on an individual column instead of navigating the entire hierarchy.

The benefit of filtering directly on a column is that it requires only a single line of code.

For example, let’s grab all “Avalanche” events, which requires us to filter on the “Disaster Subtype” column:

# find all avalanches in the EM-DAT dataset by filtering *directly* on the
# Disaster Subtype of the original dataframe
df_avalanche = df[df["Disaster Subtype"] == "Avalanche"]
df_avalanche.tail()

And now we have a dataframe consisting of just the “Avalanche” events:

Dis NoYearSeqDisaster GroupDisaster SubgroupDisaster TypeDisaster SubtypeDisaster SubsubtypeEvent NameEntry CriteriaCountryISORegionContinentLocationOriginAssociated DisAssociated Dis2OFDA ResponseAppealDeclarationAid ContributionDis Mag ValueDis Mag ScaleLatitudeLongitudeLocal TimeRiver BasinStart YearStart MonthStart DayEnd YearEnd MonthEnd DayTotal DeathsNo InjuredNo AffectedNo HomelessTotal AffectedReconstruction Costs ('000 US$)Insured Damages ('000 US$)Total Damages ('000 US$)CPI
145802017-0466-MNG2017466NaturalHydrologicalLandslideAvalancheNaNNaNKillMongoliaMNGEastern AsiaAsiaOtgontenger mountain (Khangai mountain range, Zavkhan province)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN201710.022.0201710.022.017.0NaNNaNNaNNaNNaNNaNNaN95.878166
146252017-0034-TJK201734NaturalHydrologicalLandslideAvalancheNaNNaNWaitingTajikistanTJKCentral AsiaAsiaPamir region. Road between Douchanbe (Tadshikistan territories) and Khodjent (Sogd), Gorno-Badakhshan region (East)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN20171.027.020171.028.013.0NaNNaNNaNNaNNaNNaNNaN95.878166
155172020-0063-AFG202063NaturalHydrologicalLandslideAvalancheNaNNaNKillAfghanistanAFGSouthern AsiaAsiaDaykundi ProvinceNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN20202.013.020202.014.022.010.0NaN250.0260.0NaNNaNNaNNaN
156252020-0574-IRN2020574NaturalHydrologicalLandslideAvalancheNaNNaNKillIran (Islamic Republic of)IRNSouthern AsiaAsiaDarabad mountainsNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN202012.025.0202012.025.012.0NaNNaNNaNNaNNaNNaNNaNNaN
157362020-0044-TUR202044NaturalHydrologicalLandslideAvalancheNaNNaNKillTurkeyTURWestern AsiaAsiaBahçesaray and Çatak districts (Van province)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN20202.04.020202.05.041.084.0NaNNaN84.0NaNNaNNaNNaN

I suggest using this approach if you need to filter on a specific natural disaster type in EM-DAT.

I assume you have enough Pandas knowledge to understand this, but thought I would include the disclaimer as a matter of completeness.

Comments on the EM-DAT natural disaster hierarchy

The hierarchical structure of the EM-DAT dataset facilities a systematic approach to drilling down into natural disaster types.

As a data scientist, I can utilize either a top-down or bottom-up approach in my analysis of natural disasters.

Furthermore, using Pandas and column-based indexing, drilling down into natural disaster type is trivially easy.

Note: While the above code snippets explored meteorological events, the same approach can be utilized for other events in the EM-DAT dataset, including biological, climatological, etc.

Takeaways

  1. EM-DAT Database Exploration: We deep dived into the hierarchical fashion in which EM-DAT organizes natural disaster types. This structure enables a systematic approach for drilling down into specific disaster types.
  2. Hierarchical Column Structure: The EM-DAT hierarchy is based on five columns, including Disaster Group, Disaster Subgroup, Disaster Type, Disaster Subtype, and Disaster Subsubtype.
  3. Using Python and Pandas for Data Analysis: The article provided Python and Pandas code snippets to help you navigate the EM-DAT dataset hierarchy.
  4. Filtering Made Easy with Pandas: While the hierarchy can seem complex, using Pandas, one can easily filter specific disaster types with single lines of code. For instance, extracting all “Avalanche” events requires just a direct filter on the “Disaster Subtype” column.
  5. Versatility in Analysis Approach: With the EM-DAT’s hierarchical structure, you can adopt either a top-down or bottom-up approach to analyze natural disasters.
  6. Disclaimer for Data Scientists: Remember, while this article showcased meteorological events, the same analytical methods apply for other event types in the EM-DAT dataset, such as biological, climatological, etc.

Understanding the EM-DAT hierarchy, and how to effectively navigate it using Pandas, will equip you with a robust toolkit to explore the vast data on natural disasters in the EM-DAT database.

Citation information

Adrian Rosebrock. “Decoding the EM-DAT Natural Disaster Dataset Hierarchy”, NaturalDisasters.ai, 2023, https://naturaldisasters.ai/posts/em-dat-dataset-hierarchy-explained/.

@incollection{ARosebrock_EMDATDatasetHierarchy”,
    author = {Adrian Rosebrock},
    title = {Decoding the EM-DAT Natural Disaster Dataset Hierarchy},
    booktitle = {NaturalDisasters.ai},
    year = {2023},
    url = {https://naturaldisasters.ai/posts/em-dat-dataset-hierarchy-explained/},
}

AI generated content disclaimer: I’ve used a sprinkling of AI magic in this blog post, namely in the “Takeaways” section, where I used AI to create a concise summary of this article. Don’t fret, my human eyeballs have read and edited every word of the AI generated content, so rest assured, what you’re reading is as accurate as I possibly can make it. If there are any discrepancies or inaccuracies in the post, it’s my fault, not that of our machine assistants.

Header photo by Joshua Earle on Unsplash