Newcastle University
Contreras et al_2023_26112019 Albania earthquake_Albanian_SA-TA.xlsx (130.74 kB)
Download file

Sentiment and topic analysis of LastQuake app user's comments - 26th November 2019 Albania earthquake

Download (130.74 kB)
posted on 2023-03-27, 10:06 authored by Enes Veliu, Diana ContrerasDiana Contreras, Laure Fallou, Rémy Bossu, Matthieu Landès

This dataset contains the sentiment and topic analysis (supervised classification) posted by LastQuake app users about the 19th November 2019 Albania earthquake. LastQuake app is a crowdsource-based earthquake information app that allows eyewitnesses to share information about the earthquake they felt, combined with seismic data. This app was developed by the European Mediterranean Seismological Centre (EMSC). Attributes and data contained in the dataset are:

- Eq_t0: Origin time (UTC) of the intensity report.

- Intensity: report of intensity felt (before leaving a comment, users must leave a report).

-Epidist: distance from the event of the comment in Kilometres.

-Device: tool from which the comment was left, i.e. desktop, mobile or app.

- Comment in Albanian: original comments posted by LastQuake app users in Albanian.

- Translation to English: original comments in Albanian translated to English by Dr Enes Veliu (Native speaker).

- Sentiment analysis (SA): classification of the comment into a polarity, i.e. positive, negative, neutral or irrelevant.

- Topic (TA): classification of the comment into a specific topic, i.e. Building damages, distress, emergency response, governance, injured and casualties, intensity, preparedness, seismic information, solidarity messages, tsunami, urban facilities or unrelated.


Collecting data after an earthquake is essential to determine the phenomenon's impact on the population and built environment. To determine this impact, data must be collected on the number of injured or casualties among the population, buildings and infrastructure damaged. In 2018 The European Mediterranean Seismological Centre (EMSC) launched a multichannel rapid information system comprising websites, a Twitter quakebot, and a smartphone app for global earthquake eyewitnesses: the LastQuake app. This app collects a number of reports from users that could help provide rapid situation awareness. However, text data collected through crowdsourcing platforms such as the LastQuake app is unstructured. Therefore, natural language processing (NLP) techniques such as sentiment and topic analysis are necessary to extract meaningful information. Sentiment analysis, also called opinion mining is the field of study that classifies people's opinions, expressed in written text, into a specific polarity, i.e.positive, negative or neutral. The topic analysis is another NLP technique that extracts text meaning by identifying recurrent themes or topics. On and after the November 26th 2019, earthquake in Albania, the LastQuake app recorded 28,220 reports from users. For the current analysis, we took a sample of comments posted on the exact day of the earthquake written in Albanian: 1678 comments (6%). Comments were translated into English and classified into polarity and topics defined for previous earthquakes based on similar datasets. The most frequent polarity detected in comments from LastQuake app users was negative (51%), followed by far by positive and neutral. The most frequent topic tackled in comments from users was intensity (36%), followed by distress (32%) and seismic information (17%). Unfortunately, unrelated comments with inappropriate language represented 5% of comments in the sample. The most frequent polarity and topic detected were expected, given that they report about a disaster and that the LastQuake app was developed to report intensity. The remarkable finding is the high number of comments reporting distress, expressed with a positive polarity as prayers or a negative polarity as cursing. These distressing comments surpass by far comments that contain seismic information, emergency response actions, and reports of building damages or injured and casualties. This finding allows us to conclude that it is necessary to improve the preparedness among the population at the individual or community level in Albania to face the aftermath of an earthquake and probably aftershocks.