Supervised polarity and topic classification of LastQuake app user’s pictures with comments – Zagreb 2020 earthquake

posted on 28.05.2021, 16:09 by Diana Contreras Mojica, Sean Wilkinson, Laure Fallou, Matthieu Landès, Ivan Tomljenovich, Rémy Bossu, Nipun Balan, Philip James

This database contains the sentiment analysis (SA) and topic supervised classification of the comments posted with pictures by LastQuake app users related to the 22nd March 2020 Zagreb earthquake. LastQuake app is a crowdsource-based earthquake information app that allows eyewitnesses to share information about the earthquakes that they felt, combined with seismic data. This app was developed by the European-Mediterranean Seismological Centre (EMSC). Attributes and data contained in the database are:

- eq_evid : Number of the earthquake the comment is associated with

- eq_mag : magnitude of the event

- eq_t0 : Origin time (UTC)

- intensity: felt report intensity (as before leaving a comment users must leave a felt report)

- epidist : distance from the event of the comment, in km

- dt : response time from the origin time of the associated event, in seconds

- rate_pos : number of positive rates *

- rate_neg : number of negative rates*

- device : device from which the comment was left (desktop, mobile ou app)

- comm_valid : 0 or 1 depending on if we validated the comment or not. We invalidate comments when we consider them inappropriate (violence, insults,...)

- language: Original language on which the comment was written

- polarity: Polarity on which the comment is classified, i.e. positive, negative, neutral

- topic: Building damages and intensity

- comment: comment posted by LastQuake app user translated to English

LastQuake app obtained 31,911intensity reports from its users with comments, considered as text data, from which it has been possible to translate 31,403 (98%). The citizens included in their comments 361 pictures. After data processing, 314 (87%) pictures were selected for damage assessment. However, this database contains the classification of only those intensity reports that include pictures and comments: 45. This clarification is because some intensity reports from LasQuake app users include only comments or only pictures and some of them include both, and these are the intensity reports contained in this database. The supervised or unsupervised classification of the total number of comments posted by the LastQuake app users' with respect to the 22nd March 2020 Zagreb earthquake will be displayed in another database in the future.


Engineering and Physical Sciences Research Council (EPSRC