This text describes the data presented in the paper: Applying Computational Analysis to Textual Data from the Wild: a Feminist Perspective ======================== Introductory information ======================== Files included in the data deposit (include a short description of what data are contained): 1) User reviews from FeedFinder, together with Index of Multiple Deprivation (IMD) associative data (Text_review_IMD_Decile.csv) 2) Detailed breakdown of IMD data for each review(venue_review_data_detailed.csv) 3) LDA topic weightings for each review, together with the hand coding for privacy and designated areas(ff_data_LDA_topic_weighting_Priv_DA_Coding.csv) 4) Original outputs from topic models produced using MALLET implementation of LDA for 30 topics which are described in the paper and used for the analysis: Original input - malletQuintFF.txt; Mallet Composition - malletQuintFF.mallet_composition30.txt; Mallet keys - malletQuintFF.mallet_keys30.txt Key words used to describe the data: FeedFinder, breastfeeding, public breastfeeding, geolocated, IMD, Index of Multiple Deprivation ========================== Methodological information ========================== A brief method description what the data is, how and why it was collected or created, and how it was processed: Instruments, hardware and software used: The data was collected via the FeedFinder mobile application. The data is volunteered by users who interact with the app to leave reviews about their experiences of breastfeeding in public spaces. Data froth app is stored in an SQL database, from which the data presented here was extracted, before being cross-referenced with the IMD data associated with the postcodes of the venue locations. The resulting data was processed using the MALLET implementation of LDA. Date(s) of data collection: 3444 total reviews were created by 606 users for 2535 venues over 43 months from May 2013 to January 2017. Geographic coverage of data: Only UK data was used for this study because a UK postcode was required to collect the associated IMD data. Data validation (how was the data checked, proofed and cleaned): The initial step was to prepare the data for analysis, and integrate our VGI data, and in particular the review comments, with the associated data from the IMD. Using the geocoded data associated with each review venue, the data was cross-referenced with the IMD. The IMD decile for each venue location was established using the Lower-layer Super Output Areas (LSOA), neighbourhoods with populations <1500, and IMD rank data to provide contextual information on the relative deprivation of the location. This facilitated an analysis of the textual data that could be then positioned in dialogue with an officially created dataset on contextual factors of deprivation. 2727 user reviews were included in the final dataset for analysis (717 entries were excluded on the basis that they featured no comment data or because a postcode match was not generated, resulting in no associated IMD data). Overview of secondary data, if used: Information on the IMD available online at: https://www.gov.uk/government/statistics/english-indices-of-deprivation-2015 ========================= Data-specific information ========================= Definitions of names, labels, acronyms or specialist terminology uses for variables, records and their values: Explanation of weighting and grossing variables: Outline any missing data: ======= Contact ======= Please contact rdm@ncl.ac.uk for further information