READ ME

This text describes the data presented in the paper: A Social Network Analysis of Articles on Social Network Analysis

========================
Introductory information
========================
Files included in the data deposit (include a short description of what data are contained): 
1) node_list_20181022.csv: list of the references in the paper 
2) edge_list_20181022.csv: list of the citations between the references in the paper

Explain the relationship between multiple data sets, if required: the graph that represents the citation network of the references in the paper is fully known given the two lists

Key words used to describe the data: directed acyclic graph; citation network; identifiers


========================== 
Methodological information
==========================
A brief method description of what the data is, how and why it was collected or created, and how it was processed: The data contains the identifiers of 135 articles on social network analysis (SNA) and the references between them. It was created in order to construct the citation network of these articles, which can be analysed by statistical methods to better understand SNA as a research area. It was collected manually from the reference lists / bibliographies of these 135 articles.

Instruments, hardware and software used: the open-source spreadsheet software LibreOffice Calc was used for recording the data

Date(s) of data collection: 2018-04-11 to 2018-06-28

Geographic coverage of data: not applicable

Data validation (how was the data checked, proofed and cleaned): The references were checked manually and thoroughly. For example, if article A cites article B in its reference list, and article B has already been recorded to be cited by other articles, the metadata of article B is checked to be the same among all articles citing it.

Overview of secondary data, if used: none


=========================
Data-specific information
=========================
Definitions of names, labels, acronyms or specialist terminology uses for variables, records and their values:
1) There is one single column/variable 'citing' in node_list_20181022.csv. In each row is the identifier of one of the 135 articles aforementioned. These identifiers are similar to, but not the same as, those used in the alpha bibliography style, and are essentially concatenations of the first letters of the authors' last names and the publication year.
2) There are two columns/variables in edge_list_20181022.csv. Values allowed in the data are the same identifiers in node_list_20181022.csv. If article A cites article B, there will be a row in which A appears under the column 'citing', B under 'cited'.
3) As an illustration, Hunter, Krivitsky and Schweinberger (2012) cites Airoldi, Blei, Fienberg and Xing (2008). Therefore, their respective identifiers 'hks12' and 'abfx08' appear in node_list_20181022.csv, while there is one row in edge_list_20181022.csv where 'hks12' and 'abfx08' appear under the two variables 'citing' and 'cited', respectively.

Explanation of weighting and grossing variables: none

Outline any missing data: none


=======
Contact
=======
Please contact rdm@ncl.ac.uk for further information