XGI-DATA

This is a repository of openly available hypergraph datasets in JSON format with documentation more extensively describing the datasets. They are hosted in the XGI Community on Zenodo and a table of statistics can be found on Read The Docs. There is also a rudimentary inspection script for checking that datasets are in the proper format. This is loosely inspired by Datasheets for Datasets by Gebru et al.

Overview of the xgi-data format

The xgi-data format for hypergraph data sets is a JSON data structure with the following structure:

hypergraph-data: This tag accesses the attributes of the entire hypergraph dataset such as the authors or dataset name.
node-data: This tag accesses the nodes of the hypergraph and their associated properties as a dictionary where the keys are node IDs and the corresponding values are dictionaries. If a node doesn't have any properties, the associated dictionary is empty.
- name: This tag accesses the node's name if there is one that is different from the ID specified in the hyperedges.
- Other tags are user-specified based on the particular attributes provided by the dataset.
edge-data: This tag accesses the hyperedges of the hypergraph and their associated attributes.
- name: This tag accesses the edge's name if one is provided.
- timestamp: This is the tag specifying the time associated with the hyperedge if it is given. All times are stored in ISO8601 standard.
- Other tags are user-specified based on the particular attributes provided by the dataset.
edge-dict: This tag accesses the edge IDs and the corresponding nodes which participate in that hyperedge.

All IDs are strings but can be converted to other types if desired.

Data sets available on xgi-data

Currently available data sets are:

These datasets can be loaded with xgi using the following lines:

import xgi
H = xgi.load_xgi_data("<dataset_name>")

where <dataset_name> is chosen from the list above.

These datasets have been taken from the following sources:

Repository Description

index.json is a dictionary of the data sets that are currently available on xgi-data and the url where they are hosted. The code folder contains the scripts used to convert hypergraph datasets into a more standard format and the JSON inspection script. This code can be adapted to convert data sets that are currently not part of xgi-data into xgi-data format.

Checking dataset format

To check if a file has the xgi-data format, run the following command:

python inspect_json.py filepath.json

Funding

The XGI-DATA package has been supported by NSF Grant 2121905, "HNDS-I: Using Hypergraphs to Study Spreading Processes in Complex Social Networks".

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
code		code
datasheets		datasheets
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
HOW_TO_CONTRIBUTE.md		HOW_TO_CONTRIBUTE.md
LICENSE		LICENSE
README.md		README.md
get_stats.ipynb		get_stats.ipynb
index.json		index.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XGI-DATA

Overview of the xgi-data format

Data sets available on xgi-data

Repository Description

Checking dataset format

Funding

About

Releases

Packages

Contributors 3

Languages

License

xgi-org/xgi-data

Folders and files

Latest commit

History

Repository files navigation

XGI-DATA

Overview of the xgi-data format

Data sets available on xgi-data

Repository Description

Checking dataset format

Funding

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages