easygraph.datasets.coauthor module#
CoauthorCS Dataset
This dataset contains a co-authorship network of authors who submitted papers to CS category. Each node represents an author and edges represent co-authorships. Node features are bag-of-words representations of keywords in the author’s papers. The task is node classification, with labels indicating the primary field of study.
Statistics: - Nodes: 18333 - Edges: 81894 - Feature Dim: 6805 - Classes: 15
Source: dmlc/dgl
- class easygraph.datasets.coauthor.CoauthorCSDataset(raw_dir=None, force_reload=False, verbose=True, transform=None)[source]#
Bases:
EasyGraphBuiltinDatasetCoauthorCS citation network dataset.
Nodes are authors, and edges indicate co-authorship relationships. Each node has a bag-of-words feature vector and a label denoting the primary research field.
- Parameters:
raw_dir (str, optional) – Directory to store the raw downloaded files. Default: None
force_reload (bool, optional) – Whether to re-download and process the dataset. Default: False
verbose (bool, optional) – Whether to print detailed processing logs. Default: True
transform (callable, optional) – Transform to apply to the graph on access.
Examples
>>> from easygraph.datasets import CoauthorCSDataset >>> dataset = CoauthorCSDataset() >>> g = dataset[0] >>> print("Nodes:", g.number_of_nodes()) >>> print("Edges:", g.number_of_edges()) >>> print("Feature shape:", g.nodes[0]['feat'].shape) >>> print("Label:", g.nodes[0]['label']) >>> print("Number of classes:", dataset.num_classes)
- property num_classes#