easygraph.datasets.hypergraph.cocitation module#
- class easygraph.datasets.hypergraph.cocitation.CocitationCiteseer(data_root: str | None = None)[source]#
Bases:
BaseData
The Co-citation Citeseer dataset is a citation network dataset for vertex classification task. More details see the HyperGCN paper.
The content of the Co-citation Citaseer dataset includes the following:
num_classes
: The number of classes: \(6\).num_vertices
: The number of vertices: \(3,327\).num_edges
: The number of edges: \(1,079\).dim_features
: The dimension of features: \(3,703\).features
: The vertex feature matrix.torch.Tensor
with size \((3,327 \times 3,703)\).edge_list
: The edge list.List
with length \(1,079\).labels
: The label list.torch.LongTensor
with size \((3,327, )\).train_mask
: The train mask.torch.BoolTensor
with size \((3,327, )\).val_mask
: The validation mask.torch.BoolTensor
with size \((3,327, )\).test_mask
: The test mask.torch.BoolTensor
with size \((3,327, )\).
- Parameters:
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.- Attributes:
content
Return the content of the dataset.
Methods
fetch_files
(files)Download and check the files if they are not exist.
needs_to_load
(item_name)Return whether the
item_name
of the dataset needs to be loaded.raw
(key)Return the
key
of the dataset with un-preprocessed format.
- class easygraph.datasets.hypergraph.cocitation.CocitationCora(data_root: str | None = None)[source]#
Bases:
BaseData
The Co-citation Cora dataset is a citation network dataset for vertex classification task. More details see the HyperGCN paper.
The content of the Co-citation Cora dataset includes the following:
num_classes
: The number of classes: \(7\).num_vertices
: The number of vertices: \(2,708\).num_edges
: The number of edges: \(1,579\).dim_features
: The dimension of features: \(1,433\).features
: The vertex feature matrix.torch.Tensor
with size \((2,708 \times 1,433)\).edge_list
: The edge list.List
with length \(1,579\).labels
: The label list.torch.LongTensor
with size \((2,708, )\).train_mask
: The train mask.torch.BoolTensor
with size \((2,708, )\).val_mask
: The validation mask.torch.BoolTensor
with size \((2,708, )\).test_mask
: The test mask.torch.BoolTensor
with size \((2,708, )\).
- Parameters:
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.- Attributes:
content
Return the content of the dataset.
Methods
fetch_files
(files)Download and check the files if they are not exist.
needs_to_load
(item_name)Return whether the
item_name
of the dataset needs to be loaded.raw
(key)Return the
key
of the dataset with un-preprocessed format.
- class easygraph.datasets.hypergraph.cocitation.CocitationPubmed(data_root: str | None = None)[source]#
Bases:
BaseData
The Co-citation PubMed dataset is a citation network dataset for vertex classification task. More details see the HyperGCN paper.
The content of the Co-citation PubMed dataset includes the following:
num_classes
: The number of classes: \(3\).num_vertices
: The number of vertices: \(19,717\).num_edges
: The number of edges: \(7,963\).dim_features
: The dimension of features: \(500\).features
: The vertex feature matrix.torch.Tensor
with size \((19,717 \times 500)\).edge_list
: The edge list.List
with length \(7,963\).labels
: The label list.torch.LongTensor
with size \((19,717, )\).train_mask
: The train mask.torch.BoolTensor
with size \((19,717, )\).val_mask
: The validation mask.torch.BoolTensor
with size \((19,717, )\).test_mask
: The test mask.torch.BoolTensor
with size \((19,717, )\).
- Parameters:
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.- Attributes:
content
Return the content of the dataset.
Methods
fetch_files
(files)Download and check the files if they are not exist.
needs_to_load
(item_name)Return whether the
item_name
of the dataset needs to be loaded.raw
(key)Return the
key
of the dataset with un-preprocessed format.