easygraph.datasets.web_google module#
Web-Google Dataset
This dataset is a web graph based on Google’s web pages and their hyperlink structure, as crawled by the Stanford WebBase project in 2002.
Each node represents a web page, and a directed edge from u to v indicates a hyperlink from page u to page v.
Statistics: - Nodes: 875713 - Edges: 5105039 - Features: None - Labels: None
Reference: J. Leskovec, A. Rajaraman, J. Ullman, “Mining of Massive Datasets.” Dataset from SNAP: https://snap.stanford.edu/data/web-Google.html
- class easygraph.datasets.web_google.WebGoogleDataset(raw_dir=None, force_reload=False, verbose=True, transform=None)[source]#
Bases:
EasyGraphBuiltinDatasetWeb-Google hyperlink network dataset.
- Parameters:
raw_dir (str, optional) – Directory to store the raw downloaded files. Default: None
force_reload (bool, optional) – Whether to re-download and process the dataset. Default: False
verbose (bool, optional) – Whether to print detailed processing logs. Default: True
transform (callable, optional) – Optional transform to apply on the graph.
Examples
>>> from easygraph.datasets import WebGoogleDataset >>> dataset = WebGoogleDataset() >>> g = dataset[0] >>> print("Nodes:", g.number_of_nodes()) >>> print("Edges:", g.number_of_edges())