amazon module¶
This module provides a loading function of an Amazon Dataset.
The dataset consists of reviews for products insix categories. The list of the categoris are defined CATEGORIES
. If you give one or a list of categories chosed from the list to load()
, the function will load only reviews for products belong to the given categories.
This package also provides a helper function, print_state()
, to output a state of a graph object.
To use both fuctions, the graph object must implement the graph interface.
This is statistics of ratings and the number of reviewers:
Rating score | The number of reviewers |
---|---|
1.0 | 26754 |
2.0 | 16964 |
3.0 | 20294 |
4.0 | 57011 |
5.0 | 148373 |
-
amazon.
CATEGORIES
= ['cameras', 'laptops', 'mobilephone', 'tablets', 'TVs', 'video_surveillance']¶ Categories this dataset has.
-
amazon.
load
(graph, categories=None)[source]¶ Load the Amazon dataset to a given graph object.
The graph object must implement the graph interface.
If a list of categories is given, only reviews which belong to one of the given categories are added to the graph.
Parameters: graph – an instance of bipartite graph. Returns: The graph instance graph.
-
amazon.
print_state
(g, i, output=<open file '<stdout>', mode 'w'>)[source]¶ Print a current state of a given graph.
This method outputs a current of a graph as a set of json objects. Graph objects must have two properties, reviewers and products. Those properties returns a set of reviewers and products respectively. See the graph interface for more information.
In this output format, each line represents a reviewer or product object.
Reviewer objects are defined as
{ "iteration": <the iteration number given as i> "reviewer": { "reviewer_id": <Reviewer's ID> "score": <Anomalous score of the reviewer> } }
Product objects are defined as
{ "iteration": <the iteration number given as i> "reviewer": { "product_id": <Product's ID> "sumarry": <Summary of the reviews for the product> } }
Parameters: - g – Graph instance.
- i – Iteration number.
- output – A writable object (default: sys.stdout).