ria package

Implement the Repeated Improvement Analysis (RIA) and Mutually Reinforcing Analysis (MRA).

We introduce RIA in the 7th Forum on Data Engineering and Information Management (DEIM 2015) and MRA in the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011).

This package also provides another analysis algorithm, One, which employs one anomalous index proposed by Ee-Peng Lim ea al. in the 19th International Conference on Information and Knowledge Management (CIKM 2010), and its extended version OneSum.

The top level module of this package provides four constructor functions, which create review graph providing above four algorithms.

ria.mra_graph()[source]

Create a review graph providing MRA algorithm.

Returns:A review graph.
ria.one_graph()[source]

Create a review graph providing One algorithm.

Returns:A review graph.
ria.one_sum_graph()[source]

Create a review graph providing OneSum algorithm.

Returns:A review graph.
ria.ria_graph(alpha)[source]

Create a review graph providing RIA algorithm with a parameter alpha.

Parameters:alpha – Parameter.
Returns:A review graph.

Submodules

ria.bipartite module

Provide classes for review mining algorithms based on a bipartite model.

Especially, the following three classes are the core of the algorithm.

BipartiteGraph
Representing the bipartite graph model. It provides basic methods to handle the graph and update method to compute new anomalous scores for reviewers and new summaries for products.
Reviewer
Representing a reviewer. It is modeled as a node in the bipartite graph. Each reviewer has an anomalous score.
Product
Representing a product. It is also modeled as a node in the bipartite graph. Each product has a summary score of the given review scores.

The bipartite model we use has two kinds of nodes; Reviewer and Product. A reviewer and a product are tied when the reviewer has reviewed the product. Each edge is a directed and has a label representing a score the reviewer has given to the product.

In the bipartite graph model, both reviewers and products have scores. Each reviewer has an anomalous score which represent how anomalous the reviewer is. Each product has a summary of reviews the product has received.

Here is a sample of the bipartite graph.

digraph bipartite {
   graph [label="Bipartite graph model.", rankdir = LR];
   "r1" [label="Reviewer 1
(anomalous: 0.1)"];
   "r2" [label="Reviewer 2
(anomalous: 0.9)"];
   "r3" [label="Reviewer 3
(anomalous: 0.5)"];
   "p1" [label="Product 1
(summary: 0.3)"];
   "p2" [label="Product 2
(summary: 0.8)"];
   "r1" -> "p1" [label="0.3"];
   "r1" -> "p2" [label="0.9"];
   "r2" -> "p2" [label="0.1"];
   "r3" -> "p2" [label="0.5"];
 }

This module defines the bipartite graph itself (bipartite.BipartiteGraph) and two kinds of nodes, bipartite.Reviewer and bipartite.Products.

There are also many variations of the bipartite graph.

class ria.bipartite.BipartiteGraph(summary=<class 'review.scalar.AverageSummary'>, alpha=1, credibility=<class 'ria.credibility.WeightedCredibility'>, reviewer=<class 'ria.bipartite.Reviewer'>, product=<class 'ria.bipartite.Product'>)[source]

Bases: object

Bipartite graph model for review data mining.

Parameters:
  • summary_type – specify summary type class, default value is AverageSummary.
  • alpha – used to compute weight of anomalous scores, default value is 1.
  • credibility – credibility class to be used in this graph. (Default: ria.credibility.WeightedCredibility)
  • reviewer – Class of reviewers.
  • product – Class of products.
alpha

Parameter.

graph

Graph object of networkx.

reviewers

Collection of reviewers.

products

Collection of products.

credibility

Credibility object.

add_review(reviewer, product, review, date=None)[source]

Add a new review from a given reviewer to a given product.

Parameters:
  • reviewer – an instance of Reviewer.
  • product – an instance of Product.
  • review – a float value.
  • date – date the review issued.
Returns:

the added new review object.

Raises:

TypeError – when given reviewer and product aren’t instance of specified reviewer and product class when this graph is constructed.

dump_credibilities(output)[source]

Dump credibilities of all products.

Parameters:output – a writable object.
new_product(name)[source]

Create a new product.

Parameters:name – name of the new product.
Returns:A new product instance.
new_reviewer(name, anomalous=None)[source]

Create a new reviewer.

Parameters:
  • name – name of the new reviewer.
  • anomalous – initial anomalous score. (default: None)
Returns:

A new reviewer instance.

retrieve_products(*args)[source]

Retrieve products reviewed by a given reviewer.

Parameters:reviewer – A reviewer.
Returns:A list of products which the reviewer reviews.
Raises:TypeError – when given reviewer isn’t instance of specified reviewer class when this graph is constructed.
retrieve_review(*args)[source]

Retrieve review that the given reviewer put the given product.

Parameters:
  • reviewer – An instance of Reviewer.
  • product – An instance of Product.
Returns:

A review object.

Raises:
  • TypeError – when given reviewer and product aren’t instance of specified reviewer and product class when this graph is constructed.
  • KeyError – When the reviewer does not review the product.
retrieve_reviewers(*args)[source]

Retrieve reviewers who reviewed a given product.

Parameters:product – A product specifying reviewers.
Returns:A list of reviewers who review the product.
Raises:TypeError – when given product isn’t instance of specified product class when this graph is constructed.
to_pydot()[source]

Convert this graph to PyDot object.

Returns:PyDot object representing this graph.
update()[source]

Update reviewers’ anomalous scores and products’ summaries.

Returns:maximum absolute difference between old summary and new one, and old anomalous score and new one.
class ria.bipartite.Product(graph, name=None, summary_cls=<class 'review.scalar.AverageSummary'>)[source]

Bases: ria.bipartite._Node

A node class representing Product.

Parameters:
  • graph – An instance of BipartiteGraph representing the parent graph.
  • name – Name of this node. (default: None)
  • summary_cls – Specify summary type. (default: AverageSummary)
summary

Summary of reviews for this product.

Initial summary is computed by

\[\frac{1}{|R|} \sum_{r \in R} \mbox{review}(r),\]

where \(\mbox{review}(r)\) means review from reviewer \(r\).

update_summary(w)[source]

Update summary.

The new summary is a weighted average of reviews i.e.

\[\frac{\sum_{r \in R} \mbox{weight}(r) \times \mbox{review}(r)} {\sum_{r \in R} \mbox{weight}(r)},\]

where \(R\) is a set of reviewers reviewing this product, \(\mbox{review}(r)\) and \(\mbox{weight}(r)\) are the review and weight of the reviewer \(r\), respectively.

Parameters:w – A weight function.
Returns:absolute difference between old summary and updated one.
class ria.bipartite.Reviewer(graph, credibility, name=None, anomalous=None)[source]

Bases: ria.bipartite._Node

A node class representing Reviewer.

Parameters:
  • graph – an instance of BipartiteGraph representing the parent graph.
  • credibility – an instance of credibility.Credibility to be used to update scores.
  • name – name of this node. (default: None)
  • anomalous – initial anomalous score. (default: None)
anomalous_score

Anomalous score of this reviewer.

Initial anomalous score is \(1 / |R|\) where \(R\) is a set of reviewers.

update_anomalous_score()[source]

Update anomalous score.

New anomalous score is a weighted average of differences between current summary and reviews. The weights come from credibilities.

Therefore, the new anomalous score of reviewer \(p\) is as

\[{\rm anomalous}(r) = \frac{ \sum_{p \in P} {\rm credibility}(p)| {\rm review}(r, p)-{\rm summary}(p)| }{ \sum_{p \in P} {\rm credibility}(p) }\]

where \(P\) is a set of products reviewed by reviewer \(p\), review(\(r\), \(p\)) is the rating reviewer \(r\) posted to product \(p\), summary(\(p\)) and credibility(\(p\)) are summary and credibility of product \(p\), respectively.

Returns:absolute difference between old anomalous score and updated one.

ria.bipartite_sum module

Provide a bipartite graph which implements OneSum algorithm.

The bipartite graph implemented in this module uses normalized summations for updated anomalous scores.

class ria.bipartite_sum.BipartiteGraph(**kwargs)[source]

Bases: ria.bipartite.BipartiteGraph

Bipartite Graph implementing OneSum algorithm.

This graph employs a normalized summation of deviation times credibility as the undated anomalous scores for each reviewer.

Constructor receives as same arguments as ria.bipartite.BipartiteGraph but reviewer argument is ignored since this graph uses ria.bipartite_sum.Reviewer instead.

update()[source]

Update reviewers’ anomalous scores and products’ summaries.

The update consists of 2 steps;

Step1 (updating summaries):
Update summaries of products with anomalous scores of reviewers and weight function. The weight is calculated by the manner in ria.bipartite.BipartiteGraph.
Step2 (updating anomalous scores):
Update its anomalous score of each reviewer by computing the summation of deviation times credibility. See Reviewer.update_anomalous_score() for more details. After that those updated anomalous scores are normalized so that every value is in \([0, 1]\).
Returns:
maximum absolute difference between old summary and new one, and
old anomalous score and new one. This value is not normalized and thus it may be grater than actual normalized difference.
class ria.bipartite_sum.Reviewer(graph, credibility, name=None, anomalous=None)[source]

Bases: ria.bipartite.Reviewer

Reviewer which uses normalized summations for updated anomalous scores.

This reviewer will update its anomalous score by computing summation of partial anomalous scores instead of using a weighted average.

update_anomalous_score()[source]

Update anomalous score.

New anomalous score is the summation of weighted differences between current summary and reviews. The weights come from credibilities.

Therefore, the new anomalous score is defined as

\[{\rm anomalous}(r) = \sum_{p \in P} \mbox{review}(p) \times \mbox{credibility}(p) - 0.5\]

where \(P\) is a set of products reviewed by this reviewer, review(\(p\)) and credibility(\(p\)) are review and credibility of product \(p\), respectively.

Returns:absolute difference between old anomalous score and updated one.

ria.credibility module

Defines functor classes computing credibility.

Credibility is a function-like class which has __call__ method. This method receives only one argument, an instance of ria.bipartite.Product, and return a value of credibility.

This module has a helper base class GraphBasedCredibility which provides two helper functions traversing a bipartite graph.

The credibilities defined in this module are;

class ria.credibility.GraphBasedCredibility(g)[source]

Bases: object

Abstract class of credibility using a Bipartite graph.

Parameters:g – A bipartite graph instance.

This class provides two helper methods; reviewers() and review_score().

review_score(reviewer, product)[source]

Find a review score from a given reviewer to a product.

Parameters:
Returns:

A review object representing the review from the reviewer to the product.

reviewers(product)[source]

Find reviewers who have reviewed a given product.

Parameters:product – An instance of ria.bipartite.Product.
Returns:A list of reviewers who have reviewed the product.
class ria.credibility.UniformCredibility(*unused_args)[source]

Bases: object

Uniform credibility assigns 1 for every product.

Formally, this credibility is defined by

\[{\rm cred}(p) = 1,\]

where p is a product.

Uniform credibility does not use any arguments to construct.

class ria.credibility.WeightedCredibility(g)[source]

Bases: ria.credibility.GraphBasedCredibility

Credibility using unbiased variance of review scores.

Parameters:g – an instance of bipartite graph.

The credibility computed by this class is defined by

\[\begin{split}{\rm cred}(p) = \begin{cases} 0.5 \quad \mbox{if} \; N_{p} = 1, \\ \frac{\log N_{p}}{\sigma^{2} + 1} \quad \mbox{otherwise}, \end{cases}\end{split}\]

where \(N_{p}\) is the number of reviews for the product p and \(\sigma^{2}\) is the unbiased variance of review scores. The unbiased variance is defined by

\[\sigma^{2} = \frac{1}{N_{p} - 1} \sum_{r \in R} \left( {\rm review}(r, p) - \frac{1}{N_{p}}\sum_{r' \in r} {\rm review}(r', p) \right)^{2},\]

where \({\rm review}(r, p)\) is a review from reviewer r to product p.

ria.one module

Provide a review graph which implement One algorithm.

In One algorithm, only one updating scores is allowed. Thus, the review graph defined in this module overwrites ria.bipartite.BipartiteGraph.update() so that it works only one time.

class ria.one.BipartiteGraph(**kwargs)[source]

Bases: ria.bipartite.BipartiteGraph

Bipartite graph implementing One algorithm.

updated

Whether update() has been called. If True, that method does nothing.

update()[source]

Update reviewers’ anomalous scores and products’ summaries.

Returns:maximum absolute difference between old summary and new one, and old anomalous score and new one.