fraud_eagle package

An implementation of Fraud Eagle algorithm.

This algorithm has been introduced by Leman Akoglu, et al. in ICWSM 2013.

Submodules

fraud_eagle.constants module

Define constants used in Fraud Eagle package.

fraud_eagle.constants.BAD = 'bad'

Constant representing the bad label for products.

fraud_eagle.constants.FRAUD = 'fraud'

Constant representing the fraud label for users.

fraud_eagle.constants.GOOD = 'good'

Constant representing the good label for products.

fraud_eagle.constants.HONEST = 'honest'

Constant representing the honest label for users.

fraud_eagle.constants.MINUS = 'minus'

Constant representing “-” review.

fraud_eagle.constants.PLUS = 'plus'

Constant representing “+” review.

fraud_eagle.graph module

Provide a bipartite graph class implementing Fraud Eagle algorithm.

fraud_eagle.graph.LOGGER = <logging.Logger object>

Logging object.

class fraud_eagle.graph.Product(graph, name)[source]

Bases: fraud_eagle.graph._Node

Product node in ReviewGraph.

Each product has a summary of its ratings. In Fraud Eagle, we uses the weighted average of ratings given to the product as the summary. The weights are anomalous scores of reviewers.

Thus, letting \(r_{i}\) be the rating given by \(i\)-th reviewer, and \(a_{i}\) be the anomalous score of \(i\)-th reviewer, the summary of the product is defined as

\[\frac{\sum_{i}a_{i}r_{i}}{\sum_{i}a_{i}}\]
summary

Summary of ratings given to this product.

class fraud_eagle.graph.Review(rating)[source]

Bases: object

Review represents a edge in the bipartite graph.

Review is an edge in the bipartite graph connecting a user to a product if the user reviews the product. The review has a score the user gives to the product. Additionally, in Fraud Eagle, each review has two message functions, i.e. message from the user to the product, vise versa. Each message function takes only two values. For example, the message from the user to the product can take {good, bad}.

To implement those message functions, this review class maintain four values associated with each function and each input. But also provide message functions as methods.

Review also has a rating score given a user to a product. We assume this score is normalized in \([0, 1]\). Fraud Eagle treats this score as a binary value i.e. + or -. To implement it, we choose a threshold 0.5 to decide each rating belonging to + group or - group, and evaluation property returns this label. In other words, for a review r,

\[\begin{split}r.evaluation = \begin{cases} PLUS \quad (r.rating \geq 0.5) \\ MINUS \quad (otherwise) \end{cases}\end{split}\]
rating

the normalized rating of this review.

evaluation

Returns a label of this review.

If the rating is grater or equal to \(0.5\), PLUS is returned. Otherwise, MINUS is returned.

product_to_user(label)[source]

Message function from the product to the user associated with this review.

The argument label must be one of the {HONEST, FRAUD}.

This method returns the logarithm of the value of the message function for a given label.

Parameters:label – label of the user.
Returns:the logarithm of the \(m_{p\rightarrow u}(label)\), where \(u\) and \(p\) is the user and the product, respectively.
Raises:ValueError – if the given label isn’t one of {HONEST, FRAUD}.
rating
update_product_to_user(label, value)[source]

Update product-to-user message value.

The argument label must be one of the {HONEST, FRAUD}.

Note that this method doesn’t normalize any given values.

Parameters:
  • label – user label,
  • value – new message value.
Raises:

ValueError – if the given label isn’t one of {HONEST, FRAUD}.

update_user_to_product(label, value)[source]

Update user-to-product message value.

The argument label must be one of the {GOOD, BAD}.

Note that this method doesn’t normalize any given values.

Parameters:
  • label – product label,
  • value – new message value.
Raises:

ValueError – if the given label isn’t one of {GOOD, BAD}.

user_to_product(label)[source]

Message function from the user to the product associated with this review.

The argument label must be one of the {GOOD, BAD}.

This method returns the logarithm of the value of the message function for a given label.

Parameters:label – label of the product.
Returns:the logarithm of the \(m_{u\rightarrow p}(label)\), where \(u\) and \(p\) is the user and the product, respectively.
Raises:ValueError – if the given label isn’t one of {GOOD, BAD}.
class fraud_eagle.graph.ReviewGraph(epsilon)[source]

Bases: object

A bipartite graph modeling reviewers and products relationships.

graph

Graph object of networkx.

reviewers

A collection of reviewers.

products

A collection of products.

epsilon

Hyper parameter.

add_review(reviewer, product, rating, _time=None)[source]

Add a review from a given reviewer to a product.

Parameters:
  • reviewer – reviewer of the review,
  • product – product of the review,
  • rating – rating score of the review.
Returns:

a new review.

new_product(name)[source]

Create a new product and add it to this graph.

Parameters:name – name of the new product.
Returns:a new product.
new_reviewer(name, anomalous=None)[source]

Create a new reviewer and add it to this graph.

Parameters:
  • name – name of the new reviewer,
  • _anomalous – default anomalous score (not used in this method).
Returns:

a new reviewer.

prod_message_from_products(reviewer, product, ulabel)[source]

Compute a product of messages sending to a reviewer except from a product.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product} m_{k \rightarrow i}(y_{i}),\]

where \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product\) means a set of products the given reviewer reviews except the given product, \(y_{i}\) is a user label and one of the {HONEST, FRAUD}.

If product is None, compute a product of all messages sending to the reviewer.

Parameters:
  • reviewer – reviewer object,
  • product – product object, can be None,
  • ulabel – user label.
Returns:

a logarithm of the product defined above.

prod_message_from_users(reviewer, product, plabel)[source]

Compute a product of messages to a product except from a reviewer.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user} m_{k\rightarrow j}(y_{j}),\]

where \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user\) means a set of reviewers who review the given product except the given reviewer, \(y_{j}\) is a product label and one of the {GOOD, BAD}.

If reviewer is None, compute a product of all messages sending to the product.

Parameters:
  • reviewer – Reviewer, can be None,
  • product – Product,
  • plabel – product label
Returns:

a logarithm of the product defined above.

retrieve_products(*args)[source]

Retrieve products a given reviewer reviews.

Parameters:reviewer – Reviewer.
Returns:a collection of products the given reviewer reviews.
Raises:ValueError – if the given reviewer isn’t an instance of Reviewer.
retrieve_review(*args)[source]

Retrieve a review a given reviewer posts to a given product.

Parameters:
  • reviewer – Reviewer,
  • product – Product,
Returns:

a reviewer associated with the given reviewer and product.

Raises:

ValueError – if the given reviewer isn’t an instance of Reviewer or the given product isn’t an instance of Product.

retrieve_reviewers(*args)[source]

Retrieve reviewers review a given product.

Parameters:product – Product.
Returns:a collection of reviewers who review the product.
Raises:ValueError – if the given product isn’t an instance of Product.
update()[source]

Update reviewers’ anomalous scores and products’ summaries.

For each user \(u\), update messages to every product \(p\) the user reviews. The message function \(m_{u\rightarrow p}\) takes one argument i.e. label of the receiver product. The label is one of {good, bad}. Therefore, we need to compute updated \(m_{u\rightarrow p}(good)\) and \(m_{u\rightarrow p}(bad)\).

The updated messages are defined as

\[m_{u\rightarrow p}(y_{j}) \leftarrow \alpha_{1} \sum_{y_{i} \in \cal{L}_{\cal{U}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{U}}_{i}(y_{i}) \prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p} m_{k \rightarrow i}(y_{i}),\]

where \(y_{j} \in {good, bad}\), and \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p\) means a set of product the user \(u\) reviews but except product \(p\).

For each product \(p\), update message to every user \(u\) who reviews the product. The message function \(m_{p\rightarrow u}\) takes one argument i.e. label of the receiver user. The label is one of {honest, fraud}. Thus, we need to compute updated \(m_{p\rightarrow u}(honest)\) and \(m_{p\rightarrow u}(fraud)\).

The updated messages are defined as

\[m_{p\rightarrow u}(y_{i}) \leftarrow \alpha_{3} \sum_{y_{j} \in \cal{L}_{\cal{P}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{P}}_{j}(y_{j}) \prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u} m_{k\rightarrow j}(y_{j}),\]

where \(y_{i} \in {honest, fraud}\), and \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u\) means a set of users who review the product \(p\) but except user \(u\),

This method runs one iteration of update for both reviewers, i.e. users and products. It returns the maximum difference between an old message value and the associated new message value. You can stop iteration when the update gap reaches satisfied small value.

Returns:maximum difference between an old message value and its updated new value.
class fraud_eagle.graph.Reviewer(graph, name)[source]

Bases: fraud_eagle.graph._Node

Reviewer node in ReviewGraph.

Each reviewer has an anomalous_score property. In Fraud Eagle, we uses the belief that this reviewer is a fraud reviewer as the anomalous score.

The belief is defined as

\[b(y_{i}) = \alpha_{2} \phi^{\cal{U}}_{i}(y_{i}) \prod_{Y_{j} \in \cal{N}_{i} \cap \cal{Y}_{\cal{P}}} m_{j \rightarrow i}(y_{i}),\]

where \(y_{i}\) is a user label and one of the {honest, fraud} and \(\cal{N}_{i} \cap \cal{Y}_{\cal{P}}\) means a set of products this reviewer reviews. \(\alpha_{2}\) is a normalize constant so that \(b(honest) + b(fraud) = 1\).

Thus, we use \(b(fraud)\) as the anomalous score.

anomalous_score

Anomalous score of this reviewer.

fraud_eagle.likelihood module

Define likelihood functions.

This module defines a likelihood of a pair of user and product. See psi() for the detailed definition of the likelihood.

fraud_eagle.likelihood.psi(user, product, review, epsilon)[source]

Likelihood of a pair of user and product.

The likelihood is dependent on the review of the user gives the product. The review is one of {+, -}. We defined constant representing “+” and “-“, thus the review is one of the {PLUS, MINUS}. On the other hand, epsilon is a given parameter.

The likelihood \(\psi_{ij}^{s}\), where \(i\) and \(j\) are indexes of user and produce, respectively, and \(s\) is a review i.e. \(s \in {+, -}\), is given as following tables.

If the review is PLUS,

review: + Product: Good Product: Bad
User: Honest 1 - \(\epsilon\) \(\epsilon\)
User: Fraud 2 \(\epsilon\) 1 - 2 \(\epsilon\)

If the review is MINUS,

review: - Product: Good Product: Bad
User: Honest \(\epsilon\) 1 - \(\epsilon\)
User: Fraud 1 - 2 \(\epsilon\) 2 \(\epsilon\)
Parameters:
  • user – user label which must be one of the { HONEST, FRAUD}.
  • product – product label which must be one of the {GOOD, BAD}.
  • review – review label which must be one of the {PLUS, MINUS}.
  • epsilon – a float parameter in \([0,1]\).
Returns:

Float value representing a likelihood of the given values.

fraud_eagle.prior module

Define prior beliefs of users and products.

fraud_eagle.prior.phi_p(product)[source]

Logarithm of a prior belief of a product.

The definition is

\[\phi_{j}^{\cal{P}}: \cal{L}_{\cal{P}} \rightarrow \mathbb{R}_{\geq 0},\]

where \(\cal{P}\) is a set of produce nodes, \(\cal{L}_{\cal{P}}\) is a set of product labels, and \(\mathbb{R}_{\geq 0}\) is a set of real numbers grater or equals to \(0\).

The implementation of this mapping is given as

\[\phi_{j}^{\cal{P}}(y_{j}) \leftarrow \|\cal{L}_{\cal{P}}\|.\]

On the other hand, \(\cal{L}_{\cal{P}}\) is given as {good, bad}. It means the mapping returns \(2\) despite the given product.

This function returns the logarithm of such \(\phi_{j}\), i.e. \(\log(\phi_{j}(p))\) for any product \(p\).

Parameters:user – Product object.
Returns:The logarithm of the prior belief of the label of the given product. However, it returns \(\log 2\) whatever the given product is.
fraud_eagle.prior.phi_u(user)[source]

Logarithm of a prior belief of a user.

The definition is

\[\phi_{i}^{\cal{U}}: \cal{L}_{\cal{U}} \rightarrow \mathbb{R}_{\geq 0},\]

where \(\cal{U}\) is a set of user nodes, \(\cal{L}_{\cal{U}}\) is a set of user labels, and \(\mathbb{R}_{\geq 0}\) is a set of real numbers grater or equals to \(0\).

The implementation of this mapping is given as

\[\phi_{i}^{\cal{U}}(y_{i}) \leftarrow \|\cal{L}_{\cal{U}}\|.\]

On the other hand, \(\cal{L}_{\cal{U}}\) is given as {honest, fraud}. It means the mapping returns \(\phi_{i} = 2\) for any user.

This function returns the logarithm of such \(\phi_{i}\), i.e. \(\log(\phi_{i}(u))\) for any user \(u\).

Parameters:user – User object.
Returns:The logarithm of the prior belief of the label of the given user. However, it returns \(\log 2\) whatever the given user is.