fraud_eagle package

An implementation of Fraud Eagle algorithm.

This algorithm has been introduced by Leman Akoglu, et al. in ICWSM 2013.

class fraud_eagle.ReviewGraph(epsilon: float)[source]

Bases: object

A bipartite graph modeling reviewers and products relationships.

Parameters:: epsilon – a hyper parameter in (0, 0.5).

add_review(reviewer: Reviewer, product: Product, rating: float, *_args: Any, **_kwargs: Any) → Review[source]

Add a review from a given reviewer to a product.

Parameters:

reviewer – reviewer of the review,
product – product of the review,
rating – rating score of the review.

Returns:

a new review.

new_product(name: str) → Product[source]

Create a new product and add it to this graph.

Parameters:: name – name of the new product.
Returns:: a new product.

new_reviewer(name: str, *_args: Any, **_kwargs: Any) → Reviewer[source]

Create a new reviewer and add it to this graph.

Parameters:: name – name of the new reviewer,
Returns:: a new reviewer.

prod_message_from_all_products(reviewer: Reviewer, u_label: UserLabel) → float[source]

Compute a product of messages sending to a reviewer.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}} m_{k \rightarrow i}(y_{i}),\]

where \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}\) means a set of products the given reviewer reviews, \(y_{i}\) is a user label and one of the {HONEST, FRAUD}.

If product is None, compute a product of all messages sending to the reviewer.

Parameters:

reviewer – reviewer object,
u_label – user label.

Returns:

a logarithm of the product defined above.

prod_message_from_all_users(product: Product, p_label: ProductLabel) → float[source]

Compute a product of messages to a product.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}} m_{k\rightarrow j}(y_{j}),\]

where \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}\) means a set of reviewers who review the given product except the given reviewer, \(y_{j}\) is a product label and one of the {GOOD, BAD}.

If reviewer is None, compute a product of all messages sending to the product.

Parameters:

product – Product,
p_label – product label

Returns:

a logarithm of the product defined above.

prod_message_from_products(reviewer: Reviewer, product: Optional[Product], u_label: UserLabel) → float[source]

Compute a product of messages sending to a reviewer except from a product.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product} m_{k \rightarrow i}(y_{i}),\]

where \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product\) means a set of products the given reviewer reviews except the given product, \(y_{i}\) is a user label and one of the {HONEST, FRAUD}.

If product is None, compute a product of all messages sending to the reviewer.

Parameters:

reviewer – reviewer object,
product – product object, can be None,
u_label – user label.

Returns:

a logarithm of the product defined above.

prod_message_from_users(reviewer: Optional[Reviewer], product: Product, p_label: ProductLabel) → float[source]

Compute a product of messages to a product except from a reviewer.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user} m_{k\rightarrow j}(y_{j}),\]

where \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user\) means a set of reviewers who review the given product except the given reviewer, \(y_{j}\) is a product label and one of the {GOOD, BAD}.

If reviewer is None, compute a product of all messages sending to the product.

Parameters:

reviewer – Reviewer, can be None,
product – Product,
p_label – product label

Returns:

a logarithm of the product defined above.

retrieve_products(reviewer: Reviewer) → list[fraud_eagle.graph.Product][source]

Retrieve products a given reviewer reviews.

Parameters:: reviewer – Reviewer.
Returns:: a collection of products the given reviewer reviews.

retrieve_review(reviewer: Reviewer, product: Product) → Review[source]

Retrieve a review a given reviewer posts to a given product.

Parameters:

reviewer – Reviewer,
product – Product,

Returns:

a reviewer associated with the given reviewer and product.

retrieve_reviewers(product: Product) → list[fraud_eagle.graph.Reviewer][source]

Retrieve reviewers review a given product.

Parameters:: product – Product.
Returns:: a collection of reviewers who review the product.

update() → float[source]

Update reviewers’ anomalous scores and products’ summaries.

For each user \(u\), update messages to every product \(p\) the user reviews. The message function \(m_{u\rightarrow p}\) takes one argument i.e. label of the receiver product. The label is one of {good, bad}. Therefore, we need to compute updated \(m_{u\rightarrow p}(good)\) and \(m_{u\rightarrow p}(bad)\).

The updated messages are defined as

\[m_{u\rightarrow p}(y_{j}) \leftarrow \alpha_{1} \sum_{y_{i} \in \cal{L}_{\cal{U}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{U}}_{i}(y_{i}) \prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p} m_{k \rightarrow i}(y_{i}),\]

where \(y_{j} \in {good, bad}\), and \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p\) means a set of product the user \(u\) reviews but except product \(p\).

For each product \(p\), update message to every user \(u\) who reviews the product. The message function \(m_{p\rightarrow u}\) takes one argument i.e. label of the receiver user. The label is one of {honest, fraud}. Thus, we need to compute updated \(m_{p\rightarrow u}(honest)\) and \(m_{p\rightarrow u}(fraud)\).

The updated messages are defined as

\[m_{p\rightarrow u}(y_{i}) \leftarrow \alpha_{3} \sum_{y_{j} \in \cal{L}_{\cal{P}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{P}}_{j}(y_{j}) \prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u} m_{k\rightarrow j}(y_{j}),\]

where \(y_{i} \in {honest, fraud}\), and \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u\) means a set of users who review the product \(p\) but except user \(u\),

This method runs one iteration of update for both reviewers, i.e. users and products. It returns the maximum difference between an old message value and the associated new message value. You can stop iteration when the update gap reaches satisfied small value.

Returns:: maximum difference between an old message value and its updated new value.

epsilon: Final[float]: Hyper parameter.

graph: Final[DiGraph]: Graph object of networkx.

products: Final[list[fraud_eagle.graph.Product]]: A collection of products.

reviewers: Final[list[fraud_eagle.graph.Reviewer]]: A collection of reviewers.

Submodules

fraud_eagle.graph module

Provide a bipartite graph class implementing Fraud Eagle algorithm.

class fraud_eagle.graph.Node(graph: ReviewGraph, name: str)[source]

Bases: object

Define a node of the bipartite graph model.

Each node has a reference to a graph object, and has a name. Thus, to make a node, both of them are required.

Parameters:

graph – reference of the parent graph.
name – name of this node.

graph: Final[ReviewGraph]: Reference of the parent graph.

name: Final[str]: Name of this node.

class fraud_eagle.graph.Product(graph: ReviewGraph, name: str)[source]

Bases: Node

Product node in ReviewGraph.

Each product has a summary of its ratings. In Fraud Eagle, we uses the weighted average of ratings given to the product as the summary. The weights are anomalous scores of reviewers.

Thus, letting \(r_{i}\) be the rating given by \(i\)-th reviewer, and \(a_{i}\) be the anomalous score of \(i\)-th reviewer, the summary of the product is defined as

\[\frac{\sum_{i}a_{i}r_{i}}{\sum_{i}a_{i}}\]

Parameters:

graph – reference of the parent graph.
name – name of this node.

property summary: float: Summary of ratings given to this product.

class fraud_eagle.graph.Review(rating: float)[source]

Bases: object

Review represents a edge in the bipartite graph.

Review is an edge in the bipartite graph connecting a user to a product if the user reviews the product. The review has a score the user gives to the product. Additionally, in Fraud Eagle, each review has two message functions, i.e. message from the user to the product, vise versa. Each message function takes only two values. For example, the message from the user to the product can take {good, bad}.

To implement those message functions, this review class maintain four values associated with each function and each input. But also provide message functions as methods.

Review also has a rating score given a user to a product. We assume this score is normalized in \([0, 1]\). Fraud Eagle treats this score as a binary value i.e. + or -. To implement it, we choose a threshold 0.5 to decide each rating belonging to + group or - group, and evaluation property returns this label. In other words, for a review r,

\[\begin{split}r.evaluation = \begin{cases} PLUS \quad (r.rating \geq 0.5) \\ MINUS \quad (otherwise) \end{cases}\end{split}\]

Parameters:: rating – the normalized rating of this review.

product_to_user(label: UserLabel) → float[source]

Message function from the product to the user associated with this review.

The argument label must be one of the {UserLabel.HONEST, UserLabel.FRAUD}.

This method returns the logarithm of the value of the message function for a given label.

Parameters:: label – label of the user.
Returns:: the logarithm of the \(m_{p\rightarrow u}(label)\), where \(u\) and \(p\) is the user and the product, respectively.

update_product_to_user(label: UserLabel, value: float) → None[source]

Update product-to-user message value.

The argument label must be one of the {UserLabel.HONEST, UserLabel.FRAUD}.

Note that this method doesn’t normalize any given values.

Parameters:

label – user label,
value – new message value.

update_user_to_product(label: ProductLabel, value: float) → None[source]

Update user-to-product message value.

The argument label must be one of the {ProductLabel.GOOD, ProductLabel.BAD}.

Note that this method doesn’t normalize any given values.

Parameters:

label – product label,
value – new message value.

user_to_product(label: ProductLabel) → float[source]

Message function from the user to the product associated with this review.

The argument label must be one of the {ProductLabel.GOOD, ProductLabel.BAD}.

This method returns the logarithm of the value of the message function for a given label.

Parameters:: label – label of the product.
Returns:: the logarithm of the \(m_{u\rightarrow p}(label)\), where \(u\) and \(p\) is the user and the product, respectively.

property evaluation: ReviewLabel

Returns a label of this review.

If the rating is grater or equal to \(0.5\), ReviewLabel.PLUS is returned. Otherwise, ReviewLabel.MINUS is returned.

rating: Final[float]: The normalized rating of this review.

class fraud_eagle.graph.ReviewGraph(epsilon: float)[source]

Bases: object

A bipartite graph modeling reviewers and products relationships.

Parameters:: epsilon – a hyper parameter in (0, 0.5).

add_review(reviewer: Reviewer, product: Product, rating: float, *_args: Any, **_kwargs: Any) → Review[source]

Add a review from a given reviewer to a product.

Parameters:

reviewer – reviewer of the review,
product – product of the review,
rating – rating score of the review.

Returns:

a new review.

new_product(name: str) → Product[source]

Create a new product and add it to this graph.

Parameters:: name – name of the new product.
Returns:: a new product.

new_reviewer(name: str, *_args: Any, **_kwargs: Any) → Reviewer[source]

Create a new reviewer and add it to this graph.

Parameters:: name – name of the new reviewer,
Returns:: a new reviewer.

prod_message_from_all_products(reviewer: Reviewer, u_label: UserLabel) → float[source]

Compute a product of messages sending to a reviewer.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}} m_{k \rightarrow i}(y_{i}),\]

where \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}\) means a set of products the given reviewer reviews, \(y_{i}\) is a user label and one of the {HONEST, FRAUD}.

If product is None, compute a product of all messages sending to the reviewer.

Parameters:

reviewer – reviewer object,
u_label – user label.

Returns:

a logarithm of the product defined above.

prod_message_from_all_users(product: Product, p_label: ProductLabel) → float[source]

Compute a product of messages to a product.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}} m_{k\rightarrow j}(y_{j}),\]

where \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}\) means a set of reviewers who review the given product except the given reviewer, \(y_{j}\) is a product label and one of the {GOOD, BAD}.

If reviewer is None, compute a product of all messages sending to the product.

Parameters:

product – Product,
p_label – product label

Returns:

a logarithm of the product defined above.

prod_message_from_products(reviewer: Reviewer, product: Optional[Product], u_label: UserLabel) → float[source]

Compute a product of messages sending to a reviewer except from a product.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product} m_{k \rightarrow i}(y_{i}),\]

where \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/product\) means a set of products the given reviewer reviews except the given product, \(y_{i}\) is a user label and one of the {HONEST, FRAUD}.

If product is None, compute a product of all messages sending to the reviewer.

Parameters:

reviewer – reviewer object,
product – product object, can be None,
u_label – user label.

Returns:

a logarithm of the product defined above.

prod_message_from_users(reviewer: Optional[Reviewer], product: Product, p_label: ProductLabel) → float[source]

Compute a product of messages to a product except from a reviewer.

This helper function computes a logarithm of the product of messages such as

\[\prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user} m_{k\rightarrow j}(y_{j}),\]

where \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/user\) means a set of reviewers who review the given product except the given reviewer, \(y_{j}\) is a product label and one of the {GOOD, BAD}.

If reviewer is None, compute a product of all messages sending to the product.

Parameters:

reviewer – Reviewer, can be None,
product – Product,
p_label – product label

Returns:

a logarithm of the product defined above.

retrieve_products(reviewer: Reviewer) → list[fraud_eagle.graph.Product][source]

Retrieve products a given reviewer reviews.

Parameters:: reviewer – Reviewer.
Returns:: a collection of products the given reviewer reviews.

retrieve_review(reviewer: Reviewer, product: Product) → Review[source]

Retrieve a review a given reviewer posts to a given product.

Parameters:

reviewer – Reviewer,
product – Product,

Returns:

a reviewer associated with the given reviewer and product.

retrieve_reviewers(product: Product) → list[fraud_eagle.graph.Reviewer][source]

Retrieve reviewers review a given product.

Parameters:: product – Product.
Returns:: a collection of reviewers who review the product.

update() → float[source]

Update reviewers’ anomalous scores and products’ summaries.

For each user \(u\), update messages to every product \(p\) the user reviews. The message function \(m_{u\rightarrow p}\) takes one argument i.e. label of the receiver product. The label is one of {good, bad}. Therefore, we need to compute updated \(m_{u\rightarrow p}(good)\) and \(m_{u\rightarrow p}(bad)\).

The updated messages are defined as

\[m_{u\rightarrow p}(y_{j}) \leftarrow \alpha_{1} \sum_{y_{i} \in \cal{L}_{\cal{U}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{U}}_{i}(y_{i}) \prod_{Y_{k} \in \cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p} m_{k \rightarrow i}(y_{i}),\]

where \(y_{j} \in {good, bad}\), and \(\cal{N}_{i} \cap \cal{Y}^{\cal{P}}/p\) means a set of product the user \(u\) reviews but except product \(p\).

For each product \(p\), update message to every user \(u\) who reviews the product. The message function \(m_{p\rightarrow u}\) takes one argument i.e. label of the receiver user. The label is one of {honest, fraud}. Thus, we need to compute updated \(m_{p\rightarrow u}(honest)\) and \(m_{p\rightarrow u}(fraud)\).

The updated messages are defined as

\[m_{p\rightarrow u}(y_{i}) \leftarrow \alpha_{3} \sum_{y_{j} \in \cal{L}_{\cal{P}}} \psi_{ij}^{s}(y_{i}, y_{j}) \phi^{\cal{P}}_{j}(y_{j}) \prod_{Y_{k} \in \cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u} m_{k\rightarrow j}(y_{j}),\]

where \(y_{i} \in {honest, fraud}\), and \(\cal{N}_{j} \cap \cal{Y}^{\cal{U}}/u\) means a set of users who review the product \(p\) but except user \(u\),

This method runs one iteration of update for both reviewers, i.e. users and products. It returns the maximum difference between an old message value and the associated new message value. You can stop iteration when the update gap reaches satisfied small value.

Returns:: maximum difference between an old message value and its updated new value.

epsilon: Final[float]: Hyper parameter.

graph: Final[DiGraph]: Graph object of networkx.

products: Final[list[fraud_eagle.graph.Product]]: A collection of products.

reviewers: Final[list[fraud_eagle.graph.Reviewer]]: A collection of reviewers.

class fraud_eagle.graph.Reviewer(graph: ReviewGraph, name: str)[source]

Bases: Node

Reviewer node in ReviewGraph.

Each reviewer has an anomalous_score property. In Fraud Eagle, we uses the belief that this reviewer is a fraud reviewer as the anomalous score.

The belief is defined as

\[b(y_{i}) = \alpha_{2} \phi^{\cal{U}}_{i}(y_{i}) \prod_{Y_{j} \in \cal{N}_{i} \cap \cal{Y}_{\cal{P}}} m_{j \rightarrow i}(y_{i}),\]

where \(y_{i}\) is a user label and one of the {honest, fraud} and \(\cal{N}_{i} \cap \cal{Y}_{\cal{P}}\) means a set of products this reviewer reviews. \(\alpha_{2}\) is a normalize constant so that \(b(honest) + b(fraud) = 1\).

Thus, we use \(b(fraud)\) as the anomalous score.

Parameters:

graph – reference of the parent graph.
name – name of this node.

property anomalous_score: float: Anomalous score of this reviewer.

fraud_eagle.labels module

Define constants used in Fraud Eagle package.

class fraud_eagle.labels.ProductLabel(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Product label.

BAD: Final = 2: Constant representing the bad label for products.

GOOD: Final = 1: Constant representing the good label for products.

class fraud_eagle.labels.ReviewLabel(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Review label.

MINUS: Final = 2: Constant representing “-” review.

PLUS: Final = 1: Constant representing “+” review.

class fraud_eagle.labels.UserLabel(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

User label.

FRAUD: Final = 2: Constant representing the fraud label for users.

HONEST: Final = 1: Constant representing the honest label for users.

fraud_eagle.likelihood module

Define likelihood functions.

This module defines a likelihood of a pair of user and product. See psi() for the detailed definition of the likelihood.

fraud_eagle.likelihood.psi(u_label: UserLabel, p_label: ProductLabel, r_label: ReviewLabel, epsilon: float) → float[source]

Likelihood of a pair of user and product.

The likelihood is dependent on the review of the user gives the product. The review is one of {+, -}. We defined constant representing “+” and “-“, thus the review is one of the {PLUS, MINUS}. On the other hand, epsilon is a given parameter.

The likelihood \(\psi_{ij}^{s}\), where \(i\) and \(j\) are indexes of user and produce, respectively, and \(s\) is a review i.e. \(s \in {+, -}\), is given as following tables.

If the review is PLUS,

review: +	Product: Good	Product: Bad
User: Honest	1 - \(\epsilon\)	\(\epsilon\)
User: Fraud	2 \(\epsilon\)	1 - 2 \(\epsilon\)

If the review is MINUS,

review: -	Product: Good	Product: Bad
User: Honest	\(\epsilon\)	1 - \(\epsilon\)
User: Fraud	1 - 2 \(\epsilon\)	2 \(\epsilon\)

Parameters:

u_label – user label which must be one of the { UserLabel.HONEST, UserLabel.FRAUD}.
p_label – product label which must be one of the {ProductLabel.GOOD, ProductLabel.BAD}.
r_label – review label which must be one of the {ReviewLabel.PLUS, ReviewLabel.MINUS}.
epsilon – a float parameter in \([0,1]\).

Returns:

Float value representing a likelihood of the given values.

fraud_eagle.prior module

Define prior beliefs of users and products.

fraud_eagle.prior.phi_p(_p_label: ProductLabel) → float[source]

Logarithm of a prior belief of a product.

The definition is

\[\phi_{j}^{\cal{P}}: \cal{L}_{\cal{P}} \rightarrow \mathbb{R}_{\geq 0},\]

where \(\cal{P}\) is a set of produce nodes, \(\cal{L}_{\cal{P}}\) is a set of product labels, and \(\mathbb{R}_{\geq 0}\) is a set of real numbers grater or equals to \(0\).

The implementation of this mapping is given as

\[\phi_{j}^{\cal{P}}(y_{j}) \leftarrow \|\cal{L}_{\cal{P}}\|.\]

On the other hand, \(\cal{L}_{\cal{P}}\) is given as {good, bad}. It means the mapping returns \(2\) despite the given product.

This function returns the logarithm of such \(\phi_{j}\), i.e. \(\log(\phi_{j}(p))\) for any product \(p\).

Parameters:: _p_label – Product label.
Returns:: The logarithm of the prior belief of the label of the given product. However, it returns \(\log 2\) whatever the given product is.

fraud_eagle.prior.phi_u(_u_label: UserLabel) → float[source]

Logarithm of a prior belief of a user.

The definition is

\[\phi_{i}^{\cal{U}}: \cal{L}_{\cal{U}} \rightarrow \mathbb{R}_{\geq 0},\]

where \(\cal{U}\) is a set of user nodes, \(\cal{L}_{\cal{U}}\) is a set of user labels, and \(\mathbb{R}_{\geq 0}\) is a set of real numbers grater or equals to \(0\).

The implementation of this mapping is given as

\[\phi_{i}^{\cal{U}}(y_{i}) \leftarrow \|\cal{L}_{\cal{U}}\|.\]

On the other hand, \(\cal{L}_{\cal{U}}\) is given as {honest, fraud}. It means the mapping returns \(\phi_{i} = 2\) for any user.

This function returns the logarithm of such \(\phi_{i}\), i.e. \(\log(\phi_{i}(u))\) for any user \(u\).

Parameters:: _u_label – User label.
Returns:: The logarithm of the prior belief of the label of the given user. However, it returns \(\log 2\) whatever the given user is.