Collaborative Filtering

A popular subtopic within Clustering is collaborative filtering, which looks at how small sets of variables change together.

Consider two binary variables:

xa and xb.

If xb =1 more often when xa =1, then xa => xb is an association rule.

Association rules are typically represented as LHS (antecedent) => RHS (consequent).

Collaborative Filtering Metrics

Whenever you look at a collaborative filtering problem, such as in a market basket analysis, 3 key metrics to look at are:

Support: the proportion of times an event occurs. Eg. # of times an event occurs/# of observations

Confidence: supp(LHS and RHS)/supp(LHS) The probability of RHS given LHS.

Lift: supp( LHS and RHS )/[ supp(RHS) * supp(LHS) ], the increase in probability of RHS given LHS occurs.

We’re always looking for rules with high lift, which indicates a unique insight into the RHS given what we observe in the LHS.