A popular subtopic within Clustering is collaborative filtering, which looks at how small sets of variables change together.
Consider two binary variables:
xa and xb.
If xb =1 more often when xa =1, then xa => xb is an association rule.
Association rules are typically represented as LHS (antecedent) => RHS (consequent).
Collaborative Filtering Metrics
Whenever you look at a collaborative filtering problem, such as in a market basket analysis, 3 key metrics to look at are:
Support: the proportion of times an event occurs. Eg. # of times an event occurs/# of observations
Confidence: supp(LHS and RHS)/supp(LHS) The probability of RHS given LHS.
Lift: supp( LHS and RHS )/[ supp(RHS) * supp(LHS) ], the increase in probability of RHS given LHS occurs.
We’re always looking for rules with high lift, which indicates a unique insight into the RHS given what we observe in the LHS.