Measuring the likeihood that two or more factor levels (categories) appear together in observation (row). You could imagine that Aidan would want to know how likely is it that a particular beers are purchased on the same bill…
rules.surv <- titanic.raw %>% apriori(
control = list(verbose=F),
parameter = list(minlen=2, supp=0.005, conf=0.8),
appearance = list(rhs=c("Survived=No",
## keep three decimal places
quality(rules.surv) <- rules.surv %>% quality() %>% round(digits=3)
## sort rules by lift
rules.surv.sorted <- rules.surv %>% sort(by="lift")
## ----inspect rules-------------------------------------------------------
rules.surv.sorted %>% inspect() ## print rules
Which got you a nice output:
lhs | div | rhs | support | confidence | lift |
Class=2nd, Age=Child | => | Survived=Yes | 0.011 | 1.000 | 3.096 |
Class=2nd, Sex=Female, Age=Child | => | Survived=Yes | 0.006 | 1.000 | 3.096 |
Class=1st, Sex=Female | => | Survived=Yes | 0.064 | 0.972 | 3.010 |
support -> Fraction of transactions/obs that contain both LHS and RHS confidence -> Measures how often each item in RHS appears in transactions/obs that contain LHS
lift -> A lift value greater than 1 could indicate that LHS and RHS appear more often together than expected. A lift smaller than 1 could indicate that LHS and RHS appear less often together than expected
One can use association rules to predict/model future combinations…