A Short Introduction to Frequent Pattern Mining

Carsten Grimm

We are drowning in information and starving for knowledge. -- Rutherford D. Rogers

As this quote summarizes nicely, today we have huge amounts of data at hand but only very seldom an explanation for this data. The technique of frequent pattern mining helps to find (quantitative) significant peculiarities in discrete data and is a preprocessing step for the extraction of "rules" or "laws" (knowledge) that can be inferred from the given data.

Examples for such peculiarities are items which are often bought together in a shop (Market Basket Analysis), neurons in the brain that often "fire" together in case of stimuli (Neuron Spike Data Analysis), (sub)molecules that frequently appear in compounds which prevent cells from getting infected by a certain virus (Drug Design / Molecular Fragment Mining) and many more.

The talk will cover the basic ideas of Frequent Itemset Mining including the Eclat Algorithm [Zaki, Parthasarathy, Ogihara, and Li 1997] as well as some steps of generalization towards an algorithm for Molecular Fragment Mining, thus outlining a general frequent pattern mining framework.