Class: Data Mining and Text Mining, University of Illinois at Chicago (Fall 2023)
1st year of Master's Degree in Computer Science coursework
The goal of the assignment was to implement a modified version of the GSP algorothm (Generalized Sequential Pattern algorithm) allowing to assign each item a different minimum support value. This generalization of the GSP mining algorithm takes as input a collection of sequences, where each sequence is defined as a list of ordered itemsets, and a parameter specification file containing the different minsup values for the items. The output is a collection of frequent sequential patterns, organized by length.
- main.py contains the full code to run the algorithm;
- data.txt contains sample input data;
- para.txt contains the minsup values settings for each item;
- results.txt contains sample output data.