- Create 10 items usually seen in Amazon, K-mart, or any other supermarkets (e.g. diapers, clothes, etc.).
- Create a database of 20 transactions each containing some of these items. The information can be stored in a file, or a DBMS (e.g. NJIT ORACLE or MySQL).
- Repeat (1) by creating 4 additional, different databases each containing 20 transactions. Using the Apriori algorithm, generate and print out all the association rules and the input transactions for each of the 5 transactional databases you created. The support and confidence must be user-specified parameters, so the output should show different rules with respect to different databases and different support/confidence. Make sure to show multiple support and confidence results. The items and transactions must be clear and easy to identify. Important Notes:
- The purpose of this project to help you understand the algorithm, therefore, it must be “your own” implementation of the algorithm. If you use any existing package for the algorithm from Python, R, Matlab, or Guava, etc..., you will lose points. For example: Some software has a one-liner function for this algorithm, do not use them for your implementation, but you can use them to verify your work!.
- Of course, you can use helper libraries, like pandas, numpy, etc to help you develop the algorithm.
- Do not share or copy code from your peers or other resources. Your task is to implement the algorithm from scratch.