DIC Implementation


The DIC algorithm has been implemented as dic.java.

Note: The DIC implementation given here may not produce accurate output for small databases (fewer than 100 transactions). To get accurate output for these databases we need to choose step M  > 4.

Download the following files:

  1. dic.java: The DIC algorithm.
  2. config.txt: Consists of four lines.

  3.         1.  Number of items
            2.  Number of transactions
            3.  Minimum support, i.e. 20 represents 20% minsupp
            4.  Size of step M for the DIC algorithm. This line is ignored by the Apriori algorithm
  4. transa.txt: Contains the transaction database as a n x m table, with n rows and m columns. Each row represents a transaction. Columns are separated by a space and represent items. A 1 indicates that an item is present in the transaction and a 0 indicates that it is not. The sample file has 10000 lines (transactions) with values for 8 items on each line.
Compile the .java file:

        hercules[1]% javac -deprecation dic.java

          Note: dic.java uses a deprecated API.  Recompile with "-deprecation" for details.
 

Change config.txt and transa.txt to represent the database and criteria to be tested.

Run the programs:

        hercules[2]% java dic
 

Example

We use the database example from Apriori Itemset Generation. The minsupp is 40%.
 

TID A B C D E
T1 1 1 1 0 0
T2 1 1 1 1 1
T3 1 0 1 1 0
T4 1 0 1 1 1
T5 1 1 1 1 0

Transa.txt contains a row for each of the five transactions and a column for each of the five items.
 

1 1 1 0 0
1 1 1 1 1
1 0 1 1 0
1 0 1 1 1
1 1 1 1 0
transa.txt

Config.txt: Here we use 5 as the size of step M for the DIC algorithm
 

5
5
40
5

Output:
 

hercules[67]% java apriori

Algorithm apriori starting now.....

Press 'C' to change the default configuration and transaction files
or any other key to continue.

Input configuration: 5 items, 5 transactions, minsup = 40%

Frequent 1-itemsets:
[1, 2, 3, 4, 5]
Frequent 2-itemsets:
[1 2, 1 3, 1 4, 1 5, 2 3, 2 4, 3 4, 3 5, 4 5]
Frequent 3-itemsets:
[1 2 3, 1 2 4, 1 3 4, 1 3 5, 1 4 5, 2 3 4, 3 4 5]
Frequent 4-itemsets:
[1 2 3 4, 1 3 4 5]

Execution time is: 0 seconds.
hercules[68]%

Execution of dic.java

We get the same results as we did earlier when we did the Apriori algorithm by hand.