You can compile the code yourself from scratch as follows
cd /code javac .java multibinning/business/.java multibinning/data/*.java
You can then create the .jar file as follows
jar cmf mainclass.txt ipd.jar .class multibinning/business/.class multibinning/data/*.java
You run IPD as follows
java -jar ipd.jar
where for the IPD methors you'll need to specify the following parameters:
Parameter | Meaning |
---|---|
-FILE_INPUT | name of input file |
-FILE_CP_OUTPUT | name of output file for cut points per dimension |
-FILE_RUNTIME_OUTPUT | name of output file for runtime in microseconds |
-FILE_DATA_OUTPUT | name of output file for discretized data in .arff format |
-NUM_ROWS | number of data points |
-NUM_MEASURE_COLS | number of numeric dimensions |
-NUM_CAT_CONTEXT_COLS | number of categorical dimensions |
-MAX_VAL | maximum value, used for normalization |
-METHOD | method used (0 for IPD_opt, 2 for IPD_gr) |
For example,
java -jar ipd.jar -FILE_INPUT example/simple.csv -FILE_CP_OUTPUT cuts.txt
-FILE_RUNTIME_OUTPUT runtime.txt -FILE_DATA_OUTPUT out.txt -NUM_ROWS 209
-NUM_MEASURE_COLS 2 -NUM_CAT_CONTEXT_COLS 0 -MAX_VAL 1.0 -METHOD 0
the output files of which have been included in /example.
Nguyen, H-V, Müller, E, Vreeken, J & Böhm, K Unsupervised Interaction-Preserving Discretization of Multivariate Data. Data Mining and Knowledge Discovery vol.28(5), pp 1366-1397, Springer, 2014.
All synthetic and benchmark data sets used in Nguyen et al. (2014) can be found in /data. The benchmark datasets were drawn from the UCI Machine Learning Repository. The PAMAP data sets can be found at the PAMAP website.
The IPD code contains a wide spectrum of discretisation methods that are currently not documented. Two such methods, Supervised Univariate Discretisation, and PCA-based Discretisation, require the Weka framework.
While these are enabled in the provided jar file, to ease compilation and general use of the code, these have hence been disabled in the code.
Most of the IPD code was written by Hoang Vu Nguyen. Kailash Budhathoki fixed bugs and helped making the code easier to use.