Skip to content

gadelrab/Exception_Enriched_Rules

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

This is the implementation of the paper "Exception-Enriched Rule Learning from Knowledge Graphs" Link.

Requirements:

  1. maven 2
  2. Java 8
  3. For development, IntelliJ (recommended)

Prepare Project:

  1. Run scripts:

sh scripts/local_libs.sh

sh scripts/download_large_data.sh

  1. Import the project as maven project to intellij if needed

Installation:

To install the project run scripts

mvn compile

mvn package -Dmaven.test.skip=true

mvn install -Dmaven.test.skip=true

Running

To mine rules with or without exceptions use mine_riles.sh use the following options

usage: mine_rules.sh
 -cPM,--cautious-materialization <arg>    Use partial materialization cautiously, here the minimum support for exceptions
 -de                                     Decode the output
 -dPM,--Debug_materialization <file>     debug Materialization file
 -en                                     Encode the input
 -ex                                     Mine the output
 -exMinSup <EXCEPTION_MIN_SUPP_RATIO>    Exception Minimum support for the rule
 -expOnly                                Output rules with exceptions only
 -exRank,--exception_ranking <order>     Exception ranking method(LIFT|PNCONF|SUPP|CONF|PNCONV|PNJACC)
 -f1,--first_filter                      first filter based on size (4 body atoms at most, 1 head and Max conf)
 -f2,--second_filter                     Second filter based on type hierarchy
 -i,--input <file>                       Input file inform of RDF or Integer transactions
 -m,--mapping_file <file>                Mapping RDF to Integer
 -maxConf <MAX_CONF_RATIO>               Maximum Confidence for the rule (default=1.0)
 -minConf <MIN_CONF_RATIO>               Minimum Confidence for the rule (default=0.001)
 -minS <MIN_SUPP_RATIO>                  Minimum support for the rule(default=0.0001)
 -o,--output <file>                      Input file inform of RDF or Integer transactions
 -oDLV,--output_DLV                      Export rules as PrASP
 -oDLV_CONFLICT,--export_DLVConflict     Export rules to count conflict to file
 -oPrASP,--output_PrASP                  Export rules as PrASP
 -pm,--materialization                   Use partial materialization
 -PMo,--materialization_order            Materialize with order. Only useful with Materialization
 -s,--sorting <order>                    Output sorting(CONF|HEAD|BODY|LIFT|HEAD_CONF|HEAD_LIFT|NEW_LIFT|CONV)
 -stats,--export_statistics              Export statistics to file
 -w,--weighted_transactions              Count transactions with weights. Only useful with Materialization

Output sorting Methods

HEAD: Sort according to the head predicates (useful for grouping)

BODY: According to the rules body

CONF: Association Rules Confidence (Original horn rule confidence is used)

LIFT: Association Rules Lift measurement (Original horn rule confidence is used)

HEAD_CONF: Sort according to head then confidence.

HEAD_LIFT: Sort according to head then lift.

NEW_LIFT: Revised Rules Lift

CONV: [Conviction measurement] (www3.di.uminho.pt/~pja/ps/conviction.pdf).

Exception ranking Methods

SUPP: Used in the naive approach. Only consider increase in support. LIFT: Only consider increase in lift CONF: Only consider increase in confidence

PNCONF: Used in partial materialization. Considers the increase of average confidence of positive and negative predictions.

PNCONV: Used in partial materialization. Considers the increase of average conviction of positive and negative predictions.

PNJACC: Used in partial materialization. Considers the increase of average Jaccard Coefficient of positive and negative predictions.

Running Experiments:

To Run YAGO experiments

sh run_experiment.sh <sorting_Type[CONF|HEAD|BODY|LIFT|HEAD_CONF|HEAD_LIFT|NEW_LIFT|CONV]> <RM[LIFT|SUPP|CONF|CONV|JACC]>

to Run IMDB experiments

sh run_IMDB_experiment.sh <sorting_Type[CONF|HEAD|BODY|LIFT|HEAD_CONF|HEAD_LIFT|NEW_LIFT|CONV]> <RM[LIFT|SUPP|CONF|CONV|JACC]>

Note: fix the directories inside the scripts to point to facts_to_mine.tsv file

Other Important Scripts:

To convert the KB from RDF to different formats

rdf2int.sh <required conversion [SPMF|DLV_SAFE|PrASP]> <input file path> <output prefix>

Ex: sh assemble/bin/rdf2int.sh DLV_SAFE /GW/D5data-5/gadelrab/imdb/facts_to_mine_imdb.tsv /GW/D5data-5/gadelrab/imdb/in/facts_to_mine_imdb

SPMF : outputs transactional KB in numbers 1,2,3 for projected predicates DLV_SAFE : outputs unary Encoding in format p1234t(s1234o) for projected predicates. PrASP : outputs in PrASP format without encoding for example isMarriedToScientist(X). (Good fro viewing but causes problems with PrASP)

A mapping will be generated in case of encoding

P.S: Other scripts to be added

References

This is an implementation of the paper:

Exception-enriched Rule Learning from Knowledge Graphs Mohamed Gad-Elrab, Daria Stepanova, Jacopo Urbani, Gerhard Weikum In 15th International Semantic Web Conference (ISWC 2016),234-251, Springer 2016.

About

Learning Exception-aware rules over KGs.

Resources

Stars

Watchers

Forks

Releases

No releases published