diff --git a/README.md b/README.md index e167b14..e187932 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum. (WSDM 2020) Project website: https://www.mpi-inf.mpg.de/yago-naga/entyfi -# Dependencies +## Dependencies - Maven, Java 1.8 - Python2 for mention detection. - cPickle @@ -23,12 +23,12 @@ Project website: https://www.mpi-inf.mpg.de/yago-naga/entyfi - ast, pulp - Pretrained word embeddings: "wget http://nlp.stanford.edu/data/glove.840B.300d.zip". -# Required Data +## Required Data You need to download required data which include background knowledge bases of all reference universes, pretrained models for fictional typing module and data for reference universe ranking. All data can be found at: http://people.mpi-inf.mpg.de/~cxchu/entyfi/ -# Configuration +## Configuration To run typing, you need to set some paths in several files: - ultrafile/resources/constant.py - GLOVE_VEC=path to pretrained word embedding (glove) @@ -41,7 +41,7 @@ To run typing, you need to set some paths in several files: - ATTENTION_MODEL=path to pretrained model of fictional typing module --- attentionModel (downloaded data) - TERMATRIX=path to universe-term matrix for reference universe ranking --- universe-termmatrix (downloaded data) -# How to Run +## How to Run - Build: ./build.sh - Run typing: ./run.sh heap-size typing.ENTYFI input-file output-file @@ -50,6 +50,6 @@ To run typing, you need to set some paths in several files: Other parameters like topK reference universes or topK types returned by ILP can be defined in class typing/ENTYFI.java -# Notes +## Notes - For mention detection, to improve efficiency, we use the technique from the paper: https://arxiv.org/abs/1603.01360