Used to extract the meaning of a corpus of Natural Language text. Read the wiki for more information.
The author "Li Bingdong" is the same as "Bingdong Li". The difference arose because of git's automatically configuring the author name.
This is not yet a python package, so do not try to install it.
The files that are directly in the directory include,
Used to create a large random corpora from nltk's brown corpus in nltk.corpus.brown for testing.
Defines all the classes (e.g. class Entity, class Pointer ...) to contain the different elements
Extends nltk by wrapping some of its functions that are used regularly.
DEPRECATED Extracts Entities and their Relationships
DEPRECATED Tests the extract function in SemanticExtraction.py for multiple times and outputs a testlog in the form of a plain text file
Some basic utilities that extend both python native data structures and the structures in SemanticExtraction.py
A pickled file containing a list of English nouns. Pickled to save time, and generated with a generator file, "GenerateNouns.py"
DEPRECATED Another pickled file containing the Stanford POS tagger. Originally conceived as a work around to making every user download the Stanford POS tagger library and add it to CLASSPATH, deprecated after finding that this approach does not work.
Similar to nouns and is a pickled file, contains a dictionary of English words with their POS tags. Only used by is_gerund.
If you have security concerns about pickled files, then generate them using the files in generators. Which generates which should be self-evident from the name