Learning to generate questions from text.
Blog on this project :
Link1 : https://software.intel.com/en-us/articles/using-natural-language-processing-for-smart-question-generation
Link2 : http://dynamichub.in/aditya/sqg/
- Sentence Selection: This module selects topically important sentences from text document.
- Gap Selection: This module uses Standford Parser extract NP(noun phrase) and ADJP(Adjective Phrase) from important sentences as candidate gaps.
- Question Formation: This module generate actual questions from the fill in the blank type of question. It uses the NLTK parser and grammar syntax logics for the same.
- Question Classification: Classify question quality based on pre-trained SVM classifier (Conditional trained only for Blank type questions)
Install Python2.7`in your systemgit clone https://github.com/adityasarvaiya/Automatic_Question_Generation.gitcd Automatic_Question_Generation pip install -r requirements.txtif you have problem with dotenv package then uninstall dotenv and install python-dotenv
pip install nltk
python
import nltk
nltk.download("punkt")
nltk.download("stopwords")
nltk.download("averaged_perceptron_taggepython r")- Create a folder to host all the stanford models, e.g.
mkdir /your-path-to-stanford-models/stanford-models.
- Download Stanford Parser at here, unzip, and:
- Move
stanford-parser.jarto stanford models folder, e.g./your-path-to-stanford-models/stanford-models/stanford-parser.jar - Move
stanford-parser-x-x-x-models.jarto stanford models folder. - Unzip
stanford-parser-x-x-x-models.jar, move/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gztostanford-models/
- Move
- Download Stanford NER at here, unzip, and:
- Move
stanford-ner.jarto stanford models folder. - Move
stanford-ner-x-x-x.jarto stanford models folder (e.g. 3.7.0). - Move
/classifiers/english.all.3class.distsim.crf.ser.gzto stanford models folder.
- Move
The stanford models folder should looks like this:
- stanford-models/
| - stanford-parser.jar
| - stanford-parser-x-x-x-models.jar
| - englishPCFG.ser.gz
| - stanford-ner.jar
| - stanford-ner-x-x-x.jar
| - english.all.3class.distsim.crf.ser.gz
Create environment variable file with: touch .env for configuration (in project root).
SENTENCE_RATIO = 0.05 #The threshold of important sentences
STANFORD_JARS=/path-to-your-stanford-models/stanford-models/
STANFORD_PARSER_CLASSPATH=/path-to-your-stanford-models/stanford-models/stanford-parser-x.x.x-models.jar
STANFORD_NER_CLASSPATH=/path-to-your-stanford-models/stanford-models/stanford-ner.jar| ID | Variable Name | Variable Location | USE |
|---|---|---|---|
| 1 | SENTENCE_RATIO | .env file | Controls the ratio to sentence selection from given text. Range [0,1] |
| 2 | len(entities) > 7 | aqg/utils/gap_selection line 58 | It elemenates any sentence with more than 7 entities |
[embed] https://github.com/adityasarvaiya/Automatic_Question_Generation/blob/master/project.pdf [/embed]