Features in Context

This repo contains code for predicting semantic features in context.

First, clone the repo over ssh:

git clone git@github.com:gchronis/features_in_context.git

Using a saved model to predict features

obtain a model save file, which is named something like model.plsr.buchanan.allbuthomonyms.5k.300components.500max_iters
make a directory in the top level of the repo called trained_models

mkdir trained_models

move the save file to this directory

mv <model save file> ./trained_models

navigate to the notebooks directory

cd notebooks

launch jupyter

jupyter notebook

open examine_features_in_context

Training a model from scratch

To train a model from scratch use the script classifier_main.py, e.g.

python3 classifier_main.py --model=plsr --allbuthomonyms --k_fold=4 --layer=8 --clusters=1 --embedding_type=glove --train_data=mc_rae_real --plsr_n_components=100 --plsr_max_iter=500

or

python3 classifier_main.py --train_data=buchanan --allbuthomonyms --embedding_type=glove --model=ffnn --layer=8 --clusters=1 --epochs=50 --dropout=0.0 --lr=1e-4 --hidden_size=300 --save_path='trained_models/model.ffnn.buchanan.glove.50epochs.0.0dropout.lr1e-4.hsize300'

or

python3 classifier_main.py --train_data=binder --seed=42 --embedding_type=bert --model=modabs --layer=8 --clusters=5 --mu1=1 --mu2=0.1 --mu3=1e-07 --mu4=5 --nnk=3 --save_path='trained_models/main_82b2e_00003_3_clusters=5,embedding_type=bert,model=modabs,mu2=0.1,mu3=1e-07,mu4=5,nnk=3,train_data=binder_2022-10-13_20-50-01'

Non-optional arguments

argument	explanation
--train_data=x	x can be mc_rae_real, buchanan, binder (also implemented mcrae, which uses lemmatized normalized buchanan version of mcrae norms)
--model=x	`plsr` or `ffnn` or `modabs` (a.k.a. label propagation)
--embedding_type=x	`glove` or `bert`
--layer=x	layer of bert embedding to use (we have embeddings for layer 8 and 11 atm, but can make more)
--clusters=x	number of clusters in multiprototype embeddings (1 or 5; if using glove, this is always 1)

Other options

argument	explanation
--k-fold=n	Do k-fold crossvalidation with n folds. reports average metrics over all folds
--allbuthomonyms	trains on all words in the test set except for
--save_path=str	str is a path to save the trained file to; defaults to not saving; should save to `./trained_models`

Model-specific arguments

PLSR

argument	explanation
--plsr_n_components=x
--plsr_max_iter=x

FFNN

argument	explanation
--epochs=n	integer number of training epochs
--dropout=n	dropout (float between 0 and 1, e.g. 0.2)
--lr=1e-4	learning rate
--hidden_size=300	number of weights for hidden layers

Label Propagation

argument	explanation
--mu1=1	-
--mu2=0.1	-
--mu3=1e-07	-
--mu4=5	-
--nnk=3	-

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
data		data
notebooks		notebooks
results		results
src		src
trained_models		trained_models
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
calculate_clusters.py		calculate_clusters.py
classifier_main.py		classifier_main.py
collect_bnc_tokens.py		collect_bnc_tokens.py
collect_single_prototype_feature_vectors.py		collect_single_prototype_feature_vectors.py
examine_features_in_context.ipynb		examine_features_in_context.ipynb
how_to_run_models_with_docker.md		how_to_run_models_with_docker.md
hyperparam_tuning_ffnn.py		hyperparam_tuning_ffnn.py
hyperparam_tuning_mcrae.py		hyperparam_tuning_mcrae.py
hyperparam_tuning_plsr.py		hyperparam_tuning_plsr.py
hyperparameter_tuning_modabs.py		hyperparameter_tuning_modabs.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features in Context

Using a saved model to predict features

Training a model from scratch

Non-optional arguments

Other options

Model-specific arguments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

gchronis/features_in_context

Folders and files

Latest commit

History

Repository files navigation

Features in Context

Using a saved model to predict features

Training a model from scratch

Non-optional arguments

Other options

Model-specific arguments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages