Skip to content

Google's Bidirectional Encoder Representations From Transformers - the predecessor of GPT-3 - implemented for Multi-GPU and TPU and trained on 105M sequences (27B tokens).

Notifications You must be signed in to change notification settings

timespacce/math_exp_bert

About

Google's Bidirectional Encoder Representations From Transformers - the predecessor of GPT-3 - implemented for Multi-GPU and TPU and trained on 105M sequences (27B tokens).

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages