-
-
VoiceStar Public
VoiceStar: Robust, Duration-controllable TTS that can Extrapolate
-
-
VoiceCraft Public
Zero-Shot Speech Editing and Text-to-Speech in the Wild
-
-
PromptingWhisper Public
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
-
word-discovery Public
Word Discovery in Visually Grounded, Self-Supervised Speech Models
-
syllable-discovery Public
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
-
FaST-VGS-Family Public
Transformer-based visually grounded speech models
-
vqwordseg Public
Forked from kamperh/vqwordsegUnsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
Jupyter Notebook MIT License UpdatedJun 19, 2022 -
MAE-AST-Public Public
Forked from AlanBaade/MAE-AST-PublicPublic Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Python UpdatedJun 9, 2022 -
moment_detr Public
Forked from jayleicn/moment_detr[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
-
HERO_Video_Feature_Extractor Public
Forked from linjieli222/HERO_Video_Feature_ExtractorVideo Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
Python MIT License UpdatedApr 12, 2022 -
zerospeech2021_baseline Public
Forked from kamperh/zerospeech2021_baselineBERT and LSTM baseline models of the ZeroSpeech Challenge 2021
Python UpdatedFeb 22, 2022 -
-
academicpages Public template
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
JavaScript MIT License UpdatedApr 6, 2021 -
-
para-nmt-50m Public
Forked from jwieting/para-nmt-50mPre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"
Python UpdatedNov 30, 2017

