F-Actor: Controllable Conversational Behavior in Full-Duplex Models

Work in Progress

Overview

This repository contains the code accompanying the paper F-Actor: Controllable Conversational Behaviour in Full-Duplex Models.

Spoken conversational systems require more than accurate speech generation to have human-like conversations: to feel natural and engaging, they must produce conversational behaviour that adapts dynamically to the context. Current spoken conversational systems, however, rarely allow such customization, limiting their naturalness and usability. In this work, we present the first open, instruction-following full-duplex conversational speech model that can be trained efficiently under typical academic resource constraints. By keeping the audio encoder frozen and finetuning only the language model, our model requires just 2,000 hours of data, without relying on large-scale pretraining or multi-stage optimization. The model can follow explicit instructions to control speaker voice, conversation topic, conversational behaviour (e.g., backchanneling and interruptions), and dialogue initiation. We propose a single-stage training protocol and systematically analyze design choices. Both the model and training code will be released to enable reproducible research on controllable full-duplex speech systems.

Models and datasets will be released on HuggingFace soon.

Released Resources

🤗 Model: https://huggingface.co/maikezu/f-actor
🤗 Dataset (Behavior-SD, NanoCodec): https://huggingface.co/datasets/maikezu/f-actor-behavior-sd-nanocodec
🤗 Dataset (Behavior-SD, Mimi): https://huggingface.co/datasets/maikezu/f-actor-behavior-sd-mimi

Requirements

conda create -n factor python=3.10
conda activate factor
cd f-actor
pip install .

Training

Example training scripts are located in scripts/train.

Usage

Adapt an existing training script to your needs using an example script from scripts/train. More parameters can be found in arguments.py.
Run the training:
```
bash scripts/train/your-train-script.sh
```

Inference

Example inference scripts for generating dialogues using two instances of the model and prompts from Behavior-SD can be found in scripts/inference_eval. If you like to run inference with F-Actor from HuggingFace, please refer to scripts/inference_eval/inference_nanocodec_special_tokens.sh.

Usage

Adapt an inference script to your needs using an example script from scripts/inference_eval.
Run inference:
```
bash scripts/inference_eval/your-inference-script.sh
```
The generated dialogues will be stored in the output directory that is specified in the script.

Evaluation

To run the same evaluation metrics as reported in the paper:

Add the path of your model to the evaluation script in scripts/inference_eval. Add the output directory that was used during inference.
Run:
```
bash scripts/inference_eval/eval.sh
```

Generate Your Own Dialogues

You can generate custom dialogues using the script training/inference_example.py. Before running the script, configure the following options at the bottom of the file:

Speaker selection (determines the voice used for each character)
- Four example speaker voices from the original Behaviour-SD are provided below. Select any two of these voices for your dialogue.
  - Tom tom.wav
  - Brian brian.wav
  - Gweneth gweneth.wav
  - Rebeka rebeka.wav
Starting speaker (which speaker begins the conversation)
Narrative context (background or setup for the dialogue)

An example can be found in training/inference_example.py.

To run the script, use:

python training/inference_example.py

Example Dialogues

Example dialogues generated with F-Actor can be found in the example_dialogues folder.

Citation

If you use this work, please cite:

@misc{züfle2026factorcontrollableconversationalbehaviour,
      title={F-Actor: Controllable Conversational Behaviour in Full-Duplex Models},
      author={Maike Züfle and Ondrej Klejch and Nicholas Sanders and Jan Niehues and Alexandra Birch and Tsz Kin Lam},
      year={2026},
      eprint={2601.11329},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.11329},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
confs		confs
dsu_preprocessing		dsu_preprocessing
eval		eval
example_dialogues		example_dialogues
scripts		scripts
training		training
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F-Actor: Controllable Conversational Behavior in Full-Duplex Models

Overview

Released Resources

Requirements

Training

Usage

Inference

Usage

Evaluation

Generate Your Own Dialogues

Example Dialogues

Citation

About

Uh oh!

Releases

Packages

Languages

MaikeZuefle/f-actor

Folders and files

Latest commit

History

Repository files navigation

F-Actor: Controllable Conversational Behavior in Full-Duplex Models

Overview

Released Resources

Requirements

Training

Usage

Inference

Usage

Evaluation

Generate Your Own Dialogues

Example Dialogues

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages