Physically Accurate Audio Datasets

Audio AI systems should be trained on data that reflects real acoustic behavior, not simplified approximations. Real rooms exhibit diffraction, modal structure, phase interaction, and spatial complexity. Your training data should too. Treble Datasets provide immediate access to curated spatial impulse responses generated with our hybrid wave based and geometrical acoustics engine. Physics grounded, cloud scalable, and ready for modern ML workflows.

Curated Audio and Acoustic Datasets for Audio AI

Lower Word Error Rate
Speech recognition with 30 % lower WERs when prior speech enhancement was trained with Treble datasets.
Real World Robustness
Physics based spatial audio designed for real world robustness.
No Simulation Overhead
Access large scale IR data without running your own simulations.

Controlled Acoustic Variation at Scale

Treble Datasets are organized across controlled room volume categories to ensure balanced variation and solver fidelity across scales. This enables robust training and validation across fundamentally different acoustic regimes.

Small Rooms

  • 50 to 180 cubic meters
  • 10,000 impulse responses
  • Transition frequency up to 2 kHz
  • Ambisonics order 16
  • Strong modal behavior and low frequency room effects

    Designed for near field speech applications, consumer electronics, and compact acoustic environments.

Medium Rooms

  • 90 to 390 cubic meters
  • 10,000 impulse responses
  • Transition frequency up to 1.5 kHz
  • Ambisonics order 16
  • Balanced modal and reflective behavior

    Suitable for meeting rooms, classrooms, and mid scale architectural spaces.

Large Rooms

  • 300 to 1600 cubic meters
  • 10,000 impulse responses
  • Transition frequency up to 1 kHz
  • Ambisonics order 16
  • Increased reflection complexity and spatial diffusion

    Optimized for performance spaces, public environments, and large volume acoustic modeling.
Real World Environment Coverage

Treble Datasets span diverse real world environments relevant to modern Audio AI.

Current environment categories include:

  • Meeting Rooms
  • Restaurants and Bars
  • Apartments
  • Bathrooms
  • Living Rooms
  • Bedrooms
  • Virtual Test Rooms
  • Trains
  • Car Cabins

These environments span domestic, commercial, transportation, and controlled virtual spaces, enabling training and validation across realistic acoustic scenarios.

Each dataset includes:
  • Wave based modeling of low to mid frequency behavior including diffraction and modal behavior
  • Phased geometrical acoustics for accurate high frequency reflections
  • Spatial receivers means that the dataset is ready to render in your specific microphone array geometry
  • A notebook that shows you how to work with the dataset
  • Controlled variation across room size, geometry, materials, and source configurations

    This is not approximated room acoustics. It is accurate full bandwidth, hybrid simulations designed for Audio AI training and validation
Speech recognition with 30 % lower WERs

When the Right Training Data Is the Only Difference

Training the same multichannel speech enhancement model on hybrid wave based simulation data generated with Treble reduced Word Error Rate by 30% compared to training on typical open source simulation data. The model and inference setup were identical.

The only difference was the acoustic training data.

Access Treble Datasets

Get access to physics based spatial audio datasets built for scalable Audio AI development.

Tell us about your use case and data requirements. Our team will provide access details, pricing, and technical documentation.