Audiolab September 2025 Updates

Audiolab on RunDiffusion is now a full AI audio workstation in the cloud. Use preloaded voices, train your own models, generate soundtracks, transcribe audio, and transfer instrument timbres using advanced AI models. No installation needed.
Audiolab September 2025 Updates

Audiolab on RunDiffusion has grown into a complete cloud-based AI audio suite, supporting voice cloning, music generation, transcription, timbre transfer, and more. Whether you're making original voices, transforming instrument tones, or generating ambient audio from prompts, Audiolab now delivers a seamless experience.

Here’s your updated deep dive into all available tools and features as of 2025.


Quick Summary of What's Inside

Tab Purpose
Process Use preloaded and trained voice models instantly
Train RVC Train your own voice model from recordings
Music Generate music/audio using multiple AI engines
TTS Convert text to speech with dozens of models
Transcribe Turn speech/audio into text
WaveTransfer Perform instrument timbre transfer

Process: We’ve Added Voices!

One of the most requested features is finally here: preloaded voice models.

No need to train a model to get started — just select a persona and generate. These character-driven voices let you experiment with different tones, styles, and moods.

Included voice personas:

  • Crimson Legacy – Deep and poetic with assertive presence
  • Electric Indigo – Raw, emotional, and charged with energy
  • Smoky Spirit – Gritty, with a raspy vintage edge
  • Sunset Amber – Warm, mellow, and laid-back
  • Mystic Onyx – Unpredictable and eclectic
  • Velvet Violet – Rich, soulful, and full of nuance

These are great for voice testing, prototyping, or creative music with AI vocals.


Train RVC: Build Your Own Voice Model

In the Train RVC tab, you can upload your own voice dataset and train a custom voice conversion model.

Key features:

  • Upload ~30–60 minutes of audio or input audio URLs
  • Optional vocal separation
  • Adjustable training epochs (2–4000) and batch size (1–40)
  • Index building for smoother inference

This tool is perfect for content creators, musicians, or voice actors wanting a personal voice model in the cloud.


Music: Text-to-Audio with Multiple Engines

The Music tab now includes several model backends for generating music and sound from text prompts.

Stable Audio

  • Generate ambient textures, instrumentals, or sound effects
  • Controls: duration, inference steps, seed, variation count
  • Negative prompting support

ACE-Step

  • Generate structured music and songs up to 4 minutes long
  • Supports lyrics input and LoRA models (e.g., RapMachine)
  • Base model: ACE-Step-v1-3.5B
  • Controls: duration, seed, advanced generation parameters

YuE

  • Create full tracks with vocals and instrumentation
  • Input genre tags, structured lyrics ([verse], [chorus])
  • Supports optional reference audio
  • Parameters: token limits, segments, batch size, seed, CUDA index

Use these to score video content, generate creative soundscapes, or experiment with generative music ideas.


TTS (Text-to-Speech): Huge Model Library

The TTS tab includes a wide range of voice synthesis models and customization options:

Model highlights include:

  • DIA – Clean dialog synthesis
  • Zonos – Emotionally rich voice with optional style tags
  • Tacotron2, Glow-TTS, Speedy-Speech, VITS, FastPitch, etc.

Features:

  • Voice cloning from reference audio
  • Speaker tagging for multi-speaker text (e.g., [S1], [S2])
  • Language selection and speed control

Great for podcasting, video narration, dialogue generation, or storytelling.


Transcribe: WhisperX + Whisper in the Cloud

Upload an audio file and get clean, speaker-separated transcriptions in just minutes.

Powered by:

  • WhisperX (recommended)
  • OpenAI Whisper

Features:

  • Speaker diarization (multi-speaker labeling)
  • Timestamps with word alignment
  • Batch size and compute control

Ideal for interviews, meetings, subtitle generation, or audio cleanup.


WaveTransfer: Timbre Transfer with Diffusion

This feature allows you to transfer the timbre of one instrument to another while preserving the original musical structure.

Two fully supported modes:

Inference

  • Apply a trained timbre model to new audio
  • Select project
  • Control noise steps (10–1000)
  • Use chunked decoding for smoother generation

Train

  • Train a new model to learn the timbre of your chosen instrument
  • Input training data and project details
  • Two-phase training: model and schedule network
  • Start, cancel, and monitor training sessions

Perfect for composers, remix artists, and sound designers exploring new tonal blends.


How to Get Started

  1. Log in to RunDiffusion
  2. Go to Open Source Apps
  3. Select Audiolab from the left panel
  4. Click Launch to start your session
  5. Begin using any of the 6 tabs above
No installations. No local setup. Everything runs in the cloud.

Why Use Audiolab on RunDiffusion?

  • Hosted on high-performance GPUs
  • Built for creators: no coding required
  • Offers both fast prototyping and deep customization
  • Save time with preloaded voices and one-click workflows

Whether you’re a content creator, voice artist, developer, or musician, Audiolab offers a modular toolset to accelerate your audio projects.

Previous article: Audiolab - A Quick Overview

About the author
Adam Stewart

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to RunDiffusion.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.