[Arxiv] [Code]

We introduce WavCraft, an AI-empowered assistant that leverages large language models (LLMs) to edit audio content following human’s instructions. Specifically, WavCraft prompts LLMs to decompose users’ demands into several tasks and tackle each task collaboratively with the corresponding model. By embracing in-context learning together with a set of expert models, WavCraft greatly improves audio content with more details and rationales, facilitating users controlling the quality of audio. Moreover, WavCraft is able to cooperate with human via dialogue interaction and even create the audio content without specific user guidance. Experiments demonstrate that WavCraft yields a better performance than existing methods, especially when editing local area of audio clips is preferred. Moreover, WavCraft can follow complex instructions to edit and even create audio content on the top of input recordings, which further meets the demands of audio producers in the practice.

<aside> 📖 Basic features

Advanced features

Basic features

We present case study on audio editing tasks by comparing WavCraft with SOTA end-to-end audio editing and generation models: For audio editing tasks, we evaluated: (a) SEDit, (b) AUDIT, and (c) WavCraft models; For text-to-audio generation, we evaluated: (a) AudioLDM, (b) Tangle, (c) WavJourney, (d) and WavCraft models.


Addition

Instruction: add a bell in the beginning

Input:

Machine gun, while bell in the beginning_input.wav

SEDit:

Machine gun, while bell in the beginning_sdedit.wav

AUDIT:

Machine gun, while bell in the beginning_audit.wav

WavCraft:

Machine gun, while bell in the beginning_wavcraft.wav


Removal

Instruction: drop a short firework explosion in the end

Input:

Vehicle horn, car horn, honking, while fireworks in the end.wav

AUDIT:

Vehicle horn, car horn, honking, while fireworks in the end_audit.wav

SEDit:

Vehicle horn, car horn, honking, while fireworks in the end_sedit.wav

WavCraft:

Vehicle horn, car horn, honking, while fireworks in the end_output.wav


Replacement

Instruction: replace wind instrument with drum kit