Virtual Analog (VA) modeling aims to simulate the behavior of hardware circuits via algorithms to replicate their tone digitally. Dynamic Range Compressor (DRC) is an audio processing module that controls the dynamics of a track by reducing and amplifying the volumes of loud and quiet sounds, which is essential in music production. In recent years, neural-network-based VA modeling has shown great potential in producing high-fidelity models. However, due to the lack of data quantity and diversity, their generalization ability in different parameter settings and input sounds is still limited. To tackle this problem, we present Diff-SSL-G-Comp, the first large-scale and diverse dataset for modeling the SSL 500 G-Bus Compressor. Specifically, we manually collected 175 unmastered songs from the Cambridge Multitrack Library. We recorded the compressed audio in 220 parameter combinations, resulting in an extensive 2528-hour dataset with diverse genres, instruments, tempos, and keys. Moreover, to facilitate the use of our proposed dataset, we conducted benchmark experiments in various open-sourced black-box and grey-box models, as well as white-box plugins. We also conducted ablation studies in different data subsets to illustrate the effectiveness of improved data diversity and quantity. The dataset and demos are on our project page: http://www.yichenggu.com/DiffSSLGComp/.
Diff-SSL-G-Comp
is constructed from processing 175 unmastered songs from the Cambridge Multitrack Library with 220 different parameter
conbinations. It comprises 2258 hours of processed data with diverse genres, instruments, tempos, and keys,
as illustrated below.
The figure below compares the acoustic and semantic diversities between Diff-SSL-G-Comp and existing datasets, which are sourced mainly from noises and test signals. The more scattered pattern in the cluster representing the real-world recordings highlights our dataset as encompassing a richer acoustic characteristic and semantic coverage than the existing datasets.
To better understand the diversity and quality of the dataset, we have sampled a few examples below for preview.
In this section, we demonstrate the virtual analog modeling performance of the representative black-box and grey-box models trained on Diff-SSL-G-Comp, as well as samples generated by white-box commercial plugins.
All the experimental checkpoints demonstrated in the paper can be found on our Google Drive, and all the samples generated by the commercial plugins can be found on the Hugging Face dataset page. Model configurations and usage guides are also attached. We highly recommend that researchers run these experiments by hand since it is usually hard to hear the significant difference in compression models.