Lists (2)
Sort Name ascending (A-Z)
Stars
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…
An open-source pipeline for training natural language understanding models
The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
Audio processing by using pytorch 1D convolution network
YSDA course in Speech Processing.
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Benchmark popular audio i/o packages
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Fine-tune Stable Audio Open with DiT ControlNet.
Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.
Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion
Flexible LoRA Implementation to use with stable-audio-tools
Simple implementation of diffusion forcing https://arxiv.org/abs/2407.01392
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
Text and image to video generation: Kandinsky 4.0 (2024)
The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
My implementation of object moving from DragonDiffusion https://arxiv.org/pdf/2307.02421.pdf
A technical report on convolution arithmetic in the context of deep learning
👋 Motion detection using the doppler effect



