Blog(1)
Deep learning techniques have accomplished a big step forward on speech separation task. The current leading methods are based on the time-domain audio separation network (TasNet) [1]. TasNet uses a learnable encoder and decoder to replace the fixed T-F domain transformation. It takes waveform inputs and directly reconstructs sources, and computes time-domain loss with utterance-level permutation invariant training (uPIT). Several approaches are proposed based on TasNet framework, such as the Conv-TasNet [2] , the dual-path recurrent neural network (DPRNN) [3], the dual-path Transformer network (DPTNet) [4], RNN-free transformer-based neural network (SepFormer) [5] , a self-attentive network with a novel sandglass-shape, namely Sandglasset [6].
Research Areas(0)
Publications(1)
TFPSNET: TIME-FREQUENCY DOMAIN PATH SCANNING NETWORK FOR SPEECH SEPARATION
AuthorYang Lei ,Wei Liu
PublishedInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Date2022-05-22
News(1)
Deep learning techniques have accomplished a big step forward on speech separation task. The current leading methods are based on the time-domain audio separation network (TasNet) [1]. TasNet uses a learnable encoder and decoder to replace the fixed T-F domain transformation.
Others(0)