ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers is accepted in EMNLP 2024.