|
|
|
speech compression
|
|
Contact : Sara Grassi
Speech
signal compression, aiming to encode and decode the signal, finds
many applications such as fixed or mobile digital telephony, packet
network transmission (Internet), videoconferencing, radio and
television, and various storage applications.
Narrowband
speech is defined as speech filtered in continuous time in the
frequency band [300 - 3400 Hz], and then sampled at 8 kHz. Narrowband
speech is usually called "speech" in the telephony band and
is used by the second generation of mobile telephony (2G, in Europe
GSM phase 2). Narrowband speech coding makes it possible to obtain a
reconstructed speech signal with sufficient quality for telephony.
However, the reconstructed signal does not have a natural consonance,
and may sound metallic or even synthetic. It is sometimes difficult
to distinguish two fricative consonants such as the "S" or
"F" in such a signal.
Wideband
speech is defined as speech filtered in continuous time, in the
frequency band [50 7000 Hz], and then sampled at 16 kHz. The
interest of processing a wideband speech signal is to obtain a
reconstructed signal that is clearer, more natural and intelligible
than with narrowband speech.
There is a
great diversity of speech coding techniques. Waveform coders, use for
example Pulse Code Modulation (PCM) or Adaptive Differential PCM
(ADPCM). These coders make it possible to obtain an excellent quality
of the reconstructed signal but generally result in a low compression
ratio. Other techniques, such as those based on CELP (Code Excited
Linear Prediction), use a coding scheme carrying out an
analysis-by-synthesis and are based on speech production and
perception models. Such coders result in a reconstructed signal of
good quality with a higher compression ratio.
ESPLAB has
worked for a long time in software and hardware implementation of
various narrowband speech coding functionalities. With the diffusion
of multimedia technologies and the introduction of services for the
3rd and 4th generations of mobile telephony, wideband speech coding
became a prevailing research field and ESPLAB has developed a
proprietary multi-rate wideband speech coder.
Example
of current development:
Wideband
speech coding for storing the compressed speech in a portable device
in order to realize a remote decompression on a PC. The decompressed
speech is then used for an automatic transcription into text.
 Publications :
- Giuseppina Biundo-Lotito, Codage à débit variable de la parole en bande élargie, PhD Thesis, University of Neuchâtel, IMT, June 2003.
-
G. Biundo, M. Ansorge, F. Pellandini et P.-A. Farine, "Perceptual
weighting for ACELP wideband speech coder", dans Proc. of 4th COST 276
Workshop on information and knowledge management for integrated media
communication, pp. 105-110, Bordeaux, France, Mars-Avril 2003.
- G. Biundo, S. Grassi, M. Ansorge, F. Pellandini
et P.-A. Farine, "Design techniques for spectral quantization in
wideband speech coding", dans Proc. of 3rd COST 276 Workshop on
information and knowledge management for integrated media communication
(CD-ROM), Budapest, Hongrie, Oct. 2002.
- G. Biundo, S. Grassi, M. Ansorge et F.
Pellandini, "Spectral quantization for wideband speech coding", dans
Proc. of 1st COST 276 Workshop on information and knowledge management
for integrated media communication (CD-ROM), Leganés (Madrid), Espagne,
Nov. 2001.
- S. Grassi, M. Matthey, G. Biundo, A. Dufaux, M.
Ansorge, F. Pellandini, "A Low-Complexity Variable Bit Rate Speech
Coder for Portable Storage Applications", Proc. of Signal Processing
2000 Workshop, Poznan, Poland, Oct. 13th, 2000, pp. 7-12.
- S. Grassi, M. Ansorge, F. Pellandini, "LSP
Calculation Methods for Applications to Speech Coding", Proc. of the
Signal Processing 99 Workshop, Poznan, Poland, October 8, 1999, pp.
61-66.
- S. Grassi, M. Ansorge, F. Pellandini, "An
Algorithm for Fast Direct Calculation of Quantized LSP Parameters",
Proc. of the Fifth Bayona Workshop on Emerging Technologies in
Telecommunications, COST 254, Bayonna, Spain, September 6-8, 1999, pp.
158-162.
- S. Grassi, Optimized implementation of speech
processing algorithms, Thèse éditée par la Faculté des Sciences de
l'Université de Neuchâtel, Neuchâtel, 1998.
|
|