Open Source

An explicit goal of TC-STAR is the development and release of open source software for Spoken Language Translation and and Text to Speech Synthesis. This page presents open source software that was  developed and made available by TC-STAR partners.

This toolkit provides two different methods for performing Voice Conversion. It has been developed at TALP Research Center of the Universitat Politecnica de Catalunya.The first method is a C/C++ tool based on the Linear Prediction model (LPC). CARTs are used to split the acoustic space into several classes based on phonetic features. For each class, a linear regression is applied to transform the LSF coefficients using GMMs. Then, the appropiated residual is selected from the residuals found in the training data based on the similarity of the associated LSF and the transformed LSF. Code is provided to perform all the aforementioned steps. Sample scripts are provided to help in the automatization of the whole process.

The second method is a Matlab tool based on the harmonic/stochastic model (HSM). This model is used to analyze, modify and synthesize the speech signals. The voice conversion method is based on gaussian mixture models (GMM), which can be trained from parallel and non-parallel corpora. The non-parallel training procedure is suitable for cross-lingual applications because it handles only acoustic parameters. The harmonic component of the signals is converted using the trained transformation function, and the stochastic component is predicted from the converted harmonic component. The unvoiced frames are not modified. The pitch is also adapted to the target speaker by means of a linear transformation concerning the means and variances of the log-f0.

The UPC Voice Conversion Toolkit can be dowloaded from the project homepage.