Personal Tech Blog logo Personal Tech Blog

02 Apr 2018

Introduction

This blog describes various tools to be installed to perform various types of text-to-speech tasks for self-reference.

Merlin

Merlin is a neural network based speech synthesis system developed at the Centre for Speech Technology Research, University of Edinburgh. It is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).

The operating system used for installation should be linux and the installation steps are shown below:

  1. Installing Dependencies:
    • install csh, realpath, autotools-dev, automake - enter command: "sudo apt-get install csh realpath autotools-dev automake"
    • python-2 and python libraries: numpy, scipy, matplotlib, lxml, theano, bandmat are requred and we can use anaconda to help us manage python and all the packages. The details can be found here and official docs (Note: try to set up a conda environment and install dependencies in that particular environment you are working with). In addition, bandmat may be not found in the default repositories of conda, hence we need to add the repo by executing the following command: "source activate myenv + pip install bandmat"
  2. "source activate myenv" and switch working directory to merlin
  3. enter command "./tools/compile_tools.sh" for tools compilation
  4. Details installation can be found at here and official merlin docs
  5. Common Issues:
    • When run demo with merlin in our python environment, it states that to use MKL 2018 with Theano you MUST set "MKL_THREADING_LAYER=GNU" in your environment.

      solution: "source deactive" + conda install --name myenv mkl=2017, this will downgrade the mkl, mkl-service, numpy and scipy package in the environment. Then, "source activate myenv" + "pip install numpy --upgrade". This is to reinstall numpy.