No Description

Matus Gajdar e1b5a166e7 Comments added to scripts 4 months ago
METACENTRUM_QUEUE e1b5a166e7 Comments added to scripts 4 months ago
decode e8ec248133 args boolean update, generate model read, some fixes 9 months ago
README.md 7eae9b8781 Změnit "README.md" 7 months ago
args.py e1b5a166e7 Comments added to scripts 4 months ago
environment.yml e0ff3547ba conda enviroment 9 months ago
generate_probs.py e1b5a166e7 Comments added to scripts 4 months ago
layers.py e1b5a166e7 Comments added to scripts 4 months ago
model.py e1b5a166e7 Comments added to scripts 4 months ago
ortho.py e1b5a166e7 Comments added to scripts 4 months ago
read.py e1b5a166e7 Comments added to scripts 4 months ago
train.py e1b5a166e7 Comments added to scripts 4 months ago

README.md

Pytorch enviroment set

Install conda from guide

$ bash Miniconda3-latest-Linux-x86_64.sh
$ conda env create -f environment.yml
$ source activate tdnn

Pytorch scripts for training TDNN and TDNNF neural networks.

Data you need (recipe: mini_librispeech)

Run some Kaldi example script, except training, just data preparation:

$ cd egs/mini_librispeech/s5/
$ ./run.sh 

Files you need are: feats.scp, alis.scp, valid_uttlist. You can find them after running kaldi script in following places:

  • data/train_clean_5_sp_hires/feats.scp
  • exp/nnet3/<example>/egs/valid_uttlist - after running training script file valid_uttlist is written inside egs directory or you can create your own
  • alis.scp - create by following command:

    $ ali-to-pdfs exp/tri3b_ali_train_clean_5_sp/final.mdl 'ark: gunzip -c exp/tri3b_ali_train_clean_5_sp/ali.*.gz |' ark,scp:alis.ark,alis.scp
    

Where to set?

To change a model, you need to manually configure python script model.py. Also you need to set correctly number of outputs in last layer, this number will be shown by executing this command:

$ tree-info exp/tri3b_ali_train_clean_5_sp/tree | grep num-pdfs

After that, you might want to set some arguments in args.py or pass them as argument into train.py script. Also set paths correctly to files mentioned earlier.

Important: context_size in args.py depend on number of tdnn(f) layers in model.py

What to do?

Training run:

$ mkdir model
$ python train.py -n tdnn

#generate probabilities for decoding, need to set path do dev
$ python generate_probs.py -n tdnn

More info about decoding, look into decode/ folder.

Or look into script for metacentrum in folder METACENTRUM_QUEUE/.

Results (updating...):

Architecture in pytorch is based almost on Kaldi, both training where done without iVectors and without skip connections. Speed of Kaldi training is pretty fast opposite to pytorch (around 10 times faster).

TDNN WER
Kaldi (5 layer) 19.12%
Pytorch (5 layer) 19.81%
TDNN-F WER
Kaldi (5 layer with ortho compute) 18.9%
Pytorch (5 layer with ortho compute) 18.66%
Pytorch (5 layer without ortho compute) 18.72%
TDNN-F WER
Pytorch (5 layer) 18.66%
Pytorch (7 layer) 19.06%
TDNN-F WER
Pytorch (7 layer) 19.06%
Pytorch (7 layer) with skip-connections (small-to-large) 19.43%