We need to install the kaldi into the work
directory.
1 | git clone https://github.com/kaldi-asr/kaldi |
We will use the voxceleb 1
to do the experiment demo.
We need to download the dataset from the website.
There are several files that needed to be prerpared, and we need to follow the kaldi way:
1 | wav.scp : utt-id utt-path |
We need to use the make_voxceleb1_v2.pl
1 | cd asv-subtools/recipe/voxceleb/prepare |
If we want to have a better shell scripting way like preparing those files, we can just do:
1 | mkdir -p data/voxceleb1_train |
Now we can filter the files:
1 | subtools/kaldi/utils/fix_data_dir.sh |