cat ...txt | tr'[:upper:]''[:lower]' | sort | uniq -c | sort -nr ### here the "r" in "nr" means reverse the sorting, means from the up to the bottom and vice versa.
Print how many each letter occur by the frequency.
1 2 3 4
125 b 100 a 22 d 31 c
Using Egrep to read the column:
There is a .lab speech file, which is labbeled as well:
Here the first column is the timming, second is the frequency, and the third is the labelled data.
1 2 3 4
0.1213 123 y 0.1232 111 uw 0.2113 110 eh .............
So we now need to read all the third column information, we use egrep:
1 2
egrep -h -o "[a-z]{1,2}$" *.lab ### we are looking for the lower case letters, $ means that they are happened at the end of the line
This will print the each phone frequency in reverse order:
1 2 3
121 y 120 uw 110 eh
1
ls | wc -l
Check how many files in one directory
1
rm -rf ./
Delete the current directory. No warrning will occur.
1
cat ./.../*.txt
Print all the .txt files in that directory.
1
cat ./.../*.txt > ./text
Print all the .txt file’s content in that text file
1
python3 ./.../..py > ./text
print the .py running results on text file.
file .wav : Check the identity of the wav file size
Use mv to change the file name:
1
mv ./../../.py ./../../.py
We can use remove to change the file’s name.
1
which ...
Check where … is, the location of …
1
ll -lh
check all the files’ size
If there has a space in the beginning of the file’s name, we just need to delete it.
1
sed 's|^ ||'
Adding a “_” in the middle of the file name: eg. SPKID 09912 into SPKID_09912, g means globally.
1
sed 's| |_|g'
Or
1
sed 's|SPKID|SPKID_|'
align two files:
1
paste -d ' ' wav.scp wav_id > tmp.txt
Delete each lines’ particular words by grep:
1
pip freeze | grep -v "@ the things you want to remove" > requirements.txt
If we want to have a better shell scripting way like preparing those files, we can just do:
1 2 3 4 5 6 7 8 9
mkdir -p data/voxceleb1_train
# get all the .wav file path, eg. /data/voxceleb1/dev/id1231/...wav find /data/voxceleb1/dev -name *.wav > data/voxceleb1_train/temp.lst
# generate the wav.scp, eg. id1231 data/voxceleb1/dev/id1231/...wav # 1st. using split to cut "a" text with "/" # 2st. cut the a[8] value with "." and save into the "b" awk '{split($0, a, "/"); {split(a[8], b,".")}; print a[6]"-"a[7]"-"b[1], $1}' data/voxceleb1_train/temp.list > data/voxceleb1_train/wav.scp