speech_recognition
|
Public Member Functions | |
def | __init__ (self, csv_path, srs_path, dataset) |
def | csv_check (self) |
def | spk2gender (self) |
def | spk2utt (self) |
def | text (self) |
def | utt2spk (self) |
def | wavscp (self) |
Public Attributes | |
csv_data_root | |
csv_delimiter | |
csv_path | |
flag | |
index | |
srs_path | |
srs_path_data | |
srs_path_data_dataset | |
total_files | |
Class DataPreparation takes 2 input arguments Args: param1 (str): Absolute path of the CSV metadata file param2 (str): Absolute path of the root directory of Speech Recognition System param3 (str): Type of Dataset
Definition at line 48 of file prepare.py.
def prepare.DataPreparation.__init__ | ( | self, | |
csv_path, | |||
srs_path, | |||
dataset | |||
) |
DataPreparation class constructor
Definition at line 57 of file prepare.py.
def prepare.DataPreparation.csv_check | ( | self | ) |
csvCheck performs the preliminary checks before the construction of the required data files. The following flow has been established: 1. Read the CSV file to find the delimiter 2. Read the Header of the CSV file 3. Check whether the Header contains fields and store their indices: a. SPEAKER_ID b. WAV_PATH (relative to the CSV file) c. transcription d. GENDER e. UTTERANCE_ID 4. Read every row of the CSV file and check if all the wav paths exist 5. Check whether number of wav files and transcriptions are equal
Definition at line 108 of file prepare.py.
def prepare.DataPreparation.spk2gender | ( | self | ) |
spk2gender prepares the file 'spk2gender' in the DATASET directory. The following flow has been established: 1. Read the CSV file 2. Read the first row (First Speaker) and extract the gender details 3. Search for a new speaker and extract its gender details 4. Write an output file where each line has the structure: <SPEAKER_ID><Tab_space><GENDER>
Definition at line 246 of file prepare.py.
def prepare.DataPreparation.spk2utt | ( | self | ) |
spk2utt prepares the file 'spk2utt' in the DATASET directory. The following flow has been established: 1. Read 'utt2spk' from the DATASET directory (if missing, create it) 2. From each row extract: a. FILE_ID b. SPEAKER_ID 3. Write an output file where each line has the structure: <SPEAKER_ID> <FILE_ID_1> <FILE_ID_2> ... <FILE_ID_END>
Definition at line 369 of file prepare.py.
def prepare.DataPreparation.text | ( | self | ) |
text prepares the file 'text' in the DATASET directory. The following flow has been established: 1. Read the CSV file 2. From each row extract: a. SPEAKER_ID b. UTTERANCE_ID c. TRANSCRIPTION 3. Make file id "<SPEAKER_ID>U<UTTERANCE_ID>" 4. Write an output file where each line has the structure: <FILE_ID><Tab_space><TRANSCRIPTION>
Definition at line 209 of file prepare.py.
def prepare.DataPreparation.utt2spk | ( | self | ) |
utt2spk prepares the file 'utt2spk' in the DATASET directory. The following flow has been established: 1. Read the CSV file 2. From each row extract: a. SPEAKER_ID b. UTTERANCE_ID 3. Make FILE_ID "<SPEAKER_ID>U<UTTERANCE_ID>" 4. Write an output file where each line has the structure: <FILE_ID><Tab_space><SPEAKER_ID>
Definition at line 334 of file prepare.py.
def prepare.DataPreparation.wavscp | ( | self | ) |
wavscp prepares the file 'wav.scp' in the DATASET directory. The following flow has been established: 1. Read the CSV file 2. From each row extract: a. SPEAKER_ID b. UTTERANCE_ID c. WAV_PATH (relative to the CSV file) 3. Make FILE_ID "<SPEAKER_ID>U<UTTERANCE_ID>" 4. Make FILE_PATH "<CSV_DATA_ROOT>/<WAV_PATH>" 5. Write an output file where each line has the structure: <FILE_ID><Tab_space><FILE_PATH>
Definition at line 295 of file prepare.py.
prepare.DataPreparation.csv_data_root |
Definition at line 68 of file prepare.py.
prepare.DataPreparation.csv_delimiter |
Definition at line 128 of file prepare.py.
prepare.DataPreparation.csv_path |
Definition at line 67 of file prepare.py.
prepare.DataPreparation.flag |
Definition at line 70 of file prepare.py.
prepare.DataPreparation.index |
Definition at line 71 of file prepare.py.
prepare.DataPreparation.srs_path |
Definition at line 86 of file prepare.py.
prepare.DataPreparation.srs_path_data |
Definition at line 87 of file prepare.py.
prepare.DataPreparation.srs_path_data_dataset |
Definition at line 88 of file prepare.py.
prepare.DataPreparation.total_files |
Definition at line 177 of file prepare.py.