In Weeks 9 and 10, I continued to work on the speech recognition system, while working in parallel in the dialect identification task.
For the speech recognition task, I finished training the Time Delay Neural Network (TDNN) locally, and left the training process to run on Case HPC. The TDNN uses LDA-based i-vectors and MFCC’s.
Next step will be adapting the model to the development data in dialectic Arabic.
For the dialect identification task, I implemented the Siamese neural network and the dialect enrollment algorithm. The authors reported 0.83 as the interpolation parameter value achieving the best accuracy. For the training process, the authors used a random sampling approach to form the training examples, where each example includes an i-vector of an utterance, a dialect model for a certain dialect, and a label that details whether they are a match or a mismatch.
Next step will be to implement length normalization and recursive-whitening