-1

As stated in the title, I am currently researching a method using k-NN, specifically, as either an alternative or as a supplement to the DTW algorithm in keyword spotting based on MFCC. I have read through various answers on this forum (mainly Speech recognition using MFCC and DTW(Dynamic Time Warping)?) and I would like to know if anyone could direct me to somewhere I can find more research or implementations which employ the k-NN.

jojeck
  • 11,107
  • 6
  • 38
  • 74
  • 1
    If you find any of the answers useful, please mark it as accepted, so that question is answered and can help other people. – jojeck Jun 09 '17 at 14:46

2 Answers2

0

If you want to use the k-NN in conjunction with DTW, then it's quite easy. Assuming that you have the speech corpora of keywords you should do the following at the training stage:

  • Each keyword should have more than one utterance recorded
  • Extract the MFCC's for all the recordings and store them - these are effectively your "model data" - training samples.

Classification consists of:

  • Extract the matrix of MFCC's for the test sample
  • Calculate the DTW of the test sample with each and every MFCC matrix that was extracted and the training stage.
  • Store the final distance for each comparison. Now you should have a vector of total DTW distances (of a length equal to the number of the training examples):

    • test vs. class1_sample1: 10.2
    • test vs. class1_sample2: 20.3
    • test vs. class1_sample3: 15.1
    • test vs. class2_sample1: 3.1
    • test vs. class2_sample2: 2.5
    • ...
  • For each of the classes (keywords), sort the distances (from small to large)
  • Pick the k top distances for each class and average them. These are the per-class memberships.
  • Predict the keyword by choosing the class with smallest distance.
jojeck
  • 11,107
  • 6
  • 38
  • 74
0

You say " I would like to know if anyone could direct me to somewhere I can find more research or implementations which employ the k-NN"

k-NN with DTW is used in many 100s of papers. Many of them cite [a], so if you search for papers that cite [a]..

In addition, the most cited (and award winning!) paper on k-NN with DTW in the last decade is [b], which has many examples (and nice videos).

[a] http://www.cs.ucr.edu/~eamonn/time_series_data/ [b] http://www.cs.ucr.edu/~eamonn/UCRsuite.html

eamonn
  • 16