4

I'm quite new to machine learning/data mining and I'm struggling to find the correct path for my problem and would appreciate some guidance or criticism of my proposed solution i.e. is there a better/simpler algorithm for the problem?

The Problem

I have a number of features that describe a particular type (label) of wave (frame of audio) at a predetermined level 'v'. I want to be able to identify the level of an unknown wave and distinguish it from other types of waves that fall under the same higher level category.

Assumptions

  1. A group in a test set should be in increasing order of level v
  2. The type of wave in the group should be the same and known

Proposed Solution

Stage one: Level Selection

  1. For a given type of wave compute the features at each level for N number of samples

  2. For each level calculate the mean/median of each feature the N samples to create a feature vector for each level.

  3. Normalise the feature set by subtracting the empirical mean and dividing by the variance.

  4. Take the Euclidean/Manhattan distance of an incoming vector with the feature set and choose the closest level.

  5. For a group with assigned levels, compare levels with neighbours and report negative differences (should be ascending) or large differences.

Stage two: type selection

  1. Take the Euclidean/Manhattan distance of an incoming vector with the feature set for each type at a specific level or maybe across all levels choose closest type.

Extension of Problem

Features evolve over time as well as level

Proposed Solution

Repeat the stages of the above solution for each frame.

Thanks for any help

*Update I cannot guarantee that the levels v are equivalent across the data I can only guarantee that the order is increasing. i.e. Sample A may have 5 levels v= 1,..,5 and they correspond to {1,..,5} and sample B has 10 samples v = 1,2,..10 and they correspond to {.5,1,1.5,...,10}. How can I capture this without knowing the relationship between levels and identify those which do not follow this pattern. Pleas let me know if this is not clear

melinnde
  • 155
  • 1
  • 7

2 Answers2

1
  • To link feature vectors with levels 'v', you can use GP's.

  • Assuming the 'v' levels are determined by the feature vectors, I can't see why you need to model each group separately... Adding the location of the frame within the audio segment should also not be necessary.

  • If the GP model fails then one can try adding an element to the feature vector describing the location of the audio frame in the group.

Good luck.

Hasan
  • 485
  • 3
  • 4
  • Because each group was recorded separately and the levels v just correspond to the index of each sample in a group. All I know is that they are in the correct order. So if group A had 88 samples, and B had 66 it would be incorrect to say that sample 66 in group A is the same as sample 66 in group B or assume that sample 66 is the max and is equivalent to sample 88 in group B. Thanks again – melinnde Aug 29 '13 at 16:34
  • Looks like we are not on the same page here. Please provide details on 'v' values What are they ? And what is your feature vector? – Hasan Aug 29 '13 at 16:37
  • Ok it may be clearer with an example. Group A is a drum that was hit 88 times from soft to hard. Group B is different brand drum hit 66 times from soft to hard. V then describes the index of what is believed perceptually as the correct order of soft to hard. So I'm looking to identify this pattern of soft to hard with my features e.g. the fundamental frequency of the drum should increase when the drum is hit harder due to change in tension of the skin. Other features include spectral centroid, bfcc's,irregularity etc. So it is likely that V1 for group A and V2 for group B fit to the same line. – melinnde Aug 29 '13 at 16:49
  • OK..Using your earlier example: "A" 88 samples, "B" 66, After I compute the values of B which can range from 1-88. I sort them to get the the correct index. A higher value should still indicate a harder hit. – Hasan Aug 29 '13 at 17:18
  • I don't think a higher value should indicate a harder hit for example there could be a greater number of soft hits in group A. I mean if we shorten the example to 5 hits and 3 hits and rate on a scale of 1 to 10, group A could be [1 2 7 9 10] and group B could be [1 7 9] but all I know is [1 2 3 4 5] and [1 2 3]. So would it make sense to create a model for each group and then perform linear regression to relate the models? – melinnde Aug 29 '13 at 17:30
  • 1
    I still don't know what you are looking for. "V then describes the index of what is believed perceptually as the correct order of soft to hard". You can get that in the example you mentioned. – Hasan Aug 29 '13 at 18:17
  • I'm looking the for relationship between my features and soft to hard hits. I have groups of data that are organised in the correct order. For new data I want to be able to predict the order in which it should be, compared to my examples such that I can identify irregularities. What I don't understand is in order to generate the model I need to know the output for a given feature vector, but I don't know how the outputs of each group relate other than each group is organised in the correct order for that group. – melinnde Aug 29 '13 at 18:33
  • How do I train the data to get an independent level 'v'? Irregularities could include: If the new data has a soft hit in the middle, for one sample it was hit in the wrong place, someone coughs during a recording – melinnde Aug 29 '13 at 19:12
  • Assume that you are given data as a random collection of waves (skipping the order criteria), coming from all the groups you have. Divide them into two groups. Use one for training and the other for testing. The identification is wave based not group based. Does your training work for this case? I'm assuming that it will. – Hasan Aug 29 '13 at 19:14
  • Yes ok but how do I train it to get over the problem of the unknown relationship between the groups. For example taking only one feature, given a vector x0 = [1 3 6] you know the output y0 = [2 6 12] a model can be created. If you have another vector x1 = [1 5 9] and output y1 = [4 20 36], from what I understand you are saying to train with x = [1 1 5 3 6 9] y = [2 4 6 12 20 36] but I can't do that because I can't assume that y0 and y1 are on the same scale – melinnde Aug 29 '13 at 19:29
  • " I can't move this to chat b.c. of low reputation, please move it there", What am saying is this : f1 f2 f3 f4 = output1, f1 f2 f3 f4 = output 2, and so on. Each set of features f1 f2 f3 f4 describe a single wave, the output is the corresponding drum level. If you can do this you are skipping the notion of groups all together. – Hasan Aug 29 '13 at 19:34
  • " I also can't move this to chat b.c. of low reputation" What I'm saying is I don't have drum level as a measurement. – melinnde Aug 29 '13 at 19:42
  • By the drum level I meant the level 'v' you have for training data. – Hasan Aug 29 '13 at 19:46
  • Yes but this is only an index and is only a valid output within a group. It's like saying one group was measured in metres and the other was measured in inches except the conversion is unknown. All that is known is that within each group each wave is in an increasing order. – melinnde Aug 29 '13 at 19:57
  • Is it a function of the input features? If yes then it is valid across groups as long as you are using the same features. – Hasan Aug 29 '13 at 20:00
  • It may be and that's the aim of the analysis to identify the evolution using the feature vectors. But this number is not equal across the groups I would need to normalise it somehow – melinnde Aug 29 '13 at 20:08
  • Well this is a good point to stop. You can Experiment and see the relationship between the input features for a vector representing a wave (random, ignoring group concepts). This should work. Normalization is probably handled by the regression operators. If not you can make modifications later on. – Hasan Aug 29 '13 at 20:14
0
  • You need to be certain that your feature vector is capturing the properties you are looking for. Using the mean and median for dimension reduction (i.e. extracting the features) is not a good choice unless your problem is really simple. Better choices are: PCA, ICA, and SVM's. Capturing the features in the frequency domain can also be a great tool if your signal is sparse in the FFT or DCT domain.
Hasan
  • 485
  • 3
  • 4
  • Thanks for your response. I intended to use mean/median across the training set rather than use it on the feature vector. The problem is now a little bit more difficult as I can't guaranty that the levels across the training set are equal I can only guaranty that they are increasing. Any thoughts on this? – melinnde Aug 29 '13 at 11:44
  • Regression between the feature vectors and the levels. Regression with Gaussian processes (http://www.gaussianprocess.org) can be a good choice – Hasan Aug 29 '13 at 11:59
  • I'm not sure if I fully understand. So for a given group of levels and feature vectors a GP can model the features w.r.t. the levels so if I come along with a new feature vector I can deduce a level. I then do the same for another group of the same type. How do I combine these models such they should be describing the same type of pattern and that I don't know how the levels in each group are related and the lengths of the levels may vary? Do I do an almost recursive regression or have I completely lost the plot? – melinnde Aug 29 '13 at 15:44