I was trying to emulate a research which included machine learning. In that the researcher used both feature selection and feature reduction before using Gaussian Classifiers from classification.
My question is as follows: Say I have 3 classes. I select the (say,) the top 3 best features for each class from a total of (say) 10 features. The features selected are for example as follows:
Class 1: F1 F2 F9
Class 2: F3 F4 F9
Class 3: F1 F5 F10
Since principal component analysis or Linear Discriminant analysis both work on the complete data-set or atleast datasets in which all classes have the same features how do I perform feature reduction on such a set and then perform training?
Here is the link for the paper: Speaker Dependent Audio Visual Emotion Recognition
Following is an exerpt from the paper:
The top 40 visual features were selected with Plus l-Take Away r algorithm using Bhattacharyya distance as a criterion function. The PCA and LDA were then applied to the selected feature set and finally single component Gaussian classifier was used for classification.