- Data-adaptive features. Using CSM, one can define parallel signal processing
branches, eliminating the need to “put all the eggs in one basket", as is often done
by the choice of a fixed FFT size, model order, or feature type. Instead of being
constrained by one choice of features, CSM effectively allows the
data to choose the most suitable path to the output.
- Dimension reduction. Through the use of parallel signal procssing branches,
more information can be brought to bear on the problem without increasing dimension.
- Information maximization. Using CSM maximizes information content of the features
without the need to specify a task or target function.
The concept of information maximizytion is explained in Section 3.4.
- Reversibility. Just like reversible physical processes that
are the most efficient, CSM provides a closer link to the input data, making it less likely to
lose critical information in the feature extraction.
CSM provides a “return path" to the input data. By reconstructing
the input data, the most apropriate feature can be chosen.
Or, by providing a likelihood function referenced to the input data,
statistical test can be performed across feature sets, making it
even possible to select features based on likelihood comparison.
- When CSM is applied to a neural network architecture,
it results in a projected belief networks (PBN). A PBN can be at the same time
a generative and discriminative network, and can attain the best properties of both types of networks.
In fact, is has been demonstrated that a PBN trained with both discriminative and generative
cost function can compete with fully discriminative classifiers [1].