NUWC UNCLASS transient data
        Office sounds. Version 1.3

	Paul M Baggenstoss
	p.m.baggenstoss@ieee.org
	Dec 29, 2014

** Difference from version 1.2: 
   added 4 new classes: "dry", "caps", "cd", "clips"


All data sampled at 32000 Hz
Version 1.3 has colored noise added to mask
differences in recording conditions between classes.
This is the order of classes in data.mat:

 1. penny     : penny dropped on sheet of paper on the table 
 2. quart     : quarter dropped on sheet of paper on the table 
 3. book      : books dropped on table
 4. coins     : jingle coins in hand (3 pennies, 3 quarters, 2 dimes)
 5. golf      : bounce golf ball on table 
 6. keys      : drop a set of keys (same keys as 'jing')
 7. pret      : grabbing a bag of pretzels 
 8. stapl     : stapling a sheet of paper
 9. bot       : placing a bottle on the table
10. door      : opening door
11. jing      : jingle a set of keys (same keys as 'keys')
12. paper     : ripping paper
13. hangup    : hanging up phone 
14. sciss     : cut paper with scissors 
15. stix      : drop coffee stir sticks into a cup
16. cup       : drop skittles into a cup (same cup)
17. skit      : drop handful of skittles on table
18. 2skit     : drop 2 skittles on table
19. spoon     : drop 1 spoon and 1 fork on table
20. pens      : drop 3 pens on table : mech pencil, ball-point pen, sharpie
--------- new classes in version 1.3 : --------------------------------
21. dry       : drop 4 "Expo" brand dry-erase markers on table
22. caps      : drop handful of plastic ball-point pen caps on table
23. cd        : drop one CD on table
24. clips     : drop handful of wooden clothesline clips on table

Data holdouts for experimental results.  Divide data into three sets:

SET1 samples 1-34   each class
SET2 samples 35-68  each class
SET3 samples 69-102 each class

Standard : train on SET1,SET2 and test on SET3, train on SET1,SET3 and test on SET2,
and train on SET2,SET3 and test on SET1. Report combined results.

Low training data: train on SET1 and test on SET2,SET3, train on SET2 and test on SET1,SET3,
and train on SET3 and test on SET1,SET2.  Report combined results.


Performance benchmarks

---------------------------------------------------------------------------
1.  P. Baggenstoss 12-29-2014  : SVM (128-FFT-PCA features)

SVM: linear kernel, Using SVM-Light toolkit http://svmlight.joachims.org/
          Thorsten Joachims, Cornell

Features: Take straight FFT of time-series (full 16K samples),
magnitude, then log.  Gather this feature over the training set 
and do SVD analysis, keep top 128 singlar vectors. Project data onto 
these to get a 128-dim feature. 

Training on 1/2  (51 samples) and testing on 1/2   :   3.15%  (77/2448)
Training on 1/3  (34 samples) and testing on 2/3   :   3.98%  (195/4896)
---------------------------------------------------------------------------