From Chasm Software Wiki
You may be able to improve CHASM's performance by applying the feature selection protocol used in Original CHASM.
Step 1. Assemble your feature selection set.
BuildClassifier -m MutationTable -o ClassifierName -p
see CHASM_Tutorial for details.
This will generate a new directory
which will now contain three files
drivers.tmps passengers.tmps AllFeatures.list
- We recommend that you split drivers.tmps into two randomly partitioned files of driver mutations. Name one of these files drivers_fs.tmps and name the other drivers.tmps (which was the name of the original file before you split it). You will use the drivers_fs.tmps file for your feature selection. This will avoid classifier overfit. The new drivers.tmps file will be used to train your classifier at a later step.
- You will rename (DO NOT COPY) passengers.tmps to passengers_fs.tmps but you will not split it into two parts. A new passengers.tmps will be generated when you train your classifier.
- You now should have two additional files called passengers_fs.tmps and drivers_fs.tmps
Step 2. Compute all features available in SNVBox for the mutations in passengers_fs.tmps and drivers_fs.tmps as described in SNVBox_Tutorial#Preparing the requisite files (ignore Step 1) and SNVBox_Tutorial#Retrieving Features.
- The AllFeatures.list file includes all 86 features available in SNVBox and can be used as the feature list file in this step.
- The output of this step is two files in ARFF format
Step 3. Use a package such as WEKA to select the most informative features. Original CHASM used feature selection based on mutual information with class labels and a feature threshold of 0.001 bits. WEKA and many other packages accept ARFF format, but you will have to concatenate the driver and passenger ARFF files together and add a class identification column.
Step 4. Train your classifier as described in CHASM_Tutorial#Training the Classifier using only features selected in Step 3. YOU MUST USE THE SAME MutationTable and ClassifierName AS IN STEP 1.
Custom Context Tables
For information on constructing custom context tables in CHASM read this.