CHASM archive

From Chasm Software Wiki

Revision as of 20:58, 5 March 2014 by WikiSysop (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Previous Versions of CHASM

  • CHASM 1.0.8

CHASM 64-bit (MD5 Checksum: 758cd727d5472ba403d0fd7a46938086 )

CHASM 32-bit (MD5 Checksum: b5186220cbc3fdfabe351e8246cdb673 )

Classifier Pack (1.0.8) (MD5 Checksum: 4ed7dc1d865c76f30d42319ec3a3bd50 )

  • CHASM 1.0.7

Updated list of precomputed classifiers. The list now includes classifiers for 5 new tissue types (Kidney, Bladder, Head and Neck, Lung Adenocarcinoma and Lung Squamous Cell) and an updated classifier for AML.

CHASM 64-bit (MD5 Checksum: 6e0671ede6f88c12d26ad595e25cde74 )

CHASM 32-bit (MD5 Checksum: 8542b192af28e982bd21aadadc6c7dec )

Classifier Pack (1.0.7) (MD5 Checksum: 6eca44ed6d983756dfccbffebf910d77 )

  • CHASM BugFix 1.0.6

Fixed a minor bug in FDR bin calculation that was causing some p-values to fall under a more conservative FDR bin. P-value reporting was altered to provide 4 significant digits. Classifiers available through the CRAVAT website are now packaged with CHASM, under BuiltClassifiers.

CHASM 64-bit (MD5 Checksum: 0dc8215f89b56f4db1b3661c192bd2e4 )

CHASM 32-bit (MD5 Checksum: 308f1cbf1af2e69d371ec27cc24459f4 )

Classifier Pack (1.0.6) (MD5 Checksum: ae502d58e3545f19549aa26aad0089a6 )

  • CHASM 1.0.6

For reproducibility, was altered to take the first missense mapping in any case where a nucleotide substitution maps to multiple missense mutations.

CHASM 64-bit (MD5 Checksum: f6ab4d1188849de0164cd550d31ed491 )

CHASM 32-bit (MD5 Checksum: e814cf82d1f2a8a3e0591826d0f9f7c0 )

  • CHASM 1.0.5 BugFix

This version fixed the bug in snvGetGenomic. The only difference between versions 1.0.5 BugFix and 1.0.6 is in the handling of nucleotide substitutions that map to multiple possible missense substitutions. Multiple mappings can occur if 1) the codon onto which a nucleotide substitution is mapped is split across an exon boundary. In such cases, alternative splicing can result in multiple possible mappings. 2) In rare cases, it is also possible that transcripts are predicted to overlap the genomic coordinate on both the positive and negative DNA strands. In this version (1.0.5 BugFix), a mapping is selected at random. This can lead to different mappings for a subset of nucleotide substitutions across snvGetGenomic runs.

CHASM 64-bit (MD5 Checksum: a7248aeca1c7202b39daf5e883f3ee9e )

CHASM 32-bit (MD5 Checksum: 5408c402d4332eeca01fe8ad9bb350ee )

  • CHASM 1.0.5

This version of CHASM had a bug in snvGetGenomic. Do not use this version if you are submitting your mutations in genomic coordinates. The bug affects scoring of these mutations only. It does not affect classifier training or scoring of mutations submitted in protein coordinates.

CHASM 64-bit (MD5 Checksum: 18b2508c984ced9a5a63ab77069dd9b7 )

CHASM 32-bit (MD5 Checksum: 648e4545981da6ac2ea33c1b135e7173 )

Important: If you are using CHASM pre-bug fix version 1.0.5 and want to upgrade to CHASM version 1.0.6 or 1.0.7, please do the following:

Back up configuration files (chasm_classifiers.conf, snv_box.conf)
Back up BuiltClassifiers directory
Back up any modifications to ClassifierPack (custom features lists, passenger frequency tables or training sets)

tar -xvf in the directory where CHASM is installed
Restore any backed up files

  • CHASM 1.0.4

This version of CHASM used the Waffles machine learning library to build classifiers and did not yet support genomic coordinates.

CHASM 64-bit (MD5 Checksum: 7a7c346d54cbf4f98a65d0bdce37794a)

CHASM 32-bit (MD5 Checksum: 983bd09268995616b4d7b70a8bf0c23d)

ClassifierPack (MD5 Checksum: 54a5288a0c134bc9dc84f116f6f65ae5)

Backwards Compatibility

The Random Forest engine used by CHASM was changed between releases 1.0.4 and 1.0.5. A new forest can be trained using old Train.arff files.

1) Training a classifier using an existing arff file

  • Ensure Arff format is correct
    • If class labels are included, the arff header description should now inlcude class weights: @attribute CLASS {driver (1), passenger (1)}. Here the (1) indicates that a class weight of 1 is assigned to both drivers and passengers (equal weighting). CHASM will automatically assign equal weights.
    • Lines in the data section of the arff file cannot exceed 1024 characters. An & can be used to split long lines if lines longer than 1024 characters are needed snvGetTranscript will return the proper arff formatting if the transcripts and mutations from the existing arrf are used as input. The feature file should be edited as needed to ensure that the same feature values are used
    • Missing values should not occur in arff files generated from the SNVBox database. If the arff used for classifier training does have missing values, they must be represented by '?'.
  • Run Trainer on the arff file
 ./Trainer -f train.arff -d directory to store trained classifier -s seed -n mtry parameter (default= floor(sqrt(number of features used))) -r Number of trees (default=500)

2) Score an existing arff file with a new classifier

  • Ensure Arff format is correct
    • There should be no class column in the file
    • Lines cannot exceed 1024 characters. Long lines can be split using & to comply.
    • Missing values should be represented by '?'
    • The features and feature ordering in the arff file must exactly match the features in the arff file used to train the classifier
  • Run Chasm to score the mutations in the arff file
 ./Chasm -c directory storing the trained classifier -i mutations to score (in arff format) -o file to store output

Note: This only returns scores. No p-values or FDRs will be calculated

2) Alternative:

  • Use RunChasm to score mutations while simultaneously estimating p-values and FDRs
    • First check the format of Null.arff
    • Second, run Chasm to get scores for Null mutations: ./Chasm -c classifierDir -i Null.arff -o Null.classified
    • Third, run RunChasm to score mutations and calculate p-values and FDRs (script will look dor Null.classified in the classifier directory):
./RunChasm classifierDir mutationFile (note that mutationFile should be in raw format, not in arff format) 

Previous Versions of VEST

  • VEST 1.1.0

VEST 64-bit (MD5 Checksum: 1ca60beae60ef3d57d21b863b620dc0a )

VEST 32-bit (MD5 Checksum: 9b64eda3b77d102b79ad1f3717231848 )

Previous Versions of SNVBox Feature Retrieval Tool

  • SNVGet 2.0 (used by CRAVAT 2.0, CHASM 1.0.8, and VEST 1.1.0)

SNVGet 2.0 (MD5 Checksum: 4520d637794ae808f57142a9d8f21e5a )

Previous Versions of SNVBox MySQL Database

  • SNVBox MySQL Database 2.0.0

SNVBox MySQL Database 2.0.0 (MD5 Checksum: 6125760183629a6c3018dbd168634c2a )

  • SNVBox MySQL Database 1.0.2

SNVBox MySQL Database 1.0.2 (MD5 Checksum: 581fe35e26760c0d5c6d241ff98fcece )

  • SNVBox MySQL Database 1.0.1

SNVBox MySQL Database 1.0.1 (MD5 Checksum: 094e0e807d8dc1c5d404f923dc1dc546 )

To upgrade from SNVBox MySQL Database 1.0.1 to SNVBox MySQL Database 1.0.2, both BugFix and upgrade should be applied to SNVBox MySQL Database 1.0.1.

  • SNVBox MySQL Database 1.0.1 BugFix

We fixed a bug with the disulfide bridge annotation feature in the Uniprot_features table.

Priority Low: This fix will only affect mutation scores for a very small number of mutations.

1. Download SNVBox MySQL Database 1.0.1 BugFix (MD5 Checksum: c2765f2676c44d6193d6d5d371f2aeb8 )

2. Run the following commands:

gunzip SNVBox_BugFix_1.0.1_Uniprot_features.sql.gz
mysql -u user --password=password dbname < SNVBox_BugFix_1.0.1_Uniprot_features.sql

  • SNVBox MySQL Database 1.0.1 to 1.0.2 upgrade

We improved the coverage of the mapping between genomic and transcript coordinates.

Prority High: This upgrade will significantly increase the number of codons for which features are available.

1. Download SNVBox MySQL Database 1.0.1 to 1.0.2 patch (MD5 Checksum: e7cd5b7a065ce59b8e93e8252d035851 )

2. Run the following commands:

mysql -u user --password=password dbname <
Personal tools