| 
							
								
									
										| 
											 | 
											
												
												
												
												CRAVAT is a web server with simple interface where cancer-related analysis of variants are performed. To cite CRAVAT, please use
												   this article. CRAVAT currently employs three analysis tools, CHASM, SNVGet, and VEST.
												For more information on these tools, refer to Analysis Tools chapter. On how to use CRAVAT, refer to How to Use chapter. On how to interpret the reports by CRAVAT, refer to Output Report. 
													
													To cite CRAVAT, please use the following literature:
													 
														Douville C, Carter H, Kim R, Niknafs N, Diekhans M, Stenson PD, Cooper DN, Ryan M, Karchin R (2013). 
															CRAVAT: Cancer-Related Analysis of VAriants Toolkit 
															Bioinformatics, 29(5):647-648.
														 
													To cite CHASM, please use the following literature:
													 
														Carter H, Chen S, Isik L, Tyekucheva S, Velculescu VE, Kinzler KW, Vogelstein B, Karchin R (2009)
															Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations
															Cancer Res, 69(16):6660-7.
														 
													To cite VEST, please use the following literature:
													 
														Douville C, Christopher, Masica DL, Stenson PD, Cooper DN, Gygax DM, Kim R, Ryan M, and Karchin R (2015)
														    Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST-indel)
														    Human Mutation, doi: 10.1002/humu.22911.
														Carter H, Douville C, Stenson P, Cooper D, Karchin R (2013) 
															Identifying Mendelian disease genes with the Variant Effect Scoring Tool
															BMC Genomics, 14(Suppl 3):S3.
														 
													To cite SNVBox, please use the following literature:
													 
														Wong WC, Kim D, Carter H, Diekhans M, Ryan M, Karchin R (2011). 
															CHASM and SNVBox: toolkit for detecting biologically important single nucleotide mutations in cancer 
															Bioinformatics, 27(15):2147-2148.
														 
													
													CRAVAT currently employs three analysis tools, CHASM, SNVGet, and VEST:
													 
														
															
																CHASM (Cancer-specific High-throughput Annotation of Somatic Mutations) is a method that 
																predicts the functional significance of somatic missense variants observed in the genomes 
																of cancer cells, allowing variants to be prioritized in subsequent functional studies, based on the 
																probability that they confer increased fitness to a cancer cell. CHASM uses a machine learning method called 
																Random Forest to distinguish between driver and passenger somatic missense variation, The Random Forest is 
																trained on a positive class of drivers curated from the COSMIC database and a negative class of passengers, 
																generated in silico, according to passenger  base substitution frequencies estimated for a specific tumor type. 
																Each variant is represented by a list of features, including amino acid substitution properties, 
																alignment-based estimates of conservation at the variant position, predicted local structure and annotations 
																from the UniProt Knowledgebase. Only missense mutations are analyzed by CHASM.  
																For more information on CHASM, please visit 
																http://wiki.chasmsoftware.org  and
																refer to this  and
																this  articles.
															
															
																VEST is a method that predicts the functional effect of a variant. 
																The classifier and null distribution for VEST has been updated on November 12, 2012, 
																so the VEST result obtained before November 13, 2012 might be different from those obtained after that date.
																For more information on VEST, please visit http://wiki.chasmsoftware.org 
																and refer to 
																this article .
															
															
																SNVGet retrieves selected predictive features for a variant. Features can be broadly 
																categorized into 3 types:
																 
																	Only missense mutations are analyzed by SNVGet.
																	    Amino Acid Substitution features
																	    Protein-based position-specific features
																	    Exon-specific features
  
															    For more information on SNVBox (database made with SNVGet), please visit 
																http://wiki.chasmsoftware.org  and
																refer to this article .
														    
															
														 
													
													
														
															CRAVAT provides user account functionality. You can create your user account, retrieve/change your password, and see the status of your jobs and
															retrieve the results of your jobs through "My Jobs" page. Your username is your email.
														 Create a CRAVAT Account: 
															There are two ways to create a CRAVAT account:
															  
																When you submit a job for the first time, CRAVAT will create an account with your email and 
																    a temporary password and this account information will be sent to you as a part of the result 
																    notification email.
																
																	A CRAVAT account can be created by clicking "Log-In" > "Create an account" on the top menu.
																Retrieve Your UsernameYour username is your email. Retrieve Your PasswordIf you forgot your password, click "Log-In" > "Forgot password?", type your username (your email) and click "Submit". 
														A temporary password will be sent to you. Change Your PasswordTo change your password, first log in, and then click "My Profile" > "Change password". 
														In the "Change Password" pop-up window, type your current password, your new password, and again your new password. Click "Submit". My Jobs PageAfter having logged in, click "My Jobs" on the top menu to open the My Jobs page in a new browser tab.
														This page shows your past and current jobs and their parameters and status (success, fail, running, and in-queue).
														By clicking "Here" in the "Result file" column, you can download the result files through this My Jobs page conveniently.  
													
													
																		
																		
																		
														
														Choose an analysis type:
														 
															
																Cancer driver analysis: This analysis predicts whether the submitted variants 
																are cancer drivers or not.
															
																Pathogenicity analysis: This analysis predicts whether the submitted variants
																will have any pathogenic effect on their translated proteins or not.
															
																Gene annotation only: This analysis provides GeneCard and PubMed information on
																the genes containing the submitted variants.
															 
														When an analysis type is chosen, the options for analysis programs will show up. 
														Multiple analysis programs can be chosen, and if any of the program needs a cancer tissue type
														to be specified, a list box for the selection of the cancer type also will appear. 
														Currently, the following tissue types can be chosen at CRAVAT.
															 
																
																	| Name | Full name | Source | Date |  
															    	| Bladder | Bladder Urothelial Carcinoma | BLCA (TCGA) | Jun 2013 |  
															    	| Blood-Lymphocyte | Chronic Lymphocytic Leukemia | CLL (ICGC) | Mar 2013 |  
															    	| Blood-Myeloid | Acute Myeloid Leukemia | LAML (TCGA) | Jun 2013 |  
															    	| Brain-Cerebellum | Medulloblastoma | MB (mixed source) | Dec 2010 |  
															    	| Brain-Glioblastoma-Multiforme | Glioblastoma Multiforme | GBM (TCGA) | Jun 2013 |  | Brain-Lower-Grade-Glioma | Brain Lower Grade Glioma | LGG (TCGA) | Jun 2013 |  | Breast | Breast Invasive Carcinoma | BRCA (TCGA) | Jun 2012 |  | Cervix | Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma | CESC (TCGA) | Jun 2013 |  | Colon | Colon Adenocarcinoma | COAD (TCGA) | Jun 2013 |  | Head and Neck | Head and Neck Squamous Cell Carcinoma | HNSC (TCGA) | Jun 2013 |  | Kidney-Chromophobe | Kidney Chromophobe | KICH (TCGA) | Jun 2013 |  | Kidney-Clear-Cell | Kidney Renal Clear Cell Carcinoma | KIRC (TCGA) | Jun 2013 |  | Kidney-Papillary-Cell | Kidney Renal Papillary Cell Carcinoma | KIRP (TCGA) | Jun 2013 |  | Liver-Nonviral | Hepatocellular Carcinoma (Secondary to Alcohol and Adiposity) | HCCA (ICGC) | Mar 2013 |  | Liver-Viral | Hepatocellular Carcinoma (Viral) | HCCV (ICGC) | Mar 2013 |  | Lung-Adenocarcinoma | Lung Adenocarcinoma | LUAD (TCGA) | Jun 2013 |  | Lung-Squamous Cell | Lung Squamous Cell Carcinoma | LUSC (TCGA) | Jun 2013 |  | Melanoma | Melanoma | ML (Yardena Samuels lab) | Dec 2011 |  | Other | General purpose | OV (TCGA) | Jun 2013 |  | Ovary | Ovarian Serous Cystadenocarcinoma | OV (TCGA) | Jun 2013 |  | Pancreas | Pancreatic Cancer | PNCC (ICGC)) | Mar 2013 |  | Prostate-Adenocarcinoma | Prostate Adenocarcinoma | PRAD (TCGA) | Jun 2013 |  | Rectum | Rectum Adenocarcinoma | READ (TCGA) | Jun 2013 |  | Skin | Skin Cutaneous Melanoma | SKCM (TCGA) | Jun 2013 |  | Stomach | Stomach Adenocarcinoma | STAD (TCGA) | Jun 2013 |  | Thyroid | Thyroid Carcinoma | THCA (TCGA) | Jun 2013 |  | Uterus | Uterine Corpus Endometriod Carcinoma | UCEC (TCGA) | Jun 2013 |  
														Lastly, check "Include gene annotation" based on whether you want to include in the result email 
														the GeneCard and PubMed annotation of the genes containing the submitted variants.
														
													 
														
														Enter your email address (if you have logged in you don't need to), and if you want to receive machine processing-friendly, tab-separated
														text version of the CRAVAT analysis report in addition to its default Microsoft Excel version,
														check "Include text reports for machine processing". Then, click "SUBMIT". When all the analyses
														are complete, an email with reports will be sent to you. If you have logged in you can check the status and history of your jobs at 'My Jobs' page, where you can also
														download your result by clicking 'Here' in the 'Result file' column.
														
													 
														
														With CRAVAT's RESTful web service, you can submit and check the status of your jobs withuot using a browser.
														 
															
																Job submission via POSTURL: http://www.cravat.us/CRAVAT/rest/service/submitMethod: POST
 Consumes: Multipart/form-data
 Produces: a JSON object, notable fields of which are as follows.
 
												                    Form data parameters (* = essential parameters):status: "submitted" for successful job submission, "submissonfailed" for an error in the job submissionerrormsg: If there was any error during the job submission, the error message is written here.jobid: The Job ID of the submitted job. This job ID can be used to check the status of the job later using "status" method which is explained below. 
												                    analyses: "CHASM", "SnvGet", "VEST", "CHASM;VEST", "CHASM;SnvGet", "VEST;SnvGet", or "CHASM;VEST;SnvGet"chasmclassifier: classifier name for CHASM analysis*email: email of the submitterfunctionalannotation: "on" or "off". GeneCards and PubMed annotation.hg18: "on" or "off". Input mutations are in hg18 coordinates or not.*inputfile: Input mutation file. This is from the file input element in the POST form.mupitinput: "on" or "off". MuPIT input format returned or not.tsvreport: "on" or "off". Text format reports returned or not.
																Job submission via GETURL: http://www.cravat.us/CRAVAT/rest/service/submitMethod: GET
 Produces: a JSON object, notable fields of which are as follows.
 
												                    Query parameters (* = essential parameters):status: "submitted" for successful job submission, "submissonfailed" for an error in the job submissionerrormsg: If there was any error during the job submission, the error message is written here.jobid: The Job ID of the submitted job. This job ID can be used to check the status of the job later using "status" method which is explained below. 
												                    analyses: "CHASM", "SnvGet", "VEST", "CHASM;SnvGet", or "VEST;SnvGet"chasmclassifier: classifier name for CHASM analysis*email: email of the submitterfunctionalannotation: "on" or "off". GeneCards and PubMed annotation.hg18: "on" or "off". Input mutations are in hg18 coordinates or not.*mutations: a string with mutations, the format of which is the same as described in the "Input" section above.mupitinput: "on" or "off". MuPIT input format returned or not.tsvreport: "on" or "off". Text format reports returned or not.
																Job status checkingURL: http://www.cravat.us/CRAVAT/rest/service/statusMethod: GET
 Produces: a JSON object, notable fields of which are as follows.
 
																	Query parameters (* = essential parameter):status: "running" for still running, "success" for successful completion, "jobfailed" for failederrormsg: Error message if the job failed.resultfileurl: If the job completed successfully, the URL of the result file. 
 
														        	Example: http://www.cravat.us/CRAVAT/rest/service/status?jobid=test@20140204_102423*jobid: The job ID to query. 
															
															Single variant Web APIURL:http://www.cravat.us/CRAVAT/rest/service/queryMethod: GET
 Produces: a JSON object, notable fields of which are as follows.
 
																Query parameters (* = essential parameter):Chromosome: Chromosome of the variantPosition: Position of the variantStrand: DNA strand on which the variant is onReference base: Base(s) at the variant position in the reference genome (hg18 or hg19)Alternate base: Sequence of the variantHugo symbol: Gene symbol from HUGO in which the variant residesSequence ontology transcript: Transcript used to get the most severe
																    sequence Ontology. 
																	If there are more than one transcript of the most severe sequence ontology, 
																	the longest RefSeq transcript (if not, the longest Ensembl one, or the longest
																	CCDS one, in this order) is chosen.Protein sequence change: Protein sequence change for the Sequence ontology columnSequence ontology: Sequence Ontology annotation. 
																		    See Sequence Ontology section below. 
																		    When more than one sequence ontology is found due to multiple transcript mapping, 
																		    the most severe consequence is reported, according to the order of 
																		    FI, FD, SG, SS, SL, II, ID, CS, MS, and SY.Sequence ontology all transcripts: Sequence ontology for each transcript mapped to the variant position. 
																    An asterisk is assigned to the transcript that was used to get the most severe sequence ontology.ExAC total allele frequency: Total allele frequency from 
																		    ExACExAC allele frequency (African/African American): ExAC
																		    allele frequency in African and African American populationExAC allele frequency (Latino): ExAC
																		    allele frequency in Latino populationExAC allele frequency (East Asia): ExAC
																		    allele frequency in East Asian populationExAC allele frequency (Finish): ExAC
																		    allele frequency in Finnish populationExAC allele frequency (Non-Finnish European): ExAC
																		    allele frequency in Non-Finnish European populationExAC allele frequency (Other): ExAC
																		    allele frequency in Other populationExAC allele frequency (South Asian): ExAC
																		    allele frequency in South Asian population1000 Genomes allele frequency: Allele frequency from the 1000 Genomes projectESP6500 allele frequency (European American): Allele frequency in the European American population, 
																		    from ESP6500ESP6500 allele frequency (African American): Allele frequency in the African American population, 
																		    from ESP6500Transcript in COSMIC: COSMIC Transcript that is mapped to the input variantProtein sequence change in COSMIC: Protein sequence change caused by the variant, according to COSMICOccurrences in COSMIC [exact nucleotide change]: How many times the variant is observed in COSMICOccurrences in COSMIC by primary sites [exact nucleotide change]: How many times the mutation is observed in COSMIC, grouped by primary sitesMappability Warning: Warning codes for whether the mutation's mapping is reliable or not. See Mappability section below.Driver Genes: Cancer driver gene hits (oncogenes and tumor suppressor genes) according to Vogelstein et al.TARGET: TARGET drug association DB hitsdbSNP: dbSNP record which has the mutation 
 
																Example: http://www.cravat.us/CRAVAT/rest/service/query?mutation=chr22_30421786_+_A_T*mutation: The chromsome, position, strand direction, reference base and alternate base of the variant separated by underscores (chomosome_position_strand_refBase_altBase 
													
													
														
														Upon a successful submission and analysis, you will receive a link to your results via
														email (if you have logged in you can check the status and history of your jobs at 'My Jobs' page, where you can also 
														download your result by clicking 'Here' in the 'Result file' column), which will be available for 30 days from the date of submission. The results will be 
														delivered as one zip-compressed file containing several report files, including a MS Excel format spreadsheet and optional tab-separated text files.
														There are three levels of analysis: variant, codon, and gene level. The spreadsheet has each level as a tab, and the tab-separated text files have 
														each level as a separate .tsv file. SNVGet analysis result also shows up as a separate tab or file. The result of the analysis at each level is shown as a table, and the columns of the table are explained below.
														 
															
															
																
																	
																		| Column | Meaning |  
																		| Input line number | Line number from the input file |  
																		| ID | Unique ID of a mutation input line |  
																		| Chromosome | Chromosome of the mutation |  
																		| Position | Position of the mutation |  
																		| Strand | DNA strand on which the mutation is on |  
																		| Reference base(s) | Base(s) at the mutation position in the reference genome (hg18 or hg19) |  
																		| Alternate base(s) | Sequence of the mutation |  
																		| Sample ID | ID of the sample from which the mutation was observed |  
																		| HUGO symbol | Gene symbol from HUGO in which the mutation resides |  
																		| Sequence ontology | Sequence Ontology annotation. 
																		    See Sequence Ontology section below. 
																		    When more than one sequence ontology is found due to multiple transcript mapping, 
																		    the most severe consequence is reported, according to the order of 
																		    FI, FD, SG, SS, SL, II, ID, CS, MS, and SY. |  
																		| Protein sequence change | Protein sequence change for the Sequence ontology column. |  
																		| QUAL | Phred-scaled quality score for the assertion made in 
																		    the alternate bases. 
																		    This column appears only with a VCF-format input. |  
																		| FILTER | PASS if the mutation position passed all filters. 
																		    Otherwise, a semicolon-separated list of codes for filters that fail 
																		    (e.g. "q10;s50").
																		    This column appears only with a VCF-format input. |  
																		| Zygosity | "hom" or "het" depending on whether the alternate allele is present 
																		    on both chromosomes or only one of them, respectively.
																		    This column appears only with a VCF-format input. |  
																		| CHASM cancer driver p-value (missense) | Empirically-derived p-value of the CHASM cancer driver score. Only missense mutations are considered. |  
																		| CHASM cancer driver FDR (missense) | Benjamini-Hochberg false discovery rate. Only missense mutations are considered. |  
																		| VEST pathogenicity p-value (non-silent) | Empirically-derived p-value of the VEST pathogenicity score. Only non-silent mutations are considered. |  
																		| VEST pathogenicity FDR (non-silent) | Benjamini-Hochberg false discovery rate. Only non-silent mutations are considered. |  
																		| Mappability Warning | Warning codes for whether the mutation's mapping is reliable or not. See Mappability section below. |  
																		| Driver Genes | Cancer driver gene hits (oncogenes and tumor suppressor genes) according to Vogelstein et al. |  
																		| TARGET | TARGET drug association DB hits |  
																		| dbSNP | dbSNP record which has the mutation |  
																		| 1000 Genomes allele frequency | Allele frequency from the 1000 Genomes project |  
																		| ESP6500 allele frequency (average) | Average allele frequency from ESP6500 |  
																		| ExAC total allele frequency | Total allele frequency from 
																		    ExAC |  
																		| Occurrences in COSMIC by primary sites [exact nucleotide change] | How many times the mutation is observed in COSMIC, grouped by primary sites |  
																		| Number of samples in study having the exact nucleotide change | Number of samples in study having the exact nucleotide change |  
																		| MuPIT Link | If the mutation falls on a known protein structure or a homology model (see here), it can be visualized with MuPIT by clicking the link in this column. |  
																		| GeneCards summary | Information on the gene containing the mutation, pulled from GeneCards |  
																		| Number of retrieved articles from PubMed | Number of the records retrieved in PubMed, using the name of the gene which contains the mutation and "cancer" as keywords. First, the keywords are searched in MeSH terms. If nothing is found, title and abstract of literature are searched. If nothing is still found, the keywords are searched without restriction on their appearance. |  
																		| PubMed search term | Link to the PubMed search result with the mutation's gene name and "cancer" as keywords |  
															
															
																
																	
																		| Column | Meaning |  
																		| Input line number | Line number from the input file |  
																		| ID | Unique ID of a mutation input line |  
																		| Chromosome | Chromosome of the mutation |  
																		| Position | Position of the mutation |  
																		| Strand | DNA strand on which the mutation is on |  
																		| Reference base(s) | Base(s) at the mutation position in the reference genome (hg18 or hg19) |  
																		| Alternate base(s) | Sequence of the mutation |  
																		| Sample ID | ID of the sample from which the mutation was observed |  
																		| HUGO symbol | Gene symbol from HUGO in which the mutation resides |  
																		| Sequence ontology | Sequence Ontology annotation. 
																		    See Sequence Ontology section below. 
																		    When more than one sequence ontology is found due to multiple transcript mapping, 
																		    the most severe consequence is reported, according to the order of 
																		    FI, FD, SG, SS, SL, II, ID, CS, MS, and SY. |  
																		| Sequence ontology transcript | Transcript used to get the most severe
																            sequence Ontology. 
																			If there are more than one transcript of the most severe sequence ontology, 
																			the longest RefSeq transcript (if not, the longest Ensembl one, or the longest
																			CCDS one, in this order) is chosen. |  
																		| Sequence ontology transcript strand | The strand (+ or -) of the transcript used to get the sequence ontology |  
																		| Protein sequence change | Protein sequence change for the Sequence ontology column |  
																		| Sequence ontology all transcripts | Sequence ontology for each transcript mapped to the variant position. 
																		    An asterisk is assigned to the transcript that was used to get the most severe sequence ontology. |  
																		| CHASM cancer driver score transcript | Transcript used to get the CHASM cancer driver score |  
																		| Cancer missense driver score (1 - CHASM score) | 1 - CHASM cancer driver score. Closer to 1 means that the mutation is more likely a cancer driver. |  
																		| CHASM cancer driver p-value (missense) | Empirically-derived p-value of the CHASM cancer driver score. Only missense mutations are considered. |  
																		| CHASM cancer driver FDR (missense) | Benjamini-Hochberg false discovery rate. Only missense mutations are considered. |  
																		| Cancer missense driver score of all transcripts | Cancer missense driver score (1 - CHASM score) and p-value of each transcript that has mapping to the input variant. 
																		    Format is Transcript:Protein sequence change(Cancer missense driver score:CHASM cancer driver p-value).
																		    An asterisk is assigned to the transcript that has the highest cancer missense driver score. |  
																		| VEST pathogenicity score transcript | Transcript used to get VEST pathogenicity score |  
																		| VEST pathogenicity score (missense) | VEST pathogenicity score for missense variants |  
																		| VEST pathogenicity score (frameshift indels) | VEST pathogenicity score for frameshift indels |  
																		| VEST pathogenicity score (inframe indels) | VEST pathogenicity score for inframe indels |  
																		| VEST pathogenicity score (stop-gain) | VEST pathogenicity score for stop-gain variants |  
																		| VEST pathogenicity score (stop-loss) | VEST pathogenicity score for stop-loss variants |  
																		| VEST pathogenicity score (splice site) | VEST pathogenicity score for splice site variants |  
																		| VEST pathogenicity score and p-value of all transcripts (non-silent) | VEST pathogenicity score and p-value of each transcript that has mapping to the input variant. 
																		    Format is Transcript:Protein sequence change(VEST pathogenicity score:VEST pathogenicity p-value).
																		    An asterisk is assigned to the transcript that has the highest VEST pathogenicity score. |  
																		| ESP6500 allele frequency (European American) | Allele frequency in the European American population, 
																		    from ESP6500 |  
																		| ESP6500 allele frequency (African American) | Allele frequency in the African American population, 
																		    from ESP6500 |  
																		| ExAC allele frequency (Latino) | ExAC
																		    allele frequency in Latino population |  
																		| ExAC allele frequency (African/African American) | ExAC
																		    allele frequency in African and African American population |  
																		| ExAC allele frequency (East Asian) | ExAC
																		    allele frequency in East Asian population |  
																		| ExAC allele frequency (Finnish) | ExAC
																		    allele frequency in Finnish population |  
																		| ExAC allele frequency (Non-Finnish European) | ExAC
																		    allele frequency in Non-Finnish European population |  
																		| ExAC allele frequency (Other) | ExAC
																		    allele frequency in Other population |  
																		| ExAC allele frequency (South Asian) | ExAC
																		    allele frequency in South Asian population |  
																		| Transcript in COSMIC | COSMIC Transcript that is mapped to the input variant |  
																		| Protein sequence change in COSMIC | Protein sequence change caused by the variant, according to COSMIC |  
																		| Occurrences in COSMIC [exact nucleotide change] | How many times the variant is observed in COSMIC |  
															
															Non-coding regions are regions in the genome that are not in a protein coding portion of a gene. This includes UTR, intron, non-coding RNA, and intergenic regions.
															
															 
																
																	
																		| Column | Meaning |  
																		| Input line number | Line number from the input file |  
																		| ID | Unique ID of a mutation input line |  
																		| Chromosome | Chromosome of the mutation |  
																		| Position | Position of the mutation |  
																		| Strand | DNA strand on which the mutation is on |  
																		| Reference base(s) | Base(s) at the mutation position in the reference genome (hg18 or hg19) |  
																		| Alternate base(s) | Sequence of the mutation |  
																		| Sample ID | ID of the sample from which the mutation was observed |  
																		| HUGO symbol | Gene symbol from HUGO in which the mutation resides |  
																		| Sequence ontology | Sequence Ontology annotation. 
																		    See Sequence Ontology section below. 
																		    When more than one sequence ontology is found due to multiple transcript mapping, 
																		    the most severe consequence is reported, according to the order of 
																		    FI, FD, SG, SS, SL, II, ID, CS, MS, and SY. |  
																		| QUAL | Phred-scaled quality score for the assertion made in 
																		    the alternate bases. 
																		    This column appears only with a VCF-format input. |  
																		| FILTER | PASS if the mutation position passed all filters. 
																		    Otherwise, a semicolon-separated list of codes for filters that fail 
																		    (e.g. "q10;s50").
																		    This column appears only with a VCF-format input. |  
																		| Zygosity | "hom" or "het" depending on whether the alternate allele is present 
																		    on both chromosomes or only one of them, respectively.
																		    This column appears only with a VCF-format input. |  
																		| Mappability Warning | Warning codes for whether the mutation's mapping is reliable or not. See Mappability section below. |  
																		| dbSNP | dbSNP record which has the mutation |  
																		| 1000 Genomes allele frequency | Allele frequency from the 1000 Genomes project |  
																		| ESP6500 allele frequency (average) | Average allele frequency from ESP6500 |  
																		| ExAC total allele frequency | Total allele frequency from 
																		    ExAC |  
																		| Occurrences in COSMIC by primary sites [exact nucleotide change] | How many times the mutation is observed in COSMIC, grouped by primary sites |  
																		| Number of samples in study having the exact nucleotide change | Number of samples in study having the exact nucleotide change |  
															
															
																
																	
																		| Column | Meaning |  
																		| HUGO Symbol | Gene symbol from HUGO in which the mutation resides |  
																		| Sequence ontology | Sequence Ontology annotation. See Sequence Ontology section below. |  
																		| Cancer missense driver score (1-CHASM score) | Most cancer driving CHASM cancer driver score found in the gene. 
																		    The closer to 1, the more cancer driving variant the gene has. |  
																		| VEST pathogenicity score (non-silent) | Most pathogenic VEST pathogenicity score found in the gene.
																			The closer to 1, the more pathogenic variant the gene has. |  
																		| VEST pathogenicity composite p value (non-silent) | Composite p-value based on Stouffer's Z-score method |  
																		| VEST pathogenicity FDR (non-silent) | Composite FDR based on Stouffer's Z-score method |  
																		| Driver Genes | Cancer driver gene hits (oncogenes and tumor suppressor genes) according to Vogelstein et al. |  
																		| TARGET | TARGET drug association DB hits |  
																		| Occurrences in COSMIC [gene mutated] | How many times any mutation in the gene is observed in COSMIC |  
																		| Occurrences in COSMIC by primary sites [gene mutated] | How many times any mutation in the gene is observed in COSMIC, grouped by primary sites |  
																		| Number of samples in study having the gene mutated | Number of samples in study having the gene mutated |  
																		| MuPIT Link | If the mutations in the gene fall on a known protein structure, they can be visualized with MuPIT by clicking the link in this column. |  
																		| GeneCards summary (from http://www.genecards.org) | Information on the gene containing the mutation, pulled from GeneCards |  
																		| Number of retrieved articles from PubMed | Number of the records retrieved in PubMed, using the name of the gene which contains the mutation and "cancer" as keywords. First, the keywords are searched in MeSH terms. If nothing is found, title and abstract of literature are searched. If nothing is still found, the keywords are searched without restriction on their appearance. |  
																		| PubMed search term | Link to the PubMed search result with the mutation's gene name and "cancer" as keywords |  
															
															
																
																	
																		| Column | Meaning |  
																		| Input line number | Line number from the input file |  
																		| ID | Unique ID of a variant input line |  
																		| Chromosome | Chromosome of the mutation |  
																		| Position | Position of the mutation |  
																		| Strand | DNA strand on which the mutation is on |  
																		| Reference base(s) | Base(s) at the mutation position in the reference genome (hg18 or hg19) |  
																		| Alternate base(s) | Sequence of the mutation |  
																		| Sample ID | ID of the sample from which the mutation was observed |  
																		| HUGO Symbol | Gene symbol from HUGO in which the mutation resides |  
																		| Sequence ontology | Sequence Ontology annotation. See Sequence Ontology section below. |  
																		| Sequence ontology transcript | Transcript used to get the most severe
																    		sequence Ontology. 
																			If there are more than one transcript of the most severe sequence ontology, 
																			the longest RefSeq transcript (if not, the longest Ensembl one, or the longest
																			CCDS one, in this order) is chosen. |  
																		| Protein sequence change | Position and amino acid changed by the variant, in the representative transcript |  
															To understand the other columns of the SNVBox analysis result table, please refer to this document  for comprehensive explanation.
														 
														
															
															
																
																	
																		| Code | Meaning |  
																		| SY | Synonymous Variant |  
																		| SL | Stop Lost |  
																		| SG | Stop Gained |  
																		| MS | Missense Variant |  
																		| II | Inframe Insertion |  
																		| FI | Frameshift Insertion |  
																		| ID | Inframe Deletion |  
																		| FD | Frameshift Deletion |  
																		| CS | Complex Substitution |  
															The source of Sequence Ontology terms is here .
														 
															
															
																
																	
																		| Code | Meaning |  
																		| A75 | The hg19 reference genome has more than 1 location with the 75 mer sequence from the query position |  
																		| ACR | ACRO1 (Human acromeric satellite) |  
																		| ALC | ALR/Alpha |  
																		| BSR | Beta satellite repeat/beta |  
																		| CAT | (CATTC)n |  
																		| CHM | Chromosome M |  
																		| CNR | Centromeric Repeat |  
																		| GAA | (GAATG)n |  
																		| GAG | (GAGTG)n |  
																		| HMI | High artifact island |  
																		| LMI | Low artifact island |  
																		| LSU | Large subunit rRNA Hsa |  
																		| snR | Small nuclear RNA |  
																		| SSU | Small subunit rRNA Hsa |  
																		| STL | Satellite repeat |  
																		| TAR | TAR1 |  
																		| TII | HSATII (Human satellite II DNA) |  
																		| TLM | Telomeric repeat |  
															The source of the mappability tags is here .
														 |  |