Variant Call Format - Wikipedia

submited by
Style Pass
2024-11-14 03:30:04

The Variant Call Format or VCF is a standard text file format used in bioinformatics for storing gene sequence or DNA sequence variations. The format was developed in 2010 for the 1000 Genomes Project and has since been used by other large-scale genotyping and DNA sequencing projects.[ 1] [ 2] VCF is a common output format for variant calling programs due to its relative simplicity and scalability.[ 3] [ 4] Many tools have been developed for editing and manipulating VCF files, including VCFtools, which was released in conjunction with the VCF format in 2011, and BCFtools, which was included as part of SAMtools until being split into an independent package in 2014.[ 1] [ 5]

The standard is currently in version 4.5,[ 6] [ 7] although the 1000 Genomes Project has developed its own specification for structural variations such as duplications, which are not easily accommodated into the existing schema.[ 8]

Additional file formats have been developed based on VCF, including genomic VCF (gVCF). gVCF is an extended format which includes additional information about "blocks" that match the reference and their qualities.[ 9] [ 10]

Leave a Comment