The phase 3 structural variant dataset

The 1000 Genomes Project SV group produced an expanded dataset of structural variation for the individuals in phase 3 of the 1000 Genomes Project.

The VCF files for the SV dataset in GRCh37 coordinates can be found in This directory contains a README which explains the contents of the VCF files and supporting information, and provides a complete list of the differences between the 1000 Genomes Project Consortium Phase 3 paper and the Structural Variation Consortium Companion paper.

The 1000 Genomes Structural Variation dataset is built and validated on several different raw datasets.

Type Archive accession Data on the FTP site
Short-read Illumina WGS sequencing * phase3/20130502.phase3.analysis.sequence.index
Complete Genomics WGS sequencing * phase3/20130725.phase3.cg_sra.index
PCR-free Illumina WGS sequencing SRP047053 release/20130502/../high_.._alignments/20141118_high_coverage.alignment.index
Moleculo WGS NA12878   phase3/integrated_sv_map/supporting/NA12878/moleculo
PacBio sequencing NA12878 SRX638310 phase3/integrated_sv_map/supporting/NA12878/pacbio
PacBio sequencing CHM1 SRX533609 phase3/integrated_sv_map/supporting/CHM1
Agilent 1M aCGH microarray GSE70188 phase3/integrated_sv_map/supporting/acgh/
Illumina Omni2.5 microarray   release/20130502/supporting/hd_genotype_chip/
Affymetrix SNP Array 6.0   release/20130502/supporting/hd_genotype_chip/coriell_affy6_intensities/
Targeted PacBio sequencing ERS661321, ERS661355, ERS661356, ERS661358+  
Targeted MinION sequencing ERS661358, ERS661406+  

The Phase 3 Structural variants can also be found mapped to GRCh38 coordinates in the FTP directory