The 1000 Genomes Project SV group produced an expanded dataset of structural variation for the individuals in phase 3 of the 1000 Genomes Project.
The VCF files for the SV dataset in GRCh37 coordinates can be found in ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/. This directory contains a README which explains the contents of the VCF files and supporting information, and provides a complete list of the differences between the 1000 Genomes Project Consortium Phase 3 paper and the Structural Variation Consortium Companion paper.
The 1000 Genomes Structural Variation dataset is built and validated on several different raw datasets.
Type | Archive accession | Data on the FTP site |
---|---|---|
Short-read Illumina WGS sequencing | * | phase3/20130502.phase3.analysis.sequence.index |
Complete Genomics WGS sequencing | * | phase3/20130725.phase3.cg_sra.index |
PCR-free Illumina WGS sequencing | SRP047053 | release/20130502/../high_.._alignments/20141118_high_coverage.alignment.index |
Moleculo WGS NA12878 | phase3/integrated_sv_map/supporting/NA12878/moleculo | |
PacBio sequencing NA12878 | SRX638310 | phase3/integrated_sv_map/supporting/NA12878/pacbio |
PacBio sequencing CHM1 | SRX533609 | phase3/integrated_sv_map/supporting/CHM1 |
Agilent 1M aCGH microarray | GSE70188 | phase3/integrated_sv_map/supporting/acgh/ |
Illumina Omni2.5 microarray | release/20130502/supporting/hd_genotype_chip/ | |
Affymetrix SNP Array 6.0 | release/20130502/supporting/hd_genotype_chip/coriell_affy6_intensities/ | |
Targeted PacBio sequencing | ERS661321, ERS661355, ERS661356, ERS661358+ | |
Targeted MinION sequencing | ERS661358, ERS661406+ |
The Phase 3 Structural variants can also be found mapped to GRCh38 coordinates in the FTP directory ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/supporting/GRCh38_positions/.