How do I find the familial relationships between the individuals?

While the main outputs of the 1000 Genomes Project’s phase three work focused on 2,504 unrelated individuals, we also hold data from related samples. Frequently, these are trios (parents and child), with some families also indluding further generations.

For the 1000 Genomes Project phase three analysis, the relationships between the samples are recorded in a .ped pedigree file on our FTP site. This is based on both the known relationships and analysis of the data generated for this work.

In some instances, analysis of a given set of data may suggest different relationships from those originally recorded. This could be due to a number of possible reasons, including, for example, an error in relationship recording or an accidental sample swap during data generation. Where such concerns exist for a data set, this is reflected in the pedigree file. We have pedigree files accompanying analyses of different sets of data that may list different relationships or concerns from each other, based on the different data sets, although such cases are very rare.

In general, to understand the relationships between large sets of samples, consulting the accompanying pedigree file is the best approach.

Relationship information is also recorded in our data portal for all samples. Further, sample relationship information is also held at Coriell and CEPH along with the cell lines.

Related questions: