1000G-ONT image


The 1000 Genomes Project ONT Sequencing Consortium

The 1000 Genomes Project ONT Sequencing Consortium (1KGP-ONT) is building on the landmark work done by the 1000 Genomes Project (1KGP), which began in 2008 as a collaborative initiative to establish a database of normal human genetic variation by sequencing the genomes of over a thousand healthy individuals from diverse ancestries. In the end, the 1KGP study sequenced over 3,000 genomes using short-read sequencing, and it continues to provide invaluable insights into human genetic diversity.

The 1KGP-ONT kicked off on Thursday, June 30, 2022. Funding-permitting, we hope to perform long-read sequencing of all 1KGP samples, which are available as DNA or cell lines from the NHGRI's Sample Repository for Human Genetic Research housed at the Coriell Institute for Medical Research. To obtain high-coverage, high-quality long-read assemblies, we are isolating high molecular weight DNA directly from cell culture of 1KGP cell lines obtained from Coriell.

The goal of the 1KGP-ONT Consortium is to identify a broader spectrum of genomic variation than is possible using short-read sequencing so we may further improve our understanding of human genetic disease. This dataset is already enabling us to better understand normal patterns of human structural variation, identify variation in difficult-to-map regions of the genome, and study repeat expansions and methylation patterns. Analysis of the first 100 genomes sequenced in this project is available as preprint on medRxiv.

1000G-ONT

The People

We are a collaborative group of researchers from around the world interested in leveraging long-read sequencing to better understand the normal patterns of structural variation, methylation, and repeat expansion in the population so we can more effectively identify missing disease-causing variation in individuals. The project is led by Danny Miller and Evan Eichler at the University of Washington, and cell culture and DNA extraction are performed in the Miller and Eichler labs. Sequencing is performed at the University of Washington, the New York Genome Center, and Stanford University. Individuals and institutions contributing to this work are listed below.

Zachery Anderson
UNIVERSITY OF WASHINGTON
Anna O Basile
NEW YORK GENOME CENTER
Wayne E Clarke
NEW YORK GENOME CENTER
André Corvelo
NEW YORK GENOME CENTER
Nikhita Damaraju
UNIVERSITY OF WASHINGTON
Harriet Dashnow
UNIVERSITY OF UTAH / UNIVERSITY OF COLORADO SCHOOL OF MEDICINE
Wouter De Coster
UNIVERSITY OF ANTWERP
Evan E Eichler
UNIVERSITY OF WASHINGTON
Erik Garrison
UNIVERSITY OF TENNESSEE HEALTH SCIENCE CENTER
Sophia B Gibson
UNIVERSITY OF WASHINGTON
Joy Goffena
UNIVERSITY OF WASHINGTON
Claudia Gonzaga-Jauregui
UNIVERSIDAD NACIONAL AUTÓNOMA DE MÉXICO
Sara Goodwin
COLD SPRING HARBOR LABORATORY
Andrea Guarracino
UNIVERSITY OF TENNESSEE HEALTH SCIENCE CENTER
Jonas A Gustafson
UNIVERSITY OF WASHINGTON
Adrienne Helland
NEW YORK GENOME CENTER
Kendra Hoekzema
UNIVERSITY OF WASHINGTON
Miten Jain
NORTHEASTERN UNIVERSITY
Tanner D Jensen
STANFORD UNIVERSITY
Mikhail Kolmogorov
NATIONAL CANCER INSTITUTE, NIH
Qiuhui Li
JOHNS HOPKINS UNIVERSITY
Matthew Loose
UNIVERSITY OF NOTTINGHAM
W Richard McCombie
COLD SPRING HARBOR LABORATORY
Richard N McLaughlin Jr
PACIFIC NORTHWEST RESEARCH INSTITUTE / UNIVERSITY OF WASHINGTON
Angela L Miller
UNIVERSITY OF WASHINGTON
Danny E Miller
UNIVERSITY OF WASHINGTON
Stephen B Montgomery
STANFORD UNIVERSITY
Rajeeva Lochan Musunuri
NEW YORK GENOME CENTER
Nathan D Olson
NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
Cate R Paschal
SEATTLE CHILDREN'S HOSPITAL
Karynne E Patterson
UNIVERSITY OF WASHINGTON
Catherine E Reeves
NEW YORK GENOME CENTER
Mahler Revsine
JOHNS HOPKINS UNIVERSITY
Phillip A Richmond
ALAMYA HEALTH
Esther Robb
STANFORD UNIVERSITY
Michael C Schatz
JOHNS HOPKINS UNIVERSITY
Fritz J Sedlazeck
BAYLOR COLLEGE OF MEDICINE, RICE UNIVERSITY
Maisha Sinha
UNIVERSITY OF WASHINGTON
Anthony A Snead
NEW YORK UNIVERSITY
Sophie HR Storz
UNIVERSITY OF WASHINGTON
David Twesigomwe
UNIVERSITY OF THE WITWATERSRAND
Rachel A Ungar
STANFORD UNIVERSITY
Sydney A Ward
UNIVERSITY OF WASHINGTON
Lei Yang
PACIFIC NORTHWEST RESEARCH INSTITUTE
Christina Zakarian
UNIVERSITY OF WASHINGTON
Miranda PG Zalusky
UNIVERSITY OF WASHINGTON
Michael C Zody
NEW YORK GENOME CENTER
Justin M Zook
NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
1000G-ONT ONT pore

The Technology

Long-read sequencing is being performed on the Oxford Nanopore platform. This technology works by measuring changes in current as single-stranded DNA or RNA molecules pass through a protein pore. The first 100 samples were sequenced on the R9.4.1 pore, and subsequent samples will be sequenced using the R10 chemistry.

1000G-ONT

The Data

The 1KGP-ONT Consortium is committed to publicly releasing data as they are generated, after basecalling and standard QC.

Analysis of the first 100 genomes sequenced in this project is available as a preprint on medRxiv.The initial 100 samples:

  • Represent all 5 superpopulations and 19 subpopulations
  • Have yielded an average sequence read N50 of 54 kbp and 37x depth of coverage
  • Have identified ~24,500 high-confidence structural variants per genome

Raw sequencing data, processed data, and summary data can be found here.

1000G-ONT

Join Us

The consortium is open to all. Please contact Danny Miller if you are interested in joining the consortium and to be added to the Slack group.