TY - JOUR
T1 - A novel canine reference genome resolves genomic architecture and uncovers transcript complexity
AU - Wang, Chao
AU - Wallerman, Ola
AU - Arendt, Maja Louise
AU - Sundström, Elisabeth
AU - Karlsson, Åsa
AU - Nordin, Jessika
AU - Mäkeläinen, Suvi
AU - Pielberg, Gerli Rosengren
AU - Hanson, Jeanette
AU - Ohlsson, Åsa
AU - Saellström, Sara
AU - Rönnberg, Henrik
AU - Ljungvall, Ingrid
AU - Häggström, Jens
AU - Bergström, Tomas F.
AU - Hedhammar, Åke
AU - Meadows, Jennifer R.S.
AU - Lindblad-Toh, Kerstin
PY - 2021
Y1 - 2021
N2 - We present GSD_1.0, a high-quality domestic dog reference genome with chromosome length scaffolds and contiguity increased 55-fold over CanFam3.1. Annotation with generated and existing long and short read RNA-seq, miRNA-seq and ATAC-seq, revealed that 32.1% of lifted over CanFam3.1 gaps harboured previously hidden functional elements, including promoters, genes and miRNAs in GSD_1.0. A catalogue of canine “dark” regions was made to facilitate mapping rescue. Alignment in these regions is difficult, but we demonstrate that they harbour trait-associated variation. Key genomic regions were completed, including the Dog Leucocyte Antigen (DLA), T Cell Receptor (TCR) and 366 COSMIC cancer genes. 10x linked-read sequencing of 27 dogs (19 breeds) uncovered 22.1 million SNPs, indels and larger structural variants. Subsequent intersection with protein coding genes showed that 1.4% of these could directly influence gene products, and so provide a source of normal or aberrant phenotypic modifications.
AB - We present GSD_1.0, a high-quality domestic dog reference genome with chromosome length scaffolds and contiguity increased 55-fold over CanFam3.1. Annotation with generated and existing long and short read RNA-seq, miRNA-seq and ATAC-seq, revealed that 32.1% of lifted over CanFam3.1 gaps harboured previously hidden functional elements, including promoters, genes and miRNAs in GSD_1.0. A catalogue of canine “dark” regions was made to facilitate mapping rescue. Alignment in these regions is difficult, but we demonstrate that they harbour trait-associated variation. Key genomic regions were completed, including the Dog Leucocyte Antigen (DLA), T Cell Receptor (TCR) and 366 COSMIC cancer genes. 10x linked-read sequencing of 27 dogs (19 breeds) uncovered 22.1 million SNPs, indels and larger structural variants. Subsequent intersection with protein coding genes showed that 1.4% of these could directly influence gene products, and so provide a source of normal or aberrant phenotypic modifications.
U2 - 10.1038/s42003-021-01698-x
DO - 10.1038/s42003-021-01698-x
M3 - Journal article
C2 - 33568770
AN - SCOPUS:85100970715
VL - 4
JO - Communications Biology
JF - Communications Biology
SN - 2399-3642
M1 - 185
ER -