Office of Research and Development Publications

Bringing the fathead minnow into the genomic era


Saari, T., Dan Villeneuve, G. Ankley, F. Burns, A. Cogburn, S. Deschamps, R. Jackson, R. Hoke, AND A. Schroeder. Bringing the fathead minnow into the genomic era. Midwets SETAC, Madison, WI, March 14 - 16, 2016.


not applicable


The fathead minnow is a well-established ecotoxicological model organism that has been widely used for regulatory ecotoxicity testing and research for over a half century. While a large amount of molecular information has been gathered on the fathead minnow over the years, the lack of genomic sequence data has limited the utility of the fathead minnow for certain applications. To address this limitation, high-throughput Illumina sequencing technology was employed to sequence the fathead minnow genome. Approximately 100X coverage was achieved by sequencing several libraries of paired-end reads with differing genome insert sizes. Two draft genome assemblies were generated using the SOAPdenovo and String Graph Assembler (SGA) methods, respectively. When these were compared, the SOAPdenovo assembly had a higher scaffold N50 value of 60.4 kbp versus 15.4 kbp, and it also performed better in a Core Eukaryotic Genes Mapping Analysis (CEGMA), mapping 91% versus 67% of genes. As such, this assembly was selected for further development and annotation. The foundation for genome annotation was generated using AUGUSTUS, an ab initio method for gene prediction. A total of 43,345 potential coding sequences were predicted on the genome assembly. These predicted sequences were translated to peptides and queried in a BLAST search against all vertebrates, with 28,290 of these sequences corresponding to zebrafish peptides and 5,242 producing no significant alignments. Additional types of sequence data have also been layered onto the fathead minnow genome assembly to provide evidence of gene structures and other sequence elements. To this end, each of 240,000 fathead minnow expressed sequence tags (ESTs) and nearly 7,000 full-length zebrafish coding sequences (CDSs) were aligned to the genome assembly, with 73% and 38% creating successful alignments, respectively. A fathead minnow genome browser that provides accessible and visual integration of these various sequence datasets into a cohesive knowledge-base is being developed. . Completion of this work will provide a valuable resource for future ecotoxicology studies using the fathead minnow.

Record Details:

Product Published Date: 03/16/2016
Record Last Revised: 03/21/2016
OMB Category: Other
Record ID: 311435