To the top

Page Manager: Webmaster
Last update: 9/11/2012 3:13 PM

Tell a friend about this page
Print version

Improved software detecti… - University of Gothenburg, Sweden Till startsida
To content Read more about how we use cookies on

Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data

Journal article
Authors Johan Bengtsson-Palme
Martin Ryberg
Martin Hartmann
Sara Branco
Zheng Wang
Anna Godhe
Pierre De Wit
Marisol Sánchez-García
Ingo Ebersberger
Filipe de Sousa
Anthony S. Amend
Ari Jumpponen
Martin Unterseher
Erik Kristiansson
Kessy Abarenkov
Yann Bertrand
Kemal Sanli
Karl Martin Eriksson
Unni Vik
Vilmar Veldre
R. Henrik Nilsson
Published in Methods in Ecology and Evolution
Volume 4
Issue 10
Pages 914-919
ISSN 2041-210X
Publication year 2013
Published at Institute of Neuroscience and Physiology, Department of Physiology
Department of Mathematical Sciences, Mathematical Statistics
Department of Biological and Environmental Sciences
Pages 914-919
Language en
Keywords Fungi, molecular ecology, next-generation sequencing, Perl, ribosomal DNA
Subject categories Computer Science, Bioinformatics (Computational Biology), Microbiology, Botany, Zoology, Bioinformatics and Systems Biology, Ecology, Biological Systematics, Microbiology, Soil biology, Environmental Sciences related to Agriculture and Land-use


The nuclear ribosomal internal transcribed spacer (ITS) region is the primary choice for molecular identification of fungi. Its two highly variable spacers (ITS1 and ITS2) are usually species specific, whereas the intercalary 5.8S gene is highly conserved. For sequence clustering and blast searches, it is often advantageous to rely on either one of the variable spacers but not the conserved 5.8S gene. To identify and extract ITS1 and ITS2 from large taxonomic and environmental data sets is, however, often difficult, and many ITS sequences are incorrectly delimited in the public sequence databases. We introduce ITSx, a Perl-based software tool to extract ITS1, 5.8S and ITS2 – as well as full-length ITS sequences – from both Sanger and high-throughput sequencing data sets. ITSx uses hidden Markov models computed from large alignments of a total of 20 groups of eukaryotes, including fungi, metazoans and plants, and the sequence extraction is based on the predicted positions of the ribosomal genes in the sequences. ITSx has a very high proportion of true-positive extractions and a low proportion of false-positive extractions. Additionally, process parallelization permits expedient analyses of very large data sets, such as a one million sequence amplicon pyrosequencing data set. ITSx is rich in features and written to be easily incorporated into automated sequence analysis pipelines. ITSx paves the way for more sensitive blast searches and sequence clustering operations for the ITS region in eukaryotes. The software also permits elimination of non-ITS sequences from any data set. This is particularly useful for amplicon-based next-generation sequencing data sets, where insidious non-target sequences are often found among the target sequences. Such non-target sequences are difficult to find by other means and would contribute noise to diversity estimates if left in the data set.

Page Manager: Webmaster|Last update: 9/11/2012

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?