Non-coding RNA annotation
If ncRNA gene annotations have been performed along with the protein coding genes predictions by the group or consortium in charge of a sequencing project, these annotations will be imported, either through GFF import or through INSDC import.
In the case there are no ncRNA annotations, we then run our own ncRNA predictions pipelines.
Ensembl Genomes prediction pipelines
For all ncRNA except tRNA and rRNA genes, models are predicted by aligning a genomic sequence against Rfam sequences using BLASTN. The BLAST hits are then used to seed Infernal searches of the aligned regions with the corresponding Rfam covariance models. The purpose of this is to reduce the search space required, as to scan the entire genome with all the Rfam covariance models would be extremely CPU-intensive.
See Burge SW et al. (2013) Rfam 11.0: 10 years of RNA families Nucl. Acids Res. 41 D226-32.
tRNA is predicted by using tRNAScan-SE software. Version 1.23 of the program was used, configured for superregnum as appropriate.
See Lowe T.M. and Eddy S.R. (1997) tRNAScan-SE: a program for improved detection of transfer RNA genes in genomic sequence Nucl. Acids Res. 25 955-964
rRNA is predicted by using RNAmmer software. Version 1.2 of the program was used, configured for superregnum as appropriate.
See Lagesen K. et al. (2007) RNammer: consistent annotation of rRNA genes in genomic sequences Nucl. Acids Res. 35 3100-3108.
Non-coding RNA biotype
The following non-coding RNA gene types are annotated, along with pseudogenes.
Note that there are many more RFAM families but whenever they are classified as motifs (e.g. a SECIS element motif, RF00031), they are filtered out by our ncRNA gene prediction pipeline.