We propose in this study a catalogue of 52,075 chicken genes enriched in Long Non-Coding (LNC) RNAs, built by extending the Ensembl reference (v94 - Gallus_gallus-5.0 assembly) with LNC from three public databases (NCBI, NONCODE, ALDB), new genes from the FR-AgENCODE annotation (Foissac et al., 2019) and genes modeled in this work using 364 RNA-seq samples. The v94 Ensembl reference grew from 4,643 LNC and 18,346 protein coding genes (PCG) to 30,084 LNC and 19,545 PCG. The resulting GTF annotation file v94-gg5 is available here.

These LNC were classified relatively to the closest PCG (resulting LNC classification here) and gene expression values were computed across 25 tissues as well as their tissue specificity (resulting expression here and here).

We also generated an extended version of the Ensembl v100 annotation associated to the Gallus gallus 6 reference genome (GRCg6a), comprising genes that did not overlap genes from the Ensembl v100 annotation‚Äč. The resulting GTF annotation file v100-GRCg6a is available here. It features the 24,356 genes of the Ensembl v100 annotation (16,878 PCG and 5,506 LNC + 1,972 others) extended by 18,994 LNC gene models.

Reference: Jehl et al., 2020 (to come soon)

French National Institute for Agricultural Research

Functional Annotation of Animal Genomes