Deciphering the genome structure of Theobroma cacao (W107)

2011 
Theobroma cacao L., is a diploid tree fruit species, originated from the South American rainforests, and constitutes an important source of incomes for farmers of tropical countries. We produced a high quality draft genome sequence corresponding to a 16,7X genome coverage of a criollo genotype. The assembly corresponds to 76% of the estimated genome size of the T. cacao genotype B97-61/B2 (430 Mbp). This assembly appears to cover a very large proportion of the euchromatin of the T. cacao genome, allowing to recover 97.8% of the unigene resource (38,737 unigenes assembled from 715,457 EST sequences) in the genome assembly. Annotations revealed 28,798 protein-coding genes among which 82% could be anchored in a high density genetic map. Only 20% of the genome consisted in transposable elements, a significantly lower percentage compared to other genome of similar size. This first cocoa genome sequence was the support for several genome analyses revealing specific extension of some gene families. The comparative mapping of genes involved in disease resistance and quality traits, localised in the genome sequence, and QTLs related to these traits highlighted several co-localisations between them and candidate genes potentially involved in useful cocoa trait variations. This genome sequence will facilitate a better understanding of trait elaboration and will accelerate T. cacao breeding through efficient marker assisted selection and exploitation of genetic resources. A genome browser allows to access freely to the cocoa sequence data at the following website : http://cocoagendb.cirad.fr. (Texte integral)
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []