Quantifying the taxonomic bias in enzymology.

2021 
The ongoing biotechnological revolution is rooted in our knowledge of enzymes. However, metagenomics is showing how little we know about Earth's enzyme repertoire. Deep sequencing has revolutionized our view of the tree of life. The genomes of newly-discovered organisms are replete with novel sequences, emphasizing the trove of enzyme structures and functions waiting to be explored by biochemists. Here, we sought to draw attention to the vastness of the "enzymatic dark matter" within the tree of life by placing enzymological knowledge in the context of phylogeny. We used kinetic parameters from the BRaunschweig ENzyme DAtabase (BRENDA) as our proxy for enzymological knowledge. Mapping 12,677 BRENDA entries onto the phylogenetic tree revealed that 55% of these data were from eukaryotes, even though they are the least diverse part of the tree. At the next taxonomic level, only four of 18 archaeal phyla and 24 of 111 bacterial phyla are represented in the BRENDA dataset. One phylum, the Proteobacteria, accounts for over half of all bacterial entries. Similarly, the supergroup Amorphea, which includes animals and fungi, contains over half the data on eukaryotes. Many major taxonomic groups are notable for their complete absence from BRENDA, including the ultra-diverse bacterial Candidate Phyla Radiation. At the species level, five mammals (including human) contribute 15% of BRENDA entries. The taxonomic bias in enzymology is strong, but in the era of gene synthesis we now have the tools to address it. Doing so promises to enrich our biochemical understanding of life and uncover powerful new biocatalysts.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    1
    Citations
    NaN
    KQI
    []