Overview

The CEM website (Co-evolution based enzyme mining) started ran from February 2022.

We provide more than 200,000 microbial genome data, which can be downloaded from the ncbi database, the microbial whole genome data according to the collated OGT data from Martin K. M. Engqvist et al., and a filamentous fungus database (https://pub.fungalgenomics.ca/). We will provide you with the information about the microbial genomes, from which the above mined candidate thermophilic enzymes are derived. We can achieve rapid mining of thermophilic enzymes in a short time of no more than ten minutes. Through this enzyme mining platform, we have discovered many thermophilic enzymes, all of which have great thermostability.

We provide three enzyme mining methods, which are based on the microbial optimum growth temperatures, the thermophilic fungus genomes, and the structural RNA GC content.

  • Microbial optimum growth temperatures (OGT): Our website filtered 224799 microbial whole genome data according to the collated OGT data from Martin K. M. Engqvist et al., from which 1910 microbial genomes with OGT greater than 42°C, or 1672 genomes with OGT greater than 50°C was selected to establish the database.
  • Thermophilic fungus genomes: Based on a filamentous fungus database (https://pub.fungalgenomics.ca/), our website screened out microorganisms with the word ‘thermo’ or the thermophilic fungi reported previously to form a thermophilic fungus database.
  • The structural RNA GC content: Since the microbial OGT is positively correlated with the GC content of structural RNAs, our website found 15271 microbial genome data in the NCBI database that could conduct tRNA annotations and GC content calculations, and can be further analyzed by ORF prediction. From the distribution of average GC content of tRNAs for all microbes, the average GC content of tRNAs of most microorganisms is centered at 58%. Considering comprehensively, we selected the microbial genomes with GC content of tRNAs higher than 61% or 62.5% as a database for thermophilic enzyme mining.

 

How to use the website?

  1. You need to first fill in your Job Name and email address, and then verify your email address before using the enzyme mining service.

    Note: Your email must be real, otherwise it cannot be verified.

  2. After verifying the email, users need to input the amino acid sequence of an enzyme in fasta format In the search interface, the rapid mining of thermophilic enzymes can be realized.

    Note: Only an amino acid sequence in fasta format is allowed. Nucleotide sequences, special characters, and multiple amino acid sequences are not allowed.

  3. Choose a method for thermophilic enzyme mining.

    Note: Only one of the three mining methods can be selected for mining, but not at the same time.

  4. Click Submit. The information about mined candidate thermophilic enzymes will be sent to users in the form of reports with .tsv format, which can be downloaded directly from the website.

 

About the report

The report can provide you with the information about the mined thermophilic enzymes.

The report of enzyme mining method based on microbial optimum growth temperature:

  • Column 1. Genome Acc. No. : GenBank assembly accession
  • Column 2,3. Score and E-value: Based on the NCBI PSI blast.
  • Column 4. Protein ID and Annotation
  • Column 5. Protein sequence
  • Column 6. Protein size

The report of enzyme mining method based on structural RNA GC content:

  • Column 1. Genome Acc. No. : GenBank assembly accession
  • Column 2. GC content of tRNA
  • Column 3,4. Score and E-value: Based on the NCBI PSI blast.
  • Column 5. Protein ID and Annotation
  • Column 6. Protein sequence
  • Column 7. Protein size

The report of enzyme mining method based on thermophilic fungus genomes:

  • Column 1. Genome Acc. No. : GenBank assembly accession
  • Column 2. Score and E-value: Based on the NCBI PSI blast.
  • Column 3. Protein ID and Annotation
  • Column 4. Protein sequence
  • Column 5. Protein size