CATALOG

Abstract

The first version of the website (Co-evolution based enzyme mining) started running from February 2022, and the second version (Co-evolution based extremophilic proteins mining) was upgraded from 2023 to 2024.

Now the CEM website includes an increase in the number of extremophilic feature (filtered from various databases and prediction tools) and proteomes (downloaded from the NCBI database), as well as comprehensive optimization of input parameters and result display&interaction&redundancy removal, making it more conducive for users in fields such as biotechnology and synthetic biology to obtain and screen for suitable extremophilic proteins.

You can obtain an informative interactive mining result in a short time of no more than 30 minutes. The composition & statistical information of the databases, the mining operation process, and the interpretation of the mining result are as follows.

Wishing you a pleasant using experience and mining the extremophilic proteins you are looking for! 😀

 

Overview of each database

We provide 6 extremophilic feature and 20 relevant databases, which are based on the microbial optimum growth temperatures, optimum pH, halotolerance classification and so on. (Click on the leftmost column to view more information)

ID Extremophilic Feature Domain .faa count Seq Count Predicted OGT Mean (°C)

 


How to carry out a mining job?

  1. In the mining interface, you need to first fill in your Job Name and email address.

    Note: Your email must be real, otherwise it cannot be verified.

  2. After filling in the job information, you need to input the amino acid sequence of an enzyme in fasta or txt format.

    Note: Only one amino acid sequence is allowed. Nucleotide sequences, special characters, and multiple amino acid sequences are not allowed.

  3. Then, you can select an extremophilic feature and database for thermophilic enzyme mining.

    Note: Relevant information can be viewed above.

  4. Finally, Click Submit, you will be directed to either a unique interactive results page or an email verification page.

    Note: You can bookmark the results page or copy/download the corresponding results in either Excel or PNG format.

  5. Optional: You can choose other sequence alignment algorithms and customize some key parameters.

    Note: If you have any questions, please feel free to contact us.

 

About the mining result

The mining result page can provide you with rich information about the mined extremophilic proteins.

The entries in the tables and interactive charts include input sequence (seq-input, input by the user), result sequences (seq-n, like seq-1, seq-2 ... ), and consensus sequence (seq-consensus, obtained by multiple sequence alignment, removing gap marks).

The contents are described below:

Part1: Input Parameter

Index Name Description

Part2: Overview of Sequence Results

Index Name Description

Part3: Properties of Sequences

Index Name Description

Part4: Sequence Clustering

Clustering is based on the various properties of Part 3.

You can select a reasonable number of clusters (Clusters Num) based on the scores obtained from the four clustering evaluation methods, and then select sequence(s) from each cluster for use in other bioinformatics analyses or wet experiment.

â‘  Hierarchical Clustering Situation

Index Name Description

â‘¡ Clustering Evaluation

Index Name Description

 

About the prediction result

The prediction result page can provide you with rich information about the input protein sequence, including properties, enzyme kinetic parameters, 13 extremophilic feature probabilities.

The contents are described below:

Part1: Properties of Sequences

Index Name Description

Part2: Classification probability of sequences

13 predictors were obtained by training on datasets consisting of 3 domains and 6 extreme features (consistent with the mining task) constructed based on the concept of coevolution, along with their corresponding negative datasets.

The output of each predictor is probability, with a default value of >50% indicating extremophilic feature.

Users can adjust the probability threshold through the Threshold Bar in the lower right corner, and download the Excel results through the save button in the upper right corner.