Generate cluster centered around a given sequence

 

Sequence name:
Sequence(aa):
Taxa group :
Redundancy filter :
Identity(%) cutoff:
Overlap(%) cutoff:




Choosing parameters
  • Redundancy filter
    Search against representatives at a given redundancy level, expressed as identity percent. For example, choosing a redundancy filter of 70% will retrieve only the hits that have less than 70% identity between themselves.
  • Identity(%) cutoff
    Values can be between 20 and 100. The searched sequences will be gathered to have higher identity percents than the selected cutoff, with respect to the input protein. Choosing a low identity cutoff will yield in a larger and more diverse cluster.
  • Overlap(%) cutoff
    Between 20 and 100. The cluster sequences will be gathered to have higher coverage percents than the selected cutoff, with respect to the input protein sequence. Choosing a low value will yield both in retrieving complete and incomplete sequences hits, but also in a more diverse cluster (as the identity % cutoff constraint will apply on any sequence segment percent above the overlap cutoff)".
  • Input sequence
    As input sequence, users can provide either a complete sequence or a fragment corresponding to a particular domain. Depending on the choice, the identity and overlap parameters might require adjustments depending on the case. This can be done after the search step depending on how many sequences are gathered to satisfy the input criteria.

    Read more in the user guide