Frequently Asked Questions
What are the main advantages of FuncLib?
- FuncLib designs combinations of highly epistatic active-site mutations that might be inaccessible to natural and laboratory evolution, or might require high throughput screening or selection methods.
- Funclib does not require high-throughput screens - we recommend to test the top 50 ranked designs.
- Funclib can be used without a ligand or transition-state analogue, to obtain a repertoire of functionally diverse enzymes that can be screened for a substrate of interest.
- If the target ligand or transition state analogue position in the active site is known, it can be included in the calculations. See here
What type of proteins can be submitted to FuncLib?
FuncLib calculations work best on soluble, single domain proteins. We are currently developing FuncLib versions for membrane proteins and binding interfaces.
It is highly recommended to work on a PROSS stabilized variant of your protein.
Can FuncLib be used without a structure?
FuncLib must receive an X-ray structure as input.
If your protein does not have a solved X-ray structure, you can use as input structure a homology model you trust. The sequence homology between the protein of interest and the input structure must be at least 40%.
A homologous structure can be found using hhpred
And the model for your protein sequence can be generated with SWISS-MODEL
Can I submit a structure with missing density (missing residues/loops)?
Yes, you may submit a structure that has some missing residues. However, you will need to provide the sequence of the whole protein sequnce to avoid bias the PSSM (see here for details).
Can I submit an NMR structure?
FuncLib does not support NMR structures. Though, if you tweak the NMR-based pdb file to look as if it is based on a crystal structure (having all atoms appearing only once), you may use this structure.
However, we strongly recommend avoiding NMR structures, as they are typically not accurate enough for these kind of calculations.
How long does a FuncLib calculation take?
FuncLib is comprised of 2 steps, and each may take a day or two. After the first step is done, you will be sent an email with a link to enter more parameters to control the calculation of the second part. When the second step is done, you will be sent an email with the results.
What do the result files include?
When results are ready you will get an email with a zip file attached. It will include a file named ReadMe.txt with a detailed explanation of the results. Please read the ReadMe.txt file carefully.
A shorter description of the results:
- An excel file with the clustered designs sorted by Rosetta score(Rosetta Energy Units).
- Design clustering- we select the best designs (in terms of predicted Rosetta score) that are different from one another by N mutations (by default, N=2).
- A zip file with the structures of the top 50 ranked designs.
- A text file with the sequence space, e.g. the allowed residues in each position
- A text file with the parameters you used to create the run
How to interpret the different variants names?
One of the results files is the sequence space. This file contains a list of allowed amino acids at each diversified position (the WT identity is placed first).
For example:
106A ICHLM
132A FL
271A LIR
217A ML
In the other output files, the name of each mutant relates to the sequence space; and each mutation is represented with 2 digits.
For the sequence space shown above,a mutant with the name 04010202 has the following sequence: I106L, 132 is not mutated (numbered 01), L271I and M217L.
The mutant that contains in its name only .01. symbols (0101010101) is the WT.
How to proceed towards experimental validation?
- We strongly recommend to order the full genes rather than inserting mutations one by one.
- Select the designs for experimental testing- we recommend to order the top 50 designs, but you can order more/less variants according to your screening capacity.
- For each selected design, copy the WT protein's sequence, and change the needed positions according to the variant's name. After generating all sequences, align them all to the WT original sequence, and make sure again the mutations are in the correct place, and verify that there are no gaps.
- Structures with missing densities: the sequences in the attached Top50.fa file Do Not include the missing densities in the structure. If you have missing densities in your structure, you should align each mutant sequence to the full sequence of your protein, and fill in the missing residues.
- Back-translate the amino acid sequences to DNA sequences with dnaworks or EMBOSS websites.
How to proceed when receiving an error email?
Errors are most likely due to user's wrong input.
In the error email you have a list of all the parameters that you submitted to FuncLib. Go over each parameter and make sure it is correct.
Below is a list of common user errors:
- PDB id
- The desired PDB id contains the letter 'o' but you typed the digit 0 or vice versa.
- Wrong id - copy the id reported in the email and paste it in the rcsb web. Make sure it is your protein of interest.
- The id is of a NMR structure - FuncLib is not compatible with NMR structures.
- The PDB files contain negative residue numbers. FuncLib is incompatible with negative numbers.
- One of the residues have more than one conformation- edit the file manually and keep only one conformation.
- (rare) The PDB file contains a residue for which one or more of the backbone atoms is missing. Note that if an entire residue is missing this is not a problem. However, if some atoms exist, then all backbone atoms should also exist. To solve this problem remove all other atom lines related to the specific residue (i.e., turn it into a missing density).
- Chain identifier to design
- Missing identifier - one of the chain identifires used does not appear in the PDB file.
- Wrong identifier - not of a protein chain but of DNA or RNA chains.
- Wrong identifier - the identifier is of a protein chain but the wrong one and is incompatible with the rest of the parameters.
- Numerical identifier - FuncLib is incompatible with numerical identifiers (1, 2...). You can change the identifiers to letters and resubmit the query using the upload files option.
- You included a chain containing only non-amino acid residues (e.g. ligands and ions). Such chains should not be specified here. Instead, if you'd like to keep a ligand/ ion during simulation, use the ligands or ions to keep during simulations options.
- Amino acid positions to diversify
- One or more of the positions (a pair of residue number and chain, e.g. 106A) does not exist in the pdb file
- Typos
- A missing comma leading to the merge of 2 numbers.
- Using a numbering system from a paper or from a database that does not match the numbering system in the pdb file.
- Trying to specify a position for which there is missing X-ray density.
- Specifying a number of a nonnative amino acid that is
- Some residues positions are both in Amino acid positions to diversify and Essential amino acid residues
- Missing chain - one of the residues you specified is located on a chain not specified in the chain identifier option
- Using the amino acid instead of the chain - The list should be of residue positions and chain, not of positions and amino acid letter. For example, if you want to specify a Valine in residue 56 on chain B, you should specify 56B and not 56V.
We suggest to open the pdb file on PyMOL; then go over all the position numbers you entered here and make sure you can select them on the designed chain in PyMOL.
- Essential amino acid residues
- Non protein residue cannot be specified in this option(e.g. ligands, co-factors). If needed, specify the protein residues in contact with these other atoms
- Do not include here any of the amino acid positions to diversify, as they should be mutated. The essential amino acid residues will be kept fixed during simumlations.
- See also in Amino acid positions to diversify
- Ligand
- Make sure you used a pair of the residue number and the chain (e.g. 1X)
- Do not use the three-letter name of the ligand in the PDB file
- Single atom ligands cannot be used in this option. For further information see here.
- Do not designate a ligand and an ion in the same residue number. Please separate them into different residues.
- See also in Amino acid positions to diversify
For any further questions and troubleshooting, please contact rosalie.lipsh@weizmann.ac.il
Good Luck!