-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding pdb_addter #112
base: master
Are you sure you want to change the base?
Adding pdb_addter #112
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andrewsb8
First of all thanks for coming interacting with pdb-tools
I really enjoyed the fact that you followed very well the structure of our code and project. You really got into it 😉
I am adding some review comments. Cheers!
pdbtools/pdb_addter.py
Outdated
prev_res = res_id | ||
res_counter += 1 | ||
if res_counter - min(residue_range) != 0 and (res_counter - min(residue_range)) % step == 0 and int(line[22:26]) in residue_range: | ||
yield "TER\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use os.linesep
. Likely we don't have that in all tools but it is something we need to change. so let's start here
pdbtools/pdb_addter.py
Outdated
|
||
|
||
if __name__ == '__main__': | ||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to chain pdb_addter
with pdb_chainbow
using your example, test/data/addter_test.pdb
and I have this error:
· python pdbtools/pdb_addter.py -1::10 tests/data/addter_test.pdb | python pdbtools/pdb_chainbows.py
Traceback (most recent call last):
File "pdbtools/pdb_chainbows.py", line 159, in <module>
main()
File "pdbtools/pdb_chainbows.py", line 138, in main
for lineno, line in enumerate(new_pdb):
File "pdbtools/pdb_chainbows.py", line 106, in run
chain_map[line[21]] = curchain
IndexError: string index out of range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've figured out the reason for this. Even without using pdb_add_manual_ter
here, the test pdb has a TER line (but not a full TER entry). This is common in pdb files output from Gromacs. These files will look like this at the end:
.
.
ATOM ...
ATOM ...
TER
ENDMDL
So pdb files output from Gromacs will break pdb_chainbows
just like pdb_add_manual_ter
does (before the changes of course I am making of course 😄).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks for the contribution and effort in writing this tool! It's very consistent with our style.
Besides a few minor points in the code, my main concern is the name of the tool and its purpose. I'd understand addter
as a tool that would add TER statements at the right places. We've had discussions about such a tool in the past, but it wasn't very productive because it's not a simple task. The PDB format spec doesn't really tell you when to add TERs appropriately. I'm afraid that this tool would produce "non-standard" PDBs, which is fine to be honest, but then we should change the name to make it a little more obvious that that is the case, particularly, since the default mode of action of the tool is to add a TER
after every residue.
What about pdb_add_manual_ter
? A bit longer, but at least more specific.
Another example, adds
|
Another comment, |
That could be a proper name. As @JoaoRodrigues said, |
Thanks for the feedback! The name change makes sense. Having ability to manually place TER records will be helpful ultimately given the other scripts' purposes (based on my understanding). Ex: chainbows requires TER records, tidy requires Chain identifiers. Some tools don't always give Chain IDs (CHARMM-GUI, Gromacs if Chain IDs aren't provided and no TER records already in place, VMD in some cases, etc). So this tool could be complicated but would help complete the suite as some of the tools need one or the other to already exist. I have some questions and comments as well:
Otherwise, I will get to those changes, write tests for a smaller pdb file and dummy.pdb (based on your feedback to the above suggestion), and make the names appropriate. Thanks for your time. Brian |
Hi, you can stay on the same branch, no need to do a new branch,
i don't understand, can you rephrase? Can it be that
pdb-tools should not raise errors (I think). these are very simple tools that should do the job when the PDB follows (more or less) the right formatting rules. if the PDB is "crazy" or has "errors incompatible with the tool", I am not sure if it should be the responsibility of Yes please, use the smallest possible reproducible case for PDB tests. Cheers! |
I'll have to get back to you about If not throwing an error, the function could mimic that of |
Sounds good so far. Give it a try on the review comments and your considerations. We will review it again after that. Best, and we are very happy to have you interacting with |
Hi all, I've made some fairly large changes to the branch pdb_addter. I'll outline some of the largest here:
Let me know what you think or if you run into any issues!
|
Hey all, I know that it's end of term and holiday season. Just wanted to follow up because I had actually forgotten about this until a minute ago 😅. Hope the end of the term goes well for you both and that you enjoy your holiday break! Brian |
Hi, @andrewsb8 sorry for the late reply. These have been busy days on all fronts. My apologies. I will review this PR today/tomorrow and let you know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andrewsb8
I have been reviewing your PR. The idea remains very valuable. However still some minor issues.
- I noticed the residue range the tool expects is the natural order of the residues (1, 2, 3, ...) regardless of the residue numbering in the PDB. This is awkward because would force the users to calculate the residues they see in the PDB and the ones they want. Or use
pdb_reres
twice to correct for that, for example:pdb_reres -1 | pdb_add_manual_ter | pdb_reres -X
again. I think it is not difficult to implement such that the residue rangepdb_add_manual_ter
expects is the residues of the actual PDB. I found it more natural. - I think the tests are very loose still. Please add assert statements to confirm the actual number and position of the
TER
lines in the output files. The best would be to assert for the actualTER
line.
Regarding number 1
. I think other members of the team should give their opinion as well.
Apart from that, everything looks working. Once this is set, before merging I will clean a bit your implementation, so to homogenize with the other tools.
Sorry for these demands, but we need to keep it so 😉
Thanks for all your contributions!
Happy new year! Thanks for the feedback. Have a couple of questions based on it.
So the
There the second entry is changed but everything after the Brian |
Hi Brian, thanks for your cheerful words. We had great holidays, we hope you too :-) All the points you are rising are definitively very tricky and I feel they are getting out of the PDB standard format because of the added complexity. The point you raise for 1) is a good one. Maybe it is best to leave it as "per residue" instead of the actual residue numbers of what you want. Sometimes I think what we if the tool should be add ters at specific residue/chains, for example Regarding 3), I was rereading again the whole conversation here, definitively @JoaoRodrigues raise a good point. Could you point us to a real-world example (PDBID) and would be your perfect outcome so we can investigate further?
Could so send us again these examples? (transfer.sh) Cheers, |
I will make it so the atom numbers are fixed, no problem! As for examples of what I'm looking for, I'm rarely working with a single pdb id and sometimes not one at all. I have three examples of some current projects with large systems:
Before finding this project, I've been modifying these files largely manually which has been very time intensive. |
I have updated
Let me know what you think of my changes or of my prior comment. Best, Brian |
Hi @andrewsb8 , |
Hi,
Wrote a script to add TER lines in user defined ways similar to pdb_delres.py. This would be a good functionality for multiple reasons:
I include pdbtools/pdb_addter.py, tests/test_pdb_addter.py, and tests/data/addter_test.pdb where all 13 unit tests in test_pdb_addter.py pass.
Let me know what you think,
Brian