Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistake in regex string of ion_or_solid_comp_object leads to example string formula "Na[+]" failing #4231

Open
IsaacGentle opened this issue Dec 20, 2024 · 1 comment · May be fixed by #4233
Labels

Comments

@IsaacGentle
Copy link

Python version

Python 3.12.3

Pymatgen version

2024.11.13

Operating system version

Windows 10

Current behavior

The following code leads to an error (ValueError: + is an invalid formula!)

from pymatgen.analysis.pourbaix_diagram import ion_or_solid_comp_object
fails = ion_or_solid_comp_object('Na[+]')

Even though the definition of ion_or_solid_comp_object states that the argument string should be formatted as 'Na[+]'

def ion_or_solid_comp_object(formula):
    """Get an Ion or Composition object given a formula.

    Args:
        formula: String formula. Eg. of ion: NaOH(aq), Na[+];
            Eg. of solid: Fe2O3(s), Fe(s), Na2O

    Returns:
        Composition/Ion object
    """
    if re.match(r"\[([^\[\]]+)\]|\(aq\)", formula):
        comp_obj = Ion.from_formula(formula)
    elif re.search(r"\(s\)", formula):
        comp_obj = Composition(formula[:-3])
    else:
        comp_obj = Composition(formula)
    return comp_obj

In it's current form, the string needs to be '[+]Na' in order for it to work:

from pymatgen.analysis.pourbaix_diagram import ion_or_solid_comp_object
works = ion_or_solid_comp_object('[+]Na')

Expected Behavior

If 'Na[+]' is the desired string format, the definition of ion_or_solid_comp_object should be modified

L398 of pourbaix_diagram.py is currently:

if re.match(r"\[([^\[\]]+)\]|\(aq\)", formula):

https://github.com/materialsproject/pymatgen/blob/da607e86f9ce8aec942067c6c1a4fda6e04915dd/src/pymatgen/analysis/pourbaix_diagram.py#L398C18-L398C19

I suggest replacement with

if re.match(r".*\[([^\[\]]+)\]|\(aq\)", formula):

I am not super familiar with git workflows and pull requests so I have sent this as an issue. Please advise if you think this is something that should be fixed and if you wish me to do a pull request :)

Minimal example

from pymatgen.analysis.pourbaix_diagram import ion_or_solid_comp_object

works = ion_or_solid_comp_object('[+]Na')
fails = ion_or_solid_comp_object('Na[+]')

Relevant files to reproduce this bug

No response

@DanielYang59
Copy link
Contributor

DanielYang59 commented Dec 23, 2024

Hi @IsaacGentle thanks for reporting and I could confirm this issue. For a quick guide on making contributions to a project, you could checkout the guide here.

I could confirm you fix indeed work, but things looks more complex that this as some other formulas still don't work (for example Cl[-]).

Also the unit test for this helper method seems missing, so I might take over from here and fix this :) Thanks again for reporting and investigating.


A small tip, to share a code block, you could use the permalink instead of copying the code block.

Another small tip, to show differences in code, you could add diff, which would look like:

- if re.match(r"\[([^\[\]]+)\]|\(aq\)", formula):
+ if re.match(r".*\[([^\[\]]+)\]|\(aq\)", formula):

Update: looks like ion_or_solid_comp_object is not used across the code base at all, not sure about its use case, and it looks like a private helper method to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants