-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find recursively "pure input" cells #142
Comments
The script in #119 seems to go in the right direction, although, as already noted, does not work with ranges. In my case, it correctly returns >>> get_one_line_formula(xl_model, "'[test.xlsx]WS'!M15")
'SUM((0.22 / 0.6):N14)' With few changes and an helper function import re
def expand_range(range_str,xl_model):
'''
Convert a string representing an Excel range like "'[test.xlsx]WS'!B3:D8" into a list of strings representing all single Excel cells of the range, filtering out cells unused in Excel document model `xl_model`.
'''
match = re.match(r"'([^']+)'\!([A-Z]+)(\d+):([A-Z]+)(\d+)", range_str)
if match:
sheet_name, col_start, row_start, col_end, row_end = match.groups()
col_start_num = ord(col_start) - 65
col_end_num = ord(col_end) - 65
num_rows = int(row_end) - int(row_start) + 1
cell_list = ["'{}'!{}{}".format(sheet_name, chr(65 + col), row)
for row in range(int(row_start), int(row_start) + num_rows)
for col in range(col_start_num, col_end_num + 1)]
# cells really used in Excel document are only those in xl_model.dsp.dmap.pred dictionary keys
return [c for c in cell_list if c in xl_model.dsp.dmap.pred]
else:
return None
import functools
import schedula as sh
@functools.lru_cache()
def pure_input_cells(xl_model, node):
'''
Knowing the model of an Excel document, find recursively the formula expressed in terms of "pure input" cells of a given formula.
'''
rl = expand_range(node,xl_model) # list of cells of a range, or None if argument is not a range
if rl is not None:
# even if in Excel parameters are separated by ';', in `formulas` model the separation character is ','
return ','.join(pure_input_cells(xl_model, rc) for rc in rl if rc in xl_model.dsp.dmap.pred)
try:
formula = next(iter(xl_model.dsp.dmap.pred[node]))
except StopIteration:
return node
for k in sorted(xl_model.dsp.dmap.pred[formula], key=lambda x: (len(x), x), reverse=True):
if not isinstance(k, sh.Token):
formula = formula.replace(k, pure_input_cells(xl_model, k))
return formula[1:] I was able to make it returns >>> pure_input_cells(xl_model, "'[test.xlsx]WS'!M15")
"SUM(('[test.xlsx]WS'!K10 / '[test.xlsx]WS'!H10),'[test.xlsx]WS'!M11,('[test.xlsx]WS'!H12 * '[test.xlsx]WS'!K12))" which express the formula of the input cell in terms of "pure input" cells expanding the range. I hope a similar functionality will be added to the library core. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm trying to disclose the computing structure of an excel file in order to reproduce and perform those computation from code. Basically I'd like to have a function able to find all the dependencies of a cell:
Once you know all the "pure input" cells, computing the value of a formula cell is easy with
calculate
method:The initilization is easy,
formulas
does a great job at creating a model and all information seem to be there:When the
inputs
method of a cell returnsNone
(as far as I understand) the cell does NOT contain a formula (like "pure input" cells), but as soon as I look at a cell containing a formulaor
I could not understand if and how I can use
formulas
methods to:Thanks in advance for your attention.
The text was updated successfully, but these errors were encountered: