Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnakeMake parser #64

Closed
ihh opened this issue Jun 29, 2018 · 6 comments
Closed

SnakeMake parser #64

ihh opened this issue Jun 29, 2018 · 6 comments

Comments

@ihh
Copy link
Member

ihh commented Jun 29, 2018

Implement a parser to convert SnakeMake rules into prolog rules, as the current GNU makefile parser does for GNU makefiles (and the trivial prolog parser for prolog).

Implement by changing shell to python, via #54

@ihh
Copy link
Member Author

ihh commented Jun 29, 2018

Here is the SnakeMake grammar, from snakemake.readthedocs.io

snakemake  = statement | rule | include | workdir
rule       = "rule" (identifier | "") ":" ruleparams
include    = "include:" stringliteral
workdir    = "workdir:" stringliteral
ni         = NEWLINE INDENT
ruleparams = [ni input] [ni output] [ni params] [ni message] [ni threads] [ni (run | shell)] NEWLINE snakemake
input      = "input" ":" parameter_list
output     = "output" ":" parameter_list
params     = "params" ":" parameter_list
log        = "log" ":" parameter_list
benchmark  = "benchmark" ":" statement
message    = "message" ":" stringliteral
threads    = "threads" ":" integer
resources  = "resources" ":" parameter_list
version    = "version" ":" statement
run        = "run" ":" ni statement
shell      = "shell" ":" stringliteral

@cmungall
Copy link
Member

Hmm,

while all not defined non-terminals map to their Python equivalents

Seems non-trivial, would need to implement a python parser and execution engine; or link into the python API?

@ihh
Copy link
Member Author

ihh commented Jun 29, 2018

To do it 100% correctly you would need a Python parser, yes. The Python grammar is here, and it wouldn't be impossible. I don't think you need a Python execution engine, other than running the Python executable itself, and that's just equivalent to changing the GNU Make SHELL pseudo-environment variable as in issue #54.

A half-working/subset implementation might be even easier, though. If you look at the examples, it looks like the interweaving of Python with SnakeMake is relatively straightforward in many cases. The input and output are Python parameter lists which can be referenced in the Python rule; that seems like the trickiest thing, on first glance. The demarcation of the Python code itself might not be too hard because you just use indentation.

Here are examples of what I mean by the parameter lists being referenced in the rules:

rule compose_merge:
    input:
        expand('assembly/{sample}/transcripts.gtf', sample=SAMPLES)
    output:
        txt='assembly/assemblies.txt'
    run:
        with open(output.txt, 'w') as out:
            print(*input, sep="\n", file=out)


rule merge_assemblies:
    input:
        'assembly/assemblies.txt'
    output:
        'assembly/merged/merged.gtf', dir='assembly/merged'
    shell:
        'cuffmerge -o {output.dir} -s {REF} {input}'

Note that we don't even have full GNU Makefile compatibility (nor are we ever likely to). I think a better target is the proportion of peoples' makefiles in practice that biomake can be used with. And actually I think it's hard to confidently state a lower bound for this number until a prototype is out there and people are giving feedback. Implementing a simplified subset of SnakeMake syntax that covers (say) 10% of SnakeMake files, along with the 80% (or whatever) of GNU Make that we also currently cover, wouldn't be so bad.

@ihh
Copy link
Member Author

ihh commented Jun 29, 2018

I guess there's also this kind of thing...

CLASS1  = '101 102'.split()
CLASS2  = '103 104'.split()
SAMPLES = CLASS1 + CLASS2

I think this could be implemented using $(shell ...): https://www.gnu.org/software/make/manual/html_node/Shell-Function.html

@ihh
Copy link
Member Author

ihh commented Jun 29, 2018

I think the subset that is within reach is a subset which does not make any use of Python outside of rules and variable assignments (which we can think of as analogous to shell-executed code in Make)... the question is how many SnakeMake users would that satisfy (vs how many would be frustrated)

@ihh ihh added the wontfix label Oct 1, 2018
@ihh
Copy link
Member Author

ihh commented Oct 1, 2018

Shelving for now. Prolog->Python seems like a stretch...

@ihh ihh closed this as completed Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants