Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHC plugin update #260

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ulysses4ever
Copy link
Collaborator

@ulysses4ever ulysses4ever commented Jun 12, 2024

Remaining loose ends:

  • [kinda] how will the resulting executable find the libgibbon_rts_ng.so part of the RTS that we create from the Rust code?

    • ATM we stick -rpath pointing to the RTS to the client's Cabal file. This is probably not too bad. But is there a chance to do better? E.g.could we link the ng RTS statically instead?
  • plugin (Haskell code specifically) should not reference the RTS via an absolute path

    • how to make sure that the RTS is compiled? I added a call to compileRTS from plugin but it doesn't seem to have any effect 🤔
  • where to stick the gib_init call to initialize the RTS?

    • [BLOCKER] currently, to circumvent the issue of not calling it, a bulk of code in RTS is commented out. Very bad! This is maybe the last point that should be solved before we can declare a reasonable state that we can merge.

@ulysses4ever
Copy link
Collaborator Author

This is still a draft

@ulysses4ever ulysses4ever marked this pull request as draft June 12, 2024 14:06
@ulysses4ever ulysses4ever force-pushed the ap/ghc-integration-update branch from 19b799d to 096b6ac Compare June 12, 2024 14:12
gibbon-compiler/src/Gibbon/Passes/Codegen.hs Outdated Show resolved Hide resolved
Comment on lines -750 to +754
if (gib_addr_in_nursery(footer_ptr)) {
old_chunk_in_nursery = true;
GibNurseryChunkFooter oldsize = *(GibNurseryChunkFooter *) footer_ptr;
newsize = oldsize * 2;
} else {
// if (gib_addr_in_nursery(footer_ptr)) {
// old_chunk_in_nursery = true;
// GibNurseryChunkFooter oldsize = *(GibNurseryChunkFooter *) footer_ptr;
// newsize = oldsize * 2;
// } else {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the second major thing, but it's a pity I don't understand the motivation of this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nursery is initialized in the main function when it calls "gib_init". Without initialization, gib_addr_in_nursery will segfault. We need to find a way to call "gib_init" function in a different way.


-- (5) Link the .o file in the module.
let objfile = replaceExtension fp ".o"
let rtsfile = "/home/ckoparka/chai/tree-velocity/gibbon-rts/build/gibbon_rts.o"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've hardcoded various paths in a few places in the code, those all need to made relative to the "GIBBONDIR" environment variable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed this one it seems, thank you!

It's easy in the Haskell code. The issue is that some of the paths need to go into the .cabal file, where you can't rely on GIBBONDIR, I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the best way to relativize this path, do you have an idea? My hope is that RTS can be a part of the gibbon Haskell package (as it used to be once, I think), and the plugin could depend on it (via extra-libraries or something).

Do you think that we need to use .o files, or .a would be acceptable. Because I suppose .a is what Cabal will expose as an "extra-library".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: instead of extra-libraries read extra-bundled-libraries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, none of the *-libraries fields really work because we need to recompile RTS at run time rather than compile time. Need to think more...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that since plugin can't work without Gibbon at all and it's not much code, it should be just a part of the Gibbon package. In that case it could find RTS via data-files. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I've been having a super busy week. I just tried to merge my commit and saw this isn't a branch on this repo but on your clone. Avoiding creating a PR for a PR :D You've already cleaned up most of the things, the only additional changes I have are : (1) using GIBBONDIR in the Plugin.hs file in the spot you missed, and (2) using (isLibrary) in Codegen.hs to decide whether to codegen the main expression.

My hope is that RTS can be a part of the gibbon Haskell package (as it used to be once, I think)

We didn't have this problem before because the old RTS was linked using copy-paste :D We had rts.c file that would be added as a prefix to every generated C program.

I guess, none of the *-libraries fields really work because we need to recompile RTS at run time rather than compile time. Need to think more...

Just to double check if I got this right, "run time" means the time that the plugin runs right? So every time a Haskell program uses liftPacked and the plugin kicks in? We don't need to recompile the RTS every time here. We'd want to compile the RTS just once, keep the .o/.so files in a known location, and then link the compiled Haskell program with it.

I think there might be a way avoid the cabal file altogether. We were using it to dynamically link against the .so file that the Rust RTS produces. Maybe there's a different way to tackle this? So the plugin is already inserting these .o files into the GHC compilation pipeline. Maybe we can a find a way to set rpath here too?

I didn't get a chance to try this out, but another idea was to use Setup.hs to set extra-bundled-libraries: with the right path in gibbon-plugin.cabal and then to only use extra-libraries: in the client (here gibbon-examples.cabal) and things should work out? Then we don't have to specify any sort of library path in the client's cabal file, because the appropriate .o/.so files should be present in the same directory as the gibbon-plugin library would?

I'm starting to think that since plugin can't work without Gibbon at all and it's not much code, it should be just a part of the Gibbon package. In that case it could find RTS via data-files. What do you think?

Yeah, that's okay. But we would still need to put a relative path to the .o/.so files here or is that avoidable?

Copy link
Collaborator Author

@ulysses4ever ulysses4ever Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply! There are many questions, and it'd be great to untangle them. Here's an approximate list.

First, something completely orthogonal (as I realize now): I don't like that Gibbon depends on $GIBBONDIR. I think, as a Haskell app, it could get away with using only Paths_gibbon-paths. The reason I don't like the env var is that I don't see a clean way to cabal install gibbon in that case: you'd need to learn where the Gibbon source is stored by Cabal. This is not the way you're supposed to use Cabal :-) Before I thought that switching away from $GIBBONDIR might make the life of the plugin easier, but after reading your message and thinking more, I think it's probably doable in any case, so I probably need to shelve this issue for now.

  • I thought that data-files will be enough to package the RTS in a reasonable way, but it seems there are catches here.
    1. You mention dynamic library for the Rust part of the RTS.
    2. Currently, the RTS is compiled "in place" meaning that the binary artifacts produced get stored under the same directory. This can break if the Gibbon package installed in a read-only filesystem. E.g. Nix store... I think the RTS artifacts should be generated in the current directory, just like the .c files for the source and alike. This is hopefully not too hard to achieve, but more work than just adding data-files field and switching from $GIBBONDIR to Paths_gibbon in the Haskell source of Gibbon.

Imagine I'm compiling a plugin client with Cabal.

  1. Cabal needs to find the plugin and Gibbon-the-library. This is easy is everything is packaged as one or more Haskell packages although there are several options. Gibbon and the plugin may be:
    a. two separate Haskell packages (as it is now)
    b. one package (the plugin is just a module of the Gibbon package) -- this may or may not simplify things for the rest.
    c. plugin may be a public sublibrary of the gibbon package (see Cabal docs) -- not sure what this could buy us (in theory, this is a cleaner interface).

  2. The plugin needs to find the Gibbon RTS -- currently, we can assume it's done via $GIBBONDIR. In that case, it's not important what we choose in the previous step, hopefully.
    a. Currently, the plugin assumes the RTS has been compiled it seems. That probably should be changed: I don't see why this assumption would be true in general.

  3. At run time, the client needs to find the Rust part of the RTS that is compiled into a dynamic library. I am not a big fan of dynamic libraries. Can we not use a static library and deal with it in the previous step? Also, how does Gibbon-the-compiler find the .so now? Can't this approach be extended to the plugin somehow?..

    a. At plugin client's run time, I hope you don't need to compile anything. When I said "run time" before, I meant plugin's run time, so the client's compile time. Anyway, I think we're on the same page about this point.

Anything I'm missing? Any comments?

@ulysses4ever ulysses4ever force-pushed the ap/ghc-integration-update branch from 096b6ac to 2db208c Compare June 20, 2024 19:33
@@ -252,8 +253,17 @@ data Mode = ToParse -- ^ Parse and then stop
| RunMPL -- ^ Compile to SML & compile with MPL & run
| Bench Var -- ^ Benchmark a particular function applied to the packed data within an input file.
| BenchInput FilePath -- ^ Hardcode the input file to the benchmark in the C code.
| Library Var -- ^ Compile as a library, with its main entry point given.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the idea for the "entry point" field of the library mode, @ckoparkar ? I don't see it used anywhere. Also, what's an entry point to a library anyway? I thought "entry point" is only applicable to executables. I'm really bad at these things...

…o luck

at least, the .o is fetched in a reasonable way now

this also tries to compile RTS from the plugin, but local experiments
doesn't seem to show it having any effect, unfortunately
@ulysses4ever
Copy link
Collaborator Author

All right, I get to the point when it says it can't find libgibbon_rts_ng.so when trying to run the example. Now, we need to think harder... But let me repeat the penultimate question: does it have to be an .so or could we stick to static? Not that I can guarantee you that we can solve it in a nice way in the static case, but that would be one less moving piece, which could make it easier to progress...

Also, I added a call to compileRTS inside the plugin, as I said I intended. It doesn't seem to work: the binary files don't appear after the plugin ran 😭😭😭

Caveats:

- I'm back to absoulute paths in the cabal file. I reconsidered the
  issue of absolute paths in .cabal files a little: if it's a client's
  .cabal file, than it's acceptable, albeit not great, I think. We'll
  see if this is avoidable. With the shared lib in RTS, I'm not sure it is.

- I had to get back to once used
  "-optl-Wl,--allow-multiple-definition". We need to investigate why we
  get "multiple defintions" and how to avoid this.

- I had to simplify the example client package to only have an exe rather than exe+lib.
  I assume this can be resolved.
@ulysses4ever
Copy link
Collaborator Author

It works! but it's ugly... the key was "-optl-Wl,-rpath" I think. 92e90f9

Caveats:

  • I'm back to absoulute paths in the cabal file. I reconsidered the
    issue of absolute paths in .cabal files a little: if it's a client's
    .cabal file, than it's acceptable, albeit not great, I think. We'll
    see if this is avoidable. With the shared lib in RTS, I'm not sure it is.

  • I had to get back to once used
    "-optl-Wl,--allow-multiple-definition". We need to investigate why we
    get "multiple defintions" and how to avoid this.

  • I had to simplify the example client package to only have an exe rather than exe+lib.
    I assume this can be resolved.

@ulysses4ever
Copy link
Collaborator Author

@ckoparkar do you thoughts what is the best place to stick the call to gib_init?

@ulysses4ever ulysses4ever marked this pull request as ready for review August 2, 2024 20:17
@ulysses4ever ulysses4ever marked this pull request as draft August 2, 2024 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants