Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux.c: don't use pread64 #2048

Closed
wants to merge 2 commits into from
Closed

linux.c: don't use pread64 #2048

wants to merge 2 commits into from

Conversation

orbea
Copy link
Contributor

@orbea orbea commented Feb 26, 2024

Starting with musl-1.2.4 all LFS64 APIs have been removed since musl is always 64-bit and yara now fails with implicit function declarations for pread64.

However configure.ac already sets AC_SYS_LARGEFILE so changing them all to pread will work with both glibc and musl.

Copy link

google-cla bot commented Feb 26, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@plusvic
Copy link
Member

plusvic commented Feb 26, 2024

This was explicitely changed in #2005. @vthib, could you take a look a this PR?

@vthib
Copy link
Contributor

vthib commented Feb 27, 2024

Yes, as explained in #2005, this change would break the "glibc, x86, LFS not enabled" case.

There is a bit in musl explaining a bit the differences with glibc here: https://wiki.musl-libc.org/faq#Q:-Do-I-need-to-define-%3Ccode%3E_LARGEFILE64_SOURCE%3C/code%3E-to-get-64bit-%3Ccode%3Eoff_t%3C/code%3E?

From what I can gather, we have:

The pread64 function is in posix for support of large files, but musl seems to want to deprecate this. This leaves us in a bit of an weird situation:

  • in the end, musl will only expose pread(..., off_t offset), with off_t=uint64_t. We could fix it right now by defining _LARGEFILE64_SOURCE but they do say they plan to remove this in the future
  • but on glibc x86, using pread(... off_t offset) will break process scanning unless -D_FILE_OFFSET_BITS=64 is set.

As detailed in #2005, the issue isn't so much when compiling yara using autotools (since it should set this properly), but rather when using other tools that compile yara, such as yara-rust. If it breaks, it is completely silent as it will still compile, and it can be very annoying to debug.

I wonder if the best solution couldn't be something like this:

  • Merge this commit, ie move back to using pread. This makes musl work.
  • Add a #define _FILE_OFFSET_BITS=64 in proc/linux.c before including unistd.h, to properly include the 64 bits version of those syscalls. This change avoids depending on an autotools option, makes the code easier to build and more local. This is basically what using pread64 was, but we can no longer do that due to the musl change.
  • Possibly add a static assert checking that sizeof(off_t) == 8, to make sure that it does not compile. This ensures we never get to the previous situation where it compiles properly, but it silently casts the value and breaks process scanning. If the second point is not done, that would allow yara-rust or others to no longer be able to compile yara unless it properly defines _FILE_OFFSET_BITS=64

@orbea
Copy link
Contributor Author

orbea commented Feb 27, 2024

Maybe I'm missing something, but shouldn't AC_SYS_LARGEFILE already add the required defines to config.h?

@orbea
Copy link
Contributor Author

orbea commented Feb 27, 2024

Maybe I'm missing something

Yes, it seems yara doesn't use AC_CONFIG_HEADERS so there is no config.h, but unfortunately adding it is not as trivial as I hoped due to how AC_DEFINE is used makes AC_CONFIG_HEADERS unhappy.

@orbea
Copy link
Contributor Author

orbea commented Feb 27, 2024

I force pushed with two new commits, the first to add a description to AC_DEFINE as shown in the autoconf documentation which seems required with AC_CONFIG_HEADERS and another to add AC_CONFIG_HEADERSso that pread should work on 32-bit systems too since AC_SYS_LARGEFILE is already used.

This needs testing to be sure config.h is not missing anywhere important.

@orbea orbea force-pushed the musl-1.2.4 branch 3 times, most recently from 839d5ee to 7547f5f Compare February 27, 2024 14:06
@vthib
Copy link
Contributor

vthib commented Feb 27, 2024

Adding a config.h does not solve the points i raised about avoiding the use of the 32-bits pread function when autotools is not used.

@plusvic any opinions on the changes i suggested?

@orbea
Copy link
Contributor Author

orbea commented Feb 27, 2024

Adding a config.h does not solve the points i raised about avoiding the use of the 32-bits pread function when autotools is not used.

Is it possible to test it? Check out the documentation for AC_SYS_LARGEFILE.

https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/System-Services.html

— Macro: AC_SYS_LARGEFILE

    Arrange for 64-bit file offsets, known as [large-file support](http://www.unix-systems.org/version2/whatsnew/lfs20mar.html). On some hosts, one must use special compiler options to build programs that can access large files. Append any such options to the output variable CC. Define _FILE_OFFSET_BITS and _LARGE_FILES if necessary.

    Large-file support can be disabled by configuring with the --disable-largefile option.

    If you use this macro, check that your program works even when off_t is wider than long int, since this is common when large-file support is enabled. For example, it is not correct to print an arbitrary off_t value X with printf ("%ld", (long int) X).

    The LFS introduced the fseeko and ftello functions to replace their C counterparts fseek and ftell that do not use off_t. Take care to use AC_FUNC_FSEEKO to make their prototypes available when using them and large-file support is enabled.

These defines get added to config.h which is now included at the start of the relevant file, I think this is accomplishing what you are suggesting?

@vthib
Copy link
Contributor

vthib commented Feb 27, 2024

The issue that I raised is that if autotools are not used, it is very easy to forget to define _FILE_OFFSET_BITS, and this can lead to invalid process scanning.

The config.h is generated by autotools, so this actually reinforces the coupling with autotools. That would force people that do not use autotools (like yara-rust) to create a config.h by hand to still be able to build, and there is still the possibility of forgetting to define _FILE_OFFSET_BITS

@orbea
Copy link
Contributor Author

orbea commented Feb 27, 2024

Thanks for elaborating, I understand your concern now and yes another build system would have to replicate the defines and autoconf tests. I am not certain if setting _FILE_OFFSET_BITS unconditionally is correct? See this meson issue. mesonbuild/meson#3519

Unfortunately this is not something easily solved outside of using autoconf regardless...

@vthib
Copy link
Contributor

vthib commented Feb 27, 2024

Regarding your link, it also leads to this: jarun/nnn@db8b618

And defining it in proc/linux.c would mean this would also impact linux builds, and no other targets. This feels very safe and recommended, both by musl and glibc

@plusvic
Copy link
Member

plusvic commented Feb 28, 2024

I prefer not having to deal with a config.h file, and go with @vthib's proposal.

Starting with musl-1.2.4 all LFS64 APIs have been removed since musl is
always 64-bit and yara now fails with implicit function declarations for
pread64.
@orbea
Copy link
Contributor Author

orbea commented Feb 28, 2024

I prefer not having to deal with a config.h file, and go with @vthib's proposal.

Its your call, I removed the config.h commit and went with @vthib's proposal. I left the AC_DEFINE commit since it might save some debugging in the future in case autoconf ever becomes more pedantic about that, but I can remove it too if you wish?

Copy link
Contributor

@vthib vthib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. There is still an issue but moving the define should be enough.

I still think adding a static assert would be best to add a safety net. _Static_assert is C11 though, i don't know if this is an issue. There are tricks to have static asserts without depending on C11, but I don't know where @plusvic stands on this addition or not

@@ -46,6 +46,8 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include <yara/proc.h>
#include <yara/strutils.h>

#define _FILE_OFFSET_BITS 64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work, it must be defined before any include for it to have an effect.

This can be confirmed by:

  • removing the AC_SYS_LARGEFILE
  • adding a _Static_assert(sizeof(off_t) == 8);
  • compiling with -m32

the static assert fails, indicating the scanning would be invalid.

That's why i was suggesting adding a static assert, it is very easy to break the fix without realizing it.

@@ -145,7 +145,7 @@ AC_ARG_ENABLE([profiling],
AC_ARG_WITH([debug-verbose],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Since the _FILE_OFFSET_BITS=64 value is defined, we could remove AC_SYS_LARGEFILE from the configure.ac file now

@alyssais alyssais mentioned this pull request Mar 1, 2024
47 tasks
plusvic added a commit that referenced this pull request Apr 7, 2024
See also PR #2048.
@plusvic
Copy link
Member

plusvic commented Apr 7, 2024

Fixed in 833a580

@plusvic plusvic closed this Apr 7, 2024
@orbea
Copy link
Contributor Author

orbea commented Apr 7, 2024

Thank you! And sorry for forgetting about this PR, I had other things distracting me and lost track of it...

@orbea orbea deleted the musl-1.2.4 branch April 7, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants