Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44950: [C++] Bump minimum CMake version to 3.25 #44989

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Dec 10, 2024

This is currently under development. I am only exercising CI to find out what things would require to be updated
This is still being decided and a decision hasn't been finalized yet.

Rationale for this change

We want to upgrade our CMake version to 3.25 as discussed on the ML:
https://lists.apache.org/thread/h8jp16ktrj11fmjmjhlg6xvkvv9wzvjk

What changes are included in this PR?

TBD

Are these changes tested?

Yes, via CI.

Are there any user-facing changes?

Yes, the minimum CMake version to be used to build Arrow is bumped to 3.25.
This PR includes breaking changes to build systems.

Copy link

⚠️ GitHub issue #44950 has been automatically assigned in GitHub to PR creator.

@raulcd
Copy link
Member Author

raulcd commented Dec 10, 2024

I am unsure on how to fix the remaining failures for R.
For the gcc 12 job I am unsure why it is failing and for the Windows C++ RTools 40 ucrt64 it seems we install CMake from MINGW here but I am not sure if this is necessary or can be updated.
https://github.com/apache/arrow/blob/main/ci/scripts/PKGBUILD#L39

Of course this is the initial CI (we also have to update all the extended CI jobs for crossbow).

@h-vetinari
Copy link
Contributor

For the gcc 12 job I am unsure why it is failing

Updating the CMake lower bound will flip the default of several policies from legacy to new; sounds like you might be relying on legacy behaviour there in some way (once you figure out which policy is at fault, there's usually a migration path to keep the old behaviour)

@raulcd
Copy link
Member Author

raulcd commented Dec 11, 2024

ok, it seems R forces the builds to use the CMake provided on the images:

**** Not using cmake found at /bin/cmake
Error in .make_numeric_version(x, strict, .standard_regexps()$valid_numeric_version) : 
  invalid non-character version specification 'x' (type: double)
Calls: build_libarrow ... as.numeric_version -> numeric_version -> .make_numeric_version
Execution halted

@jonkeane @assignUser @amoeba will this be an issue for CRAN? Are we somehow forced to the CMake version on those images?

root@03abb5b759ba:/# /bin/cmake --version
cmake version 3.22.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

@assignUser
Copy link
Member

assignUser commented Dec 11, 2024

I went through the logs of our recent checks on cran and only one is using a version < 3.25 and that seems more incidental then purposely as it's the r-odrel arm64 but the intel version has 3.26

So I don't think we should be forced to use that cmake version, additionally we have a function that fetches current cmake if an unsuited version is found but apparently there is an issue with it as seen above. IIRC there was a change to numeric version in one of the las R Versions that is causing this? I'll have a look.

@nealrichardson
Copy link
Member

It's possible the version comparison error is from this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL731-L743

cmake_version <- function(cmd = "cmake") {
  tryCatch(
    {
      raw_version <- system(paste(cmd, "--version"), intern = TRUE, ignore.stderr = TRUE)
      pat <- ".* ([0-9\\.]+).*?"
      which_line <- grep(pat, raw_version)
      package_version(sub(pat, "\\1", raw_version[which_line]))
    },
    error = function(e) {
      return(0)
    }
  )
}

The error case should probably return("0")

@nealrichardson
Copy link
Member

Two other places in the R nixlibs.R script worth updating:

@pitrou
Copy link
Member

pitrou commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

@nealrichardson
Copy link
Member

nealrichardson commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

I could be remembering wrong, but I believe the function is used to check for cmake of a certain version, and this is to be robust to where cmake may not be installed or not found at the path provided. It does not emit an error, it traps it.

This might not be where the error is coming from that was observed in CI, I was just browsing the source to see where you might get a numeric version error. Looking again, and reading the output it produced, I think we're hitting this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL718

      } else {
        # Keep trying
        lg("Not using cmake found at %s", path, .indent = "****")
        if (found_version > 0) {
          lg("Version >= %s required; found %s", version_required, found_version, .indent = "*****")
        } else {

should be found_version > "0". We must not have been running into this before because the "found_version" was always sufficient if found, and if it wasn't found, we were returning a numeric 0 which works in this comparison.

(To be clear, we need to fix both this and the return("0") above.)

.env Outdated
@@ -54,6 +54,7 @@ UBUNTU=22.04

# Default versions for various dependencies
CLANG_TOOLS=14
CMAKE=3.25.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible (and also would it be advisable...) to call this CMAKE_VERSION instead? One of the R failures I suspect might be due to us looking to CMAKE for a path to the cmake executable itself at

Sys.getenv("CMAKE"),
(though I don't have a full traceback I'm not 100% certain on that)

Copy link
Member

@jonkeane jonkeane Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah I didn't see the previous comments on the PR (sorry, I must have had a tab open and didn't refresh!) but I suspect this here is the answer to:

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Dec 12, 2024
@github-actions github-actions bot removed the awaiting changes Awaiting changes label Dec 13, 2024
@nealrichardson
Copy link
Member

We support more versions of R than CRAN actively checks on

Do we still want to do that? How is our R support policy decided?

We generally follow the tidyverse version support policy, which is the most recent 5 versions. Many enterprise users of R don't upgrade versions eagerly so they can be stuck on older versions longer than CRAN's testing window.

If there is an easy way to get new enough cmake into the R windows-cpp job so that we can keep building with the R 4.0 toolchain, that would be ideal. I don't see why we couldn't do that.

@nealrichardson
Copy link
Member

One more reference to 3.16 to update: https://github.com/apache/arrow/blob/main/r/DESCRIPTION#L31

@raulcd
Copy link
Member Author

raulcd commented Dec 19, 2024

After a bunch of trial and error, I seem to be stuck with the following issue if I override the CMake version to be used to either a manually 3.25.0 installed or using the latest available (3.31) on msys2:

-- Found hdfs.h at: D:/a/arrow/arrow/cpp/thirdparty/hadoop/include/hdfs.h
CMake Error at C:/rtools40/ucrt64/lib/cmake/AWSSDK/AWSSDKConfig.cmake:119 (while):
  while() given incorrect arguments:

    "NOT" "TEMP_NAME" "STREQUAL"

  Unknown arguments specified
Call Stack (most recent call first):
  cmake_modules/FindAWSSDKAlt.cmake:36 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:294 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:5389 (resolve_dependency)
  CMakeLists.txt:546 (include)

I am trying some things, like using an updated aws-sdk-cpp version, but I am unsure where to go from there

@nealrichardson
Copy link
Member

The aws-sdk-cpp it's using is coming from https://github.com/r-windows/rtools-packages/tree/master/mingw-w64-aws-sdk-cpp

If you need to update it, I think you'll have to drop it from the dependencies that get downloaded in our PKGBUILD and make it dependency source=BUNDLED. Doesn't look like we're able to make any updates happen to that repo.

cc @assignUser, in case you wanted to start pushing on the MXE direction again as an alternative 🙃

@amoeba
Copy link
Member

amoeba commented Dec 20, 2024

I spent some time trying to build aws-sdk-cpp under Rtools40 and got stuck building aws-c-io. I put my progress in a gist which includes my steps, a patch for aws-c-common, and my build log for aws-c-io.

@amoeba
Copy link
Member

amoeba commented Dec 23, 2024

Update on fixing the Rtools40 build: I've created a set of patches and the awssdk source build works now so I think the approach you were taking @raulcd will work. I'm running into an issue where the patches I record don't re-apply when I rebuild so I'm working on figuring that out now.

@amoeba
Copy link
Member

amoeba commented Dec 23, 2024

I figured out my issue above and now have what looks like a working set of patches so we can build the AWS SDK under Rtools40. My changes are over on raulcd#90. Still needs some work but hopefully this helps unblock this PR.

@raulcd
Copy link
Member Author

raulcd commented Dec 23, 2024

Thanks @amoeba for the patches! I had to tweak a little but I did make some more progress. The R / AMD64 Windows C++ RTools 40 ucrt64 is now successful with bundled AWS but the related R / AMD64 Windows R release job, which validates the built package, is currently failing when trying to install the new Arrow package. It seems to be missing some symbols on the new bundled libraries. Anyone has an idea?

g++ -shared -s -static-libgcc -o arrow.dll tmp.def RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L../windows/arrow-18.1.0.9000/lib-13.3.0/x64 -L../windows/arrow-18.1.0.9000/lib/x64-ucrt -larrow_dataset -larrow_acero -lparquet -larrow -larrow_bundled_dependencies -lutf8proc -lthrift -lsnappy -lz -lzstd -llz4 -lbz2 -lbrotlienc -lbrotlidec -lbrotlicommon -lole32 -lbcrypt -lpsapi -lcrypto -lcrypt32 -lre2 -luserenv -lversion -lws2_32 -lbcrypt -lwininet -lwinhttp -lcurl -lnormaliz -lssh2 -lgdi32 -lssl -lcrypto -lcrypt32 -lwldap32 -lz -lws2_32 -lnghttp2 -ldbghelp -LC:/rtools44/x86_64-w64-mingw32.static.posix/lib/x64 -LC:/rtools44/x86_64-w64-mingw32.static.posix/lib -LC:/R/bin/x64 -lR
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(file.c.obj):(.text+0x1110): undefined reference to `__imp_PathFileExistsW'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0xcb): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x694): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0xb6a): undefined reference to `DecryptMessage'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x100e): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x12bd): undefined reference to `EncryptMessage'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x1869): undefined reference to `__imp_ApplyControlToken'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x192a): undefined reference to `InitializeSecurityContextA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x1ab3): undefined reference to `__imp_DeleteSecurityContext'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x1ae2): undefined reference to `__imp_DeleteSecurityContext'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x1bc6): undefined reference to `AcquireCredentialsHandleA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x274b): undefined reference to `__imp_AcceptSecurityContext'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x27c5): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x2a0f): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x2be0): undefined reference to `InitializeSecurityContextA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x2c56): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x2e59): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x2fa2): undefined reference to `__imp_AcceptSecurityContext'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x309b): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x31e3): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x342a): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x3447): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x381c): undefined reference to `InitializeSecurityContextA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x3913): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x3a9d): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x3c9a): undefined reference to `FreeContextBuffer'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(secure_channel_tls_handler.c.obj):(.text+0x3cb7): undefined reference to `QueryContextAttributesA'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(windows_pki_utils.c.obj):(.text+0x1825): undefined reference to `NCryptOpenStorageProvider'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(windows_pki_utils.c.obj):(.text+0x1884): undefined reference to `NCryptImportKey'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(windows_pki_utils.c.obj):(.text+0x1bed): undefined reference to `NCryptFreeObject'
C:\rtools44\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-18.1.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(windows_pki_utils.c.obj):(.text+0x1bf7): undefined reference to `NCryptFreeObject'
collect2.exe: error: ld returned 1 exit status

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants