Skip to content

Commit

Permalink
SDK: OMPT Support (#22)
Browse files Browse the repository at this point in the history
* Ability to select alternative compiler per file

Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.

Misc updates

Update OpenMP target sample

- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files

Rework OpenMP implementation

Minor OpenMP implementation cleanup

Rename samples/openmp_target CMake targets

Add tests/bin/openmp

- OpenMP target test app in tests/bin/openmp/target

Format samples/openmp_target CMakeLists.txt

Misc lib/rocprofiler-sdk/openmp cleanup

- fix includes
- convert_arg

Update openmp.def.cpp

- tweak includes
- remove lots of temporary variables

Update samples

- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample

Fix code object operation names

- add "CODE_OBJECT_" prefix

Update include/rocprofiler-sdk/openmp/api_id.h

- remove spurious comment

Miscellaneous openmp updates

- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events

[SWDEV-484495] Fix int truncation in CSV output (#1098)

CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.

Co-authored-by: Benjamin Welton <[email protected]>

Update limit for max counter records in rocprof-tool (#1073)

A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.

adding proxy ompt_data_t * arguments

fixes for proxy pointers

- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers

formatting

SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)

Fixing some accumulate metrics (#1089)

* Fixing some accumulate metrics

* Fixing some more accumulate metrics

---------

Co-authored-by: Benjamin Welton <[email protected]>

updating rocprofv3 help options (#1113)

* updating rocprofv3 help options

* updating CHANGELOG

Fixing installed pacakge tests in CI (#1119)

* Fixing installed pacakge tests in CI

* Formatted rocprofv3.py with black formatter

SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)

* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.

* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp

Co-authored-by: Vladimir Indic <[email protected]>

* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp

Co-authored-by: Vladimir Indic <[email protected]>

* Adding backlog for codeobj changes

* Formatting

* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp

Co-authored-by: Vladimir Indic <[email protected]>

* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp

Co-authored-by: Vladimir Indic <[email protected]>

---------

Co-authored-by: Vladimir Indic <[email protected]>

SWDEV-487621: Fixes for metric definitions (#1118)

* Fixes for metric definitions

* Removing gfx8

* Update changelog

* Fixing unit tests

* Small fixes

* Fix for write size

Fix PSDB change (#1120)

Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit 9b2ece7

clang-18 build fix for RCCL (#1123)

Removes ambiguity on const usage, which clang-18 complains about
(preventing build with warn error).

mem copy direction field update (#1124)

Adding Node-id for debugging with log level trace (#1090)

fix botched rebase

Per Jonathan to remove -rdynamic warning so CI will continue

pedantic formatting

Correct the package name of rocprofiler-sdk (#1126)

* Correct the package name of rocprofiler-sdk

ROCM VERSION(for ex: 60300) was missing in the package name.
Added the same

* Use cmake cache string while setting the variable for ROCm Version

* correct the cmake-format

---------

Co-authored-by: Ranjith Ramakrishnan <[email protected]>

Fixing kokkosp tool library packaging (#1121)

* Fixing kokkosp tool library packaging

* Update source/lib/rocprofiler-sdk-tool/kokkosp/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update CMakeLists.txt

* Update CMakeLists.txt

* Component Requirement in CPack

* Adding package dependency

* Update CMakeLists.txt

* Update rocprofiler_config_packaging.cmake

* Fix rocprofiler-sdk-tool-kokkosp BUILD/INSTALL RPATH

- CMAKE_INSTALL_LIBDIR doesn't help

* Add BUILD/INSTALL RPATH to rocprofv3-trigger-list-metrics

- fixes packaging issues

* Update packaging

- core depends on rocprofiler-sdk-roctx
- add CPACK_DEBIAN_PACKAGE_SHLIBDEPS_PRIVATE_DIRS to resolve inter-package dependencies

* Fix package depends version format

* Improve tests/rocprofv3/summary/validate logging

* Update CI workflow

- prioritize roctx package in Install Packages step

* Remove setting <package-name>_VERSION in config.cmake.in

- this is automatically handled by existence of <package-name>-config-version.cmake

* Update rocprofiler-sdk-config.cmake

- relax find_package versioning requirements to same major and minor version

* Update rocprofiler-sdk-config.cmake

- relax find_package versioning requirements (remove EXACT, specify range)

* Tweak CI workflow

* Update perfetto_reader.py

- better handle failure to load trace processor

* Misc cleanup for config packaging

* Update config packaging

* Update config packaging

* Revert perfetto for core-rpm packages

* Revert perfetto for core-rpm packages

- perfetto < 0.9.0

* Tweak tests/rocprofv3/summary/validate.py

- reorder some checks

---------

Co-authored-by: Ammar Elwazir <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <[email protected]>

Clang Warning Fixes (#1131)

Builds prevented on clang-18

Adding start and end timestamp columns in csv (#1128)

* Adding start and end timestamp columns in csv

* Adding assert check for the counter timestamps

---------

Co-authored-by: Gopesh Bhardwaj <[email protected]>

rocprofv3: docs and help menu updates (#1129)

* doc updates

* Correcting ROCtx information

* Making ROCTx string consistent

* missing occurence

Renamed agent profiling service to device counting service (#1132)

* Renamed agent profiling service to device counting service

Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:

find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +

* Converted dispatch profile to dispatch counting service

* Debug for functioal counters test

* Minor changes for CI

* Minor fix

* More fixes for CI

* Update evaluate_ast.cpp

---------

Co-authored-by: Benjamin Welton <[email protected]>

Testing updated RPM dockers (#1136)

* Testing updated RPM dockers

* Trying to fix PSDB for test package dependency

Agent Profiling Fixes for Broken/Improper API Usage (#1122)

Prevent's multiple setups of agent profiling on the same agent.

Fixes agent read context to only read agents that were setup.

Prevent copy of agent profiling internal data struct and reset
hsa_signal on move to prevent inadvertant delete.

Simplifying PR template (#1139)

Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.

Fixing installed pacakge tests in CI (#1119)

* Fixing installed pacakge tests in CI

* Formatted rocprofv3.py with black formatter

Fix PSDB change (#1120)

Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit 9b2ece7

delete unused files

added arguments to some OMPT buffter records

* Fix cmake issues

Remove rocprofiler_ompt_finalize_tool

- a public API function is not necessary: should just finalize rocprofiler-sdk

Fix duplicate ROCPROFILER_{BUFFER,CALLBACK}_TRACING_KIND_STRING

Add lib/rocprofiler-sdk/ompt.hpp

- declares rocprofiler::sdk::finalize_ompt

Remove change to tests/rocprofv3/summary/conftest.py

Add set_fini_status(1) back to registration.cpp

Deleted uneeded files

Incoporate OpenMP code and sample

Fix merge issues with amd-staging

Add push_correlation_id for OpenMP tasking; improve debugability

fixup bad merge

* Suppress OpenMP data race

* Fix openmp_target sample

* Enum and struct name changes + source code reorg

- remove mix of ompt and openmp
  - opted for ompt
- changes made for consistency
  - ompt_api -> ompt
  - openmp_api -> ompt
  - OPENMP -> OMPT

* Update tests and more renaming

- dest_device_num -> dst_device_num
- src_addr -> src_address
- dest_addr -> dst_address
- remove info_type::begin
- require OMP_TARGET_OFFLOAD

* Update openmp-target test/sample env and labels

* Formatting

* Tweaks to cmake for openmp target

- Disable for thread sanitizers due to preloading issue

* OpenMP target cmake updates

- remove gfx1010 (fails on mi300)
- OPENMP_GPU_TARGETS

* Remove device_unload and target_map_emi support

- these are never supported by AMD OpenMP compilers

* Update CI workflow

- exclude openmp-target tests from navi3 and vega20

---------

Co-authored-by: Larry Meadows <[email protected]>
Co-authored-by: Jonathan R. Madsen <[email protected]>
  • Loading branch information
3 people authored Dec 6, 2024
1 parent a579c70 commit 00c46fd
Show file tree
Hide file tree
Showing 54 changed files with 4,258 additions and 500 deletions.
86 changes: 44 additions & 42 deletions .github/workflows/continuous_integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,21 +23,27 @@ env:
ROCM_PATH: "/opt/rocm"
GPU_TARGETS: "gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942 gfx1030 gfx1100 gfx1101 gfx1102"
PATH: "/usr/bin:$PATH"
TEMP_EXCLUDED_TESTS: "test-page-migration|app-abort"
EXCLUDED_TESTS: ".*pc-sampling.*|.*pc_sampling.*"
navi3_EXCLUDE_TESTS_REGEX: "^(test-page-migration-(execute|validate)|rocprofv3-test-(execute|validate)-app-abort)$"
vega20_EXCLUDE_TESTS_REGEX: "^(test-page-migration-(execute|validate)|rocprofv3-test-(execute|validate)-app-abort)$"
mi200_EXCLUDE_TESTS_REGEX: "^(test-page-migration-(execute|validate)|rocprofv3-test-(execute|validate)-app-abort)$"
mi300_EXCLUDE_TESTS_REGEX: "^(test-page-migration-(execute|validate)|rocprofv3-test-(execute|validate)-app-abort)$"
navi3_EXCLUDE_LABEL_REGEX: "^(pc-sampling|openmp-target)$"
vega20_EXCLUDE_LABEL_REGEX: "^(pc-sampling|openmp-target)$"
mi200_EXCLUDE_LABEL_REGEX: "^(openmp-target)$"
mi300_EXCLUDE_LABEL_REGEX: "^(pc-sampling)$"

jobs:
core-deb:
# See: https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/managing-complex-workflows#using-a-build-matrix
strategy:
fail-fast: false
matrix:
runner: ['vega20-emu', 'mi300-emu']
runner: ['vega20', 'mi300']
os: ['ubuntu-22.04']
build-type: ['RelWithDebInfo']
ci-flags: ['--linter clang-tidy']

runs-on: ${{ matrix.runner }}-runner-set
runs-on: ${{ matrix.runner }}-emu-runner-set

# define this for containers
env:
Expand Down Expand Up @@ -72,7 +78,6 @@ jobs:
if: ${{ contains(matrix.runner, 'mi200') }}
shell: bash
run: |
echo "EXCLUDED_TESTS=''" >> $GITHUB_ENV
echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Configure, Build, and Test
Expand All @@ -94,8 +99,8 @@ jobs:
-DCPACK_PACKAGING_INSTALL_PREFIX="$(realpath /opt/rocm)"
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"

- name: Install
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -119,8 +124,8 @@ jobs:
export LD_LIBRARY_PATH=/opt/rocprofiler-sdk/lib:${LD_LIBRARY_PATH}
cmake --build build-samples --target all --parallel 16
cmake --build build-tests --target all --parallel 16
ctest --test-dir build-samples -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-tests -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-samples -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Install Packages
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -142,8 +147,8 @@ jobs:
CMAKE_PREFIX_PATH=/opt/rocm cmake -B build-tests-deb /opt/rocm/share/rocprofiler-sdk/tests
cmake --build build-samples-deb --target all --parallel 16
cmake --build build-tests-deb --target all --parallel 16
ctest --test-dir build-samples-deb -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-tests-deb -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-samples-deb -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests-deb -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Archive production artifacts
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -160,7 +165,7 @@ jobs:
strategy:
fail-fast: false
matrix:
runner: ['mi300-emu']
runner: ['mi300']
os: ['rhel-emu', 'sles-emu']
build-type: ['RelWithDebInfo']
ci-flags: ['--linter clang-tidy']
Expand Down Expand Up @@ -196,7 +201,6 @@ jobs:
if: ${{ contains(matrix.runner, 'mi200') }}
shell: bash
run: |
echo "EXCLUDED_TESTS=''" >> $GITHUB_ENV
echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Configure, Build, and Test
Expand All @@ -218,8 +222,8 @@ jobs:
-DCPACK_PACKAGING_INSTALL_PREFIX="$(realpath /opt/rocm)"
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"

- name: Install
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -243,8 +247,8 @@ jobs:
export LD_LIBRARY_PATH=/opt/rocprofiler-sdk/lib:${LD_LIBRARY_PATH}
cmake --build build-samples --target all --parallel 16
cmake --build build-tests --target all --parallel 16
ctest --test-dir build-samples -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-tests -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-samples -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Install Packages
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -266,8 +270,8 @@ jobs:
CMAKE_PREFIX_PATH=/opt/rocm cmake -B build-tests-deb /opt/rocm/share/rocprofiler-sdk/tests
cmake --build build-samples-deb --target all --parallel 16
cmake --build build-tests-deb --target all --parallel 16
ctest --test-dir build-samples-deb -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-tests-deb -LE "${EXCLUDED_TESTS}" -E "${{ env.TEMP_EXCLUDED_TESTS }}" --output-on-failure
ctest --test-dir build-samples-deb -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests-deb -LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Archive production artifacts
if: ${{ contains(matrix.runner, env.CORE_EXT_RUNNER) }}
Expand All @@ -284,11 +288,11 @@ jobs:
strategy:
# fail-fast: false
matrix:
runner: ['mi200-emu']
runner: ['mi200']
os: ['ubuntu-22.04']
build-type: ['Release']

runs-on: ${{ matrix.runner }}-runner-set
runs-on: ${{ matrix.runner }}-emu-runner-set

# define this for containers
env:
Expand Down Expand Up @@ -361,7 +365,6 @@ jobs:
if: ${{ contains(matrix.runner, 'mi200') }}
shell: bash
run: |
echo "EXCLUDED_TESTS=''" >> $GITHUB_ENV
echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Configure, Build, and Test (Total Code Coverage)
Expand All @@ -379,8 +382,8 @@ jobs:
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"

- name: Configure, Build, and Test (Tests Code Coverage)
timeout-minutes: 30
Expand All @@ -398,8 +401,8 @@ jobs:
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"

- name: Configure, Build, and Test (Samples Code Coverage)
timeout-minutes: 30
Expand All @@ -417,8 +420,8 @@ jobs:
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"

- name: Save XML Code Coverage
id: save-coverage
Expand Down Expand Up @@ -536,7 +539,7 @@ jobs:
# - unittests
# - integration-tests
#
ctest -N -LE 'samples|tests' -E "${{ env.TEMP_EXCLUDED_TESTS }}" -O ctest.mislabeled.log
ctest -N -LE 'samples|tests' -E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}" -O ctest.mislabeled.log
grep 'Total Tests: 0' ctest.mislabeled.log
#
# if following fails, then there is overlap between the labels.
Expand All @@ -561,22 +564,22 @@ jobs:
strategy:
fail-fast: false
matrix:
runner: ['vega20-emu', 'navi3-emu', 'mi300-emu']
runner: ['vega20', 'navi3', 'mi300']
sanitizer: ['AddressSanitizer', 'ThreadSanitizer', 'LeakSanitizer', 'UndefinedBehaviorSanitizer']
os: ['ubuntu-22.04']
build-type: ['RelWithDebInfo']
exclude:
- { runner: 'vega20-emu', sanitizer: 'ThreadSanitizer' }
- { runner: 'vega20-emu', sanitizer: 'AddressSanitizer' }
- { runner: 'vega20-emu', sanitizer: 'UndefinedBehaviorSanitizer' }
- { runner: 'mi300-emu', sanitizer: 'AddressSanitizer' }
- { runner: 'mi300-emu', sanitizer: 'LeakSanitizer' }
- { runner: 'navi3-emu', sanitizer: 'LeakSanitizer' }
- { runner: 'navi3-emu', sanitizer: 'ThreadSanitizer' }
- { runner: 'navi3-emu', sanitizer: 'UndefinedBehaviorSanitizer' }
- { runner: 'vega20', sanitizer: 'ThreadSanitizer' }
- { runner: 'vega20', sanitizer: 'AddressSanitizer' }
- { runner: 'vega20', sanitizer: 'UndefinedBehaviorSanitizer' }
- { runner: 'mi300', sanitizer: 'AddressSanitizer' }
- { runner: 'mi300', sanitizer: 'LeakSanitizer' }
- { runner: 'navi3', sanitizer: 'LeakSanitizer' }
- { runner: 'navi3', sanitizer: 'ThreadSanitizer' }
- { runner: 'navi3', sanitizer: 'UndefinedBehaviorSanitizer' }

if: ${{ contains(github.event_name, 'pull_request') }}
runs-on: ${{ matrix.runner }}-runner-set
runs-on: ${{ matrix.runner }}-emu-runner-set

# define this for containers
env:
Expand Down Expand Up @@ -611,7 +614,6 @@ jobs:
if: ${{ contains(matrix.runner, 'mi200') }}
shell: bash
run: |
echo "EXCLUDED_TESTS=''" >> $GITHUB_ENV
echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Configure, Build, and Test
Expand All @@ -630,5 +632,5 @@ jobs:
-DCMAKE_INSTALL_PREFIX="${{ env.ROCM_PATH }}"
-DPython3_EXECUTABLE=$(which python3)
--
-LE "${EXCLUDED_TESTS}"
-E "${{ env.TEMP_EXCLUDED_TESTS }}"
-LE "${${{ matrix.runner }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.runner }}_EXCLUDE_TESTS_REGEX}"
1 change: 0 additions & 1 deletion cmake/rocprofiler_config_install_roctx.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ set(${PACKAGE_NAME}_BUILD_TREE
set(PROJECT_BUILD_TREE_TARGETS
${SDK_PACKAGE_NAME}::${PACKAGE_NAME}-shared-library
${SDK_PACKAGE_NAME}::${SDK_PACKAGE_NAME}-headers
${SDK_PACKAGE_NAME}::${SDK_PACKAGE_NAME}-build-flags
${SDK_PACKAGE_NAME}::${SDK_PACKAGE_NAME}-stack-protector)

configure_file(
Expand Down
1 change: 1 addition & 0 deletions samples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,4 @@ add_subdirectory(code_object_isa_decode)
add_subdirectory(advanced_thread_trace)
add_subdirectory(external_correlation_id_request)
add_subdirectory(pc_sampling)
add_subdirectory(openmp_target)
2 changes: 1 addition & 1 deletion samples/api_callback_tracing/client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)

call_stack_v->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""});

callback_name_info name_info = common::get_callback_id_names();
callback_name_info name_info = common::get_callback_tracing_names();

for(const auto& itr : name_info)
{
Expand Down
2 changes: 1 addition & 1 deletion samples/common/name_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ get_buffer_tracing_names()
}

inline auto
get_callback_id_names()
get_callback_tracing_names()
{
return rocprofiler::sdk::get_callback_tracing_names();
}
Expand Down
95 changes: 95 additions & 0 deletions samples/openmp_target/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)

if(NOT OMP_TARGET_COMPILER)
find_program(
amdclangpp_EXECUTABLE
NAMES amdclang++
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATH_SUFFIXES bin llvm/bin NO_CACHE)
mark_as_advanced(amdclangpp_EXECUTABLE)

if(amdclangpp_EXECUTABLE)
set(OMP_TARGET_COMPILER
"${amdclangpp_EXECUTABLE}"
CACHE FILEPATH "")
endif()
endif()

project(rocprofiler-sdk-samples-openmp-target LANGUAGES CXX)

find_package(rocprofiler-sdk REQUIRED)

add_library(openmp-target-sample-client SHARED)
target_sources(openmp-target-sample-client PRIVATE client.cpp client.hpp)
target_link_libraries(
openmp-target-sample-client
PRIVATE rocprofiler-sdk::rocprofiler-sdk rocprofiler-sdk::samples-build-flags
rocprofiler-sdk::samples-common-library)

set(DEFAULT_GPU_TARGETS
"gfx906"
"gfx908"
"gfx90a"
"gfx940"
"gfx941"
"gfx942"
"gfx1100"
"gfx1101"
"gfx1102")

set(OPENMP_GPU_TARGETS
"${DEFAULT_GPU_TARGETS}"
CACHE STRING "GPU targets to compile for")

if(ROCPROFILER_MEMCHECK STREQUAL "ThreadSanitizer")
set(IS_THREAD_SANITIZER ON)
else()
set(IS_THREAD_SANITIZER OFF)
endif()

find_package(Threads REQUIRED)
find_package(rocprofiler-sdk-roctx REQUIRED)

add_executable(openmp-target-sample)
target_sources(openmp-target-sample PRIVATE main.cpp)
target_link_libraries(
openmp-target-sample PRIVATE Threads::Threads
rocprofiler-sdk-roctx::rocprofiler-sdk-roctx)
target_compile_options(openmp-target-sample PRIVATE -fopenmp)
target_link_options(openmp-target-sample PRIVATE -fopenmp)

foreach(_TARGET ${OPENMP_GPU_TARGETS})
target_compile_options(openmp-target-sample PRIVATE --offload-arch=${_TARGET})
target_link_options(openmp-target-sample PRIVATE --offload-arch=${_TARGET})
endforeach()

include(rocprofiler-sdk-custom-compilation)
rocprofiler_sdk_custom_compilation(TARGET openmp-target-sample
COMPILER ${OMP_TARGET_COMPILER})

rocprofiler_samples_get_preload_env(PRELOAD_ENV openmp-target-sample-client)
rocprofiler_samples_get_ld_library_path_env(
LIBRARY_PATH_ENV rocprofiler-sdk-roctx::rocprofiler-sdk-roctx-shared-library)

set(openmp-target-sample-env
${PRELOAD_ENV} ${LIBRARY_PATH_ENV} "OMP_NUM_THREADS=2" "OMP_TARGET_OFFLOAD=mandatory"
"OMP_DISPLAY_ENV=1" "ROCR_VISIBLE_DEVICES=0")

add_test(NAME openmp-target-sample COMMAND $<TARGET_FILE:openmp-target-sample>)

set_tests_properties(
openmp-target-sample
PROPERTIES TIMEOUT
45
LABELS
"samples;openmp-target"
ENVIRONMENT
"${openmp-target-sample-env}"
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
DISABLED
"${IS_THREAD_SANITIZER}")
Loading

0 comments on commit 00c46fd

Please sign in to comment.