Skip to content

Commit

Permalink
Disable PC sampling service if counter collection service is configur…
Browse files Browse the repository at this point in the history
…ed (#899)
  • Loading branch information
vlaindic authored Jun 10, 2024
1 parent f5753d3 commit 211ee21
Show file tree
Hide file tree
Showing 4 changed files with 394 additions and 2 deletions.
6 changes: 5 additions & 1 deletion source/include/rocprofiler-sdk/pc_sampling.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ ROCPROFILER_EXTERN_C_INIT
*
* Constraint4: PC sampling feature is not available within the ROCgdb.
*
* Constraint5: PC sampling service cannot be used simultaneously with
* counter collection service.
*
* @param [in] context_id - id of the context used for starting/stopping PC sampling service
* @param [in] agent_id - id of the agent on which caller tries using PC sampling capability
* @param [in] method - the type of PC sampling the caller tries to use on the agent.
Expand All @@ -105,7 +108,8 @@ ROCPROFILER_EXTERN_C_INIT
* @retval ::ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNEL the amdgpu driver installed on the system
* does not support the PC sampling feature
* @retval ::ROCPROFILER_STATUS_ERROR a general error caused by the amdgpu driver
*
* @retval ::ROCPROFILER_STATUS_ERROR_CONTEXT_CONFLICT counter collection service already
* setup in the context
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id,
Expand Down
14 changes: 14 additions & 0 deletions source/lib/rocprofiler-sdk/pc_sampling/service.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,20 @@ configure_pc_sampling_service(context::context* ctx,
uint64_t interval,
rocprofiler_buffer_id_t buffer_id)
{
// FIXME: PC Sampling cannot be used simultaneously with counter collection.
// PC sampling requires clock gating to be disabled on MI2xx and MI3xx,
// otherwise a weird GPU hang might appear and a machine must be rebooted.
// Current implementation of (dispatch) counter collection service assumes disabling
// the clock gating before dispatching a kernel and reenabling the clock gating
// after kernel completion. Consequently, if PC sampling is active, (dispatch)
// counter collection service can enable clock gating and hang might appear.
// As a workaround, PC sampling and (dispatch) counter collection service
// cannot coexist in the same context.
if(ctx->counter_collection || ctx->agent_counter_collection)
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_CONFLICT;
}

if(!ctx->pc_sampler)
{
ctx->pc_sampler = std::make_unique<context::pc_sampling_service>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ include(GoogleTest)
set(ROCPROFILER_LIB_PC_SAMPLING_TEST_SOURCES
configure_service.cpp cid_manager.cpp
# samples_processing.cpp
query_configuration.cpp)
pc_sampling_vs_counter_collection.cpp query_configuration.cpp)
set(ROCPROFILER_LIB_PC_SAMPLING_TEST_HEADERS pc_sampling_internals.hpp)

add_executable(pcs-test)
Expand Down
Loading

0 comments on commit 211ee21

Please sign in to comment.