forked from ofiwg/libfabric
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/sharp: Add mocks for SHARP #6
Draft
ldorau
wants to merge
10
commits into
pmem:main
Choose a base branch
from
ldorau:prov-sharp-Add-mocks-for-SHARP
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
coll_cq implementation can be reused by other collective providers. Signed-off-by: Tomasz Gromadzki <[email protected]>
…er is available Signed-off-by: Tomasz Gromadzki <[email protected]>
…initialization It is rxm provider responsability to initialize collective offload provider's fabric. Otherwise collective offload functionality will not be available Signed-off-by: Tomasz Gromadzki <[email protected]>
…IDER FI_OFFLOAD_PROVIDER environment variable shall be set to offload provider name to instruct libcabric to setup and use particular provider. Signed-off-by: Tomasz Gromadzki <[email protected]>
Peer provider must create peer_eq for offload provider, to allow offload provider reporting events to peer provider. Signed-off-by: Tomasz Gromadzki <[email protected]>
Offload provider may execute collective operations via util_coll provider. It must call fi_join() operation to get struct mc required for collective operations. It can only call fi_join() on it's peer provider (e.g. rxm). FI_PEER flag is used to inform peer provider to coll fi_join() operation for util_coll_ep Signed-off-by: Tomasz Gromadzki <[email protected]>
offload_coll_mask value is calculated based on the actual offload capabilities confirmed by fi_query_collective(). Signed-off-by: Tomasz Gromadzki <[email protected]>
Signed-off-by: Tomasz Gromadzki <[email protected]>
This patch requires the original SHARP header in /usr/include/mellanox/sharp.h Signed-off-by: Lukasz Dorau <[email protected]>
Signed-off-by: Lukasz Dorau <[email protected]>
grom72
pushed a commit
that referenced
this pull request
Mar 24, 2023
If a posted receive matches with a saved receive, we may need to increment the rx counter. Set the rx counter increment callback to match that of the posted receive. This fixes an assert in xnet_cntr_inc() accessing a NULL cntr_inc function pointer. Program received signal SIGABRT, Aborted. 0x0000155552d4d37f in raise () from /lib64/libc.so.6 #0 0x0000155552d4d37f in raise () from /lib64/libc.so.6 #1 0x0000155552d37db5 in abort () from /lib64/libc.so.6 #2 0x0000155552d37c89 in __assert_fail_base.cold.0 () from /lib64/libc.so.6 #3 0x0000155552d45a76 in __assert_fail () from /lib64/libc.so.6 #4 0x00001555522967f9 in xnet_cntr_inc (ep=0x6e4c70, xfer_entry=0x6f7a30) at prov/tcp/src/xnet_cq.c:347 #5 0x0000155552296836 in xnet_report_cntr_success (ep=0x6e4c70, cq=0x6ca930, xfer_entry=0x6f7a30) at prov/tcp/src/xnet_cq.c:354 #6 0x000015555229970d in xnet_complete_saved (saved_entry=0x6f7a30) at prov/tcp/src/xnet_progress.c:153 #7 0x0000155552299961 in xnet_recv_saved (saved_entry=0x6f7a30, rx_entry=0x6f7840) at prov/tcp/src/xnet_progress.c:188 #8 0x00001555522946f8 in xnet_srx_tag (srx=0x6dd1c0, recv_entry=0x6f7840) at prov/tcp/src/xnet_srx.c:445 #9 0x0000155552294bb1 in xnet_srx_trecv (ep_fid=0x6dd1c0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at prov/tcp/src/xnet_srx.c:558 ofiwg#10 0x000015555228f60e in fi_trecv (ep=0x6dd1c0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at ./include/rdma/fi_tagged.h:91 ofiwg#11 0x00001555522900a7 in xnet_rdm_trecv (ep_fid=0x6d9fe0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at prov/tcp/src/xnet_rdm.c:212 Signed-off-by: Sean Hefty <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch requires the original SHARP header in /usr/include/mellanox/sharp.h
Signed-off-by: Lukasz Dorau [email protected]
Requires:
This change is