Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about how to obtain cqe from strding RQ. #1

Open
Lcc-code opened this issue Jun 30, 2024 · 4 comments
Open

Some questions about how to obtain cqe from strding RQ. #1

Lcc-code opened this issue Jun 30, 2024 · 4 comments
Assignees
Labels
good first issue Good for newcomers question Further information is requested

Comments

@Lcc-code
Copy link

Lcc-code commented Jun 30, 2024

Hello, I'm sorry to bother you.

I am a student from China and recently saw your code. I would like to ask you a question:

I saw that you tried to obtain the cqe of strding RQ using ibv_poll_cq before. Can you use ibv_poll_cq to obtain the cqe normally? I have been reporting the issue of IBV_WC_LOC_PROT_ERR locally.

Additionally, it would be somewhat unusual to obtain CQE using your current method, and I am very confused :<

@Lcc-code Lcc-code changed the title Some questions about how to obtain cqe from string RQ. Some questions about how to obtain cqe from strding RQ. Jun 30, 2024
@Lcc-code
Copy link
Author

Hi,
when I try this "post_wq_recvs" function, I can poll_cq successfully!
I will continue to try to understand the acquisition of cqe for string RQ, which may be a bit difficult.
Thank you very much.

@AjayBrahmakshatriya
Copy link
Contributor

Hi @Lcc-code,

Thanks for taking the time to use NetBlocks. What you have asked here is a very interesting question and related to how MLX5's Striding RQs work. As far as I understand rdma-core doesn't support Striding RQs. If you actually try to use the ibv_poll_cq, it works fine initially. However, when you try to read a burst of packets, it fails. The method directly reading from the buffers from the mlx5_cqe64 allows you to read a large number of posted packets. With the ibv_poll_cq method you miss packets if more than 128 are enqueued.

@AjayBrahmakshatriya
Copy link
Contributor

Also, just to add, I use this modified version or rdma-core instead of the default one - https://github.com/AjayBrahmakshatriya/rdma-core/tree/mlx5-fix-mprq-wq-post-recv

This has been retrofitted to support striding RQs.

@AjayBrahmakshatriya AjayBrahmakshatriya self-assigned this Jul 16, 2024
@AjayBrahmakshatriya AjayBrahmakshatriya added good first issue Good for newcomers question Further information is requested labels Jul 16, 2024
@Lcc-code
Copy link
Author

Hi, Thank you for your reply,
I'm doing the post recv in this doorbell way
std::atomicstd::uint32_t *dbrec = reinterpret_cast<std::atomicstd::uint32_t *>(dma_context->rwq->dbrec);
dbrec->store(htobe32(dma_context->sge_idx & 0xffff), std::memory_order_release);

However, I ran into the problem that the current method works fine on a single thread, but on 2 threads or more, cqe->wqe_id and cqe->wqe_counter will produce incorrect return values when the thread receives more than recv depth. However, the above situation does not occur when a new process is started to receive packets.

Do you have a similar situation when multithreading with the current protocol stack? In addition, do you have similar experience? How can I find out?

Thanks again for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants