OSDI16- FaSST: Fast, Scalable and Simple Distributed Transactions with Two-sided (RDMA) Datagram RPCs
FaSST  shares QPs among threads, which subsequently lowers the CPU efﬁciency and performance due to the lock contention of QPs between threads.
An alternative heuristic is the inclusion of a dedicated proxy thread that manages all receive and send requests . However, switching to/from a dedicated proxy thread increases latency. Furthermore, it is difﬁcult to saturate the full network bandwidth with a single thread. Moreover, the proxy solution is not transparent to the underlying RDMA libraries.