Set IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK for IPoIB's UD QPs if
supported by the underlying device.  This creates an improvement of up
to 39% in bandwidth when sending multicast packets with IPoIB, and an
improvment of 12% in cpu usage.
Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
        init_attr.recv_cq = priv->recv_cq;
 
        if (priv->hca_caps & IB_DEVICE_UD_TSO)
-               init_attr.create_flags = IB_QP_CREATE_IPOIB_UD_LSO;
+               init_attr.create_flags |= IB_QP_CREATE_IPOIB_UD_LSO;
+
+       if (priv->hca_caps & IB_DEVICE_BLOCK_MULTICAST_LOOPBACK)
+               init_attr.create_flags |= IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK;
 
        if (dev->features & NETIF_F_SG)
                init_attr.cap.max_send_sge = MAX_SKB_FRAGS + 1;