net: netif_tx_queue_stopped too expensive

netif_tx_queue_stopped(txq) is most of the time false.

Yet its cost is very expensive on SMP.

static inline int netif_tx_queue_stopped(const struct netdev_queue *dev_queue)
{
	return test_bit(__QUEUE_STATE_XOFF, &dev_queue->state);
}

I saw this on oprofile hunting and bnx2 driver bnx2_tx_int().

We probably should split "struct netdev_queue" in two parts, one
being read mostly.

__netif_tx_lock() touches _xmit_lock & xmit_lock_owner, these
deserve a separate cache line.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f8c3619..505a3c6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -447,12 +447,18 @@
 };
 
 struct netdev_queue {
+/*
+ * read mostly part
+ */
 	struct net_device	*dev;
 	struct Qdisc		*qdisc;
 	unsigned long		state;
-	spinlock_t		_xmit_lock;
-	int			xmit_lock_owner;
 	struct Qdisc		*qdisc_sleeping;
+/*
+ * write mostly part
+ */
+	spinlock_t		_xmit_lock ____cacheline_aligned_in_smp;
+	int			xmit_lock_owner;
 } ____cacheline_aligned_in_smp;