Use netdev_alloc_skb since it handles any NUMA node memory localtion issues
and sets skb->dev. Since device driver was not setting skb->dev, I bet
filter rules based on device would not work.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Any negative return value from start_xmit is interpreted as NETDEV_TX_LOCK
which is not what this driver wants. It should return 0 (NETDEV_TX_OK)
when it consumes a packet.
Also, use skb_padto() as the generic way to pad small frames.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch works around the MWI bug on the DC21143 rev 65 Tulip by
ensuring that the receive buffers don't end on a cache line boundary
(as documented in the errata).
This patch is required for the MIPS based Cobalt Qube/RaQ as
supporting the extra PCI commands seems to reduce the chance of a hard
lockup between the Tulip and the PCI bridge.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Jeff Garzik [Thu, 17 Apr 2008 00:37:24 +0000 (20:37 -0400)]
[netdrvr] tulip/winbond-840: don't let tulip.h symbol stomp ours
winbond-840 shares tulip.h with the tulip driver, because they share
many (but not all) of the same register definitions.
This is useful for the register definitions, but not helpful when it
comes to symbols that are shared among the tulip driver's C modules,
but not meant to be shared outside that one driver.
Thus, PKT_BUF_SZ is a symbol internal to tulip, but it was intruding
upon a similar symbol in winbond-840's namespace. This was not a
problem as long as the two symbols had the same value, but upcoming
patches result in differing symbol values.
Andy Fleming [Thu, 10 Apr 2008 00:38:13 +0000 (19:38 -0500)]
phy: Change mii_bus id field to a string
Having the id field be an int was making more complex bus topologies
excessively difficult. For now, just convert it to a string, and
change all instances of "bus->id = val" to
snprintf(id, MII_BUS_ID_LEN, "%x", val).
Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Dai Haruki [Thu, 10 Apr 2008 00:37:51 +0000 (19:37 -0500)]
gianfar: Support NAPI for TX Frames
Poll the completed TX frames in gfar_poll(). This prevents the tx
completion interrupt from interfering with processing of received
frames.
We also disable hardware rx coalescing when NAPI is enabled.
Signed-off-by: Dai Haruki <dai.haruki@freescale.com> Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch revises the logging for link informations of spidernet.
- The link down message is too verbose because auto-negotiation timeout
occurs periodically while an ethernet cable is not connected.
- We want to see the link result, and we think it should be displayed.
In addition to the value of GHIINT0STS, spidernet interrupt handler
should check the values of GHIINT1STS/GHIINT2STS registers at the
beginning of spider_net_interrupt() so as not to drop error
interrupts.
GHIINT1STS/GHIINT2STS registers indicates some of erroneous conditions
in spidernet, and a few bits of GHIINT0STS register reflects these
conditions. But GHIINT0MSK masks these bits, so you should check these
conditions by reading GHIINT1STS/GHIINT2STS registers directly.
- unmask GDAINVAINT. There is an operation to do by spidernet
interrupt handler.
- mask some interrupts. There are no operations in the interrupt handler.
In error and warning printks, always report the netdevice name
instead of the port index (the latter has no meaning when there
are multiple mv643xx_eth silicon blocks in the system.)
mv643xx_eth: identify ports by struct mv643xx_private *
Instead of identifying individual mv643xx ethernet ports by only
their port number, identify them by their struct mv643xx_private *,
as just a port number has no meaning when there are multiple
mv643xx_eth silicon blocks in the system.
- Remove unused MV643XX_DEFAULT_[RT]X_QUEUE_SIZE definitions.
- Remove ETH_TARGET enum -- it isn't used anywhere in the driver,
and isn't even valid for non-mv643xx chip models, as those use
different MBUS target IDs.
- Clean up comment and control flow in mv643xx_eth_change_mtu().
- Use mp->dev instead of mp->mii.dev in mv643xx_eth_tx_timeout_task().
- Make mv643xx_eth_free_tx_descs() static.
- Remove overzealous NULL check in mv643xx_eth_start_xmit().
- Use symbolic NETDEV_TX_* constants in mv643xx_eth_start_xmit().
John Heffner [Tue, 15 Apr 2008 22:26:39 +0000 (15:26 -0700)]
[TCP]: Increase the max_burst threshold from 3 to tp->reordering.
This change is necessary to allow cwnd to grow during persistent
reordering. Cwnd moderation is applied when in the disorder state
and an ack that fills the hole comes in. If the hole was greater
than 3 packets, but less than tp->reordering, cwnd will shrink when
it should not have.
Signed-off-by: John Heffner <jheffner@napa.(none)> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 09:17:42 +0000 (02:17 -0700)]
[NET]: Do not allocate unneeded memory for dev->priv alignment.
The alloc_netdev_mq() tries to produce 32-bytes alignment for both
the net_device itself and its private data. The second alignment is
achieved by adding the NETDEV_ALIGN_CONST to the whole size of
the memory to be allocated.
However, for those devices that do not need the private area, this
addition just makes the net_device weight 1024 + 32 = 1068 bytes,
i.e. consume twice as much memory.
Since loopback device is such (sizeof_priv == 0 for it), and each
net namespace creates one, this can save a noticeable amount of
memory for kernel with net namespaces turned on.
After this set the lo device is actually allocated from a size-1024
kmem cache on i386 box even with NETPOLL and WIRELESS_EXT turned on.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETNS]: Add netns refcnt debug for network devices.
dev_set_net is called for
- just allocated devices
- devices moving from one namespace to another
release_net has proper check inside to distinguish these cases.
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETNS]: Add netns refcnt debug for kernel sockets.
Protocol control sockets and netlink kernel sockets should not prevent the
namespace stop request. They are initialized and disposed in a special way by
sk_change_net/sk_release_kernel.
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 08:23:02 +0000 (01:23 -0700)]
[IP6TUNNEL]: Make the fallback tunnel device per-net.
All the code, that reference it already has the ip6_tnl_net pointer,
so s/ip6_fb_tnl_dev/ip6n->fb_tnl_dev/ and move creation/releasing
code into net init/exit ops.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 08:09:44 +0000 (01:09 -0700)]
[GRE]: Use proper net in hash-lookup functions.
This is the part#2 of the patch #2 - get the proper net for
these functions. This change in a separate patch in order not
to get lost in a large previous patch.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 07:55:06 +0000 (00:55 -0700)]
[VLAN]: Allow vlan devices registration in net namespaces.
This one is similar to what I've done for TUN - set the proper
net after device allocation and clean VLANs on net exit (use the
rtnl_kill_links helper finally).
Plus, drop explicit init_net usage and net != &init_net checks.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 07:48:04 +0000 (00:48 -0700)]
[VLAN]: Tag vlan_group_device with net device, not ifindex.
Currently vlan group is searched using one key - the ifindex.
We'll have to lookup the vlan_group by two keys - ifindex and
net. Turning the vlan_group lookup key to struct net_device
pointer will make this process easier.
Besides, this will eliminate one more place in the networking,
that assumes that indexes are unique in the kernel.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 07:41:53 +0000 (00:41 -0700)]
[TUN]: Allow to register tun devices in namespace.
This is basically means that a net is set for a new device, but
actually also involves two more steps:
1. mark the tun device as "local", i.e. do not allow for it to
move across namespaces.
This is done so, since tun device is most often associated to some
file (and thus to some process) and moving the device alone is not
valid while keeping the file and the process outside. The need in
ability to move a detached persistent device is to be investigated
later.
2. get the tun device's net when tun becomes attached and put one
when it becomes detached.
This is needed to handle the case when a task owning the tun dies,
but a files lives for some more time - in this case we must not
allow for net to be freed, since its exit hook will spoil that file's
private data by unregistering the tun from under tun_chr_close.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 07:41:16 +0000 (00:41 -0700)]
[TUN]: Make the tun_dev_list per-net.
Remove the static tun_dev_list and replace its occurrences in
driver with per-net one.
It is used in two places - in tun_set_iff and tun_cleanup. In
the first case it's legal to use current net_ns. In the cleanup
call - move the loop, that unregisters all devices in net exit
hook.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Wed, 16 Apr 2008 07:40:46 +0000 (00:40 -0700)]
[TUN]: Introduce the tun_net structure and init/exit net ops.
This is the first step in making tuntap devices work in net
namespaces. The structure mentioned is pointed by generic
net pointer with tun_net_id id, and tun driver fills one on
its load. It will contain only the tun devices list.
So declare this structure and introduce net init and exit hooks.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Needed can only be more strict than what was checked by the
earlier common case check for non-tail skbs, thus
cwnd_len <= needed will never match in that case anyway.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, it is not possible to read/write to an eeprom larger than
128k in size because the buffer used for temporarily storing the
eeprom contents is allocated using kmalloc. kmalloc can only allocate
a maximum of 128k depending on architecture.
Modified ethtool_get/set_eeprom to only allocate a page of memory and
then copy the eeprom a page at a time.
Updated original patch as per suggestions from Joe Perches.
Signed-off-by: Mandeep Singh Baines <msb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Wed, 16 Apr 2008 02:29:14 +0000 (19:29 -0700)]
CAN: use hrtimers in can-bcm protocol
Make use of hrtimers to support high resolution capabilities, when
provided by the system clocksource.
The conversion to hrtimers additionally discovered and solved an
unlikely race condition that has been reproduced under (unrealistic)
massive receive load, which can only be produced on vcan software devices.
[ Fix printf format warnings on 64-bit -DaveM ]
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Wed, 16 Apr 2008 02:04:54 +0000 (19:04 -0700)]
[TIPC]: Enhance validation of format on incoming messages
This patch ensures that TIPC properly handles incoming messages
that have incorrect or unexpected formats. Most significantly,
it now ensures that each sl_buff has at least as much data as
the message header indicates it should, and that the entire
message header is stored contiguously; this prevents TIPC from
accidentally accessing memory that is not part of the sk_buff.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Wed, 16 Apr 2008 02:03:23 +0000 (19:03 -0700)]
[TIPC]: Force linearization of non-linear sk_buffs
This patch allows TIPC to process incoming messages that are
stored in a fragmented sk_buff, by forcing the linearization
of any such messages it receives.
Note: This is an interim solution to allow TIPC to operate with
Ethernet devices that generate non-linear buffers (such as the
gianfar driver), until such time as the rest of TIPC is enhanced
to handle sk_buffs with multiple data areas.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Wed, 16 Apr 2008 02:02:30 +0000 (19:02 -0700)]
[TIPC]: Use fast buffer cloning to improve performance
This patch causes TIPC to allocate fast clonable sk_buffs,
rather than standard ones. This speeds up the cloning
operation done by the link code each time a message is sent
off-node.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 15 Apr 2008 07:36:08 +0000 (00:36 -0700)]
[NETNS]: The generic per-net pointers.
Add the elastic array of void * pointer to the struct net.
The access rules are simple:
1. register the ops with register_pernet_gen_device to get
the id of your private pointer
2. call net_assign_generic() to put the private data on the
struct net (most preferably this should be done in the
->init callback of the ops registered)
3. do not store any private reference on the net_generic array;
4. do not change this pointer while the net is alive;
5. use the net_generic() to get the pointer.
When adding a new pointer, I copy the old array, replace it
with a new one and schedule the old for kfree after an RCU
grace period.
Since the net_generic explores the net->gen array inside rcu
read section and once set the net->gen->ptr[x] pointer never
changes, this grants us a safe access to generic pointers.
Quoting Paul: "... RCU is protecting -only- the net_generic
structure that net_generic() is traversing, and the [pointer]
returned by net_generic() is protected by a reference counter
in the upper-level struct net."
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 15 Apr 2008 07:35:23 +0000 (00:35 -0700)]
[NETNS]: The net-subsys IDs generator.
To make some per-net generic pointers, we need some way to address
them, i.e. - IDs. This is simple IDA-based IDs generator for pernet
subsystems.
Addressing questions about potential checkpoint/restart problems:
these IDs are "lite-offsets" within the net structure and are by no
means supposed to be exported to the userspace.
Since it will be used in the nearest future by devices only (tun,
vlan, tunnels, bridge, etc), I make it resemble the functionality
of register_pernet_device().
The new ids is stored in the *id pointer _before_ calling the init
callback to make this id available in this callback.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Tue, 15 Apr 2008 07:30:16 +0000 (00:30 -0700)]
[ISDN] include/linux/isdn.h: remove dead code
This patch remove the usage of a nonexisting kconfig variable.
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Tue, 15 Apr 2008 07:29:24 +0000 (00:29 -0700)]
[IRDA]: Remove irlan_eth_send_gratuitous_arp()
Even kernel 2.2.26 (sic) already contains the
#undef CONFIG_IRLAN_SEND_GRATUITOUS_ARP
with the comment "but for some reason the machine crashes if you use DHCP".
Either someone finally looks into this or it's simply time to remove
this dead code.
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Tue, 15 Apr 2008 07:22:02 +0000 (00:22 -0700)]
[TIPC]: Overhaul of socket locking logic
This patch modifies TIPC's socket code to follow the same approach
used by other protocols. This change eliminates the need for a
mutex in the TIPC-specific portion of the socket protocol data
structure -- in its place, the standard Linux socket backlog queue
and associated locking routines are utilized. These changes fix
a long-standing receive queue bug on SMP systems, and also enable
individual read and write threads to utilize a socket without
unnecessarily interfering with each other.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Allan Stephens [Tue, 15 Apr 2008 07:15:50 +0000 (00:15 -0700)]
[TIPC]: Correct "off by 1" error in socket queue limit enforcement
This patch fixes a bug that allowed TIPC to queue 1 more message
than allowed by the socket receive queue threshold limits. The
patch also improves the threshold code's logic and naming to help
prevent this sort of error from recurring in the future.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>