pilppa.org Git - linux-2.6-omap-h63xx.git/log

]> pilppa.org Git - linux-2.6-omap-h63xx.git/log

projects / linux-2.6-omap-h63xx.git / log

Arnd Bergmann [Tue, 20 May 2008 17:16:33 +0000 (19:16 +0200)]

pm_qos_params: BKL pushdown

[jmc: added <linux/smp_lock.h>]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:31 +0000 (19:16 +0200)]

parisc-kernel-perf: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:30 +0000 (19:16 +0200)]

parisc-eisa_eeprom: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:29 +0000 (19:16 +0200)]

openprom: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:28 +0000 (19:16 +0200)]

ocfs2-stack_user: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:26 +0000 (19:16 +0200)]

nvram: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:24 +0000 (19:16 +0200)]

net-tun: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:23 +0000 (19:16 +0200)]

mwave-mwavedd: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:22 +0000 (19:16 +0200)]

mvme16x-rtc: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:20 +0000 (19:16 +0200)]

mousedev: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:18 +0000 (19:16 +0200)]

monwriter: BKL pushdown

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

commit | commitdiff | tree

Arnd Bergmann [Tue, 20 May 2008 17:16:17 +0000 (19:16 +0200)]

monreader: BKL pushdown

[jmc: added <linux/smp_lock.h>]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 20:14:02 +0000 (15:14 -0500)]

svcrdma: Change WR context get/put to use the kmem cache

Change the WR context pool to be shared across mount points. This
reduces the RDMA transport memory footprint significantly since
idle mounts don't consume WR context memory.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 19:05:54 +0000 (14:05 -0500)]

svcrdma: Create a kmem cache for the WR contexts

Create a kmem cache to hold WR contexts. Next we will convert
the WR context get and put services to use this kmem cache.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 18:57:05 +0000 (13:57 -0500)]

svcrdma: Add flush_scheduled_work to module exit function

Make certain all transports pending free are flushed from the wq
before unloading the module.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Tue, 20 May 2008 00:00:24 +0000 (19:00 -0500)]

svcrdma: Limit ORD based on client's advertised IRD

When adapters have differing IRD limits, the RDMA transport will fail to
connect properly. The RDMA transport should use the client's advertised
inbound read limit when computing its outbound read limit. For iWARP
transports, there is currently no standard for exchanging IRD/ORD
during connection establishment so the 'responder_resources' field in the
connect event is the local device's limit. The RDMA transport can be
configured to use a smaller ORD by writing the desired number to the
/proc/sys/sunrpc/svc_rdma/max_outbound_read_requests file.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Mon, 19 May 2008 15:17:09 +0000 (10:17 -0500)]

svcrdma: Remove unused wait q from svcrdma_xprt structure

The sc_read_wait queue head is no longer used. Remove it.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 18:20:24 +0000 (13:20 -0500)]

svcrdma: Remove unneeded spin locks from __svc_rdma_free

At the time __svc_rdma_free is called, we are guaranteed that all references
to this transport are gone. There is, therefore, no need to protect the
resource lists with a spin lock.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 18:17:44 +0000 (13:17 -0500)]

svcrdma: Add dma map count and WARN_ON

Add a dma map count in order to verify that all DMA mapping resources
have been freed when the transport is closed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 17:08:48 +0000 (12:08 -0500)]

svcrdma: Move the DMA unmap logic to the CQ handler

Separate DMA unmap from context destruction and perform DMA unmapping
in the SQ/RQ CQ reap functions. This is necessary to support software
based RDMA implementations that actually copy the data in their
ib_dma_unmap callback functions and architectures that don't have
cache coherent I/O busses.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Tue, 27 May 2008 22:03:14 +0000 (17:03 -0500)]

svcrdma: Use reply and chunk map for RDMA_READ processing

Modify the RDMA_READ processing to use the reply and chunk list mapping data
types. Also add a special purpose 'hdr_count' field in in the context to hold
the header page count instead of overloading the SGE length field and
corrupting the DMA map length.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 2 Jul 2008 19:56:13 +0000 (14:56 -0500)]

svcrdma: Use RPC reply map for RDMA_WRITE processing

Use the new svc_rdma_req_map data type for mapping the client side memory
to the server side memory. Move the DMA mapping to the context pointed to
by each WR individually so that it is unmapped after the WR completes.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Tom Tucker [Wed, 28 May 2008 18:54:04 +0000 (13:54 -0500)]

svcrdma: Add a type for keeping NFS RPC mapping

Create a new data structure to hold the remote client address space
to local server address space mapping.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

commit | commitdiff | tree

Benny Halevy [Wed, 2 Jul 2008 08:14:22 +0000 (11:14 +0300)]

nfsd: nfs4 minorversion decoder vectors

Have separate vectors of operation decoders for each minorversion.
Obsolete ops in newer minorversions have default implementation returning
nfserr_opnotsupp.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

commit | commitdiff | tree

Benny Halevy [Wed, 2 Jul 2008 08:14:01 +0000 (11:14 +0300)]

nfsd: unsupported nfs4 ops should fail with nfserr_opnotsupp

nfserr_opnotsupp should be returned for unsupported nfs4 ops
rather than nfserr_op_illegal.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

commit | commitdiff | tree

Benny Halevy [Wed, 2 Jul 2008 08:13:41 +0000 (11:13 +0300)]

nfsd: tabulate nfs4 xdr decoding functions

In preparation for minorversion 1

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

commit | commitdiff | tree

Benny Halevy [Wed, 2 Jul 2008 08:13:18 +0000 (11:13 +0300)]

nfsd: return nfserr_minor_vers_mismatch when compound minorversion != 0

Check minorversion once before decoding any operation and reject with
nfserr_minor_vers_mismatch if != 0 (this still happens in nfsd4_proc_compound).
In this case return a zero length resultdata array as required by RFC3530.

minorversion 1 processing will have its own vector of decoders.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

commit | commitdiff | tree

Larry.Finger@lwfinger.net [Mon, 30 Jun 2008 15:39:49 +0000 (10:39 -0500)]

p54: Add quality output to iwlist and iwconfig

The p54 driver family reports a quality of 0 in iwconfig and iwlist
output. This patch calculates a quality number as a percentage of the
rssi to the maximum signal of 127 reported as the maximum signal.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Johannes Berg [Mon, 30 Jun 2008 13:10:46 +0000 (15:10 +0200)]

mac80211: rework debug settings and make debugging safer

This patch reworks the mac80211 debug settings making them more focused
and adding help text for those that didn't have one. It also removes a
number of printks that can be triggered remotely and add no value, e.g.
"too short deauthentication frame received - ignoring".

If somebody really needs to debug that they should just add a monitor
interface and look at the frames in wireshark.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Johannes Berg [Mon, 30 Jun 2008 13:10:45 +0000 (15:10 +0200)]

mac80211: get rid of function pointers in RX path

This changes the RX path to no longer use function pointers for
RX handlers but rather invoke them directly. If debugging is
enabled, mark the RX handlers noinline because otherwise they
all get inlined into ieee80211_invoke_rx_handlers() which makes
it harder to see where a bug is.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Johannes Berg [Mon, 30 Jun 2008 13:10:44 +0000 (15:10 +0200)]

mac80211: get rid of function pointers in TX path

This changes the TX path to no longer use function pointers for
TX handlers but rather invoke them directly. If debugging is
enabled, mark the TX handlers noinline because otherwise they
all get inlined into invoke_tx_handlers() which makes it harder
to see where a bug is.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Zhu Yi [Mon, 30 Jun 2008 09:23:32 +0000 (17:23 +0800)]

iwl3945: remove RFKILL_STATE_HARD_BLOCKED warnings

The patch fixes the compile warning of "enumeration value
‘RFKILL_STATE_HARD_BLOCKED’ not handled in switch".

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Zhu Yi [Mon, 30 Jun 2008 09:23:31 +0000 (17:23 +0800)]

iwlwifi: fix iwl4965 temperature callback calibration issue

The patch fixes the temperature calibration issue introduced by the
patch "iwlwifi: move RX stats to core, and move temperature to handler".
It also remove the second parameter "stats" since it is already copied
to priv->statistics.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Stefanik Gábor [Mon, 30 Jun 2008 09:23:30 +0000 (17:23 +0800)]

iwlwifi: enable packet injection for iwl3945

This patch enables packet injection on iwl3945 devices. Tested with
packetspammer and aireplay-ng.

Signed-off-by: Gábor Stefanik <netrolller.3d@gmail.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Rick Farrington [Mon, 30 Jun 2008 09:23:28 +0000 (17:23 +0800)]

iwlwifi: fix incorrect monitor mode operation

This patch fixes monitor mode operation for iwlwifi.

Problems addressed:
1. when monitor mode was enabled, multiple, overlapped calls were being
made to 'iwl3945_bg_set_monitor'

2. when monitor mode was disabled (via the configure_filter callback),
the driver was still enabling monitor mode

3. when monitor mode was enabled, the selected channel was not set
(eg. 'iwconfig wlanx mode monitor channel n' DID NOT SET channel 'n'
when packet capture was subsequently enabled)

Signed-off-by: Rick Farrington <rickdic@hotmail.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

andrey@cozybit.com [Tue, 1 Jul 2008 18:43:53 +0000 (11:43 -0700)]

libertas: support USB persistence on suspend/resume (resend)

Handle .reset_resume() so that libertas can survive suspend/resume without
reloading the firmware.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Deepak Saxena <dsaxena@laptop.org>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Zhu Yi [Tue, 1 Jul 2008 01:20:34 +0000 (09:20 +0800)]

iwlwifi: drop skb silently for Tx request in monitor mode

This patch fixes the problem to keep mac80211 resubmitting SKBs
when Tx request cannot be met in monitor mode.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Rick Farrington [Tue, 1 Jul 2008 01:20:33 +0000 (09:20 +0800)]

iwlwifi: fix incorrect 5GHz rates reported in monitor mode

This patch fixes the rates reported in monitor mode operation
(Wireshark) for iwlwifi.

Previously, packets with rates of 6M..24M would be reported
incorrectly and packets with rates of 36M..54M would not passed
up the stack.

Signed-off-by: Rick Farrington <rickdic@hotmail.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

commit | commitdiff | tree

Alex Chiang [Wed, 2 Jul 2008 02:02:23 +0000 (20:02 -0600)]

PCI: acpiphp: cleanup notify handler on all root bridges

During the development of the physical PCI slot patch series, Gary Hade
kept on reporting strange oopses due to interactions between pci_slot
and acpiphp.

http://lkml.org/lkml/2007/11/28/319

find_root_bridges() unconditionally installs
handle_hotplug_event_bridge() as an ACPI_SYSTEM_NOTIFY handler for all
root bridges.

However, during module cleanup, remove_bridge() will only remove the
notify handler iff the root bridge had a hot-pluggable slot directly
underneath. That is:

root bridge -> hotplug slot

But, if the topology looks like either of the following:

root bridge -> non-hotplug slot
root bridge -> p2p bridge -> hotplug slot

Then we currently do not remove the notify handler from that root
bridge.

This can cause a kernel oops if we modprobe acpiphp later and it gets
loaded somewhere else in memory. If the root bridge then receives a
hotplug event, it will then attempt to call a stale, non-existent notify
handler and we blow up.

Much thanks goes to Gary Hade for his persistent debugging efforts.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

commit | commitdiff | tree

Benjamin Li [Wed, 2 Jul 2008 17:59:04 +0000 (10:59 -0700)]

PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev.

For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device.  This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd').   A read to this sysfs entry
will dump 32k of data.  Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang.  Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.

A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information.  A PCI quirk added for the
Broadcom NIC's to limit the read/write's.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 2 Jul 2008 14:03:33 +0000 (11:03 -0300)]

V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc is selected

UVC makefile defines obj as:
obj-$(CONFIG_USB_VIDEO_CLASS) := uvcvideo.o
Instead of:
obj-$(CONFIG_USB_VIDEO_CLASS) += uvcvideo.o

Due to that, if uvc is selected, all obj-y or obj-m that were added to
compilation were forget. This breaks a proper kernel build.

Acked-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>

commit | commitdiff | tree

Santwona Behera [Wed, 2 Jul 2008 10:49:11 +0000 (03:49 -0700)]

niu: Add support for rx flow hash configuration.

Implemented ethtool callback functions for configuring receive flow
hashing in the niu driver.

Signed-off-by: Santwona Behera <santwona.behera@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Santwona Behera [Wed, 2 Jul 2008 10:47:41 +0000 (03:47 -0700)]

netdev: Add support for rx flow hash configuration, using ethtool.

Added new interfaces to ethtool to configure receive network flow
distribution across multiple rx rings using hashing.

Signed-off-by: Santwona Behera <santwona.behera@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Haavard Skinnemoen [Sun, 24 Feb 2008 12:51:38 +0000 (13:51 +0100)]

avr32: Power Management support ("standby" and "mem" modes)

Implement Standby support. In this mode, we'll suspend all drivers,
put the SDRAM in self-refresh mode and switch off the HSB bus
("frozen" mode.)

Implement Suspend-to-mem support. In this mode, we suspend all
drivers, put the SDRAM into self-refresh mode and switch off all
internal clocks except the 32 kHz oscillator ("stop" mode.)

The lowest-level suspend code runs from a small portion of SRAM
allocated at startup time. This gets rid of a small potential race
with the SDRAM where we might try to enter self-refresh mode in the
middle of an icache burst. We also relocate all interrupt and
exception handlers to SRAM during the small window when we enter and
exit the low-power modes.

We don't need to do any special tricks to start and stop the PLL. The
main clock is automatically gated by hardware until the PLL is stable.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Sun, 24 Feb 2008 13:26:03 +0000 (14:26 +0100)]

avr32: Add system device for the internal interrupt controller (intc)

This makes the intc show up in sysfs (probably not very useful), and
allows us to easily add suspend/resume support later.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Wed, 5 Mar 2008 09:00:28 +0000 (10:00 +0100)]

avr32: Add simple SRAM allocator

Add SRAM allocator for avr32, which is just a thin wrapper around
genalloc.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Wed, 5 Mar 2008 14:08:27 +0000 (15:08 +0100)]

avr32: Enable SDRAMC clock at startup

The SDRAM controller needs a clock in order to respond to our
commands, and suspend doesn't work very well without the SDRAM in
self-refresh mode.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 30 Jun 2008 08:54:31 +0000 (10:54 +0200)]

rtc-at32ap700x: Enable wakeup

Call device_init_wakeup() to signal that the RTC is capable of waking
the system. This is needed for rtcwake to work.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>

commit | commitdiff | tree

Haavard Skinnemoen [Tue, 4 Mar 2008 12:39:29 +0000 (13:39 +0100)]

macb: Basic suspend/resume support

This implements suspend and resume callbacks for the macb driver. We may
have to do some more to gracefully shut the MAC down, but this at least
prevents the macb from waking the system when hooked up to a busy
network.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Patrice Vilchez <patrice.vilchez@rfo.atmel.com>
Cc: Nicolas FERRE <nicolas.ferre@rfo.atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Fri, 14 Mar 2008 13:54:13 +0000 (14:54 +0100)]

atmel_serial: Drain console TX shifter before suspending

Funny things may happen if we stop the USART clock before the shifter is
empty. Prevent this from happening by waiting until the shifter is
completely drained before allowing suspend to continue.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Andrew Victor <linux@maxim.org.za>

commit | commitdiff | tree

Haavard Skinnemoen [Sun, 24 Feb 2008 13:34:45 +0000 (14:34 +0100)]

atmel_serial: Fix build on avr32 with CONFIG_PM enabled

AVR32 doesn't have at91_suspend_entering_slow_clock(). Just assume the
clock will keep running for now.

David has a better solution for this, but this works for now. Leaving
the USART clock running won't prevent the PM code from entering deep
power-down modes anyway.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Andrew Victor <linux@maxim.org.za>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 14 Jan 2008 22:35:32 +0000 (23:35 +0100)]

avr32: Use a quicklist for PTE allocation as well

Using a quicklist to allocate PTEs might be slightly faster than using
the page allocator directly since we might avoid zeroing the page
after each allocation.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 14 Jan 2008 22:33:44 +0000 (23:33 +0100)]

avr32: Use a quicklist for PGD allocation

Use a quicklist to allocate process PGDs. This is expected to be
slightly faster since we need to copy entries from swapper_pg_dir,
which can stay around for pages on the PGD quick list.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 14 Jan 2008 22:11:26 +0000 (23:11 +0100)]

avr32: Cover the kernel page tables in the user PGDs

Expand the per-process PGDs so that they cover the kernel virtual
memory area as well. This simplifies the TLB miss handler fastpath
since it doesn't have to check for kernel addresses anymore.

If a TLB miss happens on a kernel address and a second-level page
table can't be found, we check swapper_pg_dir and copy the PGD entry
into the user PGD if it can be found there.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 14 Jan 2008 21:15:05 +0000 (22:15 +0100)]

avr32: Store virtual addresses in the PGD

Instead of storing physical addresses along with page flags in the
PGD, store virtual addresses and use NULL to indicate a not present
second-level page table. A non-page-aligned page table indicates a bad
PMD.

This simplifies the TLB miss handler since it no longer has to check
the Present bit and no longer has to convert the PGD entry from
physical to virtual address. Instead, it has to check for a NULL
entry, which is slightly cheaper than either.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Mon, 14 Jan 2008 21:16:50 +0000 (22:16 +0100)]

avr32: Remove useless zeroing of swapper_pg_dir at startup

swapper_pg_dir is stored in .bss, so it must already be zeroed out
when we get there.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Haavard Skinnemoen [Tue, 15 May 2007 13:06:41 +0000 (15:06 +0200)]

avr32: Clean up and optimize the TLB operations

This and the following patches aim to optimize the code dealing with
page tables and TLB operations. Each patch reduces the time it takes
to gzip a 16 MB file slightly, but I expect things like fork() and
mmap() will improve somewhat more.

This patch deals with the low-level TLB operations:

  * Remove unused _TLBEHI_I define
  * Use gcc builtins instead of inline assembly
  * Remove a few unnecessary pipeline flushes and nops
  * Introduce NR_TLB_ENTRIES define and use it instead of hardcoding it
    to 32 a few places throughout the code.
  * Use sysreg bitops instead of hardcoded shifts and masks
  * Make a few needlessly global functions static

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>

commit | commitdiff | tree

Milan Broz [Wed, 2 Jul 2008 08:34:28 +0000 (09:34 +0100)]

dm crypt: use cond_resched

Add cond_resched() to prevent monopolising CPU when processing large bios.

dm-crypt processes encryption of bios in sector units. If the bio request
is big it can spend a long time in the encryption call.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Tested-by: Yan Li <elliot.li.tech@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

commit | commitdiff | tree

Vlad Yasevich [Wed, 2 Jul 2008 03:06:22 +0000 (20:06 -0700)]

sctp: Mark GET_PEER|LOCAL_ADDR_OLD deprecated.

Socket options SCTP_GET_PEER_ADDR_OLD, SCTP_GET_PEER_ADDR_NUM_OLD,
SCTP_GET_LOCAL_ADDR_OLD, and SCTP_GET_PEER_LOCAL_ADDR_NUM_OLD
have been replaced by newer versions a since 2005. It's time
to officially deprecate them and schedule them for removal.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Patrick McHardy [Wed, 2 Jul 2008 02:59:37 +0000 (19:59 -0700)]

net: fib_rules: fix error code for unsupported families

The errno code returned must be negative.

Fixes "RTNETLINK answers: Unknown error 18446744073709551519".

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wang Chen [Wed, 2 Jul 2008 02:57:19 +0000 (19:57 -0700)]

netdevice: Fix wrong string handle in kernel command line parsing

v1->v2: Use strlcpy() to ensure s[i].name be null-termination.

1. In netdev_boot_setup_add(), a long name will leak.
   ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname.........
2. In netdev_boot_setup_check(), mismatch will happen if s[i].name
   is a substring of dev->name.
   ex. : dev=...eth1 dev=...eth11

[ With feedback from Ben Hutchings. ]

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wang Chen [Wed, 2 Jul 2008 02:55:40 +0000 (19:55 -0700)]

net: Tyop of sk_filter() comment

Parameter "needlock" no long exists.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wang Chen [Wed, 2 Jul 2008 02:55:09 +0000 (19:55 -0700)]

netlink: Unneeded local variable

We already have a variable, which has the same capability.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Patrick McHardy [Wed, 2 Jul 2008 02:53:09 +0000 (19:53 -0700)]

net-sched: fix filter destruction in atm/hfsc qdisc destruction

Filters need to be destroyed before beginning to destroy classes
since the destination class needs to still be alive to unbind the
filter.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Patrick McHardy [Wed, 2 Jul 2008 02:52:38 +0000 (19:52 -0700)]

net-sched: change tcf_destroy_chain() to clear start of filter list

Pass double tcf_proto pointers to tcf_destroy_chain() to make it
clear the start of the filter list for more consistency.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Stephen Hemminger [Wed, 2 Jul 2008 02:29:07 +0000 (19:29 -0700)]

icmp: fix units for ratelimit

Convert the sysctl values for icmp ratelimit to use milliseconds instead
of jiffies which is based on kernel configured HZ.
Internal kernel jiffies are not a proper unit for any userspace API.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Stephen Hemminger [Wed, 2 Jul 2008 00:22:48 +0000 (17:22 -0700)]

ipv4: fix sysctl documentation of time related values

These sysctl values are time related and all use the same routine
(proc_dointvec_jiffies) that internally converts from seconds to jiffies.
The code is fine, the documentation is just wrong.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tim Yamin [Tue, 17 Jun 2008 08:33:14 +0000 (09:33 +0100)]

powerpc/mpc5200: Fix lite5200b suspend/resume

Suspend/resume ("echo mem > /sys/power/state") does not work with
vanilla kernels -- the system does not suspend correctly and just
hangs. This patch fixes this so suspend/resume works:

1) of_iomap does not map the whole 0xC000 of the MPC5200 immr so
saving registers does not work.
2) PCI registers need to be saved and restored.

Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>

commit | commitdiff | tree

John Linn [Tue, 1 Jul 2008 17:52:41 +0000 (10:52 -0700)]

powerpc/legacy_serial: Bail if reg-offset/shift properties are present

The legacy serial driver does not work with an 8250 type UART that is
described in the device tree with the reg-offset and reg-shift
properties. This change makes legacy_serial ignore these devices.

Signed-off-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>

commit | commitdiff | tree

Wolfram Sang [Tue, 1 Jul 2008 20:38:18 +0000 (22:38 +0200)]

i2c: Fix bad hint about irqs in i2c.h

i2c.h mentions -1 as a not-issued irq. This false hint was taken by
of_i2c and caused crashes. Don't give any advice as 'no irq' is not
consistent across all architectures yet and it is not needed internally
by the i2c-core.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

commit | commitdiff | tree

Ben Dooks [Tue, 1 Jul 2008 20:38:18 +0000 (22:38 +0200)]

i2c: Documentation: fix device matching description

The matching process described for new style clients in
Documentation/i2c/writing-clients is classed as out-of-date
as it requires the presence of an .id_table entry in the
driver's i2c_driver entry.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

commit | commitdiff | tree

John Linn [Tue, 1 Jul 2008 20:17:18 +0000 (14:17 -0600)]

powerpc/bootwrapper: update for initrd with simpleImage

This change to the makefile corrects the build of a simpleImage with initrd.

Signed-off-by: John Linn <john.linn@xilinx>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>

commit | commitdiff | tree

Miklos Szeredi [Tue, 1 Jul 2008 13:38:35 +0000 (15:38 +0200)]

nfsd: clean up mnt_want_write calls

Multiple mnt_want_write() calls in the switch statement looks really
ugly.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

commit | commitdiff | tree

Ben Hutchings [Tue, 1 Jul 2008 16:18:17 +0000 (17:18 +0100)]

PCI: Restrict VPD read permission to root

Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit. Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

commit | commitdiff | tree

Ingo Molnar [Tue, 1 Jul 2008 15:35:06 +0000 (17:35 +0200)]

fix "ftrace: store mcount address in rec->ip"

Alexander Beregalov reported this build failure:

$ make CROSS_COMPILE=sparc64-unknown-linux-gnu- image modules && sudo
make modules_install
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  CALL    scripts/checksyscalls.sh
  CHK     include/linux/compile.h
dnsdomainname: Unknown host
  CC      arch/sparc64/kernel/sparc64_ksyms.o
arch/sparc64/kernel/sparc64_ksyms.c:116: error: '_mcount' undeclared
here (not in a function)
cc1: warnings being treated as errors
arch/sparc64/kernel/sparc64_ksyms.c:116: error: type defaults to 'int'
in declaration of '_mcount'

And bisected it back to:

| commit 395a59d0f8e86bb39cd700c3d185d30c670bb958
| Author: Abhishek Sagar <sagar.abhishek@gmail.com>
| Date:   Sat Jun 21 23:47:27 2008 +0530
|
|     ftrace: store mcount address in rec->ip

the mcount prototype is only available under CONFIG_FTRACE,
extend it to CONFIG_MCOUNT as well.

Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Vegard Nossum [Tue, 1 Jul 2008 13:38:13 +0000 (15:38 +0200)]

x86: small unifications of address printing

'man 3 printf' tells me that %p should be printed as if by %#x, but
this is not true for the kernel, which does not use the '0x' prefix
for the %p conversion specifier.

A small cast to (void *) is also prettier than #ifdef/#else/#endif.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Ben Dooks [Tue, 1 Jul 2008 10:59:43 +0000 (11:59 +0100)]

I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device.

Add a MODULE_ALIAS() statement for the i2c-s3c2410 controller
to ensure that it can be autoloaded on the S3C2440 systems that
we support.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>

commit | commitdiff | tree

Ben Dooks [Tue, 1 Jul 2008 10:59:42 +0000 (11:59 +0100)]

I2C: S3C2410: Fixup error codes returned rom a transfer.

The driver should be returning -ENXIO for transfers that do not
pass the initial address byte stage.

Note, also small tidyups to the driver comments in the area.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>

commit | commitdiff | tree

Ben Dooks [Tue, 1 Jul 2008 10:59:41 +0000 (11:59 +0100)]

I2C: S3C2410: Check ACK on byte transmission

We should check for the reception of an ACK after transmitting each
data byte. The address send has been correctly checking this, but the
data write byte state should have also been checking for these failures.

As part of the same fix, we remove the ACK checking from the receive
path where it should not have been checking for an ACK which our hardware
was sending.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>

commit | commitdiff | tree

Roland McGrath [Mon, 30 Jun 2008 21:02:41 +0000 (14:02 -0700)]

x86 ptrace: fix PTRACE_GETFPXREGS error

ptrace has always returned only -EIO for all failures to access
registers.  The user_regset calls are allowed to return a more
meaningful variety of errors.  The REGSET_XFP calls use -ENODEV
for !cpu_has_fxsr hardware.  Make ptrace return the traditional
-EIO instead of the error code from the user_regset call.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

H. Peter Anvin [Mon, 30 Jun 2008 22:42:47 +0000 (15:42 -0700)]

x86: fix regression: boot failure on AMD Elan TS-5500

Jeremy Fitzhardinge wrote:
>
> Maybe it really does require the far jump immediately after setting PE
> in cr0...
>
> Hm, I don't remember this paragraph being in vol 3a, section 8.9.1
> before.  Is it a recent addition?
>
>    Random failures can occur if other instructions exist between steps
>    3 and 4 above.  Failures will be readily seen in some situations,
>    such as when instructions that reference memory are inserted between
>    steps 3 and 4 while in system management mode.
>

I don't remember that, either.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Pekka Paalanen [Mon, 30 Jun 2008 20:48:37 +0000 (23:48 +0300)]

mmiotrace broken in linux-next (8-bit writes only)

The moment mmiotrace is enabled, I hit a NULL deref in:

IP: [<ffffffff80256e71>] __trace_special+0x17c/0x23a
Call Trace:
[<ffffffff802573cc>] ftrace_special+0x6f/0x9a
[<ffffffff8023e3e4>] down+0x19/0x4a
[<ffffffff80228adc>] acquire_console_sem+0x42/0x58
[<ffffffff8035d273>] con_flush_chars+0x28/0x43
[<ffffffff80354a70>] write_chan+0x22e/0x334
[<ffffffff802244e9>] ? default_wake_function+0x0/0xf
[<ffffffff8035236d>] tty_write+0x195/0x228
[<ffffffff80354842>] ? write_chan+0x0/0x334
[<ffffffff8027c23a>] vfs_write+0xae/0x137
[<ffffffff8027c6e3>] sys_write+0x47/0x70
[<ffffffff8020b1db>] system_call_after_swapgs+0x7b/0x80

which means 'entry' in __trace_special() is NULL.

[ mingo@elte.hu: that ftrace_special() was a leftover. ]

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: proski@gnu.org
Cc: "Vegard Nossum" <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Gautham R Shenoy [Fri, 27 Jun 2008 04:47:38 +0000 (10:17 +0530)]

rcu: fix hotplug vs rcu race

Dhaval Giani reported this warning during cpu hotplug stress-tests:

| On running kernel compiles in parallel with cpu hotplug:
|
| WARNING: at arch/x86/kernel/smp.c:118
| native_smp_send_reschedule+0x21/0x36()
| Modules linked in:
| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1
| [...]
|  [<c0110355>] native_smp_send_reschedule+0x21/0x36
|  [<c014fe8f>] force_quiescent_state+0x47/0x57
|  [<c014fef0>] call_rcu+0x51/0x6d
|  [<c01713b3>] __fput+0x130/0x158
|  [<c0171231>] fput+0x17/0x19
|  [<c016fd99>] filp_close+0x4d/0x57
|  [<c016fdff>] sys_close+0x5c/0x97

IMHO the warning is a spurious one.

cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.

However, a cpu might have been offlined _just_ before we disabled irqs
while entering force_quiescent_state(). And rcu subsystem might not yet
have handled the CPU_DEAD notification, leading to the offlined cpu's
bit being set in the rcp->cpumask.

Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending
smp_reschedule() to an offlined CPU.

Here's the timeline:

CPU_A CPU_B
--------------------------------------------------------------
cpu_down(): .
.     .
. .
stop_machine(): /* disables preemption, .
* and irqs */ .
. .
. .
take_cpu_down(); .
. .
. .
. .
cpu_disable(); /*this removes cpu .
*from cpu_online_map .
*/ .
. .
. .
restart_machine(); /* enables irqs */ .
------WINDOW DURING WHICH rcp->cpumask is stale ---------------
. call_rcu();
. /* disables irqs here */
. .force_quiescent_state();
.CPU_DEAD: .for_each_cpu(rcp->cpumask)
. .   smp_send_reschedule();
. .
. .   WARN_ON() for offlined CPU!
.
.
.
rcu_cpu_notify:
.
-------- WINDOW ENDS ------------------------------------------
rcu_offline_cpu() /* Which calls cpu_quiet()
   * which removes
   * cpu from rcp->cpumask.
   */

If a new batch was started just before calling stop_machine_run(), the
"tobe-offlined" cpu is still present in rcp-cpumask.

During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle
task as the next task to be picked by the scheduler. We also call
cpu_disable() which will disable any further interrupts and remove the
cpu's bit from the cpu_online_map.

Once the stop_machine_run() successfully calls take_cpu_down(), it calls
schedule(). That's the last time a schedule is called on the offlined
cpu, and hence the last time when rdp->passed_quiesc will be set to 1
through rcu_qsctr_inc().

But the cpu_quiet() will be on this cpu will be called only when the
next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU
is still set in rcp->cpumask.

Now coming back to the idle_task which truely offlines the CPU, it does
check for a pending RCU and raises the softirq, since it will find
rdp->passed_quiesc to be 0 in this case. However, since the cpu is
offline I am not sure if the softirq will trigger on the CPU.

Even if it doesn't the rcu_offline_cpu() will find that rcp->completed
is not the same as rcp->cur, which means that our cpu could be holding
up the grace period progression. Hence we call cpu_quiet() and move
ahead.

But because of the window explained in the timeline, we could still have
a call_rcu() before the RCU subsystem executes it's CPU_DEAD
notification, and we send smp_send_reschedule() to offlined cpu while
trying to force the quiescent states. The appended patch adds comments
and prevents checking for offlined cpu everytime.

cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.

Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Johannes Weiner [Tue, 1 Jul 2008 07:12:04 +0000 (09:12 +0200)]

softlockup: fix watchdog task wakeup frequency

The print_timestamp can never be bigger than the touch_timestamp, at
maximum it can be equal. And if it is, the second check for
touch_timestamp + 1 bigger print_timestamp is always true, too.

The check for equality is sufficient as we proceed in one-second-steps
and are at least one second away from the last print-out if we have
another timestamp.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Jens Axboe [Tue, 1 Jul 2008 07:07:34 +0000 (09:07 +0200)]

Properly notify block layer of sync writes

fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
then immediately wait on them. Conceptually, that makes them sync writes
and we should treat them as such so that the IO schedulers can handle
them appropriately.

This patch fixes a write starvation issue that Lin Ming reported, where
xx is stuck for more than 2 minutes because of a large number of
synchronous IO in the system:

INFO: task kjournald:20558 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kjournald D ffff810010820978 6712 20558 2
ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
Call Trace:
[<ffffffff803ba6f2>] kobject_get+0x12/0x17
[<ffffffff80247537>] getnstimeofday+0x2f/0x83
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d195>] io_schedule+0x5d/0x9f
[<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f
[<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78
[<ffffffff80243909>] wake_bit_function+0x0/0x23
[<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb
[<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6
[<ffffffff8023a676>] lock_timer_base+0x26/0x4b
[<ffffffff8030300a>] kjournald+0xc1/0x1fb
[<ffffffff802438db>] autoremove_wake_function+0x0/0x2e
[<ffffffff80302f49>] kjournald+0x0/0x1fb
[<ffffffff802437bb>] kthread+0x47/0x74
[<ffffffff8022de51>] schedule_tail+0x28/0x5d
[<ffffffff8020cac8>] child_rip+0xa/0x12
[<ffffffff80243774>] kthread+0x0/0x74
[<ffffffff8020cabe>] child_rip+0x0/0x12

Lin Ming confirms that this patch fixes the issue. I've run tests with
it for the past week and no ill effects have been observed, so I'm
proposing it for inclusion into 2.6.26.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Divyesh Shah [Mon, 16 Jun 2008 16:37:08 +0000 (18:37 +0200)]

block: Fix the starving writes bug in the anticipatory IO scheduler

AS scheduler alternates between issuing read and write batches. It does
the batch switch only after all requests from the previous batch are
completed.

When switching to a write batch, if there is an on-going read request,
it waits for its completion and indicates its intention of switching by
setting ad->changed_batch and the new direction but does not update the
batch_expire_time for the new write batch which it does in the case of
no previous pending requests.
On completion of the read request, it sees that we were waiting for the
switch and schedules work for kblockd right away and resets the
ad->changed_data flag.
Now when kblockd enters dispatch_request where it is expected to pick
up a write request, it in turn ends the write batch because the
batch_expire_timer was not updated and shows the expire timestamp for
the previous batch.

This results in the write starvation for all the cases where there is
the intention for switching to a write batch, but there is a previous
in-flight read request and the batch gets reverted to a read_batch
right away.

This also holds true in the reverse case (switching from a write batch
to a read batch with an in-flight write request).

I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and
linux-2.6-block git HEAD. I've tested the fix on x86 platforms with
SCSI drives where the driver asks for the next request while a current
request is in-flight.

This patch is based off linux-2.6-block git HEAD.

Bug reproduction:
A simple scenario which reproduces this bug is:
- dd if=/dev/hda3 of=/dev/null &
- lilo
   The lilo takes forever to complete.

This can also be reproduced fairly easily with the earlier dd and
another test
program doing msync().

The example test program below should print out a message after every
iteration
but it simply hangs forever. With this bugfix it makes forward progress.

====
Example test program using msync() (thanks to suleiman AT google DOT
com)

inline uint64_t
rdtsc(void)
{
         int64_t tsc;

         __asm __volatile("rdtsc" : "=A" (tsc));
         return (tsc);
}

int
main(int argc, char **argv)
{
         struct stat st;
         uint64_t e, s, t;
         char *p, q;
         long i;
         int fd;

         if (argc < 2) {
                 printf("Usage: %s <file>\n", argv[0]);
                 return (1);
         }

         if ((fd = open(argv[1], O_RDWR | O_NOATIME)) < 0)
                 err(1, "open");

         if (fstat(fd, &st) < 0)
                 err(1, "fstat");

         p = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);

         t = 0;
         for (i = 0; i < 1000; i++) {
                 *p = 0;
                 msync(p, 4096, MS_SYNC);
                 s = rdtsc();
                *p = 0;
                 __asm __volatile(""::: "memory");
                 e = rdtsc();
                 if (argc > 2)
                         printf("%d: %lld cycles %jd %jd\n",
                                i, e - s, (intmax_t)s, (intmax_t)e);
                 t += e - s;
         }
         printf("average time: %lld cycles\n", t / 1000);
         return (0);
}

Cc: <stable@kernel.org>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Thomas Gleixner [Tue, 1 Jul 2008 06:56:32 +0000 (08:56 +0200)]

x86: fix NODES_SHIFT Kconfig range

commit 4323838215184f5a2f081e0d17b8d60731b03164
x86: change size of node ids from u8 to s16

set the range for NODES_SHIFT to 1..15.

The possible range is 1..9

Fixes Bugzilla #10726

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

commit | commitdiff | tree

Raistlin [Thu, 26 Jun 2008 16:54:09 +0000 (18:54 +0200)]

sched: fix divide error when trying to configure rt_period to zero

Here it is another little Oops we found while configuring invalid values
via cgroups:

echo 0 > /dev/cgroups/0/cpu.rt_period_us
or
echo 4294967296 > /dev/cgroups/0/cpu.rt_period_us

[  205.509825] divide error: 0000 [#1]
[  205.510151] Modules linked in:
[  205.510151]
[  205.510151] Pid: 2339, comm: bash Not tainted (2.6.26-rc8 #33)
[  205.510151] EIP: 0060:[<c030c6ef>] EFLAGS: 00000293 CPU: 0
[  205.510151] EIP is at div64_u64+0x5f/0x70
[  205.510151] EAX: 0000389f EBX: 00000000 ECX: 00000000 EDX: 00000000
[  205.510151] ESI: d9800000 EDI: 00000000 EBP: c6cede60 ESP: c6cede50
[  205.510151]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  205.510151] Process bash (pid: 2339, ti=c6cec000 task=c79be370 task.ti=c6cec000)
[  205.510151] Stack: d9800000 0000389f c05971a0 d9800000 c6cedeb4 c0214dbd 00000000 00000000
[  205.510151]        c6cede88 c0242bd8 c05377c0 c7a41b40 00000000 00000000 00000000 c05971a0
[  205.510151]        c780ed20 c7508494 c7a41b40 00000000 00000002 c6cedebc c05971a0 ffffffea
[  205.510151] Call Trace:
[  205.510151]  [<c0214dbd>] ? __rt_schedulable+0x1cd/0x240
[  205.510151]  [<c0242bd8>] ? cgroup_file_open+0x18/0xe0
[  205.510151]  [<c0214fe4>] ? tg_set_bandwidth+0xa4/0xf0
[  205.510151]  [<c0215066>] ? sched_group_set_rt_period+0x36/0x50
[  205.510151]  [<c021508e>] ? cpu_rt_period_write_uint+0xe/0x10
[  205.510151]  [<c0242dc5>] ? cgroup_file_write+0x125/0x160
[  205.510151]  [<c0232c15>] ? hrtimer_interrupt+0x155/0x190
[  205.510151]  [<c02f047f>] ? security_file_permission+0xf/0x20
[  205.510151]  [<c0277ad8>] ? rw_verify_area+0x48/0xc0
[  205.510151]  [<c0283744>] ? dupfd+0x104/0x130
[  205.510151]  [<c027838c>] ? vfs_write+0x9c/0x160
[  205.510151]  [<c0242ca0>] ? cgroup_file_write+0x0/0x160
[  205.510151]  [<c027850d>] ? sys_write+0x3d/0x70
[  205.510151]  [<c0203019>] ? sysenter_past_esp+0x6a/0x91
[  205.510151]  =======================
[  205.510151] Code: 0f 45 de 31 f6 0f ad d0 d3 ea f6 c1 20 0f 45 c2 0f 45 d6 89 45 f0 89 55 f4 8b 55 f4 31 c9 8b 45 f0 39 d3 89 c6 77 08 89 d0 31 d2 <f7> f3 89 c1 83 c4 08 89 f0 f7 f3 89 ca 5b 5e 5d c3 55 89 e5 56
[  205.510151] EIP: [<c030c6ef>] div64_u64+0x5f/0x70 SS:ESP 0068:c6cede50

The attached patch solves the issue for me.

I'm checking as soon as possible for the period not being zero since, if
it is, going ahead is useless. This way we also save a mutex_lock() and
a read_lock() wrt doing it inside tg_set_bandwidth() or
__rt_schedulable().

Signed-off-by: Dario Faggioli <raistlin@linux.it>
Signed-off-by: Michael Trimarchi <trimarchimichael@yahoo.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Michael Neuling [Tue, 1 Jul 2008 04:01:39 +0000 (14:01 +1000)]

powerpc: Update for VSX core file and ptrace

This correctly hooks the VSX dump into Roland McGrath core file
infrastructure. It adds the VSX dump information as an additional elf
note in the core file (after talking more to the tool chain/gdb guys).
This also ensures the formats are consistent between signals, ptrace
and core files.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Michael Neuling [Tue, 1 Jul 2008 04:01:39 +0000 (14:01 +1000)]

powerpc: Fix compile error for CONFIG_VSX

Fix compile error when CONFIG_VSX is enabled.

arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:241: error: 'i' undeclared (first use in this function)

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Eric B Munson [Mon, 30 Jun 2008 16:12:13 +0000 (02:12 +1000)]

powerpc: Keep 3 high personality bytes across exec

Currently when a 32 bit process is exec'd on a powerpc 64 bit host the
value in the top three bytes of the personality is clobbered.  patch
adds a check in the SET_PERSONALITY macro that will carry all the
values in the top three bytes across the exec.

These three bytes currently carry flags to disable address randomisation,
limit the address space, force zeroing of an mmapped page, etc.  Should an
application set any of these bits they will be maintained and honoured on
homogeneous environment but discarded and ignored on a heterogeneous
environment.  So if an application requires all mmapped pages to be initialised
to zero and a wrapper is used to setup the personality and exec the target,
these flags will remain set on an all 32 or all 64 bit envrionment, but they
will be lost in the exec on a mixed 32/64 bit environment.  Losing these bits
means that the same application would behave differently in different
environments.  Tested on a POWER5+ machine with 64bit kernel and a mixed
64/32 bit user space.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

H. Peter Anvin [Mon, 30 Jun 2008 21:42:18 +0000 (14:42 -0700)]

x86: remove arbitrary ELF section limit in i386 relocatable kernel

Impact: build failure in maximal configurations

The 32-bit x86 relocatable kernel requires an auxilliary host program
to process the relocations. This program had a hard-coded arbitrary
limit of a 100 ELF sections. Instead of a hard-coded limit, allocate
the structures dynamically.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>

commit | commitdiff | tree

Bart Van Assche [Sat, 28 Jun 2008 06:51:35 +0000 (16:51 +1000)]

powerpc: Make sure that include/asm-powerpc/spinlock.h does not trigger compilation warnings

When compiling kernel modules for ppc that include <linux/spinlock.h>,
gcc prints a warning message every time it encounters a function
declaration where the inline keyword appears after the return type.
This makes sure that the order of the inline keyword and the return
type is as gcc expects it. Additionally, the __inline__ keyword is
replaced by inline, as checkpatch expects.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Stephen Rothwell [Fri, 27 Jun 2008 06:18:27 +0000 (16:18 +1000)]

powerpc: Explicitly copy elements of pt_regs

Gcc 4.3 produced this warning:

arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:161: warning: array subscript is above array bounds

This is caused by us copying to aliases of elements of the pt_regs
structure. Make those explicit.

This adds one extra __get_user and unrolls a loop.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Bernhard Walle [Thu, 26 Jun 2008 17:02:15 +0000 (03:02 +1000)]

powerpc: Remove experimental status of kdump on 64-bit powerpc

This removes the experimental status of kdump on PPC64. kdump is on
PPC64 now since more than one year and it has proven to be stable.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Andy Whitcroft [Thu, 26 Jun 2008 09:55:58 +0000 (19:55 +1000)]

powerpc: Add 64 bit version of huge_ptep_set_wrprotect

The implementation of huge_ptep_set_wrprotect() directly calls
ptep_set_wrprotect() to mark a hugepte write protected.  However this
call is not appropriate on ppc64 kernels as this is a small page only
implementation.  This can lead to the hash not being flushed correctly
when a mapping is being converted to COW, allowing processes to continue
using the original copy.

Currently huge_ptep_set_wrprotect() unconditionally calls
ptep_set_wrprotect().  This is fine on ppc32 kernels as this call is
generic.  On 64 bit this is implemented as:

pte_update(mm, addr, ptep, _PAGE_RW, 0);

On ppc64 this last parameter is the page size and is passed directly on
to hpte_need_flush():

hpte_need_flush(mm, addr, ptep, old, huge);

And this directly affects the page size we pass to flush_hash_page():

flush_hash_page(vaddr, rpte, psize, ssize, 0);

As this changes the way the hash is calculated we will flush the wrong
pages, potentially leaving live hashes to the original page.

Move the definition of huge_ptep_set_wrprotect() to the 32/64 bit specific
headers.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Andrew Lewis [Thu, 26 Jun 2008 09:29:05 +0000 (19:29 +1000)]

powerpc: Prevent memory corruption due to cache invalidation of unaligned DMA buffer

On PowerPC processors with non-coherent cache architectures the DMA
subsystem calls invalidate_dcache_range() before performing a DMA read
operation.  If the address and length of the DMA buffer are not aligned
to a cache-line boundary this can result in memory outside of the DMA
buffer being invalidated in the cache.  If this memory has an
uncommitted store then the data will be lost and a subsequent read of
that address will result in an old value being returned from main memory.

Only when the DMA buffer starts on a cache-line boundary and is an exact
mutiple of the cache-line size can invalidate_dcache_range() be called,
otherwise flush_dcache_range() must be called.  flush_dcache_range()
will first flush uncommitted writes, and then invalidate the cache.

Signed-off-by: Andrew Lewis <andrew-lewis at netspace.net.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Kumar Gala [Thu, 26 Jun 2008 08:58:11 +0000 (18:58 +1000)]

powerpc/bootwrapper: Pad .dtb by default

Since most bootloaders or wrappers tend to update or add some information
to the .dtb they a handled they need some working space to do that in.

By default add 1K of padding via a default setting of DTS_FLAGS.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Michael Neuling [Wed, 25 Jun 2008 04:07:18 +0000 (14:07 +1000)]

powerpc: Add CONFIG_VSX config option

Add CONFIG_VSX config build option. Must compile with POWER4, FPU and ALTIVEC.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

commit | commitdiff | tree

Michael Neuling [Wed, 25 Jun 2008 04:07:18 +0000 (14:07 +1000)]

powerpc: Add VSX context save/restore, ptrace and signal support

This patch extends the floating point save and restore code to use the
VSX load/stores when VSX is available. This will make FP context
save/restore marginally slower on FP only code, when VSX is available,
as it has to load/store 128bits rather than just 64bits.

Mixing FP, VMX and VSX code will get constant architected state.

The signals interface is extended to enable access to VSR 0-31
doubleword 1 after discussions with tool chain maintainers. Backward
compatibility is maintained.

The ptrace interface is also extended to allow access to VSR 0-31 full
registers.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom