pilppa.org Git - linux-2.6-omap-h63xx.git/log

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  ioat: fix 'ack' handling, driver must ensure that 'ack' is zero
  dmaengine: fix sparse warning
  fsldma: do not cleanup descriptors in hardirq context
  dmaengine: add driver for Freescale MPC85xx DMA controller

scsi: missing add of padded bytes to io completion byte count

Original patch from Tejun Heo <htejun@gmail.com> but should use ->extra_len
and not ->data_len, as we would then overshoot the original request size.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel:
sched: revert load_balance_monitor() changes

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86/xen: fix DomU boot problem
  x86: not set node to cpu_to_node if the node is not online
  x86, i387: fix ptrace leakage using init_fpu()

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  x86: disable KVM for Voyager and friends
  KVM: VMX: Avoid rearranging switched guest msrs while they are loaded
  KVM: MMU: Fix race when instantiating a shadow pte
  KVM: Route irq 0 to vcpu 0 exclusively
  KVM: Avoid infinite-frequency local apic timer
  KVM: make MMU_DEBUG compile again
  KVM: move alloc_apic_access_page() outside of non-preemptable region
  KVM: SVM: fix Windows XP 64 bit installation crash
  KVM: remove the usage of the mmap_sem for the protection of the memory slots.
  KVM: emulate access to MSR_IA32_MCG_CTL
  KVM: Make the supported cpuid list a host property rather than a vm property
  KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly
  KVM: SVM: set NM intercept when enabling CR0.TS in the guest
  KVM: SVM: Fix lazy FPU switching

ioat: fix 'ack' handling, driver must ensure that 'ack' is zero

Initialize 'ack' to zero in case the descriptor has been recycled.

Prevents "kernel BUG at crypto/async_tx/async_xor.c:185!"

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Cc: stable@kernel.org

dmaengine: fix sparse warning

include/linux/dmaengine.h:364:2: warning: returning void-valued expression

Signed-off-by: Dan Williams <dan.j.williams@intel.com>

fsldma: do not cleanup descriptors in hardirq context

"Cleaning" descriptors involves calling pending callbacks and clients
assume that their callback will only ever happen in softirq context.
Delay cleanup to the tasklet.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Zhang Wei <wei.zhang@freescale.com>

dmaengine: add driver for Freescale MPC85xx DMA controller

The driver implements DMA engine API for Freescale MPC85xx DMA controller,
which could be used by devices in the silicon. The driver supports the
Basic mode of Freescale MPC85xx DMA controller. The MPC85xx processors
supported include MPC8540/60, MPC8555, MPC8548, MPC8641 and so on.

The MPC83xx(MPC8349, MPC8360) are also supported.

[kamalesh@linux.vnet.ibm.com: build fix]
[dan.j.williams@intel.com: merge mm fixes, rebase on async_tx-2.6.25]
Signed-off-by: Zhang Wei <wei.zhang@freescale.com>
Signed-off-by: Ebony Zhu <ebony.zhu@freescale.com>
Acked-by: Kumar Gala <galak@gate.crashing.org>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

sched: revert load_balance_monitor() changes

The following commits cause a number of regressions:

  commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729
  Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduling, change how cpu load is calculated

  commit 6b2d7700266b9402e12824e11e0099ae6a4a6a79
  Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups

Namely:
- very frequent wakeups on SMP, reported by PowerTop users.
- cacheline trashing on (large) SMP
- some latencies larger than 500ms

While there is a mergeable patch to fix the latter, the former issues
are not fixable in a manner suitable for .25 (we're at -rc3 now).

Hence we revert them and try again in v2.6.26.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86/xen: fix DomU boot problem

Construct Xen guest e820 map with a hole between 640K-1M.

It's pure luck that Xen kernels have gotten away with it in the past.

The patch below seems like the right thing to do. It certainly boots in
a domU without the DMI problem (without any of the other related patches
such as Alexander's).

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Tested-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>

x86: not set node to cpu_to_node if the node is not online

resolve boot problem reported by Mel Gorman:

http://lkml.org/lkml/2008/2/13/404

init_cpu_to_node will use cpu->apic (from MADT or mptable) and
apic->node(from SRAT or AMD config space with k8_bus_64.c) to have
cpu->node mapping, and later identify_cpu will overwrite them
again...(with nearby_node...)

this patch checks if the node is online, otherwise it will not
update cpu_node map. so keep cpu_node map to online node before
identify_cpu..., to prevent possible error.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>

x86, i387: fix ptrace leakage using init_fpu()

This bug got introduced by the recent i387 merge:

  commit 4421011120b2304e5c248ae4165a2704588aedf1
  Author: Roland McGrath <roland@redhat.com>
  Date:   Wed Jan 30 13:31:50 2008 +0100

      x86: x86 i387 user_regset

Current usage of unlazy_fpu() in ptrace specific routines is wrong.
unlazy_fpu() will not init fpu if the task never used math. So the
ptrace calls can expose the parent tasks FPU data in some cases.

Replace it with the init_fpu() which will init the math state, if the
task never used math before.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back
  block: fix shadowed variable warning in blk-map.c
  block: remove extern on function definition
  cciss: remove READ_AHEAD define and use block layer defaults
  make cdrom.c:check_for_audio_disc() static
  block/genhd.c: proper externs
  unexport blk_rq_map_user_iov
  unexport blk_{get,put}_queue
  block/genhd.c: cleanups
  proper prototype for blk_dev_init()
  block/blk-tag.c should #include "blk.h"
  Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory
  block: separate out padding from alignment
  block: restore the meaning of rq->data_len to the true data length
  resubmit: cciss: procfs updates to display info about many
  splice: only return -EAGAIN if there's hope of more data
  block: fix kernel-docbook parameters and files

m68k{,nommu}: Wire up new timerfd syscalls

m68k{,nommu}: Wire up the new timerfd syscalls, which were introduced in
commit 4d672e7ac79b5ec5cdc90e450823441e20464691 ("timerfd: new timerfd API").

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

m68knommu: fix fec driver interrupt races

The FEC driver has a common interrupt handler for all interrupt event
types. It is raised on a number of distinct interrupt vectors.
This handler can't be re-entered while processing an interrupt, so
make sure all requested vectors are flagged as IRQF_DISABLED.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

m68knommu: declare do_IRQ()

Need a declaration of do_IRQ for the 68328 interrupt handling code.
It is common to all m68knommu targets, so a common declaration makes
sense.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

m68knommu: remove duplicate hw_tick() code

Remove duplicate hw_tick() function from 68328 timers code.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

m68knommu: update defconfig

Update the m68knommu defconfig.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Build fix for drivers/s390/char/defkeymap.c

Commit 5ce2087ed0eb424e0889bdc9102727f65d2ecdde (Fix default compose
table initialization) left a trailing quote.

CC drivers/s390/char/defkeymap.o
drivers/s390/char/defkeymap.c:155: error: missing terminating ' character
drivers/s390/char/defkeymap.c:156: error: syntax error before ';' token
make[3]: *** [drivers/s390/char/defkeymap.o] Error 1

Fix that.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] remove unused variable
  [CIFS] consolidate duplicate code in posix/unix inode handling
  [CIFS] fix build break when proc disabled
  [CIFS] factoring out common code in get_inode_info functions
  [CIFS] fix prepath conversion when server supports posix paths
  [CIFS] Only convert / when server does not support posix paths
  [CIFS] Fix mixed case name in structure dfs_info3_param
  [CIFS] fixup prefixpaths which contain multiple path components
  [CIFS] fix typo
  [CIFS] patch to fix incorrect encoding of number of aces on set mode
  [CIFS] Fix typo in quota operations
  [CIFS] clean up some hard to read ifdefs
  [CIFS] reduce checkpatch warnings
  [CIFS] fix warning in cifs_spnego.c

freezer vs stopped or traced

This changes the "freezer" code used by suspend/hibernate in its treatment
of tasks in TASK_STOPPED (job control stop) and TASK_TRACED (ptrace) states.

As I understand it, the intent of the "freezer" is to hold all tasks
from doing anything significant.  For this purpose, TASK_STOPPED and
TASK_TRACED are "frozen enough".  It's possible the tasks might resume
from ptrace calls (if the tracer were unfrozen) or from signals
(including ones that could come via timer interrupts, etc).  But this
doesn't matter as long as they quickly block again while "freezing" is
in effect.  Some minor adjustments to the signal.c code make sure that
try_to_freeze() very shortly follows all wakeups from both kinds of
stop.  This lets the freezer code safely leave stopped tasks unmolested.

Changing this fixes the longstanding bug of seeing after resuming from
suspend/hibernate your shell report "[1] Stopped" and the like for all
your jobs stopped by ^Z et al, as if you had freshly fg'd and ^Z'd them.
It also removes from the freezer the arcane special case treatment for
ptrace'd tasks, which relied on intimate knowledge of ptrace internals.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86: disable KVM for Voyager and friends

Most classic Pentiums don't have hardware virtualization extension,
and building kvm with Voyager, Visual Workstation, or NUMAQ
generates spurious failures.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>

KVM: VMX: Avoid rearranging switched guest msrs while they are loaded

KVM tries to run as much as possible with the guest msrs loaded instead of
host msrs, since switching msrs is very expensive.  It also tries to minimize
the number of msrs switched according to the guest mode; for example,
MSR_LSTAR is needed only by long mode guests.  This optimization is done by
setup_msrs().

However, we must not change which msrs are switched while we are running with
guest msr state:

- switch to guest msr state
- call setup_msrs(), removing some msrs from the list
- switch to host msr state, leaving a few guest msrs loaded

An easy way to trigger this is to kexec an x86_64 linux guest.  Early during
setup, the guest will switch EFER to not include SCE.  KVM will stop saving
MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded.
The next host syscall will end up in a random location in the kernel.

Fix by reloading the host msrs before changing the msr list.

Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: MMU: Fix race when instantiating a shadow pte

For improved concurrency, the guest walk is performed concurrently with other
vcpus.  This means that we need to revalidate the guest ptes once we have
write-protected the guest page tables, at which point they can no longer be
modified.

The current code attempts to avoid this check if the shadow page table is not
new, on the assumption that if it has existed before, the guest could not have
modified the pte without the shadow lock.  However the assumption is incorrect,
as the racing vcpu could have modified the pte, then instantiated the shadow
page, before our vcpu regains control:

  vcpu0        vcpu1

  fault
  walk pte

               modify pte
               fault in same pagetable
               instantiate shadow page

  lookup shadow page
  conclude it is old
  instantiate spte based on stale guest pte

We could do something clever with generation counters, but a test run by
Marcelo suggests this is unnecessary and we can just do the revalidation
unconditionally.  The pte will be in the processor cache and the check can
be quite fast.

Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: Route irq 0 to vcpu 0 exclusively

Some Linux versions allow the timer interrupt to be processed by more than
one cpu, leading to hangs due to tsc instability. Work around the issue
by only disaptching the interrupt to vcpu 0.

Problem analyzed (and patch tested) by Sheng Yang.

Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: Avoid infinite-frequency local apic timer

If the local apic initial count is zero, don't start a an hrtimer with infinite
frequency, locking up the host.

Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: make MMU_DEBUG compile again

the cr3 variable is now inside the vcpu->arch structure.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: move alloc_apic_access_page() outside of non-preemptable region

alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called
inside a non preemptable region. Move it after put_cpu().

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: SVM: fix Windows XP 64 bit installation crash

While installing Windows XP 64 bit wants to access the DEBUGCTL and the last
branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to
crash. This patch allow the access to these MSRs and fixes the issue.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

KVM: remove the usage of the mmap_sem for the protection of the memory slots.

This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

atmel_tc clocksource/clockevent code

Clocksource and clockevent device based on the Atmel TC blocks.

The clockevent device handles both periodic and oneshot modes, so this
enables NO_HZ and high res timers on some platforms that previously
couldn't use those mechanisms.

This works on both AVR32 and AT91 chips, given relevant patches for
tclib support (always) and clockevents (or else this will only look
like a higher precision clocksource).  It's an updated and modularized
version of an AT91-only patch that has circulated for some time now.

Changes relative to the original patch:
  * Update to use new tclib API
  * Replace open-coded do-while loop using goto with a real do-while loop
  * Minor irq handler optimization: Load register base address from
    dev_id instead of a global variable.
  * Aggressively turn off clocks when the clockevent isn't being used
  * Include the clockevent code on AT91RM9200 as well. The rating is
    lower than the System Timer, so the clock will usually stay off.
  * Don't assume that the number of clocks is always equal to the
    number of irqs.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

atmel_tc library

Create <linux/atmel_tc.h> based on <asm-arm/arch-at91/at91-tc.h> and the
at91sam9263 and at32ap7000 datasheets.  Most AT91 and AT32 SOCs have one
or two of these TC blocks, which include three 16-bit timers that can be
interconnected in various ways.

These TC blocks can be used for external interfacing (such as PWM and
measurement), or used as somewhat quirky sixteen-bit timers.

Changes relative to the original version:
  * Drop unneeded inclusion of <linux/mutex.h>
  * Support an arbitrary number of TC blocks
  * Return a struct with information about a TC block from
    atmel_tc_alloc() instead of using a combination of return values
    and "out" parameters.
  * ioremap() the I/O registers on allocation
  * Look up clocks and irqs for all channels
  * Add "name" parameter to atmel_tc_alloc() and use this when
    requesting the iomem resource.
  * Check if the platform provided the necessary resources at probe()
    time instead of when the TCB is allocated.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back

This is important to eg dm, that tries to decide whether to stop using
barriers or not.

Tested as working by Anders Henke <anders.henke@1und1.de>

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block: fix shadowed variable warning in blk-map.c

Introduced between 2.6.25-rc2 and -rc3
block/blk-map.c:154:14: warning: symbol 'bio' shadows an earlier one
block/blk-map.c:110:13: originally declared here

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block: remove extern on function definition

Intoduced between 2.6.25-rc2 and -rc3
block/blk-settings.c:319:12: warning: function 'blk_queue_dma_drain' with external linkage has definition

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

cciss: remove READ_AHEAD define and use block layer defaults

This patch removes the #define READ_AHEAD 1024 from the driver and uses the
block layer defaults, instead. We have found that under certain workloads
the setting can cause a disk connected to the e200 controller to go offline.
If the disk hiccups the link may try to downshift but the controller is
never notified that the link successfully completed the renegotiation.
We've also found that performance using the block layer default of 32 pages
was on par with the 1024 setting. We tried setting it to zero at one time
based on info from our firmware guys but that killed performance. Turns out
we were talking about 2 different read ahead settings.
Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

make cdrom.c:check_for_audio_disc() static

This patch makes the needlessly global check_for_audio_disc() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block/genhd.c: proper externs

This patch adds proper externs for two structs in include/linux/genhd.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

unexport blk_rq_map_user_iov

This patch removes the unused export of blk_rq_map_user_iov.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

unexport blk_{get,put}_queue

This patch removes the unused exports of blk_{get,put}_queue.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block/genhd.c: cleanups

This patch contains the following cleanups:
- make the needlessly global struct disk_type static
- #if 0 the unused genhd_media_change_notify()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

proper prototype for blk_dev_init()

This patch adds a proper prototye for blk_dev_init() in block/blk.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block/blk-tag.c should #include "blk.h"

Every file should include the headers containing the externs for its
global functions (in this case for __blk_queue_free_tags()).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory

For some non-x86 systems with 4GB or upper 4GB memory,
we need increase the range of addresses that can be
used for direct DMA in 64-bit kernel.

Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block: separate out padding from alignment

Block layer alignment was used for two different purposes - memory
alignment and padding.  This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len.  Separate out padding such that padding occurs iff
driver explicitly requests it.

Tomo: restorethe code to update bio in blk_rq_map_user
      introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa
      according to padding alignment.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block: restore the meaning of rq->data_len to the true data length

The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).

This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>

resubmit: cciss: procfs updates to display info about many

volumes

This patch allows us to display information about all of the logical volumes
configured on a particular controller without stepping on memory even when
there are many volumes (128 or more) configured.
Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

splice: only return -EAGAIN if there's hope of more data

sys_tee() currently is a bit eager in returning -EAGAIN, it may do so
even if we don't have a chance of anymore data becoming available. So
improve the logic and only return -EAGAIN if we have an attached writer
to the input pipe.

Reported by Johann Felix Soden <johfel@gmx.de> and
Patrick McManus <mcmanus@ducksong.com>.

Tested-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

block: fix kernel-docbook parameters and files

kernel-doc for block/:
- add missing parameters
- fix one function's parameter list (remove blank line)
- add 2 source files to docbook for non-exported kernel-doc functions

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

UBI: mtd/ubi/vtbl.c: fix memory leak

This patch fixes a memory leak introduced by commit
4ccf8cffa963c7b5bdc6d455ea9417084ee49aa8 and spotted by the Coverity
checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

UBI: fix sparse errors in ubi.h

In C, signed 1-bit bitfields can only take the values 0 and -1, only 0 and 1
are ever assigned in current code. Make them unsigned bitfields.

Fixes the (repeated) sparse errors:
drivers/mtd/ubi/ubi.h:220:15: error: dubious one-bit signed bitfield
drivers/mtd/ubi/ubi.h:221:17: error: dubious one-bit signed bitfield
drivers/mtd/ubi/ubi.h:222:18: error: dubious one-bit signed bitfield
drivers/mtd/ubi/ubi.h:223:16: error: dubious one-bit signed bitfield
drivers/mtd/ubi/ubi.h:224:20: error: dubious one-bit signed bitfield

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

UBI: fix error message

Make it print "UBI error: cannot attach mtd4"
instead of "UBI error: cannot attach 4"

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

UBI: silence warning

drivers/mtd/ubi/vmt.c: In function `ubi_create_volume':
drivers/mtd/ubi/vmt.c:379: warning: statement with no effect

Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

[AF_KEY]: Dump SA/SP entries non-atomically

Stop dumping of entries when af_key socket receive queue is getting
full and continue it later when there is more room again.

This fixes dumping of large databases. Currently the entries not
fitting into the receive queue are just dropped (including the
end-of-dump message) which can confuse applications.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TIPC]: Convert tsock->sem in a mutex

The semaphore tsock->sem is used as mutex, convert it to the mutex API

Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] rt6_stats - make the stats per network namespace

The rt6_stats is now per namespace with this patch. It is allocated
when a network namespace is created and freed when the network
namespace exits and references are relative to the network namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] rt6_stats - dynamically allocate the routes statistics

This patch allocates the rt6_stats struct dynamically when the fib6 is
initialized. That provides the ability to create several instances of
this structure for the network namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] fib6_rules - handle several network namespaces

The fib6_rules_ops is moved to the network namespace structure. All
references are changed to have it relatively to it.

Each time a network namespace is created a new fib6_rules_ops is
allocated, initialized and stored into the network namespace
structure.

The common part of the fib rules is namespace aware, so it is quite
easy to retrieve the network namespace from the rules and use it in
the different callbacks.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] fib6 rule - dynamic allocation of the rules struct ops

The fib6_rules_ops structure is dynamically allocated, so that allows
to make several instances of it per network namespace.

The global static fib6_rules_ops structure is renamed to
fib6_rules_ops_template in order to quickly memcopy it for the
structure initialization.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - clean node use namespace

The fib6_clean_node function should have the network namespace it is
working on. The fib6_cleaner_t structure is extended with the network
namespace field to be passed to the fib6_clean_node function.

The different functions calling the fib6_clean_node function are
extended with the netns parameter when needed to propagate the netns
pointer.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - gc timer per namespace

Move the timer initialization at the network namespace creation and
store the network namespace in the timer argument.

That enables multiple timers (one per network namespace) to do garbage
collecting.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - dynamically allocate gc-timer

The ip6_fib_timer gc timer is dynamically allocated and initialized in
the ip6 fib init function. There are no more references to a static
global variable. That will allow to make multiple instance of the
garbage collecting timer and make them per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - add net to gc timer parameter

The fib tables are now relative to the network namespace. When the
garbage collector timer expires, we must have a network namespace
parameter in order to retrieve the tables. For now this is the
init_net, but we should be able to have a timer per namespace and use
the timer callback parameter to pass the network namespace from the
expired timer.

The timer callback, fib6_run_gc, is actually used to be called
synchronously by some functions and asynchronously when the timer
expires.

When the timer expires, the delay specified for fib6_run_gc parameter
is always zero. So, I changed fib6_run_gc to not be a timer callback
but a function called by the timer callback and I added a timer
callback where its work is just to retrieve from the data arg of the
timer the network namespace and call fib6_run_gc with zero expiring
time and the network namespace parameters. That makes the code cleaner
for the fib6_run_gc callers.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - fib6_clean_all handle several network namespaces

The function fib6_clean_all takes the network namespace as
parameter. That allows to flush the routes related to a specific
network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - make it per network namespace

The fib table for ipv6 are moved to the network namespace structure.
All references to them are made relatively to the network namespace.

All external calls to the ip6_fib functions taking the network
namespace parameter are made using the init_net variable, so the
ip6_fib engine is ready for the namespaces but the callers not yet.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] ip6_fib - dynamically allocate the fib tables

This patch changes the fib6 tables to be dynamically allocated. That
provides the ability to make several instances of them when a new
network namespace is created.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] MCAST: Use standard path for sending MLD/MLDv2 messages.

This is changing the paths for sending MLD/MLDv2 messages
from dev_queue_xmit() to standard dst_output().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Make ndisc_dst_alloc() common for later use.

For later use, this patch is renaming ndisc_dst_alloc()
(and related function/structures) to icmp6_dst_alloc()
(and so on). This patch also removing unused function-
pointer argument for it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Make ndisc_flow_init() common for later use.

For later use, this patch is renaming ndisc_flow_init() to
icmpv6_flow_init() and putting it in common place.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().

Since most users of ipv6_get_saddr() pass non-NULL as
dst argument, use ipv6_dev_get_saddr() directly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[NET] NEIGHBOUR: Remove unpopular neigh_is_connected().

neigh_is_connected() is not popular at all, and the only user
drivers/net/cxgb3/l2t.c:t3_l2t_update() also have raw (expanded) expression.
Let's expand it and remove the inline function.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Use htonl() instead of __constant_htonl() where appricable.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] SYSCTL: complete initialization for sysctl table in subsystem code.

Move initialization bits for subsystem sysctl tables to
appropriate functions.
- route
- icmp

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Move packet_type{} related bits to af_inet6.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] UDP: Rename IPv6 UDP files.

Rename net/ipv6/udp.c to net/ipv6/udp_ipv6.c
Rename net/ipv6/udplite.c to net/ipv6/udplite_ipv6.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV4] UDP: Move IPv4-specific bits to other file.

Move IPv4-specific UDP bits from net/ipv4/udp.c into (new) net/ipv4/udp_ipv4.c.
Rename net/ipv4/udplite.c to net/ipv4/udplite_ipv4.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV4]: Fix size description of CONFIG_INET.

CONFIG_INET now enlarges about 400KB, not 140KB.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[UDP]: Allow users to configure UDP-Lite.

Let's give users an option for disabling UDP-Lite (~4K).

old:
|    text    data     bss     dec     hex filename
|  286498   12432    6072 305002   4a76a net/ipv4/built-in.o
|  193830    8192    3204 205226   321aa net/ipv6/ipv6.o

new (without UDP-Lite):
|    text    data     bss     dec     hex filename
|  284086   12136    5432 301654   49a56 net/ipv4/built-in.o
|  191835    7832    3076 202743   317f7 net/ipv6/ipv6.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[TCP]: Add IPv6 support to TCP SYN cookies

Updated to incorporate Eric's suggestion of using a per cpu buffer
rather than allocating on the stack. Just a two line change, but will
resend in it's entirety.

Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[TCP]: lower stack usage in cookie_hash() function

400 bytes allocated on stack might be a litle bit too much. Using a
per_cpu var is more friendly.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[PPPOL2TP]: Add missing sock_put() in pppol2tp_tunnel_closeall()

Every skb removed from session->reorder_q needs sock_put().

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Subject: [PPPOL2TP] add missing sock_put() in pppol2tp_recv_dequeue()

Every skb removed from session->reorder_q needs sock_put().

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6

[Blackfin] arch: fix atomic and32/xor32 comments and ENDPROC markings

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

[Blackfin] arch: fix bug - allow SDH driver to be used as module

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

[Blackfin] arch: to kill syscalls missing warning by adding new timerfd syscalls

Signed-off-by: Bryan Wu <cooloney@kernel.org>

[PATCH] fs/ocfs2/aops.c: Correct use of ! and &

In commit e6bafba5b4765a5a252f1b8d31cbf6d2459da337, a bug was fixed that
involved converting !x & y to !(x & y). The code below shows the same
pattern, and thus should perhaps be fixed in the same way.

This is not tested and clearly changes the semantics, so it is only
something to consider.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[2.6 patch] ocfs2: make dlm_do_assert_master() static

This patch makes the needlessly global dlm_do_assert_master() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[2.6 patch] make ocfs2_downconvert_thread() static

This patch makes the needlessly global ocfs2_downconvert_thread()
static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[2.6 patch] fs/ocfs2/: possible cleanups

This patch contains the following cleanups that are now possible:
- make the following needlessly global functions static:
  - dlmglue.c:ocfs2_process_blocked_lock()
  - heartbeat.c:ocfs2_node_map_init()
- #if 0 the following unused global function plus support functions:
  - heartbeat.c:ocfs2_node_map_is_only()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] ocfs2: le*_add_cpu conversion

replace all:
little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
expression_in_cpu_byteorder);
with:
leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2: Fix writeout in ocfs2_data_convert_worker()

Commit f1f540688eae66c274ff1c1133b5d9c687b28f58 "optimized"
ocfs2_data_convert_worker() to "only do work for regular files".
Unfortunately, I left out a '!', which casued it to *skip* regular files.
This was hidden from testing until recently because the default data
journaling mode (data=ordered) doesn't exercise this code.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>

ocfs2: Enable localalloc for local mounts

Commit 2fbe8d1ebe004425b4f7b8bba345623d2280be82 disabled localalloc
for local mounts. This caused issues as ocfs2 uses localalloc to
provide write locality. This patch enables localalloc for local mounts.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[SPARC]: Fix link errors with gcc-4.3

Reported by Adrian Bunk.

Just like in changeset a3f9985843b674cbcb58f39fab8416675e7ab842
("[SPARC64]: Move kernel unaligned trap handlers into assembler
file.") we have to move the assembler bits into a seperate
asm file because as far as the compiler is concerned
these inline bits we're doing in unaligned.c are unreachable.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm

* 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm:
  slub: fix possible NULL pointer dereference
  slub: Add kmalloc_large_node() to support kmalloc_node fallback
  slub: look up object from the freelist once
  slub: Fix up comments
  slub: Rearrange #ifdef CONFIG_SLUB_DEBUG in calculate_sizes()
  slub: Remove BUG_ON() from ksize and omit checks for !SLUB_DEBUG
  slub: Use the objsize from the kmem_cache_cpu structure
  slub: Remove useless checks in alloc_debug_processing
  slub: Remove objsize check in kmem_cache_flags()
  slub: rename slab_objects to show_slab_objects
  Revert "unique end pointer" patch
  slab: avoid double initialization & do initialization in 1 place

exit_notify: fix kill_orphaned_pgrp() usage with mt exit

1. exit_notify() always calls kill_orphaned_pgrp(). This is wrong, we
should do this only when the whole process exits.

2. exit_notify() uses "current" as "ignored_task", obviously wrong.
Use ->group_leader instead.

Test case:

void hup(int sig)
{
printf("HUP received\n");
}

void *tfunc(void *arg)
{
sleep(2);
printf("sub-thread exited\n");
return NULL;
}

int main(int argc, char *argv[])
{
if (!fork()) {
signal(SIGHUP, hup);
kill(getpid(), SIGSTOP);
exit(0);
}

pthread_t thr;
pthread_create(&thr, NULL, tfunc, NULL);

sleep(1);
printf("main thread exited\n");
syscall(__NR_exit, 0);

return 0;
}

output:

main thread exited
HUP received
Hangup

With this patch the output is:

main thread exited
sub-thread exited
HUP received

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

will_become_orphaned_pgrp: partially fix insufficient ->exit_state check

p->exit_state != 0 doesn't mean this process is dead, it may have
sub-threads.  Change the code to use "p->exit_state && thread_group_empty(p)"
instead.

Without this patch, ^Z doesn't deliver SIGTSTP to the foreground process
if the main thread has exited.

However, the new check is not perfect either.  There is a window when
exit_notify() drops tasklist and before release_task().  Suppose that
the last (non-leader) thread exits.  This means that entire group exits,
but thread_group_empty() is not true yet.

As Eric pointed out, is_global_init() is wrong as well, but I did not
dare to do other changes.

Just for the record, has_stopped_jobs() is absolutely wrong too.  But we
can't fix it now, we should first fix SIGNAL_STOP_STOPPED issues.

Even with this patch ^Z doesn't play well with the dead main thread.
The task is stopped correctly but do_wait(WSTOPPED) won't see it.  This
is another unrelated issue, will be (hopefully) fixed separately.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

introduce kill_orphaned_pgrp() helper

Factor out the common code in reparent_thread() and exit_notify().

No functional changes.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Fix default compose table initialization

Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300... This fixes the default unicode compose table by using integers
instead of character constants.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>