pilppa.org Git - linux-2.6-omap-h63xx.git/log

serial: fix enable_irq_wake/disable_irq_wake imbalance in serial_core.c

enable_irq_wake() and disable_irq_wake() need to be balanced. However,
serial_core.c calls these for different conditions during the suspend and
resume functions...

This is causing a regular WARN_ON() as found at
http://www.kerneloops.org/search.php?search=set_irq_wake

This patch makes the conditions for triggering the _wake enable/disable
sequence identical.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: proc_get_inode() should get module only once

Any file under /proc/net opened more than once leaked the refcounter
on the module it belongs to.

The problem is that module_get is called for each file opening while
module_put is called only when /proc inode is destroyed. So, lets put
module counter if we are dealing with already initialised inode.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=10737

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: David Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Robert Olsson <robert.olsson@its.uu.se>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: Roland Kletzing <devzero@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

brd: don't show ramdisks in /proc/partitions

In 2.6.25, ramdisk devices show up in /proc/partitions, which is a
behaviour change from the old rd.c.  Add GENHD_FL_SUPPRESS_PARTITION_INFO,
which was present in rd.c.

All kernels prior to 2.6.25 weren't displaying ramdisks in
/proc/partitions.  Since there are many userspace tools using information
from /proc/partitions some of them may now behave incorrectly (I didn't
tested any though).  For example before 2.6.25 /proc/partitions was empty
if no block devices like hard disks and such were detected by kernel.  Now
all 16 ramdisks are always visible there.  Some software may rely on such
information (I mean, on empty /proc/partitions).

There was quite similar situation back in 2004, and ramdisks were excluded
back from displaying.  Thats why I called this a regression (maybe a bit
unfortunate).  See this patch for info:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.3-rc2/2.6.3-rc2-mm1/broken-out/nbd-proc-partitions-fix.patch

I also think that someone somewhere (long time ago) excluded ramdisks from
/proc/partitions for good reasons.  It is possible that now such new
"feature" is harmless, but I think there are more chances that someone
will say "hey, /proc/partitions has changed, now my software doesn't work"
then "hey where did my new 2.6.25 feature go".  nbd devices are also
excluded, maybe for very same (unknown to me) reasons.

Signed-off-by: Marcin Krol <hawk@pld-linux.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ip2: fix crashes on load/unload

This doesn't need to be two modules, and making it one cleans up the
problem

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: fix off by one errors

The last gpio belonging to a chip is chip->base + chip->ngpios - 1. Some
places in the code, but not all, forgot the critical minus one.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio: mcp23s08 debug fix

The return value of mcp23s08_read_regs() can only be evaluated when signed

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio: pca953x driver handles pca9554 too

Teach drivers/gpio/pca953x.c about PCA9554, another compatible chip.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

signals: fix sigqueue_free() vs __exit_signal() race

__exit_signal() does flush_sigqueue(tsk->pending) outside of ->siglock.
This can race with another thread doing sigqueue_free(), we can free the
same SIGQUEUE_PREALLOC sigqueue twice or corrupt the pending->list.

Note that even sys_exit_group() can trigger this race, not only
sys_timer_delete().

Move the callsite of flush_sigqueue(tsk->pending) under ->siglock.

This patch doesn't touch flush_sigqueue(->shared_pending) below, it is
called when there are no other threads which can play with signals, and
sigqueue_free() can't be used outside of our thread group.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: restart recovery cleanly after device failure.

When we get any IO error during a recovery (rebuilding a spare), we abort
the recovery and restart it.

For RAID6 (and multi-drive RAID1) it may not be best to restart at the
beginning: when multiple failures can be tolerated, the recovery may be
able to continue and re-doing all that has already been done doesn't make
sense.

We already have the infrastructure to record where a recovery is up to
and restart from there, but it is not being used properly.
This is because:
  - We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR,
    which causes the recovery not be be checkpointed.
  - We remove spares and then re-added them which loses important state
    information.

The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't
needed.  If there is an error, the relevant drive will be marked as
Faulty, and that is enough to ensure correct handling of the error.  So we
first remove MD_RECOVERY_ERR, changing some of the uses of it to
MD_RECOVERY_INTR.

Then we cause the attempt to remove a non-faulty device from an array to
fail (unless recovery is impossible as the array is too degraded).  Then
when remove_and_add_spares attempts to remove the devices on which
recovery can continue, it will fail, they will remain in place, and
recovery will continue on them as desired.

Issue:  If we are halfway through rebuilding a spare and another drive
fails, and a new spare is immediately available,  do we want to:
1/ complete the current rebuild, then go back and rebuild the new spare or
2/ restart the rebuild from the start and rebuild both devices in
    parallel.

Both options can be argued for.  The code currently takes option 2 as
  a/ this requires least code change
  b/ this results in a minimally-degraded array in minimal time.

Cc: "Eivind Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: allow parallel resync of md-devices.

In some configurations, a raid6 resync can be limited by CPU speed
(Calculating P and Q and moving data) rather than by device speed. In
these cases there is nothing to be gained byt serialising resync of arrays
that share a device, and doing the resync in parallel can provide benefit.
So add a sysfs tunable to flag an array as being allowed to resync in
parallel with other arrays that use (a different part of) the same device.

Signed-off-by: Bernd Schubert <bs@q-leap.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: notify userspace on 'stop' events

This additional notification to 'array_state' is needed to allow the
monitor application to learn about stop events via sysfs. The
sysfs_notify("sync_action") call that comes at the end of do_md_stop()
(via md_new_event) is insufficient since the 'sync_action' attribute has
been removed by this point.

(Seems like a sysfs-notify-on-removal patch is a better fix. Currently
removal updates the event count but does not wake up waiters)

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: notify userspace on 'write-pending' changes to array_state

When an array enters write pending, 'array_state' changes, so we must be
sure to sysfs_notify.

Also, when waiting for user-space to acknowledge 'write-pending' by
marking the metadata as dirty, we don't want to wait for MD_CHANGE_DEVS to
be cleared as that might not happen. So explicity test for the bits that
we are really interested in.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: raid1: Fix restoration of bio between failed read and write.

When performing a "recovery" or "check" pass on a RAID1 array, we read
from each device and possible, if there is a difference or a read error,
write back to some devices.

We use the same 'bio' for both read and write, resetting various fields
between the two operations.

We forgot to reset bv_offset and bv_len however. These are often left
unchanged, but in the case where there is an IO error one or two sectors
into a page, they are changed.

This results in correctable errors not being corrected properly. It does
not result in any data corruption.

Cc: "Fairbanks, David" <David.Fairbanks@stratus.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: md: raid5 rate limit error printk

Last night we had scsi problems and a hardware raid unit was offlined
during heavy i/o. While this happened we got for about 3 minutes a huge
number messages like these

Apr 12 03:36:07 pfs1n14 kernel: [197510.696595] raid5:md7: read error not correctable (sector 2993096568 on sdj2).

I guess the high error rate is responsible for not scheduling other events
- during this time the system was not pingable and in the end also other
devices run into scsi command timeouts causing problems on these unrelated
devices as well.

Signed-off-by: Bernd Schubert <bernd-schubert@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: kill file_path wrapper

Kill the trivial and rather pointless file_path wrapper around d_path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: proper extern for mdp_major

This patch adds a proper extern for mdp_major in include/linux/raid/md.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix possible oops when removing a bitmap from an active array

It is possible to add a write-intent bitmap to an active array, or remove
the bitmap that is there.

When we do with the 'quiesce' the array, which causes make_request to
block in "wait_barrier()".

However we are sampling the value of "mddev->bitmap" before the
wait_barrier call, and using it afterwards. This can result in using a
bitmap structure that has been freed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: fix atomic_t overflow in vm

The atomic_t type is 32bit but a 64bit system can have more than 2^32
pages of virtual address space available. Without this we overflow on
ludicrously large mappings

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

types.h: don't expose struct ustat to userspace

<linux/types.h> can't be used together with <sys/ustat.h> because they
both define struct ustat:

    $ cat test.c
    #include <sys/ustat.h>
    #include <linux/types.h>
    $ gcc -c test.c
    In file included from test.c:2:
    /usr/include/linux/types.h:165: error: redefinition of 'struct ustat'

has been reported a while ago to debian, but seems to have been
lost in cat fighting: http://bugs.debian.org/429064

Signed-off-by: maximilian attems <max@stro.at>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

serial: support for InstaShield IS-400 four port RS-232 PCI card

Add support for the InstaShield IS-400 four port RS-232 PCI card.

Signed-off-by: Ignacio García Pérez <iggarpe@t2i.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix parenthesis in include/asm-mips/mach-au1x00/au1000.h

Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix parenthesis in include/asm-mips/gic.h

Parenthesis fix in include/asm-mips/gic.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ibmaem: new driver for power/energy/temp meters in IBM System X hardware

This driver reads IBM Active Energy Manager energy/temperature/power
sensors on IBM System X hardware.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i5k_amb: support Intel 5400 chipset

Minor rework to support the Intel 5400 chipset.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hdaps: invert the axes for HDAPS on Lenovo R61i ThinkPads

Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Jiri Kosina <jikos@jikos.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ntfs: le*_add_cpu conversion

replace all:
little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
expression_in_cpu_byteorder);
with:
leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: don't drop a partial page in a zone's memory map size

In a zone's present pages number, account for all pages occupied by the
memory map, including a partial.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add util-linux-ng package

(akpm: we often deal with util-linux and I (at least) can never remember
where they hang out).

Signed-off-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ecryptfs: fix missed mutex_unlock

Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fuse: fix bdi naming conflict

Fuse allocates a separate bdi for each filesystem, and registers them
in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for
anon devices normally used by fuse, but can conflict with an already
registered BDI for "fuseblk" filesystems, where sb->s_dev represents a
real block device. In particularl this happens if a non-partitioned
device is being mounted.

Fix by registering with a different name for "fuseblk" filesystems.

Thanks to Ioan Ionita for the bug report.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reported-by: Ioan Ionita <opslynx@gmail.com>
Tested-by: Ioan Ionita <opslynx@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: allow pfnmap ->fault()s

Take out an assertion to allow ->fault handlers to service PFNMAP regions.
This is required to reimplement .nopfn handlers with .fault handlers and
subsequently remove nopfn.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Jes Sorensen <jes@sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

posix-timers: print RT watchdog message

It's useful to detect which process is killed by RT watchdog.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

[ALSA] hda - Added support for Foxconn P35AX-S mainboard

Added IDs for the Foxconn P35AX-S mainboard to patch_realtek.c, so
that ALC883_6ST_DIG is used by default.

Signed-off-by: Travis Place <wishie@wishie.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

[ALSA] hda - Fix COEF and EAPD in ALC889 auto-configuration mode

Fix the missing COEF and EAPD initialization in ALC889 auto-configuration
mode.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

[ALSA] hda - Fix noise on VT1708 codec

We get quite noisy output on the right channel on VT1708 codec
when 24bit samples are used. Suppress the 24bit support until any
real fix is found.
https://bugzilla.novell.com/show_bug.cgi?id=390473

Signed-off-by: Takashi Iwai <tiwai@suse.de>

[ALSA] hda - Add model for ASUS P5K-E/WIFI-AP

Added a config table entry for the ASUS P5K-E/WIFI-AP mainboard (ID
1043:8227) to use AD1988_6STACK_DIG

Signed-off-by: Travis Place <wishie@wishie.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

x86: fixup the fallout of the bitops changes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: rename the i8259_32/64.c leftovers to irqinit_32/64.c

The leftovers of the i8259 unification have nothing to do with i8259
at all. They contain interrupt init code and the i8259_xx name is just
misleading now.

Rename them to irqinit_32/64.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: make init_ISA_irqs() static

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: i8259: cleanup codingstyle

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: i8259.c: remove trivial ifdefs

Remove #ifdefs where the only difference is formatting of comments.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: i8259.c: remove #ifdefs around includes

Remove #ifdefs around includes; including too much should be always
safe.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: automatical unification of i8259.c

Make conversion of i8259 very mechanical -- i8259 was generated by
diff -D, with too different parts left in i8259_32 and
i8259_64.c. Only "by hand" changes were removal of #ifdef from middle
of the comment (prevented compilation) and removal of one static to
allow splitting into files.

Of course, it will need some cleanups now, and those will follow.

Signed-of-by: Pavel Machek <pavel@suse.cz>

x86: make init_ISA_irqs() static

Moved to i8259 branch to avoid conflicts.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: rename the i8259_32/64.c leftovers to initirq_32/64.c

The leftovers of the i8259 unification have nothing to do with i8259
at all. They contain interrupt init code and the i8259_xx name is just
misleading now.

Rename them to initirq_32/64.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: i8259: cleanup codingstyle

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: i8259.c: remove trivial ifdefs

Remove #ifdefs where the only difference is formatting of comments.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: i8259.c: remove #ifdefs around includes

Remove #ifdefs around includes; including too much should be always
safe.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: macro@ds2.pg.gda.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: automatical unification of i8259.c

Make conversion of i8259 very mechanical -- i8259 was generated by

sysprof: make it depend on X86

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

PCI: use dev_to_node in pci_call_probe

to make sure get one online node.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: page level is unsigned

Fixes some sparse warnings.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

mmiotrace: rename kmmio_probe::user_data to :private.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86 mmiotrace: use resource_size_t for phys addresses

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86 mmiotrace: fix page-unaligned ioremaps

mmiotrace_ioremap() expects to receive the original unaligned map phys address
and size. Also fix {un,}register_kmmio_probe() to deal properly with
unaligned size.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

mmiotrace: code style cleanups

From c2da03771e29159627c5c7b9509ec70bce9f91ee Mon Sep 17 00:00:00 2001
From: Pekka Paalanen <pq@iki.fi>
Date: Mon, 28 Apr 2008 21:25:22 +0300

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

doc: update mmiotrace doc to current status

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

mmiotrace: do not print bogus pid for maps either

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

mmiotrace: print overrun counts

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

mmiotrace: print header using the read hook.

Now the header is printed only for `trace_pipe' file.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86 mmiotrace: dynamically disable non-boot CPUs

From 8979ee55cb6a429c4edd72ebec2244b849f6a79a Mon Sep 17 00:00:00 2001
From: Pekka Paalanen <pq@iki.fi>
Date: Sat, 12 Apr 2008 00:18:57 +0300

Mmiotrace is not reliable with multiple CPUs and may
miss events. Drop to single CPU when mmiotrace is activated.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86/mmiotrace: uses/depends on PCI

Don't try to build mmiotrace when CONFIG_PCI=n.

next-20080416/kernel/trace/trace_mmiotrace.c: In function 'mmio_print_pcidev':
next-20080416/kernel/trace/trace_mmiotrace.c:62: error: implicit declaration of function 'pci_dev_driver'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

mmiotrace: fix printk format

Fix gcc printk format warnings:

next-20080415/arch/x86/mm/mmio-mod.c: In function 'print_pte':
next-20080415/arch/x86/mm/mmio-mod.c:154: warning: format '%lx' expects type 'long unsigned int', but argument 3 has type 'pteval_t'
next-20080415/arch/x86/mm/mmio-mod.c:154: warning: format '%lx' expects type 'long unsigned int', but argument 4 has type 'pteval_t'
next-20080415/arch/x86/mm/mmio-mod.c: At top level:
next-20080415/arch/x86/mm/mmio-mod.c:403: warning: 'downed_cpus' defined but not used

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

mmiotrace: add user documentation

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: Do not print bogus pid

Non-zero pid indicates the MMIO access originated in user space.
We do not catch that kind of accesses yet, so always print zero for now.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: remove ISA_trace parameter.

This had become a no-op.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: move files into arch/x86/mm/.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

mmiotrace: cleanup

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

mmiotrace: ftrace fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: mmiotrace update, #2

another weekend, another patch. This should apply on top of my previous patch
from March 23rd.

Summary of changes:
- Print PCI device list in output header
- work around recursive probe hits on SMP
- refactor dis/arm_kmmio_fault_page() and add check for page levels
- remove un/reference_kmmio(), the die notifier hook is registered
permanently into the list
- explicitly check for single stepping in die notifier callback

I have tested this version on my UP Athlon64 desktop with Nouveau, and
SMP Core 2 Duo laptop with the proprietary nvidia driver. Both systems
are 64-bit. One previously unknown bug crept into daylight: the ftrace
framework's output routines print the first entry last after buffer has
wrapped around.

The most important regressions compared to non-ftrace mmiotrace at this
time are:
- failure of trace_pipe file
- illegal lines in output file
- unaware of losing data due to buffer full

Personally I'd like to see these three solved before submitting to
mainline. Other issues may come up once we know when we lose events.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: mmiotrace, updates

here is a patch that makes mmiotrace work almost well within the tracing
framework. The patch applies on top of my previous patch. I have my own
output formatting in place now.

Summary of changes:
- fix the NULL dereference that was due to not calling tracing_reset()
- add print_line() callback into struct tracer
- implement print_line() for mmiotrace, producing up-to-spec text
- add my output header, but that is not really called in the right place
- rewrote the main structs in mmiotrace
- added two new trace entry types: TRACE_MMIO_RW and TRACE_MMIO_MAP
- made some functions in trace.c non-static
- check current==NULL in tracing_generic_entry_update()
- fix(?) comparison in trace_seq_printf()

Things seem to work fine except a few issues. Markers (text lines injected
into mmiotrace log) are missing, I did not feel hacking them in before we
have variable length entries. My output header is printed only for 'trace'
file, but not 'trace_pipe'. For some reason, despite my quick fix,
iter->trace is NULL in print_trace_line() when called from 'trace_pipe'
file, which means I don't get proper output formatting.

I only tried by loading nouveau.ko, which just detects the card, and that
is traced fine. I didn't try further. Map, two reads and unmap. Works
perfectly.

I am missing the information about overflows, I'd prefer to have a
counter for lost events. I didn't try, but I guess currently there is no
way of knowning when it overflows?

So, not too far from being fully operational, it seems :-)
And looking at the diffstat, there also is some 700-900 lines of user space
code that just became obsolete.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: add mmiotrace plugin

On Sat, 22 Mar 2008 13:07:47 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> > > i'd suggest the following: pull x86.git and sched-devel.git into a
> > > single tree [the two will combine without rejects]. Then try to add a
> > > kernel/tracing/trace_mmiotrace.c ftrace plugin. The trace_sysprof.c
> > > plugin might be a good example.
> >
> > I did this and now I have mmiotrace enabled/disabled via the tracing
> > framework (what do we call this, since ftrace is one of the tracers?).
>
> cool! could you send the patches for that? (even if they are not fully
> functional yet)

Patch attached in the end. Nice to see how much code disappeared. I tried
to mark all the features I had to break with XXX-comments.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: mmiotrace, preview 2

Kconfig.debug, Makefile and testmmiotrace.c style fixes.
Use real mutex instead of mutex.
Fix failure path in register probe func.
kmmio: RCU read-locked over single stepping.
Generate mapping id's.
Make mmio-mod.c built-in and rewrite its locking.
Add debugfs file to enable/disable mmiotracing.
kmmio: use irqsave spinlocks.
Lots of cleanups in mmio-mod.c
Marker file moved from /proc into debugfs.
Call mmiotrace entrypoints directly from ioremap.c.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: mmiotrace full patch, preview 1

kmmio.c handles the list of mmio probes with callbacks, list of traced
pages, and attaching into the page fault handler and die notifier. It
arms, traps and disarms the given pages, this is the core of mmiotrace.

mmio-mod.c is a user interface, hooking into ioremap functions and
registering the mmio probes. It also decodes the required information
from trapped mmio accesses via the pre and post callbacks in each probe.
Currently, hooking into ioremap functions works by redefining the symbols
of the target (binary) kernel module, so that it calls the traced
versions of the functions.

The most notable changes done since the last discussion are:
- kmmio.c is a built-in, not part of the module
- direct call from fault.c to kmmio.c, removing all dynamic hooks
- prepare for unregistering probes at any time
- make kmmio re-initializable and accessible to more than one user
- rewrite kmmio locking to remove all spinlocks from page fault path

Can I abuse call_rcu() like I do in kmmio.c:unregister_kmmio_probe()
or is there a better way?

The function called via call_rcu() itself calls call_rcu() again,
will this work or break? There I need a second grace period for RCU
after the first grace period for page faults.

Mmiotrace itself (mmio-mod.c) is still a module, I am going to attack
that next. At some point I will start looking into how to make mmiotrace
a tracer component of ftrace (thanks for the hint, Ingo). Ftrace should
make the user space part of mmiotracing as simple as
'cat /debug/trace/mmio > dump.txt'.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: Use percpu instead of arrays.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: pq@iki.fi
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: explicit call to mmiotrace in do_page_fault()

The custom page fault handler list is replaced with a single function
pointer. All related functions and variables are renamed for
mmiotrace.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: pq@iki.fi
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: comment about user space ABI

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: fix relay-buffer-full flag for SMP

Relay has per-cpu buffers, but mmiotrace was using only a single flag
for detecting buffer full/not-full transitions. The new code makes
this per-cpu and actually counts missed events.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86 mmiotrace: use lookup_address()

Use lookup_address() from pageattr.c instead of doing the same
manually. Also had to EXPORT_SYMBOL_GPL(lookup_address) to make this
work for modules. This also fixes "undefined symbol 'init_mm'"
compile error for x86_32.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86: mmiotrace - trace memory mapped IO

Mmiotrace is a tool for trapping memory mapped IO (MMIO) accesses within
the kernel. It is used for debugging and especially for reverse
engineering evil binary drivers.

Mmiotrace works by wrapping the ioremap family of kernel functions and
marking the returned pages as not present. Access to the IO memory
triggers a page fault, which will be handled by mmiotrace's custom page
fault handler. This will single-step the faulted instruction with the
MMIO page marked as present. Access logs are directed to user space via
relay and debug_fs.

This page fault approach is necessary, because binary drivers have
readl/writel etc. calls inlined and therefore extremely difficult to
trap with with e.g. kprobes.

This patch depends on the custom page fault handlers patch.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

stacktrace: don't crash on invalid stack trace structs

This patch makes the stacktrace printout code \warn when the entries
pointer is unset rather than crashing when trying to access it in an
attempt to make it a bit more robust.

I was saving a stacktrace into an skb and forgot to copy it across skb
copies... I have since fixed the code, but it would have been easier
had the kernel not crashed in an interrupt.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace/sysprof: don't trace the user stack if we are a kernel thread.

Check that current->mm is non-NULL before attempting to trace the user
stack.

Also take depth of the kernel stack into account when comparing
against sample_max_depth.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

sysprof: update copyrights

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

sysprof: kernel trace

add kernel backtracing to the sysprof tracer.

change the format of the data, so that type=0 means
beginning of stack trace, 1 means kernel address, 2 means user
address, and 3 means end of trace.

EIP addresses are no longer distinguished from return addresses,
mostly because sysprof userspace doesn't make use of it. It may be
worthwhile adding this back in though, just in case it becomes
interesting.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: make it more available in the Kconfig

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: fix __trace_special()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: remove notrace

Remove the notrace annotations. The build logic takes care of that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: sysprof updates

make the sample period configurable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: sysprof fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: make sysprof dependent on x86 for now

that's the only tested platform for now. If there's interest we
can make it generic easily.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: sysprof plugin improvement

add sample maximum depth.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: sysprof-plugin, add self-tests

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: extend sysprof plugin some more

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: extend sysprof plugin

add per CPU hrtimers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: add sysprof plugin

very first baby version.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

[SCSI] fix intermittent oops in scsi_bus_uevent

Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
> BUG: unable to handle kernel paging request at e6f17fac
> IP: [<c02604d6>] scsi_bus_uevent+0x1/0x17
> *pde = 2714b163 *pte = 26f17160
> Oops: 0000 [#1] DEBUG_PAGEALLOC
> last sysfs file:
>
> Pid:  1, comm: swapper Not tainted (2.6.26-rc2-next-20080516skw #30)
> EIP: 0060:[<c02604d6>] EFLAGS: 00010282 CPU: 0
> EIP is at scsi_bus_uevent+0x1/0x17
> EAX: e6f18014 EBX: e6f18014 ECX: c02604d5 EDX: e7173000
> ESI: e7173000 EDI: e7173000 EBP: e7851ca0 ESP: e7851c90
>  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068

The problem is caused by:

commit b0ed43360fdca227048d88a08290365cb681c1a8
Author: Hannes Reinecke <hare@suse.de>
Date:   Tue Mar 18 14:32:28 2008 +0100

    [SCSI] add scsi_host and scsi_target to scsi_bus

which added scsi_bus_type to the struct scsi_target device.  This
causes both the scsi_device and scsi_target to fire scsi_bus_uevents.
However, the actualy scsi_bus_uevent() call assumes blindly that it's
a struct scsi_device.  Check for this and return immediately if it
isn't.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

ftrace: add have dynamic ftrace config for archs

Now that ftrace is being ported to other architectures, it has become
apparent that DYNAMIC_FTRACE is dependent on whether or not that
architecture implements dynamic ftrace. FTRACE itself may be ported to
an architecture without porting dynamic ftrace.

This patch adds HAVE_DYNAMIC_FTRACE to allow architectures to port ftrace
without having to also port the dynamic aspect as well.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: use the new kbuild CFLAGS_REMOVE for x86/kernel directory

This patch removes the Makefile turd and uses the nice CFLAGS_REMOVE macro
in the x86/kernel directory.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: use the new kbuild CFLAGS_REMOVE for lib directory

This patch removes the Makefile turd and uses the nice CFLAGS_REMOVE macro
in the lib directory.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ftrace: use the new kbuild CFLAGS_REMOVE for kernel directory

This patch removes the Makefile turd and uses the nice CFLAGS_REMOVE macro
in the kernel directory.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>