Rainer Weikusat [Sat, 28 Jun 2008 02:34:18 +0000 (19:34 -0700)]
af_unix: fix 'poll for write'/connected DGRAM sockets
For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg
routine implements a form of receiver-imposed flow control by
comparing the length of the receive queue of the 'peer socket' with
the max_ack_backlog value stored in the corresponding sock structure,
either blocking the thread which caused the send-routine to be called
or returning EAGAIN. This routine is used by both SOCK_DGRAM and
SOCK_SEQPACKET sockets. The poll-implementation for these socket types
is datagram_poll from core/datagram.c. A socket is deemed to be
writeable by this routine when the memory presently consumed by
datagrams owned by it is less than the configured socket send buffer
size. This is always wrong for PF_UNIX non-stream sockets connected to
server sockets dealing with (potentially) multiple clients if the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an (usual)
application to 'poll for writeability by repeated send request with
O_NONBLOCK set' until it has consumed its time quantum.
The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_recvmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.
Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Octavian Purdila [Sat, 28 Jun 2008 00:27:21 +0000 (17:27 -0700)]
tcp: fix for splice receive when used with software LRO
If an skb has nr_frags set to zero but its frag_list is not empty (as
it can happen if software LRO is enabled), and a previous
tcp_read_sock has consumed the linear part of the skb, then
__skb_splice_bits:
(a) incorrectly reports an error and
(b) forgets to update the offset to account for the linear part
Any of the two problems will cause the subsequent __skb_splice_bits
call (the one that handles the frag_list skbs) to either skip data,
or, if the unadjusted offset is greater then the size of the next skb
in the frag_list, make tcp_splice_read loop forever.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: calculate tcp_mem based on low memory instead of all memory
The tcp_mem array which contains limits on the total amount of memory
used by TCP sockets is calculated based on nr_all_pages. On a 32 bits
x86 system, we should base this on the number of lowmem pages.
Signed-off-by: Miquel van Smoorenburg <miquels@cistron.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Fri, 27 Jun 2008 23:16:30 +0000 (09:16 +1000)]
md: rationalize raid5 function names
From: Dan Williams <dan.j.williams@intel.com>
Commit a4456856 refactored some of the deep code paths in raid5.c into separate
functions. The names chosen at the time do not consistently indicate what is
going to happen to the stripe. So, update the names, and since a stripe is a
cache element use cache semantics like fill, dirty, and clean.
(also, fix up the indentation in fetch_block5)
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:32:09 +0000 (08:32 +1000)]
md: handle operation chaining in raid5_run_ops
From: Dan Williams <dan.j.williams@intel.com>
Neil said:
> At the end of ops_run_compute5 you have:
> /* ack now if postxor is not set to be run */
> if (tx && !test_bit(STRIPE_OP_POSTXOR, &s->ops_run))
> async_tx_ack(tx);
>
> It looks odd having that test there. Would it fit in raid5_run_ops
> better?
The intended global interpretation is that raid5_run_ops can build a chain
of xor and memcpy operations. When MD registers the compute-xor it tells
async_tx to keep the operation handle around so that another item in the
dependency chain can be submitted. If we are just computing a block to
satisfy a read then we can terminate the chain immediately. raid5_run_ops
gives a better context for this test since it cares about the entire chain.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:32:06 +0000 (08:32 +1000)]
md: replace R5_WantPrexor with R5_WantDrain, add 'prexor' reconstruct_states
From: Dan Williams <dan.j.williams@intel.com>
Currently ops_run_biodrain and other locations have extra logic to determine
which blocks are processed in the prexor and non-prexor cases. This can be
eliminated if handle_write_operations5 flags the blocks to be processed in all
cases via R5_Wantdrain. The presence of the prexor operation is tracked in
sh->reconstruct_state.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:32:05 +0000 (08:32 +1000)]
md: replace STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} with 'reconstruct_states'
From: Dan Williams <dan.j.williams@intel.com>
Track the state of reconstruct operations (recalculating the parity block
usually due to incoming writes, or as part of array expansion) Reduces the
scope of the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags to only tracking whether
a reconstruct operation has been requested via the ops_request field of struct
stripe_head_state.
This is the final step in the removal of ops.{pending,ack,complete,count}, i.e.
the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags only request an operation and do
not track the state of the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:32:03 +0000 (08:32 +1000)]
md: replace STRIPE_OP_COMPUTE_BLK with STRIPE_COMPUTE_RUN
From: Dan Williams <dan.j.williams@intel.com>
Track the state of compute operations (recalculating a block from all the other
blocks in a stripe) with a state flag. Reduces the scope of the
STRIPE_OP_COMPUTE_BLK flag to only tracking whether a compute operation has
been requested via the ops_request field of struct stripe_head_state.
Note, the compute operation that is performed in the course of doing a 'repair'
operation (check the parity block, recalculate it and write it back if the
check result is not zero) is tracked separately with the 'check_state'
variable. Compute operations are held off while a 'check' is in progress, and
moving this check out to handle_issuing_new_read_requests5 the helper routine
__handle_issuing_new_read_requests5 can be simplified.
This is another step towards the removal of ops.{pending,ack,complete,count},
i.e. STRIPE_OP_COMPUTE_BLK only requests an operation and does not track the
state of the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:31:58 +0000 (08:31 +1000)]
md: replace STRIPE_OP_BIOFILL with STRIPE_BIOFILL_RUN
From: Dan Williams <dan.j.williams@intel.com>
Track the state of read operations (copying data from the stripe cache to bio
buffers outside the lock) with a state flag. Reduce the scope of the
STRIPE_OP_BIOFILL flag to only tracking whether a biofill operation has been
requested via the ops_request field of struct stripe_head_state.
This is another step towards the removal of ops.{pending,ack,complete,count},
i.e. STRIPE_OP_BIOFILL only requests an operation and does not track the state
of the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:31:57 +0000 (08:31 +1000)]
md: replace STRIPE_OP_CHECK with 'check_states'
From: Dan Williams <dan.j.williams@intel.com>
The STRIPE_OP_* flags record the state of stripe operations which are
performed outside the stripe lock. Their use in indicating which
operations need to be run is straightforward; however, interpolating what
the next state of the stripe should be based on a given combination of
these flags is not straightforward, and has led to bugs. An easier to read
implementation with minimal degrees of freedom is needed.
Towards this goal, this patch introduces explicit states to replace what was
previously interpolated from the STRIPE_OP_* flags. For now this only converts
the handle_parity_checks5 path, removing a user of the
ops.{pending,ack,complete,count} fields of struct stripe_operations.
This conversion also found a remaining issue with the current code. There is
a small window for a drive to fail between when we schedule a repair and when
the parity calculation for that repair completes. When this happens we will
writeback to 'failed_num' when we really want to write back to 'pd_idx'.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Dan Williams [Fri, 27 Jun 2008 22:31:50 +0000 (08:31 +1000)]
md: kill STRIPE_OP_MOD_DMA in raid5 offload
From: Dan Williams <dan.j.williams@intel.com>
This micro-optimization allowed the raid code to skip a re-read of the
parity block after checking parity. It took advantage of the fact that
xor-offload-engines have their own internal result buffer and can check
parity without writing to memory. Remove it for the following reasons:
1/ It is a layering violation for MD to need to manage the DMA and
non-DMA paths within async_xor_zero_sum
2/ Bad precedent to toggle the 'ops' flags outside the lock
3/ Hard to realize a performance gain as reads will not need an updated
parity block and writes will dirty it anyways.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
Chris Webb [Fri, 27 Jun 2008 22:31:46 +0000 (08:31 +1000)]
Support changing rdev size on running arrays.
From: Chris Webb <chris@arachsys.com>
Allow /sys/block/mdX/md/rdY/size to change on running arrays, moving the
superblock if necessary for this metadata version. We prevent the available
space from shrinking to less than the used size, and allow it to be set to zero
to fill all the available space on the underlying device.
Signed-off-by: Chris Webb <chris@arachsys.com> Signed-off-by: Neil Brown <neilb@suse.de>
Neil Brown [Fri, 27 Jun 2008 22:31:43 +0000 (08:31 +1000)]
Make sure all changes to md/degraded are notified.
When a device fails, when a spare is activated, when
an array is reshaped, or when an array is started,
the extent to which the array is degraded can change.
Neil Brown [Fri, 27 Jun 2008 22:31:41 +0000 (08:31 +1000)]
Make sure all changes to md/sync_action are notified.
When the 'resync' thread starts or stops, when we explicitly
set sync_action, or when we determine that there is definitely nothing
to do, we notify sync_action.
To stop "sync_action" from occasionally showing the wrong value,
we introduce a new flags - MD_RECOVERY_RECOVER - to say that a
recovery is probably needed or happening, and we make sure
that we set MD_RECOVERY_RUNNING before clearing MD_RECOVERY_NEEDED.
Neil Brown [Fri, 27 Jun 2008 22:31:36 +0000 (08:31 +1000)]
Make sure all changes to md/array_state are notified.
Changes in md/array_state could be of interest to a monitoring
program. So make sure all changes trigger a notification.
Exceptions:
changing active_idle to active is not reported because it
is frequent and not interesting.
changing active to active_idle is only reported on arrays
with externally managed metadata, as it is not interesting
otherwise.
Neil Brown [Fri, 27 Jun 2008 22:31:33 +0000 (08:31 +1000)]
rationalise return value for ->hot_add_disk method.
For all array types but linear, ->hot_add_disk returns 1 on
success, 0 on failure.
For linear, it returns 0 on success and -errno on failure.
This doesn't cause a functional problem because the ->hot_add_disk
function of linear is used quite differently to the others.
However it is confusing.
So convert all to return 0 for success or -errno on failure
and fix call sites to match.
Neil Brown [Fri, 27 Jun 2008 22:31:29 +0000 (08:31 +1000)]
Enable setting of 'offset' and 'size' of a hot-added spare.
offset_store and rdev_size_store allow control of the region of a
device which is to be using in an md/raid array.
They only allow these values to be set when an array is being assembled,
as changing them on an active array could be dangerous.
However when adding a spare device to an array, we might need to
set the offset and size before starting recovery. So allow
these values to be set also if "->raid_disk < 0" which indicates that
the device is still a spare.
Neil Brown [Fri, 27 Jun 2008 22:31:27 +0000 (08:31 +1000)]
Don't try to make md arrays dirty if that is not meaningful.
Arrays personalities such as 'raid0' and 'linear' have no redundancy,
and so marking them as 'clean' or 'dirty' is not meaningful.
So always allow write requests without requiring a superblock update.
Such arrays types are detected by ->sync_request being NULL. If it is
not possible to send a sync request we don't need a 'dirty' flag because
all a dirty flag does is trigger some sync_requests.
Neil Brown [Fri, 27 Jun 2008 22:31:26 +0000 (08:31 +1000)]
Close race in md_probe
There is a possible race in md_probe. If two threads call md_probe
for the same device, then one could exit (having checked that
->gendisk exists) before the other has called kobject_init_and_add,
thus returning an incomplete kobj which will cause problems when
we try to add children to it.
So extend the range of protection of disks_mutex slightly to
avoid this possibility.
Neil Brown [Fri, 27 Jun 2008 22:31:24 +0000 (08:31 +1000)]
Allow setting start point for requested check/repair
This makes it possible to just resync a small part of an array.
e.g. if a drive reports that it has questionable sectors,
a 'repair' of just the region covering those sectors will
cause them to be read and, if there is an error, re-written
with correct data.
Neil Brown [Fri, 27 Jun 2008 22:31:22 +0000 (08:31 +1000)]
Improve setting of "events_cleared" for write-intent bitmaps.
When an array is degraded, bits in the write-intent bitmap are not
cleared, so that if the missing device is re-added, it can be synced
by only updated those parts of the device that have changed since
it was removed.
The enable this a 'events_cleared' value is stored. It is the event
counter for the array the last time that any bits were cleared.
Sometimes - if a device disappears from an array while it is 'clean' -
the events_cleared value gets updated incorrectly (there are subtle
ordering issues between updateing events in the main metadata and the
bitmap metadata) resulting in the missing device appearing to require
a full resync when it is re-added.
With this patch, we update events_cleared precisely when we are about
to clear a bit in the bitmap. We record events_cleared when we clear
the bit internally, and copy that to the superblock which is written
out before the bit on storage. This makes it more "obviously correct".
We also need to update events_cleared when the event_count is going
backwards (as happens on a dirty->clean transition of a non-degraded
array).
Thanks to Mike Snitzer for identifying this problem and testing early
"fixes".
Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de>
Neil Brown [Fri, 27 Jun 2008 22:31:17 +0000 (08:31 +1000)]
Fix error paths if md_probe fails.
md_probe can fail (e.g. alloc_disk could fail) without
returning an error (as it alway returns NULL).
So when we call mddev_find immediately afterwards, we need
to check that md_probe actually succeeded. This means checking
that mdev->gendisk is non-NULL.
cc: <stable@kernel.org> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Neil Brown <neilb@suse.de>
Neil Brown [Fri, 27 Jun 2008 22:31:14 +0000 (08:31 +1000)]
Don't acknowlege that stripe-expand is complete until it really is.
We shouldn't acknowledge that a stripe has been expanded (When
reshaping a raid5 by adding a device) until the moved data has
actually been written out. However we are currently
acknowledging (by calling md_done_sync) when the POST_XOR
is complete and before the write.
So track in s.locked whether there are pending writes, and don't
call md_done_sync yet if there are.
Note: we all set R5_LOCKED on devices which are are about to
read from. This probably isn't technically necessary, but is
usually done when writing a block, and justifies the use of
s.locked here.
This bug can lead to a crash if an array is stopped while an reshape
is in progress.
Cc: <stable@kernel.org> Signed-off-by: Neil Brown <neilb@suse.de>
Neil Brown [Fri, 27 Jun 2008 22:30:52 +0000 (08:30 +1000)]
Ensure interrupted recovery completed properly (v1 metadata plus bitmap)
If, while assembling an array, we find a device which is not fully
in-sync with the array, it is important to set the "fullsync" flags.
This is an exact analog to the setting of this flag in hot_add_disk
methods.
Currently, only v1.x metadata supports having devices in an array
which are not fully in-sync (it keep track of how in sync they are).
The 'fullsync' flag only makes a difference when a write-intent bitmap
is being used. In this case it tells recovery to ignore the bitmap
and recovery all blocks.
This fix is already in place for raid1, but not raid5/6 or raid10.
So without this fix, a raid1 ir raid4/5/6 array with version 1.x
metadata and a write intent bitmaps, that is stopped in the middle
of a recovery, will appear to complete the recovery instantly
after it is reassembled, but the recovery will not be correct.
If you might have an array like that, issueing
echo repair > /sys/block/mdXX/md/sync_action
will make sure recovery completes properly.
Cc: <stable@kernel.org> Signed-off-by: Neil Brown <neilb@suse.de>
H. Peter Anvin [Fri, 27 Jun 2008 20:23:00 +0000 (13:23 -0700)]
x86: setup: issue a null command after enabling A20 via KBC
Apparently, DOS and possibly other legacy operating systems issued a
null command to the keyboard controller after toggling A20,
specifically "pulse output pins" with no output pins specified. This
was presumably done for synchronization reasons. This has made it
into at least the UHCI spec, and it has been found to cause
compatibility problems when "legacy USB" is enabled (which it almost
always is) to not have this byte sent.
It is *NOT* clear if any of these compatibility problems has any
effect on Linux. However, for maximum compatibility, issue this null
command after togging A20 through the KBC.
Kill pcibios_update_resource(), replacing it with pci_update_resource().
pci_update_resource() uses pcibios_resource_to_bus() to convert a
resource to a device BAR - the transformation should be exactly the
same as the transformation used for the PCI bridges.
pci_update_resource "knows" about 64-bit BARs, but doesn't attempt to
set the high 32-bits to anything non-zero - currently no architecture
attempts to do something different. If anyone cares, please fix; I'm
going to reflect current behaviour for the time being.
Ivan pointed out the following architectures need to examine their
pcibios_update_resource() implementation - they should make sure that
this new implementation does the right thing. #warning's have been
added where appropriate.
ia64
mips
mips64
This cset also includes a fix for the problem reported by AKPM where
64-bit arch compilers complain about the resource mask being placed
in a u32.
<-- snip -->
This patch removes the unused pcibios_update_resource() functions the
kernel gained since, from FRV, m68k, mips & sh architectures.
Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
David Howells [Fri, 27 Jun 2008 12:23:20 +0000 (13:23 +0100)]
PCI: fix pci_setup_device()'s sprinting into a const buffer
Make pci_setup_device() write the bus ID directly into the allotted storage,
rather than using pci_name() as the address as that now returns a const
pointer.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Kenji Kaneshige [Thu, 26 Jun 2008 11:06:24 +0000 (20:06 +0900)]
pciehp: use get_service_data
Current pciehp driver saves its private data pointer into pci_dev
structure using pci_set_drvdata()/pci_get_drvdata(). But because
pciehp is not a pci device driver but a PCI Express service driver, it
should save its private data pointer into pcie_device structure using
set_service_data()/get_service_data().
Currently, pciehp driver enables command completed interrupt as follows.
(1) Don't enable at initialization.
(2) Enable command completed interrupt whenever pciehp issues a
command, if the command doesn't attempt to disable the interrupt.
(3) Disable command completed interrupt at driver unloading.
Once we enable command completed interrupt, we don't need to re-enable
it for every command. So we can simplify above steps as follows:
(1) Enable command completed interrupt at initialization.
(2) No special sequence for command completed interrupt.
(3) Disable command completed interrupt at driver unloading.
Kenji Kaneshige [Fri, 20 Jun 2008 03:07:08 +0000 (12:07 +0900)]
pciehp: fix interrupt initialization
Current pciehp driver's intialization sequence is as follows:
(1) initialize controller data structure
(2) install interrupt handler
(3) enable software notification
(4) initialize controller specific slot data structure
(5) initialize generic slot data structure and register it to pci hotplug core
The interrupt handler of pciehp assumes that controller specific slot
data structure is already initialized. However, it is installed at (2)
before initializing controller specific slot data structure at
(4). Because of this, pciehp driver cannot handle the following cases
properly.
- If devices that shares IRQ with pciehp raise interrupts between (2) and (4).
- If hotplug events (e.g. MRL open) happen between (3) and (4).
We already have a workaround for this problem ("pciehp: fix NULL
dereference in interrupt handler: dbd79aed1aea2bece0bf43cc2ff3b2f9baf48a08).
But we still need fundamental fix.
This patch fix the problem by changing the initilization sequence as follows:
(1) initialize controller data structure
(2) initialize controller specific slot data structure
(3) install interrupt handler
(4) enable software notification
(5) initialize generic slot data structure and register it to pci hotplug core
Richard Sharpe [Wed, 25 Jun 2008 02:11:13 +0000 (19:11 -0700)]
firewire: fw-sbp2: fix parsing of logical unit directories
There is a small off-by-one bug in firewire-sbp2. This causes problems
when a device exports multiple LUN Directories. I found it when trying
to talk to a SONY DVD Jukebox.
Signed-off-by: Richard Sharpe <realrichardsharpe@gmail.com> Acked-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (op. order, changelog)
mac80211: fix an oops in several failure paths in key allocation
This patch fixes an oops in several failure paths in key allocation. This
Oops occurs when freeing a key that has not been linked yet, so the
key->sdata is not set.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Wed, 25 Jun 2008 19:27:00 +0000 (21:27 +0200)]
rt2x00: Fix lock dependency errror
This fixes a circular locking dependency in the workqueue handling.
The interface work task uses the mac80211 function
ieee80211_iterate_active_interfaces() which grabs the RTNL lock.
However when the interface is brough down, this happens under the RTNL
lock as well, this causes problems because mac80211 will flush the workqueue
during the ifdown event. This causes mac80211 to wait until the driver has
completed all work which can't finish because it is waiting on the RTNL lock.
This is fixed by moving rt2x00 workqueue tasks on a different workqueue,
this workqueue can be flushed when the ieee80211_hw structure is removed
by the driver (when the driver is unloaded) which does not happen under the
RTNL lock.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Vegard Nossum [Fri, 27 Jun 2008 16:06:54 +0000 (18:06 +0200)]
backtrace: replace timer with tasklet + completions
On qemu, the backtrace would show up _after_ the "end of backtrace
testing" message.
This patch changes it to use completions instead, which will guarantee
that no such race exists.
Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Vegard Nossum [Fri, 27 Jun 2008 16:04:48 +0000 (18:04 +0200)]
stacktrace: add saved stack traces to backtrace self-test
This patch adds saved stack-traces to the backtrace suite of self-tests.
Note that we don't depend on or unconditionally enable CONFIG_STACKTRACE
because not all architectures may have it (and we still want to enable the
other tests for those architectures).
Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Kill the special exception handler sections .tlbx.ex.text,
.tlbr.ex.text, tlbw.ex.text and .scall.text. Use .org instead to place
the handlers at the required offsets from EVBA.
Vegard Nossum [Fri, 27 Jun 2008 15:22:17 +0000 (17:22 +0200)]
x86: don't destroy %rbp on kernel-mode faults
From the code:
"B stepping K8s sometimes report an truncated RIP for IRET exceptions
returning to compat mode. Check for these here too."
The code then proceeds to truncate the upper 32 bits of %rbp. This means
that when do_page_fault() is finally called, its prologue,
do_page_fault:
push %rbp
movl %rsp, %rbp
will put the truncated base pointer on the stack. This means that the
stack tracer will not be able to follow the base-pointer changes and
will see all subsequent stack frames as unreliable.
This patch changes the code to use a different register (%rcx) for the
checking and leaves %rbp untouched.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
CC [M] drivers/net/wireless/b43/rfkill.o
drivers/net/wireless/b43/rfkill.c: In function ‘b43_rfkill_soft_toggle’:
drivers/net/wireless/b43/rfkill.c:90: warning: enumeration value ‘RFKILL_STATE_HARD_BLOCKED’ not handled in switch
CC [M] drivers/net/wireless/b43legacy/rfkill.o
drivers/net/wireless/b43legacy/rfkill.c: In function ‘b43legacy_rfkill_soft_toggle’:
drivers/net/wireless/b43legacy/rfkill.c:92: warning: enumeration value ‘RFKILL_STATE_HARD_BLOCKED’ not handled in switch
CC [M] drivers/net/wireless/iwlwifi/iwl-rfkill.o
drivers/net/wireless/iwlwifi/iwl-rfkill.c: In function ‘iwl_rfkill_soft_rf_kill’:
drivers/net/wireless/iwlwifi/iwl-rfkill.c:56: warning: enumeration value ‘RFKILL_STATE_HARD_BLOCKED’ not handled in switch
Also handle RFKILL_STATE_{ON,OFF} -> RFKILL_STATE_{UNBLOCKED,SOFT_BLOCKED}
conversion since I'm already here...
Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Brownell [Wed, 18 Jun 2008 09:31:43 +0000 (02:31 -0700)]
at32ap700x spi: enable pullups on MISO
This is a minor tweak to the at32ap700x pin configuration for the SPI
input pin (MISO), enabling the on-chip weak pullup (typical 190K) to
(a) ensure a fixed data value for missing or input-only slaves;
(b) prevent power waste associated with inputs floating near VDDIO/2.
Atmel's boards have no external pullup or pulldown on these pins, so
it's unlikely other boards would address these issues with external
pulldowns. Were there trouble, board-specific code could turn off
the relevant pullup(s).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
avr32: Add pin configuration choice to LCDC peripheral
This patch lets the board code choose which pin out to use for the LCD
interface.
On AT32AP7000 the LCDC is wired to two sets of pins, which lets the user
choose between dual ethernet and 32-bit EBI. For the ATNGW100 board it
is vital to have the choice to select the alternative pinout since this
pinout is routed to the external headers.
Update ATSTK1002 and ATSTK1004 to use the new interface.
David Brownell [Tue, 10 Jun 2008 11:55:52 +0000 (13:55 +0200)]
avr32: minor GPIO handling updates
On the odd chance some code uses a pin as a GPIO IRQ without calling
gpio_request() or gpio_direction_input(), the debug dump should still
show its pin status.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
John W. Linville [Fri, 27 Jun 2008 13:31:29 +0000 (09:31 -0400)]
ath5k: remove now unused variable declared in ath5k_tx
CC [M] drivers/net/wireless/ath5k/base.o
drivers/net/wireless/ath5k/base.c: In function ‘ath5k_tx’:
drivers/net/wireless/ath5k/base.c:2598: warning: unused variable ‘info’
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Wed, 25 Jun 2008 11:36:27 +0000 (14:36 +0300)]
mac80211: fix tx fragmentation
This patch fixes TX fragmentation caused by
tx handlers reordering and 'tx info to cb' patches
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 26 Jun 2008 17:59:56 +0000 (19:59 +0200)]
mac80211: make workqueue freezable
This patch makes the mac80211 workqueue freezable making it
interact a bit better with system suspend and not try to ping
the AP while the hardware is down.
This doesn't really help with implementing proper suspend in
any way but makes some bad things trigger less.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Randy Dunlap [Wed, 25 Jun 2008 04:02:46 +0000 (21:02 -0700)]
iwlwifi: fix build for CONFIG_INPUT=n
Fix iwlwifi so that it builds cleanly with CONFIG_INPUT=n.
Also free the input device on exit.
drivers/built-in.o: In function `iwl_rfkill_unregister':
(.text+0xbf430): undefined reference to `input_unregister_device'
drivers/built-in.o: In function `iwl_rfkill_init':
(.text+0xbf51c): undefined reference to `input_allocate_device'
drivers/built-in.o: In function `iwl_rfkill_init':
(.text+0xbf5bf): undefined reference to `input_register_device'
drivers/built-in.o: In function `iwl_rfkill_init':
(.text+0xbf5e9): undefined reference to `input_free_device'
net/built-in.o: In function `rfkill_disconnect':
rfkill-input.c:(.text+0xe71e1): undefined reference to `input_close_device'
rfkill-input.c:(.text+0xe71e9): undefined reference to `input_unregister_handle'
net/built-in.o: In function `rfkill_connect':
rfkill-input.c:(.text+0xe723e): undefined reference to `input_register_handle'
rfkill-input.c:(.text+0xe724d): undefined reference to `input_open_device'
rfkill-input.c:(.text+0xe725c): undefined reference to `input_unregister_handle'
net/built-in.o: In function `rfkill_handler_init':
rfkill-input.c:(.init.text+0x36ec): undefined reference to `input_register_handler'
net/built-in.o: In function `rfkill_handler_exit':
rfkill-input.c:(.exit.text+0x112c): undefined reference to `input_unregister_handler'
make[1]: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Tue, 24 Jun 2008 17:23:36 +0000 (19:23 +0200)]
mac80211: Add RTNL warning for workqueue
The workqueue provided by mac80211 should not be used for
scheduled tasks that acquire the RTNL lock. This could be done
when the driver uses the function ieee80211_iterate_active_interfaces()
within the scheduled work. Such behavior will end in locking
dependencies problems when an interface is being removed.
This patch will add a notification about the RTNL locking and
the mac80211 workqueue to prevent driver developers from
blindly using it.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch updates the authentication method upon giwencode ioctl.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211: don't return -EINVAL upon iwconfig wlan0 rts auto
This patch avoids returning -EINVAL upon iwconfig wlan0 rts auto. If
rts->fixed is 0, then we should choose a default value instead of failing.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Tue, 24 Jun 2008 10:22:05 +0000 (12:22 +0200)]
b43: Fix PIO skb clobber
This fixes a clobber of the skb that was introduced by the
tx_control->cb conversion patches.
This bug causes a crash when the skb destructor is invoked. That happens
on skb_orphan or skb_kfree.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211: Let drivers have access to TKIP key offets for TX and RX MIC
Some drivers may want to to use the TKIP key offsets for TX and RX
MIC so lets move this out. Lets also clear up a bit how this is used
internally in mac80211.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Fri, 20 Jun 2008 20:10:53 +0000 (22:10 +0200)]
rt2x00: Remove duplicate deinitialization
When rt2x00queue_alloc_rxskbs() fails rt2x00queue_unitialize()
will be called which will free all rxskb. So we don't need
to do this in the rt2x00queue_alloc_rxskb() function as well.
rt2x00queue_free_skb() unmaps the DMA but doesn't clear the
allocation flag. Since the code is copied from rt2x00queue_unmap_skb()
anyway (and that function does clear the flag) we might as well
use that function directly.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Fri, 20 Jun 2008 16:02:08 +0000 (18:02 +0200)]
b43: Add debugfs firmware debugging knob
This adds a firmware debugging knob to debugfs.
With this knob it's possible to enable advanced runtime firmware
checks.
For now it only implements one sanity check for the mac-suspend.
In future there'll probably be more.
If CONFIG_B43_DEBUG is disabled, these checks will collapse to nothing.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Fri, 20 Jun 2008 15:44:02 +0000 (17:44 +0200)]
b43: Add simple firmware watchdog
This adds a simple firmware watchdog for the opensource firmware.
This will check every 15 seconds, if the firmware zeroed out the watchdog
register. The firmware will do this in its eventloop.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Fri, 20 Jun 2008 09:50:29 +0000 (11:50 +0200)]
ssb, b43, b43legacy, b44: Rewrite SSB DMA API
This is a rewrite of the DMA API for SSB devices.
This is needed, because the old (non-existing) "API" made too many bad
assumptions on the API of the host-bus (PCI).
This introduces an almost complete SSB-DMA-API that maps to the lowlevel
bus-API based on the bustype.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Martin Koegler [Wed, 28 May 2008 18:50:07 +0000 (20:50 +0200)]
avr32: Fix sigaltstack behaviour
A signal handler should be able to change the signal stack used for the
next signal by altering the ucontext_t passed as a parameter to the
handler. This does not currently work on avr32 since it doesn't update
the in-kernel signal context from the ucontext_t upon signal handler
return.
Fix it by adding a call to do_sigaltstack() from sys_rt_sigreturn(),
bringing it in line with most other architectures.
Some non-PCI drivers need the PCI variant of the DMA mapping API.
Include <asm-generic/pci-dma-compat.h> to provide this through the
non-PCI DMA mapping API.
Peter Zijlstra [Fri, 27 Jun 2008 11:41:38 +0000 (13:41 +0200)]
sched: incremental effective_load()
Increase the accuracy of the effective_load values.
Not only consider the current increment (as per the attempted wakeup), but
also consider the delta between when we last adjusted the shares and the
current situation.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 27 Jun 2008 11:41:35 +0000 (13:41 +0200)]
sched: update shares on wakeup
We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 27 Jun 2008 11:41:34 +0000 (13:41 +0200)]
sched: fix shares boost logic
In case the domain is empty, pretend there is a single task on each cpu, so
that together with the boost logic we end up giving 1/n shares to each
cpu.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 27 Jun 2008 11:41:30 +0000 (13:41 +0200)]
sched: fix task_h_load()
Currently task_h_load() computes the load of a task and uses that to either
subtract it from the total, or add to it.
However, removing or adding a task need not have any effect on the total load
at all. Imagine adding a task to a group that is local to one cpu - in that
case the total load of that cpu is unaffected.
So properly compute addition/removal:
s_i = S * rw_i / \Sum_j rw_j
s'_i = S * (rw_i + wl) / (\Sum_j rw_j + wg)
then s'_i - s_i gives the change in load.
Where s_i is the shares for cpu i, S the group weight, rw_i the runqueue weight
for that cpu, wl the weight we add (subtract) and wg the weight contribution to
the runqueue.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>