Wendy Cheng [Thu, 29 Nov 2007 22:56:51 +0000 (17:56 -0500)]
[GFS2] Remove lock methods for lock_nolock protocol
GFS2 supports two modes of locking - lock_nolock for single node filesystem
and lock_dlm for cluster mode locking. The gfs2 lock methods are removed from
file operation table for lock_nolock protocol. This would allow VFS to handle
posix lock and flock logics just like other in-tree filesystems without
duplication.
Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Ryan O'Hara [Wed, 21 Nov 2007 17:54:54 +0000 (11:54 -0600)]
[GFS2] remove unnecessary permission checks
Remove read/write permission() checks from xattr operations.
VFS layer is already handling permission for xattrs via the
xattr_permission() call, so there is no need for gfs2 to
check permissions. Futhermore, using permission() for SELinux
xattrs ops is incorrect.
Signed-off-by: Ryan O'Hara <rohara@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The issue is indeed UP vs SMP and it is totally random.
spin_is_locked() is a bad assertion because there is no correct answer on UP.
on UP spin_is_locked() has to return either one value or another, always.
This means that in my setup I am lucky enough to trigger the issue and your you
are lucky enough not to.
the patch in attachment removes the bogus calls to BUG_ON and according to David
(in CC and thanks for the long explanation on the problem) we can rely upon
things like lockdep to find problem that might be trying to catch.
Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
David Teigland [Thu, 15 Nov 2007 15:01:13 +0000 (09:01 -0600)]
[GFS2] tidy up error message
Print error with log_error() to be consistent with others.
Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
[GFS2] Check for installation of mount helpers for DLM mounts
The patch is a fix to abort mount if the mount.gfs* and possible
umount.* are missing from /sbin.
While we do what we can to guarantee that they are installed properly in
userland (CVS HEAD), we want to make sure that mount still aborts properly.
The only sign of missing helpers is that lock_dlm will receive no mount options
at all. According to David the problem does not exist for lock_nolock as the
helpers are not required.
The patch has been tested for both gfs and gfs2 and it works as expected. The
lack of mount.gfs* will generate an error that is propagated to mount:
oot@node1:~# mount -t gfs2 /dev/nbd2 /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/nbd2,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
[ 3513.303346] GFS2: fsid=: Trying to join cluster "lock_dlm", "gutsy:gfs2"
[ 3513.304546] DLM/GFS2/GFS ERROR: (u)mount helpers are not installed properly!
[ 3513.306290] GFS2: fsid=: can't mount proto=lock_dlm, table=gutsy:gfs2, hostdata=
You might want to notice that it will also avoid mount to hang or fail silently
or with strange errors that will require the cluster to reboot/restart before
you can actually mount the filesystem again.
Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We only care about the content of the jindex in two cases,
one is when we mount the fs and the other is when we need
to recover another journal. In both cases we have to update
the jindex anyway, so there is no point in updating it
periodically between times, so this removes it to simplify
gfs2_logd.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
[GFS2] Use atomic_t for journal free blocks counter
This patch changes the counter which keeps track of the free
blocks in the journal to an atomic_t in preparation for the
following patch which will update the log reservation code.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The only reason for adding glocks to the journal was to keep track
of which locks required a log flush prior to release. We add a
flag to the glock to allow this check to be made in a simpler way.
This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
and means that we can avoid extra work during the journal flush.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
David Teigland [Wed, 7 Nov 2007 15:03:56 +0000 (09:03 -0600)]
[GFS2] check kthread_should_stop when waiting
Use wait_event_interruptible() in the lock_dlm thread instead
of an open coded equivalent, and include a kthread_should_stop()
check in the wait test so we don't miss a kthread_stop().
Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Fri, 2 Nov 2007 14:37:15 +0000 (09:37 -0500)]
[GFS2] Given device ID rather than s_id in "id" sysfs file
This patch changes the /sys/fs/gfs2/<s_id>/id file to give the device
id "major:minor" rather than the s_id. That enables gfs2_tool to
match devices properly (by id, not name) when locating the tuning files.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The HIF_MUTEX and HIF_PROMOTE flags were set on the glock holders
depending upon which of the two waiters lists they were going to
be queued upon. They were then tested when the holders were taken
off the lists to ensure that the right type of holder was being
dequeued.
Since we are already using separate lists, there doesn't seem a
lot of point having these flags as well, and since setting them
and testing them is in the fast path for locking and unlocking
glock, this patch removes them.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Previously we were doing (write data, wait for data, write metadata, wait
for metadata). After this patch we so (write metadata, write data, wait for
data, wait for metadata) which should be more efficient.
Also I noticed that the drop_bh and xmote_bh functions were almost
identical. In fact the only difference was a single test, and that
test is such that in the drop_bh case, it would always evaluate to
the correct result. As such we can use the xmote_bh functions in
all the places where we were using the drop_bh function and remove
the drop_bh functions.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This call to reclaim glocks is not needed, and in particular we don't want it
in the fast path for locking glocks. The limit was entirely arbitrary anyway
and we can't expect users to adjust things like this, the remaining code will
do the right thing on its own.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
[GFS2] Don't hold page lock when starting transaction
This is an addendum to the new AOPs work which moves the point
at which we take the page lock so that we don't get it until
the last possible moment. This resolves a conflict between
starting transactions and the page lock.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch resolves a lock ordering issue where we had been getting
a transaction lock in the wrong order with respect to the page lock.
By using writepages rather than just writepage, it is then possible
to start a transaction before locking the page, and thus matching the
locking order elsewhere in the code.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch splits gfs2_writepage into separate functions for each of
the three cases: writeback, ordered and journalled. As a result
it becomes a lot easier to see what each one is doing. The common
code is moved into gfs2_writepage_common.
This fixes a performance bug where we were doing more work than
strictly required in the ordered write case.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Just like ext3 we now have three sets of address space operations
to cover the cases of writeback, ordered and journalled data
writes. This means that the individual operations can now become
less complicated as we are able to remove some of the tests for
file data mode from the code.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This adds a function "gfs2_is_writeback()" along the lines of the
existing "gfs2_is_jdata()" in order to clean up the code and make
the various tests for the inode mode more obvious. It also fixes
the PageChecked() logic where we were resetting the flag too early
in the case of an error path.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The i_cache was designed to keep references to the indirect blocks
used during block mapping so that they didn't have to be looked
up continually. The idea failed because there are too many places
where the i_cache needs to be freed, and this has in the past been
the cause of many bugs.
In addition there was no performance benefit being gained since the
disk blocks in question were cached anyway. So this patch removes
it in order to simplify the code to prepare for other changes which
would otherwise have had to add further support for this feature.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This cleans up the mmap() code path for GFS2 by implementing the
page_mkwrite function for GFS2. We are thus able to use the
generic filemap_fault function for our ->fault() implementation.
This now means that shared writable mappings will be much more
efficiently shared across the cluster if there is a reasonable
proportion of read activity (the greater proportion, the better).
As a side effect, it also reduces the size of the code, removes
special cases from readpage and readpages, and makes the code
path easier to follow.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
As requested by Christoph, this patch cleans up GFS2's internal
read function so that it no longer uses the do_generic_mapping_read
function. This function is obsolete and GFS2 is the last user of it.
As a side effect the internal read code gets smaller and easier
to read and gfs2_readpage is split into two. One function has the locking
and the other function has the rest of the logic.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>
Wendy Cheng [Fri, 5 Oct 2007 04:27:58 +0000 (00:27 -0400)]
[GFS2] Handle multiple glock demote requests
Fix a race condition where multiple glock demote requests are sent to
a node back-to-back. This patch does a check inside handle_callback()
to see whether a demote request is in progress. If true, it sets a flag
to make sure run_queue() will loop again to handle the new request,
instead of erronously setting gl_demote_state to a different state.
Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
David Brownell pointed out a mismatch in the avr32 extint code:
> I noticed a small glitch that's not fixed by this patch: the
> initial type is falling edge, but IRQ_TYPE_NONE is mapped to
> IRQ_TYPE_LEVEL_LOW. Potentially surprising.
Fix it by setting the initial type (and handler) to low level,
matching the meaning of IRQ_TYPE_NONE.
Change the NMI handler to use the die notifier chain to signal anyone
who cares. Add a simple "nmi debugger" which hooks into this chain and
that may dump registers, task state, etc. when it happens.
[AVR32] Kconfig: Choose daughterboard instead of CPU
Remove the CPU selection menu and instead let it be selected by the
board or daughterboard option. Add daughterboard selection for
ATSTK1000 (this was previously determined based on CPU type.)
Reduce the ridiculous amount of #ifdef clutter in atstk1002.c a bit by
moving all the extdac stuff into its own function and providing an
empty stub for the case when it isn't wanted.
These are derivatives of the AT32AP7000 chip, which means that most of
the code stays the same. Rename a few files, functions, definitions
and config symbols to reflect that they apply to all AP700x chips, and
exclude some platform devices from chips where they aren't present.
[AVR32] Provide more CPU information in /proc/cpuinfo and dmesg
Add the following fields to /proc/cpuinfo:
* chip type and revision (from the JTAG chip id)
* cpu MHz (from clk_get_rate())
* features (from the CONFIG0 register)
Also rename "cpu family" to "cpu arch" and "cpu type" to "cpu core" to
remove some ambiguity.
Show chip type and revision at bootup, and clarify that the other
kinds of IDs that we're already printing are for the cpu core and
architecture. Rename "AP7000" to "AP7" since that's the name of the
core.
Keep track of processes being debugged (including the kernel itself)
and turn the OCD system on and off as appropriate. Since enabling
debugging turns off some optimizations in the CPU core, this fixes the
issue that enabling KProbes support or simply running a program under
gdbserver will reduce system performance significantly until the next
reboot.
The CPU performance will still be reduced for all processes while a
process is being debugged, but this is a lot better than reducing the
performance forever.
ptrace: Call arch_ptrace_attach() when request=PTRACE_TRACEME
arch_ptrace_attach() is a hook that allows the architecture to do
book-keeping after a ptrace attach. This patch adds a call to this
hook when handling a PTRACE_TRACEME request as well.
Currently only one architecture, m32r, implements this hook. When
called, it initializes a number of debug trap slots in the ptraced
task's thread struct, and it looks to me like this is the right thing
to do after a PTRACE_TRACEME request as well, not only after
PTRACE_ATTACH. Please correct me if I'm wrong.
I want to use this hook on AVR32 to turn the debugging hardware on
when a process is actually being debugged and keep it off otherwise.
To be able to do this, I need to intercept PTRACE_TRACEME and
PTRACE_ATTACH, as well as PTRACE_DETACH and thread exit. The latter
two can be handled by existing hooks.
dma_alloc_coherent wants to split pages after allocation in order to
reduce the memory footprint. This does not work well with GFP_COMP
pages, so drop this flag before allocation.
Dave Young [Tue, 22 Jan 2008 07:14:18 +0000 (15:14 +0800)]
spi: use class iteration api
Convert to use the class iteration api.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: David Brownell <david-b@pacbell.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Young [Tue, 22 Jan 2008 06:01:34 +0000 (14:01 +0800)]
scsi: use class iteration api
Convert to use the class iteration api.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Young [Tue, 22 Jan 2008 05:58:22 +0000 (13:58 +0800)]
power supply : use class iteration api
Convert to use the class iteration api.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Anton Vorontsov <cbou@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Young [Tue, 22 Jan 2008 05:56:32 +0000 (13:56 +0800)]
ieee1394: use class iteration api
Convert to use the class iteration api.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cornelia Huck [Mon, 21 Jan 2008 15:09:44 +0000 (16:09 +0100)]
Driver core: Cleanup get_device_parent() in device_add() and device_move()
Make setup_parent() void as get_device_parent() will always return
either a valid kobject or NULL.
Introduce cleanup_glue_dir() to drop reference grabbed on "glue"
directory by get_device_parent(). Use it for cleanup in device_move()
and device_add() on errors.
This should fix the refcounting problem reported in
http://marc.info/?l=linux-kernel&m=120052487909200&w=2
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Dave Young <hidave.darkstar@gmail.com> Cc: Gabor Gombas <gombasg@sztaki.hu> Cc: Tejun Heo <htejun@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: David Miller <davem@davemloft.net> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Randy Dunlap [Mon, 31 Dec 2007 18:05:43 +0000 (10:05 -0800)]
driver core: fix build with SYSFS=n
When SYSFS=n and MODULES=y, build ends with:
linux-2.6.24-rc6-mm1/drivers/base/module.c: In function 'module_add_driver':
linux-2.6.24-rc6-mm1/drivers/base/module.c:49: error: 'module_kset' undeclared (first use in this function)
make[3]: *** [drivers/base/module.o] Error 1
Below is one possible fix.
Build-tested with all 4 config combinations of SYSFS & MODULES.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Randy Dunlap [Mon, 31 Dec 2007 18:05:34 +0000 (10:05 -0800)]
sysfs: make SYSFS_DEPRECATED depend on SYSFS
Make SYSFS_DEPRECATED depend on SYSFS since files that check
CONFIG_SYSFS_DEPRECATED don't check for CONFIG_SYSFS first.
Also don't prompt user about SYSFS_DEPRECATED if SYSFS=n.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Thu, 20 Dec 2007 01:09:39 +0000 (02:09 +0100)]
Driver core: change sysdev classes to use dynamic kobject names
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that kobjects properly clean up their name structures, no matter if
they have a release function or not, we can drop this empty module
kobject release function too (it was needed prior to this because of the
way we handled static kobject names, we based the fact that if a release
function was present, then we could safely free the name string, now we
are more smart about things and only free names we have previously set.)
Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Wed, 19 Dec 2007 00:40:42 +0000 (01:40 +0100)]
Kobject: auto-cleanup on final unref
We save the current state in the object itself, so we can do proper
cleanup when the last reference is dropped.
If the initial reference is dropped, the object will be removed from
sysfs if needed, if an "add" event was sent, "remove" will be send, and
the allocated resources are released.
This allows us to clean up some driver core usage as well as allowing us
to do other such changes to the rest of the kernel.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The function is no longer used by anyone in the kernel, and it prevents
the proper sending of the kobject uevent after the needed files are set
up by the caller. kobject_init_and_add() can be used in its place.
Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kobject: remove kobject_init() as no one uses it anymore
The old kobject_init() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)
Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kobject: remove kobject_add() as no one uses it anymore
The old kobject_add() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)
Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Mon, 21 May 2007 20:08:01 +0000 (22:08 +0200)]
Driver core: convert block from raw kobjects to core devices
This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.
Alan Stern [Mon, 19 Nov 2007 15:53:40 +0000 (10:53 -0500)]
Kobject: drop child->parent ref at unregistration
This patch (as1015) reverts changes that were made to the driver core
about four years ago. The intent back then was to avoid certain kinds
of invalid memory accesses by leaving kernel objects allocated as long
as any of their children were still allocated. The original and
correct approach was to wait only as long as any children were still
_registered_; that's what this patch reinstates.
This fixes a problem in the SCSI core made visible by the class_device
to regular device conversion: A reference loop (scsi_device holds
reference to request_queue, which is the child of a gendisk, which is
the child of the scsi_device) prevents the data structures from being
released, even though they are deregistered okay.
It's possible that this change will cause a few bugs to surface,
things that have been hidden for several years. They can be fixed
easily enough by having the child device take an explicit reference to
the parent whenever needed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Thu, 29 Nov 2007 22:46:11 +0000 (23:46 +0100)]
sysfs: fix /sys/module/*/holders after sysfs logic change
Sysfs symlinks now require fully registered kobjects as a target,
otherwise the call to create a symlink will fail. Here we register
the kobject before we request the symlink in the holders directory.
Kay Sievers [Wed, 21 Nov 2007 16:29:15 +0000 (17:29 +0100)]
Driver core: fix class glue dir cleanup logic
We should remove the glue directory between the class and the bus
device _after_ we sent out the 'remove' event for the device, otherwise
the parent relationship is no longer valid, and composing the path
with deleted sysfs entries will not work.
Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Fri, 16 Nov 2007 16:57:28 +0000 (11:57 -0500)]
Driver core: fix race in __device_release_driver
This patch (as1013) was suggested by David Woodhouse; it fixes a race
in the driver core. If a device is unregistered at the same time as
its driver is unloaded, the driver's code pages may be unmapped while
the remove method is still running. The calls to get_driver() and
put_driver() were intended to prevent this, but they don't work if the
driver's module count has already dropped to 0.
Instead, the patch keeps the device on the driver's list until after
the remove method has returned. This forces the necessary
synchronization to occur.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Driver core: move the static kobject out of struct driver
This patch removes the kobject, and a few other driver-core-only fields
out of struct driver and into the driver core only. Now drivers can be
safely create on the stack or statically (like they currently are.)
Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Driver core: move the driver specific module code into the driver core
The module driver specific code should belong in the driver core, not in
the kernel/ directory. So move this code. This is done in preparation
for some struct device_driver rework that should be confined to the
driver core code only.
This also lets us keep from exporting these functions, as no external
code should ever be calling it.
Thanks to Andrew Morton for the !CONFIG_MODULES fix.
Driver: add driver_add_kobj for looney iseries_veth driver
The iseries driver wants to hang kobjects off of its driver, so, to
preserve backwards compatibility, we need to add a call to the driver
core to allow future changes to work properly.
Hopefully no one uses this function in the future and the iseries_veth
driver authors come to their senses so I can remove this hack...
Cc: Dave Larson <larson1@us.ibm.com> Cc: Santiago Leon <santil@us.ibm.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Infiniband: make ipath driver use default driver groups.
Make the ipath driver use the new driver functions so that it does not
touch the sysfs portion of the driver structure.
We also remove the redundant symlink from the device back to the driver,
as it is already in the sysfs tree. Any userspace tools should be using
the standard symlink, not some driver specific one.
Cc: Roland Dreier <rdreier@cisco.com> Cc: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Cc: Arthur Jones <arthur.jones@qlogic.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>