pilppa.org Git - linux-2.6-omap-h63xx.git/log

svc: Remove unnecessary call to svc_sock_enqueue

The svc_tcp_accept function calls svc_sock_enqueue after setting the
SK_CONN bit. This doesn't actually do anything because the SK_BUSY bit
is still set. The call is unnecessary anyway because the generic code in
svc_recv calls svc_sock_received after calling the accept function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add xpo_accept transport function

Previously, the accept logic looked into the socket state to determine
whether to call accept or recv when data-ready was indicated on an endpoint.
Since some transports don't use sockets, this logic now uses a flag
bit (SK_LISTENER) to identify listening endpoints. A transport function
(xpo_accept) allows each transport to define its own accept processing.
A transport's initialization logic is reponsible for setting the
SK_LISTENER bit. I didn't see any way to do this in transport independent
logic since the passive side of a UDP connection doesn't listen and
always recv's.

In the svc_recv function, if the SK_LISTENER bit is set, the transport
xpo_accept function is called to handle accept processing.

Note that all functions are defined even if they don't make sense
for a given transport. For example, accept doesn't mean anything for
UDP. The function is defined anyway and bug checks if called. The
UDP transport should never set the SK_LISTENER bit.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Move close processing to a single place

Close handling was duplicated in the UDP and TCP recvfrom
methods. This code has been moved to the transport independent
svc_recv function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add a transport function that checks for write space

In order to avoid blocking a service thread, the receive side checks
to see if there is sufficient write space to reply to the request.
Each transport has a different mechanism for determining if there is
enough write space to reply.

The code that checked for write space was coupled with code that
checked for CLOSE and CONN. These checks have been broken out into
separate statements to make the code easier to read.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add xpo_prep_reply_hdr

Some transports add fields to the RPC header for replies, e.g. the TCP
record length. This function is called when preparing the reply header
to allow each transport to add whatever fields it requires.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add per-transport delete functions

Add transport specific xpo_detach and xpo_free functions. The xpo_detach
function causes the transport to stop delivering data-ready events
and enqueing the transport for I/O.

The xpo_free function frees all resources associated with the particular
transport instance.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add transport specific xpo_release function

The svc_sock_release function releases pages allocated to a thread. For
UDP this frees the receive skb. For RDMA it will post a receive WR
and bump the client credit count.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Move sk_sendto and sk_recvfrom to svc_xprt_class

The sk_sendto and sk_recvfrom are function pointers that allow svc_sock
to be used for both UDP and TCP. Move these function pointers to the
svc_xprt_ops structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add a max payload value to the transport

The svc_max_payload function currently looks at the socket type
to determine the max payload. Add a max payload value to svc_xprt_class
so it can be returned directly.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Change the svc_sock in the rqstp structure to a transport

The rqstp structure contains a pointer to the transport for the
RPC request. This functionaly trivial patch adds an unamed union
with pointers to both svc_sock and svc_xprt. Ultimately the
union will be removed and only the rq_xprt field will remain. This
allows incrementally extracting transport independent interfaces without
one gigundo patch.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Make svc_sock the tcp/udp transport

Make TCP and UDP svc_sock transports, and register them
with the svc transport core.

A transport type (svc_sock) has an svc_xprt as its first member,
and calls svc_xprt_init to initialize this field.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svc: Add an svc transport class

The transport class (svc_xprt_class) represents a type of transport, e.g.
udp, tcp, rdma. A transport class has a unique name and a set of transport
operations kept in the svc_xprt_ops structure.

A transport class can be dynamically registered and unregisterd. The
svc_xprt_class represents the module that implements the transport
type and keeps reference counts on the module to avoid unloading while
there are active users.

The endpoint (svc_xprt) is a generic, transport independent endpoint that can
be used to send and receive data for an RPC service. It inherits it's
operations from the transport class.

A transport driver module registers and unregisters itself with svc sunrpc
by calling svc_reg_xprt_class, and svc_unreg_xprt_class respectively.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

svcrpc: ensure gss DESTROY tokens free contexts from cache

If we don't do this then we'll end up with a pointless unusable context
sitting in the cache until the time the original context would have
expired.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

Leak in nlmsvc_testlock for async GETFL case

Fix nlm_block leak for the case of supplied blocking lock info.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: clean up access_valid, deny_valid checks.

Document these checks a little better and inline, as suggested by Neil
Brown (note both functions have two callers). Remove an obviously bogus
check while we're there (checking whether unsigned value is negative).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Neil Brown <neilb@suse.de>

sunrpc: gss: simplify rsi_parse logic

Make an obvious simplification that removes a few lines and some
unnecessary indentation; no change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: change mailing list for nfsd in MAINTAINERS

nfs@lists.sourceforge.net is being decommissioned.

Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: fix rsi_cache reference count leak

For some reason we haven't been put()'ing the reference count here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: allow root to set uid and gid on create

The server silently ignores attempts to set the uid and gid on create.
Based on the comment, this appears to have been done to prevent some
overly-clever IRIX client from causing itself problems.

Perhaps we should remove that hack completely. For now, at least, it
makes sense to allow root (when no_root_squash is set) to set uid and
gid.

While we're there, since nfsd_create and nfsd_create_v3 share the same
logic, pull that out into a separate function. And spell out the
individual modifications of ia_valid instead of doing them both at once
inside a conditional.

Thanks to Roger Willcocks <roger@filmlight.ltd.uk> for the bug report
and original patch on which this is based.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

lockd: fix a leak in nlmsvc_testlock asynchronous request handling

Without the patch, there is a leakage of nlmblock structure refcount
that holds a reference nlmfile structure, that holds a reference to
struct file, when async GETFL is used (-EINPROGRESS return from
file_ops->lock()), and also in some error cases.

Fix up a style nit while we're here.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: Allow AIX client to read dir containing mountpoints

This patch addresses a compatibility issue with a Linux NFS server and
AIX NFS client.

I have exported /export as fsid=0 with sec=krb5:krb5i
I have mount --bind /home onto /export/home
I have exported /export/home with sec=krb5i

The AIX client mounts / -o sec=krb5:krb5i onto /mnt

If I do an ls /mnt, the AIX client gets a permission error. Looking at
the network traceIwe see a READDIR looking for attributes
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID. The response gives a
NFS4ERR_WRONGSEC which the AIX client is not expecting.

Since the AIX client is only asking for an attribute that is an
attribute of the parent file system (pseudo root in my example), it
seems reasonable that there should not be an error.

In discussing this issue with Bruce Fields, I initially proposed
ignoring the error in nfsd4_encode_dirent_fattr() if all that was being
asked for was FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID, however,
Bruce suggested that we avoid calling cross_mnt() if only these
attributes are requested.

The following patch implements bypassing cross_mnt() if only
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID are called. Since there
is some complexity in the code in nfsd4_encode_fattr(), I didn't want to
duplicate code (and introduce a maintenance nightmare), so I added a
parameter to nfsd4_encode_fattr() that indicates whether it should
ignore cross mounts and simply fill in the attribute using the passed in
dentry as opposed to it's parent.

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: fix bad seqid on lock request incompatible with open mode

The failure to return a stateowner from nfs4_preprocess_seqid_op() means
in the case where a lock request is of a type incompatible with an open
(due to, e.g., an application attempting a write lock on a file open for
read), means that fs/nfsd/nfs4xdr.c:ENCODE_SEQID_OP_TAIL() never bumps
the seqid as it should. The client, attempting to close the file
afterwards, then gets an (incorrect) bad sequence id error. Worse, this
prevents the open file from ever being closed, so we leak state.

Thanks to Benny Halevy and Trond Myklebust for analysis, and to Steven
Wilton for the report and extensive data-gathering.

Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Steven Wilton <steven.wilton@team.eftel.com.au>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

lockd: fix reference count leaks in async locking case

In a number of places where we wish only to translate nlm_drop_reply to
rpc_drop_reply errors we instead return early with rpc_drop_reply,
skipping some important end-of-function cleanup.

This results in reference count leaks when lockd is doing posix locking
on GFS2.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: recognize callback channel failure earlier

When the callback channel fails, we inform the client of that by
returning a cb_path_down error the next time it tries to renew its
lease.

If we wait most of a lease period before deciding that a callback has
failed and that the callback channel is down, then we decrease the
chances that the client will find out in time to do anything about it.

So, mark the channel down as soon as we recognize that an rpc has
failed. However, continue trying to recall delegations anyway, in hopes
it will come back up. This will prevent more delegations from being
given out, and ensure cb_path_down is returned to renew calls earlier,
while still making the best effort to deliver recalls of existing
delegations.

Also fix a couple comments and remove a dprink that doesn't seem likely
to be useful.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: miscellaneous nfs4state.c style fixes

Fix various minor style violations.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: make current_clientid local

Declare this variable in the one function where it's used, and clean up
some minor style problems.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: fix encode_entryplus_baggage() indentation

Fix bizarre indentation.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: kill unneeded cl_confirm check

We generate a unique cl_confirm for every new client; so if we've
already checked that this cl_confirm agrees with the cl_confirm of
unconf, then we already know that it does not agree with the cl_confirm
of conf.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: remove unnecessary cl_verifier check from setclientid_confirm

Again, the only way conf and unconf can have the same clientid is if
they were created in the "probable callback update" case of setclientid,
in which case we already know that the cl_verifier fields must agree.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: kill unnecessary same_name() in setclientid_confirm

If conf and unconf are both found in the lookup by cl_clientid, then
they share the same cl_clientid. We always create a unique new
cl_clientid field when creating a new client--the only exception is the
"probable callback update" case in setclientid, where we copy the old
cl_clientid from another clientid with the same name.

Therefore two clients with the same cl_client field also always share
the same cl_name field, and a couple of the checks here are redundant.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>

nfsd: uniquify cl_confirm values

Using a counter instead of the nanoseconds value seems more likely to
produce a unique cl_confirm.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: eliminate final bogus case from setclientid logic

We're supposed to generate a different cl_confirm verifier for each new
client, so these to cl_confirm values should never be the same.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: kill some unneeded setclientid comments

Most of these comments just summarize the code.

The matching of code to the cases described in the RFC may still be
useful, though; add specific section references to make that easier to
follow. Also update references to the outdated RFC 3010.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: minor fs/nfsd/auth.h cleanup

While we're here, let's remove the redundant (and now wrong) pathname in
the comment, and the #ifdef __KERNEL__'s.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: move nfsd/auth.h into fs/nfsd

This header is used only in a few places in fs/nfsd, so there seems to
be little point to having it in include/. (Thanks to Robert Day for
pointing this out.)

Cc: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: allow cache_register to return error on failure

Newer server features such as nfsv4 and gss depend on proc to work, so a
failure to initialize the proc files they need should be treated as
fatal.

Thanks to Andrew Morton for style fix and compile fix in case where
CONFIG_NFSD_V4 is undefined.

Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: move cache proc (un)registration to separate function

Just some minor cleanup.

Also I don't see much point in trying to register further proc entries
if initial entries fail; so just stop trying in that case.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: fail init on /proc/fs/nfs/exports creation failure

I assume the reason failure of creation was ignored here was just to
continue support embedded systems that want nfsd but not proc.

However, in cases where proc is supported it would be clearer to fail
entirely than to come up with some features disabled.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: select CONFIG_PROC_FS in nfsv4 and gss server cases

The server depends on upcalls under /proc to support nfsv4 and gss.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: cache unregistration needn't return error

There's really nothing much the caller can do if cache unregistration
fails. And indeed, all any caller does in this case is print an error
and continue. So just return void and move the printk's inside
cache_unregister.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: fail module init on reply cache init failure

If the reply cache initialization fails due to a kmalloc failure,
currently we try to soldier on with a reduced (or nonexistant) reply
cache.

Better to just fail immediately: the failure is then much easier to
understand and debug, and it could save us complexity in some later
code. (But actually, it doesn't help currently because the cache is
also turned off in some odd failure cases; we should probably find a
better way to handle those failure cases some day.)

Fix some minor style problems while we're at it, and rename
nfsd_cache_init() to remove the need for a comment describing it.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: cleanup nfsd module initialization cleanup

Handle the failure case here with something closer to the standard
kernel style.

Doesn't really matter for now, but I'd like to add a few more failure
cases, and then this'll help.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: cleanup nfsd4 properly on module init failure

We forgot to shut down the nfs4 state and idmapping code in this case.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: Fix handling of negative lengths in read_buf()

The length "nbytes" passed into read_buf should never be negative, but
we check only for too-large values of "nbytes", not for too-small
values.  Make nbytes unsigned, so it's clear that the former tests are
sufficient.  (Despite this read_buf() currently correctly returns an xdr
error in the case of a negative length, thanks to an unsigned
comparison with size_of() and bounds-checking in kmalloc().  This seems
very fragile, though.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: fix cache.c comment

The path here must be left over from some earlier draft; fix it. And do
some more minor cleanup while we're there.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: Path name length signage in nfsd request argument structures

Clean up: For consistency, store the length of path name strings in nfsd
argument structures as unsigned integers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: Fix mixed sign comparison in nfs3svc_decode_symlinkargs

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: Use unsigned length argument for decode_pathname

Clean up: path name lengths are unsigned on the wire, negative lengths
are not meaningful natively either.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: Adjust filename length argument of nfsd_lookup

Clean up: adjust the sign of the length argument of nfsd_lookup and
nfsd_lookup_dentry, for consistency with recent changes. NFSD version
4 callers already pass an unsigned file name length.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: File name length signage in nfsd request argument structures

Clean up: For consistency, store the length of file name strings in nfsd
argument structures as unsigned integers. This matches the XDR routines
and client argument structures for the same operation types.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NFSD: Use unsigned length argument for decode_filename

Clean up: file name lengths are unsigned on the wire, negative lengths
are not meaningful natively either.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

NLM: Fix sign of length of NLM variable length strings

According to The Open Group's NLM specification, NLM callers are variable
length strings. XDR variable length strings use an unsigned 32 bit length.
And internally, negative string lengths are not meaningful for the Linux
NLM implementation.

Clean up: Make nlm_lock.len and nlm_reboot.len unsigned integers. This
makes the sign of NLM string lengths consistent with the sign of xdr_netobj
lengths.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

SUNRPC: Use unsigned string lengths in xdr_decode_string_inplace

XDR strings, opaques, and net objects should all use unsigned lengths.
To wit, RFC 4506 says:

4.2.  Unsigned Integer

   An XDR unsigned integer is a 32-bit datum that encodes a non-negative
   integer in the range [0,4294967295].

...

4.11.  String

   The standard defines a string of n (numbered 0 through n-1) ASCII
   bytes to be the number n encoded as an unsigned integer (as described
   above), and followed by the n bytes of the string.

After this patch, xdr_decode_string_inplace now matches the other XDR
string and array helpers that take a string length argument.  See:

xdr_encode_opaque_fixed, xdr_encode_opaque, xdr_encode_array

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

SUNRPC: Prevent length underflow in read_flush()

Make sure we compare an unsigned length to an unsigned count in
read_flush().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

knfsd: fix broken length check in nfs4idmap.c

Obviously at some point we thought "error" represented the length when
positive. This appears to be a long-standing typo.

Thanks to Prasad Potluri <pvp@us.ibm.com> for finding the problem and
proposing an earlier version of this patch.

Cc: Steve French <smfltc@us.ibm.com>
Cc: Prasad V Potluri <pvp@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: Fix inconsistent assignment

Dereferenced pointer "dentry" without checking and assigned to inode
in the declaration.

(We could just delete the NULL checks that follow instead, as we never
get to the encode function in this particular case. But it takes a
little detective work to verify that fact, so it's probably safer to
leave the checks in place.)

Cc: Steve French <smfltc@us.ibm.com>
Signed-off-by: Prasad V Potluri <pvp@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: move callback rpc_client creation into separate thread

The whole reason to move this callback-channel probe into a separate
thread was because (for now) we don't have an easy way to create the
rpc_client asynchronously. But I forgot to move the rpc_create() to the
spawned thread. Doh! Fix that.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: probe callback channel only once

Our callback code doesn't actually handle concurrent attempts to probe
the callback channel. Some rethinking of the locking may be required.
However, we can also just move the callback probing to this case. Since
this is the only time a client is "confirmed" (and since that can only
happen once in the lifetime of a client), this ensures we only probe
once.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

Merge branch 'audit.b46' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current

* 'audit.b46' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [AUDIT] Add uid, gid fields to ANOM_PROMISCUOUS message
  [AUDIT] ratelimit printk messages audit
  [patch 2/2] audit: complement va_copy with va_end()
  [patch 1/2] kernel/audit.c: warning fix
  [AUDIT] create context if auditing was ever enabled
  [AUDIT] clean up audit_receive_msg()
  [AUDIT] make audit=0 really stop audit messages
  [AUDIT] break large execve argument logging into smaller messages
  [AUDIT] include audit type in audit message when using printk
  [AUDIT] do not panic on exclude messages in audit_log_pid_context()
  [AUDIT] Add End of Event record
  [AUDIT] add session id to audit messages
  [AUDIT] collect uid, loginuid, and comm in OBJ_PID records
  [AUDIT] return EINTR not ERESTART*
  [PATCH] get rid of loginuid races
  [PATCH] switch audit_get_loginuid() to task_struct *

Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86

* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: avoid section mismatch involving arch_register_cpu
  x86: fixes for lookup_address args
  x86: fix sparse warnings in cpu/common.c
  x86: make early_console static in early_printk.c
  x86: remove unneeded round_up
  x86: fix section mismatch warning in kernel/pci-calgary
  x86: fix section mismatch warning in acpi/boot.c
  x86: fix section mismatch warnings when referencing notifiers
  x86: silence section mismatch warning in smpboot_64.c
  x86: fix comments in vmlinux_64.lds
  x86_64: make bootmap_start page align v6
  x86_64: add debug name for early_res

Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  latencytop: Change Kconfig dependency.
  futex: Add bitset conditional wait/wakeup functionality
  futex: Remove warn on in return fixup path
  x86: replace LOCK_PREFIX in futex.h
  tick-sched: add more debug information
  timekeeping: update xtime_cache when time(zone) changes
  hrtimer: fix hrtimer_init_sleeper() users

ath5k: fix section mismatch warning

Signed-off-by: John W. Linville <linville@tuxdriver.com>

mac80211: fix initialisation error path

The error handling in ieee80211_init() is broken when any of
the built-in rate control algorithms fail to initialise, fix
it and rename the error labels.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

mac80211 rate control: fix section mismatch

When the rate control algorithms are built-in, their exit
functions can be called from mac80211's init function so
they cannot be marked __exit.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

iwlwifi: fix merge sequence: exit on error before state change

The intention behind the original patch:
"iwlwifi: fix possible read attempt on ucode that is not available"
was to exit before any state is changed. Due to its submission directly
to 2.6.24 it was not clear how this relates to the latest iwlwifi work.

This patch does exactly the same as the previous patch, just earlier to
prevent any state from being changed if there is an error.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

iwlwifi: remove ieee80211 types from iwl-helpers.h

This patch removes IEEE80211_STYPE_BACK_REQ and
IEEE80211_STYPE_BACK defines from iwl-helpers.h. These are already
defined in include/linux/ieee80211.h

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

iwlwifi: Fix MIMO PS mode

This patch setups correctly MIMO PS mode flags

Signed-off-by: Guy Cohen <guy.cohen@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

iwlwifi: fix sparse warning in iwl 3945

This patch fixes a sparse warning over iwl3945_add_radiotap function by
turning it static

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

mac80211: make alignment warning optional

Driver authors should be aware of the alignment requirements, but
not everybody cares about the warning. This patch makes it depend
on a new Kconfig symbol MAC80211_DEBUG_PACKET_ALIGNMENT which can
be enabled regardless of MAC80211_DEBUG and is recommended for
driver authors (only). This also restricts the warning to data
packets so other packets need not be realigned to not trigger the
warning.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[AUDIT] Add uid, gid fields to ANOM_PROMISCUOUS message

Changes the ANOM_PROMISCUOUS message to include uid and gid fields,
making it consistent with other AUDIT_ANOM_ messages and in the
format the userspace is expecting.

Signed-off-by: Klaus Heinrich Kiwi <klausk@br.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>

[AUDIT] ratelimit printk messages audit

some printk messages from the audit system can become excessive. This
patch ratelimits those messages. It was found that messages, such as
the audit backlog lost printk message could flood the logs to the point
that a machine could take an nmi watchdog hit or otherwise become
unresponsive.

Signed-off-by: Eric Paris <eparis@redhat.com>

[patch 2/2] audit: complement va_copy with va_end()

Complement va_copy() with va_end().

Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

[patch 1/2] kernel/audit.c: warning fix

kernel/audit.c: In function 'audit_log_start':
kernel/audit.c:1133: warning: 'serial' may be used uninitialized in this function

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

[AUDIT] create context if auditing was ever enabled

Disabling audit at runtime by auditctl doesn't mean that we can
stop allocating contexts for new processes; we don't want to miss them
when that sucker is reenabled.

(based on work from Al Viro in the RHEL kernel series)

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] clean up audit_receive_msg()

generally clean up audit_receive_msg() don't free random memory if
selinux_sid_to_string fails for some reason. Move generic auditing
to a helper function

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] make audit=0 really stop audit messages

Some audit messages (namely configuration changes) are still emitted even if
the audit subsystem has been explicitly disabled. This patch turns those
messages off as well.

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] break large execve argument logging into smaller messages

execve arguments can be quite large. There is no limit on the number of
arguments and a 4G limit on the size of an argument.

this patch prints those aruguments in bite sized pieces. a userspace size
limitation of 8k was discovered so this keeps messages around 7.5k

single arguments larger than 7.5k in length are split into multiple records
and can be identified as aX[Y]=

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] include audit type in audit message when using printk

Currently audit drops the audit type when an audit message goes through
printk instead of the audit deamon. This is a minor annoyance in
that the audit type is no longer part of the message and the information
the audit type conveys needs to be carried in, or derived from the
message data.

The attached patch includes the type number as part of the printk.
Admittedly it isn't the type name that the audit deamon provides but I
think this is better than dropping the type completely.

Signed-pff-by: John Johansen <jjohansen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] do not panic on exclude messages in audit_log_pid_context()

If we fail to get an ab in audit_log_pid_context this may be due to an exclude
rule rather than a memory allocation failure. If it was due to a memory
allocation failue we would have already paniced and no need to do it again.

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] Add End of Event record

This patch adds an end of event record type. It will be sent by the kernel as
the last record when a multi-record event is triggered. This will aid realtime
analysis programs since they will now reliably know they have the last record
to complete an event. The audit daemon filters this and will not write it to
disk.

Signed-off-by: Steve Grubb <sgrubb redhat com>
Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] add session id to audit messages

In order to correlate audit records to an individual login add a session
id.  This is incremented every time a user logs in and is included in
almost all messages which currently output the auid.  The field is
labeled ses=  or oses=

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] collect uid, loginuid, and comm in OBJ_PID records

Add uid, loginuid, and comm collection to OBJ_PID records. This just
gives users a little more information about the task that received a
signal. pid is rather meaningless after the fact, and even though comm
isn't great we can't collect exe reasonably on this code path for
performance reasons.

Signed-off-by: Eric Paris <eparis@redhat.com>

[AUDIT] return EINTR not ERESTART*

The syscall exit code will change ERESTART* kernel internal return codes
to EINTR if it does not restart the syscall. Since we collect the audit
info before that point we should fix those in the audit log as well.

Signed-off-by: Eric Paris <eparis@redhat.com>

[PATCH] get rid of loginuid races

Keeping loginuid in audit_context is racy and results in messier
code. Taken to task_struct, out of the way of ->audit_context
changes.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

[PATCH] switch audit_get_loginuid() to task_struct *

all callers pass something->audit_context

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

pci: allow multiple calls to pcim_enable_device()

There's no reason not to allow multiple calls to pcim_enable_device().
Calls after the first one can simply be noop. All PCI resources will
be released when the initial pcim_enable_device() resource is
released.

This allows more flexibility to managed PCI users.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Blackfin pata-bf54x driver: fix compiling bug - no ata_port struct in struct ata_device any more

Cc: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Blackfin pata-bf54x driver: should cover all possible interrupt sources

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Blackfin pata-bf54x driver: Add debug information

Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Blackfin pata-bf54x driver: Remove obsolete PM function

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

pata_sl82c105: dual channel support

Use qc_defer to serialize the two channels

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ata_piix.c: make piix_merge_scr() static

piix_merge_scr() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sata_nv: fix for completion handling

This patch is based on an original patch from Kuan Luo of NVIDIA,
posted under subject "fixed a bug of adma in rhel4u5 with HDS7250SASUN500G".
His description follows. I've reworked it a bit to avoid some unnecessary
repeated checks but it should be functionally identical.

"The patch is to solve the error message "ata1: CPB flags CMD err,
flags=0x11" when testing HDS7250SASUN500G in rhel4u5.
I tested this hd in 2.6.24-rc7 which needed to remove the mask in
blacklist to run the ncq and the same error also showed up.

I traced the  bug and found that the interrupt finished a command (for
example, tag=0) when the driver got that adma status is
NV_ADMA_STAT_DONE  and  cpb->resp_flags is NV_CPB_RESP_DONE.
However, For this hd, the drive maybe didn't clear bit 0 at this moment.
It meaned the hardware  had not completely finished the command.
If at the same time  the driver freed the command(tag 0) and sended
another command (tag 0), the error happened.

The notifier register is 32-bit register containing notifier value.
Value is bit vector containing one bit per tag number (0-31) in
corresponding bit positions (bit 0 is for tag 0, etc). When bit is set
then ADMA indicates that command with corresponding tag number completed
execution.

So i added the check notifier code. Sometimes i saw that the notifier
reg set some bits  , but the adma status set NV_ADMA_STAT_CMD_COMPLETE
,not NV_ADMA_STAT_DONE. So i added the NV_ADMA_STAT_CMD_COMPLETE check
code."

Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sata_mv: Remove PCI dependency

The integrated SATA controller is connected directly to the SoC's
internal bus, not via PCI interface. this patch removes the dependency
on the PCI interface.

Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

x86: avoid section mismatch involving arch_register_cpu

Avoid section mismatch involving arch_register_cpu.

Marking arch_register_cpu as __init and removing the export
for non-hotplug-cpu configurations makes the following warning
go away:

Section mismatch in reference from the function
arch_register_cpu() to the function .devinit.text:register_cpu()
The function arch_register_cpu() references
the function __devinit register_cpu().
This is often because arch_register_cpu lacks a __devinit
annotation or the annotation of register_cpu is wrong.

The only external user of arch_register_cpu in the tree is
in drivers/acpi/processor_core.c where it is guarded by
ACPI_HOTPLUG_CPU (which depends on HOTPLUG_CPU).

Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: fixes for lookup_address args

Signedness mismatches in level argument.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: fix sparse warnings in cpu/common.c

The casts will always be needed, may as well make them the right
signedness.  The ebx variables can easily be unsigned, may as well.

arch/x86/kernel/cpu/common.c:261:21: warning: incorrect type in argument 2 (different signedness)
arch/x86/kernel/cpu/common.c:261:21:    expected unsigned int *eax
arch/x86/kernel/cpu/common.c:261:21:    got int *<noident>
arch/x86/kernel/cpu/common.c:262:9: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/cpu/common.c:262:9:    expected unsigned int *ebx
arch/x86/kernel/cpu/common.c:262:9:    got int *<noident>
arch/x86/kernel/cpu/common.c:263:9: warning: incorrect type in argument 4 (different signedness)
arch/x86/kernel/cpu/common.c:263:9:    expected unsigned int *ecx
arch/x86/kernel/cpu/common.c:263:9:    got int *<noident>
arch/x86/kernel/cpu/common.c:264:9: warning: incorrect type in argument 5 (different signedness)
arch/x86/kernel/cpu/common.c:264:9:    expected unsigned int *edx
arch/x86/kernel/cpu/common.c:264:9:    got int *<noident>
arch/x86/kernel/cpu/common.c:293:30: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/cpu/common.c:293:30:    expected unsigned int *ebx
arch/x86/kernel/cpu/common.c:293:30:    got int *<noident>
arch/x86/kernel/cpu/common.c:350:22: warning: incorrect type in argument 2 (different signedness)
arch/x86/kernel/cpu/common.c:350:22:    expected unsigned int *eax
arch/x86/kernel/cpu/common.c:350:22:    got int *<noident>
arch/x86/kernel/cpu/common.c:351:10: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/cpu/common.c:351:10:    expected unsigned int *ebx
arch/x86/kernel/cpu/common.c:351:10:    got int *<noident>
arch/x86/kernel/cpu/common.c:352:10: warning: incorrect type in argument 4 (different signedness)
arch/x86/kernel/cpu/common.c:352:10:    expected unsigned int *ecx
arch/x86/kernel/cpu/common.c:352:10:    got int *<noident>
arch/x86/kernel/cpu/common.c:353:10: warning: incorrect type in argument 5 (different signedness)
arch/x86/kernel/cpu/common.c:353:10:    expected unsigned int *edx
arch/x86/kernel/cpu/common.c:353:10:    got int *<noident>
arch/x86/kernel/cpu/common.c:362:30: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/cpu/common.c:362:30:    expected unsigned int *ebx
arch/x86/kernel/cpu/common.c:362:30:    got int *<noident>

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: make early_console static in early_printk.c

Not necessary to expose it, also fixes sparse warning.

arch/x86/kernel/early_printk.c:196:16: warning: symbol 'early_console' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: remove unneeded round_up

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: fix section mismatch warning in kernel/pci-calgary

Fix following warning:
WARNING: arch/x86/kernel/built-in.o(.text+0x1eb41): Section mismatch in reference from the function calgary_handle_quirks() to the function .init.text:calgary_set_split_completion_timeout()

calgary_handle_quirks() are only called at
__init time (in calgary_init_one() via handle_quirks ops).
So annotate this function and the sister function __init.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>