Here's an additional patch that contains the cleanups I did to John's
timer patches. It does the following:
- uses C99 initializers
- makes the timer list static
- adds better documentation to the timer function structure
- makes the timer init function return 0 on success
- NULL terminates the list of timers to make further patches
easier.
Trond Myklebust [Thu, 10 Oct 2002 05:50:19 +0000 (22:50 -0700)]
[PATCH] Fix NFS locking over TCP
The 2.5.x RPC code is currently broken in that it demands that all
tasks that call xprt_create_proto() in order to open a TCP socket must
have CAP_NET_BIND_SERVICE capabilities, and must bind to a privileged
port.
This breaks the NLM locking code and its use of the call_bind() RPC
portmapper lookup feature.
This patch allows the built-in portmapper client to use unbound TCP
sockets if the user does not have the necessary capabilities.
Trond Myklebust [Thu, 10 Oct 2002 05:49:40 +0000 (22:49 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
This is a nontrivial change to the NFS client.
In this patch, we finish modifying the async READ path so that it is
version-agnostic. We define a new nfs_rpc_op ->setup_read(), and move
the v2- and v3-specific code in nfs_read_rpcsetup() there. We also
have to change nfs_readpage() result so that the 'count' of bytes
read is a parameter. The extra parameter means that it can no longer
be ->tk_exit(). Instead, it is called from a version-specific ->tk_exit()
routine which is set in ->read_setup().
The upshot of all this is that the version-specific part of the
async READ path has been encapsulated in a new nfs_rpc_op
->read_setup(), and NFSv4 can share the logic for asynchronous
READ's with NFSv2 and v3.
Trond Myklebust [Thu, 10 Oct 2002 05:49:07 +0000 (22:49 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
This is a nontrivial change to the NFS client.
Synchronous READ operations are currently done via the ->read() nfs_rpc_op.
Therefore, the synchronous READ path can easily be adapted for NFSv4. On
the other hand, the asynchronous READ path contains several NFSv3-specific
features, which make it difficult to adapt for NFSv4.
In this patch and the next, we modify the async READ path to be
version-agnostic. This patch just changes the 'struct nfs_read_data'
so that the v2- and v3-specific parts are moved into a private area,
with room for a v4-specific part in parallel. None of the logic is
changed.
Trond Myklebust [Thu, 10 Oct 2002 05:48:22 +0000 (22:48 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
This is a nontrivial change to the NFS client.
NFSv4 defines a new file attribute, change_attr. This is a per-file
opaque quantity returned by the server, whose value is required to
change whenever the file is modified. If it exists, we want to use
it for all cache consistency checks in nfs_refresh_inode(). Some
operations also return a "pre-operation" value of the change_attr;
we want to take this into account too.
First, define flags
NFS_ATTR_FATTR_V4 - indicates that the 'struct nfs_fattr' is an
NFSv4 fattr, so the change_attr field is valid
NFS_ATTR_PRE_CHANGE - indicates that the server returned a pre-operation
change_attr, so the pre_change_attr field is valid
Second, change nfs_refresh_inode() so that the caches are invalidated
if there is a change_attr mismatch. Exception: If the pre_change_attr
tells us that the mismatch was caused by our operation, then do not
invalidate the caches.
This patch should leave the logic in nfs_refresh_inode() unchanged
if neither of the new flags are set.
Trond Myklebust [Thu, 10 Oct 2002 05:48:08 +0000 (22:48 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
In NFSv4, an fsid is a 64-bit major number together with a 64-bit
minor number. In previous versions, an fsid is a single number.
This patch changes 'struct nfs_fattr' accordingly.
Trond Myklebust [Thu, 10 Oct 2002 05:47:02 +0000 (22:47 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
This patch changes the interface of the ->setattr() nfs_rpc_op
so that its first argument is a dentry instead of an inode.
[Explanation: The dentry is required because in NFSv4, we may
need to OPEN the file before doing the SETATTR. (This is
required if the file size is changed as part of the setattr.)
Opening the file requires making use of the containing
directory's inode.]
Trond Myklebust [Thu, 10 Oct 2002 05:46:48 +0000 (22:46 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
This patch changes the interface of the ->readdir() nfs_rpc_op
so that its first argument is a dentry instead of an inode.
[Explanation: The dentry is required because in NFSv4, we need
to make use of the _parent_ directory's inode. This is because
NFSv4 servers no longer return an entry for ".." in the READDIR
response, so the client kernel needs to fake this entry, inode
number and all.]
Trond Myklebust [Thu, 10 Oct 2002 05:45:49 +0000 (22:45 -0700)]
[PATCH] A basic NFSv4 client for 2.5.x
Instantiate a new file, include/linux/nfs4.h, which contains
constants and typedef's for the NFSv4 protocol (by analogy with
include/linux/nfs2.h and include/linux/nfs3.h).
Also #include this file in a few places where it will be needed
later.
Paul Mackerras [Thu, 10 Oct 2002 05:39:20 +0000 (22:39 -0700)]
[PATCH] adjust PPC sysctls
This patch takes out the unused KERN_PPC_ZEROPAGED sysctl, and
restricts the KERN_PPC_POWERSAVE_NAP and KERN_PPC_L2CR sysctls to be
present only on those PPC processors where they are useful. This
patch only affects PPC.
Paul Mackerras [Thu, 10 Oct 2002 05:38:51 +0000 (22:38 -0700)]
[PATCH] add PCI device ID for Motorola MPC107
This patch adds the PCI device ID for the Motorola MPC107 host bridge.
The entry is already in the list at pciids.sf.net but isn't in the
kernel pci_ids.h file yet. Please apply this to your tree.
Olaf Dietsche [Thu, 10 Oct 2002 05:31:57 +0000 (22:31 -0700)]
[PATCH] 2.5.40: fix chmod/chown on procfs
This patch allows to change uid, gid and mode of files and directories
located in procfs.
Without this patch you can change uid, gid and mode as long as the
file is open. As soon as you close the file, it reverts back to its
default, which is root:root and readonly usually.
Jens Axboe [Thu, 10 Oct 2002 05:30:17 +0000 (22:30 -0700)]
[PATCH] excessive stack usage in cdrom
CD-ROM puts struct cdrom_changer_info on the stack in a few places, this
is a bad idea since it's big (a bit over 1kb). This makes us allocate
it instead.
John Stultz [Thu, 10 Oct 2002 04:39:10 +0000 (21:39 -0700)]
[PATCH] linux-2.5.41_cyclone-timer_B2
In order to demonstrate how new time-sources are added to my
timer-changes patch. Here is my current version of my cyclone-timer
patch for 2.5.41. This uses the infrastructure set up in the
timer-changes_A4 patch set to add the cyclone counter (found on IBM
Summit Based hardware) as a time-source.
The current code is not enabled as it also depends on James
Cleverdon's 2.5 summit patch, however it illustrates how cleanly new
time-sources can be added.
This is part 2 of 3 of my timer-change patch. Part 2 is just a
bulk move of code out of time.c and into timer_pit.c and timer_tsc.c. No
code is changed, only moved.
Please note, this code will not compile without the final third
part of this patch collection. This was done for readability alone.
The i386 time.c code is turning into a mess. We've got multiple
functions that do the same thing, only with different hardware, all
surrounded #ifdefs and even more difficult to follow #ifndefs. George
Anzinger is introducing a new ACPIpm time source, I'm going to attempt
to add the cyclone counter as a time source, and in the future there
will be HPET to deal with. These will not go in cleanly together as
things are now.
Inspired by suggestions from Alan, this collection of patches
tries to clean up time.c by breaking out the PIT and TSC specific parts
into their own files. Additionally the patch creates an abstract
interface to use these existing time soruces, as well as make it easier
to add future time sources.
It introduces "struct timer_ops" which gives the time code a
clear interface to use these different time sources. It also allows for
clearer conditional compilation of these various time sources.
This first patch (part 1 of 3) provides the infrastructure
needed via the timer_ops structure, as well as the select_timer()
function for choosing the best available timer.
Andrew Morton [Thu, 10 Oct 2002 04:05:33 +0000 (21:05 -0700)]
[PATCH] remove the sched_yield from the ext3 fsync path
The changed sched_yield() semantics have made ext3's transaction
batching terribly slow.
Apparently a schedule() fixes that, although it probably breaks
transaction batching.
This patch largely fixes my complaints about the new scheduler being
extremely sluggish to interactive applications. Evidently those
applications were calling fsync() and were spending extremely long
periods in sched_yield().
Andrew Morton [Thu, 10 Oct 2002 04:05:19 +0000 (21:05 -0700)]
[PATCH] remove radix_tree_reserve()
From Hugh Dickins.
radix_tree_reserve() exists solely for the tmpfs move_to_swap_cache()
and move_from_swap_cache() functions, and yet they don't need it: there
is no problem in the one page being simultaneously listed in two radix
trees (while both locks are held). Use radix_tree_insert(), and remove
radix_tree_reserve(); also removed a few blank lines.
Andrew Morton [Thu, 10 Oct 2002 04:04:43 +0000 (21:04 -0700)]
[PATCH] fix the raw driver
Fix the raw driver by tricking it into performing O_DIRECT IO against
the bound blockdev.
- rewrite the i_mapping for /dev/raw/raw0 to point at the same thing
as bdev->bd_inode->i_mapping. We've performed a bdget() against the
blockdev, which should pin it for the correct lifetime.
- set the O_DIRECT bit on the caller's file->flags.
Andrew Morton [Thu, 10 Oct 2002 04:04:24 +0000 (21:04 -0700)]
[PATCH] move_one_page atomicity fix
The atomicicty fix for move_one_page() was not quite right.
We only do the page_table_present() test if CONFIG_HIGHPTE=y. Which is
fine, but even with CONFIG_HIGHPTE=n, the pte mapping functions still
do an inc_preempt_count() due to their unconditional kmap_atomic(). So
we get a might_sleep() warning.
The warning is actually bogus, because those pte's are always in
direct-mapped memory.
So hm. Three fixes suggest themselves:
1: Run the page_table_present() test if CONFIG_HIGHMEM.
Rejected: penalises non-pte_highmem setups
2: Make kmap_atomic() not do inc_preempt_count() is the page was
direct mapped.
Rejected: I don't think we want kmap_atomic side effects to be
varying according to the page which was passed.
3: Change the pte mapping functions so they don't run kmap_atomic at
all if CONFIG_HIGHPTE=n
This is what I did. And guess what? For CONFIG_HIGHMEM=y,
CONFIG_HIGHPTE=n this patch shrinks the kernel by 5 kbytes. Because
kmap_atomic is inlined.
Andrew Morton [Thu, 10 Oct 2002 04:03:56 +0000 (21:03 -0700)]
[PATCH] mremap use-after-free bugfix
I have invented a new software development methodology! You send an
email to Hugh saying "I don't have the foggiest idea why this guy's
kernel is oopsing" and next morning, you get a patch! I shall patent
this.
Since 2.5.3, move_vma() has been passing a freed vma into
move_page_tables(). Fix it to move back to the previous vma in the
list if we're about to delete this one.
Thanks to Morten Helgesen for patient reporting, diagnosis and testing.
David Jeffery [Wed, 9 Oct 2002 08:04:30 +0000 (01:04 -0700)]
[PATCH] ips driver 6/6
2 bug fixes for scsi pass through
When talking directly to scsi devices, the driver would
sometimes get two things wrong. We could set too short
of a timeout. Or, we could confuse the adapter by having
non-zero values in certain fields which we shouldn't have
been using. This patch corrects these problems.
David Jeffery [Wed, 9 Oct 2002 08:04:01 +0000 (01:04 -0700)]
[PATCH] ips driver 5/6
2 minor bug fixes.
The first section makes sure we limit the size of the
sense_buffer copy to the target buffer's size so that
we don't overflow the sence_buffer.
The other sections remove some pointer arithmatic that
is wrong on 64bit machines do to padding. Instead, just
call the pci_map functions on the buffer.
David Jeffery [Wed, 9 Oct 2002 08:03:42 +0000 (01:03 -0700)]
[PATCH] ips driver 4/6
This is by far the biggest patch. It is a rewrite of the
driver's horrid locking. In addition to the host_lock,
the driver used to have 4 other locks per adapter!
It had a redundant ha_lock and a lock for each of 3
queues. In a few places it also played with atomic bit
setting. And almost all of it was useless as the
host_lock was already held.
This patch cleans up this locking nightmare. The driver
now uses the host_lock exclusively. Only a few places
needed to add calls to lock the host_lock. Most of
this patch is deletion of useless extra locking.
David Jeffery [Wed, 9 Oct 2002 08:03:00 +0000 (01:03 -0700)]
[PATCH] ips driver 2/6
This patch is some simple code consolidation.
A new function ips_abort_init() is created
and consolidates some repeated code that is
used if there is an error during initialization
of the adapter.
David Jeffery [Wed, 9 Oct 2002 08:02:33 +0000 (01:02 -0700)]
[PATCH] ips driver 1/6
This removes several unused header includes and allows
the driver to compile by no longer trying to include
<linux/tqueue.h> . You may have already gotten a patch
to remove tqueue.h from someone else.
This patch also corrects the spelling of my last name
in the MAINTAINERS file. You'd think I'd be used to
seeing it spelled wrong by now.
Robert Love [Wed, 9 Oct 2002 06:06:28 +0000 (23:06 -0700)]
[PATCH] fix preempt_count overflow with brlocks
Now that brlocks loop over NR_CPUS, on SMP every br_lock/br_unlock
results in the acquire/release of 32 locks. This incs/decs the
preempt_count by 32.
Since we only have 7 bits now for actually storing the lock depth, we
cannot nest but 3 locks deep. I doubt we ever acquire three brlocks
concurrently, but it is still a concern.
Attached patch disables/enables preemption explicitly once and only
once for each lock/unlock. This is also an optimization as it
removes 31 incs, decs, and conditionals. :)
Alan Cox [Wed, 9 Oct 2002 05:58:38 +0000 (22:58 -0700)]
[PATCH] 3c501 for 2.5
Not much here, just some tidying/checking. This driver can't alas use NAPI
in 2.5. Note however it has no panics or BUG()s so appears to meet the
carrier grade guidelines ;)
- Clarified authors so I get the mail not Donald
- Added missing MODULE_ bits
- Moved junk into 3c501.h
Patrick Mochel [Wed, 9 Oct 2002 03:52:46 +0000 (20:52 -0700)]
IDE: register ide driver for all ide drives; not just for disk drives.
This adds
struct device_driver gen_driver;
to ide_driver_t, which is filled in with necessary fields when an ide
driver calls ide_register_driver(). That then registers the driver with
the driver model core.
As a result, this gives us the following output in driverfs:
# tree -d /sys/bus/ide/drivers/
/sys/bus/ide/drivers/
|-- ide-cdrom
`-- ide-disk
The suspend/resume callbacks in ide-disk.c have been temporarily
disabled until the ide core implements generic methods which forward
the calls to the drive drivers.
Fix 3270 console reboot loop. Recognize 3270 control unit type 3174.
Fix tubfs kmallocs. Dynamically get 3270 input buffer. Get bootup colors
right on 3270 console
Vojtech Pavlik [Wed, 9 Oct 2002 14:49:06 +0000 (16:49 +0200)]
Don't try to enable extra keys on IBM/Chicony keyboards as this upsets
several notebook keyboards. Until we find a better solution how to detect
who are we talking to, we rely on the kernel command line. Use
atkbd_set=4 to gain access to the extra keys.
Switch over ATM code to initcalls and reorder the makefile so
that link order inside atm is the same. I've also cleaned up
the makefile a bit while at it.
I didn't fix the existing compilation problems in the drivers (cli &
friends) and the broken le/be firmware selection for the fore200e cards
(kbuild breakage) though.
The bios geometry is almost useless, except for fdisk to try to write
an MSDOS partition table that is vaguely compatible with one written by
other operating systems.
If the size of disc will overflow a ten-bit cylinder number, then all
bets are off anyway. So fake it by casting the true disc capacity to a
smaller type (than u64), so that we avoid 64-bit division on 32-bit
platforms. If the disc is small enough that the number of cylinders is
correct, then this has no effect; otherwise, the number-of-cylinders we
report is bogus, but you can't use an MSDOS-format partition table on
such a drive anyway --- use the EFI GPT or the LDM partitioning, which
use 64-bit offsets internally.
Andrew Morton [Wed, 9 Oct 2002 01:37:37 +0000 (18:37 -0700)]
[PATCH] 64-bit sector_t - filesystems
From Peter Chubb
Filesystem migration to possibly 64-bit sector_t:
- bmap() now takes and returns a sector_t to allow filesystems
(e.g., JFS, XFS) that are 64-bit clean to deal with large files
- buffer handling now 64-bit clean
Enable 64-bit sector_t on IA32 and PPC.
kiobufs takes sector_t array, not array of long.
Fix blkmtd.c to deal in such an array.
Miscellaneous fixes for 64-bit sector_t.
- missed printk formats
- ide_floppy_do_request had incorrect signature
- in blkmtd.c there was a pointer used to
manipulate an array to be used by kiobuf --
it was unsigned long, needed to be sector_t
Andrew Morton [Wed, 9 Oct 2002 01:37:30 +0000 (18:37 -0700)]
[PATCH] 64-bit sector_t - driver changes
From Peter Chubb
Compaq Smart array sector_t cleanup: prepare for possible 64-bit sector_t
Clean up loop device to allow huge backing files.
MD transition to 64-bit sector_t.
- Hold sizes and offsets as sector_t not int;
- use 64-bit arithmetic if necessary to map block-in-raid to zone
and block-in-zone
Andrew Morton [Wed, 9 Oct 2002 01:37:24 +0000 (18:37 -0700)]
[PATCH] 64-bit sector_t - printk changes and sector_t cleanup
From Peter Chubb
printk changes: A sector_t can be either 64 or 32 bits, so cast it to a
printable type that is at least as large as 64-bits on all platforms
(i.e., cast to unsigned long long and use a %llu format)
Transition to 64-bit sector_t: fix isofs_get_blocks by converting the
(possibly 64-bit) arg to a long.
SCSI 64-bit sector_t cleanup: capacity now stored as sector_t; make
sure that the READ_CAPACITY command doesn't sign-extend its returned
value; avoid 64-bit division when printing size in MB.
Still to do:
- 16-byte SCSI commands
- Individual scsi drivers.
Andrew Morton [Wed, 9 Oct 2002 01:37:17 +0000 (18:37 -0700)]
[PATCH] 64-bit sector_t - various driver changes
peter's code works for me, and the 40-odd people who download
the -mm patches. Anton has tested it on ppc64 and I presume that
Peter has tested it on ia64. I use gcc-2.91.66 and others use
later compilers. I expect that any remaining problems will
mainly be caught by the compiler. And compiler bugs can be
detected by turning off the option in config and seeing if things
get better.
From Peter Chubb
- do_request() function takes sector_t not unsigned long as the
block number to operate on.
- Various casts to long where the underlying device can never get
big enough to warrant a 64-bit sector offset.
- Cast sector_t to unsigned long long when printing.
Vojtech Pavlik [Tue, 8 Oct 2002 19:36:32 +0000 (21:36 +0200)]
Make i8042.c even less picky about detecting an AUX port because of
broken chipsets that don't support the LOOP command or report failure
on the TEST command. Hopefully this won't screw any old 386/486
systems without the AUX port.