Andrew Morton [Sun, 16 Mar 2003 15:22:39 +0000 (07:22 -0800)]
[PATCH] Fix memleak in e100 driver
Patch from Oleg Drokin <green@linuxhacker.ru>
There is a memleak in e100 driver from intel, both in 2.4 and 2.5
e100_ethtool_gstrings does not free "strings" variable if it cannot
copy it to userspace.
Russell King [Sun, 16 Mar 2003 23:28:51 +0000 (23:28 +0000)]
[PCI] pci-15: Fix setup-bus.c resource sizing.
Patch from Ivan Kokshaysky
This fixes long standing typo ('size' instead of 'r_size') which causes
overestimate of the bridge memory ranges calculated in pbus_size_mem().
For example, if we have a device with one 1Mb and one 2Mb memory ranges
behind the bridge, calculated size and alignment of the bridge memory
window will be 4Mb and 2Mb respectively, while the correct values are
3Mb and 1Mb.
Russell King [Sun, 16 Mar 2003 23:21:31 +0000 (23:21 +0000)]
[PCI] pci-13: unuse pci_do_scan_bus()
In an attempt to "unuse" pci_do_scan_bus() so it can be eventually
killed, make pci_scan_bus_parented() call the new pci_scan_child_bus()
and pci_bus_add_devices(). The only remaining callers are the
hotplug drivers.
Eventually, pci_bus_add_devices() will be removed from this function -
it is intended that architectures should call this after they have
done any setups and fixups to the scanned bus.
It is legal to call pci_bus_add_devices() on a bus which has already
had this function called, so architectures could update today.
Kill pcibios_update_resource(), replacing it with pci_update_resource().
pci_update_resource() uses pcibios_resource_to_bus() to convert a
resource to a device BAR - the transformation should be exactly the
same as the transformation used for the PCI bridges.
pci_update_resource "knows" about 64-bit BARs, but doesn't attempt to
set the high 32-bits to anything non-zero - currently no architecture
attempts to do something different. If anyone cares, please fix; I'm
going to reflect current behaviour for the time being.
Ivan pointed out the following architectures need to examine their
pcibios_update_resource() implementation - they should make sure that
this new implementation does the right thing. #warning's have been
added where appropriate.
ia64
mips
mips64
This cset also includes a fix for the problem reported by AKPM where
64-bit arch compilers complain about the resource mask being placed
in a u32.
Russell King [Sun, 16 Mar 2003 21:56:53 +0000 (21:56 +0000)]
[PCI] pci-8: pci_resource_to_bus()
Convert pcibios_fixup_pbus_ranges() into something more generic, namely
pcibios_resource_to_bus() - we are really trying to convert resources
to something to program into bus registers for bridge windows, and in
fact, PCI device BARs.
This is necessary since some architectures, namely Alpha, ARM and PARISC
have an offset between PCI addressing and host-based addressing, so
resources need to be adjusted when read or when written back to the bus.
We provide a generic version in asm-generic/pci.h, which most
architectures use.
This patch finds the following architectures with something to think
consider:
- ppc, ppc64
adjusts resources for devices, but not buses.
This is inconsistent, and leads to improperly
programmed windows/BARs.
PPC people (Anton) has a replacement PCI resource implementation
which should do the right thing.
Russell King [Sun, 16 Mar 2003 21:42:43 +0000 (21:42 +0000)]
[PCI] pci-7: Remove second argument to pcibios_update_resource()
Patch from Ivan Kokshaysky
remove the "parent" or "root" second argument to
pcibios_update_resource(). This highlights the following
architectures doing something wrong in their implementation:
Russell King [Sun, 16 Mar 2003 21:33:30 +0000 (21:33 +0000)]
[PCI] pci-6 - Fix scanning of non-zero functions
Fix breakage in pci-3 - we scanned all functions if function 0 was not
present. This causes some host bridges to lock up when scanning devfn
255 on PPC machines.
Ingo Molnar [Sun, 16 Mar 2003 13:35:21 +0000 (05:35 -0800)]
[PATCH] sched-2.5.64-bk10-D0
This removes/fixes a few whitespaces and removes the MAX_PRIO setting in
the init task path which is unnecessary and which might even lead to
bugs - MAX_PRIO is outside the valid range and technically the init
thread is not an idle thread yet at this point.
Ingo Molnar [Sun, 16 Mar 2003 13:35:14 +0000 (05:35 -0800)]
[PATCH] sched-2.5.64-bk10-C4
This fixes a fundamental (and long-standing) bug in the sleep-average
estimator which is the root cause of the "contest process_load" problems
reported by Mike Galbraith and Andrew Morton, and which problem is
addressed by Mike's patch.
The bug is the following: the sleep_time code in activate_task()
over-estimates the true sleep time by 0.5 jiffies on average (0.5 msecs
on recent 2.5 kernels). Furthermore, for highly context-switch
intensive and CPU-intensive workloads it means a constant 1 jiffy
over-estimation. This turns the balance of giving and removing ticks
and nils the effect of the CPU busy-tick, catapulting the task(s) to
highly interactive status - while in reality they are constantly burning
CPU time.
The fix is to round down sleep_time, not to round it up. This slightly
under-estimates the sleep time, but this is not a real problem, any task
with a sleep time in the 1 jiffy range will see timekeeping granularity
artifacts from various parts of the kernel anyway. We could use rdtsc
to estimate the sleep time, but i think that's unnecessary overhead.
The fixups in Mike's scheduler patch (which is in -mm8) basically work
around this bug. The patch below definitely fixes the contest-load
starvation bug, but it remains to be seen what other effects it has on
interactivity. In any case, this bug in the estimator is real and if
there's any other interactivity problem around then we need to deal with
it ontop of this patch.
This bug has been in the O(1) scheduler from day 1 on basically, so i'm
quite hopeful that a number of interactivity complaints are fixed by
this patch.
Douglas Gilbert [Sun, 16 Mar 2003 03:05:30 +0000 (21:05 -0600)]
scsi_debug version 1.68 mark III
Changelog since version 1.68 mark II:
- merge Mike Anderson's probe() cleanup
- num_devs is now "per host"
- num_devs is sysfs writeable
- add slave_alloc skeleton code
So to simulate 154 disks (for example) one might use:
# modprobe scsi_debug add_host=11 num_devs=14
With max_luns at its default value of 2, 14 is the
maximum number of devices per host scsi_debug will
respond to (i.e. 7 targets, each with 2 lus).
make's line continuation without explicit backslashes is a mystery
to me, and in this case, vmlinux got linked, but the linker command
was not written to the screen. Works again now.
Randy Dunlap [Sat, 15 Mar 2003 09:19:32 +0000 (01:19 -0800)]
[PATCH] update filesystems config. menu
This is Robert PJ Day's patch that updates the filesystems
config menu. It had become a bit ad hoc (jumbled:) and this
patch attempts to arrange it more logically.
Russell King [Sat, 15 Mar 2003 10:37:42 +0000 (10:37 +0000)]
[TTY] Register tty devclass before use.
Register the tty devclass with sysfs before tty drivers initialise -
sysfs requires structures to be registered before use. This is
required for the previous serial csets, as well as any drivers which
are initialising using __initcall() or module_init().
Douglas Gilbert [Fri, 14 Mar 2003 12:56:34 +0000 (06:56 -0600)]
[PATCH] sg version 3.5.28 for lk 2.5.64
Changelog:
- remove hosts, host_strs and host_hdr from sg's
procfs interface **
- add sysfs interface for allow_dio, def_reserved_size
and version ***
- switch boot time and module parameters to Rusty's
moduleparam.h interface. This means, for example,
the boot time "sg_def_reserved_size" parameter
changes to "sg.def_reserved_size".
** Christoph moved the host listing functionality into
a more central sysfs position (i.e. not dependent on
sg). However scsi_debug is the only LLD that I can
get to post any "host" info under the new arrangement.
Should devices, device_strs and device_hdrs also be
moved out of sg's procfs interface?
*** I find sg's "debug" in its procfs interface very
useful for debugging (sg itself amongst other things).
However it does not seem suitable for sysfs. Should
it move?
Randy Dunlap [Fri, 14 Mar 2003 12:54:22 +0000 (06:54 -0600)]
[PATCH] reduce stack in qlogicfc.c
This is a start on reducing the stack usage in qlogicfc.c::
isp2x00_make_portdb(). I think that the stack reduction portion
of it is fine, but I'm concerned about the function returning
early due to kmalloc() failure, without making the port database.
[reduces stack from 0xc38 to 0x34 bytes (P4 UP, gcc 2.96)]
Can anyone suggest way(s) to have the isp2x00_make_portdb() function
called over and over again until it gets its job done?
Or does anyone even still use this driver?
Douglas Gilbert [Fri, 14 Mar 2003 12:48:23 +0000 (06:48 -0600)]
[PATCH] scsi_debug in 2.5.64
Here is a second attempt to patch scsi_debug in
2.5.64 . My reference version was out of sync in
my previous posting.
Changelog:
- add recovered error injection (this is a bit
different to the patch proposed by Kurt Garloff)
- fix flakiness in scsi_cmnd::result when errors
are being injected
- fix medium error injection
- make "every_nth" writeable in sysfs
- small re-arrangement of error flags in "opts"
- clean up some of the naming
Updated http://www.torque.net/sg/sdebug25.html
This patch does not include Mike Anderson's sysfs
probe() cleanup.
In 2.5.64 scsi error handling is flaky and sysfs is
especially flaky in 2.5.64-bk3. I'll send some finding
to the list when things stabilize a bit. [For anyone
who is bored, try following the tortured sequence
of scsi commands generated by the block/sd/mid-level
layers in response to a persistent medium error.]
Mark Haverkamp [Fri, 14 Mar 2003 12:45:16 +0000 (06:45 -0600)]
[PATCH] aacraid driver for 2.5
This changes the cmd_per_lun element of the aacraid Scsi_Host_Template
to 1. The larger number is not needed and exceeds the depth limit for
scsi_adjust_queue_depth. Also updated struct initializers.
Alex Tomas [Fri, 14 Mar 2003 12:43:43 +0000 (06:43 -0600)]
[PATCH] Re: hot scsi disk resize
Hi!
Here is new version of the patch. All procfs-related stuff has been removed.
One may rescan device size writing something to /sysfs/.../<scsi device>/rescan:
root@zefir:~# echo 1 >/sysfs/bus/scsi/devices/0\:0\:1\:0/rescan
root@zefir:~# dmesg
scsi0:A:1:0: Tagged Queuing enabled. Depth 64
scsi: host 0 channel 0 id 1 lun16384 has a LUN larger than allowed by the host adapter
SCSI device sda: 2097152 512-byte hdwr sectors (1074 MB)
SCSI device sda: drive cache: write through
sda: unknown partition table
Attached scsi disk sda at scsi0, channel 0, id 1, lun 0
SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
root@zefir:~#
Willem Riede [Fri, 14 Mar 2003 12:41:24 +0000 (06:41 -0600)]
fix jiffies compare warning in osst
On 2003.03.11 14:13 Christoph Hellwig wrote:
>
> --- 1.39/drivers/scsi/osst.c Sun Feb 2 17:50:23 2003
> +++ edited/drivers/scsi/osst.c Mon Mar 10 14:35:46 2003
> @@ -777,7 +777,7 @@
> #define OSST_POLL_PER_SEC 10
> static int osst_wait_frame(OS_Scsi_Tape * STp, Scsi_Request ** aSRpnt, int curr, int minlast, int to)
> {
> - long startwait = jiffies;
> + unsigned long startwait = jiffies;
> char * name = tape_name(STp);
> #if DEBUG
> char notyetprinted = 1;
> @@ -1288,7 +1288,7 @@
> int logical_blk_num = ntohl(STp->buffer->aux->logical_blk_num)
> - (nframes + pending - 1) * blks_per_frame;
> char * name = tape_name(STp);
> - long startwait = jiffies;
> + unsigned long startwait = jiffies;
> #if DEBUG
> int dbg = debugging;
> #endif
> @@ -1477,7 +1477,7 @@
> int expected = 0;
> int attempts = 1000 / skip;
> int flag = 1;
> - long startwait = jiffies;
> + unsigned long startwait = jiffies;
> #if DEBUG
> int dbg = debugging;
> #endif
> -
There are five functions that use jiffies. You fixed three of them.
If this change is done (and that's fine with me) it should be done
with this patch:
Luben Tuikov [Fri, 14 Mar 2003 12:35:45 +0000 (06:35 -0600)]
scsi_softirq queue is now list_head, eliminate bh_next
The following patch gets rid of softscsi_data struct and
array for the more manageable
static struct list_head done_q[NR_CPUS] __cacheline_aligned;
Thus, scsi_cmnd::bh_next is eliminated, since it was used only
in the scsi softirq processing code.
The comments are updated.
80 chars per line for the affected functions: scsi_done()
and scsi_softirq().
Eliminated is the double loop in scsi_softirq() -- this is
better handled in do_softirq() and gives the system a ``breather''.
(There are pros and cons for either side and if you guys
think that it was better with the double loop, I'll change it and
resubmit the patch.)
[PATCH] fix possible NULL pointer dereference in scsi_scan.c
If the sdev allocation fails and q is non-null we could dereference
sdev->request_queue. While at it reformat the function to use
goto-based cleanup - that's much easier to parse.
Neil Brown [Fri, 14 Mar 2003 10:12:01 +0000 (02:12 -0800)]
[PATCH] kNFSd: Introduce CROSSMNT flag for knfsd
Now that we have working up-calls to userspace,
CROSSMNT makes sense.
If CROSSMNT is set for an export, and we too a
LOOKUP which crosses a mountpoint, we initiate an
upcall to find out if and how that filesystem
is exported.
Neil Brown [Fri, 14 Mar 2003 10:11:42 +0000 (02:11 -0800)]
[PATCH] kNFSd: Fix deadlock problem in lockd.
nlmsvc_lock calls nlmsvc_create_block with file->f_sema
held.
nlmsvc_create_block calls nlmclnt_lookup_host which might
call nlm_gc_hosts which might, eventually, try to claim
file->f_sema for the same file -> deadlock.
nlmsvc_create_block does not need any protection under
any lock as lockd is single-threaded and _create_block
only plays with internal data structures.
So we release the f_sema before calling in, and make sure
it gets claimed again afterwards.
knfsd needs to disable soft interrupts when calling
csum_partial_copy_to_xdr().
At the moment there's a nasty conflict between the RPC server and
client. The problem arises when you get to xdr_partial_copy_from_skb()
(and the kmap_atomic()); the RPC client can end up calling the same
function from a ->data_ready() soft interrupt, and corrupt any data
the knfsd process may have copied.
Neil Brown [Fri, 14 Mar 2003 10:09:26 +0000 (02:09 -0800)]
[PATCH] md: Add new superblock format for md
Superblock format '1' resolves a number of issues with
superblock format '0'.
It is more dense and can support many more sub-devices.
It does not contains un-needed redundancy.
It adds a few new useful fields
Neil Brown [Fri, 14 Mar 2003 10:09:20 +0000 (02:09 -0800)]
[PATCH] md: Allow md to select between superblock formats
The code to understand a specific superblock format is
already highly localised in md. This patch defines a
user-space interface for selecting which superblock format
to use, and obeys that selection.
Md currently has a concept of 3 version numbers:
A major version number
A minor version number
A patch version number
There historically seems to be some confusion about whether
these refer to a version of the superblock layout,
or a version of the software.
We will now define that:
the "major_version" defines the superblock handler.
'0' is the current superblock format. All new formats
will need new numbers.
the "minor_version" can specify minor variations in the
superblock, such as different location on the device
the "patch_version" will be used to indicate new extenstions
to the software.. patch_version=1 will mean multiple superblock
support.
A superblock version number is selected by specifing major_version
in SET_ARRAY_INFO ioctl.
This patch:
Updates Documentation/md.txt with details of new interface.
Generalises desc_nr handling and makes sure that an array never
has two devices with the same desc_nr.
makes sure mddev->major_version is always valid and is 0 by default.
uses mddev->major_version to select superblock handlers.
Modifies set_array_info to just record version number if raid_disks==0
Makes sure max_disks is always set correctly.
Determines device size when reading superblock, or a hot-add/add-new.
Neil Brown [Fri, 14 Mar 2003 10:09:13 +0000 (02:09 -0800)]
[PATCH] md: Allow components of MD raid array to have data start at offset from start of device.
Normally the data stored on a component of a RAID array is stored
from the start of the device. This patch allows a per-device
data_offset so the data can start elsewhere. This will allow
RAID arrays where the metadata is at the head of the device
rather than the tail.
Neil Brown [Fri, 14 Mar 2003 10:09:06 +0000 (02:09 -0800)]
[PATCH] md: Fix bad interaction between sync checkpointing and recovery
Md devices (raid1/raid5) can resync or recover.
There are similar but importantly different.
resync happens after an unclean shutdown
recovery happens when a failed drive is being replaced by a hot spare.
The sync-checkpoint code confused the two somewhat and this causes
problems.
This patch makes sure "recovery_cp" only relates to resync,
not recovery.
It also fixes a small problem with recording spares in
the superblock.
Neil Brown [Fri, 14 Mar 2003 10:09:01 +0000 (02:09 -0800)]
[PATCH] md: Fulltime delayed 'safe_mode' for md
From: Angus Sawyer <angus.sawyer@dsl.pipex.com>
If there are no writes for 20 milliseconds, write out superblock
to mark array as clean. Write out superblock with
dirty flag before allowing any further write to succeed.
If an md thread gets signaled with SIGKILL, reduce the
delay to 0.
Also tidy up some printk's and make sure writing the
superblock isn't noisy.
Neil Brown [Fri, 14 Mar 2003 10:08:54 +0000 (02:08 -0800)]
[PATCH] md: Remove md_recoveryd thread for md
The md_recoveryd thread is responsible for initiating and cleaning
up resync threads.
This job can be equally well done by the per-array threads
for those arrays which might need it.
So the mdrecoveryd thread is gone and the core code that
it ran is now run by raid5d, raid1d or multipathd.
We add an MD_RECOVERY_NEEDED flag so those daemon don't have
to bother trying to lock the md array unless it is likely
that something needs to be done.
Also modify the names of all threads to have the number of
md device.
Neil Brown [Fri, 14 Mar 2003 10:08:21 +0000 (02:08 -0800)]
[PATCH] md: Convert /proc/mdstat to use seq_file
From: Angus Sawyer <angus.sawyer@dsl.pipex.com>
Mainly straightforward convert of sprintf -> seq_printf. seq_start and
seq_next modelled on /proc/partitions. locking/ref counting as for
ITERATE_MDDEV.
Neil Brown [Fri, 14 Mar 2003 10:08:13 +0000 (02:08 -0800)]
[PATCH] md: Missing mddev_put in md resync code
Whenever a ITERATE_MDDEV loop is exitted abnormally
we need to mddev_put the current mddev. There was
one point in md_do_sync where we didn't so use counts
became wrong.
Jeff Garzik [Thu, 13 Mar 2003 13:51:25 +0000 (08:51 -0500)]
[hw_random] fixes and cleanups
* s/Via/VIA/
* allow multiple simultaneous open(2)s of the chrdev. This allows
us to eliminate some code, without modifying the core code (rng_dev_read)
at all.
* s/__exit// in ->cleanup ops, to eliminate link error
Jeff Garzik [Thu, 13 Mar 2003 13:47:27 +0000 (08:47 -0500)]
[ia32] cpu capabilities cleanups and additions
* Add support for new Centaur(VIA) and Intel cpuid feature bits,
expanding the x86_capability array by two.
* (cleanup) Move cpu setup for newer Via C3 cpus into its own
function, init_c3()
* Add support for RNG control msr on VIA Nehemiah
* export X86_FEATURE_XSTORE and cpu_has_xstore macros so that
kernel code may easily test for cpu support of the new
"xstore" instruction.
Jeff Garzik [Thu, 13 Mar 2003 13:24:13 +0000 (08:24 -0500)]
[hw_random] update amd768_rng driver to be modular; add Intel support
Take Alan's amd768_rng driver, recently renamed to hw_random.c,
and convert it's very-simple structure to support multiple
types of hardware RNG. Integrate Intel i8xx (ICH) RNG support.
Paul Mackerras [Thu, 13 Mar 2003 09:28:56 +0000 (20:28 +1100)]
PPC32: Add a thread-pointer argument to the clone syscall, make a prepare_to_copy().
The thread-pointer argument gets copied to R2 in the child in copy_thread() if
the CLONE_SETTLS flag is set. Adding a prepare_to_copy simplifies the copy_thread
logic since we don't have to do the extra copy of fpu/altivec state to the child.