The locking in some OSS modules is really lousy.
Because save_flags/cli/restore_flags could be used recursivly - the
programmers pushed the locking too far the lower level.
Because on ISA cards the register sets are usually multiplexed
you had to write to an address latch and then access the data port
in an "atomic" manner.
I suggest removing the locking from ad_read/ad_write +
ad_{enter|leave}_MCE and clamping the locks wherever the functions
are called. I hope the attached patch does that correctly.
Yes, I don't like all the timeout loops while holding the locks:
high chances that a cpu is spinning in interrupt context :(
Andrew Morton [Wed, 12 Feb 2003 05:08:24 +0000 (21:08 -0800)]
[PATCH] fix fadvise64() return type
Patch from: David Mosberger <davidm@napali.hpl.hp.com>
Please remember to declare the return-type of syscall stubs as "long".
On 64-bit platforms, it's generally necessary to ensure that the
entire 64-bit return value is valid (and can be checked against
negative values).
Andrew Morton [Wed, 12 Feb 2003 05:07:17 +0000 (21:07 -0800)]
[PATCH] EATA driver fix
This driver is calling down into scsi_register with local interrupts
disabled. scsi_register performs blocking allocations, starts kernel
threads, etc. slab debugging gets offended by someone performing blocking
operations with local interrupts disabled.
Andrew Morton [Wed, 12 Feb 2003 05:06:47 +0000 (21:06 -0800)]
[PATCH] sunrpc dcache cleanup
Patch from Dipankar Sarma <dipankar@in.ibm.com>
All fs should be using dcache APIs to manipulate dcache hash lists. This is
in line with the dcache cleanup patch (dcache_rcu-1) from Maneesh that Linus
accepted. This seems like a reasonable cleanup. One change though, we don't
need to grab dcache_lock while deleting dentries from the private list and
__d_drop() should suffice here.
Andrew Morton [Wed, 12 Feb 2003 05:06:21 +0000 (21:06 -0800)]
[PATCH] disassociate_ctty SMP fix
Patch from Rik van Riel <riel@conectiva.com.br>
the following patch, against today's BK tree, fixes a small
SMP race in disassociate_ctty. This function gets called
from do_exit, without the BKL held.
However, it sets the *tty variable before grabbing the bkl,
then makes decisions on what the variable was set to before
the lock was grabbed, despite the fact that another process
could modify its ->tty pointer in this same function.
Andrew Morton [Wed, 12 Feb 2003 05:04:29 +0000 (21:04 -0800)]
[PATCH] genhd warnings fix
I have a whole bunch of silly compile warning fixes here, arising from
building the kernel for a 64-bit target. Some are trivial, some are genuine
printk bugs.
assuming dev_t is unsigned generates a warning on ppc64. Cast it.
Dave Jones [Wed, 12 Feb 2003 19:23:16 +0000 (18:23 -0100)]
[CPUFREQ] add support for cpufreq governors.
More bits from Dominik.
Most cpufreq drivers (in fact, all except one, longrun) or even most
cpu frequency scaling algorithms only offer the CPU to be set to one
frequency. In order to offer dynamic frequency scaling, the cpufreq
core must be able to tell these drivers of a "target frequency". So
these specific drivers will be transformed to offer a "->target"
call instead of the existing "->setpolicy" call. For "longrun", all
stays the same, though.
How to decide what frequency within the CPUfreq policy should be used?
That's done using "cpufreq governors". Two are already in this patch
-- they're the already existing "powersave" and "performance" which
set the frequency statically to the lowest or highest frequency,
respectively. At least two more such governors will be ready for
addition in the near future, but likely many more as there are various
different theories and models about dynamic frequency scaling
around. Using such a generic interface as cpufreq offers to scaling
governors, these can be tested extensively, and the best one can be
selected for each specific use.
Basically, it's the following flow graph:
CPU can be set to switch independetly | CPU can only be set
within specific "limits" | to specific frequencies
"CPUfreq policy"
consists of frequency limits (policy->{min,max})
and CPUfreq governor to be used
/ \
/ \
/ the cpufreq governor decides
/ (dynamically or statically)
/ what target_freq to set within
/ the limits of policy->{min,max}
/ \
/ \
Using the ->setpolicy call, Using the ->target call,
the limits and the the frequency closest
"policy" is set. to target_freq is set.
It is assured that it
is within policy->{min,max}
Pavel Machek [Wed, 12 Feb 2003 04:50:16 +0000 (20:50 -0800)]
[PATCH] Fix stack handling in acpi_wakeup.S
This fixes stack handling in acpi_wakeup.S, and makes stack smaller so
that wakeup code actually fits inside memory allocated for it. Plus
someone renamed .L1432 to something meaningful.
Stephen Rothwell [Tue, 11 Feb 2003 13:37:16 +0000 (05:37 -0800)]
[PATCH] x86_64 compatibility layer update
Andi has asked that I send these straight forward compatibility patches
to you and he will fix up any merge problems later. These are the
outstanding patches for x86_64 against 2.5.60.
Stephen Rothwell [Tue, 11 Feb 2003 13:37:08 +0000 (05:37 -0800)]
[PATCH] parisc compatibility layer update
At Linux Conf AU, Willy asked me to send any further parisc compatibility
changes directly to you, so this is what I have outstanding. Basically,
it is just the uses of compat_sigset_t that seemed to have been missed in
the previous merges.
Andi Kleen [Tue, 11 Feb 2003 13:20:56 +0000 (05:20 -0800)]
[PATCH] x86-64 merge
This brings the x86-64 port uptodate in 2.5.60. Unfortunately I cannot
test too much because i constantly get deadlocks in exit/wait in initscripts
on SMP bootup. The kernel seems to still lose a lot of SIGCHLD. 2.5.59/SMP
had the same problem. Uniprocessor and SMP kernel on UP seems to work.
This patch only touches x86-64 specific files. It requires a few simple
changes to arch independent files that I will send separately.
- Fixed a lot of obsolete/misleading configure help texts.
- Remove old bootblock disk loader and support fdimage target for syslinux
instead (H. Peter Anvin)
- Fix potential fpu signal restore problem on 32bit emulation.
- Merge with 2.5.60 i386 (hugetlbfs, acpi etc.)
- Some fixes for local apic disabled modus.
- Beginngs of S3 ACPI wakeup from real-mode (not working yet, don't use)
- Beginnings of NUMA/CONFIG_DISCONTIGMEM support for AMD K8 (work in progress,
port from 2.4): clean up memory mapping at bootup, generalize bootmem etc.
- Fix 64bit GS base reload problem and reenable (Karsten Keil)
- Fix race with vmalloc accesses from interrupt handlers disturbing page fault/
similar race for the debug handler (thanks to Andrew Morton)
- Merge cpu access primitives with i386
- Revert to private module list for now because putting modules
nto vmlist triggered too many problems.
- Some cleanups, removal of unneeded code.
- Let early __get_free_pages see consistent pda
- Preempt disabled for now because it is too broken right now
- Signal handler fixes
- Fix do_gettimeofday to be completely lockless and reenable vsyscalls
- Optimize context switch path a bit (should be ported to i386)
- Get thread_info via stack for better code
- Don't leak pmd pages
- Clean up hardcoded task stack sizes.
Linus Torvalds [Tue, 11 Feb 2003 03:05:34 +0000 (19:05 -0800)]
If we set TIF_SIGPENDING for SIGCONT, we have to wake up any sleeping
tasks (even if we don't otherwise need to wake anything up), since
otherwise later signals would see that signals are already pending and
wouldn't cause wakeups.
Andrew Morton [Tue, 11 Feb 2003 01:16:52 +0000 (17:16 -0800)]
[PATCH] sched_init enables interrupts too early
wake_up_forked_process() unconditionally enables interrupts. It is called
from sched_init(). Enabling interrupts that early makes Anton's ppc64
machine lock up.
David S. Miller [Mon, 10 Feb 2003 20:27:26 +0000 (12:27 -0800)]
[SIGNAL]: Allow more platforms to use generic get_signal_to_deliver.
The few platforms that cannot use the generic
get_signal_to_deliver implementation cannot do
so because they do special things for ptraced
children. This can be easily avoided and thus
all of the signal handling code duplication can
be eliminated.
This is the first part, which adds a platform hook
right before the parent of the ptraced child is woken.
Data can be passed in via a cookie argument.
The next part will be dealing with platforms
that need to muck with breakpoints in the child
in this same code block.
Andrew Morton [Mon, 10 Feb 2003 15:37:12 +0000 (07:37 -0800)]
[PATCH] fix current->user->processes leak
Patch from: Eric Lammerts <eric@lammerts.org>
Every time you do a loop mount, a kernel thread is started (those
processes are called "loop0", "loop1", etc.). The problem is that when
it starts, it's counted as one of your processes. Then, it's
changed to be a root-owned process without correcting that count.
Patch below fixes the problem. It moves the bookkeeping of changing
current->user to a new function switch_uid() (which is now also used
by exec_usermodehelper() in kmod.c). The patch is tested.
Andrew Morton [Mon, 10 Feb 2003 15:37:03 +0000 (07:37 -0800)]
[PATCH] remove the buffer_head mempool
mempools have the wrong semantics for use by buffer_heads. The problem
scenario:
- Process A calls mempool_alloc(), asking for a buffer_head.
- While process A sleeps, process B frees up a ton of memory.
That's it. There is no longer any memory pressure, so nobody frees any
buffer_heads, so process A does not get woken up. I managed to trigger this
in some testing recently.
One approach would be to use a schedule_timeout(2) in mempool_alloc().
Anyway, the importance of buffer_head allocation was lessened when swapout
stopped using them, so let's just drop the mempool out of it for now.
Andrew Morton [Mon, 10 Feb 2003 15:36:23 +0000 (07:36 -0800)]
[PATCH] nforce2 IDE support for the amd74xx driver
Patch from James Curbo <phoenix@sandwich.net>
The amd74xx IDE driver in 2.5.59 has support for the nforce IDE controller,
but not explicitly for the nforce2 IDE controller (which has a different PCI
ID, which is in the kernel already). I'm not sure if the nforce and nforce2
controllers are identical, but I made a small patch that made the amd74xx
driver recognize the nforce2 IDE, and it boots for me, seems to work fine, as
my drives were tuned to their highest transfer rate automatically (udma5).
I don't know if this patch is proper or correct, but it Works for Me [tm].
Patch is attached.
Andrew Morton [Mon, 10 Feb 2003 15:36:07 +0000 (07:36 -0800)]
[PATCH] DAC960 Stanford Checker fix
Patch from Dave Olien <dmo@osdl.org>
This was found by the Standford Checker.
The LogicalDeviceNumber bad range test was changed from > to >=
I also replaced a couple of panic() calls with error messages,
since panic-ing seemed a little extreme.
Andrew Morton [Mon, 10 Feb 2003 15:36:00 +0000 (07:36 -0800)]
[PATCH] ext3: Remove journal_try_start()
journal_try_start() is a function which nonblockingly attempts to open a JBD
transaction handle. It was added a long time ago when there were concerns
that ext3_writepage() could block kswapd for too long.
It was never clearly necessary.
So the patch throws it all away and just calls the blocking journal_start()
from ext3_writepage().
> like with cmd_ld in scripts/Makefile.lib having possibility to add
> customflags with cmd_objcopy would be nice. When building a
> ROMKernel I'd like to use:
> OBJCOPYFLAGS_rompiggydata := --remove-section=.text
> OBJCOPYFLAGS_$(MODEL)piggytext := --only-section=.text
> Appended below is a small patch to the top-level makefile; it
> -- replaces a call to $(shell/echo/sed) with $(subst) and adds a
> comment
> -- fixes some typos.
When the user selects CONFIG_MODVERSIONS but doesn't build anything
modular, the post-processing step does nothing (right, as there is
nothing to be done), but it also gave an error, which it shouldn't.
Some versions of sed seem to think \w, as in word, doesn't include
digits, which breaks the build with CONFIG_MODVERSIONS. So we
just use the more compatible [<space><tab>]*.
the problem is that if the buffer was locked on entry to this code sequence
(due to in-progress I/O), ll_rw_block() will not wait, and start new I/O. So
this code will wait on the _old_ I/O, and will then continue execution,
leaving the buffer dirty.
It turns out that all callers were only writing one buffer, and they were all
waiting on that writeout. So I added a new sync_dirty_buffer() function: