git.neil.brown.name Git - history.git/log

]> git.neil.brown.name Git - history.git/log

projects / history.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Bernardo Innocenti [Tue, 8 Jul 2003 09:19:51 +0000 (02:19 -0700)]

[PATCH] Fix problem introduced by do_div() patch

- export the __div64_32 symbol for modules;

- add likely() to the fast path (divisor>>32 == 0);

- add __attribute__((pure)) to __div64_32() prototype so
the compiler knows global memory isn't clobbered;

- avoid building __div64_32() on 64bit architectures.

commit | commitdiff | tree

Pavel Machek [Tue, 8 Jul 2003 09:18:05 +0000 (02:18 -0700)]

[PATCH] Fix thinko in acpi

commit | commitdiff | tree

Linus Torvalds [Tue, 8 Jul 2003 05:15:27 +0000 (22:15 -0700)]

Avoid deadlocking on thread shutdown after a vfork.

commit | commitdiff | tree

Hirofumi Ogawa [Tue, 8 Jul 2003 02:55:47 +0000 (19:55 -0700)]

[PATCH] FAT maintainership

commit | commitdiff | tree

David S. Miller [Mon, 7 Jul 2003 18:00:08 +0000 (11:00 -0700)]

Merge davem@nuts.ninka.net:/home/davem/src/BK/sparc-2.5
into kernel.bkbits.net:/home/davem/sparc-2.5

commit | commitdiff | tree

David S. Miller [Mon, 7 Jul 2003 19:15:57 +0000 (12:15 -0700)]

[SPARC64]: Use kstat_this_cpu where possible.

commit | commitdiff | tree

David S. Miller [Mon, 7 Jul 2003 18:37:57 +0000 (11:37 -0700)]

[SPARC64]: Kill all irq_cpustat_t except __softirq_pending.

commit | commitdiff | tree

David S. Miller [Mon, 7 Jul 2003 17:03:03 +0000 (10:03 -0700)]

[SPARC64]: Move raid xor into library assembler file.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:22:14 +0000 (09:22 -0700)]

[PATCH] clean module_exit in m68knommu serial drivers

Remove un-used commented module_exit functions from m68knommu
ColdFire and 68328 serial drivers. These drivers currently cannot
be configured as modules, and they have no exit functions.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:21:47 +0000 (09:21 -0700)]

[PATCH] fix security_initcall in m68knommu linker script

Global SECURITY_INIT macro cannot be used inside .init section
for m68knommu linker script. It is a complete section of its own,
need to just list the components individually.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:21:25 +0000 (09:21 -0700)]

[PATCH] conditional ROMfs copy for NETtel/5307 board

Conditionally copy the ROMfs filesystem on the NETtel/5307
target board only if using a ROMfs.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:21:02 +0000 (09:21 -0700)]

[PATCH] DragenEngine interrupt handler to use irqreturn_t

DragenEngine setup code updates:

- Change interrupt handler return type to irqreturn_t
- Allow configure time setting of boot parameters
- Clean up warnings

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:20:47 +0000 (09:20 -0700)]

[PATCH] conditional ROMfs copy for SecureEdgeMP3/5307 board

Conditionally copy the ROMfs filesystem on the SecureEdgeMP3/5307
target board only if using a ROMfs.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 16:20:38 +0000 (09:20 -0700)]

[PATCH] 68328 DragenEngine configure updates

Configuration updates for 68328 DragenEngine board. Fix up name
so that it is "DragenEngine" and clean up eeprom read.

commit | commitdiff | tree

David S. Miller [Mon, 7 Jul 2003 15:19:41 +0000 (08:19 -0700)]

Merge nuts.ninka.net:/home/davem/src/BK/sparcwork-2.5
into nuts.ninka.net:/home/davem/src/BK/sparc-2.5

commit | commitdiff | tree

Trond Myklebust [Mon, 7 Jul 2003 09:14:10 +0000 (02:14 -0700)]

[PATCH] make create() follow symlinks again

The intent patches broke behaviour w.r.t. following symlinks when
doing an open() with file creation. The problem occurs in open_namei()
because the LOOKUP_PARENT flag is no longer set when we do the call to
follow_link().

commit | commitdiff | tree

Ulrich Drepper [Mon, 7 Jul 2003 08:31:32 +0000 (01:31 -0700)]

[PATCH] tgkill patch for safe inter-thread signals

This is the updated versions of the patch Ingo sent some time ago to
implement a new tgkill() syscall which specifies the target thread
without any possibility of ambiguity or thread ID wrap races, by passing
in both the thread group _and_ the thread ID as the arguments.

This is really needed since many/most people still run with limited PID
ranges (maybe due to legacy apps breaking) and the PID reuse can cause
problems.

commit | commitdiff | tree

Pavel Machek [Mon, 7 Jul 2003 06:34:25 +0000 (23:34 -0700)]

[PATCH] suspend SMP-kernel with one CPU

This allows suspend to work on UP machines, even if the kernel
is compiled for SMP.

commit | commitdiff | tree

Ian Molton [Mon, 7 Jul 2003 06:29:28 +0000 (23:29 -0700)]

[PATCH] ARM26 architecture update

commit | commitdiff | tree

Bruno Ducrot [Mon, 7 Jul 2003 06:04:55 +0000 (23:04 -0700)]

[PATCH] powernow-k7 typo fix

Due to a typo in powernow-k7.c, the value which correspond
to the CPU core multiplicator and the VID value are swapped
when we go down to up in frequency step.

commit | commitdiff | tree

Paul Mackerras [Mon, 7 Jul 2003 06:04:09 +0000 (23:04 -0700)]

[PATCH] Compile fix and cleanup for macserial driver

This adds a declaration that the macserial driver needs in order to
compile correctly, and removes some old SERIAL_DO_RESTART junk which
isn't used (SERIAL_DO_RESTART is never defined in this driver) and which
I think is incorrect anyway, since it looks to me like it would
potentially return an ERESTARTSYS error without a signal pending.

commit | commitdiff | tree

Rusty Russell [Mon, 7 Jul 2003 06:01:50 +0000 (23:01 -0700)]

[PATCH] switch_mm and enter_lazy_tlb: remove cpu arg

switch_mm and enter_lazy_tlb take a CPU arg, which is always
smp_processor_id(). This is misleading, and pointless if they use
per-cpu variables or other optimizations. gcc will eliminate
redundant smp_processor_id() (in inline functions) anyway.

This removes that arg from all the architectures.

commit | commitdiff | tree

Rusty Russell [Mon, 7 Jul 2003 06:01:42 +0000 (23:01 -0700)]

[PATCH] Make kstat_this_cpu in terms of __get_cpu_var and use it

kstat_this_cpu() is defined in terms of per_cpu instead of __get_cpu_var.

This patch changes that, and uses it everywhere appropriate. The sched.c
change puts it in a local variable, which helps gcc generate better code.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 06:00:33 +0000 (23:00 -0700)]

[PATCH] remove 68360 specific trap init call

No longer need the 68360 specific trap init call. The generic
interrupt/trap code is now setup to do this itself.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 06:00:27 +0000 (23:00 -0700)]

[PATCH] define raw read/write for m68knommu io access

Define the raw read and write access macros for m68knommu.
These rae use by MTD drivers in particular.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 06:00:19 +0000 (23:00 -0700)]

[PATCH] cleanup show_process_blocks() for non-mmu targets

Clean up show_process_blocks() loop for non-mmu targets.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 06:00:12 +0000 (23:00 -0700)]

[PATCH] define shared lib limits for flat loader

This patch includes the last peices of the flat laoder shared
library support. Define the shared lib limit and implement a
flag for doing kernel level tracing.

commit | commitdiff | tree

Greg Ungerer [Mon, 7 Jul 2003 06:00:04 +0000 (23:00 -0700)]

[PATCH] .no .romvec section for DragonEngine/68328 target

A couple of minor fixes for the 68328 interrupt setup code.

- don't define the .romvec section for DragonEngine build
- print newline at end of spurious interrupt count in show_interrupts()

commit | commitdiff | tree

Ingo Molnar [Mon, 7 Jul 2003 05:51:59 +0000 (22:51 -0700)]

[PATCH] Double unlock in BSD accounting speedup patch

doh - double unlock in the acct-is-on path. Noticed by Aneesh Kumar K.V
<aneesh.kumar@digital.com>

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Jul 2003 16:16:36 +0000 (09:16 -0700)]

Merge bk://cifs.bkbits.net/linux-2.5cifs
into home.osdl.org:/home/torvalds/v2.5/linux

commit | commitdiff | tree

Steve French [Sun, 6 Jul 2003 15:36:41 +0000 (08:36 -0700)]

Fix statfs failure due to invalid value for ffree

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 12:58:51 +0000 (05:58 -0700)]

[PATCH] conditional ROMfs copy for Cleopatra/5307 board

Conditionally copy the ROMfs filesystem on the Cleopatra/5307
target board only if using a ROMfs.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:41:34 +0000 (05:41 -0700)]

[PATCH] BSD accounting speedup

From: Ingo Molnar <mingo@elte.hu>

Most distributions turn on process accounting - but even the common
'accounting is off' case is horrible SMP-scalability-wise: it accesses a
global spinlock during every sys_exit() call, which bounces like mad on SMP
(and NUMA) systems.

(i also got rid of the unused return code.)

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:41:27 +0000 (05:41 -0700)]

[PATCH] display bootserver in /proc/net/pnp

From: "lode leroy" <lode_leroy@hotmail.com>

I would like to submit a trivial enhancement to display the ip address of
the bootserver in /proc/net/pnp

This aids me in developing a diskless linux root image to know where it
comes from...

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:41:19 +0000 (05:41 -0700)]

[PATCH] Module autoloading for quota

From: Jan Kara <jack@suse.cz>

This implements autoloading of quota modules.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:41:12 +0000 (05:41 -0700)]

[PATCH] xattr: fine-grained locking

From: Andreas Gruenbacher <agruen@suse.de>

This patch removes the dependency on i_sem in the getxattr and
listxattr iops of ext2 and ext3. In addition, the global ext[23]_xattr
semaphores go away. Instead of i_sem and the global semaphore, mutual
exclusion is now ensured by per-inode xattr semaphores, and by locking
the buffers before modifying them. The detailed locking strategy is
described in comments in fs/ext[23]/xattr.c.

Due to this change it is no longer necessary to take i_sem in
ext[23]_permission() for retrieving acls, so the
ext[23]_permission_locked() functions go away.

Additionally, the patch fixes a race condition in ext[23]_permission:
Accessing inode->i_acl was protected by the BKL in 2.4; in 2.5 there no
longer is such protection. Instead, inode->i_acl (and inode->i_default_acl)
are now accessed under inode->i_lock. (This could be replaced by RCU in
the future.)

In the ext3 extended attribute code, an new uglines results from locking
at the buffer head level: The buffer lock must be held between testing
if an xattr block can be modified and the actual modification to prevent
races from happening. Before a block can be modified,
ext3_journal_get_write_access() must be called. But this requies an unlocked
buffer, so I call ext3_journal_get_write_access() before locking the
buffer. If it turns out that the buffer cannot be modified,
journal_release_buffer() is called. Calling ext3_journal_get_write_access
after the test but while the buffer is still locked would be much better.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:41:05 +0000 (05:41 -0700)]

[PATCH] xattrr: preparation for fine-grained locking

From: Andreas Gruenbacher <agruen@suse.de>

Andrew Morton found that there is lock contention between extended
attribute operations (like reading ACLs, which `ls -l' needs to do)
and other operations on the same files. This is due to the fact that
all extended attribute syscalls take inode->i_sem before calling into
the filesystem code.

To fix this problem, this patch no longer takes inode->i_sem in the
getxattr and listxattr syscalls, and moves the lock taking code into
the file systems. (Another patch improves the locking strategy in
ext2 and ext3.)

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:57 +0000 (05:40 -0700)]

[PATCH] xattr: update-in-place optimisation

From: Andreas Gruenbacher <agruen@suse.de>

It is common to update extended attributes without changing the value's
length. This patch optimizes this case. In addition to that, the current
code tries to recognize early when extended attribute blocks become
empty. This optimization is not of significant value, so this patch
removes it, and moves the empty block test further down.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:50 +0000 (05:40 -0700)]

[PATCH] xattr: blockdev inode selection fix

From: Andreas Gruenbacher <agruen@suse.de>

The inode->i_bdev field is not the same as inode->i_sb->s_bdev or bh->b_bdev.
We must compare inode->i_sb->s_bdev with bh->b_bdev, or else equal extended
attribute block will not be found.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:42 +0000 (05:40 -0700)]

[PATCH] xattr: cleanups

From: From: Andreas Gruenbacher <agruen@suse.de>

* Various minor cleanups and simplifications in the extended attributes
  and acl code.

* Use a smarter shortcut rule in ext[23]_permission(): If the mask
  contains permissions that are not also contained in the group
  file mode permission bits, those permissions can never be granted by
  an acl. (The previous shortcut rule was more coarse.)

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:35 +0000 (05:40 -0700)]

[PATCH] proc_attr_lookup() fix

From: Daniele Belluci <bellucda@tiscali.it>

proc_attr_lookup() was missed out in Trond's conversion. (It is behind
CONFIG_SECURITY).

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:27 +0000 (05:40 -0700)]

[PATCH] breadahead() tweaks

- use ll_rw_block().

- use READA

- export it to modules.

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:21 +0000 (05:40 -0700)]

[PATCH] misc fixes

- xfs printk warning fix (dev_t is ulong on ppc64)

- unused var in serial_remove() (Daniele Bellucci <bellucda@tiscali.it>)

commit | commitdiff | tree

Andrew Morton [Sun, 6 Jul 2003 12:40:14 +0000 (05:40 -0700)]

[PATCH] use task_cpu() not ->thread_info->cpu in sched.c

From: Mikael Pettersson <mikpe@csd.uu.se>

This patch fixes two p->thread_info->cpu occurrences in kernel/sched.c to
use the task_cpu(p) macro instead, which is optimised on UP. Although one
of the occurrences is under #ifdef CONFIG_SMP, it's bad style to use the
raw non-optimisable form in non-arch code.

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Jul 2003 12:28:36 +0000 (05:28 -0700)]

Fix several broken macros to get the "private" field of a seq-file
in the networking code.

From YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 12:25:03 +0000 (05:25 -0700)]

[PATCH] flat loader v850 specific support abstracted

Architecture specific flat loader code for v850 moved into its
own v850 flat.h header. This patch also adds supporti for a number
of relocation cases that need to be handled at laod time.

Most of this code is originally from Miles Bader <miles@gnu.org>.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 12:24:52 +0000 (05:24 -0700)]

[PATCH] flat loader m68knommu specific support abstracted

Architecture specific flat loader code for m68knommu moved into its
own m68knommu flat.h header. Part of the shared library flat loader
update.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 12:24:20 +0000 (05:24 -0700)]

[PATCH] flat loader H8/300 specific support abstracted

Architecture specific flat loader code for H8/300 moved into its
own H8/300 flat.h header.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 12:24:13 +0000 (05:24 -0700)]

[PATCH] shared library support for MMUless binfmt_flat loader

This patch adds shared library support to the MMU application
loader, binfmt_flat. This is not new, it is a forward port from the
same support in 2.4.x kernels with MMUless support, and has been
running for well over a year now. The code support is conditionally
compiled on CONFIG_BINFMT_FLAT_SHARED. This change also abstracts
a bit more architecture dependent code into the separate flat.h
includes.

Basically relocations within an application also carry a tag to
identify what they refer too (this code or which shared library).
This is patched as before at load/run-time with an appropriate
address.

commit | commitdiff | tree

Steve French [Sun, 6 Jul 2003 11:24:48 +0000 (04:24 -0700)]

Signing fixes part 4 of 4

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 10:20:56 +0000 (03:20 -0700)]

[PATCH] simplify access_ok() for all m68knommu targets

Unify access_ok for all m68knommu targets. All targets use the
common linker script and have common end symbols. So now we can
just use a simple check.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 10:20:31 +0000 (03:20 -0700)]

[PATCH] remove unused register from clobber list in down_trylock()

Remove "%d0" register from clobber list of down_trylock() for
m68knommu. It is not used by the asm code here at all.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 10:20:14 +0000 (03:20 -0700)]

[PATCH] force PAGE_SIZE to be an unsigned long

Force PAGE_SIZE for the m68knommu architecture to be an unsigned long.
This makes it consistent with all other architectures and cleans up
a load of compiler warnings.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 10:20:00 +0000 (03:20 -0700)]

[PATCH] conditional ROMfs copy for Motorola M5307C3 board

Conditionally copy the ROMfs filesystem on the Motorola M5307C3
target board only if using a ROMfs.

commit | commitdiff | tree

Greg Ungerer [Sun, 6 Jul 2003 10:19:51 +0000 (03:19 -0700)]

[PATCH] selection of boot parameters at configure time for Motorola 5282 targets

Allow setting boot time parameters at configuration for Motorola
5282 targets.

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Jul 2003 06:23:55 +0000 (23:23 -0700)]

Simplify and speed up mmap read-around handling

This improves cold-cache program startup noticeably for me, and
simplifies the read-ahead logic at the same time. The rules for
read-ahead are:

- if the vma is marked random, we just do the regular one-page case.
   Obvious.

- if the vma is marked "linear access", we use the regular readahead
   code. No change in behaviour there (well, we also only consider it a
   _miss_ if it was marked linear access - the "readahead" and
   "readaround"  things are now totally independent of each other)

- otherwise, we look at how many hits/misses we've had for this
   particular file open for mmap, and if we've had noticeably more
   misses than hits, we don't bother with read-around.

In particular, this means that the "real" read-ahead logic literally
only needs to worry about finding sequential accesses, and does not
have to worry about the common executable mmap access patthers that
have very different behaviour.

Some constant tweaking may be a good idea.

commit | commitdiff | tree

Ingo Molnar [Sun, 6 Jul 2003 05:58:33 +0000 (22:58 -0700)]

[PATCH] another timer overflow thing

in add_timer_internal() we simply leave the timer pending forever if the
expiry is in more than 0xffffffff jiffies. This means more than 48 days on
eg. ia64 - which is not an unrealistic timeout. IIRC crond is happy to use
extremely large timeouts.

It's better to time out early (if you can call 48 days "early") than to
not time out at all.

commit | commitdiff | tree

Bernardo Innocenti [Sun, 6 Jul 2003 05:58:25 +0000 (22:58 -0700)]

[PATCH] Fix do_div() for all architectures

This offers a generic do_div64() that actually does the right thing,
unlike some architectures that "optimized" the 64-by-32 divide into
just a 32-bit divide.

Both ppc and sh were already providing an assembly optimized
__div64_32().  I called my function the same, so that their optimized
versions will automatically override mine in lib.a.

I've only tested extensively on m68knommu (uClinux) and made
sure generated code is reasonably short. Should be ok also on
parisc, since it's the same algorithm they were using before.

- add generic C implementations of the do_div() for 32bit and 64bit
   archs in asm-generic/div64.h;

- add generic library support function __div64_32() to handle the
   full 64/32 case on 32bit archs;

- kill multiple copies of generic do_div() in architecture
   specific subdirs. Most copies were either buggy or not doing
   what they were supposed to do;

- ensure all surviving instances of do_div() have their parameters
   correctly parenthesized to avoid funny side-effects;

commit | commitdiff | tree

Paul Fulghum [Sun, 6 Jul 2003 05:58:17 +0000 (22:58 -0700)]

[PATCH] synclink_cs.c update

Fix arbitration between net open and tty open.

Cleanup missed bits of CUA device removal changes.

commit | commitdiff | tree

Paul Fulghum [Sun, 6 Jul 2003 05:58:09 +0000 (22:58 -0700)]

[PATCH] synclinkmp.c update

Fix arbitration between net open and tty open.

Clean up unused locals resulting from latest tty changes.

commit | commitdiff | tree

Paul Fulghum [Sun, 6 Jul 2003 05:58:02 +0000 (22:58 -0700)]

[PATCH] synclink.c update

Fix arbitration between net open and tty open.

Cleanup unused local resulting from latest tty changes.

commit | commitdiff | tree

Benjamin Herrenschmidt [Sun, 6 Jul 2003 05:33:43 +0000 (22:33 -0700)]

[PATCH] fix IDE init oops on PowerMac

From Mikael Petterson:

  Booting kernel 2.5.74 on a PowerMac with CONFIG_BLK_DEV_IDE_PMAC=y
  results in an oops during IDE init, and the box then reboots.

  The patch below updates drivers/ide/ppc/pmac.c to also set up the
  hwif->ide_dma_queued_off and hwif->ide_dma_queued_on function
  pointers, which fixes the oops. Tested on my ancient PM4400.

commit | commitdiff | tree

Pavel Machek [Sun, 6 Jul 2003 05:33:35 +0000 (22:33 -0700)]

[PATCH] New maintainter for nbd

I no longer have the time/interest in nbd, and Paul agreed to take it
over.

commit | commitdiff | tree

Anton Blanchard [Sun, 6 Jul 2003 03:39:12 +0000 (20:39 -0700)]

[PATCH] enable device mapper in compat layer

The compat ioctls for device mapper were not being enabled due to an
incorrect config option.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 11:08:19 +0000 (04:08 -0700)]

[PATCH] Improve mmap readaround

This tweaks the mmap read-ahead behaviour so that the prefaulting
is largely pointless.

- double the minimum readaround chunksize in page_cache_readaround().

- when a seek is detected, collapse the window more slowly.

commit | commitdiff | tree

Steve French [Sat, 5 Jul 2003 04:19:58 +0000 (21:19 -0700)]

Signing fixes part 3

commit | commitdiff | tree

Krzysztof Halasa [Sat, 5 Jul 2003 03:17:46 +0000 (20:17 -0700)]

[PATCH] C99 initializers in hdlc_generic.c

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:38:27 +0000 (19:38 -0700)]

[PATCH] i2o_scsi build fix

i2o_scsi.c now needs pci.h.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:38:20 +0000 (19:38 -0700)]

[PATCH] fix rfcomm oops

From: ilmari@ilmari.org (Dagfinn Ilmari Mannsaker)

It turns out that net/bluetooth/rfcomm/sock.c (and
net/bluetooth/hci_sock.c) had been left out when net_proto_family gained an
owner field, here's a patch that fixes them both.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:38:13 +0000 (19:38 -0700)]

[PATCH] MTD build fix for old gcc's

From: junkio@cox.net

Sigh. Is there a gcc option to tell it to not accept this incompatible C99
extension?

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:38:06 +0000 (19:38 -0700)]

[PATCH] fix current->user->__count leak

From: Arvind Kandhare <arvind.kan@wipro.com>

When switch_uid is called, the reference count of the new user is
incremented twice.  I think the increment in the switch_uid is done because
of the reparent_to_init() function which does not increase the __count for
root user.

But if switch_uid is called from any other function, the reference count is
already incremented by the caller by calling alloc_uid for the new user.
Hence the count is incremented twice.  The user struct will not be deleted
even when there are no processes holding a reference count for it.  This
does not cause any problem currently because nothing is dependent on timely
deletion of the user struct.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:38:00 +0000 (19:38 -0700)]

[PATCH] epoll: microoptimisations

From: Davide Libenzi <davidel@xmailserver.org>

- Inline eventpoll_release() so that __fput() does not need to call in
epoll code if the file itself is not registered inside an epoll fd

- Add <linux/types.h> inclusion due __u32 and __u64 usage

- Fix debug printf that would otherwise panic if enabled with the new
epoll code

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:54 +0000 (19:37 -0700)]

[PATCH] bootmem.c cleanups

From: Davide Libenzi <davidel@xmailserver.org>

- Remove a couple of impossible debug checks (unsigneds cannot be
negative!)

- If __alloc_bootmem_core() fails with a goal and unaligned node_boot_start
it'll loop fovever.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:46 +0000 (19:37 -0700)]

[PATCH] after exec_mmap(), exec cannot fail

If de_thread() fails in flush_old_exec() then we try to fail the execve().

That is a bad move, because exec_mmap() has already switched the current
process over to the new mm.  The new process is not yet sufficiently set up
to handle the error and the kernel doublefaults and dies.  exec_mmap() is the
point of no return.

Change flush_old_exec() to call de_thread() before running exec_mmap() so the
execing program sees the error.  I added fault injection to both de_thread()
and exec_mmap() - everything now survives OK.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:40 +0000 (19:37 -0700)]

[PATCH] block allocation comments

From: Nick Piggin <piggin@cyberone.com.au>

Add some comments to the request allocation code.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:34 +0000 (19:37 -0700)]

[PATCH] get_io_context fixes

- pass gfp_flags to get_io_context(): not all callers are forced to use
  GFP_ATOMIC().

- fix locking in get_io_context(): bump the refcount whilein the exclusive
  region.

- don't go oops in get_io_context() if the kmalloc failed.

- in as_get_io_context(): fail the whole thing if we were unable to
  allocate the AS-specific part.

- as_remove_queued_request() cleanup

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:26 +0000 (19:37 -0700)]

[PATCH] block request batching

From: Nick Piggin <piggin@cyberone.com.au>

The following patch gets batching working how it should be.

After a process is woken up, it is allowed to allocate up to 32 requests
for 20ms.  It does not stop other processes submitting requests if it isn't
submitting though.  This should allow less context switches, and allow
batches of requests from each process to be sent to the io scheduler
instead of 1 request from each process.

tiobench sequential writes are more than tripled, random writes are nearly
doubled over mm1.  In earlier tests I generally saw better CPU efficiency
but it doesn't show here.  There is still debug to be taken out.  Its also
only on UP.

                                Avg     Maximum     Lat%   Lat%   CPU
Identifier    Rate  (CPU%)  Latency   Latency     >2s    >10s   Eff
------------------- ------ --------- ---------- ------- ------ ----
-2.5.71-mm1   11.13 3.783%    46.10    24668.01   0.84   0.02   294
+2.5.71-mm1   13.21 4.489%    37.37     5691.66   0.76   0.00   294

Random Reads
------------------- ------ --------- ---------- ------- ------ ----
-2.5.71-mm1    0.97 0.582%   519.86     6444.66  11.93   0.00   167
+2.5.71-mm1    1.01 0.604%   484.59     6604.93  10.73   0.00   167

Sequential Writes
------------------- ------ --------- ---------- ------- ------ ----
-2.5.71-mm1    4.85 4.456%    77.80    99359.39   0.18   0.13   109
+2.5.71-mm1   14.11 14.19%    10.07    22805.47   0.09   0.04    99

Random Writes
------------------- ------ --------- ---------- ------- ------ ----
-2.5.71-mm1    0.46 0.371%    14.48     6173.90   0.23   0.00   125
+2.5.71-mm1    0.86 0.744%    24.08     8753.66   0.31   0.00   115

It decreases context switch rate on IBM's 8-way on ext2 tiobench 64 threads
from ~2500/s to ~140/s on their regression tests.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:19 +0000 (19:37 -0700)]

[PATCH] generic io contexts

From: Nick Piggin <piggin@cyberone.com.au>

Generalise the AS-specific per-process IO context so that other IO schedulers
could use it.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:12 +0000 (19:37 -0700)]

[PATCH] block batching fairness

From: Nick Piggin <piggin@cyberone.com.au>

This patch fixes the request batching fairness/starvation issue.  Its not
clear what is going on with 2.4, but it seems that its a problem around this
area.

Anyway, previously:

* request queue fills up
* process 1 calls get_request, sleeps
* a couple of requests are freed
* process 2 calls get_request, proceeds
* a couple of requests are freed
* process 2 calls get_request...

Now as unlikely as it seems, it could be a problem.  Its a fairness problem
that process 2 can skip ahead of process 1 anyway.

With the patch:

* request queue fills up
* any process calling get_request will sleep
* once the queue gets below the batch watermark, processes
  start being worken, and may allocate.

This patch includes Chris Mason's fix to only clear queue_full when all tasks
have been woken.  Previously I think starvation and unfairness could still
occur.

With this change to the blk-fair-batches patch, Chris is showing some much
improved numbers for 2.4 - 170 ms max wait vs 2700ms without blk-fair-batches
for a dbench 90 run.  He didn't indicate how much difference his patch alone
made, but it is an important fix I think.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:37:05 +0000 (19:37 -0700)]

[PATCH] handle OOM in get_request_wait()

From: Nick Piggin <piggin@cyberone.com.au>

If there are no requess in flight against the target device and
get_request() fails, nothing will wake us up. Fix.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:59 +0000 (19:36 -0700)]

[PATCH] allow the IO scheduler to pass an allocation hint to

From: Nick Piggin <piggin@cyberone.com.au>

This patch implements a hint so that AS can tell the request allocator to
allocate a request even if there are none left (the accounting is quite
flexible and easily handles overallocations).

elv_may_queue semantics have changed from "the elevator does _not_ want
another request allocated" to "the elevator _insists_ that another request is
allocated". I couldn't see any harm ;)

Now in practice, AS will only allow _1_ request over the limit, because as
soon as the request is sent to AS, it stops anticipating.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:51 +0000 (19:36 -0700)]

[PATCH] blk_congestion_wait threshold cleanup

From: Nick Piggin <piggin@cyberone.com.au>

Now that we are counting requests (not requests free), this patch changes
the congested & batch watermarks to be more logical. Also a minor fix to
the sysfs code.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:44 +0000 (19:36 -0700)]

[PATCH] per queue nr_requests

From: Nick Piggin <piggin@cyberone.com.au>

This gets rid of the global queue_nr_requests and usage of BLKDEV_MAX_RQ
(the latter is now only used to set the queues' defaults).

The queue depth becomes per-queue, controlled by a sysfs entry.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:37 +0000 (19:36 -0700)]

[PATCH] Use kblockd for running request queues

Using keventd for running request_fns is risky because keventd itself can
block on disk I/O. Use the new kblockd kernel threads for the generic
unplugging.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:30 +0000 (19:36 -0700)]

[PATCH] anticipatory I/O scheduler

From: Nick Piggin <piggin@cyberone.com.au>

This is the core anticipatory IO scheduler.  There are nearly 100 changesets
in this and five months work.  I really cannot describe it fully here.

Major points:

- It works by recognising that reads are dependent: we don't know where the
  next read will occur, but it's probably close-by the previous one.  So once
  a read has completed we leave the disk idle, anticipating that a request
  for a nearby read will come in.

- There is read batching and write batching logic.

  - when we're servicing a batch of writes we will refuse to seek away
    for a read for some tens of milliseconds.  Then the write stream is
    preempted.

  - when we're servicing a batch of reads (via anticipation) we'll do
    that for some tens of milliseconds, then preempt.

- There are request deadlines, for latency and fairness.
  The oldest outstanding request is examined at regular intervals. If
  this request is older than a specific deadline, it will be the next
  one dispatched. This gives a good fairness heuristic while being simple
  because processes tend to have localised IO.

Just about all of the rest of the complexity involves an array of fixups
which prevent most of teh obvious failure modes with anticipation: trying to
not leave the disk head pointlessly idle.  Some of these algorithms are:

- Process tracking.  If the process whose read we are anticipating submits
  a write, abandon anticipation.

- Process exit tracking.  If the process whose read we are anticipating
  exits, abandon anticipation.

- Process IO history.  We accumulate statistical info on the process's
  recent IO patterns to aid in making decisions about how long to anticipate
  new reads.

  Currently thinktime and seek distance are tracked. Thinktime is the
  time between when a process's last request has completed and when it
  submits another one. Seek distance is simply the number of sectors
  between each read request. If either statistic becomes too high, the
  it isn't anticipated that the process will submit another read.

The above all means that we need a per-process "io context".  This is a fully
refcounted structure.  In this patch it is AS-only.  later we generalise it a
little so other IO schedulers could use the same framework.

- Requests are grouped as synchronous and asynchronous whereas deadline
  scheduler groups requests as reads and writes. This can provide better
  sync write performance, and may give better responsiveness with journalling
  filesystems (although we haven't done that yet).

  We currently detect synchronous writes by nastily setting PF_SYNCWRITE in
  current->flags.  The plan is to remove this later, and to propagate the
  sync hint from writeback_contol.sync_mode into bio->bi_flags thence into
  request->flags.  Once that is done, direct-io needs to set the BIO sync
  hint as well.

- There is also quite a bit of complexity gone into bashing TCQ into
  submission. Timing for a read batch is not started until the first read
  request actually completes. A read batch also does not start until all
  outstanding writes have completed.

AS is the default IO scheduler.  deadline may be chosen by booting with
"elevator=deadline".

There are a few reasons for retaining deadline:

- AS is often slower than deadline in random IO loads with large TCQ
  windows. The usual real world task here is OLTP database loads.

- deadline is presumably more stable.

- deadline is much simpler.

The tunable per-queue entries under /sys/block/*/iosched/ are all in
milliseconds:

* read_expire

  Controls how long until a request becomes "expired".

  It also controls the interval between which expired requests are served,
  so set to 50, a request might take anywhere < 100ms to be serviced _if_ it
  is the next on the expired list.

  Obviously it can't make the disk go faster.  Result is basically the
  timeslice a reader gets in the presence of other IO.  100*((seek time /
  read_expire) + 1) is very roughly the % streaming read efficiency your disk
  should get in the presence of multiple readers.

* read_batch_expire

  Controls how much time a batch of reads is given before pending writes
  are served.  Higher value is more efficient.  Shouldn't really be below
  read_expire.

* write_ versions of the above

* antic_expire

  Controls the maximum amount of time we can anticipate a good read before
  giving up.  Many other factors may cause anticipation to be stopped early,
  or some processes will not be "anticipated" at all.  Should be a bit higher
  for big seek time devices though not a linear correspondance - most
  processes have only a few ms thinktime.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:23 +0000 (19:36 -0700)]

[PATCH] elevator completion API

From: Nick Piggin <piggin@cyberone.com.au>

Introduces an elevator_completed_req() callback with which the generic
queueing layer may tell an IO scheduler that a particualr request has
finished.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:16 +0000 (19:36 -0700)]

[PATCH] elv_may_queue() API function

Introduces the elv_may_queue() predicate with which the IO scheduler may tell
the generic request layer that we may add another request to this queue.

It is used by the CFQ elevator.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:09 +0000 (19:36 -0700)]

[PATCH] Create `kblockd' workqueue

keventd is inappropriate for running block request queues because keventd
itself can get blocked on disk I/O. Via call_usermodehelper()'s vfork and,
presumably, GFP_KERNEL allocations.

So create a new gang of kernel threads whose mandate is for running low-level
disk operations. It must ever block on disk IO, so any memory allocations
should be GFP_NOIO.

We mainly use it for running unplug operations from interrupt context.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:36:03 +0000 (19:36 -0700)]

[PATCH] bring back the batch_requests function

From: Nick Piggin <piggin@cyberone.com.au>

The batch_requests function got lost during the merge of the dynamic request
allocation patch.

We need it for the anticipatory scheduler - when the number of threads
exceeds the number of requests, the anticipated-upon task will undesirably
sleep in get_request_wait().

And apparently some block devices which use small requests need it so they
string a decent number together.

Jens has acked this patch.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:35:55 +0000 (19:35 -0700)]

[PATCH] ipc semaphore optimization

From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

This patch proposes a performance fix for the current IPC semaphore
implementation.

There are two shortcoming in the current implementation:
try_atomic_semop() was called two times to wake up a blocked process,
once from the update_queue() (executed from the process that wakes up
the sleeping process) and once in the retry part of the blocked process
(executed from the block process that gets woken up).

A second issue is that when several sleeping processes that are eligible
for wake up, they woke up in daisy chain formation and each one in turn
to wake up next process in line.  However, every time when a process
wakes up, it start scans the wait queue from the beginning, not from
where it was last scanned.  This causes large number of unnecessary
scanning of the wait queue under a situation of deep wait queue.
Blocked processes come and go, but chances are there are still quite a
few blocked processes sit at the beginning of that queue.

What we are proposing here is to merge the portion of the code in the
bottom part of sys_semtimedop() (code that gets executed when a sleeping
process gets woken up) into update_queue() function.  The benefit is two
folds: (1) is to reduce redundant calls to try_atomic_semop() and (2) to
increase efficiency of finding eligible processes to wake up and higher
concurrency for multiple wake-ups.

We have measured that this patch improves throughput for a large
application significantly on a industry standard benchmark.

This patch is relative to 2.5.72.  Any feedback is very much
appreciated.

Some kernel profile data attached:

  Kernel profile before optimization:
  -----------------------------------------------
                0.05    0.14   40805/529060      sys_semop [133]
                0.55    1.73  488255/529060      ia64_ret_from_syscall
[2]
[52]     2.5    0.59    1.88  529060         sys_semtimedop [52]
                0.05    0.83  477766/817966      schedule_timeout [62]
                0.34    0.46  529064/989340      update_queue [61]
                0.14    0.00 1006740/6473086     try_atomic_semop [75]
                0.06    0.00  529060/989336      ipcperms [149]
  -----------------------------------------------

                0.30    0.40  460276/989340      semctl_main [68]
                0.34    0.46  529064/989340      sys_semtimedop [52]
[61]     1.5    0.64    0.87  989340         update_queue [61]
                0.75    0.00 5466346/6473086     try_atomic_semop [75]
                0.01    0.11  477676/576698      wake_up_process [146]
  -----------------------------------------------
                0.14    0.00 1006740/6473086     sys_semtimedop [52]
                0.75    0.00 5466346/6473086     update_queue [61]
[75]     0.9    0.89    0.00 6473086         try_atomic_semop [75]
  -----------------------------------------------

  Kernel profile with optimization:

  -----------------------------------------------
                0.03    0.05   26139/503178      sys_semop [155]
                0.46    0.92  477039/503178      ia64_ret_from_syscall
[2]
[61]     1.2    0.48    0.97  503178         sys_semtimedop [61]
                0.04    0.79  470724/784394      schedule_timeout [62]
                0.05    0.00  503178/3301773     try_atomic_semop [109]
                0.05    0.00  503178/930934      ipcperms [149]
                0.00    0.03   32454/460210      update_queue [99]
  -----------------------------------------------
                0.00    0.03   32454/460210      sys_semtimedop [61]
                0.06    0.36  427756/460210      semctl_main [75]
[99]     0.4    0.06    0.39  460210         update_queue [99]
                0.30    0.00 2798595/3301773     try_atomic_semop [109]
                0.00    0.09  470630/614097      wake_up_process [146]
  -----------------------------------------------
                0.05    0.00  503178/3301773     sys_semtimedop [61]
                0.30    0.00 2798595/3301773     update_queue [99]
[109]    0.3    0.35    0.00 3301773         try_atomic_semop [109]
  -----------------------------------------------=20

Both number of function calls to try_atomic_semop() and update_queue()
are reduced by 50% as a result of the merge.  Execution time of
sys_semtimedop is reduced because of the reduction in the low level
functions.

commit | commitdiff | tree

Andrew Morton [Sat, 5 Jul 2003 02:35:49 +0000 (19:35 -0700)]

[PATCH] PCI domain scanning fix

From: Matthew Wilcox <willy@debian.org>

ppc64 oopses on boot because pci_scan_bus_parented() is unexpectedly
returning NULL. Change pci_scan_bus_parented() to correctly handle
overlapping PCI bus numbers on different domains.

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Jul 2003 02:22:49 +0000 (19:22 -0700)]

Merge bk://ppc.bkbits.net/for-linus-ppc
into home.osdl.org:/home/torvalds/v2.5/linux

commit | commitdiff | tree

Paul Mackerras [Sat, 5 Jul 2003 23:13:11 +0000 (09:13 +1000)]

Merge samba.org:/home/paulus/kernel/linux-2.5
into samba.org:/home/paulus/kernel/for-linus-ppc

commit | commitdiff | tree

Ulrich Drepper [Fri, 4 Jul 2003 11:57:37 +0000 (04:57 -0700)]

[PATCH] wrong pid in siginfo_t

If a signal is sent via kill() or tkill() the kernel fills in the wrong
PID value in the siginfo_t structure (obviously only if the handler has
SA_SIGINFO set).

POSIX specifies the the si_pid field is filled with the process ID, and
in Linux parlance that's the "thread group" ID, not the thread ID.

commit | commitdiff | tree

Linus Torvalds [Fri, 4 Jul 2003 10:53:27 +0000 (03:53 -0700)]

When forcing through a signal for some thread-synchronous
event (ie SIGSEGV, SIGFPE etc that happens as a result of a
trap as opposed to an external event), if the signal is
blocked we will not invoce a signal handler, we will just
kill the thread with the signal.

This is equivalent to what we do in the SIG_IGN case: you
cannot ignore or block synchronous signals, and if you try,
we'll just have to kill you.

We don't want to handle endless recursive faults, which the
old behaviour easily led to if the stack was bad, for example.

commit | commitdiff | tree

Steve French [Fri, 4 Jul 2003 10:52:02 +0000 (03:52 -0700)]

Merge bk://linux.bkbits.net/linux-2.5
into hostme.bitkeeper.com:/repos/c/cifs/linux-2.5cifs

commit | commitdiff | tree

Linus Torvalds [Fri, 4 Jul 2003 10:24:32 +0000 (03:24 -0700)]

Go back to defaulting to 6-byte commands for MODE SENSE,
since some drivers seem to be unhappy about the 10-byte
version.

The subsystem configuration can override this (eg USB or
ide-scsi).

commit | commitdiff | tree

Marc Zyngier [Fri, 4 Jul 2003 10:00:47 +0000 (03:00 -0700)]

[PATCH] EISA: avoid unnecessary probing

- By default, do not try to probe the bus if the mainboard does not
seems to support EISA (allow this behaviour to be changed through a
command-line option).

commit | commitdiff | tree

Marc Zyngier [Fri, 4 Jul 2003 10:00:39 +0000 (03:00 -0700)]

[PATCH] EISA: PCI-EISA dma_mask

- Use parent bridge device dma_mask as default for each discovered
device.

commit | commitdiff | tree

Marc Zyngier [Fri, 4 Jul 2003 10:00:33 +0000 (03:00 -0700)]

[PATCH] EISA: PA-RISC changes

- Probe the right number of EISA slots on PA-RISC. No more, no less.

Unnamed repository; edit this file 'description' to name the repository.