Rusty Russell [Tue, 3 Sep 2002 15:03:36 +0000 (08:03 -0700)]
[PATCH] list_t removal
This removes list_t, which is a gratuitous typedef for a "struct
list_head". Unless there is good reason, the kernel doesn't usually
typedef, as typedefs cannot be predeclared unlike structs.
This makes daemonize() call reparent_to_init() itself, as long
suggested for 2.5, and fixes the callers so they don't call it again.
Also fixes callers which set current->tty to NULL themselves (also
no longer neccessary).
RELOC_HIDE got miscompiled on gcc3.1/x86-64 in the access to softirq.c's per
cpu variables. This fixes the problem.
Clearly to hide the relocation the addition needs to be done after the
value obfuscation, not before.
I don't know if it triggers on other architectures (x86-64 is especially
stressf here because it has negative kernel addresses), but seems like the
right thing to do.
In sd.c we call MODE SENSE (6) in order to find out whether the
device is write protected. The info we need is in byte 2, the
header of the MODE SENSE answer, but in the request we have to
specify (i) what page(s) we want, and (ii) how many bytes we want.
Long ago we asked for 12 bytes from page 1 (Daniel Roche, 1.3.35).
Matthew Dharm made this 8 bytes from page 3F (all pages), patch-2.4.0-test8.
In patch-2.4.10 the 8 was increased to 255.
I found on the one hand devices that only react to page 0
(the vendor page), and return an error for page 3F.
And on the other hand devices that are unable to handle requests
for more bytes than they actually have.
So, it seems that the cautious way to ask for MODE SENSE data is
to first ask for the header only, see how much is available,
and then ask for everything.
The patch below first separates out the MODE SENSE call,
and then tries it three times: on all pages (3F), only the first
four bytes; on the vendor page (0), only the first four bytes;
on all pages (3F), 255 bytes.
This should be at least as robust as our current code.
I tried it on 8 SCSI devices (of which 2 fail under 2.5.33)
and found no problems.
Andrew Morton [Tue, 3 Sep 2002 12:34:07 +0000 (05:34 -0700)]
[PATCH] discontigmem support for ia32 NUMA
- All the support macros which assume a linear mem_map[] have been
wrapped in !CONFIG_DISCONTIGMEM. pfn_to_page, page_to_pfn,
page_to_phys, pmd_page, kern_addr_valid.
- Move some initialsation macros into setup.h so they can be used in
the i386 discontig.c (INITRD_START, INITRD_SIZE).
Andrew Morton [Tue, 3 Sep 2002 12:33:56 +0000 (05:33 -0700)]
[PATCH] reorganise setup_arch() for ia32 discontigmem
This restructures setup_arch() for i386 to make it easier to include the
i386 numa changes (for CONFIG_DISCONTIGMEM) I've been working on. It
also makes setup_arch() easier to read. A version of this patch is the
in 2.4 aa tree.
This does not depend on the other patches I'm submitting today, but my
discontigmem patch does depend on this one.
I've tested this patch on the following configurations: UP, SMP, SMP
PAE, multiquad, multiquad PAE.
Andrew Morton [Tue, 3 Sep 2002 12:33:51 +0000 (05:33 -0700)]
[PATCH] convert node/zone_start_paddr to pfns
I've had ia32-discontigmem under test for a month, uneventfully. Possibly
because I don't have a machine to test it on....
A major part of this work is a general move to convert the low-level
memory management to consistently use pageframe numbers. It's a bit
schizo at present..
This patch was written by Martin Bligh. A version of this patch is in
the 2.4 aa tree.
It changes the unsigned longs node_start_paddr and zone_start_paddr to
page frame numbers. This is necessary because a PAE address is 36 bits
and cannot be represented in an unsigned long.
- The per-node physical memory start address node_start_paddr becomes
a pfn, node_start_pfn.
- The per-zone physical memory start address zone_start_paddr becomes
a pfn, zone_start_pfn.
- free_area_init_node() takes a pfn rather than a physical address.
Patricia has tested this patch on the following configurations: UP,
SMP, SMP PAE, multiquad, multiquad PAE, multiquad DISCONTIGMEM,
multiquad DISCONTIGMEM PAE.
Robert Love [Tue, 3 Sep 2002 05:43:11 +0000 (22:43 -0700)]
[PATCH] bad: schedule() with irqs disabled!
OK, Linus, you are right... there are enough instances of this we are
not going to find them all (although I suspect Andrew's slab.c fixes
will cover most of the cases). Further, I think we can should actually
purposely call preempt_schedule() in certain cases after interrupt
reenable to check for reschedules...
Let's just make it a rule "no preemption if interrupts are off" and
enforce that.
James Morris [Tue, 3 Sep 2002 05:40:11 +0000 (22:40 -0700)]
[PATCH] sigio/sigurg cleanup for 2.5.32
This is a cleanup of the sigio/sigurg code.
Summary:
o Removed sk->proc, SIGURG now sent via vfs, credentials checked
during delivery.
o SIOCSPGRP etc. ioctls use vfs, and work now for SIGIO as well
as SIGURG.
o Removed socket fcntl code.
o Consolidate lsm file_set_fowner() hooks.
o Fixed fowner race.
o Fixed associated mainline memory leak in fcntl_dirnotify().
Fix floppy driver end_request() handling - it used to do insane
contortions instead of just calling "end_that_request_first()" with
the proper sector count.
Major partial request completion boo-boo in the bio layer.
This was _bad_. Major floppy corruption, and possibly the reason
for other block device corruption for any driver that generated
partial results for a block device request.
kbuild: Remove CONFIG_DEVFS_FS from arch/ia64/config.in
Defining CONFIG_DEVFS_FS in two different places does not work correctly.
Nobody objected on linux-ia64, so I just removed the statements
unconditionally enabling devfs on IA64/SNI.
Ray Lee [Sun, 1 Sep 2002 08:58:09 +0000 (01:58 -0700)]
[PATCH] Re: 2.5.33 PNPBIOS does not compile
I don't know if the current form is harmless or not, but it is
definitely incorrect. The patch below corrects the compile failure, as
well as the multi-statement macro defines used in bare if statements;
please apply.
[PATCH] remove BUG_ON(p->ptrace) in release_task()
The BUG_ON(p->ptrace) will be called if the CLONE_DETACH process is
traced. This patch removes BUG_ON(p->ptrace), and also removes a
workaround for it in sys_wait4().
David S. Miller [Sat, 31 Aug 2002 15:36:40 +0000 (08:36 -0700)]
[SPARC]: Fix build breakage.
- Update for PCI config space access api changes
- Handle new do_fork args
- Delete SET/CLEAR TID handling from copy_thread
- Update arch/sparc64/defconfig
Luca Barbieri [Sat, 31 Aug 2002 03:21:42 +0000 (20:21 -0700)]
[PATCH] Fix panic if pnpbios is enabled and speed up its check in
This fixes the pnpbios CS check to check for the correct values (it
wasn't up to date with the various GDT reshuffles), moves it inside the
kernel mode check, modifies it so that it takes less instructions and
marks it with unlikely().
Note that the 2.5.32 version of this check will cause the kernel to
always panic since it checks for the kernel segments and will thus
decide to jump to the pnpbios fault handler without being in pnpbios.
pnpbios_core.c instead seems to use the correct values.
Harald Welte [Fri, 30 Aug 2002 10:51:47 +0000 (03:51 -0700)]
[NETFILTER]: Fix OOPS in ipt_ULOG
- If an iptables rule is using --ulog-nlgroup with a
value greated than 4, we crash
- Remove a memory leak when someone throws packets at
the ulog netlink socket
Neil Brown [Fri, 30 Aug 2002 09:00:00 +0000 (02:00 -0700)]
[PATCH] PATCH - kNFSd - More small fixes for TCP nfsd
sk_inuse should be bigger than "char" as we can
have more than 255 server threads. Due to the way the count
is used, this is unlikely to actually cause a problem, but it
should nonetheless be fixed.
Also, two printk generate more noise than we would like,
so turn them into dprintk (debugging printk).
Chuck Lever [Fri, 30 Aug 2002 08:58:33 +0000 (01:58 -0700)]
[PATCH] sock_writeable not appropriate for TCP sockets, for 2.5.32
sock_writeable determines whether there is space in a socket's output
buffer. socket write_space callbacks use it to determine whether to wake
up those that are waiting for more output buffer space.
however, sock_writeable is not appropriate for TCP sockets. because the
RPC layer's write_space callback uses it for TCP sockets, the RPC layer
hammers on sock_sendmsg with dozens of write requests that are only a few
hundred bytes long when it is trying to send a large write RPC request.
this patch adds logic to the RPC layer's write_space callback that
properly handles TCP sockets.
Chuck Lever [Fri, 30 Aug 2002 08:58:29 +0000 (01:58 -0700)]
[PATCH] prevent oops in xprt_lock_write, against 2.5.32
when several RPC requests want to reconnect a TCP transport socket at
once, xprt_lock_write serializes the tasks to prevent multiple socket
connects. however, TCP connects are always done by a RPC child task that
has no request slot. xprt_lock_write can oops if there is no request slot
allocated to the invoking RPC task. reviewed and accepted by Trond.
the xprt_lock_write changes are not yet in 2.4, so this patch does not
apply to 2.4.
Ingo Molnar [Fri, 30 Aug 2002 08:56:01 +0000 (01:56 -0700)]
[PATCH] scheduler fixes, 2.5.32-BK
This adds two scheduler related fixes:
- changes the migration code to use struct completion. Andrew pointed out
that there might be a small window in where the up() touches the
semaphore while the waiting task goes on and frees its stack. And
completion is more suited for this kind of stuff anyway.
- removes two unneeded exports, pointed out by Andrew.
Ingo Molnar [Fri, 30 Aug 2002 08:55:56 +0000 (01:55 -0700)]
[PATCH] clone-cleanup 2.5.32-BK
This moves CLONE_SETTID and CLONE_CLEARTID handling into kernel/fork.c,
where it belongs. [the CLONE_SETTLS is x86-specific and thus remains in
the per-arch process.c] This makes support for these two new flags much
easier: architectures only have to pass in the user_tid pointer.
Andrew Morton [Fri, 30 Aug 2002 08:49:31 +0000 (01:49 -0700)]
[PATCH] O_DIRECT for ext3
O_DIRECT support for ext3.
It works OK in all journalling modes.
Updates to the file metadata and inode are journalled as usual.
If the system crashes during an appending O_DIRECT write then journal
recovery will truncate the written-to file back to the length which it
had on entry to that write.
If the system crashes during a file overwrite to existing blocks then
the file contents will be an unknown mixture of old and new.
If the system crashes during a file overwrite which instantiates new
blocks in the middle of the file then there is a possibility of
uninitialised disk blocks being present in the file post-recovery.