Pavel Roskin [Mon, 19 Apr 2004 09:04:06 +0000 (05:04 -0400)]
[PATCH] Tulip endianess fix
My tulip ethernet card doesn't work on Blue&White G3 PowerMac with Linux
2.6.5-rc2. The card is shown by lspci as
01:03.0 Ethernet controller: Linksys Network Everywhere Fast Ethernet
10/100 model NC100 (rev 11)
The kernel detects it as "ADMtek Comet rev 17".
The MAC address reported by the kernel looked obviously wrong. Also, I
could only ping the system successfully if the interface was in promiscuous
mode (running Ethereal).
Those two symptoms indicated two different problems - one for reading the
MAC address from the card on module load (tulip_init_one), and the other
for writing the address to the card when the interface was brought up
(tulip_up). I have fixed both, and here's the explanation:
tulip_init_one:
When reading the first 4 bytes of the address, inl() returns the same data
to the CPU on all platforms, interpreting the data from the lowest port
address as the least significant byte. In other words, I/O is little
endian on all platforms; it's the memory that differs across platforms.
We want to write the data to memory preserving little-endianness of the
PCI bus. To force little endian write to the memory, the data should be
converted to the little endian format.
When reading the remaining 2 bytes, the CPU gets them in 2 least
significant bytes. To write those 2 bytes to the memory in a 16-bit
operation, they should be byte-swapped for the 16-bit operation.
tulip_up:
The first 4 bytes are processed correctly, but the code is confusing.
Reading from memory needs conversion to CPU format, while writing to I/O
ports doesn't. So I replaced cpu_to_le32() to le32_to_cpu().
The second 2 bytes are read in a 16-bit memory operation, so they should
be passed to le16_to_cpu() rather than cpu_to_le32() to make them CPU
independent and suitable for outl().
All those conversions do nothing on little-endian machines, so they should
not be affected.
The patch has been tested. The driver is working fine. ping is OK, ssh
is OK, X11 over ssh is OK. Even netconsole is working fine.
Russell King [Mon, 19 Apr 2004 08:50:20 +0000 (04:50 -0400)]
[PATCH] fix arm/etherh.c
On Tue, Apr 13, 2004 at 02:35:40PM -0400, Jeff Garzik wrote:
> Russell,
>
> Would you be willing to provide an updated diff of this?
I didn't particularly like the PRIV() method implemented previously -
gcc appears to want to avoid some optimisations it if its an inline
function rather than a macro.
Also, 'ei_local' may look unused in some functions, but it's your
typical hidden-use-in-a-macro crap which 8390 likes.
In systems with mixed network cards, and all drivers compiled into
the kernel; the PCI device (eth0) will get probed first, before the ISA.
The problem is that the ISA device can mistakenly try to probe
for eth0. The problem is that the ISA driver will not detect the failure
until it goes to call register_netdevice, and not all drivers have
perfect error unwind code.
This patch short circuits the device probe, so it won't bother
looking for devices that already are registered.
Adrian Bunk [Mon, 19 Apr 2004 08:43:04 +0000 (04:43 -0400)]
[PATCH] fix warning in drivers/net/tulip/timer.c
I get the following warning in 2.6.5-mm6 and 2.6.6-rc1:
<-- snip -->
...
CC drivers/net/tulip/timer.o
drivers/net/tulip/timer.c: In function `comet_timer':
drivers/net/tulip/timer.c:156: warning: unused variable `ioaddr'
...
<-- snip -->
Since the
[netdrvr tulip] add MII support for Comet chips
patch has removed the only use of this variable, the fix is simple:
Chris Wright [Mon, 19 Apr 2004 08:26:30 +0000 (04:26 -0400)]
[PATCH] wan sdla: fix probable security hole
> [BUG] minor
> /home/kash/linux/linux-2.6.5/drivers/net/wan/sdla.c:1206:sdla_xfer:
> ERROR:TAINT: 1201:1206:Passing unbounded user value "(mem).len" as arg 0
> to function "kmalloc", which uses it unsafely in model
> [SOURCE_MODEL=(lib,copy_from_user,user,taintscalar)]
> [SINK_MODEL=(lib,kmalloc,user,trustingsink)] [MINOR] [PATH=] [Also
> used at, line 1219 in argument 0 to function "kmalloc"]
> static int sdla_xfer(struct net_device *dev, struct sdla_mem *info, int
> read)
> {
> struct sdla_mem mem;
> char *temp;
>
> Start --->
> if(copy_from_user(&mem, info, sizeof(mem)))
> return -EFAULT;
>
> if (read)
> {
> Error --->
> temp = kmalloc(mem.len, GFP_KERNEL);
> if (!temp)
> return(-ENOMEM);
> sdla_read(dev, mem.addr, temp, mem.len);
Hrm, I believe you could use this to read 128k of kernel memory.
sdla_read() takes len as a short, whereas mem.len is an int. So,
if mem.len == 0x20000, the allocation could still succeed. When cast
to short, len will be 0x0, causing the read loop to copy nothing into
the buffer. At least it's protected by a capable() check. I don't
know what proper upper bound is for this hardware, or how much it's
used/cared about. Simple memset() is trivial fix.
Andrew Morton [Mon, 19 Apr 2004 05:06:30 +0000 (22:06 -0700)]
[PATCH] From: David Gibson <david@gibson.dropbear.id.au>
hugepage_vma() is both misleadingly named and unnecessary. On most archs it
always returns NULL, and on IA64 the vma it returns is never used. The
function's real purpose is to determine whether the address it is passed is a
special hugepage address which must be looked up in hugepage pagetables,
rather than being looked up in the normal pagetables (which might have
specially marked hugepage PMDs or PTEs).
This patch kills off hugepage_vma() and folds the logic it really needs into
follow_huge_addr(). That now returns a (page *) if called on a special
hugepage address, and an error encoded with ERR_PTR otherwise. This also
requires tweaking the IA64 code to check that the hugepage PTE is present in
follow_huge_addr() - previously this was guaranteed, since it was only called
if the address was in an existing hugepage VMA, and hugepages are always
prefaulted.
Andrew Morton [Mon, 19 Apr 2004 05:06:16 +0000 (22:06 -0700)]
[PATCH] Fix default value for commit interval for older reiserfs filesystems.
From: Bart Samwel <bart@samwel.tk>
The reiserfs patch that adds support for "commit=0" saves the default max
commit age in a variable when the fs is originally mounted, so that it can
later restore it. Unfortunately it makes some mistakes with that:
- The default is not saved when the original mount has a commit=NNN option.
- The default is not correctly saved for older reiserfs filesystems, where
the default was not stored on disk.
Andrew Morton [Mon, 19 Apr 2004 05:05:51 +0000 (22:05 -0700)]
[PATCH] Increase number of dynamic inodes in procfs
From: Nathan Lynch <nathanl@austin.ibm.com>
On some larger ppc64 configurations /proc/device-tree is exhausting procfs'
dynamic (non-pid) inode range (16K). This patch makes the dynamic inode
range 0xf0000000-0xffffffff and changes the inode number allocator to use
the idr.c allocator for the first-fit allocations.
A few weeks ago, Pavel and I agreed that PF_IOTHREAD should be renamed to
PF_NOFREEZE. This reflects the fact that some threads so marked aren't
actually used for IO while suspending, but simply shouldn't be frozen.
This patch, against 2.6.5 vanilla, applies that change. In the
refrigerator calls, the actual value doesn't matter (so long as it's
non-zero) and it makes more sense to use PF_FREEZE so I've used that.
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds the posix message queue syscalls to ppc32 and 64 and fixes
our implementation of compat copy siginfo to 32 bits userland which wasn't
using the si_code but still doing a switch/case on the signal number.
Russell King [Sun, 18 Apr 2004 23:32:07 +0000 (00:32 +0100)]
[ARM] Clean up ARM includes
This removes a number of unnecessary includes from the ARM specific
files throughout the kernel. Most notably asm/pgalloc.h is
needlessly included in several places. There were some places
including it as a means to get at the cache flushing functions,
so this has been corrected.
This is my brown paper bag day, I sent you the wrong patch for
fixing the deadlock in rtas.c, here's one to apply on top of current
bk that fixes build.
Reduce the locking coverage of the oft-used j_list_lock: the per-bh
jbd_lock_bh_state() gives us sufficient locking of buffer_head and
journal_head internals.
Andrew Morton [Sun, 18 Apr 2004 03:55:18 +0000 (20:55 -0700)]
[PATCH] rmap: nonlinear truncation
From: Hugh Dickins <hugh@veritas.com>
The earlier changes introducing PageAnon left truncated pages mapped into
nonlinear vmas unswappable. Once we go to object-based rmap, it's
impossible to find where file page is mapped once page->mapping cleared:
switching them to anonymous is odd, and breaks strict commit accounting.
So now handle truncation of nonlinear vmas correctly. And factor in
Daniel's cluster filesystem needs while we're there: when invalidating
local cache, we do want to unmap shared pages from all mms, but we do not
want to discard private COWed modifications of those pages (which
truncation discards to satisfy the SIGBUS semantics demanded by specs).
Drew from Daniel's patch (LKML 2 Mar 04), but didn't always follow it;
fewer name changes, but still some - "unmap" rather than "invalidate".
zap_page_range is not exported, safe to give it and all the too-many layers
an extra zap_details arg, in normal cases just NULL.
Given details, zap_pte_range checks page mapping or index to skip anon or
untruncated pages. I didn't realize before implementing, that in nonlinear
case, it should set a file pte when truncating - otherwise linear pages
might appear in place of SIGBUS. I suspect this implies that ->populate
functions ought to set file ptes beyond EOF instead of failing, but haven't
changed them as yet.
To avoid making yet another copy of that ugly linear pgidx test, added
inline function linear_page_index (to pagemap.h to get PAGE_CACHE_SIZE,
though as usual things don't really work if it differs from PAGE_SIZE).
Ooh, I thought I'd removed ___add_to_page_cache last time, do so now.
unmap_page_range static, shift its hugepage check up into sole caller
unmap_vmas. Killed "killme" debug from unmap_vmas, not seen it trigger.
unmap_mapping_range is exported without restriction: I'm one of those who
believe it should be generally available. But I'm wrongly placed to decide
that, probably just sob quietly to myself if _GPL added later.
Andrew Morton [Sun, 18 Apr 2004 03:55:06 +0000 (20:55 -0700)]
[PATCH] rmap: swap_unplug page
From: Hugh Dickins <hugh@veritas.com>
Good example of "swapper_space considered harmful": swap_unplug_io_fn was
originally designed for calling via swapper_space.backing_dev_info; but
that way it loses track of which device is to be unplugged, so had to
unplug all swap devices. But now sync_page tests SwapCache anyway, can
call swap_unplug_io_fn with page, which leads direct to the device.
Reverted -mc4's CONFIG_SWAP=n fix, just add another NOTHING for it.
Reverted -mc3's editorial adjustments to swap_backing_dev_info and
swapper_space initializations: they document the few fields which are
actually used now, as comment above them says (sound of slapped wrist).
Andrew Morton [Sun, 18 Apr 2004 03:54:52 +0000 (20:54 -0700)]
[PATCH] rmap: flush_dcache revisited
From: Hugh Dickins <hugh@veritas.com>
One of the callers of flush_dcache_page is do_generic_mapping_read, where
file is read without i_sem and without page lock: concurrent truncation may
at any moment remove page from cache, NULLing ->mapping, making
flush_dcache_page liable to oops. Put result of page_mapping in a local
variable and apply mapping_mapped to that (if we were to check for NULL
within mapping_mapped, it's unclear whether to say yes or no).
parisc and arm do have other locking unsafety in their i_mmap(_shared)
searching, but that's a larger issue to be dealt with down the line.
Andrew Morton [Sun, 18 Apr 2004 03:54:27 +0000 (20:54 -0700)]
[PATCH] Fix unix module
From: Rusty Russell <rusty@rustcorp.com.au>
# lsmod
Module Size Used by
1 26060 6
#
The compiler #define's unix to 1: we use -DKBUILD_MODNAME=unix. We used to
#undef unix at the top of af_unix.c, but now the name is inserted by
modpost, that doesn't help.
Andrew Morton [Sun, 18 Apr 2004 03:54:15 +0000 (20:54 -0700)]
[PATCH] ppc64: Fix CPU hot unplug deadlock
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
My RTAS locking fixes incorrectly added a spinlock around the function used
to stop a CPU, that function never returns, thus the lock becomes stale.
The correct fix is to disable interrupts instead (the RTAS params beeing
per-CPU, this should be safe enough)
Russell King [Sat, 17 Apr 2004 23:19:03 +0000 (00:19 +0100)]
[ARM] Add detailed documentation concerning ARM page tables
This adds detailed documentation concerning how we map the Linux
page table structure onto the hardware tables on ARM. In addition,
it also adds documentation describing how we emulate the "dirty"
and "young" or "accessed" page table bits.
This should be of interest to Linux MM developers.
It occurred to me that if vma and new_vma are one and the same, then
vma_relink_file will not do a good job of linking it after itself - in
that pretty unlikely case when move_page_tables fails.
And more generally, whenever copy_vma's vma_merge succeeds, we have no
guarantee that old vma comes before new_vma in the i_mmap lists, as we
need to satisfy Rajesh's point: that ordering is only guaranteed in the
newly allocated case.
We have to abandon the ordering method when/if we move from lists to
prio_trees, so this patch switches to the less glamorous use of
i_shared_sem exclusion, as in my prio_tree mremap.
Pavel Roskin [Sat, 17 Apr 2004 10:41:18 +0000 (11:41 +0100)]
[PCMCIA] Conversion to module_param
Patch from: Pavel Roskin
As it turns out, mixing MODULE_PARM and module_param in one module is
wrong. The parameters specified in module_param are ignored. I've just
posted a patch to LKML that will detect this condition and warn about it.
The new debugging code used the new-style module_param, which means that
all instances of MODULE_PARM should be converted. The attached patch does
that.
An additional bonus is that module_param_array provides the number of
array elements. This allowed me to change tcic.c and i82365.c to use
this number for IRQ list. This change was tested with i82365. If
"irq_list" is not specified, irq_list_count is 0.
I set all permissions to 0444 to be safe. I think we have no secrets
from the users regarding those parameters. If some parameters can be
changed safely at the runtime, the permissions could be changed to 0644.
I didn't examine how safe (and how useful) it would be, so it's 0444 for
now.
Andrew Morton [Sat, 17 Apr 2004 10:32:46 +0000 (03:32 -0700)]
[PATCH] ARM-related ptep_to_address() fix
From: William Lee Irwin III <wli@holomorphy.com>
rmk mentioned that ARM was borked as the relation, assumed by generic rmap,
PTRS_PER_PTE*sizeof(pte_t) == PAGE_SIZE, fails to hold. The following
patch, developed jointly with him (or depending on POV, by him with me
acting as codemonkey), is reported to resolve the issue.
Specifically, while ARM dedicates an entire PAGE_SIZE -sized block of
memory to each PTE table, the PTE table itself only spans half that, the
remainder being dedicated to hardware-interpreted structures. As the
hardware structure must be contiguous, wider ptes can't be used. So the
core-visible PTE table only spans PAGE_SIZE/2 bytes, violating the
assumption. This corrects masking and scaling done in ptep_to_address().
Andrew Morton [Sat, 17 Apr 2004 10:29:12 +0000 (03:29 -0700)]
[PATCH] Fix bogus get_page() calls in hugepage code
From: David Gibson <david@gibson.dropbear.id.au>
Some versions of follow_huge_addr() and follow_huge_pmd() are doing a
get_page() on the target page. They shouldn't: follow_page() returns an
unpinned page and it is the caller's responsibility to pin the page (if
desired) before dropping page_table_lock.
Andrew Morton [Sat, 17 Apr 2004 10:28:13 +0000 (03:28 -0700)]
[PATCH] reiserfs: fsync() speedup
From: Chris Mason <mason@suse.com>
Updates the reiserfs-logging improvements to use schedule_timeout instead of
yield when letting the transaction grow a little before forcing a commit for
fsync/O_SYNC/O_DIRECT.
Also, when one process forces a transaction to end and plans on doing the
commit (like fsync), it sets a flag on the transaction so the journal code
knows not to bother kicking the journal work queue.
queue_delayed_work is used so that if we get a bunch of tiny transactions
ended quickly, we aren't constantly kicking the work queue.
These significantly improve reiserfs performance during fsync heavy
workloads.
Andrew Morton [Sat, 17 Apr 2004 10:28:02 +0000 (03:28 -0700)]
[PATCH] Add "commit=0" to reiserfs
From: Bart Samwel <bart@samwel.tk>
Add support for value 0 to the commit option of reiserfs. Means "restore
to the default value". For the maximum commit age, this default value is
normally read from the journal; this patch adds an extra variable to cache
the default value for the maximum commit age.
Andrew Morton [Sat, 17 Apr 2004 10:27:51 +0000 (03:27 -0700)]
[PATCH] ppc64: hugepage cleanup
From: David Gibson <david@gibson.dropbear.id.au>
This is a small cleanup to the PPC64 hugepage code. It removes an
unhelpful function, removing some studlyCaps in the process. It was
originally this way to match the normal page path, but that has all been
rewritten since.
Andrew Morton [Sat, 17 Apr 2004 10:27:40 +0000 (03:27 -0700)]
[PATCH] Fix mq 32-bit compatibility
From: Jakub Jelinek <jakub@redhat.com>
The first change removes just a useless put_user (si_int and si_ptr are
part of the same union, si_ptr is on all arches covering whole union), the
rest is fixes for signal handling of SI_MESGQ.
From: Andros: Implement server-side reboot recovery (server now handles
open and lock reclaims). Not completely to spec: we don't yet store the
state in stable storage that would be required to recover correctly in
certain situations.
Andrew Morton [Sat, 17 Apr 2004 10:26:39 +0000 (03:26 -0700)]
[PATCH] kNFSdv4: Set credentials properly when puutrootfh is used
From: NeilBrown <neilb@cse.unsw.edu.au>
The credentials (uid/gid) of a process are set when a filehandle is
verified. Nfsv4 allows requests without an explicit filehandle (instead,
an implicit 'root' filehandle) so we much make sure the credentials are set
for these requests too.
From: "J. Bruce Fields" <bfields@fieldses.org>
From: Andros: added a call to nfsd_setuser in nfsd4_putrootfh so that nfsd
runs as the rpc->cred user.
Andrew Morton [Sat, 17 Apr 2004 10:26:28 +0000 (03:26 -0700)]
[PATCH] kNFSdv4: Improve how locking copes with replays
From: NeilBrown <neilb@cse.unsw.edu.au>
From: "J. Bruce Fields" <bfields@fieldses.org>
From: Andros: Hold state_lock longer so the stateowner doesn't diseappear
out from under us before we get the chance to encode the replay. Don't
attempt to save replay if we failed to find a stateowner.
Andrew Morton [Sat, 17 Apr 2004 10:25:57 +0000 (03:25 -0700)]
[PATCH] kNFSdv4: Keep state to allow replays for 'close' to work.
From: NeilBrown <neilb@cse.unsw.edu.au>
From: "J. Bruce Fields" <bfields@fieldses.org>
From: Andros: Idea is to keep around a list of openowners recently released
by closes, and make sure they stay around long enough so that replays still
work.
Andrew Morton [Sat, 17 Apr 2004 10:25:10 +0000 (03:25 -0700)]
[PATCH] dm: avoid ioctl buffer overrun
From: Kevin Corry <kevcorry@us.ibm.com>
dm-ioctl.c::retrieve_status(): Prevent overrunning the ioctl buffer by making
sure we don't call the target status routine with a buffer size limit of
zero. [Kevin Corry, Alasdair Kergon]
Andrew Morton [Sat, 17 Apr 2004 10:24:54 +0000 (03:24 -0700)]
[PATCH] dm: fix a comment
From: Kevin Corry <kevcorry@us.ibm.com>
Clarify the comment regarding the "next" field in struct dm_target_spec. The
"next" field has different behavior if you're performing a DM_TABLE_STATUS
command than it does if you're performing a DM_TABLE_LOAD command.
See populate_table() and retrieve_status() in drivers/md/dm-ioctl.c for more
details on how this field is used.