From: Linus Torvalds Date: Fri, 23 Nov 2007 20:17:04 +0000 (-0500) Subject: Linux 2.1.127pre2 X-Git-Tag: 2.1.127pre2 X-Git-Url: http://git.neil.brown.name/?a=commitdiff_plain;h=a93be80365075acdced695d8fe641e3a4b6c9536;p=history.git Linux 2.1.127pre2 I just found a case that could certainly result in endless page faults, and an endless stream of __get_free_page() calls. It's been there forever, and I bascially thought it could never happen, but thinking about it some more it can happen a lot more easily than I thought. The problem is that the page fault handling code will give up if it cannot allocate a page table entry. We have code in place to handle the final page allocation failure, but the "mid-way" failures just failed, and caused the page fault to be done over and over again. More importantly, this could happen from kernel mode when a system call was trying to fill in a user page, in which case it wouldn't even be interruptible. It's really unlikely to happen (because the page tables tend to be set up already), but I suspect it can be triggered by execve'ing a new process which is not going to have any existing page tables. Even then we're likely to have old pages available (the ones we free'd from the previous process), but at least it doesn't sound impossible that this could be a problem. I've not seen this behaviour myself, but it could have caused Andrea's problems, especially the harder to find ones. Andrea, can you check this patch (against clean 2.1.126) out and see if it makes any difference to your testing? (Right now it does the wrong error code: it will cause a SIGSEGV instead of a SIGBUS when we run out of memory, but that's a small detail). Essentially, instead of trying to call "oom()" and sending a signal (which doesn't work for kernel level accesses anyway), the code returns the proper return value from handle_mm_fault(), which allows the caller to do the right thing (which can include following the exception tables). That way we can handle the case of running out of memory from a kernel mode access too.. (This is also why the fault gets the wrong signal - I didn't bother to fix up the x86 fault handler all that much ;) Btw, the reason I'm sending out these patches in emails instead of just putting them on ftp.kernel.org is that the machine has had disk problems for the last week, and finally gave up completely last Friday or so. So ftp.kernel.org is down until we have a new raid array or the old one magically recovers. Sorry about the spamming. Linus --- diff --git a/arch/i386/boot/tools/build.c b/arch/i386/boot/tools/build.c index b5faf665c29c..25f7835206f5 100644 --- a/arch/i386/boot/tools/build.c +++ b/arch/i386/boot/tools/build.c @@ -151,7 +151,7 @@ int main(int argc, char ** argv) if (setup_sectors < SETUP_SECTS) setup_sectors = SETUP_SECTS; fprintf(stderr, "Setup is %d bytes.\n", i); - memset(buf, sizeof(buf), 0); + memset(buf, 0, sizeof(buf)); while (i < setup_sectors * 512) { c = setup_sectors * 512 - i; if (c > sizeof(buf)) diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c index 8c4c6f7f41d9..a95a7329f0e3 100644 --- a/arch/i386/mm/fault.c +++ b/arch/i386/mm/fault.c @@ -156,7 +156,14 @@ good_area: if (!(vma->vm_flags & (VM_READ | VM_EXEC))) goto bad_area; } - handle_mm_fault(tsk, vma, address, write); + + /* + * If for any reason at all we couldn't handle the fault, + * make sure we exit gracefully rather than endlessly redo + * the fault. + */ + if (!handle_mm_fault(tsk, vma, address, write)) + goto bad_area; /* * Did it hit the DOS screen memory VA from vm86 mode? diff --git a/arch/mips/sgi/kernel/indy_sc.c b/arch/mips/sgi/kernel/indy_sc.c index ddac5581685f..5ead0cba6aca 100644 --- a/arch/mips/sgi/kernel/indy_sc.c +++ b/arch/mips/sgi/kernel/indy_sc.c @@ -5,7 +5,6 @@ * Copyright (C) 1997 Ralf Baechle (ralf@gnu.org), * derived from r4xx0.c by David S. Miller (dm@engr.sgi.com). */ -#include #include #include #include diff --git a/arch/mips/sgi/kernel/setup.c b/arch/mips/sgi/kernel/setup.c index 51eca15b0bf7..36cf97cd5213 100644 --- a/arch/mips/sgi/kernel/setup.c +++ b/arch/mips/sgi/kernel/setup.c @@ -5,6 +5,7 @@ * Copyright (C) 1996 David S. Miller (dm@engr.sgi.com) * Copyright (C) 1997, 1998 Ralf Baechle (ralf@gnu.org) */ +#include #include #include #include diff --git a/drivers/block/loop.c b/drivers/block/loop.c index c8959b0de644..ea81b7c99690 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -27,7 +27,6 @@ #include -#include #include #include #include diff --git a/drivers/misc/parport_pc.c b/drivers/misc/parport_pc.c index 7654c7cb0a80..fece4c0b2fcc 100644 --- a/drivers/misc/parport_pc.c +++ b/drivers/misc/parport_pc.c @@ -7,6 +7,8 @@ * Andrea Arcangeli * * based on work by Grant Guenther and Phil Blundell. + * + * Cleaned up include files - Russell King */ /* This driver should work with any hardware that is broadly compatible @@ -557,12 +559,12 @@ static int programmable_irq_support(struct parport *pb) static int irq_probe_ECP(struct parport *pb) { int irqs, i; - + sti(); irqs = probe_irq_on(); - parport_pc_write_econtrol(pb, 0x00); /* Reset FIFO */ - parport_pc_write_econtrol(pb, 0xd0); /* TEST FIFO + nErrIntrEn */ + parport_pc_write_econtrol(pb, 0x00); /* Reset FIFO */ + parport_pc_write_econtrol(pb, 0xd0); /* TEST FIFO + nErrIntrEn */ /* If Full FIFO sure that WriteIntrThresold is generated */ for (i=0; i < 1024 && !(parport_pc_read_econtrol(pb) & 0x02) ; i++) diff --git a/drivers/misc/parport_procfs.c b/drivers/misc/parport_procfs.c index 758015bd1f15..dd92e895728a 100644 --- a/drivers/misc/parport_procfs.c +++ b/drivers/misc/parport_procfs.c @@ -8,16 +8,11 @@ * * based on work by Grant Guenther * and Philip Blundell + * + * Cleaned up include files - Russell King */ -#include -#include -#include -#include -#include -#include -#include - +#include #include #include #include @@ -26,6 +21,11 @@ #include #include #include +#include + +#include +#include +#include struct proc_dir_entry *base = NULL; diff --git a/drivers/misc/parport_share.c b/drivers/misc/parport_share.c index 01123bc7bcb5..3e9b7ace163d 100644 --- a/drivers/misc/parport_share.c +++ b/drivers/misc/parport_share.c @@ -105,7 +105,7 @@ struct parport *parport_register_port(unsigned long base, int irq, int dma, tmp->ops = ops; tmp->number = portnum; memset (&tmp->probe_info, 0, sizeof (struct parport_device_info)); - spin_lock_init(&tmp->cad_lock); + tmp->cad_lock = RW_LOCK_UNLOCKED; spin_lock_init(&tmp->waitlist_lock); spin_lock_init(&tmp->pardevice_lock); diff --git a/drivers/net/3c509.c b/drivers/net/3c509.c index 081770e8c10c..58a44182e105 100644 --- a/drivers/net/3c509.c +++ b/drivers/net/3c509.c @@ -34,6 +34,7 @@ v1.10 4/21/97 Fixed module code so that multiple cards may be detected, other cleanups. -djb Andrea Arcangeli: Upgraded to Donald Becker's version 1.12. + Rick Payne: Fixed SMP race condition */ static char *version = "3c509.c:1.12 6/4/97 becker@cesdis.gsfc.nasa.gov\n"; @@ -59,6 +60,7 @@ static char *version = "3c509.c:1.12 6/4/97 becker@cesdis.gsfc.nasa.gov\n"; #include #include /* for udelay() */ +#include #include #include @@ -122,6 +124,7 @@ enum RxFilter { struct el3_private { struct enet_statistics stats; struct device *next_dev; + spinlock_t lock; /* skb send-queue */ int head, size; struct sk_buff *queue[SKB_QUEUE_SIZE]; @@ -401,6 +404,9 @@ el3_open(struct device *dev) outw(RxReset, ioaddr + EL3_CMD); outw(SetStatusEnb | 0x00, ioaddr + EL3_CMD); + /* Set the spinlock before grabbing IRQ! */ + ((struct el3_private *)dev->priv)->lock = (spinlock_t) SPIN_LOCK_UNLOCKED; + if (request_irq(dev->irq, &el3_interrupt, 0, "3c509", dev)) { return -EAGAIN; } @@ -520,6 +526,11 @@ el3_start_xmit(struct sk_buff *skb, struct device *dev) if (test_and_set_bit(0, (void*)&dev->tbusy) != 0) printk("%s: Transmitter access conflict.\n", dev->name); else { + unsigned long flags; + + /* Spin on the lock, until we're clear of an IRQ */ + spin_lock_irqsave(&lp->lock, flags); + /* Put out the doubleword header... */ outw(skb->len, ioaddr + TX_FIFO); outw(0x00, ioaddr + TX_FIFO); @@ -536,6 +547,8 @@ el3_start_xmit(struct sk_buff *skb, struct device *dev) } else /* Interrupt us when the FIFO has room for max-sized packet. */ outw(SetTxThreshold + 1536, ioaddr + EL3_CMD); + + spin_unlock_irqrestore(&lp->lock, flags); } dev_kfree_skb (skb); @@ -560,6 +573,7 @@ static void el3_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct device *dev = (struct device *)dev_id; + struct el3_private *lp; int ioaddr, status; int i = INTR_WORK; @@ -568,6 +582,9 @@ el3_interrupt(int irq, void *dev_id, struct pt_regs *regs) return; } + lp = (struct el3_private *)dev->priv; + spin_lock(&lp->lock); + if (dev->interrupt) printk("%s: Re-entering the interrupt handler.\n", dev->name); dev->interrupt = 1; @@ -629,7 +646,7 @@ el3_interrupt(int irq, void *dev_id, struct pt_regs *regs) printk("%s: exiting interrupt, status %4.4x.\n", dev->name, inw(ioaddr + EL3_STATUS)); } - + spin_unlock(&lp->lock); dev->interrupt = 0; return; } diff --git a/drivers/net/ne2.c b/drivers/net/ne2.c index 8903f9698d7c..429699c34610 100644 --- a/drivers/net/ne2.c +++ b/drivers/net/ne2.c @@ -55,7 +55,6 @@ static const char *version = "ne2.c:v0.90 Oct 14 1998 David Weinehall \n"; #include -#include #include #include diff --git a/drivers/scsi/ChangeLog.ncr53c8xx b/drivers/scsi/ChangeLog.ncr53c8xx index ff1716e5ab0b..bd302921fc21 100644 --- a/drivers/scsi/ChangeLog.ncr53c8xx +++ b/drivers/scsi/ChangeLog.ncr53c8xx @@ -1,3 +1,26 @@ +Wed Oct 21 21:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1a + - Changes from Eddie Dost for Sparc and Alpha: + ioremap/iounmap support for Sparc. + pcivtophys changed to bus_dvma_to_phys. + - Add the 53c876 description to the chip table. This is only usefull + for printing the right name of the controller. + - DEL-441 Item 2 work-around for the 53c876 rev <= 5 (0x15). + - Add additionnal checking of INQUIRY data: + Check INQUIRY data received length is at least 7. Byte 7 of + inquiry data contains device features bits and the driver might + be confused by garbage. Also check peripheral qualifier. + - Cleanup of the SCSI tasks management: + Remove the special case for 32 tags. Now the driver only uses the + scheme that allows up to 64 tags per LUN. + Merge some code from the 896 driver. + Use a 1,3,5,...MAXTAGS*2+1 tag numbering. Previous driver could + use any tag number from 1 to 253 and some non conformant devices + might have problems with large tag numbers. + - 'no_sync' changed to 'no_disc' in the README file. This is an old + and trivial mistake that seems to demonstrate the README file is + not often read. :) + Sun Oct 4 14:00 1998 Gerard Roudier (groudier@club-internet.fr) * revision 3.0i - Cosmetic changes for sparc (but not for the driver) that needs diff --git a/drivers/scsi/README.ncr53c8xx b/drivers/scsi/README.ncr53c8xx index a02da1df9f1f..eb11ccb39fb2 100644 --- a/drivers/scsi/README.ncr53c8xx +++ b/drivers/scsi/README.ncr53c8xx @@ -4,7 +4,7 @@ Written by Gerard Roudier 21 Rue Carnot 95170 DEUIL LA BARRE - FRANCE -27 June 1998 +18 October 1998 =============================================================================== 1. Introduction @@ -21,7 +21,7 @@ Written by Gerard Roudier 8.4 Set order type for tagged command 8.5 Set debug mode 8.6 Clear profile counters - 8.7 Set flag (no_sync) + 8.7 Set flag (no_disc) 8.8 Set verbose level 9. Configuration parameters 10. Boot setup commands @@ -424,7 +424,7 @@ Available commands: The "clearprof" command allows you to clear these counters at any time. -8.7 Set flag (no_sync) +8.7 Set flag (no_disc) setflag @@ -432,11 +432,11 @@ Available commands: For the moment, only one flag is available: - no_sync: not allow target to disconnect. + no_disc: not allow target to disconnect. Do not specify any flag in order to reset the flag. For example: - setflag 4 - will reset no_sync flag for target 4, so will allow it disconnections. + will reset no_disc flag for target 4, so will allow it disconnections. - setflag all will allow disconnection for all devices on the SCSI bus. @@ -1067,7 +1067,7 @@ Try to enable one feature at a time with control commands. For example: Will enable fast synchronous data transfer negotiation for all targets. - echo "setflag 3" >/proc/scsi/ncr53c8xx/0 - Will reset flags (no_sync) for target 3, and so will allow it to disconnect + Will reset flags (no_disc) for target 3, and so will allow it to disconnect the SCSI Bus. - echo "settags 3 8" >/proc/scsi/ncr53c8xx/0 diff --git a/drivers/scsi/ncr53c8xx.c b/drivers/scsi/ncr53c8xx.c index c4b42f13796c..5f632df09644 100644 --- a/drivers/scsi/ncr53c8xx.c +++ b/drivers/scsi/ncr53c8xx.c @@ -73,7 +73,7 @@ */ /* -** October 4 1998, version 3.0i +** October 21 1998, version 3.1a ** ** Supported SCSI-II features: ** Synchronous negotiation @@ -169,9 +169,11 @@ #endif /* -** Define the BSD style u_int32 type +** Define the BSD style u_int32 and u_int64 type. +** Are in fact u_int32_t and u_int64_t :-) */ typedef u32 u_int32; +typedef u64 u_int64; #include "ncr53c8xx.h" @@ -366,25 +368,14 @@ static inline struct xpt_quehead *xpt_remque_tail(struct xpt_quehead *head) #define NO_TAG (255) /* -** For more than 32 TAGS support, we do some address calculation -** from the SCRIPTS using 2 additionnal SCR_COPY's and a fiew -** bit handling on 64 bit integers. For these reasons, support for -** 32 up to 64 TAGS is compiled conditionnaly. +** Choose appropriate type for tag bitmap. */ - -#if SCSI_NCR_MAX_TAGS <= 32 -struct nlink { - ncrcmd l_cmd; - ncrcmd l_paddr; -}; +#if SCSI_NCR_MAX_TAGS > 32 +typedef u_int64 tagmap_t; #else -struct nlink { - ncrcmd l_paddr; -}; -typedef u64 u_int64; +typedef u_int32 tagmap_t; #endif - /* ** Number of targets supported by the driver. ** n permits target numbers 0..n-1. @@ -583,16 +574,12 @@ static spinlock_t driver_lock; #define iounmap vfree #endif -#ifdef __sparc__ +#if defined (__sparc__) #include -#define remap_pci_mem(base, size) ((vm_offset_t) __va(base)) -#define unmap_pci_mem(vaddr, size) -#define pcivtophys(p) ((p) & pci_dvma_mask) -#else -#if defined(__alpha__) -#define pcivtophys(p) ((p) & 0xfffffffful) +#elif defined (__alpha__) +#define bus_dvma_to_mem(p) ((p) & 0xfffffffful) #else -#define pcivtophys(p) (p) +#define bus_dvma_to_mem(p) (p) #endif #ifndef NCR_IOMAPPED @@ -615,7 +602,6 @@ static void unmap_pci_mem(vm_offset_t vaddr, u_long size) iounmap((void *) (vaddr & PAGE_MASK)); } #endif /* !NCR_IOMAPPED */ -#endif /* __sparc__ */ /* ** Insert a delay in micro-seconds and milli-seconds. @@ -1488,8 +1474,8 @@ struct lcb { ** 64 possible tags. **---------------------------------------------------------------- */ - struct nlink jump_ccb_0; /* Default table if no tags */ - struct nlink *jump_ccb; /* Virtual address */ + u_int32 jump_ccb_0; /* Default table if no tags */ + u_int32 *jump_ccb; /* Virtual address */ /*---------------------------------------------------------------- ** CCB queue management. @@ -1514,11 +1500,7 @@ struct lcb { */ u_char ia_tag; /* Allocation index */ u_char if_tag; /* Freeing index */ -#if SCSI_NCR_MAX_TAGS <= 32 - u_char cb_tags[32]; /* Circular tags buffer */ -#else - u_char cb_tags[64]; /* Circular tags buffer */ -#endif + u_char cb_tags[SCSI_NCR_MAX_TAGS]; /* Circular tags buffer */ u_char usetags; /* Command queuing is active */ u_char maxtags; /* Max nr of tags asked by user */ u_char numtags; /* Current number of tags */ @@ -1528,14 +1510,13 @@ struct lcb { ** QUEUE FULL control and ORDERED tag control. **---------------------------------------------------------------- */ + /*---------------------------------------------------------------- + ** QUEUE FULL and ORDERED tag control. + **---------------------------------------------------------------- + */ u_short num_good; /* Nr of GOOD since QUEUE FULL */ -#if SCSI_NCR_MAX_TAGS <= 32 - u_int tags_umap; /* Used tags bitmap */ - u_int tags_smap; /* Tags in use at 'tag_stime' */ -#else - u_int64 tags_umap; /* Used tags bitmap */ - u_int64 tags_smap; /* Tags in use at 'tag_stime' */ -#endif + tagmap_t tags_umap; /* Used tags bitmap */ + tagmap_t tags_smap; /* Tags in use at 'tag_stime' */ u_long tags_stime; /* Last time we set smap=umap */ ccb_p held_ccb; /* CCB held for QUEUE FULL */ }; @@ -2065,18 +2046,10 @@ struct script { ncrcmd loadpos1 [ 4]; #endif ncrcmd resel_lun [ 6]; -#if SCSI_NCR_MAX_TAGS <= 32 - ncrcmd resel_tag [ 8]; -#else ncrcmd resel_tag [ 6]; ncrcmd jump_to_nexus [ 4]; ncrcmd nexus_indirect [ 4]; -#endif -#if SCSI_NCR_MAX_TAGS <= 32 - ncrcmd resel_notag [ 4]; -#else ncrcmd resel_notag [ 4]; -#endif ncrcmd data_in [MAX_SCATTERL * 4]; ncrcmd data_in2 [ 4]; ncrcmd data_out [MAX_SCATTERL * 4]; @@ -2987,18 +2960,12 @@ static struct script script0 __initdata = { /* ** Read the TAG from the SIDL. ** Still an aggressive optimization. ;-) + ** Compute the CCB indirect jump address which + ** is (#TAG*2 & 0xfc) due to tag numbering using + ** 1,3,5..MAXTAGS*2+1 actual values. */ - SCR_FROM_REG (sidl), - 0, - /* - ** JUMP indirectly to the restart point of the CCB. - */ -#if SCSI_NCR_MAX_TAGS <= 32 - SCR_SFBR_REG (temp, SCR_AND, 0xf8), + SCR_REG_SFBR (sidl, SCR_SHL, 0), 0, - SCR_RETURN, - 0, -#else SCR_SFBR_REG (temp, SCR_AND, 0xfc), 0, }/*-------------------------< JUMP_TO_NEXUS >-------------------*/,{ @@ -3011,7 +2978,6 @@ static struct script script0 __initdata = { RADDR (temp), SCR_RETURN, 0, -#endif }/*-------------------------< RESEL_NOTAG >-------------------*/,{ /* ** No tag expected. @@ -3019,13 +2985,8 @@ static struct script script0 __initdata = { */ SCR_MOVE_ABS (1) ^ SCR_MSG_IN, NADDR (msgin), -#if SCSI_NCR_MAX_TAGS <= 32 - SCR_RETURN, - 0, -#else SCR_JUMP, PADDR (jump_to_nexus), -#endif }/*-------------------------< DATA_IN >--------------------*/,{ /* ** Because the size depends on the @@ -3907,7 +3868,7 @@ static void ncr_script_copy_and_bind (ncb_p np, ncrcmd *src, ncrcmd *dst, int le switch (old & RELOC_MASK) { case RELOC_REGISTER: new = (old & ~RELOC_MASK) - + pcivtophys(np->paddr); + + bus_dvma_to_mem(np->paddr); break; case RELOC_LABEL: new = (old & ~RELOC_MASK) + np->p_script; @@ -4654,7 +4615,7 @@ printk(KERN_INFO "ncr53c%s-%d: rev=0x%02x, base=0x%lx, io_port=0x%lx, irq=%d\n", np->scripth = np->scripth0; np->p_scripth = vtophys(np->scripth); - np->p_script = (np->paddr2) ? pcivtophys(np->paddr2) : vtophys(np->script0); + np->p_script = (np->paddr2) ? bus_dvma_to_mem(np->paddr2) : vtophys(np->script0); ncr_script_copy_and_bind (np, (ncrcmd *) &script0, (ncrcmd *) np->script0, sizeof(struct script)); ncr_script_copy_and_bind (np, (ncrcmd *) &scripth0, (ncrcmd *) np->scripth0, sizeof(struct scripth)); @@ -5063,12 +5024,12 @@ int ncr_queue_command (ncb_p np, Scsi_Cmnd *cmd) } } msgptr[msglen++] = order; -#if SCSI_NCR_MAX_TAGS <= 32 - msgptr[msglen++] = (cp->tag << 3) + 1; -#else - msgptr[msglen++] = (cp->tag << 2) + 1; -#endif - + /* + ** Actual tags are numbered 1,3,5,..2*MAXTAGS+1, + ** since we may have to deal with devices that have + ** problems with #TAG 0 or too great #TAG numbers. + */ + msgptr[msglen++] = (cp->tag << 1) + 1; } switch (nego) { @@ -5316,7 +5277,7 @@ static void ncr_start_next_ccb(ncb_p np, lcb_p lp, int maxn) ++lp->queuedccbs; cp = xpt_que_entry(qp, struct ccb, link_ccbq); xpt_insque_tail(qp, &lp->busy_ccbq); - lp->jump_ccb[cp->tag == NO_TAG ? 0 : cp->tag].l_paddr = + lp->jump_ccb[cp->tag == NO_TAG ? 0 : cp->tag] = cpu_to_scr(CCB_PHYS (cp, restart)); ncr_put_start_queue(np, cp); } @@ -5705,7 +5666,7 @@ static int ncr_detach(ncb_p np) #ifdef DEBUG_NCR53C8XX printk("%s: freeing lp (%lx)\n", ncr_name(np), (u_long) lp); #endif - if (lp->maxnxs > 1) + if (lp->jump_ccb != &lp->jump_ccb_0) m_free(lp->jump_ccb, 256); m_free(lp, sizeof(*lp)); } @@ -5861,9 +5822,10 @@ void ncr_complete (ncb_p np, ccb_p cp) /* ** On standard INQUIRY response (EVPD and CmDt ** not set), setup logical unit according to - ** announced capabilities. + ** announced capabilities (we need the 1rst 7 bytes). */ - if (cmd->cmnd[0] == 0x12 && !(cmd->cmnd[1] & 0x3)) { + if (cmd->cmnd[0] == 0x12 && !(cmd->cmnd[1] & 0x3) && + cmd->cmnd[4] >= 7) { ncr_setup_lcb (np, cmd->target, cmd->lun, (char *) cmd->request_buffer); } @@ -6218,6 +6180,14 @@ void ncr_init (ncb_p np, int reset, char * msg, u_long code) np->scsi_mode = INB (nc_stest4) & SMODE; } + /* + ** DEL 441 - 53C876 Rev 5 - Part Number 609-0392787/2788 - ITEM 2. + ** Disable overlapped arbitration. + */ + if (np->device_id == PCI_DEVICE_ID_NCR_53C875 && + np->revision_id >= 0x10 && np->revision_id <= 0x15) + OUTB (nc_ctest0, (1<<5)); + /* ** Fill in target structure. ** Reinitialize usrsync. @@ -7778,7 +7748,7 @@ void ncr_int_sir (ncb_p np) ** We just assume lun=0, 1 CCB, no tag. */ if (tp->lp[0]) { - OUTL (nc_dsp, scr_to_cpu(tp->lp[0]->jump_ccb[0].l_paddr)); + OUTL (nc_dsp, scr_to_cpu(tp->lp[0]->jump_ccb[0])); return; } case SIR_RESEL_BAD_TARGET: /* Will send a TARGET RESET message */ @@ -8307,17 +8277,9 @@ static ccb_p ncr_get_ccb (ncb_p np, u_char tn, u_char ln) if (lp) { if (tag != NO_TAG) { ++lp->ia_tag; -#if SCSI_NCR_MAX_TAGS <= 32 - if (lp->ia_tag == 32) -#else - if (lp->ia_tag == 64) -#endif + if (lp->ia_tag == SCSI_NCR_MAX_TAGS) lp->ia_tag = 0; -#if SCSI_NCR_MAX_TAGS <= 32 - lp->tags_umap |= (1u << tag); -#else - lp->tags_umap |= (((u_int64) 1) << tag); -#endif + lp->tags_umap |= (((tagmap_t) 1) << tag); } } @@ -8363,22 +8325,14 @@ static void ncr_free_ccb (ncb_p np, ccb_p cp) if (lp) { if (cp->tag != NO_TAG) { lp->cb_tags[lp->if_tag++] = cp->tag; -#if SCSI_NCR_MAX_TAGS <= 32 - if (lp->if_tag == 32) -#else - if (lp->if_tag == 64) -#endif + if (lp->if_tag == SCSI_NCR_MAX_TAGS) lp->if_tag = 0; -#if SCSI_NCR_MAX_TAGS <= 32 - lp->tags_umap &= ~(1u << cp->tag); -#else - lp->tags_umap &= ~(((u_int64) 1) << cp->tag); -#endif + lp->tags_umap &= ~(((tagmap_t) 1) << cp->tag); lp->tags_smap &= lp->tags_umap; - lp->jump_ccb[cp->tag].l_paddr = + lp->jump_ccb[cp->tag] = cpu_to_scr(NCB_SCRIPTH_PHYS(np, bad_i_t_l_q)); } else { - lp->jump_ccb[0].l_paddr = + lp->jump_ccb[0] = cpu_to_scr(NCB_SCRIPTH_PHYS(np, bad_i_t_l)); } } @@ -8412,7 +8366,7 @@ static void ncr_free_ccb (ncb_p np, ccb_p cp) #define ncr_reg_bus_addr(r) \ - (pcivtophys(np->paddr) + offsetof (struct ncr_reg, r)) + (bus_dvma_to_mem(np->paddr) + offsetof (struct ncr_reg, r)) /*------------------------------------------------------------------------ ** Initialize the fixed part of a CCB structure. @@ -8578,28 +8532,6 @@ static void ncr_init_tcb (ncb_p np, u_char tn) } -/*------------------------------------------------------------------------ -** Reselection JUMP table initialisation. -**------------------------------------------------------------------------ -** The SCRIPTS processor jumps on reselection to the entry -** corresponding to the CCB using the tag as offset. -**------------------------------------------------------------------------ -*/ -static void ncr_setup_jump_ccb(ncb_p np, lcb_p lp) -{ - int i; - - lp->p_jump_ccb = cpu_to_scr(vtophys(lp->jump_ccb)); - for (i = 0 ; i < lp->maxnxs ; i++) { -#if SCSI_NCR_MAX_TAGS <= 32 - lp->jump_ccb[i].l_cmd = cpu_to_scr(SCR_JUMP); -#endif - lp->jump_ccb[i].l_paddr = - cpu_to_scr(NCB_SCRIPTH_PHYS (np, bad_i_t_l_q)); - lp->cb_tags[i] = i; - } -} - /*------------------------------------------------------------------------ ** Lun control block allocation and initialization. **------------------------------------------------------------------------ @@ -8649,12 +8581,12 @@ static lcb_p ncr_alloc_lcb (ncb_p np, u_char tn, u_char ln) xpt_que_init(&lp->skip_ccbq); /* - ** Set max CCBs to 1 and use the default jump table - ** by default. + ** Set max CCBs to 1 and use the default 1 entry + ** jump table by default. */ - lp->maxnxs = 1; - lp->jump_ccb = &lp->jump_ccb_0; - ncr_setup_jump_ccb(np, lp); + lp->maxnxs = 1; + lp->jump_ccb = &lp->jump_ccb_0; + lp->p_jump_ccb = cpu_to_scr(vtophys(lp->jump_ccb)); /* ** Initilialyze the reselect script: @@ -8732,6 +8664,13 @@ static lcb_p ncr_setup_lcb (ncb_p np, u_char tn, u_char ln, u_char *inq_data) if ((inq_data[2] & 0x7) >= 2 && (inq_data[3] & 0xf) == 2) inq_byte7 = inq_data[7]; + /* + ** Throw away announced LUN capabilities if we are told + ** that there is no real device supported by the logical unit. + */ + if ((inq_data[0] & 0xe0) > 0x20 || (inq_data[0] & 0x1f) == 0x1f) + inq_byte7 &= (INQ7_SYNC | INQ7_WIDE16); + /* ** If user is wanting SYNC, force this feature. */ @@ -8751,19 +8690,21 @@ static lcb_p ncr_setup_lcb (ncb_p np, u_char tn, u_char ln, u_char *inq_data) ** If unit supports tagged commands, allocate the ** CCB JUMP table if not yet. */ - if ((inq_byte7 & INQ7_QUEUE) && lp->maxnxs < 2) { - struct nlink *jumps; - jumps = m_alloc(256, 8); - if (!jumps) + if ((inq_byte7 & INQ7_QUEUE) && lp->jump_ccb == &lp->jump_ccb_0) { + int i; + lp->jump_ccb = m_alloc(256, 8); + if (!lp->jump_ccb) { + lp->jump_ccb = &lp->jump_ccb_0; goto fail; -#if SCSI_NCR_MAX_TAGS <= 32 - lp->maxnxs = 32; -#else - lp->maxnxs = 64; -#endif - lp->jump_ccb = jumps; - ncr_setup_jump_ccb(np, lp); - lp->tags_stime = jiffies; + } + lp->p_jump_ccb = cpu_to_scr(vtophys(lp->jump_ccb)); + for (i = 0 ; i < 64 ; i++) + lp->jump_ccb[i] = + cpu_to_scr(NCB_SCRIPTH_PHYS (np, bad_i_t_l_q)); + for (i = 0 ; i < SCSI_NCR_MAX_TAGS ; i++) + lp->cb_tags[i] = i; + lp->maxnxs = SCSI_NCR_MAX_TAGS; + lp->tags_stime = jiffies; } /* diff --git a/drivers/scsi/ncr53c8xx.h b/drivers/scsi/ncr53c8xx.h index af07ab58972e..93b17aa6f07e 100644 --- a/drivers/scsi/ncr53c8xx.h +++ b/drivers/scsi/ncr53c8xx.h @@ -45,7 +45,7 @@ /* ** Name and revision of the driver */ -#define SCSI_NCR_DRIVER_NAME "ncr53c8xx - revision 3.0i" +#define SCSI_NCR_DRIVER_NAME "ncr53c8xx - revision 3.1a" /* ** Check supported Linux versions @@ -468,7 +468,10 @@ typedef struct { {PCI_DEVICE_ID_NCR_53C875, 0x01, "875", 6, 16, 5, \ FE_WIDE|FE_ULTRA|FE_CLK80|FE_CACHE0_SET|FE_BOF|FE_DFS|FE_LDSTR|FE_PFEN|FE_RAM}\ , \ - {PCI_DEVICE_ID_NCR_53C875, 0xff, "875", 6, 16, 5, \ + {PCI_DEVICE_ID_NCR_53C875, 0x0f, "875", 6, 16, 5, \ + FE_WIDE|FE_ULTRA|FE_DBLR|FE_CACHE0_SET|FE_BOF|FE_DFS|FE_LDSTR|FE_PFEN|FE_RAM}\ + , \ + {PCI_DEVICE_ID_NCR_53C875, 0xff, "876", 6, 16, 5, \ FE_WIDE|FE_ULTRA|FE_DBLR|FE_CACHE0_SET|FE_BOF|FE_DFS|FE_LDSTR|FE_PFEN|FE_RAM}\ , \ {PCI_DEVICE_ID_NCR_53C875J,0xff, "875J", 6, 16, 5, \ diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index ededdce3b70f..3991abdcd32f 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -65,7 +65,8 @@ #define SD_MINOR_NUMBER(i) ((i) & 255) #define MKDEV_SD_PARTITION(i) MKDEV(SD_MAJOR_NUMBER(i), (i) & 255) #define MKDEV_SD(index) MKDEV_SD_PARTITION((index) << 4) -#define N_USED_SD_MAJORS ((sd_template.dev_max + SCSI_DISKS_PER_MAJOR - 1) / SCSI_DISKS_PER_MAJOR) +#define N_USED_SCSI_DISKS (sd_template.dev_max + SCSI_DISKS_PER_MAJOR - 1) +#define N_USED_SD_MAJORS (N_USED_SCSI_DISKS / SCSI_DISKS_PER_MAJOR) #define MAX_RETRIES 5 @@ -1765,7 +1766,7 @@ void cleanup_module( void) scsi_unregister_module(MODULE_SCSI_DEV, &sd_template); for (i=0; i <= sd_template.dev_max / SCSI_DISKS_PER_MAJOR; i++) - unregister_blkdev(SD_MAJOR(i),"sd"); + unregister_blkdev(SD_MAJOR(i),"sd"); sd_registered--; if( rscsi_disks != NULL ) @@ -1783,13 +1784,13 @@ void cleanup_module( void) for (sdgd = gendisk_head; sdgd; sdgd = sdgd->next) { - if (sdgd->next >= sd_gendisks && sdgd->next <= LAST_SD_GENDISK) + if (sdgd->next >= sd_gendisks && sdgd->next <= LAST_SD_GENDISK.max_nr) removed++, sdgd->next = sdgd->next->next; else sdgd = sdgd->next; } - if (removed != N_USED_SCSI_DISKS) + if (removed != N_USED_SD_MAJORS) printk("%s %d sd_gendisks in disk chain", - removed > N_USED_SCSI_DISKS ? "total" : "just", removed); + removed > N_USED_SD_MAJORS ? "total" : "just", removed); } diff --git a/drivers/sound/es1370.c b/drivers/sound/es1370.c index 81a3b6081216..2a5f5b9d643f 100644 --- a/drivers/sound/es1370.c +++ b/drivers/sound/es1370.c @@ -101,6 +101,7 @@ /*****************************************************************************/ +#include #include #include #include diff --git a/fs/namei.c b/fs/namei.c index ec3403abc754..154bb6c946fe 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -626,7 +626,7 @@ struct dentry * open_namei(const char * pathname, int flag, int mode) if (!inode) goto exit; - error = -EACCES; + error = -ELOOP; if (S_ISLNK(inode->i_mode)) goto exit; diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 4e67fdf33fd2..b8f7c3944ff8 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -14,6 +14,7 @@ * Copyright (C) 1995, 1996, 1997 Olaf Kirch */ +#include #include #include #include diff --git a/fs/nls/nls_base.c b/fs/nls/nls_base.c index afc219838736..9654c8ec2577 100644 --- a/fs/nls/nls_base.c +++ b/fs/nls/nls_base.c @@ -49,7 +49,6 @@ utf8_mbtowc(__u16 *p, const __u8 *s, int n) int c0, c, nc; struct utf8_table *t; - printk("utf8_mbtowc\n"); nc = 0; c0 = *s; l = c0; @@ -80,11 +79,9 @@ utf8_mbstowcs(__u16 *pwcs, const __u8 *s, int n) const __u8 *ip; int size; - printk("\nutf8_mbstowcs: n=%d\n", n); op = pwcs; ip = s; while (*ip && n > 0) { - printk(" %02x", *ip); if (*ip & 0x80) { size = utf8_mbtowc(op, ip, n); if (size == -1) { diff --git a/fs/select.c b/fs/select.c index 7ab56a4b8b82..3d10fc2b2fd1 100644 --- a/fs/select.c +++ b/fs/select.c @@ -130,22 +130,20 @@ int do_select(int n, fd_set_buffer *fds, unsigned long timeout) int retval; int i; - lock_kernel(); - wait = NULL; current->timeout = timeout; if (timeout) { - struct poll_table_entry *entry = (struct poll_table_entry *) - __get_free_page(GFP_KERNEL); - if (!entry) { - retval = -ENOMEM; - goto out_nowait; - } + struct poll_table_entry *entry = (struct poll_table_entry *) __get_free_page(GFP_KERNEL); + if (!entry) + return -ENOMEM; + wait_table.nr = 0; wait_table.entry = entry; wait = &wait_table; } + lock_kernel(); + retval = max_select_fd(n, fds); if (retval < 0) goto out; diff --git a/include/asm-mips/floppy.h b/include/asm-mips/floppy.h index e42b1c924ef0..9ba75332e449 100644 --- a/include/asm-mips/floppy.h +++ b/include/asm-mips/floppy.h @@ -11,7 +11,6 @@ #ifndef __ASM_MIPS_FLOPPY_H #define __ASM_MIPS_FLOPPY_H -#include #include #include #include diff --git a/include/linux/mm.h b/include/linux/mm.h index 18abb15fc0fa..fdc9de6cb0a7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -272,7 +272,7 @@ extern int remap_page_range(unsigned long from, unsigned long to, unsigned long extern int zeromap_page_range(unsigned long from, unsigned long size, pgprot_t prot); extern void vmtruncate(struct inode * inode, unsigned long offset); -extern void handle_mm_fault(struct task_struct *tsk,struct vm_area_struct *vma, unsigned long address, int write_access); +extern int handle_mm_fault(struct task_struct *tsk,struct vm_area_struct *vma, unsigned long address, int write_access); extern void make_pages_present(unsigned long addr, unsigned long end); extern int pgt_cache_water[2]; @@ -329,18 +329,11 @@ extern void put_cached_page(unsigned long); */ extern int free_memory_available(void); extern struct task_struct * kswapd_task; - -extern inline void kswapd_notify(unsigned int gfp_mask) -{ - if (kswapd_task) { - wake_up_process(kswapd_task); - if (gfp_mask & __GFP_WAIT) { - current->policy |= SCHED_YIELD; - schedule(); - } - } -} - +#define wakeup_kswapd() do { \ + if (kswapd_task->state & TASK_INTERRUPTIBLE) \ + wake_up_process(kswapd_task); \ +} while (0) + /* vma is the first one with address < vma->vm_end, * and even address < vma->vm_start. Have to extend vma. */ static inline int expand_stack(struct vm_area_struct * vma, unsigned long address) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 9af62e4a2f9b..58f6fdd31df2 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -83,7 +83,6 @@ static inline void remove_page_from_hash_queue(struct page * page) static inline void __add_page_to_hash_queue(struct page * page, struct page **p) { page_cache_size++; - set_bit(PG_referenced, &page->flags); page->age = PAGE_AGE_VALUE; if((page->next_hash = *p) != NULL) (*p)->pprev_hash = &page->next_hash; diff --git a/include/linux/parport.h b/include/linux/parport.h index ac6dbb3e9cac..3adbc5ab3d58 100644 --- a/include/linux/parport.h +++ b/include/linux/parport.h @@ -208,7 +208,7 @@ struct parport { int number; /* port index - the `n' in `parportn' */ spinlock_t pardevice_lock; spinlock_t waitlist_lock; - spinlock_t cad_lock; + rwlock_t cad_lock; }; /* parport_register_port registers a new parallel port at the given address (if diff --git a/mm/filemap.c b/mm/filemap.c index a5638bf306f1..47c45dada9f1 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -293,7 +293,7 @@ static inline void add_to_page_cache(struct page * page, struct page **hash) { atomic_inc(&page->count); - page->flags &= ~((1 << PG_uptodate) | (1 << PG_error)); + page->flags = (page->flags & ~((1 << PG_uptodate) | (1 << PG_error))) | (1 << PG_referenced); page->offset = offset; add_page_to_inode_queue(inode, page); __add_page_to_hash_queue(page, hash); @@ -328,7 +328,6 @@ static unsigned long try_to_read_ahead(struct file * file, */ page = mem_map + MAP_NR(page_cache); add_to_page_cache(page, inode, offset, hash); - set_bit(PG_referenced, &page->flags); inode->i_op->readpage(file, page); page_cache = 0; } diff --git a/mm/memory.c b/mm/memory.c index 19fe2aad85c9..b4534aca8668 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -629,7 +629,7 @@ unsigned long put_dirty_page(struct task_struct * tsk, unsigned long page, unsig * change only once the write actually happens. This avoids a few races, * and potentially makes it more efficient. */ -static void do_wp_page(struct task_struct * tsk, struct vm_area_struct * vma, +static int do_wp_page(struct task_struct * tsk, struct vm_area_struct * vma, unsigned long address, pte_t *page_table) { pte_t pte; @@ -665,30 +665,31 @@ static void do_wp_page(struct task_struct * tsk, struct vm_area_struct * vma, set_pte(page_table, pte_mkwrite(pte_mkdirty(mk_pte(new_page, vma->vm_page_prot)))); free_page(old_page); flush_tlb_page(vma, address); - return; + return 1; } flush_cache_page(vma, address); set_pte(page_table, BAD_PAGE); flush_tlb_page(vma, address); free_page(old_page); oom(tsk); - return; + return 0; } if (PageSwapCache(page_map)) delete_from_swap_cache(page_map); flush_cache_page(vma, address); set_pte(page_table, pte_mkdirty(pte_mkwrite(pte))); flush_tlb_page(vma, address); +end_wp_page: if (new_page) free_page(new_page); - return; + return 1; + bad_wp_page: printk("do_wp_page: bogus page at address %08lx (%08lx)\n",address,old_page); send_sig(SIGKILL, tsk, 1); -end_wp_page: if (new_page) free_page(new_page); - return; + return 0; } /* @@ -777,30 +778,50 @@ void vmtruncate(struct inode * inode, unsigned long offset) } -static inline void do_swap_page(struct task_struct * tsk, +static int do_swap_page(struct task_struct * tsk, struct vm_area_struct * vma, unsigned long address, pte_t * page_table, pte_t entry, int write_access) { - pte_t page; - + lock_kernel(); if (!vma->vm_ops || !vma->vm_ops->swapin) { swap_in(tsk, vma, page_table, pte_val(entry), write_access); flush_page_to_ram(pte_page(*page_table)); - return; + } else { + pte_t page = vma->vm_ops->swapin(vma, address - vma->vm_start + vma->vm_offset, pte_val(entry)); + if (pte_val(*page_table) != pte_val(entry)) { + free_page(pte_page(page)); + } else { + if (atomic_read(&mem_map[MAP_NR(pte_page(page))].count) > 1 && + !(vma->vm_flags & VM_SHARED)) + page = pte_wrprotect(page); + ++vma->vm_mm->rss; + ++tsk->maj_flt; + flush_page_to_ram(pte_page(page)); + set_pte(page_table, page); + } } - page = vma->vm_ops->swapin(vma, address - vma->vm_start + vma->vm_offset, pte_val(entry)); - if (pte_val(*page_table) != pte_val(entry)) { - free_page(pte_page(page)); - return; + unlock_kernel(); + return 1; +} + +/* + * This only needs the MM semaphore + */ +static int do_anonymous_page(struct task_struct * tsk, struct vm_area_struct * vma, pte_t *page_table, int write_access) +{ + pte_t entry = pte_wrprotect(mk_pte(ZERO_PAGE, vma->vm_page_prot)); + if (write_access) { + unsigned long page = __get_free_page(GFP_KERNEL); + if (!page) + return 0; + clear_page(page); + entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot))); + vma->vm_mm->rss++; + tsk->min_flt++; + flush_page_to_ram(page); } - if (atomic_read(&mem_map[MAP_NR(pte_page(page))].count) > 1 && - !(vma->vm_flags & VM_SHARED)) - page = pte_wrprotect(page); - ++vma->vm_mm->rss; - ++tsk->maj_flt; - flush_page_to_ram(pte_page(page)); - set_pte(page_table, page); - return; + put_page(page_table, entry); + return 1; } /* @@ -811,26 +832,33 @@ static inline void do_swap_page(struct task_struct * tsk, * * As this is called only for pages that do not currently exist, we * do not need to flush old virtual caches or the TLB. + * + * This is called with the MM semaphore held, but without the kernel + * lock. */ -static void do_no_page(struct task_struct * tsk, struct vm_area_struct * vma, - unsigned long address, int write_access, pte_t *page_table, pte_t entry) +static int do_no_page(struct task_struct * tsk, struct vm_area_struct * vma, + unsigned long address, int write_access, pte_t *page_table) { unsigned long page; + pte_t entry; - if (!pte_none(entry)) - goto swap_page; - address &= PAGE_MASK; if (!vma->vm_ops || !vma->vm_ops->nopage) - goto anonymous_page; + return do_anonymous_page(tsk, vma, page_table, write_access); + /* * The third argument is "no_share", which tells the low-level code * to copy, not share the page even if sharing is possible. It's - * essentially an early COW detection + * essentially an early COW detection. + * + * We need to grab the kernel lock for this.. */ - page = vma->vm_ops->nopage(vma, address, + lock_kernel(); + page = vma->vm_ops->nopage(vma, address & PAGE_MASK, (vma->vm_flags & VM_SHARED)?0:write_access); + unlock_kernel(); if (!page) - goto sigbus; + return 0; + ++tsk->maj_flt; ++vma->vm_mm->rss; /* @@ -852,32 +880,7 @@ static void do_no_page(struct task_struct * tsk, struct vm_area_struct * vma, entry = pte_wrprotect(entry); put_page(page_table, entry); /* no need to invalidate: a not-present page shouldn't be cached */ - return; - -anonymous_page: - entry = pte_wrprotect(mk_pte(ZERO_PAGE, vma->vm_page_prot)); - if (write_access) { - unsigned long page = __get_free_page(GFP_KERNEL); - if (!page) - goto sigbus; - clear_page(page); - entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot))); - vma->vm_mm->rss++; - tsk->min_flt++; - flush_page_to_ram(page); - } - put_page(page_table, entry); - return; - -sigbus: - force_sig(SIGBUS, current); - put_page(page_table, BAD_PAGE); - /* no need to invalidate, wasn't present */ - return; - -swap_page: - do_swap_page(tsk, vma, address, page_table, entry, write_access); - return; + return 1; } /* @@ -889,54 +892,54 @@ swap_page: * with external mmu caches can use to update those (ie the Sparc or * PowerPC hashed page tables that act as extended TLBs). */ -static inline void handle_pte_fault(struct task_struct *tsk, +static inline int handle_pte_fault(struct task_struct *tsk, struct vm_area_struct * vma, unsigned long address, int write_access, pte_t * pte) { pte_t entry = *pte; if (!pte_present(entry)) { - do_no_page(tsk, vma, address, write_access, pte, entry); - return; + if (pte_none(entry)) + return do_no_page(tsk, vma, address, write_access, pte); + return do_swap_page(tsk, vma, address, pte, entry, write_access); } + entry = pte_mkyoung(entry); set_pte(pte, entry); flush_tlb_page(vma, address); if (!write_access) - return; + return 1; + if (pte_write(entry)) { entry = pte_mkdirty(entry); set_pte(pte, entry); flush_tlb_page(vma, address); - return; + return 1; } - do_wp_page(tsk, vma, address, pte); + return do_wp_page(tsk, vma, address, pte); } /* * By the time we get here, we already hold the mm semaphore */ -void handle_mm_fault(struct task_struct *tsk, struct vm_area_struct * vma, +int handle_mm_fault(struct task_struct *tsk, struct vm_area_struct * vma, unsigned long address, int write_access) { pgd_t *pgd; pmd_t *pmd; - pte_t *pte; pgd = pgd_offset(vma->vm_mm, address); pmd = pmd_alloc(pgd, address); - if (!pmd) - goto no_memory; - pte = pte_alloc(pmd, address); - if (!pte) - goto no_memory; - lock_kernel(); - handle_pte_fault(tsk, vma, address, write_access, pte); - unlock_kernel(); - update_mmu_cache(vma, address, *pte); - return; -no_memory: - oom(tsk); + if (pmd) { + pte_t * pte = pte_alloc(pmd, address); + if (pte) { + if (handle_pte_fault(tsk, vma, address, write_access, pte)) { + update_mmu_cache(vma, address, *pte); + return 1; + } + } + } + return 0; } /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4340911f840c..71ea82b6a1bf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -269,11 +269,16 @@ unsigned long __get_free_pages(int gfp_mask, unsigned long order) /* * If we failed to find anything, we'll return NULL, but we'll - * wake up kswapd _now_ and even wait for it synchronously if - * we can.. This way we'll at least make some forward progress + * wake up kswapd _now_ and even yield to it if we can.. + * This way we'll at least make some forward progress * over time. */ - kswapd_notify(gfp_mask); + wakeup_kswapd(); + if (gfp_mask & __GFP_WAIT) { + current->policy |= SCHED_YIELD; + schedule(); + } + nopage: return 0; } diff --git a/mm/vmscan.c b/mm/vmscan.c index 64263c79e3ee..1812e4bc3506 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -118,8 +118,13 @@ static inline int try_to_swap_out(struct task_struct * tsk, struct vm_area_struc } if (pte_young(pte)) { + /* + * Transfer the "accessed" bit from the page + * tables to the global page map. + */ set_pte(page_table, pte_mkold(pte)); - touch_page(page_map); + set_bit(PG_referenced, &page_map->flags); + /* * We should test here to see if we want to recover any * swap cache page here. We do this if the page seeing @@ -132,10 +137,6 @@ static inline int try_to_swap_out(struct task_struct * tsk, struct vm_area_struc return 0; } - age_page(page_map); - if (page_map->age) - return 0; - if (pte_dirty(pte)) { if (vma->vm_ops && vma->vm_ops->swapout) { pid_t pid = tsk->pid; @@ -305,8 +306,9 @@ static inline int swap_out_pgd(struct task_struct * tsk, struct vm_area_struct * } static int swap_out_vma(struct task_struct * tsk, struct vm_area_struct * vma, - pgd_t *pgdir, unsigned long start, int gfp_mask) + unsigned long address, int gfp_mask) { + pgd_t *pgdir; unsigned long end; /* Don't swap out areas like shared memory which have their @@ -314,12 +316,14 @@ static int swap_out_vma(struct task_struct * tsk, struct vm_area_struct * vma, if (vma->vm_flags & (VM_SHM | VM_LOCKED)) return 0; + pgdir = pgd_offset(tsk->mm, address); + end = vma->vm_end; - while (start < end) { - int result = swap_out_pgd(tsk, vma, pgdir, start, end, gfp_mask); + while (address < end) { + int result = swap_out_pgd(tsk, vma, pgdir, address, end, gfp_mask); if (result) return result; - start = (start + PGDIR_SIZE) & PGDIR_MASK; + address = (address + PGDIR_SIZE) & PGDIR_MASK; pgdir++; } return 0; @@ -339,22 +343,23 @@ static int swap_out_process(struct task_struct * p, int gfp_mask) * Find the proper vm-area */ vma = find_vma(p->mm, address); - if (!vma) { - p->swap_address = 0; - return 0; + if (vma) { + if (address < vma->vm_start) + address = vma->vm_start; + + for (;;) { + int result = swap_out_vma(p, vma, address, gfp_mask); + if (result) + return result; + vma = vma->vm_next; + if (!vma) + break; + address = vma->vm_start; + } } - if (address < vma->vm_start) - address = vma->vm_start; - for (;;) { - int result = swap_out_vma(p, vma, pgd_offset(p->mm, address), address, gfp_mask); - if (result) - return result; - vma = vma->vm_next; - if (!vma) - break; - address = vma->vm_start; - } + /* We didn't find anything for the process */ + p->swap_cnt = 0; p->swap_address = 0; return 0; } @@ -415,20 +420,12 @@ static int swap_out(unsigned int priority, int gfp_mask) } pbest->swap_cnt--; - switch (swap_out_process(pbest, gfp_mask)) { - case 0: - /* - * Clear swap_cnt so we don't look at this task - * again until we've tried all of the others. - * (We didn't block, so the task is still here.) - */ - pbest->swap_cnt = 0; - break; - case 1: + /* + * Nonzero means we cleared out something, but only "1" means + * that we actually free'd up a page as a result. + */ + if (swap_out_process(pbest, gfp_mask) == 1) return 1; - default: - break; - }; } out: return 0; @@ -540,7 +537,7 @@ int kswapd(void *unused) init_swap_timer(); kswapd_task = current; while (1) { - int tries; + unsigned long start_time; current->state = TASK_INTERRUPTIBLE; flush_signals(current); @@ -548,36 +545,12 @@ int kswapd(void *unused) schedule(); swapstats.wakeups++; - /* - * Do the background pageout: be - * more aggressive if we're really - * low on free memory. - * - * We try page_daemon.tries_base times, divided by - * an 'urgency factor'. In practice this will mean - * a value of pager_daemon.tries_base / 8 or 4 = 64 - * or 128 pages at a time. - * This gives us 64 (or 128) * 4k * 4 (times/sec) = - * 1 (or 2) MB/s swapping bandwidth in low-priority - * background paging. This number rises to 8 MB/s - * when the priority is highest (but then we'll be - * woken up more often and the rate will be even - * higher). - */ - tries = pager_daemon.tries_base; - tries >>= 4*free_memory_available(); - + start_time = jiffies; do { do_try_to_free_page(0); - /* - * Syncing large chunks is faster than swapping - * synchronously (less head movement). -- Rik. - */ - if (atomic_read(&nr_async_pages) >= pager_daemon.swap_cluster) - run_task_queue(&tq_disk); if (free_memory_available() > 1) break; - } while (--tries > 0); + } while (jiffies != start_time); } /* As if we could ever get here - maybe we want to make this killable */ kswapd_task = NULL; diff --git a/scripts/Configure b/scripts/Configure index d01564bee554..a2439560b6fd 100644 --- a/scripts/Configure +++ b/scripts/Configure @@ -53,6 +53,9 @@ # # 090398 Axel Boldt (boldt@math.ucsb.edu) - allow for empty lines in help # texts. +# +# 102598 Michael Chastain (mec@shout.net) - put temporary files in +# current directory, not in /tmp. # # Make sure we're really running bash. @@ -506,9 +509,9 @@ if [ -f $DEFAULTS ]; then echo "# Using defaults found in" $DEFAULTS echo "#" . $DEFAULTS - sed -e 's/# \(.*\) is not.*/\1=n/' < $DEFAULTS > /tmp/conf.$$ - . /tmp/conf.$$ - rm /tmp/conf.$$ + sed -e 's/# \(.*\) is not.*/\1=n/' < $DEFAULTS > .config-is-not.$$ + . .config-is-not.$$ + rm .config-is-not.$$ else echo "#" echo "# No defaults found"