]> git.neil.brown.name Git - history.git/commitdiff
Linux 2.2.12 2.2.12
authorAlan Cox <alan@lxorguk.ukuu.org.uk>
Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
committerAlan Cox <alan@lxorguk.ukuu.org.uk>
Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
Platforms:Alpha (see notes), PowerPC, Sparc, X86

Introduction
Linux 2.2.12 is the latest update to the Linux kernel tree. It fixes the memory
leak bug in the 2.2.11 kernel. In addition it updates various drivers and the
platform specific support. The out of the box tree supports the Alpha, PPC,
Sparc and X86 platforms. MIPS is mostly merged but you should obtain the
platform specific tree. It is hoped MIPS and PowerPC will soon be fully merged.
ARM and M680x0 users should get their platform specific tree.

Known Bugs
On the Alpha platform we know the new maths code currently fails some glibc
maths checks. The Alpha port people are looking into this. Expect patches soon.

Compilers
This code is intended to build with gcc 2.7.2 and egcs 1.1.2. It is known that
not all of it builds validly on the x86 CPU's with gcc 2.95. As far as we know
these are Linux not gcc issues. Fixes for gcc 2.95 to gcc 3.0 may go into Linux
2.2 in time. You should therefore not use gcc 2.95 to build stable kernels for
the moment.

Binary Compatibility
Linux 2.2.12 changes a few internal system structures. You may need to rebuild
a few third party modules such as pcmcia-cs when upgrading from older kernels
to this one.

Security Notes
The TCP fixes in the 2.2.12 kernel for the memory leak and Solaris food fight
are the only security updates. You can obtain them seperately to 2.2.12 from
the 2.2.11 release notes. Linux 2.2.11 with the errata is believed to be as
secure as 2.2.12 unless you are trying to use strictly enforced capability
sets. In which case you may wish to apply the fs/proc/array.c patch from 2.2.12
to get precisely the same security.

Architecture Updates

Alpha
    Further changes have been made to the maths emulation support.
    A bug where the floppy drive may be unusable for alternating periods of
     49.7 days has been fixed.
    The Symbios cache test should now pass and the SCSI work properly.
i386
    Optimisations for the IDT Winchip.
    Identify and report the AMD Athlon.
    Fix a crash on boot with the AMD Athlon.
MIPS
    Fix a timeout scheduling error in the dz driver.
PowerPC
    All the PPC changes should now be merged.
Sparc
    A problem with the viking MMU code has been fixed.
    A small Sparc64 kernel_thread change.

Core Updates

File Handles
    The kernel now supports large numbers of file handles per process.
    The default remains unchanged but can be raised by processes.
Memory Limits
    Certain parts of the kernel didn't correctly interpret RLIM_INFINITY
     and enforced 2Gig limits.
Mlock
    Munlock was checking for CAP_IPC_LOCK when it should only be required
     to lock memory.
Quota
    Fixed a pair of accounting errors in the quota code.

Driver Updates

Computone Intelliport 2
    A driver for this card under Linux has been included.
DAC960
    The DAC960 driver has been updated.
ESS Solo
    An experimental driver for this PCI sound card is now included.
Iomega Buz
    A Zoran ZR36067 driver for video capture including MJPEG capture is
     now included.
     This works with the Iomega buz but does not yet support the LML33.
ISDN
    The ISDN fax patches have been merged.
    The hisax driver now passes certification with some ELSA cards.
    Fix a buffer headroom issue with compression and ISDN ppp.
MAD16
    The MAD16 driver now defaults to not enabling its on board CD port.
    This avoids problems with users not being aware the default may
     interfere with other drivers.
Multitech ISI driver
    Support for PCI interrupt sharing is now included.
PCWD Watchdog
    Revision A boards reported their status incorrectly.
Soundblaster
    A case where IRQ 0 may be erroneously freed has been fixed.
VisWs Sound
    The SGI visual workstation onboard audio is now supported.
VisWs Video
    The SGI visual workstation onboard video driver has been improved.

File System Updates

Welsh Language
    ISO 8859-14 (The Celtic languages) is now supported for UTF8 translations.

Miscellaneous Updates

ChangeLog
    The Changelog has been updated to reflect newer tools.
Documentation
    Various documents have been updated.

Network Updates

Alteon AceNIC
    Small changes have been made to reduce its interrupt load and increase
     performance further.
Interphase 5526
    This fibre channel chipset is now supported under Linux
RTL8139
    A sign handling bug has been fixed that might have caused memory leakage.
SB1000
    The errata patch for the SB1000 has been folded into the 2.2.12 kernel.
    This driver is now functional.
SiS900
    This driver has been updated further.

SCSI Updates

PAS-16
    The module now allows you to set the I/O and IRQ.
Symbios controller
    The symbios 53C876 revision 32 is now supported.

Security Updates

/proc/kcore
    The RAWIO capability is now needed to access /proc/kcore.
Memory leak from TCP
    This is the nasty bug fixed in the 2.2.11 errata. The fix is also in 2.2.12.
Solaris food fight
    This TCP fringe case has been fixed.
Tightened capabilities
    We have tightened the capabilities needed for setting frame buffer bases
     to include RAWIO.

91 files changed:
CREDITS
Documentation/Configure.help
Documentation/README.DAC960 [new file with mode: 0644]
Documentation/networking/CREDITS.ipvs [deleted file]
Documentation/networking/ChangeLog.ipvs [deleted file]
Documentation/networking/README.ipvs [deleted file]
arch/alpha/kernel/alpha_ksyms.c
arch/alpha/kernel/core_mcpcia.c
arch/alpha/kernel/process.c
arch/alpha/kernel/setup.c
arch/i386/defconfig
arch/i386/kernel/mtrr.c
arch/i386/mm/init.c
arch/sparc64/kernel/ioctl32.c
drivers/block/Config.in
drivers/block/DAC960.c
drivers/block/DAC960.h
drivers/block/Makefile
drivers/block/cpqarray.h
drivers/block/genhd.c
drivers/block/hsm.c [deleted file]
drivers/block/linear.c
drivers/block/linear.h [new file with mode: 0644]
drivers/block/ll_rw_blk.c
drivers/block/md.c
drivers/block/raid0.c
drivers/block/raid1.c
drivers/block/raid5.c
drivers/block/translucent.c [deleted file]
drivers/block/xor.c [deleted file]
drivers/cdrom/sonycd535.c
drivers/char/bttv.c
drivers/char/buz.c
drivers/char/dz.c
drivers/char/generic_serial.c
drivers/char/planb.c
drivers/isdn/isdn_ppp.c
drivers/net/sis900.c
drivers/scsi/aha152x.c
drivers/sound/sb_ess.c
fs/autofs/root.c
fs/block_dev.c
fs/buffer.c
fs/dquot.c
fs/fat/inode.c
fs/select.c
include/asm-alpha/core_cia.h
include/asm-alpha/md.h [new file with mode: 0644]
include/asm-i386/md.h [new file with mode: 0644]
include/asm-m68k/md.h [new file with mode: 0644]
include/asm-ppc/md.h [new file with mode: 0644]
include/asm-sparc/md.h [new file with mode: 0644]
include/asm-sparc64/md.h [new file with mode: 0644]
include/linux/blkdev.h
include/linux/ip_masq.h
include/linux/md.h [new file with mode: 0644]
include/linux/raid/hsm.h [deleted file]
include/linux/raid/hsm_p.h [deleted file]
include/linux/raid/linear.h [deleted file]
include/linux/raid/md.h [deleted file]
include/linux/raid/md_compatible.h [deleted file]
include/linux/raid/md_k.h [deleted file]
include/linux/raid/md_p.h [deleted file]
include/linux/raid/md_u.h [deleted file]
include/linux/raid/raid0.h [deleted file]
include/linux/raid/raid1.h [deleted file]
include/linux/raid/raid5.h [deleted file]
include/linux/raid/translucent.h [deleted file]
include/linux/raid/xor.h [deleted file]
include/linux/raid0.h [new file with mode: 0644]
include/linux/raid1.h [new file with mode: 0644]
include/linux/raid5.h [new file with mode: 0644]
include/linux/sysctl.h
include/net/ip_masq.h
include/net/ip_vs.h [deleted file]
init/main.c
net/ipv4/Config.in
net/ipv4/Makefile
net/ipv4/arp.c
net/ipv4/ip_input.c
net/ipv4/ip_masq.c
net/ipv4/ip_masq_autofw.c
net/ipv4/ip_masq_mfw.c
net/ipv4/ip_masq_portfw.c
net/ipv4/ip_masq_user.c
net/ipv4/ip_vs.c [deleted file]
net/ipv4/ip_vs_pcc.c [deleted file]
net/ipv4/ip_vs_rr.c [deleted file]
net/ipv4/ip_vs_wlc.c [deleted file]
net/ipv4/ip_vs_wrr.c [deleted file]
sound/solo1 [deleted file]

diff --git a/CREDITS b/CREDITS
index 91045d6a4f41118d557098aa7c1516fc7bcaf606..f6879c094fb0b6a37633963df04c24a4ecb4d246 100644 (file)
--- a/CREDITS
+++ b/CREDITS
@@ -995,6 +995,15 @@ S: Tallak 95
 S: 8103 Rein
 S: Austria
 
+N: Jan Kara
+E: jack@atrey.karlin.mff.cuni.cz
+D: Quota fixes for 2.2 kernel
+D: Few other fixes in filesystem area (isofs, loopback)
+W: http://atrey.karlin.mff.cuni.cz/~jack/
+S: Krosenska' 543
+S: 181 00 Praha 8
+S: Czech Republic
+
 N: Jan "Yenya" Kasprzak
 E: kas@fi.muni.cz
 D: Author of the COSA/SRP sync serial board driver.
index 0efe43182e959b60000a47a78311bb4a66e585a2..a00ade642fde8d4ee3a0b45b75edd8fcc87744ca 100644 (file)
@@ -956,13 +956,6 @@ CONFIG_BLK_DEV_MD
 
   If unsure, say N.
 
-Autodetect RAID partitions
-CONFIG_AUTODETECT_RAID
-  This feature lets the kernel detect RAID partitions on bootup.
-  An autodetect RAID partition is a normal partition with partition
-  type 0xfd. Use this if you want to boot RAID devices, or want to
-  run them automatically.
-
 Linear (append) mode
 CONFIG_MD_LINEAR
   If you say Y here, then your multiple devices driver will be able to
@@ -1042,21 +1035,6 @@ CONFIG_MD_RAID5
 
   If unsure, say Y.
 
-Translucent Block Device Support (EXPERIMENTAL)
-CONFIG_MD_TRANSLUCENT
-  DO NOT USE THIS STUFF YET!
-
-  currently there is only a placeholder there as the implementation
-  is not yet usable.
-
-Logical Volume Manager support (EXPERIMENTAL)
-CONFIG_MD_LVM
-  DO NOT USE THIS STUFF YET!
-
-  i have released this so people can comment on the architecture,
-  but user-space tools are still unusable so there is nothing much
-  you can do with this.
-
 Boot support (linear, striped)
 CONFIG_MD_BOOT
   To boot with an initial linear or striped md device you have to
@@ -2665,76 +2643,6 @@ CONFIG_IP_MASQUERADE_MFW
   The module will be called ip_masq_markfw.o. If you want to compile
   it as a module, say M here and read Documentation/modules.txt.
 
-IP: masquerading virtual server support
-CONFIG_IP_MASQUERADE_VS
-  IP Virtual Server support will let you build a virtual server
-  based on cluster of two or more real servers. This option must
-  be enabled for at least one of the clustered computers that will
-  take care of intercepting incomming connections to the virtual IP
-  and scheduling them to real servers.
-  Three request dispatching techniques are implemented, they are 
-  virtual server via NAT, virtual server via tunneling  and virtual
-  server via direct routing. The round-robin scheduling, the weighted
-  round-robin secheduling, or the weighted least-connection scheduling
-  algorithm can be used to choose which server the connection is 
-  directed to, thus load balancing can be achieved among the servers.
-  For more information and its administration program, please visit
-  the following URL:
-       http://proxy.iinchina.net/~wensong/ippfvs/
-  If you want this, say Y.
-
-IP masquerading VS table size (the Nth power of 2)
-CONFIG_IP_MASQUERADE_VS_TAB_BITS
-  Using a big IP masquerading hash table for virtual server will greatly
-  reduce conflicts in the masquerading hash table when there are
-  thousands of active connections.
-  Note the table size must be power of 2. The table size will be the
-  value of 2 to the your input number power. For example, the default
-  number is 12, so the table size is 4096. Don't input the number too
-  small, otherwise you will lose performance on it.
-  You can adapt the table size yourself, according to your virtual
-  server application. It is good to set the table size larger than
-  the number of connections per second multiplying average lasting time
-  of connection in the table. For example, your virtual server gets
-  20 connections per second, the connection lasts for 200 seconds in
-  average in the masquerading table, the table size should be larger
-  than 20x200, it is good to set the table size 4096 (2**12).
-
-IPVS: round-robin scheduling
-CONFIG_IP_MASQUERADE_VS_RR
-  The robin-robin scheduling algorithm simply directs network
-  connections to different real servers in a round-robin manner.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: weighted round-robin scheduling
-CONFIG_IP_MASQUERADE_VS_WRR
-  The weighted robin-robin scheduling algorithm directs network
-  connections to different real servers based on server weights
-  in a round-robin manner. Servers with higher weights receive
-  new connections first than those with less weights, and servers
-  with higher weights get more connections than those with less
-  weights and servers with equal weights get equal connections.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: weighted least-connection scheduling
-CONFIG_IP_MASQUERADE_VS_WLC
-  The weighted least-connection scheduling algorithm directs network
-  connections to the server with the least number of alive connections
-  dividing the server weight.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: persistent client connection scheduling
-CONFIG_IP_MASQUERADE_VS_PCC
-  The persistent client connection feature means that after a client
-  establishs a connection to the selected server, all connections
-  from the same client will be directed to the same server in a
-  specified period.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
 IP: always defragment (required for masquerading)
 CONFIG_IP_ALWAYS_DEFRAG
   If you say Y here, then all incoming fragments (parts of IP packets
diff --git a/Documentation/README.DAC960 b/Documentation/README.DAC960
new file mode 100644 (file)
index 0000000..2a80042
--- /dev/null
@@ -0,0 +1,717 @@
+          Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux
+
+                       Version 2.2.4 for Linux 2.2.11
+                       Version 2.0.4 for Linux 2.0.37
+
+                             PRODUCTION RELEASE
+
+                               23 August 1999
+
+                              Leonard N. Zubkoff
+                              Dandelion Digital
+                              lnz@dandelion.com
+
+        Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+
+
+                                INTRODUCTION
+
+Mylex, Inc. designs and manufactures a variety of high performance PCI RAID
+controllers.  Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont,
+California 94555, USA and can be reached at 510/796-6100 or on the World Wide
+Web at http://www.mylex.com.  Mylex RAID Technical Support can be reached by
+electronic mail at support@mylex.com (for eXtremeRAID 1100 and older DAC960
+models) or techsup@mylex.com (for AcceleRAID models), by voice at 510/608-2400,
+or by FAX at 510/745-7715.  Contact information for offices in Europe and Japan
+is available on the Web site.
+
+The latest information on Linux support for DAC960 PCI RAID Controllers, as
+well as the most recent release of this driver, will always be available from
+my Linux Home Page at URL "http://www.dandelion.com/Linux/".  The Linux DAC960
+driver supports all current DAC960 PCI family controllers including the
+AcceleRAID models, as well as the eXtremeRAID 1100; see below for a complete
+list.  For simplicity, in most places this documentation refers to DAC960
+generically rather than explicitly listing all the models.
+
+Bug reports should be sent via electronic mail to "lnz@dandelion.com".  Please
+include with the bug report the complete configuration messages reported by the
+driver at startup, along with any subsequent system messages relevant to the
+controller's operation, and a detailed description of your system's hardware
+configuration.
+
+Please consult the DAC960 RAID controller documentation for detailed
+information regarding installation and configuration of the controllers.  This
+document primarily provides information specific to the Linux DAC960 support.
+
+
+                               DRIVER FEATURES
+
+The DAC960 RAID controllers are supported solely as high performance RAID
+controllers, not as interfaces to arbitrary SCSI devices.  The Linux DAC960
+driver operates at the block device level, the same level as the SCSI and IDE
+drivers.  Unlike other RAID controllers currently supported on Linux, the
+DAC960 driver is not dependent on the SCSI subsystem, and hence avoids all the
+complexity and unnecessary code that would be associated with an implementation
+as a SCSI driver.  The DAC960 driver is designed for as high a performance as
+possible with no compromises or extra code for compatibility with lower
+performance devices.  The DAC960 driver includes extensive error logging and
+online configuration management capabilities.  Except for initial configuration
+of the controller and adding new disk drives, most everything can be handled
+from Linux while the system is operational.
+
+The DAC960 driver is architected to support up to 8 controllers per system.
+Each DAC960 controller can support up to 15 disk drives per channel, for a
+maximum of 45 drives on a three channel controller.  The drives installed on a
+controller are divided into one or more "Drive Groups", and then each Drive
+Group is subdivided further into 1 to 32 "Logical Drives".  Each Logical Drive
+has a specific RAID Level and caching policy associated with it, and it appears
+to Linux as a single block device.  Logical Drives are further subdivided into
+up to 7 partitions through the normal Linux and PC disk partitioning schemes.
+Logical Drives are also known as "System Drives", and Drive Groups are also
+called "Packs".  Both terms are in use in the Mylex documentation; I have
+chosen to standardize on the more generic "Logical Drive" and "Drive Group".
+
+DAC960 RAID disk devices are named in the style of the Device File System
+(DEVFS).  The device corresponding to Logical Drive D on Controller C is
+referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
+through /dev/rd/cCdDp7.  For example, partition 3 of Logical Drive 5 on
+Controller 2 is referred to as /dev/rd/c2d5p3.  Note that unlike with SCSI
+disks the device names will not change in the event of a disk drive failure.
+The DAC960 driver is assigned major numbers 48 - 55 with one major number per
+controller.  The 8 bits of minor number are divided into 5 bits for the Logical
+Drive and 3 bits for the partition.
+
+
+                SUPPORTED DAC960/DAC1100 PCI RAID CONTROLLERS
+
+The following list comprises the supported DAC960 and DAC1100 PCI RAID
+Controllers as of the date of this document.  It is recommended that anyone
+purchasing a Mylex PCI RAID Controller not in the following table contact the
+author beforehand to verify that it is or will be supported.
+
+eXtremeRAID 1100 (DAC1164P)
+           3 Wide Ultra-2/LVD SCSI channels
+           233MHz StrongARM SA 110 Processor
+           64 Bit PCI (backward compatible with 32 Bit PCI slots)
+           16MB/32MB/64MB Parity SDRAM Memory with Battery Backup
+
+AcceleRAID 250 (DAC960PTL1)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Also includes one onboard Wide Ultra-2/LVD SCSI Channel
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+AcceleRAID 200 (DAC960PTL0)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Includes no onboard SCSI Channels
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+AcceleRAID 150 (DAC960PRL)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Also includes one onboard Wide Ultra-2/LVD SCSI Channel
+           33MHz Intel i960RP RISC Processor
+           4MB Parity EDO Memory
+
+DAC960PJ    1/2/3 Wide Ultra SCSI-3 Channels
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+DAC960PG    1/2/3 Wide Ultra SCSI-3 Channels
+           33MHz Intel i960RP RISC Processor
+           4MB/8MB ECC EDO Memory
+
+DAC960PU    1/2/3 Wide Ultra SCSI-3 Channels
+           Intel i960CF RISC Processor
+           4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+DAC960PD    1/2/3 Wide Fast SCSI-2 Channels
+           Intel i960CF RISC Processor
+           4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+DAC960PL    1/2/3 Wide Fast SCSI-2 Channels
+           Intel i960 RISC Processor
+           2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required.
+
+For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above is
+required.
+
+For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required.
+
+For the DAC960PU, DAC960PD, and DAC960PL, firmware version 3.51-0-04 or above
+is required.
+
+Note that earlier revisions of the DAC960PU, DAC960PD, and DAC960PL controllers
+were delivered with version 2.xx firmware.  Version 2.xx firmware is not
+supported by this driver and no support is envisioned.  Contact Mylex RAID
+Technical Support to inquire about upgrading controllers with version 2.xx
+firmware to version 3.51-0-04.  Upgrading to version 3.xx firmware requires
+installation of higher capacity Flash ROM chips, and not all DAC960PD and
+DAC960PL controllers can be upgraded.
+
+Please note that not all SCSI disk drives are suitable for use with DAC960
+controllers, and only particular firmware versions of any given model may
+actually function correctly.  Similarly, not all motherboards have a BIOS that
+properly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150,
+DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device.
+If in doubt, contact Mylex RAID Technical Support (support@mylex.com) to verify
+compatibility.  Mylex makes available a hard disk compatibility list by FTP at
+ftp://ftp.mylex.com/pub/dac960/diskcomp.html.
+
+
+                             DRIVER INSTALLATION
+
+This distribution was prepared for Linux kernel version 2.2.11 or 2.0.37.
+
+To install the DAC960 RAID driver, you may use the following commands,
+replacing "/usr/src" with wherever you keep your Linux kernel source tree:
+
+  cd /usr/src
+  tar -xvzf DAC960-2.2.4.tar.gz (or DAC960-2.0.4.tar.gz)
+  mv README.DAC960 linux/Documentation
+  mv DAC960.[ch] linux/drivers/block
+  patch -p0 < DAC960.patch
+  cd linux
+  make config
+  make depend
+  make bzImage (or zImage)
+
+Then install "arch/i386/boot/bzImage" or "arch/i386/boot/zImage" as your
+standard kernel, run lilo if appropriate, and reboot.
+
+To create the necessary devices in /dev, the "make_rd" script included in
+"DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used.
+LILO 21 and FDISK v2.9 include DAC960 support; also included in this archive
+are patches to LILO 20 and FDISK v2.8 that add DAC960 support, along with
+statically linked executables of LILO and FDISK.  This modified version of LILO
+will allow booting from a DAC960 controller and/or mounting the root file
+system from a DAC960.
+
+Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAID
+controllers.  Installing directly onto a DAC960 may be problematic from other
+Linux distributions until their installation utilities are updated.
+
+
+                             INSTALLATION NOTES
+
+Before installing Linux or adding DAC960 logical drives to an existing Linux
+system, the controller must first be configured to provide one or more logical
+drives using the BIOS Configuration Utility or DACCF.  Please note that since
+there are only at most 6 usable partitions on each logical drive, systems
+requiring more partitions should subdivide a drive group into multiple logical
+drives, each of which can have up to 6 partitions.  Also, note that with large
+disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63) rather
+than accepting the default 2GB BIOS Geometry (128/32); failing to so do will
+cause the logical drive geometry to have more than 65535 cylinders which will
+make it impossible for FDISK to be used properly.  The 8GB BIOS Geometry can be
+enabled by configuring the DAC960 BIOS, which is accessible via Alt-M during
+the BIOS initialization sequence.
+
+For maximum performance and the most efficient E2FSCK performance, it is
+recommended that EXT2 file systems be built with a 4KB block size and 16 block
+stride to match the DAC960 controller's 64KB default stripe size.  The command
+"mke2fs -b 4096 -R stride=16 <device>" is appropriate.  Unless there will be a
+large number of small files on the file systems, it is also beneficial to add
+the "-i 16384" option to increase the bytes per inode parameter thereby
+reducing the file system metadata.  Finally, on systems that will only be run
+with Linux 2.2 or later kernels it is beneficial to enable sparse superblocks
+with the "-s 1" option.
+
+
+                     DAC960 ANNOUNCEMENTS MAILING LIST
+
+The DAC960 Announcements Mailing List provides a forum for informing Linux
+users of new driver releases and other announcements regarding Linux support
+for DAC960 PCI RAID Controllers.  To join the mailing list, send a message to
+"dac960-announce-request@dandelion.com" with the line "subscribe" in the
+message body.
+
+
+               CONTROLLER CONFIGURATION AND STATUS MONITORING
+
+The DAC960 RAID controllers running firmware 4.06 or above include a Background
+Initialization facility so that system downtime is minimized both for initial
+installation and subsequent configuration of additional storage.  The BIOS
+Configuration Utility (accessible via Alt-R during the BIOS initialization
+sequence) is used to quickly configure the controller, and then the logical
+drives that have been created are available for immediate use even while they
+are still being initialized by the controller.  The primary need for online
+configuration and status monitoring is then to avoid system downtime when disk
+drives fail and must be replaced.  Mylex's online monitoring and configuration
+utilities are being ported to Linux and will become available at some point in
+the future.  Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure)
+enclosure, the controller is able to rebuild failed drives automatically as
+soon as a drive replacement is made available.
+
+The primary interfaces for controller configuration and status monitoring are
+special files created in the /proc/rd/... hierarchy along with the normal
+system console logging mechanism.  Whenever the system is operating, the DAC960
+driver queries each controller for status information every 10 seconds, and
+checks for additional conditions every 60 seconds.  The initial status of each
+controller is always available for controller N in /proc/rd/cN/initial_status,
+and the current status as of the last status monitoring query is available in
+/proc/rd/cN/current_status.  In addition, status changes are also logged by the
+driver to the system console and will appear in the log files maintained by
+syslog.  The progress of asynchronous rebuild or consistency check operations
+is also available in /proc/rd/cN/current_status, and progress messages are
+logged to the system console at most every 60 seconds.
+
+Starting with the 2.2.3/2.0.3 versions of the driver, the status information
+available in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has been
+augmented to include the vendor, model, revision, and serial number (if
+available) for each physical device found connected to the controller:
+
+***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PRL PCI RAID Controller
+  Firmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MB
+  PCI Bus: 1, Device: 4, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  SAF-TE Enclosure Management Enabled
+  Physical Devices:
+    0:0  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68016775HA
+         Disk Status: Online, 17928192 blocks
+    0:1  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68004E53HA
+         Disk Status: Online, 17928192 blocks
+    0:2  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       13013935HA
+         Disk Status: Online, 17928192 blocks
+    0:3  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       13016897HA
+         Disk Status: Online, 17928192 blocks
+    0:4  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68019905HA
+         Disk Status: Online, 17928192 blocks
+    0:5  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68012753HA
+         Disk Status: Online, 17928192 blocks
+    0:6  Vendor: ESG-SHV   Model: SCA HSBP M6       Revision: 0.61
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+To simplify the monitoring process for custom software, the special file
+/proc/rd/status returns "OK" when all DAC960 controllers in the system are
+operating normally and no failures have occurred, or "ALERT" if any logical
+drives are offline or critical or any non-standby physical drives are dead.
+
+Configuration commands for controller N are available via the special file
+/proc/rd/cN/user_command.  A human readable command can be written to this
+special file to initiate a configuration operation, and the results of the
+operation can then be read back from the special file in addition to being
+logged to the system console.  The shell command sequence
+
+  echo "<configuration-command>" > /proc/rd/c0/user_command
+  cat /proc/rd/c0/user_command
+
+is typically used to execute configuration commands.  The configuration
+commands are:
+
+  flush-cache
+
+    The "flush-cache" command flushes the controller's cache.  The system
+    automatically flushes the cache at shutdown or if the driver module is
+    unloaded, so this command is only needed to be certain a write back cache
+    is flushed to disk before the system is powered off by a command to a UPS.
+    Note that the flush-cache command also stops an asynchronous rebuild or
+    consistency check, so it should not be used except when the system is being
+    halted.
+
+  kill <channel>:<target-id>
+
+    The "kill" command marks the physical drive <channel>:<target-id> as DEAD.
+    This command is provided primarily for testing, and should not be used
+    during normal system operation.
+
+  make-online <channel>:<target-id>
+
+    The "make-online" command changes the physical drive <channel>:<target-id>
+    from status DEAD to status ONLINE.  In cases where multiple physical drives
+    have been killed simultaneously, this command may be used to bring them
+    back online, after which a consistency check is advisable.
+
+    Warning: make-online should only be used on a dead physical drive that is
+    an active part of a drive group, never on a standby drive.
+
+  make-standby <channel>:<target-id>
+
+    The "make-standby" command changes physical drive <channel>:<target-id>
+    from status DEAD to status STANDBY.  It should only be used in cases where
+    a dead drive was replaced after an automatic rebuild was performed onto a
+    standby drive.  It cannot be used to add a standby drive to the controller
+    configuration if one was not created initially; the BIOS Configuration
+    Utility must be used for that currently.
+
+  rebuild <channel>:<target-id>
+
+    The "rebuild" command initiates an asynchronous rebuild onto physical drive
+    <channel>:<target-id>.  It should only be used when a dead drive has been
+    replaced.
+
+  check-consistency <logical-drive-number>
+
+    The "check-consistency" command initiates an asynchronous consistency check
+    of <logical-drive-number> with automatic restoration.  It can be used
+    whenever it is desired to verify the consistency of the redundancy
+    information.
+
+  cancel-rebuild
+  cancel-consistency-check
+
+    The "cancel-rebuild" and "cancel-consistency-check" commands cancel any
+    rebuild or consistency check operations previously initiated.
+
+
+              EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVE
+
+The following annotated logs demonstrate the controller configuration and and
+online status monitoring capabilities of the Linux DAC960 Driver.  The test
+configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a
+DAC960PJ controller.  The physical drives are configured into a single drive
+group without a standby drive, and the drive group has been configured into two
+logical drives, one RAID-5 and one RAID-6.  Note that these logs are from an
+earlier version of the driver and the messages have changed somewhat with newer
+releases, but the functionality remains similar.  First, here is the current
+status of the RAID configuration:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+The above messages indicate that everything is healthy, and /proc/rd/status
+returns "OK" indicating that there are no problems with any DAC960 controller
+in the system.  For demonstration purposes, while I/O is active Physical Drive
+1:1 is now disconnected, simulating a drive failure.  The failure is noted by
+the driver within 10 seconds of the controller's having detected it, and the
+driver logs the following console status messages indicating that Logical
+Drives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD:
+
+DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI command
+DAC960#0: Physical Drive 1:1 is now DEAD
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL
+
+The Sense Keys logged here are just Check Condition / Unit Attention conditions
+arising from a SCSI bus reset that is forced by the controller during its error
+recovery procedures.  Concurrently with the above, the driver status available
+from /proc/rd also reflects the drive failure.  The status message in
+/proc/rd/status has changed from "OK" to "ALERT":
+
+gwynedd:/u/lnz# cat /proc/rd/status
+ALERT
+
+and /proc/rd/c0/current_status has been updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Dead, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+Since there are no standby drives configured, the system can continue to access
+the logical drives in a performance degraded mode until the failed drive is
+replaced and a rebuild operation completed to restore the redundancy of the
+logical drives.  Once Physical Drive 1:1 is replaced with a properly
+functioning drive, or if the physical drive was killed without having failed
+(e.g., due to electrical problems on the SCSI bus), the user can instruct the
+controller to initiate a rebuild operation onto the newly replaced drive:
+
+gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_command
+gwynedd:/u/lnz# cat /proc/rd/c0/user_command
+Rebuild of Physical Drive 1:1 Initiated
+
+The echo command instructs the controller to initiate an asynchronous rebuild
+operation onto Physical Drive 1:1, and the status message that results from the
+operation is then available for reading from /proc/rd/c0/user_command, as well
+as being logged to the console by the driver.
+
+Within 10 seconds of this command the driver logs the initiation of the
+asynchronous rebuild operation:
+
+DAC960#0: Rebuild of Physical Drive 1:1 Initiated
+DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01
+DAC960#0: Physical Drive 1:1 is now WRITE-ONLY
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completed
+
+and /proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Write-Only, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completed
+
+As the rebuild progresses, the current status in /proc/rd/c0/current_status is
+updated every 10 seconds:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Write-Only, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completed
+
+and every minute a progress message is logged to the console by the driver:
+
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completed
+
+Finally, the rebuild completes successfully.  The driver logs the status of the 
+logical and physical drives and the rebuild completion:
+
+DAC960#0: Rebuild Completed Successfully
+DAC960#0: Physical Drive 1:1 is now ONLINE
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE
+
+/proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru
+  Rebuild Completed Successfully
+
+and /proc/rd/status indicates that everything is healthy once again:
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+
+               EXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVE
+
+The following annotated logs demonstrate the controller configuration and and
+online status monitoring capabilities of the Linux DAC960 Driver.  The test
+configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a
+DAC960PJ controller.  The physical drives are configured into a single drive
+group with a standby drive, and the drive group has been configured into two
+logical drives, one RAID-5 and one RAID-6.  Note that these logs are from an
+earlier version of the driver and the messages have changed somewhat with newer
+releases, but the functionality remains similar.  First, here is the current
+status of the RAID configuration:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Standby, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+The above messages indicate that everything is healthy, and /proc/rd/status
+returns "OK" indicating that there are no problems with any DAC960 controller
+in the system.  For demonstration purposes, while I/O is active Physical Drive
+1:2 is now disconnected, simulating a drive failure.  The failure is noted by
+the driver within 10 seconds of the controller's having detected it, and the
+driver logs the following console status messages:
+
+DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI command
+DAC960#0: Physical Drive 1:2 is now DEAD
+DAC960#0: Physical Drive 1:2 killed because it was removed
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL
+
+Since a standby drive is configured, the controller automatically begins
+rebuilding onto the standby drive:
+
+DAC960#0: Physical Drive 1:3 is now WRITE-ONLY
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed
+
+Concurrently with the above, the driver status available from /proc/rd also
+reflects the drive failure and automatic rebuild.  The status message in
+/proc/rd/status has changed from "OK" to "ALERT":
+
+gwynedd:/u/lnz# cat /proc/rd/status
+ALERT
+
+and /proc/rd/c0/current_status has been updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Write-Only, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed
+
+As the rebuild progresses, the current status in /proc/rd/c0/current_status is
+updated every 10 seconds:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Write-Only, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed
+
+and every minute a progress message is logged on the console by the driver:
+
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completed
+
+Finally, the rebuild completes successfully.  The driver logs the status of the 
+logical and physical drives and the rebuild completion:
+
+DAC960#0: Rebuild Completed Successfully
+DAC960#0: Physical Drive 1:3 is now ONLINE
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE
+
+/proc/rd/c0/current_status is updated:
+
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  Rebuild Completed Successfully
+
+and /proc/rd/status indicates that everything is healthy once again:
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+Note that the absence of a viable standby drive does not create an "ALERT"
+status.  Once dead Physical Drive 1:2 has been replaced, the controller must be
+told that this has occurred and that the newly replaced drive should become the
+new standby drive:
+
+gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_command
+gwynedd:/u/lnz# cat /proc/rd/c0/user_command
+Make Standby of Physical Drive 1:2 Succeeded
+
+The echo command instructs the controller to make Physical Drive 1:2 into a
+standby drive, and the status message that results from the operation is then
+available for reading from /proc/rd/c0/user_command, as well as being logged to
+the console by the driver.  Within 60 seconds of this command the driver logs:
+
+DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01
+DAC960#0: Physical Drive 1:2 is now STANDBY
+DAC960#0: Make Standby of Physical Drive 1:2 Succeeded
+
+and /proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Standby, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  Rebuild Completed Successfully
diff --git a/Documentation/networking/CREDITS.ipvs b/Documentation/networking/CREDITS.ipvs
deleted file mode 100644 (file)
index 95d869e..0000000
+++ /dev/null
@@ -1,26 +0,0 @@
-The contributors of Linux Virtual Server project are listed
-as follows in alphabetical order.
-
-       Mike Douglas <spike@bayside.net>
-       Virtual Server Logo.
-
-       Matthew Kellett <matthewk@corelcomputer.com>
-       Added the loadable load-balancing module to VS-0.5 patch 
-       for kernel 2.0.
-
-       Peter Kese <peter.kese@ijs.si>
-       Suggested the idea of the local-node feature and provided a
-       local-node prototype patch for VS via tunneling.
-       Port the VS patch to kernel 2.2 and rewrite of the code
-       The persistent client connection feature.
-
-       Joseph Mack <mack.joseph@epa.gov>
-       Gaving a talk about Linux Virtual Server in the LinuxExpo'99.
-
-       Rob Thomas <rob@rpi.net.au>
-       Wrote the the "Greased Turkey" document about how to setup a
-       load-sharing server. (a little bit stale, though)
-
-       Wensong Zhang <wensong@iinchina.net>
-       Chief author and developer.
-   
diff --git a/Documentation/networking/ChangeLog.ipvs b/Documentation/networking/ChangeLog.ipvs
deleted file mode 100644 (file)
index 55d5eff..0000000
+++ /dev/null
@@ -1,292 +0,0 @@
-ChangeLog of Virtual Server patch for Linux 2.2
-
-Virtual Server patch for Linux 2.2 - Version 0.7 - July 9, 1999
-
-Changes:
--   Added a separate masq hash table for IPVS.
-
--   Added slow timers to expire masq entries. 
-    Slow timers are checked in one second by default. Most overhead
-    of cascading timers is avoided.
-
-    With this new hash table and slow timers, the system can hold
-    huge number of masq entries, but make sure that you have
-    enough free memory. One masq entry costs 128 bytes memory
-    effectively (Thank Alan Cox), if your box holds 1 million masq
-    entries (it means that your box can receive 2000 connections per 
-    second if masq expire time is 500 seconds in average.), make sure
-    that you have 128M free memory. And, thank Alan for suggesting
-    the early random drop algorithm for masq entries that prevents
-    the system from running out of memory, I will design and implement
-    this feature in the near future.
-
--   Fixed the unlocking bug in the ip_vs_del_dest().
-    Thank Ted Pavlic <tpavlic@netwalk.com> for reporting it.
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.6 - July 1, 1999
-
-Changes:
--   Fixed the overflow bug in the ip_vs_procinfo().
-    Thank Ted Pavlic <tpavlic@netwalk.com> for reporting it.
-
--   Added the functionality to change weight and forwarding
-    (dispatching) method of existing real server.
-    This is useful for load-informed scheduling.
-
--   Added the functionality to change scheduler of virtual service
-    on the fly.
-
--   Reorganized some code and changed names of some functions.
-    This make the code more readable.
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.5 - June 22, 1999
-
-Changes:
--   Fixed the bug that LocalNode doesn't work in vs-0.4-2.2.9.
-    Thank Changwon Kim <chwkim@samsung.co.kr> for
-    reporting the bug and pointing me the checksum update
-    problem in the code.
-
--   some code of VS in the ip_fw_demasquerade was reorganized
-    so that the packets for VS-Tunneling, VS-DRouting and LocalNode
-    skip the checksum update. This make the code right and efficient
-
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.4 - June 1, 1999
-
-Most of the code was rewritten. The locking and refcnt was changed
-The violation of "no floats in kernel mode" rule in the weighted 
-least-connection scheduling was fixed. This patch is more efficient,
-and should be more stable.
-
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.1~0.3 - May 1999
-
-Peter Kese <peter.kese@ijs.si> ported the VS patch to kernel 2.2,
-rewrote the code and loadable scheduling modules.
-       
-==========================================================================
-       
-ChangeLog of Virtual Server patch for Linux 2.0
-----------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.9 - May 1, 1999
-
-Differences with virtual server patch version 0.8:
-
--  Add Virtual Server via Direct Routing
-   This approach was first implemented in IBM's NetDispatcher. All real
-   servers have their loopback alias interface configured with the virtual
-   IP address, the load balancer and the real servers must have one of
-   their interfaces physically linked by a HUB/Switch. When the packets
-   destined for the virtual IP address arrives, the load balnacer directly
-   route them to the real servers, the real servers processing the requests
-   and return the reply packets directly to the clients. Compared to the
-   virtual server via IP tunneling approach, this approach doesn't have
-   tunneling overhead(In fact, this overhead is minimal in most situations),
-   but requires that one of the load balancer's interfaces and the real
-   servers' interfaces must be in physical segment.
-       
--  Add more satistics information
-   The active connection counter and the total connection counter of
-   each real server were added for all the scheduling algorithms.
-
--  Add resetting(zeroing) counters
-   The total connection counters of all real servers can be reset to zero.
-
--  Change some statements in the masq_expire function and the 
-   ip_fw_demasquerade function, so that ip_masq_free_ports won't become
-   abnormal number after the masquerading entries for virtual server
-   are released.
-
--  Fix the bug of "double unlock on device queue"
-   Remove the unnecessary function call of skb_device_unlock(skb) in the
-   ip_pfvs_encapsule function, which sometimes cause "kernel: double
-   unlock on device queue" waring in the virtual server via tunneling.
-
--  Many functions of virtual server patch was splitted into the
-   linux/net/ipv4/ip_masq_pfvs.c.
-
--  Upgrade ippfvsadm 1.0.2 to ippfvsadm 1.0.3
-   Zeroing counters is supported in the new version. The ippfvsadm 1.0.3
-   can be used for all kernel with different virtual server options
-   without rebuilding the program.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.8 - March 6, 1999
-
-Differences with virtual server patch version 0.7:
-
--  Add virtual FTP server support
-   The original ippfvs via IP tunneling could not be used to
-   build a virtual FTP server, because the real servers could
-   not establish data connections to clients. The code was
-   added to parse the port number in the ftp control data
-   and create the corresponding masquerading entry for the
-   coming data connection.
-   Although the original ippfvs via NAT could be used to build
-   a virtual server, the data connection was established in
-   this way.
-     Real Server port:20  ----> ippfvs: allocate a free masq port
-       ----->  the client port
-   It is not elegent but time-consuming. Now it was changed
-   as follows:
-     Real Server port:20  ----> ippfvs port: 20 
-       ----> the client port
-
--  Change the port checking order in the ip_fw_demasquerade()
-   If the size of masquerade hash table is well chosen, checking
-   a masquerading entry in the hash table will just require one
-   hit. It is much efficient than checking port for  virtual
-   services, and there are at least 3 incoming packets for each
-   connection, which require port checking. So, it is efficient
-   to check the masquerading hash table first and then check
-   port for virtual services.
-
--  Remove a useless statement in the ip_masq_new_pfvs()
-   The useless statement in the ip_masq_new_pfvs function is
-       ip_masq_free_ports[masq_proto_num(proto)]++;
-   which may disturb system.
-
--  Change the header printing of the ip_pfvs_procinfo()
-       
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.7 - Febuary 10, 1999
-
-Differences with virtual server patch version 0.6:
-
--  Fix a bug in detect the finish of connection for tunneling
-   or NATing to the local node.
-   Since the server reply the client directly in tunneling or
-   NATing to the local node, the load balancer (LinuxDirector)
-   can only detect a FIN segment. It is mistake that the masq
-   entry is removed only if both-side FIN segments are detected,
-   and then the masq entry expires in 15 minutes. For the
-   situation above, the code was changed to set the masq entry
-   expire in TCP_FIN_TIMEOUT (2min) when an incoming FIN segment
-   is detecting.
--  Add the patch version printing in the ip_pfvs_procinfo()
-   It would be easy for users and hackers to know which
-   virtual server patch version they are running. Thank
-   Peter Kese <peter.kese@ijs.si> for the suggestion.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.6 - Febuary 2, 1999
-
-Differences with virtual server patch version 0.5:
-
--  Add the local node feature in virtual server.
-   If the local node feature is enabled, the load balancer can 
-   not only redirect the packets of the specified port to the 
-   other servers (remote nodes) to process it, but also can process 
-   the packets locally (local node). Which node is chosen depends on
-   the scheduling algorithms.
-   This local node feature can be used to build a virtual server of
-   a few nodes, for example, 2, 3 or more sites, in which it is a 
-   resource waste if the load balancer is only used to redirect
-   packets. It is wise to direct some packets to the local node to
-   process. This feature can also be used to build distributed
-   identical servers, in which one is too busy to handle requests
-   locally, then it can seamlessly forward requests to other servers
-   to process them.
-   This feature can be applied to both virtual server via NAT and
-   virtual server via IP tunneling.
-   Thank Peter Kese <peter.kese@ijs.si> for idea of "Two node Virtual
-   Server" and his single line patch for virtual server via IP
-   tunneling.
--  Remove a useless function call ip_send_check in the virtual
-   server via IP tunneling code. 
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.5 - November 25, 1998
-
-Differences with virtual server patch version 0.4:
-
--  Add the feature of virtual server via IP tunneling.
-   If the ippfvs is enabled using IP tunneling, the load balancer
-   chooses a real server from a cluster based on a scheduling algorithm,
-   encapsules the packet and forwards it to the chosen server. All real
-   servers are configured with "ifconfig tunl0 <Virtual IP Address> up".
-   When the chosen server receives the encapsuled packet, it decapsules
-   the packet, processes the request and returns the reply packets 
-   directly to the client without passing the load balancer. This can 
-   greatly increase the scalability of virtual server.
--  Fix a bug in the ip_portfw_del() for the weighted RR scheduling.
-   The bug in version 0.4 is when the weighted round-robin scheduling
-   is used, deleting the last rule for a virtual server will report
-   "setsockopt failed: Invalid argument" warning, in fact the last
-   rule is deleted but the gen_scheduling_seq() works on a null list
-   and causes that warning.
--  Add and modify some description for virtual server options in
-   the Linux kernel configuration help texts.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.4 - November 12, 1998
-
-Differences with virtual server patch version 0.3:
-
--  Fix a memory access error bug.
-   The set_serverpointer_null() function is added to scan all the existing
-   ip masquerading records for its server pointer which points to the 
-   server specified and set it null. It is useful when administrators 
-   delete a real server or all real servers, those pointers pointing to 
-   the server must be set null.  Otherwise, decreasing the connection 
-   counter of the server may cause memory access error when the connection
-   terminates or timeout.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.3 - November 10, 1998
-
-Differences with virtual server patch version 0.2:
-
--  Change the simple round-robin scheduling to the weighted round-robin
-   scheduling. Simple is a special instance of the weighted round-robin
-   scheduling when the weights of the servers are the same.
--  The scheduling algorithm, originally called the weighted round-robin
-   scheduling in version 0.2, actually is the weighted least-connection
-   scheduling. So the concept is clarified here.
--  Add the least-connection scheduling algorithm. Although it is a 
-   special instance of the weighted least-connection scheduling algorithm,
-   it is used to avoid dividing the weight in looking up servers when
-   the weights of the servers are the same, so the overhead of scheduling
-   can be minimized in this case.
--  Change the type of the server load variables, curr_load and least_load,
-   from integer to float in the weighted least-connection scheduling.
-   It can make a better load-balancing when the weights specified are high.
--  Merge the original two patches into one. Users have to specify which
-   scheduling algorithm is used, the weighted round-robin scheduling,
-   the least-connection scheduling, or the weighted least-connection
-   scheduling, before rebuild the kernel.
--  Change the ip_pfvs_proc function to make the output of the port 
-   forwarding & virtual server table more beautiful.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.2 - May 28, 1998
-
-Differences with virtual server patch version 0.1:
-
--  Add the weighted round-robin scheduling patch.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.1 - May 26, 1998
-
--  Implement the infrastructure of virtual server.
--  Implement the simple round-robin scheduling algorithm.
-
---------------------------------------------------------------------
diff --git a/Documentation/networking/README.ipvs b/Documentation/networking/README.ipvs
deleted file mode 100644 (file)
index a7a626b..0000000
+++ /dev/null
@@ -1,93 +0,0 @@
-README of Virtual Server Patch for Linux 2.2.10
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux 2.2.10 - Version 0.7 - July 9, 1999
-
-Copyright (c) 1998,1999 by Wensong Zhang, Peter Kese.
-This is free software.  See below for details.
-
-The ipvs is IP Virtual Server support in Linux kernel, which can be used
-to build a high-performance and highly available server. Check out the
-Linux Virtual Server Project homepage on the World Wide Web:
-       http://proxy.iinchina.net/~wensong/ippfvs/
-for the most recent information and original sources about ipvs.
-
-We now call the Linux box running ipvs LinuxDirector. Thank 
-Robert Thomas <rob@rpi.net.au> for this name, I love it. :-)
-
-This patch (Version 0.7) is for the Linux kernel 2.2.10. See the ChangeLog
-for how the code has been improved and what new features it has now.
-
-To rebuild a Linux kernel with virtual server support, first get a clean
-copy of the Linux kernel source of the right version and apply the patch
-to the kernel. The commands can be as follows: 
-       cd /usr/src/linux
-       cat <path-name>/ipvs-0.7-2.2.10.patch | patch -p1 
-Then make sure the following kernel compile options at least are selected
-via "make menuconfig" or "make xconfig".
-
-Kernel Compile Options:
-
-Code maturity level options ---
-       [*] Prompt for development and/or incomplete code/drivers
-Networking options ---
-        [*] Network firewalls
-        ....
-        [*] IP: firewalling
-        [*] IP: always defragment (required for masquerading)
-        ....
-        [*] IP: masquerading
-        ....
-        [*] IP: masquerading virtual server support
-       (12) IP masquerading table size (the Nth power of 2)
-       < > IPVS: round-robin scheduling
-       < > IPVS: weighted round-robin scheduling
-       < > IPVS: weighted least-connection scheduling
-       < > IPVS: persistent client connection scheduling
-Note that you can compile scheduling algorithms in kernel or as modules.
-
-Finally, rebuild the kernel. Once you have your kernel properly built, 
-update your system kernel and reboot.
-
-Note that there are three request dispatching techniques existing together
-in the LinuxDirector, and there are also three scheduling algorithms
-implemented. Both the VS via IP Tunneling and the VS via Direct Routing 
-can greatly increase the scalability of virtual server. If the VS-Tunneling
-is selected, it requires that all the servers must be configured with 
-       ifconfig tunl0 <Virtual IP Address> netmask 255.255.255.255
-If the VS-DRouting is chosen, it requires that all servers must be configured
-with the following command:
-       ifconfig lo:0 <Virtual IP Address> netmask 255.255.255.255
-The localnode feature can make that the LinuxDiretor can not only redirect
-packets to other servers, but also process packets locally.
-
-Thanks must go to other contributors, check the CREDITS file to know
-who they are.
-
-There is a mailing list for virtual server. You are welcome to talk about
-building the virtual server kernel, using the virtual server and making
-the virtual server better there. :-) To subscribe, send a message to
-       majordomo@iinchina.net
-with the body of "subscribe linux-virtualserver".
-
-
-Wensong Zhang <wensong@iinchina.net>
-
-
---------------------------------------------------------------------
-
-This program is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 2 of the License, or
-(at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with this program; if not, write to the Free Software
-Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
---------------------------------------------------------------------
index 2c22c0c4bfd222fcabd6a60617023e14c01ac263..cf8911e743f8f4ca1d9808ede77d5a88100a23a3 100644 (file)
@@ -14,6 +14,8 @@
 #include <linux/in.h>
 #include <linux/in6.h>
 #include <linux/pci.h>
+#include <linux/tty.h>
+#include <linux/mm.h>
 
 #include <asm/io.h>
 #include <asm/hwrpb.h>
 #include <linux/interrupt.h>
 #include <asm/softirq.h>
 #include <asm/fpu.h>
+#include <asm/irq.h>
+#include <asm/machvec.h>
+#include <asm/pgtable.h>
+#include <asm/semaphore.h>
 
 #define __KERNEL_SYSCALLS__
 #include <asm/unistd.h>
@@ -41,8 +47,14 @@ extern void __remlu (void);
 extern void __divqu (void);
 extern void __remqu (void);
 
+EXPORT_SYMBOL(alpha_mv);
 EXPORT_SYMBOL(local_bh_count);
 EXPORT_SYMBOL(local_irq_count);
+EXPORT_SYMBOL(enable_irq);
+EXPORT_SYMBOL(disable_irq);
+EXPORT_SYMBOL(disable_irq_nosync);
+EXPORT_SYMBOL(screen_info);
+EXPORT_SYMBOL(perf_irq);
 
 /* platform dependent support */
 EXPORT_SYMBOL(_inb);
@@ -76,12 +88,14 @@ EXPORT_SYMBOL(strnlen);
 EXPORT_SYMBOL(strncat);
 EXPORT_SYMBOL(strstr);
 EXPORT_SYMBOL(strtok);
+EXPORT_SYMBOL(strpbrk);
 EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strrchr);
 EXPORT_SYMBOL(memcmp);
 EXPORT_SYMBOL(memmove);
 EXPORT_SYMBOL(__memcpy);
 EXPORT_SYMBOL(__memset);
+EXPORT_SYMBOL(__memsetw);
 EXPORT_SYMBOL(__constant_c_memset);
 
 EXPORT_SYMBOL(dump_thread);
@@ -90,8 +104,8 @@ EXPORT_SYMBOL(hwrpb);
 EXPORT_SYMBOL(wrusp);
 EXPORT_SYMBOL(start_thread);
 EXPORT_SYMBOL(alpha_read_fp_reg);
-EXPORT_SYMBOL(alpha_write_fp_reg);
 EXPORT_SYMBOL(alpha_read_fp_reg_s);
+EXPORT_SYMBOL(alpha_write_fp_reg);
 EXPORT_SYMBOL(alpha_write_fp_reg_s);
 
 /* In-kernel system calls.  */
@@ -117,7 +131,9 @@ EXPORT_SYMBOL(csum_ipv6_magic);
 
 #ifdef CONFIG_MATHEMU_MODULE
 extern long (*alpha_fp_emul_imprecise)(struct pt_regs *, unsigned long);
+extern long (*alpha_fp_emul) (unsigned long pc);
 EXPORT_SYMBOL(alpha_fp_emul_imprecise);
+EXPORT_SYMBOL(alpha_fp_emul);
 #endif
 
 /*
@@ -128,6 +144,44 @@ EXPORT_SYMBOL_NOVERS(__do_clear_user);
 EXPORT_SYMBOL(__strncpy_from_user);
 EXPORT_SYMBOL(__strlen_user);
 
+/*
+ * The following are specially called from the semaphore assembly stubs.
+ */
+EXPORT_SYMBOL_NOVERS(__down_failed);
+EXPORT_SYMBOL_NOVERS(__down_failed_interruptible);
+EXPORT_SYMBOL_NOVERS(__up_wakeup);
+
+/* 
+ * SMP-specific symbols.
+ */
+
+#ifdef __SMP__
+EXPORT_SYMBOL(synchronize_irq);
+EXPORT_SYMBOL(flush_tlb_all);
+EXPORT_SYMBOL(flush_tlb_mm);
+EXPORT_SYMBOL(flush_tlb_page);
+EXPORT_SYMBOL(flush_tlb_range);
+EXPORT_SYMBOL(cpu_data);
+EXPORT_SYMBOL(cpu_number_map);
+EXPORT_SYMBOL(global_bh_lock);
+EXPORT_SYMBOL(global_bh_count);
+EXPORT_SYMBOL(synchronize_bh);
+EXPORT_SYMBOL(global_irq_holder);
+EXPORT_SYMBOL(__global_cli);
+EXPORT_SYMBOL(__global_sti);
+EXPORT_SYMBOL(__global_save_flags);
+EXPORT_SYMBOL(__global_restore_flags);
+#if DEBUG_SPINLOCK
+EXPORT_SYMBOL(spin_unlock);
+EXPORT_SYMBOL(debug_spin_lock);
+EXPORT_SYMBOL(debug_spin_trylock);
+#endif
+#if DEBUG_RWLOCK
+EXPORT_SYMBOL(write_lock);
+EXPORT_SYMBOL(read_lock);
+#endif
+#endif /* __SMP__ */
+
 /*
  * The following are special because they're not called
  * explicitly (the C compiler or assembler generates them in
@@ -147,3 +201,5 @@ EXPORT_SYMBOL_NOVERS(__remq);
 EXPORT_SYMBOL_NOVERS(__remqu);
 EXPORT_SYMBOL_NOVERS(memcpy);
 EXPORT_SYMBOL_NOVERS(memset);
+
+
index 62fc7e6e53e4e8735b0bc898463df2078efaead9..50250befb738d4a6b511770305848d264f9ff613 100644 (file)
@@ -568,65 +568,65 @@ mcpcia_print_uncorrectable(struct el_MCPCIA_uncorrected_frame_mcheck *logout)
 
        /* Print PAL fields */
        for (i = 0; i < 24; i += 2) {
-               printk("\tpal temp[%d-%d]\t\t= %16lx %16lx\n\r",
+               printk("\tpal temp[%d-%d]\t\t= %16lx %16lx\n",
                       i, i+1, frame->paltemp[i], frame->paltemp[i+1]);
        }
        for (i = 0; i < 8; i += 2) {
-               printk("\tshadow[%d-%d]\t\t= %16lx %16lx\n\r",
+               printk("\tshadow[%d-%d]\t\t= %16lx %16lx\n",
                       i, i+1, frame->shadow[i], 
                       frame->shadow[i+1]);
        }
-       printk("\tAddr of excepting instruction\t= %16lx\n\r",
+       printk("\tAddr of excepting instruction\t= %16lx\n",
               frame->exc_addr);
-       printk("\tSummary of arithmetic traps\t= %16lx\n\r",
+       printk("\tSummary of arithmetic traps\t= %16lx\n",
               frame->exc_sum);
-       printk("\tException mask\t\t\t= %16lx\n\r",
+       printk("\tException mask\t\t\t= %16lx\n",
               frame->exc_mask);
-       printk("\tBase address for PALcode\t= %16lx\n\r",
+       printk("\tBase address for PALcode\t= %16lx\n",
               frame->pal_base);
-       printk("\tInterrupt Status Reg\t\t= %16lx\n\r",
+       printk("\tInterrupt Status Reg\t\t= %16lx\n",
               frame->isr);
-       printk("\tCURRENT SETUP OF EV5 IBOX\t= %16lx\n\r",
+       printk("\tCURRENT SETUP OF EV5 IBOX\t= %16lx\n",
               frame->icsr);
-       printk("\tI-CACHE Reg %s parity error\t= %16lx\n\r",
+       printk("\tI-CACHE Reg %s parity error\t= %16lx\n",
               (frame->ic_perr_stat & 0x800L) ? 
               "Data" : "Tag", 
               frame->ic_perr_stat); 
-       printk("\tD-CACHE error Reg\t\t= %16lx\n\r",
+       printk("\tD-CACHE error Reg\t\t= %16lx\n",
               frame->dc_perr_stat);
        if (frame->dc_perr_stat & 0x2) {
                switch (frame->dc_perr_stat & 0x03c) {
                case 8:
-                       printk("\t\tData error in bank 1\n\r");
+                       printk("\t\tData error in bank 1\n");
                        break;
                case 4:
-                       printk("\t\tData error in bank 0\n\r");
+                       printk("\t\tData error in bank 0\n");
                        break;
                case 20:
-                       printk("\t\tTag error in bank 1\n\r");
+                       printk("\t\tTag error in bank 1\n");
                        break;
                case 10:
-                       printk("\t\tTag error in bank 0\n\r");
+                       printk("\t\tTag error in bank 0\n");
                        break;
                }
        }
-       printk("\tEffective VA\t\t\t= %16lx\n\r",
+       printk("\tEffective VA\t\t\t= %16lx\n",
               frame->va);
-       printk("\tReason for D-stream\t\t= %16lx\n\r",
+       printk("\tReason for D-stream\t\t= %16lx\n",
               frame->mm_stat);
-       printk("\tEV5 SCache address\t\t= %16lx\n\r",
+       printk("\tEV5 SCache address\t\t= %16lx\n",
               frame->sc_addr);
-       printk("\tEV5 SCache TAG/Data parity\t= %16lx\n\r",
+       printk("\tEV5 SCache TAG/Data parity\t= %16lx\n",
               frame->sc_stat);
-       printk("\tEV5 BC_TAG_ADDR\t\t\t= %16lx\n\r",
+       printk("\tEV5 BC_TAG_ADDR\t\t\t= %16lx\n",
               frame->bc_tag_addr);
-       printk("\tEV5 EI_ADDR: Phys addr of Xfer\t= %16lx\n\r",
+       printk("\tEV5 EI_ADDR: Phys addr of Xfer\t= %16lx\n",
               frame->ei_addr);
-       printk("\tFill Syndrome\t\t\t= %16lx\n\r",
+       printk("\tFill Syndrome\t\t\t= %16lx\n",
               frame->fill_syndrome);
-       printk("\tEI_STAT reg\t\t\t= %16lx\n\r",
+       printk("\tEI_STAT reg\t\t\t= %16lx\n",
               frame->ei_stat);
-       printk("\tLD_LOCK\t\t\t\t= %16lx\n\r",
+       printk("\tLD_LOCK\t\t\t\t= %16lx\n",
               frame->ld_lock);
 }
 
@@ -657,7 +657,8 @@ mcpcia_machine_check(unsigned long vector, unsigned long la_ptr,
        process_mcheck_info(vector, la_ptr, regs, "MCPCIA",
                            DEBUG_MCHECK, MCPCIA_mcheck_expected[cpu]);
 
-       if (vector != 0x620 && vector != 0x630) {
+       if (vector != 0x620 && vector != 0x630
+           && ! MCPCIA_mcheck_expected[cpu]) {
                mcpcia_print_uncorrectable(mchk_logout);
        }
 
index a89d4c9e3d9de750aa40c92002d45317de591aa9..19109d12e1a12ca385f63de9007b52c7296f95a6 100644 (file)
@@ -88,6 +88,7 @@ cpu_idle(void *unused)
 
                /* Although we are an idle CPU, we do not want to 
                   get into the scheduler unnecessarily.  */
+               barrier();
                if (current->need_resched) {
                        schedule();
                        check_pgt_cache();
index 15715f469dd6ccddb939f879699ef7d14fd917a0..b52d386baf3fd5293811ee22f0b3781eadc4592b 100644 (file)
@@ -340,8 +340,11 @@ find_end_memory(void)
        high = (high + PAGE_SIZE) & (PAGE_MASK*2);
 
        /* Enforce maximum of 2GB even if there is more.  Blah.  */
-       if (high > 0x80000000UL)
+       if (high > 0x80000000UL) {
+               printk("Cropping memory from %luMB to 2048MB\n", high);
                high = 0x80000000UL;
+       }
+
        return PAGE_OFFSET + high;
 }
 
index 453b36aa4b03b9c31f6536745911e1ac08e05838..49f04a8923e2b8ad9d82f493e0002f67353b8ecc 100644 (file)
@@ -93,20 +93,11 @@ CONFIG_IDEDMA_AUTO=y
 #
 # CONFIG_BLK_DEV_LOOP is not set
 # CONFIG_BLK_DEV_NBD is not set
-CONFIG_BLK_DEV_MD=y
-CONFIG_AUTODETECT_RAID=y
-CONFIG_MD_LINEAR=y
-CONFIG_MD_STRIPED=y
-CONFIG_MD_MIRRORING=y
-CONFIG_MD_RAID5=y
-CONFIG_MD_TRANSLUCENT=y
-CONFIG_MD_LVM=y
-CONFIG_MD_BOOT=y
+# CONFIG_BLK_DEV_MD is not set
 # CONFIG_BLK_DEV_RAM is not set
 # CONFIG_BLK_DEV_XD is not set
 CONFIG_PARIDE_PARPORT=y
 # CONFIG_PARIDE is not set
-# CONFIG_BLK_CPQ_DA is not set
 # CONFIG_BLK_DEV_HD is not set
 
 #
index 0d6c177919c9aad1602cd6c5f3aa501dc303dd46..b95c65ed40e1e7436f39188e9c007f6056e506a6 100644 (file)
     19990512   Richard Gooch <rgooch@atnf.csiro.au>
               Minor cleanups.
   v1.35
+    19990812   Zoltan Boszormenyi <zboszor@mol.hu>
+               PRELIMINARY CHANGES!!! ONLY FOR TESTING!!!
+               Rearrange switch() statements so the driver accomodates to
+               the fact that the AMD Athlon handles its MTRRs the same way
+               as Intel does.
+               
+    19990819   Alan Cox <alan@redhat.com>
+              Tested Zoltan's changes on a pre production Athlon - 100%
+              success. Fixed one fall through check to be Intel only.
 */
+
 #include <linux/types.h>
 #include <linux/errno.h>
 #include <linux/sched.h>
 #include <asm/hardirq.h>
 #include "irq.h"
 
-#define MTRR_VERSION            "1.35 (19990512)"
+#define MTRR_VERSION            "1.35a (19990819)"
 
 #define TRUE  1
 #define FALSE 0
@@ -321,6 +331,8 @@ static void set_mtrr_prepare (struct set_mtrr_context *ctxt)
     switch (boot_cpu_data.x86_vendor)
     {
       case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 >= 6) break; /* Athlon and post-Athlon CPUs */
+       /* else fall through */
       case X86_VENDOR_CENTAUR:
        return;
        /*break;*/
@@ -344,6 +356,7 @@ static void set_mtrr_prepare (struct set_mtrr_context *ctxt)
 
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
       case X86_VENDOR_INTEL:
        /*  Disable MTRRs, and set the default type to uncached  */
        rdmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
@@ -365,6 +378,8 @@ static void set_mtrr_done (struct set_mtrr_context *ctxt)
     switch (boot_cpu_data.x86_vendor)
     {
       case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 >= 6) break; /* Athlon and post-Athlon CPUs */
+       /* else fall through */
       case X86_VENDOR_CENTAUR:
        __restore_flags (ctxt->flags);
        return;
@@ -376,6 +391,7 @@ static void set_mtrr_done (struct set_mtrr_context *ctxt)
     /*  Restore MTRRdefType  */
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
       case X86_VENDOR_INTEL:
        wrmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
        break;
@@ -406,6 +422,9 @@ static unsigned int get_num_var_ranges (void)
 
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) return 2; /* pre-Athlon CPUs */
+       /* else fall through */
       case X86_VENDOR_INTEL:
        rdmsr (MTRRcap_MSR, config, dummy);
        return (config & 0xff);
@@ -416,9 +435,6 @@ static unsigned int get_num_var_ranges (void)
         /*  and Centaur has 8 MCR's  */
        return 8;
        /*break;*/
-      case X86_VENDOR_AMD:
-       return 2;
-       /*break;*/
     }
     return 0;
 }   /*  End Function get_num_var_ranges  */
@@ -430,12 +446,14 @@ static int have_wrcomb (void)
 
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) return 1; /* pre-Athlon CPUs */
+       /* else fall through */
       case X86_VENDOR_INTEL:
        rdmsr (MTRRcap_MSR, config, dummy);
        return (config & (1<<10));
        /*break;*/
       case X86_VENDOR_CYRIX:
-      case X86_VENDOR_AMD:
       case X86_VENDOR_CENTAUR:
        return 1;
        /*break;*/
@@ -1062,9 +1080,23 @@ int mtrr_add (unsigned long base, unsigned long size, unsigned int type,
     if ( !(boot_cpu_data.x86_capability & X86_FEATURE_MTRR) ) return -ENODEV;
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) { /* pre-Athlon CPUs */
+         /* Apply the K6 block alignment and size rules
+            In order
+               o Uncached or gathering only
+               o 128K or bigger block
+               o Power of 2 block
+               o base suitably aligned to the power
+           */
+         if (type > MTRR_TYPE_WRCOMB || size < (1 << 17) ||
+             (size & ~(size-1))-size || (base & (size-1)))
+             return -EINVAL;
+         break;
+       } /* else fall through */
       case X86_VENDOR_INTEL:
        /*  For Intel PPro stepping <= 7, must be 4 MiB aligned  */
-       if ( (boot_cpu_data.x86 == 6) && (boot_cpu_data.x86_model == 1) &&
+       if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && (boot_cpu_data.x86 == 6) && (boot_cpu_data.x86_model == 1) &&
             (boot_cpu_data.x86_mask <= 7) && ( base & ( (1 << 22) - 1 ) ) )
        {
            printk ("mtrr: base(0x%lx) is not 4 MiB aligned\n", base);
@@ -1105,18 +1137,6 @@ int mtrr_add (unsigned long base, unsigned long size, unsigned int type,
            return -EINVAL;
        }
        break;
-      case X86_VENDOR_AMD:
-       /* Apply the K6 block alignment and size rules
-          In order
-             o Uncached or gathering only
-             o 128K or bigger block
-             o Power of 2 block
-             o base suitably aligned to the power
-         */
-       if (type > MTRR_TYPE_WRCOMB || size < (1 << 17) ||
-           (size & ~(size-1))-size || (base & (size-1)))
-           return -EINVAL;
-       break;
       default:
        return -EINVAL;
        /*break;*/
@@ -1657,6 +1677,12 @@ __initfunc(static void mtrr_setup (void))
     printk ("mtrr: v%s Richard Gooch (rgooch@atnf.csiro.au)\n", MTRR_VERSION);
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) { /* pre-Athlon CPUs */
+         get_mtrr = amd_get_mtrr;
+         set_mtrr_up = amd_set_mtrr_up;
+         break;
+       } /* else fall through */
       case X86_VENDOR_INTEL:
        get_mtrr = intel_get_mtrr;
        set_mtrr_up = intel_set_mtrr_up;
@@ -1666,10 +1692,6 @@ __initfunc(static void mtrr_setup (void))
        set_mtrr_up = cyrix_set_arr_up;
        get_free_region = cyrix_get_free_region;
        break;
-      case X86_VENDOR_AMD:
-       get_mtrr = amd_get_mtrr;
-       set_mtrr_up = amd_set_mtrr_up;
-       break;
      case X86_VENDOR_CENTAUR:
         get_mtrr = centaur_get_mcr;
         set_mtrr_up = centaur_set_mcr_up;
@@ -1688,6 +1710,8 @@ __initfunc(void mtrr_init_boot_cpu (void))
     mtrr_setup ();
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) break; /* pre-Athlon CPUs */
       case X86_VENDOR_INTEL:
        get_mtrr_state (&smp_mtrr_state);
        break;
@@ -1724,6 +1748,9 @@ __initfunc(void mtrr_init_secondary_cpu (void))
     if ( !(boot_cpu_data.x86_capability & X86_FEATURE_MTRR) ) return;
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       /* Just for robustness: pre-Athlon CPUs cannot do SMP. */
+       if (boot_cpu_data.x86 < 6) break;
       case X86_VENDOR_INTEL:
        intel_mtrr_init_secondary_cpu ();
        break;
@@ -1749,6 +1776,8 @@ __initfunc(int mtrr_init(void))
 #  ifdef __SMP__
     switch (boot_cpu_data.x86_vendor)
     {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) break; /* pre-Athlon CPUs */
       case X86_VENDOR_INTEL:
        finalize_mtrr_state (&smp_mtrr_state);
        mtrr_state_warn (smp_changes_mask);
index 2f0206b734f3f0cb4578f53e238d8549d3161220..2948069e30b1bc46bd7ec054adb07ab36404a107 100644 (file)
@@ -392,7 +392,6 @@ __initfunc(void mem_init(unsigned long start_mem, unsigned long end_mem))
        int datapages = 0;
        int initpages = 0;
        unsigned long tmp;
-       unsigned long endbase;
 
        end_mem &= PAGE_MASK;
        high_memory = (void *) end_mem;
@@ -420,10 +419,8 @@ __initfunc(void mem_init(unsigned long start_mem, unsigned long end_mem))
         * IBM messed up *AGAIN* in their thinkpad: 0xA0000 -> 0x9F000.
         * They seem to have done something stupid with the floppy
         * controller as well..
-        * The amount of available base memory is in WORD 40:13.
         */
-       endbase = PAGE_OFFSET + ((*(unsigned short *)__va(0x413) * 1024) & PAGE_MASK);
-       while (start_low_mem < endbase) {
+       while (start_low_mem < 0x9f000+PAGE_OFFSET) {
                clear_bit(PG_reserved, &mem_map[MAP_NR(start_low_mem)].flags);
                start_low_mem += PAGE_SIZE;
        }
index bf10d2847f3816f353b73e47f27d3fa7a3608bc7..5caa558b949825571f625149568af686d2d25edf 100644 (file)
@@ -1,4 +1,4 @@
-/* $Id: ioctl32.c,v 1.62.2.2 1999/08/13 18:28:25 davem Exp $
+/* $Id: ioctl32.c,v 1.62.2.1 1999/06/09 04:53:03 davem Exp $
  * ioctl32.c: Conversion between 32bit and 64bit native ioctls.
  *
  * Copyright (C) 1997  Jakub Jelinek  (jj@sunsite.mff.cuni.cz)
@@ -17,7 +17,7 @@
 #include <linux/if.h>
 #include <linux/malloc.h>
 #include <linux/hdreg.h>
-#include <linux/raid/md.h>
+#include <linux/md.h>
 #include <linux/kd.h>
 #include <linux/route.h>
 #include <linux/skbuff.h>
@@ -1992,24 +1992,11 @@ asmlinkage int sys32_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
        case BLKRASET:
        
        /* 0x09 */
-       case RAID_VERSION:
-       case GET_ARRAY_INFO:
-       case GET_DISK_INFO:
-       case CLEAR_ARRAY:
-       case ADD_NEW_DISK:
-       case HOT_REMOVE_DISK:
-       case SET_ARRAY_INFO:
-       case SET_DISK_INFO:
-       case WRITE_RAID_INFO:
-       case UNPROTECT_ARRAY:
-       case PROTECT_ARRAY:
-       case HOT_ADD_DISK:
-       case RUN_ARRAY:
-       case START_ARRAY:
-       case STOP_ARRAY:
-       case STOP_ARRAY_RO:
-       case RESTART_ARRAY_RW:
-
+       case REGISTER_DEV:
+       case REGISTER_DEV_NEW:
+       case START_MD:
+       case STOP_MD:
+       
        /* Big K */
        case PIO_FONT:
        case GIO_FONT:
index 55d85c76cad0444c634352fa52d78ee14d244c2c..38d7a475ccf4db905a44d9dd91377b7aa6b00f60 100644 (file)
@@ -101,16 +101,13 @@ if [ "$CONFIG_NET" = "y" ]; then
 fi
 bool 'Multiple devices driver support' CONFIG_BLK_DEV_MD
 if [ "$CONFIG_BLK_DEV_MD" = "y" ]; then
-  bool 'Autodetect RAID partitions' CONFIG_AUTODETECT_RAID
   tristate '   Linear (append) mode' CONFIG_MD_LINEAR
   tristate '   RAID-0 (striping) mode' CONFIG_MD_STRIPED
   tristate '   RAID-1 (mirroring) mode' CONFIG_MD_MIRRORING
   tristate '   RAID-4/RAID-5 mode' CONFIG_MD_RAID5
-  tristate '   Translucent mode' CONFIG_MD_TRANSLUCENT
-  tristate '   Logical Volume Manager support' CONFIG_MD_LVM
-  if [ "$CONFIG_MD_LINEAR" = "y" -o "$CONFIG_MD_STRIPED" = "y" ]; then
-    bool '      Boot support (linear, striped)' CONFIG_MD_BOOT
-  fi
+fi
+if [ "$CONFIG_MD_LINEAR" = "y" -o "$CONFIG_MD_STRIPED" = "y" ]; then
+  bool '      Boot support (linear, striped)' CONFIG_MD_BOOT
 fi
 tristate 'RAM disk support' CONFIG_BLK_DEV_RAM
 if [ "$CONFIG_BLK_DEV_RAM" = "y" ]; then
index acef020c3d202d65f566b3ca7ca9036ef01338bc..5ba9cdbe9e1413300b0f42df9d76e02c746761e3 100644 (file)
@@ -19,8 +19,8 @@
 */
 
 
-#define DAC960_DriverVersion                   "2.2.2"
-#define DAC960_DriverDate                      "3 July 1999"
+#define DAC960_DriverVersion                   "2.2.4"
+#define DAC960_DriverDate                      "23 August 1999"
 
 
 #include <linux/version.h>
@@ -478,6 +478,7 @@ static void DAC960_DetectControllers(DAC960_ControllerType_T ControllerType)
       unsigned long BaseAddress0 = PCI_Device->base_address[0];
       unsigned long BaseAddress1 = PCI_Device->base_address[1];
       unsigned short SubsystemVendorID, SubsystemDeviceID;
+      int CommandIdentifier;
       pci_read_config_word(PCI_Device, PCI_SUBSYSTEM_VENDOR_ID,
                           &SubsystemVendorID);
       pci_read_config_word(PCI_Device, PCI_SUBSYSTEM_ID,
@@ -584,9 +585,15 @@ static void DAC960_DetectControllers(DAC960_ControllerType_T ControllerType)
          break;
        }
       DAC960_ActiveControllerCount++;
-      Controller->Commands[0].Controller = Controller;
-      Controller->Commands[0].Next = NULL;
-      Controller->FreeCommands = &Controller->Commands[0];
+      for (CommandIdentifier = 0;
+          CommandIdentifier < DAC960_MaxChannels;
+          CommandIdentifier++)
+       {
+         Controller->Commands[CommandIdentifier].Controller = Controller;
+         Controller->Commands[CommandIdentifier].Next =
+           Controller->FreeCommands;
+         Controller->FreeCommands = &Controller->Commands[CommandIdentifier];
+       }
       continue;
     Failure:
       if (IO_Address == 0)
@@ -754,14 +761,13 @@ static boolean DAC960_ReadControllerConfiguration(DAC960_Controller_T
 
 
 /*
-  DAC960_ReportControllerConfiguration reports the configuration of
+  DAC960_ReportControllerConfiguration reports the Configuration Information of
   Controller.
 */
 
 static boolean DAC960_ReportControllerConfiguration(DAC960_Controller_T
                                                    *Controller)
 {
-  int LogicalDriveNumber, Channel, TargetID;
   DAC960_Info("Configuring Mylex %s PCI RAID Controller\n",
              Controller, Controller->ModelName);
   DAC960_Info("  Firmware Version: %s, Channels: %d, Memory Size: %dMB\n",
@@ -793,40 +799,199 @@ static boolean DAC960_ReportControllerConfiguration(DAC960_Controller_T
              Controller->GeometryTranslationSectors);
   if (Controller->SAFTE_EnclosureManagementEnabled)
     DAC960_Info("  SAF-TE Enclosure Management Enabled\n", Controller);
+  return true;
+}
+
+
+/*
+  DAC960_ReadDeviceConfiguration reads the Device Configuration Information by
+  requesting the SCSI Inquiry and SCSI Inquiry Unit Serial Number information
+  for each device connected to Controller.
+*/
+
+static boolean DAC960_ReadDeviceConfiguration(DAC960_Controller_T *Controller)
+{
+  DAC960_DCDB_T DCDBs[DAC960_MaxChannels], *DCDB;
+  Semaphore_T Semaphores[DAC960_MaxChannels], *Semaphore;
+  unsigned long ProcessorFlags;
+  int Channel, TargetID;
+  for (TargetID = 0; TargetID < DAC960_MaxTargets; TargetID++)
+    {
+      for (Channel = 0; Channel < Controller->Channels; Channel++)
+       {
+         DAC960_Command_T *Command = &Controller->Commands[Channel];
+         DAC960_SCSI_Inquiry_T *InquiryStandardData =
+           &Controller->InquiryStandardData[Channel][TargetID];
+         InquiryStandardData->PeripheralDeviceType = 0x1F;
+         Semaphore = &Semaphores[Channel];
+         *Semaphore = MUTEX_LOCKED;
+         DCDB = &DCDBs[Channel];
+         DAC960_ClearCommand(Command);
+         Command->CommandType = DAC960_ImmediateCommand;
+         Command->Semaphore = Semaphore;
+         Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+         Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+         DCDB->Channel = Channel;
+         DCDB->TargetID = TargetID;
+         DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+         DCDB->EarlyStatus = false;
+         DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+         DCDB->NoAutomaticRequestSense = false;
+         DCDB->DisconnectPermitted = true;
+         DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_T);
+         DCDB->BusAddress = Virtual_to_Bus(InquiryStandardData);
+         DCDB->CDBLength = 6;
+         DCDB->TransferLengthHigh4 = 0;
+         DCDB->SenseLength = sizeof(DCDB->SenseData);
+         DCDB->CDB[0] = 0x12; /* INQUIRY */
+         DCDB->CDB[1] = 0; /* EVPD = 0 */
+         DCDB->CDB[2] = 0; /* Page Code */
+         DCDB->CDB[3] = 0; /* Reserved */
+         DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_T);
+         DCDB->CDB[5] = 0; /* Control */
+         DAC960_AcquireControllerLock(Controller, &ProcessorFlags);
+         DAC960_QueueCommand(Command);
+         DAC960_ReleaseControllerLock(Controller, &ProcessorFlags);
+       }
+      for (Channel = 0; Channel < Controller->Channels; Channel++)
+       {
+         DAC960_Command_T *Command = &Controller->Commands[Channel];
+         DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+           &Controller->InquiryUnitSerialNumber[Channel][TargetID];
+         InquiryUnitSerialNumber->PeripheralDeviceType = 0x1F;
+         Semaphore = &Semaphores[Channel];
+         down(Semaphore);
+         if (Command->CommandStatus != DAC960_NormalCompletion) continue;
+         Command->Semaphore = Semaphore;
+         DCDB = &DCDBs[Channel];
+         DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+         DCDB->BusAddress = Virtual_to_Bus(InquiryUnitSerialNumber);
+         DCDB->SenseLength = sizeof(DCDB->SenseData);
+         DCDB->CDB[0] = 0x12; /* INQUIRY */
+         DCDB->CDB[1] = 1; /* EVPD = 1 */
+         DCDB->CDB[2] = 0x80; /* Page Code */
+         DCDB->CDB[3] = 0; /* Reserved */
+         DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+         DCDB->CDB[5] = 0; /* Control */
+         DAC960_AcquireControllerLock(Controller, &ProcessorFlags);
+         DAC960_QueueCommand(Command);
+         DAC960_ReleaseControllerLock(Controller, &ProcessorFlags);
+         down(Semaphore);
+       }
+    }
+  return true; 
+}
+
+
+/*
+  DAC960_ReportDeviceConfiguration reports the Device Configuration Information
+  of Controller.
+*/
+
+static boolean DAC960_ReportDeviceConfiguration(DAC960_Controller_T *Controller)
+{
+  int LogicalDriveNumber, Channel, TargetID;
   DAC960_Info("  Physical Devices:\n", Controller);
   for (Channel = 0; Channel < Controller->Channels; Channel++)
     for (TargetID = 0; TargetID < DAC960_MaxTargets; TargetID++)
       {
+       DAC960_SCSI_Inquiry_T *InquiryStandardData =
+         &Controller->InquiryStandardData[Channel][TargetID];
+       DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+         &Controller->InquiryUnitSerialNumber[Channel][TargetID];
        DAC960_DeviceState_T *DeviceState =
          &Controller->DeviceState[Controller->DeviceStateIndex]
                                  [Channel][TargetID];
-       if (!DeviceState->Present) continue;
-       switch (DeviceState->DeviceType)
+       DAC960_ErrorTable_T *ErrorTable =
+         &Controller->ErrorTable[Controller->ErrorTableIndex];
+       DAC960_ErrorTableEntry_T *ErrorEntry =
+         &ErrorTable->ErrorTableEntries[Channel][TargetID];
+       char Vendor[1+sizeof(InquiryStandardData->VendorIdentification)];
+       char Model[1+sizeof(InquiryStandardData->ProductIdentification)];
+       char Revision[1+sizeof(InquiryStandardData->ProductRevisionLevel)];
+       char SerialNumber[1+sizeof(InquiryUnitSerialNumber
+                                  ->ProductSerialNumber)];
+       int i;
+       if (InquiryStandardData->PeripheralDeviceType == 0x1F) continue;
+       for (i = 0; i < sizeof(Vendor)-1; i++)
+         {
+           unsigned char VendorCharacter =
+             InquiryStandardData->VendorIdentification[i];
+           Vendor[i] = (VendorCharacter >= ' ' && VendorCharacter <= '~'
+                        ? VendorCharacter : ' ');
+         }
+       Vendor[sizeof(Vendor)-1] = '\0';
+       for (i = 0; i < sizeof(Model)-1; i++)
+         {
+           unsigned char ModelCharacter =
+             InquiryStandardData->ProductIdentification[i];
+           Model[i] = (ModelCharacter >= ' ' && ModelCharacter <= '~'
+                       ? ModelCharacter : ' ');
+         }
+       Model[sizeof(Model)-1] = '\0';
+       for (i = 0; i < sizeof(Revision)-1; i++)
+         {
+           unsigned char RevisionCharacter =
+             InquiryStandardData->ProductRevisionLevel[i];
+           Revision[i] = (RevisionCharacter >= ' ' && RevisionCharacter <= '~'
+                          ? RevisionCharacter : ' ');
+         }
+       Revision[sizeof(Revision)-1] = '\0';
+       DAC960_Info("    %d:%d%s Vendor: %s  Model: %s  Revision: %s\n",
+                   Controller, Channel, TargetID, (TargetID < 10 ? " " : ""),
+                   Vendor, Model, Revision);
+       if (InquiryUnitSerialNumber->PeripheralDeviceType != 0x1F)
          {
-         case DAC960_OtherType:
-           DAC960_Info("    %d:%d - Other\n", Controller, Channel, TargetID);
-           break;
-         case DAC960_DiskType:
-           DAC960_Info("    %d:%d - Disk: %s, %d blocks\n", Controller,
-                       Channel, TargetID,
-                       (DeviceState->DeviceState == DAC960_Device_Dead
-                        ? "Dead"
-                        : DeviceState->DeviceState == DAC960_Device_WriteOnly
+           int SerialNumberLength = InquiryUnitSerialNumber->PageLength;
+           if (SerialNumberLength >
+               sizeof(InquiryUnitSerialNumber->ProductSerialNumber))
+             SerialNumberLength =
+               sizeof(InquiryUnitSerialNumber->ProductSerialNumber);
+           for (i = 0; i < SerialNumberLength; i++)
+             {
+               unsigned char SerialNumberCharacter =
+                 InquiryUnitSerialNumber->ProductSerialNumber[i];
+               SerialNumber[i] =
+                 (SerialNumberCharacter >= ' ' && SerialNumberCharacter <= '~'
+                  ? SerialNumberCharacter : ' ');
+             }
+           SerialNumber[SerialNumberLength] = '\0';
+           DAC960_Info("         Serial Number: %s\n",
+                       Controller, SerialNumber);
+         }
+       if (DeviceState->Present && DeviceState->DeviceType == DAC960_DiskType)
+         {
+           if (Controller->DeviceResetCount[Channel][TargetID] > 0)
+             DAC960_Info("         Disk Status: %s, %d blocks, %d resets\n",
+                         Controller,
+                         (DeviceState->DeviceState == DAC960_Device_Dead
+                          ? "Dead"
+                          : DeviceState->DeviceState == DAC960_Device_WriteOnly
+                          ? "Write-Only"
+                          : DeviceState->DeviceState == DAC960_Device_Online
+                          ? "Online" : "Standby"),
+                         DeviceState->DiskSize,
+                         Controller->DeviceResetCount[Channel][TargetID]);
+           else
+             DAC960_Info("         Disk Status: %s, %d blocks\n", Controller,
+                         (DeviceState->DeviceState == DAC960_Device_Dead
+                          ? "Dead"
+                          : DeviceState->DeviceState == DAC960_Device_WriteOnly
                           ? "Write-Only"
                           : DeviceState->DeviceState == DAC960_Device_Online
-                            ? "Online" : "Standby"),
-                       DeviceState->DiskSize);
-           break;
-         case DAC960_SequentialType:
-           DAC960_Info("    %d:%d - Sequential\n", Controller,
-                       Channel, TargetID);
-           break;
-         case DAC960_CDROM_or_WORM_Type:
-           DAC960_Info("    %d:%d - CD-ROM or WORM\n", Controller,
-                       Channel, TargetID);
-           break;
+                          ? "Online" : "Standby"),
+                         DeviceState->DiskSize);
          }
-
+       if (ErrorEntry->ParityErrorCount > 0 ||
+           ErrorEntry->SoftErrorCount > 0 ||
+           ErrorEntry->HardErrorCount > 0 ||
+           ErrorEntry->MiscErrorCount > 0)
+         DAC960_Info("         Errors - Parity: %d, Soft: %d, "
+                     "Hard: %d, Misc: %d\n", Controller,
+                     ErrorEntry->ParityErrorCount,
+                     ErrorEntry->SoftErrorCount,
+                     ErrorEntry->HardErrorCount,
+                     ErrorEntry->MiscErrorCount);
       }
   DAC960_Info("  Logical Drives:\n", Controller);
   for (LogicalDriveNumber = 0;
@@ -982,6 +1147,8 @@ static void DAC960_InitializeController(DAC960_Controller_T *Controller)
 {
   if (DAC960_ReadControllerConfiguration(Controller) &&
       DAC960_ReportControllerConfiguration(Controller) &&
+      DAC960_ReadDeviceConfiguration(Controller) &&
+      DAC960_ReportDeviceConfiguration(Controller) &&
       DAC960_RegisterBlockDevice(Controller))
     {
       /*
@@ -1625,7 +1792,7 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
              Controller->NeedErrorTableInformation = true;
              Controller->NeedDeviceStateInformation = true;
              Controller->DeviceStateChannel = 0;
-             Controller->DeviceStateTargetID = 0;
+             Controller->DeviceStateTargetID = -1;
              Controller->SecondaryMonitoringTime = jiffies;
            }
          if (NewEnquiry->RebuildFlag == DAC960_StandbyRebuildInProgress ||
@@ -1705,13 +1872,17 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
                                EventLogEntry->TargetID,
                                DAC960_EventMessages[
                                  AdditionalSenseCodeQualifier]);
+             else if (SenseKey == 6 && AdditionalSenseCode == 0x29)
+               {
+                 if (Controller->MonitoringTimerCount > 0)
+                   Controller->DeviceResetCount[EventLogEntry->Channel]
+                                               [EventLogEntry->TargetID]++;
+               }
              else if (!(SenseKey == 0 ||
                         (SenseKey == 2 &&
                          AdditionalSenseCode == 0x04 &&
                          (AdditionalSenseCodeQualifier == 0x01 ||
-                          AdditionalSenseCodeQualifier == 0x02)) ||
-                        (SenseKey == 6 && AdditionalSenseCode == 0x29 &&
-                         Controller->MonitoringTimerCount == 0)))
+                          AdditionalSenseCodeQualifier == 0x02))))
                {
                  DAC960_Critical("Physical Drive %d:%d Error Log: "
                                  "Sense Key = %d, ASC = %02X, ASCQ = %02X\n",
@@ -1793,10 +1964,11 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
                               : NewDeviceState->DeviceState
                                 == DAC960_Device_Online
                                 ? "ONLINE" : "STANDBY"));
-         if (++Controller->DeviceStateTargetID == DAC960_MaxTargets)
+         if (OldDeviceState->DeviceState == DAC960_Device_Dead &&
+             NewDeviceState->DeviceState != DAC960_Device_Dead)
            {
-             Controller->DeviceStateChannel++;
-             Controller->DeviceStateTargetID = 0;
+             Controller->NeedDeviceInquiryInformation = true;
+             Controller->NeedDeviceSerialNumberInformation = true;
            }
        }
       else if (CommandOpcode == DAC960_GetLogicalDriveInformation)
@@ -1948,6 +2120,76 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
        }
       if (Controller->NeedDeviceStateInformation)
        {
+         if (Controller->NeedDeviceInquiryInformation)
+           {
+             DAC960_DCDB_T *DCDB = &Controller->MonitoringDCDB;
+             DAC960_SCSI_Inquiry_T *InquiryStandardData =
+               &Controller->InquiryStandardData
+                              [Controller->DeviceStateChannel]
+                              [Controller->DeviceStateTargetID];
+             InquiryStandardData->PeripheralDeviceType = 0x1F;
+             Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+             Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+             DCDB->Channel = Controller->DeviceStateChannel;
+             DCDB->TargetID = Controller->DeviceStateTargetID;
+             DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+             DCDB->EarlyStatus = false;
+             DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+             DCDB->NoAutomaticRequestSense = false;
+             DCDB->DisconnectPermitted = true;
+             DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_T);
+             DCDB->BusAddress = Virtual_to_Bus(InquiryStandardData);
+             DCDB->CDBLength = 6;
+             DCDB->TransferLengthHigh4 = 0;
+             DCDB->SenseLength = sizeof(DCDB->SenseData);
+             DCDB->CDB[0] = 0x12; /* INQUIRY */
+             DCDB->CDB[1] = 0; /* EVPD = 0 */
+             DCDB->CDB[2] = 0; /* Page Code */
+             DCDB->CDB[3] = 0; /* Reserved */
+             DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_T);
+             DCDB->CDB[5] = 0; /* Control */
+             DAC960_QueueCommand(Command);
+             Controller->NeedDeviceInquiryInformation = false;
+             return;
+           }
+         if (Controller->NeedDeviceSerialNumberInformation)
+           {
+             DAC960_DCDB_T *DCDB = &Controller->MonitoringDCDB;
+             DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+               &Controller->InquiryUnitSerialNumber
+                              [Controller->DeviceStateChannel]
+                              [Controller->DeviceStateTargetID];
+             InquiryUnitSerialNumber->PeripheralDeviceType = 0x1F;
+             Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+             Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+             DCDB->Channel = Controller->DeviceStateChannel;
+             DCDB->TargetID = Controller->DeviceStateTargetID;
+             DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+             DCDB->EarlyStatus = false;
+             DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+             DCDB->NoAutomaticRequestSense = false;
+             DCDB->DisconnectPermitted = true;
+             DCDB->TransferLength =
+               sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+             DCDB->BusAddress = Virtual_to_Bus(InquiryUnitSerialNumber);
+             DCDB->CDBLength = 6;
+             DCDB->TransferLengthHigh4 = 0;
+             DCDB->SenseLength = sizeof(DCDB->SenseData);
+             DCDB->CDB[0] = 0x12; /* INQUIRY */
+             DCDB->CDB[1] = 1; /* EVPD = 1 */
+             DCDB->CDB[2] = 0x80; /* Page Code */
+             DCDB->CDB[3] = 0; /* Reserved */
+             DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+             DCDB->CDB[5] = 0; /* Control */
+             DAC960_QueueCommand(Command);
+             Controller->NeedDeviceSerialNumberInformation = false;
+             return;
+           }
+         if (++Controller->DeviceStateTargetID == DAC960_MaxTargets)
+           {
+             Controller->DeviceStateChannel++;
+             Controller->DeviceStateTargetID = 0;
+           }
          while (Controller->DeviceStateChannel < Controller->Channels)
            {
              DAC960_DeviceState_T *OldDeviceState =
@@ -3078,9 +3320,8 @@ static boolean DAC960_ExecuteUserCommand(DAC960_Controller_T *Controller,
   DAC960_ProcReadStatus implements reading /proc/rd/status.
 */
 
-static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
-                                    off_t Offset, int Count,
-                                    int *EOF, void *Data)
+static int DAC960_ProcReadStatus(char *Page, char **Start, off_t Offset,
+                                int Count, int *EOF, void *Data)
 {
   char *StatusMessage = "OK\n";
   int ControllerNumber, BytesAvailable;
@@ -3107,6 +3348,7 @@ static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
       *EOF = true;
     }
   if (Count <= 0) return 0;
+  *Start = Page;
   memcpy(Page, &StatusMessage[Offset], Count);
   return Count;
 }
@@ -3116,9 +3358,8 @@ static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
   DAC960_ProcReadInitialStatus implements reading /proc/rd/cN/initial_status.
 */
 
-static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
-                                           off_t Offset, int Count,
-                                           int *EOF, void *Data)
+static int DAC960_ProcReadInitialStatus(char *Page, char **Start, off_t Offset,
+                                       int Count, int *EOF, void *Data)
 {
   DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
   int BytesAvailable = Controller->InitialStatusLength - Offset;
@@ -3128,6 +3369,7 @@ static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
       *EOF = true;
     }
   if (Count <= 0) return 0;
+  *Start = Page;
   memcpy(Page, &Controller->InitialStatusBuffer[Offset], Count);
   return Count;
 }
@@ -3137,29 +3379,35 @@ static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
   DAC960_ProcReadCurrentStatus implements reading /proc/rd/cN/current_status.
 */
 
-static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
-                                           off_t Offset, int Count,
-                                           int *EOF, void *Data)
+static int DAC960_ProcReadCurrentStatus(char *Page, char **Start, off_t Offset,
+                                       int Count, int *EOF, void *Data)
 {
   DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
   int BytesAvailable;
-  Controller->CurrentStatusLength = 0;
-  DAC960_AnnounceDriver(Controller);
-  DAC960_ReportControllerConfiguration(Controller);
-  Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
-  Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
-  if (Controller->RebuildProgressLength > 0)
+  if (jiffies != Controller->LastCurrentStatusTime)
     {
-      strcpy(&Controller->CurrentStatusBuffer[Controller->CurrentStatusLength],
-            Controller->RebuildProgressBuffer);
-      Controller->CurrentStatusLength += Controller->RebuildProgressLength;
-    }
-  else
-    {
-      char *StatusMessage = "No Rebuild or Consistency Check in Progress\n";
-      strcpy(&Controller->CurrentStatusBuffer[Controller->CurrentStatusLength],
-            StatusMessage);
-      Controller->CurrentStatusLength += strlen(StatusMessage);
+      Controller->CurrentStatusLength = 0;
+      DAC960_AnnounceDriver(Controller);
+      DAC960_ReportControllerConfiguration(Controller);
+      DAC960_ReportDeviceConfiguration(Controller);
+      Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
+      Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
+      if (Controller->RebuildProgressLength > 0)
+       {
+         strcpy(&Controller->CurrentStatusBuffer
+                             [Controller->CurrentStatusLength],
+                Controller->RebuildProgressBuffer);
+         Controller->CurrentStatusLength += Controller->RebuildProgressLength;
+       }
+      else
+       {
+         char *StatusMessage = "No Rebuild or Consistency Check in Progress\n";
+         strcpy(&Controller->CurrentStatusBuffer
+                             [Controller->CurrentStatusLength],
+                StatusMessage);
+         Controller->CurrentStatusLength += strlen(StatusMessage);
+       }
+      Controller->LastCurrentStatusTime = jiffies;
     }
   BytesAvailable = Controller->CurrentStatusLength - Offset;
   if (Count >= BytesAvailable)
@@ -3168,6 +3416,7 @@ static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
       *EOF = true;
     }
   if (Count <= 0) return 0;
+  *Start = Page;
   memcpy(Page, &Controller->CurrentStatusBuffer[Offset], Count);
   return Count;
 }
@@ -3177,9 +3426,8 @@ static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
   DAC960_ProcReadUserCommand implements reading /proc/rd/cN/user_command.
 */
 
-static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
-                                         off_t Offset, int Count,
-                                         int *EOF, void *Data)
+static int DAC960_ProcReadUserCommand(char *Page, char **Start, off_t Offset,
+                                     int Count, int *EOF, void *Data)
 {
   DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
   int BytesAvailable = Controller->UserStatusLength - Offset;
@@ -3189,6 +3437,7 @@ static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
       *EOF = true;
     }
   if (Count <= 0) return 0;
+  *Start = Page;
   memcpy(Page, &Controller->UserStatusBuffer[Offset], Count);
   return Count;
 }
@@ -3198,8 +3447,8 @@ static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
   DAC960_ProcWriteUserCommand implements writing /proc/rd/cN/user_command.
 */
 
-static ssize_t DAC960_ProcWriteUserCommand(File_T *File, const char *Buffer,
-                                          unsigned long Count, void *Data)
+static int DAC960_ProcWriteUserCommand(File_T *File, const char *Buffer,
+                                      unsigned long Count, void *Data)
 {
   DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
   char CommandBuffer[80];
index 5bea6a759278f2b71b8b1224355de0e2117a1000..a96757db61f2a795b48478dd555f417993c0206b 100644 (file)
@@ -671,6 +671,57 @@ typedef struct DAC960_DCDB
 DAC960_DCDB_T;
 
 
+/*
+  Define the SCSI INQUIRY Standard Data reply structure.
+*/
+
+typedef struct DAC960_SCSI_Inquiry
+{
+  unsigned char PeripheralDeviceType:5;                        /* Byte 0 Bits 0-4 */
+  unsigned char PeripheralQualifier:3;                 /* Byte 0 Bits 5-7 */
+  unsigned char DeviceTypeModifier:7;                  /* Byte 1 Bits 0-6 */
+  boolean RMB:1;                                       /* Byte 1 Bit 7 */
+  unsigned char ANSI_ApprovedVersion:3;                        /* Byte 2 Bits 0-2 */
+  unsigned char ECMA_Version:3;                                /* Byte 2 Bits 3-5 */
+  unsigned char ISO_Version:2;                         /* Byte 2 Bits 6-7 */
+  unsigned char ResponseDataFormat:4;                  /* Byte 3 Bits 0-3 */
+  unsigned char :2;                                    /* Byte 3 Bits 4-5 */
+  boolean TrmIOP:1;                                    /* Byte 3 Bit 6 */
+  boolean AENC:1;                                      /* Byte 3 Bit 7 */
+  unsigned char AdditionalLength;                      /* Byte 4 */
+  unsigned char :8;                                    /* Byte 5 */
+  unsigned char :8;                                    /* Byte 6 */
+  boolean SftRe:1;                                     /* Byte 7 Bit 0 */
+  boolean CmdQue:1;                                    /* Byte 7 Bit 1 */
+  boolean :1;                                          /* Byte 7 Bit 2 */
+  boolean Linked:1;                                    /* Byte 7 Bit 3 */
+  boolean Sync:1;                                      /* Byte 7 Bit 4 */
+  boolean WBus16:1;                                    /* Byte 7 Bit 5 */
+  boolean WBus32:1;                                    /* Byte 7 Bit 6 */
+  boolean RelAdr:1;                                    /* Byte 7 Bit 7 */
+  unsigned char VendorIdentification[8];               /* Bytes 8-15 */
+  unsigned char ProductIdentification[16];             /* Bytes 16-31 */
+  unsigned char ProductRevisionLevel[4];               /* Bytes 32-35 */
+}
+DAC960_SCSI_Inquiry_T;
+
+
+/*
+  Define the SCSI INQUIRY Unit Serial Number reply structure.
+*/
+
+typedef struct DAC960_SCSI_Inquiry_UnitSerialNumber
+{
+  unsigned char PeripheralDeviceType:5;                        /* Byte 0 Bits 0-4 */
+  unsigned char PeripheralQualifier:3;                 /* Byte 0 Bits 5-7 */
+  unsigned char PageCode;                              /* Byte 1 */
+  unsigned char :8;                                    /* Byte 2 */
+  unsigned char PageLength;                            /* Byte 3 */
+  unsigned char ProductSerialNumber[28];               /* Bytes 4 - 31 */
+}
+DAC960_SCSI_Inquiry_UnitSerialNumber_T;
+
+
 /*
   Define the Scatter/Gather List Type 1 32 Bit Address 32 Bit Byte Count
   structure.
@@ -977,12 +1028,12 @@ static inline void *Bus_to_Virtual(DAC960_BusAddress_T BusAddress)
 
 
 /*
-  Define the Controller Line, Status Buffer, Rebuild Progress, and
-  User Message Sizes.
+  Define the Controller Line Buffer, Status Buffer, Rebuild Progress,
+  and User Message Sizes.
 */
 
 #define DAC960_LineBufferSize                  100
-#define DAC960_StatusBufferSize                        5000
+#define DAC960_StatusBufferSize                        16384
 #define DAC960_RebuildProgressSize             200
 #define DAC960_UserMessageSize                 200
 
@@ -1183,6 +1234,7 @@ typedef struct DAC960_Controller
   unsigned long MonitoringTimerCount;
   unsigned long SecondaryMonitoringTime;
   unsigned long LastProgressReportTime;
+  unsigned long LastCurrentStatusTime;
   boolean DualModeMemoryMailboxInterface;
   boolean SAFTE_EnclosureManagementEnabled;
   boolean ControllerInitialized;
@@ -1190,6 +1242,8 @@ typedef struct DAC960_Controller
   boolean NeedLogicalDriveInformation;
   boolean NeedErrorTableInformation;
   boolean NeedDeviceStateInformation;
+  boolean NeedDeviceInquiryInformation;
+  boolean NeedDeviceSerialNumberInformation;
   boolean NeedRebuildProgress;
   boolean NeedConsistencyCheckProgress;
   boolean EphemeralProgressMessage;
@@ -1209,6 +1263,7 @@ typedef struct DAC960_Controller
   PROC_DirectoryEntry_T CurrentStatusProcEntry;
   PROC_DirectoryEntry_T UserCommandProcEntry;
   WaitQueue_T *CommandWaitQueue;
+  DAC960_DCDB_T MonitoringDCDB;
   DAC960_Enquiry_T Enquiry[2];
   DAC960_ErrorTable_T ErrorTable[2];
   DAC960_EventLogEntry_T EventLogEntry;
@@ -1219,12 +1274,17 @@ typedef struct DAC960_Controller
   DAC960_LogicalDriveState_T LogicalDriveInitialState[DAC960_MaxLogicalDrives];
   DAC960_DeviceState_T DeviceState[2][DAC960_MaxChannels][DAC960_MaxTargets];
   DAC960_Command_T Commands[DAC960_MaxDriverQueueDepth];
+  DAC960_SCSI_Inquiry_T
+    InquiryStandardData[DAC960_MaxChannels][DAC960_MaxTargets];
+  DAC960_SCSI_Inquiry_UnitSerialNumber_T
+    InquiryUnitSerialNumber[DAC960_MaxChannels][DAC960_MaxTargets];
   DiskPartition_T DiskPartitions[DAC960_MinorCount];
   int LogicalDriveUsageCount[DAC960_MaxLogicalDrives];
   int PartitionSizes[DAC960_MinorCount];
   int BlockSizes[DAC960_MinorCount];
   int MaxSectorsPerRequest[DAC960_MinorCount];
   int MaxSegmentsPerRequest[DAC960_MinorCount];
+  int DeviceResetCount[DAC960_MaxChannels][DAC960_MaxTargets];
   boolean DirectCommandActive[DAC960_MaxChannels][DAC960_MaxTargets];
   char InitialStatusBuffer[DAC960_StatusBufferSize];
   char CurrentStatusBuffer[DAC960_StatusBufferSize];
index 9e2b1b078acfd94cf965b1f68692ed34eae068c1..d050b53ac9e402e5779e97762fb1083eb85f3c8b 100644 (file)
@@ -234,14 +234,6 @@ else
   endif
 endif
 
-ifeq ($(CONFIG_BLK_DEV_DAC960),y)
-LX_OBJS += DAC960.o
-else
-  ifeq ($(CONFIG_BLK_DEV_DAC960),m)
-  MX_OBJS += DAC960.o
-  endif
-endif
-
 ifeq ($(CONFIG_BLK_CPQ_DA),y)
 L_OBJS += cpqarray.o
 else
@@ -250,6 +242,14 @@ else
   endif
 endif
 
+ifeq ($(CONFIG_BLK_DEV_DAC960),y)
+LX_OBJS += DAC960.o
+else
+  ifeq ($(CONFIG_BLK_DEV_DAC960),m)
+  MX_OBJS += DAC960.o
+  endif
+endif
+
 ifeq ($(CONFIG_BLK_DEV_MD),y)
 LX_OBJS += md.o
 
@@ -278,31 +278,13 @@ else
 endif
 
 ifeq ($(CONFIG_MD_RAID5),y)
-LX_OBJS += xor.o
 L_OBJS += raid5.o
 else
   ifeq ($(CONFIG_MD_RAID5),m)
-  LX_OBJS += xor.o
   M_OBJS += raid5.o
   endif
 endif
 
-ifeq ($(CONFIG_MD_TRANSLUCENT),y)
-L_OBJS += translucent.o
-else
-  ifeq ($(CONFIG_MD_TRANSLUCENT),m)
-  M_OBJS += translucent.o
-  endif
-endif
-
-ifeq ($(CONFIG_MD_HSM),y)
-L_OBJS += hsm.o
-else
-  ifeq ($(CONFIG_MD_HSM),m)
-  M_OBJS += hsm.o
-  endif
-endif
-
 endif
 
 ifeq ($(CONFIG_BLK_DEV_NBD),y)
index 31e9786f3fa22d11e69f2e2fd4b213eba3e3b7ad..fa51d3dee4021fa66b86b64d5f29c3775fac4781 100644 (file)
@@ -30,6 +30,7 @@
 #include <linux/locks.h>
 #include <linux/malloc.h>
 #include <linux/proc_fs.h>
+#include <linux/md.h>
 #include <linux/timer.h>
 #endif
 
index aff62db8b1a79529d5094e1884826a1b694afe84..c0cb2f8e91cd1bd3cf08a3f0a3ee9b71ad381e72 100644 (file)
@@ -28,7 +28,6 @@
 #include <linux/string.h>
 #include <linux/blk.h>
 #include <linux/init.h>
-#include <linux/raid/md.h>
 
 #include <asm/system.h>
 #include <asm/byteorder.h>
@@ -1467,9 +1466,6 @@ __initfunc(void device_setup(void))
 #endif
        rd_load();
 #endif
-#ifdef CONFIG_BLK_DEV_MD
-       autodetect_raid();
-#endif
 #ifdef CONFIG_MD_BOOT
         md_setup_drive();
 #endif
diff --git a/drivers/block/hsm.c b/drivers/block/hsm.c
deleted file mode 100644 (file)
index 6307a50..0000000
+++ /dev/null
@@ -1,840 +0,0 @@
-/*
-   hsm.c : HSM RAID driver for Linux
-              Copyright (C) 1998 Ingo Molnar
-
-   HSM mode management functions.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/module.h>
-
-#include <linux/raid/md.h>
-#include <linux/malloc.h>
-
-#include <linux/raid/hsm.h>
-#include <linux/blk.h>
-
-#define MAJOR_NR MD_MAJOR
-#define MD_DRIVER
-#define MD_PERSONALITY
-
-
-#define DEBUG_HSM 1
-
-#if DEBUG_HSM
-#define dprintk(x,y...) printk(x,##y)
-#else
-#define dprintk(x,y...) do { } while (0)
-#endif
-
-void print_bh(struct buffer_head *bh)
-{
-       dprintk("bh %p: %lx %lx %x %x %lx %p %lx %p %x %p %x %lx\n", bh, 
-               bh->b_blocknr, bh->b_size, bh->b_dev, bh->b_rdev,
-               bh->b_rsector, bh->b_this_page, bh->b_state,
-               bh->b_next_free, bh->b_count, bh->b_data,
-               bh->b_list, bh->b_flushtime
-       );
-}
-
-static int check_bg (pv_t *pv, pv_block_group_t * bg)
-{
-       int i, free = 0;
-
-       dprintk("checking bg ...\n");
-
-       for (i = 0; i < pv->pv_sb->pv_bg_size-1; i++) {
-               if (pv_pptr_free(bg->blocks + i)) {
-                       free++;
-                       if (test_bit(i, bg->used_bitmap)) {
-                               printk("hm, bit %d set?\n", i);
-                       }
-               } else {
-                       if (!test_bit(i, bg->used_bitmap)) {
-                               printk("hm, bit %d not set?\n", i);
-                       }
-               }
-       }
-       dprintk("%d free blocks in bg ...\n", free);
-       return free;
-}
-
-static void get_bg (pv_t *pv, pv_bg_desc_t *desc, int nr)
-{
-       unsigned int bg_pos = nr * pv->pv_sb->pv_bg_size + 2;
-       struct buffer_head *bh;
-
-       dprintk("... getting BG at %u ...\n", bg_pos);
-
-        bh = bread (pv->dev, bg_pos, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return;
-       }
-       desc->bg = (pv_block_group_t *) bh->b_data;
-       desc->free_blocks = check_bg(pv, desc->bg);
-}
-
-static int find_free_block (lv_t *lv, pv_t *pv, pv_bg_desc_t *desc, int nr,
-                               unsigned int lblock, lv_lptr_t * index)
-{
-       int i;
-
-       for (i = 0; i < pv->pv_sb->pv_bg_size-1; i++) {
-               pv_pptr_t * bptr = desc->bg->blocks + i;
-               if (pv_pptr_free(bptr)) {
-                       unsigned int bg_pos = nr * pv->pv_sb->pv_bg_size + 2;
-
-                       if (test_bit(i, desc->bg->used_bitmap)) {
-                               MD_BUG();
-                               continue;
-                       }
-                       bptr->u.used.owner.log_id = lv->log_id;
-                       bptr->u.used.owner.log_index = lblock;
-                       index->data.phys_nr = pv->phys_nr;
-                       index->data.phys_block = bg_pos + i + 1;
-                       set_bit(i, desc->bg->used_bitmap);
-                       desc->free_blocks--;
-                       dprintk(".....free blocks left in bg %p: %d\n",
-                                       desc->bg, desc->free_blocks);
-                       return 0;
-               }
-       }
-       return -ENOSPC;
-}
-
-static int __get_free_block (lv_t *lv, pv_t *pv,
-                                       unsigned int lblock, lv_lptr_t * index)
-{
-       int i;
-
-       dprintk("trying to get free block for lblock %d ...\n", lblock);
-
-       for (i = 0; i < pv->pv_sb->pv_block_groups; i++) {
-               pv_bg_desc_t *desc = pv->bg_array + i;
-
-               dprintk("looking at desc #%d (%p)...\n", i, desc->bg);
-               if (!desc->bg)
-                       get_bg(pv, desc, i);
-
-               if (desc->bg && desc->free_blocks)
-                       return find_free_block(lv, pv, desc, i,
-                                                       lblock, index);
-       }
-       dprintk("hsm: pv %s full!\n", partition_name(pv->dev));
-       return -ENOSPC;
-}
-
-static int get_free_block (lv_t *lv, unsigned int lblock, lv_lptr_t * index)
-{
-       int err;
-
-       if (!lv->free_indices)
-               return -ENOSPC;
-
-       /* fix me */
-       err = __get_free_block(lv, lv->vg->pv_array + 0, lblock, index);
-
-       if (err || !index->data.phys_block) {
-               MD_BUG();
-               return -ENOSPC;
-       }
-
-       lv->free_indices--;
-
-       return 0;
-}
-
-/*
- * fix me: wordsize assumptions ...
- */
-#define INDEX_BITS 8
-#define INDEX_DEPTH (32/INDEX_BITS)
-#define INDEX_MASK ((1<<INDEX_BITS) - 1)
-
-static void print_index_list (lv_t *lv, lv_lptr_t *index)
-{
-       lv_lptr_t *tmp;
-       int i;
-
-       dprintk("... block <%u,%u,%x> [.", index->data.phys_nr,
-               index->data.phys_block, index->cpu_addr);
-
-       tmp = index_child(index);
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index_block(lv, tmp))
-                       dprintk("(%d->%d)", i, index_block(lv, tmp));
-               tmp++;
-       }
-       dprintk(".]\n");
-}
-
-static int read_index_group (lv_t *lv, lv_lptr_t *index)
-{
-       lv_lptr_t *index_group, *tmp;
-       struct buffer_head *bh;
-       int i;
-
-       dprintk("reading index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = bread(index_dev(lv, index), index_block(lv, index), HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -EIO;
-       }
-       if (!buffer_uptodate(bh))
-               MD_BUG();
-
-       index_group = (lv_lptr_t *) bh->b_data;
-       tmp = index_group;
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index_block(lv, tmp)) {
-                       dprintk("index group has BLOCK %d, non-present.\n", i);
-                       tmp->cpu_addr = 0;
-               }
-               tmp++;
-       }
-       index->cpu_addr = ptr_to_cpuaddr(index_group);
-
-       dprintk("have read index group %p at block %d.\n",
-                               index_group, index_block(lv, index));
-       print_index_list(lv, index);
-
-       return 0;
-}
-
-static int alloc_index_group (lv_t *lv, unsigned int lblock, lv_lptr_t * index)
-{
-       struct buffer_head *bh;
-       lv_lptr_t * index_group;
-       
-       if (get_free_block(lv, lblock, index))
-               return -ENOSPC;
-
-       dprintk("creating block for index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = getblk(index_dev(lv, index),
-                        index_block(lv, index), HSM_BLOCKSIZE);
-
-       index_group = (lv_lptr_t *) bh->b_data;
-       md_clear_page(index_group);
-       mark_buffer_uptodate(bh, 1);
-
-       index->cpu_addr = ptr_to_cpuaddr(index_group);
-
-       dprintk("allocated index group %p at block %d.\n",
-                               index_group, index_block(lv, index));
-       return 0;
-}
-
-static lv_lptr_t * alloc_fixed_index (lv_t *lv, unsigned int lblock)
-{
-       lv_lptr_t * index = index_child(&lv->root_index);
-       int idx, l;
-
-       for (l = INDEX_DEPTH-1; l >= 0; l--) {
-               idx = (lblock >> (INDEX_BITS*l)) & INDEX_MASK;
-               index += idx;
-               if (!l)
-                       break;
-               if (!index_present(index)) {
-                       dprintk("no group, level %u, pos %u\n", l, idx);
-                       if (alloc_index_group(lv, lblock, index))
-                               return NULL;
-               }
-               index = index_child(index);
-       }
-       if (!index_block(lv,index)) {
-               dprintk("no data, pos %u\n", idx);
-               if (get_free_block(lv, lblock, index))
-                       return NULL;
-               return index;
-       }
-       MD_BUG();
-       return index;
-}
-
-static lv_lptr_t * find_index (lv_t *lv, unsigned int lblock)
-{
-       lv_lptr_t * index = index_child(&lv->root_index);
-       int idx, l;
-
-       for (l = INDEX_DEPTH-1; l >= 0; l--) {
-               idx = (lblock >> (INDEX_BITS*l)) & INDEX_MASK;
-               index += idx;
-               if (!l)
-                       break;
-               if (index_free(index))
-                       return NULL;
-               if (!index_present(index))
-                       read_index_group(lv, index);
-               if (!index_present(index)) {
-                       MD_BUG();
-                       return NULL;
-               }
-               index = index_child(index);
-       }
-       if (!index_block(lv,index))
-               return NULL;
-       return index;
-}
-
-static int read_root_index(lv_t *lv)
-{
-       int err;
-       lv_lptr_t *index = &lv->root_index;
-
-       if (!index_block(lv, index)) {
-               printk("LV has no root index yet, creating.\n");
-
-               err = alloc_index_group (lv, 0, index);
-               if (err) {
-                       printk("could not create index group, err:%d\n", err);
-                       return err;
-               }
-               lv->vg->vg_sb->lv_array[lv->log_id].lv_root_idx =
-                                       lv->root_index.data;
-       } else {
-               printk("LV already has a root index.\n");
-               printk("... at <%s:%d>.\n",
-                       partition_name(index_dev(lv, index)),
-                       index_block(lv, index));
-
-               read_index_group(lv, index);
-       }
-       return 0;
-}
-
-static int init_pv(pv_t *pv)
-{
-       struct buffer_head *bh;
-       pv_sb_t *pv_sb;
-
-        bh = bread (pv->dev, 0, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-
-       pv_sb = (pv_sb_t *) bh->b_data;
-       pv->pv_sb = pv_sb;
-
-       if (pv_sb->pv_magic != HSM_PV_SB_MAGIC) {
-               printk("%s is not a PV, has magic %x instead of %x!\n",
-                       partition_name(pv->dev), pv_sb->pv_magic,
-                       HSM_PV_SB_MAGIC);
-               return -1;
-       }
-       printk("%s detected as a valid PV (#%d).\n", partition_name(pv->dev),
-                                                       pv->phys_nr);
-       printk("... created under HSM version %d.%d.%d, at %x.\n",
-           pv_sb->pv_major, pv_sb->pv_minor, pv_sb->pv_patch, pv_sb->pv_ctime);
-       printk("... total # of blocks: %d (%d left unallocated).\n",
-                        pv_sb->pv_total_size, pv_sb->pv_blocks_left);
-
-       printk("... block size: %d bytes.\n", pv_sb->pv_block_size);
-       printk("... block descriptor size: %d bytes.\n", pv_sb->pv_pptr_size);
-       printk("... block group size: %d blocks.\n", pv_sb->pv_bg_size);
-       printk("... # of block groups: %d.\n", pv_sb->pv_block_groups);
-
-       if (pv_sb->pv_block_groups*sizeof(pv_bg_desc_t) > PAGE_SIZE) {
-               MD_BUG();
-               return 1;
-       }
-       pv->bg_array = (pv_bg_desc_t *)__get_free_page(GFP_KERNEL);
-       if (!pv->bg_array) {
-               MD_BUG();
-               return 1;
-       }
-       memset(pv->bg_array, 0, PAGE_SIZE);
-
-       return 0;
-}
-
-static int free_pv(pv_t *pv)
-{
-       struct buffer_head *bh;
-
-       dprintk("freeing PV %d ...\n", pv->phys_nr);
-
-       if (pv->bg_array) {
-               int i;
-
-               dprintk(".... freeing BGs ...\n");
-               for (i = 0; i < pv->pv_sb->pv_block_groups; i++) {
-                       unsigned int bg_pos = i * pv->pv_sb->pv_bg_size + 2;
-                       pv_bg_desc_t *desc = pv->bg_array + i;
-
-                       if (desc->bg) {
-                               dprintk(".... freeing BG %d ...\n", i);
-                               bh = getblk (pv->dev, bg_pos, HSM_BLOCKSIZE);
-                               mark_buffer_dirty(bh, 1);
-                               brelse(bh);
-                               brelse(bh);
-                       }
-               }
-               free_page((unsigned long)pv->bg_array);
-       } else
-               MD_BUG();
-
-        bh = getblk (pv->dev, 0, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-       mark_buffer_dirty(bh, 1);
-       brelse(bh);
-       brelse(bh);
-
-       return 0;
-}
-
-struct semaphore hsm_sem = MUTEX;
-
-#define HSM_SECTORS (HSM_BLOCKSIZE/512)
-
-static int hsm_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-                       unsigned long *rsector, unsigned long bsectors)
-{
-       lv_t *lv = kdev_to_lv(dev);
-       lv_lptr_t *index;
-       unsigned int lblock = *rsector / HSM_SECTORS;
-       unsigned int offset = *rsector % HSM_SECTORS;
-       int err = -EIO;
-
-       if (!lv) {
-               printk("HSM: md%d not a Logical Volume!\n", mdidx(mddev));
-               goto out;
-       }
-       if (offset + bsectors > HSM_SECTORS) {
-               MD_BUG();
-               goto out;
-       }
-       down(&hsm_sem);
-       index = find_index(lv, lblock);
-       if (!index) {
-               printk("no block %u yet ... allocating\n", lblock);
-               index = alloc_fixed_index(lv, lblock);
-       }
-
-       err = 0;
-
-       printk(" %u <%s : %ld(%ld)> -> ", lblock,
-               partition_name(*rdev), *rsector, bsectors);
-
-       *rdev = index_dev(lv, index);
-       *rsector = index_block(lv, index) * HSM_SECTORS + offset;
-
-       printk(" <%s : %ld> %u\n",
-               partition_name(*rdev), *rsector, index_block(lv, index));
-
-       up(&hsm_sem);
-out:
-       return err;
-}
-
-static void free_index (lv_t *lv, lv_lptr_t * index)
-{
-       struct buffer_head *bh;
-
-       printk("tryin to get cached block for index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = getblk(index_dev(lv, index), index_block(lv, index),HSM_BLOCKSIZE);
-
-       printk("....FREEING ");
-       print_index_list(lv, index);
-
-       if (bh) {
-               if (!buffer_uptodate(bh))
-                       MD_BUG();
-               if ((lv_lptr_t *)bh->b_data != index_child(index)) {
-                       printk("huh? b_data is %p, index content is %p.\n",
-                               bh->b_data, index_child(index));
-               } else 
-                       printk("good, b_data == index content == %p.\n",
-                               index_child(index));
-               printk("b_count == %d, writing.\n", bh->b_count);
-               mark_buffer_dirty(bh, 1);
-               brelse(bh);
-               brelse(bh);
-               printk("done.\n");
-       } else {
-               printk("FAILED!\n");
-       }
-       print_index_list(lv, index);
-       index_child(index) = NULL;
-}
-
-static void free_index_group (lv_t *lv, int level, lv_lptr_t * index_0)
-{
-       char dots [3*8];
-       lv_lptr_t * index;
-       int i, nr_dots;
-
-       nr_dots = (INDEX_DEPTH-level)*3;
-       memcpy(dots,"...............",nr_dots);
-       dots[nr_dots] = 0;
-
-       dprintk("%s level %d index group block:\n", dots, level);
-
-
-       index = index_0;
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index->data.phys_block) {
-                       dprintk("%s block <%u,%u,%x>\n", dots,
-                               index->data.phys_nr,
-                               index->data.phys_block,
-                               index->cpu_addr);
-                       if (level && index_present(index)) {
-                               dprintk("%s==> deeper one level\n", dots);
-                               free_index_group(lv, level-1,
-                                               index_child(index));
-                               dprintk("%s freeing index group block %p ...",
-                                               dots, index_child(index));
-                               free_index(lv, index);
-                       }
-               }
-               index++;
-       }
-       dprintk("%s DONE: level %d index group block.\n", dots, level);
-}
-
-static void free_lv_indextree (lv_t *lv)
-{
-       dprintk("freeing LV %d ...\n", lv->log_id);
-       dprintk("..root index: %p\n", index_child(&lv->root_index));
-       dprintk("..INDEX TREE:\n");
-       free_index_group(lv, INDEX_DEPTH-1, index_child(&lv->root_index));
-       dprintk("..freeing root index %p ...", index_child(&lv->root_index));
-       dprintk("root block <%u,%u,%x>\n", lv->root_index.data.phys_nr,
-               lv->root_index.data.phys_block, lv->root_index.cpu_addr);
-       free_index(lv, &lv->root_index);
-       dprintk("..INDEX TREE done.\n");
-       fsync_dev(lv->vg->pv_array[0].dev); /* fix me */
-       lv->vg->vg_sb->lv_array[lv->log_id].lv_free_indices = lv->free_indices;
-}
-
-static void print_index_group (lv_t *lv, int level, lv_lptr_t * index_0)
-{
-       char dots [3*5];
-       lv_lptr_t * index;
-       int i, nr_dots;
-
-       nr_dots = (INDEX_DEPTH-level)*3;
-       memcpy(dots,"...............",nr_dots);
-       dots[nr_dots] = 0;
-
-       dprintk("%s level %d index group block:\n", dots, level);
-
-
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               index = index_0 + i;
-               if (index->data.phys_block) {
-                       dprintk("%s block <%u,%u,%x>\n", dots,
-                               index->data.phys_nr,
-                               index->data.phys_block,
-                               index->cpu_addr);
-                       if (level && index_present(index)) {
-                               dprintk("%s==> deeper one level\n", dots);
-                               print_index_group(lv, level-1,
-                                                       index_child(index));
-                       }
-               }
-       }
-       dprintk("%s DONE: level %d index group block.\n", dots, level);
-}
-
-static void print_lv (lv_t *lv)
-{
-       dprintk("printing LV %d ...\n", lv->log_id);
-       dprintk("..root index: %p\n", index_child(&lv->root_index));
-       dprintk("..INDEX TREE:\n");
-       print_index_group(lv, INDEX_DEPTH-1, index_child(&lv->root_index));
-       dprintk("..INDEX TREE done.\n");
-}
-
-static int map_lv (lv_t *lv)
-{
-       kdev_t dev = lv->dev;
-       unsigned int nr = MINOR(dev);
-       mddev_t *mddev = lv->vg->mddev;
-
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return -1;
-       }
-       if (kdev_to_mddev(dev)) {
-               MD_BUG();
-               return -1;
-       }
-       md_hd_struct[nr].start_sect = 0;
-       md_hd_struct[nr].nr_sects = md_size[mdidx(mddev)] << 1;
-       md_size[nr] = md_size[mdidx(mddev)];
-       add_mddev_mapping(mddev, dev, lv);
-
-       return 0;
-}
-
-static int unmap_lv (lv_t *lv)
-{
-       kdev_t dev = lv->dev;
-       unsigned int nr = MINOR(dev);
-
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return -1;
-       }
-       md_hd_struct[nr].start_sect = 0;
-       md_hd_struct[nr].nr_sects = 0;
-       md_size[nr] = 0;
-       del_mddev_mapping(lv->vg->mddev, dev);
-
-       return 0;
-}
-
-static int init_vg (vg_t *vg)
-{
-       int i;
-       lv_t *lv;
-       kdev_t dev;
-       vg_sb_t *vg_sb;
-       struct buffer_head *bh;
-       lv_descriptor_t *lv_desc;
-
-       /*
-        * fix me: read all PVs and compare the SB
-        */
-        dev = vg->pv_array[0].dev;
-        bh = bread (dev, 1, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-
-       vg_sb = (vg_sb_t *) bh->b_data;
-       vg->vg_sb = vg_sb;
-
-       if (vg_sb->vg_magic != HSM_VG_SB_MAGIC) {
-               printk("%s is not a valid VG, has magic %x instead of %x!\n",
-                       partition_name(dev), vg_sb->vg_magic,
-                       HSM_VG_SB_MAGIC);
-               return -1;
-       }
-
-       vg->nr_lv = 0;
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               unsigned int id;
-               lv_desc = vg->vg_sb->lv_array + i;
-
-               id = lv_desc->lv_id;
-               if (!id) {
-                       printk("... LV desc %d empty\n", i);
-                       continue;
-               }
-               if (id >= HSM_MAX_LVS_PER_VG) {
-                       MD_BUG();
-                       continue;
-               }
-
-               lv = vg->lv_array + id;
-               if (lv->vg) {
-                       MD_BUG();
-                       continue;
-               }
-               lv->log_id = id;
-               lv->vg = vg;
-               lv->max_indices = lv_desc->lv_max_indices;
-               lv->free_indices = lv_desc->lv_free_indices;
-               lv->root_index.data = lv_desc->lv_root_idx;
-               lv->dev = MKDEV(MD_MAJOR, lv_desc->md_id);
-
-               vg->nr_lv++;
-
-               map_lv(lv);
-               if (read_root_index(lv)) {
-                       vg->nr_lv--;
-                       unmap_lv(lv);
-                       memset(lv, 0, sizeof(*lv));
-               }
-       }
-       if (vg->nr_lv != vg_sb->nr_lvs)
-               MD_BUG();
-
-       return 0;
-}
-
-static int hsm_run (mddev_t *mddev)
-{
-       int i;
-       vg_t *vg;
-       mdk_rdev_t *rdev;
-
-       MOD_INC_USE_COUNT;
-
-       vg = kmalloc (sizeof (*vg), GFP_KERNEL);
-       if (!vg)
-               goto out;
-       memset(vg, 0, sizeof(*vg));
-       mddev->private = vg;
-       vg->mddev = mddev;
-
-       if (md_check_ordering(mddev)) {
-               printk("hsm: disks are not ordered, aborting!\n");
-               goto out;
-       }
-
-       set_blocksize (mddev_to_kdev(mddev), HSM_BLOCKSIZE);
-
-       vg->nr_pv = mddev->nb_dev;
-       ITERATE_RDEV_ORDERED(mddev,rdev,i) {
-               pv_t *pv = vg->pv_array + i;
-
-               pv->dev = rdev->dev;
-               fsync_dev (pv->dev);
-               set_blocksize (pv->dev, HSM_BLOCKSIZE);
-               pv->phys_nr = i;
-               if (init_pv(pv))
-                       goto out;
-       }
-
-       init_vg(vg);
-
-       return 0;
-
-out:
-       if (vg) {
-               kfree(vg);
-               mddev->private = NULL;
-       }
-       MOD_DEC_USE_COUNT;
-
-       return 1;
-}
-
-static int hsm_stop (mddev_t *mddev)
-{
-       lv_t *lv;
-       vg_t *vg;
-       int i;
-
-       vg = mddev_to_vg(mddev);
-
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               lv = vg->lv_array + i;
-               if (!lv->log_id)
-                       continue;
-               print_lv(lv);
-               free_lv_indextree(lv);
-               unmap_lv(lv);
-       }
-       for (i = 0; i < vg->nr_pv; i++)
-               free_pv(vg->pv_array + i);
-
-       kfree(vg);
-
-       MOD_DEC_USE_COUNT;
-
-       return 0;
-}
-
-
-static int hsm_status (char *page, mddev_t *mddev)
-{
-       int sz = 0, i;
-       lv_t *lv;
-       vg_t *vg;
-
-       vg = mddev_to_vg(mddev);
-
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               lv = vg->lv_array + i;
-               if (!lv->log_id)
-                       continue;
-               sz += sprintf(page+sz, "<LV%d %d/%d blocks used> ", lv->log_id,
-                       lv->max_indices - lv->free_indices, lv->max_indices);
-       }
-       return sz;
-}
-
-
-static mdk_personality_t hsm_personality=
-{
-       "hsm",
-       hsm_map,
-       NULL,
-       NULL,
-       hsm_run,
-       hsm_stop,
-       hsm_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
-};
-
-#ifndef MODULE
-
-md__initfunc(void hsm_init (void))
-{
-       register_md_personality (HSM, &hsm_personality);
-}
-
-#else
-
-int init_module (void)
-{
-       return (register_md_personality (HSM, &hsm_personality));
-}
-
-void cleanup_module (void)
-{
-       unregister_md_personality (HSM);
-}
-
-#endif
-
-/*
- * This Linus-trick catches bugs via the linker.
- */
-
-extern void __BUG__in__hsm_dot_c_1(void);
-extern void __BUG__in__hsm_dot_c_2(void);
-extern void __BUG__in__hsm_dot_c_3(void);
-extern void __BUG__in__hsm_dot_c_4(void);
-extern void __BUG__in__hsm_dot_c_5(void);
-extern void __BUG__in__hsm_dot_c_6(void);
-extern void __BUG__in__hsm_dot_c_7(void);
-void bugcatcher (void)
-{
-        if (sizeof(pv_block_group_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_1();
-        if (sizeof(lv_index_block_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_2();
-
-        if (sizeof(pv_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_4();
-        if (sizeof(lv_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_3();
-       if (sizeof(vg_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_6();
-
-       if (sizeof(lv_lptr_t) != 16)
-                __BUG__in__hsm_dot_c_5();
-       if (sizeof(pv_pptr_t) != 16)
-                __BUG__in__hsm_dot_c_6();
-}
-
index e60ef217780db67321a52b816314366b6a927b97..b6f72fd6a038fbf46fe733cc12526917865bc8a9 100644 (file)
@@ -1,3 +1,4 @@
+
 /*
    linear.c : Multiple Devices driver for Linux
               Copyright (C) 1994-96 Marc ZYNGIER
 
 #include <linux/module.h>
 
-#include <linux/raid/md.h>
+#include <linux/md.h>
 #include <linux/malloc.h>
+#include <linux/init.h>
 
-#include <linux/raid/linear.h>
+#include "linear.h"
 
 #define MAJOR_NR MD_MAJOR
 #define MD_DRIVER
 #define MD_PERSONALITY
 
-static int linear_run (mddev_t *mddev)
+static int linear_run (int minor, struct md_dev *mddev)
 {
-       linear_conf_t *conf;
-       struct linear_hash *table;
-       mdk_rdev_t *rdev;
-       int size, i, j, nb_zone;
-       unsigned int curr_offset;
-
-       MOD_INC_USE_COUNT;
-
-       conf = kmalloc (sizeof (*conf), GFP_KERNEL);
-       if (!conf)
-               goto out;
-       mddev->private = conf;
-
-       if (md_check_ordering(mddev)) {
-               printk("linear: disks are not ordered, aborting!\n");
-               goto out;
-       }
-       /*
-        * Find the smallest device.
-        */
-
-       conf->smallest = NULL;
-       curr_offset = 0;
-       ITERATE_RDEV_ORDERED(mddev,rdev,j) {
-               dev_info_t *disk = conf->disks + j;
-
-               disk->dev = rdev->dev;
-               disk->size = rdev->size;
-               disk->offset = curr_offset;
-
-               curr_offset += disk->size;
-
-               if (!conf->smallest || (disk->size < conf->smallest->size))
-                       conf->smallest = disk;
-       }
-
-       nb_zone = conf->nr_zones =
-               md_size[mdidx(mddev)] / conf->smallest->size +
-               ((md_size[mdidx(mddev)] % conf->smallest->size) ? 1 : 0);
+  int cur=0, i, size, dev0_size, nb_zone;
+  struct linear_data *data;
+
+  MOD_INC_USE_COUNT;
+
+  mddev->private=kmalloc (sizeof (struct linear_data), GFP_KERNEL);
+  data=(struct linear_data *) mddev->private;
+
+  /*
+     Find out the smallest device. This was previously done
+     at registry time, but since it violates modularity,
+     I moved it here... Any comment ? ;-)
+   */
+
+  data->smallest=mddev->devices;
+  for (i=1; i<mddev->nb_dev; i++)
+    if (data->smallest->size > mddev->devices[i].size)
+      data->smallest=mddev->devices+i;
   
-       conf->hash_table = kmalloc (sizeof (struct linear_hash) * nb_zone,
-                                       GFP_KERNEL);
-       if (!conf->hash_table)
-               goto out;
-
-       /*
-        * Here we generate the linear hash table
-        */
-       table = conf->hash_table;
-       i = 0;
-       size = 0;
-       for (j = 0; j < mddev->nb_dev; j++) {
-               dev_info_t *disk = conf->disks + j;
-
-               if (size < 0) {
-                       table->dev1 = disk;
-                       table++;
-               }
-               size += disk->size;
-
-               while (size) {
-                       table->dev0 = disk;
-                       size -= conf->smallest->size;
-                       if (size < 0)
-                               break;
-                       table->dev1 = NULL;
-                       table++;
-               }
-       }
-       table->dev1 = NULL;
-
-       return 0;
-
-out:
-       if (conf)
-               kfree(conf);
-       MOD_DEC_USE_COUNT;
-       return 1;
+  nb_zone=data->nr_zones=
+    md_size[minor]/data->smallest->size +
+    (md_size[minor]%data->smallest->size ? 1 : 0);
+  
+  data->hash_table=kmalloc (sizeof (struct linear_hash)*nb_zone, GFP_KERNEL);
+
+  size=mddev->devices[cur].size;
+
+  i=0;
+  while (cur<mddev->nb_dev)
+  {
+    data->hash_table[i].dev0=mddev->devices+cur;
+
+    if (size>=data->smallest->size) /* If we completely fill the slot */
+    {
+      data->hash_table[i++].dev1=NULL;
+      size-=data->smallest->size;
+
+      if (!size)
+      {
+       if (++cur==mddev->nb_dev) continue;
+       size=mddev->devices[cur].size;
+      }
+
+      continue;
+    }
+
+    if (++cur==mddev->nb_dev) /* Last dev, set dev1 as NULL */
+    {
+      data->hash_table[i].dev1=NULL;
+      continue;
+    }
+
+    dev0_size=size;            /* Here, we use a 2nd dev to fill the slot */
+    size=mddev->devices[cur].size;
+    data->hash_table[i++].dev1=mddev->devices+cur;
+    size-=(data->smallest->size - dev0_size);
+  }
+
+  return 0;
 }
 
-static int linear_stop (mddev_t *mddev)
+static int linear_stop (int minor, struct md_dev *mddev)
 {
-       linear_conf_t *conf = mddev_to_conf(mddev);
+  struct linear_data *data=(struct linear_data *) mddev->private;
   
-       kfree(conf->hash_table);
-       kfree(conf);
+  kfree (data->hash_table);
+  kfree (data);
 
-       MOD_DEC_USE_COUNT;
+  MOD_DEC_USE_COUNT;
 
-       return 0;
+  return 0;
 }
 
 
-static int linear_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int linear_map (struct md_dev *mddev, kdev_t *rdev,
                       unsigned long *rsector, unsigned long size)
 {
-       linear_conf_t *conf = mddev_to_conf(mddev);
-       struct linear_hash *hash;
-       dev_info_t *tmp_dev;
-       long block;
+  struct linear_data *data=(struct linear_data *) mddev->private;
+  struct linear_hash *hash;
+  struct real_dev *tmp_dev;
+  long block;
 
-       block = *rsector >> 1;
-       hash = conf->hash_table + (block / conf->smallest->size);
+  block=*rsector >> 1;
+  hash=data->hash_table+(block/data->smallest->size);
   
-       if (block >= (hash->dev0->size + hash->dev0->offset))
-       {
-               if (!hash->dev1)
-               {
-                       printk ("linear_map : hash->dev1==NULL for block %ld\n",
-                                               block);
-                       return -1;
-               }
-               tmp_dev = hash->dev1;
-       } else
-               tmp_dev = hash->dev0;
+  if (block >= (hash->dev0->size + hash->dev0->offset))
+  {
+    if (!hash->dev1)
+    {
+      printk ("linear_map : hash->dev1==NULL for block %ld\n", block);
+      return (-1);
+    }
+    
+    tmp_dev=hash->dev1;
+  }
+  else
+    tmp_dev=hash->dev0;
     
-       if (block >= (tmp_dev->size + tmp_dev->offset)
-                               || block < tmp_dev->offset)
-       printk ("Block %ld out of bounds on dev %s size %d offset %d\n",
-               block, kdevname(tmp_dev->dev), tmp_dev->size, tmp_dev->offset);
+  if (block >= (tmp_dev->size + tmp_dev->offset) || block < tmp_dev->offset)
+    printk ("Block %ld out of bounds on dev %s size %d offset %d\n",
+           block, kdevname(tmp_dev->dev), tmp_dev->size, tmp_dev->offset);
   
-       *rdev = tmp_dev->dev;
-       *rsector = (block - tmp_dev->offset) << 1;
+  *rdev=tmp_dev->dev;
+  *rsector=(block-(tmp_dev->offset)) << 1;
 
-       return 0;
+  return (0);
 }
 
-static int linear_status (char *page, mddev_t *mddev)
+static int linear_status (char *page, int minor, struct md_dev *mddev)
 {
-       int sz=0;
+  int sz=0;
 
 #undef MD_DEBUG
 #ifdef MD_DEBUG
-       int j;
-       linear_conf_t *conf = mddev_to_conf(mddev);
+  int j;
+  struct linear_data *data=(struct linear_data *) mddev->private;
   
-       sz += sprintf(page+sz, "      ");
-       for (j = 0; j < conf->nr_zones; j++)
-       {
-               sz += sprintf(page+sz, "[%s",
-                       partition_name(conf->hash_table[j].dev0->dev));
-
-               if (conf->hash_table[j].dev1)
-                       sz += sprintf(page+sz, "/%s] ",
-                         partition_name(conf->hash_table[j].dev1->dev));
-               else
-                       sz += sprintf(page+sz, "] ");
-       }
-       sz += sprintf(page+sz, "\n");
+  sz+=sprintf (page+sz, "      ");
+  for (j=0; j<data->nr_zones; j++)
+  {
+    sz+=sprintf (page+sz, "[%s",
+                partition_name (data->hash_table[j].dev0->dev));
+
+    if (data->hash_table[j].dev1)
+      sz+=sprintf (page+sz, "/%s] ",
+                  partition_name(data->hash_table[j].dev1->dev));
+    else
+      sz+=sprintf (page+sz, "] ");
+  }
+
+  sz+=sprintf (page+sz, "\n");
 #endif
-       sz += sprintf(page+sz, " %dk rounding", mddev->param.chunk_size/1024);
-       return sz;
+  sz+=sprintf (page+sz, " %dk rounding", 1<<FACTOR_SHIFT(FACTOR(mddev)));
+  return sz;
 }
 
 
-static mdk_personality_t linear_personality=
+static struct md_personality linear_personality=
 {
-       "linear",
-       linear_map,
-       NULL,
-       NULL,
-       linear_run,
-       linear_stop,
-       linear_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
+  "linear",
+  linear_map,
+  NULL,
+  NULL,
+  linear_run,
+  linear_stop,
+  linear_status,
+  NULL,                                /* no ioctls */
+  0
 };
 
+
 #ifndef MODULE
 
-md__initfunc(void linear_init (void))
+__initfunc(void linear_init (void))
 {
-       register_md_personality (LINEAR, &linear_personality);
+  register_md_personality (LINEAR, &linear_personality);
 }
 
 #else
 
 int init_module (void)
 {
-       return (register_md_personality (LINEAR, &linear_personality));
+  return (register_md_personality (LINEAR, &linear_personality));
 }
 
 void cleanup_module (void)
 {
-       unregister_md_personality (LINEAR);
+  unregister_md_personality (LINEAR);
 }
 
 #endif
-
diff --git a/drivers/block/linear.h b/drivers/block/linear.h
new file mode 100644 (file)
index 0000000..1146d83
--- /dev/null
@@ -0,0 +1,16 @@
+#ifndef _LINEAR_H
+#define _LINEAR_H
+
+struct linear_hash
+{
+  struct real_dev *dev0, *dev1;
+};
+
+struct linear_data
+{
+  struct linear_hash *hash_table; /* Dynamically allocated */
+  struct real_dev *smallest;
+  int nr_zones;
+};
+
+#endif
index 9e8d8441347134e9f5a44762f924d2675f7a8a6a..f07c528e3b983d447db384f229a6a23834bc25c8 100644 (file)
@@ -21,7 +21,6 @@
 #include <asm/system.h>
 #include <asm/io.h>
 #include <linux/blk.h>
-#include <linux/raid/md.h>
 
 #include <linux/module.h>
 
@@ -51,11 +50,6 @@ DECLARE_TASK_QUEUE(tq_disk);
  */
 spinlock_t io_request_lock = SPIN_LOCK_UNLOCKED;
 
-/*
- * per-major idle-IO detection
- */
-unsigned long io_events[MAX_BLKDEV] = {0, };
-
 /*
  * used to wait on when there are no free requests
  */
@@ -430,8 +424,6 @@ void make_request(int major, int rw, struct buffer_head * bh)
        /* Maybe the above fixes it, and maybe it doesn't boot. Life is interesting */
 
        lock_buffer(bh);
-       if (!buffer_lowprio(bh))
-               io_events[major]++;
 
        if (blk_size[major]) {
                unsigned long maxsector = (blk_size[major][MINOR(bh->b_rdev)] << 1) + 1;
@@ -530,7 +522,7 @@ void make_request(int major, int rw, struct buffer_head * bh)
                 * entry may be busy being processed and we thus can't change it.
                 */
                if (req == blk_dev[major].current_request)
-                       req = req->next;
+                       req = req->next;
                if (!req)
                        break;
                /* fall through */
@@ -689,12 +681,11 @@ void ll_rw_block(int rw, int nr, struct buffer_head * bh[])
                bh[i]->b_rsector=bh[i]->b_blocknr*(bh[i]->b_size >> 9);
 #ifdef CONFIG_BLK_DEV_MD
                if (major==MD_MAJOR &&
-               /* changed v to allow LVM to remap */
-                       md_map (bh[i]->b_rdev, &bh[i]->b_rdev,
-                               &bh[i]->b_rsector, bh[i]->b_size >> 9)) {
-                       printk (KERN_ERR
+                   md_map (MINOR(bh[i]->b_dev), &bh[i]->b_rdev,
+                           &bh[i]->b_rsector, bh[i]->b_size >> 9)) {
+                       printk (KERN_ERR
                                "Bad md_map in ll_rw_block\n");
-                       goto sorry;
+                       goto sorry;
                }
 #endif
        }
@@ -709,10 +700,8 @@ void ll_rw_block(int rw, int nr, struct buffer_head * bh[])
                if (bh[i]) {
                        set_bit(BH_Req, &bh[i]->b_state);
 #ifdef CONFIG_BLK_DEV_MD
-                       /* changed       v  to allow LVM to remap */
-                       if (MAJOR(bh[i]->b_rdev) == MD_MAJOR) {
-                               /* changed for LVM to remap     v */
-                               md_make_request(bh[i], rw);
+                       if (MAJOR(bh[i]->b_dev) == MD_MAJOR) {
+                               md_make_request(MINOR (bh[i]->b_dev), rw, bh[i]);
                                continue;
                        }
 #endif
@@ -795,10 +784,10 @@ __initfunc(int blk_dev_init(void))
 
        for (dev = blk_dev + MAX_BLKDEV; dev-- != blk_dev;) {
                dev->request_fn      = NULL;
-               dev->queue         = NULL;
+               dev->queue           = NULL;
                dev->current_request = NULL;
                dev->plug.rq_status  = RQ_INACTIVE;
-               dev->plug.cmd   = -1;
+               dev->plug.cmd        = -1;
                dev->plug.next       = NULL;
                dev->plug_tq.sync    = 0;
                dev->plug_tq.routine = &unplug_device;
@@ -879,7 +868,7 @@ __initfunc(int blk_dev_init(void))
        sbpcd_init();
 #endif CONFIG_SBPCD
 #ifdef CONFIG_AZTCD
-       aztcd_init();
+        aztcd_init();
 #endif CONFIG_AZTCD
 #ifdef CONFIG_CDU535
        sony535_init();
index ca11dd495ee5aefc0fe54eab095e441e7a048238..77090708e6d13bad93ffd1af16e68e422064047b 100644 (file)
@@ -1,17 +1,21 @@
+
 /*
    md.c : Multiple Devices driver for Linux
-          Copyright (C) 1998 Ingo Molnar
+          Copyright (C) 1994-96 Marc ZYNGIER
+         <zyngier@ufr-info-p7.ibp.fr> or
+         <maz@gloups.fdn.fr>
 
-     completely rewritten, based on the MD driver code from Marc Zyngier
+   A lot of inspiration came from hd.c ...
 
-   Changes:
+   kerneld support by Boris Tobotras <boris@xtalk.msk.su>
+   boot support for linear and striped mode by Harald Hoyer <HarryH@Royal.Net>
 
-   - RAID-1/RAID-5 extensions by Miguel de Icaza, Gadi Oxman, Ingo Molnar
-   - boot support for linear and striped mode by Harald Hoyer <HarryH@Royal.Net>
-   - kerneld support by Boris Tobotras <boris@xtalk.msk.su>
-   - kmod support by: Cyrus Durgin
-   - RAID0 bugfixes: Mark Anthony Lisher <markal@iname.com>
+   RAID-1/RAID-5 extensions by:
+        Ingo Molnar, Miguel de Icaza, Gadi Oxman
 
+   Changes for kmod by:
+       Cyrus Durgin
+   
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2, or (at your option)
    Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
 */
 
-#include <linux/config.h>
-#include <linux/raid/md.h>
-#include <linux/raid/xor.h>
+/*
+ * Current RAID-1,4,5 parallel reconstruction speed limit is 1024 KB/sec, so
+ * the extra system load does not show up that much. Increase it if your
+ * system can take more.
+ */
+#define SPEED_LIMIT 1024
 
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/version.h>
+#include <linux/malloc.h>
+#include <linux/mm.h>
+#include <linux/md.h>
+#include <linux/hdreg.h>
+#include <linux/stat.h>
+#include <linux/fs.h>
+#include <linux/proc_fs.h>
+#include <linux/blkdev.h>
+#include <linux/genhd.h>
+#include <linux/smp_lock.h>
 #ifdef CONFIG_KMOD
 #include <linux/kmod.h>
 #endif
+#include <linux/errno.h>
+#include <linux/init.h>
 
 #define __KERNEL_SYSCALLS__
 #include <linux/unistd.h>
 
-#include <asm/unaligned.h>
-
-extern asmlinkage int sys_sched_yield(void);
-extern asmlinkage int sys_setsid(void);
-
-extern unsigned long io_events[MAX_BLKDEV];
-
 #define MAJOR_NR MD_MAJOR
 #define MD_DRIVER
 
 #include <linux/blk.h>
+#include <asm/uaccess.h>
+#include <asm/bitops.h>
+#include <asm/atomic.h>
 
 #ifdef CONFIG_MD_BOOT
-extern kdev_t name_to_kdev_t(char *line) md__init;
+extern kdev_t name_to_kdev_t(char *line) __init;
 #endif
 
-static mdk_personality_t *pers[MAX_PERSONALITY] = {NULL, };
-
-/*
- * these have to be allocated separately because external
- * subsystems want to have a pre-defined structure
- */
-struct hd_struct md_hd_struct[MAX_MD_DEVS];
-static int md_blocksizes[MAX_MD_DEVS];
-static int md_maxreadahead[MAX_MD_DEVS];
-static mdk_thread_t *md_recovery_thread = NULL;
+static struct hd_struct md_hd_struct[MAX_MD_DEV];
+static int md_blocksizes[MAX_MD_DEV];
+int md_maxreadahead[MAX_MD_DEV];
+#if SUPPORT_RECONSTRUCTION
+static struct md_thread *md_sync_thread = NULL;
+#endif /* SUPPORT_RECONSTRUCTION */
 
-int md_size[MAX_MD_DEVS] = {0, };
+int md_size[MAX_MD_DEV]={0, };
 
 static void md_geninit (struct gendisk *);
 
 static struct gendisk md_gendisk=
 {
-       MD_MAJOR,
-       "md",
-       0,
-       1,
-       MAX_MD_DEVS,
-       md_geninit,
-       md_hd_struct,
-       md_size,
-       MAX_MD_DEVS,
-       NULL,
-       NULL
+  MD_MAJOR,
+  "md",
+  0,
+  1,
+  MAX_MD_DEV,
+  md_geninit,
+  md_hd_struct,
+  md_size,
+  MAX_MD_DEV,
+  NULL,
+  NULL
 };
 
-/*
- * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit'
- * is 100 KB/sec, so the extra system load does not show up that much.
- * Increase it if you want to have more _guaranteed_ speed. Note that
- * the RAID driver will use the maximum available bandwith if the IO
- * subsystem is idle.
- *
- * you can change it via /proc/sys/dev/speed-limit
- */
-
-static int sysctl_speed_limit = 100;
+static struct md_personality *pers[MAX_PERSONALITY]={NULL, };
+struct md_dev md_dev[MAX_MD_DEV];
 
-static struct ctl_table_header *md_table_header;
+int md_thread(void * arg);
 
-static ctl_table md_table[] = {
-       {DEV_MD_SPEED_LIMIT, "speed-limit",
-        &sysctl_speed_limit, sizeof(int), 0644, NULL, &proc_dointvec},
-       {0}
-};
-
-static ctl_table md_dir_table[] = {
-        {DEV_MD, "md", NULL, 0, 0555, md_table},
-        {0}
-};
+static struct gendisk *find_gendisk (kdev_t dev)
+{
+  struct gendisk *tmp=gendisk_head;
 
-static ctl_table md_root_table[] = {
-        {CTL_DEV, "dev", NULL, 0, 0555, md_dir_table},
-        {0}
-};
+  while (tmp != NULL)
+  {
+    if (tmp->major==MAJOR(dev))
+      return (tmp);
+    
+    tmp=tmp->next;
+  }
 
-static void md_register_sysctl(void)
-{
-        md_table_header = register_sysctl_table(md_root_table, 1);
+  return (NULL);
 }
 
-void md_unregister_sysctl(void)
+char *partition_name (kdev_t dev)
 {
-        unregister_sysctl_table(md_table_header);
-}
+  static char name[40];                /* This should be long
+                                  enough for a device name ! */
+  struct gendisk *hd = find_gendisk (dev);
 
-/*
- * The mapping between kdev and mddev is not necessary a simple
- * one! Eg. HSM uses several sub-devices to implement Logical
- * Volumes. All these sub-devices map to the same mddev.
- */
-dev_mapping_t mddev_map [MAX_MD_DEVS] = { {NULL, 0}, };
+  if (!hd)
+  {
+    sprintf (name, "[dev %s]", kdevname(dev));
+    return (name);
+  }
+
+  return disk_name (hd, MINOR(dev), name);  /* routine in genhd.c */
+}
 
-void add_mddev_mapping (mddev_t * mddev, kdev_t dev, void *data)
+static int legacy_raid_sb (int minor, int pnum)
 {
-       unsigned int minor = MINOR(dev);
+       int i, factor;
 
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return;
-       }
-       if (mddev_map[minor].mddev != NULL) {
-               MD_BUG();
-               return;
-       }
-       mddev_map[minor].mddev = mddev;
-       mddev_map[minor].data = data;
+       factor = 1 << FACTOR_SHIFT(FACTOR((md_dev+minor)));
+
+       /*****
+        * do size and offset calculations.
+        */
+       for (i=0; i<md_dev[minor].nb_dev; i++) {
+               md_dev[minor].devices[i].size &= ~(factor - 1);
+               md_size[minor] += md_dev[minor].devices[i].size;
+               md_dev[minor].devices[i].offset=i ? (md_dev[minor].devices[i-1].offset + 
+                                                       md_dev[minor].devices[i-1].size) : 0;
+       }
+       if (pnum == RAID0 >> PERSONALITY_SHIFT)
+               md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD * md_dev[minor].nb_dev;
+       return 0;
 }
 
-void del_mddev_mapping (mddev_t * mddev, kdev_t dev)
+static void free_sb (struct md_dev *mddev)
 {
-       unsigned int minor = MINOR(dev);
+       int i;
+       struct real_dev *realdev;
 
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return;
+       if (mddev->sb) {
+               free_page((unsigned long) mddev->sb);
+               mddev->sb = NULL;
        }
-       if (mddev_map[minor].mddev != mddev) {
-               MD_BUG();
-               return;
+       for (i = 0; i <mddev->nb_dev; i++) {
+               realdev = mddev->devices + i;
+               if (realdev->sb) {
+                       free_page((unsigned long) realdev->sb);
+                       realdev->sb = NULL;
+               }
        }
-       mddev_map[minor].mddev = NULL;
-       mddev_map[minor].data = NULL;
 }
 
 /*
- * Enables to iterate over all existing md arrays
+ * Check one RAID superblock for generic plausibility
  */
-static MD_LIST_HEAD(all_mddevs);
 
-static mddev_t * alloc_mddev (kdev_t dev)
-{
-       mddev_t * mddev;
+#define BAD_MAGIC KERN_ERR \
+"md: %s: invalid raid superblock magic (%x) on block %u\n"
 
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return 0;
-       }
-       mddev = (mddev_t *) kmalloc(sizeof(*mddev), GFP_KERNEL);
-       if (!mddev)
-               return NULL;
-               
-       memset(mddev, 0, sizeof(*mddev));
-
-       mddev->__minor = MINOR(dev);
-       mddev->reconfig_sem = MUTEX;
-       mddev->recovery_sem = MUTEX;
-       mddev->resync_sem = MUTEX;
-       MD_INIT_LIST_HEAD(&mddev->disks);
-       /*
-        * The 'base' mddev is the one with data NULL.
-        * personalities can create additional mddevs 
-        * if necessary.
-        */
-       add_mddev_mapping(mddev, dev, 0);
-       md_list_add(&mddev->all_mddevs, &all_mddevs);
+#define OUT_OF_MEM KERN_ALERT \
+"md: out of memory.\n"
 
-       return mddev;
-}
+#define NO_DEVICE KERN_ERR \
+"md: disabled device %s\n"
+
+#define SUCCESS 0
+#define FAILURE -1
 
-static void free_mddev (mddev_t *mddev)
+static int analyze_one_sb (struct real_dev * rdev)
 {
-       if (!mddev) {
-               MD_BUG();
-               return;
-       }
+       int ret = FAILURE;
+       struct buffer_head *bh;
+       kdev_t dev = rdev->dev;
+       md_superblock_t *sb;
 
        /*
-        * Make sure nobody else is using this mddev
-        * (careful, we rely on the global kernel lock here)
+        * Read the superblock, it's at the end of the disk
         */
-       while (md_atomic_read(&mddev->resync_sem.count) != 1)
-               schedule();
-       while (md_atomic_read(&mddev->recovery_sem.count) != 1)
-               schedule();
-
-       del_mddev_mapping(mddev, MKDEV(MD_MAJOR, mdidx(mddev)));
-       md_list_del(&mddev->all_mddevs);
-       MD_INIT_LIST_HEAD(&mddev->all_mddevs);
-       kfree(mddev);
-}
-
-
-struct gendisk * find_gendisk (kdev_t dev)
-{
-       struct gendisk *tmp = gendisk_head;
-
-       while (tmp != NULL) {
-               if (tmp->major == MAJOR(dev))
-                       return (tmp);
-               tmp = tmp->next;
-       }
-       return (NULL);
-}
-
-mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
-{
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->desc_nr == nr)
-                       return rdev;
-       }
-       return NULL;
-}
+       rdev->sb_offset = MD_NEW_SIZE_BLOCKS (blk_size[MAJOR(dev)][MINOR(dev)]);
+       set_blocksize (dev, MD_SB_BYTES);
+       bh = bread (dev, rdev->sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
 
-mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
+       if (bh) {
+               sb = (md_superblock_t *) bh->b_data;
+               if (sb->md_magic != MD_SB_MAGIC) {
+                       printk (BAD_MAGIC, kdevname(dev),
+                                        sb->md_magic, rdev->sb_offset);
+                       goto abort;
+               }
+               rdev->sb = (md_superblock_t *) __get_free_page(GFP_KERNEL);
+               if (!rdev->sb) {
+                       printk (OUT_OF_MEM);
+                       goto abort;
+               }
+               memcpy (rdev->sb, bh->b_data, MD_SB_BYTES);
 
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->dev == dev)
-                       return rdev;
-       }
-       return NULL;
+               rdev->size = sb->size;
+       } else
+               printk (NO_DEVICE,kdevname(rdev->dev));
+       ret = SUCCESS;
+abort:
+       if (bh)
+               brelse (bh);
+       return ret;
 }
 
-static MD_LIST_HEAD(device_names);
+#undef SUCCESS
+#undef FAILURE
 
-char * partition_name (kdev_t dev)
-{
-       struct gendisk *hd;
-       static char nomem [] = "<nomem>";
-       dev_name_t *dname;
-       struct md_list_head *tmp = device_names.next;
-
-       while (tmp != &device_names) {
-               dname = md_list_entry(tmp, dev_name_t, list);
-               if (dname->dev == dev)
-                       return dname->name;
-               tmp = tmp->next;
-       }
+#undef BAD_MAGIC
+#undef OUT_OF_MEM
+#undef NO_DEVICE
 
-       dname = (dev_name_t *) kmalloc(sizeof(*dname), GFP_KERNEL);
+/*
+ * Check a full RAID array for plausibility
+ */
 
-       if (!dname)
-               return nomem;
-       /*
-        * ok, add this new device name to the list
-        */
-       hd = find_gendisk (dev);
+#define INCONSISTENT KERN_ERR \
+"md: superblock inconsistency -- run ckraid\n"
 
-       if (!hd)
-               sprintf (dname->name, "[dev %s]", kdevname(dev));
-       else
-               disk_name (hd, MINOR(dev), dname->name);
+#define OUT_OF_DATE KERN_ERR \
+"md: superblock update time inconsistenty -- using the most recent one\n"
 
-       dname->dev = dev;
-       md_list_add(&dname->list, &device_names);
+#define OLD_VERSION KERN_ALERT \
+"md: %s: unsupported raid array version %d.%d.%d\n"
 
-       return dname->name;
-}
+#define NOT_CLEAN KERN_ERR \
+"md: %s: raid array is not clean -- run ckraid\n"
 
-static unsigned int calc_dev_sboffset (kdev_t dev, mddev_t *mddev,
-                                               int persistent)
-{
-       unsigned int size = 0;
+#define NOT_CLEAN_IGNORE KERN_ERR \
+"md: %s: raid array is not clean -- reconstructing parity\n"
 
-       if (blk_size[MAJOR(dev)])
-               size = blk_size[MAJOR(dev)][MINOR(dev)];
-       if (persistent)
-               size = MD_NEW_SIZE_BLOCKS(size);
-       return size;
-}
+#define UNKNOWN_LEVEL KERN_ERR \
+"md: %s: unsupported raid level %d\n"
 
-static unsigned int calc_dev_size (kdev_t dev, mddev_t *mddev, int persistent)
+static int analyze_sbs (int minor, int pnum)
 {
-       unsigned int size;
+       struct md_dev *mddev = md_dev + minor;
+       int i, N = mddev->nb_dev, out_of_date = 0;
+       struct real_dev * disks = mddev->devices;
+       md_superblock_t *sb, *freshest = NULL;
 
-       size = calc_dev_sboffset(dev, mddev, persistent);
-       if (!mddev->sb) {
-               MD_BUG();
-               return size;
-       }
-       if (mddev->sb->chunk_size)
-               size &= ~(mddev->sb->chunk_size/1024 - 1);
-       return size;
-}
+       /*
+        * RAID-0 and linear don't use a RAID superblock
+        */
+       if (pnum == RAID0 >> PERSONALITY_SHIFT ||
+               pnum == LINEAR >> PERSONALITY_SHIFT)
+                       return legacy_raid_sb (minor, pnum);
 
-/*
- * We check wether all devices are numbered from 0 to nb_dev-1. The
- * order is guaranteed even after device name changes.
- *
- * Some personalities (raid0, linear) use this. Personalities that
- * provide data have to be able to deal with loss of individual
- * disks, so they do their checking themselves.
- */
-int md_check_ordering (mddev_t *mddev)
-{
-       int i, c;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       /*
+        * Verify the RAID superblock on each real device
+        */
+       for (i = 0; i < N; i++)
+               if (analyze_one_sb(disks+i))
+                       goto abort;
 
        /*
-        * First, all devices must be fully functional
+        * The superblock constant part has to be the same
+        * for all disks in the array.
         */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       printk("md: md%d's device %s faulty, aborting.\n",
-                               mdidx(mddev), partition_name(rdev->dev));
+       sb = NULL;
+       for (i = 0; i < N; i++) {
+               if (!disks[i].sb)
+                       continue;
+               if (!sb) {
+                       sb = disks[i].sb;
+                       continue;
+               }
+               if (memcmp(sb,
+                          disks[i].sb, MD_SB_GENERIC_CONSTANT_WORDS * 4)) {
+                       printk (INCONSISTENT);
                        goto abort;
                }
        }
 
-       c = 0;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               c++;
-       }
-       if (c != mddev->nb_dev) {
-               MD_BUG();
+       /*
+        * OK, we have all disks and the array is ready to run. Let's
+        * find the freshest superblock, that one will be the superblock
+        * that represents the whole array.
+        */
+       if ((sb = mddev->sb = (md_superblock_t *) __get_free_page (GFP_KERNEL)) == NULL)
                goto abort;
+       freshest = NULL;
+       for (i = 0; i < N; i++) {
+               if (!disks[i].sb)
+                       continue;
+               if (!freshest) {
+                       freshest = disks[i].sb;
+                       continue;
+               }
+               /*
+                * Find the newest superblock version
+                */
+               if (disks[i].sb->utime != freshest->utime) {
+                       out_of_date = 1;
+                       if (disks[i].sb->utime > freshest->utime)
+                               freshest = disks[i].sb;
+               }
        }
-       if (mddev->nb_dev != mddev->sb->raid_disks) {
-               printk("md: md%d, array needs %d disks, has %d, aborting.\n",
-                       mdidx(mddev), mddev->sb->raid_disks, mddev->nb_dev);
+       if (out_of_date)
+               printk(OUT_OF_DATE);
+       memcpy (sb, freshest, sizeof(*freshest));
+
+       /*
+        * Check if we can support this RAID array
+        */
+       if (sb->major_version != MD_MAJOR_VERSION ||
+                       sb->minor_version > MD_MINOR_VERSION) {
+
+               printk (OLD_VERSION, kdevname(MKDEV(MD_MAJOR, minor)),
+                               sb->major_version, sb->minor_version,
+                               sb->patch_version);
                goto abort;
        }
+
        /*
-        * Now the numbering check
+        * We need to add this as a superblock option.
         */
-       for (i = 0; i < mddev->nb_dev; i++) {
-               c = 0;
-               ITERATE_RDEV(mddev,rdev,tmp) {
-                       if (rdev->desc_nr == i)
-                               c++;
-               }
-               if (c == 0) {
-                       printk("md: md%d, missing disk #%d, aborting.\n",
-                               mdidx(mddev), i);
+#if SUPPORT_RECONSTRUCTION
+       if (sb->state != (1 << MD_SB_CLEAN)) {
+               if (sb->level == 1) {
+                       printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
                        goto abort;
-               }
-               if (c > 1) {
-                       printk("md: md%d, too many disks #%d, aborting.\n",
-                               mdidx(mddev), i);
+               } else
+                       printk (NOT_CLEAN_IGNORE, kdevname(MKDEV(MD_MAJOR, minor)));
+       }
+#else
+       if (sb->state != (1 << MD_SB_CLEAN)) {
+               printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
+       }
+#endif /* SUPPORT_RECONSTRUCTION */
+
+       switch (sb->level) {
+               case 1:
+                       md_size[minor] = sb->size;
+                       md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD;
+                       break;
+               case 4:
+               case 5:
+                       md_size[minor] = sb->size * (sb->raid_disks - 1);
+                       md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD * (sb->raid_disks - 1);
+                       break;
+               default:
+                       printk (UNKNOWN_LEVEL, kdevname(MKDEV(MD_MAJOR, minor)),
+                                       sb->level);
                        goto abort;
-               }
        }
        return 0;
 abort:
+       free_sb(mddev);
        return 1;
 }
 
-static unsigned int zoned_raid_size (mddev_t *mddev)
+#undef INCONSISTENT
+#undef OUT_OF_DATE
+#undef OLD_VERSION
+#undef NOT_CLEAN
+#undef OLD_LEVEL
+
+int md_update_sb(int minor)
 {
-       unsigned int mask;
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
+       struct md_dev *mddev = md_dev + minor;
+       struct buffer_head *bh;
+       md_superblock_t *sb = mddev->sb;
+       struct real_dev *realdev;
+       kdev_t dev;
+       int i;
+       u32 sb_offset;
 
-       if (!mddev->sb) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       /*
-        * do size and offset calculations.
-        */
-       mask = ~(mddev->sb->chunk_size/1024 - 1);
-printk("mask %08x\n", mask);
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-printk(" rdev->size: %d\n", rdev->size);
-               rdev->size &= mask;
-printk(" masked rdev->size: %d\n", rdev->size);
-               md_size[mdidx(mddev)] += rdev->size;
-printk("  new md_size: %d\n", md_size[mdidx(mddev)]);
+       sb->utime = CURRENT_TIME;
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = mddev->devices + i;
+               if (!realdev->sb)
+                       continue;
+               dev = realdev->dev;
+               sb_offset = realdev->sb_offset;
+               set_blocksize(dev, MD_SB_BYTES);
+               printk("md: updating raid superblock on device %s, sb_offset == %u\n", kdevname(dev), sb_offset);
+               bh = getblk(dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
+               if (bh) {
+                       sb = (md_superblock_t *) bh->b_data;
+                       memcpy(sb, mddev->sb, MD_SB_BYTES);
+                       memcpy(&sb->descriptor, sb->disks + realdev->sb->descriptor.number, MD_SB_DESCRIPTOR_WORDS * 4);
+                       mark_buffer_uptodate(bh, 1);
+                       mark_buffer_dirty(bh, 1);
+                       ll_rw_block(WRITE, 1, &bh);
+                       wait_on_buffer(bh);
+                       bforget(bh);
+                       fsync_dev(dev);
+                       invalidate_buffers(dev);
+               } else
+                       printk(KERN_ERR "md: getblk failed for device %s\n", kdevname(dev));
        }
        return 0;
 }
 
-static void remove_descriptor (mdp_disk_t *disk, mdp_super_t *sb)
+static int do_md_run (int minor, int repart)
 {
-       if (disk_active(disk)) {
-               sb->working_disks--;
-       } else {
-               if (disk_spare(disk)) {
-                       sb->spare_disks--;
-                       sb->working_disks--;
-               } else  {
-                       sb->failed_disks--;
-               }
-       }
-       sb->nr_disks--;
-       disk->major = 0;
-       disk->minor = 0;
-       mark_disk_removed(disk);
-}
-
-#define BAD_MAGIC KERN_ERR \
-"md: invalid raid superblock magic on %s\n"
+  int pnum, i, min, factor, err;
 
-#define BAD_MINOR KERN_ERR \
-"md: %s: invalid raid minor (%x)\n"
+  if (!md_dev[minor].nb_dev)
+    return -EINVAL;
+  
+  if (md_dev[minor].pers)
+    return -EBUSY;
 
-#define OUT_OF_MEM KERN_ALERT \
-"md: out of memory.\n"
+  md_dev[minor].repartition=repart;
+  
+  if ((pnum=PERSONALITY(&md_dev[minor]) >> (PERSONALITY_SHIFT))
+      >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  /* Only RAID-1 and RAID-5 can have MD devices as underlying devices */
+  if (pnum != (RAID1 >> PERSONALITY_SHIFT) && pnum != (RAID5 >> PERSONALITY_SHIFT)){
+         for (i = 0; i < md_dev [minor].nb_dev; i++)
+                 if (MAJOR (md_dev [minor].devices [i].dev) == MD_MAJOR)
+                         return -EINVAL;
+  }
+  if (!pers[pnum])
+  {
+#ifdef CONFIG_KMOD
+    char module_name[80];
+    sprintf (module_name, "md-personality-%d", pnum);
+    request_module (module_name);
+    if (!pers[pnum])
+#endif
+      return -EINVAL;
+  }
+  
+  factor = min = 1 << FACTOR_SHIFT(FACTOR((md_dev+minor)));
+  
+  for (i=0; i<md_dev[minor].nb_dev; i++)
+    if (md_dev[minor].devices[i].size<min)
+    {
+      printk ("Dev %s smaller than %dk, cannot shrink\n",
+             partition_name (md_dev[minor].devices[i].dev), min);
+      return -EINVAL;
+    }
+
+  for (i=0; i<md_dev[minor].nb_dev; i++) {
+    fsync_dev(md_dev[minor].devices[i].dev);
+    invalidate_buffers(md_dev[minor].devices[i].dev);
+  }
+  
+  /* Resize devices according to the factor. It is used to align
+     partitions size on a given chunk size. */
+  md_size[minor]=0;
 
-#define NO_SB KERN_ERR \
-"md: disabled device %s, could not read superblock.\n"
+  /*
+   * Analyze the raid superblock
+   */ 
+  if (analyze_sbs(minor, pnum))
+    return -EINVAL;
 
-#define BAD_CSUM KERN_WARNING \
-"md: invalid superblock checksum on %s\n"
+  md_dev[minor].pers=pers[pnum];
+  
+  if ((err=md_dev[minor].pers->run (minor, md_dev+minor)))
+  {
+    md_dev[minor].pers=NULL;
+    free_sb(md_dev + minor);
+    return (err);
+  }
+
+  if (pnum != RAID0 >> PERSONALITY_SHIFT && pnum != LINEAR >> PERSONALITY_SHIFT)
+  {
+    md_dev[minor].sb->state &= ~(1 << MD_SB_CLEAN);
+    md_update_sb(minor);
+  }
+
+  /* FIXME : We assume here we have blocks
+     that are twice as large as sectors.
+     THIS MAY NOT BE TRUE !!! */
+  md_hd_struct[minor].start_sect=0;
+  md_hd_struct[minor].nr_sects=md_size[minor]<<1;
+  
+  read_ahead[MD_MAJOR] = 128;
+  return (0);
+}
 
-static int alloc_array_sb (mddev_t * mddev)
+static int do_md_stop (int minor, struct inode *inode)
 {
-       if (mddev->sb) {
-               MD_BUG();
-               return 0;
+       int i;
+  
+       if (inode->i_count>1 || md_dev[minor].busy>1) {
+               /*
+                * ioctl : one open channel
+                */
+               printk ("STOP_MD md%x failed : i_count=%d, busy=%d\n",
+                               minor, inode->i_count, md_dev[minor].busy);
+               return -EBUSY;
        }
-
-       mddev->sb = (mdp_super_t *) __get_free_page (GFP_KERNEL);
-       if (!mddev->sb)
-               return -ENOMEM;
-       md_clear_page((unsigned long)mddev->sb);
-       return 0;
+  
+       if (md_dev[minor].pers) {
+               /*
+                * It is safe to call stop here, it only frees private
+                * data. Also, it tells us if a device is unstoppable
+                * (eg. resyncing is in progress)
+                */
+               if (md_dev[minor].pers->stop (minor, md_dev+minor))
+                       return -EBUSY;
+               /*
+                *  The device won't exist anymore -> flush it now
+                */
+               fsync_dev (inode->i_rdev);
+               invalidate_buffers (inode->i_rdev);
+               if (md_dev[minor].sb) {
+                       md_dev[minor].sb->state |= 1 << MD_SB_CLEAN;
+                       md_update_sb(minor);
+               }
+       }
+  
+       /* Remove locks. */
+       if (md_dev[minor].sb)
+       free_sb(md_dev + minor);
+       for (i=0; i<md_dev[minor].nb_dev; i++)
+               clear_inode (md_dev[minor].devices[i].inode);
+
+       md_dev[minor].nb_dev=md_size[minor]=0;
+       md_hd_struct[minor].nr_sects=0;
+       md_dev[minor].pers=NULL;
+  
+       read_ahead[MD_MAJOR] = 128;
+  
+       return (0);
 }
 
-static int alloc_disk_sb (mdk_rdev_t * rdev)
+static int do_md_add (int minor, kdev_t dev)
 {
-       if (rdev->sb)
-               MD_BUG();
+       int i;
+       int hot_add=0;
+       struct real_dev *realdev;
 
-       rdev->sb = (mdp_super_t *) __get_free_page(GFP_KERNEL);
-       if (!rdev->sb) {
-               printk (OUT_OF_MEM);
+       if (md_dev[minor].nb_dev==MAX_REAL)
                return -EINVAL;
-       }
-       md_clear_page((unsigned long)rdev->sb);
 
-       return 0;
-}
+       if (!fs_may_mount (dev))
+               return -EBUSY;
 
-static void free_disk_sb (mdk_rdev_t * rdev)
-{
-       if (rdev->sb) {
-               free_page((unsigned long) rdev->sb);
-               rdev->sb = NULL;
-               rdev->sb_offset = 0;
-               rdev->size = 0;
-       } else {
-               if (!rdev->faulty)
-                       MD_BUG();
+       if (blk_size[MAJOR(dev)] == NULL || blk_size[MAJOR(dev)][MINOR(dev)] == 0) {
+               printk("md_add(): zero device size, huh, bailing out.\n");
+               return -EINVAL;
        }
-}
-
-static void mark_rdev_faulty (mdk_rdev_t * rdev)
-{
-       unsigned long flags;
 
-       if (!rdev) {
-               MD_BUG();
-               return;
+       if (md_dev[minor].pers) {
+               /*
+                * The array is already running, hot-add the drive, or
+                * bail out:
+                */
+               if (!md_dev[minor].pers->hot_add_disk)
+                       return -EBUSY;
+               else
+                       hot_add=1;
        }
-       save_flags(flags);
-       cli();
-       free_disk_sb(rdev);
-       rdev->faulty = 1;
-       restore_flags(flags);
-}
-
-static int read_disk_sb (mdk_rdev_t * rdev)
-{
-       int ret = -EINVAL;
-       struct buffer_head *bh = NULL;
-       kdev_t dev = rdev->dev;
-       mdp_super_t *sb;
-       u32 sb_offset;
 
-       if (!rdev->sb) {
-               MD_BUG();
-               goto abort;
-       }       
-       
        /*
-        * Calculate the position of the superblock,
-        * it's at the end of the disk
+        * Careful. We cannot increase nb_dev for a running array.
         */
-       sb_offset = calc_dev_sboffset(rdev->dev, rdev->mddev, 1);
-       rdev->sb_offset = sb_offset;
-       printk("(read) %s's sb offset: %d", partition_name(dev),
-                                                        sb_offset);
-       fsync_dev(dev);
-       set_blocksize (dev, MD_SB_BYTES);
-       bh = bread (dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
+       i=md_dev[minor].nb_dev;
+       realdev = &md_dev[minor].devices[i];
+       realdev->dev=dev;
+  
+       /* Lock the device by inserting a dummy inode. This doesn't
+          smell very good, but I need to be consistent with the
+          mount stuff, specially with fs_may_mount. If someone have
+          a better idea, please help ! */
+  
+       realdev->inode=get_empty_inode ();
+       realdev->inode->i_dev=dev;      /* don't care about other fields */
+       insert_inode_hash (realdev->inode);
+  
+       /* Sizes are now rounded at run time */
+  
+/*  md_dev[minor].devices[i].size=gen_real->sizes[MINOR(dev)]; HACKHACK*/
 
-       if (bh) {
-               sb = (mdp_super_t *) bh->b_data;
-               memcpy (rdev->sb, sb, MD_SB_BYTES);
-       } else {
-               printk (NO_SB,partition_name(rdev->dev));
-               goto abort;
-       }
-       printk(" [events: %08lx]\n", (unsigned long)get_unaligned(&rdev->sb->events));
-       ret = 0;
-abort:
-       if (bh)
-               brelse (bh);
-       return ret;
-}
-
-static unsigned int calc_sb_csum (mdp_super_t * sb)
-{
-       unsigned int disk_csum, csum;
-
-       disk_csum = sb->sb_csum;
-       sb->sb_csum = 0;
-       csum = csum_partial((void *)sb, MD_SB_BYTES, 0);
-       sb->sb_csum = disk_csum;
-       return csum;
-}
-
-/*
- * Check one RAID superblock for generic plausibility
- */
-
-static int check_disk_sb (mdk_rdev_t * rdev)
-{
-       mdp_super_t *sb;
-       int ret = -EINVAL;
-
-       sb = rdev->sb;
-       if (!sb) {
-               MD_BUG();
-               goto abort;
-       }
-
-       if (sb->md_magic != MD_SB_MAGIC) {
-               printk (BAD_MAGIC, partition_name(rdev->dev));
-               goto abort;
-       }
-
-       if (sb->md_minor >= MAX_MD_DEVS) {
-               printk (BAD_MINOR, partition_name(rdev->dev),
-                                                       sb->md_minor);
-               goto abort;
-       }
-
-       if (calc_sb_csum(sb) != sb->sb_csum)
-               printk(BAD_CSUM, partition_name(rdev->dev));
-       ret = 0;
-abort:
-       return ret;
-}
-
-static kdev_t dev_unit(kdev_t dev)
-{
-       unsigned int mask;
-       struct gendisk *hd = find_gendisk(dev);
-
-       if (!hd)
-               return 0;
-       mask = ~((1 << hd->minor_shift) - 1);
-
-       return MKDEV(MAJOR(dev), MINOR(dev) & mask);
-}
-
-static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       ITERATE_RDEV(mddev,rdev,tmp)
-               if (dev_unit(rdev->dev) == dev_unit(dev))
-                       return rdev;
-
-       return NULL;
-}
-
-static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       ITERATE_RDEV(mddev1,rdev,tmp)
-               if (match_dev_unit(mddev2, rdev->dev))
-                       return 1;
-
-       return 0;
-}
-
-static MD_LIST_HEAD(all_raid_disks);
-static MD_LIST_HEAD(pending_raid_disks);
-
-static void bind_rdev_to_array (mdk_rdev_t * rdev, mddev_t * mddev)
-{
-       mdk_rdev_t *same_pdev;
-
-       if (rdev->mddev) {
-               MD_BUG();
-               return;
-       }
-       same_pdev = match_dev_unit(mddev, rdev->dev);
-       if (same_pdev)
-               printk( KERN_WARNING
-"md%d: WARNING: %s appears to be on the same physical disk as %s. True\n"
-"     protection against single-disk failure might be compromised.\n",
-                       mdidx(mddev), partition_name(rdev->dev),
-                               partition_name(same_pdev->dev));
-               
-       md_list_add(&rdev->same_set, &mddev->disks);
-       rdev->mddev = mddev;
-       mddev->nb_dev++;
-       printk("bind<%s,%d>\n", partition_name(rdev->dev), mddev->nb_dev);
-}
-
-static void unbind_rdev_from_array (mdk_rdev_t * rdev)
-{
-       if (!rdev->mddev) {
-               MD_BUG();
-               return;
-       }
-       md_list_del(&rdev->same_set);
-       MD_INIT_LIST_HEAD(&rdev->same_set);
-       rdev->mddev->nb_dev--;
-       printk("unbind<%s,%d>\n", partition_name(rdev->dev),
-                                                rdev->mddev->nb_dev);
-       rdev->mddev = NULL;
-}
-
-/*
- * prevent the device from being mounted, repartitioned or
- * otherwise reused by a RAID array (or any other kernel
- * subsystem), by opening the device. [simply getting an
- * inode is not enough, the SCSI module usage code needs
- * an explicit open() on the device]
- */
-static void lock_rdev (mdk_rdev_t *rdev)
-{
-       int err = 0;
-
-       /*
-        * First insert a dummy inode.
-        */
-       rdev->inode = get_empty_inode();
-       /*
-        * we dont care about any other fields
-        */
-       rdev->inode->i_dev = rdev->inode->i_rdev = rdev->dev;
-       insert_inode_hash(rdev->inode);
-
-       memset(&rdev->filp, 0, sizeof(rdev->filp));
-       rdev->filp.f_mode = 3; /* read write */
-       err = blkdev_open(rdev->inode, &rdev->filp);
-       if (err)
-               printk("blkdev_open() failed: %d\n", err);
-}
-
-static void unlock_rdev (mdk_rdev_t *rdev)
-{
-       blkdev_release(rdev->inode);
-}
-
-static void export_rdev (mdk_rdev_t * rdev)
-{
-       printk("export_rdev(%s)\n",partition_name(rdev->dev));
-       if (rdev->mddev)
-               MD_BUG();
-       unlock_rdev(rdev);
-       free_disk_sb(rdev);
-       md_list_del(&rdev->all);
-       MD_INIT_LIST_HEAD(&rdev->all);
-       if (rdev->pending.next != &rdev->pending) {
-               printk("(%s was pending)\n",partition_name(rdev->dev));
-               md_list_del(&rdev->pending);
-               MD_INIT_LIST_HEAD(&rdev->pending);
-       }
-       rdev->dev = 0;
-       rdev->faulty = 0;
-       kfree(rdev);
-}
-
-static void kick_rdev_from_array (mdk_rdev_t * rdev)
-{
-       unbind_rdev_from_array(rdev);
-       export_rdev(rdev);
-}
-
-static void export_array (mddev_t *mddev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-       mdp_super_t *sb = mddev->sb;
-
-       if (mddev->sb) {
-               mddev->sb = NULL;
-               free_page((unsigned long) sb);
-       }
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!rdev->mddev) {
-                       MD_BUG();
-                       continue;
-               }
-               kick_rdev_from_array(rdev);
-       }
-       if (mddev->nb_dev)
-               MD_BUG();
-}
-
-#undef BAD_CSUM
-#undef BAD_MAGIC
-#undef OUT_OF_MEM
-#undef NO_SB
-
-static void print_desc(mdp_disk_t *desc)
-{
-       printk(" DISK<N:%d,%s(%d,%d),R:%d,S:%d>\n", desc->number,
-               partition_name(MKDEV(desc->major,desc->minor)),
-               desc->major,desc->minor,desc->raid_disk,desc->state);
-}
-
-static void print_sb(mdp_super_t *sb)
-{
-       int i;
-
-       printk("  SB: (V:%d.%d.%d) ID:<%08x.%08x.%08x.%08x> CT:%08x\n",
-               sb->major_version, sb->minor_version, sb->patch_version,
-               sb->set_uuid0, sb->set_uuid1, sb->set_uuid2, sb->set_uuid3,
-               sb->ctime);
-       printk("     L%d S%08d ND:%d RD:%d md%d LO:%d CS:%d\n", sb->level,
-               sb->size, sb->nr_disks, sb->raid_disks, sb->md_minor,
-               sb->layout, sb->chunk_size);
-       printk("     UT:%08x ST:%d AD:%d WD:%d FD:%d SD:%d CSUM:%08x E:%08lx\n",
-               sb->utime, sb->state, sb->active_disks, sb->working_disks,
-               sb->failed_disks, sb->spare_disks,
-               sb->sb_csum, (unsigned long)get_unaligned(&sb->events));
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-
-               desc = sb->disks + i;
-               printk("     D %2d: ", i);
-               print_desc(desc);
-       }
-       printk("     THIS: ");
-       print_desc(&sb->this_disk);
-
-}
-
-static void print_rdev(mdk_rdev_t *rdev)
-{
-       printk(" rdev %s: O:%s, SZ:%08d F:%d DN:%d ",
-               partition_name(rdev->dev), partition_name(rdev->old_dev),
-               rdev->size, rdev->faulty, rdev->desc_nr);
-       if (rdev->sb) {
-               printk("rdev superblock:\n");
-               print_sb(rdev->sb);
-       } else
-               printk("no rdev superblock!\n");
-}
-
-void md_print_devices (void)
-{
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev;
-       mddev_t *mddev;
-
-       printk("\n");
-       printk("       **********************************\n");
-       printk("       * <COMPLETE RAID STATE PRINTOUT> *\n");
-       printk("       **********************************\n");
-       ITERATE_MDDEV(mddev,tmp) {
-               printk("md%d: ", mdidx(mddev));
-
-               ITERATE_RDEV(mddev,rdev,tmp2)
-                       printk("<%s>", partition_name(rdev->dev));
-
-               if (mddev->sb) {
-                       printk(" array superblock:\n");
-                       print_sb(mddev->sb);
-               } else
-                       printk(" no array superblock.\n");
-
-               ITERATE_RDEV(mddev,rdev,tmp2)
-                       print_rdev(rdev);
-       }
-       printk("       **********************************\n");
-       printk("\n");
-}
-
-static int sb_equal ( mdp_super_t *sb1, mdp_super_t *sb2)
-{
-       int ret;
-       mdp_super_t *tmp1, *tmp2;
-
-       tmp1 = kmalloc(sizeof(*tmp1),GFP_KERNEL);
-       tmp2 = kmalloc(sizeof(*tmp2),GFP_KERNEL);
-
-       if (!tmp1 || !tmp2) {
-               ret = 0;
-               goto abort;
-       }
-
-       *tmp1 = *sb1;
-       *tmp2 = *sb2;
-
-       /*
-        * nr_disks is not constant
-        */
-       tmp1->nr_disks = 0;
-       tmp2->nr_disks = 0;
-
-       if (memcmp(tmp1, tmp2, MD_SB_GENERIC_CONSTANT_WORDS * 4))
-               ret = 0;
-       else
-               ret = 1;
-
-abort:
-       if (tmp1)
-               kfree(tmp1);
-       if (tmp2)
-               kfree(tmp2);
-
-       return ret;
-}
-
-static int uuid_equal(mdk_rdev_t *rdev1, mdk_rdev_t *rdev2)
-{
-       if (    (rdev1->sb->set_uuid0 == rdev2->sb->set_uuid0) &&
-               (rdev1->sb->set_uuid1 == rdev2->sb->set_uuid1) &&
-               (rdev1->sb->set_uuid2 == rdev2->sb->set_uuid2) &&
-               (rdev1->sb->set_uuid3 == rdev2->sb->set_uuid3))
-
-               return 1;
-
-       return 0;
-}
-
-static mdk_rdev_t * find_rdev_all (kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       tmp = all_raid_disks.next;
-       while (tmp != &all_raid_disks) {
-               rdev = md_list_entry(tmp, mdk_rdev_t, all);
-               if (rdev->dev == dev)
-                       return rdev;
-               tmp = tmp->next;
-       }
-       return NULL;
-}
-
-#define GETBLK_FAILED KERN_ERR \
-"md: getblk failed for device %s\n"
-
-static int write_disk_sb(mdk_rdev_t * rdev)
-{
-       struct buffer_head *bh;
-       kdev_t dev;
-       u32 sb_offset, size;
-       mdp_super_t *sb;
-
-       if (!rdev->sb) {
-               MD_BUG();
-               return -1;
-       }
-       if (rdev->faulty) {
-               MD_BUG();
-               return -1;
-       }
-       if (rdev->sb->md_magic != MD_SB_MAGIC) {
-               MD_BUG();
-               return -1;
-       }
-
-       dev = rdev->dev;
-       sb_offset = calc_dev_sboffset(dev, rdev->mddev, 1);
-       if (rdev->sb_offset != sb_offset) {
-               printk("%s's sb offset has changed from %d to %d, skipping\n", partition_name(dev), rdev->sb_offset, sb_offset);
-               goto skip;
-       }
-       /*
-        * If the disk went offline meanwhile and it's just a spare, then
-        * it's size has changed to zero silently, and the MD code does
-        * not yet know that it's faulty.
-        */
-       size = calc_dev_size(dev, rdev->mddev, 1);
-       if (size != rdev->size) {
-               printk("%s's size has changed from %d to %d since import, skipping\n", partition_name(dev), rdev->size, size);
-               goto skip;
-       }
-
-       printk("(write) %s's sb offset: %d\n", partition_name(dev), sb_offset);
-       fsync_dev(dev);
-       set_blocksize(dev, MD_SB_BYTES);
-       bh = getblk(dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
-       if (!bh) {
-               printk(GETBLK_FAILED, partition_name(dev));
-               return 1;
-       }
-       memset(bh->b_data,0,bh->b_size);
-       sb = (mdp_super_t *) bh->b_data;
-       memcpy(sb, rdev->sb, MD_SB_BYTES);
-
-       mark_buffer_uptodate(bh, 1);
-       mark_buffer_dirty(bh, 1);
-       ll_rw_block(WRITE, 1, &bh);
-       wait_on_buffer(bh);
-       brelse(bh);
-       fsync_dev(dev);
-skip:
-       return 0;
-}
-#undef GETBLK_FAILED KERN_ERR
-
-static void set_this_disk(mddev_t *mddev, mdk_rdev_t *rdev)
-{
-       int i, ok = 0;
-       mdp_disk_t *desc;
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               desc = mddev->sb->disks + i;
-#if 0
-               if (disk_faulty(desc)) {
-                       if (MKDEV(desc->major,desc->minor) == rdev->dev)
-                               ok = 1;
-                       continue;
-               }
-#endif
-               if (MKDEV(desc->major,desc->minor) == rdev->dev) {
-                       rdev->sb->this_disk = *desc;
-                       rdev->desc_nr = desc->number;
-                       ok = 1;
-                       break;
-               }
-       }
-
-       if (!ok) {
-               MD_BUG();
-       }
-}
-
-static int sync_sbs(mddev_t * mddev)
-{
-       mdk_rdev_t *rdev;
-       mdp_super_t *sb;
-        struct md_list_head *tmp;
-
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               sb = rdev->sb;
-               *sb = *mddev->sb;
-               set_this_disk(mddev, rdev);
-               sb->sb_csum = calc_sb_csum(sb);
-       }
-       return 0;
-}
-
-int md_update_sb(mddev_t * mddev)
-{
-       int first, err, count = 100;
-        struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-       __u64 ev;
-
-repeat:
-       mddev->sb->utime = CURRENT_TIME;
-       ev = get_unaligned(&mddev->sb->events);
-       ++ev;
-       put_unaligned(ev,&mddev->sb->events);
-       if (ev == (__u64)0) {
-               /*
-                * oops, this 64-bit counter should never wrap.
-                * Either we are in around ~1 trillion A.C., assuming
-                * 1 reboot per second, or we have a bug:
-                */
-               MD_BUG();
-               --ev;
-               put_unaligned(ev,&mddev->sb->events);
-       }
-       sync_sbs(mddev);
-
-       /*
-        * do not write anything to disk if using
-        * nonpersistent superblocks
-        */
-       if (mddev->sb->not_persistent)
-               return 0;
-
-       printk(KERN_INFO "md: updating md%d RAID superblock on device\n",
-                                       mdidx(mddev));
-
-       first = 1;
-       err = 0;
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!first) {
-                       first = 0;
-                       printk(", ");
-               }
-               if (rdev->faulty)
-                       printk("(skipping faulty ");
-               printk("%s ", partition_name(rdev->dev));
-               if (!rdev->faulty) {
-                       printk("[events: %08lx]",
-                              (unsigned long)get_unaligned(&rdev->sb->events));
-                       err += write_disk_sb(rdev);
-               } else
-                       printk(")\n");
-       }
-       printk(".\n");
-       if (err) {
-               printk("errors occured during superblock update, repeating\n");
-               if (--count)
-                       goto repeat;
-               printk("excessive errors occured during superblock update, exiting\n");
-       }
-       return 0;
-}
-
-/*
- * Import a device. If 'on_disk', then sanity check the superblock
- *
- * mark the device faulty if:
- *
- *   - the device is nonexistent (zero size)
- *   - the device has no valid superblock
- *
- * a faulty rdev _never_ has rdev->sb set.
- */
-static int md_import_device (kdev_t newdev, int on_disk)
-{
-       int err;
-       mdk_rdev_t *rdev;
-       unsigned int size;
-
-       if (find_rdev_all(newdev))
-               return -EEXIST;
-
-       rdev = (mdk_rdev_t *) kmalloc(sizeof(*rdev), GFP_KERNEL);
-       if (!rdev) {
-               printk("could not alloc mem for %s!\n", partition_name(newdev));
-               return -ENOMEM;
-       }
-       memset(rdev, 0, sizeof(*rdev));
-
-       if (!fs_may_mount(newdev)) {
-               printk("md: can not import %s, has active inodes!\n",
-                       partition_name(newdev));
-               err = -EBUSY;
-               goto abort_free;
-       }
-
-       if ((err = alloc_disk_sb(rdev)))
-               goto abort_free;
-
-       rdev->dev = newdev;
-       lock_rdev(rdev);
-       rdev->desc_nr = -1;
-       rdev->faulty = 0;
-
-       size = 0;
-       if (blk_size[MAJOR(newdev)])
-               size = blk_size[MAJOR(newdev)][MINOR(newdev)];
-       if (!size) {
-               printk("md: %s has zero size, marking faulty!\n",
-                               partition_name(newdev));
-               err = -EINVAL;
-               goto abort_free;
-       }
-
-       if (on_disk) {
-               if ((err = read_disk_sb(rdev))) {
-                       printk("md: could not read %s's sb, not importing!\n",
-                                       partition_name(newdev));
-                       goto abort_free;
-               }
-               if ((err = check_disk_sb(rdev))) {
-                       printk("md: %s has invalid sb, not importing!\n",
-                                       partition_name(newdev));
-                       goto abort_free;
-               }
-
-               rdev->old_dev = MKDEV(rdev->sb->this_disk.major,
-                                       rdev->sb->this_disk.minor);
-               rdev->desc_nr = rdev->sb->this_disk.number;
-       }
-       md_list_add(&rdev->all, &all_raid_disks);
-       MD_INIT_LIST_HEAD(&rdev->pending);
-
-       if (rdev->faulty && rdev->sb)
-               free_disk_sb(rdev);
-       return 0;
-
-abort_free:
-       if (rdev->sb)
-               free_disk_sb(rdev);
-       kfree(rdev);
-       return err;
-}
-
-/*
- * Check a full RAID array for plausibility
- */
-
-#define INCONSISTENT KERN_ERR \
-"md: fatal superblock inconsistency in %s -- removing from array\n"
-
-#define OUT_OF_DATE KERN_ERR \
-"md: superblock update time inconsistency -- using the most recent one\n"
-
-#define OLD_VERSION KERN_ALERT \
-"md: md%d: unsupported raid array version %d.%d.%d\n"
-
-#define NOT_CLEAN_IGNORE KERN_ERR \
-"md: md%d: raid array is not clean -- starting background reconstruction\n"
-
-#define UNKNOWN_LEVEL KERN_ERR \
-"md: md%d: unsupported raid level %d\n"
-
-static int analyze_sbs (mddev_t * mddev)
-{
-       int out_of_date = 0, i;
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev, *rdev2, *freshest;
-       mdp_super_t *sb;
-
-       /*
-        * Verify the RAID superblock on each real device
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       MD_BUG();
-                       goto abort;
-               }
-               if (!rdev->sb) {
-                       MD_BUG();
-                       goto abort;
-               }
-               if (check_disk_sb(rdev))
-                       goto abort;
-       }
-
-       /*
-        * The superblock constant part has to be the same
-        * for all disks in the array.
-        */
-       sb = NULL;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!sb) {
-                       sb = rdev->sb;
-                       continue;
-               }
-               if (!sb_equal(sb, rdev->sb)) {
-                       printk (INCONSISTENT, partition_name(rdev->dev));
-                       kick_rdev_from_array(rdev);
-                       continue;
-               }
-       }
-
-       /*
-        * OK, we have all disks and the array is ready to run. Let's
-        * find the freshest superblock, that one will be the superblock
-        * that represents the whole array.
-        */
-       if (!mddev->sb)
-               if (alloc_array_sb(mddev))
-                       goto abort;
-       sb = mddev->sb;
-       freshest = NULL;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               __u64 ev1, ev2;
-               /*
-                * if the checksum is invalid, use the superblock
-                * only as a last resort. (decrease it's age by
-                * one event)
-                */
-               if (calc_sb_csum(rdev->sb) != rdev->sb->sb_csum) {
-                       __u64 ev = get_unaligned(&rdev->sb->events);
-                       if (ev != (__u64)0) {
-                               --ev;
-                               put_unaligned(ev,&rdev->sb->events);
-                       }
-               }
-
-               printk("%s's event counter: %08lx\n", partition_name(rdev->dev),
-                      (unsigned long)get_unaligned(&rdev->sb->events));
-               if (!freshest) {
-                       freshest = rdev;
-                       continue;
-               }
-               /*
-                * Find the newest superblock version
-                */
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&freshest->sb->events);
-               if (ev1 != ev2) {
-                       out_of_date = 1;
-                       if (ev1 > ev2)
-                               freshest = rdev;
-               }
-       }
-       if (out_of_date) {
-               printk(OUT_OF_DATE);
-               printk("freshest: %s\n", partition_name(freshest->dev));
-       }
-       memcpy (sb, freshest->sb, sizeof(*sb));
-
-       /*
-        * at this point we have picked the 'best' superblock
-        * from all available superblocks.
-        * now we validate this superblock and kick out possibly
-        * failed disks.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               /*
-                * Kick all non-fresh devices faulty
-                */
-               __u64 ev1, ev2;
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&sb->events);
-               ++ev1;
-               if (ev1 < ev2) {
-                       printk("md: kicking non-fresh %s from array!\n",
-                                               partition_name(rdev->dev));
-                       kick_rdev_from_array(rdev);
-                       continue;
-               }
-       }
-
-       /*
-        * Fix up changed device names ... but only if this disk has a
-        * recent update time. Use faulty checksum ones too.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               __u64 ev1, ev2, ev3;
-               if (rdev->faulty) { /* REMOVEME */
-                       MD_BUG();
-                       goto abort;
-               }
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&sb->events);
-               ev3 = ev2;
-               --ev3;
-               if ((rdev->dev != rdev->old_dev) &&
-                   ((ev1 == ev2) || (ev1 == ev3))) {
-                       mdp_disk_t *desc;
-
-                       printk("md: device name has changed from %s to %s since last import!\n", partition_name(rdev->old_dev), partition_name(rdev->dev));
-                       if (rdev->desc_nr == -1) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       desc = &sb->disks[rdev->desc_nr];
-                       if (rdev->old_dev != MKDEV(desc->major, desc->minor)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       desc->major = MAJOR(rdev->dev);
-                       desc->minor = MINOR(rdev->dev);
-                       desc = &rdev->sb->this_disk;
-                       desc->major = MAJOR(rdev->dev);
-                       desc->minor = MINOR(rdev->dev);
-               }
-       }
-
-       /*
-        * Remove unavailable and faulty devices ...
-        *
-        * note that if an array becomes completely unrunnable due to
-        * missing devices, we do not write the superblock back, so the
-        * administrator has a chance to fix things up. The removal thus
-        * only happens if it's nonfatal to the contents of the array.
-        */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               int found;
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               /*
-                * We kick faulty devices/descriptors immediately.
-                */
-               if (disk_faulty(desc)) {
-                       found = 0;
-                       ITERATE_RDEV(mddev,rdev,tmp) {
-                               if (rdev->desc_nr != desc->number)
-                                       continue;
-                               printk("md%d: kicking faulty %s!\n",
-                                       mdidx(mddev),partition_name(rdev->dev));
-                               kick_rdev_from_array(rdev);
-                               found = 1;
-                               break;
-                       }
-                       if (!found) {
-                               if (dev == MKDEV(0,0))
-                                       continue;
-                               printk("md%d: removing former faulty %s!\n",
-                                       mdidx(mddev), partition_name(dev));
-                       }
-                       remove_descriptor(desc, sb);
-                       continue;
-               }
-
-               if (dev == MKDEV(0,0))
-                       continue;
-               /*
-                * Is this device present in the rdev ring?
-                */
-               found = 0;
-               ITERATE_RDEV(mddev,rdev,tmp) {
-                       if (rdev->desc_nr == desc->number) {
-                               found = 1;
-                               break;
-                       }
-               }
-               if (found) 
-                       continue;
-
-               printk("md%d: former device %s is unavailable, removing from array!\n", mdidx(mddev), partition_name(dev));
-               remove_descriptor(desc, sb);
-       }
-
-       /*
-        * Double check wether all devices mentioned in the
-        * superblock are in the rdev ring.
-        */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               if (dev == MKDEV(0,0))
-                       continue;
-
-               if (disk_faulty(desc)) {
-                       MD_BUG();
-                       goto abort;
-               }
-
-               rdev = find_rdev(mddev, dev);
-               if (!rdev) {
-                       MD_BUG();
-                       goto abort;
-               }
-       }
-
-       /*
-        * Do a final reality check.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->desc_nr == -1) {
-                       MD_BUG();
-                       goto abort;
-               }
-               /*
-                * is the desc_nr unique?
-                */
-               ITERATE_RDEV(mddev,rdev2,tmp2) {
-                       if ((rdev2 != rdev) &&
-                                       (rdev2->desc_nr == rdev->desc_nr)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-               }
-               /*
-                * is the device unique?
-                */
-               ITERATE_RDEV(mddev,rdev2,tmp2) {
-                       if ((rdev2 != rdev) &&
-                                       (rdev2->dev == rdev->dev)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-               }
-       }
-
-       /*
-        * Check if we can support this RAID array
-        */
-       if (sb->major_version != MD_MAJOR_VERSION ||
-                       sb->minor_version > MD_MINOR_VERSION) {
-
-               printk (OLD_VERSION, mdidx(mddev), sb->major_version,
-                               sb->minor_version, sb->patch_version);
-               goto abort;
-       }
-
-       if ((sb->state != (1 << MD_SB_CLEAN)) && ((sb->level == 1) ||
-                       (sb->level == 4) || (sb->level == 5)))
-               printk (NOT_CLEAN_IGNORE, mdidx(mddev));
-
-       return 0;
-abort:
-       return 1;
-}
-
-#undef INCONSISTENT
-#undef OUT_OF_DATE
-#undef OLD_VERSION
-#undef OLD_LEVEL
-
-static int device_size_calculation (mddev_t * mddev)
-{
-       int data_disks = 0, persistent;
-       unsigned int readahead;
-       mdp_super_t *sb = mddev->sb;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       /*
-        * Do device size calculation. Bail out if too small.
-        * (we have to do this after having validated chunk_size,
-        * because device size has to be modulo chunk_size)
-        */ 
-       persistent = !mddev->sb->not_persistent;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               if (rdev->size) {
-                       MD_BUG();
-                       continue;
-               }
-               rdev->size = calc_dev_size(rdev->dev, mddev, persistent);
-               if (rdev->size < sb->chunk_size / 1024) {
-                       printk (KERN_WARNING
-                               "Dev %s smaller than chunk_size: %dk < %dk\n",
-                               partition_name(rdev->dev),
-                               rdev->size, sb->chunk_size / 1024);
-                       return -EINVAL;
-               }
-       }
-
-       switch (sb->level) {
-               case -3:
-                       data_disks = 1;
-                       break;
-               case -2:
-                       data_disks = 1;
-                       break;
-               case -1:
-                       zoned_raid_size(mddev);
-                       data_disks = 1;
-                       break;
-               case 0:
-                       zoned_raid_size(mddev);
-                       data_disks = sb->raid_disks;
-                       break;
-               case 1:
-                       data_disks = 1;
-                       break;
-               case 4:
-               case 5:
-                       data_disks = sb->raid_disks-1;
-                       break;
-               default:
-                       printk (UNKNOWN_LEVEL, mdidx(mddev), sb->level);
-                       goto abort;
-       }
-       if (!md_size[mdidx(mddev)])
-               md_size[mdidx(mddev)] = sb->size * data_disks;
-
-       readahead = MD_READAHEAD;
-       if ((sb->level == 0) || (sb->level == 4) || (sb->level == 5))
-               readahead = mddev->sb->chunk_size * 4 * data_disks;
-               if (readahead < data_disks * MAX_SECTORS*512*2) 
-                       readahead = data_disks * MAX_SECTORS*512*2;
-       else {
-               if (sb->level == -3)
-                       readahead = 0;
-       }
-       md_maxreadahead[mdidx(mddev)] = readahead;
-
-       printk(KERN_INFO "md%d: max total readahead window set to %dk\n",
-               mdidx(mddev), readahead/1024);
-
-       printk(KERN_INFO
-               "md%d: %d data-disks, max readahead per data-disk: %dk\n",
-                       mdidx(mddev), data_disks, readahead/data_disks/1024);
-       return 0;
-abort:
-       return 1;
-}
-
-
-#define TOO_BIG_CHUNKSIZE KERN_ERR \
-"too big chunk_size: %d > %d\n"
-
-#define TOO_SMALL_CHUNKSIZE KERN_ERR \
-"too small chunk_size: %d < %ld\n"
-
-#define BAD_CHUNKSIZE KERN_ERR \
-"no chunksize specified, see 'man raidtab'\n"
-
-static int do_md_run (mddev_t * mddev)
-{
-       int pnum, err;
-       int chunk_size;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-
-       if (!mddev->nb_dev) {
-               MD_BUG();
-               return -EINVAL;
-       }
-  
-       if (mddev->pers)
-               return -EBUSY;
-
-       /*
-        * Resize disks to align partitions size on a given
-        * chunk size.
-        */
-       md_size[mdidx(mddev)] = 0;
-
-       /*
-        * Analyze all RAID superblock(s)
-        */ 
-       if (analyze_sbs(mddev)) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       chunk_size = mddev->sb->chunk_size;
-       pnum = level_to_pers(mddev->sb->level);
-
-       mddev->param.chunk_size = chunk_size;
-       mddev->param.personality = pnum;
-
-       if (chunk_size > MAX_CHUNK_SIZE) {
-               printk(TOO_BIG_CHUNKSIZE, chunk_size, MAX_CHUNK_SIZE);
-               return -EINVAL;
-       }
-       /*
-        * chunk-size has to be a power of 2 and multiples of PAGE_SIZE
-        */
-       if ( (1 << ffz(~chunk_size)) != chunk_size) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       if (chunk_size < PAGE_SIZE) {
-               printk(TOO_SMALL_CHUNKSIZE, chunk_size, PAGE_SIZE);
-               return -EINVAL;
-       }
-
-       if (pnum >= MAX_PERSONALITY) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       if ((pnum != RAID1) && (pnum != LINEAR) && !chunk_size) {
-               /*
-                * 'default chunksize' in the old md code used to
-                * be PAGE_SIZE, baaad.
-                * we abort here to be on the safe side. We dont
-                * want to continue the bad practice.
-                */
-               printk(BAD_CHUNKSIZE);
-               return -EINVAL;
-       }
-
-       if (!pers[pnum])
-       {
-#ifdef CONFIG_KMOD
-               char module_name[80];
-               sprintf (module_name, "md-personality-%d", pnum);
-               request_module (module_name);
-               if (!pers[pnum])
-#endif
-                       return -EINVAL;
-       }
-  
-       if (device_size_calculation(mddev))
-               return -EINVAL;
-
-       /*
-        * Drop all container device buffers, from now on
-        * the only valid external interface is through the md
-        * device.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               fsync_dev(rdev->dev);
-               invalidate_buffers(rdev->dev);
-       }
-  
-       mddev->pers = pers[pnum];
-  
-       err = mddev->pers->run(mddev);
-       if (err) {
-               printk("pers->run() failed ...\n");
-               mddev->pers = NULL;
-               return -EINVAL;
-       }
-
-       mddev->sb->state &= ~(1 << MD_SB_CLEAN);
-       md_update_sb(mddev);
-
-       /*
-        * md_size has units of 1K blocks, which are
-        * twice as large as sectors.
-        */
-       md_hd_struct[mdidx(mddev)].start_sect = 0;
-       md_hd_struct[mdidx(mddev)].nr_sects = md_size[mdidx(mddev)] << 1;
-  
-       read_ahead[MD_MAJOR] = 1024;
-       return (0);
-}
-
-#undef TOO_BIG_CHUNKSIZE
-#undef BAD_CHUNKSIZE
-
-#define OUT(x) do { err = (x); goto out; } while (0)
-
-static int restart_array (mddev_t *mddev)
-{
-       int err = 0;
-       /*
-        * Complain if it has no devices
-        */
-       if (!mddev->nb_dev)
-               OUT(-ENXIO);
-
-       if (mddev->pers) {
-               if (!mddev->ro)
-                       OUT(-EBUSY);
-
-               mddev->ro = 0;
-               set_device_ro(mddev_to_kdev(mddev), 0);
-
-               printk (KERN_INFO
-                       "md%d switched to read-write mode.\n", mdidx(mddev));
-               /*
-                * Kick recovery or resync if necessary
-                */
-               md_recover_arrays();
-               if (mddev->pers->restart_resync)
-                       mddev->pers->restart_resync(mddev);
-       } else
-               err = -EINVAL;
-out:
-       return err;
-}
-
-#define STILL_MOUNTED KERN_WARNING \
-"md: md%d still mounted.\n"
-
-static int do_md_stop (mddev_t * mddev, int ro)
-{
-       int err = 0, resync_interrupted = 0;
-       kdev_t dev = mddev_to_kdev(mddev);
-       if (!ro && !fs_may_mount (dev)) {
-               printk (STILL_MOUNTED, mdidx(mddev));
-               OUT(-EBUSY);
-       }
-  
-       /*
-        * complain if it's already stopped
-        */
-       if (!mddev->nb_dev)
-               OUT(-ENXIO);
-
-       if (mddev->pers) {
-               /*
-                * It is safe to call stop here, it only frees private
-                * data. Also, it tells us if a device is unstoppable
-                * (eg. resyncing is in progress)
-                */
-               if (mddev->pers->stop_resync)
-                       if (mddev->pers->stop_resync(mddev))
-                               resync_interrupted = 1;
-
-               if (mddev->recovery_running)
-                       md_interrupt_thread(md_recovery_thread);
-
-               /*
-                * This synchronizes with signal delivery to the
-                * resync or reconstruction thread. It also nicely
-                * hangs the process if some reconstruction has not
-                * finished.
-                */
-               down(&mddev->recovery_sem);
-               up(&mddev->recovery_sem);
-
-               /*
-                *  sync and invalidate buffers because we cannot kill the
-                *  main thread with valid IO transfers still around.
-                *  the kernel lock protects us from new requests being
-                *  added after invalidate_buffers().
-                */
-               fsync_dev (mddev_to_kdev(mddev));
-               fsync_dev (dev);
-               invalidate_buffers (dev);
-
-               if (ro) {
-                       if (mddev->ro)
-                               OUT(-ENXIO);
-                       mddev->ro = 1;
-               } else {
-                       if (mddev->ro)
-                               set_device_ro(dev, 0);
-                       if (mddev->pers->stop(mddev)) {
-                               if (mddev->ro)
-                                       set_device_ro(dev, 1);
-                               OUT(-EBUSY);
-                       }
-                       if (mddev->ro)
-                               mddev->ro = 0;
-               }
-               if (mddev->sb) {
-                       /*
-                        * mark it clean only if there was no resync
-                        * interrupted.
-                        */
-                       if (!mddev->recovery_running && !resync_interrupted) {
-                               printk("marking sb clean...\n");
-                               mddev->sb->state |= 1 << MD_SB_CLEAN;
-                       }
-                       md_update_sb(mddev);
-               }
-               if (ro)
-                       set_device_ro(dev, 1);
-       }
-       /*
-        * Free resources if final stop
-        */
-       if (!ro) {
-               export_array(mddev);
-               md_size[mdidx(mddev)] = 0;
-               md_hd_struct[mdidx(mddev)].nr_sects = 0;
-               free_mddev(mddev);
-
-               printk (KERN_INFO "md%d stopped.\n", mdidx(mddev));
-       } else
-               printk (KERN_INFO
-                       "md%d switched to read-only mode.\n", mdidx(mddev));
-out:
-       return err;
-}
-
-#undef OUT
-
-/*
- * We have to safely support old arrays too.
- */
-int detect_old_array (mdp_super_t *sb)
-{
-       if (sb->major_version > 0)
-               return 0;
-       if (sb->minor_version >= 90)
-               return 0;
-
-       return -EINVAL;
-}
-
-
-static void autorun_array (mddev_t *mddev)
-{
-       mdk_rdev_t *rdev;
-        struct md_list_head *tmp;
-       int err;
-
-       if (mddev->disks.prev == &mddev->disks) {
-               MD_BUG();
-               return;
-       }
-
-       printk("running: ");
-
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               printk("<%s>", partition_name(rdev->dev));
-       }
-       printk("\nnow!\n");
-
-       err = do_md_run (mddev);
-       if (err) {
-               printk("do_md_run() returned %d\n", err);
-               /*
-                * prevent the writeback of an unrunnable array
-                */
-               mddev->sb_dirty = 0;
-               do_md_stop (mddev, 0);
-       }
-}
-
-/*
- * lets try to run arrays based on all disks that have arrived
- * until now. (those are in the ->pending list)
- *
- * the method: pick the first pending disk, collect all disks with
- * the same UUID, remove all from the pending list and put them into
- * the 'same_array' list. Then order this list based on superblock
- * update time (freshest comes first), kick out 'old' disks and
- * compare superblocks. If everything's fine then run it.
- */
-static void autorun_devices (void)
-{
-       struct md_list_head candidates;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev0, *rdev;
-       mddev_t *mddev;
-       kdev_t md_kdev;
-
-
-       printk("autorun ...\n");
-       while (pending_raid_disks.next != &pending_raid_disks) {
-               rdev0 = md_list_entry(pending_raid_disks.next,
-                                        mdk_rdev_t, pending);
-
-               printk("considering %s ...\n", partition_name(rdev0->dev));
-               MD_INIT_LIST_HEAD(&candidates);
-               ITERATE_RDEV_PENDING(rdev,tmp) {
-                       if (uuid_equal(rdev0, rdev)) {
-                               if (!sb_equal(rdev0->sb, rdev->sb)) {
-                                       printk("%s has same UUID as %s, but superblocks differ ...\n", partition_name(rdev->dev), partition_name(rdev0->dev));
-                                       continue;
-                               }
-                               printk("  adding %s ...\n", partition_name(rdev->dev));
-                               md_list_del(&rdev->pending);
-                               md_list_add(&rdev->pending, &candidates);
-                       }
-               }
-               /*
-                * now we have a set of devices, with all of them having
-                * mostly sane superblocks. It's time to allocate the
-                * mddev.
-                */
-               md_kdev = MKDEV(MD_MAJOR, rdev0->sb->md_minor);
-               mddev = kdev_to_mddev(md_kdev);
-               if (mddev) {
-                       printk("md%d already running, cannot run %s\n",
-                                mdidx(mddev), partition_name(rdev0->dev));
-                       ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp)
-                               export_rdev(rdev);
-                       continue;
-               }
-               mddev = alloc_mddev(md_kdev);
-               printk("created md%d\n", mdidx(mddev));
-               ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp) {
-                       bind_rdev_to_array(rdev, mddev);
-                       md_list_del(&rdev->pending);
-                       MD_INIT_LIST_HEAD(&rdev->pending);
-               }
-               autorun_array(mddev);
-       }
-       printk("... autorun DONE.\n");
-}
-
-/*
- * import RAID devices based on one partition
- * if possible, the array gets run as well.
- */
-
-#define BAD_VERSION KERN_ERR \
-"md: %s has RAID superblock version 0.%d, autodetect needs v0.90 or higher\n"
-
-#define OUT_OF_MEM KERN_ALERT \
-"md: out of memory.\n"
-
-#define NO_DEVICE KERN_ERR \
-"md: disabled device %s\n"
-
-#define AUTOADD_FAILED KERN_ERR \
-"md: auto-adding devices to md%d FAILED (error %d).\n"
-
-#define AUTOADD_FAILED_USED KERN_ERR \
-"md: cannot auto-add device %s to md%d, already used.\n"
-
-#define AUTORUN_FAILED KERN_ERR \
-"md: auto-running md%d FAILED (error %d).\n"
-
-#define MDDEV_BUSY KERN_ERR \
-"md: cannot auto-add to md%d, already running.\n"
-
-#define AUTOADDING KERN_INFO \
-"md: auto-adding devices to md%d, based on %s's superblock.\n"
-
-#define AUTORUNNING KERN_INFO \
-"md: auto-running md%d.\n"
-
-static int autostart_array (kdev_t startdev)
-{
-       int err = -EINVAL, i;
-       mdp_super_t *sb = NULL;
-       mdk_rdev_t *start_rdev = NULL, *rdev;
-
-       if (md_import_device(startdev, 1)) {
-               printk("could not import %s!\n", partition_name(startdev));
-               goto abort;
-       }
-
-       start_rdev = find_rdev_all(startdev);
-       if (!start_rdev) {
-               MD_BUG();
-               goto abort;
-       }
-       if (start_rdev->faulty) {
-               printk("can not autostart based on faulty %s!\n",
-                                               partition_name(startdev));
-               goto abort;
-       }
-       md_list_add(&start_rdev->pending, &pending_raid_disks);
-
-       sb = start_rdev->sb;
-
-       err = detect_old_array(sb);
-       if (err) {
-               printk("array version is too old to be autostarted, use raidtools 0.90 mkraid --upgrade\nto upgrade the array without data loss!\n");
-               goto abort;
-       }
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               if (dev == MKDEV(0,0))
-                       continue;
-               if (dev == startdev)
-                       continue;
-               if (md_import_device(dev, 1)) {
-                       printk("could not import %s, trying to run array nevertheless.\n", partition_name(dev));
-                       continue;
-               }
-               rdev = find_rdev_all(dev);
-               if (!rdev) {
-                       MD_BUG();
-                       goto abort;
-               }
-               md_list_add(&rdev->pending, &pending_raid_disks);
-       }
-
-       /*
-        * possibly return codes
-        */
-       autorun_devices();
-       return 0;
-
-abort:
-       if (start_rdev)
-               export_rdev(start_rdev);
-       return err;
-}
-
-#undef BAD_VERSION
-#undef OUT_OF_MEM
-#undef NO_DEVICE
-#undef AUTOADD_FAILED_USED
-#undef AUTOADD_FAILED
-#undef AUTORUN_FAILED
-#undef AUTOADDING
-#undef AUTORUNNING
-
-struct {
-       int set;
-       int noautodetect;
-
-} raid_setup_args md__initdata = { 0, 0 };
-
-/*
- * Searches all registered partitions for autorun RAID arrays
- * at boot time.
- */
-md__initfunc(void autodetect_raid(void))
-{
-#ifdef CONFIG_AUTODETECT_RAID
-       struct gendisk *disk;
-       mdk_rdev_t *rdev;
-       int i;
-
-       if (raid_setup_args.noautodetect) {
-               printk(KERN_INFO "skipping autodetection of RAID arrays\n");
-               return;
-       }
-       printk(KERN_INFO "autodetecting RAID arrays\n");
-
-       for (disk = gendisk_head ; disk ; disk = disk->next) {
-               for (i = 0; i < disk->max_p*disk->max_nr; i++) {
-                       kdev_t dev = MKDEV(disk->major,i);
-
-                       if (disk->part[i].type == LINUX_OLD_RAID_PARTITION) {
-                               printk(KERN_ALERT
-"md: %s's partition type has to be changed from type 0x86 to type 0xfd\n"
-"    to maintain interoperability with other OSs! Autodetection support for\n"
-"    type 0x86 will be deleted after some migration timeout. Sorry.\n",
-                                       partition_name(dev));
-                               disk->part[i].type = LINUX_RAID_PARTITION;
-                       }
-                       if (disk->part[i].type != LINUX_RAID_PARTITION)
-                               continue;
-
-                       if (md_import_device(dev,1)) {
-                               printk(KERN_ALERT "could not import %s!\n",
-                                                       partition_name(dev));
-                               continue;
-                       }
-                       /*
-                        * Sanity checks:
-                        */
-                       rdev = find_rdev_all(dev);
-                       if (!rdev) {
-                               MD_BUG();
-                               continue;
-                       }
-                       if (rdev->faulty) {
-                               MD_BUG();
-                               continue;
-                       }
-                       md_list_add(&rdev->pending, &pending_raid_disks);
-               }
-       }
-
-       autorun_devices();
-#endif
-}
-
-static int get_version (void * arg)
-{
-       mdu_version_t ver;
-
-       ver.major = MD_MAJOR_VERSION;
-       ver.minor = MD_MINOR_VERSION;
-       ver.patchlevel = MD_PATCHLEVEL_VERSION;
-
-       if (md_copy_to_user(arg, &ver, sizeof(ver)))
-               return -EFAULT;
-
-       return 0;
-}
-
-#define SET_FROM_SB(x) info.x = mddev->sb->x
-static int get_array_info (mddev_t * mddev, void * arg)
-{
-       mdu_array_info_t info;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       SET_FROM_SB(major_version);
-       SET_FROM_SB(minor_version);
-       SET_FROM_SB(patch_version);
-       SET_FROM_SB(ctime);
-       SET_FROM_SB(level);
-       SET_FROM_SB(size);
-       SET_FROM_SB(nr_disks);
-       SET_FROM_SB(raid_disks);
-       SET_FROM_SB(md_minor);
-       SET_FROM_SB(not_persistent);
-
-       SET_FROM_SB(utime);
-       SET_FROM_SB(state);
-       SET_FROM_SB(active_disks);
-       SET_FROM_SB(working_disks);
-       SET_FROM_SB(failed_disks);
-       SET_FROM_SB(spare_disks);
-
-       SET_FROM_SB(layout);
-       SET_FROM_SB(chunk_size);
-
-       if (md_copy_to_user(arg, &info, sizeof(info)))
-               return -EFAULT;
-
-       return 0;
-}
-#undef SET_FROM_SB
-
-#define SET_FROM_SB(x) info.x = mddev->sb->disks[nr].x
-static int get_disk_info (mddev_t * mddev, void * arg)
-{
-       mdu_disk_info_t info;
-       unsigned int nr;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       nr = info.number;
-       if (nr >= mddev->sb->nr_disks)
-               return -EINVAL;
-
-       SET_FROM_SB(major);
-       SET_FROM_SB(minor);
-       SET_FROM_SB(raid_disk);
-       SET_FROM_SB(state);
-
-       if (md_copy_to_user(arg, &info, sizeof(info)))
-               return -EFAULT;
-
-       return 0;
-}
-#undef SET_FROM_SB
-
-#define SET_SB(x) mddev->sb->disks[nr].x = info.x
-
-static int add_new_disk (mddev_t * mddev, void * arg)
-{
-       int err, size, persistent;
-       mdu_disk_info_t info;
-       mdk_rdev_t *rdev;
-       unsigned int nr;
-       kdev_t dev;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       nr = info.number;
-       if (nr >= mddev->sb->nr_disks)
-               return -EINVAL;
-
-       dev = MKDEV(info.major,info.minor);
-
-       if (find_rdev_all(dev)) {
-               printk("device %s already used in a RAID array!\n", 
-                               partition_name(dev));
-               return -EBUSY;
-       }
-
-       SET_SB(number);
-       SET_SB(major);
-       SET_SB(minor);
-       SET_SB(raid_disk);
-       SET_SB(state);
-       if ((info.state & (1<<MD_DISK_FAULTY))==0) {
-               err = md_import_device (dev, 0);
-               if (err) {
-                       printk("md: error, md_import_device() returned %d\n", err);
-                       return -EINVAL;
-               }
-               rdev = find_rdev_all(dev);
-               if (!rdev) {
-                       MD_BUG();
-                       return -EINVAL;
-               }
-               rdev->old_dev = dev;
-               rdev->desc_nr = info.number;
-               bind_rdev_to_array(rdev, mddev);
-               persistent = !mddev->sb->not_persistent;
-               if (!persistent)
-                       printk("nonpersistent superblock ...\n");
-               if (!mddev->sb->chunk_size)
-                       printk("no chunksize?\n");
-               size = calc_dev_size(dev, mddev, persistent);
-               rdev->sb_offset = calc_dev_sboffset(dev, mddev, persistent);
-               if (!mddev->sb->size || (mddev->sb->size > size))
-                       mddev->sb->size = size;
-       }
-       /*
-        * sync all other superblocks with the main superblock
-        */
-       sync_sbs(mddev);
-
-       return 0;
-}
-#undef SET_SB
-
-static int hot_remove_disk (mddev_t * mddev, kdev_t dev)
-{
-       int err;
-       mdk_rdev_t *rdev;
-       mdp_disk_t *disk;
-
-       if (!mddev->pers)
-               return -ENODEV;
-
-       printk("trying to remove %s from md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-
-       if (!mddev->pers->diskop) {
-               printk("md%d: personality does not support diskops!\n",
-                                                                mdidx(mddev));
-               return -EINVAL;
-       }
-
-       rdev = find_rdev(mddev, dev);
-       if (!rdev)
-               return -ENXIO;
-
-       if (rdev->desc_nr == -1) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       disk = &mddev->sb->disks[rdev->desc_nr];
-       if (disk_active(disk))
-               goto busy;
-       if (disk_removed(disk)) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       
-       err = mddev->pers->diskop(mddev, &disk, DISKOP_HOT_REMOVE_DISK);
-       if (err == -EBUSY)
-               goto busy;
-       if (err) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       remove_descriptor(disk, mddev->sb);
-       kick_rdev_from_array(rdev);
-       mddev->sb_dirty = 1;
-       md_update_sb(mddev);
-
-       return 0;
-busy:
-       printk("cannot remove active disk %s from md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-       return -EBUSY;
-}
-
-static int hot_add_disk (mddev_t * mddev, kdev_t dev)
-{
-       int i, err, persistent;
-       unsigned int size;
-       mdk_rdev_t *rdev;
-       mdp_disk_t *disk;
-
-       if (!mddev->pers)
-               return -ENODEV;
-
-       printk("trying to hot-add %s to md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-
-       if (!mddev->pers->diskop) {
-               printk("md%d: personality does not support diskops!\n",
-                                                                mdidx(mddev));
-               return -EINVAL;
-       }
-
-       persistent = !mddev->sb->not_persistent;
-       size = calc_dev_size(dev, mddev, persistent);
-
-       if (size < mddev->sb->size) {
-               printk("md%d: disk size %d blocks < array size %d\n",
-                               mdidx(mddev), size, mddev->sb->size);
-               return -ENOSPC;
-       }
-
-       rdev = find_rdev(mddev, dev);
-       if (rdev)
-               return -EBUSY;
-
-       err = md_import_device (dev, 0);
-       if (err) {
-               printk("md: error, md_import_device() returned %d\n", err);
-               return -EINVAL;
-       }
-       rdev = find_rdev_all(dev);
-       if (!rdev) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       if (rdev->faulty) {
-               printk("md: can not hot-add faulty %s disk to md%d!\n",
-                               partition_name(dev), mdidx(mddev));
-               err = -EINVAL;
-               goto abort_export;
-       }
-       bind_rdev_to_array(rdev, mddev);
-
-       /*
-        * The rest should better be atomic, we can have disk failures
-        * noticed in interrupt contexts ...
-        */
-       cli();
-       rdev->old_dev = dev;
-       rdev->size = size;
-       rdev->sb_offset = calc_dev_sboffset(dev, mddev, persistent);
-
-       disk = mddev->sb->disks + mddev->sb->raid_disks;
-       for (i = mddev->sb->raid_disks; i < MD_SB_DISKS; i++) {
-               disk = mddev->sb->disks + i;
-
-               if (!disk->major && !disk->minor)
-                       break;
-               if (disk_removed(disk))
-                       break;
-       }
-       if (i == MD_SB_DISKS) {
-               sti();
-               printk("md%d: can not hot-add to full array!\n", mdidx(mddev));
-               err = -EBUSY;
-               goto abort_unbind_export;
-       }
+       realdev->size=blk_size[MAJOR(dev)][MINOR(dev)];
 
-       if (disk_removed(disk)) {
+       if (hot_add) {
+               /*
+                * Check the superblock for consistency.
+                * The personality itself has to check whether it's getting
+                * added with the proper flags.  The personality has to be
+                 * checked too. ;)
+                */
+               if (analyze_one_sb (realdev))
+                       return -EINVAL;
                /*
-                * reuse slot
+                * hot_add has to bump up nb_dev itself
                 */
-               if (disk->number != i) {
-                       sti();
-                       MD_BUG();
-                       err = -EINVAL;
-                       goto abort_unbind_export;
+               if (md_dev[minor].pers->hot_add_disk (&md_dev[minor], dev)) {
+                       /*
+                        * FIXME: here we should free up the inode and stuff
+                        */
+                       printk ("FIXME\n");
+                       return -EINVAL;
                }
-       } else {
-               disk->number = i;
-       }
-
-       disk->raid_disk = disk->number;
-       disk->major = MAJOR(dev);
-       disk->minor = MINOR(dev);
-
-       if (mddev->pers->diskop(mddev, &disk, DISKOP_HOT_ADD_DISK)) {
-               sti();
-               MD_BUG();
-               err = -EINVAL;
-               goto abort_unbind_export;
-       }
-
-       mark_disk_spare(disk);
-       mddev->sb->nr_disks++;
-       mddev->sb->spare_disks++;
-       mddev->sb->working_disks++;
-
-       mddev->sb_dirty = 1;
-
-       sti();
-       md_update_sb(mddev);
-
-       /*
-        * Kick recovery, maybe this spare has to be added to the
-        * array immediately.
-        */
-       md_recover_arrays();
-
-       return 0;
-
-abort_unbind_export:
-       unbind_rdev_from_array(rdev);
-
-abort_export:
-       export_rdev(rdev);
-       return err;
-}
-
-#define SET_SB(x) mddev->sb->x = info.x
-static int set_array_info (mddev_t * mddev, void * arg)
-{
-       mdu_array_info_t info;
-
-       if (mddev->sb) {
-               printk("array md%d already has a superblock!\n", 
-                               mdidx(mddev));
-               return -EBUSY;
-       }
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       if (alloc_array_sb(mddev))
-               return -ENOMEM;
-
-       mddev->sb->major_version = MD_MAJOR_VERSION;
-       mddev->sb->minor_version = MD_MINOR_VERSION;
-       mddev->sb->patch_version = MD_PATCHLEVEL_VERSION;
-       mddev->sb->ctime = CURRENT_TIME;
-
-       SET_SB(level);
-       SET_SB(size);
-       SET_SB(nr_disks);
-       SET_SB(raid_disks);
-       SET_SB(md_minor);
-       SET_SB(not_persistent);
-
-       SET_SB(state);
-       SET_SB(active_disks);
-       SET_SB(working_disks);
-       SET_SB(failed_disks);
-       SET_SB(spare_disks);
-
-       SET_SB(layout);
-       SET_SB(chunk_size);
-
-       mddev->sb->md_magic = MD_SB_MAGIC;
-
-       /*
-        * Generate a 128 bit UUID
-        */
-       get_random_bytes(&mddev->sb->set_uuid0, 4);
-       get_random_bytes(&mddev->sb->set_uuid1, 4);
-       get_random_bytes(&mddev->sb->set_uuid2, 4);
-       get_random_bytes(&mddev->sb->set_uuid3, 4);
-
-       return 0;
-}
-#undef SET_SB
-
-static int set_disk_info (mddev_t * mddev, void * arg)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int clear_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int write_raid_info (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int protect_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
+       } else
+               md_dev[minor].nb_dev++;
 
-static int unprotect_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
+       printk ("REGISTER_DEV %s to md%x done\n", partition_name(dev), minor);
+       return (0);
 }
 
 static int md_ioctl (struct inode *inode, struct file *file,
                      unsigned int cmd, unsigned long arg)
 {
-       unsigned int minor;
-       int err = 0;
-       struct hd_geometry *loc = (struct hd_geometry *) arg;
-       mddev_t *mddev = NULL;
-       kdev_t dev;
-
-       if (!md_capable_admin())
-               return -EACCES;
-
-       dev = inode->i_rdev;
-       minor = MINOR(dev);
-       if (minor >= MAX_MD_DEVS)
-               return -EINVAL;
-
-       /*
-        * Commands dealing with the RAID driver but not any
-        * particular array:
-        */
-       switch (cmd)
-       {
-               case RAID_VERSION:
-                       err = get_version((void *)arg);
-                       goto done;
-
-               case PRINT_RAID_DEBUG:
-                       err = 0;
-                       md_print_devices();
-                       goto done_unlock;
-      
-               case BLKGETSIZE:   /* Return device size */
-                       if (!arg) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       err = md_put_user(md_hd_struct[minor].nr_sects,
-                                               (long *) arg);
-                       goto done;
-
-               case BLKFLSBUF:
-                       fsync_dev(dev);
-                       invalidate_buffers(dev);
-                       goto done;
-
-               case BLKRASET:
-                       if (arg > 0xff) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       read_ahead[MAJOR(dev)] = arg;
-                       goto done;
-    
-               case BLKRAGET:
-                       if (!arg) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       err = md_put_user (read_ahead[
-                               MAJOR(dev)], (long *) arg);
-                       goto done;
-               default:
-       }
-
-       /*
-        * Commands creating/starting a new array:
-        */
-
-       mddev = kdev_to_mddev(dev);
-
-       switch (cmd)
-       {
-               case SET_ARRAY_INFO:
-               case START_ARRAY:
-                       if (mddev) {
-                               printk("array md%d already exists!\n",
-                                                               mdidx(mddev));
-                               err = -EEXIST;
-                               goto abort;
-                       }
-               default:
-       }
-
-       switch (cmd)
-       {
-               case SET_ARRAY_INFO:
-                       mddev = alloc_mddev(dev);
-                       if (!mddev) {
-                               err = -ENOMEM;
-                               goto abort;
-                       }
-                       /*
-                        * alloc_mddev() should possibly self-lock.
-                        */
-                       err = lock_mddev(mddev);
-                       if (err) {
-                               printk("ioctl, reason %d, cmd %d\n",err, cmd);
-                               goto abort;
-                       }
-                       err = set_array_info(mddev, (void *)arg);
-                       goto done_unlock;
-
-               case START_ARRAY:
-                       /*
-                        * possibly make it lock the array ...
-                        */
-                       err = autostart_array((kdev_t)arg);
-                       if (err) {
-                               printk("autostart %s failed!\n",
-                                       partition_name((kdev_t)arg));
-                       }
-                       goto done;
-      
-               default:
-       }
-      
-       /*
-        * Commands querying/configuring an existing array:
-        */
-
-       if (!mddev) {
-               err = -ENODEV;
-               goto abort;
-       }
-       err = lock_mddev(mddev);
-       if (err) {
-               printk("ioctl lock interrupted, reason %d, cmd %d\n",err, cmd);
-               goto abort;
-       }
-
-       /*
-        * Commands even a read-only array can execute:
-        */
-       switch (cmd)
-       {
-               case GET_ARRAY_INFO:
-                       err = get_array_info(mddev, (void *)arg);
-                       goto done_unlock;
-
-               case GET_DISK_INFO:
-                       err = get_disk_info(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case RESTART_ARRAY_RW:
-                       err = restart_array(mddev);
-                       goto done_unlock;
+  int minor, err;
+  struct hd_geometry *loc = (struct hd_geometry *) arg;
 
-               case STOP_ARRAY:
-                       err = do_md_stop (mddev, 0);
-                       goto done_unlock;
-      
-               case STOP_ARRAY_RO:
-                       err = do_md_stop (mddev, 1);
-                       goto done_unlock;
-      
-       /*
-        * We have a problem here : there is no easy way to give a CHS
-        * virtual geometry. We currently pretend that we have a 2 heads
-        * 4 sectors (with a BIG number of cylinders...). This drives
-        * dosfs just mad... ;-)
-        */
-               case HDIO_GETGEO:
-                       if (!loc) {
-                               err = -EINVAL;
-                               goto abort_unlock;
-                       }
-                       err = md_put_user (2, (char *) &loc->heads);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (4, (char *) &loc->sectors);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
-                                               (short *) &loc->cylinders);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (md_hd_struct[minor].start_sect,
-                                               (long *) &loc->start);
-                       goto done_unlock;
-       }
+  if (!capable(CAP_SYS_ADMIN))
+    return -EACCES;
 
-       /*
-        * The remaining ioctls are changing the state of the
-        * superblock, so we do not allow read-only arrays
-        * here:
-        */
-       if (mddev->ro) {
-               err = -EROFS;
-               goto abort_unlock;
-       }
+  if (((minor=MINOR(inode->i_rdev)) & 0x80) &&
+      (minor & 0x7f) < MAX_PERSONALITY &&
+      pers[minor & 0x7f] &&
+      pers[minor & 0x7f]->ioctl)
+    return (pers[minor & 0x7f]->ioctl (inode, file, cmd, arg));
+  
+  if (minor >= MAX_MD_DEV)
+    return -EINVAL;
 
-       switch (cmd)
-       {
-               case CLEAR_ARRAY:
-                       err = clear_array(mddev);
-                       goto done_unlock;
-      
-               case ADD_NEW_DISK:
-                       err = add_new_disk(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case HOT_REMOVE_DISK:
-                       err = hot_remove_disk(mddev, (kdev_t)arg);
-                       goto done_unlock;
-      
-               case HOT_ADD_DISK:
-                       err = hot_add_disk(mddev, (kdev_t)arg);
-                       goto done_unlock;
-      
-               case SET_DISK_INFO:
-                       err = set_disk_info(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case WRITE_RAID_INFO:
-                       err = write_raid_info(mddev);
-                       goto done_unlock;
-      
-               case UNPROTECT_ARRAY:
-                       err = unprotect_array(mddev);
-                       goto done_unlock;
-      
-               case PROTECT_ARRAY:
-                       err = protect_array(mddev);
-                       goto done_unlock;
-      
-               case RUN_ARRAY:
-               {
-                       mdu_param_t param;
+  switch (cmd)
+  {
+    case REGISTER_DEV:
+      return do_md_add (minor, to_kdev_t ((dev_t) arg));
 
-                       err = md_copy_from_user(&param, (mdu_param_t *)arg,
-                                                        sizeof(param));
-                       if (err)
-                               goto abort_unlock;
+    case START_MD:
+      return do_md_run (minor, (int) arg);
 
-                       err = do_md_run (mddev);
-                       /*
-                        * we have to clean up the mess if
-                        * the array cannot be run for some
-                        * reason ...
-                        */
-                       if (err) {
-                               mddev->sb_dirty = 0;
-                               do_md_stop (mddev, 0);
-                       }
-                       goto done_unlock;
-               }
+    case STOP_MD:
+      return do_md_stop (minor, inode);
       
-               default:
-                       printk(KERN_WARNING "%s(pid %d) used obsolete MD ioctl, upgrade your software to use new ictls.\n", current->comm, current->pid);
-                       err = -EINVAL;
-                       goto abort_unlock;
-       }
-
-done_unlock:
-abort_unlock:
-       if (mddev)
-               unlock_mddev(mddev);
-       else
-               printk("huh11?\n");
+    case BLKGETSIZE:   /* Return device size */
+    if  (!arg)  return -EINVAL;
+    err = put_user (md_hd_struct[MINOR(inode->i_rdev)].nr_sects, (long *) arg);
+    if (err)
+      return err;
+    break;
+
+    case BLKFLSBUF:
+    fsync_dev (inode->i_rdev);
+    invalidate_buffers (inode->i_rdev);
+    break;
+
+    case BLKRASET:
+    if (arg > 0xff)
+      return -EINVAL;
+    read_ahead[MAJOR(inode->i_rdev)] = arg;
+    return 0;
+    
+    case BLKRAGET:
+    if  (!arg)  return -EINVAL;
+    err = put_user (read_ahead[MAJOR(inode->i_rdev)], (long *) arg);
+    if (err)
+      return err;
+    break;
+
+    /* We have a problem here : there is no easy way to give a CHS
+       virtual geometry. We currently pretend that we have a 2 heads
+       4 sectors (with a BIG number of cylinders...). This drives dosfs
+       just mad... ;-) */
+    
+    case HDIO_GETGEO:
+    if (!loc)  return -EINVAL;
+    err = put_user (2, (char *) &loc->heads);
+    if (err)
+      return err;
+    err = put_user (4, (char *) &loc->sectors);
+    if (err)
+      return err;
+    err = put_user (md_hd_struct[minor].nr_sects/8, (short *) &loc->cylinders);
+    if (err)
+      return err;
+    err = put_user (md_hd_struct[MINOR(inode->i_rdev)].start_sect,
+               (long *) &loc->start);
+    if (err)
+      return err;
+    break;
+    
+    RO_IOCTLS(inode->i_rdev,arg);
+    
+    default:
+    return -EINVAL;
+  }
 
-       return err;
-done:
-       if (err)
-               printk("huh12?\n");
-abort:
-       return err;
+  return (0);
 }
 
-
-#if LINUX_VERSION_CODE < LinuxVersionCode(2,1,0)
-
 static int md_open (struct inode *inode, struct file *file)
 {
-       /*
-        * Always succeed
-        */
-       return (0);
-}
-
-static void md_release (struct inode *inode, struct file *file)
-{
-       sync_dev(inode->i_rdev);
-}
-
-
-static int md_read (struct inode *inode, struct file *file,
-                                               char *buf, int count)
-{
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
-
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  int minor=MINOR(inode->i_rdev);
 
-       return block_read (inode, file, buf, count);
+  md_dev[minor].busy++;
+  return (0);                  /* Always succeed */
 }
 
-static int md_write (struct inode *inode, struct file *file,
-                                               const char *buf, int count)
-{
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
-
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
 
-       return block_write (inode, file, buf, count);
-}
-
-static struct file_operations md_fops=
+static int md_release (struct inode *inode, struct file *file)
 {
-       NULL,
-       md_read,
-       md_write,
-       NULL,
-       NULL,
-       md_ioctl,
-       NULL,
-       md_open,
-       md_release,
-       block_fsync
-};
-
-#else
+  int minor=MINOR(inode->i_rdev);
 
-static int md_open (struct inode *inode, struct file *file)
-{
-       /*
-        * Always succeed
-        */
-       return (0);
+  sync_dev (inode->i_rdev);
+  md_dev[minor].busy--;
+  return 0;
 }
 
-static int md_release (struct inode *inode, struct file *file)
-{
-       sync_dev(inode->i_rdev);
-       return 0;
-}
 
 static ssize_t md_read (struct file *file, char *buf, size_t count,
                        loff_t *ppos)
 {
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
+  int minor=MINOR(file->f_dentry->d_inode->i_rdev);
 
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  if (!md_dev[minor].pers)     /* Check if device is being run */
+    return -ENXIO;
 
-       return block_read(file, buf, count, ppos);
+  return block_read(file, buf, count, ppos);
 }
 
 static ssize_t md_write (struct file *file, const char *buf,
                         size_t count, loff_t *ppos)
 {
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
+  int minor=MINOR(file->f_dentry->d_inode->i_rdev);
 
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  if (!md_dev[minor].pers)     /* Check if device is being run */
+    return -ENXIO;
 
-       return block_write(file, buf, count, ppos);
+  return block_write(file, buf, count, ppos);
 }
 
 static struct file_operations md_fops=
 {
-       NULL,
-       md_read,
-       md_write,
-       NULL,
-       NULL,
-       md_ioctl,
-       NULL,
-       md_open,
-       NULL,
-       md_release,
-       block_fsync
+  NULL,
+  md_read,
+  md_write,
+  NULL,
+  NULL,
+  md_ioctl,
+  NULL,
+  md_open,
+  NULL,
+  md_release,
+  block_fsync
 };
 
-#endif
-
-int md_map (kdev_t dev, kdev_t *rdev,
-                        unsigned long *rsector, unsigned long size)
+int md_map (int minor, kdev_t *rdev, unsigned long *rsector, unsigned long size)
 {
-       int err;
-       mddev_t *mddev = kdev_to_mddev(dev);
-
-       if (!mddev || !mddev->pers) {
-               err = -ENXIO;
-               goto out;
-       }
+  if ((unsigned int) minor >= MAX_MD_DEV)
+  {
+    printk ("Bad md device %d\n", minor);
+    return (-1);
+  }
+  
+  if (!md_dev[minor].pers)
+  {
+    printk ("Oops ! md%d not running, giving up !\n", minor);
+    return (-1);
+  }
 
-       err = mddev->pers->map(mddev, dev, rdev, rsector, size);
-out:
-       return err;
+  return (md_dev[minor].pers->map(md_dev+minor, rdev, rsector, size));
 }
   
-int md_make_request (struct buffer_head * bh, int rw)
+int md_make_request (int minor, int rw, struct buffer_head * bh)
 {
-       int err;
-       mddev_t *mddev = kdev_to_mddev(bh->b_dev);
-
-       if (!mddev || !mddev->pers) {
-               err = -ENXIO;
-               goto out;
-       }
-
-       if (mddev->pers->make_request) {
-               if (buffer_locked(bh)) {
-                       err = 0;
-                       goto out;
-               }
+       if (md_dev [minor].pers->make_request) {
+               if (buffer_locked(bh))
+                       return 0;
                set_bit(BH_Lock, &bh->b_state);
                if (rw == WRITE || rw == WRITEA) {
                        if (!buffer_dirty(bh)) {
-                               bh->b_end_io(bh, buffer_uptodate(bh));
-                               err = 0;
-                               goto out;
+                               bh->b_end_io(bh, test_bit(BH_Uptodate, &bh->b_state));
+                               return 0;
                        }
                }
                if (rw == READ || rw == READA) {
                        if (buffer_uptodate(bh)) {
-                               bh->b_end_io(bh, buffer_uptodate(bh));
-                               err = 0;
-                               goto out;
+                               bh->b_end_io(bh, test_bit(BH_Uptodate, &bh->b_state));
+                               return 0;
                        }
                }
-               err = mddev->pers->make_request(mddev, rw, bh);
+               return (md_dev[minor].pers->make_request(md_dev+minor, rw, bh));
        } else {
                make_request (MAJOR(bh->b_rdev), rw, bh);
-               err = 0;
+               return 0;
        }
-out:
-       return err;
 }
 
 static void do_md_request (void)
 {
-       printk(KERN_ALERT "Got md request, not good...");
-       return;
-}
-
-int md_thread(void * arg)
-{
-       mdk_thread_t *thread = arg;
-
-       md_lock_kernel();
-       exit_mm(current);
-       exit_files(current);
-       exit_fs(current);
-
-       /*
-        * Detach thread
-        */
-       sys_setsid();
-       sprintf(current->comm, thread->name);
-       md_init_signals();
-       md_flush_signals();
-       thread->tsk = current;
-
-       /*
-        * md_thread is a 'system-thread', it's priority should be very
-        * high. We avoid resource deadlocks individually in each
-        * raid personality. (RAID5 does preallocation) We also use RR and
-        * the very same RT priority as kswapd, thus we will never get
-        * into a priority inversion deadlock.
-        *
-        * we definitely have to have equal or higher priority than
-        * bdflush, otherwise bdflush will deadlock if there are too
-        * many dirty RAID5 blocks.
-        */
-       current->policy = SCHED_OTHER;
-       current->priority = 40;
-
-       up(thread->sem);
-
-       for (;;) {
-               cli();
-               if (!test_bit(THREAD_WAKEUP, &thread->flags)) {
-                       if (!thread->run)
-                               break;
-                       interruptible_sleep_on(&thread->wqueue);
-               }
-               sti();
-               clear_bit(THREAD_WAKEUP, &thread->flags);
-               if (thread->run) {
-                       thread->run(thread->data);
-                       run_task_queue(&tq_disk);
-               }
-               if (md_signal_pending(current)) {
-                       printk("%8s(%d) flushing signals.\n", current->comm,
-                               current->pid);
-                       md_flush_signals();
-               }
-       }
-       sti();
-       up(thread->sem);
-       return 0;
+  printk ("Got md request, not good...");
+  return;
 }
 
-void md_wakeup_thread(mdk_thread_t *thread)
+void md_wakeup_thread(struct md_thread *thread)
 {
        set_bit(THREAD_WAKEUP, &thread->flags);
        wake_up(&thread->wqueue);
 }
 
-mdk_thread_t *md_register_thread (void (*run) (void *),
-                                               void *data, const char *name)
+struct md_thread *md_register_thread (void (*run) (void *), void *data)
 {
-       mdk_thread_t *thread;
+       struct md_thread *thread = (struct md_thread *)
+               kmalloc(sizeof(struct md_thread), GFP_KERNEL);
        int ret;
        struct semaphore sem = MUTEX_LOCKED;
        
-       thread = (mdk_thread_t *) kmalloc
-                               (sizeof(mdk_thread_t), GFP_KERNEL);
-       if (!thread)
-               return NULL;
+       if (!thread) return NULL;
        
-       memset(thread, 0, sizeof(mdk_thread_t));
+       memset(thread, 0, sizeof(struct md_thread));
        init_waitqueue(&thread->wqueue);
        
        thread->sem = &sem;
        thread->run = run;
        thread->data = data;
-       thread->name = name;
        ret = kernel_thread(md_thread, thread, 0);
        if (ret < 0) {
                kfree(thread);
@@ -2997,405 +836,270 @@ mdk_thread_t *md_register_thread (void (*run) (void *),
        return thread;
 }
 
-void md_interrupt_thread (mdk_thread_t *thread)
-{
-       if (!thread->tsk) {
-               MD_BUG();
-               return;
-       }
-       printk("interrupting MD-thread pid %d\n", thread->tsk->pid);
-       send_sig(SIGKILL, thread->tsk, 1);
-}
-
-void md_unregister_thread (mdk_thread_t *thread)
+void md_unregister_thread (struct md_thread *thread)
 {
        struct semaphore sem = MUTEX_LOCKED;
        
        thread->sem = &sem;
        thread->run = NULL;
-       thread->name = NULL;
-       if (!thread->tsk) {
-               MD_BUG();
-               return;
-       }
-       md_interrupt_thread(thread);
+       if (thread->tsk)
+               printk("Killing md_thread %d %p %s\n",
+                      thread->tsk->pid, thread->tsk, thread->tsk->comm);
+       else
+               printk("Aiee. md_thread has 0 tsk\n");
+       send_sig(SIGKILL, thread->tsk, 1);
+       printk("downing on %p\n", &sem);
        down(&sem);
 }
 
-void md_recover_arrays (void)
-{
-       if (!md_recovery_thread) {
-               MD_BUG();
-               return;
-       }
-       md_wakeup_thread(md_recovery_thread);
-}
-
-
-int md_error (kdev_t dev, kdev_t rdev)
-{
-       mddev_t *mddev = kdev_to_mddev(dev);
-       mdk_rdev_t * rrdev;
-       int rc;
-
-       if (!mddev) {
-               MD_BUG();
-               return 0;
-       }
-       rrdev = find_rdev(mddev, rdev);
-       mark_rdev_faulty(rrdev);
-       /*
-        * if recovery was running, stop it now.
-        */
-       if (mddev->pers->stop_resync)
-               mddev->pers->stop_resync(mddev);
-       if (mddev->pers->error_handler) {
-               rc = mddev->pers->error_handler(mddev, rdev);
-               md_recover_arrays();
-               return rc;
-       }
-#if 0
-       /*
-        * Drop all buffers in the failed array.
-        * _not_. This is called from IRQ handlers ...
-        */
-       invalidate_buffers(rdev);
-#endif
-       return 0;
-}
+#define SHUTDOWN_SIGS   (sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
 
-static int status_unused (char * page)
+int md_thread(void * arg)
 {
-       int sz = 0, i = 0;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct md_thread *thread = arg;
 
-       sz += sprintf(page + sz, "unused devices: ");
+       lock_kernel();
+       exit_mm(current);
+       exit_files(current);
+       exit_fs(current);
+       
+       current->session = 1;
+       current->pgrp = 1;
+       sprintf(current->comm, "md_thread");
+       siginitsetinv(&current->blocked, SHUTDOWN_SIGS);
+       thread->tsk = current;
+       up(thread->sem);
 
-       ITERATE_RDEV_ALL(rdev,tmp) {
-               if (!rdev->same_set.next && !rdev->same_set.prev) {
-                       /*
-                        * The device is not yet used by any array.
-                        */
-                       i++;
-                       sz += sprintf(page + sz, "%s ",
-                               partition_name(rdev->dev));
+       for (;;) {
+               cli();
+               if (!test_bit(THREAD_WAKEUP, &thread->flags)) {
+                       do {
+                               spin_lock(&current->sigmask_lock);
+                               flush_signals(current);
+                               spin_unlock(&current->sigmask_lock);
+                               interruptible_sleep_on(&thread->wqueue);
+                               cli();
+                               if (test_bit(THREAD_WAKEUP, &thread->flags))
+                                       break;
+                               if (!thread->run) {
+                                       sti();
+                                       up(thread->sem);
+                                       return 0;
+                               }
+                       } while (signal_pending(current));
+               }
+               sti();
+               clear_bit(THREAD_WAKEUP, &thread->flags);
+               if (thread->run) {
+                       thread->run(thread->data);
+                       run_task_queue(&tq_disk);
                }
        }
-       if (!i)
-               sz += sprintf(page + sz, "<none>");
-
-       sz += sprintf(page + sz, "\n");
-       return sz;
 }
 
+EXPORT_SYMBOL(md_size);
+EXPORT_SYMBOL(md_maxreadahead);
+EXPORT_SYMBOL(register_md_personality);
+EXPORT_SYMBOL(unregister_md_personality);
+EXPORT_SYMBOL(partition_name);
+EXPORT_SYMBOL(md_dev);
+EXPORT_SYMBOL(md_error);
+EXPORT_SYMBOL(md_register_thread);
+EXPORT_SYMBOL(md_unregister_thread);
+EXPORT_SYMBOL(md_update_sb);
+EXPORT_SYMBOL(md_map);
+EXPORT_SYMBOL(md_wakeup_thread);
+EXPORT_SYMBOL(md_do_sync);
 
-static int status_resync (char * page, mddev_t * mddev)
-{
-       int sz = 0;
-       unsigned int blocksize, max_blocks, resync, res, dt, tt, et;
+#ifdef CONFIG_PROC_FS
+static struct proc_dir_entry proc_md = {
+       PROC_MD, 6, "mdstat",
+       S_IFREG | S_IRUGO, 1, 0, 0,
+       0, &proc_array_inode_operations,
+};
+#endif
 
-       resync = mddev->curr_resync;
-       blocksize = blksize_size[MD_MAJOR][mdidx(mddev)];
-       max_blocks = blk_size[MD_MAJOR][mdidx(mddev)] / (blocksize >> 10);
+static void md_geninit (struct gendisk *gdisk)
+{
+  int i;
+  
+  for(i=0;i<MAX_MD_DEV;i++)
+  {
+    md_blocksizes[i] = 1024;
+    md_maxreadahead[i] = MD_DEFAULT_DISK_READAHEAD;
+    md_gendisk.part[i].start_sect=-1; /* avoid partition check */
+    md_gendisk.part[i].nr_sects=0;
+    md_dev[i].pers=NULL;
+  }
+
+  blksize_size[MD_MAJOR] = md_blocksizes;
+  max_readahead[MD_MAJOR] = md_maxreadahead;
 
-       /*
-        * Should not happen.
-        */             
-       if (!max_blocks) {
-               MD_BUG();
-               return 0;
-       }
-       res = resync*100/max_blocks;
-       if (!mddev->recovery_running)
-               /*
-                * true resync
-                */
-               sz += sprintf(page + sz, " resync=%u%%", res);
-       else
-               /*
-                * recovery ...
-                */
-               sz += sprintf(page + sz, " recovery=%u%%", res);
+#ifdef CONFIG_PROC_FS
+  proc_register(&proc_root, &proc_md);
+#endif
+}
 
-       /*
-        * We do not want to overflow, so the order of operands and
-        * the * 100 / 100 trick are important. We do a +1 to be
-        * safe against division by zero. We only estimate anyway.
-        *
-        * dt: time until now
-        * tt: total time
-        * et: estimated finish time
-        */
-       dt = ((jiffies - mddev->resync_start) / HZ);
-       tt = (dt * (max_blocks / (resync/100+1)))/100;
-       if (tt > dt)
-               et = tt - dt;
-       else
-               /*
-                * ignore rounding effects near finish time
-                */
-               et = 0;
-       
-       sz += sprintf(page + sz, " finish=%u.%umin", et / 60, (et % 60)/6);
+int md_error (kdev_t mddev, kdev_t rdev)
+{
+    unsigned int minor = MINOR (mddev);
+    int rc;
 
-       return sz;
+    if (MAJOR(mddev) != MD_MAJOR || minor > MAX_MD_DEV)
+       panic ("md_error gets unknown device\n");
+    if (!md_dev [minor].pers)
+       panic ("md_error gets an error for an unknown device\n");
+    if (md_dev [minor].pers->error_handler) {
+       rc = md_dev [minor].pers->error_handler (md_dev+minor, rdev);
+#if SUPPORT_RECONSTRUCTION
+       md_wakeup_thread(md_sync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
+       return rc;
+    }
+    return 0;
 }
 
 int get_md_status (char *page)
 {
-       int sz = 0, j, size;
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev;
-       mddev_t *mddev;
-
-       sz += sprintf(page + sz, "Personalities : ");
-       for (j = 0; j < MAX_PERSONALITY; j++)
-       if (pers[j])
-               sz += sprintf(page+sz, "[%s] ", pers[j]->name);
+  int sz=0, i, j, size;
 
-       sz += sprintf(page+sz, "\n");
+  sz+=sprintf( page+sz, "Personalities : ");
+  for (i=0; i<MAX_PERSONALITY; i++)
+    if (pers[i])
+      sz+=sprintf (page+sz, "[%d %s] ", i, pers[i]->name);
 
+  page[sz-1]='\n';
 
-       sz += sprintf(page+sz, "read_ahead ");
-       if (read_ahead[MD_MAJOR] == INT_MAX)
-               sz += sprintf(page+sz, "not set\n");
-       else
-               sz += sprintf(page+sz, "%d sectors\n", read_ahead[MD_MAJOR]);
+  sz+=sprintf (page+sz, "read_ahead ");
+  if (read_ahead[MD_MAJOR]==INT_MAX)
+    sz+=sprintf (page+sz, "not set\n");
+  else
+    sz+=sprintf (page+sz, "%d sectors\n", read_ahead[MD_MAJOR]);
   
-       ITERATE_MDDEV(mddev,tmp) {
-               sz += sprintf(page + sz, "md%d : %sactive", mdidx(mddev),
-                                               mddev->pers ? "" : "in");
-               if (mddev->pers) {
-                       if (mddev->ro)  
-                               sz += sprintf(page + sz, " (read-only)");
-                       sz += sprintf(page + sz, " %s", mddev->pers->name);
-               }
+  for (i=0; i<MAX_MD_DEV; i++)
+  {
+    sz+=sprintf (page+sz, "md%d : %sactive", i, md_dev[i].pers ? "" : "in");
 
-               size = 0;
-               ITERATE_RDEV(mddev,rdev,tmp2) {
-                       sz += sprintf(page + sz, " %s[%d]",
-                               partition_name(rdev->dev), rdev->desc_nr);
-                       if (rdev->faulty) {
-                               sz += sprintf(page + sz, "(F)");
-                               continue;
-                       }
-                       size += rdev->size;
-               }
+    if (md_dev[i].pers)
+      sz+=sprintf (page+sz, " %s", md_dev[i].pers->name);
 
-               if (mddev->nb_dev) {
-                       if (mddev->pers)
-                               sz += sprintf(page + sz, " %d blocks",
-                                                md_size[mdidx(mddev)]);
-                       else
-                               sz += sprintf(page + sz, " %d blocks", size);
-               }
+    size=0;
+    for (j=0; j<md_dev[i].nb_dev; j++)
+    {
+      sz+=sprintf (page+sz, " %s",
+                  partition_name(md_dev[i].devices[j].dev));
+      size+=md_dev[i].devices[j].size;
+    }
 
-               if (!mddev->pers) {
-                       sz += sprintf(page+sz, "\n");
-                       continue;
-               }
+    if (md_dev[i].nb_dev) {
+      if (md_dev[i].pers)
+        sz+=sprintf (page+sz, " %d blocks", md_size[i]);
+      else
+        sz+=sprintf (page+sz, " %d blocks", size);
+    }
 
-               sz += mddev->pers->status (page+sz, mddev);
+    if (!md_dev[i].pers)
+    {
+      sz+=sprintf (page+sz, "\n");
+      continue;
+    }
 
-               if (mddev->curr_resync)
-                       sz += status_resync (page+sz, mddev);
-               else {
-                       if (md_atomic_read(&mddev->resync_sem.count) != 1)
-                               sz += sprintf(page + sz, " resync=DELAYED");
-               }
-               sz += sprintf(page + sz, "\n");
-       }
-       sz += status_unused (page + sz);
+    if (md_dev[i].pers->max_invalid_dev)
+      sz+=sprintf (page+sz, " maxfault=%ld", MAX_FAULT(md_dev+i));
 
-       return (sz);
+    sz+=md_dev[i].pers->status (page+sz, i, md_dev+i);
+    sz+=sprintf (page+sz, "\n");
+  }
+
+  return (sz);
 }
 
-int register_md_personality (int pnum, mdk_personality_t *p)
+int register_md_personality (int p_num, struct md_personality *p)
 {
-       if (pnum >= MAX_PERSONALITY)
-               return -EINVAL;
+  int i=(p_num >> PERSONALITY_SHIFT);
 
-       if (pers[pnum])
-               return -EBUSY;
+  if (i >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  if (pers[i])
+    return -EBUSY;
   
-       pers[pnum] = p;
-       printk(KERN_INFO "%s personality registered\n", p->name);
-       return 0;
+  pers[i]=p;
+  printk ("%s personality registered\n", p->name);
+  return 0;
 }
 
-int unregister_md_personality (int pnum)
+int unregister_md_personality (int p_num)
 {
-       if (pnum >= MAX_PERSONALITY)
-               return -EINVAL;
+  int i=(p_num >> PERSONALITY_SHIFT);
 
-       printk(KERN_INFO "%s personality unregistered\n", pers[pnum]->name);
-       pers[pnum] = NULL;
-       return 0;
+  if (i >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  printk ("%s personality unregistered\n", pers[i]->name);
+  pers[i]=NULL;
+  return 0;
 } 
 
-static mdp_disk_t *get_spare(mddev_t *mddev)
+static md_descriptor_t *get_spare(struct md_dev *mddev)
 {
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *disk;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               if (!rdev->sb) {
-                       MD_BUG();
+       int i;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
+       
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb)
                        continue;
-               }
-               disk = &sb->disks[rdev->desc_nr];
-               if (disk_faulty(disk)) {
-                       MD_BUG();
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE))
                        continue;
-               }
-               if (disk_active(disk))
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE))
                        continue;
-               return disk;
+               return descriptor;
        }
        return NULL;
 }
 
-static int is_mddev_idle (mddev_t *mddev)
-{
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
-       int idle;
-       unsigned long curr_events;
-
-       idle = 1;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               curr_events = io_events[MAJOR(rdev->dev)];
-
-               if (curr_events != rdev->last_events) {
-//                     printk("!I(%d)", curr_events-rdev->last_events);
-                       rdev->last_events = curr_events;
-                       idle = 0;
-               }
-       }
-       return idle;
-}
-
 /*
  * parallel resyncing thread. 
+ *
+ * FIXME: - make it abort with a dirty array on mdstop, now it just blocks
+ *        - fix read error handing
  */
 
-/*
- * Determine correct block size for this device.
- */
-unsigned int device_bsize (kdev_t dev)
-{
-       unsigned int i, correct_size;
-
-       correct_size = BLOCK_SIZE;
-       if (blksize_size[MAJOR(dev)]) {
-               i = blksize_size[MAJOR(dev)][MINOR(dev)];
-               if (i)
-                       correct_size = i;
-       }
-
-       return correct_size;
-}
-
-static struct wait_queue *resync_wait = (struct wait_queue *)NULL;
-
-#define RA_ORDER (1)
-#define RA_PAGE_SIZE (PAGE_SIZE*(1<<RA_ORDER))
-#define MAX_NR_BLOCKS (RA_PAGE_SIZE/sizeof(struct buffer_head *))
-
-int md_do_sync(mddev_t *mddev, mdp_disk_t *spare)
+int md_do_sync(struct md_dev *mddev)
 {
-       mddev_t *mddev2;
-        struct buffer_head **bh;
-       unsigned int max_blocks, blocksize, curr_bsize,
-               i, ii, j, k, chunk, window, nr_blocks, err, serialize;
-       kdev_t read_disk = mddev_to_kdev(mddev);
+        struct buffer_head *bh;
+       int max_blocks, blocksize, curr_bsize, percent=1, j;
+       kdev_t read_disk = MKDEV(MD_MAJOR, mddev - md_dev);
        int major = MAJOR(read_disk), minor = MINOR(read_disk);
        unsigned long starttime;
-       int max_read_errors = 2*MAX_NR_BLOCKS,
-                max_write_errors = 2*MAX_NR_BLOCKS;
-       struct md_list_head *tmp;
-
-retry_alloc:
-       bh = (struct buffer_head **) md__get_free_pages(GFP_KERNEL, RA_ORDER);
-       if (!bh) {
-               printk(KERN_ERR
-               "could not alloc bh array for reconstruction ... retrying!\n");
-               goto retry_alloc;
-       }
-
-       err = down_interruptible(&mddev->resync_sem);
-       if (err)
-               goto out_nolock;
-
-recheck:
-       serialize = 0;
-       ITERATE_MDDEV(mddev2,tmp) {
-               if (mddev2 == mddev)
-                       continue;
-               if (mddev2->curr_resync && match_mddev_units(mddev,mddev2)) {
-                       printk(KERN_INFO "md: serializing resync, md%d has overlapping physical units with md%d!\n", mdidx(mddev), mdidx(mddev2));
-                       serialize = 1;
-                       break;
-               }
-       }
-       if (serialize) {
-               interruptible_sleep_on(&resync_wait);
-               if (md_signal_pending(current)) {
-                       md_flush_signals();
-                       err = -EINTR;
-                       goto out;
-               }
-               goto recheck;
-       }
 
-       mddev->curr_resync = 1;
-
-       blocksize = device_bsize(read_disk);
+       blocksize = blksize_size[major][minor];
        max_blocks = blk_size[major][minor] / (blocksize >> 10);
 
-       printk(KERN_INFO "md: syncing RAID array md%d\n", mdidx(mddev));
-       printk(KERN_INFO "md: minimum _guaranteed_ reconstruction speed: %d KB/sec.\n",
-                                               sysctl_speed_limit);
-       printk(KERN_INFO "md: using maximum available idle IO bandwith for reconstruction.\n");
-
-       /*
-        * Resync has low priority.
-        */
-       current->priority = 1;
-
-       is_mddev_idle(mddev); /* this also initializes IO event counters */
-       starttime = jiffies;
-       mddev->resync_start = starttime;
+       printk("... resync log\n");
+       printk(" ....   mddev->nb_dev: %d\n", mddev->nb_dev);
+       printk(" ....   raid array: %s\n", kdevname(read_disk));
+       printk(" ....   max_blocks: %d blocksize: %d\n", max_blocks, blocksize);
+       printk("md: syncing RAID array %s\n", kdevname(read_disk));
 
-       /*
-        * Tune reconstruction:
-        */
-       window = md_maxreadahead[mdidx(mddev)]/1024;
-       nr_blocks = window / (blocksize >> 10);
-       if (!nr_blocks || (nr_blocks > MAX_NR_BLOCKS))
-               nr_blocks = MAX_NR_BLOCKS;
-       printk(KERN_INFO "md: using %dk window.\n",window);
+       mddev->busy++;
 
-       for (j = 0; j < max_blocks; j += nr_blocks) {
+       starttime=jiffies;
+       for (j = 0; j < max_blocks; j++) {
 
-               if (j)
-                       mddev->curr_resync = j;
                /*
                 * B careful. When some1 mounts a non-'blocksize' filesystem
                 * then we get the blocksize changed right under us. Go deal
                 * with it transparently, recalculate 'blocksize', 'j' and
                 * 'max_blocks':
                 */
-               curr_bsize = device_bsize(read_disk);
+               curr_bsize = blksize_size[major][minor];
                if (curr_bsize != blocksize) {
-                       printk(KERN_INFO "md%d: blocksize changed\n",
-                                                               mdidx(mddev));
-retry_read:
+               diff_blocksize:
                        if (curr_bsize > blocksize)
                                /*
                                 * this is safe, rounds downwards.
@@ -3405,384 +1109,114 @@ retry_read:
                                j *= blocksize/curr_bsize;
 
                        blocksize = curr_bsize;
-                       nr_blocks = window / (blocksize >> 10);
-                       if (!nr_blocks || (nr_blocks > MAX_NR_BLOCKS))
-                               nr_blocks = MAX_NR_BLOCKS;
                        max_blocks = blk_size[major][minor] / (blocksize >> 10);
-                       printk("nr_blocks changed to %d (blocksize %d, j %d, max_blocks %d)\n",
-                                       nr_blocks, blocksize, j, max_blocks);
+               }
+               if ((bh = breada (read_disk, j, blocksize, j * blocksize,
+                                       max_blocks * blocksize)) != NULL) {
+                       mark_buffer_dirty(bh, 1);
+                       brelse(bh);
+               } else {
                        /*
-                        * We will retry the current block-group
+                        * FIXME: Ugly, but set_blocksize() isnt safe ...
                         */
-               }
-
-               /*
-                * Cleanup routines expect this
-                */
-               for (k = 0; k < nr_blocks; k++)
-                       bh[k] = NULL;
-
-               chunk = nr_blocks;
-               if (chunk > max_blocks-j)
-                       chunk = max_blocks-j;
+                       curr_bsize = blksize_size[major][minor];
+                       if (curr_bsize != blocksize)
+                               goto diff_blocksize;
 
-               /*
-                * request buffer heads ...
-                */
-               for (i = 0; i < chunk; i++) {
-                       bh[i] = getblk (read_disk, j+i, blocksize);
-                       if (!bh[i])
-                               goto read_error;
-                       if (!buffer_dirty(bh[i]))
-                               mark_buffer_lowprio(bh[i]);
+                       /*
+                        * It's a real read problem. FIXME, handle this
+                        * a better way.
+                        */
+                       printk ( KERN_ALERT
+                                "read error, stopping reconstruction.\n");
+                       mddev->busy--;
+                       return 1;
                }
 
                /*
-                * read buffer heads ...
-                */
-               ll_rw_block (READ, chunk, bh);
-               run_task_queue(&tq_disk);
-               
-               /*
-                * verify that all of them are OK ...
+                * Let's sleep some if we are faster than our speed limit:
                 */
-               for (i = 0; i < chunk; i++) {
-                       ii = chunk-i-1;
-                       wait_on_buffer(bh[ii]);
-                       if (!buffer_uptodate(bh[ii]))
-                               goto read_error;
-               }
-
-retry_write:
-               for (i = 0; i < chunk; i++)
-                       mark_buffer_dirty_lowprio(bh[i]);
-
-               ll_rw_block(WRITE, chunk, bh);
-               run_task_queue(&tq_disk);
-
-               for (i = 0; i < chunk; i++) {
-                       ii = chunk-i-1;
-                       wait_on_buffer(bh[ii]);
-
-                       if (spare && disk_faulty(spare)) {
-                               for (k = 0; k < chunk; k++)
-                                       brelse(bh[k]);
-                               printk(" <SPARE FAILED!>\n ");
-                               err = -EIO;
-                               goto out;
-                       }
-
-                       if (!buffer_uptodate(bh[ii])) {
-                               curr_bsize = device_bsize(read_disk);
-                               if (curr_bsize != blocksize) {
-                                       printk(KERN_INFO
-                                               "md%d: blocksize changed during write\n",
-                                               mdidx(mddev));
-                                       for (k = 0; k < chunk; k++)
-                                               if (bh[k]) {
-                                                       if (buffer_lowprio(bh[k]))
-                                                               mark_buffer_clean(bh[k]);
-                                                       brelse(bh[k]);
-                                               }
-                                       goto retry_read;
-                               }
-                               printk(" BAD WRITE %8d>\n", j);
-                               /*
-                                * Ouch, write error, retry or bail out.
-                                */
-                               if (max_write_errors) {
-                                       max_write_errors--;
-                                       printk ( KERN_WARNING "md%d: write error while reconstructing, at block %u(%d).\n", mdidx(mddev), j, blocksize);
-                                       goto retry_write;
-                               }
-                               printk ( KERN_ALERT
-                                 "too many write errors, stopping reconstruction.\n");
-                               for (k = 0; k < chunk; k++)
-                                       if (bh[k]) {
-                                               if (buffer_lowprio(bh[k]))
-                                                       mark_buffer_clean(bh[k]);
-                                               brelse(bh[k]);
-                                       }
-                               err = -EIO;
-                               goto out;
-                       }
+               while (blocksize*j/(jiffies-starttime+1)*HZ/1024 > SPEED_LIMIT)
+               {
+                       current->state = TASK_INTERRUPTIBLE;
+                       schedule_timeout(1);
                }
 
                /*
-                * This is the normal 'everything went OK' case
-                * do a 'free-behind' logic, we sure dont need
-                * this buffer if it was the only user.
+                * FIXME: put this status bar thing into /proc
                 */
-               for (i = 0; i < chunk; i++)
-                       cache_drop_behind(bh[i]);
-
-
-               if (md_signal_pending(current)) {
-                       /*
-                        * got a signal, exit.
-                        */
-                       mddev->curr_resync = 0;
-                       printk("md_do_sync() got signal ... exiting\n");
-                       md_flush_signals();
-                       err = -EINTR;
-                       goto out;
+               if (!(j%(max_blocks/100))) {
+                       if (!(percent%10))
+                               printk (" %03d%% done.\n",percent);
+                       else
+                               printk (".");
+                       percent++;
                }
-
-               /*
-                * this loop exits only if either when we are slower than
-                * the 'hard' speed limit, or the system was IO-idle for
-                * a jiffy.
-                * the system might be non-idle CPU-wise, but we only care
-                * about not overloading the IO subsystem. (things like an
-                * e2fsck being done on the RAID array should execute fast)
-                */
-repeat:
-               if (md_need_resched(current))
-                       schedule();
-
-               if ((blocksize/1024)*j/((jiffies-starttime)/HZ + 1) + 1
-                                               > sysctl_speed_limit) {
-                       current->priority = 1;
-
-                       if (!is_mddev_idle(mddev)) {
-                               current->state = TASK_INTERRUPTIBLE;
-                               md_schedule_timeout(HZ/2);
-                               if (!md_signal_pending(current))
-                                       goto repeat;
-                       }
-               } else
-                       current->priority = 40;
        }
        fsync_dev(read_disk);
-       printk(KERN_INFO "md: md%d: sync done.\n",mdidx(mddev));
-       err = 0;
-       /*
-        * this also signals 'finished resyncing' to md_stop
-        */
-out:
-       up(&mddev->resync_sem);
-out_nolock:
-       free_pages((unsigned long)bh, RA_ORDER);
-       mddev->curr_resync = 0;
-       wake_up(&resync_wait);
-       return err;
-
-read_error:
-       /*
-        * set_blocksize() might change the blocksize. This
-        * should not happen often, but it happens when eg.
-        * someone mounts a filesystem that has non-1k
-        * blocksize. set_blocksize() doesnt touch our
-        * buffer, but to avoid aliasing problems we change
-        * our internal blocksize too and retry the read.
-        */
-       curr_bsize = device_bsize(read_disk);
-       if (curr_bsize != blocksize) {
-               printk(KERN_INFO "md%d: blocksize changed during read\n",
-                       mdidx(mddev));
-               for (k = 0; k < chunk; k++)
-                       if (bh[k]) {
-                               if (buffer_lowprio(bh[k]))
-                                       mark_buffer_clean(bh[k]);
-                               brelse(bh[k]);
-                       }
-               goto retry_read;
-       }
-
-       /*
-        * It's a real read problem. We retry and bail out
-        * only if it's excessive.
-        */
-       if (max_read_errors) {
-               max_read_errors--;
-               printk ( KERN_WARNING "md%d: read error while reconstructing, at block %u(%d).\n", mdidx(mddev), j, blocksize);
-               for (k = 0; k < chunk; k++)
-                       if (bh[k]) {
-                               if (buffer_lowprio(bh[k]))
-                                       mark_buffer_clean(bh[k]);
-                               brelse(bh[k]);
-                       }
-               goto retry_read;
-       }
-       printk ( KERN_ALERT "too many read errors, stopping reconstruction.\n");
-       for (k = 0; k < chunk; k++)
-               if (bh[k]) {
-                       if (buffer_lowprio(bh[k]))
-                               mark_buffer_clean(bh[k]);
-                       brelse(bh[k]);
-               }
-       err = -EIO;
-       goto out;
+       printk("md: %s: sync done.\n", kdevname(read_disk));
+       mddev->busy--;
+       return 0;
 }
 
-#undef MAX_NR_BLOCKS
-
 /*
- * This is a kernel thread which syncs a spare disk with the active array
+ * This is a kernel thread which: syncs a spare disk with the active array
  *
  * the amount of foolproofing might seem to be a tad excessive, but an
  * early (not so error-safe) version of raid1syncd synced the first 0.5 gigs
  * of my root partition with the first 0.5 gigs of my /home partition ... so
  * i'm a bit nervous ;)
  */
-void md_do_recovery (void *data)
+void mdsyncd (void *data)
 {
-       int err;
-       mddev_t *mddev;
-       mdp_super_t *sb;
-       mdp_disk_t *spare;
+       int i;
+       struct md_dev *mddev;
+       md_superblock_t *sb;
+       md_descriptor_t *spare;
        unsigned long flags;
-       struct md_list_head *tmp;
 
-       printk(KERN_INFO "md: recovery thread got woken up ...\n");
-restart:
-       ITERATE_MDDEV(mddev,tmp) {
+       for (i = 0, mddev = md_dev; i < MAX_MD_DEV; i++, mddev++) {
                if ((sb = mddev->sb) == NULL)
                        continue;
-               if (mddev->recovery_running)
-                       continue;
                if (sb->active_disks == sb->raid_disks)
                        continue;
-               if (!sb->spare_disks) {
-                       printk(KERN_ERR "md%d: no spare disk to reconstruct array! -- continuing in degraded mode\n", mdidx(mddev));
+               if (!sb->spare_disks)
                        continue;
-               }
-               /*
-                * now here we get the spare and resync it.
-                */
                if ((spare = get_spare(mddev)) == NULL)
                        continue;
-               printk(KERN_INFO "md%d: resyncing spare disk %s to replace failed disk\n", mdidx(mddev), partition_name(MKDEV(spare->major,spare->minor)));
-               if (!mddev->pers->diskop)
+               if (!mddev->pers->mark_spare)
                        continue;
-               if (mddev->pers->diskop(mddev, &spare, DISKOP_SPARE_WRITE))
+               if (mddev->pers->mark_spare(mddev, spare, SPARE_WRITE))
+                       continue;
+               if (md_do_sync(mddev) || (spare->state & (1 << MD_FAULTY_DEVICE))) {
+                       mddev->pers->mark_spare(mddev, spare, SPARE_INACTIVE);
                        continue;
-               down(&mddev->recovery_sem);
-               mddev->recovery_running = 1;
-               err = md_do_sync(mddev, spare);
-               if (err == -EIO) {
-                       printk(KERN_INFO "md%d: spare disk %s failed, skipping to next spare.\n", mdidx(mddev), partition_name(MKDEV(spare->major,spare->minor)));
-                       if (!disk_faulty(spare)) {
-                               mddev->pers->diskop(mddev,&spare,DISKOP_SPARE_INACTIVE);
-                               mark_disk_faulty(spare);
-                               mark_disk_nonsync(spare);
-                               mark_disk_inactive(spare);
-                               sb->spare_disks--;
-                               sb->working_disks--;
-                               sb->failed_disks++;
-                       }
-               } else
-                       if (disk_faulty(spare))
-                               mddev->pers->diskop(mddev, &spare,
-                                               DISKOP_SPARE_INACTIVE);
-               if (err == -EINTR) {
-                       /*
-                        * Recovery got interrupted ...
-                        * signal back that we have finished using the array.
-                        */
-                       mddev->pers->diskop(mddev, &spare,
-                                                        DISKOP_SPARE_INACTIVE);
-                       up(&mddev->recovery_sem);
-                       /*
-                        * we keep 'recovery_running == 1', so we will not
-                        * start a reconstruction next time around ...
-                        * the stop code will set it to 0 explicitly.
-                        */
-                       goto restart;
-               } else {
-                       mddev->recovery_running = 0;
-                       up(&mddev->recovery_sem);
                }
                save_flags(flags);
                cli();
-               if (!disk_faulty(spare)) {
-                       /*
-                        * the SPARE_ACTIVE diskop possibly changes the
-                        * pointer too
-                        */
-                       mddev->pers->diskop(mddev, &spare, DISKOP_SPARE_ACTIVE);
-                       mark_disk_sync(spare);
-                       mark_disk_active(spare);
-                       sb->active_disks++;
-                       sb->spare_disks--;
-               }
-               restore_flags(flags);
+               mddev->pers->mark_spare(mddev, spare, SPARE_ACTIVE);
+               spare->state |= (1 << MD_SYNC_DEVICE);
+               spare->state |= (1 << MD_ACTIVE_DEVICE);
+               sb->spare_disks--;
+               sb->active_disks++;
                mddev->sb_dirty = 1;
-               md_update_sb(mddev);
-               goto restart;
+               md_update_sb(mddev - md_dev);
+               restore_flags(flags);
        }
-       printk(KERN_INFO "md: recovery thread finished ...\n");
        
 }
 
-int md_notify_reboot(struct notifier_block *this,
-                                       unsigned long code, void *x)
-{
-       struct md_list_head *tmp;
-       mddev_t *mddev;
-
-       if ((code == MD_SYS_DOWN) || (code == MD_SYS_HALT)
-                                 || (code == MD_SYS_POWER_OFF)) {
-
-               printk(KERN_INFO "stopping all md devices.\n");
-
-               ITERATE_MDDEV(mddev,tmp)
-                       do_md_stop (mddev, 1);
-               /*
-                * certain more exotic SCSI devices are known to be
-                * volatile wrt too early system reboots. While the
-                * right place to handle this issue is the given
-                * driver, we do want to have a safe RAID driver ...
-                */
-               md_mdelay(1000*1);
-       }
-       return NOTIFY_DONE;
-}
-
-struct notifier_block md_notifier = {
-       md_notify_reboot,
-       NULL,
-       0
-};
-
-md__initfunc(void raid_setup(char *str, int *ints))
-{
-       char tmpline[100];
-       int len, pos, nr, i;
-
-       len = strlen(str) + 1;
-       nr = 0;
-       pos = 0;
-
-       for (i = 0; i < len; i++) {
-               char c = str[i];
-
-               if (c == ',' || !c) {
-                       tmpline[pos] = 0;
-                       if (!strcmp(tmpline,"noautodetect"))
-                               raid_setup_args.noautodetect = 1;
-                       nr++;
-                       pos = 0;
-                       continue;
-               }
-               tmpline[pos] = c;
-               pos++;
-       }
-       raid_setup_args.set = 1;
-       return;
-}
-
 #ifdef CONFIG_MD_BOOT
 struct {
        int set;
        int ints[100];
        char str[100];
-} md_setup_args md__initdata = {
+} md_setup_args __initdata = {
        0,{0},{0}
 };
 
 /* called from init/main.c */
-md__initfunc(void md_setup(char *str,int *ints))
+__initfunc(void md_setup(char *str,int *ints))
 {
        int i;
        for(i=0;i<=ints[0];i++) {
@@ -3794,24 +1228,21 @@ md__initfunc(void md_setup(char *str,int *ints))
        return;
 }
 
-md__initfunc(void do_md_setup(char *str,int *ints))
+__initfunc(void do_md_setup(char *str,int *ints))
 {
-#if 0
-       int minor, pers, chunk_size, fault;
+       int minor, pers, factor, fault;
        kdev_t dev;
        int i=1;
 
-       printk("i plan to phase this out --mingo\n");
-
        if(ints[0] < 4) {
-               printk (KERN_WARNING "md: Too few Arguments (%d).\n", ints[0]);
+               printk ("md: Too few Arguments (%d).\n", ints[0]);
                return;
        }
    
        minor=ints[i++];
    
-       if ((unsigned int)minor >= MAX_MD_DEVS) {
-               printk (KERN_WARNING "md: Minor device number too high.\n");
+       if (minor >= MAX_MD_DEV) {
+               printk ("md: Minor device number too high.\n");
                return;
        }
 
@@ -3821,20 +1252,18 @@ md__initfunc(void do_md_setup(char *str,int *ints))
        case -1:
 #ifdef CONFIG_MD_LINEAR
                pers = LINEAR;
-               printk (KERN_INFO "md: Setting up md%d as linear device.\n",
-                                                                       minor);
+               printk ("md: Setting up md%d as linear device.\n",minor);
 #else 
-               printk (KERN_WARNING "md: Linear mode not configured." 
+               printk ("md: Linear mode not configured." 
                        "Recompile the kernel with linear mode enabled!\n");
 #endif
                break;
        case 0:
                pers = STRIPED;
 #ifdef CONFIG_MD_STRIPED
-               printk (KERN_INFO "md: Setting up md%d as a striped device.\n",
-                                                               minor);
+               printk ("md: Setting up md%d as a striped device.\n",minor);
 #else 
-               printk (KERN_WARNING "md: Striped mode not configured." 
+               printk ("md: Striped mode not configured." 
                        "Recompile the kernel with striped mode enabled!\n");
 #endif
                break;
@@ -3849,145 +1278,79 @@ md__initfunc(void do_md_setup(char *str,int *ints))
                break;
 */
        default:           
-               printk (KERN_WARNING "md: Unknown or not supported raid level %d.\n", ints[--i]);
+               printk ("md: Unknown or not supported raid level %d.\n", ints[--i]);
                return;
        }
 
-       if (pers) {
+       if(pers) {
 
-               chunk_size = ints[i++]; /* Chunksize  */
-               fault = ints[i++]; /* Faultlevel */
+         factor=ints[i++]; /* Chunksize  */
+         fault =ints[i++]; /* Faultlevel */
    
-               pers = pers | chunk_size | (fault << FAULT_SHIFT);   
+         pers=pers | factor | (fault << FAULT_SHIFT);   
    
-               while( str && (dev = name_to_kdev_t(str))) {
-                       do_md_add (minor, dev);
-                       if((str = strchr (str, ',')) != NULL)
-                               str++;
-               }
+         while( str && (dev = name_to_kdev_t(str))) {
+           do_md_add (minor, dev);
+           if((str = strchr (str, ',')) != NULL)
+             str++;
+         }
 
-               do_md_run (minor, pers);
-               printk (KERN_INFO "md: Loading md%d.\n",minor);
+         do_md_run (minor, pers);
+         printk ("md: Loading md%d.\n",minor);
        }
-#endif
+   
 }
 #endif
 
-void hsm_init (void);
-void translucent_init (void);
 void linear_init (void);
 void raid0_init (void);
 void raid1_init (void);
 void raid5_init (void);
 
-md__initfunc(int md_init (void))
+__initfunc(int md_init (void))
 {
-       static char * name = "mdrecoveryd";
-
-       printk (KERN_INFO "md driver %d.%d.%d MAX_MD_DEVS=%d, MAX_REAL=%d\n",
-                       MD_MAJOR_VERSION, MD_MINOR_VERSION,
-                       MD_PATCHLEVEL_VERSION, MAX_MD_DEVS, MAX_REAL);
-
-       if (register_blkdev (MD_MAJOR, "md", &md_fops))
-       {
-               printk (KERN_ALERT "Unable to get major %d for md\n", MD_MAJOR);
-               return (-1);
-       }
+  printk ("md driver %d.%d.%d MAX_MD_DEV=%d, MAX_REAL=%d\n",
+    MD_MAJOR_VERSION, MD_MINOR_VERSION, MD_PATCHLEVEL_VERSION,
+    MAX_MD_DEV, MAX_REAL);
 
-       blk_dev[MD_MAJOR].request_fn = DEVICE_REQUEST;
-       blk_dev[MD_MAJOR].current_request = NULL;
-       read_ahead[MD_MAJOR] = INT_MAX;
-       md_gendisk.next = gendisk_head;
+  if (register_blkdev (MD_MAJOR, "md", &md_fops))
+  {
+    printk ("Unable to get major %d for md\n", MD_MAJOR);
+    return (-1);
+  }
 
-       gendisk_head = &md_gendisk;
+  blk_dev[MD_MAJOR].request_fn=DEVICE_REQUEST;
+  blk_dev[MD_MAJOR].current_request=NULL;
+  read_ahead[MD_MAJOR]=INT_MAX;
+  memset(md_dev, 0, MAX_MD_DEV * sizeof (struct md_dev));
+  md_gendisk.next=gendisk_head;
 
-       md_recovery_thread = md_register_thread(md_do_recovery, NULL, name);
-       if (!md_recovery_thread)
-               printk(KERN_ALERT "bug: couldn't allocate md_recovery_thread\n");
+  gendisk_head=&md_gendisk;
 
-       md_register_reboot_notifier(&md_notifier);
-       md_register_sysctl();
+#if SUPPORT_RECONSTRUCTION
+  if ((md_sync_thread = md_register_thread(mdsyncd, NULL)) == NULL)
+    printk("md: bug: md_sync_thread == NULL\n");
+#endif /* SUPPORT_RECONSTRUCTION */
 
-#ifdef CONFIG_MD_HSM
-       hsm_init ();
-#endif
-#ifdef CONFIG_MD_TRANSLUCENT
-       translucent_init ();
-#endif
 #ifdef CONFIG_MD_LINEAR
-       linear_init ();
+  linear_init ();
 #endif
 #ifdef CONFIG_MD_STRIPED
-       raid0_init ();
+  raid0_init ();
 #endif
 #ifdef CONFIG_MD_MIRRORING
-       raid1_init ();
+  raid1_init ();
 #endif
 #ifdef CONFIG_MD_RAID5
-       raid5_init ();
-#endif
-#if defined(CONFIG_MD_RAID5) || defined(CONFIG_MD_RAID5_MODULE)
-        /*
-         * pick a XOR routine, runtime.
-         */
-       calibrate_xor_block();
+  raid5_init ();
 #endif
-
-       return (0);
+  return (0);
 }
 
 #ifdef CONFIG_MD_BOOT
-md__initfunc(void md_setup_drive(void))
+__initfunc(void md_setup_drive(void))
 {
        if(md_setup_args.set)
                do_md_setup(md_setup_args.str, md_setup_args.ints);
 }
 #endif
-
-MD_EXPORT_SYMBOL(md_size);
-MD_EXPORT_SYMBOL(register_md_personality);
-MD_EXPORT_SYMBOL(unregister_md_personality);
-MD_EXPORT_SYMBOL(partition_name);
-MD_EXPORT_SYMBOL(md_error);
-MD_EXPORT_SYMBOL(md_recover_arrays);
-MD_EXPORT_SYMBOL(md_register_thread);
-MD_EXPORT_SYMBOL(md_unregister_thread);
-MD_EXPORT_SYMBOL(md_update_sb);
-MD_EXPORT_SYMBOL(md_map);
-MD_EXPORT_SYMBOL(md_wakeup_thread);
-MD_EXPORT_SYMBOL(md_do_sync);
-MD_EXPORT_SYMBOL(md_print_devices);
-MD_EXPORT_SYMBOL(find_rdev_nr);
-MD_EXPORT_SYMBOL(md_check_ordering);
-MD_EXPORT_SYMBOL(md_interrupt_thread);
-MD_EXPORT_SYMBOL(mddev_map);
-
-#ifdef CONFIG_PROC_FS
-static struct proc_dir_entry proc_md = {
-       PROC_MD, 6, "mdstat",
-       S_IFREG | S_IRUGO, 1, 0, 0,
-       0, &proc_array_inode_operations,
-};
-#endif
-
-static void md_geninit (struct gendisk *gdisk)
-{
-       int i;
-  
-       for(i = 0; i < MAX_MD_DEVS; i++) {
-               md_blocksizes[i] = 1024;
-               md_maxreadahead[i] = MD_READAHEAD;
-               md_gendisk.part[i].start_sect = -1; /* avoid partition check */
-               md_gendisk.part[i].nr_sects = 0;
-       }
-
-       printk("md.c: sizeof(mdp_super_t) = %d\n", (int)sizeof(mdp_super_t));
-
-       blksize_size[MD_MAJOR] = md_blocksizes;
-       md_set_global_readahead(md_maxreadahead);
-
-#ifdef CONFIG_PROC_FS
-       proc_register(&proc_root, &proc_md);
-#endif
-}
-
index 5272d7353e61ed56aeaa0a88d2e499a36dc831ca..2e95d34f89b8832295c41688219dc91f361ef841 100644 (file)
@@ -1,3 +1,4 @@
+
 /*
    raid0.c : Multiple Devices driver for Linux
              Copyright (C) 1994-96 Marc ZYNGIER
 */
 
 #include <linux/module.h>
-#include <linux/raid/raid0.h>
+#include <linux/md.h>
+#include <linux/raid0.h>
+#include <linux/vmalloc.h>
 
 #define MAJOR_NR MD_MAJOR
 #define MD_DRIVER
 #define MD_PERSONALITY
 
-static int create_strip_zones (mddev_t *mddev)
+static int create_strip_zones (int minor, struct md_dev *mddev)
 {
-       int i, c, j, j1, j2;
-       int current_offset, curr_zone_offset;
-       raid0_conf_t *conf = mddev_to_conf(mddev);
-       mdk_rdev_t *smallest, *rdev1, *rdev2, *rdev;
-       /*
-        * The number of 'same size groups'
-        */
-       conf->nr_strip_zones = 0;
-       ITERATE_RDEV_ORDERED(mddev,rdev1,j1) {
-               printk("raid0: looking at %s\n", partition_name(rdev1->dev));
-               c = 0;
-               ITERATE_RDEV_ORDERED(mddev,rdev2,j2) {
-                       printk("raid0:   comparing %s(%d) with %s(%d)\n", partition_name(rdev1->dev), rdev1->size, partition_name(rdev2->dev), rdev2->size);
-                       if (rdev2 == rdev1) {
-                               printk("raid0:   END\n");
-                               break;
-                       }
-                       if (rdev2->size == rdev1->size)
-                       {
-                               /*
-                                * Not unique, dont count it as a new
-                                * group
-                                */
-                               printk("raid0:   EQUAL\n");
-                               c = 1;
-                               break;
-                       }
-                       printk("raid0:   NOT EQUAL\n");
-               }
-               if (!c) {
-                       printk("raid0:   ==> UNIQUE\n");
-                       conf->nr_strip_zones++;
-                       printk("raid0: %d zones\n", conf->nr_strip_zones);
-               }
-       }
-               printk("raid0: FINAL %d zones\n", conf->nr_strip_zones);
-
-       conf->strip_zone = vmalloc(sizeof(struct strip_zone)*
-                               conf->nr_strip_zones);
-       if (!conf->strip_zone)
-               return 1;
-
-
-       conf->smallest = NULL;
-       current_offset = 0;
-       curr_zone_offset = 0;
-
-       for (i = 0; i < conf->nr_strip_zones; i++)
-       {
-               struct strip_zone *zone = conf->strip_zone + i;
-
-               printk("zone %d\n", i);
-               zone->dev_offset = current_offset;
-               smallest = NULL;
-               c = 0;
-
-               ITERATE_RDEV_ORDERED(mddev,rdev,j) {
-
-                       printk(" checking %s ...", partition_name(rdev->dev));
-                       if (rdev->size > current_offset)
-                       {
-                               printk(" contained as device %d\n", c);
-                               zone->dev[c] = rdev;
-                               c++;
-                               if (!smallest || (rdev->size <smallest->size)) {
-                                       smallest = rdev;
-                                       printk("  (%d) is smallest!.\n", rdev->size);
-                               }
-                       } else
-                               printk(" nope.\n");
-               }
-
-               zone->nb_dev = c;
-               zone->size = (smallest->size - current_offset) * c;
-               printk(" zone->nb_dev: %d, size: %d\n",zone->nb_dev,zone->size);
-
-               if (!conf->smallest || (zone->size < conf->smallest->size))
-                       conf->smallest = zone;
-
-               zone->zone_offset = curr_zone_offset;
-               curr_zone_offset += zone->size;
-
-               current_offset = smallest->size;
-               printk("current zone offset: %d\n", current_offset);
-       }
-       printk("done.\n");
-       return 0;
+  int i, j, c=0;
+  int current_offset=0;
+  struct real_dev *smallest_by_zone;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
+  
+  data->nr_strip_zones=1;
+  
+  for (i=1; i<mddev->nb_dev; i++)
+  {
+    for (j=0; j<i; j++)
+      if (mddev->devices[i].size==mddev->devices[j].size)
+      {
+       c=1;
+       break;
+      }
+
+    if (!c)
+      data->nr_strip_zones++;
+
+    c=0;
+  }
+
+  if ((data->strip_zone=vmalloc(sizeof(struct strip_zone)*data->nr_strip_zones)) == NULL)
+    return 1;
+
+  data->smallest=NULL;
+  
+  for (i=0; i<data->nr_strip_zones; i++)
+  {
+    data->strip_zone[i].dev_offset=current_offset;
+    smallest_by_zone=NULL;
+    c=0;
+
+    for (j=0; j<mddev->nb_dev; j++)
+      if (mddev->devices[j].size>current_offset)
+      {
+       data->strip_zone[i].dev[c++]=mddev->devices+j;
+       if (!smallest_by_zone ||
+           smallest_by_zone->size > mddev->devices[j].size)
+         smallest_by_zone=mddev->devices+j;
+      }
+
+    data->strip_zone[i].nb_dev=c;
+    data->strip_zone[i].size=(smallest_by_zone->size-current_offset)*c;
+
+    if (!data->smallest ||
+       data->smallest->size > data->strip_zone[i].size)
+      data->smallest=data->strip_zone+i;
+
+    data->strip_zone[i].zone_offset=i ? (data->strip_zone[i-1].zone_offset+
+                                          data->strip_zone[i-1].size) : 0;
+    current_offset=smallest_by_zone->size;
+  }
+  return 0;
 }
 
-static int raid0_run (mddev_t *mddev)
+static int raid0_run (int minor, struct md_dev *mddev)
 {
-       int cur=0, i=0, size, zone0_size, nb_zone;
-       raid0_conf_t *conf;
-
-       MOD_INC_USE_COUNT;
-
-       conf = vmalloc(sizeof (raid0_conf_t));
-       if (!conf)
-               goto out;
-       mddev->private = (void *)conf;
-       if (md_check_ordering(mddev)) {
-               printk("raid0: disks are not ordered, aborting!\n");
-               goto out_free_conf;
-       }
-
-       if (create_strip_zones (mddev)) 
-               goto out_free_conf;
-
-       printk("raid0 : md_size is %d blocks.\n", md_size[mdidx(mddev)]);
-       printk("raid0 : conf->smallest->size is %d blocks.\n", conf->smallest->size);
-       nb_zone = md_size[mdidx(mddev)]/conf->smallest->size +
-                       (md_size[mdidx(mddev)] % conf->smallest->size ? 1 : 0);
-       printk("raid0 : nb_zone is %d.\n", nb_zone);
-       conf->nr_zones = nb_zone;
-
-       printk("raid0 : Allocating %d bytes for hash.\n",
-                               sizeof(struct raid0_hash)*nb_zone);
-
-       conf->hash_table = vmalloc (sizeof (struct raid0_hash)*nb_zone);
-       if (!conf->hash_table)
-               goto out_free_zone_conf;
-       size = conf->strip_zone[cur].size;
-
-       i = 0;
-       while (cur < conf->nr_strip_zones) {
-               conf->hash_table[i].zone0 = conf->strip_zone + cur;
-
-               /*
-                * If we completely fill the slot
-                */
-               if (size >= conf->smallest->size) {
-                       conf->hash_table[i++].zone1 = NULL;
-                       size -= conf->smallest->size;
-
-                       if (!size) {
-                               if (++cur == conf->nr_strip_zones)
-                                       continue;
-                               size = conf->strip_zone[cur].size;
-                       }
-                       continue;
-               }
-               if (++cur == conf->nr_strip_zones) {
-                       /*
-                        * Last dev, set unit1 as NULL
-                        */
-                       conf->hash_table[i].zone1=NULL;
-                       continue;
-               }
-
-               /*
-                * Here we use a 2nd dev to fill the slot
-                */
-               zone0_size = size;
-               size = conf->strip_zone[cur].size;
-               conf->hash_table[i++].zone1 = conf->strip_zone + cur;
-               size -= (conf->smallest->size - zone0_size);
-       }
-       return 0;
-
-out_free_zone_conf:
-       vfree(conf->strip_zone);
-       conf->strip_zone = NULL;
-
-out_free_conf:
-       vfree(conf);
-       mddev->private = NULL;
-out:
-       MOD_DEC_USE_COUNT;
-       return 1;
+  int cur=0, i=0, size, zone0_size, nb_zone;
+  struct raid0_data *data;
+
+  MOD_INC_USE_COUNT;
+
+  if ((mddev->private=vmalloc (sizeof (struct raid0_data))) == NULL) return 1;
+  data=(struct raid0_data *) mddev->private;
+  
+  if (create_strip_zones (minor, mddev)) 
+  {
+       vfree(data);
+       return 1;
+  }
+
+  nb_zone=data->nr_zones=
+    md_size[minor]/data->smallest->size +
+    (md_size[minor]%data->smallest->size ? 1 : 0);
+
+  printk ("raid0 : Allocating %ld bytes for hash.\n",(long)sizeof(struct raid0_hash)*nb_zone);
+  if ((data->hash_table=vmalloc (sizeof (struct raid0_hash)*nb_zone)) == NULL)
+  {
+    vfree(data->strip_zone);
+    vfree(data);
+    return 1;
+  }
+  size=data->strip_zone[cur].size;
+
+  i=0;
+  while (cur<data->nr_strip_zones)
+  {
+    data->hash_table[i].zone0=data->strip_zone+cur;
+
+    if (size>=data->smallest->size)/* If we completely fill the slot */
+    {
+      data->hash_table[i++].zone1=NULL;
+      size-=data->smallest->size;
+
+      if (!size)
+      {
+       if (++cur==data->nr_strip_zones) continue;
+       size=data->strip_zone[cur].size;
+      }
+
+      continue;
+    }
+
+    if (++cur==data->nr_strip_zones) /* Last dev, set unit1 as NULL */
+    {
+      data->hash_table[i].zone1=NULL;
+      continue;
+    }
+
+    zone0_size=size;           /* Here, we use a 2nd dev to fill the slot */
+    size=data->strip_zone[cur].size;
+    data->hash_table[i++].zone1=data->strip_zone+cur;
+    size-=(data->smallest->size - zone0_size);
+  }
+
+  return (0);
 }
 
-static int raid0_stop (mddev_t *mddev)
+
+static int raid0_stop (int minor, struct md_dev *mddev)
 {
-       raid0_conf_t *conf = mddev_to_conf(mddev);
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
 
-       vfree (conf->hash_table);
-       conf->hash_table = NULL;
-       vfree (conf->strip_zone);
-       conf->strip_zone = NULL;
-       vfree (conf);
-       mddev->private = NULL;
+  vfree (data->hash_table);
+  vfree (data->strip_zone);
+  vfree (data);
 
-       MOD_DEC_USE_COUNT;
-       return 0;
+  MOD_DEC_USE_COUNT;
+  return 0;
 }
 
 /*
@@ -221,135 +167,129 @@ static int raid0_stop (mddev_t *mddev)
  * Of course, those facts may not be valid anymore (and surely won't...)
  * Hey guys, there's some work out there ;-)
  */
-static int raid0_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid0_map (struct md_dev *mddev, kdev_t *rdev,
                      unsigned long *rsector, unsigned long size)
 {
-       raid0_conf_t *conf = mddev_to_conf(mddev);
-       struct raid0_hash *hash;
-       struct strip_zone *zone;
-       mdk_rdev_t *tmp_dev;
-       int blk_in_chunk, chunksize_bits, chunk, chunk_size;
-       long block, rblock;
-
-       chunk_size = mddev->param.chunk_size >> 10;
-       chunksize_bits = ffz(~chunk_size);
-       block = *rsector >> 1;
-       hash = conf->hash_table + block / conf->smallest->size;
-
-       /* Sanity check */
-       if ((chunk_size * 2) < (*rsector % (chunk_size * 2)) + size)
-               goto bad_map;
-       if (!hash)
-               goto bad_hash;
-
-       if (!hash->zone0)
-               goto bad_zone0;
-       if (block >= (hash->zone0->size + hash->zone0->zone_offset)) {
-               if (!hash->zone1)
-                       goto bad_zone1;
-               zone = hash->zone1;
-       } else
-               zone = hash->zone0;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
+  static struct raid0_hash *hash;
+  struct strip_zone *zone;
+  struct real_dev *tmp_dev;
+  int blk_in_chunk, factor, chunk, chunk_size;
+  long block, rblock;
+
+  factor=FACTOR(mddev);
+  chunk_size=(1UL << FACTOR_SHIFT(factor));
+  block=*rsector >> 1;
+  hash=data->hash_table+(block/data->smallest->size);
+
+  /* Sanity check */
+  if ((chunk_size*2)<(*rsector % (chunk_size*2))+size)
+  {
+    printk ("raid0_convert : can't convert block across chunks or bigger than %dk %ld %ld\n", chunk_size, *rsector, size);
+    return (-1);
+  }
+  
+  if (block >= (hash->zone0->size +
+               hash->zone0->zone_offset))
+  {
+    if (!hash->zone1)
+    {
+      printk ("raid0_convert : hash->zone1==NULL for block %ld\n", block);
+      return (-1);
+    }
+    
+    zone=hash->zone1;
+  }
+  else
+    zone=hash->zone0;
     
-       blk_in_chunk = block & (chunk_size -1);
-       chunk = (block - zone->zone_offset) / (zone->nb_dev << chunksize_bits);
-       tmp_dev = zone->dev[(block >> chunksize_bits) % zone->nb_dev];
-       rblock = (chunk << chunksize_bits) + blk_in_chunk + zone->dev_offset;
+  blk_in_chunk=block & (chunk_size -1);
+  chunk=(block - zone->zone_offset) / (zone->nb_dev<<FACTOR_SHIFT(factor));
+  tmp_dev=zone->dev[(block >> FACTOR_SHIFT(factor)) % zone->nb_dev];
+  rblock=(chunk << FACTOR_SHIFT(factor)) + blk_in_chunk + zone->dev_offset;
   
-       *rdev = tmp_dev->dev;
-       *rsector = rblock << 1;
-
-       return 0;
-
-bad_map:
-       printk ("raid0_map bug: can't convert block across chunks or bigger than %dk %ld %ld\n", chunk_size, *rsector, size);
-       return -1;
-bad_hash:
-       printk("raid0_map bug: hash==NULL for block %ld\n", block);
-       return -1;
-bad_zone0:
-       printk ("raid0_map bug: hash->zone0==NULL for block %ld\n", block);
-       return -1;
-bad_zone1:
-       printk ("raid0_map bug: hash->zone1==NULL for block %ld\n", block);
-       return -1;
+  *rdev=tmp_dev->dev;
+  *rsector=rblock<<1;
+
+  return (0);
 }
 
                           
-static int raid0_status (char *page, mddev_t *mddev)
+static int raid0_status (char *page, int minor, struct md_dev *mddev)
 {
-       int sz = 0;
+  int sz=0;
 #undef MD_DEBUG
 #ifdef MD_DEBUG
-       int j, k;
-       raid0_conf_t *conf = mddev_to_conf(mddev);
+  int j, k;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
   
-       sz += sprintf(page + sz, "      ");
-       for (j = 0; j < conf->nr_zones; j++) {
-               sz += sprintf(page + sz, "[z%d",
-                               conf->hash_table[j].zone0 - conf->strip_zone);
-               if (conf->hash_table[j].zone1)
-                       sz += sprintf(page+sz, "/z%d] ",
-                               conf->hash_table[j].zone1 - conf->strip_zone);
-               else
-                       sz += sprintf(page+sz, "] ");
-       }
+  sz+=sprintf (page+sz, "      ");
+  for (j=0; j<data->nr_zones; j++)
+  {
+    sz+=sprintf (page+sz, "[z%d",
+                data->hash_table[j].zone0-data->strip_zone);
+    if (data->hash_table[j].zone1)
+      sz+=sprintf (page+sz, "/z%d] ",
+                  data->hash_table[j].zone1-data->strip_zone);
+    else
+      sz+=sprintf (page+sz, "] ");
+  }
   
-       sz += sprintf(page + sz, "\n");
+  sz+=sprintf (page+sz, "\n");
   
-       for (j = 0; j < conf->nr_strip_zones; j++) {
-               sz += sprintf(page + sz, "      z%d=[", j);
-               for (k = 0; k < conf->strip_zone[j].nb_dev; k++)
-                       sz += sprintf (page+sz, "%s/", partition_name(
-                               conf->strip_zone[j].dev[k]->dev));
-               sz--;
-               sz += sprintf (page+sz, "] zo=%d do=%d s=%d\n",
-                               conf->strip_zone[j].zone_offset,
-                               conf->strip_zone[j].dev_offset,
-                               conf->strip_zone[j].size);
-       }
+  for (j=0; j<data->nr_strip_zones; j++)
+  {
+    sz+=sprintf (page+sz, "      z%d=[", j);
+    for (k=0; k<data->strip_zone[j].nb_dev; k++)
+      sz+=sprintf (page+sz, "%s/",
+                  partition_name(data->strip_zone[j].dev[k]->dev));
+    sz--;
+    sz+=sprintf (page+sz, "] zo=%d do=%d s=%d\n",
+                data->strip_zone[j].zone_offset,
+                data->strip_zone[j].dev_offset,
+                data->strip_zone[j].size);
+  }
 #endif
-       sz += sprintf(page + sz, " %dk chunks", mddev->param.chunk_size/1024);
-       return sz;
+  sz+=sprintf (page+sz, " %dk chunks", 1<<FACTOR_SHIFT(FACTOR(mddev)));
+  return sz;
 }
 
-static mdk_personality_t raid0_personality=
+
+static struct md_personality raid0_personality=
 {
-       "raid0",
-       raid0_map,
-       NULL,                           /* no special make_request */
-       NULL,                           /* no special end_request */
-       raid0_run,
-       raid0_stop,
-       raid0_status,
-       NULL,                           /* no ioctls */
-       0,
-       NULL,                           /* no error_handler */
-       NULL,                           /* no diskop */
-       NULL,                           /* no stop resync */
-       NULL                            /* no restart resync */
+  "raid0",
+  raid0_map,
+  NULL,                                /* no special make_request */
+  NULL,                                /* no special end_request */
+  raid0_run,
+  raid0_stop,
+  raid0_status,
+  NULL,                                /* no ioctls */
+  0,
+  NULL,                                /* no error_handler */
+  NULL,                                /* hot_add_disk */
+  NULL,                                /* hot_remove_disk */
+  NULL                         /* mark_spare */
 };
 
+
 #ifndef MODULE
 
 void raid0_init (void)
 {
-       register_md_personality (RAID0, &raid0_personality);
+  register_md_personality (RAID0, &raid0_personality);
 }
 
 #else
 
 int init_module (void)
 {
-       return (register_md_personality (RAID0, &raid0_personality));
+  return (register_md_personality (RAID0, &raid0_personality));
 }
 
 void cleanup_module (void)
 {
-       unregister_md_personality (RAID0);
+  unregister_md_personality (RAID0);
 }
 
 #endif
-
index a7caea3b5282a0c3e96c22102d6c1e5a1bdce791..890584dcdd684679c0bf3e422b27e791717110e3 100644 (file)
@@ -1,6 +1,6 @@
-/*
+/************************************************************************
  * raid1.c : Multiple Devices driver for Linux
- * Copyright (C) 1996, 1997, 1998 Ingo Molnar, Miguel de Icaza, Gadi Oxman
+ *           Copyright (C) 1996 Ingo Molnar, Miguel de Icaza, Gadi Oxman
  *
  * RAID-1 management functions.
  *
  */
 
 #include <linux/module.h>
+#include <linux/locks.h>
 #include <linux/malloc.h>
-#include <linux/raid/raid1.h>
+#include <linux/md.h>
+#include <linux/raid1.h>
+#include <asm/bitops.h>
 #include <asm/atomic.h>
 
 #define MAJOR_NR MD_MAJOR
 #define MD_DRIVER
 #define MD_PERSONALITY
 
-#define MAX_LINEAR_SECTORS 128
+/*
+ * The following can be used to debug the driver
+ */
+/*#define RAID1_DEBUG*/
+#ifdef RAID1_DEBUG
+#define PRINTK(x)   do { printk x; } while (0);
+#else
+#define PRINTK(x)   do { ; } while (0);
+#endif
 
 #define MAX(a,b)       ((a) > (b) ? (a) : (b))
 #define MIN(a,b)       ((a) < (b) ? (a) : (b))
 
-static mdk_personality_t raid1_personality;
+static struct md_personality raid1_personality;
+static struct md_thread *raid1_thread = NULL;
 struct buffer_head *raid1_retry_list = NULL;
 
-static void * raid1_kmalloc (int size)
-{
-       void * ptr;
-       /*
-        * now we are rather fault tolerant than nice, but
-        * there are a couple of places in the RAID code where we
-        * simply can not afford to fail an allocation because
-        * there is no failure return path (eg. make_request())
-        */
-       while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_KERNEL)))
-               printk ("raid1: out of memory, retrying...\n");
-
-       memset(ptr, 0, size);
-       return ptr;
-}
-
-static int __raid1_map (mddev_t *mddev, kdev_t *rdev,
+static int __raid1_map (struct md_dev *mddev, kdev_t *rdev,
                        unsigned long *rsector, unsigned long size)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       int i, disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       int i, n = raid_conf->raid_disks;
 
        /*
         * Later we do read balancing on the read side 
         * now we use the first available disk.
         */
 
-       for (i = 0; i < disks; i++) {
-               if (conf->mirrors[i].operational) {
-                       *rdev = conf->mirrors[i].dev;
+       PRINTK(("raid1_map().\n"));
+
+       for (i=0; i<n; i++) {
+               if (raid_conf->mirrors[i].operational) {
+                       *rdev = raid_conf->mirrors[i].dev;
                        return (0);
                }
        }
@@ -69,29 +67,29 @@ static int __raid1_map (mddev_t *mddev, kdev_t *rdev,
        return (-1);
 }
 
-static int raid1_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid1_map (struct md_dev *mddev, kdev_t *rdev,
                      unsigned long *rsector, unsigned long size)
 {
        return 0;
 }
 
-static void raid1_reschedule_retry (struct buffer_head *bh)
+void raid1_reschedule_retry (struct buffer_head *bh)
 {
        struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_dev_id);
-       mddev_t *mddev = r1_bh->mddev;
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+
+       PRINTK(("raid1_reschedule_retry().\n"));
 
        r1_bh->next_retry = raid1_retry_list;
        raid1_retry_list = bh;
-       md_wakeup_thread(conf->thread);
+       md_wakeup_thread(raid1_thread);
 }
 
 /*
- * raid1_end_bh_io() is called when we have finished servicing a mirrored
+ * raid1_end_buffer_io() is called when we have finished servicing a mirrored
  * operation and are ready to return a success/failure code to the buffer
  * cache layer.
  */
-static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
+static inline void raid1_end_buffer_io(struct raid1_bh *r1_bh, int uptodate)
 {
        struct buffer_head *bh = r1_bh->master_bh;
 
@@ -99,6 +97,8 @@ static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
        kfree(r1_bh);
 }
 
+int raid1_one_error=0;
+
 void raid1_end_request (struct buffer_head *bh, int uptodate)
 {
        struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_dev_id);
@@ -106,7 +106,12 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
 
        save_flags(flags);
        cli();
+       PRINTK(("raid1_end_request().\n"));
 
+       if (raid1_one_error) {
+               raid1_one_error=0;
+               uptodate=0;
+       }
        /*
         * this branch is our 'one mirror IO has finished' event handler:
         */
@@ -131,11 +136,15 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
         */
 
        if ( (r1_bh->cmd == READ) || (r1_bh->cmd == READA) ) {
+
+               PRINTK(("raid1_end_request(), read branch.\n"));
+
                /*
                 * we have only one buffer_head on the read side
                 */
                if (uptodate) {
-                       raid1_end_bh_io(r1_bh, uptodate);
+                       PRINTK(("raid1_end_request(), read branch, uptodate.\n"));
+                       raid1_end_buffer_io(r1_bh, uptodate);
                        restore_flags(flags);
                        return;
                }
@@ -143,56 +152,67 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
                 * oops, read error:
                 */
                printk(KERN_ERR "raid1: %s: rescheduling block %lu\n", 
-                        partition_name(bh->b_dev), bh->b_blocknr);
-               raid1_reschedule_retry(bh);
+                                kdevname(bh->b_dev), bh->b_blocknr);
+               raid1_reschedule_retry (bh);
                restore_flags(flags);
                return;
        }
 
        /*
-        * WRITE:
-        *
+        * WRITE or WRITEA.
+        */
+       PRINTK(("raid1_end_request(), write branch.\n"));
+
+       /*
         * Let's see if all mirrored write operations have finished 
-        * already.
+        * already [we have irqs off, so we can decrease]:
         */
 
-       if (atomic_dec_and_test(&r1_bh->remaining)) {
-               int i, disks = MD_SB_DISKS;
+       if (!--r1_bh->remaining) {
+               struct md_dev *mddev = r1_bh->mddev;
+               struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+               int i, n = raid_conf->raid_disks;
+
+               PRINTK(("raid1_end_request(), remaining == 0.\n"));
 
-               for ( i = 0; i < disks; i++)
-                       if (r1_bh->mirror_bh[i])
-                               kfree(r1_bh->mirror_bh[i]);
+               for ( i=0; i<n; i++)
+                       if (r1_bh->mirror_bh[i]) kfree(r1_bh->mirror_bh[i]);
 
-               raid1_end_bh_io(r1_bh, test_bit(BH_Uptodate, &r1_bh->state));
+               raid1_end_buffer_io(r1_bh, test_bit(BH_Uptodate, &r1_bh->state));
        }
+       else PRINTK(("raid1_end_request(), remaining == %u.\n", r1_bh->remaining));
        restore_flags(flags);
 }
 
-/*
- * This routine checks if the undelying device is an md device
- * and in that case it maps the blocks before putting the
- * request on the queue
+/* This routine checks if the undelying device is an md device and in that
+ * case it maps the blocks before putting the request on the queue
  */
-static void map_and_make_request (int rw, struct buffer_head *bh)
+static inline void
+map_and_make_request (int rw, struct buffer_head *bh)
 {
        if (MAJOR (bh->b_rdev) == MD_MAJOR)
-               md_map (bh->b_rdev, &bh->b_rdev,
-                               &bh->b_rsector, bh->b_size >> 9);
+               md_map (MINOR (bh->b_rdev), &bh->b_rdev, &bh->b_rsector, bh->b_size >> 9);
        clear_bit(BH_Lock, &bh->b_state);
        make_request (MAJOR (bh->b_rdev), rw, bh);
 }
        
-static int raid1_make_request (mddev_t *mddev, int rw,
-                                                struct buffer_head * bh)
+static int
+raid1_make_request (struct md_dev *mddev, int rw, struct buffer_head * bh)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
        struct buffer_head *mirror_bh[MD_SB_DISKS], *bh_req;
        struct raid1_bh * r1_bh;
-       int disks = MD_SB_DISKS;
-       int i, sum_bhs = 0, switch_disks = 0, sectors, lowprio = 0;
+       int n = raid_conf->raid_disks, i, sum_bhs = 0, switch_disks = 0, sectors;
        struct mirror_info *mirror;
 
-       r1_bh = raid1_kmalloc (sizeof (struct raid1_bh));
+       PRINTK(("raid1_make_request().\n"));
+
+       while (!( /* FIXME: now we are rather fault tolerant than nice */
+       r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_KERNEL)
+       ) )
+               printk ("raid1_make_request(#1): out of memory\n");
+       memset (r1_bh, 0, sizeof (struct raid1_bh));
 
 /*
  * make_request() can abort the operation when READA or WRITEA are being
@@ -203,65 +223,43 @@ static int raid1_make_request (mddev_t *mddev, int rw,
        if (rw == READA) rw = READ;
        if (rw == WRITEA) rw = WRITE;
 
-       if (rw == WRITE) {
-               /*
-                * Too early ?
-                */
-               mark_buffer_clean(bh);
-               /*
-                * not too early. we _first_ clean the bh, then we start
-                * the IO, then when the IO has finished, we unlock the
-                * bh and mark it uptodate. This way we do not miss the
-                * case when the bh got dirty again during the IO.
-                */
-       }
-
-       /*
-        * special flag for 'lowprio' reconstruction requests ...
-        */
-       if (buffer_lowprio(bh))
-               lowprio = 1;
+       if (rw == WRITE || rw == WRITEA)
+               mark_buffer_clean(bh);          /* Too early ? */
 
 /*
- * i think the read and write branch should be separated completely,
- * since we want to do read balancing on the read side for example.
- * Comments? :) --mingo
+ * i think the read and write branch should be separated completely, since we want
+ * to do read balancing on the read side for example. Comments? :) --mingo
  */
 
        r1_bh->master_bh=bh;
        r1_bh->mddev=mddev;
        r1_bh->cmd = rw;
 
-       if (rw==READ) {
-               int last_used = conf->last_used;
-
-               /*
-                * read balancing logic:
-                */
-               mirror = conf->mirrors + last_used;
+       if (rw==READ || rw==READA) {
+               int last_used = raid_conf->last_used;
+               PRINTK(("raid1_make_request(), read branch.\n"));
+               mirror = raid_conf->mirrors + last_used;
                bh->b_rdev = mirror->dev;
                sectors = bh->b_size >> 9;
-
-               if (bh->b_blocknr * sectors == conf->next_sect) {
-                       conf->sect_count += sectors;
-                       if (conf->sect_count >= mirror->sect_limit)
+               if (bh->b_blocknr * sectors == raid_conf->next_sect) {
+                       raid_conf->sect_count += sectors;
+                       if (raid_conf->sect_count >= mirror->sect_limit)
                                switch_disks = 1;
                } else
                        switch_disks = 1;
-               conf->next_sect = (bh->b_blocknr + 1) * sectors;
-               /*
-                * Do not switch disks if full resync is in progress ...
-                */
-               if (switch_disks && !conf->resync_mirrors) {
-                       conf->sect_count = 0;
-                       last_used = conf->last_used = mirror->next;
+               raid_conf->next_sect = (bh->b_blocknr + 1) * sectors;
+               if (switch_disks) {
+                       PRINTK(("read-balancing: switching %d -> %d (%d sectors)\n", last_used, mirror->next, raid_conf->sect_count));
+                       raid_conf->sect_count = 0;
+                       last_used = raid_conf->last_used = mirror->next;
                        /*
-                        * Do not switch to write-only disks ...
-                        * reconstruction is in progress
+                        * Do not switch to write-only disks ... resyncing
+                        * is in progress
                         */
-                       while (conf->mirrors[last_used].write_only)
-                               conf->last_used = conf->mirrors[last_used].next;
+                       while (raid_conf->mirrors[last_used].write_only)
+                               raid_conf->last_used = raid_conf->mirrors[last_used].next;
                }
+               PRINTK (("raid1 read queue: %d %d\n", MAJOR (bh->b_rdev), MINOR (bh->b_rdev)));
                bh_req = &r1_bh->bh_req;
                memcpy(bh_req, bh, sizeof(*bh));
                bh_req->b_end_io = raid1_end_request;
@@ -271,12 +269,13 @@ static int raid1_make_request (mddev_t *mddev, int rw,
        }
 
        /*
-        * WRITE:
+        * WRITE or WRITEA.
         */
+       PRINTK(("raid1_make_request(n=%d), write branch.\n",n));
 
-       for (i = 0; i < disks; i++) {
+       for (i = 0; i < n; i++) {
 
-               if (!conf->mirrors[i].operational) {
+               if (!raid_conf->mirrors [i].operational) {
                        /*
                         * the r1_bh->mirror_bh[i] pointer remains NULL
                         */
@@ -284,91 +283,85 @@ static int raid1_make_request (mddev_t *mddev, int rw,
                        continue;
                }
 
-               /*
-                * special case for reconstruction ...
-                */
-               if (lowprio && (i == conf->last_used)) {
-                       mirror_bh[i] = NULL;
-                       continue;
-               }
-       /*
-        * We should use a private pool (size depending on NR_REQUEST),
-        * to avoid writes filling up the memory with bhs
-        *
-        * Such pools are much faster than kmalloc anyways (so we waste
-        * almost nothing by not using the master bh when writing and
-        * win alot of cleanness) but for now we are cool enough. --mingo
-        *
-        * It's safe to sleep here, buffer heads cannot be used in a shared
-        * manner in the write branch. Look how we lock the buffer at the
-        * beginning of this function to grok the difference ;)
-        */
-               mirror_bh[i] = raid1_kmalloc(sizeof(struct buffer_head));
-       /*
-        * prepare mirrored bh (fields ordered for max mem throughput):
-        */
-               mirror_bh[i]->b_blocknr    = bh->b_blocknr;
-               mirror_bh[i]->b_dev        = bh->b_dev;
-               mirror_bh[i]->b_rdev       = conf->mirrors[i].dev;
-               mirror_bh[i]->b_rsector    = bh->b_rsector;
-               mirror_bh[i]->b_state      = (1<<BH_Req) | (1<<BH_Dirty);
-               if (lowprio)
-                       mirror_bh[i]->b_state |= (1<<BH_LowPrio);
-               mirror_bh[i]->b_count      = 1;
-               mirror_bh[i]->b_size       = bh->b_size;
-               mirror_bh[i]->b_data       = bh->b_data;
-               mirror_bh[i]->b_list       = BUF_LOCKED;
-               mirror_bh[i]->b_end_io     = raid1_end_request;
-               mirror_bh[i]->b_dev_id     = r1_bh;
-  
-               r1_bh->mirror_bh[i] = mirror_bh[i];
-               sum_bhs++;
+       /*
+        * We should use a private pool (size depending on NR_REQUEST),
+        * to avoid writes filling up the memory with bhs
+        *
+        * Such pools are much faster than kmalloc anyways (so we waste almost 
+        * nothing by not using the master bh when writing and win alot of cleanness)
+        *
+        * but for now we are cool enough. --mingo
+        *
+        * It's safe to sleep here, buffer heads cannot be used in a shared
+        * manner in the write branch. Look how we lock the buffer at the beginning
+        * of this function to grok the difference ;)
+        */
+               while (!( /* FIXME: now we are rather fault tolerant than nice */
+               mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_KERNEL)
+               ) )
+                       printk ("raid1_make_request(#2): out of memory\n");
+               memset (mirror_bh[i], 0, sizeof (struct buffer_head));
+
+       /*
+        * prepare mirrored bh (fields ordered for max mem throughput):
+        */
+               mirror_bh [i]->b_blocknr    = bh->b_blocknr;
+               mirror_bh [i]->b_dev        = bh->b_dev;
+               mirror_bh [i]->b_rdev       = raid_conf->mirrors [i].dev;
+               mirror_bh [i]->b_rsector    = bh->b_rsector;
+               mirror_bh [i]->b_state      = (1<<BH_Req) | (1<<BH_Dirty);
+               mirror_bh [i]->b_count      = 1;
+               mirror_bh [i]->b_size       = bh->b_size;
+               mirror_bh [i]->b_data       = bh->b_data;
+               mirror_bh [i]->b_list       = BUF_LOCKED;
+               mirror_bh [i]->b_end_io     = raid1_end_request;
+               mirror_bh [i]->b_dev_id     = r1_bh;
+
+               r1_bh->mirror_bh[i] = mirror_bh[i];
+               sum_bhs++;
        }
 
-       md_atomic_set(&r1_bh->remaining, sum_bhs);
+       r1_bh->remaining = sum_bhs;
+
+       PRINTK(("raid1_make_request(), write branch, sum_bhs=%d.\n",sum_bhs));
 
        /*
-        * We have to be a bit careful about the semaphore above, thats
-        * why we start the requests separately. Since kmalloc() could
-        * fail, sleep and make_request() can sleep too, this is the
-        * safer solution. Imagine, end_request decreasing the semaphore
-        * before we could have set it up ... We could play tricks with
-        * the semaphore (presetting it and correcting at the end if
-        * sum_bhs is not 'n' but we have to do end_request by hand if
-        * all requests finish until we had a chance to set up the
-        * semaphore correctly ... lots of races).
+        * We have to be a bit careful about the semaphore above, thats why we
+        * start the requests separately. Since kmalloc() could fail, sleep and
+        * make_request() can sleep too, this is the safer solution. Imagine,
+        * end_request decreasing the semaphore before we could have set it up ...
+        * We could play tricks with the semaphore (presetting it and correcting
+        * at the end if sum_bhs is not 'n' but we have to do end_request by hand
+        * if all requests finish until we had a chance to set up the semaphore
+        * correctly ... lots of races).
         */
-       for (i = 0; i < disks; i++)
-               if (mirror_bh[i])
-                       map_and_make_request(rw, mirror_bh[i]);
+       for (i = 0; i < n; i++)
+               if (mirror_bh [i] != NULL)
+                       map_and_make_request (rw, mirror_bh [i]);
 
        return (0);
 }
                           
-static int raid1_status (char *page, mddev_t *mddev)
+static int raid1_status (char *page, int minor, struct md_dev *mddev)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
        int sz = 0, i;
        
-       sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks,
-                                                conf->working_disks);
-       for (i = 0; i < conf->raid_disks; i++)
-               sz += sprintf (page+sz, "%s",
-                       conf->mirrors[i].operational ? "U" : "_");
+       sz += sprintf (page+sz, " [%d/%d] [", raid_conf->raid_disks, raid_conf->working_disks);
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               sz += sprintf (page+sz, "%s", raid_conf->mirrors [i].operational ? "U" : "_");
        sz += sprintf (page+sz, "]");
        return sz;
 }
 
-static void unlink_disk (raid1_conf_t *conf, int target)
+static void raid1_fix_links (struct raid1_data *raid_conf, int failed_index)
 {
-       int disks = MD_SB_DISKS;
-       int i;
+       int disks = raid_conf->raid_disks;
+       int j;
 
-       for (i = 0; i < disks; i++)
-               if (conf->mirrors[i].next == target)
-                       conf->mirrors[i].next = conf->mirrors[target].next;
+       for (j = 0; j < disks; j++)
+               if (raid_conf->mirrors [j].next == failed_index)
+                       raid_conf->mirrors [j].next = raid_conf->mirrors [failed_index].next;
 }
 
 #define LAST_DISK KERN_ALERT \
@@ -387,53 +380,48 @@ static void unlink_disk (raid1_conf_t *conf, int target)
 #define ALREADY_SYNCING KERN_INFO \
 "raid1: syncing already in progress.\n"
 
-static void mark_disk_bad (mddev_t *mddev, int failed)
+static int raid1_error (struct md_dev *mddev, kdev_t dev)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info *mirror = conf->mirrors+failed;
-       mdp_super_t *sb = mddev->sb;
-
-       mirror->operational = 0;
-       unlink_disk(conf, failed);
-       mark_disk_faulty(sb->disks+mirror->number);
-       mark_disk_nonsync(sb->disks+mirror->number);
-       mark_disk_inactive(sb->disks+mirror->number);
-       sb->active_disks--;
-       sb->working_disks--;
-       sb->failed_disks++;
-       mddev->sb_dirty = 1;
-       md_wakeup_thread(conf->thread);
-       conf->working_disks--;
-       printk (DISK_FAILED, partition_name (mirror->dev),
-                                conf->working_disks);
-}
-
-static int raid1_error (mddev_t *mddev, kdev_t dev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info * mirrors = conf->mirrors;
-       int disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       struct mirror_info *mirror;
+       md_superblock_t *sb = mddev->sb;
+       int disks = raid_conf->raid_disks;
        int i;
 
-       if (conf->working_disks == 1) {
+       PRINTK(("raid1_error called\n"));
+
+       if (raid_conf->working_disks == 1) {
                /*
                 * Uh oh, we can do nothing if this is our last disk, but
                 * first check if this is a queued request for a device
                 * which has just failed.
                 */
-               for (i = 0; i < disks; i++) {
-                       if (mirrors[i].dev==dev && !mirrors[i].operational)
+               for (i = 0, mirror = raid_conf->mirrors; i < disks;
+                                i++, mirror++)
+                       if (mirror->dev == dev && !mirror->operational)
                                return 0;
-               }
                printk (LAST_DISK);
        } else {
-               /*
-                * Mark disk as unusable
-                */
-               for (i = 0; i < disks; i++) {
-                       if (mirrors[i].dev==dev && mirrors[i].operational) {
-                               mark_disk_bad (mddev, i);
-                               break;
+               /* Mark disk as unusable */
+               for (i = 0, mirror = raid_conf->mirrors; i < disks;
+                                i++, mirror++) {
+                       if (mirror->dev == dev && mirror->operational){
+                               mirror->operational = 0;
+                               raid1_fix_links (raid_conf, i);
+                               sb->disks[mirror->number].state |=
+                                               (1 << MD_FAULTY_DEVICE);
+                               sb->disks[mirror->number].state &=
+                                               ~(1 << MD_SYNC_DEVICE);
+                               sb->disks[mirror->number].state &=
+                                               ~(1 << MD_ACTIVE_DEVICE);
+                               sb->active_disks--;
+                               sb->working_disks--;
+                               sb->failed_disks++;
+                               mddev->sb_dirty = 1;
+                               md_wakeup_thread(raid1_thread);
+                               raid_conf->working_disks--;
+                               printk (DISK_FAILED, kdevname (dev),
+                                               raid_conf->working_disks);
                        }
                }
        }
@@ -446,396 +434,219 @@ static int raid1_error (mddev_t *mddev, kdev_t dev)
 #undef START_SYNCING
 
 /*
- * Insert the spare disk into the drive-ring
+ * This is the personality-specific hot-addition routine
  */
-static void link_disk(raid1_conf_t *conf, struct mirror_info *mirror)
-{
-       int j, next;
-       int disks = MD_SB_DISKS;
-       struct mirror_info *p = conf->mirrors;
 
-       for (j = 0; j < disks; j++, p++)
-               if (p->operational && !p->write_only) {
-                       next = p->next;
-                       p->next = mirror->raid_disk;
-                       mirror->next = next;
-                       return;
-               }
+#define NO_SUPERBLOCK KERN_ERR \
+"raid1: cannot hot-add disk to the array with no RAID superblock\n"
 
-       printk("raid1: bug: no read-operational devices\n");
-}
-
-static void print_raid1_conf (raid1_conf_t *conf)
-{
-       int i;
-       struct mirror_info *tmp;
+#define WRONG_LEVEL KERN_ERR \
+"raid1: hot-add: level of disk is not RAID-1\n"
 
-       printk("RAID1 conf printout:\n");
-       if (!conf) {
-               printk("(conf==NULL)\n");
-               return;
-       }
-       printk(" --- wd:%d rd:%d nd:%d\n", conf->working_disks,
-                        conf->raid_disks, conf->nr_disks);
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               tmp = conf->mirrors + i;
-               printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
-                       i, tmp->spare,tmp->operational,
-                       tmp->number,tmp->raid_disk,tmp->used_slot,
-                       partition_name(tmp->dev));
-       }
-}
+#define HOT_ADD_SUCCEEDED KERN_INFO \
+"raid1: device %s hot-added\n"
 
-static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
+static int raid1_hot_add_disk (struct md_dev *mddev, kdev_t dev)
 {
-       int err = 0;
-       int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
-       raid1_conf_t *conf = mddev->private;
-       struct mirror_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
        unsigned long flags;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *failed_desc, *spare_desc, *added_desc;
-
-       save_flags(flags);
-       cli();
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       struct mirror_info *mirror;
+       md_superblock_t *sb = mddev->sb;
+       struct real_dev * realdev;
+       int n;
 
-       print_raid1_conf(conf);
        /*
-        * find the disk ...
+        * The device has its superblock already read and it was found
+        * to be consistent for generic RAID usage.  Now we check whether
+        * it's usable for RAID-1 hot addition.
         */
-       switch (state) {
-
-       case DISKOP_SPARE_ACTIVE:
 
-               /*
-                * Find the failed disk within the RAID1 configuration ...
-                * (this can only be in the first conf->working_disks part)
-                */
-               for (i = 0; i < conf->raid_disks; i++) {
-                       tmp = conf->mirrors + i;
-                       if ((!tmp->operational && !tmp->spare) ||
-                                       !tmp->used_slot) {
-                               failed_disk = i;
-                               break;
-                       }
-               }
-               /*
-                * When we activate a spare disk we _must_ have a disk in
-                * the lower (active) part of the array to replace. 
-                */
-               if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               /* fall through */
-
-       case DISKOP_SPARE_WRITE:
-       case DISKOP_SPARE_INACTIVE:
-
-               /*
-                * Find the spare disk ... (can only be in the 'high'
-                * area of the array)
-                */
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (tmp->spare && tmp->number == (*d)->number) {
-                               spare_disk = i;
-                               break;
-                       }
-               }
-               if (spare_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-
-               for (i = 0; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (tmp->used_slot && (tmp->number == (*d)->number)) {
-                               if (tmp->operational) {
-                                       err = -EBUSY;
-                                       goto abort;
-                               }
-                               removed_disk = i;
-                               break;
-                       }
-               }
-               if (removed_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (!tmp->used_slot) {
-                               added_disk = i;
-                               break;
-                       }
-               }
-               if (added_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
+       n = mddev->nb_dev++;
+       realdev = &mddev->devices[n];
+       if (!realdev->sb) {
+               printk (NO_SUPERBLOCK);
+               return -EINVAL;
        }
+       if (realdev->sb->level != 1) {
+               printk (WRONG_LEVEL);
+               return -EINVAL;
+       }
+       /* FIXME: are there other things left we could sanity-check? */
 
-       switch (state) {
-       /*
-        * Switch the spare disk to write-only mode:
-        */
-       case DISKOP_SPARE_WRITE:
-               sdisk = conf->mirrors + spare_disk;
-               sdisk->operational = 1;
-               sdisk->write_only = 1;
-               break;
        /*
-        * Deactivate a spare disk:
+        * We have to disable interrupts, as our RAID-1 state is used
+        * from irq handlers as well.
         */
-       case DISKOP_SPARE_INACTIVE:
-               sdisk = conf->mirrors + spare_disk;
-               sdisk->operational = 0;
-               sdisk->write_only = 0;
-               break;
-       /*
-        * Activate (mark read-write) the (now sync) spare disk,
-        * which means we switch it's 'raid position' (->raid_disk)
-        * with the failed disk. (only the first 'conf->nr_disks'
-        * slots are used for 'real' disks and we must preserve this
-        * property)
-        */
-       case DISKOP_SPARE_ACTIVE:
-
-               sdisk = conf->mirrors + spare_disk;
-               fdisk = conf->mirrors + failed_disk;
-
-               spare_desc = &sb->disks[sdisk->number];
-               failed_desc = &sb->disks[fdisk->number];
-
-               if (spare_desc != *d) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (spare_desc->raid_disk != sdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-                       
-               if (sdisk->raid_disk != spare_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       save_flags(flags);
+       cli();
 
-               if (failed_desc->raid_disk != fdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       raid_conf->raid_disks++;
+       mirror = raid_conf->mirrors+n;
 
-               if (fdisk->raid_disk != failed_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       mirror->number=n;
+       mirror->raid_disk=n;
+       mirror->dev=dev;
+       mirror->next=0; /* FIXME */
+       mirror->sect_limit=128;
 
-               /*
-                * do the switch finally
-                */
-               xchg_values(*spare_desc, *failed_desc);
-               xchg_values(*fdisk, *sdisk);
+       mirror->operational=0;
+       mirror->spare=1;
+       mirror->write_only=0;
 
-               /*
-                * (careful, 'failed' and 'spare' are switched from now on)
-                *
-                * we want to preserve linear numbering and we want to
-                * give the proper raid_disk number to the now activated
-                * disk. (this means we switch back these values)
-                */
-       
-               xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
-               xchg_values(sdisk->raid_disk, fdisk->raid_disk);
-               xchg_values(spare_desc->number, failed_desc->number);
-               xchg_values(sdisk->number, fdisk->number);
+       sb->disks[n].state |= (1 << MD_FAULTY_DEVICE);
+       sb->disks[n].state &= ~(1 << MD_SYNC_DEVICE);
+       sb->disks[n].state &= ~(1 << MD_ACTIVE_DEVICE);
+       sb->nr_disks++;
+       sb->spare_disks++;
 
-               *d = failed_desc;
+       restore_flags(flags);
 
-               if (sdisk->dev == MKDEV(0,0))
-                       sdisk->used_slot = 0;
-               /*
-                * this really activates the spare.
-                */
-               fdisk->spare = 0;
-               fdisk->write_only = 0;
-               link_disk(conf, fdisk);
+       md_update_sb(MINOR(dev));
 
-               /*
-                * if we activate a spare, we definitely replace a
-                * non-operational disk slot in the 'low' area of
-                * the disk array.
-                */
+       printk (HOT_ADD_SUCCEEDED, kdevname(realdev->dev));
 
-               conf->working_disks++;
+       return 0;
+}
 
-               break;
+#undef NO_SUPERBLOCK
+#undef WRONG_LEVEL
+#undef HOT_ADD_SUCCEEDED
 
-       case DISKOP_HOT_REMOVE_DISK:
-               rdisk = conf->mirrors + removed_disk;
+/*
+ * Insert the spare disk into the drive-ring
+ */
+static void add_ring(struct raid1_data *raid_conf, struct mirror_info *mirror)
+{
+       int j, next;
+       struct mirror_info *p = raid_conf->mirrors;
 
-               if (rdisk->spare && (removed_disk < conf->raid_disks)) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-               rdisk->dev = MKDEV(0,0);
-               rdisk->used_slot = 0;
-               conf->nr_disks--;
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-               adisk = conf->mirrors + added_disk;
-               added_desc = *d;
-
-               if (added_disk != added_desc->number) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
+       for (j = 0; j < raid_conf->raid_disks; j++, p++)
+               if (p->operational && !p->write_only) {
+                       next = p->next;
+                       p->next = mirror->raid_disk;
+                       mirror->next = next;
+                       return;
                }
+       printk("raid1: bug: no read-operational devices\n");
+}
 
-               adisk->number = added_desc->number;
-               adisk->raid_disk = added_desc->raid_disk;
-               adisk->dev = MKDEV(added_desc->major,added_desc->minor);
-
-               adisk->operational = 0;
-               adisk->write_only = 0;
-               adisk->spare = 1;
-               adisk->used_slot = 1;
-               conf->nr_disks++;
+static int raid1_mark_spare(struct md_dev *mddev, md_descriptor_t *spare,
+                               int state)
+{
+       int i = 0, failed_disk = -1;
+       struct raid1_data *raid_conf = mddev->private;
+       struct mirror_info *mirror = raid_conf->mirrors;
+       md_descriptor_t *descriptor;
+       unsigned long flags;
 
-               break;
+       for (i = 0; i < MD_SB_DISKS; i++, mirror++) {
+               if (mirror->spare && mirror->number == spare->number)
+                       goto found;
+       }
+       return 1;
+found:
+       for (i = 0, mirror = raid_conf->mirrors; i < raid_conf->raid_disks;
+                                                               i++, mirror++)
+               if (!mirror->operational)
+                       failed_disk = i;
 
-       default:
-               MD_BUG();       
-               err = 1;
-               goto abort;
+       save_flags(flags);
+       cli();
+       switch (state) {
+               case SPARE_WRITE:
+                       mirror->operational = 1;
+                       mirror->write_only = 1;
+                       raid_conf->raid_disks = MAX(raid_conf->raid_disks,
+                                                       mirror->raid_disk + 1);
+                       break;
+               case SPARE_INACTIVE:
+                       mirror->operational = 0;
+                       mirror->write_only = 0;
+                       break;
+               case SPARE_ACTIVE:
+                       mirror->spare = 0;
+                       mirror->write_only = 0;
+                       raid_conf->working_disks++;
+                       add_ring(raid_conf, mirror);
+
+                       if (failed_disk != -1) {
+                               descriptor = &mddev->sb->disks[raid_conf->mirrors[failed_disk].number];
+                               i = spare->raid_disk;
+                               spare->raid_disk = descriptor->raid_disk;
+                               descriptor->raid_disk = i;
+                       }
+                       break;
+               default:
+                       printk("raid1_mark_spare: bug: state == %d\n", state);
+                       restore_flags(flags);
+                       return 1;
        }
-abort:
        restore_flags(flags);
-       print_raid1_conf(conf);
-       return err;
+       return 0;
 }
 
-
-#define IO_ERROR KERN_ALERT \
-"raid1: %s: unrecoverable I/O read error for block %lu\n"
-
-#define REDIRECT_SECTOR KERN_ERR \
-"raid1: %s: redirecting sector %lu to another mirror\n"
-
 /*
  * This is a kernel thread which:
  *
  *     1.      Retries failed read operations on working mirrors.
  *     2.      Updates the raid superblock when problems encounter.
  */
-static void raid1d (void *data)
+void raid1d (void *data)
 {
        struct buffer_head *bh;
        kdev_t dev;
        unsigned long flags;
-       struct raid1_bh *r1_bh;
-       mddev_t *mddev;
+       struct raid1_bh * r1_bh;
+       struct md_dev *mddev;
 
+       PRINTK(("raid1d() active\n"));
+       save_flags(flags);
+       cli();
        while (raid1_retry_list) {
-               save_flags(flags);
-               cli();
                bh = raid1_retry_list;
                r1_bh = (struct raid1_bh *)(bh->b_dev_id);
                raid1_retry_list = r1_bh->next_retry;
                restore_flags(flags);
 
-               mddev = kdev_to_mddev(bh->b_dev);
+               mddev = md_dev + MINOR(bh->b_dev);
                if (mddev->sb_dirty) {
-                       printk(KERN_INFO "dirty sb detected, updating.\n");
+                       printk("dirty sb detected, updating.\n");
                        mddev->sb_dirty = 0;
-                       md_update_sb(mddev);
+                       md_update_sb(MINOR(bh->b_dev));
                }
                dev = bh->b_rdev;
-               __raid1_map (mddev, &bh->b_rdev, &bh->b_rsector,
-                                                        bh->b_size >> 9);
+               __raid1_map (md_dev + MINOR(bh->b_dev), &bh->b_rdev, &bh->b_rsector, bh->b_size >> 9);
                if (bh->b_rdev == dev) {
-                       printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
-                       raid1_end_bh_io(r1_bh, 0);
+                       printk (KERN_ALERT 
+                                       "raid1: %s: unrecoverable I/O read error for block %lu\n",
+                                               kdevname(bh->b_dev), bh->b_blocknr);
+                       raid1_end_buffer_io(r1_bh, 0);
                } else {
-                       printk (REDIRECT_SECTOR,
-                               partition_name(bh->b_dev), bh->b_blocknr);
+                       printk (KERN_ERR "raid1: %s: redirecting sector %lu to another mirror\n", 
+                                         kdevname(bh->b_dev), bh->b_blocknr);
                        map_and_make_request (r1_bh->cmd, bh);
                }
+               cli();
        }
+       restore_flags(flags);
 }
-#undef IO_ERROR
-#undef REDIRECT_SECTOR
-
-/*
- * Private kernel thread to reconstruct mirrors after an unclean
- * shutdown.
- */
-static void raid1syncd (void *data)
-{
-        raid1_conf_t *conf = data;
-        mddev_t *mddev = conf->mddev;
-
-        if (!conf->resync_mirrors)
-                return;
-        if (conf->resync_mirrors == 2)
-                return;
-       down(&mddev->recovery_sem);
-        if (md_do_sync(mddev, NULL)) {
-               up(&mddev->recovery_sem);
-               return;
-       }
-       /*
-        * Only if everything went Ok.
-        */
-        conf->resync_mirrors = 0;
-       up(&mddev->recovery_sem);
-}
-
 
 /*
  * This will catch the scenario in which one of the mirrors was
  * mounted as a normal device rather than as a part of a raid set.
- *
- * check_consistency is very personality-dependent, eg. RAID5 cannot
- * do this check, it uses another method.
  */
-static int __check_consistency (mddev_t *mddev, int row)
+static int __check_consistency (struct md_dev *mddev, int row)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       int disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = mddev->private;
        kdev_t dev;
        struct buffer_head *bh = NULL;
        int i, rc = 0;
        char *buffer = NULL;
 
-       for (i = 0; i < disks; i++) {
-               printk("(checking disk %d)\n",i);
-               if (!conf->mirrors[i].operational)
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               if (!raid_conf->mirrors[i].operational)
                        continue;
-               printk("(really checking disk %d)\n",i);
-               dev = conf->mirrors[i].dev;
+               dev = raid_conf->mirrors[i].dev;
                set_blocksize(dev, 4096);
                if ((bh = bread(dev, row / 4, 4096)) == NULL)
                        break;
@@ -864,342 +675,163 @@ static int __check_consistency (mddev_t *mddev, int row)
        return rc;
 }
 
-static int check_consistency (mddev_t *mddev)
+static int check_consistency (struct md_dev *mddev)
 {
-       if (__check_consistency(mddev, 0))
-/*
- * we do not do this currently, as it's perfectly possible to
- * have an inconsistent array when it's freshly created. Only
- * newly written data has to be consistent.
- */
-               return 0;
+       int size = mddev->sb->size;
+       int row;
 
+       for (row = 0; row < size; row += size / 8)
+               if (__check_consistency(mddev, row))
+                       return 1;
        return 0;
 }
 
-#define INVALID_LEVEL KERN_WARNING \
-"raid1: md%d: raid level not set to mirroring (%d)\n"
-
-#define NO_SB KERN_ERR \
-"raid1: disabled mirror %s (couldn't access raid superblock)\n"
-
-#define ERRORS KERN_ERR \
-"raid1: disabled mirror %s (errors detected)\n"
-
-#define NOT_IN_SYNC KERN_ERR \
-"raid1: disabled mirror %s (not in sync)\n"
-
-#define INCONSISTENT KERN_ERR \
-"raid1: disabled mirror %s (inconsistent descriptor)\n"
-
-#define ALREADY_RUNNING KERN_ERR \
-"raid1: disabled mirror %s (mirror %d already operational)\n"
-
-#define OPERATIONAL KERN_INFO \
-"raid1: device %s operational as mirror %d\n"
-
-#define MEM_ERROR KERN_ERR \
-"raid1: couldn't allocate memory for md%d\n"
-
-#define SPARE KERN_INFO \
-"raid1: spare disk %s\n"
-
-#define NONE_OPERATIONAL KERN_ERR \
-"raid1: no operational mirrors for md%d\n"
-
-#define RUNNING_CKRAID KERN_ERR \
-"raid1: detected mirror differences -- running resync\n"
-
-#define ARRAY_IS_ACTIVE KERN_INFO \
-"raid1: raid set md%d active with %d out of %d mirrors\n"
-
-#define THREAD_ERROR KERN_ERR \
-"raid1: couldn't allocate thread for md%d\n"
-
-#define START_RESYNC KERN_WARNING \
-"raid1: raid set md%d not clean; reconstructing mirrors\n"
-
-static int raid1_run (mddev_t *mddev)
+static int raid1_run (int minor, struct md_dev *mddev)
 {
-       raid1_conf_t *conf;
-       int i, j, disk_idx;
-       struct mirror_info *disk;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *descriptor;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
-       int start_recovery = 0;
+       struct raid1_data *raid_conf;
+       int i, j, raid_disk;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
 
        MOD_INC_USE_COUNT;
 
        if (sb->level != 1) {
-               printk(INVALID_LEVEL, mdidx(mddev), sb->level);
-               goto out;
+               printk("raid1: %s: raid level not set to mirroring (%d)\n",
+                               kdevname(MKDEV(MD_MAJOR, minor)), sb->level);
+               MOD_DEC_USE_COUNT;
+               return -EIO;
        }
-       /*
-        * copy the already verified devices into our private RAID1
-        * bookkeeping area. [whatever we allocate in raid1_run(),
-        * should be freed in raid1_stop()]
+       /****
+        * copy the now verified devices into our private RAID1 bookkeeping
+        * area. [whatever we allocate in raid1_run(), should be freed in
+        * raid1_stop()]
         */
 
-       conf = raid1_kmalloc(sizeof(raid1_conf_t));
-       mddev->private = conf;
-       if (!conf) {
-               printk(MEM_ERROR, mdidx(mddev));
-               goto out;
-       }
+       while (!( /* FIXME: now we are rather fault tolerant than nice */
+       mddev->private = kmalloc (sizeof (struct raid1_data), GFP_KERNEL)
+       ) )
+               printk ("raid1_run(): out of memory\n");
+       raid_conf = mddev->private;
+       memset(raid_conf, 0, sizeof(*raid_conf));
 
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       printk(ERRORS, partition_name(rdev->dev));
-               } else {
-                       if (!rdev->sb) {
-                               MD_BUG();
-                               continue;
-                       }
-               }
-               if (rdev->desc_nr == -1) {
-                       MD_BUG();
+       PRINTK(("raid1_run(%d) called.\n", minor));
+
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb) {
+                       printk(KERN_ERR "raid1: disabled mirror %s (couldn't access raid superblock)\n", kdevname(realdev->dev));
                        continue;
                }
-               descriptor = &sb->disks[rdev->desc_nr];
-               disk_idx = descriptor->raid_disk;
-               disk = conf->mirrors + disk_idx;
-
-               if (disk_faulty(descriptor)) {
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+
+               /*
+                * This is important -- we are using the descriptor on
+                * the disk only to get a pointer to the descriptor on
+                * the main superblock, which might be more recent.
+                */
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE)) {
+                       printk(KERN_ERR "raid1: disabled mirror %s (errors detected)\n", kdevname(realdev->dev));
                        continue;
                }
-               if (disk_active(descriptor)) {
-                       if (!disk_sync(descriptor)) {
-                               printk(NOT_IN_SYNC,
-                                       partition_name(rdev->dev));
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE)) {
+                       if (!(descriptor->state & (1 << MD_SYNC_DEVICE))) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (not in sync)\n", kdevname(realdev->dev));
                                continue;
                        }
-                       if ((descriptor->number > MD_SB_DISKS) ||
-                                        (disk_idx > sb->raid_disks)) {
-
-                               printk(INCONSISTENT,
-                                       partition_name(rdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       if (descriptor->number > sb->nr_disks || raid_disk > sb->raid_disks) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (inconsistent descriptor)\n", kdevname(realdev->dev));
                                continue;
                        }
-                       if (disk->operational) {
-                               printk(ALREADY_RUNNING,
-                                       partition_name(rdev->dev),
-                                       disk_idx);
+                       if (raid_conf->mirrors[raid_disk].operational) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (mirror %d already operational)\n", kdevname(realdev->dev), raid_disk);
                                continue;
                        }
-                       printk(OPERATIONAL, partition_name(rdev->dev),
-                                       disk_idx);
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 1;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
-                       conf->working_disks++;
+                       printk(KERN_INFO "raid1: device %s operational as mirror %d\n", kdevname(realdev->dev), raid_disk);
+                       raid_conf->mirrors[raid_disk].number = descriptor->number;
+                       raid_conf->mirrors[raid_disk].raid_disk = raid_disk;
+                       raid_conf->mirrors[raid_disk].dev = mddev->devices [i].dev;
+                       raid_conf->mirrors[raid_disk].operational = 1;
+                       raid_conf->mirrors[raid_disk].sect_limit = 128;
+                       raid_conf->working_disks++;
                } else {
                /*
                 * Must be a spare disk ..
                 */
-                       printk(SPARE, partition_name(rdev->dev));
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 1;
-                       disk->used_slot = 1;
-               }
-       }
-       if (!conf->working_disks) {
-               printk(NONE_OPERATIONAL, mdidx(mddev));
-               goto out_free_conf;
-       }
-
-       conf->raid_disks = sb->raid_disks;
-       conf->nr_disks = sb->nr_disks;
-       conf->mddev = mddev;
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               
-               descriptor = sb->disks+i;
-               disk_idx = descriptor->raid_disk;
-               disk = conf->mirrors + disk_idx;
+                       printk(KERN_INFO "raid1: spare disk %s\n", kdevname(realdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       raid_conf->mirrors[raid_disk].number = descriptor->number;
+                       raid_conf->mirrors[raid_disk].raid_disk = raid_disk;
+                       raid_conf->mirrors[raid_disk].dev = mddev->devices [i].dev;
+                       raid_conf->mirrors[raid_disk].sect_limit = 128;
 
-               if (disk_faulty(descriptor) && (disk_idx < conf->raid_disks) &&
-                               !disk->used_slot) {
-
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = MKDEV(0,0);
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+                       raid_conf->mirrors[raid_disk].operational = 0;
+                       raid_conf->mirrors[raid_disk].write_only = 0;
+                       raid_conf->mirrors[raid_disk].spare = 1;
                }
        }
-
-       /*
-        * find the first working one and use it as a starting point
-        * to read balancing.
-        */
-       for (j = 0; !conf->mirrors[j].operational; j++)
-               /* nothing */;
-       conf->last_used = j;
-
-       /*
-        * initialize the 'working disks' list.
-        */
-       for (i = conf->raid_disks - 1; i >= 0; i--) {
-               if (conf->mirrors[i].operational) {
-                       conf->mirrors[i].next = j;
-                       j = i;
-               }
+       if (!raid_conf->working_disks) {
+               printk(KERN_ERR "raid1: no operational mirrors for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               kfree(raid_conf);
+               mddev->private = NULL;
+               MOD_DEC_USE_COUNT;
+               return -EIO;
        }
 
-       if (conf->working_disks != sb->raid_disks) {
-               printk(KERN_ALERT "raid1: md%d, not all disks are operational -- trying to recover array\n", mdidx(mddev));
-               start_recovery = 1;
-       }
+       raid_conf->raid_disks = sb->raid_disks;
+       raid_conf->mddev = mddev;
 
-       if (!start_recovery && (sb->state & (1 << MD_SB_CLEAN))) {
-               /*
-                * we do sanity checks even if the device says
-                * it's clean ...
-                */
-               if (check_consistency(mddev)) {
-                       printk(RUNNING_CKRAID);
-                       sb->state &= ~(1 << MD_SB_CLEAN);
+       for (j = 0; !raid_conf->mirrors[j].operational; j++);
+       raid_conf->last_used = j;
+       for (i = raid_conf->raid_disks - 1; i >= 0; i--) {
+               if (raid_conf->mirrors[i].operational) {
+                       PRINTK(("raid_conf->mirrors[%d].next == %d\n", i, j));
+                       raid_conf->mirrors[i].next = j;
+                       j = i;
                }
        }
 
-       {
-               const char * name = "raid1d";
-
-               conf->thread = md_register_thread(raid1d, conf, name);
-               if (!conf->thread) {
-                       printk(THREAD_ERROR, mdidx(mddev));
-                       goto out_free_conf;
-               }
+       if (check_consistency(mddev)) {
+               printk(KERN_ERR "raid1: detected mirror differences -- run ckraid\n");
+               sb->state |= 1 << MD_SB_ERRORS;
+               kfree(raid_conf);
+               mddev->private = NULL;
+               MOD_DEC_USE_COUNT;
+               return -EIO;
        }
 
-       if (!start_recovery && !(sb->state & (1 << MD_SB_CLEAN))) {
-               const char * name = "raid1syncd";
-
-               conf->resync_thread = md_register_thread(raid1syncd, conf,name);
-               if (!conf->resync_thread) {
-                       printk(THREAD_ERROR, mdidx(mddev));
-                       goto out_free_conf;
-               }
-
-               printk(START_RESYNC, mdidx(mddev));
-                conf->resync_mirrors = 1;
-                md_wakeup_thread(conf->resync_thread);
-        }
-
        /*
         * Regenerate the "device is in sync with the raid set" bit for
         * each device.
         */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mark_disk_nonsync(sb->disks+i);
+       for (i = 0; i < sb->nr_disks ; i++) {
+               sb->disks[i].state &= ~(1 << MD_SYNC_DEVICE);
                for (j = 0; j < sb->raid_disks; j++) {
-                       if (!conf->mirrors[j].operational)
+                       if (!raid_conf->mirrors[j].operational)
                                continue;
-                       if (sb->disks[i].number == conf->mirrors[j].number)
-                               mark_disk_sync(sb->disks+i);
-               }
-       }
-       sb->active_disks = conf->working_disks;
-
-       if (start_recovery)
-               md_recover_arrays();
-
-
-       printk(ARRAY_IS_ACTIVE, mdidx(mddev), sb->active_disks, sb->raid_disks);
-       /*
-        * Ok, everything is just fine now
-        */
-       return 0;
-
-out_free_conf:
-       kfree(conf);
-       mddev->private = NULL;
-out:
-       MOD_DEC_USE_COUNT;
-       return -EIO;
-}
-
-#undef INVALID_LEVEL
-#undef NO_SB
-#undef ERRORS
-#undef NOT_IN_SYNC
-#undef INCONSISTENT
-#undef ALREADY_RUNNING
-#undef OPERATIONAL
-#undef SPARE
-#undef NONE_OPERATIONAL
-#undef RUNNING_CKRAID
-#undef ARRAY_IS_ACTIVE
-
-static int raid1_stop_resync (mddev_t *mddev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-
-       if (conf->resync_thread) {
-               if (conf->resync_mirrors) {
-                       conf->resync_mirrors = 2;
-                       md_interrupt_thread(conf->resync_thread);
-                       printk(KERN_INFO "raid1: mirror resync was not fully finished, restarting next time.\n");
-                       return 1;
+                       if (sb->disks[i].number == raid_conf->mirrors[j].number)
+                               sb->disks[i].state |= 1 << MD_SYNC_DEVICE;
                }
-               return 0;
        }
-       return 0;
-}
-
-static int raid1_restart_resync (mddev_t *mddev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       sb->active_disks = raid_conf->working_disks;
 
-       if (conf->resync_mirrors) {
-               if (!conf->resync_thread) {
-                       MD_BUG();
-                       return 0;
-               }
-               conf->resync_mirrors = 1;
-               md_wakeup_thread(conf->resync_thread);
-               return 1;
-       }
-       return 0;
+       printk("raid1: raid set %s active with %d out of %d mirrors\n", kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks);
+       /* Ok, everything is just fine now */
+       return (0);
 }
 
-static int raid1_stop (mddev_t *mddev)
+static int raid1_stop (int minor, struct md_dev *mddev)
 {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
 
-       md_unregister_thread(conf->thread);
-       if (conf->resync_thread)
-               md_unregister_thread(conf->resync_thread);
-       kfree(conf);
+       kfree (raid_conf);
        mddev->private = NULL;
        MOD_DEC_USE_COUNT;
        return 0;
 }
 
-static mdk_personality_t raid1_personality=
+static struct md_personality raid1_personality=
 {
        "raid1",
        raid1_map,
@@ -1211,13 +843,15 @@ static mdk_personality_t raid1_personality=
        NULL,                   /* no ioctls */
        0,
        raid1_error,
-       raid1_diskop,
-       raid1_stop_resync,
-       raid1_restart_resync
+       raid1_hot_add_disk,
+       /* raid1_hot_remove_drive */ NULL,
+       raid1_mark_spare
 };
 
 int raid1_init (void)
 {
+       if ((raid1_thread = md_register_thread(raid1d, NULL)) == NULL)
+               return -EBUSY;
        return register_md_personality (RAID1, &raid1_personality);
 }
 
@@ -1229,6 +863,7 @@ int init_module (void)
 
 void cleanup_module (void)
 {
+       md_unregister_thread (raid1_thread);
        unregister_md_personality (RAID1);
 }
 #endif
index 4db59b8e77c420fad9ef9272cf917151ab1775e2..66713a84b955d46803ab628d3d96a49d95d3e05b 100644 (file)
@@ -1,4 +1,4 @@
-/*
+/*****************************************************************************
  * raid5.c : Multiple Devices driver for Linux
  *           Copyright (C) 1996, 1997 Ingo Molnar, Miguel de Icaza, Gadi Oxman
  *
  * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
-
 #include <linux/module.h>
 #include <linux/locks.h>
 #include <linux/malloc.h>
-#include <linux/raid/raid5.h>
+#include <linux/md.h>
+#include <linux/raid5.h>
 #include <asm/bitops.h>
 #include <asm/atomic.h>
+#include <asm/md.h>
 
-static mdk_personality_t raid5_personality;
+static struct md_personality raid5_personality;
 
 /*
  * Stripe cache
@@ -32,7 +33,7 @@ static mdk_personality_t raid5_personality;
 #define HASH_PAGES_ORDER       0
 #define NR_HASH                        (HASH_PAGES * PAGE_SIZE / sizeof(struct stripe_head *))
 #define HASH_MASK              (NR_HASH - 1)
-#define stripe_hash(conf, sect, size)  ((conf)->stripe_hashtbl[((sect) / (size >> 9)) & HASH_MASK])
+#define stripe_hash(raid_conf, sect, size)     ((raid_conf)->stripe_hashtbl[((sect) / (size >> 9)) & HASH_MASK])
 
 /*
  * The following can be used to debug the driver
@@ -45,8 +46,6 @@ static mdk_personality_t raid5_personality;
 #define PRINTK(x)   do { ; } while (0)
 #endif
 
-static void print_raid5_conf (raid5_conf_t *conf);
-
 static inline int stripe_locked(struct stripe_head *sh)
 {
        return test_bit(STRIPE_LOCKED, &sh->state);
@@ -62,32 +61,32 @@ static inline int stripe_error(struct stripe_head *sh)
  */
 static inline void lock_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       if (!md_test_and_set_bit(STRIPE_LOCKED, &sh->state)) {
+       struct raid5_data *raid_conf = sh->raid_conf;
+       if (!test_and_set_bit(STRIPE_LOCKED, &sh->state)) {
                PRINTK(("locking stripe %lu\n", sh->sector));
-               conf->nr_locked_stripes++;
+               raid_conf->nr_locked_stripes++;
        }
 }
 
 static inline void unlock_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       if (md_test_and_clear_bit(STRIPE_LOCKED, &sh->state)) {
+       struct raid5_data *raid_conf = sh->raid_conf;
+       if (test_and_clear_bit(STRIPE_LOCKED, &sh->state)) {
                PRINTK(("unlocking stripe %lu\n", sh->sector));
-               conf->nr_locked_stripes--;
+               raid_conf->nr_locked_stripes--;
                wake_up(&sh->wait);
        }
 }
 
 static inline void finish_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
        unlock_stripe(sh);
        sh->cmd = STRIPE_NONE;
        sh->phase = PHASE_COMPLETE;
-       conf->nr_pending_stripes--;
-       conf->nr_cached_stripes++;
-       wake_up(&conf->wait_for_stripe);
+       raid_conf->nr_pending_stripes--;
+       raid_conf->nr_cached_stripes++;
+       wake_up(&raid_conf->wait_for_stripe);
 }
 
 void __wait_on_stripe(struct stripe_head *sh)
@@ -115,7 +114,7 @@ static inline void wait_on_stripe(struct stripe_head *sh)
                __wait_on_stripe(sh);
 }
 
-static inline void remove_hash(raid5_conf_t *conf, struct stripe_head *sh)
+static inline void remove_hash(struct raid5_data *raid_conf, struct stripe_head *sh)
 {
        PRINTK(("remove_hash(), stripe %lu\n", sh->sector));
 
@@ -124,22 +123,21 @@ static inline void remove_hash(raid5_conf_t *conf, struct stripe_head *sh)
                        sh->hash_next->hash_pprev = sh->hash_pprev;
                *sh->hash_pprev = sh->hash_next;
                sh->hash_pprev = NULL;
-               conf->nr_hashed_stripes--;
+               raid_conf->nr_hashed_stripes--;
        }
 }
 
-static inline void insert_hash(raid5_conf_t *conf, struct stripe_head *sh)
+static inline void insert_hash(struct raid5_data *raid_conf, struct stripe_head *sh)
 {
-       struct stripe_head **shp = &stripe_hash(conf, sh->sector, sh->size);
+       struct stripe_head **shp = &stripe_hash(raid_conf, sh->sector, sh->size);
 
-       PRINTK(("insert_hash(), stripe %lu, nr_hashed_stripes %d\n",
-                       sh->sector, conf->nr_hashed_stripes));
+       PRINTK(("insert_hash(), stripe %lu, nr_hashed_stripes %d\n", sh->sector, raid_conf->nr_hashed_stripes));
 
        if ((sh->hash_next = *shp) != NULL)
                (*shp)->hash_pprev = &sh->hash_next;
        *shp = sh;
        sh->hash_pprev = shp;
-       conf->nr_hashed_stripes++;
+       raid_conf->nr_hashed_stripes++;
 }
 
 static struct buffer_head *get_free_buffer(struct stripe_head *sh, int b_size)
@@ -147,15 +145,13 @@ static struct buffer_head *get_free_buffer(struct stripe_head *sh, int b_size)
        struct buffer_head *bh;
        unsigned long flags;
 
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
-       bh = sh->buffer_pool;
-       if (!bh)
-               goto out_unlock;
+       save_flags(flags);
+       cli();
+       if ((bh = sh->buffer_pool) == NULL)
+               return NULL;
        sh->buffer_pool = bh->b_next;
        bh->b_size = b_size;
-out_unlock:
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
-
+       restore_flags(flags);
        return bh;
 }
 
@@ -164,14 +160,12 @@ static struct buffer_head *get_free_bh(struct stripe_head *sh)
        struct buffer_head *bh;
        unsigned long flags;
 
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
-       bh = sh->bh_pool;
-       if (!bh)
-               goto out_unlock;
+       save_flags(flags);
+       cli();
+       if ((bh = sh->bh_pool) == NULL)
+               return NULL;
        sh->bh_pool = bh->b_next;
-out_unlock:
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
-
+       restore_flags(flags);
        return bh;
 }
 
@@ -179,52 +173,54 @@ static void put_free_buffer(struct stripe_head *sh, struct buffer_head *bh)
 {
        unsigned long flags;
 
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
        bh->b_next = sh->buffer_pool;
        sh->buffer_pool = bh;
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
 }
 
 static void put_free_bh(struct stripe_head *sh, struct buffer_head *bh)
 {
        unsigned long flags;
 
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
        bh->b_next = sh->bh_pool;
        sh->bh_pool = bh;
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
 }
 
-static struct stripe_head *get_free_stripe(raid5_conf_t *conf)
+static struct stripe_head *get_free_stripe(struct raid5_data *raid_conf)
 {
        struct stripe_head *sh;
        unsigned long flags;
 
        save_flags(flags);
        cli();
-       if ((sh = conf->free_sh_list) == NULL) {
+       if ((sh = raid_conf->free_sh_list) == NULL) {
                restore_flags(flags);
                return NULL;
        }
-       conf->free_sh_list = sh->free_next;
-       conf->nr_free_sh--;
-       if (!conf->nr_free_sh && conf->free_sh_list)
+       raid_conf->free_sh_list = sh->free_next;
+       raid_conf->nr_free_sh--;
+       if (!raid_conf->nr_free_sh && raid_conf->free_sh_list)
                printk ("raid5: bug: free_sh_list != NULL, nr_free_sh == 0\n");
        restore_flags(flags);
-       if (sh->hash_pprev || md_atomic_read(&sh->nr_pending) || sh->count)
+       if (sh->hash_pprev || sh->nr_pending || sh->count)
                printk("get_free_stripe(): bug\n");
        return sh;
 }
 
-static void put_free_stripe(raid5_conf_t *conf, struct stripe_head *sh)
+static void put_free_stripe(struct raid5_data *raid_conf, struct stripe_head *sh)
 {
        unsigned long flags;
 
        save_flags(flags);
        cli();
-       sh->free_next = conf->free_sh_list;
-       conf->free_sh_list = sh;
-       conf->nr_free_sh++;
+       sh->free_next = raid_conf->free_sh_list;
+       raid_conf->free_sh_list = sh;
+       raid_conf->nr_free_sh++;
        restore_flags(flags);
 }
 
@@ -328,8 +324,8 @@ static void raid5_update_old_bh(struct stripe_head *sh, int i)
 
 static void kfree_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks, j;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks, j;
 
        PRINTK(("kfree_stripe called, stripe %lu\n", sh->sector));
        if (sh->phase != PHASE_COMPLETE || stripe_locked(sh) || sh->count) {
@@ -342,19 +338,19 @@ static void kfree_stripe(struct stripe_head *sh)
                if (sh->bh_new[j] || sh->bh_copy[j])
                        printk("raid5: bug: sector %lu, new %p, copy %p\n", sh->sector, sh->bh_new[j], sh->bh_copy[j]);
        }
-       remove_hash(conf, sh);
-       put_free_stripe(conf, sh);
+       remove_hash(raid_conf, sh);
+       put_free_stripe(raid_conf, sh);
 }
 
-static int shrink_stripe_cache(raid5_conf_t *conf, int nr)
+static int shrink_stripe_cache(struct raid5_data *raid_conf, int nr)
 {
        struct stripe_head *sh;
        int i, count = 0;
 
-       PRINTK(("shrink_stripe_cache called, %d/%d, clock %d\n", nr, conf->nr_hashed_stripes, conf->clock));
+       PRINTK(("shrink_stripe_cache called, %d/%d, clock %d\n", nr, raid_conf->nr_hashed_stripes, raid_conf->clock));
        for (i = 0; i < NR_HASH; i++) {
 repeat:
-               sh = conf->stripe_hashtbl[(i + conf->clock) & HASH_MASK];
+               sh = raid_conf->stripe_hashtbl[(i + raid_conf->clock) & HASH_MASK];
                for (; sh; sh = sh->hash_next) {
                        if (sh->phase != PHASE_COMPLETE)
                                continue;
@@ -364,30 +360,30 @@ repeat:
                                continue;
                        kfree_stripe(sh);
                        if (++count == nr) {
-                               PRINTK(("shrink completed, nr_hashed_stripes %d\n", conf->nr_hashed_stripes));
-                               conf->clock = (i + conf->clock) & HASH_MASK;
+                               PRINTK(("shrink completed, nr_hashed_stripes %d\n", raid_conf->nr_hashed_stripes));
+                               raid_conf->clock = (i + raid_conf->clock) & HASH_MASK;
                                return nr;
                        }
                        goto repeat;
                }
        }
-       PRINTK(("shrink completed, nr_hashed_stripes %d\n", conf->nr_hashed_stripes));
+       PRINTK(("shrink completed, nr_hashed_stripes %d\n", raid_conf->nr_hashed_stripes));
        return count;
 }
 
-static struct stripe_head *find_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *find_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
 {
        struct stripe_head *sh;
 
-       if (conf->buffer_size != size) {
-               PRINTK(("switching size, %d --> %d\n", conf->buffer_size, size));
-               shrink_stripe_cache(conf, conf->max_nr_stripes);
-               conf->buffer_size = size;
+       if (raid_conf->buffer_size != size) {
+               PRINTK(("switching size, %d --> %d\n", raid_conf->buffer_size, size));
+               shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes);
+               raid_conf->buffer_size = size;
        }
 
        PRINTK(("find_stripe, sector %lu\n", sector));
-       for (sh = stripe_hash(conf, sector, size); sh; sh = sh->hash_next)
-               if (sh->sector == sector && sh->raid_conf == conf) {
+       for (sh = stripe_hash(raid_conf, sector, size); sh; sh = sh->hash_next)
+               if (sh->sector == sector && sh->raid_conf == raid_conf) {
                        if (sh->size == size) {
                                PRINTK(("found stripe %lu\n", sector));
                                return sh;
@@ -401,7 +397,7 @@ static struct stripe_head *find_stripe(raid5_conf_t *conf, unsigned long sector,
        return NULL;
 }
 
-static int grow_stripes(raid5_conf_t *conf, int num, int priority)
+static int grow_stripes(struct raid5_data *raid_conf, int num, int priority)
 {
        struct stripe_head *sh;
 
@@ -409,64 +405,62 @@ static int grow_stripes(raid5_conf_t *conf, int num, int priority)
                if ((sh = kmalloc(sizeof(struct stripe_head), priority)) == NULL)
                        return 1;
                memset(sh, 0, sizeof(*sh));
-               sh->stripe_lock = MD_SPIN_LOCK_UNLOCKED;
-
-               if (grow_buffers(sh, 2 * conf->raid_disks, PAGE_SIZE, priority)) {
-                       shrink_buffers(sh, 2 * conf->raid_disks);
+               if (grow_buffers(sh, 2 * raid_conf->raid_disks, PAGE_SIZE, priority)) {
+                       shrink_buffers(sh, 2 * raid_conf->raid_disks);
                        kfree(sh);
                        return 1;
                }
-               if (grow_bh(sh, conf->raid_disks, priority)) {
-                       shrink_buffers(sh, 2 * conf->raid_disks);
-                       shrink_bh(sh, conf->raid_disks);
+               if (grow_bh(sh, raid_conf->raid_disks, priority)) {
+                       shrink_buffers(sh, 2 * raid_conf->raid_disks);
+                       shrink_bh(sh, raid_conf->raid_disks);
                        kfree(sh);
                        return 1;
                }
-               put_free_stripe(conf, sh);
-               conf->nr_stripes++;
+               put_free_stripe(raid_conf, sh);
+               raid_conf->nr_stripes++;
        }
        return 0;
 }
 
-static void shrink_stripes(raid5_conf_t *conf, int num)
+static void shrink_stripes(struct raid5_data *raid_conf, int num)
 {
        struct stripe_head *sh;
 
        while (num--) {
-               sh = get_free_stripe(conf);
+               sh = get_free_stripe(raid_conf);
                if (!sh)
                        break;
-               shrink_buffers(sh, conf->raid_disks * 2);
-               shrink_bh(sh, conf->raid_disks);
+               shrink_buffers(sh, raid_conf->raid_disks * 2);
+               shrink_bh(sh, raid_conf->raid_disks);
                kfree(sh);
-               conf->nr_stripes--;
+               raid_conf->nr_stripes--;
        }
 }
 
-static struct stripe_head *kmalloc_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *kmalloc_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
 {
        struct stripe_head *sh = NULL, *tmp;
        struct buffer_head *buffer_pool, *bh_pool;
 
        PRINTK(("kmalloc_stripe called\n"));
 
-       while ((sh = get_free_stripe(conf)) == NULL) {
-               shrink_stripe_cache(conf, conf->max_nr_stripes / 8);
-               if ((sh = get_free_stripe(conf)) != NULL)
+       while ((sh = get_free_stripe(raid_conf)) == NULL) {
+               shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes / 8);
+               if ((sh = get_free_stripe(raid_conf)) != NULL)
                        break;
-               if (!conf->nr_pending_stripes)
+               if (!raid_conf->nr_pending_stripes)
                        printk("raid5: bug: nr_free_sh == 0, nr_pending_stripes == 0\n");
-               md_wakeup_thread(conf->thread);
+               md_wakeup_thread(raid_conf->thread);
                PRINTK(("waiting for some stripes to complete\n"));
-               sleep_on(&conf->wait_for_stripe);
+               sleep_on(&raid_conf->wait_for_stripe);
        }
 
        /*
         * The above might have slept, so perhaps another process
         * already created the stripe for us..
         */
-       if ((tmp = find_stripe(conf, sector, size)) != NULL) { 
-               put_free_stripe(conf, sh);
+       if ((tmp = find_stripe(raid_conf, sector, size)) != NULL) { 
+               put_free_stripe(raid_conf, sh);
                wait_on_stripe(tmp);
                return tmp;
        }
@@ -478,25 +472,25 @@ static struct stripe_head *kmalloc_stripe(raid5_conf_t *conf, unsigned long sect
                sh->bh_pool = bh_pool;
                sh->phase = PHASE_COMPLETE;
                sh->cmd = STRIPE_NONE;
-               sh->raid_conf = conf;
+               sh->raid_conf = raid_conf;
                sh->sector = sector;
                sh->size = size;
-               conf->nr_cached_stripes++;
-               insert_hash(conf, sh);
+               raid_conf->nr_cached_stripes++;
+               insert_hash(raid_conf, sh);
        } else printk("raid5: bug: kmalloc_stripe() == NULL\n");
        return sh;
 }
 
-static struct stripe_head *get_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *get_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
 {
        struct stripe_head *sh;
 
        PRINTK(("get_stripe, sector %lu\n", sector));
-       sh = find_stripe(conf, sector, size);
+       sh = find_stripe(raid_conf, sector, size);
        if (sh)
                wait_on_stripe(sh);
        else
-               sh = kmalloc_stripe(conf, sector, size);
+               sh = kmalloc_stripe(raid_conf, sector, size);
        return sh;
 }
 
@@ -529,7 +523,7 @@ static inline void raid5_end_buffer_io (struct stripe_head *sh, int i, int uptod
        bh->b_end_io(bh, uptodate);
        if (!uptodate)
                printk(KERN_ALERT "raid5: %s: unrecoverable I/O error for "
-                      "block %lu\n", partition_name(bh->b_dev), bh->b_blocknr);
+                      "block %lu\n", kdevname(bh->b_dev), bh->b_blocknr);
 }
 
 static inline void raid5_mark_buffer_uptodate (struct buffer_head *bh, int uptodate)
@@ -543,35 +537,36 @@ static inline void raid5_mark_buffer_uptodate (struct buffer_head *bh, int uptod
 static void raid5_end_request (struct buffer_head * bh, int uptodate)
 {
        struct stripe_head *sh = bh->b_dev_id;
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks, i;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks, i;
        unsigned long flags;
 
        PRINTK(("end_request %lu, nr_pending %d\n", sh->sector, sh->nr_pending));
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
        raid5_mark_buffer_uptodate(bh, uptodate);
-       if (atomic_dec_and_test(&sh->nr_pending)) {
-               md_wakeup_thread(conf->thread);
-               atomic_inc(&conf->nr_handle);
+       --sh->nr_pending;
+       if (!sh->nr_pending) {
+               md_wakeup_thread(raid_conf->thread);
+               atomic_inc(&raid_conf->nr_handle);
        }
-       if (!uptodate) {
+       if (!uptodate)
                md_error(bh->b_dev, bh->b_rdev);
-       }
-       if (conf->failed_disks) {
+       if (raid_conf->failed_disks) {
                for (i = 0; i < disks; i++) {
-                       if (conf->disks[i].operational)
+                       if (raid_conf->disks[i].operational)
                                continue;
                        if (bh != sh->bh_old[i] && bh != sh->bh_req[i] && bh != sh->bh_copy[i])
                                continue;
-                       if (bh->b_rdev != conf->disks[i].dev)
+                       if (bh->b_rdev != raid_conf->disks[i].dev)
                                continue;
                        set_bit(STRIPE_ERROR, &sh->state);
                }
        }
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
 }
 
-static int raid5_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid5_map (struct md_dev *mddev, kdev_t *rdev,
                      unsigned long *rsector, unsigned long size)
 {
        /* No complex mapping used: the core of the work is done in the
@@ -582,10 +577,11 @@ static int raid5_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
 
 static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, int i)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       struct md_dev *mddev = raid_conf->mddev;
+       int minor = (int) (mddev - md_dev);
        char *b_data;
-       kdev_t dev = mddev_to_kdev(mddev);
+       kdev_t dev = MKDEV(MD_MAJOR, minor);
        int block = sh->sector / (sh->size >> 9);
 
        b_data = ((volatile struct buffer_head *) bh)->b_data;
@@ -593,7 +589,7 @@ static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, i
        init_buffer(bh, dev, block, raid5_end_request, sh);
        ((volatile struct buffer_head *) bh)->b_data = b_data;
 
-       bh->b_rdev      = conf->disks[i].dev;
+       bh->b_rdev      = raid_conf->disks[i].dev;
        bh->b_rsector   = sh->sector;
 
        bh->b_state     = (1 << BH_Req);
@@ -601,62 +597,33 @@ static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, i
        bh->b_list      = BUF_LOCKED;
 }
 
-static int raid5_error (mddev_t *mddev, kdev_t dev)
+static int raid5_error (struct md_dev *mddev, kdev_t dev)
 {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       mdp_super_t *sb = mddev->sb;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       md_superblock_t *sb = mddev->sb;
        struct disk_info *disk;
        int i;
 
        PRINTK(("raid5_error called\n"));
-       conf->resync_parity = 0;
-       for (i = 0, disk = conf->disks; i < conf->raid_disks; i++, disk++) {
+       raid_conf->resync_parity = 0;
+       for (i = 0, disk = raid_conf->disks; i < raid_conf->raid_disks; i++, disk++)
                if (disk->dev == dev && disk->operational) {
                        disk->operational = 0;
-                       mark_disk_faulty(sb->disks+disk->number);
-                       mark_disk_nonsync(sb->disks+disk->number);
-                       mark_disk_inactive(sb->disks+disk->number);
+                       sb->disks[disk->number].state |= (1 << MD_FAULTY_DEVICE);
+                       sb->disks[disk->number].state &= ~(1 << MD_SYNC_DEVICE);
+                       sb->disks[disk->number].state &= ~(1 << MD_ACTIVE_DEVICE);
                        sb->active_disks--;
                        sb->working_disks--;
                        sb->failed_disks++;
                        mddev->sb_dirty = 1;
-                       conf->working_disks--;
-                       conf->failed_disks++;
-                       md_wakeup_thread(conf->thread);
+                       raid_conf->working_disks--;
+                       raid_conf->failed_disks++;
+                       md_wakeup_thread(raid_conf->thread);
                        printk (KERN_ALERT
-                               "raid5: Disk failure on %s, disabling device."
-                               " Operation continuing on %d devices\n",
-                               partition_name (dev), conf->working_disks);
-                       return -EIO;
-               }
-       }
-       /*
-        * handle errors in spares (during reconstruction)
-        */
-       if (conf->spare) {
-               disk = conf->spare;
-               if (disk->dev == dev) {
-                       printk (KERN_ALERT
-                               "raid5: Disk failure on spare %s\n",
-                               partition_name (dev));
-                       if (!conf->spare->operational) {
-                               MD_BUG();
-                               return -EIO;
-                       }
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       conf->spare = NULL;
-                       mark_disk_faulty(sb->disks+disk->number);
-                       mark_disk_nonsync(sb->disks+disk->number);
-                       mark_disk_inactive(sb->disks+disk->number);
-                       sb->spare_disks--;
-                       sb->working_disks--;
-                       sb->failed_disks++;
-
-                       return -EIO;
+                               "RAID5: Disk failure on %s, disabling device."
+                               "Operation continuing on %d devices\n",
+                               kdevname (dev), raid_conf->working_disks);
                }
-       }
-       MD_BUG();
        return 0;
 }      
 
@@ -667,12 +634,12 @@ static int raid5_error (mddev_t *mddev, kdev_t dev)
 static inline unsigned long 
 raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_disks,
                        unsigned int * dd_idx, unsigned int * pd_idx, 
-                       raid5_conf_t *conf)
+                       struct raid5_data *raid_conf)
 {
        unsigned int  stripe;
        int chunk_number, chunk_offset;
        unsigned long new_sector;
-       int sectors_per_chunk = conf->chunk_size >> 9;
+       int sectors_per_chunk = raid_conf->chunk_size >> 9;
 
        /* First compute the information on this sector */
 
@@ -695,9 +662,9 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
        /*
         * Select the parity disk based on the user selected algorithm.
         */
-       if (conf->level == 4)
+       if (raid_conf->level == 4)
                *pd_idx = data_disks;
-       else switch (conf->algorithm) {
+       else switch (raid_conf->algorithm) {
                case ALGORITHM_LEFT_ASYMMETRIC:
                        *pd_idx = data_disks - stripe % raid_disks;
                        if (*dd_idx >= *pd_idx)
@@ -717,7 +684,7 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
                        *dd_idx = (*pd_idx + 1 + *dd_idx) % raid_disks;
                        break;
                default:
-                       printk ("raid5: unsupported algorithm %d\n", conf->algorithm);
+                       printk ("raid5: unsupported algorithm %d\n", raid_conf->algorithm);
        }
 
        /*
@@ -738,16 +705,16 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
 
 static unsigned long compute_blocknr(struct stripe_head *sh, int i)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       int raid_disks = conf->raid_disks, data_disks = raid_disks - 1;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int raid_disks = raid_conf->raid_disks, data_disks = raid_disks - 1;
        unsigned long new_sector = sh->sector, check;
-       int sectors_per_chunk = conf->chunk_size >> 9;
+       int sectors_per_chunk = raid_conf->chunk_size >> 9;
        unsigned long stripe = new_sector / sectors_per_chunk;
        int chunk_offset = new_sector % sectors_per_chunk;
        int chunk_number, dummy1, dummy2, dd_idx = i;
        unsigned long r_sector, blocknr;
 
-       switch (conf->algorithm) {
+       switch (raid_conf->algorithm) {
                case ALGORITHM_LEFT_ASYMMETRIC:
                case ALGORITHM_RIGHT_ASYMMETRIC:
                        if (i > sh->pd_idx)
@@ -760,14 +727,14 @@ static unsigned long compute_blocknr(struct stripe_head *sh, int i)
                        i -= (sh->pd_idx + 1);
                        break;
                default:
-                       printk ("raid5: unsupported algorithm %d\n", conf->algorithm);
+                       printk ("raid5: unsupported algorithm %d\n", raid_conf->algorithm);
        }
 
        chunk_number = stripe * data_disks + i;
        r_sector = chunk_number * sectors_per_chunk + chunk_offset;
        blocknr = r_sector / (sh->size >> 9);
 
-       check = raid5_compute_sector (r_sector, raid_disks, data_disks, &dummy1, &dummy2, conf);
+       check = raid5_compute_sector (r_sector, raid_disks, data_disks, &dummy1, &dummy2, raid_conf);
        if (check != sh->sector || dummy1 != dd_idx || dummy2 != sh->pd_idx) {
                printk("compute_blocknr: map not correct\n");
                return 0;
@@ -775,11 +742,36 @@ static unsigned long compute_blocknr(struct stripe_head *sh, int i)
        return blocknr;
 }
 
+#ifdef HAVE_ARCH_XORBLOCK
+static void xor_block(struct buffer_head *dest, struct buffer_head *source)
+{
+       __xor_block((char *) dest->b_data, (char *) source->b_data, dest->b_size);
+}
+#else
+static void xor_block(struct buffer_head *dest, struct buffer_head *source)
+{
+       long lines = dest->b_size / (sizeof (long)) / 8, i;
+       long *destp = (long *) dest->b_data, *sourcep = (long *) source->b_data;
+
+       for (i = lines; i > 0; i--) {
+               *(destp + 0) ^= *(sourcep + 0);
+               *(destp + 1) ^= *(sourcep + 1);
+               *(destp + 2) ^= *(sourcep + 2);
+               *(destp + 3) ^= *(sourcep + 3);
+               *(destp + 4) ^= *(sourcep + 4);
+               *(destp + 5) ^= *(sourcep + 5);
+               *(destp + 6) ^= *(sourcep + 6);
+               *(destp + 7) ^= *(sourcep + 7);
+               destp += 8;
+               sourcep += 8;
+       }
+}
+#endif
+
 static void compute_block(struct stripe_head *sh, int dd_idx)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       int i, count, disks = conf->raid_disks;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int i, disks = raid_conf->raid_disks;
 
        PRINTK(("compute_block, stripe %lu, idx %d\n", sh->sector, dd_idx));
 
@@ -788,100 +780,69 @@ static void compute_block(struct stripe_head *sh, int dd_idx)
        raid5_build_block(sh, sh->bh_old[dd_idx], dd_idx);
 
        memset(sh->bh_old[dd_idx]->b_data, 0, sh->size);
-       bh_ptr[0] = sh->bh_old[dd_idx];
-       count = 1;
        for (i = 0; i < disks; i++) {
                if (i == dd_idx)
                        continue;
                if (sh->bh_old[i]) {
-                       bh_ptr[count++] = sh->bh_old[i];
-               } else {
+                       xor_block(sh->bh_old[dd_idx], sh->bh_old[i]);
+                       continue;
+               } else
                        printk("compute_block() %d, stripe %lu, %d not present\n", dd_idx, sh->sector, i);
-               }
-               if (count == MAX_XOR_BLOCKS) {
-                       xor_block(count, &bh_ptr[0]);
-                       count = 1;
-               }
-       }
-       if(count != 1) {
-               xor_block(count, &bh_ptr[0]);
        }
        raid5_mark_buffer_uptodate(sh->bh_old[dd_idx], 1);
 }
 
 static void compute_parity(struct stripe_head *sh, int method)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       int i, pd_idx = sh->pd_idx, disks = conf->raid_disks, lowprio, count;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int i, pd_idx = sh->pd_idx, disks = raid_conf->raid_disks;
 
        PRINTK(("compute_parity, stripe %lu, method %d\n", sh->sector, method));
-       lowprio = 1;
        for (i = 0; i < disks; i++) {
                if (i == pd_idx || !sh->bh_new[i])
                        continue;
                if (!sh->bh_copy[i])
                        sh->bh_copy[i] = raid5_kmalloc_buffer(sh, sh->size);
                raid5_build_block(sh, sh->bh_copy[i], i);
-               if (!buffer_lowprio(sh->bh_new[i]))
-                       lowprio = 0;
-               else
-                       mark_buffer_lowprio(sh->bh_copy[i]);
                mark_buffer_clean(sh->bh_new[i]);
                memcpy(sh->bh_copy[i]->b_data, sh->bh_new[i]->b_data, sh->size);
        }
        if (sh->bh_copy[pd_idx] == NULL)
                sh->bh_copy[pd_idx] = raid5_kmalloc_buffer(sh, sh->size);
        raid5_build_block(sh, sh->bh_copy[pd_idx], sh->pd_idx);
-       if (lowprio)
-               mark_buffer_lowprio(sh->bh_copy[pd_idx]);
 
        if (method == RECONSTRUCT_WRITE) {
                memset(sh->bh_copy[pd_idx]->b_data, 0, sh->size);
-               bh_ptr[0] = sh->bh_copy[pd_idx];
-               count = 1;
                for (i = 0; i < disks; i++) {
                        if (i == sh->pd_idx)
                                continue;
                        if (sh->bh_new[i]) {
-                               bh_ptr[count++] = sh->bh_copy[i];
-                       } else if (sh->bh_old[i]) {
-                               bh_ptr[count++] = sh->bh_old[i];
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_copy[i]);
+                               continue;
                        }
-                       if (count == MAX_XOR_BLOCKS) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
+                       if (sh->bh_old[i]) {
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_old[i]);
+                               continue;
                        }
                }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
        } else if (method == READ_MODIFY_WRITE) {
                memcpy(sh->bh_copy[pd_idx]->b_data, sh->bh_old[pd_idx]->b_data, sh->size);
-               bh_ptr[0] = sh->bh_copy[pd_idx];
-               count = 1;
                for (i = 0; i < disks; i++) {
                        if (i == sh->pd_idx)
                                continue;
                        if (sh->bh_new[i] && sh->bh_old[i]) {
-                               bh_ptr[count++] = sh->bh_copy[i];
-                               bh_ptr[count++] = sh->bh_old[i];
-                       }
-                       if (count >= (MAX_XOR_BLOCKS - 1)) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_copy[i]);
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_old[i]);
+                               continue;
                        }
                }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
        }
        raid5_mark_buffer_uptodate(sh->bh_copy[pd_idx], 1);
 }
 
 static void add_stripe_bh (struct stripe_head *sh, struct buffer_head *bh, int dd_idx, int rw)
 {
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
        struct buffer_head *bh_req;
 
        if (sh->bh_new[dd_idx]) {
@@ -899,22 +860,19 @@ static void add_stripe_bh (struct stripe_head *sh, struct buffer_head *bh, int d
        if (sh->phase == PHASE_COMPLETE && sh->cmd == STRIPE_NONE) {
                sh->phase = PHASE_BEGIN;
                sh->cmd = (rw == READ) ? STRIPE_READ : STRIPE_WRITE;
-               conf->nr_pending_stripes++;
-               atomic_inc(&conf->nr_handle);
+               raid_conf->nr_pending_stripes++;
+               atomic_inc(&raid_conf->nr_handle);
        }
        sh->bh_new[dd_idx] = bh;
        sh->bh_req[dd_idx] = bh_req;
        sh->cmd_new[dd_idx] = rw;
        sh->new[dd_idx] = 1;
-
-       if (buffer_lowprio(bh))
-               mark_buffer_lowprio(bh_req);
 }
 
 static void complete_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks;
        int i, new = 0;
        
        PRINTK(("complete_stripe %lu\n", sh->sector));
@@ -951,22 +909,6 @@ static void complete_stripe(struct stripe_head *sh)
        }
 }
 
-
-static int is_stripe_lowprio(struct stripe_head *sh, int disks)
-{
-       int i, lowprio = 1;
-
-       for (i = 0; i < disks; i++) {
-               if (sh->bh_new[i])
-                       if (!buffer_lowprio(sh->bh_new[i]))
-                               lowprio = 0;
-               if (sh->bh_old[i])
-                       if (!buffer_lowprio(sh->bh_old[i]))
-                               lowprio = 0;
-       }
-       return lowprio;
-}
-
 /*
  * handle_stripe() is our main logic routine. Note that:
  *
@@ -977,27 +919,28 @@ static int is_stripe_lowprio(struct stripe_head *sh, int disks)
  * 2.  We should be careful to set sh->nr_pending whenever we sleep,
  *     to prevent re-entry of handle_stripe() for the same sh.
  *
- * 3.  conf->failed_disks and disk->operational can be changed
+ * 3.  raid_conf->failed_disks and disk->operational can be changed
  *     from an interrupt. This complicates things a bit, but it allows
  *     us to stop issuing requests for a failed drive as soon as possible.
  */
 static void handle_stripe(struct stripe_head *sh)
 {
-       raid5_conf_t *conf = sh->raid_conf;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       struct md_dev *mddev = raid_conf->mddev;
+       int minor = (int) (mddev - md_dev);
        struct buffer_head *bh;
-       int disks = conf->raid_disks;
-       int i, nr = 0, nr_read = 0, nr_write = 0, lowprio;
+       int disks = raid_conf->raid_disks;
+       int i, nr = 0, nr_read = 0, nr_write = 0;
        int nr_cache = 0, nr_cache_other = 0, nr_cache_overwrite = 0, parity = 0;
        int nr_failed_other = 0, nr_failed_overwrite = 0, parity_failed = 0;
        int reading = 0, nr_writing = 0;
        int method1 = INT_MAX, method2 = INT_MAX;
        int block;
        unsigned long flags;
-       int operational[MD_SB_DISKS], failed_disks = conf->failed_disks;
+       int operational[MD_SB_DISKS], failed_disks = raid_conf->failed_disks;
 
        PRINTK(("handle_stripe(), stripe %lu\n", sh->sector));
-       if (md_atomic_read(&sh->nr_pending)) {
+       if (sh->nr_pending) {
                printk("handle_stripe(), stripe %lu, io still pending\n", sh->sector);
                return;
        }
@@ -1006,9 +949,9 @@ static void handle_stripe(struct stripe_head *sh)
                return;
        }
 
-       atomic_dec(&conf->nr_handle);
+       atomic_dec(&raid_conf->nr_handle);
 
-       if (md_test_and_clear_bit(STRIPE_ERROR, &sh->state)) {
+       if (test_and_clear_bit(STRIPE_ERROR, &sh->state)) {
                printk("raid5: restarting stripe %lu\n", sh->sector);
                sh->phase = PHASE_BEGIN;
        }
@@ -1026,11 +969,11 @@ static void handle_stripe(struct stripe_head *sh)
        save_flags(flags);
        cli();
        for (i = 0; i < disks; i++) {
-               operational[i] = conf->disks[i].operational;
-               if (i == sh->pd_idx && conf->resync_parity)
+               operational[i] = raid_conf->disks[i].operational;
+               if (i == sh->pd_idx && raid_conf->resync_parity)
                        operational[i] = 0;
        }
-       failed_disks = conf->failed_disks;
+       failed_disks = raid_conf->failed_disks;
        restore_flags(flags);
 
        if (failed_disks > 1) {
@@ -1074,7 +1017,7 @@ static void handle_stripe(struct stripe_head *sh)
        }
 
        if (nr_write && nr_read)
-               printk("raid5: bug, nr_write ==`%d, nr_read == %d, sh->cmd == %d\n", nr_write, nr_read, sh->cmd);
+               printk("raid5: bug, nr_write == %d, nr_read == %d, sh->cmd == %d\n", nr_write, nr_read, sh->cmd);
 
        if (nr_write) {
                /*
@@ -1087,7 +1030,7 @@ static void handle_stripe(struct stripe_head *sh)
                                if (sh->bh_new[i])
                                        continue;
                                block = (int) compute_blocknr(sh, i);
-                               bh = find_buffer(mddev_to_kdev(mddev), block, sh->size);
+                               bh = find_buffer(MKDEV(MD_MAJOR, minor), block, sh->size);
                                if (bh && bh->b_count == 0 && buffer_dirty(bh) && !buffer_locked(bh)) {
                                        PRINTK(("Whee.. sector %lu, index %d (%d) found in the buffer cache!\n", sh->sector, i, block));
                                        add_stripe_bh(sh, bh, i, WRITE);
@@ -1121,22 +1064,21 @@ static void handle_stripe(struct stripe_head *sh)
 
                if (!method1 || !method2) {
                        lock_stripe(sh);
-                       lowprio = is_stripe_lowprio(sh, disks);
-                       atomic_inc(&sh->nr_pending);
+                       sh->nr_pending++;
                        sh->phase = PHASE_WRITE;
                        compute_parity(sh, method1 <= method2 ? RECONSTRUCT_WRITE : READ_MODIFY_WRITE);
                        for (i = 0; i < disks; i++) {
-                               if (!operational[i] && !conf->spare && !conf->resync_parity)
+                               if (!operational[i] && !raid_conf->spare && !raid_conf->resync_parity)
                                        continue;
                                if (i == sh->pd_idx || sh->bh_new[i])
                                        nr_writing++;
                        }
 
-                       md_atomic_set(&sh->nr_pending, nr_writing);
-                       PRINTK(("handle_stripe() %lu, writing back %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = nr_writing;
+                       PRINTK(("handle_stripe() %lu, writing back %d\n", sh->sector, sh->nr_pending));
 
                        for (i = 0; i < disks; i++) {
-                               if (!operational[i] && !conf->spare && !conf->resync_parity)
+                               if (!operational[i] && !raid_conf->spare && !raid_conf->resync_parity)
                                        continue;
                                bh = sh->bh_copy[i];
                                if (i != sh->pd_idx && ((bh == NULL) ^ (sh->bh_new[i] == NULL)))
@@ -1147,30 +1089,18 @@ static void handle_stripe(struct stripe_head *sh)
                                        bh->b_state |= (1<<BH_Dirty);
                                        PRINTK(("making request for buffer %d\n", i));
                                        clear_bit(BH_Lock, &bh->b_state);
-                                       if (!operational[i] && !conf->resync_parity) {
-                                               bh->b_rdev = conf->spare->dev;
-                                               make_request(MAJOR(conf->spare->dev), WRITE, bh);
-                                       } else {
-#if 0
-                                               make_request(MAJOR(conf->disks[i].dev), WRITE, bh);
-#else
-                                               if (!lowprio || (i==sh->pd_idx))
-                                                       make_request(MAJOR(conf->disks[i].dev), WRITE, bh);
-                                               else {
-                                                       mark_buffer_clean(bh);
-                                                       raid5_end_request(bh,1);
-                                                       sh->new[i] = 0;
-                                               }
-#endif
-                                       }
+                                       if (!operational[i] && !raid_conf->resync_parity) {
+                                               bh->b_rdev = raid_conf->spare->dev;
+                                               make_request(MAJOR(raid_conf->spare->dev), WRITE, bh);
+                                       } else
+                                               make_request(MAJOR(raid_conf->disks[i].dev), WRITE, bh);
                                }
                        }
                        return;
                }
 
                lock_stripe(sh);
-               lowprio = is_stripe_lowprio(sh, disks);
-               atomic_inc(&sh->nr_pending);
+               sh->nr_pending++;
                if (method1 < method2) {
                        sh->write_method = RECONSTRUCT_WRITE;
                        for (i = 0; i < disks; i++) {
@@ -1180,8 +1110,6 @@ static void handle_stripe(struct stripe_head *sh)
                                        continue;
                                sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                reading++;
                        }
                } else {
@@ -1193,21 +1121,19 @@ static void handle_stripe(struct stripe_head *sh)
                                        continue;
                                sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                reading++;
                        }
                }
                sh->phase = PHASE_READ_OLD;
-               md_atomic_set(&sh->nr_pending, reading);
-               PRINTK(("handle_stripe() %lu, reading %d old buffers\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+               sh->nr_pending = reading;
+               PRINTK(("handle_stripe() %lu, reading %d old buffers\n", sh->sector, sh->nr_pending));
                for (i = 0; i < disks; i++) {
                        if (!sh->bh_old[i])
                                continue;
                        if (buffer_uptodate(sh->bh_old[i]))
                                continue;
                        clear_bit(BH_Lock, &sh->bh_old[i]->b_state);
-                       make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_old[i]);
+                       make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_old[i]);
                }
        } else {
                /*
@@ -1215,8 +1141,7 @@ static void handle_stripe(struct stripe_head *sh)
                 */
                method1 = nr_read - nr_cache_overwrite;
                lock_stripe(sh);
-               lowprio = is_stripe_lowprio(sh,disks);
-               atomic_inc(&sh->nr_pending);
+               sh->nr_pending++;
 
                PRINTK(("handle_stripe(), sector %lu, nr_read %d, nr_cache %d, method1 %d\n", sh->sector, nr_read, nr_cache, method1));
                if (!method1 || (method1 == 1 && nr_cache == disks - 1)) {
@@ -1224,22 +1149,18 @@ static void handle_stripe(struct stripe_head *sh)
                        for (i = 0; i < disks; i++) {
                                if (!sh->bh_new[i])
                                        continue;
-                               if (!sh->bh_old[i]) {
+                               if (!sh->bh_old[i])
                                        compute_block(sh, i);
-                                       if (lowprio)
-                                               mark_buffer_lowprio
-                                                       (sh->bh_old[i]);
-                               }
                                memcpy(sh->bh_new[i]->b_data, sh->bh_old[i]->b_data, sh->size);
                        }
-                       atomic_dec(&sh->nr_pending);
+                       sh->nr_pending--;
                        complete_stripe(sh);
                        return;
                }
                if (nr_failed_overwrite) {
                        sh->phase = PHASE_READ_OLD;
-                       md_atomic_set(&sh->nr_pending, (disks - 1) - nr_cache);
-                       PRINTK(("handle_stripe() %lu, phase READ_OLD, pending %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = (disks - 1) - nr_cache;
+                       PRINTK(("handle_stripe() %lu, phase READ_OLD, pending %d\n", sh->sector, sh->nr_pending));
                        for (i = 0; i < disks; i++) {
                                if (sh->bh_old[i])
                                        continue;
@@ -1247,16 +1168,13 @@ static void handle_stripe(struct stripe_head *sh)
                                        continue;
                                sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                clear_bit(BH_Lock, &sh->bh_old[i]->b_state);
-                               make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_old[i]);
+                               make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_old[i]);
                        }
                } else {
                        sh->phase = PHASE_READ;
-                       md_atomic_set(&sh->nr_pending,
-                               nr_read - nr_cache_overwrite);
-                       PRINTK(("handle_stripe() %lu, phase READ, pending %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = nr_read - nr_cache_overwrite;
+                       PRINTK(("handle_stripe() %lu, phase READ, pending %d\n", sh->sector, sh->nr_pending));
                        for (i = 0; i < disks; i++) {
                                if (!sh->bh_new[i])
                                        continue;
@@ -1264,16 +1182,16 @@ static void handle_stripe(struct stripe_head *sh)
                                        memcpy(sh->bh_new[i]->b_data, sh->bh_old[i]->b_data, sh->size);
                                        continue;
                                }
-                               make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_req[i]);
+                               make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_req[i]);
                        }
                }
        }
 }
 
-static int raid5_make_request (mddev_t *mddev, int rw, struct buffer_head * bh)
+static int raid5_make_request (struct md_dev *mddev, int rw, struct buffer_head * bh)
 {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       const unsigned int raid_disks = conf->raid_disks;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       const unsigned int raid_disks = raid_conf->raid_disks;
        const unsigned int data_disks = raid_disks - 1;
        unsigned int  dd_idx, pd_idx;
        unsigned long new_sector;
@@ -1284,15 +1202,15 @@ static int raid5_make_request (mddev_t *mddev, int rw, struct buffer_head * bh)
        if (rw == WRITEA) rw = WRITE;
 
        new_sector = raid5_compute_sector(bh->b_rsector, raid_disks, data_disks,
-                                               &dd_idx, &pd_idx, conf);
+                                               &dd_idx, &pd_idx, raid_conf);
 
        PRINTK(("raid5_make_request, sector %lu\n", new_sector));
 repeat:
-       sh = get_stripe(conf, new_sector, bh->b_size);
+       sh = get_stripe(raid_conf, new_sector, bh->b_size);
        if ((rw == READ && sh->cmd == STRIPE_WRITE) || (rw == WRITE && sh->cmd == STRIPE_READ)) {
                PRINTK(("raid5: lock contention, rw == %d, sh->cmd == %d\n", rw, sh->cmd));
                lock_stripe(sh);
-               if (!md_atomic_read(&sh->nr_pending))
+               if (!sh->nr_pending)
                        handle_stripe(sh);
                goto repeat;
        }
@@ -1303,24 +1221,24 @@ repeat:
                printk("raid5: bug: stripe->bh_new[%d], sector %lu exists\n", dd_idx, sh->sector);
                printk("raid5: bh %p, bh_new %p\n", bh, sh->bh_new[dd_idx]);
                lock_stripe(sh);
-               md_wakeup_thread(conf->thread);
+               md_wakeup_thread(raid_conf->thread);
                wait_on_stripe(sh);
                goto repeat;
        }
        add_stripe_bh(sh, bh, dd_idx, rw);
 
-       md_wakeup_thread(conf->thread);
+       md_wakeup_thread(raid_conf->thread);
        return 0;
 }
 
 static void unplug_devices(struct stripe_head *sh)
 {
 #if 0
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
        int i;
 
-       for (i = 0; i < conf->raid_disks; i++)
-               unplug_device(blk_dev + MAJOR(conf->disks[i].dev));
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               unplug_device(blk_dev + MAJOR(raid_conf->disks[i].dev));
 #endif
 }
 
@@ -1334,8 +1252,8 @@ static void unplug_devices(struct stripe_head *sh)
 static void raid5d (void *data)
 {
        struct stripe_head *sh;
-       raid5_conf_t *conf = data;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = data;
+       struct md_dev *mddev = raid_conf->mddev;
        int i, handled = 0, unplug = 0;
        unsigned long flags;
 
@@ -1343,47 +1261,47 @@ static void raid5d (void *data)
 
        if (mddev->sb_dirty) {
                mddev->sb_dirty = 0;
-               md_update_sb(mddev);
+               md_update_sb((int) (mddev - md_dev));
        }
        for (i = 0; i < NR_HASH; i++) {
 repeat:
-               sh = conf->stripe_hashtbl[i];
+               sh = raid_conf->stripe_hashtbl[i];
                for (; sh; sh = sh->hash_next) {
-                       if (sh->raid_conf != conf)
+                       if (sh->raid_conf != raid_conf)
                                continue;
                        if (sh->phase == PHASE_COMPLETE)
                                continue;
-                       if (md_atomic_read(&sh->nr_pending))
+                       if (sh->nr_pending)
                                continue;
-                       if (sh->sector == conf->next_sector) {
-                               conf->sector_count += (sh->size >> 9);
-                               if (conf->sector_count >= 128)
+                       if (sh->sector == raid_conf->next_sector) {
+                               raid_conf->sector_count += (sh->size >> 9);
+                               if (raid_conf->sector_count >= 128)
                                        unplug = 1;
                        } else
                                unplug = 1;
                        if (unplug) {
-                               PRINTK(("unplugging devices, sector == %lu, count == %d\n", sh->sector, conf->sector_count));
+                               PRINTK(("unplugging devices, sector == %lu, count == %d\n", sh->sector, raid_conf->sector_count));
                                unplug_devices(sh);
                                unplug = 0;
-                               conf->sector_count = 0;
+                               raid_conf->sector_count = 0;
                        }
-                       conf->next_sector = sh->sector + (sh->size >> 9);
+                       raid_conf->next_sector = sh->sector + (sh->size >> 9);
                        handled++;
                        handle_stripe(sh);
                        goto repeat;
                }
        }
-       if (conf) {
-               PRINTK(("%d stripes handled, nr_handle %d\n", handled, md_atomic_read(&conf->nr_handle)));
+       if (raid_conf) {
+               PRINTK(("%d stripes handled, nr_handle %d\n", handled, atomic_read(&raid_conf->nr_handle)));
                save_flags(flags);
                cli();
-               if (!md_atomic_read(&conf->nr_handle))
-                       clear_bit(THREAD_WAKEUP, &conf->thread->flags);
-               restore_flags(flags);
+               if (!atomic_read(&raid_conf->nr_handle))
+                       clear_bit(THREAD_WAKEUP, &raid_conf->thread->flags);
        }
        PRINTK(("--- raid5d inactive\n"));
 }
 
+#if SUPPORT_RECONSTRUCTION
 /*
  * Private kernel thread for parity reconstruction after an unclean
  * shutdown. Reconstruction on spare drives in case of a failed drive
@@ -1391,64 +1309,44 @@ repeat:
  */
 static void raid5syncd (void *data)
 {
-       raid5_conf_t *conf = data;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = data;
+       struct md_dev *mddev = raid_conf->mddev;
 
-       if (!conf->resync_parity)
-               return;
-       if (conf->resync_parity == 2)
+       if (!raid_conf->resync_parity)
                return;
-       down(&mddev->recovery_sem);
-       if (md_do_sync(mddev,NULL)) {
-               up(&mddev->recovery_sem);
-               printk("raid5: resync aborted!\n");
-               return;
-       }
-       conf->resync_parity = 0;
-       up(&mddev->recovery_sem);
-       printk("raid5: resync finished.\n");
+       md_do_sync(mddev);
+       raid_conf->resync_parity = 0;
 }
+#endif /* SUPPORT_RECONSTRUCTION */
 
-static int __check_consistency (mddev_t *mddev, int row)
+static int __check_consistency (struct md_dev *mddev, int row)
 {
-       raid5_conf_t *conf = mddev->private;
+       struct raid5_data *raid_conf = mddev->private;
        kdev_t dev;
        struct buffer_head *bh[MD_SB_DISKS], tmp;
-       int i, rc = 0, nr = 0, count;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       int i, rc = 0, nr = 0;
 
-       if (conf->working_disks != conf->raid_disks)
+       if (raid_conf->working_disks != raid_conf->raid_disks)
                return 0;
        tmp.b_size = 4096;
        if ((tmp.b_data = (char *) get_free_page(GFP_KERNEL)) == NULL)
                return 0;
-       md_clear_page((unsigned long)tmp.b_data);
        memset(bh, 0, MD_SB_DISKS * sizeof(struct buffer_head *));
-       for (i = 0; i < conf->raid_disks; i++) {
-               dev = conf->disks[i].dev;
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               dev = raid_conf->disks[i].dev;
                set_blocksize(dev, 4096);
                if ((bh[i] = bread(dev, row / 4, 4096)) == NULL)
                        break;
                nr++;
        }
-       if (nr == conf->raid_disks) {
-               bh_ptr[0] = &tmp;
-               count = 1;
-               for (i = 1; i < nr; i++) {
-                       bh_ptr[count++] = bh[i];
-                       if (count == MAX_XOR_BLOCKS) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
-                       }
-               }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
+       if (nr == raid_conf->raid_disks) {
+               for (i = 1; i < nr; i++)
+                       xor_block(&tmp, bh[i]);
                if (memcmp(tmp.b_data, bh[0]->b_data, 4096))
                        rc = 1;
        }
-       for (i = 0; i < conf->raid_disks; i++) {
-               dev = conf->disks[i].dev;
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               dev = raid_conf->disks[i].dev;
                if (bh[i]) {
                        bforget(bh[i]);
                        bh[i] = NULL;
@@ -1460,607 +1358,285 @@ static int __check_consistency (mddev_t *mddev, int row)
        return rc;
 }
 
-static int check_consistency (mddev_t *mddev)
+static int check_consistency (struct md_dev *mddev)
 {
-       if (__check_consistency(mddev, 0))
-/*
- * We are not checking this currently, as it's legitimate to have
- * an inconsistent array, at creation time.
- */
-               return 0;
+       int size = mddev->sb->size;
+       int row;
 
+       for (row = 0; row < size; row += size / 8)
+               if (__check_consistency(mddev, row))
+                       return 1;
        return 0;
 }
 
-static int raid5_run (mddev_t *mddev)
+static int raid5_run (int minor, struct md_dev *mddev)
 {
-       raid5_conf_t *conf;
+       struct raid5_data *raid_conf;
        int i, j, raid_disk, memory;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *desc;
-       mdk_rdev_t *rdev;
-       struct disk_info *disk;
-       struct md_list_head *tmp;
-       int start_recovery = 0;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
 
        MOD_INC_USE_COUNT;
 
        if (sb->level != 5 && sb->level != 4) {
-               printk("raid5: md%d: raid level not set to 4/5 (%d)\n", mdidx(mddev), sb->level);
+               printk("raid5: %s: raid level not set to 4/5 (%d)\n", kdevname(MKDEV(MD_MAJOR, minor)), sb->level);
                MOD_DEC_USE_COUNT;
                return -EIO;
        }
 
-       mddev->private = kmalloc (sizeof (raid5_conf_t), GFP_KERNEL);
-       if ((conf = mddev->private) == NULL)
+       mddev->private = kmalloc (sizeof (struct raid5_data), GFP_KERNEL);
+       if ((raid_conf = mddev->private) == NULL)
                goto abort;
-       memset (conf, 0, sizeof (*conf));
-       conf->mddev = mddev;
+       memset (raid_conf, 0, sizeof (*raid_conf));
+       raid_conf->mddev = mddev;
 
-       if ((conf->stripe_hashtbl = (struct stripe_head **) md__get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
+       if ((raid_conf->stripe_hashtbl = (struct stripe_head **) __get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
                goto abort;
-       memset(conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
+       memset(raid_conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
 
-       init_waitqueue(&conf->wait_for_stripe);
-       PRINTK(("raid5_run(md%d) called.\n", mdidx(mddev)));
+       init_waitqueue(&raid_conf->wait_for_stripe);
+       PRINTK(("raid5_run(%d) called.\n", minor));
+
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb) {
+                       printk(KERN_ERR "raid5: disabled device %s (couldn't access raid superblock)\n", kdevname(realdev->dev));
+                       continue;
+               }
 
-       ITERATE_RDEV(mddev,rdev,tmp) {
                /*
                 * This is important -- we are using the descriptor on
                 * the disk only to get a pointer to the descriptor on
                 * the main superblock, which might be more recent.
                 */
-               desc = sb->disks + rdev->desc_nr;
-               raid_disk = desc->raid_disk;
-               disk = conf->disks + raid_disk;
-
-               if (disk_faulty(desc)) {
-                       printk(KERN_ERR "raid5: disabled device %s (errors detected)\n", partition_name(rdev->dev));
-                       if (!rdev->faulty) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE)) {
+                       printk(KERN_ERR "raid5: disabled device %s (errors detected)\n", kdevname(realdev->dev));
                        continue;
                }
-               if (disk_active(desc)) {
-                       if (!disk_sync(desc)) {
-                               printk(KERN_ERR "raid5: disabled device %s (not in sync)\n", partition_name(rdev->dev));
-                               MD_BUG();
-                               goto abort;
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE)) {
+                       if (!(descriptor->state & (1 << MD_SYNC_DEVICE))) {
+                               printk(KERN_ERR "raid5: disabled device %s (not in sync)\n", kdevname(realdev->dev));
+                               continue;
                        }
-                       if (raid_disk > sb->raid_disks) {
-                               printk(KERN_ERR "raid5: disabled device %s (inconsistent descriptor)\n", partition_name(rdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       if (descriptor->number > sb->nr_disks || raid_disk > sb->raid_disks) {
+                               printk(KERN_ERR "raid5: disabled device %s (inconsistent descriptor)\n", kdevname(realdev->dev));
                                continue;
                        }
-                       if (disk->operational) {
-                               printk(KERN_ERR "raid5: disabled device %s (device %d already operational)\n", partition_name(rdev->dev), raid_disk);
+                       if (raid_conf->disks[raid_disk].operational) {
+                               printk(KERN_ERR "raid5: disabled device %s (device %d already operational)\n", kdevname(realdev->dev), raid_disk);
                                continue;
                        }
-                       printk(KERN_INFO "raid5: device %s operational as raid disk %d\n", partition_name(rdev->dev), raid_disk);
+                       printk(KERN_INFO "raid5: device %s operational as raid disk %d\n", kdevname(realdev->dev), raid_disk);
        
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-                       disk->operational = 1;
-                       disk->used_slot = 1;
+                       raid_conf->disks[raid_disk].number = descriptor->number;
+                       raid_conf->disks[raid_disk].raid_disk = raid_disk;
+                       raid_conf->disks[raid_disk].dev = mddev->devices[i].dev;
+                       raid_conf->disks[raid_disk].operational = 1;
 
-                       conf->working_disks++;
+                       raid_conf->working_disks++;
                } else {
                        /*
                         * Must be a spare disk ..
                         */
-                       printk(KERN_INFO "raid5: spare disk %s\n", partition_name(rdev->dev));
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 1;
-                       disk->used_slot = 1;
-               }
-       }
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               desc = sb->disks + i;
-               raid_disk = desc->raid_disk;
-               disk = conf->disks + raid_disk;
-
-               if (disk_faulty(desc) && (raid_disk < sb->raid_disks) &&
-                       !conf->disks[raid_disk].used_slot) {
-
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = MKDEV(0,0);
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+                       printk(KERN_INFO "raid5: spare disk %s\n", kdevname(realdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       raid_conf->disks[raid_disk].number = descriptor->number;
+                       raid_conf->disks[raid_disk].raid_disk = raid_disk;
+                       raid_conf->disks[raid_disk].dev = mddev->devices [i].dev;
+
+                       raid_conf->disks[raid_disk].operational = 0;
+                       raid_conf->disks[raid_disk].write_only = 0;
+                       raid_conf->disks[raid_disk].spare = 1;
                }
        }
+       raid_conf->raid_disks = sb->raid_disks;
+       raid_conf->failed_disks = raid_conf->raid_disks - raid_conf->working_disks;
+       raid_conf->mddev = mddev;
+       raid_conf->chunk_size = sb->chunk_size;
+       raid_conf->level = sb->level;
+       raid_conf->algorithm = sb->parity_algorithm;
+       raid_conf->max_nr_stripes = NR_STRIPES;
 
-       conf->raid_disks = sb->raid_disks;
-       /*
-        * 0 for a fully functional array, 1 for a degraded array.
-        */
-       conf->failed_disks = conf->raid_disks - conf->working_disks;
-       conf->mddev = mddev;
-       conf->chunk_size = sb->chunk_size;
-       conf->level = sb->level;
-       conf->algorithm = sb->layout;
-       conf->max_nr_stripes = NR_STRIPES;
-
-#if 0
-       for (i = 0; i < conf->raid_disks; i++) {
-               if (!conf->disks[i].used_slot) {
-                       MD_BUG();
-                       goto abort;
-               }
+       if (raid_conf->working_disks != sb->raid_disks && sb->state != (1 << MD_SB_CLEAN)) {
+               printk(KERN_ALERT "raid5: raid set %s not clean and not all disks are operational -- run ckraid\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
        }
-#endif
-       if (!conf->chunk_size || conf->chunk_size % 4) {
-               printk(KERN_ERR "raid5: invalid chunk size %d for md%d\n", conf->chunk_size, mdidx(mddev));
+       if (!raid_conf->chunk_size || raid_conf->chunk_size % 4) {
+               printk(KERN_ERR "raid5: invalid chunk size %d for %s\n", raid_conf->chunk_size, kdevname(MKDEV(MD_MAJOR, minor)));
                goto abort;
        }
-       if (conf->algorithm > ALGORITHM_RIGHT_SYMMETRIC) {
-               printk(KERN_ERR "raid5: unsupported parity algorithm %d for md%d\n", conf->algorithm, mdidx(mddev));
+       if (raid_conf->algorithm > ALGORITHM_RIGHT_SYMMETRIC) {
+               printk(KERN_ERR "raid5: unsupported parity algorithm %d for %s\n", raid_conf->algorithm, kdevname(MKDEV(MD_MAJOR, minor)));
                goto abort;
        }
-       if (conf->failed_disks > 1) {
-               printk(KERN_ERR "raid5: not enough operational devices for md%d (%d/%d failed)\n", mdidx(mddev), conf->failed_disks, conf->raid_disks);
+       if (raid_conf->failed_disks > 1) {
+               printk(KERN_ERR "raid5: not enough operational devices for %s (%d/%d failed)\n", kdevname(MKDEV(MD_MAJOR, minor)), raid_conf->failed_disks, raid_conf->raid_disks);
                goto abort;
        }
 
-       if (conf->working_disks != sb->raid_disks) {
-               printk(KERN_ALERT "raid5: md%d, not all disks are operational -- trying to recover array\n", mdidx(mddev));
-               start_recovery = 1;
+       if ((sb->state & (1 << MD_SB_CLEAN)) && check_consistency(mddev)) {
+               printk(KERN_ERR "raid5: detected raid-5 xor inconsistenty -- run ckraid\n");
+               sb->state |= 1 << MD_SB_ERRORS;
+               goto abort;
        }
 
-       if (!start_recovery && (sb->state & (1 << MD_SB_CLEAN)) &&
-                       check_consistency(mddev)) {
-               printk(KERN_ERR "raid5: detected raid-5 superblock xor inconsistency -- running resync\n");
-               sb->state &= ~(1 << MD_SB_CLEAN);
+       if ((raid_conf->thread = md_register_thread(raid5d, raid_conf)) == NULL) {
+               printk(KERN_ERR "raid5: couldn't allocate thread for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
        }
 
-       {
-               const char * name = "raid5d";
-
-               conf->thread = md_register_thread(raid5d, conf, name);
-               if (!conf->thread) {
-                       printk(KERN_ERR "raid5: couldn't allocate thread for md%d\n", mdidx(mddev));
-                       goto abort;
-               }
+#if SUPPORT_RECONSTRUCTION
+       if ((raid_conf->resync_thread = md_register_thread(raid5syncd, raid_conf)) == NULL) {
+               printk(KERN_ERR "raid5: couldn't allocate thread for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
        }
+#endif /* SUPPORT_RECONSTRUCTION */
 
-       memory = conf->max_nr_stripes * (sizeof(struct stripe_head) +
-                conf->raid_disks * (sizeof(struct buffer_head) +
+       memory = raid_conf->max_nr_stripes * (sizeof(struct stripe_head) +
+                raid_conf->raid_disks * (sizeof(struct buffer_head) +
                 2 * (sizeof(struct buffer_head) + PAGE_SIZE))) / 1024;
-       if (grow_stripes(conf, conf->max_nr_stripes, GFP_KERNEL)) {
+       if (grow_stripes(raid_conf, raid_conf->max_nr_stripes, GFP_KERNEL)) {
                printk(KERN_ERR "raid5: couldn't allocate %dkB for buffers\n", memory);
-               shrink_stripes(conf, conf->max_nr_stripes);
+               shrink_stripes(raid_conf, raid_conf->max_nr_stripes);
                goto abort;
        } else
-               printk(KERN_INFO "raid5: allocated %dkB for md%d\n", memory, mdidx(mddev));
+               printk(KERN_INFO "raid5: allocated %dkB for %s\n", memory, kdevname(MKDEV(MD_MAJOR, minor)));
 
        /*
         * Regenerate the "device is in sync with the raid set" bit for
         * each device.
         */
-       for (i = 0; i < MD_SB_DISKS ; i++) {
-               mark_disk_nonsync(sb->disks + i);
+       for (i = 0; i < sb->nr_disks ; i++) {
+               sb->disks[i].state &= ~(1 << MD_SYNC_DEVICE);
                for (j = 0; j < sb->raid_disks; j++) {
-                       if (!conf->disks[j].operational)
+                       if (!raid_conf->disks[j].operational)
                                continue;
-                       if (sb->disks[i].number == conf->disks[j].number)
-                               mark_disk_sync(sb->disks + i);
+                       if (sb->disks[i].number == raid_conf->disks[j].number)
+                               sb->disks[i].state |= 1 << MD_SYNC_DEVICE;
                }
        }
-       sb->active_disks = conf->working_disks;
+       sb->active_disks = raid_conf->working_disks;
 
        if (sb->active_disks == sb->raid_disks)
-               printk("raid5: raid level %d set md%d active with %d out of %d devices, algorithm %d\n", conf->level, mdidx(mddev), sb->active_disks, sb->raid_disks, conf->algorithm);
+               printk("raid5: raid level %d set %s active with %d out of %d devices, algorithm %d\n", raid_conf->level, kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks, raid_conf->algorithm);
        else
-               printk(KERN_ALERT "raid5: raid level %d set md%d active with %d out of %d devices, algorithm %d\n", conf->level, mdidx(mddev), sb->active_disks, sb->raid_disks, conf->algorithm);
-
-       if (!start_recovery && ((sb->state & (1 << MD_SB_CLEAN))==0)) {
-               const char * name = "raid5syncd";
+               printk(KERN_ALERT "raid5: raid level %d set %s active with %d out of %d devices, algorithm %d\n", raid_conf->level, kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks, raid_conf->algorithm);
 
-               conf->resync_thread = md_register_thread(raid5syncd, conf,name);
-               if (!conf->resync_thread) {
-                       printk(KERN_ERR "raid5: couldn't allocate thread for md%d\n", mdidx(mddev));
-                       goto abort;
-               }
-
-               printk("raid5: raid set md%d not clean; reconstructing parity\n", mdidx(mddev));
-               conf->resync_parity = 1;
-               md_wakeup_thread(conf->resync_thread);
+       if ((sb->state & (1 << MD_SB_CLEAN)) == 0) {
+               printk("raid5: raid set %s not clean; re-constructing parity\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               raid_conf->resync_parity = 1;
+#if SUPPORT_RECONSTRUCTION
+               md_wakeup_thread(raid_conf->resync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
        }
 
-       print_raid5_conf(conf);
-       if (start_recovery)
-               md_recover_arrays();
-       print_raid5_conf(conf);
-
        /* Ok, everything is just fine now */
        return (0);
 abort:
-       if (conf) {
-               print_raid5_conf(conf);
-               if (conf->stripe_hashtbl)
-                       free_pages((unsigned long) conf->stripe_hashtbl,
-                                                       HASH_PAGES_ORDER);
-               kfree(conf);
+       if (raid_conf) {
+               if (raid_conf->stripe_hashtbl)
+                       free_pages((unsigned long) raid_conf->stripe_hashtbl, HASH_PAGES_ORDER);
+               kfree(raid_conf);
        }
        mddev->private = NULL;
-       printk(KERN_ALERT "raid5: failed to run raid set md%d\n", mdidx(mddev));
+       printk(KERN_ALERT "raid5: failed to run raid set %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
        MOD_DEC_USE_COUNT;
        return -EIO;
 }
 
-static int raid5_stop_resync (mddev_t *mddev)
+static int raid5_stop (int minor, struct md_dev *mddev)
 {
-       raid5_conf_t *conf = mddev_to_conf(mddev);
-       mdk_thread_t *thread = conf->resync_thread;
-
-       if (thread) {
-               if (conf->resync_parity) {
-                       conf->resync_parity = 2;
-                       md_interrupt_thread(thread);
-                       printk(KERN_INFO "raid5: parity resync was not fully finished, restarting next time.\n");
-                       return 1;
-               }
-               return 0;
-       }
-       return 0;
-}
-
-static int raid5_restart_resync (mddev_t *mddev)
-{
-       raid5_conf_t *conf = mddev_to_conf(mddev);
-
-       if (conf->resync_parity) {
-               if (!conf->resync_thread) {
-                       MD_BUG();
-                       return 0;
-               }
-               printk("raid5: waking up raid5resync.\n");
-               conf->resync_parity = 1;
-               md_wakeup_thread(conf->resync_thread);
-               return 1;
-       } else
-               printk("raid5: no restart-resync needed.\n");
-       return 0;
-}
-
-
-static int raid5_stop (mddev_t *mddev)
-{
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-
-       shrink_stripe_cache(conf, conf->max_nr_stripes);
-       shrink_stripes(conf, conf->max_nr_stripes);
-       md_unregister_thread(conf->thread);
-       if (conf->resync_thread)
-               md_unregister_thread(conf->resync_thread);
-       free_pages((unsigned long) conf->stripe_hashtbl, HASH_PAGES_ORDER);
-       kfree(conf);
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+
+       shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes);
+       shrink_stripes(raid_conf, raid_conf->max_nr_stripes);
+       md_unregister_thread(raid_conf->thread);
+#if SUPPORT_RECONSTRUCTION
+       md_unregister_thread(raid_conf->resync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
+       free_pages((unsigned long) raid_conf->stripe_hashtbl, HASH_PAGES_ORDER);
+       kfree(raid_conf);
        mddev->private = NULL;
        MOD_DEC_USE_COUNT;
        return 0;
 }
 
-static int raid5_status (char *page, mddev_t *mddev)
+static int raid5_status (char *page, int minor, struct md_dev *mddev)
 {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       mdp_super_t *sb = mddev->sb;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       md_superblock_t *sb = mddev->sb;
        int sz = 0, i;
 
-       sz += sprintf (page+sz, " level %d, %dk chunk, algorithm %d", sb->level, sb->chunk_size >> 10, sb->layout);
-       sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks, conf->working_disks);
-       for (i = 0; i < conf->raid_disks; i++)
-               sz += sprintf (page+sz, "%s", conf->disks[i].operational ? "U" : "_");
+       sz += sprintf (page+sz, " level %d, %dk chunk, algorithm %d", sb->level, sb->chunk_size >> 10, sb->parity_algorithm);
+       sz += sprintf (page+sz, " [%d/%d] [", raid_conf->raid_disks, raid_conf->working_disks);
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               sz += sprintf (page+sz, "%s", raid_conf->disks[i].operational ? "U" : "_");
        sz += sprintf (page+sz, "]");
        return sz;
 }
 
-static void print_raid5_conf (raid5_conf_t *conf)
+static int raid5_mark_spare(struct md_dev *mddev, md_descriptor_t *spare, int state)
 {
-       int i;
-       struct disk_info *tmp;
-
-       printk("RAID5 conf printout:\n");
-       if (!conf) {
-               printk("(conf==NULL)\n");
-               return;
-       }
-       printk(" --- rd:%d wd:%d fd:%d\n", conf->raid_disks,
-                conf->working_disks, conf->failed_disks);
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               tmp = conf->disks + i;
-               printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
-                       i, tmp->spare,tmp->operational,
-                       tmp->number,tmp->raid_disk,tmp->used_slot,
-                       partition_name(tmp->dev));
-       }
-}
-
-static int raid5_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
-{
-       int err = 0;
-       int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
-       raid5_conf_t *conf = mddev->private;
-       struct disk_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
+       int i = 0, failed_disk = -1;
+       struct raid5_data *raid_conf = mddev->private;
+       struct disk_info *disk = raid_conf->disks;
        unsigned long flags;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *failed_desc, *spare_desc, *added_desc;
-
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+
+       for (i = 0; i < MD_SB_DISKS; i++, disk++) {
+               if (disk->spare && disk->number == spare->number)
+                       goto found;
+       }
+       return 1;
+found:
+       for (i = 0, disk = raid_conf->disks; i < raid_conf->raid_disks; i++, disk++)
+               if (!disk->operational)
+                       failed_disk = i;
+       if (failed_disk == -1)
+               return 1;
        save_flags(flags);
        cli();
-
-       print_raid5_conf(conf);
-       /*
-        * find the disk ...
-        */
-       switch (state) {
-
-       case DISKOP_SPARE_ACTIVE:
-
-               /*
-                * Find the failed disk within the RAID5 configuration ...
-                * (this can only be in the first conf->raid_disks part)
-                */
-               for (i = 0; i < conf->raid_disks; i++) {
-                       tmp = conf->disks + i;
-                       if ((!tmp->operational && !tmp->spare) ||
-                                       !tmp->used_slot) {
-                               failed_disk = i;
-                               break;
-                       }
-               }
-               /*
-                * When we activate a spare disk we _must_ have a disk in
-                * the lower (active) part of the array to replace. 
-                */
-               if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               /* fall through */
-
-       case DISKOP_SPARE_WRITE:
-       case DISKOP_SPARE_INACTIVE:
-
-               /*
-                * Find the spare disk ... (can only be in the 'high'
-                * area of the array)
-                */
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (tmp->spare && tmp->number == (*d)->number) {
-                               spare_disk = i;
-                               break;
-                       }
-               }
-               if (spare_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-
-               for (i = 0; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (tmp->used_slot && (tmp->number == (*d)->number)) {
-                               if (tmp->operational) {
-                                       err = -EBUSY;
-                                       goto abort;
-                               }
-                               removed_disk = i;
-                               break;
-                       }
-               }
-               if (removed_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (!tmp->used_slot) {
-                               added_disk = i;
-                               break;
-                       }
-               }
-               if (added_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-       }
-
        switch (state) {
-       /*
-        * Switch the spare disk to write-only mode:
-        */
-       case DISKOP_SPARE_WRITE:
-               if (conf->spare) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               sdisk = conf->disks + spare_disk;
-               sdisk->operational = 1;
-               sdisk->write_only = 1;
-               conf->spare = sdisk;
-               break;
-       /*
-        * Deactivate a spare disk:
-        */
-       case DISKOP_SPARE_INACTIVE:
-               sdisk = conf->disks + spare_disk;
-               sdisk->operational = 0;
-               sdisk->write_only = 0;
-               /*
-                * Was the spare being resynced?
-                */
-               if (conf->spare == sdisk)
-                       conf->spare = NULL;
-               break;
-       /*
-        * Activate (mark read-write) the (now sync) spare disk,
-        * which means we switch it's 'raid position' (->raid_disk)
-        * with the failed disk. (only the first 'conf->raid_disks'
-        * slots are used for 'real' disks and we must preserve this
-        * property)
-        */
-       case DISKOP_SPARE_ACTIVE:
-               if (!conf->spare) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               sdisk = conf->disks + spare_disk;
-               fdisk = conf->disks + failed_disk;
-
-               spare_desc = &sb->disks[sdisk->number];
-               failed_desc = &sb->disks[fdisk->number];
-
-               if (spare_desc != *d) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (spare_desc->raid_disk != sdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-                       
-               if (sdisk->raid_disk != spare_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (failed_desc->raid_disk != fdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (fdisk->raid_disk != failed_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               /*
-                * do the switch finally
-                */
-               xchg_values(*spare_desc, *failed_desc);
-               xchg_values(*fdisk, *sdisk);
-
-               /*
-                * (careful, 'failed' and 'spare' are switched from now on)
-                *
-                * we want to preserve linear numbering and we want to
-                * give the proper raid_disk number to the now activated
-                * disk. (this means we switch back these values)
-                */
-       
-               xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
-               xchg_values(sdisk->raid_disk, fdisk->raid_disk);
-               xchg_values(spare_desc->number, failed_desc->number);
-               xchg_values(sdisk->number, fdisk->number);
-
-               *d = failed_desc;
-
-               if (sdisk->dev == MKDEV(0,0))
-                       sdisk->used_slot = 0;
-
-               /*
-                * this really activates the spare.
-                */
-               fdisk->spare = 0;
-               fdisk->write_only = 0;
-
-               /*
-                * if we activate a spare, we definitely replace a
-                * non-operational disk slot in the 'low' area of
-                * the disk array.
-                */
-               conf->failed_disks--;
-               conf->working_disks++;
-               conf->spare = NULL;
-
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-               rdisk = conf->disks + removed_disk;
-
-               if (rdisk->spare && (removed_disk < conf->raid_disks)) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-               rdisk->dev = MKDEV(0,0);
-               rdisk->used_slot = 0;
-
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-               adisk = conf->disks + added_disk;
-               added_desc = *d;
-
-               if (added_disk != added_desc->number) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-
-               adisk->number = added_desc->number;
-               adisk->raid_disk = added_desc->raid_disk;
-               adisk->dev = MKDEV(added_desc->major,added_desc->minor);
-
-               adisk->operational = 0;
-               adisk->write_only = 0;
-               adisk->spare = 1;
-               adisk->used_slot = 1;
-
-
-               break;
+               case SPARE_WRITE:
+                       disk->operational = 1;
+                       disk->write_only = 1;
+                       raid_conf->spare = disk;
+                       break;
+               case SPARE_INACTIVE:
+                       disk->operational = 0;
+                       disk->write_only = 0;
+                       raid_conf->spare = NULL;
+                       break;
+               case SPARE_ACTIVE:
+                       disk->spare = 0;
+                       disk->write_only = 0;
 
-       default:
-               MD_BUG();       
-               err = 1;
-               goto abort;
+                       descriptor = &sb->disks[raid_conf->disks[failed_disk].number];
+                       i = spare->raid_disk;
+                       disk->raid_disk = spare->raid_disk = descriptor->raid_disk;
+                       if (disk->raid_disk != failed_disk)
+                               printk("raid5: disk->raid_disk != failed_disk");
+                       descriptor->raid_disk = i;
+
+                       raid_conf->spare = NULL;
+                       raid_conf->working_disks++;
+                       raid_conf->failed_disks--;
+                       raid_conf->disks[failed_disk] = *disk;
+                       break;
+               default:
+                       printk("raid5_mark_spare: bug: state == %d\n", state);
+                       restore_flags(flags);
+                       return 1;
        }
-abort:
        restore_flags(flags);
-       print_raid5_conf(conf);
-       return err;
+       return 0;
 }
 
-static mdk_personality_t raid5_personality=
+static struct md_personality raid5_personality=
 {
        "raid5",
        raid5_map,
@@ -2072,19 +1648,14 @@ static mdk_personality_t raid5_personality=
        NULL,                   /* no ioctls */
        0,
        raid5_error,
-       raid5_diskop,
-       raid5_stop_resync,
-       raid5_restart_resync
+       /* raid5_hot_add_disk, */ NULL,
+       /* raid1_hot_remove_drive */ NULL,
+       raid5_mark_spare
 };
 
 int raid5_init (void)
 {
-       int err;
-
-       err = register_md_personality (RAID5, &raid5_personality);
-       if (err)
-               return err;
-       return 0;
+       return register_md_personality (RAID5, &raid5_personality);
 }
 
 #ifdef MODULE
diff --git a/drivers/block/translucent.c b/drivers/block/translucent.c
deleted file mode 100644 (file)
index 49d2d88..0000000
+++ /dev/null
@@ -1,136 +0,0 @@
-/*
-   translucent.c : Translucent RAID driver for Linux
-              Copyright (C) 1998 Ingo Molnar
-
-   Translucent mode management functions.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/module.h>
-
-#include <linux/raid/md.h>
-#include <linux/malloc.h>
-
-#include <linux/raid/translucent.h>
-
-#define MAJOR_NR MD_MAJOR
-#define MD_DRIVER
-#define MD_PERSONALITY
-
-static int translucent_run (mddev_t *mddev)
-{
-       translucent_conf_t *conf;
-       mdk_rdev_t *rdev;
-       int i;
-
-       MOD_INC_USE_COUNT;
-
-       conf = kmalloc (sizeof (*conf), GFP_KERNEL);
-       if (!conf)
-               goto out;
-       mddev->private = conf;
-
-       if (mddev->nb_dev != 2) {
-               printk("translucent: this mode needs 2 disks, aborting!\n");
-               goto out;
-       }
-
-       if (md_check_ordering(mddev)) {
-               printk("translucent: disks are not ordered, aborting!\n");
-               goto out;
-       }
-
-       ITERATE_RDEV_ORDERED(mddev,rdev,i) {
-               dev_info_t *disk = conf->disks + i;
-
-               disk->dev = rdev->dev;
-               disk->size = rdev->size;
-       }
-
-       return 0;
-
-out:
-       if (conf)
-               kfree(conf);
-
-       MOD_DEC_USE_COUNT;
-       return 1;
-}
-
-static int translucent_stop (mddev_t *mddev)
-{
-       translucent_conf_t *conf = mddev_to_conf(mddev);
-  
-       kfree(conf);
-
-       MOD_DEC_USE_COUNT;
-
-       return 0;
-}
-
-
-static int translucent_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-                      unsigned long *rsector, unsigned long size)
-{
-       translucent_conf_t *conf = mddev_to_conf(mddev);
-  
-       *rdev = conf->disks[0].dev;
-
-       return 0;
-}
-
-static int translucent_status (char *page, mddev_t *mddev)
-{
-       int sz = 0;
-  
-       sz += sprintf(page+sz, " %d%% full", 10);
-       return sz;
-}
-
-
-static mdk_personality_t translucent_personality=
-{
-       "translucent",
-       translucent_map,
-       NULL,
-       NULL,
-       translucent_run,
-       translucent_stop,
-       translucent_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
-};
-
-#ifndef MODULE
-
-md__initfunc(void translucent_init (void))
-{
-       register_md_personality (TRANSLUCENT, &translucent_personality);
-}
-
-#else
-
-int init_module (void)
-{
-       return (register_md_personality (TRANSLUCENT, &translucent_personality));
-}
-
-void cleanup_module (void)
-{
-       unregister_md_personality (TRANSLUCENT);
-}
-
-#endif
-
diff --git a/drivers/block/xor.c b/drivers/block/xor.c
deleted file mode 100644 (file)
index 062ba73..0000000
+++ /dev/null
@@ -1,1895 +0,0 @@
-/*
- * xor.c : Multiple Devices driver for Linux
- *
- * Copyright (C) 1996, 1997, 1998, 1999 Ingo Molnar, Matti Aarnio, Jakub Jelinek
- *
- *
- * optimized RAID-5 checksumming functions.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2, or (at your option)
- * any later version.
- *
- * You should have received a copy of the GNU General Public License
- * (for example /usr/src/linux/COPYING); if not, write to the Free
- * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-#include <linux/config.h>
-#include <linux/module.h>
-#include <linux/raid/md.h>
-#ifdef __sparc_v9__
-#include <asm/head.h>
-#include <asm/asi.h>
-#include <asm/visasm.h>
-#endif
-
-/*
- * we use the 'XOR function template' to register multiple xor
- * functions runtime. The kernel measures their speed upon bootup
- * and decides which one to use. (compile-time registration is
- * not enough as certain CPU features like MMX can only be detected
- * runtime)
- *
- * this architecture makes it pretty easy to add new routines
- * that are faster on certain CPUs, without killing other CPU's
- * 'native' routine. Although the current routines are belived
- * to be the physically fastest ones on all CPUs tested, but
- * feel free to prove me wrong and add yet another routine =B-)
- * --mingo
- */
-
-#define MAX_XOR_BLOCKS 5
-
-#define XOR_ARGS (unsigned int count, struct buffer_head **bh_ptr)
-
-typedef void (*xor_block_t) XOR_ARGS;
-xor_block_t xor_block = NULL;
-
-#ifndef __sparc_v9__
-
-struct xor_block_template;
-
-struct xor_block_template {
-       char * name;
-       xor_block_t xor_block;
-       int speed;
-       struct xor_block_template * next;
-};
-
-struct xor_block_template * xor_functions = NULL;
-
-#define XORBLOCK_TEMPLATE(x) \
-static void xor_block_##x XOR_ARGS; \
-static struct xor_block_template t_xor_block_##x = \
-                                { #x, xor_block_##x, 0, NULL }; \
-static void xor_block_##x XOR_ARGS
-
-#ifdef __i386__
-
-#ifdef CONFIG_X86_XMM
-/*
- * Cache avoiding checksumming functions utilizing KNI instructions
- * Copyright (C) 1999 Zach Brown (with obvious credit due Ingo)
- */
-
-XORBLOCK_TEMPLATE(pIII_kni)
-{
-       char xmm_save[16*4];
-       int cr0;
-        int lines = (bh_ptr[0]->b_size>>8);
-
-       __asm__ __volatile__ ( 
-               "movl %%cr0,%0          ;\n\t"
-               "clts                   ;\n\t"
-               "movups %%xmm0,(%1)     ;\n\t"
-               "movups %%xmm1,0x10(%1) ;\n\t"
-               "movups %%xmm2,0x20(%1) ;\n\t"
-               "movups %%xmm3,0x30(%1) ;\n\t"
-               : "=r" (cr0)
-               : "r" (xmm_save) 
-               : "memory" );
-
-#define OFFS(x) "8*("#x"*2)"
-#define        PF0(x) \
-       "       prefetcht0  "OFFS(x)"(%1)   ;\n"
-#define LD(x,y) \
-        "       movaps   "OFFS(x)"(%1), %%xmm"#y"   ;\n"
-#define ST(x,y) \
-        "       movaps %%xmm"#y",   "OFFS(x)"(%1)   ;\n"
-#define PF1(x) \
-       "       prefetchnta "OFFS(x)"(%2)   ;\n"
-#define PF2(x) \
-       "       prefetchnta "OFFS(x)"(%3)   ;\n"
-#define PF3(x) \
-       "       prefetchnta "OFFS(x)"(%4)   ;\n"
-#define PF4(x) \
-       "       prefetchnta "OFFS(x)"(%5)   ;\n"
-#define PF5(x) \
-       "       prefetchnta "OFFS(x)"(%6)   ;\n"
-#define XO1(x,y) \
-        "       xorps   "OFFS(x)"(%2), %%xmm"#y"   ;\n"
-#define XO2(x,y) \
-        "       xorps   "OFFS(x)"(%3), %%xmm"#y"   ;\n"
-#define XO3(x,y) \
-        "       xorps   "OFFS(x)"(%4), %%xmm"#y"   ;\n"
-#define XO4(x,y) \
-        "       xorps   "OFFS(x)"(%5), %%xmm"#y"   ;\n"
-#define XO5(x,y) \
-        "       xorps   "OFFS(x)"(%6), %%xmm"#y"   ;\n"
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data)
-                       : "memory" );
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data)
-                       : "memory" );
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               PF3(i)                                  \
-                               PF3(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               XO3(i,0)                                \
-                       XO3(i+1,1)                      \
-                               XO3(i+2,2)              \
-                                       XO3(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       addl $256, %4           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data)
-                       : "memory" );
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               PF3(i)                                  \
-                               PF3(i+2)                \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               PF4(i)                                  \
-                               PF4(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO3(i,0)                                \
-                       XO3(i+1,1)                      \
-                               XO3(i+2,2)              \
-                                       XO3(i+3,3)      \
-               XO4(i,0)                                \
-                       XO4(i+1,1)                      \
-                               XO4(i+2,2)              \
-                                       XO4(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       addl $256, %4           ;\n"
-        "       addl $256, %5           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data),
-                         "r" (bh_ptr[4]->b_data)
-                       : "memory");
-                       break;
-       }
-
-       __asm__ __volatile__ ( 
-               "sfence                 ;\n\t"
-               "movups (%1),%%xmm0     ;\n\t"
-               "movups 0x10(%1),%%xmm1 ;\n\t"
-               "movups 0x20(%1),%%xmm2 ;\n\t"
-               "movups 0x30(%1),%%xmm3 ;\n\t"
-               "movl   %0,%%cr0        ;\n\t"
-               :
-               : "r" (cr0), "r" (xmm_save)
-               : "memory" );
-}
-
-#undef OFFS
-#undef LD
-#undef ST
-#undef PF0
-#undef PF1
-#undef PF2
-#undef PF3
-#undef PF4
-#undef PF5
-#undef XO1
-#undef XO2
-#undef XO3
-#undef XO4
-#undef XO5
-#undef BLOCK
-
-#endif /* CONFIG_X86_XMM */
-
-/*
- * high-speed RAID5 checksumming functions utilizing MMX instructions
- * Copyright (C) 1998 Ingo Molnar
- */
-XORBLOCK_TEMPLATE(pII_mmx)
-{
-       char fpu_save[108];
-        int lines = (bh_ptr[0]->b_size>>7);
-
-       if (!(current->flags & PF_USEDFPU))
-               __asm__ __volatile__ ( " clts;\n");
-
-       __asm__ __volatile__ ( " fsave %0; fwait\n"::"m"(fpu_save[0]) );
-
-#define LD(x,y) \
-        "       movq   8*("#x")(%1), %%mm"#y"   ;\n"
-#define ST(x,y) \
-        "       movq %%mm"#y",   8*("#x")(%1)   ;\n"
-#define XO1(x,y) \
-        "       pxor   8*("#x")(%2), %%mm"#y"   ;\n"
-#define XO2(x,y) \
-        "       pxor   8*("#x")(%3), %%mm"#y"   ;\n"
-#define XO3(x,y) \
-        "       pxor   8*("#x")(%4), %%mm"#y"   ;\n"
-#define XO4(x,y) \
-        "       pxor   8*("#x")(%5), %%mm"#y"   ;\n"
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                       ST(i,0)                                 \
-                               XO1(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO1(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO1(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data)
-                       : "memory");
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                       ST(i,0)                                 \
-                               XO2(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO2(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO2(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data)
-                       : "memory");
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                               XO2(i+1,1)                      \
-                                       XO2(i+2,2)              \
-                                               XO2(i+3,3)      \
-                       XO3(i,0)                                \
-                       ST(i,0)                                 \
-                               XO3(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO3(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO3(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       addl $128, %4         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data)
-                       : "memory");
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                               XO2(i+1,1)                      \
-                                       XO2(i+2,2)              \
-                                               XO2(i+3,3)      \
-                       XO3(i,0)                                \
-                               XO3(i+1,1)                      \
-                                       XO3(i+2,2)              \
-                                               XO3(i+3,3)      \
-                       XO4(i,0)                                \
-                       ST(i,0)                                 \
-                               XO4(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO4(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO4(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       addl $128, %4         ;\n"
-                       "       addl $128, %5         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data),
-                         "r" (bh_ptr[4]->b_data)
-                       : "memory");
-                       break;
-       }
-
-       __asm__ __volatile__ ( " frstor %0;\n"::"m"(fpu_save[0]) );
-
-       if (!(current->flags & PF_USEDFPU))
-               stts();
-}
-
-#undef LD
-#undef XO1
-#undef XO2
-#undef XO3
-#undef XO4
-#undef ST
-#undef BLOCK
-
-XORBLOCK_TEMPLATE(p5_mmx)
-{
-       char fpu_save[108];
-        int lines = (bh_ptr[0]->b_size>>6);
-
-       if (!(current->flags & PF_USEDFPU))
-               __asm__ __volatile__ ( " clts;\n");
-
-       __asm__ __volatile__ ( " fsave %0; fwait\n"::"m"(fpu_save[0]) );
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data)
-                               : "memory" );
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data)
-                               : "memory" );
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor   (%4), %%mm0   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       pxor  8(%4), %%mm1   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       pxor 16(%4), %%mm2   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 24(%4), %%mm3   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       pxor 32(%4), %%mm4   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       pxor 40(%4), %%mm5   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%4), %%mm6   ;\n"
-                               "       pxor 56(%4), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       addl $64, %4         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data),
-                                 "r" (bh_ptr[3]->b_data)
-                               : "memory" );
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       pxor   (%4), %%mm0   ;\n"
-                               "       pxor  8(%4), %%mm1   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       pxor   (%5), %%mm0   ;\n"
-                               "       pxor  8(%5), %%mm1   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor 16(%4), %%mm2   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       pxor 16(%5), %%mm2   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 24(%4), %%mm3   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 24(%5), %%mm3   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%4), %%mm4   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       pxor 32(%5), %%mm4   ;\n"
-                               "       pxor 40(%4), %%mm5   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       pxor 40(%5), %%mm5   ;\n"
-                               "       pxor 48(%4), %%mm6   ;\n"
-                               "       pxor 56(%4), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%5), %%mm6   ;\n"
-                               "       pxor 56(%5), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       addl $64, %4         ;\n"
-                               "       addl $64, %5         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data),
-                                 "r" (bh_ptr[3]->b_data),
-                                 "r" (bh_ptr[4]->b_data)
-                               : "memory" );
-                       break;
-       }
-
-       __asm__ __volatile__ ( " frstor %0;\n"::"m"(fpu_save[0]) );
-
-       if (!(current->flags & PF_USEDFPU))
-               stts();
-}
-#endif /* __i386__ */
-#endif /* !__sparc_v9__ */
-
-#ifdef __sparc_v9__
-/*
- * High speed xor_block operation for RAID4/5 utilizing the
- * UltraSparc Visual Instruction Set.
- *
- * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
- *
- *     Requirements:
- *     !(((long)dest | (long)sourceN) & (64 - 1)) &&
- *     !(len & 127) && len >= 256
- *
- * It is done in pure assembly, as otherwise gcc makes it
- * a non-leaf function, which is not what we want.
- * Also, we don't measure the speeds as on other architectures,
- * as the measuring routine does not take into account cold caches
- * and the fact that xor_block_VIS bypasses the caches.
- * xor_block_32regs might be 5% faster for count 2 if caches are hot
- * and things just right (for count 3 VIS is about as fast as 32regs for
- * hot caches and for count 4 and 5 VIS is faster by good margin always),
- * but I think it is better not to pollute the caches.
- * Actually, if I'd just fight for speed for hot caches, I could
- * write a hybrid VIS/integer routine, which would do always two
- * 64B blocks in VIS and two in IEUs, but I really care more about
- * caches.
- */
-extern void *VISenter(void);
-extern void xor_block_VIS XOR_ARGS;
-
-void __xor_block_VIS(void)
-{
-__asm__ ("
-       .globl xor_block_VIS
-xor_block_VIS:
-       ldx     [%%o1 + 0], %%o4
-       ldx     [%%o1 + 8], %%o3
-       ldx     [%%o4 + %1], %%g5
-       ldx     [%%o4 + %0], %%o4
-       ldx     [%%o3 + %0], %%o3
-       rd      %%fprs, %%o5
-       andcc   %%o5, %2, %%g0
-       be,pt   %%icc, 297f
-        sethi  %%hi(%5), %%g1
-       jmpl    %%g1 + %%lo(%5), %%g7
-        add    %%g7, 8, %%g7
-297:   wr      %%g0, %4, %%fprs
-       membar  #LoadStore|#StoreLoad|#StoreStore
-       sub     %%g5, 64, %%g5
-       ldda    [%%o4] %3, %%f0
-       ldda    [%%o3] %3, %%f16
-       cmp     %%o0, 4
-       bgeu,pt %%xcc, 10f
-        cmp    %%o0, 3
-       be,pn   %%xcc, 13f
-        mov    -64, %%g1
-       sub     %%g5, 64, %%g5
-       rd      %%asi, %%g1
-       wr      %%g0, %3, %%asi
-
-2:     ldda    [%%o4 + 64] %%asi, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       stda    %%f16, [%%o4] %3
-       ldda    [%%o3 + 64] %%asi, %%f48
-       ldda    [%%o4 + 128] %%asi, %%f0
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       add     %%o4, 128, %%o4
-       fxor    %%f36, %%f52, %%f52
-       add     %%o3, 128, %%o3
-       fxor    %%f38, %%f54, %%f54
-       subcc   %%g5, 128, %%g5
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 - 64] %%asi
-       bne,pt  %%xcc, 2b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o4 + 64] %%asi, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       stda    %%f16, [%%o4] %3
-       ldda    [%%o3 + 64] %%asi, %%f48
-       membar  #Sync
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 + 64] %%asi
-       membar  #Sync|#StoreStore|#StoreLoad
-       wr      %%g0, 0, %%fprs
-       retl
-        wr     %%g1, %%g0, %%asi
-
-13:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o2 + %0], %%o2
-
-3:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       add     %%o2, 64, %%o2
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 3b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       membar  #Sync
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-
-10:    cmp     %%o0, 5
-       be,pt   %%xcc, 15f
-        mov    -64, %%g1
-
-14:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o1 + 24], %%o0
-       ldx     [%%o2 + %0], %%o2
-       ldx     [%%o0 + %0], %%o0
-
-4:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       ldda    [%%o0] %3, %%f48
-       fxor    %%f16, %%f32, %%f32
-       fxor    %%f18, %%f34, %%f34
-       fxor    %%f20, %%f36, %%f36
-       fxor    %%f22, %%f38, %%f38
-       add     %%o2, 64, %%o2
-       fxor    %%f24, %%f40, %%f40
-       fxor    %%f26, %%f42, %%f42
-       fxor    %%f28, %%f44, %%f44
-       fxor    %%f30, %%f46, %%f46
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       add     %%o0, 64, %%o0
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 4b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       ldda    [%%o0] %3, %%f48
-       fxor    %%f16, %%f32, %%f32
-       fxor    %%f18, %%f34, %%f34
-       fxor    %%f20, %%f36, %%f36
-       fxor    %%f22, %%f38, %%f38
-       fxor    %%f24, %%f40, %%f40
-       fxor    %%f26, %%f42, %%f42
-       fxor    %%f28, %%f44, %%f44
-       fxor    %%f30, %%f46, %%f46
-       membar  #Sync
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-
-15:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o1 + 24], %%o0
-       ldx     [%%o1 + 32], %%o1
-       ldx     [%%o2 + %0], %%o2
-       ldx     [%%o0 + %0], %%o0
-       ldx     [%%o1 + %0], %%o1
-
-5:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o0] %3, %%f16
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       add     %%o2, 64, %%o2
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       ldda    [%%o1] %3, %%f32
-       fxor    %%f48, %%f16, %%f48
-       fxor    %%f50, %%f18, %%f50
-       add     %%o0, 64, %%o0
-       fxor    %%f52, %%f20, %%f52
-       fxor    %%f54, %%f22, %%f54
-       add     %%o1, 64, %%o1
-       fxor    %%f56, %%f24, %%f56
-       fxor    %%f58, %%f26, %%f58
-       fxor    %%f60, %%f28, %%f60
-       fxor    %%f62, %%f30, %%f62
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 5b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o0] %3, %%f16
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       ldda    [%%o1] %3, %%f32
-       fxor    %%f48, %%f16, %%f48
-       fxor    %%f50, %%f18, %%f50
-       fxor    %%f52, %%f20, %%f52
-       fxor    %%f54, %%f22, %%f54
-       fxor    %%f56, %%f24, %%f56
-       fxor    %%f58, %%f26, %%f58
-       fxor    %%f60, %%f28, %%f60
-       fxor    %%f62, %%f30, %%f62
-       membar  #Sync
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-       " : :
-       "i" (&((struct buffer_head *)0)->b_data),
-       "i" (&((struct buffer_head *)0)->b_size),
-       "i" (FPRS_FEF|FPRS_DU), "i" (ASI_BLK_P),
-       "i" (FPRS_FEF), "i" (VISenter));
-}
-#endif /* __sparc_v9__ */
-
-#if defined(__sparc__) && !defined(__sparc_v9__)
-/*
- * High speed xor_block operation for RAID4/5 utilizing the
- * ldd/std SPARC instructions.
- *
- * Copyright (C) 1999 Jakub Jelinek (jj@ultra.linux.cz)
- *
- */
-
-XORBLOCK_TEMPLATE(SPARC)
-{
-       int size  = bh_ptr[0]->b_size;
-       int lines = size / (sizeof (long)) / 8, i;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1 = (long *) bh_ptr[1]->b_data;
-       long *source2, *source3, *source4;
-
-       switch (count) {
-       case 2:
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1) : "g2", "g3", "g4", "g5", "o0", 
-                 "o1", "o2", "o3", "o4", "o5", "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-               }
-               break;
-       case 3:
-               source2 = (long *) bh_ptr[2]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-               }
-               break;
-       case 4:
-               source2 = (long *) bh_ptr[2]->b_data;
-               source3 = (long *) bh_ptr[3]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%3 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%3 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%3 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%3 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2), "r" (source3)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-                 source3 += 8;
-               }
-               break;
-       case 5:
-               source2 = (long *) bh_ptr[2]->b_data;
-               source3 = (long *) bh_ptr[3]->b_data;
-               source4 = (long *) bh_ptr[4]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%3 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%3 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%3 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%3 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%4 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%4 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%4 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%4 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2), "r" (source3), "r" (source4)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-                 source3 += 8;
-                 source4 += 8;
-               }
-               break;
-       }
-}
-#endif /* __sparc_v[78]__ */
-
-#ifndef __sparc_v9__
-
-/*
- * this one works reasonably on any x86 CPU
- * (send me an assembly version for inclusion if you can make it faster)
- *
- * this one is just as fast as written in pure assembly on x86.
- * the reason for this separate version is that the
- * fast open-coded xor routine "32reg" produces suboptimal code
- * on x86, due to lack of registers.
- */
-XORBLOCK_TEMPLATE(8regs)
-{
-       int len  = bh_ptr[0]->b_size;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1, *source2, *source3, *source4;
-       long lines = len / (sizeof (long)) / 8, i;
-
-       switch(count) {
-               case 2:
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               source1 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 3:
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 4:
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 0) ^= *(source3 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 1) ^= *(source3 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 2) ^= *(source3 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 3) ^= *(source3 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 4) ^= *(source3 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 5) ^= *(source3 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 6) ^= *(source3 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               *(destp + 7) ^= *(source3 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 5:
-                       source4 = (long *) bh_ptr[4]->b_data;
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 0) ^= *(source3 + 0);
-                               *(destp + 0) ^= *(source4 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 1) ^= *(source3 + 1);
-                               *(destp + 1) ^= *(source4 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 2) ^= *(source3 + 2);
-                               *(destp + 2) ^= *(source4 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 3) ^= *(source3 + 3);
-                               *(destp + 3) ^= *(source4 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 4) ^= *(source3 + 4);
-                               *(destp + 4) ^= *(source4 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 5) ^= *(source3 + 5);
-                               *(destp + 5) ^= *(source4 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 6) ^= *(source3 + 6);
-                               *(destp + 6) ^= *(source4 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               *(destp + 7) ^= *(source3 + 7);
-                               *(destp + 7) ^= *(source4 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               source4 += 8;
-                               destp += 8;
-                       }
-                       break;
-       }
-}
-
-/*
- * platform independent RAID5 checksum calculation, this should
- * be very fast on any platform that has a decent amount of
- * registers. (32 or more)
- */
-XORBLOCK_TEMPLATE(32regs)
-{
-       int size  = bh_ptr[0]->b_size;
-       int lines = size / (sizeof (long)) / 8, i;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1, *source2, *source3, *source4;
-       
-         /* LOTS of registers available...
-            We do explicite loop-unrolling here for code which
-            favours RISC machines.  In fact this is almoast direct
-            RISC assembly on Alpha and SPARC :-)  */
-
-
-       switch(count) {
-               case 2:
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 3:
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 4:
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               d0 ^= source3[0];
-                               d1 ^= source3[1];
-                               d2 ^= source3[2];
-                               d3 ^= source3[3];
-                               d4 ^= source3[4];
-                               d5 ^= source3[5];
-                               d6 ^= source3[6];
-                               d7 ^= source3[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 5:
-                       source4 = (long *) bh_ptr[4]->b_data;
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               d0 ^= source3[0];
-                               d1 ^= source3[1];
-                               d2 ^= source3[2];
-                               d3 ^= source3[3];
-                               d4 ^= source3[4];
-                               d5 ^= source3[5];
-                               d6 ^= source3[6];
-                               d7 ^= source3[7];
-                               d0 ^= source4[0];
-                               d1 ^= source4[1];
-                               d2 ^= source4[2];
-                               d3 ^= source4[3];
-                               d4 ^= source4[4];
-                               d5 ^= source4[5];
-                               d6 ^= source4[6];
-                               d7 ^= source4[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               source4 += 8;
-                               destp += 8;
-                       }
-                       break;
-       }
-}
-
-/*
- * (the -6*32 shift factor colors the cache)
- */
-#define SIZE (PAGE_SIZE-6*32)
-
-static void xor_speed ( struct xor_block_template * func, 
-       struct buffer_head *b1, struct buffer_head *b2)
-{
-       int speed;
-       unsigned long now;
-       int i, count, max;
-       struct buffer_head *bh_ptr[6];
-
-       func->next = xor_functions;
-       xor_functions = func;
-       bh_ptr[0] = b1;
-       bh_ptr[1] = b2;
-
-       /*
-        * count the number of XORs done during a whole jiffy.
-        * calculate the speed of checksumming from this.
-        * (we use a 2-page allocation to have guaranteed
-        * color L1-cache layout)
-        */
-       max = 0;
-       for (i = 0; i < 5; i++) {
-               now = jiffies;
-               count = 0;
-               while (jiffies == now) {
-                       mb();
-                       func->xor_block(2,bh_ptr);
-                       mb();
-                       count++;
-                       mb();
-               }
-               if (count > max)
-                       max = count;
-       }
-
-       speed = max * (HZ*SIZE/1024);
-       func->speed = speed;
-
-       printk( "   %-10s: %5d.%03d MB/sec\n", func->name,
-               speed / 1000, speed % 1000);
-}
-
-static inline void pick_fastest_function(void)
-{
-       struct xor_block_template *f, *fastest;
-
-       fastest = xor_functions;
-       for (f = fastest; f; f = f->next) {
-               if (f->speed > fastest->speed)
-                       fastest = f;
-       }
-#ifdef CONFIG_X86_XMM 
-       if (boot_cpu_data.mmu_cr4_features & X86_CR4_OSXMMEXCPT) {
-               fastest = &t_xor_block_pIII_kni;
-       }
-#endif
-       xor_block = fastest->xor_block;
-       printk( "using fastest function: %s (%d.%03d MB/sec)\n", fastest->name,
-               fastest->speed / 1000, fastest->speed % 1000);
-}
-
-void calibrate_xor_block(void)
-{
-       struct buffer_head b1, b2;
-
-       memset(&b1,0,sizeof(b1));
-       b2 = b1;
-
-       b1.b_data = (char *) md__get_free_pages(GFP_KERNEL,2);
-       if (!b1.b_data) {
-               pick_fastest_function();
-               return;
-       }
-       b2.b_data = b1.b_data + 2*PAGE_SIZE + SIZE;
-
-       b1.b_size = SIZE;
-
-       printk(KERN_INFO "raid5: measuring checksumming speed\n");
-
-       sti(); /* should be safe */
-
-#if defined(__sparc__) && !defined(__sparc_v9__)
-       printk(KERN_INFO "raid5: trying high-speed SPARC checksum routine\n");
-       xor_speed(&t_xor_block_SPARC,&b1,&b2);
-#endif
-
-#ifdef CONFIG_X86_XMM 
-       if (boot_cpu_data.mmu_cr4_features & X86_CR4_OSXMMEXCPT) {
-               printk(KERN_INFO
-                       "raid5: KNI detected, trying cache-avoiding KNI checksum routine\n");
-               /* we force the use of the KNI xor block because it
-                       can write around l2.  we may also be able
-                       to load into the l1 only depending on how
-                       the cpu deals with a load to a line that is
-                       being prefetched.
-               */
-               xor_speed(&t_xor_block_pIII_kni,&b1,&b2);
-       }
-#endif /* CONFIG_X86_XMM */
-
-#ifdef __i386__
-
-       if (md_cpu_has_mmx()) {
-               printk(KERN_INFO
-                       "raid5: MMX detected, trying high-speed MMX checksum routines\n");
-               xor_speed(&t_xor_block_pII_mmx,&b1,&b2);
-               xor_speed(&t_xor_block_p5_mmx,&b1,&b2);
-       }
-
-#endif /* __i386__ */
-       
-       
-       xor_speed(&t_xor_block_8regs,&b1,&b2);
-       xor_speed(&t_xor_block_32regs,&b1,&b2);
-
-       free_pages((unsigned long)b1.b_data,2);
-       pick_fastest_function();
-}
-
-#else /* __sparc_v9__ */
-
-void calibrate_xor_block(void)
-{
-       printk(KERN_INFO "raid5: using high-speed VIS checksum routine\n");
-       xor_block = xor_block_VIS;
-}
-
-#endif /* __sparc_v9__ */
-
-MD_EXPORT_SYMBOL(xor_block);
-
index f16b5b1d0c5e919d5099ffd5df4b3237d32e5861..f124836fd6b120fe95ff5b834c657f657e5c31f8 100644 (file)
@@ -1524,9 +1524,11 @@ sony535_init(void))
                printk(CDU535_MESSAGE_NAME ": my base address is not free!\n");
                return -EIO;
        }
+
        /* look for the CD-ROM, follows the procedure in the DOS driver */
        inb(select_unit_reg);
        /* wait for 40 18 Hz ticks (reverse-engineered from DOS driver) */
+       current->state = TASK_INTERRUPTIBLE;
        schedule_timeout((HZ+17)*40/18);
        inb(result_reg);
 
index 7d8fdb2429e9983e7b486e4990fac7672e5eb38a..25316c0148d0e1038b21d82875626aac18c618b3 100644 (file)
@@ -1471,10 +1471,10 @@ static int vgrab(struct bttv *btv, struct video_mmap *mp)
 /*      This doesn´t work like this for NTSC anyway.
         So, better check the total image size ...
 */
-/*
-       if(mp->height>576 || mp->width>768+BURSTOFFSET)
+
+       if(mp->height>576 || mp->width>768+BURSTOFFSET || mp->height < 32 || mp->width <32)
                return -EINVAL;
-*/
+
        if (mp->format >= PALETTEFMT_MAX)
                return -EINVAL;
        if (mp->height*mp->width*fmtbppx2[palette2fmt[mp->format]&0x0f]/2
@@ -1977,7 +1977,8 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                {
                        struct video_buffer v;
 #if LINUX_VERSION_CODE >= 0x020100
-                       if(!capable(CAP_SYS_ADMIN))
+                       if(!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
 #else
                        if(!suser())
 #endif
@@ -1989,12 +1990,7 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                                v.height > 16 && v.bytesperline > 16)
                                return -EINVAL;
                         if (v.base)
-                        {
-                                if ((unsigned long)v.base&1)
-                                        btv->win.vidadr=(unsigned long)(PAGE_OFFSET|uvirt_to_bus((unsigned long)v.base));
-                                else
-                                        btv->win.vidadr=(unsigned long)v.base;
-                        }
+                               btv->win.vidadr=(unsigned long)v.base;
                        btv->win.sheight=v.height;
                        btv->win.swidth=v.width;
                        btv->win.bpp=((v.depth+7)&0x38)/8;
@@ -2216,6 +2212,8 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                         struct video_mmap vm;
                        if(copy_from_user((void *) &vm, (void *) arg, sizeof(vm)))
                                return -EFAULT;
+                       if (vm.frame < 0 || vm.frame >= MAX_GBUFFERS)
+                               return -EIO;
                         if (btv->frame_stat[vm.frame] == GBUFFER_GRABBING)
                                 return -EBUSY;
                        return vgrab(btv, &vm);
index 5352245a95fff971b8e4c7098549fc0f1101065a..0a361f21203861a141cdb2edf1437ffd6ae4bfed 100644 (file)
@@ -2654,7 +2654,8 @@ static int zoran_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                {
                        struct video_buffer v;
 
-                       if (!capable(CAP_SYS_ADMIN))
+                       if (!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
                                return -EPERM;
 
                        if (copy_from_user(&v, arg, sizeof(v)))
index c26a448e85202cad16018324f7a6031719cf83fd..30f28280b7ada0bb2665c4e525d42676c1528615 100644 (file)
@@ -902,7 +902,7 @@ static void send_break (struct dz_serial *info, int duration)
   
   dz_out (info, DZ_TCR, tmp);
   
-  schedule_timeout(jiffies + duration);
+  schedule_timeout(duration);
   
   tmp &= ~mask;
   dz_out (info, DZ_TCR, tmp);
@@ -1093,7 +1093,7 @@ static void dz_close (struct tty_struct *tty, struct file *filp)
   if (info->blocked_open) {
     if (info->close_delay) {
       current->state = TASK_INTERRUPTIBLE;
-      schedule_timeout(jiffies + info->close_delay);
+      schedule_timeout(info->close_delay);
     }
     wake_up_interruptible (&info->open_wait);
   }
index 15b4fcd96d9ceb69d143841380f20be7ee3876a0..d377683af9edda075e0275dafec12dfcd32d52dd 100644 (file)
@@ -94,7 +94,7 @@ static inline int copy_from_user(void *to,const void *from, int c)
 #ifndef TWO_THREE
 /* These are new in 2.3. The source now uses 2.3 syntax, and here is 
    the compatibility define... */
-#define waitq_head_t struct wait_queue *
+#define wait_queue_head_t struct wait_queue *
 #define DECLARE_MUTEX(name) struct semaphore name = MUTEX
 #define DECLARE_WAITQUEUE(wait, current) struct wait_queue wait = { current, NULL }
 
index 0f2d24e8bbe47870a24b2bd228428c9152b725fd..ede49febce596c1abcc7c13768dfb7275ef3cfee 100644 (file)
@@ -1533,7 +1533,8 @@ static int planb_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
 
                        DEBUG("PlanB: IOCTL VIDIOCSFBUF\n");
 
-                        if (!capable(CAP_SYS_ADMIN))
+                        if (!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
                                 return -EPERM;
                         if (copy_from_user(&v, arg,sizeof(v)))
                                 return -EFAULT;
index 35be7ed4e0ca315ae4b25a56eb729353f2148adb..69e84990512f0398fb0e5a6092360c098d6dbea2 100644 (file)
@@ -1352,7 +1352,7 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
                        {
                                struct sk_buff *skb_old = skb;
                                int pkt_len;
-                               skb = dev_alloc_skb(skb_old->len + 40);
+                               skb = dev_alloc_skb(skb_old->len + 128);
 
                                if (!skb) {
                                        printk(KERN_WARNING "%s: Memory squeeze, dropping packet.\n", dev->name);
@@ -1361,7 +1361,7 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
                                        return;
                                }
                                skb->dev = dev;
-                               skb_put(skb, skb_old->len + 40);
+                               skb_put(skb, skb_old->len + 128);
                                memcpy(skb->data, skb_old->data, skb_old->len);
                                skb->mac.raw = skb->data;
                                pkt_len = slhc_uncompress(ippp_table[net_dev->local->ppp_slot]->slcomp,
@@ -1413,16 +1413,22 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
 static unsigned char *isdn_ppp_skb_push(struct sk_buff **skb_p,int len)
 {
        struct sk_buff *skb = *skb_p;
-
+       
        if(skb_headroom(skb) < len) {
-               printk(KERN_ERR "isdn_ppp_skb_push:under %d %d\n",skb_headroom(skb),len);
+               struct sk_buff *nskb = skb_realloc_headroom(skb, len);
+               
+               if (!nskb) {
+                       printk(KERN_INFO "isdn_ppp_skb_push: can't realloc headroom!\n");
+                       dev_kfree_skb(skb);
+                       return NULL;
+               }
                dev_kfree_skb(skb);
-               return NULL;
+               *skb_p = nskb;
+               return skb_push(nskb, len);
        }
        return skb_push(skb,len);
 }
-
-
+       
 /*
  * send ppp frame .. we expect a PIDCOMPressable proto --
  *  (here: currently always PPP_IP,PPP_VJC_COMP,PPP_VJC_UNCOMP)
index 38e96cca2262d8b7df65872437f5d1c57b9e43ac..fc013ded501de3bf0728ea8ed67dcd50c03bf4af 100644 (file)
@@ -585,7 +585,7 @@ static struct device * sis900_probe1(   int pci_bus,
         tp = kmalloc(sizeof(*tp), GFP_KERNEL | GFP_DMA);
         if(tp==NULL)
         {
-               free_region(ioaddr, pci_tbl[chip_idx].io_size);
+               releaseregion(ioaddr, pci_tbl[chip_idx].io_size);
                return NULL;
         }
         memset(tp, 0, sizeof(*tp));
index b9f884a9e1d5543bd1e7a229dcdc8ecdb2b2f91c..e0e6aeb70cb9bf26864a47c875dacb371f42e86f 100644 (file)
@@ -1394,9 +1394,7 @@ int aha152x_reset(Scsi_Cmnd * SCpnt, unsigned int unused)
                if (ptr && !ptr->device->soft_reset) {
                        ptr->host_scribble = NULL;
                        ptr->result = DID_RESET << 16;
-                       spin_lock_irqsave(&io_request_lock, flags);
                        ptr->scsi_done(CURRENT_SC);
-                       spin_unlock_irqrestore(&io_request_lock, flags);
                        CURRENT_SC = NULL;
                }
                save_flags(flags);
@@ -1414,9 +1412,7 @@ int aha152x_reset(Scsi_Cmnd * SCpnt, unsigned int unused)
 
                                ptr->host_scribble = NULL;
                                ptr->result = DID_RESET << 16;
-                               spin_lock_irqsave(&io_request_lock, flags);
                                ptr->scsi_done(ptr);
-                               spin_unlock_irqrestore(&io_request_lock, flags);
 
                                ptr = next;
                        } else {
index e4664b06abdaf8164f2e7b88dd4e6c7ca1af5425..ba45fd594c3199fad511b65ddb3bcc41038dd96d 100644 (file)
  * of writing 0x00 to 0x7f (which should be done by reset): The ES1887 moves
  * into ES1888 mode. This means that it claims IRQ 11, which happens to be my
  * ISDN adapter. Needless to say it no longer worked. I now understand why
- * after rebooting 0x7f already was 0x05, the value of my choise: the BIOS
+ * after rebooting 0x7f already was 0x05, the value of my choice: the BIOS
  * did it.
  *
  * Oh, and this is another trap: in ES1887 docs mixer register 0x70 is decribed
@@ -1200,10 +1200,10 @@ FKS_test (devc);
 
        /* AAS: info stolen from ALSA: these boards have different clocks */
        switch(devc->submodel) {
-/* APPARENTLY NOT 1869 
+/* APPARENTLY NOT 1869 AND 1887
                case SUBMDL_ES1869:
-*/             
                case SUBMDL_ES1887:
+*/             
                case SUBMDL_ES1888:
                        devc->caps |= SB_CAP_ES18XX_RATE;
                        break;
index 4f21959d079f052932b14619f339cd659c6e8b9d..c1b57ec6e96923fc2f844630855e52fb1429fa35 100644 (file)
@@ -165,7 +165,7 @@ static int try_to_fill_dentry(struct dentry *dentry, struct super_block *sb, str
  * yet completely filled in, and revalidate has to delay such
  * lookups..
  */
-static int autofs_do_revalidate(struct dentry * dentry, int flags)
+static int autofs_revalidate(struct dentry * dentry, int flags)
 {
        struct inode * dir = dentry->d_parent->d_inode;
        struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb);
@@ -200,15 +200,6 @@ static int autofs_do_revalidate(struct dentry * dentry, int flags)
        return 1;
 }
 
-static int autofs_revalidate(struct dentry * dentry, int flags)
-{
-       int r;
-       up(&dentry->d_parent->d_inode->i_sem);
-       r = autofs_do_revalidate(dentry, flags);
-       down(&dentry->d_parent->d_inode->i_sem);
-       return r;
-}
-
 static struct dentry_operations autofs_dentry_operations = {
        autofs_revalidate,      /* d_revalidate */
        NULL,                   /* d_hash */
@@ -246,7 +237,9 @@ static struct dentry *autofs_root_lookup(struct inode *dir, struct dentry *dentr
        dentry->d_flags |= DCACHE_AUTOFS_PENDING;
        d_add(dentry, NULL);
 
+       up(&dir->i_sem);
        autofs_revalidate(dentry, 0);
+       down(&dir->i_sem);
 
        /*
         * If we are still pending, check if we had to handle
index 401d9f34dfd1d40803ec1320e9bfaa58f42daaf5..13b3f534debca838640b5e9e05d94477ee4e266b 100644 (file)
@@ -14,8 +14,7 @@ extern int *blk_size[];
 extern int *blksize_size[];
 
 #define MAX_BUF_PER_PAGE (PAGE_SIZE / 512)
-#define NBUF 128
-#define READAHEAD_SECTORS      (128 * 4 * 2)
+#define NBUF 64
 
 ssize_t block_write(struct file * filp, const char * buf,
                    size_t count, loff_t *ppos)
@@ -153,13 +152,12 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
        size_t blocks, rblocks, left;
        int bhrequest, uptodate;
        struct buffer_head ** bhb, ** bhe;
-       struct buffer_head ** buflist;
-       struct buffer_head ** bhreq;
+       struct buffer_head * buflist[NBUF];
+       struct buffer_head * bhreq[NBUF];
        unsigned int chars;
        loff_t size;
        kdev_t dev;
        ssize_t read;
-       int nbuf;
 
        dev = inode->i_rdev;
        blocksize = BLOCK_SIZE;
@@ -189,18 +187,6 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                left = count;
        if (left <= 0)
                return 0;
-
-       if ((buflist = (struct buffer_head **) __get_free_page(GFP_KERNEL)) == NULL)
-               return -ENOMEM;
-       if ((bhreq = (struct buffer_head **) __get_free_page(GFP_KERNEL)) == NULL) {
-               free_page((unsigned long) buflist);
-               return -ENOMEM;
-       }
-
-       nbuf = READAHEAD_SECTORS / (blocksize >> 9);
-       if (nbuf > PAGE_SIZE / sizeof(struct buffer_head *))
-               nbuf = PAGE_SIZE / sizeof(struct buffer_head *);
-               
        read = 0;
        block = offset >> blocksize_bits;
        offset &= blocksize-1;
@@ -208,12 +194,8 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
        rblocks = blocks = (left + offset + blocksize - 1) >> blocksize_bits;
        bhb = bhe = buflist;
        if (filp->f_reada) {
-#if 0
                if (blocks < read_ahead[MAJOR(dev)] / (blocksize >> 9))
                        blocks = read_ahead[MAJOR(dev)] / (blocksize >> 9);
-#else
-               blocks += read_ahead[MAJOR(dev)] / (blocksize >> 9);
-#endif
                if (rblocks > blocks)
                        blocks = rblocks;
                
@@ -245,7 +227,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                bhreq[bhrequest++] = *bhb;
                        }
 
-                       if (++bhb == &buflist[nbuf])
+                       if (++bhb == &buflist[NBUF])
                                bhb = buflist;
 
                        /* If the block we have on hand is uptodate, go ahead
@@ -266,7 +248,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                wait_on_buffer(*bhe);
                                if (!buffer_uptodate(*bhe)) {   /* read error? */
                                        brelse(*bhe);
-                                       if (++bhe == &buflist[nbuf])
+                                       if (++bhe == &buflist[NBUF])
                                          bhe = buflist;
                                        left = 0;
                                        break;
@@ -288,7 +270,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                        put_user(0,buf++);
                        }
                        offset = 0;
-                       if (++bhe == &buflist[nbuf])
+                       if (++bhe == &buflist[NBUF])
                                bhe = buflist;
                } while (left > 0 && bhe != bhb && (!*bhe || !buffer_locked(*bhe)));
                if (bhe == bhb && !blocks)
@@ -298,12 +280,9 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
 /* Release the read-ahead blocks */
        while (bhe != bhb) {
                brelse(*bhe);
-               if (++bhe == &buflist[nbuf])
+               if (++bhe == &buflist[NBUF])
                        bhe = buflist;
        };
-
-       free_page((unsigned long) buflist);
-       free_page((unsigned long) bhreq);
        if (!read)
                return -EIO;
        filp->f_reada = 1;
index 5e08a9774f29b2178e6ea958218e4e2f88df47de..1f76051878cf59718a98c87d124237d4db5d0881 100644 (file)
@@ -600,7 +600,6 @@ unsigned int get_hardblocksize(kdev_t dev)
        return 0;
 }
 
-#if 0
 void set_blocksize(kdev_t dev, int size)
 {
        extern int *blksize_size[];
@@ -646,113 +645,10 @@ void set_blocksize(kdev_t dev, int size)
                                clear_bit(BH_Req, &bh->b_state);
                                bh->b_flushtime = 0;
                        }
-                       remove_from_queues(bh);
-                       bh->b_dev=B_FREE;
-                       insert_into_queues(bh);
-               }
-       }
-}
-
-#else
-void set_blocksize(kdev_t dev, int size)
-{
-       extern int *blksize_size[];
-       int i, nlist;
-       struct buffer_head * bh, *bhnext;
-
-       if (!blksize_size[MAJOR(dev)])
-               return;
-
-       /* Size must be a power of two, and between 512 and PAGE_SIZE */
-       if (size > PAGE_SIZE || size < 512 || (size & (size-1)))
-               panic("Invalid blocksize passed to set_blocksize");
-
-       if (blksize_size[MAJOR(dev)][MINOR(dev)] == 0 && size == BLOCK_SIZE) {
-               blksize_size[MAJOR(dev)][MINOR(dev)] = size;
-               return;
-       }
-       if (blksize_size[MAJOR(dev)][MINOR(dev)] == size)
-               return;
-       sync_buffers(dev, 2);
-       blksize_size[MAJOR(dev)][MINOR(dev)] = size;
-
-       /* We need to be quite careful how we do this - we are moving entries
-        * around on the free list, and we can get in a loop if we are not careful.
-        */
-       for(nlist = 0; nlist < NR_LIST; nlist++) {
-               bh = lru_list[nlist];
-               for (i = nr_buffers_type[nlist]*2 ; --i > 0 ; bh = bhnext) {
-                       if(!bh)
-                               break;
-
-                       bhnext = bh->b_next_free; 
-                       if (bh->b_dev != dev)
-                                continue;
-                       if (bh->b_size == size)
-                                continue;
-                       if (bhnext)
-                               bhnext->b_count++;
-                       wait_on_buffer(bh);
-                       if (bh->b_dev == dev && bh->b_size != size) {
-                               clear_bit(BH_Dirty, &bh->b_state);
-                               clear_bit(BH_Uptodate, &bh->b_state);
-                               clear_bit(BH_Req, &bh->b_state);
-                               bh->b_flushtime = 0;
-                       }
-
-                       /*
-                        * lets be mega-conservative about what to free:
-                        */
-                       if (!(bh->b_dev != dev) && 
-                               !(bh->b_size == size) &&
-                               !bh->b_count &&
-                               !buffer_protected(bh) &&
-                               !buffer_dirty(bh) &&
-                               !buffer_locked(bh) &&
-                               !waitqueue_active(&bh->b_wait)) {
-                                       remove_from_hash_queue(bh);
-                                       bh->b_dev = NODEV;
-                                       refile_buffer(bh);
-                                       try_to_free_buffers(buffer_page(bh));
-                       } else {
-                               remove_from_queues(bh);
-                               bh->b_dev=B_FREE;
-                               insert_into_queues(bh);
-                       }
-                       if (bhnext)
-                               bhnext->b_count--;
+                       remove_from_hash_queue(bh);
                }
        }
 }
-#endif
-
-/*
-* This function knows that we do a linear pass over the whole array,
-* so we can drop all unused buffers. Careful, bforget alone is
-* unsafe, we must be 100% sure that at the end of bforget() we will
-* really have no (new) users of this buffer.
-*
-* this logic improves overall system performance greatly during array
-* resync or reconstruction. Actually, the reconstruction is basically
-* seemless.
-*/
-void cache_drop_behind(struct buffer_head *bh)
-{
-       /*
-        * We are up to something dangerous ... rather be careful
-        */
-       if ((bh->b_count != 1) || buffer_protected(bh) ||
-                       buffer_dirty(bh) || buffer_locked(bh) ||
-                       !buffer_lowprio(bh) || waitqueue_active(&bh->b_wait)) {
-               brelse(bh);
-       } else {
-               bh->b_count--;
-               remove_from_hash_queue(bh);
-               bh->b_dev = NODEV;
-               refile_buffer(bh);
-               try_to_free_buffers(buffer_page(bh));
-       }
-}
 
 /*
  * We used to try various strange things. Let's not.
@@ -958,21 +854,22 @@ struct buffer_head * bread(kdev_t dev, int block, int size)
  * Ok, breada can be used as bread, but additionally to mark other
  * blocks for reading as well. End the argument list with a negative
  * number.
- *
- * __breada does the same but with block arguments. This is handy if a
- * device is bigger than 2G on a 32-bit architecture.
  */
 
 #define NBUF 16
 
-struct buffer_head * breada_blocks(kdev_t dev, int block,
-                        int bufsize, int blocks)
+struct buffer_head * breada(kdev_t dev, int block, int bufsize,
+       unsigned int pos, unsigned int filesize)
 {
        struct buffer_head * bhlist[NBUF];
+       unsigned int blocks;
        struct buffer_head * bh;
        int index;
        int i, j;
 
+       if (pos >= filesize)
+               return NULL;
+
        if (block < 0)
                return NULL;
 
@@ -981,14 +878,18 @@ struct buffer_head * breada_blocks(kdev_t dev, int block,
 
        if (buffer_uptodate(bh))
                return(bh);   
-       else
-               ll_rw_block(READ, 1, &bh);
+       else ll_rw_block(READ, 1, &bh);
+
+       blocks = (filesize - pos) >> (9+index);
 
        if (blocks < (read_ahead[MAJOR(dev)] >> index))
                blocks = read_ahead[MAJOR(dev)] >> index;
        if (blocks > NBUF) 
                blocks = NBUF;
 
+/*     if (blocks) printk("breada (new) %d blocks\n",blocks); */
+
+
        bhlist[0] = bh;
        j = 1;
        for(i=1; i<blocks; i++) {
@@ -1015,22 +916,6 @@ struct buffer_head * breada_blocks(kdev_t dev, int block,
        return NULL;
 }
 
-struct buffer_head * breada(kdev_t dev, int block, int bufsize,
-       unsigned int pos, unsigned int filesize)
-{
-       unsigned int blocks;
-       int index;
-
-       if (pos >= filesize)
-               return NULL;
-
-       index = BUFSIZE_INDEX(bufsize);
-
-       blocks = (filesize - pos) >> (9+index);
-
-       return (breada_blocks(dev,block,bufsize,blocks));
-}
-
 /*
  * Note: the caller should wake up the buffer_wait list if needed.
  */
index 987269fd7b72fe925ce381f56113ecdc92755f3f..705ce07e4bc844c19189019824dec44b76d45a98 100644 (file)
@@ -641,6 +641,8 @@ static struct dquot *dqduplicate(struct dquot *dquot)
                dquot->dq_count--;
                return NODQUOT;
        }
+       dquot->dq_referenced++;
+       dqstats.lookups++;
        return dquot;
 }
 
index 24b41b97d77a98fd1b68e1dd8314011fd814fef1..9bdbc37fc83f9d1f77682db8f3404d48a8313ec8 100644 (file)
@@ -305,12 +305,12 @@ static int parse_options(char *options,int *fat, int *blksize, int *debug,
                        else opts->quiet = 1;
                }
                else if (!strcmp(this_char,"blocksize")) {
-                       if (*value) ret = 0;
-                       else if (*blksize != 512  &&
-                                *blksize != 1024 &&
-                                *blksize != 2048) {
-                               printk ("MSDOS FS: Invalid blocksize "
-                                       "(512, 1024, or 2048)\n");
+                       if (!value || !*value) ret = 0;
+                       else {
+                               *blksize = simple_strtoul(value,&value,0);
+                               if (*value || (*blksize != 512 &&
+                                       *blksize != 1024 && *blksize != 2048))
+                                       ret = 0;
                        }
                }
                else if (!strcmp(this_char,"sys_immutable")) {
index 290676166018385308201b28a320279c4e4da25c..62a6643edc8f2f61586cd499a8dab2d8fe2df964 100644 (file)
@@ -280,8 +280,8 @@ sys_select(int n, fd_set *inp, fd_set *outp, fd_set *exp, struct timeval *tvp)
         
        if (n < 0)
                goto out_nofds;
-       if (n > current->files->max_fdset + 1)
-               n = current->files->max_fdset + 1;
+       if (n > current->files->max_fdset)
+               n = current->files->max_fdset;
                
        /*
         * We need 6 bitmaps (in/out/ex for both incoming and outgoing),
index 223ea7b3847b73d63feaf0deb3c2e828d2ed22ff..0e4df559cc452bb2ed2aab77b3283ba7823a3454 100644 (file)
@@ -94,7 +94,7 @@
 #define CIA_DMA_WIN_BASE               alpha_mv.dma_win_base
 #define CIA_DMA_WIN_SIZE               alpha_mv.dma_win_size
 #else
-#define CIA_DMA_WIN_BASE               CIA_DMA_WIN_SIZE_DEFAULT
+#define CIA_DMA_WIN_BASE               CIA_DMA_WIN_BASE_DEFAULT
 #define CIA_DMA_WIN_SIZE               CIA_DMA_WIN_SIZE_DEFAULT
 #endif
 
diff --git a/include/asm-alpha/md.h b/include/asm-alpha/md.h
new file mode 100644 (file)
index 0000000..6c9b822
--- /dev/null
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:11:48 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-i386/md.h b/include/asm-i386/md.h
new file mode 100644 (file)
index 0000000..0a2c5dd
--- /dev/null
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:11:57 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-m68k/md.h b/include/asm-m68k/md.h
new file mode 100644 (file)
index 0000000..1d15aae
--- /dev/null
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:04 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-ppc/md.h b/include/asm-ppc/md.h
new file mode 100644 (file)
index 0000000..0ff3e7e
--- /dev/null
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:15 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-sparc/md.h b/include/asm-sparc/md.h
new file mode 100644 (file)
index 0000000..e0d0e85
--- /dev/null
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:39 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-sparc64/md.h b/include/asm-sparc64/md.h
new file mode 100644 (file)
index 0000000..0387993
--- /dev/null
@@ -0,0 +1,91 @@
+/* $Id: md.h,v 1.2 1997/12/27 16:28:38 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *            utilizing the UltraSparc Visual Instruction Set.
+ *
+ * Copyright (C) 1997 Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ */
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+#include <asm/head.h>
+#include <asm/asi.h>
+
+#define HAVE_ARCH_XORBLOCK
+
+#define MD_XORBLOCK_ALIGNMENT  64
+
+/*     void __xor_block (char *dest, char *src, long len)
+ *     {
+ *             while (len--) *dest++ ^= *src++;
+ *     }
+ *
+ *     Requirements:
+ *     !(((long)dest | (long)src) & (MD_XORBLOCK_ALIGNMENT - 1)) &&
+ *     !(len & 127) && len >= 256
+ */
+
+static inline void __xor_block (char *dest, char *src, long len)
+{
+       __asm__ __volatile__ ("
+       wr      %%g0, %3, %%fprs
+       wr      %%g0, %4, %%asi
+       membar  #LoadStore|#StoreLoad|#StoreStore
+       sub     %2, 128, %2
+       ldda    [%0] %4, %%f0
+       ldda    [%1] %4, %%f16
+1:     ldda    [%0 + 64] %%asi, %%f32
+       fxor    %%f0, %%f16, %%f16
+       fxor    %%f2, %%f18, %%f18
+       fxor    %%f4, %%f20, %%f20
+       fxor    %%f6, %%f22, %%f22
+       fxor    %%f8, %%f24, %%f24
+       fxor    %%f10, %%f26, %%f26
+       fxor    %%f12, %%f28, %%f28
+       fxor    %%f14, %%f30, %%f30
+       stda    %%f16, [%0] %4
+       ldda    [%1 + 64] %%asi, %%f48
+       ldda    [%0 + 128] %%asi, %%f0
+       fxor    %%f32, %%f48, %%f48
+       fxor    %%f34, %%f50, %%f50
+       add     %0, 128, %0
+       fxor    %%f36, %%f52, %%f52
+       add     %1, 128, %1
+       fxor    %%f38, %%f54, %%f54
+       subcc   %2, 128, %2
+       fxor    %%f40, %%f56, %%f56
+       fxor    %%f42, %%f58, %%f58
+       fxor    %%f44, %%f60, %%f60
+       fxor    %%f46, %%f62, %%f62
+       stda    %%f48, [%0 - 64] %%asi
+       bne,pt  %%xcc, 1b
+        ldda   [%1] %4, %%f16
+       ldda    [%0 + 64] %%asi, %%f32
+       fxor    %%f0, %%f16, %%f16
+       fxor    %%f2, %%f18, %%f18
+       fxor    %%f4, %%f20, %%f20
+       fxor    %%f6, %%f22, %%f22
+       fxor    %%f8, %%f24, %%f24
+       fxor    %%f10, %%f26, %%f26
+       fxor    %%f12, %%f28, %%f28
+       fxor    %%f14, %%f30, %%f30
+       stda    %%f16, [%0] %4
+       ldda    [%1 + 64] %%asi, %%f48
+       membar  #Sync
+       fxor    %%f32, %%f48, %%f48
+       fxor    %%f34, %%f50, %%f50
+       fxor    %%f36, %%f52, %%f52
+       fxor    %%f38, %%f54, %%f54
+       fxor    %%f40, %%f56, %%f56
+       fxor    %%f42, %%f58, %%f58
+       fxor    %%f44, %%f60, %%f60
+       fxor    %%f46, %%f62, %%f62
+       stda    %%f48, [%0 + 64] %%asi
+       membar  #Sync|#StoreStore|#StoreLoad
+       wr      %%g0, 0, %%fprs
+       " : :
+       "r" (dest), "r" (src), "r" (len), "i" (FPRS_FEF), "i" (ASI_BLK_P) :
+       "cc", "memory");
+}
+
+#endif /* __ASM_MD_H */
index 808f69323db998e63def32c7992676971046dcd8..87a9092b219fc31f584629787442a75c54130f5f 100644 (file)
@@ -62,9 +62,8 @@ extern void unplug_device(void * data);
 extern void make_request(int major,int rw, struct buffer_head * bh);
 
 /* md needs this function to remap requests */
-extern int md_map (kdev_t dev, kdev_t *rdev,
-                                unsigned long *rsector, unsigned long size);
-extern int md_make_request (struct buffer_head * bh, int rw);
+extern int md_map (int minor, kdev_t *rdev, unsigned long *rsector, unsigned long size);
+extern int md_make_request (int minor, int rw, struct buffer_head * bh);
 extern int md_error (kdev_t mddev, kdev_t rdev);
 
 extern int * blk_size[MAX_BLKDEV];
index 34f9fe4943e3a7e4251beb5bc9de88f6beedc98e..ba89313830234c9b8a39d90ccfd17151e0b4e3cc 100644 (file)
@@ -1,6 +1,6 @@
 /*
  *     IP_MASQ user space control interface
- *     $Id: ip_masq.h,v 1.2.2.1 1999/08/13 18:23:03 davem Exp $
+ *     $Id: ip_masq.h,v 1.2 1998/12/08 05:41:48 davem Exp $
  */
 
 #ifndef _LINUX_IP_MASQ_H
@@ -103,26 +103,6 @@ struct ip_mfw_user {
 
 #define IP_MASQ_MFW_SCHED      0x01
 
-/* 
- *     VS & schedulers stuff 
- */
-struct ip_vs_user {
-       /* create the virtual service and attach the scheduler to it */
-       u_int16_t       protocol;
-       u_int32_t       vaddr;          /* virtual address */
-       u_int16_t       vport;
-       /* ... timeouts and other stuff */
-
-       /* scheduler specific options */
-       u_int32_t       daddr;          /* real destination address */
-       u_int16_t       dport;
-       unsigned        masq_flags;
-       unsigned        sched_flags;
-       unsigned        weight;
-       char            data[0];        /* optional scheduler parameters */
-};
-
-
 #define IP_FW_MASQCTL_MAX 256
 #define IP_MASQ_TNAME_MAX  32
 
@@ -135,7 +115,6 @@ struct ip_masq_ctl {
                struct ip_autofw_user autofw_user;
                struct ip_mfw_user mfw_user;
                struct ip_masq_user user;
-               struct ip_vs_user vs_user;
                unsigned char m_raw[IP_FW_MASQCTL_MAX];
        } u;
 };
@@ -145,10 +124,7 @@ struct ip_masq_ctl {
 #define IP_MASQ_TARGET_CORE    1
 #define IP_MASQ_TARGET_MOD     2       /* masq_mod is selected by "name" */
 #define IP_MASQ_TARGET_USER    3       
-#define IP_MASQ_TARGET_VS      4       /* sched_mod is selected by "name" */
-/*  #define IP_MASQ_TARGET_VS_SCHED 5 */
-#define IP_MASQ_TARGET_LAST    5
-
+#define IP_MASQ_TARGET_LAST    4
 
 #define IP_MASQ_CMD_NONE       0       /* just peek */
 #define IP_MASQ_CMD_INSERT     1
@@ -160,9 +136,5 @@ struct ip_masq_ctl {
 #define IP_MASQ_CMD_LIST       7       /* actually fake: done via /proc */
 #define IP_MASQ_CMD_ENABLE     8
 #define IP_MASQ_CMD_DISABLE    9
-#define IP_MASQ_CMD_ADD_DEST   10      /* for adding dest in IPVS */
-#define IP_MASQ_CMD_DEL_DEST   11      /* for deleting dest in IPVS */
-#define IP_MASQ_CMD_SET_DEST   12      /* for setting dest in IPVS */
 
 #endif /* _LINUX_IP_MASQ_H */
-
diff --git a/include/linux/md.h b/include/linux/md.h
new file mode 100644 (file)
index 0000000..f4f4f54
--- /dev/null
@@ -0,0 +1,300 @@
+/*
+   md.h : Multiple Devices driver for Linux
+          Copyright (C) 1994-96 Marc ZYNGIER
+         <zyngier@ufr-info-p7.ibp.fr> or
+         <maz@gloups.fdn.fr>
+         
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+   
+   You should have received a copy of the GNU General Public License
+   (for example /usr/src/linux/COPYING); if not, write to the Free
+   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
+*/
+
+#ifndef _MD_H
+#define _MD_H
+
+#include <linux/major.h>
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/*
+ * Different major versions are not compatible.
+ * Different minor versions are only downward compatible.
+ * Different patchlevel versions are downward and upward compatible.
+ */
+#define MD_MAJOR_VERSION               0
+#define MD_MINOR_VERSION               36
+#define MD_PATCHLEVEL_VERSION          6
+
+#define MD_DEFAULT_DISK_READAHEAD      (256 * 1024)
+
+/* ioctls */
+#define REGISTER_DEV           _IO (MD_MAJOR, 1)
+#define START_MD               _IO (MD_MAJOR, 2)
+#define STOP_MD                _IO (MD_MAJOR, 3)
+#define REGISTER_DEV_NEW       _IO (MD_MAJOR, 4)
+
+/*
+   personalities :
+   Byte 0 : Chunk size factor
+   Byte 1 : Fault tolerance count for each physical device
+            (   0 means no fault tolerance,
+             0xFF means always tolerate faults), not used by now.
+   Byte 2 : Personality
+   Byte 3 : Reserved.
+ */
+
+#define FAULT_SHIFT       8
+#define PERSONALITY_SHIFT 16
+
+#define FACTOR_MASK       0x000000FFUL
+#define FAULT_MASK        0x0000FF00UL
+#define PERSONALITY_MASK  0x00FF0000UL
+
+#define MD_RESERVED       0    /* Not used by now */
+#define LINEAR            (1UL << PERSONALITY_SHIFT)
+#define STRIPED           (2UL << PERSONALITY_SHIFT)
+#define RAID0             STRIPED
+#define RAID1             (3UL << PERSONALITY_SHIFT)
+#define RAID5             (4UL << PERSONALITY_SHIFT)
+#define MAX_PERSONALITY   5
+
+/*
+ * MD superblock.
+ *
+ * The MD superblock maintains some statistics on each MD configuration.
+ * Each real device in the MD set contains it near the end of the device.
+ * Some of the ideas are copied from the ext2fs implementation.
+ *
+ * We currently use 4096 bytes as follows:
+ *
+ *     word offset     function
+ *
+ *        0  -    31   Constant generic MD device information.
+ *        32  -    63   Generic state information.
+ *       64  -   127   Personality specific information.
+ *      128  -   511   12 32-words descriptors of the disks in the raid set.
+ *      512  -   911   Reserved.
+ *      912  -  1023   Disk specific descriptor.
+ */
+
+/*
+ * If x is the real device size in bytes, we return an apparent size of:
+ *
+ *     y = (x & ~(MD_RESERVED_BYTES - 1)) - MD_RESERVED_BYTES
+ *
+ * and place the 4kB superblock at offset y.
+ */
+#define MD_RESERVED_BYTES              (64 * 1024)
+#define MD_RESERVED_SECTORS            (MD_RESERVED_BYTES / 512)
+#define MD_RESERVED_BLOCKS             (MD_RESERVED_BYTES / BLOCK_SIZE)
+
+#define MD_NEW_SIZE_SECTORS(x)         ((x & ~(MD_RESERVED_SECTORS - 1)) - MD_RESERVED_SECTORS)
+#define MD_NEW_SIZE_BLOCKS(x)          ((x & ~(MD_RESERVED_BLOCKS - 1)) - MD_RESERVED_BLOCKS)
+
+#define MD_SB_BYTES                    4096
+#define MD_SB_WORDS                    (MD_SB_BYTES / 4)
+#define MD_SB_BLOCKS                   (MD_SB_BYTES / BLOCK_SIZE)
+#define MD_SB_SECTORS                  (MD_SB_BYTES / 512)
+
+/*
+ * The following are counted in 32-bit words
+ */
+#define        MD_SB_GENERIC_OFFSET            0
+#define MD_SB_PERSONALITY_OFFSET       64
+#define MD_SB_DISKS_OFFSET             128
+#define MD_SB_DESCRIPTOR_OFFSET                992
+
+#define MD_SB_GENERIC_CONSTANT_WORDS   32
+#define MD_SB_GENERIC_STATE_WORDS      32
+#define MD_SB_GENERIC_WORDS            (MD_SB_GENERIC_CONSTANT_WORDS + MD_SB_GENERIC_STATE_WORDS)
+#define MD_SB_PERSONALITY_WORDS                64
+#define MD_SB_DISKS_WORDS              384
+#define MD_SB_DESCRIPTOR_WORDS         32
+#define MD_SB_RESERVED_WORDS           (1024 - MD_SB_GENERIC_WORDS - MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
+#define MD_SB_EQUAL_WORDS              (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS + MD_SB_DISKS_WORDS)
+#define MD_SB_DISKS                    (MD_SB_DISKS_WORDS / MD_SB_DESCRIPTOR_WORDS)
+
+/*
+ * Device "operational" state bits
+ */
+#define MD_FAULTY_DEVICE               0       /* Device is faulty / operational */
+#define MD_ACTIVE_DEVICE               1       /* Device is a part or the raid set / spare disk */
+#define MD_SYNC_DEVICE                 2       /* Device is in sync with the raid set */
+
+typedef struct md_device_descriptor_s {
+       __u32 number;           /* 0 Device number in the entire set */
+       __u32 major;            /* 1 Device major number */
+       __u32 minor;            /* 2 Device minor number */
+       __u32 raid_disk;        /* 3 The role of the device in the raid set */
+       __u32 state;            /* 4 Operational state */
+       __u32 reserved[MD_SB_DESCRIPTOR_WORDS - 5];
+} md_descriptor_t;
+
+#define MD_SB_MAGIC            0xa92b4efc
+
+/*
+ * Superblock state bits
+ */
+#define MD_SB_CLEAN            0
+#define MD_SB_ERRORS           1
+
+typedef struct md_superblock_s {
+
+       /*
+        * Constant generic information
+        */
+       __u32 md_magic;         /*  0 MD identifier */
+       __u32 major_version;    /*  1 major version to which the set conforms */
+       __u32 minor_version;    /*  2 minor version to which the set conforms */
+       __u32 patch_version;    /*  3 patchlevel version to which the set conforms */
+       __u32 gvalid_words;     /*  4 Number of non-reserved words in this section */
+       __u32 set_magic;        /*  5 Raid set identifier */
+       __u32 ctime;            /*  6 Creation time */
+       __u32 level;            /*  7 Raid personality (mirroring, raid5, ...) */
+       __u32 size;             /*  8 Apparent size of each individual disk, in kB */
+       __u32 nr_disks;         /*  9 Number of total disks in the raid set */
+       __u32 raid_disks;       /* 10 Number of disks in a fully functional raid set */
+       __u32 gstate_creserved[MD_SB_GENERIC_CONSTANT_WORDS - 11];
+
+       /*
+        * Generic state information
+        */
+       __u32 utime;            /*  0 Superblock update time */
+       __u32 state;            /*  1 State bits (clean, ...) */
+       __u32 active_disks;     /*  2 Number of currently active disks (some non-faulty disks might not be in sync) */
+       __u32 working_disks;    /*  3 Number of working disks */
+       __u32 failed_disks;     /*  4 Number of failed disks */
+       __u32 spare_disks;      /*  5 Number of spare disks */
+       __u32 gstate_sreserved[MD_SB_GENERIC_STATE_WORDS - 6];
+
+       /*
+        * Personality information
+        */
+       __u32 parity_algorithm;
+       __u32 chunk_size;
+       __u32 pstate_reserved[MD_SB_PERSONALITY_WORDS - 2];
+
+       /*
+        * Disks information
+        */
+       md_descriptor_t disks[MD_SB_DISKS];
+
+       /*
+        * Reserved
+        */
+       __u32 reserved[MD_SB_RESERVED_WORDS];
+
+       /*
+        * Active descriptor
+        */
+       md_descriptor_t descriptor;
+} md_superblock_t;
+
+#ifdef __KERNEL__
+
+#include <linux/mm.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <asm/semaphore.h>
+
+/*
+ * Kernel-based reconstruction is mostly working, but still requires
+ * some additional work.
+ */
+#define SUPPORT_RECONSTRUCTION 0
+
+#define MAX_REAL     8         /* Max number of physical dev per md dev */
+#define MAX_MD_DEV   4         /* Max number of md dev */
+
+#define FACTOR(a)         ((a)->repartition & FACTOR_MASK)
+#define MAX_FAULT(a)      (((a)->repartition & FAULT_MASK)>>8)
+#define PERSONALITY(a)    ((a)->repartition & PERSONALITY_MASK)
+
+#define FACTOR_SHIFT(a) (PAGE_SHIFT + (a) - 10)
+
+struct real_dev
+{
+  kdev_t dev;                  /* Device number */
+  int size;                    /* Device size (in blocks) */
+  int offset;                  /* Real device offset (in blocks) in md dev
+                                  (only used in linear mode) */
+  struct inode *inode;         /* Lock inode */
+  md_superblock_t *sb;
+  u32 sb_offset;
+};
+
+struct md_dev;
+
+#define SPARE_INACTIVE 0
+#define SPARE_WRITE    1
+#define SPARE_ACTIVE   2
+
+struct md_personality
+{
+  char *name;
+  int (*map)(struct md_dev *mddev, kdev_t *rdev,
+                     unsigned long *rsector, unsigned long size);
+  int (*make_request)(struct md_dev *mddev, int rw, struct buffer_head * bh);
+  void (*end_request)(struct buffer_head * bh, int uptodate);
+  int (*run)(int minor, struct md_dev *mddev);
+  int (*stop)(int minor, struct md_dev *mddev);
+  int (*status)(char *page, int minor, struct md_dev *mddev);
+  int (*ioctl)(struct inode *inode, struct file *file,
+              unsigned int cmd, unsigned long arg);
+  int max_invalid_dev;
+  int (*error_handler)(struct md_dev *mddev, kdev_t dev);
+
+/*
+ * Some personalities (RAID-1, RAID-5) can get disks hot-added and
+ * hot-removed. Hot removal is different from failure. (failure marks
+ * a disk inactive, but the disk is still part of the array)
+ */
+  int (*hot_add_disk) (struct md_dev *mddev, kdev_t dev);
+  int (*hot_remove_disk) (struct md_dev *mddev, kdev_t dev);
+  int (*mark_spare) (struct md_dev *mddev, md_descriptor_t *descriptor, int state);
+};
+
+struct md_dev
+{
+  struct real_dev      devices[MAX_REAL];
+  struct md_personality        *pers;
+  md_superblock_t      *sb;
+  int                  sb_dirty;
+  int                  repartition;
+  int                  busy;
+  int                  nb_dev;
+  void                 *private;
+};
+
+struct md_thread {
+       void                    (*run) (void *data);
+       void                    *data;
+       struct wait_queue       *wqueue;
+       unsigned long           flags;
+       struct semaphore        *sem;
+       struct task_struct      *tsk;
+};
+
+#define THREAD_WAKEUP  0
+
+extern struct md_dev md_dev[MAX_MD_DEV];
+extern int md_size[MAX_MD_DEV];
+extern int md_maxreadahead[MAX_MD_DEV];
+
+extern char *partition_name (kdev_t dev);
+
+extern int register_md_personality (int p_num, struct md_personality *p);
+extern int unregister_md_personality (int p_num);
+extern struct md_thread *md_register_thread (void (*run) (void *data), void *data);
+extern void md_unregister_thread (struct md_thread *thread);
+extern void md_wakeup_thread(struct md_thread *thread);
+extern int md_update_sb (int minor);
+extern int md_do_sync(struct md_dev *mddev);
+
+#endif __KERNEL__
+#endif _MD_H
diff --git a/include/linux/raid/hsm.h b/include/linux/raid/hsm.h
deleted file mode 100644 (file)
index 0438d27..0000000
+++ /dev/null
@@ -1,65 +0,0 @@
-#ifndef _LVM_H
-#define _LVM_H
-
-#include <linux/raid/md.h>
-
-#if __alpha__
-#error fix cpu_addr on Alpha first
-#endif
-
-#include <linux/raid/hsm_p.h>
-
-#define index_pv(lv,index) ((lv)->vg->pv_array+(index)->data.phys_nr)
-#define index_dev(lv,index) index_pv((lv),(index))->dev
-#define index_block(lv,index) (index)->data.phys_block
-#define index_child(index) ((lv_lptr_t *)((index)->cpu_addr))
-
-#define ptr_to_cpuaddr(ptr) ((__u32) (ptr))
-
-
-typedef struct pv_bg_desc_s {
-       unsigned int            free_blocks;
-       pv_block_group_t        *bg;
-} pv_bg_desc_t;
-
-typedef struct pv_s pv_t;
-typedef struct vg_s vg_t;
-typedef struct lv_s lv_t;
-
-struct pv_s
-{
-       int                     phys_nr;
-       kdev_t                  dev;
-       pv_sb_t                 *pv_sb;
-       pv_bg_desc_t            *bg_array;
-};
-
-struct lv_s
-{
-       int             log_id;
-       vg_t            *vg;
-
-       unsigned int    max_indices;
-       unsigned int    free_indices;
-       lv_lptr_t       root_index;
-
-       kdev_t          dev;
-};
-
-struct vg_s
-{
-       int             nr_pv;
-       pv_t            pv_array [MD_SB_DISKS];
-
-       int             nr_lv;
-       lv_t            lv_array [LVM_MAX_LVS_PER_VG];
-
-       vg_sb_t         *vg_sb;
-       mddev_t         *mddev;
-};
-
-#define kdev_to_lv(dev) ((lv_t *) mddev_map[MINOR(dev)].data)
-#define mddev_to_vg(mddev) ((vg_t *) mddev->private)
-
-#endif
-
diff --git a/include/linux/raid/hsm_p.h b/include/linux/raid/hsm_p.h
deleted file mode 100644 (file)
index 02674b3..0000000
+++ /dev/null
@@ -1,237 +0,0 @@
-#ifndef _LVM_P_H
-#define _LVM_P_H
-
-#define LVM_BLOCKSIZE 4096
-#define LVM_BLOCKSIZE_WORDS (LVM_BLOCKSIZE/4)
-#define PACKED __attribute__ ((packed))
-
-/*
- * Identifies a block in physical space
- */
-typedef struct phys_idx_s {
-       __u16 phys_nr;
-       __u32 phys_block;
-
-} PACKED phys_idx_t;
-
-/*
- * Identifies a block in logical space
- */
-typedef struct log_idx_s {
-       __u16 log_id;
-       __u32 log_index;
-
-} PACKED log_idx_t;
-
-/*
- * Describes one PV
- */
-#define LVM_PV_SB_MAGIC          0xf091ae9fU
-
-#define LVM_PV_SB_GENERIC_WORDS 32
-#define LVM_PV_SB_RESERVED_WORDS \
-               (LVM_BLOCKSIZE_WORDS - LVM_PV_SB_GENERIC_WORDS)
-
-/*
- * On-disk PV identification data, on block 0 in any PV.
- */
-typedef struct pv_sb_s
-{
-       __u32 pv_magic;         /*  0                                       */
-
-       __u32 pv_uuid0;         /*  1                                       */
-       __u32 pv_uuid1;         /*  2                                       */
-       __u32 pv_uuid2;         /*  3                                       */
-       __u32 pv_uuid3;         /*  4                                       */
-
-       __u32 pv_major;         /*  5                                       */
-       __u32 pv_minor;         /*  6                                       */
-       __u32 pv_patch;         /*  7                                       */
-
-       __u32 pv_ctime;         /*  8 Creation time                         */
-
-       __u32 pv_total_size;    /*  9 size of this PV, in blocks            */
-       __u32 pv_first_free;    /*  10 first free block                     */
-       __u32 pv_first_used;    /*  11 first used block                     */
-       __u32 pv_blocks_left;   /*  12 unallocated blocks                   */
-       __u32 pv_bg_size;       /*  13 size of a block group, in blocks     */
-       __u32 pv_block_size;    /*  14 size of blocks, in bytes             */
-       __u32 pv_pptr_size;     /*  15 size of block descriptor, in bytes   */
-       __u32 pv_block_groups;  /*  16 number of block groups               */
-
-       __u32 __reserved1[LVM_PV_SB_GENERIC_WORDS - 17];
-
-       /*
-        * Reserved
-        */
-       __u32 __reserved2[LVM_PV_SB_RESERVED_WORDS];
-
-} PACKED pv_sb_t;
-
-/*
- * this is pretty much arbitrary, but has to be less than ~64
- */
-#define LVM_MAX_LVS_PER_VG 32
-
-#define LVM_VG_SB_GENERIC_WORDS 32
-
-#define LV_DESCRIPTOR_WORDS 8
-#define LVM_VG_SB_RESERVED_WORDS (LVM_BLOCKSIZE_WORDS - \
-       LV_DESCRIPTOR_WORDS*LVM_MAX_LVS_PER_VG - LVM_VG_SB_GENERIC_WORDS)
-
-#if (LVM_PV_SB_RESERVED_WORDS < 0)
-#error you messed this one up dude ...
-#endif
-
-typedef struct lv_descriptor_s
-{
-       __u32 lv_id;            /*  0                                       */
-       phys_idx_t lv_root_idx; /*  1                                       */
-       __u16 __reserved;       /*  2                                       */
-       __u32 lv_max_indices;   /*  3                                       */
-       __u32 lv_free_indices;  /*  4                                       */
-       __u32 md_id;            /*  5                                       */
-
-       __u32 reserved[LV_DESCRIPTOR_WORDS - 6];
-
-} PACKED lv_descriptor_t;
-
-#define LVM_VG_SB_MAGIC          0x98320d7aU
-/*
- * On-disk VG identification data, in block 1 on all PVs
- */
-typedef struct vg_sb_s
-{
-       __u32 vg_magic;         /*  0                                       */
-       __u32 nr_lvs;           /*  1                                       */
-
-       __u32 __reserved1[LVM_VG_SB_GENERIC_WORDS - 2];
-
-       lv_descriptor_t lv_array [LVM_MAX_LVS_PER_VG];
-       /*
-        * Reserved
-        */
-       __u32 __reserved2[LVM_VG_SB_RESERVED_WORDS];
-
-} PACKED vg_sb_t;
-
-/*
- * Describes one LV
- */
-
-#define LVM_LV_SB_MAGIC          0xe182bd8aU
-
-/* do we need lv_sb_t? */
-
-typedef struct lv_sb_s
-{
-       /*
-        * On-disk LV identifier
-        */
-       __u32 lv_magic;         /*  0 LV identifier                         */
-       __u32 lv_uuid0;         /*  1                                       */
-       __u32 lv_uuid1;         /*  2                                       */
-       __u32 lv_uuid2;         /*  3                                       */
-       __u32 lv_uuid3;         /*  4                                       */
-
-       __u32 lv_major;         /*  5 PV identifier                         */
-       __u32 lv_minor;         /*  6 PV identifier                         */
-       __u32 lv_patch;         /*  7 PV identifier                         */
-
-       __u32 ctime;            /*  8 Creation time                         */
-       __u32 size;             /*  9 size of this LV, in blocks            */
-       phys_idx_t start;       /*  10 position of root index block         */
-       log_idx_t first_free;   /*  11-12 first free index                  */
-
-       /*
-        * Reserved
-        */
-       __u32 reserved[LVM_BLOCKSIZE_WORDS-13];
-
-} PACKED lv_sb_t;
-
-/*
- * Pointer pointing from the physical space, points to
- * the LV owning this block. It also contains various
- * statistics about the physical block.
- */
-typedef struct pv_pptr_s
-{
-       union {
-       /* case 1 */
-               struct {
-                       log_idx_t owner;
-                       log_idx_t predicted;
-                       __u32 last_referenced;
-               } used;
-       /* case 2 */
-               struct {
-                       __u16 log_id;
-                       __u16 __unused1;
-                       __u32 next_free;
-                       __u32 __unused2;
-                       __u32 __unused3;
-               } free;
-       } u;
-} PACKED pv_pptr_t;
-
-static __inline__ int pv_pptr_free (const pv_pptr_t * pptr)
-{
-       return !pptr->u.free.log_id;
-}
-
-
-#define DATA_BLOCKS_PER_BG ((LVM_BLOCKSIZE*8)/(8*sizeof(pv_pptr_t)+1))
-
-#define TOTAL_BLOCKS_PER_BG (DATA_BLOCKS_PER_BG+1)
-/*
- * A table of pointers filling up a single block, managing
- * the next DATA_BLOCKS_PER_BG physical blocks. Such block
- * groups form the physical space of blocks.
- */
-typedef struct pv_block_group_s
-{
-       __u8 used_bitmap[(DATA_BLOCKS_PER_BG+7)/8];
-
-       pv_pptr_t blocks[DATA_BLOCKS_PER_BG];
-
-} PACKED pv_block_group_t;
-
-/*
- * Pointer from the logical space, points to
- * the (PV,block) containing this logical block
- */
-typedef struct lv_lptr_s
-{
-       phys_idx_t data;
-       __u16 __reserved;
-       __u32 cpu_addr;
-       __u32 __reserved2;
-
-} PACKED lv_lptr_t;
-
-static __inline__ int index_free (const lv_lptr_t * index)
-{
-       return !index->data.phys_block;
-}
-
-static __inline__ int index_present (const lv_lptr_t * index)
-{
-       return index->cpu_addr;
-}
-
-
-#define LVM_LPTRS_PER_BLOCK (LVM_BLOCKSIZE/sizeof(lv_lptr_t))
-/*
- * A table of pointers filling up a single block, managing
- * LVM_LPTRS_PER_BLOCK logical blocks. Such block groups form
- * the logical space of blocks.
- */
-typedef struct lv_index_block_s
-{
-       lv_lptr_t blocks[LVM_LPTRS_PER_BLOCK];
-
-} PACKED lv_index_block_t;
-
-#endif
-
diff --git a/include/linux/raid/linear.h b/include/linux/raid/linear.h
deleted file mode 100644 (file)
index 55cfab7..0000000
+++ /dev/null
@@ -1,32 +0,0 @@
-#ifndef _LINEAR_H
-#define _LINEAR_H
-
-#include <linux/raid/md.h>
-
-struct dev_info {
-       kdev_t          dev;
-       int             size;
-       unsigned int    offset;
-};
-
-typedef struct dev_info dev_info_t;
-
-struct linear_hash
-{
-       dev_info_t *dev0, *dev1;
-};
-
-struct linear_private_data
-{
-       struct linear_hash      *hash_table;
-       dev_info_t              disks[MD_SB_DISKS];
-       dev_info_t              *smallest;
-       int                     nr_zones;
-};
-
-
-typedef struct linear_private_data linear_conf_t;
-
-#define mddev_to_conf(mddev) ((linear_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/md.h b/include/linux/raid/md.h
deleted file mode 100644 (file)
index 1059949..0000000
+++ /dev/null
@@ -1,95 +0,0 @@
-/*
-   md.h : Multiple Devices driver for Linux
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-          Copyright (C) 1994-96 Marc ZYNGIER
-         <zyngier@ufr-info-p7.ibp.fr> or
-         <maz@gloups.fdn.fr>
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_H
-#define _MD_H
-
-#include <linux/config.h>
-#include <linux/mm.h>
-#include <linux/fs.h>
-#include <linux/blkdev.h>
-#include <asm/semaphore.h>
-#include <linux/major.h>
-#include <linux/ioctl.h>
-#include <linux/types.h>
-#include <asm/bitops.h>
-#include <linux/module.h>
-#include <linux/hdreg.h>
-#include <linux/sysctl.h>
-#include <linux/proc_fs.h>
-#include <linux/smp_lock.h>
-#include <linux/delay.h>
-#include <net/checksum.h>
-#include <linux/random.h>
-#include <linux/locks.h>
-#include <asm/io.h>
-
-#include <linux/raid/md_compatible.h>
-/*
- * 'md_p.h' holds the 'physical' layout of RAID devices
- * 'md_u.h' holds the user <=> kernel API
- *
- * 'md_k.h' holds kernel internal definitions
- */
-
-#include <linux/raid/md_p.h>
-#include <linux/raid/md_u.h>
-#include <linux/raid/md_k.h>
-
-/*
- * Different major versions are not compatible.
- * Different minor versions are only downward compatible.
- * Different patchlevel versions are downward and upward compatible.
- */
-#define MD_MAJOR_VERSION                0
-#define MD_MINOR_VERSION                90
-#define MD_PATCHLEVEL_VERSION           0
-
-extern int md_size[MAX_MD_DEVS];
-extern struct hd_struct md_hd_struct[MAX_MD_DEVS];
-
-extern void add_mddev_mapping (mddev_t *mddev, kdev_t dev, void *data);
-extern void del_mddev_mapping (mddev_t *mddev, kdev_t dev);
-extern char * partition_name (kdev_t dev);
-extern int register_md_personality (int p_num, mdk_personality_t *p);
-extern int unregister_md_personality (int p_num);
-extern mdk_thread_t * md_register_thread (void (*run) (void *data),
-                               void *data, const char *name);
-extern void md_unregister_thread (mdk_thread_t *thread);
-extern void md_wakeup_thread(mdk_thread_t *thread);
-extern void md_interrupt_thread (mdk_thread_t *thread);
-extern int md_update_sb (mddev_t *mddev);
-extern int md_do_sync(mddev_t *mddev, mdp_disk_t *spare);
-extern void md_recover_arrays (void);
-extern int md_check_ordering (mddev_t *mddev);
-extern void autodetect_raid(void);
-extern struct gendisk * find_gendisk (kdev_t dev);
-extern int md_notify_reboot(struct notifier_block *this,
-                                       unsigned long code, void *x);
-#if CONFIG_BLK_DEV_MD
-extern void raid_setup(char *str,int *ints) md__init;
-#endif
-#ifdef CONFIG_MD_BOOT
-extern void md_setup(char *str,int *ints) md__init;
-#endif
-
-extern void md_print_devices (void);
-
-#define MD_BUG(x...) { printk("md: bug in file %s, line %d\n", __FILE__, __LINE__); md_print_devices(); }
-
-#endif _MD_H
-
diff --git a/include/linux/raid/md_compatible.h b/include/linux/raid/md_compatible.h
deleted file mode 100644 (file)
index d4119a0..0000000
+++ /dev/null
@@ -1,387 +0,0 @@
-
-/*
-   md.h : Multiple Devices driver compatibility layer for Linux 2.0/2.2
-          Copyright (C) 1998 Ingo Molnar
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/version.h>
-
-#ifndef _MD_COMPATIBLE_H
-#define _MD_COMPATIBLE_H
-
-#define LinuxVersionCode(v, p, s) (((v)<<16)+((p)<<8)+(s))
-
-#if LINUX_VERSION_CODE < LinuxVersionCode(2,1,0)
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y,GFP_KERNEL)
-
-#ifdef __i386__
-/* 001 */
-extern __inline__ int md_cpu_has_mmx(void)
-{
-       return x86_capability & 0x00800000;
-}
-#endif
-
-/* 002 */
-#define md_clear_page(page)        memset((void *)(page), 0, PAGE_SIZE)
-
-/* 003 */
-/*
- * someone please suggest a sane compatibility layer for modules
- */
-#define MD_EXPORT_SYMBOL(x)
-
-/* 004 */
-static inline unsigned long
-md_copy_from_user(void *to, const void *from, unsigned long n)
-{
-       int err;
-
-       err = verify_area(VERIFY_READ,from,n);
-       if (!err)
-               memcpy_fromfs(to, from, n);
-       return err; 
-}
-
-/* 005 */
-extern inline unsigned long
-md_copy_to_user(void *to, const void *from, unsigned long n)
-{
-       int err;
-
-       err = verify_area(VERIFY_WRITE,to,n);
-       if (!err)
-               memcpy_tofs(to, from, n);
-       return err; 
-}
-
-/* 006 */
-#define md_put_user(x,ptr)                                             \
-({                                                                     \
-       int __err;                                                      \
-                                                                       \
-       __err = verify_area(VERIFY_WRITE,ptr,sizeof(*ptr));             \
-       if (!__err)                                                     \
-               put_user(x,ptr);                                        \
-       __err;                                                          \
-})
-
-/* 007 */
-extern inline int md_capable_admin(void)
-{
-       return suser();
-}
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_inode)
-
-/* 009 */
-extern inline void md_flush_signals (void)
-{
-       current->signal = 0;
-}
-/* 010 */
-#define __S(nr) (1<<((nr)-1))
-extern inline void md_init_signals (void)
-{
-        current->exit_signal = SIGCHLD;
-        current->blocked = ~(__S(SIGKILL));
-}
-#undef __S
-
-/* 011 */
-extern inline unsigned long md_signal_pending (struct task_struct * tsk)
-{
-       return (tsk->signal & ~tsk->blocked);
-}
-
-/* 012 */
-#define md_set_global_readahead(x) read_ahead[MD_MAJOR] = MD_READAHEAD
-
-/* 013 */
-#define md_mdelay(n) (\
-       {unsigned long msec=(n); while (msec--) udelay(1000);})
-
-/* 014 */
-#define MD_SYS_DOWN 0
-#define MD_SYS_HALT 0
-#define MD_SYS_POWER_OFF 0
-
-/* 015 */
-#define md_register_reboot_notifier(x)
-
-/* 016 */
-extern __inline__ unsigned long
-md_test_and_set_bit(int nr, void * addr)
-{
-       unsigned long flags;
-       unsigned long oldbit;
-
-       save_flags(flags);
-       cli();
-       oldbit = test_bit(nr,addr);
-       set_bit(nr,addr);
-       restore_flags(flags);
-       return oldbit;
-}
-
-/* 017 */
-extern __inline__ unsigned long
-md_test_and_clear_bit(int nr, void * addr)
-{
-       unsigned long flags;
-       unsigned long oldbit;
-
-       save_flags(flags);
-       cli();
-       oldbit = test_bit(nr,addr);
-       clear_bit(nr,addr);
-       restore_flags(flags);
-       return oldbit;
-}
-
-/* 018 */
-#define md_atomic_read(x) (*(volatile int *)(x))
-#define md_atomic_set(x,y) (*(volatile int *)(x) = (y))
-
-/* 019 */
-extern __inline__ void md_lock_kernel (void)
-{
-#if __SMP__
-       lock_kernel();
-       syscall_count++;
-#endif
-}
-
-extern __inline__ void md_unlock_kernel (void)
-{
-#if __SMP__
-       syscall_count--;
-       unlock_kernel();
-#endif
-}
-/* 020 */
-
-#define md__init
-#define md__initdata
-#define md__initfunc(__arginit) __arginit
-
-/* 021 */
-
-/* 022 */
-
-struct md_list_head {
-       struct md_list_head *next, *prev;
-};
-
-#define MD_LIST_HEAD(name) \
-       struct md_list_head name = { &name, &name }
-
-#define MD_INIT_LIST_HEAD(ptr) do { \
-       (ptr)->next = (ptr); (ptr)->prev = (ptr); \
-} while (0)
-
-static __inline__ void md__list_add(struct md_list_head * new,
-       struct md_list_head * prev,
-       struct md_list_head * next)
-{
-       next->prev = new;
-       new->next = next;
-       new->prev = prev;
-       prev->next = new;
-}
-
-static __inline__ void md_list_add(struct md_list_head *new,
-                                               struct md_list_head *head)
-{
-       md__list_add(new, head, head->next);
-}
-
-static __inline__ void md__list_del(struct md_list_head * prev,
-                                       struct md_list_head * next)
-{
-       next->prev = prev;
-       prev->next = next;
-}
-
-static __inline__ void md_list_del(struct md_list_head *entry)
-{
-       md__list_del(entry->prev, entry->next);
-}
-
-static __inline__ int md_list_empty(struct md_list_head *head)
-{
-       return head->next == head;
-}
-
-#define md_list_entry(ptr, type, member) \
-       ((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
-
-/* 023 */
-
-static __inline__ signed long md_schedule_timeout(signed long timeout)
-{
-       current->timeout = jiffies + timeout;
-       schedule();
-       return 0;
-}
-
-/* 024 */
-#define md_need_resched(tsk) (need_resched)
-
-/* 025 */
-typedef struct { int gcc_is_buggy; } md_spinlock_t;
-#define MD_SPIN_LOCK_UNLOCKED (md_spinlock_t) { 0 }
-
-#define md_spin_lock_irq cli
-#define md_spin_unlock_irq sti
-#define md_spin_unlock_irqrestore(x,flags) restore_flags(flags)
-#define md_spin_lock_irqsave(x,flags) do { save_flags(flags); cli(); } while (0)
-
-/* END */
-
-#else
-
-#include <linux/reboot.h>
-#include <linux/vmalloc.h>
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y)
-
-#ifdef __i386__
-/* 001 */
-extern __inline__ int md_cpu_has_mmx(void)
-{
-       return boot_cpu_data.x86_capability & X86_FEATURE_MMX;
-}
-#endif
-
-/* 002 */
-#define md_clear_page(page)        clear_page(page)
-
-/* 003 */
-#define MD_EXPORT_SYMBOL(x) EXPORT_SYMBOL(x)
-
-/* 004 */
-#define md_copy_to_user(x,y,z) copy_to_user(x,y,z)
-
-/* 005 */
-#define md_copy_from_user(x,y,z) copy_from_user(x,y,z)
-
-/* 006 */
-#define md_put_user put_user
-
-/* 007 */
-extern inline int md_capable_admin(void)
-{
-       return capable(CAP_SYS_ADMIN);
-}
-
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_dentry->d_inode)
-
-/* 009 */
-extern inline void md_flush_signals (void)
-{
-       spin_lock(&current->sigmask_lock);
-       flush_signals(current);
-       spin_unlock(&current->sigmask_lock);
-}
-/* 010 */
-extern inline void md_init_signals (void)
-{
-        current->exit_signal = SIGCHLD;
-        siginitsetinv(&current->blocked, sigmask(SIGKILL));
-}
-
-/* 011 */
-#define md_signal_pending signal_pending
-
-/* 012 */
-extern inline void md_set_global_readahead(int * table)
-{
-       max_readahead[MD_MAJOR] = table;
-}
-
-/* 013 */
-#define md_mdelay(x) mdelay(x)
-
-/* 014 */
-#define MD_SYS_DOWN SYS_DOWN
-#define MD_SYS_HALT SYS_HALT
-#define MD_SYS_POWER_OFF SYS_POWER_OFF
-
-/* 015 */
-#define md_register_reboot_notifier register_reboot_notifier
-
-/* 016 */
-#define md_test_and_set_bit test_and_set_bit
-
-/* 017 */
-#define md_test_and_clear_bit test_and_clear_bit
-
-/* 018 */
-#define md_atomic_read atomic_read
-#define md_atomic_set atomic_set
-
-/* 019 */
-#define md_lock_kernel lock_kernel
-#define md_unlock_kernel unlock_kernel
-
-/* 020 */
-
-#include <linux/init.h>
-
-#define md__init __init
-#define md__initdata __initdata
-#define md__initfunc(__arginit) __initfunc(__arginit)
-
-/* 021 */
-
-
-/* 022 */
-
-#define md_list_head list_head
-#define MD_LIST_HEAD(name) LIST_HEAD(name)
-#define MD_INIT_LIST_HEAD(ptr) INIT_LIST_HEAD(ptr)
-#define md_list_add list_add
-#define md_list_del list_del
-#define md_list_empty list_empty
-
-#define md_list_entry(ptr, type, member) list_entry(ptr, type, member)
-
-/* 023 */
-
-#define md_schedule_timeout schedule_timeout
-
-/* 024 */
-#define md_need_resched(tsk) ((tsk)->need_resched)
-
-/* 025 */
-#define md_spinlock_t spinlock_t
-#define MD_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
-
-#define md_spin_lock_irq spin_lock_irq
-#define md_spin_unlock_irq spin_unlock_irq
-#define md_spin_unlock_irqrestore spin_unlock_irqrestore
-#define md_spin_lock_irqsave spin_lock_irqsave
-
-/* END */
-
-#endif
-
-#endif _MD_COMPATIBLE_H
-
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
deleted file mode 100644 (file)
index c98b4ef..0000000
+++ /dev/null
@@ -1,338 +0,0 @@
-/*
-   md_k.h : kernel internal structure of the Linux MD driver
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_K_H
-#define _MD_K_H
-
-#define MD_RESERVED       0UL
-#define LINEAR            1UL
-#define STRIPED           2UL
-#define RAID0             STRIPED
-#define RAID1             3UL
-#define RAID5             4UL
-#define TRANSLUCENT       5UL
-#define LVM               6UL
-#define MAX_PERSONALITY   7UL
-
-extern inline int pers_to_level (int pers)
-{
-       switch (pers) {
-               case LVM:               return -3;
-               case TRANSLUCENT:       return -2;
-               case LINEAR:            return -1;
-               case RAID0:             return 0;
-               case RAID1:             return 1;
-               case RAID5:             return 5;
-       }
-       panic("pers_to_level()");
-}
-
-extern inline int level_to_pers (int level)
-{
-       switch (level) {
-               case -3: return LVM;
-               case -2: return TRANSLUCENT;
-               case -1: return LINEAR;
-               case 0: return RAID0;
-               case 1: return RAID1;
-               case 4:
-               case 5: return RAID5;
-       }
-       return MD_RESERVED;
-}
-
-typedef struct mddev_s mddev_t;
-typedef struct mdk_rdev_s mdk_rdev_t;
-
-#if (MINORBITS != 8)
-#error MD doesnt handle bigger kdev yet
-#endif
-
-#define MAX_REAL     12                        /* Max number of disks per md dev */
-#define MAX_MD_DEVS  (1<<MINORBITS)    /* Max number of md dev */
-
-/*
- * Maps a kdev to an mddev/subdev. How 'data' is handled is up to
- * the personality. (eg. LVM uses this to identify individual LVs)
- */
-typedef struct dev_mapping_s {
-       mddev_t *mddev;
-       void *data;
-} dev_mapping_t;
-
-extern dev_mapping_t mddev_map [MAX_MD_DEVS];
-
-extern inline mddev_t * kdev_to_mddev (kdev_t dev)
-{
-        return mddev_map[MINOR(dev)].mddev;
-}
-
-/*
- * options passed in raidrun:
- */
-
-#define MAX_CHUNK_SIZE (4096*1024)
-
-/*
- * default readahead
- */
-#define MD_READAHEAD   (256 * 512)
-
-extern inline int disk_faulty(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_FAULTY);
-}
-
-extern inline int disk_active(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_ACTIVE);
-}
-
-extern inline int disk_sync(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_SYNC);
-}
-
-extern inline int disk_spare(mdp_disk_t * d)
-{
-       return !disk_sync(d) && !disk_active(d) && !disk_faulty(d);
-}
-
-extern inline int disk_removed(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_REMOVED);
-}
-
-extern inline void mark_disk_faulty(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_FAULTY);
-}
-
-extern inline void mark_disk_active(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_ACTIVE);
-}
-
-extern inline void mark_disk_sync(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_SYNC);
-}
-
-extern inline void mark_disk_spare(mdp_disk_t * d)
-{
-       d->state = 0;
-}
-
-extern inline void mark_disk_removed(mdp_disk_t * d)
-{
-       d->state = (1 << MD_DISK_FAULTY) | (1 << MD_DISK_REMOVED);
-}
-
-extern inline void mark_disk_inactive(mdp_disk_t * d)
-{
-       d->state &= ~(1 << MD_DISK_ACTIVE);
-}
-
-extern inline void mark_disk_nonsync(mdp_disk_t * d)
-{
-       d->state &= ~(1 << MD_DISK_SYNC);
-}
-
-/*
- * MD's 'extended' device
- */
-struct mdk_rdev_s
-{
-       struct md_list_head same_set;   /* RAID devices within the same set */
-       struct md_list_head all;        /* all RAID devices */
-       struct md_list_head pending;    /* undetected RAID devices */
-
-       kdev_t dev;                     /* Device number */
-       kdev_t old_dev;                 /*  "" when it was last imported */
-       int size;                       /* Device size (in blocks) */
-       mddev_t *mddev;                 /* RAID array if running */
-       unsigned long last_events;      /* IO event timestamp */
-
-       struct inode *inode;            /* Lock inode */
-       struct file filp;               /* Lock file */
-
-       mdp_super_t *sb;
-       int sb_offset;
-
-       int faulty;                     /* if faulty do not issue IO requests */
-       int desc_nr;                    /* descriptor index in the superblock */
-};
-
-
-/*
- * disk operations in a working array:
- */
-#define DISKOP_SPARE_INACTIVE  0
-#define DISKOP_SPARE_WRITE     1
-#define DISKOP_SPARE_ACTIVE    2
-#define DISKOP_HOT_REMOVE_DISK 3
-#define DISKOP_HOT_ADD_DISK    4
-
-typedef struct mdk_personality_s mdk_personality_t;
-
-struct mddev_s
-{
-       void                            *private;
-       mdk_personality_t               *pers;
-       int                             __minor;
-       mdp_super_t                     *sb;
-       int                             nb_dev;
-       struct md_list_head             disks;
-       int                             sb_dirty;
-       mdu_param_t                     param;
-       int                             ro;
-       unsigned int                    curr_resync;
-       unsigned long                   resync_start;
-       char                            *name;
-       int                             recovery_running;
-       struct semaphore                reconfig_sem;
-       struct semaphore                recovery_sem;
-       struct semaphore                resync_sem;
-       struct md_list_head             all_mddevs;
-};
-
-struct mdk_personality_s
-{
-       char *name;
-       int (*map)(mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-               unsigned long *rsector, unsigned long size);
-       int (*make_request)(mddev_t *mddev, int rw, struct buffer_head * bh);
-       void (*end_request)(struct buffer_head * bh, int uptodate);
-       int (*run)(mddev_t *mddev);
-       int (*stop)(mddev_t *mddev);
-       int (*status)(char *page, mddev_t *mddev);
-       int (*ioctl)(struct inode *inode, struct file *file,
-               unsigned int cmd, unsigned long arg);
-       int max_invalid_dev;
-       int (*error_handler)(mddev_t *mddev, kdev_t dev);
-
-/*
- * Some personalities (RAID-1, RAID-5) can have disks hot-added and
- * hot-removed. Hot removal is different from failure. (failure marks
- * a disk inactive, but the disk is still part of the array) The interface
- * to such operations is the 'pers->diskop()' function, can be NULL.
- *
- * the diskop function can change the pointer pointing to the incoming
- * descriptor, but must do so very carefully. (currently only
- * SPARE_ACTIVE expects such a change)
- */
-       int (*diskop) (mddev_t *mddev, mdp_disk_t **descriptor, int state);
-
-       int (*stop_resync)(mddev_t *mddev);
-       int (*restart_resync)(mddev_t *mddev);
-};
-
-
-/*
- * Currently we index md_array directly, based on the minor
- * number. This will have to change to dynamic allocation
- * once we start supporting partitioning of md devices.
- */
-extern inline int mdidx (mddev_t * mddev)
-{
-       return mddev->__minor;
-}
-
-extern inline kdev_t mddev_to_kdev(mddev_t * mddev)
-{
-       return MKDEV(MD_MAJOR, mdidx(mddev));
-}
-
-extern mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev);
-extern mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr);
-
-/*
- * iterates through some rdev ringlist. It's safe to remove the
- * current 'rdev'. Dont touch 'tmp' though.
- */
-#define ITERATE_RDEV_GENERIC(head,field,rdev,tmp)                      \
-                                                                       \
-       for (tmp = head.next;                                           \
-               rdev = md_list_entry(tmp, mdk_rdev_t, field),           \
-                       tmp = tmp->next, tmp->prev != &head             \
-               ; )
-/*
- * iterates through the 'same array disks' ringlist
- */
-#define ITERATE_RDEV(mddev,rdev,tmp)                                   \
-       ITERATE_RDEV_GENERIC((mddev)->disks,same_set,rdev,tmp)
-
-/*
- * Same as above, but assumes that the device has rdev->desc_nr numbered
- * from 0 to mddev->nb_dev, and iterates through rdevs in ascending order.
- */
-#define ITERATE_RDEV_ORDERED(mddev,rdev,i)                             \
-       for (i = 0; rdev = find_rdev_nr(mddev, i), i < mddev->nb_dev; i++)
-
-
-/*
- * Iterates through all 'RAID managed disks'
- */
-#define ITERATE_RDEV_ALL(rdev,tmp)                                     \
-       ITERATE_RDEV_GENERIC(all_raid_disks,all,rdev,tmp)
-
-/*
- * Iterates through 'pending RAID disks'
- */
-#define ITERATE_RDEV_PENDING(rdev,tmp)                                 \
-       ITERATE_RDEV_GENERIC(pending_raid_disks,pending,rdev,tmp)
-
-/*
- * iterates through all used mddevs in the system.
- */
-#define ITERATE_MDDEV(mddev,tmp)                                       \
-                                                                       \
-       for (tmp = all_mddevs.next;                                     \
-               mddev = md_list_entry(tmp, mddev_t, all_mddevs),        \
-                       tmp = tmp->next, tmp->prev != &all_mddevs       \
-               ; )
-
-extern inline int lock_mddev (mddev_t * mddev)
-{
-       return down_interruptible(&mddev->reconfig_sem);
-}
-
-extern inline void unlock_mddev (mddev_t * mddev)
-{
-       up(&mddev->reconfig_sem);
-}
-
-#define xchg_values(x,y) do { __typeof__(x) __tmp = x; \
-                               x = y; y = __tmp; } while (0)
-
-typedef struct mdk_thread_s {
-       void                    (*run) (void *data);
-       void                    *data;
-       struct wait_queue       *wqueue;
-       unsigned long           flags;
-       struct semaphore        *sem;
-       struct task_struct      *tsk;
-       const char              *name;
-} mdk_thread_t;
-
-#define THREAD_WAKEUP  0
-
-typedef struct dev_name_s {
-       struct md_list_head list;
-       kdev_t dev;
-       char name [MAX_DISKNAME_LEN];
-} dev_name_t;
-
-#endif _MD_K_H
-
diff --git a/include/linux/raid/md_p.h b/include/linux/raid/md_p.h
deleted file mode 100644 (file)
index 83f8eb1..0000000
+++ /dev/null
@@ -1,161 +0,0 @@
-/*
-   md_p.h : physical layout of Linux RAID devices
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_P_H
-#define _MD_P_H
-
-/*
- * RAID superblock.
- *
- * The RAID superblock maintains some statistics on each RAID configuration.
- * Each real device in the RAID set contains it near the end of the device.
- * Some of the ideas are copied from the ext2fs implementation.
- *
- * We currently use 4096 bytes as follows:
- *
- *     word offset     function
- *
- *        0  -    31   Constant generic RAID device information.
- *        32  -    63   Generic state information.
- *       64  -   127   Personality specific information.
- *      128  -   511   12 32-words descriptors of the disks in the raid set.
- *      512  -   911   Reserved.
- *      912  -  1023   Disk specific descriptor.
- */
-
-/*
- * If x is the real device size in bytes, we return an apparent size of:
- *
- *     y = (x & ~(MD_RESERVED_BYTES - 1)) - MD_RESERVED_BYTES
- *
- * and place the 4kB superblock at offset y.
- */
-#define MD_RESERVED_BYTES              (64 * 1024)
-#define MD_RESERVED_SECTORS            (MD_RESERVED_BYTES / 512)
-#define MD_RESERVED_BLOCKS             (MD_RESERVED_BYTES / BLOCK_SIZE)
-
-#define MD_NEW_SIZE_SECTORS(x)         ((x & ~(MD_RESERVED_SECTORS - 1)) - MD_RESERVED_SECTORS)
-#define MD_NEW_SIZE_BLOCKS(x)          ((x & ~(MD_RESERVED_BLOCKS - 1)) - MD_RESERVED_BLOCKS)
-
-#define MD_SB_BYTES                    4096
-#define MD_SB_WORDS                    (MD_SB_BYTES / 4)
-#define MD_SB_BLOCKS                   (MD_SB_BYTES / BLOCK_SIZE)
-#define MD_SB_SECTORS                  (MD_SB_BYTES / 512)
-
-/*
- * The following are counted in 32-bit words
- */
-#define        MD_SB_GENERIC_OFFSET            0
-#define MD_SB_PERSONALITY_OFFSET       64
-#define MD_SB_DISKS_OFFSET             128
-#define MD_SB_DESCRIPTOR_OFFSET                992
-
-#define MD_SB_GENERIC_CONSTANT_WORDS   32
-#define MD_SB_GENERIC_STATE_WORDS      32
-#define MD_SB_GENERIC_WORDS            (MD_SB_GENERIC_CONSTANT_WORDS + MD_SB_GENERIC_STATE_WORDS)
-#define MD_SB_PERSONALITY_WORDS                64
-#define MD_SB_DISKS_WORDS              384
-#define MD_SB_DESCRIPTOR_WORDS         32
-#define MD_SB_RESERVED_WORDS           (1024 - MD_SB_GENERIC_WORDS - MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
-#define MD_SB_EQUAL_WORDS              (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS + MD_SB_DISKS_WORDS)
-#define MD_SB_DISKS                    (MD_SB_DISKS_WORDS / MD_SB_DESCRIPTOR_WORDS)
-
-/*
- * Device "operational" state bits
- */
-#define MD_DISK_FAULTY         0 /* disk is faulty / operational */
-#define MD_DISK_ACTIVE         1 /* disk is running or spare disk */
-#define MD_DISK_SYNC           2 /* disk is in sync with the raid set */
-#define MD_DISK_REMOVED                3 /* disk is in sync with the raid set */
-
-typedef struct mdp_device_descriptor_s {
-       __u32 number;           /* 0 Device number in the entire set          */
-       __u32 major;            /* 1 Device major number                      */
-       __u32 minor;            /* 2 Device minor number                      */
-       __u32 raid_disk;        /* 3 The role of the device in the raid set   */
-       __u32 state;            /* 4 Operational state                        */
-       __u32 reserved[MD_SB_DESCRIPTOR_WORDS - 5];
-} mdp_disk_t;
-
-#define MD_SB_MAGIC            0xa92b4efc
-
-/*
- * Superblock state bits
- */
-#define MD_SB_CLEAN            0
-#define MD_SB_ERRORS           1
-
-typedef struct mdp_superblock_s {
-       /*
-        * Constant generic information
-        */
-       __u32 md_magic;         /*  0 MD identifier                           */
-       __u32 major_version;    /*  1 major version to which the set conforms */
-       __u32 minor_version;    /*  2 minor version ...                       */
-       __u32 patch_version;    /*  3 patchlevel version ...                  */
-       __u32 gvalid_words;     /*  4 Number of used words in this section    */
-       __u32 set_uuid0;        /*  5 Raid set identifier                     */
-       __u32 ctime;            /*  6 Creation time                           */
-       __u32 level;            /*  7 Raid personality                        */
-       __u32 size;             /*  8 Apparent size of each individual disk   */
-       __u32 nr_disks;         /*  9 total disks in the raid set             */
-       __u32 raid_disks;       /* 10 disks in a fully functional raid set    */
-       __u32 md_minor;         /* 11 preferred MD minor device number        */
-       __u32 not_persistent;   /* 12 does it have a persistent superblock    */
-       __u32 set_uuid1;        /* 13 Raid set identifier #2                  */
-       __u32 set_uuid2;        /* 14 Raid set identifier #3                  */
-       __u32 set_uuid3;        /* 14 Raid set identifier #4                  */
-       __u32 gstate_creserved[MD_SB_GENERIC_CONSTANT_WORDS - 16];
-
-       /*
-        * Generic state information
-        */
-       __u32 utime;            /*  0 Superblock update time                  */
-       __u32 state;            /*  1 State bits (clean, ...)                 */
-       __u32 active_disks;     /*  2 Number of currently active disks        */
-       __u32 working_disks;    /*  3 Number of working disks                 */
-       __u32 failed_disks;     /*  4 Number of failed disks                  */
-       __u32 spare_disks;      /*  5 Number of spare disks                   */
-       __u32 sb_csum;          /*  6 checksum of the whole superblock        */
-       __u64 events;           /*  7 number of superblock updates (64-bit!)  */
-       __u32 gstate_sreserved[MD_SB_GENERIC_STATE_WORDS - 9];
-
-       /*
-        * Personality information
-        */
-       __u32 layout;           /*  0 the array's physical layout             */
-       __u32 chunk_size;       /*  1 chunk size in bytes                     */
-       __u32 root_pv;          /*  2 LV root PV */
-       __u32 root_block;       /*  3 LV root block */
-       __u32 pstate_reserved[MD_SB_PERSONALITY_WORDS - 4];
-
-       /*
-        * Disks information
-        */
-       mdp_disk_t disks[MD_SB_DISKS];
-
-       /*
-        * Reserved
-        */
-       __u32 reserved[MD_SB_RESERVED_WORDS];
-
-       /*
-        * Active descriptor
-        */
-       mdp_disk_t this_disk;
-
-} mdp_super_t;
-
-#endif _MD_P_H
-
diff --git a/include/linux/raid/md_u.h b/include/linux/raid/md_u.h
deleted file mode 100644 (file)
index 18c0295..0000000
+++ /dev/null
@@ -1,114 +0,0 @@
-/*
-   md_u.h : user <=> kernel API between Linux raidtools and RAID drivers
-          Copyright (C) 1998 Ingo Molnar
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_U_H
-#define _MD_U_H
-
-/* ioctls */
-
-/* status */
-#define RAID_VERSION           _IOR (MD_MAJOR, 0x10, mdu_version_t)
-#define GET_ARRAY_INFO         _IOR (MD_MAJOR, 0x11, mdu_array_info_t)
-#define GET_DISK_INFO          _IOR (MD_MAJOR, 0x12, mdu_disk_info_t)
-#define PRINT_RAID_DEBUG       _IO (MD_MAJOR, 0x13)
-
-/* configuration */
-#define CLEAR_ARRAY            _IO (MD_MAJOR, 0x20)
-#define ADD_NEW_DISK           _IOW (MD_MAJOR, 0x21, mdu_disk_info_t)
-#define HOT_REMOVE_DISK                _IO (MD_MAJOR, 0x22)
-#define SET_ARRAY_INFO         _IOW (MD_MAJOR, 0x23, mdu_array_info_t)
-#define SET_DISK_INFO          _IO (MD_MAJOR, 0x24)
-#define WRITE_RAID_INFO                _IO (MD_MAJOR, 0x25)
-#define UNPROTECT_ARRAY                _IO (MD_MAJOR, 0x26)
-#define PROTECT_ARRAY          _IO (MD_MAJOR, 0x27)
-#define HOT_ADD_DISK           _IO (MD_MAJOR, 0x28)
-
-/* usage */
-#define RUN_ARRAY              _IOW (MD_MAJOR, 0x30, mdu_param_t)
-#define START_ARRAY            _IO (MD_MAJOR, 0x31)
-#define STOP_ARRAY             _IO (MD_MAJOR, 0x32)
-#define STOP_ARRAY_RO          _IO (MD_MAJOR, 0x33)
-#define RESTART_ARRAY_RW       _IO (MD_MAJOR, 0x34)
-
-typedef struct mdu_version_s {
-       int major;
-       int minor;
-       int patchlevel;
-} mdu_version_t;
-
-typedef struct mdu_array_info_s {
-       /*
-        * Generic constant information
-        */
-       int major_version;
-       int minor_version;
-       int patch_version;
-       int ctime;
-       int level;
-       int size;
-       int nr_disks;
-       int raid_disks;
-       int md_minor;
-       int not_persistent;
-
-       /*
-        * Generic state information
-        */
-       int utime;              /*  0 Superblock update time                  */
-       int state;              /*  1 State bits (clean, ...)                 */
-       int active_disks;       /*  2 Number of currently active disks        */
-       int working_disks;      /*  3 Number of working disks                 */
-       int failed_disks;       /*  4 Number of failed disks                  */
-       int spare_disks;        /*  5 Number of spare disks                   */
-
-       /*
-        * Personality information
-        */
-       int layout;             /*  0 the array's physical layout             */
-       int chunk_size; /*  1 chunk size in bytes                     */
-
-} mdu_array_info_t;
-
-typedef struct mdu_disk_info_s {
-       /*
-        * configuration/status of one particular disk
-        */
-       int number;
-       int major;
-       int minor;
-       int raid_disk;
-       int state;
-
-} mdu_disk_info_t;
-
-typedef struct mdu_start_info_s {
-       /*
-        * configuration/status of one particular disk
-        */
-       int major;
-       int minor;
-       int raid_disk;
-       int state;
-
-} mdu_start_info_t;
-
-typedef struct mdu_param_s
-{
-       int                     personality;    /* 1,2,3,4 */
-       int                     chunk_size;     /* in bytes */
-       int                     max_fault;      /* unused for now */
-} mdu_param_t;
-
-#endif _MD_U_H
-
diff --git a/include/linux/raid/raid0.h b/include/linux/raid/raid0.h
deleted file mode 100644 (file)
index 3ea74db..0000000
+++ /dev/null
@@ -1,33 +0,0 @@
-#ifndef _RAID0_H
-#define _RAID0_H
-
-#include <linux/raid/md.h>
-
-struct strip_zone
-{
-       int zone_offset;                /* Zone offset in md_dev */
-       int dev_offset;                 /* Zone offset in real dev */
-       int size;                       /* Zone size */
-       int nb_dev;                     /* # of devices attached to the zone */
-       mdk_rdev_t *dev[MAX_REAL]; /* Devices attached to the zone */
-};
-
-struct raid0_hash
-{
-       struct strip_zone *zone0, *zone1;
-};
-
-struct raid0_private_data
-{
-       struct raid0_hash *hash_table; /* Dynamically allocated */
-       struct strip_zone *strip_zone; /* This one too */
-       int nr_strip_zones;
-       struct strip_zone *smallest;
-       int nr_zones;
-};
-
-typedef struct raid0_private_data raid0_conf_t;
-
-#define mddev_to_conf(mddev) ((raid0_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/raid1.h b/include/linux/raid/raid1.h
deleted file mode 100644 (file)
index a50ba2f..0000000
+++ /dev/null
@@ -1,64 +0,0 @@
-#ifndef _RAID1_H
-#define _RAID1_H
-
-#include <linux/raid/md.h>
-
-struct mirror_info {
-       int             number;
-       int             raid_disk;
-       kdev_t          dev;
-       int             next;
-       int             sect_limit;
-
-       /*
-        * State bits:
-        */
-       int             operational;
-       int             write_only;
-       int             spare;
-
-       int             used_slot;
-};
-
-struct raid1_private_data {
-       mddev_t                 *mddev;
-       struct mirror_info      mirrors[MD_SB_DISKS];
-       int                     nr_disks;
-       int                     raid_disks;
-       int                     working_disks;
-       int                     last_used;
-       unsigned long           next_sect;
-       int                     sect_count;
-       mdk_thread_t            *thread, *resync_thread;
-       int                     resync_mirrors;
-       struct mirror_info      *spare;
-};
-
-typedef struct raid1_private_data raid1_conf_t;
-
-/*
- * this is the only point in the RAID code where we violate
- * C type safety. mddev->private is an 'opaque' pointer.
- */
-#define mddev_to_conf(mddev) ((raid1_conf_t *) mddev->private)
-
-/*
- * this is our 'private' 'collective' RAID1 buffer head.
- * it contains information about what kind of IO operations were started
- * for this RAID1 operation, and about their status:
- */
-
-struct raid1_bh {
-       atomic_t                remaining; /* 'have we finished' count,
-                                           * used from IRQ handlers
-                                           */
-       int                     cmd;
-       unsigned long           state;
-       mddev_t                 *mddev;
-       struct buffer_head      *master_bh;
-       struct buffer_head      *mirror_bh [MD_SB_DISKS];
-       struct buffer_head      bh_req;
-       struct buffer_head      *next_retry;
-};
-
-#endif
diff --git a/include/linux/raid/raid5.h b/include/linux/raid/raid5.h
deleted file mode 100644 (file)
index 471323c..0000000
+++ /dev/null
@@ -1,113 +0,0 @@
-#ifndef _RAID5_H
-#define _RAID5_H
-
-#include <linux/raid/md.h>
-#include <linux/raid/xor.h>
-
-struct disk_info {
-       kdev_t  dev;
-       int     operational;
-       int     number;
-       int     raid_disk;
-       int     write_only;
-       int     spare;
-       int     used_slot;
-};
-
-struct stripe_head {
-       md_spinlock_t           stripe_lock;
-       struct stripe_head      *hash_next, **hash_pprev; /* hash pointers */
-       struct stripe_head      *free_next;             /* pool of free sh's */
-       struct buffer_head      *buffer_pool;           /* pool of free buffers */
-       struct buffer_head      *bh_pool;               /* pool of free bh's */
-       struct raid5_private_data       *raid_conf;
-       struct buffer_head      *bh_old[MD_SB_DISKS];   /* disk image */
-       struct buffer_head      *bh_new[MD_SB_DISKS];   /* buffers of the MD device (present in buffer cache) */
-       struct buffer_head      *bh_copy[MD_SB_DISKS];  /* copy on write of bh_new (bh_new can change from under us) */
-       struct buffer_head      *bh_req[MD_SB_DISKS];   /* copy of bh_new (only the buffer heads), queued to the lower levels */
-       int                     cmd_new[MD_SB_DISKS];   /* READ/WRITE for new */
-       int                     new[MD_SB_DISKS];       /* buffer added since the last handle_stripe() */
-       unsigned long           sector;                 /* sector of this row */
-       int                     size;                   /* buffers size */
-       int                     pd_idx;                 /* parity disk index */
-       atomic_t                nr_pending;             /* nr of pending cmds */
-       unsigned long           state;                  /* state flags */
-       int                     cmd;                    /* stripe cmd */
-       int                     count;                  /* nr of waiters */
-       int                     write_method;           /* reconstruct-write / read-modify-write */
-       int                     phase;                  /* PHASE_BEGIN, ..., PHASE_COMPLETE */
-       struct wait_queue       *wait;                  /* processes waiting for this stripe */
-};
-
-/*
- * Phase
- */
-#define PHASE_BEGIN            0
-#define PHASE_READ_OLD         1
-#define PHASE_WRITE            2
-#define PHASE_READ             3
-#define PHASE_COMPLETE         4
-
-/*
- * Write method
- */
-#define METHOD_NONE            0
-#define RECONSTRUCT_WRITE      1
-#define READ_MODIFY_WRITE      2
-
-/*
- * Stripe state
- */
-#define STRIPE_LOCKED          0
-#define STRIPE_ERROR           1
-
-/*
- * Stripe commands
- */
-#define STRIPE_NONE            0
-#define        STRIPE_WRITE            1
-#define STRIPE_READ            2
-
-struct raid5_private_data {
-       struct stripe_head      **stripe_hashtbl;
-       mddev_t                 *mddev;
-       mdk_thread_t            *thread, *resync_thread;
-       struct disk_info        disks[MD_SB_DISKS];
-       struct disk_info        *spare;
-       int                     buffer_size;
-       int                     chunk_size, level, algorithm;
-       int                     raid_disks, working_disks, failed_disks;
-       int                     sector_count;
-       unsigned long           next_sector;
-       atomic_t                nr_handle;
-       struct stripe_head      *next_free_stripe;
-       int                     nr_stripes;
-       int                     resync_parity;
-       int                     max_nr_stripes;
-       int                     clock;
-       int                     nr_hashed_stripes;
-       int                     nr_locked_stripes;
-       int                     nr_pending_stripes;
-       int                     nr_cached_stripes;
-
-       /*
-        * Free stripes pool
-        */
-       int                     nr_free_sh;
-       struct stripe_head      *free_sh_list;
-       struct wait_queue       *wait_for_stripe;
-};
-
-typedef struct raid5_private_data raid5_conf_t;
-
-#define mddev_to_conf(mddev) ((raid5_conf_t *) mddev->private)
-
-/*
- * Our supported algorithms
- */
-#define ALGORITHM_LEFT_ASYMMETRIC      0
-#define ALGORITHM_RIGHT_ASYMMETRIC     1
-#define ALGORITHM_LEFT_SYMMETRIC       2
-#define ALGORITHM_RIGHT_SYMMETRIC      3
-
-#endif
diff --git a/include/linux/raid/translucent.h b/include/linux/raid/translucent.h
deleted file mode 100644 (file)
index a1326db..0000000
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef _TRANSLUCENT_H
-#define _TRANSLUCENT_H
-
-#include <linux/raid/md.h>
-
-typedef struct dev_info dev_info_t;
-
-struct dev_info {
-       kdev_t          dev;
-       int             size;
-};
-
-struct translucent_private_data
-{
-       dev_info_t              disks[MD_SB_DISKS];
-};
-
-
-typedef struct translucent_private_data translucent_conf_t;
-
-#define mddev_to_conf(mddev) ((translucent_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/xor.h b/include/linux/raid/xor.h
deleted file mode 100644 (file)
index e345fe7..0000000
+++ /dev/null
@@ -1,12 +0,0 @@
-#ifndef _XOR_H
-#define _XOR_H
-
-#include <linux/raid/md.h>
-
-#define MAX_XOR_BLOCKS 5
-
-extern void calibrate_xor_block(void);
-extern void (*xor_block)(unsigned int count,
-                         struct buffer_head **bh_ptr);
-
-#endif
diff --git a/include/linux/raid0.h b/include/linux/raid0.h
new file mode 100644 (file)
index 0000000..e1ae51c
--- /dev/null
@@ -0,0 +1,27 @@
+#ifndef _RAID0_H
+#define _RAID0_H
+
+struct strip_zone
+{
+  int zone_offset;             /* Zone offset in md_dev */
+  int dev_offset;              /* Zone offset in real dev */
+  int size;                    /* Zone size */
+  int nb_dev;                  /* Number of devices attached to the zone */
+  struct real_dev *dev[MAX_REAL]; /* Devices attached to the zone */
+};
+
+struct raid0_hash
+{
+  struct strip_zone *zone0, *zone1;
+};
+
+struct raid0_data
+{
+  struct raid0_hash *hash_table; /* Dynamically allocated */
+  struct strip_zone *strip_zone; /* This one too */
+  int nr_strip_zones;
+  struct strip_zone *smallest;
+  int nr_zones;
+};
+
+#endif
diff --git a/include/linux/raid1.h b/include/linux/raid1.h
new file mode 100644 (file)
index 0000000..4b031e6
--- /dev/null
@@ -0,0 +1,49 @@
+#ifndef _RAID1_H
+#define _RAID1_H
+
+#include <linux/md.h>
+
+struct mirror_info {
+       int             number;
+       int             raid_disk;
+       kdev_t          dev;
+       int             next;
+       int             sect_limit;
+
+       /*
+        * State bits:
+        */
+       int             operational;
+       int             write_only;
+       int             spare;
+};
+
+struct raid1_data {
+       struct md_dev *mddev;
+       struct mirror_info mirrors[MD_SB_DISKS];        /* RAID1 devices, 2 to MD_SB_DISKS */
+       int raid_disks;
+       int working_disks;                      /* Number of working disks */
+       int last_used;
+       unsigned long   next_sect;
+       int             sect_count;
+       int resync_running;
+};
+
+/*
+ * this is our 'private' 'collective' RAID1 buffer head.
+ * it contains information about what kind of IO operations were started
+ * for this RAID5 operation, and about their status:
+ */
+
+struct raid1_bh {
+       unsigned int            remaining;
+       int                     cmd;
+       unsigned long           state;
+       struct md_dev           *mddev;
+       struct buffer_head      *master_bh;
+       struct buffer_head      *mirror_bh [MD_SB_DISKS];
+       struct buffer_head      bh_req;
+       struct buffer_head      *next_retry;
+};
+
+#endif
diff --git a/include/linux/raid5.h b/include/linux/raid5.h
new file mode 100644 (file)
index 0000000..5efd211
--- /dev/null
@@ -0,0 +1,110 @@
+#ifndef _RAID5_H
+#define _RAID5_H
+
+#ifdef __KERNEL__
+#include <linux/md.h>
+#include <asm/atomic.h>
+
+struct disk_info {
+       kdev_t  dev;
+       int     operational;
+       int     number;
+       int     raid_disk;
+       int     write_only;
+       int     spare;
+};
+
+struct stripe_head {
+       struct stripe_head      *hash_next, **hash_pprev; /* hash pointers */
+       struct stripe_head      *free_next;             /* pool of free sh's */
+       struct buffer_head      *buffer_pool;           /* pool of free buffers */
+       struct buffer_head      *bh_pool;               /* pool of free bh's */
+       struct raid5_data       *raid_conf;
+       struct buffer_head      *bh_old[MD_SB_DISKS];   /* disk image */
+       struct buffer_head      *bh_new[MD_SB_DISKS];   /* buffers of the MD device (present in buffer cache) */
+       struct buffer_head      *bh_copy[MD_SB_DISKS];  /* copy on write of bh_new (bh_new can change from under us) */
+       struct buffer_head      *bh_req[MD_SB_DISKS];   /* copy of bh_new (only the buffer heads), queued to the lower levels */
+       int                     cmd_new[MD_SB_DISKS];   /* READ/WRITE for new */
+       int                     new[MD_SB_DISKS];       /* buffer added since the last handle_stripe() */
+       unsigned long           sector;                 /* sector of this row */
+       int                     size;                   /* buffers size */
+       int                     pd_idx;                 /* parity disk index */
+       int                     nr_pending;             /* nr of pending cmds */
+       unsigned long           state;                  /* state flags */
+       int                     cmd;                    /* stripe cmd */
+       int                     count;                  /* nr of waiters */
+       int                     write_method;           /* reconstruct-write / read-modify-write */
+       int                     phase;                  /* PHASE_BEGIN, ..., PHASE_COMPLETE */
+       struct wait_queue       *wait;                  /* processes waiting for this stripe */
+};
+
+/*
+ * Phase
+ */
+#define PHASE_BEGIN            0
+#define PHASE_READ_OLD         1
+#define PHASE_WRITE            2
+#define PHASE_READ             3
+#define PHASE_COMPLETE         4
+
+/*
+ * Write method
+ */
+#define METHOD_NONE            0
+#define RECONSTRUCT_WRITE      1
+#define READ_MODIFY_WRITE      2
+
+/*
+ * Stripe state
+ */
+#define STRIPE_LOCKED          0
+#define STRIPE_ERROR           1
+
+/*
+ * Stripe commands
+ */
+#define STRIPE_NONE            0
+#define        STRIPE_WRITE            1
+#define STRIPE_READ            2
+
+struct raid5_data {
+       struct stripe_head      **stripe_hashtbl;
+       struct md_dev           *mddev;
+       struct md_thread        *thread, *resync_thread;
+       struct disk_info        disks[MD_SB_DISKS];
+       struct disk_info        *spare;
+       int                     buffer_size;
+       int                     chunk_size, level, algorithm;
+       int                     raid_disks, working_disks, failed_disks;
+       int                     sector_count;
+       unsigned long           next_sector;
+       atomic_t                nr_handle;
+       struct stripe_head      *next_free_stripe;
+       int                     nr_stripes;
+       int                     resync_parity;
+       int                     max_nr_stripes;
+       int                     clock;
+       int                     nr_hashed_stripes;
+       int                     nr_locked_stripes;
+       int                     nr_pending_stripes;
+       int                     nr_cached_stripes;
+
+       /*
+        * Free stripes pool
+        */
+       int                     nr_free_sh;
+       struct stripe_head      *free_sh_list;
+       struct wait_queue       *wait_for_stripe;
+};
+
+#endif
+
+/*
+ * Our supported algorithms
+ */
+#define ALGORITHM_LEFT_ASYMMETRIC      0
+#define ALGORITHM_RIGHT_ASYMMETRIC     1
+#define ALGORITHM_LEFT_SYMMETRIC       2
+#define ALGORITHM_RIGHT_SYMMETRIC      3
+
+#endif
index b7359b45b780511503aa0645f004ba487f1a11b7..aba1498a73de81a44f80db82a4eab1e8defa0935 100644 (file)
@@ -425,8 +425,7 @@ enum
 /* CTL_DEV names: */
 enum {
        DEV_CDROM=1,
-       DEV_HWMON=2,
-       DEV_MD=3
+       DEV_HWMON=2
 };
 
 /* /proc/sys/dev/cdrom */
@@ -434,11 +433,6 @@ enum {
        DEV_CDROM_INFO=1
 };
 
-/* /proc/sys/dev/md */
-enum {
-       DEV_MD_SPEED_LIMIT=1
-};
-
 #ifdef __KERNEL__
 
 extern asmlinkage int sys_sysctl(struct __sysctl_args *);
index 8788eb27e162aa766718c1ff45ca0220bbaff78e..1e050359a25c356f117652403ae6c78904645674 100644 (file)
 #include <linux/list.h>
 #endif /* __KERNEL__ */
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-struct ip_vs_dest;
-#endif
-
 /*
  * This define affects the number of ports that can be handled
  * by each of the protocol helper modules.
@@ -44,6 +40,10 @@ struct ip_vs_dest;
 #define IP_MASQ_MOD_CTL                        0x00
 #define IP_MASQ_USER_CTL               0x01
 
+#ifdef __KERNEL__
+
+#define IP_MASQ_TAB_SIZE       256
+
 #define IP_MASQ_F_NO_DADDR           0x0001    /* no daddr yet */
 #define IP_MASQ_F_NO_DPORT                   0x0002    /* no dport set yet */
 #define IP_MASQ_F_NO_SADDR           0x0004    /* no sport set yet */
@@ -60,23 +60,6 @@ struct ip_vs_dest;
 #define IP_MASQ_F_USER               0x2000    /* from uspace */
 #define IP_MASQ_F_SIMPLE_HASH        0x8000    /* prevent s+d and m+d hashing */
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-#define IP_MASQ_F_VS             0x00010000    /* virtual server releated */
-#define IP_MASQ_F_VS_NO_OUTPUT    0x00020000   /* output packets avoid masq */
-#define IP_MASQ_F_VS_FIN         0x00040000    /* fin detected */
-#define IP_MASQ_F_VS_FWD_MASK    0x00700000    /* mask for the fdw method */
-#define IP_MASQ_F_VS_LOCALNODE   0x00100000    /* local node destination */
-#define IP_MASQ_F_VS_TUNNEL      0x00200000    /* packets will be tunneled */
-#define IP_MASQ_F_VS_DROUTE      0x00400000    /* direct routing */
-                                                /* masquerading otherwise */
-#define IP_MASQ_VS_FWD(ms) (ms->flags & IP_MASQ_F_VS_FWD_MASK)
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-#ifdef __KERNEL__
-
-#define IP_MASQ_NTABLES                3
-#define IP_MASQ_TAB_SIZE       256
-
 /*
  *     Delta seq. info structure
  *     Each MASQ struct has 2 (output AND input seq. changes).
@@ -108,9 +91,6 @@ struct ip_masq {
        unsigned        timeout;        /* timeout */
        unsigned        state;          /* state info */
        struct ip_masq_timeout_table *timeout_table;
-#ifdef CONFIG_IP_MASQUERADE_VS
-       struct ip_vs_dest *dest;        /* real server & service */
-#endif /* CONFIG_IP_MASQUERADE_VS */
 };
 
 /*
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
deleted file mode 100644 (file)
index af47b53..0000000
+++ /dev/null
@@ -1,154 +0,0 @@
-/*
- *      Virtual server support for IP masquerading
- *      data structure and funcationality definitions
- */
-
-#ifndef _IP_VS_H
-#define _IP_VS_H
-
-#include <linux/config.h>
-
-#ifdef CONFIG_IP_VS_DEBUG
-#define IP_VS_DBG(msg...) printk(KERN_DEBUG "IP_VS: " ## msg )
-#else  /* NO DEBUGGING at ALL */
-#define IP_VS_DBG(msg...)
-#endif
-
-#define IP_VS_ERR(msg...) printk(KERN_ERR "IP_VS: " ## msg )
-#define IP_VS_INFO(msg...) printk(KERN_INFO "IP_VS: " ## msg )
-#define IP_VS_WARNING(msg...) \
-       printk(KERN_WARNING "IP_VS: " ## msg)
-
-struct ip_vs_dest;
-struct ip_vs_scheduler;
-
-/*
- *     The information about the virtual service offered to the net
- *     and the forwarding entries
- */
-struct ip_vs_service {
-       struct ip_vs_service    *next;
-       __u32                   addr;     /* IP address for virtual service */
-       __u16                   port;     /* port number for the service */
-       __u16                   protocol; /* which protocol (TCP/UDP) */
-        struct ip_vs_dest      *destinations; /* real server list */
-       struct ip_vs_scheduler  *scheduler;    /* bound scheduler object */
-       void                    *sched_data;   /* scheduler application data */
-};
-
-
-/*
- *     The real server destination forwarding entry
- *     with ip address, port
- */
-struct ip_vs_dest {
-       struct ip_vs_dest       *next;
-       __u32                   addr;     /* IP address of real server */
-       __u16                   port;     /* port number of the service */
-       unsigned                masq_flags;     /* flags to copy to masq */
-       atomic_t                connections;
-       atomic_t                refcnt;
-       int                     weight;
-       struct ip_vs_service    *service;       /* service might be NULL */
-};
-
-
-/*
- *     The scheduler object
- */
-struct ip_vs_scheduler {
-       struct ip_vs_scheduler  *next;
-       char                    *name;
-       atomic_t                refcnt;
-
-        /* scheduler initializing service */
-       int (*init_service)(struct ip_vs_service *svc);
-        /* scheduling service finish */
-        int (*done_service)(struct ip_vs_service *svc);
-
-       /* scheduling and creating a masquerading entry */
-       struct ip_masq* (*schedule)(struct ip_vs_service *svc, 
-                                   struct iphdr *iph);
-};
-
-/*
- * IP Virtual Server hash table
- */
-#define IP_VS_TAB_BITS CONFIG_IP_MASQUERADE_VS_TAB_BITS
-#define IP_VS_TAB_SIZE  (1 << IP_VS_TAB_BITS)
-extern struct list_head  ip_vs_table[IP_VS_TAB_SIZE];
-
-/*
- *  Hash and unhash functions
- */
-extern int ip_vs_hash(struct ip_masq *ms);
-extern int ip_vs_unhash(struct ip_masq *ms);
-
-/*
- *      registering/unregistering scheduler functions
- */
-extern int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
-extern int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
-
-/*
- *  Lookup functions for the hash table
- */
-extern struct ip_masq * ip_vs_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port);
-extern struct ip_masq * ip_vs_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port);
-
-/*
- * Creating a masquerading entry for IPVS
- */
-extern struct ip_masq *ip_masq_new_vs(int proto, __u32 maddr, __u16 mport, __u32 saddr, __u16 sport, __u32 daddr, __u16 dport, unsigned flags);
-
-/*
- *      IPVS data and functions
- */
-extern rwlock_t __ip_vs_lock;
-
-extern int ip_vs_ctl(int optname, struct ip_masq_ctl *mctl, int optlen);
-
-extern void ip_vs_fin_masq(struct ip_masq *ms);
-extern void ip_vs_bind_masq(struct ip_masq *ms, struct ip_vs_dest *dest);
-extern void ip_vs_unbind_masq(struct ip_masq *ms);
-
-struct ip_vs_service *ip_vs_lookup_service(__u32 vaddr, __u16 vport,
-                                           __u16 protocol);
-extern struct ip_masq *ip_vs_schedule(__u32 vaddr, __u16 vport,
-                                     __u16 protocol,
-                                     struct iphdr *iph);
-
-extern int ip_vs_tunnel_xmit(struct sk_buff **skb_p, __u32 daddr);
-
-/*
- *      init function
- */
-extern int ip_vs_init(void);
-
-/*
- *     init function prototypes for scheduling modules
- *      these function will be called when they are built in kernel
- */
-extern int ip_vs_rr_init(void);
-extern int ip_vs_wrr_init(void);
-extern int ip_vs_wlc_init(void);
-extern int ip_vs_pcc_init(void);
-
-
-/*
- * ip_vs_fwd_tag returns the forwarding tag of the masq
- */
-static __inline__ char ip_vs_fwd_tag(struct ip_masq *ms)
-{
-  char fwd = 'M';
-
-  switch (IP_MASQ_VS_FWD(ms)) {
-    case IP_MASQ_F_VS_LOCALNODE: fwd = 'L'; break;
-    case IP_MASQ_F_VS_TUNNEL: fwd = 'T'; break;
-    case IP_MASQ_F_VS_DROUTE: fwd = 'R'; break;
-  }
-  return fwd;
-}
-
-
-#endif /* _IP_VS_H */
index a71187f48762aa2a78f5fd5e7d866ba1ed22eae8..f316a4746e1ec05d8f4dfb04cafbbcbe23312172 100644 (file)
@@ -19,7 +19,6 @@
 #include <linux/utsname.h>
 #include <linux/ioport.h>
 #include <linux/init.h>
-#include <linux/raid/md.h>
 #include <linux/smp_lock.h>
 #include <linux/blk.h>
 #include <linux/hdreg.h>
@@ -471,7 +470,7 @@ static struct dev_name_struct {
 #ifdef CONFIG_BLK_DEV_FD
        { "fd",      0x0200 },
 #endif
-#if CONFIG_MD_BOOT || CONFIG_AUTODETECT_RAID
+#ifdef CONFIG_MD_BOOT
        { "md",      0x0900 },       
 #endif     
 #ifdef CONFIG_BLK_DEV_XD
@@ -883,9 +882,6 @@ static struct kernel_param cooked_params[] __initdata = {
 #ifdef CONFIG_MD_BOOT
        { "md=", md_setup},
 #endif
-#if CONFIG_BLK_DEV_MD
-       { "raid=", raid_setup},
-#endif
 #ifdef CONFIG_ADBMOUSE
        { "adb_buttons=", adb_mouse_setup },
 #endif
@@ -1371,9 +1367,6 @@ static void __init do_basic_setup(void)
                        while (pid != wait(&i));
                if (MAJOR(real_root_dev) != RAMDISK_MAJOR
                     || MINOR(real_root_dev) != 0) {
-#ifdef CONFIG_BLK_DEV_MD
-                       autodetect_raid();
-#endif
                        error = change_root(real_root_dev,"/initrd");
                        if (error)
                                printk(KERN_ERR "Change root to /initrd: "
index 1f4b54dce0482ae3af2673fa3474c942e67620bb..29786da5e96be521fc50e56cbbf7564d65671083 100644 (file)
@@ -52,14 +52,6 @@ if [ "$CONFIG_IP_FIREWALL" = "y" ]; then
           tristate 'IP: ipportfw masq support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_IPPORTFW
           tristate 'IP: ip fwmark masq-forwarding support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_MFW
        fi
-       bool 'IP: masquerading virtual server support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_VS
-       if [ "$CONFIG_IP_MASQUERADE_VS" = "y" ]; then
-         int 'IP masquerading VS table size (the Nth power of 2)' CONFIG_IP_MASQUERADE_VS_TAB_BITS 12
-          tristate 'IPVS: round-robin scheduling' CONFIG_IP_MASQUERADE_VS_RR
-          tristate 'IPVS: weighted round-robin scheduling' CONFIG_IP_MASQUERADE_VS_WRR
-          tristate 'IPVS: weighted least-connection scheduling' CONFIG_IP_MASQUERADE_VS_WLC
-          tristate 'IPVS: persistent client connection scheduling' CONFIG_IP_MASQUERADE_VS_PCC
-       fi
       fi
     fi
   fi
index 45296ae25abbd3dc491df05666967c67d5e91c43..8ab280deba5ccc693c38995247f89f5511a17c4a 100644 (file)
@@ -91,42 +91,6 @@ ifeq ($(CONFIG_IP_MASQUERADE_MOD),y)
 
 endif
 
-ifeq ($(CONFIG_IP_MASQUERADE_VS),y)
-  IPV4X_OBJS += ip_vs.o
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_RR),y)
-  IPV4_OBJS += ip_vs_rr.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_RR),m)
-    M_OBJS += ip_vs_rr.o
-    endif
-  endif
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_WRR),y)
-  IPV4_OBJS += ip_vs_wrr.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_WRR),m)
-    M_OBJS += ip_vs_wrr.o
-    endif
-  endif
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_WLC),y)
-  IPV4_OBJS += ip_vs_wlc.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_WLC),m)
-    M_OBJS += ip_vs_wlc.o
-    endif
-  endif
-
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_PCC),y)
-  IPV4_OBJS += ip_vs_pcc.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_PCC),m)
-    M_OBJS += ip_vs_pcc.o
-    endif
-  endif
-endif
-
 M_OBJS += ip_masq_user.o
 M_OBJS += ip_masq_ftp.o ip_masq_irc.o ip_masq_raudio.o ip_masq_quake.o
 M_OBJS += ip_masq_vdolive.o ip_masq_cuseeme.o
index 1646ee78a987ee8ed2e53cd903867c6dcfe1b339..27d2f80214b676f57767559371125579c95de224 100644 (file)
@@ -1,6 +1,6 @@
 /* linux/net/inet/arp.c
  *
- * Version:    $Id: arp.c,v 1.77.2.2 1999/08/13 18:26:03 davem Exp $
+ * Version:    $Id: arp.c,v 1.77.2.1 1999/06/28 10:39:23 davem Exp $
  *
  * Copyright (C) 1994 by Florian  La Roche
  *
@@ -65,8 +65,6 @@
  *                                     clean up the APFDDI & gen. FDDI bits.
  *             Alexey Kuznetsov:       new arp state machine;
  *                                     now it is in net/core/neighbour.c.
- *              Wensong Zhang   :       NOARP device (such as tunl) arp fix.
- *             Peter Kese      :       arp_solicit: saddr opt disabled for vs.
  */
 
 /* RFC1122 Status:
@@ -308,15 +306,9 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
        u32 target = *(u32*)neigh->primary_key;
        int probes = neigh->probes;
 
-#if !defined(CONFIG_IP_MASQUERADE_VS)  /* Virtual server */ 
-       /* use default interface address as source address in virtual
-        * server environment. Otherways the saddr might be the virtual
-        * address and gateway's arp cache might start routing packets
-        * to the real server */
        if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
                saddr = skb->nh.iph->saddr;
        else
-#endif
                saddr = inet_select_addr(dev, target, RT_SCOPE_LINK);
 
        if ((probes -= neigh->parms->ucast_probes) < 0) {
@@ -542,7 +534,6 @@ int arp_rcv(struct sk_buff *skb, struct device *dev, struct packet_type *pt)
        struct rtable *rt;
        unsigned char *sha, *tha;
        u32 sip, tip;
-       struct device *tdev;
        u16 dev_type = dev->type;
        int addr_type;
        struct in_device *in_dev = dev->ip_ptr;
@@ -638,13 +629,6 @@ int arp_rcv(struct sk_buff *skb, struct device *dev, struct packet_type *pt)
        if (LOOPBACK(tip) || MULTICAST(tip))
                goto out;
 
-/* 
- *      Check for the device flags for the target IP. If the IFF_NOARP
- *      is set, just delete it. No arp reply is sent.    -- WZ
- */ 
-       if ((tdev = ip_dev_find(tip)) && (tdev->flags & IFF_NOARP))
-               goto out;
-
 /*
  *  Process entry.  The idea here is we want to send a reply if it is a
  *  request for us or if it is a request for someone else that we hold
index 92b1078a44dd3e7358bfbe764bad2d66713a0da8..7a3e2618bd90b952f1302331868e463f5502a3d7 100644 (file)
@@ -5,7 +5,7 @@
  *
  *             The Internet Protocol (IP) module.
  *
- * Version:    $Id: ip_input.c,v 1.37.2.1 1999/08/13 18:26:08 davem Exp $
+ * Version:    $Id: ip_input.c,v 1.37 1999/04/22 10:38:36 davem Exp $
  *
  * Authors:    Ross Biro, <bir7@leland.Stanford.Edu>
  *             Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
@@ -266,15 +266,6 @@ int ip_local_deliver(struct sk_buff *skb)
                }
 
                ret = ip_fw_demasquerade(&skb);
-#ifdef CONFIG_IP_MASQUERADE_VS
-               if (ret == -3) {
-                       /* packet had been tunneled */
-                       return(0);
-               }
-               if (ret == -2) {
-                       return skb->dst->input(skb);
-               }
-#endif
                if (ret < 0) {
                        kfree_skb(skb);
                        return 0;
index 69b31496c8135610f71fe1f76d3c6cf3a58aca26..0187c58d7c5c2cd601b64d8b906332546180feac 100644 (file)
@@ -4,7 +4,7 @@
  *
  *     Copyright (c) 1994 Pauline Middelink
  *
- *     $Id: ip_masq.c,v 1.34.2.3 1999/08/13 18:26:15 davem Exp $
+ *     $Id: ip_masq.c,v 1.34.2.2 1999/08/07 10:56:28 davem Exp $
  *
  *
  *     See ip_fw.c for original log
@@ -47,8 +47,7 @@
  *     Kai Bankett             :       do not toss other IP protos in proto_doff()
  *     Dan Kegel               :       pointed correct NAT behavior for UDP streams
  *     Julian Anastasov        :       use daddr and dport as hash keys
- *     Wensong Zhang           :       Added virtual server support 
- *     Peter Kese              :       fixed TCP state handling for input-only
+ *     
  */
 
 #include <linux/config.h>
 #include <linux/ip_fw.h>
 #include <linux/ip_masq.h>
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-#include <net/ip_vs.h>
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*
- * The following block implements slow timers, most code is stolen
- * from linux/kernel/sched.c
- */
-#define SHIFT_BITS 7
-#define TVN_BITS 11
-#define TVR_BITS 7
-#define TVN_SIZE (1 << TVN_BITS)
-#define TVR_SIZE (1 << TVR_BITS)
-#define TVN_MASK (TVN_SIZE - 1)
-#define TVR_MASK (TVR_SIZE - 1)
-
-struct sltimer_vec {
-        int index;
-        struct timer_list *vec[TVN_SIZE];
-};
-
-struct sltimer_vec_root {
-        int index;
-        struct timer_list *vec[TVR_SIZE];
-};
-
-static struct sltimer_vec sltv3 = { 0 };
-static struct sltimer_vec sltv2 = { 0 };
-static struct sltimer_vec_root sltv1 = { 0 };
-
-static struct sltimer_vec * const sltvecs[] = {
-       (struct sltimer_vec *)&sltv1, &sltv2, &sltv3
-};
-
-#define NOOF_SLTVECS (sizeof(sltvecs) / sizeof(sltvecs[0]))
-
-static unsigned long sltimer_jiffies = 0;
-
-static inline void insert_sltimer(struct timer_list *timer,
-                               struct timer_list **vec, int idx)
-{
-       if ((timer->next = vec[idx]))
-               vec[idx]->prev = timer;
-       vec[idx] = timer;
-       timer->prev = (struct timer_list *)&vec[idx];
-}
-
-static inline void internal_add_sltimer(struct timer_list *timer)
-{
-       /*
-        * must be cli-ed when calling this
-        */
-       unsigned long expires = timer->expires;
-       unsigned long idx = (expires - sltimer_jiffies) >> SHIFT_BITS;
-
-       if (idx < TVR_SIZE) {
-               int i = (expires >> SHIFT_BITS) & TVR_MASK;
-               insert_sltimer(timer, sltv1.vec, i);
-       } else if (idx < 1 << (TVR_BITS + TVN_BITS)) {
-               int i = (expires >> (SHIFT_BITS+TVR_BITS)) & TVN_MASK;
-               insert_sltimer(timer, sltv2.vec, i);
-       } else if ((signed long) idx < 0) {
-               /* can happen if you add a timer with expires == jiffies,
-                * or you set a timer to go off in the past
-                */
-               insert_sltimer(timer, sltv1.vec, sltv1.index);
-       } else if (idx <= 0xffffffffUL) {
-               int i = (expires >> (SHIFT_BITS+TVR_BITS+TVN_BITS)) & TVN_MASK;
-               insert_sltimer(timer, sltv3.vec, i);
-       } else {
-               /* Can only get here on architectures with 64-bit jiffies */
-               timer->next = timer->prev = timer;
-       }
-}
-
-rwlock_t  sltimerlist_lock = RW_LOCK_UNLOCKED;
-
-void add_sltimer(struct timer_list *timer)
-{
-       write_lock(&sltimerlist_lock);
-       if (timer->prev)
-               goto bug;
-       internal_add_sltimer(timer);
-out:
-       write_unlock(&sltimerlist_lock);
-       return;
-
-bug:
-       printk("bug: kernel sltimer added twice at %p.\n",
-                       __builtin_return_address(0));
-       goto out;
-}
-
-static inline int detach_sltimer(struct timer_list *timer)
-{
-       struct timer_list *prev = timer->prev;
-       if (prev) {
-               struct timer_list *next = timer->next;
-               prev->next = next;
-               if (next)
-                       next->prev = prev;
-               return 1;
-       }
-       return 0;
-}
-
-void mod_sltimer(struct timer_list *timer, unsigned long expires)
-{
-       write_lock(&sltimerlist_lock);
-       timer->expires = expires;
-       detach_sltimer(timer);
-       internal_add_sltimer(timer);
-       write_unlock(&sltimerlist_lock);
-}
-
-int del_sltimer(struct timer_list * timer)
-{
-       int ret;
-
-       write_lock(&sltimerlist_lock);
-       ret = detach_sltimer(timer);
-       timer->next = timer->prev = 0;
-       write_unlock(&sltimerlist_lock);
-       return ret;
-}
-
-
-static inline void cascade_sltimers(struct sltimer_vec *tv)
-{
-        /* cascade all the timers from tv up one level */
-        struct timer_list *timer;
-        timer = tv->vec[tv->index];
-        /*
-         * We are removing _all_ timers from the list, so we don't  have to
-         * detach them individually, just clear the list afterwards.
-         */
-        while (timer) {
-                struct timer_list *tmp = timer;
-                timer = timer->next;
-                internal_add_sltimer(tmp);
-        }
-        tv->vec[tv->index] = NULL;
-        tv->index = (tv->index + 1) & TVN_MASK;
-}
-
-static inline void run_sltimer_list(void)
-{
-       write_lock(&sltimerlist_lock);
-       while ((long)(jiffies - sltimer_jiffies) >= 0) {
-               struct timer_list *timer;
-               if (!sltv1.index) {
-                       int n = 1;
-                       do {
-                               cascade_sltimers(sltvecs[n]);
-                       } while (sltvecs[n]->index == 1 && ++n < NOOF_SLTVECS);
-               }
-               while ((timer = sltv1.vec[sltv1.index])) {
-                       void (*fn)(unsigned long) = timer->function;
-                       unsigned long data = timer->data;
-                       detach_sltimer(timer);
-                       timer->next = timer->prev = NULL;
-                       write_unlock(&sltimerlist_lock);
-                       fn(data);
-                       write_lock(&sltimerlist_lock);
-               }
-               sltimer_jiffies += 1<<SHIFT_BITS; 
-               sltv1.index = (sltv1.index + 1) & TVR_MASK;
-       }
-       write_unlock(&sltimerlist_lock);
-}
-
-static void sltimer_handler(unsigned long data);
-
-struct timer_list       slow_timer = {
-        NULL, NULL,
-        0, 0,
-        sltimer_handler,
-};
-
-#define SLTIMER_PERIOD       1*HZ
-
-void sltimer_handler(unsigned long data)
-{
-        run_sltimer_list();
-        mod_timer(&slow_timer, (jiffies + SLTIMER_PERIOD));
-}
-
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
 int sysctl_ip_masq_debug = 0;
 
 /*
@@ -362,38 +171,29 @@ struct masq_tcp_states_t masq_tcp_states [] = {
 /*fin*/        {{mTW, mFW, mSS, mTW, mFW, mTW, mCL, mTW, mLA, mLI }},
 /*ack*/        {{mES, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mES }},
 /*rst*/ {{mCL, mCL, mSS, mCL, mCL, mTW, mCL, mCL, mCL, mCL }},
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*     INPUT-ONLY */
-/*       mNO, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mLI      */
-/*syn*/        {{mES, mES, mES, mSR, mES, mSR, mSR, mSR, mSR, mSR }},
-/*fin*/        {{mCL, mFW, mSS, mTW, mFW, mTW, mCL, mCW, mLA, mLI }},
-/*ack*/        {{mCL, mES, mSS, mSR, mFW, mTW, mCL, mCW, mCL, mLI }},
-/*rst*/ {{mCL, mCL, mCL, mSR, mCL, mCL, mCL, mCL, mLA, mLI }},
-#endif
 };
 
-#define MASQ_STATE_INPUT       0
-#define MASQ_STATE_OUTPUT      4
-#define MASQ_STATE_INPUT_ONLY  8
-
-static __inline__ int masq_tcp_state_idx(struct tcphdr *th, int state_off) 
+static __inline__ int masq_tcp_state_idx(struct tcphdr *th, int output) 
 {
        /*
-        *      [0-3]: input states, [4-7]: output, [8-11] input only states.
+        *      [0-3]: input states, [4-7]: output.
         */
+       if (output) 
+               output=4;
+
        if (th->rst)
-               return state_off+3;
+               return output+3;
        if (th->syn)
-               return state_off+0;
+               return output+0;
        if (th->fin)
-               return state_off+1;
+               return output+1;
        if (th->ack)
-               return state_off+2;
+               return output+2;
        return -1;
 }
 
 
+
 static int masq_set_state_timeout(struct ip_masq *ms, int state)
 {
        struct ip_masq_timeout_table *mstim = ms->timeout_table;
@@ -416,24 +216,14 @@ static int masq_set_state_timeout(struct ip_masq *ms, int state)
        return state;
 }
 
-static int masq_tcp_state(struct ip_masq *ms, int state_off, struct tcphdr *th)
+static int masq_tcp_state(struct ip_masq *ms, int output, struct tcphdr *th)
 {
        int state_idx;
        int new_state = IP_MASQ_S_CLOSE;
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-       /* update state offset to INPUT_ONLY if necessary */
-       /* or delete NO_OUTPUT flag if output packet detected */
-       if (ms->flags & IP_MASQ_F_VS_NO_OUTPUT) {
-               if (state_off == MASQ_STATE_OUTPUT)
-                       ms->flags &= ~IP_MASQ_F_VS_NO_OUTPUT;
-               else state_off = MASQ_STATE_INPUT_ONLY;
-       } 
-#endif
-
-       if ((state_idx = masq_tcp_state_idx(th, state_off)) < 0) {
+       if ((state_idx = masq_tcp_state_idx(th, output)) < 0) {
                IP_MASQ_DEBUG(1, "masq_state_idx(%d)=%d!!!\n", 
-                       state_off, state_idx);
+                       output, state_idx);
                goto tcp_state_out;
        }
 
@@ -443,7 +233,7 @@ tcp_state_out:
        if (new_state!=ms->state)
                IP_MASQ_DEBUG(1, "%s %s [%c%c%c%c] %08lX:%04X-%08lX:%04X state: %s->%s\n",
                                masq_proto_name(ms->protocol),
-                               (state_off==MASQ_STATE_OUTPUT) ? "output " : "input ",
+                               output? "output" : "input ",
                                th->syn? 'S' : '.',
                                th->fin? 'F' : '.',
                                th->ack? 'A' : '.',
@@ -452,14 +242,6 @@ tcp_state_out:
                                ntohl(ms->daddr), ntohs(ms->dport),
                                ip_masq_state_name(ms->state),
                                ip_masq_state_name(new_state));
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-       if (th->fin && (ms->state == IP_MASQ_S_ESTABLISHED)
-            && (ms->flags & IP_MASQ_F_VS) && !(ms->flags & IP_MASQ_F_VS_FIN)) {
-               ip_vs_fin_masq(ms);
-       }
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
        return masq_set_state_timeout(ms, new_state);
 }
 
@@ -467,7 +249,7 @@ tcp_state_out:
 /*
  *     Handle state transitions
  */
-static int masq_set_state(struct ip_masq *ms, int state_off, struct iphdr *iph, void *tp)
+static int masq_set_state(struct ip_masq *ms, int output, struct iphdr *iph, void *tp)
 {
        switch (iph->protocol) {
                case IPPROTO_ICMP:
@@ -475,7 +257,7 @@ static int masq_set_state(struct ip_masq *ms, int state_off, struct iphdr *iph,
                case IPPROTO_UDP:
                        return masq_set_state_timeout(ms, IP_MASQ_S_UDP);
                case IPPROTO_TCP:
-                       return masq_tcp_state(ms, state_off, tp);
+                       return masq_tcp_state(ms, output, tp);
        }
        return -1;
 }
@@ -574,9 +356,6 @@ atomic_t mport_count = ATOMIC_INIT(0);
 
 EXPORT_SYMBOL(ip_masq_get_debug_level);
 EXPORT_SYMBOL(ip_masq_new);
-#ifdef CONFIG_IP_MASQUERADE_VS
-EXPORT_SYMBOL(ip_masq_new_vs);
-#endif /* CONFIG_IP_MASQUERADE_VS */
 EXPORT_SYMBOL(ip_masq_listen);
 EXPORT_SYMBOL(ip_masq_free_ports);
 EXPORT_SYMBOL(ip_masq_out_get);
@@ -599,6 +378,8 @@ EXPORT_SYMBOL(ip_masq_d_table);
  *       1 for extra modules support (daddr)
  */
   
+#define IP_MASQ_NTABLES 3
+
 struct list_head ip_masq_m_table[IP_MASQ_TAB_SIZE];
 struct list_head ip_masq_s_table[IP_MASQ_TAB_SIZE];
 struct list_head ip_masq_d_table[IP_MASQ_TAB_SIZE];
@@ -643,17 +424,9 @@ static void __ip_masq_set_expire(struct ip_masq *ms, unsigned long tout)
 {
         if (tout) {
                 ms->timer.expires = jiffies+tout;
-#ifdef CONFIG_IP_MASQUERADE_VS
-                add_sltimer(&ms->timer);
-#else
                 add_timer(&ms->timer);
-#endif
         } else {
-#ifdef CONFIG_IP_MASQUERADE_VS
-                del_sltimer(&ms->timer);
-#else
                 del_timer(&ms->timer);
-#endif
         }
 }
 
@@ -969,10 +742,6 @@ struct ip_masq * ip_masq_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32
        struct ip_masq *ms;
 
        read_lock(&__ip_masq_lock);
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ms = ip_vs_out_get(protocol, s_addr, s_port, d_addr, d_port);
-        if (ms == NULL)
-#endif /* CONFIG_IP_MASQUERADE_VS */
        ms = __ip_masq_out_get(protocol, s_addr, s_port, d_addr, d_port);
        read_unlock(&__ip_masq_lock);
 
@@ -986,10 +755,6 @@ struct ip_masq * ip_masq_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32
        struct ip_masq *ms;
 
        read_lock(&__ip_masq_lock);
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ms = ip_vs_in_get(protocol, s_addr, s_port, d_addr, d_port);
-        if (ms == NULL)
-#endif /* CONFIG_IP_MASQUERADE_VS */
        ms =  __ip_masq_in_get(protocol, s_addr, s_port, d_addr, d_port);
        read_unlock(&__ip_masq_lock);
 
@@ -1060,14 +825,6 @@ static void masq_expire(unsigned long data)
        if (ms->control) 
                ip_masq_control_del(ms);
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-        if (ms->flags & IP_MASQ_F_VS) {
-                if (ip_vs_unhash(ms)) {
-                        ip_vs_unbind_masq(ms);
-                }
-        }
-        else
-#endif /* CONFIG_IP_MASQUERADE_VS */
         if (ip_masq_unhash(ms)) {
                if (ms->flags&IP_MASQ_F_MPORT) {
                        atomic_dec(&mport_count);
@@ -1304,73 +1061,6 @@ mport_nono:
         return NULL;
 }
 
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*
- *  Create a new masquerade entry for IPVS, all parameters {maddr,
- *  mport, saddr, sport, daddr, dport, mflags} are known. No need
- *  to allocate a free mport. And, hash it into the ip_vs_table.
- *
- *  Be careful, it can be called from u-space
- */
-
-struct ip_masq * ip_masq_new_vs(int proto, __u32 maddr, __u16 mport, __u32 saddr, __u16 sport, __u32 daddr, __u16 dport, unsigned mflags)
-{
-        struct ip_masq *ms;
-        static int n_fails = 0;
-       int prio;
-
-       prio = (mflags&IP_MASQ_F_USER) ? GFP_KERNEL : GFP_ATOMIC;
-
-        ms = (struct ip_masq *) kmalloc(sizeof(struct ip_masq), prio);
-        if (ms == NULL) {
-                if (++n_fails < 5)
-                        IP_VS_ERR("ip_masq_new_vs(proto=%s): no memory available.\n",
-                                  masq_proto_name(proto));
-                return NULL;
-        }
-       MOD_INC_USE_COUNT;
-        memset(ms, 0, sizeof(*ms));
-       init_timer(&ms->timer);
-       ms->timer.data     = (unsigned long)ms;
-       ms->timer.function = masq_expire;
-        ms->protocol      = proto;
-        ms->saddr         = saddr;
-        ms->sport         = sport;
-        ms->daddr         = daddr;
-        ms->dport         = dport;
-        ms->maddr          = maddr;
-        ms->mport          = mport;
-        ms->flags         = mflags;
-        ms->app_data      = NULL;
-        ms->control       = NULL;
-       
-       atomic_set(&ms->n_control,0);
-       atomic_set(&ms->refcnt,0);
-
-        if (mflags & IP_MASQ_F_USER)   
-                write_lock_bh(&__ip_masq_lock);
-        else 
-                write_lock(&__ip_masq_lock);
-
-        /*
-         *  Hash it in the ip_vs_table
-         */
-        ip_vs_hash(ms);
-
-        if (mflags & IP_MASQ_F_USER)   
-                write_unlock_bh(&__ip_masq_lock);
-        else 
-                write_unlock(&__ip_masq_lock);
-
-        /*  ip_masq_bind_app(ms); */
-        atomic_inc(&ms->refcnt);
-        masq_set_state_timeout(ms, IP_MASQ_S_NONE);
-        return ms;
-}
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-
 /*
  *     Get transport protocol data offset, check against size
  *     return:
@@ -1413,7 +1103,6 @@ static __inline__ int proto_doff(unsigned proto, char *th, unsigned size)
        return ret;
 }
 
-
 int ip_fw_masquerade(struct sk_buff **skb_p, __u32 maddr)
 {
        struct sk_buff  *skb = *skb_p;
@@ -1668,11 +1357,11 @@ int ip_fw_masquerade(struct sk_buff **skb_p, __u32 maddr)
        IP_MASQ_DEBUG(2, "O-routed from %08lX:%04X with masq.addr %08lX\n",
                ntohl(ms->maddr),ntohs(ms->mport),ntohl(maddr));
 
-       masq_set_state(ms, MASQ_STATE_OUTPUT, iph, h.portp);
+       masq_set_state(ms, 1, iph, h.portp);
        ip_masq_put(ms);
 
        return 0;
-}
+ }
 
 /*
  *     Restore original addresses and ports in the original IP
@@ -1831,7 +1520,7 @@ int ip_fw_masq_icmp(struct sk_buff **skb_p, __u32 maddr)
                       ntohs(icmp_id(icmph)),
                       icmph->type);
 
-               masq_set_state(ms, MASQ_STATE_OUTPUT, iph, icmph);
+               masq_set_state(ms, 1, iph, icmph);
                ip_masq_put(ms);
 
                return 1;
@@ -2077,7 +1766,7 @@ int ip_fw_demasq_icmp(struct sk_buff **skb_p)
                       ntohs(icmp_id(icmph)),
                       icmph->type);
 
-               masq_set_state(ms, MASQ_STATE_INPUT, iph, icmph);
+               masq_set_state(ms, 0, iph, icmph);
                ip_masq_put(ms);
 
                return 1;
@@ -2295,19 +1984,13 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                return(ip_fw_demasq_icmp(skb_p));
        case IPPROTO_TCP:
        case IPPROTO_UDP:
-               /*
+               /* 
                 *      Make sure packet is in the masq range 
                 *      ... or some mod-ule relaxes input range
                 *      ... or there is still some `special' mport opened
                 */
-#ifdef CONFIG_IP_MASQUERADE_VS
-               ms = ip_masq_in_get_iph(iph);
-               if ((ms == NULL)
-                    && (ip_vs_lookup_service(maddr, h.portp[1], iph->protocol) == NULL)
-#else
                if ((ntohs(h.portp[1]) < PORT_MASQ_BEGIN
                                || ntohs(h.portp[1]) > PORT_MASQ_END)
-#endif /* CONFIG_IP_MASQUERADE_VS */
 #ifdef CONFIG_IP_MASQUERADE_MOD
                                && (ip_masq_mod_in_rule(skb, iph) != 1) 
 #endif
@@ -2349,6 +2032,8 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                return 0;
        }
 
+
+
        IP_MASQ_DEBUG(2, "Incoming %s %08lX:%04X -> %08lX:%04X\n",
                masq_proto_name(iph->protocol),
                ntohl(iph->saddr), ntohs(h.portp[0]),
@@ -2357,9 +2042,8 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
        /*
         * reroute to original host:port if found...
          */
-#ifndef CONFIG_IP_MASQUERADE_VS
+
         ms = ip_masq_in_get_iph(iph);
-#endif 
 
        /*
         *      Give additional modules a chance to create an entry
@@ -2374,19 +2058,10 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
        ip_masq_mod_in_update(skb, iph, ms);
 #endif
 
-#ifdef CONFIG_IP_MASQUERADE_VS
-       if (!ms && (h.th->syn || (iph->protocol!=IPPROTO_TCP))) {
-               /* 
-                * Let the virtual server select a real server
-                * for the incomming connection, and create a
-                 * masquerading entry.
-                */ 
-               ms = ip_vs_schedule(iph->daddr,h.portp[1],iph->protocol,iph);
-       }
-#endif /* CONFIG_IP_MASQUERADE_VS */
 
         if (ms != NULL)
         {
+
                 /*
                  *     got reply, so clear flag
                  */
@@ -2435,65 +2110,13 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
 
                 }
                }
-
                if ((skb=masq_skb_cow(skb_p, &iph, &h.raw)) == NULL) {
                        ip_masq_put(ms);
                        return -1;
                }
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-               if (IP_MASQ_VS_FWD(ms) != 0) {
-                        int ret = 0;
-                        
-                        /*
-                         *    Return values mean:
-                         *      -1    skb must be released
-                         *      -2    call skb->dst->input(skb) to release skb
-                         *      -3    skb has been released
-                         */
-                        switch (IP_MASQ_VS_FWD(ms)) {
-                          case IP_MASQ_F_VS_TUNNEL:
-                                if (ip_vs_tunnel_xmit(skb_p, ms->saddr) == 0) {
-                                        IP_VS_DBG("tunneling error.\n");
-                                } else {
-                                        IP_VS_DBG("tunneling succeeded.\n");
-                                }
-                                ret = -3;
-                                break;
-                                
-                          case IP_MASQ_F_VS_DROUTE:
-                                dst_release(skb->dst);
-                                skb->dst = NULL;
-                                ip_send_check(iph);
-                                if (ip_route_input(skb, ms->saddr, iph->saddr,
-                                                   iph->tos, skb->dev)) {
-                                        IP_VS_DBG("direct routing error.\n");
-                                        ret = -1;
-                                } else {
-                                        IP_VS_DBG("direct routing succeeded.\n");
-                                        ret = -2;
-                                }
-                                break;
-                                
-                          case IP_MASQ_F_VS_LOCALNODE:
-                                ret = 0;
-                        }
-                        
-                        /*
-                         *    Set state of masq entry
-                         */
-                        masq_set_state (ms, MASQ_STATE_INPUT, iph, h.portp);
-                        ip_masq_put(ms);
-
-                        return ret;
-               }
-                IP_VS_DBG("masquerading packet...\n");
-#endif /* CONFIG_IP_MASQUERADE_VS */
-                
                 iph->daddr = ms->saddr;
                 h.portp[1] = ms->sport;
-                
+
                /*
                 *      Invalidate csum saving if tunnel has masq helper
                 */
@@ -2550,11 +2173,11 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                                        h.uh->check = 0xFFFF;
                                break;
                }
-               ip_send_check(iph);
+                ip_send_check(iph);
 
                 IP_MASQ_DEBUG(2, "I-routed to %08lX:%04X\n",ntohl(iph->daddr),ntohs(h.portp[1]));
 
-               masq_set_state (ms, MASQ_STATE_INPUT, iph, h.portp);
+               masq_set_state (ms, 0, iph, h.portp);
                ip_masq_put(ms);
 
                 return 1;
@@ -2669,49 +2292,7 @@ static int ip_msqhst_procinfo(char *buffer, char **start, off_t offset,
                len += sprintf(buffer+len, "%-127s\n", temp);
 
                if(len >= length) {
-                       read_unlock_bh(&__ip_masq_lock);
-                       goto done;
-               }
-        }
-       read_unlock_bh(&__ip_masq_lock);
 
-       }
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-        for(idx = 0; idx < IP_VS_TAB_SIZE; idx++) 
-       {
-       /*
-        *      Lock is actually only need in next loop 
-        *      we are called from uspace: must stop bh.
-        */
-       read_lock_bh(&__ip_masq_lock);
-
-       l = &ip_vs_table[idx];
-       for (e=l->next; e!=l; e=e->next) {
-               ms = list_entry(e, struct ip_masq, m_list);
-               pos += 128;
-               if (pos <= offset) {
-                       len = 0;
-                       continue;
-               }
-
-               /*
-                *      We have locked the tables, no need to del/add timers
-                *      nor cli()  8)
-                */
-
-               sprintf(temp,"%s %08lX:%04X %08lX:%04X %04X %08X %6d %6d %7lu",
-                       masq_proto_name(ms->protocol),
-                       ntohl(ms->saddr), ntohs(ms->sport),
-                       ntohl(ms->daddr), ntohs(ms->dport),
-                       ntohs(ms->mport),
-                       ms->out_seq.init_seq,
-                       ms->out_seq.delta,
-                       ms->out_seq.previous_delta,
-                       ms->timer.expires-jiffies);
-               len += sprintf(buffer+len, "%-127s\n", temp);
-
-               if(len >= length) {
                        read_unlock_bh(&__ip_masq_lock);
                        goto done;
                }
@@ -2719,9 +2300,9 @@ static int ip_msqhst_procinfo(char *buffer, char **start, off_t offset,
        read_unlock_bh(&__ip_masq_lock);
 
        }
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
 done:
+
+
        begin = len - (pos - offset);
        *start = buffer + begin;
        len -= begin;
@@ -2828,11 +2409,6 @@ int ip_masq_uctl(int optname, char * optval , int optlen)
                case IP_MASQ_TARGET_MOD:
                        ret = ip_masq_mod_ctl(optname, &masq_ctl, optlen);
                        break;
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS
-               case IP_MASQ_TARGET_VS:
-                       ret = ip_vs_ctl(optname, &masq_ctl, optlen);
-                       break;
 #endif
        }
 
@@ -2953,7 +2529,7 @@ __initfunc(int ip_masq_init(void))
                (char *) IPPROTO_ICMP,
                ip_masq_user_info
        });
-#endif /* CONFIG_PROC_FS */
+#endif 
 #ifdef CONFIG_IP_MASQUERADE_IPAUTOFW
        ip_autofw_init();
 #endif
@@ -2962,11 +2538,6 @@ __initfunc(int ip_masq_init(void))
 #endif
 #ifdef CONFIG_IP_MASQUERADE_MFW
        ip_mfw_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ip_vs_init();
-        slow_timer.expires = jiffies+SLTIMER_PERIOD;
-        add_timer(&slow_timer);
 #endif
         ip_masq_app_init();
 
index 3030144165f5481d29a58e2de58a0d6a11cc9554..d2a1729c5386a5ff4af8785d8d47620d55c77b0d 100644 (file)
@@ -2,7 +2,7 @@
  *             IP_MASQ_AUTOFW auto forwarding module
  *
  *
- *     $Id: ip_masq_autofw.c,v 1.3.2.1 1999/08/13 18:26:20 davem Exp $
+ *     $Id: ip_masq_autofw.c,v 1.3 1998/08/29 23:51:10 davem Exp $
  *
  * Author:     Richard Lynch
  *
@@ -179,13 +179,13 @@ static __inline__ int ip_autofw_add(struct ip_autofw_user * af)
 {
        struct ip_autofw * newaf;
        newaf = kmalloc( sizeof(struct ip_autofw), GFP_KERNEL );
+       init_timer(&newaf->timer);
        if ( newaf == NULL ) 
        {
                printk("ip_autofw_add:  malloc said no\n");
                return( ENOMEM );
        }
 
-       init_timer(&newaf->timer);
        MOD_INC_USE_COUNT;
 
        memcpy(newaf, af, sizeof(struct ip_autofw_user));
index 473d0c8e615ec23dbedd5bf408a2365526267cb8..60c7797065f9ce5d1a40d9e7e1b8ea011a5296eb 100644 (file)
@@ -3,7 +3,7 @@
  *
  *     Does (reverse-masq) forwarding based on skb->fwmark value
  *
- *     $Id: ip_masq_mfw.c,v 1.3.2.2 1999/08/13 18:26:26 davem Exp $
+ *     $Id: ip_masq_mfw.c,v 1.3.2.1 1999/07/02 10:10:03 davem Exp $
  *
  * Author:     Juan Jose Ciarlante   <jjciarla@raiz.uncu.edu.ar>
  *               based on Steven Clarke's portfw
@@ -216,7 +216,6 @@ static int mfw_delhost(struct ip_masq_mfw *mfw, struct ip_mfw_user *mu)
                        (!mu->rport || h->port == mu->rport)) {
                        /* HIT */
                        atomic_dec(&mfw->nhosts);
-                       e = h->list.prev;
                        list_del(&h->list);
                        kfree_s(h, sizeof(*h));
                        MOD_DEC_USE_COUNT;
index c4b1ef4c88e0e0511599ef8b4d25779f3130c791..6c697a1029bc8abca9b74f615e1a45567e860731 100644 (file)
@@ -2,7 +2,7 @@
  *             IP_MASQ_PORTFW masquerading module
  *
  *
- *     $Id: ip_masq_portfw.c,v 1.3.2.2 1999/08/13 18:26:29 davem Exp $
+ *     $Id: ip_masq_portfw.c,v 1.3.2.1 1999/07/02 10:10:02 davem Exp $
  *
  * Author:     Steven Clarke <steven.clarke@monmouth.demon.co.uk>
  *
@@ -85,8 +85,7 @@ static __inline__ int ip_portfw_del(__u16 protocol, __u16 lport, __u32 laddr, __
                                (!laddr || n->laddr == laddr) &&
                                (!raddr || n->raddr == raddr) && 
                                (!rport || n->rport == rport)) {
-                       entry = n->list.prev;
-                       list_del(&n->list);
+                       list_del(entry);
                        ip_masq_mod_dec_nent(mmod_self);
                        kfree_s(n, sizeof(struct ip_portfw));
                        MOD_DEC_USE_COUNT;
@@ -423,6 +422,8 @@ static struct ip_masq * portfw_in_create(const struct sk_buff *skb, const struct
                                raddr, rport,
                                iph->saddr, portp[0],
                                0);
+               ip_masq_listen(ms);
+
                if (!ms || atomic_read(&mmod_self->mmod_nent) <= 1 
                        /* || ip_masq_nlocks(&portfw_lock) != 1 */ )
                                /*
@@ -430,8 +431,6 @@ static struct ip_masq * portfw_in_create(const struct sk_buff *skb, const struct
                                 */
                                goto out;
 
-               ip_masq_listen(ms);
-
                /*
                 *      Entry created, lock==1.
                 *      if pref_cnt == 0, move
index f369f03ddee6c7679efe7cf22186822b1cb32fd5..5129744195f52a8b306ea6431aab2ec976c5a6c2 100644 (file)
@@ -2,7 +2,7 @@
  *     IP_MASQ_USER user space control module
  *
  *
- *     $Id: ip_masq_user.c,v 1.1.2.2 1999/08/13 18:26:33 davem Exp $
+ *     $Id: ip_masq_user.c,v 1.1.2.1 1999/08/07 10:56:33 davem Exp $
  */
 
 #include <linux/config.h>
diff --git a/net/ipv4/ip_vs.c b/net/ipv4/ip_vs.c
deleted file mode 100644 (file)
index 9e4973d..0000000
+++ /dev/null
@@ -1,1297 +0,0 @@
-/*
- * IPVS         An implementation of the IP virtual server support for the
- *              LINUX operating system.  IPVS is now implemented as a part
- *              of IP masquerading code. IPVS can be used to build a
- *              high-performance and highly available server based on a
- *              cluster of servers.
- *
- * Version:     $Id: ip_vs.c,v 1.1.2.1 1999/08/13 18:25:27 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *     Wensong Zhang            :     fixed the overflow bug in ip_vs_procinfo
- *     Wensong Zhang            :     added editing dest and service functions
- *     Wensong Zhang            :     changed name of some functions
- *     Wensong Zhang            :     fixed the unlocking bug in ip_vs_del_dest
- *     Wensong Zhang            :     added a separate hash table for IPVS
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <linux/ip_masq.h>
-#include <linux/proc_fs.h>
-
-#include <linux/inetdevice.h>
-#include <linux/ip.h>
-#include <net/icmp.h>
-#include <net/ip.h>
-#include <net/route.h>
-
-#include <net/ip_masq.h>
-#include <net/ip_vs.h>
-
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-
-EXPORT_SYMBOL(register_ip_vs_scheduler);
-EXPORT_SYMBOL(unregister_ip_vs_scheduler);
-EXPORT_SYMBOL(ip_vs_bind_masq);
-EXPORT_SYMBOL(ip_vs_unbind_masq);
-
-/*
- *  Lock for IPVS
- */
-rwlock_t __ip_vs_lock = RW_LOCK_UNLOCKED;
-
-/*
- *  Hash table: for input and output packets lookups of IPVS
- */
-struct list_head ip_vs_table[IP_VS_TAB_SIZE];
-
-/*
- * virtual server list and schedulers
- */
-static struct ip_vs_service *service_list[2] = {NULL,NULL};
-static struct ip_vs_scheduler *schedulers = NULL;
-
-
-/*
- *  Register a scheduler in the scheduler list
- */
-int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
-{
-       if (!scheduler) {
-               IP_VS_ERR("register_ip_vs_scheduler(): NULL arg\n");
-               return -EINVAL;
-       }
-
-        if (!scheduler->name) {
-               IP_MASQ_ERR("register_ip_vs_scheduler(): NULL scheduler_name\n");
-               return -EINVAL;
-       }
-
-       if (scheduler->next) {
-               IP_VS_ERR("register_ip_vs_scheduler(): scheduler already linked\n");
-               return -EINVAL;
-       }
-       
-       scheduler->next = schedulers;
-       schedulers = scheduler;
-
-       return 0;
-}
-
-
-/*
- *  Unregister a scheduler in the scheduler list
- */
-int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
-{
-       struct ip_vs_scheduler **psched;
-
-       if (!scheduler) {
-               IP_MASQ_ERR( "unregister_ip_vs_scheduler(): NULL arg\n");
-               return -EINVAL;
-       }
-
-       /*
-        *      Only allow unregistration if it is not referenced
-        */
-       if (atomic_read(&scheduler->refcnt))  {
-               IP_MASQ_ERR( "unregister_ip_vs_scheduler(): is in use by %d guys. failed\n",
-                               atomic_read(&scheduler->refcnt));
-               return -EINVAL;
-       }
-
-       /*      
-        *      Must be already removed from the scheduler list
-        */
-       for (psched = &schedulers; (*psched) && (*psched != scheduler);
-            psched = &((*psched)->next));
-
-       if (*psched != scheduler) {
-               IP_VS_ERR("unregister_ip_vs_scheduler(): scheduler is in the list. failed\n");
-               return -EINVAL;
-       }
-
-       *psched = scheduler->next;
-       scheduler->next = NULL;
-
-       return 0;
-}
-
-
-/*
- *  Bind a service with a scheduler
- */
-int ip_vs_bind_scheduler(struct ip_vs_service *svc,
-                         struct ip_vs_scheduler *scheduler)
-{
-        if (svc == NULL) {
-               IP_VS_ERR("ip_vs_bind_scheduler(): svc arg NULL\n");
-               return -EINVAL;
-       }
-        if (scheduler == NULL) {
-               IP_VS_ERR("ip_vs_bind_scheduler(): scheduler arg NULL\n");
-               return -EINVAL;
-       }
-
-        svc->scheduler = scheduler;
-        atomic_inc(&scheduler->refcnt);
-        
-        if(scheduler->init_service)
-                if(scheduler->init_service(svc) != 0) {
-                        IP_VS_ERR("ip_vs_bind_scheduler(): init error\n");
-                        return -EINVAL;
-                }
-        
-        return 0;
-}
-
-
-/*
- *  Unbind a service with its scheduler
- */
-int ip_vs_unbind_scheduler(struct ip_vs_service *svc)
-{
-       struct ip_vs_scheduler *sched;
-
-        if (svc == NULL) {
-               IP_VS_ERR("ip_vs_unbind_scheduler(): svc arg NULL\n");
-               return -EINVAL;
-       }
-
-        sched = svc->scheduler;
-        if (sched == NULL) {
-               IP_VS_ERR("ip_vs_unbind_scheduler(): svc isn't bound\n");
-               return -EINVAL;
-       }
-
-        if(sched->done_service)
-                if(sched->done_service(svc) != 0) {
-                        IP_VS_ERR("ip_vs_unbind_scheduler(): done error\n");
-                        return -EINVAL;
-                }
-
-        atomic_dec(&sched->refcnt);
-        svc->scheduler = NULL;
-
-        return 0;
-}
-
-
-/*
- *     Returns hash value for IPVS
- */
-
-static __inline__ unsigned 
-ip_vs_hash_key(unsigned proto, __u32 addr, __u16 port)
-{
-        unsigned addrh = ntohl(addr);
-        
-        return (proto^addrh^(addrh>>IP_VS_TAB_BITS)^ntohs(port))
-                & (IP_VS_TAB_SIZE-1);
-}
-
-
-/*
- *     Hashes ip_masq in ip_vs_table by proto,addr,port.
- *     should be called with locked tables.
- *     returns bool success.
- */
-int ip_vs_hash(struct ip_masq *ms)
-{
-        unsigned hash;
-
-        if (ms->flags & IP_MASQ_F_HASHED) {
-                IP_VS_ERR("ip_vs_hash(): request for already hashed, called from %p\n",
-                          __builtin_return_address(0));
-                return 0;
-        }
-        /*
-         *     Hash by proto,client{addr,port}
-         */
-        hash = ip_vs_hash_key(ms->protocol, ms->daddr, ms->dport);
-
-        /*
-         * Note: because ip_masq_put sets masq expire if its
-         *       refcnt==IP_MASQ_NTABLES, we have to increase
-         *       counter IP_MASQ_NTABLES times, otherwise the masq
-         *       won't expire.
-         */
-       atomic_add(IP_MASQ_NTABLES, &ms->refcnt);
-        list_add(&ms->m_list, &ip_vs_table[hash]);
-
-        ms->flags |= IP_MASQ_F_HASHED;
-        return 1;
-}
-
-
-/*
- *     UNhashes ip_masq from ip_vs_table.
- *     should be called with locked tables.
- *     returns bool success.
- */
-int ip_vs_unhash(struct ip_masq *ms)
-{
-        unsigned int hash;
-
-        if (!(ms->flags & IP_MASQ_F_HASHED)) {
-                IP_VS_ERR("ip_vs_unhash(): request for unhash flagged, called from %p\n",
-                          __builtin_return_address(0));
-                return 0;
-        }
-        /*
-         *     UNhash by client{addr,port}
-         */
-        hash = ip_vs_hash_key(ms->protocol, ms->daddr, ms->dport);
-        /*
-         * Note: since we increase refcnt while hashing,
-         *       we have to decrease it while unhashing.
-         */
-       atomic_sub(IP_MASQ_NTABLES, &ms->refcnt);
-       list_del(&ms->m_list);
-        ms->flags &= ~IP_MASQ_F_HASHED;
-        return 1;
-}
-
-
-/*
- *  Gets ip_masq associated with supplied parameters in the ip_vs_table.
- *  Called for pkts coming from OUTside-to-INside the firewall.
- *     s_addr, s_port: pkt source address (foreign host)
- *     d_addr, d_port: pkt dest address (firewall)
- *  Caller must lock tables
- */
-
-struct ip_masq * ip_vs_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port)
-{
-        unsigned hash;
-        struct ip_masq *ms;
-        struct list_head *l, *e;
-
-        hash = ip_vs_hash_key(protocol, s_addr, s_port);
-
-        l=&ip_vs_table[hash];
-        for(e=l->next; e!=l; e=e->next)
-       {
-               ms = list_entry(e, struct ip_masq, m_list);
-               if (protocol==ms->protocol && 
-                   d_addr==ms->maddr && d_port==ms->mport &&
-                   s_addr==ms->daddr && s_port==ms->dport
-                   ) {
-                       atomic_inc(&ms->refcnt);
-                        goto out;
-               }
-        }
-       ms = NULL;
-
-  out:
-        return ms;
-}
-
-
-/*
- *  Gets ip_masq associated with supplied parameters in the ip_vs_table.
- *  Called for pkts coming from inside-to-OUTside the firewall.
- *     s_addr, s_port: pkt source address (inside host)
- *     d_addr, d_port: pkt dest address (foreigh host)
- *  Caller must lock tables
- */
-struct ip_masq * ip_vs_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port)
-{
-        unsigned hash;
-        struct ip_masq *ms;
-        struct list_head *l, *e;
-
-       /*      
-        *      Check for "full" addressed entries
-        */
-        hash = ip_vs_hash_key(protocol, d_addr, d_port);
-        l=&ip_vs_table[hash];
-
-        for(e=l->next; e!=l; e=e->next)
-       {       
-               ms = list_entry(e, struct ip_masq, m_list);
-               if (protocol == ms->protocol &&
-                   s_addr == ms->saddr && s_port == ms->sport &&
-                   d_addr == ms->daddr && d_port == ms->dport
-                    ) {
-                       atomic_inc(&ms->refcnt);
-                       goto out;
-               }
-
-        }
-       ms = NULL;
-
-  out:
-        return ms;
-}
-
-
-/*
- *  Create a destination
- */
-struct ip_vs_dest *ip_vs_new_dest(struct ip_vs_service *svc,
-                                 struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-
-       IP_VS_DBG("enter ip_vs_new_dest()\n");
-
-       dest = (struct ip_vs_dest*) kmalloc(sizeof(struct ip_vs_dest),
-                                           GFP_ATOMIC);
-       if (dest == NULL) {
-               IP_VS_ERR("ip_vs_new_dest: kmalloc failed.\n");
-               return NULL;
-       }
-       memset(dest, 0, sizeof(struct ip_vs_dest));
-
-       dest->service = svc;
-       dest->addr = mm->daddr;
-       dest->port = mm->dport;
-       dest->weight = mm->weight;
-       dest->masq_flags = mm->masq_flags;
-
-       atomic_set(&dest->connections, 0);
-       atomic_set(&dest->refcnt, 0);
-
-        /*
-         *    Set the IP_MASQ_F_VS flag
-         */
-        dest->masq_flags |= IP_MASQ_F_VS;
-                
-       /* check if local node and update the flags */
-       if (inet_addr_type(mm->daddr) == RTN_LOCAL) {
-               dest->masq_flags = (dest->masq_flags & ~IP_MASQ_F_VS_FWD_MASK)
-                        | IP_MASQ_F_VS_LOCALNODE;
-       }
-
-       /* check if (fwd != masquerading) and update the port & flags */
-       if ((dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) != 0) {
-               dest->masq_flags |= IP_MASQ_F_VS_NO_OUTPUT;
-       }
-
-       return dest;
-}
-
-
-/*
- *  Add a destination into an existing service
- */
-int ip_vs_add_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-
-       IP_VS_DBG("enter ip_vs_add_dest()\n");
-
-       if (mm->weight < 0) {
-                IP_VS_ERR("ip_vs_add_dest(): server weight less than zero\n");
-                return -ERANGE;
-        }
-
-       write_lock_bh(&__ip_vs_lock);
-
-        /* check the existing dest list */
-        for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        write_unlock_bh(&__ip_vs_lock);
-                        IP_VS_ERR("ip_vs_add_dest(): dest exists\n");
-                        return -EEXIST;
-                }
-        }
-        
-       /* allocate and initialize the dest structure */
-       dest = ip_vs_new_dest(svc, mctl);
-       if (dest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-                IP_VS_ERR("ip_vs_add_dest(): out of memory\n");
-                return -ENOMEM;
-        }
-        
-       /* put the dest entry into the list */
-       dest->next = svc->destinations;
-       svc->destinations = dest;
-        
-       write_unlock_bh(&__ip_vs_lock);
-
-       atomic_inc(&dest->refcnt);
-
-       return 0;
-}
-
-        
-/*
- *  Edit a destination in a service
- */
-int ip_vs_edit_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-
-       IP_VS_DBG("enter ip_vs_edit_dest()\n");
-
-       if (mm->weight < 0) {
-                IP_VS_ERR("ip_vs_add_dest(): server weight less than zero\n");
-                return -ERANGE;
-        }
-        
-       write_lock_bh(&__ip_vs_lock);
-
-        /* lookup the destination list */
-        for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        /* HIT */
-                        break;
-                }
-        }
-
-        if (dest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-                IP_VS_ERR("ip_vs_edit_dest(): dest doesn't exist\n");
-                return -ENOENT;
-        }
-        
-        /*
-         *    Set the weight and the flags
-         */
-       dest->weight = mm->weight;
-       dest->masq_flags = mm->masq_flags;
-
-        dest->masq_flags |= IP_MASQ_F_VS;
-                
-       /* check if local node and update the flags */
-       if (inet_addr_type(mm->daddr) == RTN_LOCAL) {
-               dest->masq_flags = (dest->masq_flags & ~IP_MASQ_F_VS_FWD_MASK)
-                        | IP_MASQ_F_VS_LOCALNODE;
-       }
-
-       /* check if (fwd != masquerading) and update the port & flags */
-       if ((dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) != 0) {
-               dest->masq_flags |= IP_MASQ_F_VS_NO_OUTPUT;
-       }
-        
-       write_unlock_bh(&__ip_vs_lock);
-
-       return 0;
-}
-
-
-/*
- *  Delete a destination from an existing service
- */
-int ip_vs_del_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-        struct ip_vs_dest *dest;
-        struct ip_vs_dest **pdest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-        
-       IP_VS_DBG("enter ip_vs_del_dest()\n");
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* remove dest from the destination list */
-       pdest = &svc->destinations;
-       while (*pdest) {
-                dest = *pdest;
-                if ((dest->addr == daddr) && (dest->port == dport))
-                        /* HIT */
-                        break;
-
-                pdest = &dest->next;
-        }
-        
-       if (*pdest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-               IP_VS_ERR("ip_vs_del_dest(): destination not found!\n");
-               return -ENOENT;
-       }
-        
-       *pdest = dest->next;
-       dest->service = NULL;
-
-       write_unlock_bh(&__ip_vs_lock);
-
-        /*
-         *  Decrease the refcnt of the dest, and free the dest
-         *  if nobody refers to it (refcnt=0).
-         */
-        if (atomic_dec_and_test(&dest->refcnt))
-                kfree_s(dest, sizeof(*dest));
-
-       return 0;
-}
-        
-
-#if 0
-struct ip_vs_dest * ip_vs_lookup_dest(struct ip_vs_service *svc,
-                                      __u32 daddr, __u16 dport)
-{
-       struct ip_vs_dest *dest;
-        
-       read_lock_bh(&__ip_vs_lock);
-
-       /*
-         * Find the destination for the given service
-         */
-       for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        /* HIT */
-                        read_unlock_bh(&__ip_vs_lock);
-                        return dest;
-                }
-        }
-
-       read_unlock_bh(&__ip_vs_lock);
-       return NULL;
-}
-#endif
-
-
-/*
- *  Add a service into the service list
- */
-int ip_vs_add_service(__u32 vaddr, __u16 vport, 
-                     __u16 protocol, struct ip_vs_scheduler *scheduler)
-{
-       struct ip_vs_service *svc;
-       int proto_num = masq_proto_num(protocol);
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* check if the service already exists */
-       for (svc = service_list[proto_num]; svc; svc = svc->next) {
-               if ((svc->port == vport) && (svc->addr == vaddr)) {
-                       ret = -EEXIST;
-                       goto out;
-               }
-       }
-
-       svc = (struct ip_vs_service*) kmalloc(sizeof(struct ip_vs_service),
-                                             GFP_ATOMIC);
-       if (svc == NULL) {
-               IP_VS_ERR("vs_add_svc: kmalloc failed.\n");
-               ret = -1;
-               goto out;
-       }
-       memset(svc,0,sizeof(struct ip_vs_service));
-
-       svc->addr = vaddr;
-       svc->port = vport;
-       svc->protocol = protocol;
-
-        /*
-         *    Bind the scheduler
-         */
-       ip_vs_bind_scheduler(svc, scheduler);
-
-
-       /* put the service into the proper service list */
-       if ((svc->port) || (!service_list[proto_num])) {
-               /* prepend to the beginning of the list */
-               svc->next = service_list[proto_num];
-               service_list[proto_num] = svc;
-       } else {
-               /* append to the end of the list if port==0 */
-               struct ip_vs_service *lsvc = service_list[proto_num];
-               while (lsvc->next) lsvc = lsvc->next;
-               svc->next = NULL;
-               lsvc->next = svc;
-       }
-
-  out:
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Edit s service
- */
-int ip_vs_edit_service(struct ip_vs_service *svc,
-                       struct ip_vs_scheduler *scheduler)
-{
-       write_lock_bh(&__ip_vs_lock);
-
-       /*
-         *    Unbind the old scheduler
-         */
-       ip_vs_unbind_scheduler(svc);
-
-        /*
-         *    Bind the new scheduler
-         */
-       ip_vs_bind_scheduler(svc, scheduler);
-        
-       write_unlock_bh(&__ip_vs_lock);
-        
-       return 0;
-}
-
-
-/*
- *  Delete a service from the service list
- */
-int ip_vs_del_service(struct ip_vs_service *svc)
-{
-       struct ip_vs_service **psvc;
-        struct ip_vs_dest *dest, *dnext;
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* remove the service from the service_list */
-       psvc = &service_list[masq_proto_num(svc->protocol)];
-       for(; *psvc; psvc = &(*psvc)->next) {
-               if (*psvc == svc) {
-                       break;
-               }
-       }
-
-       if (*psvc == NULL) {
-               IP_VS_ERR("vs_del_svc: service not listed.");
-               ret = -1;
-               goto out;
-       }
-
-       *psvc = svc->next;
-
-       /*
-         *    Unbind scheduler
-         */
-       ip_vs_unbind_scheduler(svc);
-
-        /*
-         *    Unlink the destination list
-         */
-        dest = svc->destinations;
-        svc->destinations = NULL;
-        for (; dest; dest=dnext) {
-                dnext = dest->next;
-                dest->service = NULL;
-                dest->next = NULL;
-                
-                /*
-                 *  Decrease the refcnt of the dest, and free the dest
-                 *  if nobody refers to it (refcnt=0).
-                 */
-                if (atomic_dec_and_test(&dest->refcnt))
-                        kfree_s(dest, sizeof(*dest));
-        }
-
-       /*
-         *    Free the service
-         */
-       kfree_s(svc, sizeof(struct ip_vs_service));
-
-  out:
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Flush all the virtual services
- */
-int ip_vs_flush(void)
-{
-        int proto_num;
-        struct ip_vs_service *svc, *snext;
-        struct ip_vs_dest *dest, *dnext;
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-        
-       for (proto_num=0; proto_num<2; proto_num++) {
-                svc = service_list[proto_num];
-                service_list[proto_num] = NULL;
-                for (; svc; svc=snext) {
-                        snext = svc->next;
-
-                        /*
-                         *    Unbind scheduler
-                         */
-                        ip_vs_unbind_scheduler(svc);
-
-                        /*
-                         *    Unlink the destination list
-                         */
-                        dest = svc->destinations;
-                        svc->destinations = NULL;
-                        for (; dest; dest=dnext) {
-                                dnext = dest->next;
-                                dest->service = NULL;
-                                dest->next = NULL;
-                
-                                /*
-                                 *  Decrease the refcnt of the dest, and free
-                                 *  the dest if nobody refers to it (refcnt=0).
-                                 */
-                                if (atomic_dec_and_test(&dest->refcnt))
-                                        kfree_s(dest, sizeof(*dest));
-                        }
-
-                        /*
-                         *    Free the service
-                         */
-                        kfree_s(svc, sizeof(*svc));
-                }
-        }
-        
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Called when a FIN packet of ms is received
- */
-void ip_vs_fin_masq(struct ip_masq *ms)
-{
-        IP_VS_DBG("enter ip_vs_fin_masq()\n");
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        if(ms->dest)
-                atomic_dec(&ms->dest->connections);
-       ms->flags |= IP_MASQ_F_VS_FIN;
-}
-
-
-/*
- *  Bind a masq entry with a VS destination
- */
-void ip_vs_bind_masq(struct ip_masq *ms, struct ip_vs_dest *dest)
-{
-        IP_VS_DBG("enter ip_vs_bind_masq()\n");
-
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        ms->flags |= dest->masq_flags;
-        ms->dest = dest;
-
-        /*
-         *    Increase the refcnt and connections couters of the dest.
-         */
-        atomic_inc(&dest->refcnt);
-        atomic_inc(&dest->connections);
-}
-
-
-/*
- *  Unbind a masq entry with its VS destination
- */
-void ip_vs_unbind_masq(struct ip_masq *ms)
-{
-        struct ip_vs_dest *dest = ms->dest;
-        
-        IP_VS_DBG("enter ip_vs_unbind_masq()\n");
-
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        if (dest) {
-               if (!(ms->flags & IP_MASQ_F_VS_FIN)) {
-                        /*
-                         * Masq timeout, decrease the connection counter
-                         */
-                       atomic_dec(&dest->connections);
-                }
-                
-                /*
-                 *  Decrease the refcnt of the dest, and free the dest
-                 *  if nobody refers to it (refcnt=0).
-                 */
-                if (atomic_dec_and_test(&dest->refcnt))
-                        kfree_s(dest, sizeof(*dest));
-       }
-}
-
-
-/*
- *    Get scheduler in the scheduler list by name
- */
-struct ip_vs_scheduler * ip_vs_sched_getbyname(const char *sched_name)
-{
-       struct ip_vs_scheduler *sched;
-
-       IP_VS_DBG("ip_vs_sched_getbyname(): sched_name \"%s\"\n", sched_name);
-       
-       read_lock_bh(&__ip_vs_lock);
-       for (sched = schedulers; sched; sched = sched->next) {
-               if (strcmp(sched_name, sched->name)==0) {
-                       /* HIT */
-                       read_unlock_bh(&__ip_vs_lock);
-                        return sched;
-               }
-       }
-
-       read_unlock_bh(&__ip_vs_lock);
-       return NULL;
-}
-
-
-/*
- *    Lookup scheduler and try to load it if it doesn't exist
- */
-struct ip_vs_scheduler * ip_vs_lookup_scheduler(const char *sched_name)
-{
-       struct ip_vs_scheduler *sched;
-
-        /* search for the scheduler by sched_name */
-        sched = ip_vs_sched_getbyname(sched_name);
-
-        /* if scheduler not found, load the module and search again */
-        if (sched == NULL) {
-                char module_name[IP_MASQ_TNAME_MAX+8];
-                sprintf(module_name,"ip_vs_%s",sched_name);
-#ifdef CONFIG_KMOD
-                request_module(module_name);
-#endif /* CONFIG_KMOD */
-                sched = ip_vs_sched_getbyname(sched_name);
-        }
-                        
-        return sched;
-}
-
-
-/*
- *  Lookup service by {proto,addr,port} in the service list
- */
-struct ip_vs_service *ip_vs_lookup_service(__u32 vaddr, __u16 vport,
-                                           __u16 protocol)
-{
-        struct ip_vs_service *svc;
-
-        read_lock(&__ip_vs_lock);
-        svc = service_list[masq_proto_num(protocol)];
-        while (svc) {
-                if ((svc->addr == vaddr) &&
-                    (!svc->port || (svc->port == vport)))
-                        break;
-                svc = svc->next;
-        }
-        read_unlock(&__ip_vs_lock);
-        return svc; 
-}
-
-        
-/*
- *  IPVS main scheduling function
- *  It selects a server according to the virtual service, and
- *  creates a masq entry.
- */
-struct ip_masq *ip_vs_schedule(__u32 vaddr, __u16 vport, __u16 protocol,
-                              struct iphdr *iph)
-{
-       struct ip_vs_service *svc;
-       struct ip_masq *ms = NULL;
-       int proto_num = masq_proto_num(protocol);
-
-       read_lock(&__ip_vs_lock);
-        
-       /*
-         * Lookup the service
-         */
-       for (svc = service_list[proto_num]; svc; svc = svc->next) {
-               if ((svc->addr == vaddr) &&
-                   (!svc->port || (svc->port == vport))) {
-                       /*
-                        * choose the destination and create ip_masq entry
-                        */
-                       ms = svc->scheduler->schedule(svc, iph);
-                       break;
-               }
-       }
-        
-       read_unlock(&__ip_vs_lock);
-
-        return ms;
-}
-
-
-/*
- *     IPVS user control entry
- */
-int ip_vs_ctl(int optname, struct ip_masq_ctl *mctl, int optlen)
-{
-       struct ip_vs_scheduler *sched = NULL;
-        struct ip_vs_service *svc = NULL;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-       __u32 vaddr = mm->vaddr;
-       __u16 vport = mm->vport;
-       int proto_num = masq_proto_num(mm->protocol);
-
-       /*
-        * Check the size of mctl, no overflow...
-        */
-       if (optlen != sizeof(*mctl)) 
-               return EINVAL;
-
-       /*
-         * Flush all the virtual service...
-         */
-        if (mctl->m_cmd == IP_MASQ_CMD_FLUSH)
-                return ip_vs_flush();
-
-       /*
-         * Check for valid protocol: TCP or UDP
-         */
-        if ((proto_num < 0) || (proto_num > 1)) {
-                IP_VS_INFO("vs_ctl: invalid protocol: %d"
-                           "%d.%d.%d.%d:%d %s",
-                           ntohs(mm->protocol),
-                           NIPQUAD(vaddr), ntohs(vport), mctl->m_tname);
-                return -EFAULT;
-        }
-
-        /*
-         * Lookup the service by (vaddr, vport, protocol)
-         */
-        svc = ip_vs_lookup_service(vaddr, vport, mm->protocol);
-
-        switch (mctl->m_cmd) {
-                case IP_MASQ_CMD_ADD:
-                        if (svc != NULL)
-                                return -EEXIST;
-
-                        /* lookup the scheduler, by 'mctl->m_tname' */
-                        sched = ip_vs_lookup_scheduler(mctl->m_tname);
-                        if (sched == NULL) {
-                                IP_VS_INFO("Scheduler module ip_vs_%s.o not found\n",
-                                           mctl->m_tname);
-                                return -ENOENT;
-                        }
-
-                        return ip_vs_add_service(vaddr, vport,
-                                                 mm->protocol, sched);
-
-                case IP_MASQ_CMD_SET:
-                        if (svc == NULL)
-                                return -ESRCH;
-
-                        /* lookup the scheduler, by 'mctl->m_tname' */
-                        sched = ip_vs_lookup_scheduler(mctl->m_tname);
-                        if (sched == NULL) {
-                                IP_VS_INFO("Scheduler module ip_vs_%s.o not found\n",
-                                           mctl->m_tname);
-                                return -ENOENT;
-                        }
-
-                        return ip_vs_edit_service(svc, sched);
-                        
-                case IP_MASQ_CMD_DEL:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_del_service(svc);
-       
-                case IP_MASQ_CMD_ADD_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_add_dest(svc, mctl);
-
-                case IP_MASQ_CMD_SET_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_edit_dest(svc, mctl);
-                        
-                case IP_MASQ_CMD_DEL_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_del_dest(svc, mctl);
-        }
-        return -EINVAL;
-}
-
-
-
-#ifdef CONFIG_PROC_FS
-/*
- *     Write the contents of the VS rule table to a PROCfs file.
- */
-static int ip_vs_procinfo(char *buf, char **start, off_t offset,
-                         int length, int *eof, void *data)
-{
-       int ind;
-        int len=0;
-        off_t pos=0;
-        int size;
-        char str1[22];
-       struct ip_vs_service *svc = NULL;
-       struct ip_vs_dest *dest;
-       __u16 protocol = 0;
-
-       size = sprintf(buf+len,
-                       "IP Virtual Server (Version 0.7)\n"
-                       "Protocol Local Address:Port Scheduler\n"
-                       "      -> Remote Address:Port   Forward Weight ActiveConn FinConn\n");
-        pos += size;
-        len += size;
-
-       read_lock_bh(&__ip_vs_lock);
-
-        for (ind = 0; ind < 2; ind++) {
-                if (ind == 0)
-                        protocol = IPPROTO_UDP;
-                else
-                        protocol = IPPROTO_TCP;
-
-                for (svc=service_list[masq_proto_num(protocol)]; svc; svc=svc->next) {
-                        size = sprintf(buf+len, "%s %d.%d.%d.%d:%d %s\n",
-                                       masq_proto_name(protocol),
-                                       NIPQUAD(svc->addr), ntohs(svc->port),
-                                       svc->scheduler->name);
-                        len += size;
-                        pos += size;
-
-                        if (pos <= offset)
-                                len=0;
-                        if (pos >= offset+length)
-                                goto done;
-                              
-                        for (dest = svc->destinations; dest; dest = dest->next) {
-                                char *fwd;
-
-                                switch (dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) {
-                                        case IP_MASQ_F_VS_LOCALNODE:
-                                                fwd = "Local";
-                                                break;
-                                        case IP_MASQ_F_VS_TUNNEL:
-                                                fwd = "Tunnel";
-                                                break;
-                                        case IP_MASQ_F_VS_DROUTE:
-                                                fwd = "Route";
-                                                break;
-                                        default:
-                                                fwd = "Masq";
-                                }
-
-                                sprintf(str1, "%d.%d.%d.%d:%d",
-                                        NIPQUAD(dest->addr), ntohs(dest->port));
-                                size = sprintf(buf+len,
-                                               "      -> %-21s %-7s %-6d %-10d %-10d\n",
-                                               str1, fwd, dest->weight,
-                                               atomic_read(&dest->connections),
-                                               atomic_read(&dest->refcnt) - atomic_read(&dest->connections) - 1);
-                                len += size;
-                                pos += size;
-                  
-                                if (pos <= offset)
-                                        len=0;
-                                if (pos >= offset+length)
-                                        goto done;
-                        }
-               }
-       }
-
-  done:
-       read_unlock_bh(&__ip_vs_lock);
-        
-        *start = buf+len-(pos-offset);          /* Start of wanted data */
-        len = pos-offset;
-        if (len > length)
-                len = length;
-        if (len < 0)
-                len = 0;
-        
-       return len;
-}
-
-struct proc_dir_entry ip_vs_proc_entry = {
-       0,                      /* dynamic inode */
-       2, "vs",                /* namelen and name */
-       S_IFREG | S_IRUGO,      /* mode */
-       1, 0, 0, 0,             /* nlinks, owner, group, size */
-       &proc_net_inode_operations, /* operations */
-       NULL,                   /* get_info */
-       NULL,                   /* fill_inode */
-       NULL, NULL, NULL,       /* next, parent, subdir */
-       NULL,                   /* data */
-       &ip_vs_procinfo,        /* function to generate proc data */
-};
-       
-#endif
-
-
-/*
- *   This function encapsulates the packet in a new IP header, its destination
- *   will be set to the daddr. Most code of this function is from ipip.c.
- *   Usage:
- *     It is called in the ip_fw_demasquerade() function. The load balancer
- *     selects a real server from a cluster based on a scheduling algorithm,
- *     encapsulates the packet and forwards it to the selected server. All real
- *     servers are configured with "ifconfig tunl0 <Virtual IP Address> up".
- *     When the server receives the encapsulated packet, it decapsulates the
- *     packet, processes the request and return the reply packets directly to
- *     the client without passing the load balancer. This can greatly
- *     increase the scalability of virtual server. 
- *   Returns:
- *     if succeeded, return 1; otherwise, return 0.
- */
-
-int ip_vs_tunnel_xmit(struct sk_buff **skb_p, __u32 daddr)
-{
-       struct sk_buff *skb = *skb_p;
-       struct rtable *rt;                      /* Route to the other host */
-       struct device *tdev;                    /* Device to other host */
-       struct iphdr  *old_iph = skb->nh.iph;
-       u8     tos = old_iph->tos;
-       u16    df = 0;
-       struct iphdr  *iph;                     /* Our new IP header */
-       int    max_headroom;                    /* The extra header space needed */
-       u32    dst = daddr;
-       u32    src = 0;
-       int    mtu;
-
-       if (skb->protocol != __constant_htons(ETH_P_IP)) {
-               IP_VS_ERR("ip_vs_tunnel_xmit(): protocol error, ETH_P_IP: %d, skb protocol: %d\n",
-                       __constant_htons(ETH_P_IP),skb->protocol);
-               goto tx_error;
-       }
-
-       if (ip_route_output(&rt, dst, src, RT_TOS(tos), 0)) {
-               IP_VS_ERR("ip_vs_tunnel_xmit(): route error, dst: %08X\n", dst);
-               goto tx_error_icmp;
-       }
-       tdev = rt->u.dst.dev;
-
-       mtu = rt->u.dst.pmtu - sizeof(struct iphdr);
-       if (mtu < 68) {
-               ip_rt_put(rt);
-               IP_VS_ERR("ip_vs_tunnel_xmit(): mtu less than 68\n");
-               goto tx_error;
-       }
-       if (skb->dst && mtu < skb->dst->pmtu)
-               skb->dst->pmtu = mtu;
-
-       df |= (old_iph->frag_off&__constant_htons(IP_DF));
-
-       if ((old_iph->frag_off&__constant_htons(IP_DF)) && mtu < ntohs(old_iph->tot_len)) {
-               icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
-               ip_rt_put(rt);
-               IP_VS_ERR("ip_vs_tunnel_xmit(): frag needed\n");
-               goto tx_error;
-       }
-
-       skb->h.raw = skb->nh.raw;
-
-       /*
-        * Okay, now see if we can stuff it in the buffer as-is.
-        */
-       max_headroom = (((tdev->hard_header_len+15)&~15)+sizeof(struct iphdr));
-
-       if (skb_headroom(skb) < max_headroom || skb_cloned(skb) || skb_shared(skb)) {
-               struct sk_buff *new_skb = skb_realloc_headroom(skb, max_headroom);
-               if (!new_skb) {
-                       ip_rt_put(rt);
-                       kfree_skb(skb);
-                       IP_VS_ERR("ip_vs_tunnel_xmit(): no memory for new_skb\n");
-                       return 0;
-               }
-               kfree_skb(skb);
-               skb = new_skb;
-       }
-
-       skb->nh.raw = skb_push(skb, sizeof(struct iphdr));
-       memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
-       dst_release(skb->dst);
-       skb->dst = &rt->u.dst;
-
-       /*
-        *      Push down and install the IPIP header.
-        */
-
-       iph                     =       skb->nh.iph;
-       iph->version            =       4;
-       iph->ihl                =       sizeof(struct iphdr)>>2;
-       iph->frag_off           =       df;
-       iph->protocol           =       IPPROTO_IPIP;
-       iph->tos                =       tos;
-       iph->daddr              =       rt->rt_dst;
-       iph->saddr              =       rt->rt_src;
-       iph->ttl                =       old_iph->ttl;
-       iph->tot_len            =       htons(skb->len);
-       iph->id                 =       htons(ip_id_count++);
-       ip_send_check(iph);
-
-       ip_send(skb);
-       return 1;
-
-tx_error_icmp:
-       dst_link_failure(skb);
-tx_error:
-       kfree_skb(skb);
-       return 0;
-}
-
-
-/*
- *     Initialize IP virtual server
- */
-__initfunc(int ip_vs_init(void))
-{
-       int idx;
-        for(idx = 0; idx < IP_VS_TAB_SIZE; idx++)  {
-               INIT_LIST_HEAD(&ip_vs_table[idx]);
-       }
-#ifdef CONFIG_PROC_FS
-       ip_masq_proc_register(&ip_vs_proc_entry);       
-#endif        
-
-#ifdef CONFIG_IP_MASQUERADE_VS_RR
-        ip_vs_rr_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_WRR
-        ip_vs_wrr_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_WLC
-        ip_vs_wlc_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_PCC
-        ip_vs_pcc_init();
-#endif
-        return 0;
-}
diff --git a/net/ipv4/ip_vs_pcc.c b/net/ipv4/ip_vs_pcc.c
deleted file mode 100644 (file)
index eed47ce..0000000
+++ /dev/null
@@ -1,240 +0,0 @@
-/*
- * IPVS:        Persistent Client Connection Scheduling module
- *
- * Version:     $Id: ip_vs_pcc.c,v 1.1.2.1 1999/08/13 18:25:33 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-/*
- * Note:
- *   It is not very good to make persistent connection client feature
- *   as a sperate scheduling module, because PCC is different from
- *   scheduling modules such as RR, WRR and WLC. In fact, it is good
- *   to let user specify which port is persistent. This will be fixed
- *   in the near future.
- */
-
-/*
- * Define TEMPLATE_TIMEOUT a little larger than average connection time
- * plus MASQUERADE_EXPIRE_TCP_FIN(2*60*HZ). Because the template won't
- * be released until its last controlled masq entry gets expired.
- * If TEMPLATE_TIMEOUT is too less, the template will soon expire and
- * will be put in expire again and again, which requires additional
- * overhead. If it is too large, the same will always visit the same
- * server, which will make dynamic load imbalance worse.
- */
-#define TEMPLATE_TIMEOUT       6*60*HZ
-
-static int ip_vs_pcc_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_pcc_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- *    In fact, it is Weighted Least Connection scheduling
- */
-static struct ip_vs_dest* ip_vs_pcc_select(struct ip_vs_service *svc)
-{
-       struct ip_vs_dest *dest, *least;
-       int loh, doh;
-
-       IP_VS_DBG("ip_vs_pcc_select(): selecting a server...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-       /*
-         * The number of connections in TCP_FIN state is
-         *                 dest->refcnt - dest->connections -1
-         * We think the overhead of processing active connections is fifty
-         * times than that of conncetions in TCP_FIN in average. (This fifty
-         * times might be not accurate, we will change it later.) We use
-         * the following formula to estimate the overhead:
-         *                dest->connections*49 + dest->refcnt
-         * and the load:
-         *                (dest overhead) / dest->weight
-         *
-         * Remember -- no floats in kernel mode!!!
-         * The comparison of h1*w2 > h2*w1 is equivalent to that of
-         *                h1/w1 > h2/w2
-         * if every weight is larger than zero.
-         */
-
-       least = svc->destinations;
-       loh = atomic_read(&least->connections)*49 + atomic_read(&least->refcnt);
-        
-        /*
-         *    Find the destination with the least load.
-         */
-       for (dest = least->next; dest; dest = dest->next) {
-               doh = atomic_read(&dest->connections)*49 + atomic_read(&dest->refcnt);
-               if (loh*dest->weight > doh*least->weight) {
-                       least = dest;
-                       loh = doh;
-               }
-       }
-
-        IP_VS_DBG("The selected server: connections %d refcnt %d weight %d"
-                  "overhead %d\n", atomic_read(&least->connections),
-                  atomic_read(&least->refcnt), least->weight, loh);
-
-       return least;
-}
-
-
-static struct ip_masq* ip_vs_pcc_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms, *mst;
-       struct ip_vs_dest *dest;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       /* check if the template exists */
-        mst = ip_masq_in_get(0, iph->saddr, 0, svc->addr, svc->port);
-       if (mst) {
-               /*
-                 * Template masq exists...
-                 */
-               dest = mst->dest;
-                IP_VS_DBG("Template masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                          ip_vs_fwd_tag(mst), ip_masq_state_name(mst->state),
-                          ntohl(mst->daddr),ntohs(mst->dport),
-                          ntohl(mst->maddr),ntohs(mst->mport),
-                          ntohl(mst->saddr),ntohs(mst->sport),
-                          mst->flags, atomic_read(&mst->refcnt));
-       } else {
-               /* template does not exist, select the destination */
-               dest = ip_vs_pcc_select(svc);
-               if (!dest) return NULL;
-
-               /* create the template */
-               mst = ip_masq_new_vs(0, svc->addr, svc->port,
-                                     dest->addr, dest->port,
-                                     iph->saddr, 0, 0);
-               if (!mst) {
-                       IP_VS_ERR("ip_masq_new template failed\n");
-                       return NULL;
-               }
-
-                /*
-                 *    Bind the template masq entry with the vs dest.
-                 */
-                ip_vs_bind_masq(mst, dest);
-                
-                IP_VS_DBG("Template masq created fwd:%c s:%s c:%lX:%x v:%lX:%x"
-                          " d:%lX:%x flg:%X cnt:%d\n",
-                          ip_vs_fwd_tag(mst), ip_masq_state_name(mst->state),
-                          ntohl(mst->daddr),ntohs(mst->dport),
-                          ntohl(mst->maddr),ntohs(mst->mport),
-                          ntohl(mst->saddr),ntohs(mst->sport),
-                          mst->flags, atomic_read(&mst->refcnt));
-
-       }
-
-       /*
-         * The destination is known, and create the masq entry
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("new_vs failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-       /*
-         *    Add its control
-         */
-        ip_masq_control_add(ms, mst);
-
-        /*
-         *    Set the timeout, and put it in expire.
-         */
-        mst->timeout = TEMPLATE_TIMEOUT;
-        ip_masq_put(mst);
-
-        return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_pcc_scheduler = {
-       NULL,                   /* next */
-       "pcc",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_pcc_init_svc,     /* service initializer */
-       ip_vs_pcc_done_svc,     /* service done */
-       ip_vs_pcc_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_pcc_init(void))
-{
-       IP_VS_INFO("InitialzingPCC scheduling\n");
-        return register_ip_vs_scheduler(&ip_vs_pcc_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_pcc_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("PCC scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_pcc_scheduler) != 0)
-               IP_VS_INFO("cannot remove PCC scheduling module\n");
-       else
-               IP_VS_INFO("PCC scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_rr.c b/net/ipv4/ip_vs_rr.c
deleted file mode 100644 (file)
index f7b9d2e..0000000
+++ /dev/null
@@ -1,138 +0,0 @@
-/*
- * IPVS:        Round-Robin Scheduling module
- *
- * Version:     $Id: ip_vs_rr.c,v 1.1.2.1 1999/08/13 18:25:39 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Fixes/Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-
-static int ip_vs_rr_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_rr_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- * Round-Robin Scheduling
- */
-static struct ip_masq* ip_vs_rr_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-        struct ip_vs_dest *dest;
-       struct ip_masq *ms;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       IP_VS_DBG("ip_vs_rr_schedule(): Scheduling...\n");
-
-        if (svc->sched_data != NULL) 
-                svc->sched_data = ((struct ip_vs_dest*)svc->sched_data)->next;
-        if (svc->sched_data == NULL) 
-                svc->sched_data = svc->destinations;
-        if (svc->sched_data == NULL)
-                return NULL;
-
-        dest = svc->sched_data;
-
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-       return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_rr_scheduler = {
-       NULL,                   /* next */
-       "rr",                   /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_rr_init_svc,      /* service initializer */
-       ip_vs_rr_done_svc,      /* service done */
-       ip_vs_rr_schedule,      /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_rr_init(void))
-{
-       IP_VS_INFO("Initializing RR scheduling\n");
-       return register_ip_vs_scheduler(&ip_vs_rr_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_rr_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("RR scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_rr_scheduler) != 0)
-               IP_VS_INFO("cannot remove RR scheduling module\n");
-       else
-               IP_VS_INFO("RR scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_wlc.c b/net/ipv4/ip_vs_wlc.c
deleted file mode 100644 (file)
index 501a68a..0000000
+++ /dev/null
@@ -1,167 +0,0 @@
-/*
- * IPVS:        Weighted Least-Connection Scheduling module
- *
- * Version:     $Id: ip_vs_wlc.c,v 1.1.2.1 1999/08/13 18:25:44 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-
-static int ip_vs_wlc_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_wlc_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- *    Weighted Least Connection scheduling
- */
-static struct ip_masq* ip_vs_wlc_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms;
-       struct ip_vs_dest *dest, *least;
-       int loh, doh;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       IP_VS_DBG("ip_vs_wlc_schedule(): Scheduling...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-       /*
-         * The number of connections in TCP_FIN state is
-         *                 dest->refcnt - dest->connections -1
-         * We think the overhead of processing active connections is fifty
-         * times than that of conncetions in TCP_FIN in average. (This fifty
-         * times might be not accurate, we will change it later.) We use
-         * the following formula to estimate the overhead:
-         *                dest->connections*49 + dest->refcnt
-         * and the load:
-         *                (dest overhead) / dest->weight
-         *
-         * Remember -- no floats in kernel mode!!!
-         * The comparison of h1*w2 > h2*w1 is equivalent to that of
-         *                h1/w1 > h2/w2
-         * if every weight is larger than zero.
-         */
-
-       least = svc->destinations;
-       loh = atomic_read(&least->connections)*49 + atomic_read(&least->refcnt);
-        
-        /*
-         *    Find the destination with the least load.
-         */
-       for (dest = least->next; dest; dest = dest->next) {
-               doh = atomic_read(&dest->connections)*49 + atomic_read(&dest->refcnt);
-               if (loh*dest->weight > doh*least->weight) {
-                       least = dest;
-                       loh = doh;
-               }
-       }
-
-        IP_VS_DBG("The selected server: connections %d refcnt %d weight %d "
-                  "overhead %d\n", atomic_read(&least->connections),
-                  atomic_read(&least->refcnt), least->weight, loh);
-
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            least->addr, least->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, least);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_wlc_scheduler = {
-       NULL,                   /* next */
-       "wlc",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_wlc_init_svc,     /* service initializer */
-       ip_vs_wlc_done_svc,     /* service done */
-       ip_vs_wlc_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_wlc_init(void))
-{
-       IP_VS_INFO("Initializing WLC scheduling\n");
-        return register_ip_vs_scheduler(&ip_vs_wlc_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_wlc_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("WLC scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_wlc_scheduler) != 0)
-               IP_VS_INFO("cannot remove WLC scheduling module\n");
-       else
-               IP_VS_INFO("WLC scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_wrr.c b/net/ipv4/ip_vs_wrr.c
deleted file mode 100644 (file)
index 5bbeaa8..0000000
+++ /dev/null
@@ -1,196 +0,0 @@
-/*
- * IPVS:        Weighted Round-Robin Scheduling module
- *
- * Version:     $Id: ip_vs_wrr.c,v 1.1.2.1 1999/08/13 18:25:49 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-/*
- * current destination pointer for weighted round-robin scheduling
- */
-struct ip_vs_wrr_mark {
-        struct ip_vs_dest *cdest;    /* current destination pointer */
-        int cw;                      /* current weight */
-};
-
-
-static int ip_vs_wrr_init_svc(struct ip_vs_service *svc)
-{
-       /*
-         *    Allocate the mark variable for WRR scheduling
-         */
-        svc->sched_data = kmalloc(sizeof(struct ip_vs_wrr_mark), GFP_ATOMIC);
-
-        if (svc->sched_data == NULL) {
-                IP_VS_ERR("ip_vs_wrr_init_svc(): no memory\n");
-               return ENOMEM;
-        }
-        memset(svc->sched_data, 0, sizeof(struct ip_vs_wrr_mark));
-
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_wrr_done_svc(struct ip_vs_service *svc)
-{
-        /*
-         *    Release the mark variable
-         */
-        kfree_s(svc->sched_data, sizeof(struct ip_vs_wrr_mark));
-        
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-int ip_vs_wrr_max_weight(struct ip_vs_dest *destinations)
-{
-        struct ip_vs_dest *dest;
-        int weight = 0;
-
-        for (dest=destinations; dest; dest=dest->next) {
-                if (dest->weight > weight)
-                        weight = dest->weight;
-        }
-
-        return weight;
-}
-
-        
-/*
- *    Weighted Round-Robin Scheduling
- */
-static struct ip_masq* ip_vs_wrr_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-        struct ip_vs_wrr_mark *mark = svc->sched_data;
-        struct ip_vs_dest *dest;
-
-       IP_VS_DBG("ip_vs_wrr_schedule(): Scheduling...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-        /*
-         * This loop will always terminate, because 0<mark->cw<max_weight,
-         * and at least one server has its weight equal to max_weight.
-         */
-        while (1) {
-                if (mark->cdest == NULL) {
-                        mark->cdest = svc->destinations;
-                        mark->cw--;
-                        if (mark->cw <= 0) {
-                                mark->cw = ip_vs_wrr_max_weight(svc->destinations);
-                                /*
-                                 * Still zero, which means no availabe servers.
-                                 */
-                                if (mark->cw == 0) {
-                                        IP_VS_INFO("ip_vs_wrr_schedule(): no available servers\n");
-                                        return NULL;
-                                }
-                        }
-                }
-                else mark->cdest = mark->cdest->next;
-
-                if(mark->cdest && (mark->cdest->weight >= mark->cw))
-                        break;
-        }
-        
-       dest = mark->cdest;
-        
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-       return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_wrr_scheduler = {
-       NULL,                   /* next */
-       "wrr",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_wrr_init_svc,     /* service initializer */
-       ip_vs_wrr_done_svc,     /* service done */
-       ip_vs_wrr_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_wrr_init(void))
-{
-       IP_VS_INFO("Initializing WRR scheduling\n");
-       return register_ip_vs_scheduler(&ip_vs_wrr_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_wrr_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("WRR scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_wrr_scheduler) != 0)
-               IP_VS_INFO("cannot remove WRR scheduling module\n");
-       else
-               IP_VS_INFO("WRR scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/sound/solo1 b/sound/solo1
deleted file mode 100644 (file)
index 1c0a641..0000000
+++ /dev/null
@@ -1,48 +0,0 @@
-ALaw/uLaw sample formats
-------------------------
-
-This driver does not support the ALaw/uLaw sample formats.
-ALaw is the default mode when opening a sound device
-using OSS/Free. The reason for the lack of support is
-that the hardware does not support these formats, and adding
-conversion routines to the kernel would lead to very ugly
-code in the presence of the mmap interface to the driver.
-And since xquake uses mmap, mmap is considered important :-)
-and no sane application uses ALaw/uLaw these days anyway.
-In short, playing a Sun .au file as follows:
-
-cat my_file.au > /dev/dsp
-
-does not work. Instead, you may use the play script from
-Chris Bagwell's sox-12.14 package (or later, available from the URL
-below) to play many different audio file formats.
-The script automatically determines the audio format
-and does do audio conversions if necessary.
-http://home.sprynet.com/sprynet/cbagwell/projects.html
-
-
-Blocking vs. nonblocking IO
----------------------------
-
-Unlike OSS/Free this driver honours the O_NONBLOCK file flag
-not only during open, but also during read and write.
-This is an effort to make the sound driver interface more
-regular. Timidity has problems with this; a patch
-is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html.
-(Timidity patched will also run on OSS/Free).
-
-
-MIDI UART
----------
-
-The driver supports a simple MIDI UART interface, with
-no ioctl's supported.
-
-
-MIDI synthesizer
-----------------
-
-The card has an OPL compatible FM synthesizer.
-
-Thomas Sailer
-sailer@ife.ee.ethz.ch