Linux 2.2.12

author Alan Cox <alan@lxorguk.ukuu.org.uk>

Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)

committer Alan Cox <alan@lxorguk.ukuu.org.uk>

Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
author Alan Cox <alan@lxorguk.ukuu.org.uk>
Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
committer Alan Cox <alan@lxorguk.ukuu.org.uk>
Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
diff --git a/CREDITS b/CREDITS

index 91045d6a4f41118d557098aa7c1516fc7bcaf606..f6879c094fb0b6a37633963df04c24a4ecb4d246 100644 (file)
--- a/CREDITS
+++ b/CREDITS
@@ -995,6 +995,15 @@ S: Tallak 95
  S: 8103 Rein
  S: Austria
  
+N: Jan Kara
+E: jack@atrey.karlin.mff.cuni.cz
+D: Quota fixes for 2.2 kernel
+D: Few other fixes in filesystem area (isofs, loopback)
+W: http://atrey.karlin.mff.cuni.cz/~jack/
+S: Krosenska' 543
+S: 181 00 Praha 8
+S: Czech Republic
+
  N: Jan "Yenya" Kasprzak
  E: kas@fi.muni.cz
  D: Author of the COSA/SRP sync serial board driver.
diff --git a/Documentation/Configure.help b/Documentation/Configure.help

index 0efe43182e959b60000a47a78311bb4a66e585a2..a00ade642fde8d4ee3a0b45b75edd8fcc87744ca 100644 (file)
--- a/Documentation/Configure.help
+++ b/Documentation/Configure.help
@@ -956,13 +956,6 @@ CONFIG_BLK_DEV_MD
  
    If unsure, say N.
  
-Autodetect RAID partitions
-CONFIG_AUTODETECT_RAID
-  This feature lets the kernel detect RAID partitions on bootup.
-  An autodetect RAID partition is a normal partition with partition
-  type 0xfd. Use this if you want to boot RAID devices, or want to
-  run them automatically.
-
  Linear (append) mode
  CONFIG_MD_LINEAR
    If you say Y here, then your multiple devices driver will be able to
@@ -1042,21 +1035,6 @@ CONFIG_MD_RAID5
  
    If unsure, say Y.
  
-Translucent Block Device Support (EXPERIMENTAL)
-CONFIG_MD_TRANSLUCENT
-  DO NOT USE THIS STUFF YET!
-
-  currently there is only a placeholder there as the implementation
-  is not yet usable.
-
-Logical Volume Manager support (EXPERIMENTAL)
-CONFIG_MD_LVM
-  DO NOT USE THIS STUFF YET!
-
-  i have released this so people can comment on the architecture,
-  but user-space tools are still unusable so there is nothing much
-  you can do with this.
-
  Boot support (linear, striped)
  CONFIG_MD_BOOT
    To boot with an initial linear or striped md device you have to
@@ -2665,76 +2643,6 @@ CONFIG_IP_MASQUERADE_MFW
    The module will be called ip_masq_markfw.o. If you want to compile
    it as a module, say M here and read Documentation/modules.txt.
  
-IP: masquerading virtual server support
-CONFIG_IP_MASQUERADE_VS
-  IP Virtual Server support will let you build a virtual server
-  based on cluster of two or more real servers. This option must
-  be enabled for at least one of the clustered computers that will
-  take care of intercepting incomming connections to the virtual IP
-  and scheduling them to real servers.
-  Three request dispatching techniques are implemented, they are 
-  virtual server via NAT, virtual server via tunneling  and virtual
-  server via direct routing. The round-robin scheduling, the weighted
-  round-robin secheduling, or the weighted least-connection scheduling
-  algorithm can be used to choose which server the connection is 
-  directed to, thus load balancing can be achieved among the servers.
-  For more information and its administration program, please visit
-  the following URL:
-       http://proxy.iinchina.net/~wensong/ippfvs/
-  If you want this, say Y.
-
-IP masquerading VS table size (the Nth power of 2)
-CONFIG_IP_MASQUERADE_VS_TAB_BITS
-  Using a big IP masquerading hash table for virtual server will greatly
-  reduce conflicts in the masquerading hash table when there are
-  thousands of active connections.
-  Note the table size must be power of 2. The table size will be the
-  value of 2 to the your input number power. For example, the default
-  number is 12, so the table size is 4096. Don't input the number too
-  small, otherwise you will lose performance on it.
-  You can adapt the table size yourself, according to your virtual
-  server application. It is good to set the table size larger than
-  the number of connections per second multiplying average lasting time
-  of connection in the table. For example, your virtual server gets
-  20 connections per second, the connection lasts for 200 seconds in
-  average in the masquerading table, the table size should be larger
-  than 20x200, it is good to set the table size 4096 (2**12).
-
-IPVS: round-robin scheduling
-CONFIG_IP_MASQUERADE_VS_RR
-  The robin-robin scheduling algorithm simply directs network
-  connections to different real servers in a round-robin manner.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: weighted round-robin scheduling
-CONFIG_IP_MASQUERADE_VS_WRR
-  The weighted robin-robin scheduling algorithm directs network
-  connections to different real servers based on server weights
-  in a round-robin manner. Servers with higher weights receive
-  new connections first than those with less weights, and servers
-  with higher weights get more connections than those with less
-  weights and servers with equal weights get equal connections.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: weighted least-connection scheduling
-CONFIG_IP_MASQUERADE_VS_WLC
-  The weighted least-connection scheduling algorithm directs network
-  connections to the server with the least number of alive connections
-  dividing the server weight.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
-IPVS: persistent client connection scheduling
-CONFIG_IP_MASQUERADE_VS_PCC
-  The persistent client connection feature means that after a client
-  establishs a connection to the selected server, all connections
-  from the same client will be directed to the same server in a
-  specified period.
-  If you want to compile it in kernel, say Y. If you want to compile
-  it as a module, say M here and read Documentation/modules.txt.
-
  IP: always defragment (required for masquerading)
  CONFIG_IP_ALWAYS_DEFRAG
    If you say Y here, then all incoming fragments (parts of IP packets
diff --git a/Documentation/README.DAC960 b/Documentation/README.DAC960

new file mode 100644 (file)

index 0000000..2a80042
--- /dev/null
+++ b/Documentation/README.DAC960
@@ -0,0 +1,717 @@
+          Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux
+
+                       Version 2.2.4 for Linux 2.2.11
+                       Version 2.0.4 for Linux 2.0.37
+
+                             PRODUCTION RELEASE
+
+                               23 August 1999
+
+                              Leonard N. Zubkoff
+                              Dandelion Digital
+                              lnz@dandelion.com
+
+        Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+
+
+                                INTRODUCTION
+
+Mylex, Inc. designs and manufactures a variety of high performance PCI RAID
+controllers.  Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont,
+California 94555, USA and can be reached at 510/796-6100 or on the World Wide
+Web at http://www.mylex.com.  Mylex RAID Technical Support can be reached by
+electronic mail at support@mylex.com (for eXtremeRAID 1100 and older DAC960
+models) or techsup@mylex.com (for AcceleRAID models), by voice at 510/608-2400,
+or by FAX at 510/745-7715.  Contact information for offices in Europe and Japan
+is available on the Web site.
+
+The latest information on Linux support for DAC960 PCI RAID Controllers, as
+well as the most recent release of this driver, will always be available from
+my Linux Home Page at URL "http://www.dandelion.com/Linux/".  The Linux DAC960
+driver supports all current DAC960 PCI family controllers including the
+AcceleRAID models, as well as the eXtremeRAID 1100; see below for a complete
+list.  For simplicity, in most places this documentation refers to DAC960
+generically rather than explicitly listing all the models.
+
+Bug reports should be sent via electronic mail to "lnz@dandelion.com".  Please
+include with the bug report the complete configuration messages reported by the
+driver at startup, along with any subsequent system messages relevant to the
+controller's operation, and a detailed description of your system's hardware
+configuration.
+
+Please consult the DAC960 RAID controller documentation for detailed
+information regarding installation and configuration of the controllers.  This
+document primarily provides information specific to the Linux DAC960 support.
+
+
+                               DRIVER FEATURES
+
+The DAC960 RAID controllers are supported solely as high performance RAID
+controllers, not as interfaces to arbitrary SCSI devices.  The Linux DAC960
+driver operates at the block device level, the same level as the SCSI and IDE
+drivers.  Unlike other RAID controllers currently supported on Linux, the
+DAC960 driver is not dependent on the SCSI subsystem, and hence avoids all the
+complexity and unnecessary code that would be associated with an implementation
+as a SCSI driver.  The DAC960 driver is designed for as high a performance as
+possible with no compromises or extra code for compatibility with lower
+performance devices.  The DAC960 driver includes extensive error logging and
+online configuration management capabilities.  Except for initial configuration
+of the controller and adding new disk drives, most everything can be handled
+from Linux while the system is operational.
+
+The DAC960 driver is architected to support up to 8 controllers per system.
+Each DAC960 controller can support up to 15 disk drives per channel, for a
+maximum of 45 drives on a three channel controller.  The drives installed on a
+controller are divided into one or more "Drive Groups", and then each Drive
+Group is subdivided further into 1 to 32 "Logical Drives".  Each Logical Drive
+has a specific RAID Level and caching policy associated with it, and it appears
+to Linux as a single block device.  Logical Drives are further subdivided into
+up to 7 partitions through the normal Linux and PC disk partitioning schemes.
+Logical Drives are also known as "System Drives", and Drive Groups are also
+called "Packs".  Both terms are in use in the Mylex documentation; I have
+chosen to standardize on the more generic "Logical Drive" and "Drive Group".
+
+DAC960 RAID disk devices are named in the style of the Device File System
+(DEVFS).  The device corresponding to Logical Drive D on Controller C is
+referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
+through /dev/rd/cCdDp7.  For example, partition 3 of Logical Drive 5 on
+Controller 2 is referred to as /dev/rd/c2d5p3.  Note that unlike with SCSI
+disks the device names will not change in the event of a disk drive failure.
+The DAC960 driver is assigned major numbers 48 - 55 with one major number per
+controller.  The 8 bits of minor number are divided into 5 bits for the Logical
+Drive and 3 bits for the partition.
+
+
+                SUPPORTED DAC960/DAC1100 PCI RAID CONTROLLERS
+
+The following list comprises the supported DAC960 and DAC1100 PCI RAID
+Controllers as of the date of this document.  It is recommended that anyone
+purchasing a Mylex PCI RAID Controller not in the following table contact the
+author beforehand to verify that it is or will be supported.
+
+eXtremeRAID 1100 (DAC1164P)
+           3 Wide Ultra-2/LVD SCSI channels
+           233MHz StrongARM SA 110 Processor
+           64 Bit PCI (backward compatible with 32 Bit PCI slots)
+           16MB/32MB/64MB Parity SDRAM Memory with Battery Backup
+
+AcceleRAID 250 (DAC960PTL1)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Also includes one onboard Wide Ultra-2/LVD SCSI Channel
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+AcceleRAID 200 (DAC960PTL0)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Includes no onboard SCSI Channels
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+AcceleRAID 150 (DAC960PRL)
+           Uses onboard Symbios SCSI chips on certain motherboards
+           Also includes one onboard Wide Ultra-2/LVD SCSI Channel
+           33MHz Intel i960RP RISC Processor
+           4MB Parity EDO Memory
+
+DAC960PJ    1/2/3 Wide Ultra SCSI-3 Channels
+           66MHz Intel i960RD RISC Processor
+           4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory
+
+DAC960PG    1/2/3 Wide Ultra SCSI-3 Channels
+           33MHz Intel i960RP RISC Processor
+           4MB/8MB ECC EDO Memory
+
+DAC960PU    1/2/3 Wide Ultra SCSI-3 Channels
+           Intel i960CF RISC Processor
+           4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+DAC960PD    1/2/3 Wide Fast SCSI-2 Channels
+           Intel i960CF RISC Processor
+           4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+DAC960PL    1/2/3 Wide Fast SCSI-2 Channels
+           Intel i960 RISC Processor
+           2MB/4MB/8MB/16MB/32MB DRAM Memory
+
+For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required.
+
+For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above is
+required.
+
+For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required.
+
+For the DAC960PU, DAC960PD, and DAC960PL, firmware version 3.51-0-04 or above
+is required.
+
+Note that earlier revisions of the DAC960PU, DAC960PD, and DAC960PL controllers
+were delivered with version 2.xx firmware.  Version 2.xx firmware is not
+supported by this driver and no support is envisioned.  Contact Mylex RAID
+Technical Support to inquire about upgrading controllers with version 2.xx
+firmware to version 3.51-0-04.  Upgrading to version 3.xx firmware requires
+installation of higher capacity Flash ROM chips, and not all DAC960PD and
+DAC960PL controllers can be upgraded.
+
+Please note that not all SCSI disk drives are suitable for use with DAC960
+controllers, and only particular firmware versions of any given model may
+actually function correctly.  Similarly, not all motherboards have a BIOS that
+properly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150,
+DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device.
+If in doubt, contact Mylex RAID Technical Support (support@mylex.com) to verify
+compatibility.  Mylex makes available a hard disk compatibility list by FTP at
+ftp://ftp.mylex.com/pub/dac960/diskcomp.html.
+
+
+                             DRIVER INSTALLATION
+
+This distribution was prepared for Linux kernel version 2.2.11 or 2.0.37.
+
+To install the DAC960 RAID driver, you may use the following commands,
+replacing "/usr/src" with wherever you keep your Linux kernel source tree:
+
+  cd /usr/src
+  tar -xvzf DAC960-2.2.4.tar.gz (or DAC960-2.0.4.tar.gz)
+  mv README.DAC960 linux/Documentation
+  mv DAC960.[ch] linux/drivers/block
+  patch -p0 < DAC960.patch
+  cd linux
+  make config
+  make depend
+  make bzImage (or zImage)
+
+Then install "arch/i386/boot/bzImage" or "arch/i386/boot/zImage" as your
+standard kernel, run lilo if appropriate, and reboot.
+
+To create the necessary devices in /dev, the "make_rd" script included in
+"DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used.
+LILO 21 and FDISK v2.9 include DAC960 support; also included in this archive
+are patches to LILO 20 and FDISK v2.8 that add DAC960 support, along with
+statically linked executables of LILO and FDISK.  This modified version of LILO
+will allow booting from a DAC960 controller and/or mounting the root file
+system from a DAC960.
+
+Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAID
+controllers.  Installing directly onto a DAC960 may be problematic from other
+Linux distributions until their installation utilities are updated.
+
+
+                             INSTALLATION NOTES
+
+Before installing Linux or adding DAC960 logical drives to an existing Linux
+system, the controller must first be configured to provide one or more logical
+drives using the BIOS Configuration Utility or DACCF.  Please note that since
+there are only at most 6 usable partitions on each logical drive, systems
+requiring more partitions should subdivide a drive group into multiple logical
+drives, each of which can have up to 6 partitions.  Also, note that with large
+disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63) rather
+than accepting the default 2GB BIOS Geometry (128/32); failing to so do will
+cause the logical drive geometry to have more than 65535 cylinders which will
+make it impossible for FDISK to be used properly.  The 8GB BIOS Geometry can be
+enabled by configuring the DAC960 BIOS, which is accessible via Alt-M during
+the BIOS initialization sequence.
+
+For maximum performance and the most efficient E2FSCK performance, it is
+recommended that EXT2 file systems be built with a 4KB block size and 16 block
+stride to match the DAC960 controller's 64KB default stripe size.  The command
+"mke2fs -b 4096 -R stride=16 <device>" is appropriate.  Unless there will be a
+large number of small files on the file systems, it is also beneficial to add
+the "-i 16384" option to increase the bytes per inode parameter thereby
+reducing the file system metadata.  Finally, on systems that will only be run
+with Linux 2.2 or later kernels it is beneficial to enable sparse superblocks
+with the "-s 1" option.
+
+
+                     DAC960 ANNOUNCEMENTS MAILING LIST
+
+The DAC960 Announcements Mailing List provides a forum for informing Linux
+users of new driver releases and other announcements regarding Linux support
+for DAC960 PCI RAID Controllers.  To join the mailing list, send a message to
+"dac960-announce-request@dandelion.com" with the line "subscribe" in the
+message body.
+
+
+               CONTROLLER CONFIGURATION AND STATUS MONITORING
+
+The DAC960 RAID controllers running firmware 4.06 or above include a Background
+Initialization facility so that system downtime is minimized both for initial
+installation and subsequent configuration of additional storage.  The BIOS
+Configuration Utility (accessible via Alt-R during the BIOS initialization
+sequence) is used to quickly configure the controller, and then the logical
+drives that have been created are available for immediate use even while they
+are still being initialized by the controller.  The primary need for online
+configuration and status monitoring is then to avoid system downtime when disk
+drives fail and must be replaced.  Mylex's online monitoring and configuration
+utilities are being ported to Linux and will become available at some point in
+the future.  Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure)
+enclosure, the controller is able to rebuild failed drives automatically as
+soon as a drive replacement is made available.
+
+The primary interfaces for controller configuration and status monitoring are
+special files created in the /proc/rd/... hierarchy along with the normal
+system console logging mechanism.  Whenever the system is operating, the DAC960
+driver queries each controller for status information every 10 seconds, and
+checks for additional conditions every 60 seconds.  The initial status of each
+controller is always available for controller N in /proc/rd/cN/initial_status,
+and the current status as of the last status monitoring query is available in
+/proc/rd/cN/current_status.  In addition, status changes are also logged by the
+driver to the system console and will appear in the log files maintained by
+syslog.  The progress of asynchronous rebuild or consistency check operations
+is also available in /proc/rd/cN/current_status, and progress messages are
+logged to the system console at most every 60 seconds.
+
+Starting with the 2.2.3/2.0.3 versions of the driver, the status information
+available in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has been
+augmented to include the vendor, model, revision, and serial number (if
+available) for each physical device found connected to the controller:
+
+***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PRL PCI RAID Controller
+  Firmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MB
+  PCI Bus: 1, Device: 4, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  SAF-TE Enclosure Management Enabled
+  Physical Devices:
+    0:0  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68016775HA
+         Disk Status: Online, 17928192 blocks
+    0:1  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68004E53HA
+         Disk Status: Online, 17928192 blocks
+    0:2  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       13013935HA
+         Disk Status: Online, 17928192 blocks
+    0:3  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       13016897HA
+         Disk Status: Online, 17928192 blocks
+    0:4  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68019905HA
+         Disk Status: Online, 17928192 blocks
+    0:5  Vendor: IBM       Model: DRVS09D           Revision: 0270
+         Serial Number:       68012753HA
+         Disk Status: Online, 17928192 blocks
+    0:6  Vendor: ESG-SHV   Model: SCA HSBP M6       Revision: 0.61
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+To simplify the monitoring process for custom software, the special file
+/proc/rd/status returns "OK" when all DAC960 controllers in the system are
+operating normally and no failures have occurred, or "ALERT" if any logical
+drives are offline or critical or any non-standby physical drives are dead.
+
+Configuration commands for controller N are available via the special file
+/proc/rd/cN/user_command.  A human readable command can be written to this
+special file to initiate a configuration operation, and the results of the
+operation can then be read back from the special file in addition to being
+logged to the system console.  The shell command sequence
+
+  echo "<configuration-command>" > /proc/rd/c0/user_command
+  cat /proc/rd/c0/user_command
+
+is typically used to execute configuration commands.  The configuration
+commands are:
+
+  flush-cache
+
+    The "flush-cache" command flushes the controller's cache.  The system
+    automatically flushes the cache at shutdown or if the driver module is
+    unloaded, so this command is only needed to be certain a write back cache
+    is flushed to disk before the system is powered off by a command to a UPS.
+    Note that the flush-cache command also stops an asynchronous rebuild or
+    consistency check, so it should not be used except when the system is being
+    halted.
+
+  kill <channel>:<target-id>
+
+    The "kill" command marks the physical drive <channel>:<target-id> as DEAD.
+    This command is provided primarily for testing, and should not be used
+    during normal system operation.
+
+  make-online <channel>:<target-id>
+
+    The "make-online" command changes the physical drive <channel>:<target-id>
+    from status DEAD to status ONLINE.  In cases where multiple physical drives
+    have been killed simultaneously, this command may be used to bring them
+    back online, after which a consistency check is advisable.
+
+    Warning: make-online should only be used on a dead physical drive that is
+    an active part of a drive group, never on a standby drive.
+
+  make-standby <channel>:<target-id>
+
+    The "make-standby" command changes physical drive <channel>:<target-id>
+    from status DEAD to status STANDBY.  It should only be used in cases where
+    a dead drive was replaced after an automatic rebuild was performed onto a
+    standby drive.  It cannot be used to add a standby drive to the controller
+    configuration if one was not created initially; the BIOS Configuration
+    Utility must be used for that currently.
+
+  rebuild <channel>:<target-id>
+
+    The "rebuild" command initiates an asynchronous rebuild onto physical drive
+    <channel>:<target-id>.  It should only be used when a dead drive has been
+    replaced.
+
+  check-consistency <logical-drive-number>
+
+    The "check-consistency" command initiates an asynchronous consistency check
+    of <logical-drive-number> with automatic restoration.  It can be used
+    whenever it is desired to verify the consistency of the redundancy
+    information.
+
+  cancel-rebuild
+  cancel-consistency-check
+
+    The "cancel-rebuild" and "cancel-consistency-check" commands cancel any
+    rebuild or consistency check operations previously initiated.
+
+
+              EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVE
+
+The following annotated logs demonstrate the controller configuration and and
+online status monitoring capabilities of the Linux DAC960 Driver.  The test
+configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a
+DAC960PJ controller.  The physical drives are configured into a single drive
+group without a standby drive, and the drive group has been configured into two
+logical drives, one RAID-5 and one RAID-6.  Note that these logs are from an
+earlier version of the driver and the messages have changed somewhat with newer
+releases, but the functionality remains similar.  First, here is the current
+status of the RAID configuration:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+The above messages indicate that everything is healthy, and /proc/rd/status
+returns "OK" indicating that there are no problems with any DAC960 controller
+in the system.  For demonstration purposes, while I/O is active Physical Drive
+1:1 is now disconnected, simulating a drive failure.  The failure is noted by
+the driver within 10 seconds of the controller's having detected it, and the
+driver logs the following console status messages indicating that Logical
+Drives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD:
+
+DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI command
+DAC960#0: Physical Drive 1:1 is now DEAD
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL
+
+The Sense Keys logged here are just Check Condition / Unit Attention conditions
+arising from a SCSI bus reset that is forced by the controller during its error
+recovery procedures.  Concurrently with the above, the driver status available
+from /proc/rd also reflects the drive failure.  The status message in
+/proc/rd/status has changed from "OK" to "ALERT":
+
+gwynedd:/u/lnz# cat /proc/rd/status
+ALERT
+
+and /proc/rd/c0/current_status has been updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Dead, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+Since there are no standby drives configured, the system can continue to access
+the logical drives in a performance degraded mode until the failed drive is
+replaced and a rebuild operation completed to restore the redundancy of the
+logical drives.  Once Physical Drive 1:1 is replaced with a properly
+functioning drive, or if the physical drive was killed without having failed
+(e.g., due to electrical problems on the SCSI bus), the user can instruct the
+controller to initiate a rebuild operation onto the newly replaced drive:
+
+gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_command
+gwynedd:/u/lnz# cat /proc/rd/c0/user_command
+Rebuild of Physical Drive 1:1 Initiated
+
+The echo command instructs the controller to initiate an asynchronous rebuild
+operation onto Physical Drive 1:1, and the status message that results from the
+operation is then available for reading from /proc/rd/c0/user_command, as well
+as being logged to the console by the driver.
+
+Within 10 seconds of this command the driver logs the initiation of the
+asynchronous rebuild operation:
+
+DAC960#0: Rebuild of Physical Drive 1:1 Initiated
+DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01
+DAC960#0: Physical Drive 1:1 is now WRITE-ONLY
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completed
+
+and /proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Write-Only, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completed
+
+As the rebuild progresses, the current status in /proc/rd/c0/current_status is
+updated every 10 seconds:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Write-Only, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completed
+
+and every minute a progress message is logged to the console by the driver:
+
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completed
+
+Finally, the rebuild completes successfully.  The driver logs the status of the 
+logical and physical drives and the rebuild completion:
+
+DAC960#0: Rebuild Completed Successfully
+DAC960#0: Physical Drive 1:1 is now ONLINE
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE
+
+/proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru
+  Rebuild Completed Successfully
+
+and /proc/rd/status indicates that everything is healthy once again:
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+
+               EXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVE
+
+The following annotated logs demonstrate the controller configuration and and
+online status monitoring capabilities of the Linux DAC960 Driver.  The test
+configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a
+DAC960PJ controller.  The physical drives are configured into a single drive
+group with a standby drive, and the drive group has been configured into two
+logical drives, one RAID-5 and one RAID-6.  Note that these logs are from an
+earlier version of the driver and the messages have changed somewhat with newer
+releases, but the functionality remains similar.  First, here is the current
+status of the RAID configuration:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Online, 2201600 blocks
+    1:3 - Disk: Standby, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  No Rebuild or Consistency Check in Progress
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+The above messages indicate that everything is healthy, and /proc/rd/status
+returns "OK" indicating that there are no problems with any DAC960 controller
+in the system.  For demonstration purposes, while I/O is active Physical Drive
+1:2 is now disconnected, simulating a drive failure.  The failure is noted by
+the driver within 10 seconds of the controller's having detected it, and the
+driver logs the following console status messages:
+
+DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02
+DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI command
+DAC960#0: Physical Drive 1:2 is now DEAD
+DAC960#0: Physical Drive 1:2 killed because it was removed
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL
+
+Since a standby drive is configured, the controller automatically begins
+rebuilding onto the standby drive:
+
+DAC960#0: Physical Drive 1:3 is now WRITE-ONLY
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed
+
+Concurrently with the above, the driver status available from /proc/rd also
+reflects the drive failure and automatic rebuild.  The status message in
+/proc/rd/status has changed from "OK" to "ALERT":
+
+gwynedd:/u/lnz# cat /proc/rd/status
+ALERT
+
+and /proc/rd/c0/current_status has been updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Write-Only, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed
+
+As the rebuild progresses, the current status in /proc/rd/c0/current_status is
+updated every 10 seconds:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Write-Only, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru
+  Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed
+
+and every minute a progress message is logged on the console by the driver:
+
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed
+DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completed
+DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completed
+
+Finally, the rebuild completes successfully.  The driver logs the status of the 
+logical and physical drives and the rebuild completion:
+
+DAC960#0: Rebuild Completed Successfully
+DAC960#0: Physical Drive 1:3 is now ONLINE
+DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE
+DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE
+
+/proc/rd/c0/current_status is updated:
+
+***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****
+Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>
+Configuring Mylex DAC960PJ PCI RAID Controller
+  Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB
+  PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned
+  PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9
+  Controller Queue Depth: 128, Maximum Blocks per Command: 128
+  Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
+  Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Dead, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  Rebuild Completed Successfully
+
+and /proc/rd/status indicates that everything is healthy once again:
+
+gwynedd:/u/lnz# cat /proc/rd/status
+OK
+
+Note that the absence of a viable standby drive does not create an "ALERT"
+status.  Once dead Physical Drive 1:2 has been replaced, the controller must be
+told that this has occurred and that the newly replaced drive should become the
+new standby drive:
+
+gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_command
+gwynedd:/u/lnz# cat /proc/rd/c0/user_command
+Make Standby of Physical Drive 1:2 Succeeded
+
+The echo command instructs the controller to make Physical Drive 1:2 into a
+standby drive, and the status message that results from the operation is then
+available for reading from /proc/rd/c0/user_command, as well as being logged to
+the console by the driver.  Within 60 seconds of this command the driver logs:
+
+DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01
+DAC960#0: Physical Drive 1:2 is now STANDBY
+DAC960#0: Make Standby of Physical Drive 1:2 Succeeded
+
+and /proc/rd/c0/current_status is updated:
+
+gwynedd:/u/lnz# cat /proc/rd/c0/current_status
+  ...
+  Physical Devices:
+    0:1 - Disk: Online, 2201600 blocks
+    0:2 - Disk: Online, 2201600 blocks
+    0:3 - Disk: Online, 2201600 blocks
+    1:1 - Disk: Online, 2201600 blocks
+    1:2 - Disk: Standby, 2201600 blocks
+    1:3 - Disk: Online, 2201600 blocks
+  Logical Drives:
+    /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru
+    /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru
+  Rebuild Completed Successfully
diff --git a/Documentation/networking/CREDITS.ipvs b/Documentation/networking/CREDITS.ipvs

deleted file mode 100644 (file)

index 95d869e..0000000
--- a/Documentation/networking/CREDITS.ipvs
+++ /dev/null
@@ -1,26 +0,0 @@
-The contributors of Linux Virtual Server project are listed
-as follows in alphabetical order.
-
-       Mike Douglas <spike@bayside.net>
-       Virtual Server Logo.
-
-       Matthew Kellett <matthewk@corelcomputer.com>
-       Added the loadable load-balancing module to VS-0.5 patch 
-       for kernel 2.0.
-
-       Peter Kese <peter.kese@ijs.si>
-       Suggested the idea of the local-node feature and provided a
-       local-node prototype patch for VS via tunneling.
-       Port the VS patch to kernel 2.2 and rewrite of the code
-       The persistent client connection feature.
-
-       Joseph Mack <mack.joseph@epa.gov>
-       Gaving a talk about Linux Virtual Server in the LinuxExpo'99.
-
-       Rob Thomas <rob@rpi.net.au>
-       Wrote the the "Greased Turkey" document about how to setup a
-       load-sharing server. (a little bit stale, though)
-
-       Wensong Zhang <wensong@iinchina.net>
-       Chief author and developer.
-   
diff --git a/Documentation/networking/ChangeLog.ipvs b/Documentation/networking/ChangeLog.ipvs

deleted file mode 100644 (file)

index 55d5eff..0000000
--- a/Documentation/networking/ChangeLog.ipvs
+++ /dev/null
@@ -1,292 +0,0 @@
-ChangeLog of Virtual Server patch for Linux 2.2
-
-Virtual Server patch for Linux 2.2 - Version 0.7 - July 9, 1999
-
-Changes:
--   Added a separate masq hash table for IPVS.
-
--   Added slow timers to expire masq entries. 
-    Slow timers are checked in one second by default. Most overhead
-    of cascading timers is avoided.
-
-    With this new hash table and slow timers, the system can hold
-    huge number of masq entries, but make sure that you have
-    enough free memory. One masq entry costs 128 bytes memory
-    effectively (Thank Alan Cox), if your box holds 1 million masq
-    entries (it means that your box can receive 2000 connections per 
-    second if masq expire time is 500 seconds in average.), make sure
-    that you have 128M free memory. And, thank Alan for suggesting
-    the early random drop algorithm for masq entries that prevents
-    the system from running out of memory, I will design and implement
-    this feature in the near future.
-
--   Fixed the unlocking bug in the ip_vs_del_dest().
-    Thank Ted Pavlic <tpavlic@netwalk.com> for reporting it.
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.6 - July 1, 1999
-
-Changes:
--   Fixed the overflow bug in the ip_vs_procinfo().
-    Thank Ted Pavlic <tpavlic@netwalk.com> for reporting it.
-
--   Added the functionality to change weight and forwarding
-    (dispatching) method of existing real server.
-    This is useful for load-informed scheduling.
-
--   Added the functionality to change scheduler of virtual service
-    on the fly.
-
--   Reorganized some code and changed names of some functions.
-    This make the code more readable.
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.5 - June 22, 1999
-
-Changes:
--   Fixed the bug that LocalNode doesn't work in vs-0.4-2.2.9.
-    Thank Changwon Kim <chwkim@samsung.co.kr> for
-    reporting the bug and pointing me the checksum update
-    problem in the code.
-
--   some code of VS in the ip_fw_demasquerade was reorganized
-    so that the packets for VS-Tunneling, VS-DRouting and LocalNode
-    skip the checksum update. This make the code right and efficient
-
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.4 - June 1, 1999
-
-Most of the code was rewritten. The locking and refcnt was changed
-The violation of "no floats in kernel mode" rule in the weighted 
-least-connection scheduling was fixed. This patch is more efficient,
-and should be more stable.
-
-
-----------------------------------------------------------------------
-
-Virtual Server patch for Linux 2.2 - Version 0.1~0.3 - May 1999
-
-Peter Kese <peter.kese@ijs.si> ported the VS patch to kernel 2.2,
-rewrote the code and loadable scheduling modules.
-       
-==========================================================================
-       
-ChangeLog of Virtual Server patch for Linux 2.0
-----------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.9 - May 1, 1999
-
-Differences with virtual server patch version 0.8:
-
--  Add Virtual Server via Direct Routing
-   This approach was first implemented in IBM's NetDispatcher. All real
-   servers have their loopback alias interface configured with the virtual
-   IP address, the load balancer and the real servers must have one of
-   their interfaces physically linked by a HUB/Switch. When the packets
-   destined for the virtual IP address arrives, the load balnacer directly
-   route them to the real servers, the real servers processing the requests
-   and return the reply packets directly to the clients. Compared to the
-   virtual server via IP tunneling approach, this approach doesn't have
-   tunneling overhead(In fact, this overhead is minimal in most situations),
-   but requires that one of the load balancer's interfaces and the real
-   servers' interfaces must be in physical segment.
-       
--  Add more satistics information
-   The active connection counter and the total connection counter of
-   each real server were added for all the scheduling algorithms.
-
--  Add resetting(zeroing) counters
-   The total connection counters of all real servers can be reset to zero.
-
--  Change some statements in the masq_expire function and the 
-   ip_fw_demasquerade function, so that ip_masq_free_ports won't become
-   abnormal number after the masquerading entries for virtual server
-   are released.
-
--  Fix the bug of "double unlock on device queue"
-   Remove the unnecessary function call of skb_device_unlock(skb) in the
-   ip_pfvs_encapsule function, which sometimes cause "kernel: double
-   unlock on device queue" waring in the virtual server via tunneling.
-
--  Many functions of virtual server patch was splitted into the
-   linux/net/ipv4/ip_masq_pfvs.c.
-
--  Upgrade ippfvsadm 1.0.2 to ippfvsadm 1.0.3
-   Zeroing counters is supported in the new version. The ippfvsadm 1.0.3
-   can be used for all kernel with different virtual server options
-   without rebuilding the program.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.8 - March 6, 1999
-
-Differences with virtual server patch version 0.7:
-
--  Add virtual FTP server support
-   The original ippfvs via IP tunneling could not be used to
-   build a virtual FTP server, because the real servers could
-   not establish data connections to clients. The code was
-   added to parse the port number in the ftp control data
-   and create the corresponding masquerading entry for the
-   coming data connection.
-   Although the original ippfvs via NAT could be used to build
-   a virtual server, the data connection was established in
-   this way.
-     Real Server port:20  ----> ippfvs: allocate a free masq port
-       ----->  the client port
-   It is not elegent but time-consuming. Now it was changed
-   as follows:
-     Real Server port:20  ----> ippfvs port: 20 
-       ----> the client port
-
--  Change the port checking order in the ip_fw_demasquerade()
-   If the size of masquerade hash table is well chosen, checking
-   a masquerading entry in the hash table will just require one
-   hit. It is much efficient than checking port for  virtual
-   services, and there are at least 3 incoming packets for each
-   connection, which require port checking. So, it is efficient
-   to check the masquerading hash table first and then check
-   port for virtual services.
-
--  Remove a useless statement in the ip_masq_new_pfvs()
-   The useless statement in the ip_masq_new_pfvs function is
-       ip_masq_free_ports[masq_proto_num(proto)]++;
-   which may disturb system.
-
--  Change the header printing of the ip_pfvs_procinfo()
-       
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.7 - Febuary 10, 1999
-
-Differences with virtual server patch version 0.6:
-
--  Fix a bug in detect the finish of connection for tunneling
-   or NATing to the local node.
-   Since the server reply the client directly in tunneling or
-   NATing to the local node, the load balancer (LinuxDirector)
-   can only detect a FIN segment. It is mistake that the masq
-   entry is removed only if both-side FIN segments are detected,
-   and then the masq entry expires in 15 minutes. For the
-   situation above, the code was changed to set the masq entry
-   expire in TCP_FIN_TIMEOUT (2min) when an incoming FIN segment
-   is detecting.
--  Add the patch version printing in the ip_pfvs_procinfo()
-   It would be easy for users and hackers to know which
-   virtual server patch version they are running. Thank
-   Peter Kese <peter.kese@ijs.si> for the suggestion.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.6 - Febuary 2, 1999
-
-Differences with virtual server patch version 0.5:
-
--  Add the local node feature in virtual server.
-   If the local node feature is enabled, the load balancer can 
-   not only redirect the packets of the specified port to the 
-   other servers (remote nodes) to process it, but also can process 
-   the packets locally (local node). Which node is chosen depends on
-   the scheduling algorithms.
-   This local node feature can be used to build a virtual server of
-   a few nodes, for example, 2, 3 or more sites, in which it is a 
-   resource waste if the load balancer is only used to redirect
-   packets. It is wise to direct some packets to the local node to
-   process. This feature can also be used to build distributed
-   identical servers, in which one is too busy to handle requests
-   locally, then it can seamlessly forward requests to other servers
-   to process them.
-   This feature can be applied to both virtual server via NAT and
-   virtual server via IP tunneling.
-   Thank Peter Kese <peter.kese@ijs.si> for idea of "Two node Virtual
-   Server" and his single line patch for virtual server via IP
-   tunneling.
--  Remove a useless function call ip_send_check in the virtual
-   server via IP tunneling code. 
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.5 - November 25, 1998
-
-Differences with virtual server patch version 0.4:
-
--  Add the feature of virtual server via IP tunneling.
-   If the ippfvs is enabled using IP tunneling, the load balancer
-   chooses a real server from a cluster based on a scheduling algorithm,
-   encapsules the packet and forwards it to the chosen server. All real
-   servers are configured with "ifconfig tunl0 <Virtual IP Address> up".
-   When the chosen server receives the encapsuled packet, it decapsules
-   the packet, processes the request and returns the reply packets 
-   directly to the client without passing the load balancer. This can 
-   greatly increase the scalability of virtual server.
--  Fix a bug in the ip_portfw_del() for the weighted RR scheduling.
-   The bug in version 0.4 is when the weighted round-robin scheduling
-   is used, deleting the last rule for a virtual server will report
-   "setsockopt failed: Invalid argument" warning, in fact the last
-   rule is deleted but the gen_scheduling_seq() works on a null list
-   and causes that warning.
--  Add and modify some description for virtual server options in
-   the Linux kernel configuration help texts.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.4 - November 12, 1998
-
-Differences with virtual server patch version 0.3:
-
--  Fix a memory access error bug.
-   The set_serverpointer_null() function is added to scan all the existing
-   ip masquerading records for its server pointer which points to the 
-   server specified and set it null. It is useful when administrators 
-   delete a real server or all real servers, those pointers pointing to 
-   the server must be set null.  Otherwise, decreasing the connection 
-   counter of the server may cause memory access error when the connection
-   terminates or timeout.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.3 - November 10, 1998
-
-Differences with virtual server patch version 0.2:
-
--  Change the simple round-robin scheduling to the weighted round-robin
-   scheduling. Simple is a special instance of the weighted round-robin
-   scheduling when the weights of the servers are the same.
--  The scheduling algorithm, originally called the weighted round-robin
-   scheduling in version 0.2, actually is the weighted least-connection
-   scheduling. So the concept is clarified here.
--  Add the least-connection scheduling algorithm. Although it is a 
-   special instance of the weighted least-connection scheduling algorithm,
-   it is used to avoid dividing the weight in looking up servers when
-   the weights of the servers are the same, so the overhead of scheduling
-   can be minimized in this case.
--  Change the type of the server load variables, curr_load and least_load,
-   from integer to float in the weighted least-connection scheduling.
-   It can make a better load-balancing when the weights specified are high.
--  Merge the original two patches into one. Users have to specify which
-   scheduling algorithm is used, the weighted round-robin scheduling,
-   the least-connection scheduling, or the weighted least-connection
-   scheduling, before rebuild the kernel.
--  Change the ip_pfvs_proc function to make the output of the port 
-   forwarding & virtual server table more beautiful.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.2 - May 28, 1998
-
-Differences with virtual server patch version 0.1:
-
--  Add the weighted round-robin scheduling patch.
-
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux - Version 0.1 - May 26, 1998
-
--  Implement the infrastructure of virtual server.
--  Implement the simple round-robin scheduling algorithm.
-
---------------------------------------------------------------------
diff --git a/Documentation/networking/README.ipvs b/Documentation/networking/README.ipvs

deleted file mode 100644 (file)

index a7a626b..0000000
--- a/Documentation/networking/README.ipvs
+++ /dev/null
@@ -1,93 +0,0 @@
-README of Virtual Server Patch for Linux 2.2.10
---------------------------------------------------------------------
-
-Virtual Server Patch for Linux 2.2.10 - Version 0.7 - July 9, 1999
-
-Copyright (c) 1998,1999 by Wensong Zhang, Peter Kese.
-This is free software.  See below for details.
-
-The ipvs is IP Virtual Server support in Linux kernel, which can be used
-to build a high-performance and highly available server. Check out the
-Linux Virtual Server Project homepage on the World Wide Web:
-       http://proxy.iinchina.net/~wensong/ippfvs/
-for the most recent information and original sources about ipvs.
-
-We now call the Linux box running ipvs LinuxDirector. Thank 
-Robert Thomas <rob@rpi.net.au> for this name, I love it. :-)
-
-This patch (Version 0.7) is for the Linux kernel 2.2.10. See the ChangeLog
-for how the code has been improved and what new features it has now.
-
-To rebuild a Linux kernel with virtual server support, first get a clean
-copy of the Linux kernel source of the right version and apply the patch
-to the kernel. The commands can be as follows: 
-       cd /usr/src/linux
-       cat <path-name>/ipvs-0.7-2.2.10.patch | patch -p1 
-Then make sure the following kernel compile options at least are selected
-via "make menuconfig" or "make xconfig".
-
-Kernel Compile Options:
-
-Code maturity level options ---
-       [*] Prompt for development and/or incomplete code/drivers
-Networking options ---
-        [*] Network firewalls
-        ....
-        [*] IP: firewalling
-        [*] IP: always defragment (required for masquerading)
-        ....
-        [*] IP: masquerading
-        ....
-        [*] IP: masquerading virtual server support
-       (12) IP masquerading table size (the Nth power of 2)
-       < > IPVS: round-robin scheduling
-       < > IPVS: weighted round-robin scheduling
-       < > IPVS: weighted least-connection scheduling
-       < > IPVS: persistent client connection scheduling
-Note that you can compile scheduling algorithms in kernel or as modules.
-
-Finally, rebuild the kernel. Once you have your kernel properly built, 
-update your system kernel and reboot.
-
-Note that there are three request dispatching techniques existing together
-in the LinuxDirector, and there are also three scheduling algorithms
-implemented. Both the VS via IP Tunneling and the VS via Direct Routing 
-can greatly increase the scalability of virtual server. If the VS-Tunneling
-is selected, it requires that all the servers must be configured with 
-       ifconfig tunl0 <Virtual IP Address> netmask 255.255.255.255
-If the VS-DRouting is chosen, it requires that all servers must be configured
-with the following command:
-       ifconfig lo:0 <Virtual IP Address> netmask 255.255.255.255
-The localnode feature can make that the LinuxDiretor can not only redirect
-packets to other servers, but also process packets locally.
-
-Thanks must go to other contributors, check the CREDITS file to know
-who they are.
-
-There is a mailing list for virtual server. You are welcome to talk about
-building the virtual server kernel, using the virtual server and making
-the virtual server better there. :-) To subscribe, send a message to
-       majordomo@iinchina.net
-with the body of "subscribe linux-virtualserver".
-
-
-Wensong Zhang <wensong@iinchina.net>
-
-
---------------------------------------------------------------------
-
-This program is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 2 of the License, or
-(at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with this program; if not, write to the Free Software
-Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
---------------------------------------------------------------------
diff --git a/arch/alpha/kernel/alpha_ksyms.c b/arch/alpha/kernel/alpha_ksyms.c

index 2c22c0c4bfd222fcabd6a60617023e14c01ac263..cf8911e743f8f4ca1d9808ede77d5a88100a23a3 100644 (file)
--- a/arch/alpha/kernel/alpha_ksyms.c
+++ b/arch/alpha/kernel/alpha_ksyms.c
@@ -14,6 +14,8 @@
  #include <linux/in.h>
  #include <linux/in6.h>
  #include <linux/pci.h>
+#include <linux/tty.h>
+#include <linux/mm.h>
  
  #include <asm/io.h>
  #include <asm/hwrpb.h>
@@ -23,6 +25,10 @@
  #include <linux/interrupt.h>
  #include <asm/softirq.h>
  #include <asm/fpu.h>
+#include <asm/irq.h>
+#include <asm/machvec.h>
+#include <asm/pgtable.h>
+#include <asm/semaphore.h>
  
  #define __KERNEL_SYSCALLS__
  #include <asm/unistd.h>
@@ -41,8 +47,14 @@ extern void __remlu (void);
  extern void __divqu (void);
  extern void __remqu (void);
  
+EXPORT_SYMBOL(alpha_mv);
  EXPORT_SYMBOL(local_bh_count);
  EXPORT_SYMBOL(local_irq_count);
+EXPORT_SYMBOL(enable_irq);
+EXPORT_SYMBOL(disable_irq);
+EXPORT_SYMBOL(disable_irq_nosync);
+EXPORT_SYMBOL(screen_info);
+EXPORT_SYMBOL(perf_irq);
  
  /* platform dependent support */
  EXPORT_SYMBOL(_inb);
@@ -76,12 +88,14 @@ EXPORT_SYMBOL(strnlen);
  EXPORT_SYMBOL(strncat);
  EXPORT_SYMBOL(strstr);
  EXPORT_SYMBOL(strtok);
+EXPORT_SYMBOL(strpbrk);
  EXPORT_SYMBOL(strchr);
  EXPORT_SYMBOL(strrchr);
  EXPORT_SYMBOL(memcmp);
  EXPORT_SYMBOL(memmove);
  EXPORT_SYMBOL(__memcpy);
  EXPORT_SYMBOL(__memset);
+EXPORT_SYMBOL(__memsetw);
  EXPORT_SYMBOL(__constant_c_memset);
  
  EXPORT_SYMBOL(dump_thread);
@@ -90,8 +104,8 @@ EXPORT_SYMBOL(hwrpb);
  EXPORT_SYMBOL(wrusp);
  EXPORT_SYMBOL(start_thread);
  EXPORT_SYMBOL(alpha_read_fp_reg);
-EXPORT_SYMBOL(alpha_write_fp_reg);
  EXPORT_SYMBOL(alpha_read_fp_reg_s);
+EXPORT_SYMBOL(alpha_write_fp_reg);
  EXPORT_SYMBOL(alpha_write_fp_reg_s);
  
  /* In-kernel system calls.  */
@@ -117,7 +131,9 @@ EXPORT_SYMBOL(csum_ipv6_magic);
  
  #ifdef CONFIG_MATHEMU_MODULE
  extern long (*alpha_fp_emul_imprecise)(struct pt_regs *, unsigned long);
+extern long (*alpha_fp_emul) (unsigned long pc);
  EXPORT_SYMBOL(alpha_fp_emul_imprecise);
+EXPORT_SYMBOL(alpha_fp_emul);
  #endif
  
  /*
@@ -128,6 +144,44 @@ EXPORT_SYMBOL_NOVERS(__do_clear_user);
  EXPORT_SYMBOL(__strncpy_from_user);
  EXPORT_SYMBOL(__strlen_user);
  
+/*
+ * The following are specially called from the semaphore assembly stubs.
+ */
+EXPORT_SYMBOL_NOVERS(__down_failed);
+EXPORT_SYMBOL_NOVERS(__down_failed_interruptible);
+EXPORT_SYMBOL_NOVERS(__up_wakeup);
+
+/* 
+ * SMP-specific symbols.
+ */
+
+#ifdef __SMP__
+EXPORT_SYMBOL(synchronize_irq);
+EXPORT_SYMBOL(flush_tlb_all);
+EXPORT_SYMBOL(flush_tlb_mm);
+EXPORT_SYMBOL(flush_tlb_page);
+EXPORT_SYMBOL(flush_tlb_range);
+EXPORT_SYMBOL(cpu_data);
+EXPORT_SYMBOL(cpu_number_map);
+EXPORT_SYMBOL(global_bh_lock);
+EXPORT_SYMBOL(global_bh_count);
+EXPORT_SYMBOL(synchronize_bh);
+EXPORT_SYMBOL(global_irq_holder);
+EXPORT_SYMBOL(__global_cli);
+EXPORT_SYMBOL(__global_sti);
+EXPORT_SYMBOL(__global_save_flags);
+EXPORT_SYMBOL(__global_restore_flags);
+#if DEBUG_SPINLOCK
+EXPORT_SYMBOL(spin_unlock);
+EXPORT_SYMBOL(debug_spin_lock);
+EXPORT_SYMBOL(debug_spin_trylock);
+#endif
+#if DEBUG_RWLOCK
+EXPORT_SYMBOL(write_lock);
+EXPORT_SYMBOL(read_lock);
+#endif
+#endif /* __SMP__ */
+
  /*
   * The following are special because they're not called
   * explicitly (the C compiler or assembler generates them in
@@ -147,3 +201,5 @@ EXPORT_SYMBOL_NOVERS(__remq);
  EXPORT_SYMBOL_NOVERS(__remqu);
  EXPORT_SYMBOL_NOVERS(memcpy);
  EXPORT_SYMBOL_NOVERS(memset);
+
+
diff --git a/arch/alpha/kernel/core_mcpcia.c b/arch/alpha/kernel/core_mcpcia.c

index 62fc7e6e53e4e8735b0bc898463df2078efaead9..50250befb738d4a6b511770305848d264f9ff613 100644 (file)
--- a/arch/alpha/kernel/core_mcpcia.c
+++ b/arch/alpha/kernel/core_mcpcia.c
@@ -568,65 +568,65 @@ mcpcia_print_uncorrectable(struct el_MCPCIA_uncorrected_frame_mcheck *logout)
  
         /* Print PAL fields */
         for (i = 0; i < 24; i += 2) {
-               printk("\tpal temp[%d-%d]\t\t= %16lx %16lx\n\r",
+               printk("\tpal temp[%d-%d]\t\t= %16lx %16lx\n",
                        i, i+1, frame->paltemp[i], frame->paltemp[i+1]);
         }
         for (i = 0; i < 8; i += 2) {
-               printk("\tshadow[%d-%d]\t\t= %16lx %16lx\n\r",
+               printk("\tshadow[%d-%d]\t\t= %16lx %16lx\n",
                        i, i+1, frame->shadow[i], 
                        frame->shadow[i+1]);
         }
-       printk("\tAddr of excepting instruction\t= %16lx\n\r",
+       printk("\tAddr of excepting instruction\t= %16lx\n",
                frame->exc_addr);
-       printk("\tSummary of arithmetic traps\t= %16lx\n\r",
+       printk("\tSummary of arithmetic traps\t= %16lx\n",
                frame->exc_sum);
-       printk("\tException mask\t\t\t= %16lx\n\r",
+       printk("\tException mask\t\t\t= %16lx\n",
                frame->exc_mask);
-       printk("\tBase address for PALcode\t= %16lx\n\r",
+       printk("\tBase address for PALcode\t= %16lx\n",
                frame->pal_base);
-       printk("\tInterrupt Status Reg\t\t= %16lx\n\r",
+       printk("\tInterrupt Status Reg\t\t= %16lx\n",
                frame->isr);
-       printk("\tCURRENT SETUP OF EV5 IBOX\t= %16lx\n\r",
+       printk("\tCURRENT SETUP OF EV5 IBOX\t= %16lx\n",
                frame->icsr);
-       printk("\tI-CACHE Reg %s parity error\t= %16lx\n\r",
+       printk("\tI-CACHE Reg %s parity error\t= %16lx\n",
                (frame->ic_perr_stat & 0x800L) ? 
                "Data" : "Tag", 
                frame->ic_perr_stat); 
-       printk("\tD-CACHE error Reg\t\t= %16lx\n\r",
+       printk("\tD-CACHE error Reg\t\t= %16lx\n",
                frame->dc_perr_stat);
         if (frame->dc_perr_stat & 0x2) {
                 switch (frame->dc_perr_stat & 0x03c) {
                 case 8:
-                       printk("\t\tData error in bank 1\n\r");
+                       printk("\t\tData error in bank 1\n");
                         break;
                 case 4:
-                       printk("\t\tData error in bank 0\n\r");
+                       printk("\t\tData error in bank 0\n");
                         break;
                 case 20:
-                       printk("\t\tTag error in bank 1\n\r");
+                       printk("\t\tTag error in bank 1\n");
                         break;
                 case 10:
-                       printk("\t\tTag error in bank 0\n\r");
+                       printk("\t\tTag error in bank 0\n");
                         break;
                 }
         }
-       printk("\tEffective VA\t\t\t= %16lx\n\r",
+       printk("\tEffective VA\t\t\t= %16lx\n",
                frame->va);
-       printk("\tReason for D-stream\t\t= %16lx\n\r",
+       printk("\tReason for D-stream\t\t= %16lx\n",
                frame->mm_stat);
-       printk("\tEV5 SCache address\t\t= %16lx\n\r",
+       printk("\tEV5 SCache address\t\t= %16lx\n",
                frame->sc_addr);
-       printk("\tEV5 SCache TAG/Data parity\t= %16lx\n\r",
+       printk("\tEV5 SCache TAG/Data parity\t= %16lx\n",
                frame->sc_stat);
-       printk("\tEV5 BC_TAG_ADDR\t\t\t= %16lx\n\r",
+       printk("\tEV5 BC_TAG_ADDR\t\t\t= %16lx\n",
                frame->bc_tag_addr);
-       printk("\tEV5 EI_ADDR: Phys addr of Xfer\t= %16lx\n\r",
+       printk("\tEV5 EI_ADDR: Phys addr of Xfer\t= %16lx\n",
                frame->ei_addr);
-       printk("\tFill Syndrome\t\t\t= %16lx\n\r",
+       printk("\tFill Syndrome\t\t\t= %16lx\n",
                frame->fill_syndrome);
-       printk("\tEI_STAT reg\t\t\t= %16lx\n\r",
+       printk("\tEI_STAT reg\t\t\t= %16lx\n",
                frame->ei_stat);
-       printk("\tLD_LOCK\t\t\t\t= %16lx\n\r",
+       printk("\tLD_LOCK\t\t\t\t= %16lx\n",
                frame->ld_lock);
  }
  
@@ -657,7 +657,8 @@ mcpcia_machine_check(unsigned long vector, unsigned long la_ptr,
         process_mcheck_info(vector, la_ptr, regs, "MCPCIA",
                             DEBUG_MCHECK, MCPCIA_mcheck_expected[cpu]);
  
-       if (vector != 0x620 && vector != 0x630) {
+       if (vector != 0x620 && vector != 0x630
+           && ! MCPCIA_mcheck_expected[cpu]) {
                 mcpcia_print_uncorrectable(mchk_logout);
         }
  
diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c

index a89d4c9e3d9de750aa40c92002d45317de591aa9..19109d12e1a12ca385f63de9007b52c7296f95a6 100644 (file)
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -88,6 +88,7 @@ cpu_idle(void *unused)
  
                 /* Although we are an idle CPU, we do not want to 
                    get into the scheduler unnecessarily.  */
+               barrier();
                 if (current->need_resched) {
                         schedule();
                         check_pgt_cache();
diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c

index 15715f469dd6ccddb939f879699ef7d14fd917a0..b52d386baf3fd5293811ee22f0b3781eadc4592b 100644 (file)
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -340,8 +340,11 @@ find_end_memory(void)
         high = (high + PAGE_SIZE) & (PAGE_MASK*2);
  
         /* Enforce maximum of 2GB even if there is more.  Blah.  */
-       if (high > 0x80000000UL)
+       if (high > 0x80000000UL) {
+               printk("Cropping memory from %luMB to 2048MB\n", high);
                 high = 0x80000000UL;
+       }
+
         return PAGE_OFFSET + high;
  }
  
diff --git a/arch/i386/defconfig b/arch/i386/defconfig

index 453b36aa4b03b9c31f6536745911e1ac08e05838..49f04a8923e2b8ad9d82f493e0002f67353b8ecc 100644 (file)
--- a/arch/i386/defconfig
+++ b/arch/i386/defconfig
@@ -93,20 +93,11 @@ CONFIG_IDEDMA_AUTO=y
  #
  # CONFIG_BLK_DEV_LOOP is not set
  # CONFIG_BLK_DEV_NBD is not set
-CONFIG_BLK_DEV_MD=y
-CONFIG_AUTODETECT_RAID=y
-CONFIG_MD_LINEAR=y
-CONFIG_MD_STRIPED=y
-CONFIG_MD_MIRRORING=y
-CONFIG_MD_RAID5=y
-CONFIG_MD_TRANSLUCENT=y
-CONFIG_MD_LVM=y
-CONFIG_MD_BOOT=y
+# CONFIG_BLK_DEV_MD is not set
  # CONFIG_BLK_DEV_RAM is not set
  # CONFIG_BLK_DEV_XD is not set
  CONFIG_PARIDE_PARPORT=y
  # CONFIG_PARIDE is not set
-# CONFIG_BLK_CPQ_DA is not set
  # CONFIG_BLK_DEV_HD is not set
  
  #
diff --git a/arch/i386/kernel/mtrr.c b/arch/i386/kernel/mtrr.c

index 0d6c177919c9aad1602cd6c5f3aa501dc303dd46..b95c65ed40e1e7436f39188e9c007f6056e506a6 100644 (file)
--- a/arch/i386/kernel/mtrr.c
+++ b/arch/i386/kernel/mtrr.c
@@ -201,7 +201,17 @@
      19990512   Richard Gooch <rgooch@atnf.csiro.au>
                Minor cleanups.
    v1.35
+    19990812   Zoltan Boszormenyi <zboszor@mol.hu>
+               PRELIMINARY CHANGES!!! ONLY FOR TESTING!!!
+               Rearrange switch() statements so the driver accomodates to
+               the fact that the AMD Athlon handles its MTRRs the same way
+               as Intel does.
+               
+    19990819   Alan Cox <alan@redhat.com>
+              Tested Zoltan's changes on a pre production Athlon - 100%
+              success. Fixed one fall through check to be Intel only.
  */
+
  #include <linux/types.h>
  #include <linux/errno.h>
  #include <linux/sched.h>
@@ -237,7 +247,7 @@
  #include <asm/hardirq.h>
  #include "irq.h"
  
-#define MTRR_VERSION            "1.35 (19990512)"
+#define MTRR_VERSION            "1.35a (19990819)"
  
  #define TRUE  1
  #define FALSE 0
@@ -321,6 +331,8 @@ static void set_mtrr_prepare (struct set_mtrr_context *ctxt)
      switch (boot_cpu_data.x86_vendor)
      {
        case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 >= 6) break; /* Athlon and post-Athlon CPUs */
+       /* else fall through */
        case X86_VENDOR_CENTAUR:
         return;
         /*break;*/
@@ -344,6 +356,7 @@ static void set_mtrr_prepare (struct set_mtrr_context *ctxt)
  
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
        case X86_VENDOR_INTEL:
         /*  Disable MTRRs, and set the default type to uncached  */
         rdmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
@@ -365,6 +378,8 @@ static void set_mtrr_done (struct set_mtrr_context *ctxt)
      switch (boot_cpu_data.x86_vendor)
      {
        case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 >= 6) break; /* Athlon and post-Athlon CPUs */
+       /* else fall through */
        case X86_VENDOR_CENTAUR:
         __restore_flags (ctxt->flags);
         return;
@@ -376,6 +391,7 @@ static void set_mtrr_done (struct set_mtrr_context *ctxt)
      /*  Restore MTRRdefType  */
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
        case X86_VENDOR_INTEL:
         wrmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
         break;
@@ -406,6 +422,9 @@ static unsigned int get_num_var_ranges (void)
  
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) return 2; /* pre-Athlon CPUs */
+       /* else fall through */
        case X86_VENDOR_INTEL:
         rdmsr (MTRRcap_MSR, config, dummy);
         return (config & 0xff);
@@ -416,9 +435,6 @@ static unsigned int get_num_var_ranges (void)
          /*  and Centaur has 8 MCR's  */
         return 8;
         /*break;*/
-      case X86_VENDOR_AMD:
-       return 2;
-       /*break;*/
      }
      return 0;
  }   /*  End Function get_num_var_ranges  */
@@ -430,12 +446,14 @@ static int have_wrcomb (void)
  
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) return 1; /* pre-Athlon CPUs */
+       /* else fall through */
        case X86_VENDOR_INTEL:
         rdmsr (MTRRcap_MSR, config, dummy);
         return (config & (1<<10));
         /*break;*/
        case X86_VENDOR_CYRIX:
-      case X86_VENDOR_AMD:
        case X86_VENDOR_CENTAUR:
         return 1;
         /*break;*/
@@ -1062,9 +1080,23 @@ int mtrr_add (unsigned long base, unsigned long size, unsigned int type,
      if ( !(boot_cpu_data.x86_capability & X86_FEATURE_MTRR) ) return -ENODEV;
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) { /* pre-Athlon CPUs */
+         /* Apply the K6 block alignment and size rules
+            In order
+               o Uncached or gathering only
+               o 128K or bigger block
+               o Power of 2 block
+               o base suitably aligned to the power
+           */
+         if (type > MTRR_TYPE_WRCOMB || size < (1 << 17) ||
+             (size & ~(size-1))-size || (base & (size-1)))
+             return -EINVAL;
+         break;
+       } /* else fall through */
        case X86_VENDOR_INTEL:
         /*  For Intel PPro stepping <= 7, must be 4 MiB aligned  */
-       if ( (boot_cpu_data.x86 == 6) && (boot_cpu_data.x86_model == 1) &&
+       if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && (boot_cpu_data.x86 == 6) && (boot_cpu_data.x86_model == 1) &&
              (boot_cpu_data.x86_mask <= 7) && ( base & ( (1 << 22) - 1 ) ) )
         {
             printk ("mtrr: base(0x%lx) is not 4 MiB aligned\n", base);
@@ -1105,18 +1137,6 @@ int mtrr_add (unsigned long base, unsigned long size, unsigned int type,
             return -EINVAL;
         }
         break;
-      case X86_VENDOR_AMD:
-       /* Apply the K6 block alignment and size rules
-          In order
-             o Uncached or gathering only
-             o 128K or bigger block
-             o Power of 2 block
-             o base suitably aligned to the power
-         */
-       if (type > MTRR_TYPE_WRCOMB || size < (1 << 17) ||
-           (size & ~(size-1))-size || (base & (size-1)))
-           return -EINVAL;
-       break;
        default:
         return -EINVAL;
         /*break;*/
@@ -1657,6 +1677,12 @@ __initfunc(static void mtrr_setup (void))
      printk ("mtrr: v%s Richard Gooch (rgooch@atnf.csiro.au)\n", MTRR_VERSION);
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) { /* pre-Athlon CPUs */
+         get_mtrr = amd_get_mtrr;
+         set_mtrr_up = amd_set_mtrr_up;
+         break;
+       } /* else fall through */
        case X86_VENDOR_INTEL:
         get_mtrr = intel_get_mtrr;
         set_mtrr_up = intel_set_mtrr_up;
@@ -1666,10 +1692,6 @@ __initfunc(static void mtrr_setup (void))
         set_mtrr_up = cyrix_set_arr_up;
         get_free_region = cyrix_get_free_region;
         break;
-      case X86_VENDOR_AMD:
-       get_mtrr = amd_get_mtrr;
-       set_mtrr_up = amd_set_mtrr_up;
-       break;
       case X86_VENDOR_CENTAUR:
          get_mtrr = centaur_get_mcr;
          set_mtrr_up = centaur_set_mcr_up;
@@ -1688,6 +1710,8 @@ __initfunc(void mtrr_init_boot_cpu (void))
      mtrr_setup ();
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) break; /* pre-Athlon CPUs */
        case X86_VENDOR_INTEL:
         get_mtrr_state (&smp_mtrr_state);
         break;
@@ -1724,6 +1748,9 @@ __initfunc(void mtrr_init_secondary_cpu (void))
      if ( !(boot_cpu_data.x86_capability & X86_FEATURE_MTRR) ) return;
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       /* Just for robustness: pre-Athlon CPUs cannot do SMP. */
+       if (boot_cpu_data.x86 < 6) break;
        case X86_VENDOR_INTEL:
         intel_mtrr_init_secondary_cpu ();
         break;
@@ -1749,6 +1776,8 @@ __initfunc(int mtrr_init(void))
  #  ifdef __SMP__
      switch (boot_cpu_data.x86_vendor)
      {
+      case X86_VENDOR_AMD:
+       if (boot_cpu_data.x86 < 6) break; /* pre-Athlon CPUs */
        case X86_VENDOR_INTEL:
         finalize_mtrr_state (&smp_mtrr_state);
         mtrr_state_warn (smp_changes_mask);
diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c

index 2f0206b734f3f0cb4578f53e238d8549d3161220..2948069e30b1bc46bd7ec054adb07ab36404a107 100644 (file)
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -392,7 +392,6 @@ __initfunc(void mem_init(unsigned long start_mem, unsigned long end_mem))
         int datapages = 0;
         int initpages = 0;
         unsigned long tmp;
-       unsigned long endbase;
  
         end_mem &= PAGE_MASK;
         high_memory = (void *) end_mem;
@@ -420,10 +419,8 @@ __initfunc(void mem_init(unsigned long start_mem, unsigned long end_mem))
          * IBM messed up *AGAIN* in their thinkpad: 0xA0000 -> 0x9F000.
          * They seem to have done something stupid with the floppy
          * controller as well..
-        * The amount of available base memory is in WORD 40:13.
          */
-       endbase = PAGE_OFFSET + ((*(unsigned short *)__va(0x413) * 1024) & PAGE_MASK);
-       while (start_low_mem < endbase) {
+       while (start_low_mem < 0x9f000+PAGE_OFFSET) {
                 clear_bit(PG_reserved, &mem_map[MAP_NR(start_low_mem)].flags);
                 start_low_mem += PAGE_SIZE;
         }
diff --git a/arch/sparc64/kernel/ioctl32.c b/arch/sparc64/kernel/ioctl32.c

index bf10d2847f3816f353b73e47f27d3fa7a3608bc7..5caa558b949825571f625149568af686d2d25edf 100644 (file)
--- a/arch/sparc64/kernel/ioctl32.c
+++ b/arch/sparc64/kernel/ioctl32.c
@@ -1,4 +1,4 @@
-/* $Id: ioctl32.c,v 1.62.2.2 1999/08/13 18:28:25 davem Exp $
+/* $Id: ioctl32.c,v 1.62.2.1 1999/06/09 04:53:03 davem Exp $
   * ioctl32.c: Conversion between 32bit and 64bit native ioctls.
   *
   * Copyright (C) 1997  Jakub Jelinek  (jj@sunsite.mff.cuni.cz)
@@ -17,7 +17,7 @@
  #include <linux/if.h>
  #include <linux/malloc.h>
  #include <linux/hdreg.h>
-#include <linux/raid/md.h>
+#include <linux/md.h>
  #include <linux/kd.h>
  #include <linux/route.h>
  #include <linux/skbuff.h>
@@ -1992,24 +1992,11 @@ asmlinkage int sys32_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
         case BLKRASET:
         
         /* 0x09 */
-       case RAID_VERSION:
-       case GET_ARRAY_INFO:
-       case GET_DISK_INFO:
-       case CLEAR_ARRAY:
-       case ADD_NEW_DISK:
-       case HOT_REMOVE_DISK:
-       case SET_ARRAY_INFO:
-       case SET_DISK_INFO:
-       case WRITE_RAID_INFO:
-       case UNPROTECT_ARRAY:
-       case PROTECT_ARRAY:
-       case HOT_ADD_DISK:
-       case RUN_ARRAY:
-       case START_ARRAY:
-       case STOP_ARRAY:
-       case STOP_ARRAY_RO:
-       case RESTART_ARRAY_RW:
-
+       case REGISTER_DEV:
+       case REGISTER_DEV_NEW:
+       case START_MD:
+       case STOP_MD:
+       
         /* Big K */
         case PIO_FONT:
         case GIO_FONT:
diff --git a/drivers/block/Config.in b/drivers/block/Config.in

index 55d85c76cad0444c634352fa52d78ee14d244c2c..38d7a475ccf4db905a44d9dd91377b7aa6b00f60 100644 (file)
--- a/drivers/block/Config.in
+++ b/drivers/block/Config.in
@@ -101,16 +101,13 @@ if [ "$CONFIG_NET" = "y" ]; then
  fi
  bool 'Multiple devices driver support' CONFIG_BLK_DEV_MD
  if [ "$CONFIG_BLK_DEV_MD" = "y" ]; then
-  bool 'Autodetect RAID partitions' CONFIG_AUTODETECT_RAID
    tristate '   Linear (append) mode' CONFIG_MD_LINEAR
    tristate '   RAID-0 (striping) mode' CONFIG_MD_STRIPED
    tristate '   RAID-1 (mirroring) mode' CONFIG_MD_MIRRORING
    tristate '   RAID-4/RAID-5 mode' CONFIG_MD_RAID5
-  tristate '   Translucent mode' CONFIG_MD_TRANSLUCENT
-  tristate '   Logical Volume Manager support' CONFIG_MD_LVM
-  if [ "$CONFIG_MD_LINEAR" = "y" -o "$CONFIG_MD_STRIPED" = "y" ]; then
-    bool '      Boot support (linear, striped)' CONFIG_MD_BOOT
-  fi
+fi
+if [ "$CONFIG_MD_LINEAR" = "y" -o "$CONFIG_MD_STRIPED" = "y" ]; then
+  bool '      Boot support (linear, striped)' CONFIG_MD_BOOT
  fi
  tristate 'RAM disk support' CONFIG_BLK_DEV_RAM
  if [ "$CONFIG_BLK_DEV_RAM" = "y" ]; then
diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c

index acef020c3d202d65f566b3ca7ca9036ef01338bc..5ba9cdbe9e1413300b0f42df9d76e02c746761e3 100644 (file)
--- a/drivers/block/DAC960.c
+++ b/drivers/block/DAC960.c
@@ -19,8 +19,8 @@
  */
  
  
-#define DAC960_DriverVersion                   "2.2.2"
-#define DAC960_DriverDate                      "3 July 1999"
+#define DAC960_DriverVersion                   "2.2.4"
+#define DAC960_DriverDate                      "23 August 1999"
  
  
  #include <linux/version.h>
@@ -478,6 +478,7 @@ static void DAC960_DetectControllers(DAC960_ControllerType_T ControllerType)
        unsigned long BaseAddress0 = PCI_Device->base_address[0];
        unsigned long BaseAddress1 = PCI_Device->base_address[1];
        unsigned short SubsystemVendorID, SubsystemDeviceID;
+      int CommandIdentifier;
        pci_read_config_word(PCI_Device, PCI_SUBSYSTEM_VENDOR_ID,
                            &SubsystemVendorID);
        pci_read_config_word(PCI_Device, PCI_SUBSYSTEM_ID,
@@ -584,9 +585,15 @@ static void DAC960_DetectControllers(DAC960_ControllerType_T ControllerType)
           break;
         }
        DAC960_ActiveControllerCount++;
-      Controller->Commands[0].Controller = Controller;
-      Controller->Commands[0].Next = NULL;
-      Controller->FreeCommands = &Controller->Commands[0];
+      for (CommandIdentifier = 0;
+          CommandIdentifier < DAC960_MaxChannels;
+          CommandIdentifier++)
+       {
+         Controller->Commands[CommandIdentifier].Controller = Controller;
+         Controller->Commands[CommandIdentifier].Next =
+           Controller->FreeCommands;
+         Controller->FreeCommands = &Controller->Commands[CommandIdentifier];
+       }
        continue;
      Failure:
        if (IO_Address == 0)
@@ -754,14 +761,13 @@ static boolean DAC960_ReadControllerConfiguration(DAC960_Controller_T
  
  
  /*
-  DAC960_ReportControllerConfiguration reports the configuration of
+  DAC960_ReportControllerConfiguration reports the Configuration Information of
    Controller.
  */
  
  static boolean DAC960_ReportControllerConfiguration(DAC960_Controller_T
                                                     *Controller)
  {
-  int LogicalDriveNumber, Channel, TargetID;
    DAC960_Info("Configuring Mylex %s PCI RAID Controller\n",
               Controller, Controller->ModelName);
    DAC960_Info("  Firmware Version: %s, Channels: %d, Memory Size: %dMB\n",
@@ -793,40 +799,199 @@ static boolean DAC960_ReportControllerConfiguration(DAC960_Controller_T
               Controller->GeometryTranslationSectors);
    if (Controller->SAFTE_EnclosureManagementEnabled)
      DAC960_Info("  SAF-TE Enclosure Management Enabled\n", Controller);
+  return true;
+}
+
+
+/*
+  DAC960_ReadDeviceConfiguration reads the Device Configuration Information by
+  requesting the SCSI Inquiry and SCSI Inquiry Unit Serial Number information
+  for each device connected to Controller.
+*/
+
+static boolean DAC960_ReadDeviceConfiguration(DAC960_Controller_T *Controller)
+{
+  DAC960_DCDB_T DCDBs[DAC960_MaxChannels], *DCDB;
+  Semaphore_T Semaphores[DAC960_MaxChannels], *Semaphore;
+  unsigned long ProcessorFlags;
+  int Channel, TargetID;
+  for (TargetID = 0; TargetID < DAC960_MaxTargets; TargetID++)
+    {
+      for (Channel = 0; Channel < Controller->Channels; Channel++)
+       {
+         DAC960_Command_T *Command = &Controller->Commands[Channel];
+         DAC960_SCSI_Inquiry_T *InquiryStandardData =
+           &Controller->InquiryStandardData[Channel][TargetID];
+         InquiryStandardData->PeripheralDeviceType = 0x1F;
+         Semaphore = &Semaphores[Channel];
+         *Semaphore = MUTEX_LOCKED;
+         DCDB = &DCDBs[Channel];
+         DAC960_ClearCommand(Command);
+         Command->CommandType = DAC960_ImmediateCommand;
+         Command->Semaphore = Semaphore;
+         Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+         Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+         DCDB->Channel = Channel;
+         DCDB->TargetID = TargetID;
+         DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+         DCDB->EarlyStatus = false;
+         DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+         DCDB->NoAutomaticRequestSense = false;
+         DCDB->DisconnectPermitted = true;
+         DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_T);
+         DCDB->BusAddress = Virtual_to_Bus(InquiryStandardData);
+         DCDB->CDBLength = 6;
+         DCDB->TransferLengthHigh4 = 0;
+         DCDB->SenseLength = sizeof(DCDB->SenseData);
+         DCDB->CDB[0] = 0x12; /* INQUIRY */
+         DCDB->CDB[1] = 0; /* EVPD = 0 */
+         DCDB->CDB[2] = 0; /* Page Code */
+         DCDB->CDB[3] = 0; /* Reserved */
+         DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_T);
+         DCDB->CDB[5] = 0; /* Control */
+         DAC960_AcquireControllerLock(Controller, &ProcessorFlags);
+         DAC960_QueueCommand(Command);
+         DAC960_ReleaseControllerLock(Controller, &ProcessorFlags);
+       }
+      for (Channel = 0; Channel < Controller->Channels; Channel++)
+       {
+         DAC960_Command_T *Command = &Controller->Commands[Channel];
+         DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+           &Controller->InquiryUnitSerialNumber[Channel][TargetID];
+         InquiryUnitSerialNumber->PeripheralDeviceType = 0x1F;
+         Semaphore = &Semaphores[Channel];
+         down(Semaphore);
+         if (Command->CommandStatus != DAC960_NormalCompletion) continue;
+         Command->Semaphore = Semaphore;
+         DCDB = &DCDBs[Channel];
+         DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+         DCDB->BusAddress = Virtual_to_Bus(InquiryUnitSerialNumber);
+         DCDB->SenseLength = sizeof(DCDB->SenseData);
+         DCDB->CDB[0] = 0x12; /* INQUIRY */
+         DCDB->CDB[1] = 1; /* EVPD = 1 */
+         DCDB->CDB[2] = 0x80; /* Page Code */
+         DCDB->CDB[3] = 0; /* Reserved */
+         DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+         DCDB->CDB[5] = 0; /* Control */
+         DAC960_AcquireControllerLock(Controller, &ProcessorFlags);
+         DAC960_QueueCommand(Command);
+         DAC960_ReleaseControllerLock(Controller, &ProcessorFlags);
+         down(Semaphore);
+       }
+    }
+  return true; 
+}
+
+
+/*
+  DAC960_ReportDeviceConfiguration reports the Device Configuration Information
+  of Controller.
+*/
+
+static boolean DAC960_ReportDeviceConfiguration(DAC960_Controller_T *Controller)
+{
+  int LogicalDriveNumber, Channel, TargetID;
    DAC960_Info("  Physical Devices:\n", Controller);
    for (Channel = 0; Channel < Controller->Channels; Channel++)
      for (TargetID = 0; TargetID < DAC960_MaxTargets; TargetID++)
        {
+       DAC960_SCSI_Inquiry_T *InquiryStandardData =
+         &Controller->InquiryStandardData[Channel][TargetID];
+       DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+         &Controller->InquiryUnitSerialNumber[Channel][TargetID];
         DAC960_DeviceState_T *DeviceState =
           &Controller->DeviceState[Controller->DeviceStateIndex]
                                   [Channel][TargetID];
-       if (!DeviceState->Present) continue;
-       switch (DeviceState->DeviceType)
+       DAC960_ErrorTable_T *ErrorTable =
+         &Controller->ErrorTable[Controller->ErrorTableIndex];
+       DAC960_ErrorTableEntry_T *ErrorEntry =
+         &ErrorTable->ErrorTableEntries[Channel][TargetID];
+       char Vendor[1+sizeof(InquiryStandardData->VendorIdentification)];
+       char Model[1+sizeof(InquiryStandardData->ProductIdentification)];
+       char Revision[1+sizeof(InquiryStandardData->ProductRevisionLevel)];
+       char SerialNumber[1+sizeof(InquiryUnitSerialNumber
+                                  ->ProductSerialNumber)];
+       int i;
+       if (InquiryStandardData->PeripheralDeviceType == 0x1F) continue;
+       for (i = 0; i < sizeof(Vendor)-1; i++)
+         {
+           unsigned char VendorCharacter =
+             InquiryStandardData->VendorIdentification[i];
+           Vendor[i] = (VendorCharacter >= ' ' && VendorCharacter <= '~'
+                        ? VendorCharacter : ' ');
+         }
+       Vendor[sizeof(Vendor)-1] = '\0';
+       for (i = 0; i < sizeof(Model)-1; i++)
+         {
+           unsigned char ModelCharacter =
+             InquiryStandardData->ProductIdentification[i];
+           Model[i] = (ModelCharacter >= ' ' && ModelCharacter <= '~'
+                       ? ModelCharacter : ' ');
+         }
+       Model[sizeof(Model)-1] = '\0';
+       for (i = 0; i < sizeof(Revision)-1; i++)
+         {
+           unsigned char RevisionCharacter =
+             InquiryStandardData->ProductRevisionLevel[i];
+           Revision[i] = (RevisionCharacter >= ' ' && RevisionCharacter <= '~'
+                          ? RevisionCharacter : ' ');
+         }
+       Revision[sizeof(Revision)-1] = '\0';
+       DAC960_Info("    %d:%d%s Vendor: %s  Model: %s  Revision: %s\n",
+                   Controller, Channel, TargetID, (TargetID < 10 ? " " : ""),
+                   Vendor, Model, Revision);
+       if (InquiryUnitSerialNumber->PeripheralDeviceType != 0x1F)
           {
-         case DAC960_OtherType:
-           DAC960_Info("    %d:%d - Other\n", Controller, Channel, TargetID);
-           break;
-         case DAC960_DiskType:
-           DAC960_Info("    %d:%d - Disk: %s, %d blocks\n", Controller,
-                       Channel, TargetID,
-                       (DeviceState->DeviceState == DAC960_Device_Dead
-                        ? "Dead"
-                        : DeviceState->DeviceState == DAC960_Device_WriteOnly
+           int SerialNumberLength = InquiryUnitSerialNumber->PageLength;
+           if (SerialNumberLength >
+               sizeof(InquiryUnitSerialNumber->ProductSerialNumber))
+             SerialNumberLength =
+               sizeof(InquiryUnitSerialNumber->ProductSerialNumber);
+           for (i = 0; i < SerialNumberLength; i++)
+             {
+               unsigned char SerialNumberCharacter =
+                 InquiryUnitSerialNumber->ProductSerialNumber[i];
+               SerialNumber[i] =
+                 (SerialNumberCharacter >= ' ' && SerialNumberCharacter <= '~'
+                  ? SerialNumberCharacter : ' ');
+             }
+           SerialNumber[SerialNumberLength] = '\0';
+           DAC960_Info("         Serial Number: %s\n",
+                       Controller, SerialNumber);
+         }
+       if (DeviceState->Present && DeviceState->DeviceType == DAC960_DiskType)
+         {
+           if (Controller->DeviceResetCount[Channel][TargetID] > 0)
+             DAC960_Info("         Disk Status: %s, %d blocks, %d resets\n",
+                         Controller,
+                         (DeviceState->DeviceState == DAC960_Device_Dead
+                          ? "Dead"
+                          : DeviceState->DeviceState == DAC960_Device_WriteOnly
+                          ? "Write-Only"
+                          : DeviceState->DeviceState == DAC960_Device_Online
+                          ? "Online" : "Standby"),
+                         DeviceState->DiskSize,
+                         Controller->DeviceResetCount[Channel][TargetID]);
+           else
+             DAC960_Info("         Disk Status: %s, %d blocks\n", Controller,
+                         (DeviceState->DeviceState == DAC960_Device_Dead
+                          ? "Dead"
+                          : DeviceState->DeviceState == DAC960_Device_WriteOnly
                            ? "Write-Only"
                            : DeviceState->DeviceState == DAC960_Device_Online
-                            ? "Online" : "Standby"),
-                       DeviceState->DiskSize);
-           break;
-         case DAC960_SequentialType:
-           DAC960_Info("    %d:%d - Sequential\n", Controller,
-                       Channel, TargetID);
-           break;
-         case DAC960_CDROM_or_WORM_Type:
-           DAC960_Info("    %d:%d - CD-ROM or WORM\n", Controller,
-                       Channel, TargetID);
-           break;
+                          ? "Online" : "Standby"),
+                         DeviceState->DiskSize);
           }
-
+       if (ErrorEntry->ParityErrorCount > 0 ||
+           ErrorEntry->SoftErrorCount > 0 ||
+           ErrorEntry->HardErrorCount > 0 ||
+           ErrorEntry->MiscErrorCount > 0)
+         DAC960_Info("         Errors - Parity: %d, Soft: %d, "
+                     "Hard: %d, Misc: %d\n", Controller,
+                     ErrorEntry->ParityErrorCount,
+                     ErrorEntry->SoftErrorCount,
+                     ErrorEntry->HardErrorCount,
+                     ErrorEntry->MiscErrorCount);
        }
    DAC960_Info("  Logical Drives:\n", Controller);
    for (LogicalDriveNumber = 0;
@@ -982,6 +1147,8 @@ static void DAC960_InitializeController(DAC960_Controller_T *Controller)
  {
    if (DAC960_ReadControllerConfiguration(Controller) &&
        DAC960_ReportControllerConfiguration(Controller) &&
+      DAC960_ReadDeviceConfiguration(Controller) &&
+      DAC960_ReportDeviceConfiguration(Controller) &&
        DAC960_RegisterBlockDevice(Controller))
      {
        /*
@@ -1625,7 +1792,7 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
               Controller->NeedErrorTableInformation = true;
               Controller->NeedDeviceStateInformation = true;
               Controller->DeviceStateChannel = 0;
-             Controller->DeviceStateTargetID = 0;
+             Controller->DeviceStateTargetID = -1;
               Controller->SecondaryMonitoringTime = jiffies;
             }
           if (NewEnquiry->RebuildFlag == DAC960_StandbyRebuildInProgress ||
@@ -1705,13 +1872,17 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
                                 EventLogEntry->TargetID,
                                 DAC960_EventMessages[
                                   AdditionalSenseCodeQualifier]);
+             else if (SenseKey == 6 && AdditionalSenseCode == 0x29)
+               {
+                 if (Controller->MonitoringTimerCount > 0)
+                   Controller->DeviceResetCount[EventLogEntry->Channel]
+                                               [EventLogEntry->TargetID]++;
+               }
               else if (!(SenseKey == 0 ||
                          (SenseKey == 2 &&
                           AdditionalSenseCode == 0x04 &&
                           (AdditionalSenseCodeQualifier == 0x01 ||
-                          AdditionalSenseCodeQualifier == 0x02)) ||
-                        (SenseKey == 6 && AdditionalSenseCode == 0x29 &&
-                         Controller->MonitoringTimerCount == 0)))
+                          AdditionalSenseCodeQualifier == 0x02))))
                 {
                   DAC960_Critical("Physical Drive %d:%d Error Log: "
                                   "Sense Key = %d, ASC = %02X, ASCQ = %02X\n",
@@ -1793,10 +1964,11 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
                                : NewDeviceState->DeviceState
                                  == DAC960_Device_Online
                                  ? "ONLINE" : "STANDBY"));
-         if (++Controller->DeviceStateTargetID == DAC960_MaxTargets)
+         if (OldDeviceState->DeviceState == DAC960_Device_Dead &&
+             NewDeviceState->DeviceState != DAC960_Device_Dead)
             {
-             Controller->DeviceStateChannel++;
-             Controller->DeviceStateTargetID = 0;
+             Controller->NeedDeviceInquiryInformation = true;
+             Controller->NeedDeviceSerialNumberInformation = true;
             }
         }
        else if (CommandOpcode == DAC960_GetLogicalDriveInformation)
@@ -1948,6 +2120,76 @@ static void DAC960_ProcessCompletedCommand(DAC960_Command_T *Command)
         }
        if (Controller->NeedDeviceStateInformation)
         {
+         if (Controller->NeedDeviceInquiryInformation)
+           {
+             DAC960_DCDB_T *DCDB = &Controller->MonitoringDCDB;
+             DAC960_SCSI_Inquiry_T *InquiryStandardData =
+               &Controller->InquiryStandardData
+                              [Controller->DeviceStateChannel]
+                              [Controller->DeviceStateTargetID];
+             InquiryStandardData->PeripheralDeviceType = 0x1F;
+             Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+             Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+             DCDB->Channel = Controller->DeviceStateChannel;
+             DCDB->TargetID = Controller->DeviceStateTargetID;
+             DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+             DCDB->EarlyStatus = false;
+             DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+             DCDB->NoAutomaticRequestSense = false;
+             DCDB->DisconnectPermitted = true;
+             DCDB->TransferLength = sizeof(DAC960_SCSI_Inquiry_T);
+             DCDB->BusAddress = Virtual_to_Bus(InquiryStandardData);
+             DCDB->CDBLength = 6;
+             DCDB->TransferLengthHigh4 = 0;
+             DCDB->SenseLength = sizeof(DCDB->SenseData);
+             DCDB->CDB[0] = 0x12; /* INQUIRY */
+             DCDB->CDB[1] = 0; /* EVPD = 0 */
+             DCDB->CDB[2] = 0; /* Page Code */
+             DCDB->CDB[3] = 0; /* Reserved */
+             DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_T);
+             DCDB->CDB[5] = 0; /* Control */
+             DAC960_QueueCommand(Command);
+             Controller->NeedDeviceInquiryInformation = false;
+             return;
+           }
+         if (Controller->NeedDeviceSerialNumberInformation)
+           {
+             DAC960_DCDB_T *DCDB = &Controller->MonitoringDCDB;
+             DAC960_SCSI_Inquiry_UnitSerialNumber_T *InquiryUnitSerialNumber =
+               &Controller->InquiryUnitSerialNumber
+                              [Controller->DeviceStateChannel]
+                              [Controller->DeviceStateTargetID];
+             InquiryUnitSerialNumber->PeripheralDeviceType = 0x1F;
+             Command->CommandMailbox.Type3.CommandOpcode = DAC960_DCDB;
+             Command->CommandMailbox.Type3.BusAddress = Virtual_to_Bus(DCDB);
+             DCDB->Channel = Controller->DeviceStateChannel;
+             DCDB->TargetID = Controller->DeviceStateTargetID;
+             DCDB->Direction = DAC960_DCDB_DataTransferDeviceToSystem;
+             DCDB->EarlyStatus = false;
+             DCDB->Timeout = DAC960_DCDB_Timeout_10_seconds;
+             DCDB->NoAutomaticRequestSense = false;
+             DCDB->DisconnectPermitted = true;
+             DCDB->TransferLength =
+               sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+             DCDB->BusAddress = Virtual_to_Bus(InquiryUnitSerialNumber);
+             DCDB->CDBLength = 6;
+             DCDB->TransferLengthHigh4 = 0;
+             DCDB->SenseLength = sizeof(DCDB->SenseData);
+             DCDB->CDB[0] = 0x12; /* INQUIRY */
+             DCDB->CDB[1] = 1; /* EVPD = 1 */
+             DCDB->CDB[2] = 0x80; /* Page Code */
+             DCDB->CDB[3] = 0; /* Reserved */
+             DCDB->CDB[4] = sizeof(DAC960_SCSI_Inquiry_UnitSerialNumber_T);
+             DCDB->CDB[5] = 0; /* Control */
+             DAC960_QueueCommand(Command);
+             Controller->NeedDeviceSerialNumberInformation = false;
+             return;
+           }
+         if (++Controller->DeviceStateTargetID == DAC960_MaxTargets)
+           {
+             Controller->DeviceStateChannel++;
+             Controller->DeviceStateTargetID = 0;
+           }
           while (Controller->DeviceStateChannel < Controller->Channels)
             {
               DAC960_DeviceState_T *OldDeviceState =
@@ -3078,9 +3320,8 @@ static boolean DAC960_ExecuteUserCommand(DAC960_Controller_T *Controller,
    DAC960_ProcReadStatus implements reading /proc/rd/status.
  */
  
-static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
-                                    off_t Offset, int Count,
-                                    int *EOF, void *Data)
+static int DAC960_ProcReadStatus(char *Page, char **Start, off_t Offset,
+                                int Count, int *EOF, void *Data)
  {
    char *StatusMessage = "OK\n";
    int ControllerNumber, BytesAvailable;
@@ -3107,6 +3348,7 @@ static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
        *EOF = true;
      }
    if (Count <= 0) return 0;
+  *Start = Page;
    memcpy(Page, &StatusMessage[Offset], Count);
    return Count;
  }
@@ -3116,9 +3358,8 @@ static ssize_t DAC960_ProcReadStatus(char *Page, char **Start,
    DAC960_ProcReadInitialStatus implements reading /proc/rd/cN/initial_status.
  */
  
-static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
-                                           off_t Offset, int Count,
-                                           int *EOF, void *Data)
+static int DAC960_ProcReadInitialStatus(char *Page, char **Start, off_t Offset,
+                                       int Count, int *EOF, void *Data)
  {
    DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
    int BytesAvailable = Controller->InitialStatusLength - Offset;
@@ -3128,6 +3369,7 @@ static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
        *EOF = true;
      }
    if (Count <= 0) return 0;
+  *Start = Page;
    memcpy(Page, &Controller->InitialStatusBuffer[Offset], Count);
    return Count;
  }
@@ -3137,29 +3379,35 @@ static ssize_t DAC960_ProcReadInitialStatus(char *Page, char **Start,
    DAC960_ProcReadCurrentStatus implements reading /proc/rd/cN/current_status.
  */
  
-static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
-                                           off_t Offset, int Count,
-                                           int *EOF, void *Data)
+static int DAC960_ProcReadCurrentStatus(char *Page, char **Start, off_t Offset,
+                                       int Count, int *EOF, void *Data)
  {
    DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
    int BytesAvailable;
-  Controller->CurrentStatusLength = 0;
-  DAC960_AnnounceDriver(Controller);
-  DAC960_ReportControllerConfiguration(Controller);
-  Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
-  Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
-  if (Controller->RebuildProgressLength > 0)
+  if (jiffies != Controller->LastCurrentStatusTime)
      {
-      strcpy(&Controller->CurrentStatusBuffer[Controller->CurrentStatusLength],
-            Controller->RebuildProgressBuffer);
-      Controller->CurrentStatusLength += Controller->RebuildProgressLength;
-    }
-  else
-    {
-      char *StatusMessage = "No Rebuild or Consistency Check in Progress\n";
-      strcpy(&Controller->CurrentStatusBuffer[Controller->CurrentStatusLength],
-            StatusMessage);
-      Controller->CurrentStatusLength += strlen(StatusMessage);
+      Controller->CurrentStatusLength = 0;
+      DAC960_AnnounceDriver(Controller);
+      DAC960_ReportControllerConfiguration(Controller);
+      DAC960_ReportDeviceConfiguration(Controller);
+      Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
+      Controller->CurrentStatusBuffer[Controller->CurrentStatusLength++] = ' ';
+      if (Controller->RebuildProgressLength > 0)
+       {
+         strcpy(&Controller->CurrentStatusBuffer
+                             [Controller->CurrentStatusLength],
+                Controller->RebuildProgressBuffer);
+         Controller->CurrentStatusLength += Controller->RebuildProgressLength;
+       }
+      else
+       {
+         char *StatusMessage = "No Rebuild or Consistency Check in Progress\n";
+         strcpy(&Controller->CurrentStatusBuffer
+                             [Controller->CurrentStatusLength],
+                StatusMessage);
+         Controller->CurrentStatusLength += strlen(StatusMessage);
+       }
+      Controller->LastCurrentStatusTime = jiffies;
      }
    BytesAvailable = Controller->CurrentStatusLength - Offset;
    if (Count >= BytesAvailable)
@@ -3168,6 +3416,7 @@ static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
        *EOF = true;
      }
    if (Count <= 0) return 0;
+  *Start = Page;
    memcpy(Page, &Controller->CurrentStatusBuffer[Offset], Count);
    return Count;
  }
@@ -3177,9 +3426,8 @@ static ssize_t DAC960_ProcReadCurrentStatus(char *Page, char **Start,
    DAC960_ProcReadUserCommand implements reading /proc/rd/cN/user_command.
  */
  
-static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
-                                         off_t Offset, int Count,
-                                         int *EOF, void *Data)
+static int DAC960_ProcReadUserCommand(char *Page, char **Start, off_t Offset,
+                                     int Count, int *EOF, void *Data)
  {
    DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
    int BytesAvailable = Controller->UserStatusLength - Offset;
@@ -3189,6 +3437,7 @@ static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
        *EOF = true;
      }
    if (Count <= 0) return 0;
+  *Start = Page;
    memcpy(Page, &Controller->UserStatusBuffer[Offset], Count);
    return Count;
  }
@@ -3198,8 +3447,8 @@ static ssize_t DAC960_ProcReadUserCommand(char *Page, char **Start,
    DAC960_ProcWriteUserCommand implements writing /proc/rd/cN/user_command.
  */
  
-static ssize_t DAC960_ProcWriteUserCommand(File_T *File, const char *Buffer,
-                                          unsigned long Count, void *Data)
+static int DAC960_ProcWriteUserCommand(File_T *File, const char *Buffer,
+                                      unsigned long Count, void *Data)
  {
    DAC960_Controller_T *Controller = (DAC960_Controller_T *) Data;
    char CommandBuffer[80];
diff --git a/drivers/block/DAC960.h b/drivers/block/DAC960.h

index 5bea6a759278f2b71b8b1224355de0e2117a1000..a96757db61f2a795b48478dd555f417993c0206b 100644 (file)
--- a/drivers/block/DAC960.h
+++ b/drivers/block/DAC960.h
@@ -671,6 +671,57 @@ typedef struct DAC960_DCDB
  DAC960_DCDB_T;
  
  
+/*
+  Define the SCSI INQUIRY Standard Data reply structure.
+*/
+
+typedef struct DAC960_SCSI_Inquiry
+{
+  unsigned char PeripheralDeviceType:5;                        /* Byte 0 Bits 0-4 */
+  unsigned char PeripheralQualifier:3;                 /* Byte 0 Bits 5-7 */
+  unsigned char DeviceTypeModifier:7;                  /* Byte 1 Bits 0-6 */
+  boolean RMB:1;                                       /* Byte 1 Bit 7 */
+  unsigned char ANSI_ApprovedVersion:3;                        /* Byte 2 Bits 0-2 */
+  unsigned char ECMA_Version:3;                                /* Byte 2 Bits 3-5 */
+  unsigned char ISO_Version:2;                         /* Byte 2 Bits 6-7 */
+  unsigned char ResponseDataFormat:4;                  /* Byte 3 Bits 0-3 */
+  unsigned char :2;                                    /* Byte 3 Bits 4-5 */
+  boolean TrmIOP:1;                                    /* Byte 3 Bit 6 */
+  boolean AENC:1;                                      /* Byte 3 Bit 7 */
+  unsigned char AdditionalLength;                      /* Byte 4 */
+  unsigned char :8;                                    /* Byte 5 */
+  unsigned char :8;                                    /* Byte 6 */
+  boolean SftRe:1;                                     /* Byte 7 Bit 0 */
+  boolean CmdQue:1;                                    /* Byte 7 Bit 1 */
+  boolean :1;                                          /* Byte 7 Bit 2 */
+  boolean Linked:1;                                    /* Byte 7 Bit 3 */
+  boolean Sync:1;                                      /* Byte 7 Bit 4 */
+  boolean WBus16:1;                                    /* Byte 7 Bit 5 */
+  boolean WBus32:1;                                    /* Byte 7 Bit 6 */
+  boolean RelAdr:1;                                    /* Byte 7 Bit 7 */
+  unsigned char VendorIdentification[8];               /* Bytes 8-15 */
+  unsigned char ProductIdentification[16];             /* Bytes 16-31 */
+  unsigned char ProductRevisionLevel[4];               /* Bytes 32-35 */
+}
+DAC960_SCSI_Inquiry_T;
+
+
+/*
+  Define the SCSI INQUIRY Unit Serial Number reply structure.
+*/
+
+typedef struct DAC960_SCSI_Inquiry_UnitSerialNumber
+{
+  unsigned char PeripheralDeviceType:5;                        /* Byte 0 Bits 0-4 */
+  unsigned char PeripheralQualifier:3;                 /* Byte 0 Bits 5-7 */
+  unsigned char PageCode;                              /* Byte 1 */
+  unsigned char :8;                                    /* Byte 2 */
+  unsigned char PageLength;                            /* Byte 3 */
+  unsigned char ProductSerialNumber[28];               /* Bytes 4 - 31 */
+}
+DAC960_SCSI_Inquiry_UnitSerialNumber_T;
+
+
  /*
    Define the Scatter/Gather List Type 1 32 Bit Address 32 Bit Byte Count
    structure.
@@ -977,12 +1028,12 @@ static inline void *Bus_to_Virtual(DAC960_BusAddress_T BusAddress)
  
  
  /*
-  Define the Controller Line, Status Buffer, Rebuild Progress, and
-  User Message Sizes.
+  Define the Controller Line Buffer, Status Buffer, Rebuild Progress,
+  and User Message Sizes.
  */
  
  #define DAC960_LineBufferSize                  100
-#define DAC960_StatusBufferSize                        5000
+#define DAC960_StatusBufferSize                        16384
  #define DAC960_RebuildProgressSize             200
  #define DAC960_UserMessageSize                 200
  
@@ -1183,6 +1234,7 @@ typedef struct DAC960_Controller
    unsigned long MonitoringTimerCount;
    unsigned long SecondaryMonitoringTime;
    unsigned long LastProgressReportTime;
+  unsigned long LastCurrentStatusTime;
    boolean DualModeMemoryMailboxInterface;
    boolean SAFTE_EnclosureManagementEnabled;
    boolean ControllerInitialized;
@@ -1190,6 +1242,8 @@ typedef struct DAC960_Controller
    boolean NeedLogicalDriveInformation;
    boolean NeedErrorTableInformation;
    boolean NeedDeviceStateInformation;
+  boolean NeedDeviceInquiryInformation;
+  boolean NeedDeviceSerialNumberInformation;
    boolean NeedRebuildProgress;
    boolean NeedConsistencyCheckProgress;
    boolean EphemeralProgressMessage;
@@ -1209,6 +1263,7 @@ typedef struct DAC960_Controller
    PROC_DirectoryEntry_T CurrentStatusProcEntry;
    PROC_DirectoryEntry_T UserCommandProcEntry;
    WaitQueue_T *CommandWaitQueue;
+  DAC960_DCDB_T MonitoringDCDB;
    DAC960_Enquiry_T Enquiry[2];
    DAC960_ErrorTable_T ErrorTable[2];
    DAC960_EventLogEntry_T EventLogEntry;
@@ -1219,12 +1274,17 @@ typedef struct DAC960_Controller
    DAC960_LogicalDriveState_T LogicalDriveInitialState[DAC960_MaxLogicalDrives];
    DAC960_DeviceState_T DeviceState[2][DAC960_MaxChannels][DAC960_MaxTargets];
    DAC960_Command_T Commands[DAC960_MaxDriverQueueDepth];
+  DAC960_SCSI_Inquiry_T
+    InquiryStandardData[DAC960_MaxChannels][DAC960_MaxTargets];
+  DAC960_SCSI_Inquiry_UnitSerialNumber_T
+    InquiryUnitSerialNumber[DAC960_MaxChannels][DAC960_MaxTargets];
    DiskPartition_T DiskPartitions[DAC960_MinorCount];
    int LogicalDriveUsageCount[DAC960_MaxLogicalDrives];
    int PartitionSizes[DAC960_MinorCount];
    int BlockSizes[DAC960_MinorCount];
    int MaxSectorsPerRequest[DAC960_MinorCount];
    int MaxSegmentsPerRequest[DAC960_MinorCount];
+  int DeviceResetCount[DAC960_MaxChannels][DAC960_MaxTargets];
    boolean DirectCommandActive[DAC960_MaxChannels][DAC960_MaxTargets];
    char InitialStatusBuffer[DAC960_StatusBufferSize];
    char CurrentStatusBuffer[DAC960_StatusBufferSize];
diff --git a/drivers/block/Makefile b/drivers/block/Makefile

index 9e2b1b078acfd94cf965b1f68692ed34eae068c1..d050b53ac9e402e5779e97762fb1083eb85f3c8b 100644 (file)
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -234,14 +234,6 @@ else
    endif
  endif
  
-ifeq ($(CONFIG_BLK_DEV_DAC960),y)
-LX_OBJS += DAC960.o
-else
-  ifeq ($(CONFIG_BLK_DEV_DAC960),m)
-  MX_OBJS += DAC960.o
-  endif
-endif
-
  ifeq ($(CONFIG_BLK_CPQ_DA),y)
  L_OBJS += cpqarray.o
  else
@@ -250,6 +242,14 @@ else
    endif
  endif
  
+ifeq ($(CONFIG_BLK_DEV_DAC960),y)
+LX_OBJS += DAC960.o
+else
+  ifeq ($(CONFIG_BLK_DEV_DAC960),m)
+  MX_OBJS += DAC960.o
+  endif
+endif
+
  ifeq ($(CONFIG_BLK_DEV_MD),y)
  LX_OBJS += md.o
  
@@ -278,31 +278,13 @@ else
  endif
  
  ifeq ($(CONFIG_MD_RAID5),y)
-LX_OBJS += xor.o
  L_OBJS += raid5.o
  else
    ifeq ($(CONFIG_MD_RAID5),m)
-  LX_OBJS += xor.o
    M_OBJS += raid5.o
    endif
  endif
  
-ifeq ($(CONFIG_MD_TRANSLUCENT),y)
-L_OBJS += translucent.o
-else
-  ifeq ($(CONFIG_MD_TRANSLUCENT),m)
-  M_OBJS += translucent.o
-  endif
-endif
-
-ifeq ($(CONFIG_MD_HSM),y)
-L_OBJS += hsm.o
-else
-  ifeq ($(CONFIG_MD_HSM),m)
-  M_OBJS += hsm.o
-  endif
-endif
-
  endif
  
  ifeq ($(CONFIG_BLK_DEV_NBD),y)
diff --git a/drivers/block/cpqarray.h b/drivers/block/cpqarray.h

index 31e9786f3fa22d11e69f2e2fd4b213eba3e3b7ad..fa51d3dee4021fa66b86b64d5f29c3775fac4781 100644 (file)
--- a/drivers/block/cpqarray.h
+++ b/drivers/block/cpqarray.h
@@ -30,6 +30,7 @@
  #include <linux/locks.h>
  #include <linux/malloc.h>
  #include <linux/proc_fs.h>
+#include <linux/md.h>
  #include <linux/timer.h>
  #endif
  
diff --git a/drivers/block/genhd.c b/drivers/block/genhd.c

index aff62db8b1a79529d5094e1884826a1b694afe84..c0cb2f8e91cd1bd3cf08a3f0a3ee9b71ad381e72 100644 (file)
--- a/drivers/block/genhd.c
+++ b/drivers/block/genhd.c
@@ -28,7 +28,6 @@
  #include <linux/string.h>
  #include <linux/blk.h>
  #include <linux/init.h>
-#include <linux/raid/md.h>
  
  #include <asm/system.h>
  #include <asm/byteorder.h>
@@ -1467,9 +1466,6 @@ __initfunc(void device_setup(void))
  #endif
         rd_load();
  #endif
-#ifdef CONFIG_BLK_DEV_MD
-       autodetect_raid();
-#endif
  #ifdef CONFIG_MD_BOOT
          md_setup_drive();
  #endif
diff --git a/drivers/block/hsm.c b/drivers/block/hsm.c

deleted file mode 100644 (file)

index 6307a50..0000000
--- a/drivers/block/hsm.c
+++ /dev/null
@@ -1,840 +0,0 @@
-/*
-   hsm.c : HSM RAID driver for Linux
-              Copyright (C) 1998 Ingo Molnar
-
-   HSM mode management functions.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/module.h>
-
-#include <linux/raid/md.h>
-#include <linux/malloc.h>
-
-#include <linux/raid/hsm.h>
-#include <linux/blk.h>
-
-#define MAJOR_NR MD_MAJOR
-#define MD_DRIVER
-#define MD_PERSONALITY
-
-
-#define DEBUG_HSM 1
-
-#if DEBUG_HSM
-#define dprintk(x,y...) printk(x,##y)
-#else
-#define dprintk(x,y...) do { } while (0)
-#endif
-
-void print_bh(struct buffer_head *bh)
-{
-       dprintk("bh %p: %lx %lx %x %x %lx %p %lx %p %x %p %x %lx\n", bh, 
-               bh->b_blocknr, bh->b_size, bh->b_dev, bh->b_rdev,
-               bh->b_rsector, bh->b_this_page, bh->b_state,
-               bh->b_next_free, bh->b_count, bh->b_data,
-               bh->b_list, bh->b_flushtime
-       );
-}
-
-static int check_bg (pv_t *pv, pv_block_group_t * bg)
-{
-       int i, free = 0;
-
-       dprintk("checking bg ...\n");
-
-       for (i = 0; i < pv->pv_sb->pv_bg_size-1; i++) {
-               if (pv_pptr_free(bg->blocks + i)) {
-                       free++;
-                       if (test_bit(i, bg->used_bitmap)) {
-                               printk("hm, bit %d set?\n", i);
-                       }
-               } else {
-                       if (!test_bit(i, bg->used_bitmap)) {
-                               printk("hm, bit %d not set?\n", i);
-                       }
-               }
-       }
-       dprintk("%d free blocks in bg ...\n", free);
-       return free;
-}
-
-static void get_bg (pv_t *pv, pv_bg_desc_t *desc, int nr)
-{
-       unsigned int bg_pos = nr * pv->pv_sb->pv_bg_size + 2;
-       struct buffer_head *bh;
-
-       dprintk("... getting BG at %u ...\n", bg_pos);
-
-        bh = bread (pv->dev, bg_pos, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return;
-       }
-       desc->bg = (pv_block_group_t *) bh->b_data;
-       desc->free_blocks = check_bg(pv, desc->bg);
-}
-
-static int find_free_block (lv_t *lv, pv_t *pv, pv_bg_desc_t *desc, int nr,
-                               unsigned int lblock, lv_lptr_t * index)
-{
-       int i;
-
-       for (i = 0; i < pv->pv_sb->pv_bg_size-1; i++) {
-               pv_pptr_t * bptr = desc->bg->blocks + i;
-               if (pv_pptr_free(bptr)) {
-                       unsigned int bg_pos = nr * pv->pv_sb->pv_bg_size + 2;
-
-                       if (test_bit(i, desc->bg->used_bitmap)) {
-                               MD_BUG();
-                               continue;
-                       }
-                       bptr->u.used.owner.log_id = lv->log_id;
-                       bptr->u.used.owner.log_index = lblock;
-                       index->data.phys_nr = pv->phys_nr;
-                       index->data.phys_block = bg_pos + i + 1;
-                       set_bit(i, desc->bg->used_bitmap);
-                       desc->free_blocks--;
-                       dprintk(".....free blocks left in bg %p: %d\n",
-                                       desc->bg, desc->free_blocks);
-                       return 0;
-               }
-       }
-       return -ENOSPC;
-}
-
-static int __get_free_block (lv_t *lv, pv_t *pv,
-                                       unsigned int lblock, lv_lptr_t * index)
-{
-       int i;
-
-       dprintk("trying to get free block for lblock %d ...\n", lblock);
-
-       for (i = 0; i < pv->pv_sb->pv_block_groups; i++) {
-               pv_bg_desc_t *desc = pv->bg_array + i;
-
-               dprintk("looking at desc #%d (%p)...\n", i, desc->bg);
-               if (!desc->bg)
-                       get_bg(pv, desc, i);
-
-               if (desc->bg && desc->free_blocks)
-                       return find_free_block(lv, pv, desc, i,
-                                                       lblock, index);
-       }
-       dprintk("hsm: pv %s full!\n", partition_name(pv->dev));
-       return -ENOSPC;
-}
-
-static int get_free_block (lv_t *lv, unsigned int lblock, lv_lptr_t * index)
-{
-       int err;
-
-       if (!lv->free_indices)
-               return -ENOSPC;
-
-       /* fix me */
-       err = __get_free_block(lv, lv->vg->pv_array + 0, lblock, index);
-
-       if (err || !index->data.phys_block) {
-               MD_BUG();
-               return -ENOSPC;
-       }
-
-       lv->free_indices--;
-
-       return 0;
-}
-
-/*
- * fix me: wordsize assumptions ...
- */
-#define INDEX_BITS 8
-#define INDEX_DEPTH (32/INDEX_BITS)
-#define INDEX_MASK ((1<<INDEX_BITS) - 1)
-
-static void print_index_list (lv_t *lv, lv_lptr_t *index)
-{
-       lv_lptr_t *tmp;
-       int i;
-
-       dprintk("... block <%u,%u,%x> [.", index->data.phys_nr,
-               index->data.phys_block, index->cpu_addr);
-
-       tmp = index_child(index);
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index_block(lv, tmp))
-                       dprintk("(%d->%d)", i, index_block(lv, tmp));
-               tmp++;
-       }
-       dprintk(".]\n");
-}
-
-static int read_index_group (lv_t *lv, lv_lptr_t *index)
-{
-       lv_lptr_t *index_group, *tmp;
-       struct buffer_head *bh;
-       int i;
-
-       dprintk("reading index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = bread(index_dev(lv, index), index_block(lv, index), HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -EIO;
-       }
-       if (!buffer_uptodate(bh))
-               MD_BUG();
-
-       index_group = (lv_lptr_t *) bh->b_data;
-       tmp = index_group;
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index_block(lv, tmp)) {
-                       dprintk("index group has BLOCK %d, non-present.\n", i);
-                       tmp->cpu_addr = 0;
-               }
-               tmp++;
-       }
-       index->cpu_addr = ptr_to_cpuaddr(index_group);
-
-       dprintk("have read index group %p at block %d.\n",
-                               index_group, index_block(lv, index));
-       print_index_list(lv, index);
-
-       return 0;
-}
-
-static int alloc_index_group (lv_t *lv, unsigned int lblock, lv_lptr_t * index)
-{
-       struct buffer_head *bh;
-       lv_lptr_t * index_group;
-       
-       if (get_free_block(lv, lblock, index))
-               return -ENOSPC;
-
-       dprintk("creating block for index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = getblk(index_dev(lv, index),
-                        index_block(lv, index), HSM_BLOCKSIZE);
-
-       index_group = (lv_lptr_t *) bh->b_data;
-       md_clear_page(index_group);
-       mark_buffer_uptodate(bh, 1);
-
-       index->cpu_addr = ptr_to_cpuaddr(index_group);
-
-       dprintk("allocated index group %p at block %d.\n",
-                               index_group, index_block(lv, index));
-       return 0;
-}
-
-static lv_lptr_t * alloc_fixed_index (lv_t *lv, unsigned int lblock)
-{
-       lv_lptr_t * index = index_child(&lv->root_index);
-       int idx, l;
-
-       for (l = INDEX_DEPTH-1; l >= 0; l--) {
-               idx = (lblock >> (INDEX_BITS*l)) & INDEX_MASK;
-               index += idx;
-               if (!l)
-                       break;
-               if (!index_present(index)) {
-                       dprintk("no group, level %u, pos %u\n", l, idx);
-                       if (alloc_index_group(lv, lblock, index))
-                               return NULL;
-               }
-               index = index_child(index);
-       }
-       if (!index_block(lv,index)) {
-               dprintk("no data, pos %u\n", idx);
-               if (get_free_block(lv, lblock, index))
-                       return NULL;
-               return index;
-       }
-       MD_BUG();
-       return index;
-}
-
-static lv_lptr_t * find_index (lv_t *lv, unsigned int lblock)
-{
-       lv_lptr_t * index = index_child(&lv->root_index);
-       int idx, l;
-
-       for (l = INDEX_DEPTH-1; l >= 0; l--) {
-               idx = (lblock >> (INDEX_BITS*l)) & INDEX_MASK;
-               index += idx;
-               if (!l)
-                       break;
-               if (index_free(index))
-                       return NULL;
-               if (!index_present(index))
-                       read_index_group(lv, index);
-               if (!index_present(index)) {
-                       MD_BUG();
-                       return NULL;
-               }
-               index = index_child(index);
-       }
-       if (!index_block(lv,index))
-               return NULL;
-       return index;
-}
-
-static int read_root_index(lv_t *lv)
-{
-       int err;
-       lv_lptr_t *index = &lv->root_index;
-
-       if (!index_block(lv, index)) {
-               printk("LV has no root index yet, creating.\n");
-
-               err = alloc_index_group (lv, 0, index);
-               if (err) {
-                       printk("could not create index group, err:%d\n", err);
-                       return err;
-               }
-               lv->vg->vg_sb->lv_array[lv->log_id].lv_root_idx =
-                                       lv->root_index.data;
-       } else {
-               printk("LV already has a root index.\n");
-               printk("... at <%s:%d>.\n",
-                       partition_name(index_dev(lv, index)),
-                       index_block(lv, index));
-
-               read_index_group(lv, index);
-       }
-       return 0;
-}
-
-static int init_pv(pv_t *pv)
-{
-       struct buffer_head *bh;
-       pv_sb_t *pv_sb;
-
-        bh = bread (pv->dev, 0, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-
-       pv_sb = (pv_sb_t *) bh->b_data;
-       pv->pv_sb = pv_sb;
-
-       if (pv_sb->pv_magic != HSM_PV_SB_MAGIC) {
-               printk("%s is not a PV, has magic %x instead of %x!\n",
-                       partition_name(pv->dev), pv_sb->pv_magic,
-                       HSM_PV_SB_MAGIC);
-               return -1;
-       }
-       printk("%s detected as a valid PV (#%d).\n", partition_name(pv->dev),
-                                                       pv->phys_nr);
-       printk("... created under HSM version %d.%d.%d, at %x.\n",
-           pv_sb->pv_major, pv_sb->pv_minor, pv_sb->pv_patch, pv_sb->pv_ctime);
-       printk("... total # of blocks: %d (%d left unallocated).\n",
-                        pv_sb->pv_total_size, pv_sb->pv_blocks_left);
-
-       printk("... block size: %d bytes.\n", pv_sb->pv_block_size);
-       printk("... block descriptor size: %d bytes.\n", pv_sb->pv_pptr_size);
-       printk("... block group size: %d blocks.\n", pv_sb->pv_bg_size);
-       printk("... # of block groups: %d.\n", pv_sb->pv_block_groups);
-
-       if (pv_sb->pv_block_groups*sizeof(pv_bg_desc_t) > PAGE_SIZE) {
-               MD_BUG();
-               return 1;
-       }
-       pv->bg_array = (pv_bg_desc_t *)__get_free_page(GFP_KERNEL);
-       if (!pv->bg_array) {
-               MD_BUG();
-               return 1;
-       }
-       memset(pv->bg_array, 0, PAGE_SIZE);
-
-       return 0;
-}
-
-static int free_pv(pv_t *pv)
-{
-       struct buffer_head *bh;
-
-       dprintk("freeing PV %d ...\n", pv->phys_nr);
-
-       if (pv->bg_array) {
-               int i;
-
-               dprintk(".... freeing BGs ...\n");
-               for (i = 0; i < pv->pv_sb->pv_block_groups; i++) {
-                       unsigned int bg_pos = i * pv->pv_sb->pv_bg_size + 2;
-                       pv_bg_desc_t *desc = pv->bg_array + i;
-
-                       if (desc->bg) {
-                               dprintk(".... freeing BG %d ...\n", i);
-                               bh = getblk (pv->dev, bg_pos, HSM_BLOCKSIZE);
-                               mark_buffer_dirty(bh, 1);
-                               brelse(bh);
-                               brelse(bh);
-                       }
-               }
-               free_page((unsigned long)pv->bg_array);
-       } else
-               MD_BUG();
-
-        bh = getblk (pv->dev, 0, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-       mark_buffer_dirty(bh, 1);
-       brelse(bh);
-       brelse(bh);
-
-       return 0;
-}
-
-struct semaphore hsm_sem = MUTEX;
-
-#define HSM_SECTORS (HSM_BLOCKSIZE/512)
-
-static int hsm_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-                       unsigned long *rsector, unsigned long bsectors)
-{
-       lv_t *lv = kdev_to_lv(dev);
-       lv_lptr_t *index;
-       unsigned int lblock = *rsector / HSM_SECTORS;
-       unsigned int offset = *rsector % HSM_SECTORS;
-       int err = -EIO;
-
-       if (!lv) {
-               printk("HSM: md%d not a Logical Volume!\n", mdidx(mddev));
-               goto out;
-       }
-       if (offset + bsectors > HSM_SECTORS) {
-               MD_BUG();
-               goto out;
-       }
-       down(&hsm_sem);
-       index = find_index(lv, lblock);
-       if (!index) {
-               printk("no block %u yet ... allocating\n", lblock);
-               index = alloc_fixed_index(lv, lblock);
-       }
-
-       err = 0;
-
-       printk(" %u <%s : %ld(%ld)> -> ", lblock,
-               partition_name(*rdev), *rsector, bsectors);
-
-       *rdev = index_dev(lv, index);
-       *rsector = index_block(lv, index) * HSM_SECTORS + offset;
-
-       printk(" <%s : %ld> %u\n",
-               partition_name(*rdev), *rsector, index_block(lv, index));
-
-       up(&hsm_sem);
-out:
-       return err;
-}
-
-static void free_index (lv_t *lv, lv_lptr_t * index)
-{
-       struct buffer_head *bh;
-
-       printk("tryin to get cached block for index group <%s:%d>\n",
-               partition_name(index_dev(lv, index)), index_block(lv, index));
-
-       bh = getblk(index_dev(lv, index), index_block(lv, index),HSM_BLOCKSIZE);
-
-       printk("....FREEING ");
-       print_index_list(lv, index);
-
-       if (bh) {
-               if (!buffer_uptodate(bh))
-                       MD_BUG();
-               if ((lv_lptr_t *)bh->b_data != index_child(index)) {
-                       printk("huh? b_data is %p, index content is %p.\n",
-                               bh->b_data, index_child(index));
-               } else 
-                       printk("good, b_data == index content == %p.\n",
-                               index_child(index));
-               printk("b_count == %d, writing.\n", bh->b_count);
-               mark_buffer_dirty(bh, 1);
-               brelse(bh);
-               brelse(bh);
-               printk("done.\n");
-       } else {
-               printk("FAILED!\n");
-       }
-       print_index_list(lv, index);
-       index_child(index) = NULL;
-}
-
-static void free_index_group (lv_t *lv, int level, lv_lptr_t * index_0)
-{
-       char dots [3*8];
-       lv_lptr_t * index;
-       int i, nr_dots;
-
-       nr_dots = (INDEX_DEPTH-level)*3;
-       memcpy(dots,"...............",nr_dots);
-       dots[nr_dots] = 0;
-
-       dprintk("%s level %d index group block:\n", dots, level);
-
-
-       index = index_0;
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               if (index->data.phys_block) {
-                       dprintk("%s block <%u,%u,%x>\n", dots,
-                               index->data.phys_nr,
-                               index->data.phys_block,
-                               index->cpu_addr);
-                       if (level && index_present(index)) {
-                               dprintk("%s==> deeper one level\n", dots);
-                               free_index_group(lv, level-1,
-                                               index_child(index));
-                               dprintk("%s freeing index group block %p ...",
-                                               dots, index_child(index));
-                               free_index(lv, index);
-                       }
-               }
-               index++;
-       }
-       dprintk("%s DONE: level %d index group block.\n", dots, level);
-}
-
-static void free_lv_indextree (lv_t *lv)
-{
-       dprintk("freeing LV %d ...\n", lv->log_id);
-       dprintk("..root index: %p\n", index_child(&lv->root_index));
-       dprintk("..INDEX TREE:\n");
-       free_index_group(lv, INDEX_DEPTH-1, index_child(&lv->root_index));
-       dprintk("..freeing root index %p ...", index_child(&lv->root_index));
-       dprintk("root block <%u,%u,%x>\n", lv->root_index.data.phys_nr,
-               lv->root_index.data.phys_block, lv->root_index.cpu_addr);
-       free_index(lv, &lv->root_index);
-       dprintk("..INDEX TREE done.\n");
-       fsync_dev(lv->vg->pv_array[0].dev); /* fix me */
-       lv->vg->vg_sb->lv_array[lv->log_id].lv_free_indices = lv->free_indices;
-}
-
-static void print_index_group (lv_t *lv, int level, lv_lptr_t * index_0)
-{
-       char dots [3*5];
-       lv_lptr_t * index;
-       int i, nr_dots;
-
-       nr_dots = (INDEX_DEPTH-level)*3;
-       memcpy(dots,"...............",nr_dots);
-       dots[nr_dots] = 0;
-
-       dprintk("%s level %d index group block:\n", dots, level);
-
-
-       for (i = 0; i < HSM_LPTRS_PER_BLOCK; i++) {
-               index = index_0 + i;
-               if (index->data.phys_block) {
-                       dprintk("%s block <%u,%u,%x>\n", dots,
-                               index->data.phys_nr,
-                               index->data.phys_block,
-                               index->cpu_addr);
-                       if (level && index_present(index)) {
-                               dprintk("%s==> deeper one level\n", dots);
-                               print_index_group(lv, level-1,
-                                                       index_child(index));
-                       }
-               }
-       }
-       dprintk("%s DONE: level %d index group block.\n", dots, level);
-}
-
-static void print_lv (lv_t *lv)
-{
-       dprintk("printing LV %d ...\n", lv->log_id);
-       dprintk("..root index: %p\n", index_child(&lv->root_index));
-       dprintk("..INDEX TREE:\n");
-       print_index_group(lv, INDEX_DEPTH-1, index_child(&lv->root_index));
-       dprintk("..INDEX TREE done.\n");
-}
-
-static int map_lv (lv_t *lv)
-{
-       kdev_t dev = lv->dev;
-       unsigned int nr = MINOR(dev);
-       mddev_t *mddev = lv->vg->mddev;
-
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return -1;
-       }
-       if (kdev_to_mddev(dev)) {
-               MD_BUG();
-               return -1;
-       }
-       md_hd_struct[nr].start_sect = 0;
-       md_hd_struct[nr].nr_sects = md_size[mdidx(mddev)] << 1;
-       md_size[nr] = md_size[mdidx(mddev)];
-       add_mddev_mapping(mddev, dev, lv);
-
-       return 0;
-}
-
-static int unmap_lv (lv_t *lv)
-{
-       kdev_t dev = lv->dev;
-       unsigned int nr = MINOR(dev);
-
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return -1;
-       }
-       md_hd_struct[nr].start_sect = 0;
-       md_hd_struct[nr].nr_sects = 0;
-       md_size[nr] = 0;
-       del_mddev_mapping(lv->vg->mddev, dev);
-
-       return 0;
-}
-
-static int init_vg (vg_t *vg)
-{
-       int i;
-       lv_t *lv;
-       kdev_t dev;
-       vg_sb_t *vg_sb;
-       struct buffer_head *bh;
-       lv_descriptor_t *lv_desc;
-
-       /*
-        * fix me: read all PVs and compare the SB
-        */
-        dev = vg->pv_array[0].dev;
-        bh = bread (dev, 1, HSM_BLOCKSIZE);
-       if (!bh) {
-               MD_BUG();
-               return -1;
-       }
-
-       vg_sb = (vg_sb_t *) bh->b_data;
-       vg->vg_sb = vg_sb;
-
-       if (vg_sb->vg_magic != HSM_VG_SB_MAGIC) {
-               printk("%s is not a valid VG, has magic %x instead of %x!\n",
-                       partition_name(dev), vg_sb->vg_magic,
-                       HSM_VG_SB_MAGIC);
-               return -1;
-       }
-
-       vg->nr_lv = 0;
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               unsigned int id;
-               lv_desc = vg->vg_sb->lv_array + i;
-
-               id = lv_desc->lv_id;
-               if (!id) {
-                       printk("... LV desc %d empty\n", i);
-                       continue;
-               }
-               if (id >= HSM_MAX_LVS_PER_VG) {
-                       MD_BUG();
-                       continue;
-               }
-
-               lv = vg->lv_array + id;
-               if (lv->vg) {
-                       MD_BUG();
-                       continue;
-               }
-               lv->log_id = id;
-               lv->vg = vg;
-               lv->max_indices = lv_desc->lv_max_indices;
-               lv->free_indices = lv_desc->lv_free_indices;
-               lv->root_index.data = lv_desc->lv_root_idx;
-               lv->dev = MKDEV(MD_MAJOR, lv_desc->md_id);
-
-               vg->nr_lv++;
-
-               map_lv(lv);
-               if (read_root_index(lv)) {
-                       vg->nr_lv--;
-                       unmap_lv(lv);
-                       memset(lv, 0, sizeof(*lv));
-               }
-       }
-       if (vg->nr_lv != vg_sb->nr_lvs)
-               MD_BUG();
-
-       return 0;
-}
-
-static int hsm_run (mddev_t *mddev)
-{
-       int i;
-       vg_t *vg;
-       mdk_rdev_t *rdev;
-
-       MOD_INC_USE_COUNT;
-
-       vg = kmalloc (sizeof (*vg), GFP_KERNEL);
-       if (!vg)
-               goto out;
-       memset(vg, 0, sizeof(*vg));
-       mddev->private = vg;
-       vg->mddev = mddev;
-
-       if (md_check_ordering(mddev)) {
-               printk("hsm: disks are not ordered, aborting!\n");
-               goto out;
-       }
-
-       set_blocksize (mddev_to_kdev(mddev), HSM_BLOCKSIZE);
-
-       vg->nr_pv = mddev->nb_dev;
-       ITERATE_RDEV_ORDERED(mddev,rdev,i) {
-               pv_t *pv = vg->pv_array + i;
-
-               pv->dev = rdev->dev;
-               fsync_dev (pv->dev);
-               set_blocksize (pv->dev, HSM_BLOCKSIZE);
-               pv->phys_nr = i;
-               if (init_pv(pv))
-                       goto out;
-       }
-
-       init_vg(vg);
-
-       return 0;
-
-out:
-       if (vg) {
-               kfree(vg);
-               mddev->private = NULL;
-       }
-       MOD_DEC_USE_COUNT;
-
-       return 1;
-}
-
-static int hsm_stop (mddev_t *mddev)
-{
-       lv_t *lv;
-       vg_t *vg;
-       int i;
-
-       vg = mddev_to_vg(mddev);
-
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               lv = vg->lv_array + i;
-               if (!lv->log_id)
-                       continue;
-               print_lv(lv);
-               free_lv_indextree(lv);
-               unmap_lv(lv);
-       }
-       for (i = 0; i < vg->nr_pv; i++)
-               free_pv(vg->pv_array + i);
-
-       kfree(vg);
-
-       MOD_DEC_USE_COUNT;
-
-       return 0;
-}
-
-
-static int hsm_status (char *page, mddev_t *mddev)
-{
-       int sz = 0, i;
-       lv_t *lv;
-       vg_t *vg;
-
-       vg = mddev_to_vg(mddev);
-
-       for (i = 0; i < HSM_MAX_LVS_PER_VG; i++) {
-               lv = vg->lv_array + i;
-               if (!lv->log_id)
-                       continue;
-               sz += sprintf(page+sz, "<LV%d %d/%d blocks used> ", lv->log_id,
-                       lv->max_indices - lv->free_indices, lv->max_indices);
-       }
-       return sz;
-}
-
-
-static mdk_personality_t hsm_personality=
-{
-       "hsm",
-       hsm_map,
-       NULL,
-       NULL,
-       hsm_run,
-       hsm_stop,
-       hsm_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
-};
-
-#ifndef MODULE
-
-md__initfunc(void hsm_init (void))
-{
-       register_md_personality (HSM, &hsm_personality);
-}
-
-#else
-
-int init_module (void)
-{
-       return (register_md_personality (HSM, &hsm_personality));
-}
-
-void cleanup_module (void)
-{
-       unregister_md_personality (HSM);
-}
-
-#endif
-
-/*
- * This Linus-trick catches bugs via the linker.
- */
-
-extern void __BUG__in__hsm_dot_c_1(void);
-extern void __BUG__in__hsm_dot_c_2(void);
-extern void __BUG__in__hsm_dot_c_3(void);
-extern void __BUG__in__hsm_dot_c_4(void);
-extern void __BUG__in__hsm_dot_c_5(void);
-extern void __BUG__in__hsm_dot_c_6(void);
-extern void __BUG__in__hsm_dot_c_7(void);
- 
-void bugcatcher (void)
-{
-        if (sizeof(pv_block_group_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_1();
-        if (sizeof(lv_index_block_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_2();
-
-        if (sizeof(pv_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_4();
-        if (sizeof(lv_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_3();
-       if (sizeof(vg_sb_t) != HSM_BLOCKSIZE)
-                __BUG__in__hsm_dot_c_6();
-
-       if (sizeof(lv_lptr_t) != 16)
-                __BUG__in__hsm_dot_c_5();
-       if (sizeof(pv_pptr_t) != 16)
-                __BUG__in__hsm_dot_c_6();
-}
-
diff --git a/drivers/block/linear.c b/drivers/block/linear.c

index e60ef217780db67321a52b816314366b6a927b97..b6f72fd6a038fbf46fe733cc12526917865bc8a9 100644 (file)
--- a/drivers/block/linear.c
+++ b/drivers/block/linear.c
@@ -1,3 +1,4 @@
+
  /*
     linear.c : Multiple Devices driver for Linux
                Copyright (C) 1994-96 Marc ZYNGIER
@@ -18,207 +19,186 @@
  
  #include <linux/module.h>
  
-#include <linux/raid/md.h>
+#include <linux/md.h>
  #include <linux/malloc.h>
+#include <linux/init.h>
  
-#include <linux/raid/linear.h>
+#include "linear.h"
  
  #define MAJOR_NR MD_MAJOR
  #define MD_DRIVER
  #define MD_PERSONALITY
  
-static int linear_run (mddev_t *mddev)
+static int linear_run (int minor, struct md_dev *mddev)
  {
-       linear_conf_t *conf;
-       struct linear_hash *table;
-       mdk_rdev_t *rdev;
-       int size, i, j, nb_zone;
-       unsigned int curr_offset;
-
-       MOD_INC_USE_COUNT;
-
-       conf = kmalloc (sizeof (*conf), GFP_KERNEL);
-       if (!conf)
-               goto out;
-       mddev->private = conf;
-
-       if (md_check_ordering(mddev)) {
-               printk("linear: disks are not ordered, aborting!\n");
-               goto out;
-       }
-       /*
-        * Find the smallest device.
-        */
-
-       conf->smallest = NULL;
-       curr_offset = 0;
-       ITERATE_RDEV_ORDERED(mddev,rdev,j) {
-               dev_info_t *disk = conf->disks + j;
-
-               disk->dev = rdev->dev;
-               disk->size = rdev->size;
-               disk->offset = curr_offset;
-
-               curr_offset += disk->size;
-
-               if (!conf->smallest || (disk->size < conf->smallest->size))
-                       conf->smallest = disk;
-       }
-
-       nb_zone = conf->nr_zones =
-               md_size[mdidx(mddev)] / conf->smallest->size +
-               ((md_size[mdidx(mddev)] % conf->smallest->size) ? 1 : 0);
+  int cur=0, i, size, dev0_size, nb_zone;
+  struct linear_data *data;
+
+  MOD_INC_USE_COUNT;
+
+  mddev->private=kmalloc (sizeof (struct linear_data), GFP_KERNEL);
+  data=(struct linear_data *) mddev->private;
+
+  /*
+     Find out the smallest device. This was previously done
+     at registry time, but since it violates modularity,
+     I moved it here... Any comment ? ;-)
+   */
+
+  data->smallest=mddev->devices;
+  for (i=1; i<mddev->nb_dev; i++)
+    if (data->smallest->size > mddev->devices[i].size)
+      data->smallest=mddev->devices+i;
    
-       conf->hash_table = kmalloc (sizeof (struct linear_hash) * nb_zone,
-                                       GFP_KERNEL);
-       if (!conf->hash_table)
-               goto out;
-
-       /*
-        * Here we generate the linear hash table
-        */
-       table = conf->hash_table;
-       i = 0;
-       size = 0;
-       for (j = 0; j < mddev->nb_dev; j++) {
-               dev_info_t *disk = conf->disks + j;
-
-               if (size < 0) {
-                       table->dev1 = disk;
-                       table++;
-               }
-               size += disk->size;
-
-               while (size) {
-                       table->dev0 = disk;
-                       size -= conf->smallest->size;
-                       if (size < 0)
-                               break;
-                       table->dev1 = NULL;
-                       table++;
-               }
-       }
-       table->dev1 = NULL;
-
-       return 0;
-
-out:
-       if (conf)
-               kfree(conf);
-       MOD_DEC_USE_COUNT;
-       return 1;
+  nb_zone=data->nr_zones=
+    md_size[minor]/data->smallest->size +
+    (md_size[minor]%data->smallest->size ? 1 : 0);
+  
+  data->hash_table=kmalloc (sizeof (struct linear_hash)*nb_zone, GFP_KERNEL);
+
+  size=mddev->devices[cur].size;
+
+  i=0;
+  while (cur<mddev->nb_dev)
+  {
+    data->hash_table[i].dev0=mddev->devices+cur;
+
+    if (size>=data->smallest->size) /* If we completely fill the slot */
+    {
+      data->hash_table[i++].dev1=NULL;
+      size-=data->smallest->size;
+
+      if (!size)
+      {
+       if (++cur==mddev->nb_dev) continue;
+       size=mddev->devices[cur].size;
+      }
+
+      continue;
+    }
+
+    if (++cur==mddev->nb_dev) /* Last dev, set dev1 as NULL */
+    {
+      data->hash_table[i].dev1=NULL;
+      continue;
+    }
+
+    dev0_size=size;            /* Here, we use a 2nd dev to fill the slot */
+    size=mddev->devices[cur].size;
+    data->hash_table[i++].dev1=mddev->devices+cur;
+    size-=(data->smallest->size - dev0_size);
+  }
+
+  return 0;
  }
  
-static int linear_stop (mddev_t *mddev)
+static int linear_stop (int minor, struct md_dev *mddev)
  {
-       linear_conf_t *conf = mddev_to_conf(mddev);
+  struct linear_data *data=(struct linear_data *) mddev->private;
    
-       kfree(conf->hash_table);
-       kfree(conf);
+  kfree (data->hash_table);
+  kfree (data);
  
-       MOD_DEC_USE_COUNT;
+  MOD_DEC_USE_COUNT;
  
-       return 0;
+  return 0;
  }
  
  
-static int linear_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int linear_map (struct md_dev *mddev, kdev_t *rdev,
                        unsigned long *rsector, unsigned long size)
  {
-       linear_conf_t *conf = mddev_to_conf(mddev);
-       struct linear_hash *hash;
-       dev_info_t *tmp_dev;
-       long block;
+  struct linear_data *data=(struct linear_data *) mddev->private;
+  struct linear_hash *hash;
+  struct real_dev *tmp_dev;
+  long block;
  
-       block = *rsector >> 1;
-       hash = conf->hash_table + (block / conf->smallest->size);
+  block=*rsector >> 1;
+  hash=data->hash_table+(block/data->smallest->size);
    
-       if (block >= (hash->dev0->size + hash->dev0->offset))
-       {
-               if (!hash->dev1)
-               {
-                       printk ("linear_map : hash->dev1==NULL for block %ld\n",
-                                               block);
-                       return -1;
-               }
-               tmp_dev = hash->dev1;
-       } else
-               tmp_dev = hash->dev0;
+  if (block >= (hash->dev0->size + hash->dev0->offset))
+  {
+    if (!hash->dev1)
+    {
+      printk ("linear_map : hash->dev1==NULL for block %ld\n", block);
+      return (-1);
+    }
+    
+    tmp_dev=hash->dev1;
+  }
+  else
+    tmp_dev=hash->dev0;
      
-       if (block >= (tmp_dev->size + tmp_dev->offset)
-                               || block < tmp_dev->offset)
-       printk ("Block %ld out of bounds on dev %s size %d offset %d\n",
-               block, kdevname(tmp_dev->dev), tmp_dev->size, tmp_dev->offset);
+  if (block >= (tmp_dev->size + tmp_dev->offset) || block < tmp_dev->offset)
+    printk ("Block %ld out of bounds on dev %s size %d offset %d\n",
+           block, kdevname(tmp_dev->dev), tmp_dev->size, tmp_dev->offset);
    
-       *rdev = tmp_dev->dev;
-       *rsector = (block - tmp_dev->offset) << 1;
+  *rdev=tmp_dev->dev;
+  *rsector=(block-(tmp_dev->offset)) << 1;
  
-       return 0;
+  return (0);
  }
  
-static int linear_status (char *page, mddev_t *mddev)
+static int linear_status (char *page, int minor, struct md_dev *mddev)
  {
-       int sz=0;
+  int sz=0;
  
  #undef MD_DEBUG
  #ifdef MD_DEBUG
-       int j;
-       linear_conf_t *conf = mddev_to_conf(mddev);
+  int j;
+  struct linear_data *data=(struct linear_data *) mddev->private;
    
-       sz += sprintf(page+sz, "      ");
-       for (j = 0; j < conf->nr_zones; j++)
-       {
-               sz += sprintf(page+sz, "[%s",
-                       partition_name(conf->hash_table[j].dev0->dev));
-
-               if (conf->hash_table[j].dev1)
-                       sz += sprintf(page+sz, "/%s] ",
-                         partition_name(conf->hash_table[j].dev1->dev));
-               else
-                       sz += sprintf(page+sz, "] ");
-       }
-       sz += sprintf(page+sz, "\n");
+  sz+=sprintf (page+sz, "      ");
+  for (j=0; j<data->nr_zones; j++)
+  {
+    sz+=sprintf (page+sz, "[%s",
+                partition_name (data->hash_table[j].dev0->dev));
+
+    if (data->hash_table[j].dev1)
+      sz+=sprintf (page+sz, "/%s] ",
+                  partition_name(data->hash_table[j].dev1->dev));
+    else
+      sz+=sprintf (page+sz, "] ");
+  }
+
+  sz+=sprintf (page+sz, "\n");
  #endif
-       sz += sprintf(page+sz, " %dk rounding", mddev->param.chunk_size/1024);
-       return sz;
+  sz+=sprintf (page+sz, " %dk rounding", 1<<FACTOR_SHIFT(FACTOR(mddev)));
+  return sz;
  }
  
  
-static mdk_personality_t linear_personality=
+static struct md_personality linear_personality=
  {
-       "linear",
-       linear_map,
-       NULL,
-       NULL,
-       linear_run,
-       linear_stop,
-       linear_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
+  "linear",
+  linear_map,
+  NULL,
+  NULL,
+  linear_run,
+  linear_stop,
+  linear_status,
+  NULL,                                /* no ioctls */
+  0
  };
  
+
  #ifndef MODULE
  
-md__initfunc(void linear_init (void))
+__initfunc(void linear_init (void))
  {
-       register_md_personality (LINEAR, &linear_personality);
+  register_md_personality (LINEAR, &linear_personality);
  }
  
  #else
  
  int init_module (void)
  {
-       return (register_md_personality (LINEAR, &linear_personality));
+  return (register_md_personality (LINEAR, &linear_personality));
  }
  
  void cleanup_module (void)
  {
-       unregister_md_personality (LINEAR);
+  unregister_md_personality (LINEAR);
  }
  
  #endif
-
diff --git a/drivers/block/linear.h b/drivers/block/linear.h

new file mode 100644 (file)

index 0000000..1146d83
--- /dev/null
+++ b/drivers/block/linear.h
@@ -0,0 +1,16 @@
+#ifndef _LINEAR_H
+#define _LINEAR_H
+
+struct linear_hash
+{
+  struct real_dev *dev0, *dev1;
+};
+
+struct linear_data
+{
+  struct linear_hash *hash_table; /* Dynamically allocated */
+  struct real_dev *smallest;
+  int nr_zones;
+};
+
+#endif
diff --git a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c

index 9e8d8441347134e9f5a44762f924d2675f7a8a6a..f07c528e3b983d447db384f229a6a23834bc25c8 100644 (file)
--- a/drivers/block/ll_rw_blk.c
+++ b/drivers/block/ll_rw_blk.c
@@ -21,7 +21,6 @@
  #include <asm/system.h>
  #include <asm/io.h>
  #include <linux/blk.h>
-#include <linux/raid/md.h>
  
  #include <linux/module.h>
  
@@ -51,11 +50,6 @@ DECLARE_TASK_QUEUE(tq_disk);
   */
  spinlock_t io_request_lock = SPIN_LOCK_UNLOCKED;
  
-/*
- * per-major idle-IO detection
- */
-unsigned long io_events[MAX_BLKDEV] = {0, };
-
  /*
   * used to wait on when there are no free requests
   */
@@ -430,8 +424,6 @@ void make_request(int major, int rw, struct buffer_head * bh)
         /* Maybe the above fixes it, and maybe it doesn't boot. Life is interesting */
  
         lock_buffer(bh);
-       if (!buffer_lowprio(bh))
-               io_events[major]++;
  
         if (blk_size[major]) {
                 unsigned long maxsector = (blk_size[major][MINOR(bh->b_rdev)] << 1) + 1;
@@ -530,7 +522,7 @@ void make_request(int major, int rw, struct buffer_head * bh)
                  * entry may be busy being processed and we thus can't change it.
                  */
                 if (req == blk_dev[major].current_request)
-                       req = req->next;
+                       req = req->next;
                 if (!req)
                         break;
                 /* fall through */
@@ -689,12 +681,11 @@ void ll_rw_block(int rw, int nr, struct buffer_head * bh[])
                 bh[i]->b_rsector=bh[i]->b_blocknr*(bh[i]->b_size >> 9);
  #ifdef CONFIG_BLK_DEV_MD
                 if (major==MD_MAJOR &&
-               /* changed v to allow LVM to remap */
-                       md_map (bh[i]->b_rdev, &bh[i]->b_rdev,
-                               &bh[i]->b_rsector, bh[i]->b_size >> 9)) {
-                       printk (KERN_ERR
+                   md_map (MINOR(bh[i]->b_dev), &bh[i]->b_rdev,
+                           &bh[i]->b_rsector, bh[i]->b_size >> 9)) {
+                       printk (KERN_ERR
                                 "Bad md_map in ll_rw_block\n");
-                       goto sorry;
+                       goto sorry;
                 }
  #endif
         }
@@ -709,10 +700,8 @@ void ll_rw_block(int rw, int nr, struct buffer_head * bh[])
                 if (bh[i]) {
                         set_bit(BH_Req, &bh[i]->b_state);
  #ifdef CONFIG_BLK_DEV_MD
-                       /* changed       v  to allow LVM to remap */
-                       if (MAJOR(bh[i]->b_rdev) == MD_MAJOR) {
-                               /* changed for LVM to remap     v */
-                               md_make_request(bh[i], rw);
+                       if (MAJOR(bh[i]->b_dev) == MD_MAJOR) {
+                               md_make_request(MINOR (bh[i]->b_dev), rw, bh[i]);
                                 continue;
                         }
  #endif
@@ -795,10 +784,10 @@ __initfunc(int blk_dev_init(void))
  
         for (dev = blk_dev + MAX_BLKDEV; dev-- != blk_dev;) {
                 dev->request_fn      = NULL;
-               dev->queue         = NULL;
+               dev->queue           = NULL;
                 dev->current_request = NULL;
                 dev->plug.rq_status  = RQ_INACTIVE;
-               dev->plug.cmd   = -1;
+               dev->plug.cmd        = -1;
                 dev->plug.next       = NULL;
                 dev->plug_tq.sync    = 0;
                 dev->plug_tq.routine = &unplug_device;
@@ -879,7 +868,7 @@ __initfunc(int blk_dev_init(void))
         sbpcd_init();
  #endif CONFIG_SBPCD
  #ifdef CONFIG_AZTCD
-       aztcd_init();
+        aztcd_init();
  #endif CONFIG_AZTCD
  #ifdef CONFIG_CDU535
         sony535_init();
diff --git a/drivers/block/md.c b/drivers/block/md.c

index ca11dd495ee5aefc0fe54eab095e441e7a048238..77090708e6d13bad93ffd1af16e68e422064047b 100644 (file)
--- a/drivers/block/md.c
+++ b/drivers/block/md.c
@@ -1,17 +1,21 @@
+
  /*
     md.c : Multiple Devices driver for Linux
-          Copyright (C) 1998 Ingo Molnar
+          Copyright (C) 1994-96 Marc ZYNGIER
+         <zyngier@ufr-info-p7.ibp.fr> or
+         <maz@gloups.fdn.fr>
  
-     completely rewritten, based on the MD driver code from Marc Zyngier
+   A lot of inspiration came from hd.c ...
  
-   Changes:
+   kerneld support by Boris Tobotras <boris@xtalk.msk.su>
+   boot support for linear and striped mode by Harald Hoyer <HarryH@Royal.Net>
  
-   - RAID-1/RAID-5 extensions by Miguel de Icaza, Gadi Oxman, Ingo Molnar
-   - boot support for linear and striped mode by Harald Hoyer <HarryH@Royal.Net>
-   - kerneld support by Boris Tobotras <boris@xtalk.msk.su>
-   - kmod support by: Cyrus Durgin
-   - RAID0 bugfixes: Mark Anthony Lisher <markal@iname.com>
+   RAID-1/RAID-5 extensions by:
+        Ingo Molnar, Miguel de Icaza, Gadi Oxman
  
+   Changes for kmod by:
+       Cyrus Durgin
+   
     This program is free software; you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
     the Free Software Foundation; either version 2, or (at your option)
@@ -22,2972 +26,807 @@
     Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
  */
  
-#include <linux/config.h>
-#include <linux/raid/md.h>
-#include <linux/raid/xor.h>
+/*
+ * Current RAID-1,4,5 parallel reconstruction speed limit is 1024 KB/sec, so
+ * the extra system load does not show up that much. Increase it if your
+ * system can take more.
+ */
+#define SPEED_LIMIT 1024
  
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/version.h>
+#include <linux/malloc.h>
+#include <linux/mm.h>
+#include <linux/md.h>
+#include <linux/hdreg.h>
+#include <linux/stat.h>
+#include <linux/fs.h>
+#include <linux/proc_fs.h>
+#include <linux/blkdev.h>
+#include <linux/genhd.h>
+#include <linux/smp_lock.h>
  #ifdef CONFIG_KMOD
  #include <linux/kmod.h>
  #endif
+#include <linux/errno.h>
+#include <linux/init.h>
  
  #define __KERNEL_SYSCALLS__
  #include <linux/unistd.h>
  
-#include <asm/unaligned.h>
-
-extern asmlinkage int sys_sched_yield(void);
-extern asmlinkage int sys_setsid(void);
-
-extern unsigned long io_events[MAX_BLKDEV];
-
  #define MAJOR_NR MD_MAJOR
  #define MD_DRIVER
  
  #include <linux/blk.h>
+#include <asm/uaccess.h>
+#include <asm/bitops.h>
+#include <asm/atomic.h>
  
  #ifdef CONFIG_MD_BOOT
-extern kdev_t name_to_kdev_t(char *line) md__init;
+extern kdev_t name_to_kdev_t(char *line) __init;
  #endif
  
-static mdk_personality_t *pers[MAX_PERSONALITY] = {NULL, };
-
-/*
- * these have to be allocated separately because external
- * subsystems want to have a pre-defined structure
- */
-struct hd_struct md_hd_struct[MAX_MD_DEVS];
-static int md_blocksizes[MAX_MD_DEVS];
-static int md_maxreadahead[MAX_MD_DEVS];
-static mdk_thread_t *md_recovery_thread = NULL;
+static struct hd_struct md_hd_struct[MAX_MD_DEV];
+static int md_blocksizes[MAX_MD_DEV];
+int md_maxreadahead[MAX_MD_DEV];
+#if SUPPORT_RECONSTRUCTION
+static struct md_thread *md_sync_thread = NULL;
+#endif /* SUPPORT_RECONSTRUCTION */
  
-int md_size[MAX_MD_DEVS] = {0, };
+int md_size[MAX_MD_DEV]={0, };
  
  static void md_geninit (struct gendisk *);
  
  static struct gendisk md_gendisk=
  {
-       MD_MAJOR,
-       "md",
-       0,
-       1,
-       MAX_MD_DEVS,
-       md_geninit,
-       md_hd_struct,
-       md_size,
-       MAX_MD_DEVS,
-       NULL,
-       NULL
+  MD_MAJOR,
+  "md",
+  0,
+  1,
+  MAX_MD_DEV,
+  md_geninit,
+  md_hd_struct,
+  md_size,
+  MAX_MD_DEV,
+  NULL,
+  NULL
  };
  
-/*
- * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit'
- * is 100 KB/sec, so the extra system load does not show up that much.
- * Increase it if you want to have more _guaranteed_ speed. Note that
- * the RAID driver will use the maximum available bandwith if the IO
- * subsystem is idle.
- *
- * you can change it via /proc/sys/dev/speed-limit
- */
-
-static int sysctl_speed_limit = 100;
+static struct md_personality *pers[MAX_PERSONALITY]={NULL, };
+struct md_dev md_dev[MAX_MD_DEV];
  
-static struct ctl_table_header *md_table_header;
+int md_thread(void * arg);
  
-static ctl_table md_table[] = {
-       {DEV_MD_SPEED_LIMIT, "speed-limit",
-        &sysctl_speed_limit, sizeof(int), 0644, NULL, &proc_dointvec},
-       {0}
-};
-
-static ctl_table md_dir_table[] = {
-        {DEV_MD, "md", NULL, 0, 0555, md_table},
-        {0}
-};
+static struct gendisk *find_gendisk (kdev_t dev)
+{
+  struct gendisk *tmp=gendisk_head;
  
-static ctl_table md_root_table[] = {
-        {CTL_DEV, "dev", NULL, 0, 0555, md_dir_table},
-        {0}
-};
+  while (tmp != NULL)
+  {
+    if (tmp->major==MAJOR(dev))
+      return (tmp);
+    
+    tmp=tmp->next;
+  }
  
-static void md_register_sysctl(void)
-{
-        md_table_header = register_sysctl_table(md_root_table, 1);
+  return (NULL);
  }
  
-void md_unregister_sysctl(void)
+char *partition_name (kdev_t dev)
  {
-        unregister_sysctl_table(md_table_header);
-}
+  static char name[40];                /* This should be long
+                                  enough for a device name ! */
+  struct gendisk *hd = find_gendisk (dev);
  
-/*
- * The mapping between kdev and mddev is not necessary a simple
- * one! Eg. HSM uses several sub-devices to implement Logical
- * Volumes. All these sub-devices map to the same mddev.
- */
-dev_mapping_t mddev_map [MAX_MD_DEVS] = { {NULL, 0}, };
+  if (!hd)
+  {
+    sprintf (name, "[dev %s]", kdevname(dev));
+    return (name);
+  }
+
+  return disk_name (hd, MINOR(dev), name);  /* routine in genhd.c */
+}
  
-void add_mddev_mapping (mddev_t * mddev, kdev_t dev, void *data)
+static int legacy_raid_sb (int minor, int pnum)
  {
-       unsigned int minor = MINOR(dev);
+       int i, factor;
  
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return;
-       }
-       if (mddev_map[minor].mddev != NULL) {
-               MD_BUG();
-               return;
-       }
-       mddev_map[minor].mddev = mddev;
-       mddev_map[minor].data = data;
+       factor = 1 << FACTOR_SHIFT(FACTOR((md_dev+minor)));
+
+       /*****
+        * do size and offset calculations.
+        */
+       for (i=0; i<md_dev[minor].nb_dev; i++) {
+               md_dev[minor].devices[i].size &= ~(factor - 1);
+               md_size[minor] += md_dev[minor].devices[i].size;
+               md_dev[minor].devices[i].offset=i ? (md_dev[minor].devices[i-1].offset + 
+                                                       md_dev[minor].devices[i-1].size) : 0;
+       }
+       if (pnum == RAID0 >> PERSONALITY_SHIFT)
+               md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD * md_dev[minor].nb_dev;
+       return 0;
  }
  
-void del_mddev_mapping (mddev_t * mddev, kdev_t dev)
+static void free_sb (struct md_dev *mddev)
  {
-       unsigned int minor = MINOR(dev);
+       int i;
+       struct real_dev *realdev;
  
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return;
+       if (mddev->sb) {
+               free_page((unsigned long) mddev->sb);
+               mddev->sb = NULL;
         }
-       if (mddev_map[minor].mddev != mddev) {
-               MD_BUG();
-               return;
+       for (i = 0; i <mddev->nb_dev; i++) {
+               realdev = mddev->devices + i;
+               if (realdev->sb) {
+                       free_page((unsigned long) realdev->sb);
+                       realdev->sb = NULL;
+               }
         }
-       mddev_map[minor].mddev = NULL;
-       mddev_map[minor].data = NULL;
  }
  
  /*
- * Enables to iterate over all existing md arrays
+ * Check one RAID superblock for generic plausibility
   */
-static MD_LIST_HEAD(all_mddevs);
  
-static mddev_t * alloc_mddev (kdev_t dev)
-{
-       mddev_t * mddev;
+#define BAD_MAGIC KERN_ERR \
+"md: %s: invalid raid superblock magic (%x) on block %u\n"
  
-       if (MAJOR(dev) != MD_MAJOR) {
-               MD_BUG();
-               return 0;
-       }
-       mddev = (mddev_t *) kmalloc(sizeof(*mddev), GFP_KERNEL);
-       if (!mddev)
-               return NULL;
-               
-       memset(mddev, 0, sizeof(*mddev));
-
-       mddev->__minor = MINOR(dev);
-       mddev->reconfig_sem = MUTEX;
-       mddev->recovery_sem = MUTEX;
-       mddev->resync_sem = MUTEX;
-       MD_INIT_LIST_HEAD(&mddev->disks);
-       /*
-        * The 'base' mddev is the one with data NULL.
-        * personalities can create additional mddevs 
-        * if necessary.
-        */
-       add_mddev_mapping(mddev, dev, 0);
-       md_list_add(&mddev->all_mddevs, &all_mddevs);
+#define OUT_OF_MEM KERN_ALERT \
+"md: out of memory.\n"
  
-       return mddev;
-}
+#define NO_DEVICE KERN_ERR \
+"md: disabled device %s\n"
+
+#define SUCCESS 0
+#define FAILURE -1
  
-static void free_mddev (mddev_t *mddev)
+static int analyze_one_sb (struct real_dev * rdev)
  {
-       if (!mddev) {
-               MD_BUG();
-               return;
-       }
+       int ret = FAILURE;
+       struct buffer_head *bh;
+       kdev_t dev = rdev->dev;
+       md_superblock_t *sb;
  
         /*
-        * Make sure nobody else is using this mddev
-        * (careful, we rely on the global kernel lock here)
+        * Read the superblock, it's at the end of the disk
          */
-       while (md_atomic_read(&mddev->resync_sem.count) != 1)
-               schedule();
-       while (md_atomic_read(&mddev->recovery_sem.count) != 1)
-               schedule();
-
-       del_mddev_mapping(mddev, MKDEV(MD_MAJOR, mdidx(mddev)));
-       md_list_del(&mddev->all_mddevs);
-       MD_INIT_LIST_HEAD(&mddev->all_mddevs);
-       kfree(mddev);
-}
-
-
-struct gendisk * find_gendisk (kdev_t dev)
-{
-       struct gendisk *tmp = gendisk_head;
-
-       while (tmp != NULL) {
-               if (tmp->major == MAJOR(dev))
-                       return (tmp);
-               tmp = tmp->next;
-       }
-       return (NULL);
-}
-
-mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
-{
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->desc_nr == nr)
-                       return rdev;
-       }
-       return NULL;
-}
+       rdev->sb_offset = MD_NEW_SIZE_BLOCKS (blk_size[MAJOR(dev)][MINOR(dev)]);
+       set_blocksize (dev, MD_SB_BYTES);
+       bh = bread (dev, rdev->sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
  
-mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
+       if (bh) {
+               sb = (md_superblock_t *) bh->b_data;
+               if (sb->md_magic != MD_SB_MAGIC) {
+                       printk (BAD_MAGIC, kdevname(dev),
+                                        sb->md_magic, rdev->sb_offset);
+                       goto abort;
+               }
+               rdev->sb = (md_superblock_t *) __get_free_page(GFP_KERNEL);
+               if (!rdev->sb) {
+                       printk (OUT_OF_MEM);
+                       goto abort;
+               }
+               memcpy (rdev->sb, bh->b_data, MD_SB_BYTES);
  
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->dev == dev)
-                       return rdev;
-       }
-       return NULL;
+               rdev->size = sb->size;
+       } else
+               printk (NO_DEVICE,kdevname(rdev->dev));
+       ret = SUCCESS;
+abort:
+       if (bh)
+               brelse (bh);
+       return ret;
  }
  
-static MD_LIST_HEAD(device_names);
+#undef SUCCESS
+#undef FAILURE
  
-char * partition_name (kdev_t dev)
-{
-       struct gendisk *hd;
-       static char nomem [] = "<nomem>";
-       dev_name_t *dname;
-       struct md_list_head *tmp = device_names.next;
-
-       while (tmp != &device_names) {
-               dname = md_list_entry(tmp, dev_name_t, list);
-               if (dname->dev == dev)
-                       return dname->name;
-               tmp = tmp->next;
-       }
+#undef BAD_MAGIC
+#undef OUT_OF_MEM
+#undef NO_DEVICE
  
-       dname = (dev_name_t *) kmalloc(sizeof(*dname), GFP_KERNEL);
+/*
+ * Check a full RAID array for plausibility
+ */
  
-       if (!dname)
-               return nomem;
-       /*
-        * ok, add this new device name to the list
-        */
-       hd = find_gendisk (dev);
+#define INCONSISTENT KERN_ERR \
+"md: superblock inconsistency -- run ckraid\n"
  
-       if (!hd)
-               sprintf (dname->name, "[dev %s]", kdevname(dev));
-       else
-               disk_name (hd, MINOR(dev), dname->name);
+#define OUT_OF_DATE KERN_ERR \
+"md: superblock update time inconsistenty -- using the most recent one\n"
  
-       dname->dev = dev;
-       md_list_add(&dname->list, &device_names);
+#define OLD_VERSION KERN_ALERT \
+"md: %s: unsupported raid array version %d.%d.%d\n"
  
-       return dname->name;
-}
+#define NOT_CLEAN KERN_ERR \
+"md: %s: raid array is not clean -- run ckraid\n"
  
-static unsigned int calc_dev_sboffset (kdev_t dev, mddev_t *mddev,
-                                               int persistent)
-{
-       unsigned int size = 0;
+#define NOT_CLEAN_IGNORE KERN_ERR \
+"md: %s: raid array is not clean -- reconstructing parity\n"
  
-       if (blk_size[MAJOR(dev)])
-               size = blk_size[MAJOR(dev)][MINOR(dev)];
-       if (persistent)
-               size = MD_NEW_SIZE_BLOCKS(size);
-       return size;
-}
+#define UNKNOWN_LEVEL KERN_ERR \
+"md: %s: unsupported raid level %d\n"
  
-static unsigned int calc_dev_size (kdev_t dev, mddev_t *mddev, int persistent)
+static int analyze_sbs (int minor, int pnum)
  {
-       unsigned int size;
+       struct md_dev *mddev = md_dev + minor;
+       int i, N = mddev->nb_dev, out_of_date = 0;
+       struct real_dev * disks = mddev->devices;
+       md_superblock_t *sb, *freshest = NULL;
  
-       size = calc_dev_sboffset(dev, mddev, persistent);
-       if (!mddev->sb) {
-               MD_BUG();
-               return size;
-       }
-       if (mddev->sb->chunk_size)
-               size &= ~(mddev->sb->chunk_size/1024 - 1);
-       return size;
-}
+       /*
+        * RAID-0 and linear don't use a RAID superblock
+        */
+       if (pnum == RAID0 >> PERSONALITY_SHIFT ||
+               pnum == LINEAR >> PERSONALITY_SHIFT)
+                       return legacy_raid_sb (minor, pnum);
  
-/*
- * We check wether all devices are numbered from 0 to nb_dev-1. The
- * order is guaranteed even after device name changes.
- *
- * Some personalities (raid0, linear) use this. Personalities that
- * provide data have to be able to deal with loss of individual
- * disks, so they do their checking themselves.
- */
-int md_check_ordering (mddev_t *mddev)
-{
-       int i, c;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       /*
+        * Verify the RAID superblock on each real device
+        */
+       for (i = 0; i < N; i++)
+               if (analyze_one_sb(disks+i))
+                       goto abort;
  
         /*
-        * First, all devices must be fully functional
+        * The superblock constant part has to be the same
+        * for all disks in the array.
          */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       printk("md: md%d's device %s faulty, aborting.\n",
-                               mdidx(mddev), partition_name(rdev->dev));
+       sb = NULL;
+       for (i = 0; i < N; i++) {
+               if (!disks[i].sb)
+                       continue;
+               if (!sb) {
+                       sb = disks[i].sb;
+                       continue;
+               }
+               if (memcmp(sb,
+                          disks[i].sb, MD_SB_GENERIC_CONSTANT_WORDS * 4)) {
+                       printk (INCONSISTENT);
                         goto abort;
                 }
         }
  
-       c = 0;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               c++;
-       }
-       if (c != mddev->nb_dev) {
-               MD_BUG();
+       /*
+        * OK, we have all disks and the array is ready to run. Let's
+        * find the freshest superblock, that one will be the superblock
+        * that represents the whole array.
+        */
+       if ((sb = mddev->sb = (md_superblock_t *) __get_free_page (GFP_KERNEL)) == NULL)
                 goto abort;
+       freshest = NULL;
+       for (i = 0; i < N; i++) {
+               if (!disks[i].sb)
+                       continue;
+               if (!freshest) {
+                       freshest = disks[i].sb;
+                       continue;
+               }
+               /*
+                * Find the newest superblock version
+                */
+               if (disks[i].sb->utime != freshest->utime) {
+                       out_of_date = 1;
+                       if (disks[i].sb->utime > freshest->utime)
+                               freshest = disks[i].sb;
+               }
         }
-       if (mddev->nb_dev != mddev->sb->raid_disks) {
-               printk("md: md%d, array needs %d disks, has %d, aborting.\n",
-                       mdidx(mddev), mddev->sb->raid_disks, mddev->nb_dev);
+       if (out_of_date)
+               printk(OUT_OF_DATE);
+       memcpy (sb, freshest, sizeof(*freshest));
+
+       /*
+        * Check if we can support this RAID array
+        */
+       if (sb->major_version != MD_MAJOR_VERSION ||
+                       sb->minor_version > MD_MINOR_VERSION) {
+
+               printk (OLD_VERSION, kdevname(MKDEV(MD_MAJOR, minor)),
+                               sb->major_version, sb->minor_version,
+                               sb->patch_version);
                 goto abort;
         }
+
         /*
-        * Now the numbering check
+        * We need to add this as a superblock option.
          */
-       for (i = 0; i < mddev->nb_dev; i++) {
-               c = 0;
-               ITERATE_RDEV(mddev,rdev,tmp) {
-                       if (rdev->desc_nr == i)
-                               c++;
-               }
-               if (c == 0) {
-                       printk("md: md%d, missing disk #%d, aborting.\n",
-                               mdidx(mddev), i);
+#if SUPPORT_RECONSTRUCTION
+       if (sb->state != (1 << MD_SB_CLEAN)) {
+               if (sb->level == 1) {
+                       printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
                         goto abort;
-               }
-               if (c > 1) {
-                       printk("md: md%d, too many disks #%d, aborting.\n",
-                               mdidx(mddev), i);
+               } else
+                       printk (NOT_CLEAN_IGNORE, kdevname(MKDEV(MD_MAJOR, minor)));
+       }
+#else
+       if (sb->state != (1 << MD_SB_CLEAN)) {
+               printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
+       }
+#endif /* SUPPORT_RECONSTRUCTION */
+
+       switch (sb->level) {
+               case 1:
+                       md_size[minor] = sb->size;
+                       md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD;
+                       break;
+               case 4:
+               case 5:
+                       md_size[minor] = sb->size * (sb->raid_disks - 1);
+                       md_maxreadahead[minor] = MD_DEFAULT_DISK_READAHEAD * (sb->raid_disks - 1);
+                       break;
+               default:
+                       printk (UNKNOWN_LEVEL, kdevname(MKDEV(MD_MAJOR, minor)),
+                                       sb->level);
                         goto abort;
-               }
         }
         return 0;
  abort:
+       free_sb(mddev);
         return 1;
  }
  
-static unsigned int zoned_raid_size (mddev_t *mddev)
+#undef INCONSISTENT
+#undef OUT_OF_DATE
+#undef OLD_VERSION
+#undef NOT_CLEAN
+#undef OLD_LEVEL
+
+int md_update_sb(int minor)
  {
-       unsigned int mask;
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
+       struct md_dev *mddev = md_dev + minor;
+       struct buffer_head *bh;
+       md_superblock_t *sb = mddev->sb;
+       struct real_dev *realdev;
+       kdev_t dev;
+       int i;
+       u32 sb_offset;
  
-       if (!mddev->sb) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       /*
-        * do size and offset calculations.
-        */
-       mask = ~(mddev->sb->chunk_size/1024 - 1);
-printk("mask %08x\n", mask);
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-printk(" rdev->size: %d\n", rdev->size);
-               rdev->size &= mask;
-printk(" masked rdev->size: %d\n", rdev->size);
-               md_size[mdidx(mddev)] += rdev->size;
-printk("  new md_size: %d\n", md_size[mdidx(mddev)]);
+       sb->utime = CURRENT_TIME;
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = mddev->devices + i;
+               if (!realdev->sb)
+                       continue;
+               dev = realdev->dev;
+               sb_offset = realdev->sb_offset;
+               set_blocksize(dev, MD_SB_BYTES);
+               printk("md: updating raid superblock on device %s, sb_offset == %u\n", kdevname(dev), sb_offset);
+               bh = getblk(dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
+               if (bh) {
+                       sb = (md_superblock_t *) bh->b_data;
+                       memcpy(sb, mddev->sb, MD_SB_BYTES);
+                       memcpy(&sb->descriptor, sb->disks + realdev->sb->descriptor.number, MD_SB_DESCRIPTOR_WORDS * 4);
+                       mark_buffer_uptodate(bh, 1);
+                       mark_buffer_dirty(bh, 1);
+                       ll_rw_block(WRITE, 1, &bh);
+                       wait_on_buffer(bh);
+                       bforget(bh);
+                       fsync_dev(dev);
+                       invalidate_buffers(dev);
+               } else
+                       printk(KERN_ERR "md: getblk failed for device %s\n", kdevname(dev));
         }
         return 0;
  }
  
-static void remove_descriptor (mdp_disk_t *disk, mdp_super_t *sb)
+static int do_md_run (int minor, int repart)
  {
-       if (disk_active(disk)) {
-               sb->working_disks--;
-       } else {
-               if (disk_spare(disk)) {
-                       sb->spare_disks--;
-                       sb->working_disks--;
-               } else  {
-                       sb->failed_disks--;
-               }
-       }
-       sb->nr_disks--;
-       disk->major = 0;
-       disk->minor = 0;
-       mark_disk_removed(disk);
-}
-
-#define BAD_MAGIC KERN_ERR \
-"md: invalid raid superblock magic on %s\n"
+  int pnum, i, min, factor, err;
  
-#define BAD_MINOR KERN_ERR \
-"md: %s: invalid raid minor (%x)\n"
+  if (!md_dev[minor].nb_dev)
+    return -EINVAL;
+  
+  if (md_dev[minor].pers)
+    return -EBUSY;
  
-#define OUT_OF_MEM KERN_ALERT \
-"md: out of memory.\n"
+  md_dev[minor].repartition=repart;
+  
+  if ((pnum=PERSONALITY(&md_dev[minor]) >> (PERSONALITY_SHIFT))
+      >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  /* Only RAID-1 and RAID-5 can have MD devices as underlying devices */
+  if (pnum != (RAID1 >> PERSONALITY_SHIFT) && pnum != (RAID5 >> PERSONALITY_SHIFT)){
+         for (i = 0; i < md_dev [minor].nb_dev; i++)
+                 if (MAJOR (md_dev [minor].devices [i].dev) == MD_MAJOR)
+                         return -EINVAL;
+  }
+  if (!pers[pnum])
+  {
+#ifdef CONFIG_KMOD
+    char module_name[80];
+    sprintf (module_name, "md-personality-%d", pnum);
+    request_module (module_name);
+    if (!pers[pnum])
+#endif
+      return -EINVAL;
+  }
+  
+  factor = min = 1 << FACTOR_SHIFT(FACTOR((md_dev+minor)));
+  
+  for (i=0; i<md_dev[minor].nb_dev; i++)
+    if (md_dev[minor].devices[i].size<min)
+    {
+      printk ("Dev %s smaller than %dk, cannot shrink\n",
+             partition_name (md_dev[minor].devices[i].dev), min);
+      return -EINVAL;
+    }
+
+  for (i=0; i<md_dev[minor].nb_dev; i++) {
+    fsync_dev(md_dev[minor].devices[i].dev);
+    invalidate_buffers(md_dev[minor].devices[i].dev);
+  }
+  
+  /* Resize devices according to the factor. It is used to align
+     partitions size on a given chunk size. */
+  md_size[minor]=0;
  
-#define NO_SB KERN_ERR \
-"md: disabled device %s, could not read superblock.\n"
+  /*
+   * Analyze the raid superblock
+   */ 
+  if (analyze_sbs(minor, pnum))
+    return -EINVAL;
  
-#define BAD_CSUM KERN_WARNING \
-"md: invalid superblock checksum on %s\n"
+  md_dev[minor].pers=pers[pnum];
+  
+  if ((err=md_dev[minor].pers->run (minor, md_dev+minor)))
+  {
+    md_dev[minor].pers=NULL;
+    free_sb(md_dev + minor);
+    return (err);
+  }
+
+  if (pnum != RAID0 >> PERSONALITY_SHIFT && pnum != LINEAR >> PERSONALITY_SHIFT)
+  {
+    md_dev[minor].sb->state &= ~(1 << MD_SB_CLEAN);
+    md_update_sb(minor);
+  }
+
+  /* FIXME : We assume here we have blocks
+     that are twice as large as sectors.
+     THIS MAY NOT BE TRUE !!! */
+  md_hd_struct[minor].start_sect=0;
+  md_hd_struct[minor].nr_sects=md_size[minor]<<1;
+  
+  read_ahead[MD_MAJOR] = 128;
+  return (0);
+}
  
-static int alloc_array_sb (mddev_t * mddev)
+static int do_md_stop (int minor, struct inode *inode)
  {
-       if (mddev->sb) {
-               MD_BUG();
-               return 0;
+       int i;
+  
+       if (inode->i_count>1 || md_dev[minor].busy>1) {
+               /*
+                * ioctl : one open channel
+                */
+               printk ("STOP_MD md%x failed : i_count=%d, busy=%d\n",
+                               minor, inode->i_count, md_dev[minor].busy);
+               return -EBUSY;
         }
-
-       mddev->sb = (mdp_super_t *) __get_free_page (GFP_KERNEL);
-       if (!mddev->sb)
-               return -ENOMEM;
-       md_clear_page((unsigned long)mddev->sb);
-       return 0;
+  
+       if (md_dev[minor].pers) {
+               /*
+                * It is safe to call stop here, it only frees private
+                * data. Also, it tells us if a device is unstoppable
+                * (eg. resyncing is in progress)
+                */
+               if (md_dev[minor].pers->stop (minor, md_dev+minor))
+                       return -EBUSY;
+               /*
+                *  The device won't exist anymore -> flush it now
+                */
+               fsync_dev (inode->i_rdev);
+               invalidate_buffers (inode->i_rdev);
+               if (md_dev[minor].sb) {
+                       md_dev[minor].sb->state |= 1 << MD_SB_CLEAN;
+                       md_update_sb(minor);
+               }
+       }
+  
+       /* Remove locks. */
+       if (md_dev[minor].sb)
+       free_sb(md_dev + minor);
+       for (i=0; i<md_dev[minor].nb_dev; i++)
+               clear_inode (md_dev[minor].devices[i].inode);
+
+       md_dev[minor].nb_dev=md_size[minor]=0;
+       md_hd_struct[minor].nr_sects=0;
+       md_dev[minor].pers=NULL;
+  
+       read_ahead[MD_MAJOR] = 128;
+  
+       return (0);
  }
  
-static int alloc_disk_sb (mdk_rdev_t * rdev)
+static int do_md_add (int minor, kdev_t dev)
  {
-       if (rdev->sb)
-               MD_BUG();
+       int i;
+       int hot_add=0;
+       struct real_dev *realdev;
  
-       rdev->sb = (mdp_super_t *) __get_free_page(GFP_KERNEL);
-       if (!rdev->sb) {
-               printk (OUT_OF_MEM);
+       if (md_dev[minor].nb_dev==MAX_REAL)
                 return -EINVAL;
-       }
-       md_clear_page((unsigned long)rdev->sb);
  
-       return 0;
-}
+       if (!fs_may_mount (dev))
+               return -EBUSY;
  
-static void free_disk_sb (mdk_rdev_t * rdev)
-{
-       if (rdev->sb) {
-               free_page((unsigned long) rdev->sb);
-               rdev->sb = NULL;
-               rdev->sb_offset = 0;
-               rdev->size = 0;
-       } else {
-               if (!rdev->faulty)
-                       MD_BUG();
+       if (blk_size[MAJOR(dev)] == NULL || blk_size[MAJOR(dev)][MINOR(dev)] == 0) {
+               printk("md_add(): zero device size, huh, bailing out.\n");
+               return -EINVAL;
         }
-}
-
-static void mark_rdev_faulty (mdk_rdev_t * rdev)
-{
-       unsigned long flags;
  
-       if (!rdev) {
-               MD_BUG();
-               return;
+       if (md_dev[minor].pers) {
+               /*
+                * The array is already running, hot-add the drive, or
+                * bail out:
+                */
+               if (!md_dev[minor].pers->hot_add_disk)
+                       return -EBUSY;
+               else
+                       hot_add=1;
         }
-       save_flags(flags);
-       cli();
-       free_disk_sb(rdev);
-       rdev->faulty = 1;
-       restore_flags(flags);
-}
-
-static int read_disk_sb (mdk_rdev_t * rdev)
-{
-       int ret = -EINVAL;
-       struct buffer_head *bh = NULL;
-       kdev_t dev = rdev->dev;
-       mdp_super_t *sb;
-       u32 sb_offset;
  
-       if (!rdev->sb) {
-               MD_BUG();
-               goto abort;
-       }       
-       
         /*
-        * Calculate the position of the superblock,
-        * it's at the end of the disk
+        * Careful. We cannot increase nb_dev for a running array.
          */
-       sb_offset = calc_dev_sboffset(rdev->dev, rdev->mddev, 1);
-       rdev->sb_offset = sb_offset;
-       printk("(read) %s's sb offset: %d", partition_name(dev),
-                                                        sb_offset);
-       fsync_dev(dev);
-       set_blocksize (dev, MD_SB_BYTES);
-       bh = bread (dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
+       i=md_dev[minor].nb_dev;
+       realdev = &md_dev[minor].devices[i];
+       realdev->dev=dev;
+  
+       /* Lock the device by inserting a dummy inode. This doesn't
+          smell very good, but I need to be consistent with the
+          mount stuff, specially with fs_may_mount. If someone have
+          a better idea, please help ! */
+  
+       realdev->inode=get_empty_inode ();
+       realdev->inode->i_dev=dev;      /* don't care about other fields */
+       insert_inode_hash (realdev->inode);
+  
+       /* Sizes are now rounded at run time */
+  
+/*  md_dev[minor].devices[i].size=gen_real->sizes[MINOR(dev)]; HACKHACK*/
  
-       if (bh) {
-               sb = (mdp_super_t *) bh->b_data;
-               memcpy (rdev->sb, sb, MD_SB_BYTES);
-       } else {
-               printk (NO_SB,partition_name(rdev->dev));
-               goto abort;
-       }
-       printk(" [events: %08lx]\n", (unsigned long)get_unaligned(&rdev->sb->events));
-       ret = 0;
-abort:
-       if (bh)
-               brelse (bh);
-       return ret;
-}
-
-static unsigned int calc_sb_csum (mdp_super_t * sb)
-{
-       unsigned int disk_csum, csum;
-
-       disk_csum = sb->sb_csum;
-       sb->sb_csum = 0;
-       csum = csum_partial((void *)sb, MD_SB_BYTES, 0);
-       sb->sb_csum = disk_csum;
-       return csum;
-}
-
-/*
- * Check one RAID superblock for generic plausibility
- */
-
-static int check_disk_sb (mdk_rdev_t * rdev)
-{
-       mdp_super_t *sb;
-       int ret = -EINVAL;
-
-       sb = rdev->sb;
-       if (!sb) {
-               MD_BUG();
-               goto abort;
-       }
-
-       if (sb->md_magic != MD_SB_MAGIC) {
-               printk (BAD_MAGIC, partition_name(rdev->dev));
-               goto abort;
-       }
-
-       if (sb->md_minor >= MAX_MD_DEVS) {
-               printk (BAD_MINOR, partition_name(rdev->dev),
-                                                       sb->md_minor);
-               goto abort;
-       }
-
-       if (calc_sb_csum(sb) != sb->sb_csum)
-               printk(BAD_CSUM, partition_name(rdev->dev));
-       ret = 0;
-abort:
-       return ret;
-}
-
-static kdev_t dev_unit(kdev_t dev)
-{
-       unsigned int mask;
-       struct gendisk *hd = find_gendisk(dev);
-
-       if (!hd)
-               return 0;
-       mask = ~((1 << hd->minor_shift) - 1);
-
-       return MKDEV(MAJOR(dev), MINOR(dev) & mask);
-}
-
-static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       ITERATE_RDEV(mddev,rdev,tmp)
-               if (dev_unit(rdev->dev) == dev_unit(dev))
-                       return rdev;
-
-       return NULL;
-}
-
-static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       ITERATE_RDEV(mddev1,rdev,tmp)
-               if (match_dev_unit(mddev2, rdev->dev))
-                       return 1;
-
-       return 0;
-}
-
-static MD_LIST_HEAD(all_raid_disks);
-static MD_LIST_HEAD(pending_raid_disks);
-
-static void bind_rdev_to_array (mdk_rdev_t * rdev, mddev_t * mddev)
-{
-       mdk_rdev_t *same_pdev;
-
-       if (rdev->mddev) {
-               MD_BUG();
-               return;
-       }
-       same_pdev = match_dev_unit(mddev, rdev->dev);
-       if (same_pdev)
-               printk( KERN_WARNING
-"md%d: WARNING: %s appears to be on the same physical disk as %s. True\n"
-"     protection against single-disk failure might be compromised.\n",
-                       mdidx(mddev), partition_name(rdev->dev),
-                               partition_name(same_pdev->dev));
-               
-       md_list_add(&rdev->same_set, &mddev->disks);
-       rdev->mddev = mddev;
-       mddev->nb_dev++;
-       printk("bind<%s,%d>\n", partition_name(rdev->dev), mddev->nb_dev);
-}
-
-static void unbind_rdev_from_array (mdk_rdev_t * rdev)
-{
-       if (!rdev->mddev) {
-               MD_BUG();
-               return;
-       }
-       md_list_del(&rdev->same_set);
-       MD_INIT_LIST_HEAD(&rdev->same_set);
-       rdev->mddev->nb_dev--;
-       printk("unbind<%s,%d>\n", partition_name(rdev->dev),
-                                                rdev->mddev->nb_dev);
-       rdev->mddev = NULL;
-}
-
-/*
- * prevent the device from being mounted, repartitioned or
- * otherwise reused by a RAID array (or any other kernel
- * subsystem), by opening the device. [simply getting an
- * inode is not enough, the SCSI module usage code needs
- * an explicit open() on the device]
- */
-static void lock_rdev (mdk_rdev_t *rdev)
-{
-       int err = 0;
-
-       /*
-        * First insert a dummy inode.
-        */
-       rdev->inode = get_empty_inode();
-       /*
-        * we dont care about any other fields
-        */
-       rdev->inode->i_dev = rdev->inode->i_rdev = rdev->dev;
-       insert_inode_hash(rdev->inode);
-
-       memset(&rdev->filp, 0, sizeof(rdev->filp));
-       rdev->filp.f_mode = 3; /* read write */
-       err = blkdev_open(rdev->inode, &rdev->filp);
-       if (err)
-               printk("blkdev_open() failed: %d\n", err);
-}
-
-static void unlock_rdev (mdk_rdev_t *rdev)
-{
-       blkdev_release(rdev->inode);
-}
-
-static void export_rdev (mdk_rdev_t * rdev)
-{
-       printk("export_rdev(%s)\n",partition_name(rdev->dev));
-       if (rdev->mddev)
-               MD_BUG();
-       unlock_rdev(rdev);
-       free_disk_sb(rdev);
-       md_list_del(&rdev->all);
-       MD_INIT_LIST_HEAD(&rdev->all);
-       if (rdev->pending.next != &rdev->pending) {
-               printk("(%s was pending)\n",partition_name(rdev->dev));
-               md_list_del(&rdev->pending);
-               MD_INIT_LIST_HEAD(&rdev->pending);
-       }
-       rdev->dev = 0;
-       rdev->faulty = 0;
-       kfree(rdev);
-}
-
-static void kick_rdev_from_array (mdk_rdev_t * rdev)
-{
-       unbind_rdev_from_array(rdev);
-       export_rdev(rdev);
-}
-
-static void export_array (mddev_t *mddev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-       mdp_super_t *sb = mddev->sb;
-
-       if (mddev->sb) {
-               mddev->sb = NULL;
-               free_page((unsigned long) sb);
-       }
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!rdev->mddev) {
-                       MD_BUG();
-                       continue;
-               }
-               kick_rdev_from_array(rdev);
-       }
-       if (mddev->nb_dev)
-               MD_BUG();
-}
-
-#undef BAD_CSUM
-#undef BAD_MAGIC
-#undef OUT_OF_MEM
-#undef NO_SB
-
-static void print_desc(mdp_disk_t *desc)
-{
-       printk(" DISK<N:%d,%s(%d,%d),R:%d,S:%d>\n", desc->number,
-               partition_name(MKDEV(desc->major,desc->minor)),
-               desc->major,desc->minor,desc->raid_disk,desc->state);
-}
-
-static void print_sb(mdp_super_t *sb)
-{
-       int i;
-
-       printk("  SB: (V:%d.%d.%d) ID:<%08x.%08x.%08x.%08x> CT:%08x\n",
-               sb->major_version, sb->minor_version, sb->patch_version,
-               sb->set_uuid0, sb->set_uuid1, sb->set_uuid2, sb->set_uuid3,
-               sb->ctime);
-       printk("     L%d S%08d ND:%d RD:%d md%d LO:%d CS:%d\n", sb->level,
-               sb->size, sb->nr_disks, sb->raid_disks, sb->md_minor,
-               sb->layout, sb->chunk_size);
-       printk("     UT:%08x ST:%d AD:%d WD:%d FD:%d SD:%d CSUM:%08x E:%08lx\n",
-               sb->utime, sb->state, sb->active_disks, sb->working_disks,
-               sb->failed_disks, sb->spare_disks,
-               sb->sb_csum, (unsigned long)get_unaligned(&sb->events));
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-
-               desc = sb->disks + i;
-               printk("     D %2d: ", i);
-               print_desc(desc);
-       }
-       printk("     THIS: ");
-       print_desc(&sb->this_disk);
-
-}
-
-static void print_rdev(mdk_rdev_t *rdev)
-{
-       printk(" rdev %s: O:%s, SZ:%08d F:%d DN:%d ",
-               partition_name(rdev->dev), partition_name(rdev->old_dev),
-               rdev->size, rdev->faulty, rdev->desc_nr);
-       if (rdev->sb) {
-               printk("rdev superblock:\n");
-               print_sb(rdev->sb);
-       } else
-               printk("no rdev superblock!\n");
-}
-
-void md_print_devices (void)
-{
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev;
-       mddev_t *mddev;
-
-       printk("\n");
-       printk("       **********************************\n");
-       printk("       * <COMPLETE RAID STATE PRINTOUT> *\n");
-       printk("       **********************************\n");
-       ITERATE_MDDEV(mddev,tmp) {
-               printk("md%d: ", mdidx(mddev));
-
-               ITERATE_RDEV(mddev,rdev,tmp2)
-                       printk("<%s>", partition_name(rdev->dev));
-
-               if (mddev->sb) {
-                       printk(" array superblock:\n");
-                       print_sb(mddev->sb);
-               } else
-                       printk(" no array superblock.\n");
-
-               ITERATE_RDEV(mddev,rdev,tmp2)
-                       print_rdev(rdev);
-       }
-       printk("       **********************************\n");
-       printk("\n");
-}
-
-static int sb_equal ( mdp_super_t *sb1, mdp_super_t *sb2)
-{
-       int ret;
-       mdp_super_t *tmp1, *tmp2;
-
-       tmp1 = kmalloc(sizeof(*tmp1),GFP_KERNEL);
-       tmp2 = kmalloc(sizeof(*tmp2),GFP_KERNEL);
-
-       if (!tmp1 || !tmp2) {
-               ret = 0;
-               goto abort;
-       }
-
-       *tmp1 = *sb1;
-       *tmp2 = *sb2;
-
-       /*
-        * nr_disks is not constant
-        */
-       tmp1->nr_disks = 0;
-       tmp2->nr_disks = 0;
-
-       if (memcmp(tmp1, tmp2, MD_SB_GENERIC_CONSTANT_WORDS * 4))
-               ret = 0;
-       else
-               ret = 1;
-
-abort:
-       if (tmp1)
-               kfree(tmp1);
-       if (tmp2)
-               kfree(tmp2);
-
-       return ret;
-}
-
-static int uuid_equal(mdk_rdev_t *rdev1, mdk_rdev_t *rdev2)
-{
-       if (    (rdev1->sb->set_uuid0 == rdev2->sb->set_uuid0) &&
-               (rdev1->sb->set_uuid1 == rdev2->sb->set_uuid1) &&
-               (rdev1->sb->set_uuid2 == rdev2->sb->set_uuid2) &&
-               (rdev1->sb->set_uuid3 == rdev2->sb->set_uuid3))
-
-               return 1;
-
-       return 0;
-}
-
-static mdk_rdev_t * find_rdev_all (kdev_t dev)
-{
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       tmp = all_raid_disks.next;
-       while (tmp != &all_raid_disks) {
-               rdev = md_list_entry(tmp, mdk_rdev_t, all);
-               if (rdev->dev == dev)
-                       return rdev;
-               tmp = tmp->next;
-       }
-       return NULL;
-}
-
-#define GETBLK_FAILED KERN_ERR \
-"md: getblk failed for device %s\n"
-
-static int write_disk_sb(mdk_rdev_t * rdev)
-{
-       struct buffer_head *bh;
-       kdev_t dev;
-       u32 sb_offset, size;
-       mdp_super_t *sb;
-
-       if (!rdev->sb) {
-               MD_BUG();
-               return -1;
-       }
-       if (rdev->faulty) {
-               MD_BUG();
-               return -1;
-       }
-       if (rdev->sb->md_magic != MD_SB_MAGIC) {
-               MD_BUG();
-               return -1;
-       }
-
-       dev = rdev->dev;
-       sb_offset = calc_dev_sboffset(dev, rdev->mddev, 1);
-       if (rdev->sb_offset != sb_offset) {
-               printk("%s's sb offset has changed from %d to %d, skipping\n", partition_name(dev), rdev->sb_offset, sb_offset);
-               goto skip;
-       }
-       /*
-        * If the disk went offline meanwhile and it's just a spare, then
-        * it's size has changed to zero silently, and the MD code does
-        * not yet know that it's faulty.
-        */
-       size = calc_dev_size(dev, rdev->mddev, 1);
-       if (size != rdev->size) {
-               printk("%s's size has changed from %d to %d since import, skipping\n", partition_name(dev), rdev->size, size);
-               goto skip;
-       }
-
-       printk("(write) %s's sb offset: %d\n", partition_name(dev), sb_offset);
-       fsync_dev(dev);
-       set_blocksize(dev, MD_SB_BYTES);
-       bh = getblk(dev, sb_offset / MD_SB_BLOCKS, MD_SB_BYTES);
-       if (!bh) {
-               printk(GETBLK_FAILED, partition_name(dev));
-               return 1;
-       }
-       memset(bh->b_data,0,bh->b_size);
-       sb = (mdp_super_t *) bh->b_data;
-       memcpy(sb, rdev->sb, MD_SB_BYTES);
-
-       mark_buffer_uptodate(bh, 1);
-       mark_buffer_dirty(bh, 1);
-       ll_rw_block(WRITE, 1, &bh);
-       wait_on_buffer(bh);
-       brelse(bh);
-       fsync_dev(dev);
-skip:
-       return 0;
-}
-#undef GETBLK_FAILED KERN_ERR
-
-static void set_this_disk(mddev_t *mddev, mdk_rdev_t *rdev)
-{
-       int i, ok = 0;
-       mdp_disk_t *desc;
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               desc = mddev->sb->disks + i;
-#if 0
-               if (disk_faulty(desc)) {
-                       if (MKDEV(desc->major,desc->minor) == rdev->dev)
-                               ok = 1;
-                       continue;
-               }
-#endif
-               if (MKDEV(desc->major,desc->minor) == rdev->dev) {
-                       rdev->sb->this_disk = *desc;
-                       rdev->desc_nr = desc->number;
-                       ok = 1;
-                       break;
-               }
-       }
-
-       if (!ok) {
-               MD_BUG();
-       }
-}
-
-static int sync_sbs(mddev_t * mddev)
-{
-       mdk_rdev_t *rdev;
-       mdp_super_t *sb;
-        struct md_list_head *tmp;
-
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               sb = rdev->sb;
-               *sb = *mddev->sb;
-               set_this_disk(mddev, rdev);
-               sb->sb_csum = calc_sb_csum(sb);
-       }
-       return 0;
-}
-
-int md_update_sb(mddev_t * mddev)
-{
-       int first, err, count = 100;
-        struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-       __u64 ev;
-
-repeat:
-       mddev->sb->utime = CURRENT_TIME;
-       ev = get_unaligned(&mddev->sb->events);
-       ++ev;
-       put_unaligned(ev,&mddev->sb->events);
-       if (ev == (__u64)0) {
-               /*
-                * oops, this 64-bit counter should never wrap.
-                * Either we are in around ~1 trillion A.C., assuming
-                * 1 reboot per second, or we have a bug:
-                */
-               MD_BUG();
-               --ev;
-               put_unaligned(ev,&mddev->sb->events);
-       }
-       sync_sbs(mddev);
-
-       /*
-        * do not write anything to disk if using
-        * nonpersistent superblocks
-        */
-       if (mddev->sb->not_persistent)
-               return 0;
-
-       printk(KERN_INFO "md: updating md%d RAID superblock on device\n",
-                                       mdidx(mddev));
-
-       first = 1;
-       err = 0;
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!first) {
-                       first = 0;
-                       printk(", ");
-               }
-               if (rdev->faulty)
-                       printk("(skipping faulty ");
-               printk("%s ", partition_name(rdev->dev));
-               if (!rdev->faulty) {
-                       printk("[events: %08lx]",
-                              (unsigned long)get_unaligned(&rdev->sb->events));
-                       err += write_disk_sb(rdev);
-               } else
-                       printk(")\n");
-       }
-       printk(".\n");
-       if (err) {
-               printk("errors occured during superblock update, repeating\n");
-               if (--count)
-                       goto repeat;
-               printk("excessive errors occured during superblock update, exiting\n");
-       }
-       return 0;
-}
-
-/*
- * Import a device. If 'on_disk', then sanity check the superblock
- *
- * mark the device faulty if:
- *
- *   - the device is nonexistent (zero size)
- *   - the device has no valid superblock
- *
- * a faulty rdev _never_ has rdev->sb set.
- */
-static int md_import_device (kdev_t newdev, int on_disk)
-{
-       int err;
-       mdk_rdev_t *rdev;
-       unsigned int size;
-
-       if (find_rdev_all(newdev))
-               return -EEXIST;
-
-       rdev = (mdk_rdev_t *) kmalloc(sizeof(*rdev), GFP_KERNEL);
-       if (!rdev) {
-               printk("could not alloc mem for %s!\n", partition_name(newdev));
-               return -ENOMEM;
-       }
-       memset(rdev, 0, sizeof(*rdev));
-
-       if (!fs_may_mount(newdev)) {
-               printk("md: can not import %s, has active inodes!\n",
-                       partition_name(newdev));
-               err = -EBUSY;
-               goto abort_free;
-       }
-
-       if ((err = alloc_disk_sb(rdev)))
-               goto abort_free;
-
-       rdev->dev = newdev;
-       lock_rdev(rdev);
-       rdev->desc_nr = -1;
-       rdev->faulty = 0;
-
-       size = 0;
-       if (blk_size[MAJOR(newdev)])
-               size = blk_size[MAJOR(newdev)][MINOR(newdev)];
-       if (!size) {
-               printk("md: %s has zero size, marking faulty!\n",
-                               partition_name(newdev));
-               err = -EINVAL;
-               goto abort_free;
-       }
-
-       if (on_disk) {
-               if ((err = read_disk_sb(rdev))) {
-                       printk("md: could not read %s's sb, not importing!\n",
-                                       partition_name(newdev));
-                       goto abort_free;
-               }
-               if ((err = check_disk_sb(rdev))) {
-                       printk("md: %s has invalid sb, not importing!\n",
-                                       partition_name(newdev));
-                       goto abort_free;
-               }
-
-               rdev->old_dev = MKDEV(rdev->sb->this_disk.major,
-                                       rdev->sb->this_disk.minor);
-               rdev->desc_nr = rdev->sb->this_disk.number;
-       }
-       md_list_add(&rdev->all, &all_raid_disks);
-       MD_INIT_LIST_HEAD(&rdev->pending);
-
-       if (rdev->faulty && rdev->sb)
-               free_disk_sb(rdev);
-       return 0;
-
-abort_free:
-       if (rdev->sb)
-               free_disk_sb(rdev);
-       kfree(rdev);
-       return err;
-}
-
-/*
- * Check a full RAID array for plausibility
- */
-
-#define INCONSISTENT KERN_ERR \
-"md: fatal superblock inconsistency in %s -- removing from array\n"
-
-#define OUT_OF_DATE KERN_ERR \
-"md: superblock update time inconsistency -- using the most recent one\n"
-
-#define OLD_VERSION KERN_ALERT \
-"md: md%d: unsupported raid array version %d.%d.%d\n"
-
-#define NOT_CLEAN_IGNORE KERN_ERR \
-"md: md%d: raid array is not clean -- starting background reconstruction\n"
-
-#define UNKNOWN_LEVEL KERN_ERR \
-"md: md%d: unsupported raid level %d\n"
-
-static int analyze_sbs (mddev_t * mddev)
-{
-       int out_of_date = 0, i;
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev, *rdev2, *freshest;
-       mdp_super_t *sb;
-
-       /*
-        * Verify the RAID superblock on each real device
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       MD_BUG();
-                       goto abort;
-               }
-               if (!rdev->sb) {
-                       MD_BUG();
-                       goto abort;
-               }
-               if (check_disk_sb(rdev))
-                       goto abort;
-       }
-
-       /*
-        * The superblock constant part has to be the same
-        * for all disks in the array.
-        */
-       sb = NULL;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (!sb) {
-                       sb = rdev->sb;
-                       continue;
-               }
-               if (!sb_equal(sb, rdev->sb)) {
-                       printk (INCONSISTENT, partition_name(rdev->dev));
-                       kick_rdev_from_array(rdev);
-                       continue;
-               }
-       }
-
-       /*
-        * OK, we have all disks and the array is ready to run. Let's
-        * find the freshest superblock, that one will be the superblock
-        * that represents the whole array.
-        */
-       if (!mddev->sb)
-               if (alloc_array_sb(mddev))
-                       goto abort;
-       sb = mddev->sb;
-       freshest = NULL;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               __u64 ev1, ev2;
-               /*
-                * if the checksum is invalid, use the superblock
-                * only as a last resort. (decrease it's age by
-                * one event)
-                */
-               if (calc_sb_csum(rdev->sb) != rdev->sb->sb_csum) {
-                       __u64 ev = get_unaligned(&rdev->sb->events);
-                       if (ev != (__u64)0) {
-                               --ev;
-                               put_unaligned(ev,&rdev->sb->events);
-                       }
-               }
-
-               printk("%s's event counter: %08lx\n", partition_name(rdev->dev),
-                      (unsigned long)get_unaligned(&rdev->sb->events));
-               if (!freshest) {
-                       freshest = rdev;
-                       continue;
-               }
-               /*
-                * Find the newest superblock version
-                */
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&freshest->sb->events);
-               if (ev1 != ev2) {
-                       out_of_date = 1;
-                       if (ev1 > ev2)
-                               freshest = rdev;
-               }
-       }
-       if (out_of_date) {
-               printk(OUT_OF_DATE);
-               printk("freshest: %s\n", partition_name(freshest->dev));
-       }
-       memcpy (sb, freshest->sb, sizeof(*sb));
-
-       /*
-        * at this point we have picked the 'best' superblock
-        * from all available superblocks.
-        * now we validate this superblock and kick out possibly
-        * failed disks.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               /*
-                * Kick all non-fresh devices faulty
-                */
-               __u64 ev1, ev2;
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&sb->events);
-               ++ev1;
-               if (ev1 < ev2) {
-                       printk("md: kicking non-fresh %s from array!\n",
-                                               partition_name(rdev->dev));
-                       kick_rdev_from_array(rdev);
-                       continue;
-               }
-       }
-
-       /*
-        * Fix up changed device names ... but only if this disk has a
-        * recent update time. Use faulty checksum ones too.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               __u64 ev1, ev2, ev3;
-               if (rdev->faulty) { /* REMOVEME */
-                       MD_BUG();
-                       goto abort;
-               }
-               ev1 = get_unaligned(&rdev->sb->events);
-               ev2 = get_unaligned(&sb->events);
-               ev3 = ev2;
-               --ev3;
-               if ((rdev->dev != rdev->old_dev) &&
-                   ((ev1 == ev2) || (ev1 == ev3))) {
-                       mdp_disk_t *desc;
-
-                       printk("md: device name has changed from %s to %s since last import!\n", partition_name(rdev->old_dev), partition_name(rdev->dev));
-                       if (rdev->desc_nr == -1) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       desc = &sb->disks[rdev->desc_nr];
-                       if (rdev->old_dev != MKDEV(desc->major, desc->minor)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       desc->major = MAJOR(rdev->dev);
-                       desc->minor = MINOR(rdev->dev);
-                       desc = &rdev->sb->this_disk;
-                       desc->major = MAJOR(rdev->dev);
-                       desc->minor = MINOR(rdev->dev);
-               }
-       }
-
-       /*
-        * Remove unavailable and faulty devices ...
-        *
-        * note that if an array becomes completely unrunnable due to
-        * missing devices, we do not write the superblock back, so the
-        * administrator has a chance to fix things up. The removal thus
-        * only happens if it's nonfatal to the contents of the array.
-        */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               int found;
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               /*
-                * We kick faulty devices/descriptors immediately.
-                */
-               if (disk_faulty(desc)) {
-                       found = 0;
-                       ITERATE_RDEV(mddev,rdev,tmp) {
-                               if (rdev->desc_nr != desc->number)
-                                       continue;
-                               printk("md%d: kicking faulty %s!\n",
-                                       mdidx(mddev),partition_name(rdev->dev));
-                               kick_rdev_from_array(rdev);
-                               found = 1;
-                               break;
-                       }
-                       if (!found) {
-                               if (dev == MKDEV(0,0))
-                                       continue;
-                               printk("md%d: removing former faulty %s!\n",
-                                       mdidx(mddev), partition_name(dev));
-                       }
-                       remove_descriptor(desc, sb);
-                       continue;
-               }
-
-               if (dev == MKDEV(0,0))
-                       continue;
-               /*
-                * Is this device present in the rdev ring?
-                */
-               found = 0;
-               ITERATE_RDEV(mddev,rdev,tmp) {
-                       if (rdev->desc_nr == desc->number) {
-                               found = 1;
-                               break;
-                       }
-               }
-               if (found) 
-                       continue;
-
-               printk("md%d: former device %s is unavailable, removing from array!\n", mdidx(mddev), partition_name(dev));
-               remove_descriptor(desc, sb);
-       }
-
-       /*
-        * Double check wether all devices mentioned in the
-        * superblock are in the rdev ring.
-        */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               if (dev == MKDEV(0,0))
-                       continue;
-
-               if (disk_faulty(desc)) {
-                       MD_BUG();
-                       goto abort;
-               }
-
-               rdev = find_rdev(mddev, dev);
-               if (!rdev) {
-                       MD_BUG();
-                       goto abort;
-               }
-       }
-
-       /*
-        * Do a final reality check.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->desc_nr == -1) {
-                       MD_BUG();
-                       goto abort;
-               }
-               /*
-                * is the desc_nr unique?
-                */
-               ITERATE_RDEV(mddev,rdev2,tmp2) {
-                       if ((rdev2 != rdev) &&
-                                       (rdev2->desc_nr == rdev->desc_nr)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-               }
-               /*
-                * is the device unique?
-                */
-               ITERATE_RDEV(mddev,rdev2,tmp2) {
-                       if ((rdev2 != rdev) &&
-                                       (rdev2->dev == rdev->dev)) {
-                               MD_BUG();
-                               goto abort;
-                       }
-               }
-       }
-
-       /*
-        * Check if we can support this RAID array
-        */
-       if (sb->major_version != MD_MAJOR_VERSION ||
-                       sb->minor_version > MD_MINOR_VERSION) {
-
-               printk (OLD_VERSION, mdidx(mddev), sb->major_version,
-                               sb->minor_version, sb->patch_version);
-               goto abort;
-       }
-
-       if ((sb->state != (1 << MD_SB_CLEAN)) && ((sb->level == 1) ||
-                       (sb->level == 4) || (sb->level == 5)))
-               printk (NOT_CLEAN_IGNORE, mdidx(mddev));
-
-       return 0;
-abort:
-       return 1;
-}
-
-#undef INCONSISTENT
-#undef OUT_OF_DATE
-#undef OLD_VERSION
-#undef OLD_LEVEL
-
-static int device_size_calculation (mddev_t * mddev)
-{
-       int data_disks = 0, persistent;
-       unsigned int readahead;
-       mdp_super_t *sb = mddev->sb;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-       /*
-        * Do device size calculation. Bail out if too small.
-        * (we have to do this after having validated chunk_size,
-        * because device size has to be modulo chunk_size)
-        */ 
-       persistent = !mddev->sb->not_persistent;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               if (rdev->size) {
-                       MD_BUG();
-                       continue;
-               }
-               rdev->size = calc_dev_size(rdev->dev, mddev, persistent);
-               if (rdev->size < sb->chunk_size / 1024) {
-                       printk (KERN_WARNING
-                               "Dev %s smaller than chunk_size: %dk < %dk\n",
-                               partition_name(rdev->dev),
-                               rdev->size, sb->chunk_size / 1024);
-                       return -EINVAL;
-               }
-       }
-
-       switch (sb->level) {
-               case -3:
-                       data_disks = 1;
-                       break;
-               case -2:
-                       data_disks = 1;
-                       break;
-               case -1:
-                       zoned_raid_size(mddev);
-                       data_disks = 1;
-                       break;
-               case 0:
-                       zoned_raid_size(mddev);
-                       data_disks = sb->raid_disks;
-                       break;
-               case 1:
-                       data_disks = 1;
-                       break;
-               case 4:
-               case 5:
-                       data_disks = sb->raid_disks-1;
-                       break;
-               default:
-                       printk (UNKNOWN_LEVEL, mdidx(mddev), sb->level);
-                       goto abort;
-       }
-       if (!md_size[mdidx(mddev)])
-               md_size[mdidx(mddev)] = sb->size * data_disks;
-
-       readahead = MD_READAHEAD;
-       if ((sb->level == 0) || (sb->level == 4) || (sb->level == 5))
-               readahead = mddev->sb->chunk_size * 4 * data_disks;
-               if (readahead < data_disks * MAX_SECTORS*512*2) 
-                       readahead = data_disks * MAX_SECTORS*512*2;
-       else {
-               if (sb->level == -3)
-                       readahead = 0;
-       }
-       md_maxreadahead[mdidx(mddev)] = readahead;
-
-       printk(KERN_INFO "md%d: max total readahead window set to %dk\n",
-               mdidx(mddev), readahead/1024);
-
-       printk(KERN_INFO
-               "md%d: %d data-disks, max readahead per data-disk: %dk\n",
-                       mdidx(mddev), data_disks, readahead/data_disks/1024);
-       return 0;
-abort:
-       return 1;
-}
-
-
-#define TOO_BIG_CHUNKSIZE KERN_ERR \
-"too big chunk_size: %d > %d\n"
-
-#define TOO_SMALL_CHUNKSIZE KERN_ERR \
-"too small chunk_size: %d < %ld\n"
-
-#define BAD_CHUNKSIZE KERN_ERR \
-"no chunksize specified, see 'man raidtab'\n"
-
-static int do_md_run (mddev_t * mddev)
-{
-       int pnum, err;
-       int chunk_size;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev;
-
-
-       if (!mddev->nb_dev) {
-               MD_BUG();
-               return -EINVAL;
-       }
-  
-       if (mddev->pers)
-               return -EBUSY;
-
-       /*
-        * Resize disks to align partitions size on a given
-        * chunk size.
-        */
-       md_size[mdidx(mddev)] = 0;
-
-       /*
-        * Analyze all RAID superblock(s)
-        */ 
-       if (analyze_sbs(mddev)) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       chunk_size = mddev->sb->chunk_size;
-       pnum = level_to_pers(mddev->sb->level);
-
-       mddev->param.chunk_size = chunk_size;
-       mddev->param.personality = pnum;
-
-       if (chunk_size > MAX_CHUNK_SIZE) {
-               printk(TOO_BIG_CHUNKSIZE, chunk_size, MAX_CHUNK_SIZE);
-               return -EINVAL;
-       }
-       /*
-        * chunk-size has to be a power of 2 and multiples of PAGE_SIZE
-        */
-       if ( (1 << ffz(~chunk_size)) != chunk_size) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       if (chunk_size < PAGE_SIZE) {
-               printk(TOO_SMALL_CHUNKSIZE, chunk_size, PAGE_SIZE);
-               return -EINVAL;
-       }
-
-       if (pnum >= MAX_PERSONALITY) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       if ((pnum != RAID1) && (pnum != LINEAR) && !chunk_size) {
-               /*
-                * 'default chunksize' in the old md code used to
-                * be PAGE_SIZE, baaad.
-                * we abort here to be on the safe side. We dont
-                * want to continue the bad practice.
-                */
-               printk(BAD_CHUNKSIZE);
-               return -EINVAL;
-       }
-
-       if (!pers[pnum])
-       {
-#ifdef CONFIG_KMOD
-               char module_name[80];
-               sprintf (module_name, "md-personality-%d", pnum);
-               request_module (module_name);
-               if (!pers[pnum])
-#endif
-                       return -EINVAL;
-       }
-  
-       if (device_size_calculation(mddev))
-               return -EINVAL;
-
-       /*
-        * Drop all container device buffers, from now on
-        * the only valid external interface is through the md
-        * device.
-        */
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               fsync_dev(rdev->dev);
-               invalidate_buffers(rdev->dev);
-       }
-  
-       mddev->pers = pers[pnum];
-  
-       err = mddev->pers->run(mddev);
-       if (err) {
-               printk("pers->run() failed ...\n");
-               mddev->pers = NULL;
-               return -EINVAL;
-       }
-
-       mddev->sb->state &= ~(1 << MD_SB_CLEAN);
-       md_update_sb(mddev);
-
-       /*
-        * md_size has units of 1K blocks, which are
-        * twice as large as sectors.
-        */
-       md_hd_struct[mdidx(mddev)].start_sect = 0;
-       md_hd_struct[mdidx(mddev)].nr_sects = md_size[mdidx(mddev)] << 1;
-  
-       read_ahead[MD_MAJOR] = 1024;
-       return (0);
-}
-
-#undef TOO_BIG_CHUNKSIZE
-#undef BAD_CHUNKSIZE
-
-#define OUT(x) do { err = (x); goto out; } while (0)
-
-static int restart_array (mddev_t *mddev)
-{
-       int err = 0;
- 
-       /*
-        * Complain if it has no devices
-        */
-       if (!mddev->nb_dev)
-               OUT(-ENXIO);
-
-       if (mddev->pers) {
-               if (!mddev->ro)
-                       OUT(-EBUSY);
-
-               mddev->ro = 0;
-               set_device_ro(mddev_to_kdev(mddev), 0);
-
-               printk (KERN_INFO
-                       "md%d switched to read-write mode.\n", mdidx(mddev));
-               /*
-                * Kick recovery or resync if necessary
-                */
-               md_recover_arrays();
-               if (mddev->pers->restart_resync)
-                       mddev->pers->restart_resync(mddev);
-       } else
-               err = -EINVAL;
- 
-out:
-       return err;
-}
-
-#define STILL_MOUNTED KERN_WARNING \
-"md: md%d still mounted.\n"
-
-static int do_md_stop (mddev_t * mddev, int ro)
-{
-       int err = 0, resync_interrupted = 0;
-       kdev_t dev = mddev_to_kdev(mddev);
- 
-       if (!ro && !fs_may_mount (dev)) {
-               printk (STILL_MOUNTED, mdidx(mddev));
-               OUT(-EBUSY);
-       }
-  
-       /*
-        * complain if it's already stopped
-        */
-       if (!mddev->nb_dev)
-               OUT(-ENXIO);
-
-       if (mddev->pers) {
-               /*
-                * It is safe to call stop here, it only frees private
-                * data. Also, it tells us if a device is unstoppable
-                * (eg. resyncing is in progress)
-                */
-               if (mddev->pers->stop_resync)
-                       if (mddev->pers->stop_resync(mddev))
-                               resync_interrupted = 1;
-
-               if (mddev->recovery_running)
-                       md_interrupt_thread(md_recovery_thread);
-
-               /*
-                * This synchronizes with signal delivery to the
-                * resync or reconstruction thread. It also nicely
-                * hangs the process if some reconstruction has not
-                * finished.
-                */
-               down(&mddev->recovery_sem);
-               up(&mddev->recovery_sem);
-
-               /*
-                *  sync and invalidate buffers because we cannot kill the
-                *  main thread with valid IO transfers still around.
-                *  the kernel lock protects us from new requests being
-                *  added after invalidate_buffers().
-                */
-               fsync_dev (mddev_to_kdev(mddev));
-               fsync_dev (dev);
-               invalidate_buffers (dev);
-
-               if (ro) {
-                       if (mddev->ro)
-                               OUT(-ENXIO);
-                       mddev->ro = 1;
-               } else {
-                       if (mddev->ro)
-                               set_device_ro(dev, 0);
-                       if (mddev->pers->stop(mddev)) {
-                               if (mddev->ro)
-                                       set_device_ro(dev, 1);
-                               OUT(-EBUSY);
-                       }
-                       if (mddev->ro)
-                               mddev->ro = 0;
-               }
-               if (mddev->sb) {
-                       /*
-                        * mark it clean only if there was no resync
-                        * interrupted.
-                        */
-                       if (!mddev->recovery_running && !resync_interrupted) {
-                               printk("marking sb clean...\n");
-                               mddev->sb->state |= 1 << MD_SB_CLEAN;
-                       }
-                       md_update_sb(mddev);
-               }
-               if (ro)
-                       set_device_ro(dev, 1);
-       }
- 
-       /*
-        * Free resources if final stop
-        */
-       if (!ro) {
-               export_array(mddev);
-               md_size[mdidx(mddev)] = 0;
-               md_hd_struct[mdidx(mddev)].nr_sects = 0;
-               free_mddev(mddev);
-
-               printk (KERN_INFO "md%d stopped.\n", mdidx(mddev));
-       } else
-               printk (KERN_INFO
-                       "md%d switched to read-only mode.\n", mdidx(mddev));
-out:
-       return err;
-}
-
-#undef OUT
-
-/*
- * We have to safely support old arrays too.
- */
-int detect_old_array (mdp_super_t *sb)
-{
-       if (sb->major_version > 0)
-               return 0;
-       if (sb->minor_version >= 90)
-               return 0;
-
-       return -EINVAL;
-}
-
-
-static void autorun_array (mddev_t *mddev)
-{
-       mdk_rdev_t *rdev;
-        struct md_list_head *tmp;
-       int err;
-
-       if (mddev->disks.prev == &mddev->disks) {
-               MD_BUG();
-               return;
-       }
-
-       printk("running: ");
-
-        ITERATE_RDEV(mddev,rdev,tmp) {
-               printk("<%s>", partition_name(rdev->dev));
-       }
-       printk("\nnow!\n");
-
-       err = do_md_run (mddev);
-       if (err) {
-               printk("do_md_run() returned %d\n", err);
-               /*
-                * prevent the writeback of an unrunnable array
-                */
-               mddev->sb_dirty = 0;
-               do_md_stop (mddev, 0);
-       }
-}
-
-/*
- * lets try to run arrays based on all disks that have arrived
- * until now. (those are in the ->pending list)
- *
- * the method: pick the first pending disk, collect all disks with
- * the same UUID, remove all from the pending list and put them into
- * the 'same_array' list. Then order this list based on superblock
- * update time (freshest comes first), kick out 'old' disks and
- * compare superblocks. If everything's fine then run it.
- */
-static void autorun_devices (void)
-{
-       struct md_list_head candidates;
-       struct md_list_head *tmp;
-       mdk_rdev_t *rdev0, *rdev;
-       mddev_t *mddev;
-       kdev_t md_kdev;
-
-
-       printk("autorun ...\n");
-       while (pending_raid_disks.next != &pending_raid_disks) {
-               rdev0 = md_list_entry(pending_raid_disks.next,
-                                        mdk_rdev_t, pending);
-
-               printk("considering %s ...\n", partition_name(rdev0->dev));
-               MD_INIT_LIST_HEAD(&candidates);
-               ITERATE_RDEV_PENDING(rdev,tmp) {
-                       if (uuid_equal(rdev0, rdev)) {
-                               if (!sb_equal(rdev0->sb, rdev->sb)) {
-                                       printk("%s has same UUID as %s, but superblocks differ ...\n", partition_name(rdev->dev), partition_name(rdev0->dev));
-                                       continue;
-                               }
-                               printk("  adding %s ...\n", partition_name(rdev->dev));
-                               md_list_del(&rdev->pending);
-                               md_list_add(&rdev->pending, &candidates);
-                       }
-               }
-               /*
-                * now we have a set of devices, with all of them having
-                * mostly sane superblocks. It's time to allocate the
-                * mddev.
-                */
-               md_kdev = MKDEV(MD_MAJOR, rdev0->sb->md_minor);
-               mddev = kdev_to_mddev(md_kdev);
-               if (mddev) {
-                       printk("md%d already running, cannot run %s\n",
-                                mdidx(mddev), partition_name(rdev0->dev));
-                       ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp)
-                               export_rdev(rdev);
-                       continue;
-               }
-               mddev = alloc_mddev(md_kdev);
-               printk("created md%d\n", mdidx(mddev));
-               ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp) {
-                       bind_rdev_to_array(rdev, mddev);
-                       md_list_del(&rdev->pending);
-                       MD_INIT_LIST_HEAD(&rdev->pending);
-               }
-               autorun_array(mddev);
-       }
-       printk("... autorun DONE.\n");
-}
-
-/*
- * import RAID devices based on one partition
- * if possible, the array gets run as well.
- */
-
-#define BAD_VERSION KERN_ERR \
-"md: %s has RAID superblock version 0.%d, autodetect needs v0.90 or higher\n"
-
-#define OUT_OF_MEM KERN_ALERT \
-"md: out of memory.\n"
-
-#define NO_DEVICE KERN_ERR \
-"md: disabled device %s\n"
-
-#define AUTOADD_FAILED KERN_ERR \
-"md: auto-adding devices to md%d FAILED (error %d).\n"
-
-#define AUTOADD_FAILED_USED KERN_ERR \
-"md: cannot auto-add device %s to md%d, already used.\n"
-
-#define AUTORUN_FAILED KERN_ERR \
-"md: auto-running md%d FAILED (error %d).\n"
-
-#define MDDEV_BUSY KERN_ERR \
-"md: cannot auto-add to md%d, already running.\n"
-
-#define AUTOADDING KERN_INFO \
-"md: auto-adding devices to md%d, based on %s's superblock.\n"
-
-#define AUTORUNNING KERN_INFO \
-"md: auto-running md%d.\n"
-
-static int autostart_array (kdev_t startdev)
-{
-       int err = -EINVAL, i;
-       mdp_super_t *sb = NULL;
-       mdk_rdev_t *start_rdev = NULL, *rdev;
-
-       if (md_import_device(startdev, 1)) {
-               printk("could not import %s!\n", partition_name(startdev));
-               goto abort;
-       }
-
-       start_rdev = find_rdev_all(startdev);
-       if (!start_rdev) {
-               MD_BUG();
-               goto abort;
-       }
-       if (start_rdev->faulty) {
-               printk("can not autostart based on faulty %s!\n",
-                                               partition_name(startdev));
-               goto abort;
-       }
-       md_list_add(&start_rdev->pending, &pending_raid_disks);
-
-       sb = start_rdev->sb;
-
-       err = detect_old_array(sb);
-       if (err) {
-               printk("array version is too old to be autostarted, use raidtools 0.90 mkraid --upgrade\nto upgrade the array without data loss!\n");
-               goto abort;
-       }
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mdp_disk_t *desc;
-               kdev_t dev;
-
-               desc = sb->disks + i;
-               dev = MKDEV(desc->major, desc->minor);
-
-               if (dev == MKDEV(0,0))
-                       continue;
-               if (dev == startdev)
-                       continue;
-               if (md_import_device(dev, 1)) {
-                       printk("could not import %s, trying to run array nevertheless.\n", partition_name(dev));
-                       continue;
-               }
-               rdev = find_rdev_all(dev);
-               if (!rdev) {
-                       MD_BUG();
-                       goto abort;
-               }
-               md_list_add(&rdev->pending, &pending_raid_disks);
-       }
-
-       /*
-        * possibly return codes
-        */
-       autorun_devices();
-       return 0;
-
-abort:
-       if (start_rdev)
-               export_rdev(start_rdev);
-       return err;
-}
-
-#undef BAD_VERSION
-#undef OUT_OF_MEM
-#undef NO_DEVICE
-#undef AUTOADD_FAILED_USED
-#undef AUTOADD_FAILED
-#undef AUTORUN_FAILED
-#undef AUTOADDING
-#undef AUTORUNNING
-
-struct {
-       int set;
-       int noautodetect;
-
-} raid_setup_args md__initdata = { 0, 0 };
-
-/*
- * Searches all registered partitions for autorun RAID arrays
- * at boot time.
- */
-md__initfunc(void autodetect_raid(void))
-{
-#ifdef CONFIG_AUTODETECT_RAID
-       struct gendisk *disk;
-       mdk_rdev_t *rdev;
-       int i;
-
-       if (raid_setup_args.noautodetect) {
-               printk(KERN_INFO "skipping autodetection of RAID arrays\n");
-               return;
-       }
-       printk(KERN_INFO "autodetecting RAID arrays\n");
-
-       for (disk = gendisk_head ; disk ; disk = disk->next) {
-               for (i = 0; i < disk->max_p*disk->max_nr; i++) {
-                       kdev_t dev = MKDEV(disk->major,i);
-
-                       if (disk->part[i].type == LINUX_OLD_RAID_PARTITION) {
-                               printk(KERN_ALERT
-"md: %s's partition type has to be changed from type 0x86 to type 0xfd\n"
-"    to maintain interoperability with other OSs! Autodetection support for\n"
-"    type 0x86 will be deleted after some migration timeout. Sorry.\n",
-                                       partition_name(dev));
-                               disk->part[i].type = LINUX_RAID_PARTITION;
-                       }
-                       if (disk->part[i].type != LINUX_RAID_PARTITION)
-                               continue;
-
-                       if (md_import_device(dev,1)) {
-                               printk(KERN_ALERT "could not import %s!\n",
-                                                       partition_name(dev));
-                               continue;
-                       }
-                       /*
-                        * Sanity checks:
-                        */
-                       rdev = find_rdev_all(dev);
-                       if (!rdev) {
-                               MD_BUG();
-                               continue;
-                       }
-                       if (rdev->faulty) {
-                               MD_BUG();
-                               continue;
-                       }
-                       md_list_add(&rdev->pending, &pending_raid_disks);
-               }
-       }
-
-       autorun_devices();
-#endif
-}
-
-static int get_version (void * arg)
-{
-       mdu_version_t ver;
-
-       ver.major = MD_MAJOR_VERSION;
-       ver.minor = MD_MINOR_VERSION;
-       ver.patchlevel = MD_PATCHLEVEL_VERSION;
-
-       if (md_copy_to_user(arg, &ver, sizeof(ver)))
-               return -EFAULT;
-
-       return 0;
-}
-
-#define SET_FROM_SB(x) info.x = mddev->sb->x
-static int get_array_info (mddev_t * mddev, void * arg)
-{
-       mdu_array_info_t info;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       SET_FROM_SB(major_version);
-       SET_FROM_SB(minor_version);
-       SET_FROM_SB(patch_version);
-       SET_FROM_SB(ctime);
-       SET_FROM_SB(level);
-       SET_FROM_SB(size);
-       SET_FROM_SB(nr_disks);
-       SET_FROM_SB(raid_disks);
-       SET_FROM_SB(md_minor);
-       SET_FROM_SB(not_persistent);
-
-       SET_FROM_SB(utime);
-       SET_FROM_SB(state);
-       SET_FROM_SB(active_disks);
-       SET_FROM_SB(working_disks);
-       SET_FROM_SB(failed_disks);
-       SET_FROM_SB(spare_disks);
-
-       SET_FROM_SB(layout);
-       SET_FROM_SB(chunk_size);
-
-       if (md_copy_to_user(arg, &info, sizeof(info)))
-               return -EFAULT;
-
-       return 0;
-}
-#undef SET_FROM_SB
-
-#define SET_FROM_SB(x) info.x = mddev->sb->disks[nr].x
-static int get_disk_info (mddev_t * mddev, void * arg)
-{
-       mdu_disk_info_t info;
-       unsigned int nr;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       nr = info.number;
-       if (nr >= mddev->sb->nr_disks)
-               return -EINVAL;
-
-       SET_FROM_SB(major);
-       SET_FROM_SB(minor);
-       SET_FROM_SB(raid_disk);
-       SET_FROM_SB(state);
-
-       if (md_copy_to_user(arg, &info, sizeof(info)))
-               return -EFAULT;
-
-       return 0;
-}
-#undef SET_FROM_SB
-
-#define SET_SB(x) mddev->sb->disks[nr].x = info.x
-
-static int add_new_disk (mddev_t * mddev, void * arg)
-{
-       int err, size, persistent;
-       mdu_disk_info_t info;
-       mdk_rdev_t *rdev;
-       unsigned int nr;
-       kdev_t dev;
-
-       if (!mddev->sb)
-               return -EINVAL;
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       nr = info.number;
-       if (nr >= mddev->sb->nr_disks)
-               return -EINVAL;
-
-       dev = MKDEV(info.major,info.minor);
-
-       if (find_rdev_all(dev)) {
-               printk("device %s already used in a RAID array!\n", 
-                               partition_name(dev));
-               return -EBUSY;
-       }
-
-       SET_SB(number);
-       SET_SB(major);
-       SET_SB(minor);
-       SET_SB(raid_disk);
-       SET_SB(state);
- 
-       if ((info.state & (1<<MD_DISK_FAULTY))==0) {
-               err = md_import_device (dev, 0);
-               if (err) {
-                       printk("md: error, md_import_device() returned %d\n", err);
-                       return -EINVAL;
-               }
-               rdev = find_rdev_all(dev);
-               if (!rdev) {
-                       MD_BUG();
-                       return -EINVAL;
-               }
- 
-               rdev->old_dev = dev;
-               rdev->desc_nr = info.number;
- 
-               bind_rdev_to_array(rdev, mddev);
- 
-               persistent = !mddev->sb->not_persistent;
-               if (!persistent)
-                       printk("nonpersistent superblock ...\n");
-               if (!mddev->sb->chunk_size)
-                       printk("no chunksize?\n");
- 
-               size = calc_dev_size(dev, mddev, persistent);
-               rdev->sb_offset = calc_dev_sboffset(dev, mddev, persistent);
- 
-               if (!mddev->sb->size || (mddev->sb->size > size))
-                       mddev->sb->size = size;
-       }
- 
-       /*
-        * sync all other superblocks with the main superblock
-        */
-       sync_sbs(mddev);
-
-       return 0;
-}
-#undef SET_SB
-
-static int hot_remove_disk (mddev_t * mddev, kdev_t dev)
-{
-       int err;
-       mdk_rdev_t *rdev;
-       mdp_disk_t *disk;
-
-       if (!mddev->pers)
-               return -ENODEV;
-
-       printk("trying to remove %s from md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-
-       if (!mddev->pers->diskop) {
-               printk("md%d: personality does not support diskops!\n",
-                                                                mdidx(mddev));
-               return -EINVAL;
-       }
-
-       rdev = find_rdev(mddev, dev);
-       if (!rdev)
-               return -ENXIO;
-
-       if (rdev->desc_nr == -1) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       disk = &mddev->sb->disks[rdev->desc_nr];
-       if (disk_active(disk))
-               goto busy;
-       if (disk_removed(disk)) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       
-       err = mddev->pers->diskop(mddev, &disk, DISKOP_HOT_REMOVE_DISK);
-       if (err == -EBUSY)
-               goto busy;
-       if (err) {
-               MD_BUG();
-               return -EINVAL;
-       }
-
-       remove_descriptor(disk, mddev->sb);
-       kick_rdev_from_array(rdev);
-       mddev->sb_dirty = 1;
-       md_update_sb(mddev);
-
-       return 0;
-busy:
-       printk("cannot remove active disk %s from md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-       return -EBUSY;
-}
-
-static int hot_add_disk (mddev_t * mddev, kdev_t dev)
-{
-       int i, err, persistent;
-       unsigned int size;
-       mdk_rdev_t *rdev;
-       mdp_disk_t *disk;
-
-       if (!mddev->pers)
-               return -ENODEV;
-
-       printk("trying to hot-add %s to md%d ... \n",
-               partition_name(dev), mdidx(mddev));
-
-       if (!mddev->pers->diskop) {
-               printk("md%d: personality does not support diskops!\n",
-                                                                mdidx(mddev));
-               return -EINVAL;
-       }
-
-       persistent = !mddev->sb->not_persistent;
-       size = calc_dev_size(dev, mddev, persistent);
-
-       if (size < mddev->sb->size) {
-               printk("md%d: disk size %d blocks < array size %d\n",
-                               mdidx(mddev), size, mddev->sb->size);
-               return -ENOSPC;
-       }
-
-       rdev = find_rdev(mddev, dev);
-       if (rdev)
-               return -EBUSY;
-
-       err = md_import_device (dev, 0);
-       if (err) {
-               printk("md: error, md_import_device() returned %d\n", err);
-               return -EINVAL;
-       }
-       rdev = find_rdev_all(dev);
-       if (!rdev) {
-               MD_BUG();
-               return -EINVAL;
-       }
-       if (rdev->faulty) {
-               printk("md: can not hot-add faulty %s disk to md%d!\n",
-                               partition_name(dev), mdidx(mddev));
-               err = -EINVAL;
-               goto abort_export;
-       }
-       bind_rdev_to_array(rdev, mddev);
-
-       /*
-        * The rest should better be atomic, we can have disk failures
-        * noticed in interrupt contexts ...
-        */
-       cli();
-       rdev->old_dev = dev;
-       rdev->size = size;
-       rdev->sb_offset = calc_dev_sboffset(dev, mddev, persistent);
-
-       disk = mddev->sb->disks + mddev->sb->raid_disks;
-       for (i = mddev->sb->raid_disks; i < MD_SB_DISKS; i++) {
-               disk = mddev->sb->disks + i;
-
-               if (!disk->major && !disk->minor)
-                       break;
-               if (disk_removed(disk))
-                       break;
-       }
-       if (i == MD_SB_DISKS) {
-               sti();
-               printk("md%d: can not hot-add to full array!\n", mdidx(mddev));
-               err = -EBUSY;
-               goto abort_unbind_export;
-       }
+       realdev->size=blk_size[MAJOR(dev)][MINOR(dev)];
  
-       if (disk_removed(disk)) {
+       if (hot_add) {
+               /*
+                * Check the superblock for consistency.
+                * The personality itself has to check whether it's getting
+                * added with the proper flags.  The personality has to be
+                 * checked too. ;)
+                */
+               if (analyze_one_sb (realdev))
+                       return -EINVAL;
                 /*
-                * reuse slot
+                * hot_add has to bump up nb_dev itself
                  */
-               if (disk->number != i) {
-                       sti();
-                       MD_BUG();
-                       err = -EINVAL;
-                       goto abort_unbind_export;
+               if (md_dev[minor].pers->hot_add_disk (&md_dev[minor], dev)) {
+                       /*
+                        * FIXME: here we should free up the inode and stuff
+                        */
+                       printk ("FIXME\n");
+                       return -EINVAL;
                 }
-       } else {
-               disk->number = i;
-       }
-
-       disk->raid_disk = disk->number;
-       disk->major = MAJOR(dev);
-       disk->minor = MINOR(dev);
-
-       if (mddev->pers->diskop(mddev, &disk, DISKOP_HOT_ADD_DISK)) {
-               sti();
-               MD_BUG();
-               err = -EINVAL;
-               goto abort_unbind_export;
-       }
-
-       mark_disk_spare(disk);
-       mddev->sb->nr_disks++;
-       mddev->sb->spare_disks++;
-       mddev->sb->working_disks++;
-
-       mddev->sb_dirty = 1;
-
-       sti();
-       md_update_sb(mddev);
-
-       /*
-        * Kick recovery, maybe this spare has to be added to the
-        * array immediately.
-        */
-       md_recover_arrays();
-
-       return 0;
-
-abort_unbind_export:
-       unbind_rdev_from_array(rdev);
-
-abort_export:
-       export_rdev(rdev);
-       return err;
-}
-
-#define SET_SB(x) mddev->sb->x = info.x
-static int set_array_info (mddev_t * mddev, void * arg)
-{
-       mdu_array_info_t info;
-
-       if (mddev->sb) {
-               printk("array md%d already has a superblock!\n", 
-                               mdidx(mddev));
-               return -EBUSY;
-       }
-
-       if (md_copy_from_user(&info, arg, sizeof(info)))
-               return -EFAULT;
-
-       if (alloc_array_sb(mddev))
-               return -ENOMEM;
-
-       mddev->sb->major_version = MD_MAJOR_VERSION;
-       mddev->sb->minor_version = MD_MINOR_VERSION;
-       mddev->sb->patch_version = MD_PATCHLEVEL_VERSION;
-       mddev->sb->ctime = CURRENT_TIME;
-
-       SET_SB(level);
-       SET_SB(size);
-       SET_SB(nr_disks);
-       SET_SB(raid_disks);
-       SET_SB(md_minor);
-       SET_SB(not_persistent);
-
-       SET_SB(state);
-       SET_SB(active_disks);
-       SET_SB(working_disks);
-       SET_SB(failed_disks);
-       SET_SB(spare_disks);
-
-       SET_SB(layout);
-       SET_SB(chunk_size);
-
-       mddev->sb->md_magic = MD_SB_MAGIC;
-
-       /*
-        * Generate a 128 bit UUID
-        */
-       get_random_bytes(&mddev->sb->set_uuid0, 4);
-       get_random_bytes(&mddev->sb->set_uuid1, 4);
-       get_random_bytes(&mddev->sb->set_uuid2, 4);
-       get_random_bytes(&mddev->sb->set_uuid3, 4);
-
-       return 0;
-}
-#undef SET_SB
-
-static int set_disk_info (mddev_t * mddev, void * arg)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int clear_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int write_raid_info (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
-
-static int protect_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
-}
+       } else
+               md_dev[minor].nb_dev++;
  
-static int unprotect_array (mddev_t * mddev)
-{
-       printk("not yet");
-       return -EINVAL;
+       printk ("REGISTER_DEV %s to md%x done\n", partition_name(dev), minor);
+       return (0);
  }
  
  static int md_ioctl (struct inode *inode, struct file *file,
                       unsigned int cmd, unsigned long arg)
  {
-       unsigned int minor;
-       int err = 0;
-       struct hd_geometry *loc = (struct hd_geometry *) arg;
-       mddev_t *mddev = NULL;
-       kdev_t dev;
-
-       if (!md_capable_admin())
-               return -EACCES;
-
-       dev = inode->i_rdev;
-       minor = MINOR(dev);
-       if (minor >= MAX_MD_DEVS)
-               return -EINVAL;
-
-       /*
-        * Commands dealing with the RAID driver but not any
-        * particular array:
-        */
-       switch (cmd)
-       {
-               case RAID_VERSION:
-                       err = get_version((void *)arg);
-                       goto done;
-
-               case PRINT_RAID_DEBUG:
-                       err = 0;
-                       md_print_devices();
-                       goto done_unlock;
-      
-               case BLKGETSIZE:   /* Return device size */
-                       if (!arg) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       err = md_put_user(md_hd_struct[minor].nr_sects,
-                                               (long *) arg);
-                       goto done;
-
-               case BLKFLSBUF:
-                       fsync_dev(dev);
-                       invalidate_buffers(dev);
-                       goto done;
-
-               case BLKRASET:
-                       if (arg > 0xff) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       read_ahead[MAJOR(dev)] = arg;
-                       goto done;
-    
-               case BLKRAGET:
-                       if (!arg) {
-                               err = -EINVAL;
-                               goto abort;
-                       }
-                       err = md_put_user (read_ahead[
-                               MAJOR(dev)], (long *) arg);
-                       goto done;
-               default:
-       }
-
-       /*
-        * Commands creating/starting a new array:
-        */
-
-       mddev = kdev_to_mddev(dev);
-
-       switch (cmd)
-       {
-               case SET_ARRAY_INFO:
-               case START_ARRAY:
-                       if (mddev) {
-                               printk("array md%d already exists!\n",
-                                                               mdidx(mddev));
-                               err = -EEXIST;
-                               goto abort;
-                       }
-               default:
-       }
-
-       switch (cmd)
-       {
-               case SET_ARRAY_INFO:
-                       mddev = alloc_mddev(dev);
-                       if (!mddev) {
-                               err = -ENOMEM;
-                               goto abort;
-                       }
-                       /*
-                        * alloc_mddev() should possibly self-lock.
-                        */
-                       err = lock_mddev(mddev);
-                       if (err) {
-                               printk("ioctl, reason %d, cmd %d\n",err, cmd);
-                               goto abort;
-                       }
-                       err = set_array_info(mddev, (void *)arg);
-                       goto done_unlock;
-
-               case START_ARRAY:
-                       /*
-                        * possibly make it lock the array ...
-                        */
-                       err = autostart_array((kdev_t)arg);
-                       if (err) {
-                               printk("autostart %s failed!\n",
-                                       partition_name((kdev_t)arg));
-                       }
-                       goto done;
-      
-               default:
-       }
-      
-       /*
-        * Commands querying/configuring an existing array:
-        */
-
-       if (!mddev) {
-               err = -ENODEV;
-               goto abort;
-       }
-       err = lock_mddev(mddev);
-       if (err) {
-               printk("ioctl lock interrupted, reason %d, cmd %d\n",err, cmd);
-               goto abort;
-       }
-
-       /*
-        * Commands even a read-only array can execute:
-        */
-       switch (cmd)
-       {
-               case GET_ARRAY_INFO:
-                       err = get_array_info(mddev, (void *)arg);
-                       goto done_unlock;
-
-               case GET_DISK_INFO:
-                       err = get_disk_info(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case RESTART_ARRAY_RW:
-                       err = restart_array(mddev);
-                       goto done_unlock;
+  int minor, err;
+  struct hd_geometry *loc = (struct hd_geometry *) arg;
  
-               case STOP_ARRAY:
-                       err = do_md_stop (mddev, 0);
-                       goto done_unlock;
-      
-               case STOP_ARRAY_RO:
-                       err = do_md_stop (mddev, 1);
-                       goto done_unlock;
-      
-       /*
-        * We have a problem here : there is no easy way to give a CHS
-        * virtual geometry. We currently pretend that we have a 2 heads
-        * 4 sectors (with a BIG number of cylinders...). This drives
-        * dosfs just mad... ;-)
-        */
-               case HDIO_GETGEO:
-                       if (!loc) {
-                               err = -EINVAL;
-                               goto abort_unlock;
-                       }
-                       err = md_put_user (2, (char *) &loc->heads);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (4, (char *) &loc->sectors);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
-                                               (short *) &loc->cylinders);
-                       if (err)
-                               goto abort_unlock;
-                       err = md_put_user (md_hd_struct[minor].start_sect,
-                                               (long *) &loc->start);
-                       goto done_unlock;
-       }
+  if (!capable(CAP_SYS_ADMIN))
+    return -EACCES;
  
-       /*
-        * The remaining ioctls are changing the state of the
-        * superblock, so we do not allow read-only arrays
-        * here:
-        */
-       if (mddev->ro) {
-               err = -EROFS;
-               goto abort_unlock;
-       }
+  if (((minor=MINOR(inode->i_rdev)) & 0x80) &&
+      (minor & 0x7f) < MAX_PERSONALITY &&
+      pers[minor & 0x7f] &&
+      pers[minor & 0x7f]->ioctl)
+    return (pers[minor & 0x7f]->ioctl (inode, file, cmd, arg));
+  
+  if (minor >= MAX_MD_DEV)
+    return -EINVAL;
  
-       switch (cmd)
-       {
-               case CLEAR_ARRAY:
-                       err = clear_array(mddev);
-                       goto done_unlock;
-      
-               case ADD_NEW_DISK:
-                       err = add_new_disk(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case HOT_REMOVE_DISK:
-                       err = hot_remove_disk(mddev, (kdev_t)arg);
-                       goto done_unlock;
-      
-               case HOT_ADD_DISK:
-                       err = hot_add_disk(mddev, (kdev_t)arg);
-                       goto done_unlock;
-      
-               case SET_DISK_INFO:
-                       err = set_disk_info(mddev, (void *)arg);
-                       goto done_unlock;
-      
-               case WRITE_RAID_INFO:
-                       err = write_raid_info(mddev);
-                       goto done_unlock;
-      
-               case UNPROTECT_ARRAY:
-                       err = unprotect_array(mddev);
-                       goto done_unlock;
-      
-               case PROTECT_ARRAY:
-                       err = protect_array(mddev);
-                       goto done_unlock;
-      
-               case RUN_ARRAY:
-               {
-                       mdu_param_t param;
+  switch (cmd)
+  {
+    case REGISTER_DEV:
+      return do_md_add (minor, to_kdev_t ((dev_t) arg));
  
-                       err = md_copy_from_user(&param, (mdu_param_t *)arg,
-                                                        sizeof(param));
-                       if (err)
-                               goto abort_unlock;
+    case START_MD:
+      return do_md_run (minor, (int) arg);
  
-                       err = do_md_run (mddev);
-                       /*
-                        * we have to clean up the mess if
-                        * the array cannot be run for some
-                        * reason ...
-                        */
-                       if (err) {
-                               mddev->sb_dirty = 0;
-                               do_md_stop (mddev, 0);
-                       }
-                       goto done_unlock;
-               }
+    case STOP_MD:
+      return do_md_stop (minor, inode);
        
-               default:
-                       printk(KERN_WARNING "%s(pid %d) used obsolete MD ioctl, upgrade your software to use new ictls.\n", current->comm, current->pid);
-                       err = -EINVAL;
-                       goto abort_unlock;
-       }
-
-done_unlock:
-abort_unlock:
-       if (mddev)
-               unlock_mddev(mddev);
-       else
-               printk("huh11?\n");
+    case BLKGETSIZE:   /* Return device size */
+    if  (!arg)  return -EINVAL;
+    err = put_user (md_hd_struct[MINOR(inode->i_rdev)].nr_sects, (long *) arg);
+    if (err)
+      return err;
+    break;
+
+    case BLKFLSBUF:
+    fsync_dev (inode->i_rdev);
+    invalidate_buffers (inode->i_rdev);
+    break;
+
+    case BLKRASET:
+    if (arg > 0xff)
+      return -EINVAL;
+    read_ahead[MAJOR(inode->i_rdev)] = arg;
+    return 0;
+    
+    case BLKRAGET:
+    if  (!arg)  return -EINVAL;
+    err = put_user (read_ahead[MAJOR(inode->i_rdev)], (long *) arg);
+    if (err)
+      return err;
+    break;
+
+    /* We have a problem here : there is no easy way to give a CHS
+       virtual geometry. We currently pretend that we have a 2 heads
+       4 sectors (with a BIG number of cylinders...). This drives dosfs
+       just mad... ;-) */
+    
+    case HDIO_GETGEO:
+    if (!loc)  return -EINVAL;
+    err = put_user (2, (char *) &loc->heads);
+    if (err)
+      return err;
+    err = put_user (4, (char *) &loc->sectors);
+    if (err)
+      return err;
+    err = put_user (md_hd_struct[minor].nr_sects/8, (short *) &loc->cylinders);
+    if (err)
+      return err;
+    err = put_user (md_hd_struct[MINOR(inode->i_rdev)].start_sect,
+               (long *) &loc->start);
+    if (err)
+      return err;
+    break;
+    
+    RO_IOCTLS(inode->i_rdev,arg);
+    
+    default:
+    return -EINVAL;
+  }
  
-       return err;
-done:
-       if (err)
-               printk("huh12?\n");
-abort:
-       return err;
+  return (0);
  }
  
-
-#if LINUX_VERSION_CODE < LinuxVersionCode(2,1,0)
-
  static int md_open (struct inode *inode, struct file *file)
  {
-       /*
-        * Always succeed
-        */
-       return (0);
-}
-
-static void md_release (struct inode *inode, struct file *file)
-{
-       sync_dev(inode->i_rdev);
-}
-
-
-static int md_read (struct inode *inode, struct file *file,
-                                               char *buf, int count)
-{
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
-
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  int minor=MINOR(inode->i_rdev);
  
-       return block_read (inode, file, buf, count);
+  md_dev[minor].busy++;
+  return (0);                  /* Always succeed */
  }
  
-static int md_write (struct inode *inode, struct file *file,
-                                               const char *buf, int count)
-{
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
-
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
  
-       return block_write (inode, file, buf, count);
-}
-
-static struct file_operations md_fops=
+static int md_release (struct inode *inode, struct file *file)
  {
-       NULL,
-       md_read,
-       md_write,
-       NULL,
-       NULL,
-       md_ioctl,
-       NULL,
-       md_open,
-       md_release,
-       block_fsync
-};
-
-#else
+  int minor=MINOR(inode->i_rdev);
  
-static int md_open (struct inode *inode, struct file *file)
-{
-       /*
-        * Always succeed
-        */
-       return (0);
+  sync_dev (inode->i_rdev);
+  md_dev[minor].busy--;
+  return 0;
  }
  
-static int md_release (struct inode *inode, struct file *file)
-{
-       sync_dev(inode->i_rdev);
-       return 0;
-}
  
  static ssize_t md_read (struct file *file, char *buf, size_t count,
                         loff_t *ppos)
  {
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
+  int minor=MINOR(file->f_dentry->d_inode->i_rdev);
  
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  if (!md_dev[minor].pers)     /* Check if device is being run */
+    return -ENXIO;
  
-       return block_read(file, buf, count, ppos);
+  return block_read(file, buf, count, ppos);
  }
  
  static ssize_t md_write (struct file *file, const char *buf,
                          size_t count, loff_t *ppos)
  {
-       mddev_t *mddev = kdev_to_mddev(MD_FILE_TO_INODE(file)->i_rdev);
+  int minor=MINOR(file->f_dentry->d_inode->i_rdev);
  
-       if (!mddev || !mddev->pers)
-               return -ENXIO;
+  if (!md_dev[minor].pers)     /* Check if device is being run */
+    return -ENXIO;
  
-       return block_write(file, buf, count, ppos);
+  return block_write(file, buf, count, ppos);
  }
  
  static struct file_operations md_fops=
  {
-       NULL,
-       md_read,
-       md_write,
-       NULL,
-       NULL,
-       md_ioctl,
-       NULL,
-       md_open,
-       NULL,
-       md_release,
-       block_fsync
+  NULL,
+  md_read,
+  md_write,
+  NULL,
+  NULL,
+  md_ioctl,
+  NULL,
+  md_open,
+  NULL,
+  md_release,
+  block_fsync
  };
  
-#endif
-
-int md_map (kdev_t dev, kdev_t *rdev,
-                        unsigned long *rsector, unsigned long size)
+int md_map (int minor, kdev_t *rdev, unsigned long *rsector, unsigned long size)
  {
-       int err;
-       mddev_t *mddev = kdev_to_mddev(dev);
-
-       if (!mddev || !mddev->pers) {
-               err = -ENXIO;
-               goto out;
-       }
+  if ((unsigned int) minor >= MAX_MD_DEV)
+  {
+    printk ("Bad md device %d\n", minor);
+    return (-1);
+  }
+  
+  if (!md_dev[minor].pers)
+  {
+    printk ("Oops ! md%d not running, giving up !\n", minor);
+    return (-1);
+  }
  
-       err = mddev->pers->map(mddev, dev, rdev, rsector, size);
-out:
-       return err;
+  return (md_dev[minor].pers->map(md_dev+minor, rdev, rsector, size));
  }
    
-int md_make_request (struct buffer_head * bh, int rw)
+int md_make_request (int minor, int rw, struct buffer_head * bh)
  {
-       int err;
-       mddev_t *mddev = kdev_to_mddev(bh->b_dev);
-
-       if (!mddev || !mddev->pers) {
-               err = -ENXIO;
-               goto out;
-       }
-
-       if (mddev->pers->make_request) {
-               if (buffer_locked(bh)) {
-                       err = 0;
-                       goto out;
-               }
+       if (md_dev [minor].pers->make_request) {
+               if (buffer_locked(bh))
+                       return 0;
                 set_bit(BH_Lock, &bh->b_state);
                 if (rw == WRITE || rw == WRITEA) {
                         if (!buffer_dirty(bh)) {
-                               bh->b_end_io(bh, buffer_uptodate(bh));
-                               err = 0;
-                               goto out;
+                               bh->b_end_io(bh, test_bit(BH_Uptodate, &bh->b_state));
+                               return 0;
                         }
                 }
                 if (rw == READ || rw == READA) {
                         if (buffer_uptodate(bh)) {
-                               bh->b_end_io(bh, buffer_uptodate(bh));
-                               err = 0;
-                               goto out;
+                               bh->b_end_io(bh, test_bit(BH_Uptodate, &bh->b_state));
+                               return 0;
                         }
                 }
-               err = mddev->pers->make_request(mddev, rw, bh);
+               return (md_dev[minor].pers->make_request(md_dev+minor, rw, bh));
         } else {
                 make_request (MAJOR(bh->b_rdev), rw, bh);
-               err = 0;
+               return 0;
         }
-out:
-       return err;
  }
  
  static void do_md_request (void)
  {
-       printk(KERN_ALERT "Got md request, not good...");
-       return;
-}
-
-int md_thread(void * arg)
-{
-       mdk_thread_t *thread = arg;
-
-       md_lock_kernel();
-       exit_mm(current);
-       exit_files(current);
-       exit_fs(current);
-
-       /*
-        * Detach thread
-        */
-       sys_setsid();
-       sprintf(current->comm, thread->name);
-       md_init_signals();
-       md_flush_signals();
-       thread->tsk = current;
-
-       /*
-        * md_thread is a 'system-thread', it's priority should be very
-        * high. We avoid resource deadlocks individually in each
-        * raid personality. (RAID5 does preallocation) We also use RR and
-        * the very same RT priority as kswapd, thus we will never get
-        * into a priority inversion deadlock.
-        *
-        * we definitely have to have equal or higher priority than
-        * bdflush, otherwise bdflush will deadlock if there are too
-        * many dirty RAID5 blocks.
-        */
-       current->policy = SCHED_OTHER;
-       current->priority = 40;
-
-       up(thread->sem);
-
-       for (;;) {
-               cli();
-               if (!test_bit(THREAD_WAKEUP, &thread->flags)) {
-                       if (!thread->run)
-                               break;
-                       interruptible_sleep_on(&thread->wqueue);
-               }
-               sti();
-               clear_bit(THREAD_WAKEUP, &thread->flags);
-               if (thread->run) {
-                       thread->run(thread->data);
-                       run_task_queue(&tq_disk);
-               }
-               if (md_signal_pending(current)) {
-                       printk("%8s(%d) flushing signals.\n", current->comm,
-                               current->pid);
-                       md_flush_signals();
-               }
-       }
-       sti();
-       up(thread->sem);
-       return 0;
+  printk ("Got md request, not good...");
+  return;
  }
  
-void md_wakeup_thread(mdk_thread_t *thread)
+void md_wakeup_thread(struct md_thread *thread)
  {
         set_bit(THREAD_WAKEUP, &thread->flags);
         wake_up(&thread->wqueue);
  }
  
-mdk_thread_t *md_register_thread (void (*run) (void *),
-                                               void *data, const char *name)
+struct md_thread *md_register_thread (void (*run) (void *), void *data)
  {
-       mdk_thread_t *thread;
+       struct md_thread *thread = (struct md_thread *)
+               kmalloc(sizeof(struct md_thread), GFP_KERNEL);
         int ret;
         struct semaphore sem = MUTEX_LOCKED;
         
-       thread = (mdk_thread_t *) kmalloc
-                               (sizeof(mdk_thread_t), GFP_KERNEL);
-       if (!thread)
-               return NULL;
+       if (!thread) return NULL;
         
-       memset(thread, 0, sizeof(mdk_thread_t));
+       memset(thread, 0, sizeof(struct md_thread));
         init_waitqueue(&thread->wqueue);
         
         thread->sem = &sem;
         thread->run = run;
         thread->data = data;
-       thread->name = name;
         ret = kernel_thread(md_thread, thread, 0);
         if (ret < 0) {
                 kfree(thread);
@@ -2997,405 +836,270 @@ mdk_thread_t *md_register_thread (void (*run) (void *),
         return thread;
  }
  
-void md_interrupt_thread (mdk_thread_t *thread)
-{
-       if (!thread->tsk) {
-               MD_BUG();
-               return;
-       }
-       printk("interrupting MD-thread pid %d\n", thread->tsk->pid);
-       send_sig(SIGKILL, thread->tsk, 1);
-}
-
-void md_unregister_thread (mdk_thread_t *thread)
+void md_unregister_thread (struct md_thread *thread)
  {
         struct semaphore sem = MUTEX_LOCKED;
         
         thread->sem = &sem;
         thread->run = NULL;
-       thread->name = NULL;
-       if (!thread->tsk) {
-               MD_BUG();
-               return;
-       }
-       md_interrupt_thread(thread);
+       if (thread->tsk)
+               printk("Killing md_thread %d %p %s\n",
+                      thread->tsk->pid, thread->tsk, thread->tsk->comm);
+       else
+               printk("Aiee. md_thread has 0 tsk\n");
+       send_sig(SIGKILL, thread->tsk, 1);
+       printk("downing on %p\n", &sem);
         down(&sem);
  }
  
-void md_recover_arrays (void)
-{
-       if (!md_recovery_thread) {
-               MD_BUG();
-               return;
-       }
-       md_wakeup_thread(md_recovery_thread);
-}
-
-
-int md_error (kdev_t dev, kdev_t rdev)
-{
-       mddev_t *mddev = kdev_to_mddev(dev);
-       mdk_rdev_t * rrdev;
-       int rc;
-
-       if (!mddev) {
-               MD_BUG();
-               return 0;
-       }
-       rrdev = find_rdev(mddev, rdev);
-       mark_rdev_faulty(rrdev);
-       /*
-        * if recovery was running, stop it now.
-        */
-       if (mddev->pers->stop_resync)
-               mddev->pers->stop_resync(mddev);
-       if (mddev->pers->error_handler) {
-               rc = mddev->pers->error_handler(mddev, rdev);
-               md_recover_arrays();
-               return rc;
-       }
-#if 0
-       /*
-        * Drop all buffers in the failed array.
-        * _not_. This is called from IRQ handlers ...
-        */
-       invalidate_buffers(rdev);
-#endif
-       return 0;
-}
+#define SHUTDOWN_SIGS   (sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
  
-static int status_unused (char * page)
+int md_thread(void * arg)
  {
-       int sz = 0, i = 0;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct md_thread *thread = arg;
  
-       sz += sprintf(page + sz, "unused devices: ");
+       lock_kernel();
+       exit_mm(current);
+       exit_files(current);
+       exit_fs(current);
+       
+       current->session = 1;
+       current->pgrp = 1;
+       sprintf(current->comm, "md_thread");
+       siginitsetinv(&current->blocked, SHUTDOWN_SIGS);
+       thread->tsk = current;
+       up(thread->sem);
  
-       ITERATE_RDEV_ALL(rdev,tmp) {
-               if (!rdev->same_set.next && !rdev->same_set.prev) {
-                       /*
-                        * The device is not yet used by any array.
-                        */
-                       i++;
-                       sz += sprintf(page + sz, "%s ",
-                               partition_name(rdev->dev));
+       for (;;) {
+               cli();
+               if (!test_bit(THREAD_WAKEUP, &thread->flags)) {
+                       do {
+                               spin_lock(&current->sigmask_lock);
+                               flush_signals(current);
+                               spin_unlock(&current->sigmask_lock);
+                               interruptible_sleep_on(&thread->wqueue);
+                               cli();
+                               if (test_bit(THREAD_WAKEUP, &thread->flags))
+                                       break;
+                               if (!thread->run) {
+                                       sti();
+                                       up(thread->sem);
+                                       return 0;
+                               }
+                       } while (signal_pending(current));
+               }
+               sti();
+               clear_bit(THREAD_WAKEUP, &thread->flags);
+               if (thread->run) {
+                       thread->run(thread->data);
+                       run_task_queue(&tq_disk);
                 }
         }
-       if (!i)
-               sz += sprintf(page + sz, "<none>");
-
-       sz += sprintf(page + sz, "\n");
-       return sz;
  }
  
+EXPORT_SYMBOL(md_size);
+EXPORT_SYMBOL(md_maxreadahead);
+EXPORT_SYMBOL(register_md_personality);
+EXPORT_SYMBOL(unregister_md_personality);
+EXPORT_SYMBOL(partition_name);
+EXPORT_SYMBOL(md_dev);
+EXPORT_SYMBOL(md_error);
+EXPORT_SYMBOL(md_register_thread);
+EXPORT_SYMBOL(md_unregister_thread);
+EXPORT_SYMBOL(md_update_sb);
+EXPORT_SYMBOL(md_map);
+EXPORT_SYMBOL(md_wakeup_thread);
+EXPORT_SYMBOL(md_do_sync);
  
-static int status_resync (char * page, mddev_t * mddev)
-{
-       int sz = 0;
-       unsigned int blocksize, max_blocks, resync, res, dt, tt, et;
+#ifdef CONFIG_PROC_FS
+static struct proc_dir_entry proc_md = {
+       PROC_MD, 6, "mdstat",
+       S_IFREG | S_IRUGO, 1, 0, 0,
+       0, &proc_array_inode_operations,
+};
+#endif
  
-       resync = mddev->curr_resync;
-       blocksize = blksize_size[MD_MAJOR][mdidx(mddev)];
-       max_blocks = blk_size[MD_MAJOR][mdidx(mddev)] / (blocksize >> 10);
+static void md_geninit (struct gendisk *gdisk)
+{
+  int i;
+  
+  for(i=0;i<MAX_MD_DEV;i++)
+  {
+    md_blocksizes[i] = 1024;
+    md_maxreadahead[i] = MD_DEFAULT_DISK_READAHEAD;
+    md_gendisk.part[i].start_sect=-1; /* avoid partition check */
+    md_gendisk.part[i].nr_sects=0;
+    md_dev[i].pers=NULL;
+  }
+
+  blksize_size[MD_MAJOR] = md_blocksizes;
+  max_readahead[MD_MAJOR] = md_maxreadahead;
  
-       /*
-        * Should not happen.
-        */             
-       if (!max_blocks) {
-               MD_BUG();
-               return 0;
-       }
-       res = resync*100/max_blocks;
-       if (!mddev->recovery_running)
-               /*
-                * true resync
-                */
-               sz += sprintf(page + sz, " resync=%u%%", res);
-       else
-               /*
-                * recovery ...
-                */
-               sz += sprintf(page + sz, " recovery=%u%%", res);
+#ifdef CONFIG_PROC_FS
+  proc_register(&proc_root, &proc_md);
+#endif
+}
  
-       /*
-        * We do not want to overflow, so the order of operands and
-        * the * 100 / 100 trick are important. We do a +1 to be
-        * safe against division by zero. We only estimate anyway.
-        *
-        * dt: time until now
-        * tt: total time
-        * et: estimated finish time
-        */
-       dt = ((jiffies - mddev->resync_start) / HZ);
-       tt = (dt * (max_blocks / (resync/100+1)))/100;
-       if (tt > dt)
-               et = tt - dt;
-       else
-               /*
-                * ignore rounding effects near finish time
-                */
-               et = 0;
-       
-       sz += sprintf(page + sz, " finish=%u.%umin", et / 60, (et % 60)/6);
+int md_error (kdev_t mddev, kdev_t rdev)
+{
+    unsigned int minor = MINOR (mddev);
+    int rc;
  
-       return sz;
+    if (MAJOR(mddev) != MD_MAJOR || minor > MAX_MD_DEV)
+       panic ("md_error gets unknown device\n");
+    if (!md_dev [minor].pers)
+       panic ("md_error gets an error for an unknown device\n");
+    if (md_dev [minor].pers->error_handler) {
+       rc = md_dev [minor].pers->error_handler (md_dev+minor, rdev);
+#if SUPPORT_RECONSTRUCTION
+       md_wakeup_thread(md_sync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
+       return rc;
+    }
+    return 0;
  }
  
  int get_md_status (char *page)
  {
-       int sz = 0, j, size;
-       struct md_list_head *tmp, *tmp2;
-       mdk_rdev_t *rdev;
-       mddev_t *mddev;
-
-       sz += sprintf(page + sz, "Personalities : ");
-       for (j = 0; j < MAX_PERSONALITY; j++)
-       if (pers[j])
-               sz += sprintf(page+sz, "[%s] ", pers[j]->name);
+  int sz=0, i, j, size;
  
-       sz += sprintf(page+sz, "\n");
+  sz+=sprintf( page+sz, "Personalities : ");
+  for (i=0; i<MAX_PERSONALITY; i++)
+    if (pers[i])
+      sz+=sprintf (page+sz, "[%d %s] ", i, pers[i]->name);
  
+  page[sz-1]='\n';
  
-       sz += sprintf(page+sz, "read_ahead ");
-       if (read_ahead[MD_MAJOR] == INT_MAX)
-               sz += sprintf(page+sz, "not set\n");
-       else
-               sz += sprintf(page+sz, "%d sectors\n", read_ahead[MD_MAJOR]);
+  sz+=sprintf (page+sz, "read_ahead ");
+  if (read_ahead[MD_MAJOR]==INT_MAX)
+    sz+=sprintf (page+sz, "not set\n");
+  else
+    sz+=sprintf (page+sz, "%d sectors\n", read_ahead[MD_MAJOR]);
    
-       ITERATE_MDDEV(mddev,tmp) {
-               sz += sprintf(page + sz, "md%d : %sactive", mdidx(mddev),
-                                               mddev->pers ? "" : "in");
-               if (mddev->pers) {
-                       if (mddev->ro)  
-                               sz += sprintf(page + sz, " (read-only)");
-                       sz += sprintf(page + sz, " %s", mddev->pers->name);
-               }
+  for (i=0; i<MAX_MD_DEV; i++)
+  {
+    sz+=sprintf (page+sz, "md%d : %sactive", i, md_dev[i].pers ? "" : "in");
  
-               size = 0;
-               ITERATE_RDEV(mddev,rdev,tmp2) {
-                       sz += sprintf(page + sz, " %s[%d]",
-                               partition_name(rdev->dev), rdev->desc_nr);
-                       if (rdev->faulty) {
-                               sz += sprintf(page + sz, "(F)");
-                               continue;
-                       }
-                       size += rdev->size;
-               }
+    if (md_dev[i].pers)
+      sz+=sprintf (page+sz, " %s", md_dev[i].pers->name);
  
-               if (mddev->nb_dev) {
-                       if (mddev->pers)
-                               sz += sprintf(page + sz, " %d blocks",
-                                                md_size[mdidx(mddev)]);
-                       else
-                               sz += sprintf(page + sz, " %d blocks", size);
-               }
+    size=0;
+    for (j=0; j<md_dev[i].nb_dev; j++)
+    {
+      sz+=sprintf (page+sz, " %s",
+                  partition_name(md_dev[i].devices[j].dev));
+      size+=md_dev[i].devices[j].size;
+    }
  
-               if (!mddev->pers) {
-                       sz += sprintf(page+sz, "\n");
-                       continue;
-               }
+    if (md_dev[i].nb_dev) {
+      if (md_dev[i].pers)
+        sz+=sprintf (page+sz, " %d blocks", md_size[i]);
+      else
+        sz+=sprintf (page+sz, " %d blocks", size);
+    }
  
-               sz += mddev->pers->status (page+sz, mddev);
+    if (!md_dev[i].pers)
+    {
+      sz+=sprintf (page+sz, "\n");
+      continue;
+    }
  
-               if (mddev->curr_resync)
-                       sz += status_resync (page+sz, mddev);
-               else {
-                       if (md_atomic_read(&mddev->resync_sem.count) != 1)
-                               sz += sprintf(page + sz, " resync=DELAYED");
-               }
-               sz += sprintf(page + sz, "\n");
-       }
-       sz += status_unused (page + sz);
+    if (md_dev[i].pers->max_invalid_dev)
+      sz+=sprintf (page+sz, " maxfault=%ld", MAX_FAULT(md_dev+i));
  
-       return (sz);
+    sz+=md_dev[i].pers->status (page+sz, i, md_dev+i);
+    sz+=sprintf (page+sz, "\n");
+  }
+
+  return (sz);
  }
  
-int register_md_personality (int pnum, mdk_personality_t *p)
+int register_md_personality (int p_num, struct md_personality *p)
  {
-       if (pnum >= MAX_PERSONALITY)
-               return -EINVAL;
+  int i=(p_num >> PERSONALITY_SHIFT);
  
-       if (pers[pnum])
-               return -EBUSY;
+  if (i >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  if (pers[i])
+    return -EBUSY;
    
-       pers[pnum] = p;
-       printk(KERN_INFO "%s personality registered\n", p->name);
-       return 0;
+  pers[i]=p;
+  printk ("%s personality registered\n", p->name);
+  return 0;
  }
  
-int unregister_md_personality (int pnum)
+int unregister_md_personality (int p_num)
  {
-       if (pnum >= MAX_PERSONALITY)
-               return -EINVAL;
+  int i=(p_num >> PERSONALITY_SHIFT);
  
-       printk(KERN_INFO "%s personality unregistered\n", pers[pnum]->name);
-       pers[pnum] = NULL;
-       return 0;
+  if (i >= MAX_PERSONALITY)
+    return -EINVAL;
+
+  printk ("%s personality unregistered\n", pers[i]->name);
+  pers[i]=NULL;
+  return 0;
  } 
  
-static mdp_disk_t *get_spare(mddev_t *mddev)
+static md_descriptor_t *get_spare(struct md_dev *mddev)
  {
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *disk;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
-
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty)
-                       continue;
-               if (!rdev->sb) {
-                       MD_BUG();
+       int i;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
+       
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb)
                         continue;
-               }
-               disk = &sb->disks[rdev->desc_nr];
-               if (disk_faulty(disk)) {
-                       MD_BUG();
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE))
                         continue;
-               }
-               if (disk_active(disk))
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE))
                         continue;
-               return disk;
+               return descriptor;
         }
         return NULL;
  }
  
-static int is_mddev_idle (mddev_t *mddev)
-{
-       mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
-       int idle;
-       unsigned long curr_events;
-
-       idle = 1;
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               curr_events = io_events[MAJOR(rdev->dev)];
-
-               if (curr_events != rdev->last_events) {
-//                     printk("!I(%d)", curr_events-rdev->last_events);
-                       rdev->last_events = curr_events;
-                       idle = 0;
-               }
-       }
-       return idle;
-}
-
  /*
   * parallel resyncing thread. 
+ *
+ * FIXME: - make it abort with a dirty array on mdstop, now it just blocks
+ *        - fix read error handing
   */
  
-/*
- * Determine correct block size for this device.
- */
-unsigned int device_bsize (kdev_t dev)
-{
-       unsigned int i, correct_size;
-
-       correct_size = BLOCK_SIZE;
-       if (blksize_size[MAJOR(dev)]) {
-               i = blksize_size[MAJOR(dev)][MINOR(dev)];
-               if (i)
-                       correct_size = i;
-       }
-
-       return correct_size;
-}
-
-static struct wait_queue *resync_wait = (struct wait_queue *)NULL;
-
-#define RA_ORDER (1)
-#define RA_PAGE_SIZE (PAGE_SIZE*(1<<RA_ORDER))
-#define MAX_NR_BLOCKS (RA_PAGE_SIZE/sizeof(struct buffer_head *))
-
-int md_do_sync(mddev_t *mddev, mdp_disk_t *spare)
+int md_do_sync(struct md_dev *mddev)
  {
-       mddev_t *mddev2;
-        struct buffer_head **bh;
-       unsigned int max_blocks, blocksize, curr_bsize,
-               i, ii, j, k, chunk, window, nr_blocks, err, serialize;
-       kdev_t read_disk = mddev_to_kdev(mddev);
+        struct buffer_head *bh;
+       int max_blocks, blocksize, curr_bsize, percent=1, j;
+       kdev_t read_disk = MKDEV(MD_MAJOR, mddev - md_dev);
         int major = MAJOR(read_disk), minor = MINOR(read_disk);
         unsigned long starttime;
-       int max_read_errors = 2*MAX_NR_BLOCKS,
-                max_write_errors = 2*MAX_NR_BLOCKS;
-       struct md_list_head *tmp;
-
-retry_alloc:
-       bh = (struct buffer_head **) md__get_free_pages(GFP_KERNEL, RA_ORDER);
-       if (!bh) {
-               printk(KERN_ERR
-               "could not alloc bh array for reconstruction ... retrying!\n");
-               goto retry_alloc;
-       }
-
-       err = down_interruptible(&mddev->resync_sem);
-       if (err)
-               goto out_nolock;
-
-recheck:
-       serialize = 0;
-       ITERATE_MDDEV(mddev2,tmp) {
-               if (mddev2 == mddev)
-                       continue;
-               if (mddev2->curr_resync && match_mddev_units(mddev,mddev2)) {
-                       printk(KERN_INFO "md: serializing resync, md%d has overlapping physical units with md%d!\n", mdidx(mddev), mdidx(mddev2));
-                       serialize = 1;
-                       break;
-               }
-       }
-       if (serialize) {
-               interruptible_sleep_on(&resync_wait);
-               if (md_signal_pending(current)) {
-                       md_flush_signals();
-                       err = -EINTR;
-                       goto out;
-               }
-               goto recheck;
-       }
  
-       mddev->curr_resync = 1;
-
-       blocksize = device_bsize(read_disk);
+       blocksize = blksize_size[major][minor];
         max_blocks = blk_size[major][minor] / (blocksize >> 10);
  
-       printk(KERN_INFO "md: syncing RAID array md%d\n", mdidx(mddev));
-       printk(KERN_INFO "md: minimum _guaranteed_ reconstruction speed: %d KB/sec.\n",
-                                               sysctl_speed_limit);
-       printk(KERN_INFO "md: using maximum available idle IO bandwith for reconstruction.\n");
-
-       /*
-        * Resync has low priority.
-        */
-       current->priority = 1;
-
-       is_mddev_idle(mddev); /* this also initializes IO event counters */
-       starttime = jiffies;
-       mddev->resync_start = starttime;
+       printk("... resync log\n");
+       printk(" ....   mddev->nb_dev: %d\n", mddev->nb_dev);
+       printk(" ....   raid array: %s\n", kdevname(read_disk));
+       printk(" ....   max_blocks: %d blocksize: %d\n", max_blocks, blocksize);
+       printk("md: syncing RAID array %s\n", kdevname(read_disk));
  
-       /*
-        * Tune reconstruction:
-        */
-       window = md_maxreadahead[mdidx(mddev)]/1024;
-       nr_blocks = window / (blocksize >> 10);
-       if (!nr_blocks || (nr_blocks > MAX_NR_BLOCKS))
-               nr_blocks = MAX_NR_BLOCKS;
-       printk(KERN_INFO "md: using %dk window.\n",window);
+       mddev->busy++;
  
-       for (j = 0; j < max_blocks; j += nr_blocks) {
+       starttime=jiffies;
+       for (j = 0; j < max_blocks; j++) {
  
-               if (j)
-                       mddev->curr_resync = j;
                 /*
                  * B careful. When some1 mounts a non-'blocksize' filesystem
                  * then we get the blocksize changed right under us. Go deal
                  * with it transparently, recalculate 'blocksize', 'j' and
                  * 'max_blocks':
                  */
-               curr_bsize = device_bsize(read_disk);
+               curr_bsize = blksize_size[major][minor];
                 if (curr_bsize != blocksize) {
-                       printk(KERN_INFO "md%d: blocksize changed\n",
-                                                               mdidx(mddev));
-retry_read:
+               diff_blocksize:
                         if (curr_bsize > blocksize)
                                 /*
                                  * this is safe, rounds downwards.
@@ -3405,384 +1109,114 @@ retry_read:
                                 j *= blocksize/curr_bsize;
  
                         blocksize = curr_bsize;
-                       nr_blocks = window / (blocksize >> 10);
-                       if (!nr_blocks || (nr_blocks > MAX_NR_BLOCKS))
-                               nr_blocks = MAX_NR_BLOCKS;
                         max_blocks = blk_size[major][minor] / (blocksize >> 10);
-                       printk("nr_blocks changed to %d (blocksize %d, j %d, max_blocks %d)\n",
-                                       nr_blocks, blocksize, j, max_blocks);
+               }
+               if ((bh = breada (read_disk, j, blocksize, j * blocksize,
+                                       max_blocks * blocksize)) != NULL) {
+                       mark_buffer_dirty(bh, 1);
+                       brelse(bh);
+               } else {
                         /*
-                        * We will retry the current block-group
+                        * FIXME: Ugly, but set_blocksize() isnt safe ...
                          */
-               }
-
-               /*
-                * Cleanup routines expect this
-                */
-               for (k = 0; k < nr_blocks; k++)
-                       bh[k] = NULL;
-
-               chunk = nr_blocks;
-               if (chunk > max_blocks-j)
-                       chunk = max_blocks-j;
+                       curr_bsize = blksize_size[major][minor];
+                       if (curr_bsize != blocksize)
+                               goto diff_blocksize;
  
-               /*
-                * request buffer heads ...
-                */
-               for (i = 0; i < chunk; i++) {
-                       bh[i] = getblk (read_disk, j+i, blocksize);
-                       if (!bh[i])
-                               goto read_error;
-                       if (!buffer_dirty(bh[i]))
-                               mark_buffer_lowprio(bh[i]);
+                       /*
+                        * It's a real read problem. FIXME, handle this
+                        * a better way.
+                        */
+                       printk ( KERN_ALERT
+                                "read error, stopping reconstruction.\n");
+                       mddev->busy--;
+                       return 1;
                 }
  
                 /*
-                * read buffer heads ...
-                */
-               ll_rw_block (READ, chunk, bh);
-               run_task_queue(&tq_disk);
-               
-               /*
-                * verify that all of them are OK ...
+                * Let's sleep some if we are faster than our speed limit:
                  */
-               for (i = 0; i < chunk; i++) {
-                       ii = chunk-i-1;
-                       wait_on_buffer(bh[ii]);
-                       if (!buffer_uptodate(bh[ii]))
-                               goto read_error;
-               }
-
-retry_write:
-               for (i = 0; i < chunk; i++)
-                       mark_buffer_dirty_lowprio(bh[i]);
-
-               ll_rw_block(WRITE, chunk, bh);
-               run_task_queue(&tq_disk);
-
-               for (i = 0; i < chunk; i++) {
-                       ii = chunk-i-1;
-                       wait_on_buffer(bh[ii]);
-
-                       if (spare && disk_faulty(spare)) {
-                               for (k = 0; k < chunk; k++)
-                                       brelse(bh[k]);
-                               printk(" <SPARE FAILED!>\n ");
-                               err = -EIO;
-                               goto out;
-                       }
-
-                       if (!buffer_uptodate(bh[ii])) {
-                               curr_bsize = device_bsize(read_disk);
-                               if (curr_bsize != blocksize) {
-                                       printk(KERN_INFO
-                                               "md%d: blocksize changed during write\n",
-                                               mdidx(mddev));
-                                       for (k = 0; k < chunk; k++)
-                                               if (bh[k]) {
-                                                       if (buffer_lowprio(bh[k]))
-                                                               mark_buffer_clean(bh[k]);
-                                                       brelse(bh[k]);
-                                               }
-                                       goto retry_read;
-                               }
-                               printk(" BAD WRITE %8d>\n", j);
-                               /*
-                                * Ouch, write error, retry or bail out.
-                                */
-                               if (max_write_errors) {
-                                       max_write_errors--;
-                                       printk ( KERN_WARNING "md%d: write error while reconstructing, at block %u(%d).\n", mdidx(mddev), j, blocksize);
-                                       goto retry_write;
-                               }
-                               printk ( KERN_ALERT
-                                 "too many write errors, stopping reconstruction.\n");
-                               for (k = 0; k < chunk; k++)
-                                       if (bh[k]) {
-                                               if (buffer_lowprio(bh[k]))
-                                                       mark_buffer_clean(bh[k]);
-                                               brelse(bh[k]);
-                                       }
-                               err = -EIO;
-                               goto out;
-                       }
+               while (blocksize*j/(jiffies-starttime+1)*HZ/1024 > SPEED_LIMIT)
+               {
+                       current->state = TASK_INTERRUPTIBLE;
+                       schedule_timeout(1);
                 }
  
                 /*
-                * This is the normal 'everything went OK' case
-                * do a 'free-behind' logic, we sure dont need
-                * this buffer if it was the only user.
+                * FIXME: put this status bar thing into /proc
                  */
-               for (i = 0; i < chunk; i++)
-                       cache_drop_behind(bh[i]);
-
-
-               if (md_signal_pending(current)) {
-                       /*
-                        * got a signal, exit.
-                        */
-                       mddev->curr_resync = 0;
-                       printk("md_do_sync() got signal ... exiting\n");
-                       md_flush_signals();
-                       err = -EINTR;
-                       goto out;
+               if (!(j%(max_blocks/100))) {
+                       if (!(percent%10))
+                               printk (" %03d%% done.\n",percent);
+                       else
+                               printk (".");
+                       percent++;
                 }
-
-               /*
-                * this loop exits only if either when we are slower than
-                * the 'hard' speed limit, or the system was IO-idle for
-                * a jiffy.
-                * the system might be non-idle CPU-wise, but we only care
-                * about not overloading the IO subsystem. (things like an
-                * e2fsck being done on the RAID array should execute fast)
-                */
-repeat:
-               if (md_need_resched(current))
-                       schedule();
-
-               if ((blocksize/1024)*j/((jiffies-starttime)/HZ + 1) + 1
-                                               > sysctl_speed_limit) {
-                       current->priority = 1;
-
-                       if (!is_mddev_idle(mddev)) {
-                               current->state = TASK_INTERRUPTIBLE;
-                               md_schedule_timeout(HZ/2);
-                               if (!md_signal_pending(current))
-                                       goto repeat;
-                       }
-               } else
-                       current->priority = 40;
         }
         fsync_dev(read_disk);
-       printk(KERN_INFO "md: md%d: sync done.\n",mdidx(mddev));
-       err = 0;
-       /*
-        * this also signals 'finished resyncing' to md_stop
-        */
-out:
-       up(&mddev->resync_sem);
-out_nolock:
-       free_pages((unsigned long)bh, RA_ORDER);
-       mddev->curr_resync = 0;
-       wake_up(&resync_wait);
-       return err;
-
-read_error:
-       /*
-        * set_blocksize() might change the blocksize. This
-        * should not happen often, but it happens when eg.
-        * someone mounts a filesystem that has non-1k
-        * blocksize. set_blocksize() doesnt touch our
-        * buffer, but to avoid aliasing problems we change
-        * our internal blocksize too and retry the read.
-        */
-       curr_bsize = device_bsize(read_disk);
-       if (curr_bsize != blocksize) {
-               printk(KERN_INFO "md%d: blocksize changed during read\n",
-                       mdidx(mddev));
-               for (k = 0; k < chunk; k++)
-                       if (bh[k]) {
-                               if (buffer_lowprio(bh[k]))
-                                       mark_buffer_clean(bh[k]);
-                               brelse(bh[k]);
-                       }
-               goto retry_read;
-       }
-
-       /*
-        * It's a real read problem. We retry and bail out
-        * only if it's excessive.
-        */
-       if (max_read_errors) {
-               max_read_errors--;
-               printk ( KERN_WARNING "md%d: read error while reconstructing, at block %u(%d).\n", mdidx(mddev), j, blocksize);
-               for (k = 0; k < chunk; k++)
-                       if (bh[k]) {
-                               if (buffer_lowprio(bh[k]))
-                                       mark_buffer_clean(bh[k]);
-                               brelse(bh[k]);
-                       }
-               goto retry_read;
-       }
-       printk ( KERN_ALERT "too many read errors, stopping reconstruction.\n");
-       for (k = 0; k < chunk; k++)
-               if (bh[k]) {
-                       if (buffer_lowprio(bh[k]))
-                               mark_buffer_clean(bh[k]);
-                       brelse(bh[k]);
-               }
-       err = -EIO;
-       goto out;
+       printk("md: %s: sync done.\n", kdevname(read_disk));
+       mddev->busy--;
+       return 0;
  }
  
-#undef MAX_NR_BLOCKS
-
  /*
- * This is a kernel thread which syncs a spare disk with the active array
+ * This is a kernel thread which: syncs a spare disk with the active array
   *
   * the amount of foolproofing might seem to be a tad excessive, but an
   * early (not so error-safe) version of raid1syncd synced the first 0.5 gigs
   * of my root partition with the first 0.5 gigs of my /home partition ... so
   * i'm a bit nervous ;)
   */
-void md_do_recovery (void *data)
+void mdsyncd (void *data)
  {
-       int err;
-       mddev_t *mddev;
-       mdp_super_t *sb;
-       mdp_disk_t *spare;
+       int i;
+       struct md_dev *mddev;
+       md_superblock_t *sb;
+       md_descriptor_t *spare;
         unsigned long flags;
-       struct md_list_head *tmp;
  
-       printk(KERN_INFO "md: recovery thread got woken up ...\n");
-restart:
-       ITERATE_MDDEV(mddev,tmp) {
+       for (i = 0, mddev = md_dev; i < MAX_MD_DEV; i++, mddev++) {
                 if ((sb = mddev->sb) == NULL)
                         continue;
-               if (mddev->recovery_running)
-                       continue;
                 if (sb->active_disks == sb->raid_disks)
                         continue;
-               if (!sb->spare_disks) {
-                       printk(KERN_ERR "md%d: no spare disk to reconstruct array! -- continuing in degraded mode\n", mdidx(mddev));
+               if (!sb->spare_disks)
                         continue;
-               }
-               /*
-                * now here we get the spare and resync it.
-                */
                 if ((spare = get_spare(mddev)) == NULL)
                         continue;
-               printk(KERN_INFO "md%d: resyncing spare disk %s to replace failed disk\n", mdidx(mddev), partition_name(MKDEV(spare->major,spare->minor)));
-               if (!mddev->pers->diskop)
+               if (!mddev->pers->mark_spare)
                         continue;
-               if (mddev->pers->diskop(mddev, &spare, DISKOP_SPARE_WRITE))
+               if (mddev->pers->mark_spare(mddev, spare, SPARE_WRITE))
+                       continue;
+               if (md_do_sync(mddev) || (spare->state & (1 << MD_FAULTY_DEVICE))) {
+                       mddev->pers->mark_spare(mddev, spare, SPARE_INACTIVE);
                         continue;
-               down(&mddev->recovery_sem);
-               mddev->recovery_running = 1;
-               err = md_do_sync(mddev, spare);
-               if (err == -EIO) {
-                       printk(KERN_INFO "md%d: spare disk %s failed, skipping to next spare.\n", mdidx(mddev), partition_name(MKDEV(spare->major,spare->minor)));
-                       if (!disk_faulty(spare)) {
-                               mddev->pers->diskop(mddev,&spare,DISKOP_SPARE_INACTIVE);
-                               mark_disk_faulty(spare);
-                               mark_disk_nonsync(spare);
-                               mark_disk_inactive(spare);
-                               sb->spare_disks--;
-                               sb->working_disks--;
-                               sb->failed_disks++;
-                       }
-               } else
-                       if (disk_faulty(spare))
-                               mddev->pers->diskop(mddev, &spare,
-                                               DISKOP_SPARE_INACTIVE);
-               if (err == -EINTR) {
-                       /*
-                        * Recovery got interrupted ...
-                        * signal back that we have finished using the array.
-                        */
-                       mddev->pers->diskop(mddev, &spare,
-                                                        DISKOP_SPARE_INACTIVE);
-                       up(&mddev->recovery_sem);
-                       /*
-                        * we keep 'recovery_running == 1', so we will not
-                        * start a reconstruction next time around ...
-                        * the stop code will set it to 0 explicitly.
-                        */
-                       goto restart;
-               } else {
-                       mddev->recovery_running = 0;
-                       up(&mddev->recovery_sem);
                 }
                 save_flags(flags);
                 cli();
-               if (!disk_faulty(spare)) {
-                       /*
-                        * the SPARE_ACTIVE diskop possibly changes the
-                        * pointer too
-                        */
-                       mddev->pers->diskop(mddev, &spare, DISKOP_SPARE_ACTIVE);
-                       mark_disk_sync(spare);
-                       mark_disk_active(spare);
-                       sb->active_disks++;
-                       sb->spare_disks--;
-               }
-               restore_flags(flags);
+               mddev->pers->mark_spare(mddev, spare, SPARE_ACTIVE);
+               spare->state |= (1 << MD_SYNC_DEVICE);
+               spare->state |= (1 << MD_ACTIVE_DEVICE);
+               sb->spare_disks--;
+               sb->active_disks++;
                 mddev->sb_dirty = 1;
-               md_update_sb(mddev);
-               goto restart;
+               md_update_sb(mddev - md_dev);
+               restore_flags(flags);
         }
-       printk(KERN_INFO "md: recovery thread finished ...\n");
         
  }
  
-int md_notify_reboot(struct notifier_block *this,
-                                       unsigned long code, void *x)
-{
-       struct md_list_head *tmp;
-       mddev_t *mddev;
-
-       if ((code == MD_SYS_DOWN) || (code == MD_SYS_HALT)
-                                 || (code == MD_SYS_POWER_OFF)) {
-
-               printk(KERN_INFO "stopping all md devices.\n");
-
-               ITERATE_MDDEV(mddev,tmp)
-                       do_md_stop (mddev, 1);
-               /*
-                * certain more exotic SCSI devices are known to be
-                * volatile wrt too early system reboots. While the
-                * right place to handle this issue is the given
-                * driver, we do want to have a safe RAID driver ...
-                */
-               md_mdelay(1000*1);
-       }
-       return NOTIFY_DONE;
-}
-
-struct notifier_block md_notifier = {
-       md_notify_reboot,
-       NULL,
-       0
-};
-
-md__initfunc(void raid_setup(char *str, int *ints))
-{
-       char tmpline[100];
-       int len, pos, nr, i;
-
-       len = strlen(str) + 1;
-       nr = 0;
-       pos = 0;
-
-       for (i = 0; i < len; i++) {
-               char c = str[i];
-
-               if (c == ',' || !c) {
-                       tmpline[pos] = 0;
-                       if (!strcmp(tmpline,"noautodetect"))
-                               raid_setup_args.noautodetect = 1;
-                       nr++;
-                       pos = 0;
-                       continue;
-               }
-               tmpline[pos] = c;
-               pos++;
-       }
-       raid_setup_args.set = 1;
-       return;
-}
-
  #ifdef CONFIG_MD_BOOT
  struct {
         int set;
         int ints[100];
         char str[100];
-} md_setup_args md__initdata = {
+} md_setup_args __initdata = {
         0,{0},{0}
  };
  
  /* called from init/main.c */
-md__initfunc(void md_setup(char *str,int *ints))
+__initfunc(void md_setup(char *str,int *ints))
  {
         int i;
         for(i=0;i<=ints[0];i++) {
@@ -3794,24 +1228,21 @@ md__initfunc(void md_setup(char *str,int *ints))
         return;
  }
  
-md__initfunc(void do_md_setup(char *str,int *ints))
+__initfunc(void do_md_setup(char *str,int *ints))
  {
-#if 0
-       int minor, pers, chunk_size, fault;
+       int minor, pers, factor, fault;
         kdev_t dev;
         int i=1;
  
-       printk("i plan to phase this out --mingo\n");
-
         if(ints[0] < 4) {
-               printk (KERN_WARNING "md: Too few Arguments (%d).\n", ints[0]);
+               printk ("md: Too few Arguments (%d).\n", ints[0]);
                 return;
         }
     
         minor=ints[i++];
     
-       if ((unsigned int)minor >= MAX_MD_DEVS) {
-               printk (KERN_WARNING "md: Minor device number too high.\n");
+       if (minor >= MAX_MD_DEV) {
+               printk ("md: Minor device number too high.\n");
                 return;
         }
  
@@ -3821,20 +1252,18 @@ md__initfunc(void do_md_setup(char *str,int *ints))
         case -1:
  #ifdef CONFIG_MD_LINEAR
                 pers = LINEAR;
-               printk (KERN_INFO "md: Setting up md%d as linear device.\n",
-                                                                       minor);
+               printk ("md: Setting up md%d as linear device.\n",minor);
  #else 
-               printk (KERN_WARNING "md: Linear mode not configured." 
+               printk ("md: Linear mode not configured." 
                         "Recompile the kernel with linear mode enabled!\n");
  #endif
                 break;
         case 0:
                 pers = STRIPED;
  #ifdef CONFIG_MD_STRIPED
-               printk (KERN_INFO "md: Setting up md%d as a striped device.\n",
-                                                               minor);
+               printk ("md: Setting up md%d as a striped device.\n",minor);
  #else 
-               printk (KERN_WARNING "md: Striped mode not configured." 
+               printk ("md: Striped mode not configured." 
                         "Recompile the kernel with striped mode enabled!\n");
  #endif
                 break;
@@ -3849,145 +1278,79 @@ md__initfunc(void do_md_setup(char *str,int *ints))
                 break;
  */
         default:           
-               printk (KERN_WARNING "md: Unknown or not supported raid level %d.\n", ints[--i]);
+               printk ("md: Unknown or not supported raid level %d.\n", ints[--i]);
                 return;
         }
  
-       if (pers) {
+       if(pers) {
  
-               chunk_size = ints[i++]; /* Chunksize  */
-               fault = ints[i++]; /* Faultlevel */
+         factor=ints[i++]; /* Chunksize  */
+         fault =ints[i++]; /* Faultlevel */
     
-               pers = pers | chunk_size | (fault << FAULT_SHIFT);   
+         pers=pers | factor | (fault << FAULT_SHIFT);   
     
-               while( str && (dev = name_to_kdev_t(str))) {
-                       do_md_add (minor, dev);
-                       if((str = strchr (str, ',')) != NULL)
-                               str++;
-               }
+         while( str && (dev = name_to_kdev_t(str))) {
+           do_md_add (minor, dev);
+           if((str = strchr (str, ',')) != NULL)
+             str++;
+         }
  
-               do_md_run (minor, pers);
-               printk (KERN_INFO "md: Loading md%d.\n",minor);
+         do_md_run (minor, pers);
+         printk ("md: Loading md%d.\n",minor);
         }
-#endif
+   
  }
  #endif
  
-void hsm_init (void);
-void translucent_init (void);
  void linear_init (void);
  void raid0_init (void);
  void raid1_init (void);
  void raid5_init (void);
  
-md__initfunc(int md_init (void))
+__initfunc(int md_init (void))
  {
-       static char * name = "mdrecoveryd";
-
-       printk (KERN_INFO "md driver %d.%d.%d MAX_MD_DEVS=%d, MAX_REAL=%d\n",
-                       MD_MAJOR_VERSION, MD_MINOR_VERSION,
-                       MD_PATCHLEVEL_VERSION, MAX_MD_DEVS, MAX_REAL);
-
-       if (register_blkdev (MD_MAJOR, "md", &md_fops))
-       {
-               printk (KERN_ALERT "Unable to get major %d for md\n", MD_MAJOR);
-               return (-1);
-       }
+  printk ("md driver %d.%d.%d MAX_MD_DEV=%d, MAX_REAL=%d\n",
+    MD_MAJOR_VERSION, MD_MINOR_VERSION, MD_PATCHLEVEL_VERSION,
+    MAX_MD_DEV, MAX_REAL);
  
-       blk_dev[MD_MAJOR].request_fn = DEVICE_REQUEST;
-       blk_dev[MD_MAJOR].current_request = NULL;
-       read_ahead[MD_MAJOR] = INT_MAX;
-       md_gendisk.next = gendisk_head;
+  if (register_blkdev (MD_MAJOR, "md", &md_fops))
+  {
+    printk ("Unable to get major %d for md\n", MD_MAJOR);
+    return (-1);
+  }
  
-       gendisk_head = &md_gendisk;
+  blk_dev[MD_MAJOR].request_fn=DEVICE_REQUEST;
+  blk_dev[MD_MAJOR].current_request=NULL;
+  read_ahead[MD_MAJOR]=INT_MAX;
+  memset(md_dev, 0, MAX_MD_DEV * sizeof (struct md_dev));
+  md_gendisk.next=gendisk_head;
  
-       md_recovery_thread = md_register_thread(md_do_recovery, NULL, name);
-       if (!md_recovery_thread)
-               printk(KERN_ALERT "bug: couldn't allocate md_recovery_thread\n");
+  gendisk_head=&md_gendisk;
  
-       md_register_reboot_notifier(&md_notifier);
-       md_register_sysctl();
+#if SUPPORT_RECONSTRUCTION
+  if ((md_sync_thread = md_register_thread(mdsyncd, NULL)) == NULL)
+    printk("md: bug: md_sync_thread == NULL\n");
+#endif /* SUPPORT_RECONSTRUCTION */
  
-#ifdef CONFIG_MD_HSM
-       hsm_init ();
-#endif
-#ifdef CONFIG_MD_TRANSLUCENT
-       translucent_init ();
-#endif
  #ifdef CONFIG_MD_LINEAR
-       linear_init ();
+  linear_init ();
  #endif
  #ifdef CONFIG_MD_STRIPED
-       raid0_init ();
+  raid0_init ();
  #endif
  #ifdef CONFIG_MD_MIRRORING
-       raid1_init ();
+  raid1_init ();
  #endif
  #ifdef CONFIG_MD_RAID5
-       raid5_init ();
-#endif
-#if defined(CONFIG_MD_RAID5) || defined(CONFIG_MD_RAID5_MODULE)
-        /*
-         * pick a XOR routine, runtime.
-         */
-       calibrate_xor_block();
+  raid5_init ();
  #endif
-
-       return (0);
+  return (0);
  }
  
  #ifdef CONFIG_MD_BOOT
-md__initfunc(void md_setup_drive(void))
+__initfunc(void md_setup_drive(void))
  {
         if(md_setup_args.set)
                 do_md_setup(md_setup_args.str, md_setup_args.ints);
  }
  #endif
-
-MD_EXPORT_SYMBOL(md_size);
-MD_EXPORT_SYMBOL(register_md_personality);
-MD_EXPORT_SYMBOL(unregister_md_personality);
-MD_EXPORT_SYMBOL(partition_name);
-MD_EXPORT_SYMBOL(md_error);
-MD_EXPORT_SYMBOL(md_recover_arrays);
-MD_EXPORT_SYMBOL(md_register_thread);
-MD_EXPORT_SYMBOL(md_unregister_thread);
-MD_EXPORT_SYMBOL(md_update_sb);
-MD_EXPORT_SYMBOL(md_map);
-MD_EXPORT_SYMBOL(md_wakeup_thread);
-MD_EXPORT_SYMBOL(md_do_sync);
-MD_EXPORT_SYMBOL(md_print_devices);
-MD_EXPORT_SYMBOL(find_rdev_nr);
-MD_EXPORT_SYMBOL(md_check_ordering);
-MD_EXPORT_SYMBOL(md_interrupt_thread);
-MD_EXPORT_SYMBOL(mddev_map);
-
-#ifdef CONFIG_PROC_FS
-static struct proc_dir_entry proc_md = {
-       PROC_MD, 6, "mdstat",
-       S_IFREG | S_IRUGO, 1, 0, 0,
-       0, &proc_array_inode_operations,
-};
-#endif
-
-static void md_geninit (struct gendisk *gdisk)
-{
-       int i;
-  
-       for(i = 0; i < MAX_MD_DEVS; i++) {
-               md_blocksizes[i] = 1024;
-               md_maxreadahead[i] = MD_READAHEAD;
-               md_gendisk.part[i].start_sect = -1; /* avoid partition check */
-               md_gendisk.part[i].nr_sects = 0;
-       }
-
-       printk("md.c: sizeof(mdp_super_t) = %d\n", (int)sizeof(mdp_super_t));
-
-       blksize_size[MD_MAJOR] = md_blocksizes;
-       md_set_global_readahead(md_maxreadahead);
-
-#ifdef CONFIG_PROC_FS
-       proc_register(&proc_root, &proc_md);
-#endif
-}
-
diff --git a/drivers/block/raid0.c b/drivers/block/raid0.c

index 5272d7353e61ed56aeaa0a88d2e499a36dc831ca..2e95d34f89b8832295c41688219dc91f361ef841 100644 (file)
--- a/drivers/block/raid0.c
+++ b/drivers/block/raid0.c
@@ -1,3 +1,4 @@
+
  /*
     raid0.c : Multiple Devices driver for Linux
               Copyright (C) 1994-96 Marc ZYNGIER
@@ -17,201 +18,146 @@
  */
  
  #include <linux/module.h>
-#include <linux/raid/raid0.h>
+#include <linux/md.h>
+#include <linux/raid0.h>
+#include <linux/vmalloc.h>
  
  #define MAJOR_NR MD_MAJOR
  #define MD_DRIVER
  #define MD_PERSONALITY
  
-static int create_strip_zones (mddev_t *mddev)
+static int create_strip_zones (int minor, struct md_dev *mddev)
  {
-       int i, c, j, j1, j2;
-       int current_offset, curr_zone_offset;
-       raid0_conf_t *conf = mddev_to_conf(mddev);
-       mdk_rdev_t *smallest, *rdev1, *rdev2, *rdev;
- 
-       /*
-        * The number of 'same size groups'
-        */
-       conf->nr_strip_zones = 0;
- 
-       ITERATE_RDEV_ORDERED(mddev,rdev1,j1) {
-               printk("raid0: looking at %s\n", partition_name(rdev1->dev));
-               c = 0;
-               ITERATE_RDEV_ORDERED(mddev,rdev2,j2) {
-                       printk("raid0:   comparing %s(%d) with %s(%d)\n", partition_name(rdev1->dev), rdev1->size, partition_name(rdev2->dev), rdev2->size);
-                       if (rdev2 == rdev1) {
-                               printk("raid0:   END\n");
-                               break;
-                       }
-                       if (rdev2->size == rdev1->size)
-                       {
-                               /*
-                                * Not unique, dont count it as a new
-                                * group
-                                */
-                               printk("raid0:   EQUAL\n");
-                               c = 1;
-                               break;
-                       }
-                       printk("raid0:   NOT EQUAL\n");
-               }
-               if (!c) {
-                       printk("raid0:   ==> UNIQUE\n");
-                       conf->nr_strip_zones++;
-                       printk("raid0: %d zones\n", conf->nr_strip_zones);
-               }
-       }
-               printk("raid0: FINAL %d zones\n", conf->nr_strip_zones);
-
-       conf->strip_zone = vmalloc(sizeof(struct strip_zone)*
-                               conf->nr_strip_zones);
-       if (!conf->strip_zone)
-               return 1;
-
-
-       conf->smallest = NULL;
-       current_offset = 0;
-       curr_zone_offset = 0;
-
-       for (i = 0; i < conf->nr_strip_zones; i++)
-       {
-               struct strip_zone *zone = conf->strip_zone + i;
-
-               printk("zone %d\n", i);
-               zone->dev_offset = current_offset;
-               smallest = NULL;
-               c = 0;
-
-               ITERATE_RDEV_ORDERED(mddev,rdev,j) {
-
-                       printk(" checking %s ...", partition_name(rdev->dev));
-                       if (rdev->size > current_offset)
-                       {
-                               printk(" contained as device %d\n", c);
-                               zone->dev[c] = rdev;
-                               c++;
-                               if (!smallest || (rdev->size <smallest->size)) {
-                                       smallest = rdev;
-                                       printk("  (%d) is smallest!.\n", rdev->size);
-                               }
-                       } else
-                               printk(" nope.\n");
-               }
-
-               zone->nb_dev = c;
-               zone->size = (smallest->size - current_offset) * c;
-               printk(" zone->nb_dev: %d, size: %d\n",zone->nb_dev,zone->size);
-
-               if (!conf->smallest || (zone->size < conf->smallest->size))
-                       conf->smallest = zone;
-
-               zone->zone_offset = curr_zone_offset;
-               curr_zone_offset += zone->size;
-
-               current_offset = smallest->size;
-               printk("current zone offset: %d\n", current_offset);
-       }
-       printk("done.\n");
-       return 0;
+  int i, j, c=0;
+  int current_offset=0;
+  struct real_dev *smallest_by_zone;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
+  
+  data->nr_strip_zones=1;
+  
+  for (i=1; i<mddev->nb_dev; i++)
+  {
+    for (j=0; j<i; j++)
+      if (mddev->devices[i].size==mddev->devices[j].size)
+      {
+       c=1;
+       break;
+      }
+
+    if (!c)
+      data->nr_strip_zones++;
+
+    c=0;
+  }
+
+  if ((data->strip_zone=vmalloc(sizeof(struct strip_zone)*data->nr_strip_zones)) == NULL)
+    return 1;
+
+  data->smallest=NULL;
+  
+  for (i=0; i<data->nr_strip_zones; i++)
+  {
+    data->strip_zone[i].dev_offset=current_offset;
+    smallest_by_zone=NULL;
+    c=0;
+
+    for (j=0; j<mddev->nb_dev; j++)
+      if (mddev->devices[j].size>current_offset)
+      {
+       data->strip_zone[i].dev[c++]=mddev->devices+j;
+       if (!smallest_by_zone ||
+           smallest_by_zone->size > mddev->devices[j].size)
+         smallest_by_zone=mddev->devices+j;
+      }
+
+    data->strip_zone[i].nb_dev=c;
+    data->strip_zone[i].size=(smallest_by_zone->size-current_offset)*c;
+
+    if (!data->smallest ||
+       data->smallest->size > data->strip_zone[i].size)
+      data->smallest=data->strip_zone+i;
+
+    data->strip_zone[i].zone_offset=i ? (data->strip_zone[i-1].zone_offset+
+                                          data->strip_zone[i-1].size) : 0;
+    current_offset=smallest_by_zone->size;
+  }
+  return 0;
  }
  
-static int raid0_run (mddev_t *mddev)
+static int raid0_run (int minor, struct md_dev *mddev)
  {
-       int cur=0, i=0, size, zone0_size, nb_zone;
-       raid0_conf_t *conf;
-
-       MOD_INC_USE_COUNT;
-
-       conf = vmalloc(sizeof (raid0_conf_t));
-       if (!conf)
-               goto out;
-       mddev->private = (void *)conf;
- 
-       if (md_check_ordering(mddev)) {
-               printk("raid0: disks are not ordered, aborting!\n");
-               goto out_free_conf;
-       }
-
-       if (create_strip_zones (mddev)) 
-               goto out_free_conf;
-
-       printk("raid0 : md_size is %d blocks.\n", md_size[mdidx(mddev)]);
-       printk("raid0 : conf->smallest->size is %d blocks.\n", conf->smallest->size);
-       nb_zone = md_size[mdidx(mddev)]/conf->smallest->size +
-                       (md_size[mdidx(mddev)] % conf->smallest->size ? 1 : 0);
-       printk("raid0 : nb_zone is %d.\n", nb_zone);
-       conf->nr_zones = nb_zone;
-
-       printk("raid0 : Allocating %d bytes for hash.\n",
-                               sizeof(struct raid0_hash)*nb_zone);
-
-       conf->hash_table = vmalloc (sizeof (struct raid0_hash)*nb_zone);
-       if (!conf->hash_table)
-               goto out_free_zone_conf;
-       size = conf->strip_zone[cur].size;
-
-       i = 0;
-       while (cur < conf->nr_strip_zones) {
-               conf->hash_table[i].zone0 = conf->strip_zone + cur;
-
-               /*
-                * If we completely fill the slot
-                */
-               if (size >= conf->smallest->size) {
-                       conf->hash_table[i++].zone1 = NULL;
-                       size -= conf->smallest->size;
-
-                       if (!size) {
-                               if (++cur == conf->nr_strip_zones)
-                                       continue;
-                               size = conf->strip_zone[cur].size;
-                       }
-                       continue;
-               }
-               if (++cur == conf->nr_strip_zones) {
-                       /*
-                        * Last dev, set unit1 as NULL
-                        */
-                       conf->hash_table[i].zone1=NULL;
-                       continue;
-               }
-
-               /*
-                * Here we use a 2nd dev to fill the slot
-                */
-               zone0_size = size;
-               size = conf->strip_zone[cur].size;
-               conf->hash_table[i++].zone1 = conf->strip_zone + cur;
-               size -= (conf->smallest->size - zone0_size);
-       }
-       return 0;
-
-out_free_zone_conf:
-       vfree(conf->strip_zone);
-       conf->strip_zone = NULL;
-
-out_free_conf:
-       vfree(conf);
-       mddev->private = NULL;
-out:
-       MOD_DEC_USE_COUNT;
-       return 1;
+  int cur=0, i=0, size, zone0_size, nb_zone;
+  struct raid0_data *data;
+
+  MOD_INC_USE_COUNT;
+
+  if ((mddev->private=vmalloc (sizeof (struct raid0_data))) == NULL) return 1;
+  data=(struct raid0_data *) mddev->private;
+  
+  if (create_strip_zones (minor, mddev)) 
+  {
+       vfree(data);
+       return 1;
+  }
+
+  nb_zone=data->nr_zones=
+    md_size[minor]/data->smallest->size +
+    (md_size[minor]%data->smallest->size ? 1 : 0);
+
+  printk ("raid0 : Allocating %ld bytes for hash.\n",(long)sizeof(struct raid0_hash)*nb_zone);
+  if ((data->hash_table=vmalloc (sizeof (struct raid0_hash)*nb_zone)) == NULL)
+  {
+    vfree(data->strip_zone);
+    vfree(data);
+    return 1;
+  }
+  size=data->strip_zone[cur].size;
+
+  i=0;
+  while (cur<data->nr_strip_zones)
+  {
+    data->hash_table[i].zone0=data->strip_zone+cur;
+
+    if (size>=data->smallest->size)/* If we completely fill the slot */
+    {
+      data->hash_table[i++].zone1=NULL;
+      size-=data->smallest->size;
+
+      if (!size)
+      {
+       if (++cur==data->nr_strip_zones) continue;
+       size=data->strip_zone[cur].size;
+      }
+
+      continue;
+    }
+
+    if (++cur==data->nr_strip_zones) /* Last dev, set unit1 as NULL */
+    {
+      data->hash_table[i].zone1=NULL;
+      continue;
+    }
+
+    zone0_size=size;           /* Here, we use a 2nd dev to fill the slot */
+    size=data->strip_zone[cur].size;
+    data->hash_table[i++].zone1=data->strip_zone+cur;
+    size-=(data->smallest->size - zone0_size);
+  }
+
+  return (0);
  }
  
-static int raid0_stop (mddev_t *mddev)
+
+static int raid0_stop (int minor, struct md_dev *mddev)
  {
-       raid0_conf_t *conf = mddev_to_conf(mddev);
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
  
-       vfree (conf->hash_table);
-       conf->hash_table = NULL;
-       vfree (conf->strip_zone);
-       conf->strip_zone = NULL;
-       vfree (conf);
-       mddev->private = NULL;
+  vfree (data->hash_table);
+  vfree (data->strip_zone);
+  vfree (data);
  
-       MOD_DEC_USE_COUNT;
-       return 0;
+  MOD_DEC_USE_COUNT;
+  return 0;
  }
  
  /*
@@ -221,135 +167,129 @@ static int raid0_stop (mddev_t *mddev)
   * Of course, those facts may not be valid anymore (and surely won't...)
   * Hey guys, there's some work out there ;-)
   */
-static int raid0_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid0_map (struct md_dev *mddev, kdev_t *rdev,
                       unsigned long *rsector, unsigned long size)
  {
-       raid0_conf_t *conf = mddev_to_conf(mddev);
-       struct raid0_hash *hash;
-       struct strip_zone *zone;
-       mdk_rdev_t *tmp_dev;
-       int blk_in_chunk, chunksize_bits, chunk, chunk_size;
-       long block, rblock;
-
-       chunk_size = mddev->param.chunk_size >> 10;
-       chunksize_bits = ffz(~chunk_size);
-       block = *rsector >> 1;
-       hash = conf->hash_table + block / conf->smallest->size;
-
-       /* Sanity check */
-       if ((chunk_size * 2) < (*rsector % (chunk_size * 2)) + size)
-               goto bad_map;
- 
-       if (!hash)
-               goto bad_hash;
-
-       if (!hash->zone0)
-               goto bad_zone0;
- 
-       if (block >= (hash->zone0->size + hash->zone0->zone_offset)) {
-               if (!hash->zone1)
-                       goto bad_zone1;
-               zone = hash->zone1;
-       } else
-               zone = hash->zone0;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
+  static struct raid0_hash *hash;
+  struct strip_zone *zone;
+  struct real_dev *tmp_dev;
+  int blk_in_chunk, factor, chunk, chunk_size;
+  long block, rblock;
+
+  factor=FACTOR(mddev);
+  chunk_size=(1UL << FACTOR_SHIFT(factor));
+  block=*rsector >> 1;
+  hash=data->hash_table+(block/data->smallest->size);
+
+  /* Sanity check */
+  if ((chunk_size*2)<(*rsector % (chunk_size*2))+size)
+  {
+    printk ("raid0_convert : can't convert block across chunks or bigger than %dk %ld %ld\n", chunk_size, *rsector, size);
+    return (-1);
+  }
+  
+  if (block >= (hash->zone0->size +
+               hash->zone0->zone_offset))
+  {
+    if (!hash->zone1)
+    {
+      printk ("raid0_convert : hash->zone1==NULL for block %ld\n", block);
+      return (-1);
+    }
+    
+    zone=hash->zone1;
+  }
+  else
+    zone=hash->zone0;
      
-       blk_in_chunk = block & (chunk_size -1);
-       chunk = (block - zone->zone_offset) / (zone->nb_dev << chunksize_bits);
-       tmp_dev = zone->dev[(block >> chunksize_bits) % zone->nb_dev];
-       rblock = (chunk << chunksize_bits) + blk_in_chunk + zone->dev_offset;
+  blk_in_chunk=block & (chunk_size -1);
+  chunk=(block - zone->zone_offset) / (zone->nb_dev<<FACTOR_SHIFT(factor));
+  tmp_dev=zone->dev[(block >> FACTOR_SHIFT(factor)) % zone->nb_dev];
+  rblock=(chunk << FACTOR_SHIFT(factor)) + blk_in_chunk + zone->dev_offset;
    
-       *rdev = tmp_dev->dev;
-       *rsector = rblock << 1;
-
-       return 0;
-
-bad_map:
-       printk ("raid0_map bug: can't convert block across chunks or bigger than %dk %ld %ld\n", chunk_size, *rsector, size);
-       return -1;
-bad_hash:
-       printk("raid0_map bug: hash==NULL for block %ld\n", block);
-       return -1;
-bad_zone0:
-       printk ("raid0_map bug: hash->zone0==NULL for block %ld\n", block);
-       return -1;
-bad_zone1:
-       printk ("raid0_map bug: hash->zone1==NULL for block %ld\n", block);
-       return -1;
+  *rdev=tmp_dev->dev;
+  *rsector=rblock<<1;
+
+  return (0);
  }
  
                            
-static int raid0_status (char *page, mddev_t *mddev)
+static int raid0_status (char *page, int minor, struct md_dev *mddev)
  {
-       int sz = 0;
+  int sz=0;
  #undef MD_DEBUG
  #ifdef MD_DEBUG
-       int j, k;
-       raid0_conf_t *conf = mddev_to_conf(mddev);
+  int j, k;
+  struct raid0_data *data=(struct raid0_data *) mddev->private;
    
-       sz += sprintf(page + sz, "      ");
-       for (j = 0; j < conf->nr_zones; j++) {
-               sz += sprintf(page + sz, "[z%d",
-                               conf->hash_table[j].zone0 - conf->strip_zone);
-               if (conf->hash_table[j].zone1)
-                       sz += sprintf(page+sz, "/z%d] ",
-                               conf->hash_table[j].zone1 - conf->strip_zone);
-               else
-                       sz += sprintf(page+sz, "] ");
-       }
+  sz+=sprintf (page+sz, "      ");
+  for (j=0; j<data->nr_zones; j++)
+  {
+    sz+=sprintf (page+sz, "[z%d",
+                data->hash_table[j].zone0-data->strip_zone);
+    if (data->hash_table[j].zone1)
+      sz+=sprintf (page+sz, "/z%d] ",
+                  data->hash_table[j].zone1-data->strip_zone);
+    else
+      sz+=sprintf (page+sz, "] ");
+  }
    
-       sz += sprintf(page + sz, "\n");
+  sz+=sprintf (page+sz, "\n");
    
-       for (j = 0; j < conf->nr_strip_zones; j++) {
-               sz += sprintf(page + sz, "      z%d=[", j);
-               for (k = 0; k < conf->strip_zone[j].nb_dev; k++)
-                       sz += sprintf (page+sz, "%s/", partition_name(
-                               conf->strip_zone[j].dev[k]->dev));
-               sz--;
-               sz += sprintf (page+sz, "] zo=%d do=%d s=%d\n",
-                               conf->strip_zone[j].zone_offset,
-                               conf->strip_zone[j].dev_offset,
-                               conf->strip_zone[j].size);
-       }
+  for (j=0; j<data->nr_strip_zones; j++)
+  {
+    sz+=sprintf (page+sz, "      z%d=[", j);
+    for (k=0; k<data->strip_zone[j].nb_dev; k++)
+      sz+=sprintf (page+sz, "%s/",
+                  partition_name(data->strip_zone[j].dev[k]->dev));
+    sz--;
+    sz+=sprintf (page+sz, "] zo=%d do=%d s=%d\n",
+                data->strip_zone[j].zone_offset,
+                data->strip_zone[j].dev_offset,
+                data->strip_zone[j].size);
+  }
  #endif
-       sz += sprintf(page + sz, " %dk chunks", mddev->param.chunk_size/1024);
-       return sz;
+  sz+=sprintf (page+sz, " %dk chunks", 1<<FACTOR_SHIFT(FACTOR(mddev)));
+  return sz;
  }
  
-static mdk_personality_t raid0_personality=
+
+static struct md_personality raid0_personality=
  {
-       "raid0",
-       raid0_map,
-       NULL,                           /* no special make_request */
-       NULL,                           /* no special end_request */
-       raid0_run,
-       raid0_stop,
-       raid0_status,
-       NULL,                           /* no ioctls */
-       0,
-       NULL,                           /* no error_handler */
-       NULL,                           /* no diskop */
-       NULL,                           /* no stop resync */
-       NULL                            /* no restart resync */
+  "raid0",
+  raid0_map,
+  NULL,                                /* no special make_request */
+  NULL,                                /* no special end_request */
+  raid0_run,
+  raid0_stop,
+  raid0_status,
+  NULL,                                /* no ioctls */
+  0,
+  NULL,                                /* no error_handler */
+  NULL,                                /* hot_add_disk */
+  NULL,                                /* hot_remove_disk */
+  NULL                         /* mark_spare */
  };
  
+
  #ifndef MODULE
  
  void raid0_init (void)
  {
-       register_md_personality (RAID0, &raid0_personality);
+  register_md_personality (RAID0, &raid0_personality);
  }
  
  #else
  
  int init_module (void)
  {
-       return (register_md_personality (RAID0, &raid0_personality));
+  return (register_md_personality (RAID0, &raid0_personality));
  }
  
  void cleanup_module (void)
  {
-       unregister_md_personality (RAID0);
+  unregister_md_personality (RAID0);
  }
  
  #endif
-
diff --git a/drivers/block/raid1.c b/drivers/block/raid1.c

index a7caea3b5282a0c3e96c22102d6c1e5a1bdce791..890584dcdd684679c0bf3e422b27e791717110e3 100644 (file)
--- a/drivers/block/raid1.c
+++ b/drivers/block/raid1.c
@@ -1,6 +1,6 @@
-/*
+/************************************************************************
   * raid1.c : Multiple Devices driver for Linux
- * Copyright (C) 1996, 1997, 1998 Ingo Molnar, Miguel de Icaza, Gadi Oxman
+ *           Copyright (C) 1996 Ingo Molnar, Miguel de Icaza, Gadi Oxman
   *
   * RAID-1 management functions.
   *
@@ -15,52 +15,50 @@
   */
  
  #include <linux/module.h>
+#include <linux/locks.h>
  #include <linux/malloc.h>
-#include <linux/raid/raid1.h>
+#include <linux/md.h>
+#include <linux/raid1.h>
+#include <asm/bitops.h>
  #include <asm/atomic.h>
  
  #define MAJOR_NR MD_MAJOR
  #define MD_DRIVER
  #define MD_PERSONALITY
  
-#define MAX_LINEAR_SECTORS 128
+/*
+ * The following can be used to debug the driver
+ */
+/*#define RAID1_DEBUG*/
+#ifdef RAID1_DEBUG
+#define PRINTK(x)   do { printk x; } while (0);
+#else
+#define PRINTK(x)   do { ; } while (0);
+#endif
  
  #define MAX(a,b)       ((a) > (b) ? (a) : (b))
  #define MIN(a,b)       ((a) < (b) ? (a) : (b))
  
-static mdk_personality_t raid1_personality;
+static struct md_personality raid1_personality;
+static struct md_thread *raid1_thread = NULL;
  struct buffer_head *raid1_retry_list = NULL;
  
-static void * raid1_kmalloc (int size)
-{
-       void * ptr;
-       /*
-        * now we are rather fault tolerant than nice, but
-        * there are a couple of places in the RAID code where we
-        * simply can not afford to fail an allocation because
-        * there is no failure return path (eg. make_request())
-        */
-       while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_KERNEL)))
-               printk ("raid1: out of memory, retrying...\n");
-
-       memset(ptr, 0, size);
-       return ptr;
-}
-
-static int __raid1_map (mddev_t *mddev, kdev_t *rdev,
+static int __raid1_map (struct md_dev *mddev, kdev_t *rdev,
                         unsigned long *rsector, unsigned long size)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       int i, disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       int i, n = raid_conf->raid_disks;
  
         /*
          * Later we do read balancing on the read side 
          * now we use the first available disk.
          */
  
-       for (i = 0; i < disks; i++) {
-               if (conf->mirrors[i].operational) {
-                       *rdev = conf->mirrors[i].dev;
+       PRINTK(("raid1_map().\n"));
+
+       for (i=0; i<n; i++) {
+               if (raid_conf->mirrors[i].operational) {
+                       *rdev = raid_conf->mirrors[i].dev;
                         return (0);
                 }
         }
@@ -69,29 +67,29 @@ static int __raid1_map (mddev_t *mddev, kdev_t *rdev,
         return (-1);
  }
  
-static int raid1_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid1_map (struct md_dev *mddev, kdev_t *rdev,
                       unsigned long *rsector, unsigned long size)
  {
         return 0;
  }
  
-static void raid1_reschedule_retry (struct buffer_head *bh)
+void raid1_reschedule_retry (struct buffer_head *bh)
  {
         struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_dev_id);
-       mddev_t *mddev = r1_bh->mddev;
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+
+       PRINTK(("raid1_reschedule_retry().\n"));
  
         r1_bh->next_retry = raid1_retry_list;
         raid1_retry_list = bh;
-       md_wakeup_thread(conf->thread);
+       md_wakeup_thread(raid1_thread);
  }
  
  /*
- * raid1_end_bh_io() is called when we have finished servicing a mirrored
+ * raid1_end_buffer_io() is called when we have finished servicing a mirrored
   * operation and are ready to return a success/failure code to the buffer
   * cache layer.
   */
-static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
+static inline void raid1_end_buffer_io(struct raid1_bh *r1_bh, int uptodate)
  {
         struct buffer_head *bh = r1_bh->master_bh;
  
@@ -99,6 +97,8 @@ static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
         kfree(r1_bh);
  }
  
+int raid1_one_error=0;
+
  void raid1_end_request (struct buffer_head *bh, int uptodate)
  {
         struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_dev_id);
@@ -106,7 +106,12 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
  
         save_flags(flags);
         cli();
+       PRINTK(("raid1_end_request().\n"));
  
+       if (raid1_one_error) {
+               raid1_one_error=0;
+               uptodate=0;
+       }
         /*
          * this branch is our 'one mirror IO has finished' event handler:
          */
@@ -131,11 +136,15 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
          */
  
         if ( (r1_bh->cmd == READ) || (r1_bh->cmd == READA) ) {
+
+               PRINTK(("raid1_end_request(), read branch.\n"));
+
                 /*
                  * we have only one buffer_head on the read side
                  */
                 if (uptodate) {
-                       raid1_end_bh_io(r1_bh, uptodate);
+                       PRINTK(("raid1_end_request(), read branch, uptodate.\n"));
+                       raid1_end_buffer_io(r1_bh, uptodate);
                         restore_flags(flags);
                         return;
                 }
@@ -143,56 +152,67 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
                  * oops, read error:
                  */
                 printk(KERN_ERR "raid1: %s: rescheduling block %lu\n", 
-                        partition_name(bh->b_dev), bh->b_blocknr);
-               raid1_reschedule_retry(bh);
+                                kdevname(bh->b_dev), bh->b_blocknr);
+               raid1_reschedule_retry (bh);
                 restore_flags(flags);
                 return;
         }
  
         /*
-        * WRITE:
-        *
+        * WRITE or WRITEA.
+        */
+       PRINTK(("raid1_end_request(), write branch.\n"));
+
+       /*
          * Let's see if all mirrored write operations have finished 
-        * already.
+        * already [we have irqs off, so we can decrease]:
          */
  
-       if (atomic_dec_and_test(&r1_bh->remaining)) {
-               int i, disks = MD_SB_DISKS;
+       if (!--r1_bh->remaining) {
+               struct md_dev *mddev = r1_bh->mddev;
+               struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+               int i, n = raid_conf->raid_disks;
+
+               PRINTK(("raid1_end_request(), remaining == 0.\n"));
  
-               for ( i = 0; i < disks; i++)
-                       if (r1_bh->mirror_bh[i])
-                               kfree(r1_bh->mirror_bh[i]);
+               for ( i=0; i<n; i++)
+                       if (r1_bh->mirror_bh[i]) kfree(r1_bh->mirror_bh[i]);
  
-               raid1_end_bh_io(r1_bh, test_bit(BH_Uptodate, &r1_bh->state));
+               raid1_end_buffer_io(r1_bh, test_bit(BH_Uptodate, &r1_bh->state));
         }
+       else PRINTK(("raid1_end_request(), remaining == %u.\n", r1_bh->remaining));
         restore_flags(flags);
  }
  
-/*
- * This routine checks if the undelying device is an md device
- * and in that case it maps the blocks before putting the
- * request on the queue
+/* This routine checks if the undelying device is an md device and in that
+ * case it maps the blocks before putting the request on the queue
   */
-static void map_and_make_request (int rw, struct buffer_head *bh)
+static inline void
+map_and_make_request (int rw, struct buffer_head *bh)
  {
         if (MAJOR (bh->b_rdev) == MD_MAJOR)
-               md_map (bh->b_rdev, &bh->b_rdev,
-                               &bh->b_rsector, bh->b_size >> 9);
+               md_map (MINOR (bh->b_rdev), &bh->b_rdev, &bh->b_rsector, bh->b_size >> 9);
         clear_bit(BH_Lock, &bh->b_state);
         make_request (MAJOR (bh->b_rdev), rw, bh);
  }
         
-static int raid1_make_request (mddev_t *mddev, int rw,
-                                                struct buffer_head * bh)
+static int
+raid1_make_request (struct md_dev *mddev, int rw, struct buffer_head * bh)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
         struct buffer_head *mirror_bh[MD_SB_DISKS], *bh_req;
         struct raid1_bh * r1_bh;
-       int disks = MD_SB_DISKS;
-       int i, sum_bhs = 0, switch_disks = 0, sectors, lowprio = 0;
+       int n = raid_conf->raid_disks, i, sum_bhs = 0, switch_disks = 0, sectors;
         struct mirror_info *mirror;
  
-       r1_bh = raid1_kmalloc (sizeof (struct raid1_bh));
+       PRINTK(("raid1_make_request().\n"));
+
+       while (!( /* FIXME: now we are rather fault tolerant than nice */
+       r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_KERNEL)
+       ) )
+               printk ("raid1_make_request(#1): out of memory\n");
+       memset (r1_bh, 0, sizeof (struct raid1_bh));
  
  /*
   * make_request() can abort the operation when READA or WRITEA are being
@@ -203,65 +223,43 @@ static int raid1_make_request (mddev_t *mddev, int rw,
         if (rw == READA) rw = READ;
         if (rw == WRITEA) rw = WRITE;
  
-       if (rw == WRITE) {
-               /*
-                * Too early ?
-                */
-               mark_buffer_clean(bh);
-               /*
-                * not too early. we _first_ clean the bh, then we start
-                * the IO, then when the IO has finished, we unlock the
-                * bh and mark it uptodate. This way we do not miss the
-                * case when the bh got dirty again during the IO.
-                */
-       }
-
-       /*
-        * special flag for 'lowprio' reconstruction requests ...
-        */
-       if (buffer_lowprio(bh))
-               lowprio = 1;
+       if (rw == WRITE || rw == WRITEA)
+               mark_buffer_clean(bh);          /* Too early ? */
  
  /*
- * i think the read and write branch should be separated completely,
- * since we want to do read balancing on the read side for example.
- * Comments? :) --mingo
+ * i think the read and write branch should be separated completely, since we want
+ * to do read balancing on the read side for example. Comments? :) --mingo
   */
  
         r1_bh->master_bh=bh;
         r1_bh->mddev=mddev;
         r1_bh->cmd = rw;
  
-       if (rw==READ) {
-               int last_used = conf->last_used;
-
-               /*
-                * read balancing logic:
-                */
-               mirror = conf->mirrors + last_used;
+       if (rw==READ || rw==READA) {
+               int last_used = raid_conf->last_used;
+               PRINTK(("raid1_make_request(), read branch.\n"));
+               mirror = raid_conf->mirrors + last_used;
                 bh->b_rdev = mirror->dev;
                 sectors = bh->b_size >> 9;
-
-               if (bh->b_blocknr * sectors == conf->next_sect) {
-                       conf->sect_count += sectors;
-                       if (conf->sect_count >= mirror->sect_limit)
+               if (bh->b_blocknr * sectors == raid_conf->next_sect) {
+                       raid_conf->sect_count += sectors;
+                       if (raid_conf->sect_count >= mirror->sect_limit)
                                 switch_disks = 1;
                 } else
                         switch_disks = 1;
-               conf->next_sect = (bh->b_blocknr + 1) * sectors;
-               /*
-                * Do not switch disks if full resync is in progress ...
-                */
-               if (switch_disks && !conf->resync_mirrors) {
-                       conf->sect_count = 0;
-                       last_used = conf->last_used = mirror->next;
+               raid_conf->next_sect = (bh->b_blocknr + 1) * sectors;
+               if (switch_disks) {
+                       PRINTK(("read-balancing: switching %d -> %d (%d sectors)\n", last_used, mirror->next, raid_conf->sect_count));
+                       raid_conf->sect_count = 0;
+                       last_used = raid_conf->last_used = mirror->next;
                         /*
-                        * Do not switch to write-only disks ...
-                        * reconstruction is in progress
+                        * Do not switch to write-only disks ... resyncing
+                        * is in progress
                          */
-                       while (conf->mirrors[last_used].write_only)
-                               conf->last_used = conf->mirrors[last_used].next;
+                       while (raid_conf->mirrors[last_used].write_only)
+                               raid_conf->last_used = raid_conf->mirrors[last_used].next;
                 }
+               PRINTK (("raid1 read queue: %d %d\n", MAJOR (bh->b_rdev), MINOR (bh->b_rdev)));
                 bh_req = &r1_bh->bh_req;
                 memcpy(bh_req, bh, sizeof(*bh));
                 bh_req->b_end_io = raid1_end_request;
@@ -271,12 +269,13 @@ static int raid1_make_request (mddev_t *mddev, int rw,
         }
  
         /*
-        * WRITE:
+        * WRITE or WRITEA.
          */
+       PRINTK(("raid1_make_request(n=%d), write branch.\n",n));
  
-       for (i = 0; i < disks; i++) {
+       for (i = 0; i < n; i++) {
  
-               if (!conf->mirrors[i].operational) {
+               if (!raid_conf->mirrors [i].operational) {
                         /*
                          * the r1_bh->mirror_bh[i] pointer remains NULL
                          */
@@ -284,91 +283,85 @@ static int raid1_make_request (mddev_t *mddev, int rw,
                         continue;
                 }
  
-               /*
-                * special case for reconstruction ...
-                */
-               if (lowprio && (i == conf->last_used)) {
-                       mirror_bh[i] = NULL;
-                       continue;
-               }
- 
-       /*
-        * We should use a private pool (size depending on NR_REQUEST),
-        * to avoid writes filling up the memory with bhs
-        *
-        * Such pools are much faster than kmalloc anyways (so we waste
-        * almost nothing by not using the master bh when writing and
-        * win alot of cleanness) but for now we are cool enough. --mingo
-        *
-        * It's safe to sleep here, buffer heads cannot be used in a shared
-        * manner in the write branch. Look how we lock the buffer at the
-        * beginning of this function to grok the difference ;)
-        */
-               mirror_bh[i] = raid1_kmalloc(sizeof(struct buffer_head));
-       /*
-        * prepare mirrored bh (fields ordered for max mem throughput):
-        */
-               mirror_bh[i]->b_blocknr    = bh->b_blocknr;
-               mirror_bh[i]->b_dev        = bh->b_dev;
-               mirror_bh[i]->b_rdev       = conf->mirrors[i].dev;
-               mirror_bh[i]->b_rsector    = bh->b_rsector;
-               mirror_bh[i]->b_state      = (1<<BH_Req) | (1<<BH_Dirty);
-               if (lowprio)
-                       mirror_bh[i]->b_state |= (1<<BH_LowPrio);
- 
-               mirror_bh[i]->b_count      = 1;
-               mirror_bh[i]->b_size       = bh->b_size;
-               mirror_bh[i]->b_data       = bh->b_data;
-               mirror_bh[i]->b_list       = BUF_LOCKED;
-               mirror_bh[i]->b_end_io     = raid1_end_request;
-               mirror_bh[i]->b_dev_id     = r1_bh;
-  
-               r1_bh->mirror_bh[i] = mirror_bh[i];
-               sum_bhs++;
+       /*
+        * We should use a private pool (size depending on NR_REQUEST),
+        * to avoid writes filling up the memory with bhs
+        *
+        * Such pools are much faster than kmalloc anyways (so we waste almost 
+        * nothing by not using the master bh when writing and win alot of cleanness)
+        *
+        * but for now we are cool enough. --mingo
+        *
+        * It's safe to sleep here, buffer heads cannot be used in a shared
+        * manner in the write branch. Look how we lock the buffer at the beginning
+        * of this function to grok the difference ;)
+        */
+               while (!( /* FIXME: now we are rather fault tolerant than nice */
+               mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_KERNEL)
+               ) )
+                       printk ("raid1_make_request(#2): out of memory\n");
+               memset (mirror_bh[i], 0, sizeof (struct buffer_head));
+
+       /*
+        * prepare mirrored bh (fields ordered for max mem throughput):
+        */
+               mirror_bh [i]->b_blocknr    = bh->b_blocknr;
+               mirror_bh [i]->b_dev        = bh->b_dev;
+               mirror_bh [i]->b_rdev       = raid_conf->mirrors [i].dev;
+               mirror_bh [i]->b_rsector    = bh->b_rsector;
+               mirror_bh [i]->b_state      = (1<<BH_Req) | (1<<BH_Dirty);
+               mirror_bh [i]->b_count      = 1;
+               mirror_bh [i]->b_size       = bh->b_size;
+               mirror_bh [i]->b_data       = bh->b_data;
+               mirror_bh [i]->b_list       = BUF_LOCKED;
+               mirror_bh [i]->b_end_io     = raid1_end_request;
+               mirror_bh [i]->b_dev_id     = r1_bh;
+
+               r1_bh->mirror_bh[i] = mirror_bh[i];
+               sum_bhs++;
         }
  
-       md_atomic_set(&r1_bh->remaining, sum_bhs);
+       r1_bh->remaining = sum_bhs;
+
+       PRINTK(("raid1_make_request(), write branch, sum_bhs=%d.\n",sum_bhs));
  
         /*
-        * We have to be a bit careful about the semaphore above, thats
-        * why we start the requests separately. Since kmalloc() could
-        * fail, sleep and make_request() can sleep too, this is the
-        * safer solution. Imagine, end_request decreasing the semaphore
-        * before we could have set it up ... We could play tricks with
-        * the semaphore (presetting it and correcting at the end if
-        * sum_bhs is not 'n' but we have to do end_request by hand if
-        * all requests finish until we had a chance to set up the
-        * semaphore correctly ... lots of races).
+        * We have to be a bit careful about the semaphore above, thats why we
+        * start the requests separately. Since kmalloc() could fail, sleep and
+        * make_request() can sleep too, this is the safer solution. Imagine,
+        * end_request decreasing the semaphore before we could have set it up ...
+        * We could play tricks with the semaphore (presetting it and correcting
+        * at the end if sum_bhs is not 'n' but we have to do end_request by hand
+        * if all requests finish until we had a chance to set up the semaphore
+        * correctly ... lots of races).
          */
-       for (i = 0; i < disks; i++)
-               if (mirror_bh[i])
-                       map_and_make_request(rw, mirror_bh[i]);
+       for (i = 0; i < n; i++)
+               if (mirror_bh [i] != NULL)
+                       map_and_make_request (rw, mirror_bh [i]);
  
         return (0);
  }
                            
-static int raid1_status (char *page, mddev_t *mddev)
+static int raid1_status (char *page, int minor, struct md_dev *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
         int sz = 0, i;
         
-       sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks,
-                                                conf->working_disks);
-       for (i = 0; i < conf->raid_disks; i++)
-               sz += sprintf (page+sz, "%s",
-                       conf->mirrors[i].operational ? "U" : "_");
+       sz += sprintf (page+sz, " [%d/%d] [", raid_conf->raid_disks, raid_conf->working_disks);
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               sz += sprintf (page+sz, "%s", raid_conf->mirrors [i].operational ? "U" : "_");
         sz += sprintf (page+sz, "]");
         return sz;
  }
  
-static void unlink_disk (raid1_conf_t *conf, int target)
+static void raid1_fix_links (struct raid1_data *raid_conf, int failed_index)
  {
-       int disks = MD_SB_DISKS;
-       int i;
+       int disks = raid_conf->raid_disks;
+       int j;
  
-       for (i = 0; i < disks; i++)
-               if (conf->mirrors[i].next == target)
-                       conf->mirrors[i].next = conf->mirrors[target].next;
+       for (j = 0; j < disks; j++)
+               if (raid_conf->mirrors [j].next == failed_index)
+                       raid_conf->mirrors [j].next = raid_conf->mirrors [failed_index].next;
  }
  
  #define LAST_DISK KERN_ALERT \
@@ -387,53 +380,48 @@ static void unlink_disk (raid1_conf_t *conf, int target)
  #define ALREADY_SYNCING KERN_INFO \
  "raid1: syncing already in progress.\n"
  
-static void mark_disk_bad (mddev_t *mddev, int failed)
+static int raid1_error (struct md_dev *mddev, kdev_t dev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info *mirror = conf->mirrors+failed;
-       mdp_super_t *sb = mddev->sb;
-
-       mirror->operational = 0;
-       unlink_disk(conf, failed);
-       mark_disk_faulty(sb->disks+mirror->number);
-       mark_disk_nonsync(sb->disks+mirror->number);
-       mark_disk_inactive(sb->disks+mirror->number);
-       sb->active_disks--;
-       sb->working_disks--;
-       sb->failed_disks++;
-       mddev->sb_dirty = 1;
-       md_wakeup_thread(conf->thread);
-       conf->working_disks--;
-       printk (DISK_FAILED, partition_name (mirror->dev),
-                                conf->working_disks);
-}
-
-static int raid1_error (mddev_t *mddev, kdev_t dev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info * mirrors = conf->mirrors;
-       int disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       struct mirror_info *mirror;
+       md_superblock_t *sb = mddev->sb;
+       int disks = raid_conf->raid_disks;
         int i;
  
-       if (conf->working_disks == 1) {
+       PRINTK(("raid1_error called\n"));
+
+       if (raid_conf->working_disks == 1) {
                 /*
                  * Uh oh, we can do nothing if this is our last disk, but
                  * first check if this is a queued request for a device
                  * which has just failed.
                  */
-               for (i = 0; i < disks; i++) {
-                       if (mirrors[i].dev==dev && !mirrors[i].operational)
+               for (i = 0, mirror = raid_conf->mirrors; i < disks;
+                                i++, mirror++)
+                       if (mirror->dev == dev && !mirror->operational)
                                 return 0;
-               }
                 printk (LAST_DISK);
         } else {
-               /*
-                * Mark disk as unusable
-                */
-               for (i = 0; i < disks; i++) {
-                       if (mirrors[i].dev==dev && mirrors[i].operational) {
-                               mark_disk_bad (mddev, i);
-                               break;
+               /* Mark disk as unusable */
+               for (i = 0, mirror = raid_conf->mirrors; i < disks;
+                                i++, mirror++) {
+                       if (mirror->dev == dev && mirror->operational){
+                               mirror->operational = 0;
+                               raid1_fix_links (raid_conf, i);
+                               sb->disks[mirror->number].state |=
+                                               (1 << MD_FAULTY_DEVICE);
+                               sb->disks[mirror->number].state &=
+                                               ~(1 << MD_SYNC_DEVICE);
+                               sb->disks[mirror->number].state &=
+                                               ~(1 << MD_ACTIVE_DEVICE);
+                               sb->active_disks--;
+                               sb->working_disks--;
+                               sb->failed_disks++;
+                               mddev->sb_dirty = 1;
+                               md_wakeup_thread(raid1_thread);
+                               raid_conf->working_disks--;
+                               printk (DISK_FAILED, kdevname (dev),
+                                               raid_conf->working_disks);
                         }
                 }
         }
@@ -446,396 +434,219 @@ static int raid1_error (mddev_t *mddev, kdev_t dev)
  #undef START_SYNCING
  
  /*
- * Insert the spare disk into the drive-ring
+ * This is the personality-specific hot-addition routine
   */
-static void link_disk(raid1_conf_t *conf, struct mirror_info *mirror)
-{
-       int j, next;
-       int disks = MD_SB_DISKS;
-       struct mirror_info *p = conf->mirrors;
  
-       for (j = 0; j < disks; j++, p++)
-               if (p->operational && !p->write_only) {
-                       next = p->next;
-                       p->next = mirror->raid_disk;
-                       mirror->next = next;
-                       return;
-               }
+#define NO_SUPERBLOCK KERN_ERR \
+"raid1: cannot hot-add disk to the array with no RAID superblock\n"
  
-       printk("raid1: bug: no read-operational devices\n");
-}
-
-static void print_raid1_conf (raid1_conf_t *conf)
-{
-       int i;
-       struct mirror_info *tmp;
+#define WRONG_LEVEL KERN_ERR \
+"raid1: hot-add: level of disk is not RAID-1\n"
  
-       printk("RAID1 conf printout:\n");
-       if (!conf) {
-               printk("(conf==NULL)\n");
-               return;
-       }
-       printk(" --- wd:%d rd:%d nd:%d\n", conf->working_disks,
-                        conf->raid_disks, conf->nr_disks);
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               tmp = conf->mirrors + i;
-               printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
-                       i, tmp->spare,tmp->operational,
-                       tmp->number,tmp->raid_disk,tmp->used_slot,
-                       partition_name(tmp->dev));
-       }
-}
+#define HOT_ADD_SUCCEEDED KERN_INFO \
+"raid1: device %s hot-added\n"
  
-static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
+static int raid1_hot_add_disk (struct md_dev *mddev, kdev_t dev)
  {
-       int err = 0;
-       int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
-       raid1_conf_t *conf = mddev->private;
-       struct mirror_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
         unsigned long flags;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *failed_desc, *spare_desc, *added_desc;
-
-       save_flags(flags);
-       cli();
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
+       struct mirror_info *mirror;
+       md_superblock_t *sb = mddev->sb;
+       struct real_dev * realdev;
+       int n;
  
-       print_raid1_conf(conf);
         /*
-        * find the disk ...
+        * The device has its superblock already read and it was found
+        * to be consistent for generic RAID usage.  Now we check whether
+        * it's usable for RAID-1 hot addition.
          */
-       switch (state) {
-
-       case DISKOP_SPARE_ACTIVE:
  
-               /*
-                * Find the failed disk within the RAID1 configuration ...
-                * (this can only be in the first conf->working_disks part)
-                */
-               for (i = 0; i < conf->raid_disks; i++) {
-                       tmp = conf->mirrors + i;
-                       if ((!tmp->operational && !tmp->spare) ||
-                                       !tmp->used_slot) {
-                               failed_disk = i;
-                               break;
-                       }
-               }
-               /*
-                * When we activate a spare disk we _must_ have a disk in
-                * the lower (active) part of the array to replace. 
-                */
-               if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               /* fall through */
-
-       case DISKOP_SPARE_WRITE:
-       case DISKOP_SPARE_INACTIVE:
-
-               /*
-                * Find the spare disk ... (can only be in the 'high'
-                * area of the array)
-                */
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (tmp->spare && tmp->number == (*d)->number) {
-                               spare_disk = i;
-                               break;
-                       }
-               }
-               if (spare_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-
-               for (i = 0; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (tmp->used_slot && (tmp->number == (*d)->number)) {
-                               if (tmp->operational) {
-                                       err = -EBUSY;
-                                       goto abort;
-                               }
-                               removed_disk = i;
-                               break;
-                       }
-               }
-               if (removed_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->mirrors + i;
-                       if (!tmp->used_slot) {
-                               added_disk = i;
-                               break;
-                       }
-               }
-               if (added_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
+       n = mddev->nb_dev++;
+       realdev = &mddev->devices[n];
+       if (!realdev->sb) {
+               printk (NO_SUPERBLOCK);
+               return -EINVAL;
         }
+       if (realdev->sb->level != 1) {
+               printk (WRONG_LEVEL);
+               return -EINVAL;
+       }
+       /* FIXME: are there other things left we could sanity-check? */
  
-       switch (state) {
-       /*
-        * Switch the spare disk to write-only mode:
-        */
-       case DISKOP_SPARE_WRITE:
-               sdisk = conf->mirrors + spare_disk;
-               sdisk->operational = 1;
-               sdisk->write_only = 1;
-               break;
         /*
-        * Deactivate a spare disk:
+        * We have to disable interrupts, as our RAID-1 state is used
+        * from irq handlers as well.
          */
-       case DISKOP_SPARE_INACTIVE:
-               sdisk = conf->mirrors + spare_disk;
-               sdisk->operational = 0;
-               sdisk->write_only = 0;
-               break;
-       /*
-        * Activate (mark read-write) the (now sync) spare disk,
-        * which means we switch it's 'raid position' (->raid_disk)
-        * with the failed disk. (only the first 'conf->nr_disks'
-        * slots are used for 'real' disks and we must preserve this
-        * property)
-        */
-       case DISKOP_SPARE_ACTIVE:
-
-               sdisk = conf->mirrors + spare_disk;
-               fdisk = conf->mirrors + failed_disk;
-
-               spare_desc = &sb->disks[sdisk->number];
-               failed_desc = &sb->disks[fdisk->number];
-
-               if (spare_desc != *d) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (spare_desc->raid_disk != sdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-                       
-               if (sdisk->raid_disk != spare_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       save_flags(flags);
+       cli();
  
-               if (failed_desc->raid_disk != fdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       raid_conf->raid_disks++;
+       mirror = raid_conf->mirrors+n;
  
-               if (fdisk->raid_disk != failed_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
+       mirror->number=n;
+       mirror->raid_disk=n;
+       mirror->dev=dev;
+       mirror->next=0; /* FIXME */
+       mirror->sect_limit=128;
  
-               /*
-                * do the switch finally
-                */
-               xchg_values(*spare_desc, *failed_desc);
-               xchg_values(*fdisk, *sdisk);
+       mirror->operational=0;
+       mirror->spare=1;
+       mirror->write_only=0;
  
-               /*
-                * (careful, 'failed' and 'spare' are switched from now on)
-                *
-                * we want to preserve linear numbering and we want to
-                * give the proper raid_disk number to the now activated
-                * disk. (this means we switch back these values)
-                */
-       
-               xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
-               xchg_values(sdisk->raid_disk, fdisk->raid_disk);
-               xchg_values(spare_desc->number, failed_desc->number);
-               xchg_values(sdisk->number, fdisk->number);
+       sb->disks[n].state |= (1 << MD_FAULTY_DEVICE);
+       sb->disks[n].state &= ~(1 << MD_SYNC_DEVICE);
+       sb->disks[n].state &= ~(1 << MD_ACTIVE_DEVICE);
+       sb->nr_disks++;
+       sb->spare_disks++;
  
-               *d = failed_desc;
+       restore_flags(flags);
  
-               if (sdisk->dev == MKDEV(0,0))
-                       sdisk->used_slot = 0;
-               /*
-                * this really activates the spare.
-                */
-               fdisk->spare = 0;
-               fdisk->write_only = 0;
-               link_disk(conf, fdisk);
+       md_update_sb(MINOR(dev));
  
-               /*
-                * if we activate a spare, we definitely replace a
-                * non-operational disk slot in the 'low' area of
-                * the disk array.
-                */
+       printk (HOT_ADD_SUCCEEDED, kdevname(realdev->dev));
  
-               conf->working_disks++;
+       return 0;
+}
  
-               break;
+#undef NO_SUPERBLOCK
+#undef WRONG_LEVEL
+#undef HOT_ADD_SUCCEEDED
  
-       case DISKOP_HOT_REMOVE_DISK:
-               rdisk = conf->mirrors + removed_disk;
+/*
+ * Insert the spare disk into the drive-ring
+ */
+static void add_ring(struct raid1_data *raid_conf, struct mirror_info *mirror)
+{
+       int j, next;
+       struct mirror_info *p = raid_conf->mirrors;
  
-               if (rdisk->spare && (removed_disk < conf->raid_disks)) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-               rdisk->dev = MKDEV(0,0);
-               rdisk->used_slot = 0;
-               conf->nr_disks--;
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-               adisk = conf->mirrors + added_disk;
-               added_desc = *d;
-
-               if (added_disk != added_desc->number) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
+       for (j = 0; j < raid_conf->raid_disks; j++, p++)
+               if (p->operational && !p->write_only) {
+                       next = p->next;
+                       p->next = mirror->raid_disk;
+                       mirror->next = next;
+                       return;
                 }
+       printk("raid1: bug: no read-operational devices\n");
+}
  
-               adisk->number = added_desc->number;
-               adisk->raid_disk = added_desc->raid_disk;
-               adisk->dev = MKDEV(added_desc->major,added_desc->minor);
-
-               adisk->operational = 0;
-               adisk->write_only = 0;
-               adisk->spare = 1;
-               adisk->used_slot = 1;
-               conf->nr_disks++;
+static int raid1_mark_spare(struct md_dev *mddev, md_descriptor_t *spare,
+                               int state)
+{
+       int i = 0, failed_disk = -1;
+       struct raid1_data *raid_conf = mddev->private;
+       struct mirror_info *mirror = raid_conf->mirrors;
+       md_descriptor_t *descriptor;
+       unsigned long flags;
  
-               break;
+       for (i = 0; i < MD_SB_DISKS; i++, mirror++) {
+               if (mirror->spare && mirror->number == spare->number)
+                       goto found;
+       }
+       return 1;
+found:
+       for (i = 0, mirror = raid_conf->mirrors; i < raid_conf->raid_disks;
+                                                               i++, mirror++)
+               if (!mirror->operational)
+                       failed_disk = i;
  
-       default:
-               MD_BUG();       
-               err = 1;
-               goto abort;
+       save_flags(flags);
+       cli();
+       switch (state) {
+               case SPARE_WRITE:
+                       mirror->operational = 1;
+                       mirror->write_only = 1;
+                       raid_conf->raid_disks = MAX(raid_conf->raid_disks,
+                                                       mirror->raid_disk + 1);
+                       break;
+               case SPARE_INACTIVE:
+                       mirror->operational = 0;
+                       mirror->write_only = 0;
+                       break;
+               case SPARE_ACTIVE:
+                       mirror->spare = 0;
+                       mirror->write_only = 0;
+                       raid_conf->working_disks++;
+                       add_ring(raid_conf, mirror);
+
+                       if (failed_disk != -1) {
+                               descriptor = &mddev->sb->disks[raid_conf->mirrors[failed_disk].number];
+                               i = spare->raid_disk;
+                               spare->raid_disk = descriptor->raid_disk;
+                               descriptor->raid_disk = i;
+                       }
+                       break;
+               default:
+                       printk("raid1_mark_spare: bug: state == %d\n", state);
+                       restore_flags(flags);
+                       return 1;
         }
-abort:
         restore_flags(flags);
-       print_raid1_conf(conf);
-       return err;
+       return 0;
  }
  
-
-#define IO_ERROR KERN_ALERT \
-"raid1: %s: unrecoverable I/O read error for block %lu\n"
-
-#define REDIRECT_SECTOR KERN_ERR \
-"raid1: %s: redirecting sector %lu to another mirror\n"
-
  /*
   * This is a kernel thread which:
   *
   *     1.      Retries failed read operations on working mirrors.
   *     2.      Updates the raid superblock when problems encounter.
   */
-static void raid1d (void *data)
+void raid1d (void *data)
  {
         struct buffer_head *bh;
         kdev_t dev;
         unsigned long flags;
-       struct raid1_bh *r1_bh;
-       mddev_t *mddev;
+       struct raid1_bh * r1_bh;
+       struct md_dev *mddev;
  
+       PRINTK(("raid1d() active\n"));
+       save_flags(flags);
+       cli();
         while (raid1_retry_list) {
-               save_flags(flags);
-               cli();
                 bh = raid1_retry_list;
                 r1_bh = (struct raid1_bh *)(bh->b_dev_id);
                 raid1_retry_list = r1_bh->next_retry;
                 restore_flags(flags);
  
-               mddev = kdev_to_mddev(bh->b_dev);
+               mddev = md_dev + MINOR(bh->b_dev);
                 if (mddev->sb_dirty) {
-                       printk(KERN_INFO "dirty sb detected, updating.\n");
+                       printk("dirty sb detected, updating.\n");
                         mddev->sb_dirty = 0;
-                       md_update_sb(mddev);
+                       md_update_sb(MINOR(bh->b_dev));
                 }
                 dev = bh->b_rdev;
-               __raid1_map (mddev, &bh->b_rdev, &bh->b_rsector,
-                                                        bh->b_size >> 9);
+               __raid1_map (md_dev + MINOR(bh->b_dev), &bh->b_rdev, &bh->b_rsector, bh->b_size >> 9);
                 if (bh->b_rdev == dev) {
-                       printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
-                       raid1_end_bh_io(r1_bh, 0);
+                       printk (KERN_ALERT 
+                                       "raid1: %s: unrecoverable I/O read error for block %lu\n",
+                                               kdevname(bh->b_dev), bh->b_blocknr);
+                       raid1_end_buffer_io(r1_bh, 0);
                 } else {
-                       printk (REDIRECT_SECTOR,
-                               partition_name(bh->b_dev), bh->b_blocknr);
+                       printk (KERN_ERR "raid1: %s: redirecting sector %lu to another mirror\n", 
+                                         kdevname(bh->b_dev), bh->b_blocknr);
                         map_and_make_request (r1_bh->cmd, bh);
                 }
+               cli();
         }
+       restore_flags(flags);
  }
-#undef IO_ERROR
-#undef REDIRECT_SECTOR
-
-/*
- * Private kernel thread to reconstruct mirrors after an unclean
- * shutdown.
- */
-static void raid1syncd (void *data)
-{
-        raid1_conf_t *conf = data;
-        mddev_t *mddev = conf->mddev;
-
-        if (!conf->resync_mirrors)
-                return;
-        if (conf->resync_mirrors == 2)
-                return;
-       down(&mddev->recovery_sem);
-        if (md_do_sync(mddev, NULL)) {
-               up(&mddev->recovery_sem);
-               return;
-       }
-       /*
-        * Only if everything went Ok.
-        */
-        conf->resync_mirrors = 0;
-       up(&mddev->recovery_sem);
-}
-
  
  /*
   * This will catch the scenario in which one of the mirrors was
   * mounted as a normal device rather than as a part of a raid set.
- *
- * check_consistency is very personality-dependent, eg. RAID5 cannot
- * do this check, it uses another method.
   */
-static int __check_consistency (mddev_t *mddev, int row)
+static int __check_consistency (struct md_dev *mddev, int row)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       int disks = MD_SB_DISKS;
+       struct raid1_data *raid_conf = mddev->private;
         kdev_t dev;
         struct buffer_head *bh = NULL;
         int i, rc = 0;
         char *buffer = NULL;
  
-       for (i = 0; i < disks; i++) {
-               printk("(checking disk %d)\n",i);
-               if (!conf->mirrors[i].operational)
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               if (!raid_conf->mirrors[i].operational)
                         continue;
-               printk("(really checking disk %d)\n",i);
-               dev = conf->mirrors[i].dev;
+               dev = raid_conf->mirrors[i].dev;
                 set_blocksize(dev, 4096);
                 if ((bh = bread(dev, row / 4, 4096)) == NULL)
                         break;
@@ -864,342 +675,163 @@ static int __check_consistency (mddev_t *mddev, int row)
         return rc;
  }
  
-static int check_consistency (mddev_t *mddev)
+static int check_consistency (struct md_dev *mddev)
  {
-       if (__check_consistency(mddev, 0))
-/*
- * we do not do this currently, as it's perfectly possible to
- * have an inconsistent array when it's freshly created. Only
- * newly written data has to be consistent.
- */
-               return 0;
+       int size = mddev->sb->size;
+       int row;
  
+       for (row = 0; row < size; row += size / 8)
+               if (__check_consistency(mddev, row))
+                       return 1;
         return 0;
  }
  
-#define INVALID_LEVEL KERN_WARNING \
-"raid1: md%d: raid level not set to mirroring (%d)\n"
-
-#define NO_SB KERN_ERR \
-"raid1: disabled mirror %s (couldn't access raid superblock)\n"
-
-#define ERRORS KERN_ERR \
-"raid1: disabled mirror %s (errors detected)\n"
-
-#define NOT_IN_SYNC KERN_ERR \
-"raid1: disabled mirror %s (not in sync)\n"
-
-#define INCONSISTENT KERN_ERR \
-"raid1: disabled mirror %s (inconsistent descriptor)\n"
-
-#define ALREADY_RUNNING KERN_ERR \
-"raid1: disabled mirror %s (mirror %d already operational)\n"
-
-#define OPERATIONAL KERN_INFO \
-"raid1: device %s operational as mirror %d\n"
-
-#define MEM_ERROR KERN_ERR \
-"raid1: couldn't allocate memory for md%d\n"
-
-#define SPARE KERN_INFO \
-"raid1: spare disk %s\n"
-
-#define NONE_OPERATIONAL KERN_ERR \
-"raid1: no operational mirrors for md%d\n"
-
-#define RUNNING_CKRAID KERN_ERR \
-"raid1: detected mirror differences -- running resync\n"
-
-#define ARRAY_IS_ACTIVE KERN_INFO \
-"raid1: raid set md%d active with %d out of %d mirrors\n"
-
-#define THREAD_ERROR KERN_ERR \
-"raid1: couldn't allocate thread for md%d\n"
-
-#define START_RESYNC KERN_WARNING \
-"raid1: raid set md%d not clean; reconstructing mirrors\n"
-
-static int raid1_run (mddev_t *mddev)
+static int raid1_run (int minor, struct md_dev *mddev)
  {
-       raid1_conf_t *conf;
-       int i, j, disk_idx;
-       struct mirror_info *disk;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *descriptor;
-       mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
-       int start_recovery = 0;
+       struct raid1_data *raid_conf;
+       int i, j, raid_disk;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
  
         MOD_INC_USE_COUNT;
  
         if (sb->level != 1) {
-               printk(INVALID_LEVEL, mdidx(mddev), sb->level);
-               goto out;
+               printk("raid1: %s: raid level not set to mirroring (%d)\n",
+                               kdevname(MKDEV(MD_MAJOR, minor)), sb->level);
+               MOD_DEC_USE_COUNT;
+               return -EIO;
         }
-       /*
-        * copy the already verified devices into our private RAID1
-        * bookkeeping area. [whatever we allocate in raid1_run(),
-        * should be freed in raid1_stop()]
+       /****
+        * copy the now verified devices into our private RAID1 bookkeeping
+        * area. [whatever we allocate in raid1_run(), should be freed in
+        * raid1_stop()]
          */
  
-       conf = raid1_kmalloc(sizeof(raid1_conf_t));
-       mddev->private = conf;
-       if (!conf) {
-               printk(MEM_ERROR, mdidx(mddev));
-               goto out;
-       }
+       while (!( /* FIXME: now we are rather fault tolerant than nice */
+       mddev->private = kmalloc (sizeof (struct raid1_data), GFP_KERNEL)
+       ) )
+               printk ("raid1_run(): out of memory\n");
+       raid_conf = mddev->private;
+       memset(raid_conf, 0, sizeof(*raid_conf));
  
-       ITERATE_RDEV(mddev,rdev,tmp) {
-               if (rdev->faulty) {
-                       printk(ERRORS, partition_name(rdev->dev));
-               } else {
-                       if (!rdev->sb) {
-                               MD_BUG();
-                               continue;
-                       }
-               }
-               if (rdev->desc_nr == -1) {
-                       MD_BUG();
+       PRINTK(("raid1_run(%d) called.\n", minor));
+
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb) {
+                       printk(KERN_ERR "raid1: disabled mirror %s (couldn't access raid superblock)\n", kdevname(realdev->dev));
                         continue;
                 }
-               descriptor = &sb->disks[rdev->desc_nr];
-               disk_idx = descriptor->raid_disk;
-               disk = conf->mirrors + disk_idx;
-
-               if (disk_faulty(descriptor)) {
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+
+               /*
+                * This is important -- we are using the descriptor on
+                * the disk only to get a pointer to the descriptor on
+                * the main superblock, which might be more recent.
+                */
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE)) {
+                       printk(KERN_ERR "raid1: disabled mirror %s (errors detected)\n", kdevname(realdev->dev));
                         continue;
                 }
-               if (disk_active(descriptor)) {
-                       if (!disk_sync(descriptor)) {
-                               printk(NOT_IN_SYNC,
-                                       partition_name(rdev->dev));
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE)) {
+                       if (!(descriptor->state & (1 << MD_SYNC_DEVICE))) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (not in sync)\n", kdevname(realdev->dev));
                                 continue;
                         }
-                       if ((descriptor->number > MD_SB_DISKS) ||
-                                        (disk_idx > sb->raid_disks)) {
-
-                               printk(INCONSISTENT,
-                                       partition_name(rdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       if (descriptor->number > sb->nr_disks || raid_disk > sb->raid_disks) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (inconsistent descriptor)\n", kdevname(realdev->dev));
                                 continue;
                         }
-                       if (disk->operational) {
-                               printk(ALREADY_RUNNING,
-                                       partition_name(rdev->dev),
-                                       disk_idx);
+                       if (raid_conf->mirrors[raid_disk].operational) {
+                               printk(KERN_ERR "raid1: disabled mirror %s (mirror %d already operational)\n", kdevname(realdev->dev), raid_disk);
                                 continue;
                         }
-                       printk(OPERATIONAL, partition_name(rdev->dev),
-                                       disk_idx);
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 1;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
-                       conf->working_disks++;
+                       printk(KERN_INFO "raid1: device %s operational as mirror %d\n", kdevname(realdev->dev), raid_disk);
+                       raid_conf->mirrors[raid_disk].number = descriptor->number;
+                       raid_conf->mirrors[raid_disk].raid_disk = raid_disk;
+                       raid_conf->mirrors[raid_disk].dev = mddev->devices [i].dev;
+                       raid_conf->mirrors[raid_disk].operational = 1;
+                       raid_conf->mirrors[raid_disk].sect_limit = 128;
+                       raid_conf->working_disks++;
                 } else {
                 /*
                  * Must be a spare disk ..
                  */
-                       printk(SPARE, partition_name(rdev->dev));
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = rdev->dev;
-                       disk->sect_limit = MAX_LINEAR_SECTORS;
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 1;
-                       disk->used_slot = 1;
-               }
-       }
-       if (!conf->working_disks) {
-               printk(NONE_OPERATIONAL, mdidx(mddev));
-               goto out_free_conf;
-       }
-
-       conf->raid_disks = sb->raid_disks;
-       conf->nr_disks = sb->nr_disks;
-       conf->mddev = mddev;
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               
-               descriptor = sb->disks+i;
-               disk_idx = descriptor->raid_disk;
-               disk = conf->mirrors + disk_idx;
+                       printk(KERN_INFO "raid1: spare disk %s\n", kdevname(realdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       raid_conf->mirrors[raid_disk].number = descriptor->number;
+                       raid_conf->mirrors[raid_disk].raid_disk = raid_disk;
+                       raid_conf->mirrors[raid_disk].dev = mddev->devices [i].dev;
+                       raid_conf->mirrors[raid_disk].sect_limit = 128;
  
-               if (disk_faulty(descriptor) && (disk_idx < conf->raid_disks) &&
-                               !disk->used_slot) {
-
-                       disk->number = descriptor->number;
-                       disk->raid_disk = disk_idx;
-                       disk->dev = MKDEV(0,0);
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+                       raid_conf->mirrors[raid_disk].operational = 0;
+                       raid_conf->mirrors[raid_disk].write_only = 0;
+                       raid_conf->mirrors[raid_disk].spare = 1;
                 }
         }
-
-       /*
-        * find the first working one and use it as a starting point
-        * to read balancing.
-        */
-       for (j = 0; !conf->mirrors[j].operational; j++)
-               /* nothing */;
-       conf->last_used = j;
-
-       /*
-        * initialize the 'working disks' list.
-        */
-       for (i = conf->raid_disks - 1; i >= 0; i--) {
-               if (conf->mirrors[i].operational) {
-                       conf->mirrors[i].next = j;
-                       j = i;
-               }
+       if (!raid_conf->working_disks) {
+               printk(KERN_ERR "raid1: no operational mirrors for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               kfree(raid_conf);
+               mddev->private = NULL;
+               MOD_DEC_USE_COUNT;
+               return -EIO;
         }
  
-       if (conf->working_disks != sb->raid_disks) {
-               printk(KERN_ALERT "raid1: md%d, not all disks are operational -- trying to recover array\n", mdidx(mddev));
-               start_recovery = 1;
-       }
+       raid_conf->raid_disks = sb->raid_disks;
+       raid_conf->mddev = mddev;
  
-       if (!start_recovery && (sb->state & (1 << MD_SB_CLEAN))) {
-               /*
-                * we do sanity checks even if the device says
-                * it's clean ...
-                */
-               if (check_consistency(mddev)) {
-                       printk(RUNNING_CKRAID);
-                       sb->state &= ~(1 << MD_SB_CLEAN);
+       for (j = 0; !raid_conf->mirrors[j].operational; j++);
+       raid_conf->last_used = j;
+       for (i = raid_conf->raid_disks - 1; i >= 0; i--) {
+               if (raid_conf->mirrors[i].operational) {
+                       PRINTK(("raid_conf->mirrors[%d].next == %d\n", i, j));
+                       raid_conf->mirrors[i].next = j;
+                       j = i;
                 }
         }
  
-       {
-               const char * name = "raid1d";
-
-               conf->thread = md_register_thread(raid1d, conf, name);
-               if (!conf->thread) {
-                       printk(THREAD_ERROR, mdidx(mddev));
-                       goto out_free_conf;
-               }
+       if (check_consistency(mddev)) {
+               printk(KERN_ERR "raid1: detected mirror differences -- run ckraid\n");
+               sb->state |= 1 << MD_SB_ERRORS;
+               kfree(raid_conf);
+               mddev->private = NULL;
+               MOD_DEC_USE_COUNT;
+               return -EIO;
         }
  
-       if (!start_recovery && !(sb->state & (1 << MD_SB_CLEAN))) {
-               const char * name = "raid1syncd";
-
-               conf->resync_thread = md_register_thread(raid1syncd, conf,name);
-               if (!conf->resync_thread) {
-                       printk(THREAD_ERROR, mdidx(mddev));
-                       goto out_free_conf;
-               }
-
-               printk(START_RESYNC, mdidx(mddev));
-                conf->resync_mirrors = 1;
-                md_wakeup_thread(conf->resync_thread);
-        }
-
         /*
          * Regenerate the "device is in sync with the raid set" bit for
          * each device.
          */
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               mark_disk_nonsync(sb->disks+i);
+       for (i = 0; i < sb->nr_disks ; i++) {
+               sb->disks[i].state &= ~(1 << MD_SYNC_DEVICE);
                 for (j = 0; j < sb->raid_disks; j++) {
-                       if (!conf->mirrors[j].operational)
+                       if (!raid_conf->mirrors[j].operational)
                                 continue;
-                       if (sb->disks[i].number == conf->mirrors[j].number)
-                               mark_disk_sync(sb->disks+i);
-               }
-       }
-       sb->active_disks = conf->working_disks;
-
-       if (start_recovery)
-               md_recover_arrays();
-
-
-       printk(ARRAY_IS_ACTIVE, mdidx(mddev), sb->active_disks, sb->raid_disks);
-       /*
-        * Ok, everything is just fine now
-        */
-       return 0;
-
-out_free_conf:
-       kfree(conf);
-       mddev->private = NULL;
-out:
-       MOD_DEC_USE_COUNT;
-       return -EIO;
-}
-
-#undef INVALID_LEVEL
-#undef NO_SB
-#undef ERRORS
-#undef NOT_IN_SYNC
-#undef INCONSISTENT
-#undef ALREADY_RUNNING
-#undef OPERATIONAL
-#undef SPARE
-#undef NONE_OPERATIONAL
-#undef RUNNING_CKRAID
-#undef ARRAY_IS_ACTIVE
-
-static int raid1_stop_resync (mddev_t *mddev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-
-       if (conf->resync_thread) {
-               if (conf->resync_mirrors) {
-                       conf->resync_mirrors = 2;
-                       md_interrupt_thread(conf->resync_thread);
-                       printk(KERN_INFO "raid1: mirror resync was not fully finished, restarting next time.\n");
-                       return 1;
+                       if (sb->disks[i].number == raid_conf->mirrors[j].number)
+                               sb->disks[i].state |= 1 << MD_SYNC_DEVICE;
                 }
-               return 0;
         }
-       return 0;
-}
-
-static int raid1_restart_resync (mddev_t *mddev)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       sb->active_disks = raid_conf->working_disks;
  
-       if (conf->resync_mirrors) {
-               if (!conf->resync_thread) {
-                       MD_BUG();
-                       return 0;
-               }
-               conf->resync_mirrors = 1;
-               md_wakeup_thread(conf->resync_thread);
-               return 1;
-       }
-       return 0;
+       printk("raid1: raid set %s active with %d out of %d mirrors\n", kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks);
+       /* Ok, everything is just fine now */
+       return (0);
  }
  
-static int raid1_stop (mddev_t *mddev)
+static int raid1_stop (int minor, struct md_dev *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       struct raid1_data *raid_conf = (struct raid1_data *) mddev->private;
  
-       md_unregister_thread(conf->thread);
-       if (conf->resync_thread)
-               md_unregister_thread(conf->resync_thread);
-       kfree(conf);
+       kfree (raid_conf);
         mddev->private = NULL;
         MOD_DEC_USE_COUNT;
         return 0;
  }
  
-static mdk_personality_t raid1_personality=
+static struct md_personality raid1_personality=
  {
         "raid1",
         raid1_map,
@@ -1211,13 +843,15 @@ static mdk_personality_t raid1_personality=
         NULL,                   /* no ioctls */
         0,
         raid1_error,
-       raid1_diskop,
-       raid1_stop_resync,
-       raid1_restart_resync
+       raid1_hot_add_disk,
+       /* raid1_hot_remove_drive */ NULL,
+       raid1_mark_spare
  };
  
  int raid1_init (void)
  {
+       if ((raid1_thread = md_register_thread(raid1d, NULL)) == NULL)
+               return -EBUSY;
         return register_md_personality (RAID1, &raid1_personality);
  }
  
@@ -1229,6 +863,7 @@ int init_module (void)
  
  void cleanup_module (void)
  {
+       md_unregister_thread (raid1_thread);
         unregister_md_personality (RAID1);
  }
  #endif
diff --git a/drivers/block/raid5.c b/drivers/block/raid5.c

index 4db59b8e77c420fad9ef9272cf917151ab1775e2..66713a84b955d46803ab628d3d96a49d95d3e05b 100644 (file)
--- a/drivers/block/raid5.c
+++ b/drivers/block/raid5.c
@@ -1,4 +1,4 @@
-/*
+/*****************************************************************************
   * raid5.c : Multiple Devices driver for Linux
   *           Copyright (C) 1996, 1997 Ingo Molnar, Miguel de Icaza, Gadi Oxman
   *
@@ -14,15 +14,16 @@
   * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
   */
  
-
  #include <linux/module.h>
  #include <linux/locks.h>
  #include <linux/malloc.h>
-#include <linux/raid/raid5.h>
+#include <linux/md.h>
+#include <linux/raid5.h>
  #include <asm/bitops.h>
  #include <asm/atomic.h>
+#include <asm/md.h>
  
-static mdk_personality_t raid5_personality;
+static struct md_personality raid5_personality;
  
  /*
   * Stripe cache
@@ -32,7 +33,7 @@ static mdk_personality_t raid5_personality;
  #define HASH_PAGES_ORDER       0
  #define NR_HASH                        (HASH_PAGES * PAGE_SIZE / sizeof(struct stripe_head *))
  #define HASH_MASK              (NR_HASH - 1)
-#define stripe_hash(conf, sect, size)  ((conf)->stripe_hashtbl[((sect) / (size >> 9)) & HASH_MASK])
+#define stripe_hash(raid_conf, sect, size)     ((raid_conf)->stripe_hashtbl[((sect) / (size >> 9)) & HASH_MASK])
  
  /*
   * The following can be used to debug the driver
@@ -45,8 +46,6 @@ static mdk_personality_t raid5_personality;
  #define PRINTK(x)   do { ; } while (0)
  #endif
  
-static void print_raid5_conf (raid5_conf_t *conf);
-
  static inline int stripe_locked(struct stripe_head *sh)
  {
         return test_bit(STRIPE_LOCKED, &sh->state);
@@ -62,32 +61,32 @@ static inline int stripe_error(struct stripe_head *sh)
   */
  static inline void lock_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       if (!md_test_and_set_bit(STRIPE_LOCKED, &sh->state)) {
+       struct raid5_data *raid_conf = sh->raid_conf;
+       if (!test_and_set_bit(STRIPE_LOCKED, &sh->state)) {
                 PRINTK(("locking stripe %lu\n", sh->sector));
-               conf->nr_locked_stripes++;
+               raid_conf->nr_locked_stripes++;
         }
  }
  
  static inline void unlock_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       if (md_test_and_clear_bit(STRIPE_LOCKED, &sh->state)) {
+       struct raid5_data *raid_conf = sh->raid_conf;
+       if (test_and_clear_bit(STRIPE_LOCKED, &sh->state)) {
                 PRINTK(("unlocking stripe %lu\n", sh->sector));
-               conf->nr_locked_stripes--;
+               raid_conf->nr_locked_stripes--;
                 wake_up(&sh->wait);
         }
  }
  
  static inline void finish_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
         unlock_stripe(sh);
         sh->cmd = STRIPE_NONE;
         sh->phase = PHASE_COMPLETE;
-       conf->nr_pending_stripes--;
-       conf->nr_cached_stripes++;
-       wake_up(&conf->wait_for_stripe);
+       raid_conf->nr_pending_stripes--;
+       raid_conf->nr_cached_stripes++;
+       wake_up(&raid_conf->wait_for_stripe);
  }
  
  void __wait_on_stripe(struct stripe_head *sh)
@@ -115,7 +114,7 @@ static inline void wait_on_stripe(struct stripe_head *sh)
                 __wait_on_stripe(sh);
  }
  
-static inline void remove_hash(raid5_conf_t *conf, struct stripe_head *sh)
+static inline void remove_hash(struct raid5_data *raid_conf, struct stripe_head *sh)
  {
         PRINTK(("remove_hash(), stripe %lu\n", sh->sector));
  
@@ -124,22 +123,21 @@ static inline void remove_hash(raid5_conf_t *conf, struct stripe_head *sh)
                         sh->hash_next->hash_pprev = sh->hash_pprev;
                 *sh->hash_pprev = sh->hash_next;
                 sh->hash_pprev = NULL;
-               conf->nr_hashed_stripes--;
+               raid_conf->nr_hashed_stripes--;
         }
  }
  
-static inline void insert_hash(raid5_conf_t *conf, struct stripe_head *sh)
+static inline void insert_hash(struct raid5_data *raid_conf, struct stripe_head *sh)
  {
-       struct stripe_head **shp = &stripe_hash(conf, sh->sector, sh->size);
+       struct stripe_head **shp = &stripe_hash(raid_conf, sh->sector, sh->size);
  
-       PRINTK(("insert_hash(), stripe %lu, nr_hashed_stripes %d\n",
-                       sh->sector, conf->nr_hashed_stripes));
+       PRINTK(("insert_hash(), stripe %lu, nr_hashed_stripes %d\n", sh->sector, raid_conf->nr_hashed_stripes));
  
         if ((sh->hash_next = *shp) != NULL)
                 (*shp)->hash_pprev = &sh->hash_next;
         *shp = sh;
         sh->hash_pprev = shp;
-       conf->nr_hashed_stripes++;
+       raid_conf->nr_hashed_stripes++;
  }
  
  static struct buffer_head *get_free_buffer(struct stripe_head *sh, int b_size)
@@ -147,15 +145,13 @@ static struct buffer_head *get_free_buffer(struct stripe_head *sh, int b_size)
         struct buffer_head *bh;
         unsigned long flags;
  
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
-       bh = sh->buffer_pool;
-       if (!bh)
-               goto out_unlock;
+       save_flags(flags);
+       cli();
+       if ((bh = sh->buffer_pool) == NULL)
+               return NULL;
         sh->buffer_pool = bh->b_next;
         bh->b_size = b_size;
-out_unlock:
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
-
+       restore_flags(flags);
         return bh;
  }
  
@@ -164,14 +160,12 @@ static struct buffer_head *get_free_bh(struct stripe_head *sh)
         struct buffer_head *bh;
         unsigned long flags;
  
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
-       bh = sh->bh_pool;
-       if (!bh)
-               goto out_unlock;
+       save_flags(flags);
+       cli();
+       if ((bh = sh->bh_pool) == NULL)
+               return NULL;
         sh->bh_pool = bh->b_next;
-out_unlock:
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
-
+       restore_flags(flags);
         return bh;
  }
  
@@ -179,52 +173,54 @@ static void put_free_buffer(struct stripe_head *sh, struct buffer_head *bh)
  {
         unsigned long flags;
  
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
         bh->b_next = sh->buffer_pool;
         sh->buffer_pool = bh;
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
  }
  
  static void put_free_bh(struct stripe_head *sh, struct buffer_head *bh)
  {
         unsigned long flags;
  
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
         bh->b_next = sh->bh_pool;
         sh->bh_pool = bh;
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
  }
  
-static struct stripe_head *get_free_stripe(raid5_conf_t *conf)
+static struct stripe_head *get_free_stripe(struct raid5_data *raid_conf)
  {
         struct stripe_head *sh;
         unsigned long flags;
  
         save_flags(flags);
         cli();
-       if ((sh = conf->free_sh_list) == NULL) {
+       if ((sh = raid_conf->free_sh_list) == NULL) {
                 restore_flags(flags);
                 return NULL;
         }
-       conf->free_sh_list = sh->free_next;
-       conf->nr_free_sh--;
-       if (!conf->nr_free_sh && conf->free_sh_list)
+       raid_conf->free_sh_list = sh->free_next;
+       raid_conf->nr_free_sh--;
+       if (!raid_conf->nr_free_sh && raid_conf->free_sh_list)
                 printk ("raid5: bug: free_sh_list != NULL, nr_free_sh == 0\n");
         restore_flags(flags);
-       if (sh->hash_pprev || md_atomic_read(&sh->nr_pending) || sh->count)
+       if (sh->hash_pprev || sh->nr_pending || sh->count)
                 printk("get_free_stripe(): bug\n");
         return sh;
  }
  
-static void put_free_stripe(raid5_conf_t *conf, struct stripe_head *sh)
+static void put_free_stripe(struct raid5_data *raid_conf, struct stripe_head *sh)
  {
         unsigned long flags;
  
         save_flags(flags);
         cli();
-       sh->free_next = conf->free_sh_list;
-       conf->free_sh_list = sh;
-       conf->nr_free_sh++;
+       sh->free_next = raid_conf->free_sh_list;
+       raid_conf->free_sh_list = sh;
+       raid_conf->nr_free_sh++;
         restore_flags(flags);
  }
  
@@ -328,8 +324,8 @@ static void raid5_update_old_bh(struct stripe_head *sh, int i)
  
  static void kfree_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks, j;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks, j;
  
         PRINTK(("kfree_stripe called, stripe %lu\n", sh->sector));
         if (sh->phase != PHASE_COMPLETE || stripe_locked(sh) || sh->count) {
@@ -342,19 +338,19 @@ static void kfree_stripe(struct stripe_head *sh)
                 if (sh->bh_new[j] || sh->bh_copy[j])
                         printk("raid5: bug: sector %lu, new %p, copy %p\n", sh->sector, sh->bh_new[j], sh->bh_copy[j]);
         }
-       remove_hash(conf, sh);
-       put_free_stripe(conf, sh);
+       remove_hash(raid_conf, sh);
+       put_free_stripe(raid_conf, sh);
  }
  
-static int shrink_stripe_cache(raid5_conf_t *conf, int nr)
+static int shrink_stripe_cache(struct raid5_data *raid_conf, int nr)
  {
         struct stripe_head *sh;
         int i, count = 0;
  
-       PRINTK(("shrink_stripe_cache called, %d/%d, clock %d\n", nr, conf->nr_hashed_stripes, conf->clock));
+       PRINTK(("shrink_stripe_cache called, %d/%d, clock %d\n", nr, raid_conf->nr_hashed_stripes, raid_conf->clock));
         for (i = 0; i < NR_HASH; i++) {
  repeat:
-               sh = conf->stripe_hashtbl[(i + conf->clock) & HASH_MASK];
+               sh = raid_conf->stripe_hashtbl[(i + raid_conf->clock) & HASH_MASK];
                 for (; sh; sh = sh->hash_next) {
                         if (sh->phase != PHASE_COMPLETE)
                                 continue;
@@ -364,30 +360,30 @@ repeat:
                                 continue;
                         kfree_stripe(sh);
                         if (++count == nr) {
-                               PRINTK(("shrink completed, nr_hashed_stripes %d\n", conf->nr_hashed_stripes));
-                               conf->clock = (i + conf->clock) & HASH_MASK;
+                               PRINTK(("shrink completed, nr_hashed_stripes %d\n", raid_conf->nr_hashed_stripes));
+                               raid_conf->clock = (i + raid_conf->clock) & HASH_MASK;
                                 return nr;
                         }
                         goto repeat;
                 }
         }
-       PRINTK(("shrink completed, nr_hashed_stripes %d\n", conf->nr_hashed_stripes));
+       PRINTK(("shrink completed, nr_hashed_stripes %d\n", raid_conf->nr_hashed_stripes));
         return count;
  }
  
-static struct stripe_head *find_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *find_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
  {
         struct stripe_head *sh;
  
-       if (conf->buffer_size != size) {
-               PRINTK(("switching size, %d --> %d\n", conf->buffer_size, size));
-               shrink_stripe_cache(conf, conf->max_nr_stripes);
-               conf->buffer_size = size;
+       if (raid_conf->buffer_size != size) {
+               PRINTK(("switching size, %d --> %d\n", raid_conf->buffer_size, size));
+               shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes);
+               raid_conf->buffer_size = size;
         }
  
         PRINTK(("find_stripe, sector %lu\n", sector));
-       for (sh = stripe_hash(conf, sector, size); sh; sh = sh->hash_next)
-               if (sh->sector == sector && sh->raid_conf == conf) {
+       for (sh = stripe_hash(raid_conf, sector, size); sh; sh = sh->hash_next)
+               if (sh->sector == sector && sh->raid_conf == raid_conf) {
                         if (sh->size == size) {
                                 PRINTK(("found stripe %lu\n", sector));
                                 return sh;
@@ -401,7 +397,7 @@ static struct stripe_head *find_stripe(raid5_conf_t *conf, unsigned long sector,
         return NULL;
  }
  
-static int grow_stripes(raid5_conf_t *conf, int num, int priority)
+static int grow_stripes(struct raid5_data *raid_conf, int num, int priority)
  {
         struct stripe_head *sh;
  
@@ -409,64 +405,62 @@ static int grow_stripes(raid5_conf_t *conf, int num, int priority)
                 if ((sh = kmalloc(sizeof(struct stripe_head), priority)) == NULL)
                         return 1;
                 memset(sh, 0, sizeof(*sh));
-               sh->stripe_lock = MD_SPIN_LOCK_UNLOCKED;
-
-               if (grow_buffers(sh, 2 * conf->raid_disks, PAGE_SIZE, priority)) {
-                       shrink_buffers(sh, 2 * conf->raid_disks);
+               if (grow_buffers(sh, 2 * raid_conf->raid_disks, PAGE_SIZE, priority)) {
+                       shrink_buffers(sh, 2 * raid_conf->raid_disks);
                         kfree(sh);
                         return 1;
                 }
-               if (grow_bh(sh, conf->raid_disks, priority)) {
-                       shrink_buffers(sh, 2 * conf->raid_disks);
-                       shrink_bh(sh, conf->raid_disks);
+               if (grow_bh(sh, raid_conf->raid_disks, priority)) {
+                       shrink_buffers(sh, 2 * raid_conf->raid_disks);
+                       shrink_bh(sh, raid_conf->raid_disks);
                         kfree(sh);
                         return 1;
                 }
-               put_free_stripe(conf, sh);
-               conf->nr_stripes++;
+               put_free_stripe(raid_conf, sh);
+               raid_conf->nr_stripes++;
         }
         return 0;
  }
  
-static void shrink_stripes(raid5_conf_t *conf, int num)
+static void shrink_stripes(struct raid5_data *raid_conf, int num)
  {
         struct stripe_head *sh;
  
         while (num--) {
-               sh = get_free_stripe(conf);
+               sh = get_free_stripe(raid_conf);
                 if (!sh)
                         break;
-               shrink_buffers(sh, conf->raid_disks * 2);
-               shrink_bh(sh, conf->raid_disks);
+               shrink_buffers(sh, raid_conf->raid_disks * 2);
+               shrink_bh(sh, raid_conf->raid_disks);
                 kfree(sh);
-               conf->nr_stripes--;
+               raid_conf->nr_stripes--;
         }
  }
  
-static struct stripe_head *kmalloc_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *kmalloc_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
  {
         struct stripe_head *sh = NULL, *tmp;
         struct buffer_head *buffer_pool, *bh_pool;
  
         PRINTK(("kmalloc_stripe called\n"));
  
-       while ((sh = get_free_stripe(conf)) == NULL) {
-               shrink_stripe_cache(conf, conf->max_nr_stripes / 8);
-               if ((sh = get_free_stripe(conf)) != NULL)
+       while ((sh = get_free_stripe(raid_conf)) == NULL) {
+               shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes / 8);
+               if ((sh = get_free_stripe(raid_conf)) != NULL)
                         break;
-               if (!conf->nr_pending_stripes)
+               if (!raid_conf->nr_pending_stripes)
                         printk("raid5: bug: nr_free_sh == 0, nr_pending_stripes == 0\n");
-               md_wakeup_thread(conf->thread);
+               md_wakeup_thread(raid_conf->thread);
                 PRINTK(("waiting for some stripes to complete\n"));
-               sleep_on(&conf->wait_for_stripe);
+               sleep_on(&raid_conf->wait_for_stripe);
         }
  
         /*
          * The above might have slept, so perhaps another process
          * already created the stripe for us..
          */
-       if ((tmp = find_stripe(conf, sector, size)) != NULL) { 
-               put_free_stripe(conf, sh);
+       if ((tmp = find_stripe(raid_conf, sector, size)) != NULL) { 
+               put_free_stripe(raid_conf, sh);
                 wait_on_stripe(tmp);
                 return tmp;
         }
@@ -478,25 +472,25 @@ static struct stripe_head *kmalloc_stripe(raid5_conf_t *conf, unsigned long sect
                 sh->bh_pool = bh_pool;
                 sh->phase = PHASE_COMPLETE;
                 sh->cmd = STRIPE_NONE;
-               sh->raid_conf = conf;
+               sh->raid_conf = raid_conf;
                 sh->sector = sector;
                 sh->size = size;
-               conf->nr_cached_stripes++;
-               insert_hash(conf, sh);
+               raid_conf->nr_cached_stripes++;
+               insert_hash(raid_conf, sh);
         } else printk("raid5: bug: kmalloc_stripe() == NULL\n");
         return sh;
  }
  
-static struct stripe_head *get_stripe(raid5_conf_t *conf, unsigned long sector, int size)
+static struct stripe_head *get_stripe(struct raid5_data *raid_conf, unsigned long sector, int size)
  {
         struct stripe_head *sh;
  
         PRINTK(("get_stripe, sector %lu\n", sector));
-       sh = find_stripe(conf, sector, size);
+       sh = find_stripe(raid_conf, sector, size);
         if (sh)
                 wait_on_stripe(sh);
         else
-               sh = kmalloc_stripe(conf, sector, size);
+               sh = kmalloc_stripe(raid_conf, sector, size);
         return sh;
  }
  
@@ -529,7 +523,7 @@ static inline void raid5_end_buffer_io (struct stripe_head *sh, int i, int uptod
         bh->b_end_io(bh, uptodate);
         if (!uptodate)
                 printk(KERN_ALERT "raid5: %s: unrecoverable I/O error for "
-                      "block %lu\n", partition_name(bh->b_dev), bh->b_blocknr);
+                      "block %lu\n", kdevname(bh->b_dev), bh->b_blocknr);
  }
  
  static inline void raid5_mark_buffer_uptodate (struct buffer_head *bh, int uptodate)
@@ -543,35 +537,36 @@ static inline void raid5_mark_buffer_uptodate (struct buffer_head *bh, int uptod
  static void raid5_end_request (struct buffer_head * bh, int uptodate)
  {
         struct stripe_head *sh = bh->b_dev_id;
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks, i;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks, i;
         unsigned long flags;
  
         PRINTK(("end_request %lu, nr_pending %d\n", sh->sector, sh->nr_pending));
-       md_spin_lock_irqsave(&sh->stripe_lock, flags);
+       save_flags(flags);
+       cli();
         raid5_mark_buffer_uptodate(bh, uptodate);
-       if (atomic_dec_and_test(&sh->nr_pending)) {
-               md_wakeup_thread(conf->thread);
-               atomic_inc(&conf->nr_handle);
+       --sh->nr_pending;
+       if (!sh->nr_pending) {
+               md_wakeup_thread(raid_conf->thread);
+               atomic_inc(&raid_conf->nr_handle);
         }
-       if (!uptodate) {
+       if (!uptodate)
                 md_error(bh->b_dev, bh->b_rdev);
-       }
-       if (conf->failed_disks) {
+       if (raid_conf->failed_disks) {
                 for (i = 0; i < disks; i++) {
-                       if (conf->disks[i].operational)
+                       if (raid_conf->disks[i].operational)
                                 continue;
                         if (bh != sh->bh_old[i] && bh != sh->bh_req[i] && bh != sh->bh_copy[i])
                                 continue;
-                       if (bh->b_rdev != conf->disks[i].dev)
+                       if (bh->b_rdev != raid_conf->disks[i].dev)
                                 continue;
                         set_bit(STRIPE_ERROR, &sh->state);
                 }
         }
-       md_spin_unlock_irqrestore(&sh->stripe_lock, flags);
+       restore_flags(flags);
  }
  
-static int raid5_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
+static int raid5_map (struct md_dev *mddev, kdev_t *rdev,
                       unsigned long *rsector, unsigned long size)
  {
         /* No complex mapping used: the core of the work is done in the
@@ -582,10 +577,11 @@ static int raid5_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
  
  static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, int i)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       struct md_dev *mddev = raid_conf->mddev;
+       int minor = (int) (mddev - md_dev);
         char *b_data;
-       kdev_t dev = mddev_to_kdev(mddev);
+       kdev_t dev = MKDEV(MD_MAJOR, minor);
         int block = sh->sector / (sh->size >> 9);
  
         b_data = ((volatile struct buffer_head *) bh)->b_data;
@@ -593,7 +589,7 @@ static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, i
         init_buffer(bh, dev, block, raid5_end_request, sh);
         ((volatile struct buffer_head *) bh)->b_data = b_data;
  
-       bh->b_rdev      = conf->disks[i].dev;
+       bh->b_rdev      = raid_conf->disks[i].dev;
         bh->b_rsector   = sh->sector;
  
         bh->b_state     = (1 << BH_Req);
@@ -601,62 +597,33 @@ static void raid5_build_block (struct stripe_head *sh, struct buffer_head *bh, i
         bh->b_list      = BUF_LOCKED;
  }
  
-static int raid5_error (mddev_t *mddev, kdev_t dev)
+static int raid5_error (struct md_dev *mddev, kdev_t dev)
  {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       mdp_super_t *sb = mddev->sb;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       md_superblock_t *sb = mddev->sb;
         struct disk_info *disk;
         int i;
  
         PRINTK(("raid5_error called\n"));
-       conf->resync_parity = 0;
-       for (i = 0, disk = conf->disks; i < conf->raid_disks; i++, disk++) {
+       raid_conf->resync_parity = 0;
+       for (i = 0, disk = raid_conf->disks; i < raid_conf->raid_disks; i++, disk++)
                 if (disk->dev == dev && disk->operational) {
                         disk->operational = 0;
-                       mark_disk_faulty(sb->disks+disk->number);
-                       mark_disk_nonsync(sb->disks+disk->number);
-                       mark_disk_inactive(sb->disks+disk->number);
+                       sb->disks[disk->number].state |= (1 << MD_FAULTY_DEVICE);
+                       sb->disks[disk->number].state &= ~(1 << MD_SYNC_DEVICE);
+                       sb->disks[disk->number].state &= ~(1 << MD_ACTIVE_DEVICE);
                         sb->active_disks--;
                         sb->working_disks--;
                         sb->failed_disks++;
                         mddev->sb_dirty = 1;
-                       conf->working_disks--;
-                       conf->failed_disks++;
-                       md_wakeup_thread(conf->thread);
+                       raid_conf->working_disks--;
+                       raid_conf->failed_disks++;
+                       md_wakeup_thread(raid_conf->thread);
                         printk (KERN_ALERT
-                               "raid5: Disk failure on %s, disabling device."
-                               " Operation continuing on %d devices\n",
-                               partition_name (dev), conf->working_disks);
-                       return -EIO;
-               }
-       }
-       /*
-        * handle errors in spares (during reconstruction)
-        */
-       if (conf->spare) {
-               disk = conf->spare;
-               if (disk->dev == dev) {
-                       printk (KERN_ALERT
-                               "raid5: Disk failure on spare %s\n",
-                               partition_name (dev));
-                       if (!conf->spare->operational) {
-                               MD_BUG();
-                               return -EIO;
-                       }
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       conf->spare = NULL;
-                       mark_disk_faulty(sb->disks+disk->number);
-                       mark_disk_nonsync(sb->disks+disk->number);
-                       mark_disk_inactive(sb->disks+disk->number);
-                       sb->spare_disks--;
-                       sb->working_disks--;
-                       sb->failed_disks++;
-
-                       return -EIO;
+                               "RAID5: Disk failure on %s, disabling device."
+                               "Operation continuing on %d devices\n",
+                               kdevname (dev), raid_conf->working_disks);
                 }
-       }
-       MD_BUG();
         return 0;
  }      
  
@@ -667,12 +634,12 @@ static int raid5_error (mddev_t *mddev, kdev_t dev)
  static inline unsigned long 
  raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_disks,
                         unsigned int * dd_idx, unsigned int * pd_idx, 
-                       raid5_conf_t *conf)
+                       struct raid5_data *raid_conf)
  {
         unsigned int  stripe;
         int chunk_number, chunk_offset;
         unsigned long new_sector;
-       int sectors_per_chunk = conf->chunk_size >> 9;
+       int sectors_per_chunk = raid_conf->chunk_size >> 9;
  
         /* First compute the information on this sector */
  
@@ -695,9 +662,9 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
         /*
          * Select the parity disk based on the user selected algorithm.
          */
-       if (conf->level == 4)
+       if (raid_conf->level == 4)
                 *pd_idx = data_disks;
-       else switch (conf->algorithm) {
+       else switch (raid_conf->algorithm) {
                 case ALGORITHM_LEFT_ASYMMETRIC:
                         *pd_idx = data_disks - stripe % raid_disks;
                         if (*dd_idx >= *pd_idx)
@@ -717,7 +684,7 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
                         *dd_idx = (*pd_idx + 1 + *dd_idx) % raid_disks;
                         break;
                 default:
-                       printk ("raid5: unsupported algorithm %d\n", conf->algorithm);
+                       printk ("raid5: unsupported algorithm %d\n", raid_conf->algorithm);
         }
  
         /*
@@ -738,16 +705,16 @@ raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_d
  
  static unsigned long compute_blocknr(struct stripe_head *sh, int i)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       int raid_disks = conf->raid_disks, data_disks = raid_disks - 1;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int raid_disks = raid_conf->raid_disks, data_disks = raid_disks - 1;
         unsigned long new_sector = sh->sector, check;
-       int sectors_per_chunk = conf->chunk_size >> 9;
+       int sectors_per_chunk = raid_conf->chunk_size >> 9;
         unsigned long stripe = new_sector / sectors_per_chunk;
         int chunk_offset = new_sector % sectors_per_chunk;
         int chunk_number, dummy1, dummy2, dd_idx = i;
         unsigned long r_sector, blocknr;
  
-       switch (conf->algorithm) {
+       switch (raid_conf->algorithm) {
                 case ALGORITHM_LEFT_ASYMMETRIC:
                 case ALGORITHM_RIGHT_ASYMMETRIC:
                         if (i > sh->pd_idx)
@@ -760,14 +727,14 @@ static unsigned long compute_blocknr(struct stripe_head *sh, int i)
                         i -= (sh->pd_idx + 1);
                         break;
                 default:
-                       printk ("raid5: unsupported algorithm %d\n", conf->algorithm);
+                       printk ("raid5: unsupported algorithm %d\n", raid_conf->algorithm);
         }
  
         chunk_number = stripe * data_disks + i;
         r_sector = chunk_number * sectors_per_chunk + chunk_offset;
         blocknr = r_sector / (sh->size >> 9);
  
-       check = raid5_compute_sector (r_sector, raid_disks, data_disks, &dummy1, &dummy2, conf);
+       check = raid5_compute_sector (r_sector, raid_disks, data_disks, &dummy1, &dummy2, raid_conf);
         if (check != sh->sector || dummy1 != dd_idx || dummy2 != sh->pd_idx) {
                 printk("compute_blocknr: map not correct\n");
                 return 0;
@@ -775,11 +742,36 @@ static unsigned long compute_blocknr(struct stripe_head *sh, int i)
         return blocknr;
  }
  
+#ifdef HAVE_ARCH_XORBLOCK
+static void xor_block(struct buffer_head *dest, struct buffer_head *source)
+{
+       __xor_block((char *) dest->b_data, (char *) source->b_data, dest->b_size);
+}
+#else
+static void xor_block(struct buffer_head *dest, struct buffer_head *source)
+{
+       long lines = dest->b_size / (sizeof (long)) / 8, i;
+       long *destp = (long *) dest->b_data, *sourcep = (long *) source->b_data;
+
+       for (i = lines; i > 0; i--) {
+               *(destp + 0) ^= *(sourcep + 0);
+               *(destp + 1) ^= *(sourcep + 1);
+               *(destp + 2) ^= *(sourcep + 2);
+               *(destp + 3) ^= *(sourcep + 3);
+               *(destp + 4) ^= *(sourcep + 4);
+               *(destp + 5) ^= *(sourcep + 5);
+               *(destp + 6) ^= *(sourcep + 6);
+               *(destp + 7) ^= *(sourcep + 7);
+               destp += 8;
+               sourcep += 8;
+       }
+}
+#endif
+
  static void compute_block(struct stripe_head *sh, int dd_idx)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       int i, count, disks = conf->raid_disks;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int i, disks = raid_conf->raid_disks;
  
         PRINTK(("compute_block, stripe %lu, idx %d\n", sh->sector, dd_idx));
  
@@ -788,100 +780,69 @@ static void compute_block(struct stripe_head *sh, int dd_idx)
         raid5_build_block(sh, sh->bh_old[dd_idx], dd_idx);
  
         memset(sh->bh_old[dd_idx]->b_data, 0, sh->size);
-       bh_ptr[0] = sh->bh_old[dd_idx];
-       count = 1;
         for (i = 0; i < disks; i++) {
                 if (i == dd_idx)
                         continue;
                 if (sh->bh_old[i]) {
-                       bh_ptr[count++] = sh->bh_old[i];
-               } else {
+                       xor_block(sh->bh_old[dd_idx], sh->bh_old[i]);
+                       continue;
+               } else
                         printk("compute_block() %d, stripe %lu, %d not present\n", dd_idx, sh->sector, i);
-               }
-               if (count == MAX_XOR_BLOCKS) {
-                       xor_block(count, &bh_ptr[0]);
-                       count = 1;
-               }
-       }
-       if(count != 1) {
-               xor_block(count, &bh_ptr[0]);
         }
         raid5_mark_buffer_uptodate(sh->bh_old[dd_idx], 1);
  }
  
  static void compute_parity(struct stripe_head *sh, int method)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       int i, pd_idx = sh->pd_idx, disks = conf->raid_disks, lowprio, count;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int i, pd_idx = sh->pd_idx, disks = raid_conf->raid_disks;
  
         PRINTK(("compute_parity, stripe %lu, method %d\n", sh->sector, method));
-       lowprio = 1;
         for (i = 0; i < disks; i++) {
                 if (i == pd_idx || !sh->bh_new[i])
                         continue;
                 if (!sh->bh_copy[i])
                         sh->bh_copy[i] = raid5_kmalloc_buffer(sh, sh->size);
                 raid5_build_block(sh, sh->bh_copy[i], i);
-               if (!buffer_lowprio(sh->bh_new[i]))
-                       lowprio = 0;
-               else
-                       mark_buffer_lowprio(sh->bh_copy[i]);
                 mark_buffer_clean(sh->bh_new[i]);
                 memcpy(sh->bh_copy[i]->b_data, sh->bh_new[i]->b_data, sh->size);
         }
         if (sh->bh_copy[pd_idx] == NULL)
                 sh->bh_copy[pd_idx] = raid5_kmalloc_buffer(sh, sh->size);
         raid5_build_block(sh, sh->bh_copy[pd_idx], sh->pd_idx);
-       if (lowprio)
-               mark_buffer_lowprio(sh->bh_copy[pd_idx]);
  
         if (method == RECONSTRUCT_WRITE) {
                 memset(sh->bh_copy[pd_idx]->b_data, 0, sh->size);
-               bh_ptr[0] = sh->bh_copy[pd_idx];
-               count = 1;
                 for (i = 0; i < disks; i++) {
                         if (i == sh->pd_idx)
                                 continue;
                         if (sh->bh_new[i]) {
-                               bh_ptr[count++] = sh->bh_copy[i];
-                       } else if (sh->bh_old[i]) {
-                               bh_ptr[count++] = sh->bh_old[i];
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_copy[i]);
+                               continue;
                         }
-                       if (count == MAX_XOR_BLOCKS) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
+                       if (sh->bh_old[i]) {
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_old[i]);
+                               continue;
                         }
                 }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
         } else if (method == READ_MODIFY_WRITE) {
                 memcpy(sh->bh_copy[pd_idx]->b_data, sh->bh_old[pd_idx]->b_data, sh->size);
-               bh_ptr[0] = sh->bh_copy[pd_idx];
-               count = 1;
                 for (i = 0; i < disks; i++) {
                         if (i == sh->pd_idx)
                                 continue;
                         if (sh->bh_new[i] && sh->bh_old[i]) {
-                               bh_ptr[count++] = sh->bh_copy[i];
-                               bh_ptr[count++] = sh->bh_old[i];
-                       }
-                       if (count >= (MAX_XOR_BLOCKS - 1)) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_copy[i]);
+                               xor_block(sh->bh_copy[pd_idx], sh->bh_old[i]);
+                               continue;
                         }
                 }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
         }
         raid5_mark_buffer_uptodate(sh->bh_copy[pd_idx], 1);
  }
  
  static void add_stripe_bh (struct stripe_head *sh, struct buffer_head *bh, int dd_idx, int rw)
  {
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
         struct buffer_head *bh_req;
  
         if (sh->bh_new[dd_idx]) {
@@ -899,22 +860,19 @@ static void add_stripe_bh (struct stripe_head *sh, struct buffer_head *bh, int d
         if (sh->phase == PHASE_COMPLETE && sh->cmd == STRIPE_NONE) {
                 sh->phase = PHASE_BEGIN;
                 sh->cmd = (rw == READ) ? STRIPE_READ : STRIPE_WRITE;
-               conf->nr_pending_stripes++;
-               atomic_inc(&conf->nr_handle);
+               raid_conf->nr_pending_stripes++;
+               atomic_inc(&raid_conf->nr_handle);
         }
         sh->bh_new[dd_idx] = bh;
         sh->bh_req[dd_idx] = bh_req;
         sh->cmd_new[dd_idx] = rw;
         sh->new[dd_idx] = 1;
-
-       if (buffer_lowprio(bh))
-               mark_buffer_lowprio(bh_req);
  }
  
  static void complete_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       int disks = conf->raid_disks;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       int disks = raid_conf->raid_disks;
         int i, new = 0;
         
         PRINTK(("complete_stripe %lu\n", sh->sector));
@@ -951,22 +909,6 @@ static void complete_stripe(struct stripe_head *sh)
         }
  }
  
-
-static int is_stripe_lowprio(struct stripe_head *sh, int disks)
-{
-       int i, lowprio = 1;
-
-       for (i = 0; i < disks; i++) {
-               if (sh->bh_new[i])
-                       if (!buffer_lowprio(sh->bh_new[i]))
-                               lowprio = 0;
-               if (sh->bh_old[i])
-                       if (!buffer_lowprio(sh->bh_old[i]))
-                               lowprio = 0;
-       }
-       return lowprio;
-}
-
  /*
   * handle_stripe() is our main logic routine. Note that:
   *
@@ -977,27 +919,28 @@ static int is_stripe_lowprio(struct stripe_head *sh, int disks)
   * 2.  We should be careful to set sh->nr_pending whenever we sleep,
   *     to prevent re-entry of handle_stripe() for the same sh.
   *
- * 3.  conf->failed_disks and disk->operational can be changed
+ * 3.  raid_conf->failed_disks and disk->operational can be changed
   *     from an interrupt. This complicates things a bit, but it allows
   *     us to stop issuing requests for a failed drive as soon as possible.
   */
  static void handle_stripe(struct stripe_head *sh)
  {
-       raid5_conf_t *conf = sh->raid_conf;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = sh->raid_conf;
+       struct md_dev *mddev = raid_conf->mddev;
+       int minor = (int) (mddev - md_dev);
         struct buffer_head *bh;
-       int disks = conf->raid_disks;
-       int i, nr = 0, nr_read = 0, nr_write = 0, lowprio;
+       int disks = raid_conf->raid_disks;
+       int i, nr = 0, nr_read = 0, nr_write = 0;
         int nr_cache = 0, nr_cache_other = 0, nr_cache_overwrite = 0, parity = 0;
         int nr_failed_other = 0, nr_failed_overwrite = 0, parity_failed = 0;
         int reading = 0, nr_writing = 0;
         int method1 = INT_MAX, method2 = INT_MAX;
         int block;
         unsigned long flags;
-       int operational[MD_SB_DISKS], failed_disks = conf->failed_disks;
+       int operational[MD_SB_DISKS], failed_disks = raid_conf->failed_disks;
  
         PRINTK(("handle_stripe(), stripe %lu\n", sh->sector));
-       if (md_atomic_read(&sh->nr_pending)) {
+       if (sh->nr_pending) {
                 printk("handle_stripe(), stripe %lu, io still pending\n", sh->sector);
                 return;
         }
@@ -1006,9 +949,9 @@ static void handle_stripe(struct stripe_head *sh)
                 return;
         }
  
-       atomic_dec(&conf->nr_handle);
+       atomic_dec(&raid_conf->nr_handle);
  
-       if (md_test_and_clear_bit(STRIPE_ERROR, &sh->state)) {
+       if (test_and_clear_bit(STRIPE_ERROR, &sh->state)) {
                 printk("raid5: restarting stripe %lu\n", sh->sector);
                 sh->phase = PHASE_BEGIN;
         }
@@ -1026,11 +969,11 @@ static void handle_stripe(struct stripe_head *sh)
         save_flags(flags);
         cli();
         for (i = 0; i < disks; i++) {
-               operational[i] = conf->disks[i].operational;
-               if (i == sh->pd_idx && conf->resync_parity)
+               operational[i] = raid_conf->disks[i].operational;
+               if (i == sh->pd_idx && raid_conf->resync_parity)
                         operational[i] = 0;
         }
-       failed_disks = conf->failed_disks;
+       failed_disks = raid_conf->failed_disks;
         restore_flags(flags);
  
         if (failed_disks > 1) {
@@ -1074,7 +1017,7 @@ static void handle_stripe(struct stripe_head *sh)
         }
  
         if (nr_write && nr_read)
-               printk("raid5: bug, nr_write ==`%d, nr_read == %d, sh->cmd == %d\n", nr_write, nr_read, sh->cmd);
+               printk("raid5: bug, nr_write == %d, nr_read == %d, sh->cmd == %d\n", nr_write, nr_read, sh->cmd);
  
         if (nr_write) {
                 /*
@@ -1087,7 +1030,7 @@ static void handle_stripe(struct stripe_head *sh)
                                 if (sh->bh_new[i])
                                         continue;
                                 block = (int) compute_blocknr(sh, i);
-                               bh = find_buffer(mddev_to_kdev(mddev), block, sh->size);
+                               bh = find_buffer(MKDEV(MD_MAJOR, minor), block, sh->size);
                                 if (bh && bh->b_count == 0 && buffer_dirty(bh) && !buffer_locked(bh)) {
                                         PRINTK(("Whee.. sector %lu, index %d (%d) found in the buffer cache!\n", sh->sector, i, block));
                                         add_stripe_bh(sh, bh, i, WRITE);
@@ -1121,22 +1064,21 @@ static void handle_stripe(struct stripe_head *sh)
  
                 if (!method1 || !method2) {
                         lock_stripe(sh);
-                       lowprio = is_stripe_lowprio(sh, disks);
-                       atomic_inc(&sh->nr_pending);
+                       sh->nr_pending++;
                         sh->phase = PHASE_WRITE;
                         compute_parity(sh, method1 <= method2 ? RECONSTRUCT_WRITE : READ_MODIFY_WRITE);
                         for (i = 0; i < disks; i++) {
-                               if (!operational[i] && !conf->spare && !conf->resync_parity)
+                               if (!operational[i] && !raid_conf->spare && !raid_conf->resync_parity)
                                         continue;
                                 if (i == sh->pd_idx || sh->bh_new[i])
                                         nr_writing++;
                         }
  
-                       md_atomic_set(&sh->nr_pending, nr_writing);
-                       PRINTK(("handle_stripe() %lu, writing back %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = nr_writing;
+                       PRINTK(("handle_stripe() %lu, writing back %d\n", sh->sector, sh->nr_pending));
  
                         for (i = 0; i < disks; i++) {
-                               if (!operational[i] && !conf->spare && !conf->resync_parity)
+                               if (!operational[i] && !raid_conf->spare && !raid_conf->resync_parity)
                                         continue;
                                 bh = sh->bh_copy[i];
                                 if (i != sh->pd_idx && ((bh == NULL) ^ (sh->bh_new[i] == NULL)))
@@ -1147,30 +1089,18 @@ static void handle_stripe(struct stripe_head *sh)
                                         bh->b_state |= (1<<BH_Dirty);
                                         PRINTK(("making request for buffer %d\n", i));
                                         clear_bit(BH_Lock, &bh->b_state);
-                                       if (!operational[i] && !conf->resync_parity) {
-                                               bh->b_rdev = conf->spare->dev;
-                                               make_request(MAJOR(conf->spare->dev), WRITE, bh);
-                                       } else {
-#if 0
-                                               make_request(MAJOR(conf->disks[i].dev), WRITE, bh);
-#else
-                                               if (!lowprio || (i==sh->pd_idx))
-                                                       make_request(MAJOR(conf->disks[i].dev), WRITE, bh);
-                                               else {
-                                                       mark_buffer_clean(bh);
-                                                       raid5_end_request(bh,1);
-                                                       sh->new[i] = 0;
-                                               }
-#endif
-                                       }
+                                       if (!operational[i] && !raid_conf->resync_parity) {
+                                               bh->b_rdev = raid_conf->spare->dev;
+                                               make_request(MAJOR(raid_conf->spare->dev), WRITE, bh);
+                                       } else
+                                               make_request(MAJOR(raid_conf->disks[i].dev), WRITE, bh);
                                 }
                         }
                         return;
                 }
  
                 lock_stripe(sh);
-               lowprio = is_stripe_lowprio(sh, disks);
-               atomic_inc(&sh->nr_pending);
+               sh->nr_pending++;
                 if (method1 < method2) {
                         sh->write_method = RECONSTRUCT_WRITE;
                         for (i = 0; i < disks; i++) {
@@ -1180,8 +1110,6 @@ static void handle_stripe(struct stripe_head *sh)
                                         continue;
                                 sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                 raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                 reading++;
                         }
                 } else {
@@ -1193,21 +1121,19 @@ static void handle_stripe(struct stripe_head *sh)
                                         continue;
                                 sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                 raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                 reading++;
                         }
                 }
                 sh->phase = PHASE_READ_OLD;
-               md_atomic_set(&sh->nr_pending, reading);
-               PRINTK(("handle_stripe() %lu, reading %d old buffers\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+               sh->nr_pending = reading;
+               PRINTK(("handle_stripe() %lu, reading %d old buffers\n", sh->sector, sh->nr_pending));
                 for (i = 0; i < disks; i++) {
                         if (!sh->bh_old[i])
                                 continue;
                         if (buffer_uptodate(sh->bh_old[i]))
                                 continue;
                         clear_bit(BH_Lock, &sh->bh_old[i]->b_state);
-                       make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_old[i]);
+                       make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_old[i]);
                 }
         } else {
                 /*
@@ -1215,8 +1141,7 @@ static void handle_stripe(struct stripe_head *sh)
                  */
                 method1 = nr_read - nr_cache_overwrite;
                 lock_stripe(sh);
-               lowprio = is_stripe_lowprio(sh,disks);
-               atomic_inc(&sh->nr_pending);
+               sh->nr_pending++;
  
                 PRINTK(("handle_stripe(), sector %lu, nr_read %d, nr_cache %d, method1 %d\n", sh->sector, nr_read, nr_cache, method1));
                 if (!method1 || (method1 == 1 && nr_cache == disks - 1)) {
@@ -1224,22 +1149,18 @@ static void handle_stripe(struct stripe_head *sh)
                         for (i = 0; i < disks; i++) {
                                 if (!sh->bh_new[i])
                                         continue;
-                               if (!sh->bh_old[i]) {
+                               if (!sh->bh_old[i])
                                         compute_block(sh, i);
-                                       if (lowprio)
-                                               mark_buffer_lowprio
-                                                       (sh->bh_old[i]);
-                               }
                                 memcpy(sh->bh_new[i]->b_data, sh->bh_old[i]->b_data, sh->size);
                         }
-                       atomic_dec(&sh->nr_pending);
+                       sh->nr_pending--;
                         complete_stripe(sh);
                         return;
                 }
                 if (nr_failed_overwrite) {
                         sh->phase = PHASE_READ_OLD;
-                       md_atomic_set(&sh->nr_pending, (disks - 1) - nr_cache);
-                       PRINTK(("handle_stripe() %lu, phase READ_OLD, pending %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = (disks - 1) - nr_cache;
+                       PRINTK(("handle_stripe() %lu, phase READ_OLD, pending %d\n", sh->sector, sh->nr_pending));
                         for (i = 0; i < disks; i++) {
                                 if (sh->bh_old[i])
                                         continue;
@@ -1247,16 +1168,13 @@ static void handle_stripe(struct stripe_head *sh)
                                         continue;
                                 sh->bh_old[i] = raid5_kmalloc_buffer(sh, sh->size);
                                 raid5_build_block(sh, sh->bh_old[i], i);
-                               if (lowprio)
-                                       mark_buffer_lowprio(sh->bh_old[i]);
                                 clear_bit(BH_Lock, &sh->bh_old[i]->b_state);
-                               make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_old[i]);
+                               make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_old[i]);
                         }
                 } else {
                         sh->phase = PHASE_READ;
-                       md_atomic_set(&sh->nr_pending,
-                               nr_read - nr_cache_overwrite);
-                       PRINTK(("handle_stripe() %lu, phase READ, pending %d\n", sh->sector, md_atomic_read(&sh->nr_pending)));
+                       sh->nr_pending = nr_read - nr_cache_overwrite;
+                       PRINTK(("handle_stripe() %lu, phase READ, pending %d\n", sh->sector, sh->nr_pending));
                         for (i = 0; i < disks; i++) {
                                 if (!sh->bh_new[i])
                                         continue;
@@ -1264,16 +1182,16 @@ static void handle_stripe(struct stripe_head *sh)
                                         memcpy(sh->bh_new[i]->b_data, sh->bh_old[i]->b_data, sh->size);
                                         continue;
                                 }
-                               make_request(MAJOR(conf->disks[i].dev), READ, sh->bh_req[i]);
+                               make_request(MAJOR(raid_conf->disks[i].dev), READ, sh->bh_req[i]);
                         }
                 }
         }
  }
  
-static int raid5_make_request (mddev_t *mddev, int rw, struct buffer_head * bh)
+static int raid5_make_request (struct md_dev *mddev, int rw, struct buffer_head * bh)
  {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       const unsigned int raid_disks = conf->raid_disks;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       const unsigned int raid_disks = raid_conf->raid_disks;
         const unsigned int data_disks = raid_disks - 1;
         unsigned int  dd_idx, pd_idx;
         unsigned long new_sector;
@@ -1284,15 +1202,15 @@ static int raid5_make_request (mddev_t *mddev, int rw, struct buffer_head * bh)
         if (rw == WRITEA) rw = WRITE;
  
         new_sector = raid5_compute_sector(bh->b_rsector, raid_disks, data_disks,
-                                               &dd_idx, &pd_idx, conf);
+                                               &dd_idx, &pd_idx, raid_conf);
  
         PRINTK(("raid5_make_request, sector %lu\n", new_sector));
  repeat:
-       sh = get_stripe(conf, new_sector, bh->b_size);
+       sh = get_stripe(raid_conf, new_sector, bh->b_size);
         if ((rw == READ && sh->cmd == STRIPE_WRITE) || (rw == WRITE && sh->cmd == STRIPE_READ)) {
                 PRINTK(("raid5: lock contention, rw == %d, sh->cmd == %d\n", rw, sh->cmd));
                 lock_stripe(sh);
-               if (!md_atomic_read(&sh->nr_pending))
+               if (!sh->nr_pending)
                         handle_stripe(sh);
                 goto repeat;
         }
@@ -1303,24 +1221,24 @@ repeat:
                 printk("raid5: bug: stripe->bh_new[%d], sector %lu exists\n", dd_idx, sh->sector);
                 printk("raid5: bh %p, bh_new %p\n", bh, sh->bh_new[dd_idx]);
                 lock_stripe(sh);
-               md_wakeup_thread(conf->thread);
+               md_wakeup_thread(raid_conf->thread);
                 wait_on_stripe(sh);
                 goto repeat;
         }
         add_stripe_bh(sh, bh, dd_idx, rw);
  
-       md_wakeup_thread(conf->thread);
+       md_wakeup_thread(raid_conf->thread);
         return 0;
  }
  
  static void unplug_devices(struct stripe_head *sh)
  {
  #if 0
-       raid5_conf_t *conf = sh->raid_conf;
+       struct raid5_data *raid_conf = sh->raid_conf;
         int i;
  
-       for (i = 0; i < conf->raid_disks; i++)
-               unplug_device(blk_dev + MAJOR(conf->disks[i].dev));
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               unplug_device(blk_dev + MAJOR(raid_conf->disks[i].dev));
  #endif
  }
  
@@ -1334,8 +1252,8 @@ static void unplug_devices(struct stripe_head *sh)
  static void raid5d (void *data)
  {
         struct stripe_head *sh;
-       raid5_conf_t *conf = data;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = data;
+       struct md_dev *mddev = raid_conf->mddev;
         int i, handled = 0, unplug = 0;
         unsigned long flags;
  
@@ -1343,47 +1261,47 @@ static void raid5d (void *data)
  
         if (mddev->sb_dirty) {
                 mddev->sb_dirty = 0;
-               md_update_sb(mddev);
+               md_update_sb((int) (mddev - md_dev));
         }
         for (i = 0; i < NR_HASH; i++) {
  repeat:
-               sh = conf->stripe_hashtbl[i];
+               sh = raid_conf->stripe_hashtbl[i];
                 for (; sh; sh = sh->hash_next) {
-                       if (sh->raid_conf != conf)
+                       if (sh->raid_conf != raid_conf)
                                 continue;
                         if (sh->phase == PHASE_COMPLETE)
                                 continue;
-                       if (md_atomic_read(&sh->nr_pending))
+                       if (sh->nr_pending)
                                 continue;
-                       if (sh->sector == conf->next_sector) {
-                               conf->sector_count += (sh->size >> 9);
-                               if (conf->sector_count >= 128)
+                       if (sh->sector == raid_conf->next_sector) {
+                               raid_conf->sector_count += (sh->size >> 9);
+                               if (raid_conf->sector_count >= 128)
                                         unplug = 1;
                         } else
                                 unplug = 1;
                         if (unplug) {
-                               PRINTK(("unplugging devices, sector == %lu, count == %d\n", sh->sector, conf->sector_count));
+                               PRINTK(("unplugging devices, sector == %lu, count == %d\n", sh->sector, raid_conf->sector_count));
                                 unplug_devices(sh);
                                 unplug = 0;
-                               conf->sector_count = 0;
+                               raid_conf->sector_count = 0;
                         }
-                       conf->next_sector = sh->sector + (sh->size >> 9);
+                       raid_conf->next_sector = sh->sector + (sh->size >> 9);
                         handled++;
                         handle_stripe(sh);
                         goto repeat;
                 }
         }
-       if (conf) {
-               PRINTK(("%d stripes handled, nr_handle %d\n", handled, md_atomic_read(&conf->nr_handle)));
+       if (raid_conf) {
+               PRINTK(("%d stripes handled, nr_handle %d\n", handled, atomic_read(&raid_conf->nr_handle)));
                 save_flags(flags);
                 cli();
-               if (!md_atomic_read(&conf->nr_handle))
-                       clear_bit(THREAD_WAKEUP, &conf->thread->flags);
-               restore_flags(flags);
+               if (!atomic_read(&raid_conf->nr_handle))
+                       clear_bit(THREAD_WAKEUP, &raid_conf->thread->flags);
         }
         PRINTK(("--- raid5d inactive\n"));
  }
  
+#if SUPPORT_RECONSTRUCTION
  /*
   * Private kernel thread for parity reconstruction after an unclean
   * shutdown. Reconstruction on spare drives in case of a failed drive
@@ -1391,64 +1309,44 @@ repeat:
   */
  static void raid5syncd (void *data)
  {
-       raid5_conf_t *conf = data;
-       mddev_t *mddev = conf->mddev;
+       struct raid5_data *raid_conf = data;
+       struct md_dev *mddev = raid_conf->mddev;
  
-       if (!conf->resync_parity)
-               return;
-       if (conf->resync_parity == 2)
+       if (!raid_conf->resync_parity)
                 return;
-       down(&mddev->recovery_sem);
-       if (md_do_sync(mddev,NULL)) {
-               up(&mddev->recovery_sem);
-               printk("raid5: resync aborted!\n");
-               return;
-       }
-       conf->resync_parity = 0;
-       up(&mddev->recovery_sem);
-       printk("raid5: resync finished.\n");
+       md_do_sync(mddev);
+       raid_conf->resync_parity = 0;
  }
+#endif /* SUPPORT_RECONSTRUCTION */
  
-static int __check_consistency (mddev_t *mddev, int row)
+static int __check_consistency (struct md_dev *mddev, int row)
  {
-       raid5_conf_t *conf = mddev->private;
+       struct raid5_data *raid_conf = mddev->private;
         kdev_t dev;
         struct buffer_head *bh[MD_SB_DISKS], tmp;
-       int i, rc = 0, nr = 0, count;
-       struct buffer_head *bh_ptr[MAX_XOR_BLOCKS];
+       int i, rc = 0, nr = 0;
  
-       if (conf->working_disks != conf->raid_disks)
+       if (raid_conf->working_disks != raid_conf->raid_disks)
                 return 0;
         tmp.b_size = 4096;
         if ((tmp.b_data = (char *) get_free_page(GFP_KERNEL)) == NULL)
                 return 0;
-       md_clear_page((unsigned long)tmp.b_data);
         memset(bh, 0, MD_SB_DISKS * sizeof(struct buffer_head *));
-       for (i = 0; i < conf->raid_disks; i++) {
-               dev = conf->disks[i].dev;
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               dev = raid_conf->disks[i].dev;
                 set_blocksize(dev, 4096);
                 if ((bh[i] = bread(dev, row / 4, 4096)) == NULL)
                         break;
                 nr++;
         }
-       if (nr == conf->raid_disks) {
-               bh_ptr[0] = &tmp;
-               count = 1;
-               for (i = 1; i < nr; i++) {
-                       bh_ptr[count++] = bh[i];
-                       if (count == MAX_XOR_BLOCKS) {
-                               xor_block(count, &bh_ptr[0]);
-                               count = 1;
-                       }
-               }
-               if (count != 1) {
-                       xor_block(count, &bh_ptr[0]);
-               }
+       if (nr == raid_conf->raid_disks) {
+               for (i = 1; i < nr; i++)
+                       xor_block(&tmp, bh[i]);
                 if (memcmp(tmp.b_data, bh[0]->b_data, 4096))
                         rc = 1;
         }
-       for (i = 0; i < conf->raid_disks; i++) {
-               dev = conf->disks[i].dev;
+       for (i = 0; i < raid_conf->raid_disks; i++) {
+               dev = raid_conf->disks[i].dev;
                 if (bh[i]) {
                         bforget(bh[i]);
                         bh[i] = NULL;
@@ -1460,607 +1358,285 @@ static int __check_consistency (mddev_t *mddev, int row)
         return rc;
  }
  
-static int check_consistency (mddev_t *mddev)
+static int check_consistency (struct md_dev *mddev)
  {
-       if (__check_consistency(mddev, 0))
-/*
- * We are not checking this currently, as it's legitimate to have
- * an inconsistent array, at creation time.
- */
-               return 0;
+       int size = mddev->sb->size;
+       int row;
  
+       for (row = 0; row < size; row += size / 8)
+               if (__check_consistency(mddev, row))
+                       return 1;
         return 0;
  }
  
-static int raid5_run (mddev_t *mddev)
+static int raid5_run (int minor, struct md_dev *mddev)
  {
-       raid5_conf_t *conf;
+       struct raid5_data *raid_conf;
         int i, j, raid_disk, memory;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *desc;
-       mdk_rdev_t *rdev;
-       struct disk_info *disk;
-       struct md_list_head *tmp;
-       int start_recovery = 0;
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+       struct real_dev *realdev;
  
         MOD_INC_USE_COUNT;
  
         if (sb->level != 5 && sb->level != 4) {
-               printk("raid5: md%d: raid level not set to 4/5 (%d)\n", mdidx(mddev), sb->level);
+               printk("raid5: %s: raid level not set to 4/5 (%d)\n", kdevname(MKDEV(MD_MAJOR, minor)), sb->level);
                 MOD_DEC_USE_COUNT;
                 return -EIO;
         }
  
-       mddev->private = kmalloc (sizeof (raid5_conf_t), GFP_KERNEL);
-       if ((conf = mddev->private) == NULL)
+       mddev->private = kmalloc (sizeof (struct raid5_data), GFP_KERNEL);
+       if ((raid_conf = mddev->private) == NULL)
                 goto abort;
-       memset (conf, 0, sizeof (*conf));
-       conf->mddev = mddev;
+       memset (raid_conf, 0, sizeof (*raid_conf));
+       raid_conf->mddev = mddev;
  
-       if ((conf->stripe_hashtbl = (struct stripe_head **) md__get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
+       if ((raid_conf->stripe_hashtbl = (struct stripe_head **) __get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
                 goto abort;
-       memset(conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
+       memset(raid_conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
  
-       init_waitqueue(&conf->wait_for_stripe);
-       PRINTK(("raid5_run(md%d) called.\n", mdidx(mddev)));
+       init_waitqueue(&raid_conf->wait_for_stripe);
+       PRINTK(("raid5_run(%d) called.\n", minor));
+
+       for (i = 0; i < mddev->nb_dev; i++) {
+               realdev = &mddev->devices[i];
+               if (!realdev->sb) {
+                       printk(KERN_ERR "raid5: disabled device %s (couldn't access raid superblock)\n", kdevname(realdev->dev));
+                       continue;
+               }
  
-       ITERATE_RDEV(mddev,rdev,tmp) {
                 /*
                  * This is important -- we are using the descriptor on
                  * the disk only to get a pointer to the descriptor on
                  * the main superblock, which might be more recent.
                  */
-               desc = sb->disks + rdev->desc_nr;
-               raid_disk = desc->raid_disk;
-               disk = conf->disks + raid_disk;
-
-               if (disk_faulty(desc)) {
-                       printk(KERN_ERR "raid5: disabled device %s (errors detected)\n", partition_name(rdev->dev));
-                       if (!rdev->faulty) {
-                               MD_BUG();
-                               goto abort;
-                       }
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+               descriptor = &sb->disks[realdev->sb->descriptor.number];
+               if (descriptor->state & (1 << MD_FAULTY_DEVICE)) {
+                       printk(KERN_ERR "raid5: disabled device %s (errors detected)\n", kdevname(realdev->dev));
                         continue;
                 }
-               if (disk_active(desc)) {
-                       if (!disk_sync(desc)) {
-                               printk(KERN_ERR "raid5: disabled device %s (not in sync)\n", partition_name(rdev->dev));
-                               MD_BUG();
-                               goto abort;
+               if (descriptor->state & (1 << MD_ACTIVE_DEVICE)) {
+                       if (!(descriptor->state & (1 << MD_SYNC_DEVICE))) {
+                               printk(KERN_ERR "raid5: disabled device %s (not in sync)\n", kdevname(realdev->dev));
+                               continue;
                         }
-                       if (raid_disk > sb->raid_disks) {
-                               printk(KERN_ERR "raid5: disabled device %s (inconsistent descriptor)\n", partition_name(rdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       if (descriptor->number > sb->nr_disks || raid_disk > sb->raid_disks) {
+                               printk(KERN_ERR "raid5: disabled device %s (inconsistent descriptor)\n", kdevname(realdev->dev));
                                 continue;
                         }
-                       if (disk->operational) {
-                               printk(KERN_ERR "raid5: disabled device %s (device %d already operational)\n", partition_name(rdev->dev), raid_disk);
+                       if (raid_conf->disks[raid_disk].operational) {
+                               printk(KERN_ERR "raid5: disabled device %s (device %d already operational)\n", kdevname(realdev->dev), raid_disk);
                                 continue;
                         }
-                       printk(KERN_INFO "raid5: device %s operational as raid disk %d\n", partition_name(rdev->dev), raid_disk);
+                       printk(KERN_INFO "raid5: device %s operational as raid disk %d\n", kdevname(realdev->dev), raid_disk);
         
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-                       disk->operational = 1;
-                       disk->used_slot = 1;
+                       raid_conf->disks[raid_disk].number = descriptor->number;
+                       raid_conf->disks[raid_disk].raid_disk = raid_disk;
+                       raid_conf->disks[raid_disk].dev = mddev->devices[i].dev;
+                       raid_conf->disks[raid_disk].operational = 1;
  
-                       conf->working_disks++;
+                       raid_conf->working_disks++;
                 } else {
                         /*
                          * Must be a spare disk ..
                          */
-                       printk(KERN_INFO "raid5: spare disk %s\n", partition_name(rdev->dev));
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = rdev->dev;
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 1;
-                       disk->used_slot = 1;
-               }
-       }
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               desc = sb->disks + i;
-               raid_disk = desc->raid_disk;
-               disk = conf->disks + raid_disk;
-
-               if (disk_faulty(desc) && (raid_disk < sb->raid_disks) &&
-                       !conf->disks[raid_disk].used_slot) {
-
-                       disk->number = desc->number;
-                       disk->raid_disk = raid_disk;
-                       disk->dev = MKDEV(0,0);
-
-                       disk->operational = 0;
-                       disk->write_only = 0;
-                       disk->spare = 0;
-                       disk->used_slot = 1;
+                       printk(KERN_INFO "raid5: spare disk %s\n", kdevname(realdev->dev));
+                       raid_disk = descriptor->raid_disk;
+                       raid_conf->disks[raid_disk].number = descriptor->number;
+                       raid_conf->disks[raid_disk].raid_disk = raid_disk;
+                       raid_conf->disks[raid_disk].dev = mddev->devices [i].dev;
+
+                       raid_conf->disks[raid_disk].operational = 0;
+                       raid_conf->disks[raid_disk].write_only = 0;
+                       raid_conf->disks[raid_disk].spare = 1;
                 }
         }
+       raid_conf->raid_disks = sb->raid_disks;
+       raid_conf->failed_disks = raid_conf->raid_disks - raid_conf->working_disks;
+       raid_conf->mddev = mddev;
+       raid_conf->chunk_size = sb->chunk_size;
+       raid_conf->level = sb->level;
+       raid_conf->algorithm = sb->parity_algorithm;
+       raid_conf->max_nr_stripes = NR_STRIPES;
  
-       conf->raid_disks = sb->raid_disks;
-       /*
-        * 0 for a fully functional array, 1 for a degraded array.
-        */
-       conf->failed_disks = conf->raid_disks - conf->working_disks;
-       conf->mddev = mddev;
-       conf->chunk_size = sb->chunk_size;
-       conf->level = sb->level;
-       conf->algorithm = sb->layout;
-       conf->max_nr_stripes = NR_STRIPES;
-
-#if 0
-       for (i = 0; i < conf->raid_disks; i++) {
-               if (!conf->disks[i].used_slot) {
-                       MD_BUG();
-                       goto abort;
-               }
+       if (raid_conf->working_disks != sb->raid_disks && sb->state != (1 << MD_SB_CLEAN)) {
+               printk(KERN_ALERT "raid5: raid set %s not clean and not all disks are operational -- run ckraid\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
         }
-#endif
-       if (!conf->chunk_size || conf->chunk_size % 4) {
-               printk(KERN_ERR "raid5: invalid chunk size %d for md%d\n", conf->chunk_size, mdidx(mddev));
+       if (!raid_conf->chunk_size || raid_conf->chunk_size % 4) {
+               printk(KERN_ERR "raid5: invalid chunk size %d for %s\n", raid_conf->chunk_size, kdevname(MKDEV(MD_MAJOR, minor)));
                 goto abort;
         }
-       if (conf->algorithm > ALGORITHM_RIGHT_SYMMETRIC) {
-               printk(KERN_ERR "raid5: unsupported parity algorithm %d for md%d\n", conf->algorithm, mdidx(mddev));
+       if (raid_conf->algorithm > ALGORITHM_RIGHT_SYMMETRIC) {
+               printk(KERN_ERR "raid5: unsupported parity algorithm %d for %s\n", raid_conf->algorithm, kdevname(MKDEV(MD_MAJOR, minor)));
                 goto abort;
         }
-       if (conf->failed_disks > 1) {
-               printk(KERN_ERR "raid5: not enough operational devices for md%d (%d/%d failed)\n", mdidx(mddev), conf->failed_disks, conf->raid_disks);
+       if (raid_conf->failed_disks > 1) {
+               printk(KERN_ERR "raid5: not enough operational devices for %s (%d/%d failed)\n", kdevname(MKDEV(MD_MAJOR, minor)), raid_conf->failed_disks, raid_conf->raid_disks);
                 goto abort;
         }
  
-       if (conf->working_disks != sb->raid_disks) {
-               printk(KERN_ALERT "raid5: md%d, not all disks are operational -- trying to recover array\n", mdidx(mddev));
-               start_recovery = 1;
+       if ((sb->state & (1 << MD_SB_CLEAN)) && check_consistency(mddev)) {
+               printk(KERN_ERR "raid5: detected raid-5 xor inconsistenty -- run ckraid\n");
+               sb->state |= 1 << MD_SB_ERRORS;
+               goto abort;
         }
  
-       if (!start_recovery && (sb->state & (1 << MD_SB_CLEAN)) &&
-                       check_consistency(mddev)) {
-               printk(KERN_ERR "raid5: detected raid-5 superblock xor inconsistency -- running resync\n");
-               sb->state &= ~(1 << MD_SB_CLEAN);
+       if ((raid_conf->thread = md_register_thread(raid5d, raid_conf)) == NULL) {
+               printk(KERN_ERR "raid5: couldn't allocate thread for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
         }
  
-       {
-               const char * name = "raid5d";
-
-               conf->thread = md_register_thread(raid5d, conf, name);
-               if (!conf->thread) {
-                       printk(KERN_ERR "raid5: couldn't allocate thread for md%d\n", mdidx(mddev));
-                       goto abort;
-               }
+#if SUPPORT_RECONSTRUCTION
+       if ((raid_conf->resync_thread = md_register_thread(raid5syncd, raid_conf)) == NULL) {
+               printk(KERN_ERR "raid5: couldn't allocate thread for %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               goto abort;
         }
+#endif /* SUPPORT_RECONSTRUCTION */
  
-       memory = conf->max_nr_stripes * (sizeof(struct stripe_head) +
-                conf->raid_disks * (sizeof(struct buffer_head) +
+       memory = raid_conf->max_nr_stripes * (sizeof(struct stripe_head) +
+                raid_conf->raid_disks * (sizeof(struct buffer_head) +
                  2 * (sizeof(struct buffer_head) + PAGE_SIZE))) / 1024;
-       if (grow_stripes(conf, conf->max_nr_stripes, GFP_KERNEL)) {
+       if (grow_stripes(raid_conf, raid_conf->max_nr_stripes, GFP_KERNEL)) {
                 printk(KERN_ERR "raid5: couldn't allocate %dkB for buffers\n", memory);
-               shrink_stripes(conf, conf->max_nr_stripes);
+               shrink_stripes(raid_conf, raid_conf->max_nr_stripes);
                 goto abort;
         } else
-               printk(KERN_INFO "raid5: allocated %dkB for md%d\n", memory, mdidx(mddev));
+               printk(KERN_INFO "raid5: allocated %dkB for %s\n", memory, kdevname(MKDEV(MD_MAJOR, minor)));
  
         /*
          * Regenerate the "device is in sync with the raid set" bit for
          * each device.
          */
-       for (i = 0; i < MD_SB_DISKS ; i++) {
-               mark_disk_nonsync(sb->disks + i);
+       for (i = 0; i < sb->nr_disks ; i++) {
+               sb->disks[i].state &= ~(1 << MD_SYNC_DEVICE);
                 for (j = 0; j < sb->raid_disks; j++) {
-                       if (!conf->disks[j].operational)
+                       if (!raid_conf->disks[j].operational)
                                 continue;
-                       if (sb->disks[i].number == conf->disks[j].number)
-                               mark_disk_sync(sb->disks + i);
+                       if (sb->disks[i].number == raid_conf->disks[j].number)
+                               sb->disks[i].state |= 1 << MD_SYNC_DEVICE;
                 }
         }
-       sb->active_disks = conf->working_disks;
+       sb->active_disks = raid_conf->working_disks;
  
         if (sb->active_disks == sb->raid_disks)
-               printk("raid5: raid level %d set md%d active with %d out of %d devices, algorithm %d\n", conf->level, mdidx(mddev), sb->active_disks, sb->raid_disks, conf->algorithm);
+               printk("raid5: raid level %d set %s active with %d out of %d devices, algorithm %d\n", raid_conf->level, kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks, raid_conf->algorithm);
         else
-               printk(KERN_ALERT "raid5: raid level %d set md%d active with %d out of %d devices, algorithm %d\n", conf->level, mdidx(mddev), sb->active_disks, sb->raid_disks, conf->algorithm);
-
-       if (!start_recovery && ((sb->state & (1 << MD_SB_CLEAN))==0)) {
-               const char * name = "raid5syncd";
+               printk(KERN_ALERT "raid5: raid level %d set %s active with %d out of %d devices, algorithm %d\n", raid_conf->level, kdevname(MKDEV(MD_MAJOR, minor)), sb->active_disks, sb->raid_disks, raid_conf->algorithm);
  
-               conf->resync_thread = md_register_thread(raid5syncd, conf,name);
-               if (!conf->resync_thread) {
-                       printk(KERN_ERR "raid5: couldn't allocate thread for md%d\n", mdidx(mddev));
-                       goto abort;
-               }
-
-               printk("raid5: raid set md%d not clean; reconstructing parity\n", mdidx(mddev));
-               conf->resync_parity = 1;
-               md_wakeup_thread(conf->resync_thread);
+       if ((sb->state & (1 << MD_SB_CLEAN)) == 0) {
+               printk("raid5: raid set %s not clean; re-constructing parity\n", kdevname(MKDEV(MD_MAJOR, minor)));
+               raid_conf->resync_parity = 1;
+#if SUPPORT_RECONSTRUCTION
+               md_wakeup_thread(raid_conf->resync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
         }
  
-       print_raid5_conf(conf);
-       if (start_recovery)
-               md_recover_arrays();
-       print_raid5_conf(conf);
-
         /* Ok, everything is just fine now */
         return (0);
  abort:
-       if (conf) {
-               print_raid5_conf(conf);
-               if (conf->stripe_hashtbl)
-                       free_pages((unsigned long) conf->stripe_hashtbl,
-                                                       HASH_PAGES_ORDER);
-               kfree(conf);
+       if (raid_conf) {
+               if (raid_conf->stripe_hashtbl)
+                       free_pages((unsigned long) raid_conf->stripe_hashtbl, HASH_PAGES_ORDER);
+               kfree(raid_conf);
         }
         mddev->private = NULL;
-       printk(KERN_ALERT "raid5: failed to run raid set md%d\n", mdidx(mddev));
+       printk(KERN_ALERT "raid5: failed to run raid set %s\n", kdevname(MKDEV(MD_MAJOR, minor)));
         MOD_DEC_USE_COUNT;
         return -EIO;
  }
  
-static int raid5_stop_resync (mddev_t *mddev)
+static int raid5_stop (int minor, struct md_dev *mddev)
  {
-       raid5_conf_t *conf = mddev_to_conf(mddev);
-       mdk_thread_t *thread = conf->resync_thread;
-
-       if (thread) {
-               if (conf->resync_parity) {
-                       conf->resync_parity = 2;
-                       md_interrupt_thread(thread);
-                       printk(KERN_INFO "raid5: parity resync was not fully finished, restarting next time.\n");
-                       return 1;
-               }
-               return 0;
-       }
-       return 0;
-}
-
-static int raid5_restart_resync (mddev_t *mddev)
-{
-       raid5_conf_t *conf = mddev_to_conf(mddev);
-
-       if (conf->resync_parity) {
-               if (!conf->resync_thread) {
-                       MD_BUG();
-                       return 0;
-               }
-               printk("raid5: waking up raid5resync.\n");
-               conf->resync_parity = 1;
-               md_wakeup_thread(conf->resync_thread);
-               return 1;
-       } else
-               printk("raid5: no restart-resync needed.\n");
-       return 0;
-}
-
-
-static int raid5_stop (mddev_t *mddev)
-{
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-
-       shrink_stripe_cache(conf, conf->max_nr_stripes);
-       shrink_stripes(conf, conf->max_nr_stripes);
-       md_unregister_thread(conf->thread);
-       if (conf->resync_thread)
-               md_unregister_thread(conf->resync_thread);
-       free_pages((unsigned long) conf->stripe_hashtbl, HASH_PAGES_ORDER);
-       kfree(conf);
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+
+       shrink_stripe_cache(raid_conf, raid_conf->max_nr_stripes);
+       shrink_stripes(raid_conf, raid_conf->max_nr_stripes);
+       md_unregister_thread(raid_conf->thread);
+#if SUPPORT_RECONSTRUCTION
+       md_unregister_thread(raid_conf->resync_thread);
+#endif /* SUPPORT_RECONSTRUCTION */
+       free_pages((unsigned long) raid_conf->stripe_hashtbl, HASH_PAGES_ORDER);
+       kfree(raid_conf);
         mddev->private = NULL;
         MOD_DEC_USE_COUNT;
         return 0;
  }
  
-static int raid5_status (char *page, mddev_t *mddev)
+static int raid5_status (char *page, int minor, struct md_dev *mddev)
  {
-       raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
-       mdp_super_t *sb = mddev->sb;
+       struct raid5_data *raid_conf = (struct raid5_data *) mddev->private;
+       md_superblock_t *sb = mddev->sb;
         int sz = 0, i;
  
-       sz += sprintf (page+sz, " level %d, %dk chunk, algorithm %d", sb->level, sb->chunk_size >> 10, sb->layout);
-       sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks, conf->working_disks);
-       for (i = 0; i < conf->raid_disks; i++)
-               sz += sprintf (page+sz, "%s", conf->disks[i].operational ? "U" : "_");
+       sz += sprintf (page+sz, " level %d, %dk chunk, algorithm %d", sb->level, sb->chunk_size >> 10, sb->parity_algorithm);
+       sz += sprintf (page+sz, " [%d/%d] [", raid_conf->raid_disks, raid_conf->working_disks);
+       for (i = 0; i < raid_conf->raid_disks; i++)
+               sz += sprintf (page+sz, "%s", raid_conf->disks[i].operational ? "U" : "_");
         sz += sprintf (page+sz, "]");
         return sz;
  }
  
-static void print_raid5_conf (raid5_conf_t *conf)
+static int raid5_mark_spare(struct md_dev *mddev, md_descriptor_t *spare, int state)
  {
-       int i;
-       struct disk_info *tmp;
-
-       printk("RAID5 conf printout:\n");
-       if (!conf) {
-               printk("(conf==NULL)\n");
-               return;
-       }
-       printk(" --- rd:%d wd:%d fd:%d\n", conf->raid_disks,
-                conf->working_disks, conf->failed_disks);
-
-       for (i = 0; i < MD_SB_DISKS; i++) {
-               tmp = conf->disks + i;
-               printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
-                       i, tmp->spare,tmp->operational,
-                       tmp->number,tmp->raid_disk,tmp->used_slot,
-                       partition_name(tmp->dev));
-       }
-}
-
-static int raid5_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
-{
-       int err = 0;
-       int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
-       raid5_conf_t *conf = mddev->private;
-       struct disk_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
+       int i = 0, failed_disk = -1;
+       struct raid5_data *raid_conf = mddev->private;
+       struct disk_info *disk = raid_conf->disks;
         unsigned long flags;
-       mdp_super_t *sb = mddev->sb;
-       mdp_disk_t *failed_desc, *spare_desc, *added_desc;
-
+       md_superblock_t *sb = mddev->sb;
+       md_descriptor_t *descriptor;
+
+       for (i = 0; i < MD_SB_DISKS; i++, disk++) {
+               if (disk->spare && disk->number == spare->number)
+                       goto found;
+       }
+       return 1;
+found:
+       for (i = 0, disk = raid_conf->disks; i < raid_conf->raid_disks; i++, disk++)
+               if (!disk->operational)
+                       failed_disk = i;
+       if (failed_disk == -1)
+               return 1;
         save_flags(flags);
         cli();
-
-       print_raid5_conf(conf);
-       /*
-        * find the disk ...
-        */
-       switch (state) {
-
-       case DISKOP_SPARE_ACTIVE:
-
-               /*
-                * Find the failed disk within the RAID5 configuration ...
-                * (this can only be in the first conf->raid_disks part)
-                */
-               for (i = 0; i < conf->raid_disks; i++) {
-                       tmp = conf->disks + i;
-                       if ((!tmp->operational && !tmp->spare) ||
-                                       !tmp->used_slot) {
-                               failed_disk = i;
-                               break;
-                       }
-               }
-               /*
-                * When we activate a spare disk we _must_ have a disk in
-                * the lower (active) part of the array to replace. 
-                */
-               if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               /* fall through */
-
-       case DISKOP_SPARE_WRITE:
-       case DISKOP_SPARE_INACTIVE:
-
-               /*
-                * Find the spare disk ... (can only be in the 'high'
-                * area of the array)
-                */
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (tmp->spare && tmp->number == (*d)->number) {
-                               spare_disk = i;
-                               break;
-                       }
-               }
-               if (spare_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-
-               for (i = 0; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (tmp->used_slot && (tmp->number == (*d)->number)) {
-                               if (tmp->operational) {
-                                       err = -EBUSY;
-                                       goto abort;
-                               }
-                               removed_disk = i;
-                               break;
-                       }
-               }
-               if (removed_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-
-               for (i = conf->raid_disks; i < MD_SB_DISKS; i++) {
-                       tmp = conf->disks + i;
-                       if (!tmp->used_slot) {
-                               added_disk = i;
-                               break;
-                       }
-               }
-               if (added_disk == -1) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               break;
-       }
-
         switch (state) {
-       /*
-        * Switch the spare disk to write-only mode:
-        */
-       case DISKOP_SPARE_WRITE:
-               if (conf->spare) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               sdisk = conf->disks + spare_disk;
-               sdisk->operational = 1;
-               sdisk->write_only = 1;
-               conf->spare = sdisk;
-               break;
-       /*
-        * Deactivate a spare disk:
-        */
-       case DISKOP_SPARE_INACTIVE:
-               sdisk = conf->disks + spare_disk;
-               sdisk->operational = 0;
-               sdisk->write_only = 0;
-               /*
-                * Was the spare being resynced?
-                */
-               if (conf->spare == sdisk)
-                       conf->spare = NULL;
-               break;
-       /*
-        * Activate (mark read-write) the (now sync) spare disk,
-        * which means we switch it's 'raid position' (->raid_disk)
-        * with the failed disk. (only the first 'conf->raid_disks'
-        * slots are used for 'real' disks and we must preserve this
-        * property)
-        */
-       case DISKOP_SPARE_ACTIVE:
-               if (!conf->spare) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-               sdisk = conf->disks + spare_disk;
-               fdisk = conf->disks + failed_disk;
-
-               spare_desc = &sb->disks[sdisk->number];
-               failed_desc = &sb->disks[fdisk->number];
-
-               if (spare_desc != *d) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (spare_desc->raid_disk != sdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-                       
-               if (sdisk->raid_disk != spare_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (failed_desc->raid_disk != fdisk->raid_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               if (fdisk->raid_disk != failed_disk) {
-                       MD_BUG();
-                       err = 1;
-                       goto abort;
-               }
-
-               /*
-                * do the switch finally
-                */
-               xchg_values(*spare_desc, *failed_desc);
-               xchg_values(*fdisk, *sdisk);
-
-               /*
-                * (careful, 'failed' and 'spare' are switched from now on)
-                *
-                * we want to preserve linear numbering and we want to
-                * give the proper raid_disk number to the now activated
-                * disk. (this means we switch back these values)
-                */
-       
-               xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
-               xchg_values(sdisk->raid_disk, fdisk->raid_disk);
-               xchg_values(spare_desc->number, failed_desc->number);
-               xchg_values(sdisk->number, fdisk->number);
-
-               *d = failed_desc;
-
-               if (sdisk->dev == MKDEV(0,0))
-                       sdisk->used_slot = 0;
-
-               /*
-                * this really activates the spare.
-                */
-               fdisk->spare = 0;
-               fdisk->write_only = 0;
-
-               /*
-                * if we activate a spare, we definitely replace a
-                * non-operational disk slot in the 'low' area of
-                * the disk array.
-                */
-               conf->failed_disks--;
-               conf->working_disks++;
-               conf->spare = NULL;
-
-               break;
-
-       case DISKOP_HOT_REMOVE_DISK:
-               rdisk = conf->disks + removed_disk;
-
-               if (rdisk->spare && (removed_disk < conf->raid_disks)) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-               rdisk->dev = MKDEV(0,0);
-               rdisk->used_slot = 0;
-
-               break;
-
-       case DISKOP_HOT_ADD_DISK:
-               adisk = conf->disks + added_disk;
-               added_desc = *d;
-
-               if (added_disk != added_desc->number) {
-                       MD_BUG();       
-                       err = 1;
-                       goto abort;
-               }
-
-               adisk->number = added_desc->number;
-               adisk->raid_disk = added_desc->raid_disk;
-               adisk->dev = MKDEV(added_desc->major,added_desc->minor);
-
-               adisk->operational = 0;
-               adisk->write_only = 0;
-               adisk->spare = 1;
-               adisk->used_slot = 1;
-
-
-               break;
+               case SPARE_WRITE:
+                       disk->operational = 1;
+                       disk->write_only = 1;
+                       raid_conf->spare = disk;
+                       break;
+               case SPARE_INACTIVE:
+                       disk->operational = 0;
+                       disk->write_only = 0;
+                       raid_conf->spare = NULL;
+                       break;
+               case SPARE_ACTIVE:
+                       disk->spare = 0;
+                       disk->write_only = 0;
  
-       default:
-               MD_BUG();       
-               err = 1;
-               goto abort;
+                       descriptor = &sb->disks[raid_conf->disks[failed_disk].number];
+                       i = spare->raid_disk;
+                       disk->raid_disk = spare->raid_disk = descriptor->raid_disk;
+                       if (disk->raid_disk != failed_disk)
+                               printk("raid5: disk->raid_disk != failed_disk");
+                       descriptor->raid_disk = i;
+
+                       raid_conf->spare = NULL;
+                       raid_conf->working_disks++;
+                       raid_conf->failed_disks--;
+                       raid_conf->disks[failed_disk] = *disk;
+                       break;
+               default:
+                       printk("raid5_mark_spare: bug: state == %d\n", state);
+                       restore_flags(flags);
+                       return 1;
         }
-abort:
         restore_flags(flags);
-       print_raid5_conf(conf);
-       return err;
+       return 0;
  }
  
-static mdk_personality_t raid5_personality=
+static struct md_personality raid5_personality=
  {
         "raid5",
         raid5_map,
@@ -2072,19 +1648,14 @@ static mdk_personality_t raid5_personality=
         NULL,                   /* no ioctls */
         0,
         raid5_error,
-       raid5_diskop,
-       raid5_stop_resync,
-       raid5_restart_resync
+       /* raid5_hot_add_disk, */ NULL,
+       /* raid1_hot_remove_drive */ NULL,
+       raid5_mark_spare
  };
  
  int raid5_init (void)
  {
-       int err;
-
-       err = register_md_personality (RAID5, &raid5_personality);
-       if (err)
-               return err;
-       return 0;
+       return register_md_personality (RAID5, &raid5_personality);
  }
  
  #ifdef MODULE
diff --git a/drivers/block/translucent.c b/drivers/block/translucent.c

deleted file mode 100644 (file)

index 49d2d88..0000000
--- a/drivers/block/translucent.c
+++ /dev/null
@@ -1,136 +0,0 @@
-/*
-   translucent.c : Translucent RAID driver for Linux
-              Copyright (C) 1998 Ingo Molnar
-
-   Translucent mode management functions.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/module.h>
-
-#include <linux/raid/md.h>
-#include <linux/malloc.h>
-
-#include <linux/raid/translucent.h>
-
-#define MAJOR_NR MD_MAJOR
-#define MD_DRIVER
-#define MD_PERSONALITY
-
-static int translucent_run (mddev_t *mddev)
-{
-       translucent_conf_t *conf;
-       mdk_rdev_t *rdev;
-       int i;
-
-       MOD_INC_USE_COUNT;
-
-       conf = kmalloc (sizeof (*conf), GFP_KERNEL);
-       if (!conf)
-               goto out;
-       mddev->private = conf;
-
-       if (mddev->nb_dev != 2) {
-               printk("translucent: this mode needs 2 disks, aborting!\n");
-               goto out;
-       }
-
-       if (md_check_ordering(mddev)) {
-               printk("translucent: disks are not ordered, aborting!\n");
-               goto out;
-       }
-
-       ITERATE_RDEV_ORDERED(mddev,rdev,i) {
-               dev_info_t *disk = conf->disks + i;
-
-               disk->dev = rdev->dev;
-               disk->size = rdev->size;
-       }
-
-       return 0;
-
-out:
-       if (conf)
-               kfree(conf);
-
-       MOD_DEC_USE_COUNT;
-       return 1;
-}
-
-static int translucent_stop (mddev_t *mddev)
-{
-       translucent_conf_t *conf = mddev_to_conf(mddev);
-  
-       kfree(conf);
-
-       MOD_DEC_USE_COUNT;
-
-       return 0;
-}
-
-
-static int translucent_map (mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-                      unsigned long *rsector, unsigned long size)
-{
-       translucent_conf_t *conf = mddev_to_conf(mddev);
-  
-       *rdev = conf->disks[0].dev;
-
-       return 0;
-}
-
-static int translucent_status (char *page, mddev_t *mddev)
-{
-       int sz = 0;
-  
-       sz += sprintf(page+sz, " %d%% full", 10);
-       return sz;
-}
-
-
-static mdk_personality_t translucent_personality=
-{
-       "translucent",
-       translucent_map,
-       NULL,
-       NULL,
-       translucent_run,
-       translucent_stop,
-       translucent_status,
-       NULL,
-       0,
-       NULL,
-       NULL,
-       NULL,
-       NULL
-};
-
-#ifndef MODULE
-
-md__initfunc(void translucent_init (void))
-{
-       register_md_personality (TRANSLUCENT, &translucent_personality);
-}
-
-#else
-
-int init_module (void)
-{
-       return (register_md_personality (TRANSLUCENT, &translucent_personality));
-}
-
-void cleanup_module (void)
-{
-       unregister_md_personality (TRANSLUCENT);
-}
-
-#endif
-
diff --git a/drivers/block/xor.c b/drivers/block/xor.c

deleted file mode 100644 (file)

index 062ba73..0000000
--- a/drivers/block/xor.c
+++ /dev/null
@@ -1,1895 +0,0 @@
-/*
- * xor.c : Multiple Devices driver for Linux
- *
- * Copyright (C) 1996, 1997, 1998, 1999 Ingo Molnar, Matti Aarnio, Jakub Jelinek
- *
- *
- * optimized RAID-5 checksumming functions.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2, or (at your option)
- * any later version.
- *
- * You should have received a copy of the GNU General Public License
- * (for example /usr/src/linux/COPYING); if not, write to the Free
- * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-#include <linux/config.h>
-#include <linux/module.h>
-#include <linux/raid/md.h>
-#ifdef __sparc_v9__
-#include <asm/head.h>
-#include <asm/asi.h>
-#include <asm/visasm.h>
-#endif
-
-/*
- * we use the 'XOR function template' to register multiple xor
- * functions runtime. The kernel measures their speed upon bootup
- * and decides which one to use. (compile-time registration is
- * not enough as certain CPU features like MMX can only be detected
- * runtime)
- *
- * this architecture makes it pretty easy to add new routines
- * that are faster on certain CPUs, without killing other CPU's
- * 'native' routine. Although the current routines are belived
- * to be the physically fastest ones on all CPUs tested, but
- * feel free to prove me wrong and add yet another routine =B-)
- * --mingo
- */
-
-#define MAX_XOR_BLOCKS 5
-
-#define XOR_ARGS (unsigned int count, struct buffer_head **bh_ptr)
-
-typedef void (*xor_block_t) XOR_ARGS;
-xor_block_t xor_block = NULL;
-
-#ifndef __sparc_v9__
-
-struct xor_block_template;
-
-struct xor_block_template {
-       char * name;
-       xor_block_t xor_block;
-       int speed;
-       struct xor_block_template * next;
-};
-
-struct xor_block_template * xor_functions = NULL;
-
-#define XORBLOCK_TEMPLATE(x) \
-static void xor_block_##x XOR_ARGS; \
-static struct xor_block_template t_xor_block_##x = \
-                                { #x, xor_block_##x, 0, NULL }; \
-static void xor_block_##x XOR_ARGS
-
-#ifdef __i386__
-
-#ifdef CONFIG_X86_XMM
-/*
- * Cache avoiding checksumming functions utilizing KNI instructions
- * Copyright (C) 1999 Zach Brown (with obvious credit due Ingo)
- */
-
-XORBLOCK_TEMPLATE(pIII_kni)
-{
-       char xmm_save[16*4];
-       int cr0;
-        int lines = (bh_ptr[0]->b_size>>8);
-
-       __asm__ __volatile__ ( 
-               "movl %%cr0,%0          ;\n\t"
-               "clts                   ;\n\t"
-               "movups %%xmm0,(%1)     ;\n\t"
-               "movups %%xmm1,0x10(%1) ;\n\t"
-               "movups %%xmm2,0x20(%1) ;\n\t"
-               "movups %%xmm3,0x30(%1) ;\n\t"
-               : "=r" (cr0)
-               : "r" (xmm_save) 
-               : "memory" );
-
-#define OFFS(x) "8*("#x"*2)"
-#define        PF0(x) \
-       "       prefetcht0  "OFFS(x)"(%1)   ;\n"
-#define LD(x,y) \
-        "       movaps   "OFFS(x)"(%1), %%xmm"#y"   ;\n"
-#define ST(x,y) \
-        "       movaps %%xmm"#y",   "OFFS(x)"(%1)   ;\n"
-#define PF1(x) \
-       "       prefetchnta "OFFS(x)"(%2)   ;\n"
-#define PF2(x) \
-       "       prefetchnta "OFFS(x)"(%3)   ;\n"
-#define PF3(x) \
-       "       prefetchnta "OFFS(x)"(%4)   ;\n"
-#define PF4(x) \
-       "       prefetchnta "OFFS(x)"(%5)   ;\n"
-#define PF5(x) \
-       "       prefetchnta "OFFS(x)"(%6)   ;\n"
-#define XO1(x,y) \
-        "       xorps   "OFFS(x)"(%2), %%xmm"#y"   ;\n"
-#define XO2(x,y) \
-        "       xorps   "OFFS(x)"(%3), %%xmm"#y"   ;\n"
-#define XO3(x,y) \
-        "       xorps   "OFFS(x)"(%4), %%xmm"#y"   ;\n"
-#define XO4(x,y) \
-        "       xorps   "OFFS(x)"(%5), %%xmm"#y"   ;\n"
-#define XO5(x,y) \
-        "       xorps   "OFFS(x)"(%6), %%xmm"#y"   ;\n"
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data)
-                       : "memory" );
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data)
-                       : "memory" );
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               PF3(i)                                  \
-                               PF3(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               XO3(i,0)                                \
-                       XO3(i+1,1)                      \
-                               XO3(i+2,2)              \
-                                       XO3(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       addl $256, %4           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data)
-                       : "memory" );
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-               PF1(i)                                  \
-                               PF1(i+2)                \
-               LD(i,0)                                 \
-                       LD(i+1,1)                       \
-                               LD(i+2,2)               \
-                                       LD(i+3,3)       \
-               PF2(i)                                  \
-                               PF2(i+2)                \
-               XO1(i,0)                                \
-                       XO1(i+1,1)                      \
-                               XO1(i+2,2)              \
-                                       XO1(i+3,3)      \
-               PF3(i)                                  \
-                               PF3(i+2)                \
-               XO2(i,0)                                \
-                       XO2(i+1,1)                      \
-                               XO2(i+2,2)              \
-                                       XO2(i+3,3)      \
-               PF4(i)                                  \
-                               PF4(i+2)                \
-               PF0(i+4)                                \
-                               PF0(i+6)                \
-               XO3(i,0)                                \
-                       XO3(i+1,1)                      \
-                               XO3(i+2,2)              \
-                                       XO3(i+3,3)      \
-               XO4(i,0)                                \
-                       XO4(i+1,1)                      \
-                               XO4(i+2,2)              \
-                                       XO4(i+3,3)      \
-               ST(i,0)                                 \
-                       ST(i+1,1)                       \
-                               ST(i+2,2)               \
-                                       ST(i+3,3)       \
-
-
-               PF0(0)
-                               PF0(2)
-
-       " .align 32,0x90                ;\n"
-        " 1:                            ;\n"
-
-               BLOCK(0)
-               BLOCK(4)
-               BLOCK(8)
-               BLOCK(12)
-
-        "       addl $256, %1           ;\n"
-        "       addl $256, %2           ;\n"
-        "       addl $256, %3           ;\n"
-        "       addl $256, %4           ;\n"
-        "       addl $256, %5           ;\n"
-        "       decl %0                 ;\n"
-        "       jnz 1b                  ;\n"
-
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data),
-                         "r" (bh_ptr[4]->b_data)
-                       : "memory");
-                       break;
-       }
-
-       __asm__ __volatile__ ( 
-               "sfence                 ;\n\t"
-               "movups (%1),%%xmm0     ;\n\t"
-               "movups 0x10(%1),%%xmm1 ;\n\t"
-               "movups 0x20(%1),%%xmm2 ;\n\t"
-               "movups 0x30(%1),%%xmm3 ;\n\t"
-               "movl   %0,%%cr0        ;\n\t"
-               :
-               : "r" (cr0), "r" (xmm_save)
-               : "memory" );
-}
-
-#undef OFFS
-#undef LD
-#undef ST
-#undef PF0
-#undef PF1
-#undef PF2
-#undef PF3
-#undef PF4
-#undef PF5
-#undef XO1
-#undef XO2
-#undef XO3
-#undef XO4
-#undef XO5
-#undef BLOCK
-
-#endif /* CONFIG_X86_XMM */
-
-/*
- * high-speed RAID5 checksumming functions utilizing MMX instructions
- * Copyright (C) 1998 Ingo Molnar
- */
-XORBLOCK_TEMPLATE(pII_mmx)
-{
-       char fpu_save[108];
-        int lines = (bh_ptr[0]->b_size>>7);
-
-       if (!(current->flags & PF_USEDFPU))
-               __asm__ __volatile__ ( " clts;\n");
-
-       __asm__ __volatile__ ( " fsave %0; fwait\n"::"m"(fpu_save[0]) );
-
-#define LD(x,y) \
-        "       movq   8*("#x")(%1), %%mm"#y"   ;\n"
-#define ST(x,y) \
-        "       movq %%mm"#y",   8*("#x")(%1)   ;\n"
-#define XO1(x,y) \
-        "       pxor   8*("#x")(%2), %%mm"#y"   ;\n"
-#define XO2(x,y) \
-        "       pxor   8*("#x")(%3), %%mm"#y"   ;\n"
-#define XO3(x,y) \
-        "       pxor   8*("#x")(%4), %%mm"#y"   ;\n"
-#define XO4(x,y) \
-        "       pxor   8*("#x")(%5), %%mm"#y"   ;\n"
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                       ST(i,0)                                 \
-                               XO1(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO1(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO1(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data)
-                       : "memory");
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                       ST(i,0)                                 \
-                               XO2(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO2(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO2(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data)
-                       : "memory");
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                               XO2(i+1,1)                      \
-                                       XO2(i+2,2)              \
-                                               XO2(i+3,3)      \
-                       XO3(i,0)                                \
-                       ST(i,0)                                 \
-                               XO3(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO3(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO3(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       addl $128, %4         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data)
-                       : "memory");
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-#undef BLOCK
-#define BLOCK(i) \
-                       LD(i,0)                                 \
-                               LD(i+1,1)                       \
-                                       LD(i+2,2)               \
-                                               LD(i+3,3)       \
-                       XO1(i,0)                                \
-                               XO1(i+1,1)                      \
-                                       XO1(i+2,2)              \
-                                               XO1(i+3,3)      \
-                       XO2(i,0)                                \
-                               XO2(i+1,1)                      \
-                                       XO2(i+2,2)              \
-                                               XO2(i+3,3)      \
-                       XO3(i,0)                                \
-                               XO3(i+1,1)                      \
-                                       XO3(i+2,2)              \
-                                               XO3(i+3,3)      \
-                       XO4(i,0)                                \
-                       ST(i,0)                                 \
-                               XO4(i+1,1)                      \
-                               ST(i+1,1)                       \
-                                       XO4(i+2,2)              \
-                                       ST(i+2,2)               \
-                                               XO4(i+3,3)      \
-                                               ST(i+3,3)
-
-                       " .align 32,0x90                ;\n"
-                       " 1:                            ;\n"
-
-                       BLOCK(0)
-                       BLOCK(4)
-                       BLOCK(8)
-                       BLOCK(12)
-
-                       "       addl $128, %1         ;\n"
-                       "       addl $128, %2         ;\n"
-                       "       addl $128, %3         ;\n"
-                       "       addl $128, %4         ;\n"
-                       "       addl $128, %5         ;\n"
-                       "       decl %0               ;\n"
-                       "       jnz 1b                ;\n"
-                       :
-                       : "r" (lines),
-                         "r" (bh_ptr[0]->b_data),
-                         "r" (bh_ptr[1]->b_data),
-                         "r" (bh_ptr[2]->b_data),
-                         "r" (bh_ptr[3]->b_data),
-                         "r" (bh_ptr[4]->b_data)
-                       : "memory");
-                       break;
-       }
-
-       __asm__ __volatile__ ( " frstor %0;\n"::"m"(fpu_save[0]) );
-
-       if (!(current->flags & PF_USEDFPU))
-               stts();
-}
-
-#undef LD
-#undef XO1
-#undef XO2
-#undef XO3
-#undef XO4
-#undef ST
-#undef BLOCK
-
-XORBLOCK_TEMPLATE(p5_mmx)
-{
-       char fpu_save[108];
-        int lines = (bh_ptr[0]->b_size>>6);
-
-       if (!(current->flags & PF_USEDFPU))
-               __asm__ __volatile__ ( " clts;\n");
-
-       __asm__ __volatile__ ( " fsave %0; fwait\n"::"m"(fpu_save[0]) );
-
-       switch(count) {
-               case 2:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data)
-                               : "memory" );
-                       break;
-               case 3:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data)
-                               : "memory" );
-                       break;
-               case 4:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor   (%4), %%mm0   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       pxor  8(%4), %%mm1   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       pxor 16(%4), %%mm2   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 24(%4), %%mm3   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       pxor 32(%4), %%mm4   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       pxor 40(%4), %%mm5   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%4), %%mm6   ;\n"
-                               "       pxor 56(%4), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       addl $64, %4         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data),
-                                 "r" (bh_ptr[3]->b_data)
-                               : "memory" );
-                       break;
-               case 5:
-                       __asm__ __volatile__ (
-
-                               " .align 32,0x90             ;\n"
-                               " 1:                         ;\n"
-                               "       movq   (%1), %%mm0   ;\n"
-                               "       movq  8(%1), %%mm1   ;\n"
-                               "       pxor   (%2), %%mm0   ;\n"
-                               "       pxor  8(%2), %%mm1   ;\n"
-                               "       movq 16(%1), %%mm2   ;\n"
-                               "       pxor   (%3), %%mm0   ;\n"
-                               "       pxor  8(%3), %%mm1   ;\n"
-                               "       pxor 16(%2), %%mm2   ;\n"
-                               "       pxor   (%4), %%mm0   ;\n"
-                               "       pxor  8(%4), %%mm1   ;\n"
-                               "       pxor 16(%3), %%mm2   ;\n"
-                               "       movq 24(%1), %%mm3   ;\n"
-                               "       pxor   (%5), %%mm0   ;\n"
-                               "       pxor  8(%5), %%mm1   ;\n"
-                               "       movq %%mm0,   (%1)   ;\n"
-                               "       pxor 16(%4), %%mm2   ;\n"
-                               "       pxor 24(%2), %%mm3   ;\n"
-                               "       movq %%mm1,  8(%1)   ;\n"
-                               "       pxor 16(%5), %%mm2   ;\n"
-                               "       pxor 24(%3), %%mm3   ;\n"
-                               "       movq 32(%1), %%mm4   ;\n"
-                               "       movq %%mm2, 16(%1)   ;\n"
-                               "       pxor 24(%4), %%mm3   ;\n"
-                               "       pxor 32(%2), %%mm4   ;\n"
-                               "       movq 40(%1), %%mm5   ;\n"
-                               "       pxor 24(%5), %%mm3   ;\n"
-                               "       pxor 32(%3), %%mm4   ;\n"
-                               "       pxor 40(%2), %%mm5   ;\n"
-                               "       movq %%mm3, 24(%1)   ;\n"
-                               "       pxor 32(%4), %%mm4   ;\n"
-                               "       pxor 40(%3), %%mm5   ;\n"
-                               "       movq 48(%1), %%mm6   ;\n"
-                               "       movq 56(%1), %%mm7   ;\n"
-                               "       pxor 32(%5), %%mm4   ;\n"
-                               "       pxor 40(%4), %%mm5   ;\n"
-                               "       pxor 48(%2), %%mm6   ;\n"
-                               "       pxor 56(%2), %%mm7   ;\n"
-                               "       movq %%mm4, 32(%1)   ;\n"
-                               "       pxor 48(%3), %%mm6   ;\n"
-                               "       pxor 56(%3), %%mm7   ;\n"
-                               "       pxor 40(%5), %%mm5   ;\n"
-                               "       pxor 48(%4), %%mm6   ;\n"
-                               "       pxor 56(%4), %%mm7   ;\n"
-                               "       movq %%mm5, 40(%1)   ;\n"
-                               "       pxor 48(%5), %%mm6   ;\n"
-                               "       pxor 56(%5), %%mm7   ;\n"
-                               "       movq %%mm6, 48(%1)   ;\n"
-                               "       movq %%mm7, 56(%1)   ;\n"
-        
-                               "       addl $64, %1         ;\n"
-                               "       addl $64, %2         ;\n"
-                               "       addl $64, %3         ;\n"
-                               "       addl $64, %4         ;\n"
-                               "       addl $64, %5         ;\n"
-                               "       decl %0              ;\n"
-                               "       jnz 1b               ;\n"
-
-                               : 
-                               : "r" (lines),
-                                 "r" (bh_ptr[0]->b_data),
-                                 "r" (bh_ptr[1]->b_data),
-                                 "r" (bh_ptr[2]->b_data),
-                                 "r" (bh_ptr[3]->b_data),
-                                 "r" (bh_ptr[4]->b_data)
-                               : "memory" );
-                       break;
-       }
-
-       __asm__ __volatile__ ( " frstor %0;\n"::"m"(fpu_save[0]) );
-
-       if (!(current->flags & PF_USEDFPU))
-               stts();
-}
-#endif /* __i386__ */
-#endif /* !__sparc_v9__ */
-
-#ifdef __sparc_v9__
-/*
- * High speed xor_block operation for RAID4/5 utilizing the
- * UltraSparc Visual Instruction Set.
- *
- * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
- *
- *     Requirements:
- *     !(((long)dest | (long)sourceN) & (64 - 1)) &&
- *     !(len & 127) && len >= 256
- *
- * It is done in pure assembly, as otherwise gcc makes it
- * a non-leaf function, which is not what we want.
- * Also, we don't measure the speeds as on other architectures,
- * as the measuring routine does not take into account cold caches
- * and the fact that xor_block_VIS bypasses the caches.
- * xor_block_32regs might be 5% faster for count 2 if caches are hot
- * and things just right (for count 3 VIS is about as fast as 32regs for
- * hot caches and for count 4 and 5 VIS is faster by good margin always),
- * but I think it is better not to pollute the caches.
- * Actually, if I'd just fight for speed for hot caches, I could
- * write a hybrid VIS/integer routine, which would do always two
- * 64B blocks in VIS and two in IEUs, but I really care more about
- * caches.
- */
-extern void *VISenter(void);
-extern void xor_block_VIS XOR_ARGS;
-
-void __xor_block_VIS(void)
-{
-__asm__ ("
-       .globl xor_block_VIS
-xor_block_VIS:
-       ldx     [%%o1 + 0], %%o4
-       ldx     [%%o1 + 8], %%o3
-       ldx     [%%o4 + %1], %%g5
-       ldx     [%%o4 + %0], %%o4
-       ldx     [%%o3 + %0], %%o3
-       rd      %%fprs, %%o5
-       andcc   %%o5, %2, %%g0
-       be,pt   %%icc, 297f
-        sethi  %%hi(%5), %%g1
-       jmpl    %%g1 + %%lo(%5), %%g7
-        add    %%g7, 8, %%g7
-297:   wr      %%g0, %4, %%fprs
-       membar  #LoadStore|#StoreLoad|#StoreStore
-       sub     %%g5, 64, %%g5
-       ldda    [%%o4] %3, %%f0
-       ldda    [%%o3] %3, %%f16
-       cmp     %%o0, 4
-       bgeu,pt %%xcc, 10f
-        cmp    %%o0, 3
-       be,pn   %%xcc, 13f
-        mov    -64, %%g1
-       sub     %%g5, 64, %%g5
-       rd      %%asi, %%g1
-       wr      %%g0, %3, %%asi
-
-2:     ldda    [%%o4 + 64] %%asi, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       stda    %%f16, [%%o4] %3
-       ldda    [%%o3 + 64] %%asi, %%f48
-       ldda    [%%o4 + 128] %%asi, %%f0
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       add     %%o4, 128, %%o4
-       fxor    %%f36, %%f52, %%f52
-       add     %%o3, 128, %%o3
-       fxor    %%f38, %%f54, %%f54
-       subcc   %%g5, 128, %%g5
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 - 64] %%asi
-       bne,pt  %%xcc, 2b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o4 + 64] %%asi, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       stda    %%f16, [%%o4] %3
-       ldda    [%%o3 + 64] %%asi, %%f48
-       membar  #Sync
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 + 64] %%asi
-       membar  #Sync|#StoreStore|#StoreLoad
-       wr      %%g0, 0, %%fprs
-       retl
-        wr     %%g1, %%g0, %%asi
-
-13:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o2 + %0], %%o2
-
-3:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       add     %%o2, 64, %%o2
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 3b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       membar  #Sync
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-
-10:    cmp     %%o0, 5
-       be,pt   %%xcc, 15f
-        mov    -64, %%g1
-
-14:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o1 + 24], %%o0
-       ldx     [%%o2 + %0], %%o2
-       ldx     [%%o0 + %0], %%o0
-
-4:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       ldda    [%%o0] %3, %%f48
-       fxor    %%f16, %%f32, %%f32
-       fxor    %%f18, %%f34, %%f34
-       fxor    %%f20, %%f36, %%f36
-       fxor    %%f22, %%f38, %%f38
-       add     %%o2, 64, %%o2
-       fxor    %%f24, %%f40, %%f40
-       fxor    %%f26, %%f42, %%f42
-       fxor    %%f28, %%f44, %%f44
-       fxor    %%f30, %%f46, %%f46
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       add     %%o0, 64, %%o0
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 4b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f16
-       fxor    %%f2, %%f18, %%f18
-       fxor    %%f4, %%f20, %%f20
-       fxor    %%f6, %%f22, %%f22
-       fxor    %%f8, %%f24, %%f24
-       fxor    %%f10, %%f26, %%f26
-       fxor    %%f12, %%f28, %%f28
-       fxor    %%f14, %%f30, %%f30
-       ldda    [%%o0] %3, %%f48
-       fxor    %%f16, %%f32, %%f32
-       fxor    %%f18, %%f34, %%f34
-       fxor    %%f20, %%f36, %%f36
-       fxor    %%f22, %%f38, %%f38
-       fxor    %%f24, %%f40, %%f40
-       fxor    %%f26, %%f42, %%f42
-       fxor    %%f28, %%f44, %%f44
-       fxor    %%f30, %%f46, %%f46
-       membar  #Sync
-       fxor    %%f32, %%f48, %%f48
-       fxor    %%f34, %%f50, %%f50
-       fxor    %%f36, %%f52, %%f52
-       fxor    %%f38, %%f54, %%f54
-       fxor    %%f40, %%f56, %%f56
-       fxor    %%f42, %%f58, %%f58
-       fxor    %%f44, %%f60, %%f60
-       fxor    %%f46, %%f62, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-
-15:    ldx     [%%o1 + 16], %%o2
-       ldx     [%%o1 + 24], %%o0
-       ldx     [%%o1 + 32], %%o1
-       ldx     [%%o2 + %0], %%o2
-       ldx     [%%o0 + %0], %%o0
-       ldx     [%%o1 + %0], %%o1
-
-5:     ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       add     %%o4, 64, %%o4
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       add     %%o3, 64, %%o3
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o0] %3, %%f16
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       add     %%o2, 64, %%o2
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       ldda    [%%o1] %3, %%f32
-       fxor    %%f48, %%f16, %%f48
-       fxor    %%f50, %%f18, %%f50
-       add     %%o0, 64, %%o0
-       fxor    %%f52, %%f20, %%f52
-       fxor    %%f54, %%f22, %%f54
-       add     %%o1, 64, %%o1
-       fxor    %%f56, %%f24, %%f56
-       fxor    %%f58, %%f26, %%f58
-       fxor    %%f60, %%f28, %%f60
-       fxor    %%f62, %%f30, %%f62
-       ldda    [%%o4] %3, %%f0
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       subcc   %%g5, 64, %%g5
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4 + %%g1] %3
-       bne,pt  %%xcc, 5b
-        ldda   [%%o3] %3, %%f16
-
-       ldda    [%%o2] %3, %%f32
-       fxor    %%f0, %%f16, %%f48
-       fxor    %%f2, %%f18, %%f50
-       fxor    %%f4, %%f20, %%f52
-       fxor    %%f6, %%f22, %%f54
-       fxor    %%f8, %%f24, %%f56
-       fxor    %%f10, %%f26, %%f58
-       fxor    %%f12, %%f28, %%f60
-       fxor    %%f14, %%f30, %%f62
-       ldda    [%%o0] %3, %%f16
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       ldda    [%%o1] %3, %%f32
-       fxor    %%f48, %%f16, %%f48
-       fxor    %%f50, %%f18, %%f50
-       fxor    %%f52, %%f20, %%f52
-       fxor    %%f54, %%f22, %%f54
-       fxor    %%f56, %%f24, %%f56
-       fxor    %%f58, %%f26, %%f58
-       fxor    %%f60, %%f28, %%f60
-       fxor    %%f62, %%f30, %%f62
-       membar  #Sync
-       fxor    %%f48, %%f32, %%f48
-       fxor    %%f50, %%f34, %%f50
-       fxor    %%f52, %%f36, %%f52
-       fxor    %%f54, %%f38, %%f54
-       fxor    %%f56, %%f40, %%f56
-       fxor    %%f58, %%f42, %%f58
-       fxor    %%f60, %%f44, %%f60
-       fxor    %%f62, %%f46, %%f62
-       stda    %%f48, [%%o4] %3
-       membar  #Sync|#StoreStore|#StoreLoad
-       retl
-        wr     %%g0, 0, %%fprs
-       " : :
-       "i" (&((struct buffer_head *)0)->b_data),
-       "i" (&((struct buffer_head *)0)->b_size),
-       "i" (FPRS_FEF|FPRS_DU), "i" (ASI_BLK_P),
-       "i" (FPRS_FEF), "i" (VISenter));
-}
-#endif /* __sparc_v9__ */
-
-#if defined(__sparc__) && !defined(__sparc_v9__)
-/*
- * High speed xor_block operation for RAID4/5 utilizing the
- * ldd/std SPARC instructions.
- *
- * Copyright (C) 1999 Jakub Jelinek (jj@ultra.linux.cz)
- *
- */
-
-XORBLOCK_TEMPLATE(SPARC)
-{
-       int size  = bh_ptr[0]->b_size;
-       int lines = size / (sizeof (long)) / 8, i;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1 = (long *) bh_ptr[1]->b_data;
-       long *source2, *source3, *source4;
-
-       switch (count) {
-       case 2:
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1) : "g2", "g3", "g4", "g5", "o0", 
-                 "o1", "o2", "o3", "o4", "o5", "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-               }
-               break;
-       case 3:
-               source2 = (long *) bh_ptr[2]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-               }
-               break;
-       case 4:
-               source2 = (long *) bh_ptr[2]->b_data;
-               source3 = (long *) bh_ptr[3]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%3 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%3 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%3 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%3 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2), "r" (source3)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-                 source3 += 8;
-               }
-               break;
-       case 5:
-               source2 = (long *) bh_ptr[2]->b_data;
-               source3 = (long *) bh_ptr[3]->b_data;
-               source4 = (long *) bh_ptr[4]->b_data;
-               for (i = lines; i > 0; i--) {
-                 __asm__ __volatile__("
-                 ldd [%0 + 0x00], %%g2
-                 ldd [%0 + 0x08], %%g4
-                 ldd [%0 + 0x10], %%o0
-                 ldd [%0 + 0x18], %%o2
-                 ldd [%1 + 0x00], %%o4
-                 ldd [%1 + 0x08], %%l0
-                 ldd [%1 + 0x10], %%l2
-                 ldd [%1 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%2 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%2 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%2 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%2 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%3 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%3 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%3 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%3 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 ldd [%4 + 0x00], %%o4
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 ldd [%4 + 0x08], %%l0
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 ldd [%4 + 0x10], %%l2
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 ldd [%4 + 0x18], %%l4
-                 xor %%g2, %%o4, %%g2
-                 xor %%g3, %%o5, %%g3
-                 xor %%g4, %%l0, %%g4
-                 xor %%g5, %%l1, %%g5
-                 xor %%o0, %%l2, %%o0
-                 xor %%o1, %%l3, %%o1
-                 xor %%o2, %%l4, %%o2
-                 xor %%o3, %%l5, %%o3
-                 std %%g2, [%0 + 0x00]
-                 std %%g4, [%0 + 0x08]
-                 std %%o0, [%0 + 0x10]
-                 std %%o2, [%0 + 0x18]
-                 " : : "r" (destp), "r" (source1), "r" (source2), "r" (source3), "r" (source4)
-                 : "g2", "g3", "g4", "g5", "o0", "o1", "o2", "o3", "o4", "o5",
-                 "l0", "l1", "l2", "l3", "l4", "l5");
-                 destp += 8;
-                 source1 += 8;
-                 source2 += 8;
-                 source3 += 8;
-                 source4 += 8;
-               }
-               break;
-       }
-}
-#endif /* __sparc_v[78]__ */
-
-#ifndef __sparc_v9__
-
-/*
- * this one works reasonably on any x86 CPU
- * (send me an assembly version for inclusion if you can make it faster)
- *
- * this one is just as fast as written in pure assembly on x86.
- * the reason for this separate version is that the
- * fast open-coded xor routine "32reg" produces suboptimal code
- * on x86, due to lack of registers.
- */
-XORBLOCK_TEMPLATE(8regs)
-{
-       int len  = bh_ptr[0]->b_size;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1, *source2, *source3, *source4;
-       long lines = len / (sizeof (long)) / 8, i;
-
-       switch(count) {
-               case 2:
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               source1 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 3:
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 4:
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 0) ^= *(source3 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 1) ^= *(source3 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 2) ^= *(source3 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 3) ^= *(source3 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 4) ^= *(source3 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 5) ^= *(source3 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 6) ^= *(source3 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               *(destp + 7) ^= *(source3 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 5:
-                       source4 = (long *) bh_ptr[4]->b_data;
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               *(destp + 0) ^= *(source1 + 0);
-                               *(destp + 0) ^= *(source2 + 0);
-                               *(destp + 0) ^= *(source3 + 0);
-                               *(destp + 0) ^= *(source4 + 0);
-                               *(destp + 1) ^= *(source1 + 1);
-                               *(destp + 1) ^= *(source2 + 1);
-                               *(destp + 1) ^= *(source3 + 1);
-                               *(destp + 1) ^= *(source4 + 1);
-                               *(destp + 2) ^= *(source1 + 2);
-                               *(destp + 2) ^= *(source2 + 2);
-                               *(destp + 2) ^= *(source3 + 2);
-                               *(destp + 2) ^= *(source4 + 2);
-                               *(destp + 3) ^= *(source1 + 3);
-                               *(destp + 3) ^= *(source2 + 3);
-                               *(destp + 3) ^= *(source3 + 3);
-                               *(destp + 3) ^= *(source4 + 3);
-                               *(destp + 4) ^= *(source1 + 4);
-                               *(destp + 4) ^= *(source2 + 4);
-                               *(destp + 4) ^= *(source3 + 4);
-                               *(destp + 4) ^= *(source4 + 4);
-                               *(destp + 5) ^= *(source1 + 5);
-                               *(destp + 5) ^= *(source2 + 5);
-                               *(destp + 5) ^= *(source3 + 5);
-                               *(destp + 5) ^= *(source4 + 5);
-                               *(destp + 6) ^= *(source1 + 6);
-                               *(destp + 6) ^= *(source2 + 6);
-                               *(destp + 6) ^= *(source3 + 6);
-                               *(destp + 6) ^= *(source4 + 6);
-                               *(destp + 7) ^= *(source1 + 7);
-                               *(destp + 7) ^= *(source2 + 7);
-                               *(destp + 7) ^= *(source3 + 7);
-                               *(destp + 7) ^= *(source4 + 7);
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               source4 += 8;
-                               destp += 8;
-                       }
-                       break;
-       }
-}
-
-/*
- * platform independent RAID5 checksum calculation, this should
- * be very fast on any platform that has a decent amount of
- * registers. (32 or more)
- */
-XORBLOCK_TEMPLATE(32regs)
-{
-       int size  = bh_ptr[0]->b_size;
-       int lines = size / (sizeof (long)) / 8, i;
-       long *destp   = (long *) bh_ptr[0]->b_data;
-       long *source1, *source2, *source3, *source4;
-       
-         /* LOTS of registers available...
-            We do explicite loop-unrolling here for code which
-            favours RISC machines.  In fact this is almoast direct
-            RISC assembly on Alpha and SPARC :-)  */
-
-
-       switch(count) {
-               case 2:
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 3:
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 4:
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               d0 ^= source3[0];
-                               d1 ^= source3[1];
-                               d2 ^= source3[2];
-                               d3 ^= source3[3];
-                               d4 ^= source3[4];
-                               d5 ^= source3[5];
-                               d6 ^= source3[6];
-                               d7 ^= source3[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               destp += 8;
-                       }
-                       break;
-               case 5:
-                       source4 = (long *) bh_ptr[4]->b_data;
-                       source3 = (long *) bh_ptr[3]->b_data;
-                       source2 = (long *) bh_ptr[2]->b_data;
-                       source1 = (long *) bh_ptr[1]->b_data;
-                       for (i = lines; i > 0; i--) {
-                               register long d0, d1, d2, d3, d4, d5, d6, d7;
-                               d0 = destp[0];  /* Pull the stuff into registers        */
-                               d1 = destp[1];  /*  ... in bursts, if possible.         */
-                               d2 = destp[2];
-                               d3 = destp[3];
-                               d4 = destp[4];
-                               d5 = destp[5];
-                               d6 = destp[6];
-                               d7 = destp[7];
-                               d0 ^= source1[0];
-                               d1 ^= source1[1];
-                               d2 ^= source1[2];
-                               d3 ^= source1[3];
-                               d4 ^= source1[4];
-                               d5 ^= source1[5];
-                               d6 ^= source1[6];
-                               d7 ^= source1[7];
-                               d0 ^= source2[0];
-                               d1 ^= source2[1];
-                               d2 ^= source2[2];
-                               d3 ^= source2[3];
-                               d4 ^= source2[4];
-                               d5 ^= source2[5];
-                               d6 ^= source2[6];
-                               d7 ^= source2[7];
-                               d0 ^= source3[0];
-                               d1 ^= source3[1];
-                               d2 ^= source3[2];
-                               d3 ^= source3[3];
-                               d4 ^= source3[4];
-                               d5 ^= source3[5];
-                               d6 ^= source3[6];
-                               d7 ^= source3[7];
-                               d0 ^= source4[0];
-                               d1 ^= source4[1];
-                               d2 ^= source4[2];
-                               d3 ^= source4[3];
-                               d4 ^= source4[4];
-                               d5 ^= source4[5];
-                               d6 ^= source4[6];
-                               d7 ^= source4[7];
-                               destp[0] = d0;  /* Store the result (in burts)          */
-                               destp[1] = d1;
-                               destp[2] = d2;
-                               destp[3] = d3;
-                               destp[4] = d4;  /* Store the result (in burts)          */
-                               destp[5] = d5;
-                               destp[6] = d6;
-                               destp[7] = d7;
-                               source1 += 8;
-                               source2 += 8;
-                               source3 += 8;
-                               source4 += 8;
-                               destp += 8;
-                       }
-                       break;
-       }
-}
-
-/*
- * (the -6*32 shift factor colors the cache)
- */
-#define SIZE (PAGE_SIZE-6*32)
-
-static void xor_speed ( struct xor_block_template * func, 
-       struct buffer_head *b1, struct buffer_head *b2)
-{
-       int speed;
-       unsigned long now;
-       int i, count, max;
-       struct buffer_head *bh_ptr[6];
-
-       func->next = xor_functions;
-       xor_functions = func;
-       bh_ptr[0] = b1;
-       bh_ptr[1] = b2;
-
-       /*
-        * count the number of XORs done during a whole jiffy.
-        * calculate the speed of checksumming from this.
-        * (we use a 2-page allocation to have guaranteed
-        * color L1-cache layout)
-        */
-       max = 0;
-       for (i = 0; i < 5; i++) {
-               now = jiffies;
-               count = 0;
-               while (jiffies == now) {
-                       mb();
-                       func->xor_block(2,bh_ptr);
-                       mb();
-                       count++;
-                       mb();
-               }
-               if (count > max)
-                       max = count;
-       }
-
-       speed = max * (HZ*SIZE/1024);
-       func->speed = speed;
-
-       printk( "   %-10s: %5d.%03d MB/sec\n", func->name,
-               speed / 1000, speed % 1000);
-}
-
-static inline void pick_fastest_function(void)
-{
-       struct xor_block_template *f, *fastest;
-
-       fastest = xor_functions;
-       for (f = fastest; f; f = f->next) {
-               if (f->speed > fastest->speed)
-                       fastest = f;
-       }
-#ifdef CONFIG_X86_XMM 
-       if (boot_cpu_data.mmu_cr4_features & X86_CR4_OSXMMEXCPT) {
-               fastest = &t_xor_block_pIII_kni;
-       }
-#endif
-       xor_block = fastest->xor_block;
-       printk( "using fastest function: %s (%d.%03d MB/sec)\n", fastest->name,
-               fastest->speed / 1000, fastest->speed % 1000);
-}
- 
-
-void calibrate_xor_block(void)
-{
-       struct buffer_head b1, b2;
-
-       memset(&b1,0,sizeof(b1));
-       b2 = b1;
-
-       b1.b_data = (char *) md__get_free_pages(GFP_KERNEL,2);
-       if (!b1.b_data) {
-               pick_fastest_function();
-               return;
-       }
-       b2.b_data = b1.b_data + 2*PAGE_SIZE + SIZE;
-
-       b1.b_size = SIZE;
-
-       printk(KERN_INFO "raid5: measuring checksumming speed\n");
-
-       sti(); /* should be safe */
-
-#if defined(__sparc__) && !defined(__sparc_v9__)
-       printk(KERN_INFO "raid5: trying high-speed SPARC checksum routine\n");
-       xor_speed(&t_xor_block_SPARC,&b1,&b2);
-#endif
-
-#ifdef CONFIG_X86_XMM 
-       if (boot_cpu_data.mmu_cr4_features & X86_CR4_OSXMMEXCPT) {
-               printk(KERN_INFO
-                       "raid5: KNI detected, trying cache-avoiding KNI checksum routine\n");
-               /* we force the use of the KNI xor block because it
-                       can write around l2.  we may also be able
-                       to load into the l1 only depending on how
-                       the cpu deals with a load to a line that is
-                       being prefetched.
-               */
-               xor_speed(&t_xor_block_pIII_kni,&b1,&b2);
-       }
-#endif /* CONFIG_X86_XMM */
-
-#ifdef __i386__
-
-       if (md_cpu_has_mmx()) {
-               printk(KERN_INFO
-                       "raid5: MMX detected, trying high-speed MMX checksum routines\n");
-               xor_speed(&t_xor_block_pII_mmx,&b1,&b2);
-               xor_speed(&t_xor_block_p5_mmx,&b1,&b2);
-       }
-
-#endif /* __i386__ */
-       
-       
-       xor_speed(&t_xor_block_8regs,&b1,&b2);
-       xor_speed(&t_xor_block_32regs,&b1,&b2);
-
-       free_pages((unsigned long)b1.b_data,2);
-       pick_fastest_function();
-}
-
-#else /* __sparc_v9__ */
-
-void calibrate_xor_block(void)
-{
-       printk(KERN_INFO "raid5: using high-speed VIS checksum routine\n");
-       xor_block = xor_block_VIS;
-}
-
-#endif /* __sparc_v9__ */
-
-MD_EXPORT_SYMBOL(xor_block);
-
diff --git a/drivers/cdrom/sonycd535.c b/drivers/cdrom/sonycd535.c

index f16b5b1d0c5e919d5099ffd5df4b3237d32e5861..f124836fd6b120fe95ff5b834c657f657e5c31f8 100644 (file)
--- a/drivers/cdrom/sonycd535.c
+++ b/drivers/cdrom/sonycd535.c
@@ -1524,9 +1524,11 @@ sony535_init(void))
                 printk(CDU535_MESSAGE_NAME ": my base address is not free!\n");
                 return -EIO;
         }
+
         /* look for the CD-ROM, follows the procedure in the DOS driver */
         inb(select_unit_reg);
         /* wait for 40 18 Hz ticks (reverse-engineered from DOS driver) */
+       current->state = TASK_INTERRUPTIBLE;
         schedule_timeout((HZ+17)*40/18);
         inb(result_reg);
  
diff --git a/drivers/char/bttv.c b/drivers/char/bttv.c

index 7d8fdb2429e9983e7b486e4990fac7672e5eb38a..25316c0148d0e1038b21d82875626aac18c618b3 100644 (file)
--- a/drivers/char/bttv.c
+++ b/drivers/char/bttv.c
@@ -1471,10 +1471,10 @@ static int vgrab(struct bttv *btv, struct video_mmap *mp)
  /*      This doesn´t work like this for NTSC anyway.
          So, better check the total image size ...
  */
-/*
-       if(mp->height>576 || mp->width>768+BURSTOFFSET)
+
+       if(mp->height>576 || mp->width>768+BURSTOFFSET || mp->height < 32 || mp->width <32)
                 return -EINVAL;
-*/
+
         if (mp->format >= PALETTEFMT_MAX)
                 return -EINVAL;
         if (mp->height*mp->width*fmtbppx2[palette2fmt[mp->format]&0x0f]/2
@@ -1977,7 +1977,8 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                 {
                         struct video_buffer v;
  #if LINUX_VERSION_CODE >= 0x020100
-                       if(!capable(CAP_SYS_ADMIN))
+                       if(!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
  #else
                         if(!suser())
  #endif
@@ -1989,12 +1990,7 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                                 v.height > 16 && v.bytesperline > 16)
                                 return -EINVAL;
                          if (v.base)
-                        {
-                                if ((unsigned long)v.base&1)
-                                        btv->win.vidadr=(unsigned long)(PAGE_OFFSET|uvirt_to_bus((unsigned long)v.base));
-                                else
-                                        btv->win.vidadr=(unsigned long)v.base;
-                        }
+                               btv->win.vidadr=(unsigned long)v.base;
                         btv->win.sheight=v.height;
                         btv->win.swidth=v.width;
                         btv->win.bpp=((v.depth+7)&0x38)/8;
@@ -2216,6 +2212,8 @@ static int bttv_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                          struct video_mmap vm;
                         if(copy_from_user((void *) &vm, (void *) arg, sizeof(vm)))
                                 return -EFAULT;
+                       if (vm.frame < 0 || vm.frame >= MAX_GBUFFERS)
+                               return -EIO;
                          if (btv->frame_stat[vm.frame] == GBUFFER_GRABBING)
                                  return -EBUSY;
                         return vgrab(btv, &vm);
diff --git a/drivers/char/buz.c b/drivers/char/buz.c

index 5352245a95fff971b8e4c7098549fc0f1101065a..0a361f21203861a141cdb2edf1437ffd6ae4bfed 100644 (file)
--- a/drivers/char/buz.c
+++ b/drivers/char/buz.c
@@ -2654,7 +2654,8 @@ static int zoran_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
                 {
                         struct video_buffer v;
  
-                       if (!capable(CAP_SYS_ADMIN))
+                       if (!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
                                 return -EPERM;
  
                         if (copy_from_user(&v, arg, sizeof(v)))
diff --git a/drivers/char/dz.c b/drivers/char/dz.c

index c26a448e85202cad16018324f7a6031719cf83fd..30f28280b7ada0bb2665c4e525d42676c1528615 100644 (file)
--- a/drivers/char/dz.c
+++ b/drivers/char/dz.c
@@ -902,7 +902,7 @@ static void send_break (struct dz_serial *info, int duration)
    
    dz_out (info, DZ_TCR, tmp);
    
-  schedule_timeout(jiffies + duration);
+  schedule_timeout(duration);
    
    tmp &= ~mask;
    dz_out (info, DZ_TCR, tmp);
@@ -1093,7 +1093,7 @@ static void dz_close (struct tty_struct *tty, struct file *filp)
    if (info->blocked_open) {
      if (info->close_delay) {
        current->state = TASK_INTERRUPTIBLE;
-      schedule_timeout(jiffies + info->close_delay);
+      schedule_timeout(info->close_delay);
      }
      wake_up_interruptible (&info->open_wait);
    }
diff --git a/drivers/char/generic_serial.c b/drivers/char/generic_serial.c

index 15b4fcd96d9ceb69d143841380f20be7ee3876a0..d377683af9edda075e0275dafec12dfcd32d52dd 100644 (file)
--- a/drivers/char/generic_serial.c
+++ b/drivers/char/generic_serial.c
@@ -94,7 +94,7 @@ static inline int copy_from_user(void *to,const void *from, int c)
  #ifndef TWO_THREE
  /* These are new in 2.3. The source now uses 2.3 syntax, and here is 
     the compatibility define... */
-#define waitq_head_t struct wait_queue *
+#define wait_queue_head_t struct wait_queue *
  #define DECLARE_MUTEX(name) struct semaphore name = MUTEX
  #define DECLARE_WAITQUEUE(wait, current) struct wait_queue wait = { current, NULL }
  
diff --git a/drivers/char/planb.c b/drivers/char/planb.c

index 0f2d24e8bbe47870a24b2bd228428c9152b725fd..ede49febce596c1abcc7c13768dfb7275ef3cfee 100644 (file)
--- a/drivers/char/planb.c
+++ b/drivers/char/planb.c
@@ -1533,7 +1533,8 @@ static int planb_ioctl(struct video_device *dev, unsigned int cmd, void *arg)
  
                         DEBUG("PlanB: IOCTL VIDIOCSFBUF\n");
  
-                        if (!capable(CAP_SYS_ADMIN))
+                        if (!capable(CAP_SYS_ADMIN)
+                       || !capable(CAP_SYS_RAWIO))
                                  return -EPERM;
                          if (copy_from_user(&v, arg,sizeof(v)))
                                  return -EFAULT;
diff --git a/drivers/isdn/isdn_ppp.c b/drivers/isdn/isdn_ppp.c

index 35be7ed4e0ca315ae4b25a56eb729353f2148adb..69e84990512f0398fb0e5a6092360c098d6dbea2 100644 (file)
--- a/drivers/isdn/isdn_ppp.c
+++ b/drivers/isdn/isdn_ppp.c
@@ -1352,7 +1352,7 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
                         {
                                 struct sk_buff *skb_old = skb;
                                 int pkt_len;
-                               skb = dev_alloc_skb(skb_old->len + 40);
+                               skb = dev_alloc_skb(skb_old->len + 128);
  
                                 if (!skb) {
                                         printk(KERN_WARNING "%s: Memory squeeze, dropping packet.\n", dev->name);
@@ -1361,7 +1361,7 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
                                         return;
                                 }
                                 skb->dev = dev;
-                               skb_put(skb, skb_old->len + 40);
+                               skb_put(skb, skb_old->len + 128);
                                 memcpy(skb->data, skb_old->data, skb_old->len);
                                 skb->mac.raw = skb->data;
                                 pkt_len = slhc_uncompress(ippp_table[net_dev->local->ppp_slot]->slcomp,
@@ -1413,16 +1413,22 @@ isdn_ppp_push_higher(isdn_net_dev * net_dev, isdn_net_local * lp, struct sk_buff
  static unsigned char *isdn_ppp_skb_push(struct sk_buff **skb_p,int len)
  {
         struct sk_buff *skb = *skb_p;
-
+       
         if(skb_headroom(skb) < len) {
-               printk(KERN_ERR "isdn_ppp_skb_push:under %d %d\n",skb_headroom(skb),len);
+               struct sk_buff *nskb = skb_realloc_headroom(skb, len);
+               
+               if (!nskb) {
+                       printk(KERN_INFO "isdn_ppp_skb_push: can't realloc headroom!\n");
+                       dev_kfree_skb(skb);
+                       return NULL;
+               }
                 dev_kfree_skb(skb);
-               return NULL;
+               *skb_p = nskb;
+               return skb_push(nskb, len);
         }
         return skb_push(skb,len);
  }
-
-
+       
  /*
   * send ppp frame .. we expect a PIDCOMPressable proto --
   *  (here: currently always PPP_IP,PPP_VJC_COMP,PPP_VJC_UNCOMP)
diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c

index 38e96cca2262d8b7df65872437f5d1c57b9e43ac..fc013ded501de3bf0728ea8ed67dcd50c03bf4af 100644 (file)
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -585,7 +585,7 @@ static struct device * sis900_probe1(   int pci_bus,
          tp = kmalloc(sizeof(*tp), GFP_KERNEL | GFP_DMA);
          if(tp==NULL)
          {
-               free_region(ioaddr, pci_tbl[chip_idx].io_size);
+               releaseregion(ioaddr, pci_tbl[chip_idx].io_size);
                 return NULL;
          }
          memset(tp, 0, sizeof(*tp));
diff --git a/drivers/scsi/aha152x.c b/drivers/scsi/aha152x.c

index b9f884a9e1d5543bd1e7a229dcdc8ecdb2b2f91c..e0e6aeb70cb9bf26864a47c875dacb371f42e86f 100644 (file)
--- a/drivers/scsi/aha152x.c
+++ b/drivers/scsi/aha152x.c
@@ -1394,9 +1394,7 @@ int aha152x_reset(Scsi_Cmnd * SCpnt, unsigned int unused)
                 if (ptr && !ptr->device->soft_reset) {
                         ptr->host_scribble = NULL;
                         ptr->result = DID_RESET << 16;
-                       spin_lock_irqsave(&io_request_lock, flags);
                         ptr->scsi_done(CURRENT_SC);
-                       spin_unlock_irqrestore(&io_request_lock, flags);
                         CURRENT_SC = NULL;
                 }
                 save_flags(flags);
@@ -1414,9 +1412,7 @@ int aha152x_reset(Scsi_Cmnd * SCpnt, unsigned int unused)
  
                                 ptr->host_scribble = NULL;
                                 ptr->result = DID_RESET << 16;
-                               spin_lock_irqsave(&io_request_lock, flags);
                                 ptr->scsi_done(ptr);
-                               spin_unlock_irqrestore(&io_request_lock, flags);
  
                                 ptr = next;
                         } else {
diff --git a/drivers/sound/sb_ess.c b/drivers/sound/sb_ess.c

index e4664b06abdaf8164f2e7b88dd4e6c7ca1af5425..ba45fd594c3199fad511b65ddb3bcc41038dd96d 100644 (file)
--- a/drivers/sound/sb_ess.c
+++ b/drivers/sound/sb_ess.c
@@ -100,7 +100,7 @@
   * of writing 0x00 to 0x7f (which should be done by reset): The ES1887 moves
   * into ES1888 mode. This means that it claims IRQ 11, which happens to be my
   * ISDN adapter. Needless to say it no longer worked. I now understand why
- * after rebooting 0x7f already was 0x05, the value of my choise: the BIOS
+ * after rebooting 0x7f already was 0x05, the value of my choice: the BIOS
   * did it.
   *
   * Oh, and this is another trap: in ES1887 docs mixer register 0x70 is decribed
@@ -1200,10 +1200,10 @@ FKS_test (devc);
  
         /* AAS: info stolen from ALSA: these boards have different clocks */
         switch(devc->submodel) {
-/* APPARENTLY NOT 1869 
+/* APPARENTLY NOT 1869 AND 1887
                 case SUBMDL_ES1869:
-*/             
                 case SUBMDL_ES1887:
+*/             
                 case SUBMDL_ES1888:
                         devc->caps |= SB_CAP_ES18XX_RATE;
                         break;
diff --git a/fs/autofs/root.c b/fs/autofs/root.c

index 4f21959d079f052932b14619f339cd659c6e8b9d..c1b57ec6e96923fc2f844630855e52fb1429fa35 100644 (file)
--- a/fs/autofs/root.c
+++ b/fs/autofs/root.c
@@ -165,7 +165,7 @@ static int try_to_fill_dentry(struct dentry *dentry, struct super_block *sb, str
   * yet completely filled in, and revalidate has to delay such
   * lookups..
   */
-static int autofs_do_revalidate(struct dentry * dentry, int flags)
+static int autofs_revalidate(struct dentry * dentry, int flags)
  {
         struct inode * dir = dentry->d_parent->d_inode;
         struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb);
@@ -200,15 +200,6 @@ static int autofs_do_revalidate(struct dentry * dentry, int flags)
         return 1;
  }
  
-static int autofs_revalidate(struct dentry * dentry, int flags)
-{
-       int r;
-       up(&dentry->d_parent->d_inode->i_sem);
-       r = autofs_do_revalidate(dentry, flags);
-       down(&dentry->d_parent->d_inode->i_sem);
-       return r;
-}
-
  static struct dentry_operations autofs_dentry_operations = {
         autofs_revalidate,      /* d_revalidate */
         NULL,                   /* d_hash */
@@ -246,7 +237,9 @@ static struct dentry *autofs_root_lookup(struct inode *dir, struct dentry *dentr
         dentry->d_flags |= DCACHE_AUTOFS_PENDING;
         d_add(dentry, NULL);
  
+       up(&dir->i_sem);
         autofs_revalidate(dentry, 0);
+       down(&dir->i_sem);
  
         /*
          * If we are still pending, check if we had to handle
diff --git a/fs/block_dev.c b/fs/block_dev.c

index 401d9f34dfd1d40803ec1320e9bfaa58f42daaf5..13b3f534debca838640b5e9e05d94477ee4e266b 100644 (file)
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -14,8 +14,7 @@ extern int *blk_size[];
  extern int *blksize_size[];
  
  #define MAX_BUF_PER_PAGE (PAGE_SIZE / 512)
-#define NBUF 128
-#define READAHEAD_SECTORS      (128 * 4 * 2)
+#define NBUF 64
  
  ssize_t block_write(struct file * filp, const char * buf,
                     size_t count, loff_t *ppos)
@@ -153,13 +152,12 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
         size_t blocks, rblocks, left;
         int bhrequest, uptodate;
         struct buffer_head ** bhb, ** bhe;
-       struct buffer_head ** buflist;
-       struct buffer_head ** bhreq;
+       struct buffer_head * buflist[NBUF];
+       struct buffer_head * bhreq[NBUF];
         unsigned int chars;
         loff_t size;
         kdev_t dev;
         ssize_t read;
-       int nbuf;
  
         dev = inode->i_rdev;
         blocksize = BLOCK_SIZE;
@@ -189,18 +187,6 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                 left = count;
         if (left <= 0)
                 return 0;
-
-       if ((buflist = (struct buffer_head **) __get_free_page(GFP_KERNEL)) == NULL)
-               return -ENOMEM;
-       if ((bhreq = (struct buffer_head **) __get_free_page(GFP_KERNEL)) == NULL) {
-               free_page((unsigned long) buflist);
-               return -ENOMEM;
-       }
-
-       nbuf = READAHEAD_SECTORS / (blocksize >> 9);
-       if (nbuf > PAGE_SIZE / sizeof(struct buffer_head *))
-               nbuf = PAGE_SIZE / sizeof(struct buffer_head *);
-               
         read = 0;
         block = offset >> blocksize_bits;
         offset &= blocksize-1;
@@ -208,12 +194,8 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
         rblocks = blocks = (left + offset + blocksize - 1) >> blocksize_bits;
         bhb = bhe = buflist;
         if (filp->f_reada) {
-#if 0
                 if (blocks < read_ahead[MAJOR(dev)] / (blocksize >> 9))
                         blocks = read_ahead[MAJOR(dev)] / (blocksize >> 9);
-#else
-               blocks += read_ahead[MAJOR(dev)] / (blocksize >> 9);
-#endif
                 if (rblocks > blocks)
                         blocks = rblocks;
                 
@@ -245,7 +227,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                 bhreq[bhrequest++] = *bhb;
                         }
  
-                       if (++bhb == &buflist[nbuf])
+                       if (++bhb == &buflist[NBUF])
                                 bhb = buflist;
  
                         /* If the block we have on hand is uptodate, go ahead
@@ -266,7 +248,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                 wait_on_buffer(*bhe);
                                 if (!buffer_uptodate(*bhe)) {   /* read error? */
                                         brelse(*bhe);
-                                       if (++bhe == &buflist[nbuf])
+                                       if (++bhe == &buflist[NBUF])
                                           bhe = buflist;
                                         left = 0;
                                         break;
@@ -288,7 +270,7 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
                                         put_user(0,buf++);
                         }
                         offset = 0;
-                       if (++bhe == &buflist[nbuf])
+                       if (++bhe == &buflist[NBUF])
                                 bhe = buflist;
                 } while (left > 0 && bhe != bhb && (!*bhe || !buffer_locked(*bhe)));
                 if (bhe == bhb && !blocks)
@@ -298,12 +280,9 @@ ssize_t block_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
  /* Release the read-ahead blocks */
         while (bhe != bhb) {
                 brelse(*bhe);
-               if (++bhe == &buflist[nbuf])
+               if (++bhe == &buflist[NBUF])
                         bhe = buflist;
         };
-
-       free_page((unsigned long) buflist);
-       free_page((unsigned long) bhreq);
         if (!read)
                 return -EIO;
         filp->f_reada = 1;
diff --git a/fs/buffer.c b/fs/buffer.c

index 5e08a9774f29b2178e6ea958218e4e2f88df47de..1f76051878cf59718a98c87d124237d4db5d0881 100644 (file)
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -600,7 +600,6 @@ unsigned int get_hardblocksize(kdev_t dev)
         return 0;
  }
  
-#if 0
  void set_blocksize(kdev_t dev, int size)
  {
         extern int *blksize_size[];
@@ -646,113 +645,10 @@ void set_blocksize(kdev_t dev, int size)
                                 clear_bit(BH_Req, &bh->b_state);
                                 bh->b_flushtime = 0;
                         }
-                       remove_from_queues(bh);
-                       bh->b_dev=B_FREE;
-                       insert_into_queues(bh);
-               }
-       }
-}
-
-#else
-void set_blocksize(kdev_t dev, int size)
-{
-       extern int *blksize_size[];
-       int i, nlist;
-       struct buffer_head * bh, *bhnext;
-
-       if (!blksize_size[MAJOR(dev)])
-               return;
-
-       /* Size must be a power of two, and between 512 and PAGE_SIZE */
-       if (size > PAGE_SIZE || size < 512 || (size & (size-1)))
-               panic("Invalid blocksize passed to set_blocksize");
-
-       if (blksize_size[MAJOR(dev)][MINOR(dev)] == 0 && size == BLOCK_SIZE) {
-               blksize_size[MAJOR(dev)][MINOR(dev)] = size;
-               return;
-       }
-       if (blksize_size[MAJOR(dev)][MINOR(dev)] == size)
-               return;
-       sync_buffers(dev, 2);
-       blksize_size[MAJOR(dev)][MINOR(dev)] = size;
-
-       /* We need to be quite careful how we do this - we are moving entries
-        * around on the free list, and we can get in a loop if we are not careful.
-        */
-       for(nlist = 0; nlist < NR_LIST; nlist++) {
-               bh = lru_list[nlist];
-               for (i = nr_buffers_type[nlist]*2 ; --i > 0 ; bh = bhnext) {
-                       if(!bh)
-                               break;
-
-                       bhnext = bh->b_next_free; 
-                       if (bh->b_dev != dev)
-                                continue;
-                       if (bh->b_size == size)
-                                continue;
-                       if (bhnext)
-                               bhnext->b_count++;
-                       wait_on_buffer(bh);
-                       if (bh->b_dev == dev && bh->b_size != size) {
-                               clear_bit(BH_Dirty, &bh->b_state);
-                               clear_bit(BH_Uptodate, &bh->b_state);
-                               clear_bit(BH_Req, &bh->b_state);
-                               bh->b_flushtime = 0;
-                       }
-
-                       /*
-                        * lets be mega-conservative about what to free:
-                        */
-                       if (!(bh->b_dev != dev) && 
-                               !(bh->b_size == size) &&
-                               !bh->b_count &&
-                               !buffer_protected(bh) &&
-                               !buffer_dirty(bh) &&
-                               !buffer_locked(bh) &&
-                               !waitqueue_active(&bh->b_wait)) {
-                                       remove_from_hash_queue(bh);
-                                       bh->b_dev = NODEV;
-                                       refile_buffer(bh);
-                                       try_to_free_buffers(buffer_page(bh));
-                       } else {
-                               remove_from_queues(bh);
-                               bh->b_dev=B_FREE;
-                               insert_into_queues(bh);
-                       }
-                       if (bhnext)
-                               bhnext->b_count--;
+                       remove_from_hash_queue(bh);
                 }
         }
  }
-#endif
-
-/*
-* This function knows that we do a linear pass over the whole array,
-* so we can drop all unused buffers. Careful, bforget alone is
-* unsafe, we must be 100% sure that at the end of bforget() we will
-* really have no (new) users of this buffer.
-*
-* this logic improves overall system performance greatly during array
-* resync or reconstruction. Actually, the reconstruction is basically
-* seemless.
-*/
-void cache_drop_behind(struct buffer_head *bh)
-{
-       /*
-        * We are up to something dangerous ... rather be careful
-        */
-       if ((bh->b_count != 1) || buffer_protected(bh) ||
-                       buffer_dirty(bh) || buffer_locked(bh) ||
-                       !buffer_lowprio(bh) || waitqueue_active(&bh->b_wait)) {
-               brelse(bh);
-       } else {
-               bh->b_count--;
-               remove_from_hash_queue(bh);
-               bh->b_dev = NODEV;
-               refile_buffer(bh);
-               try_to_free_buffers(buffer_page(bh));
-       }
-}
  
  /*
   * We used to try various strange things. Let's not.
@@ -958,21 +854,22 @@ struct buffer_head * bread(kdev_t dev, int block, int size)
   * Ok, breada can be used as bread, but additionally to mark other
   * blocks for reading as well. End the argument list with a negative
   * number.
- *
- * __breada does the same but with block arguments. This is handy if a
- * device is bigger than 2G on a 32-bit architecture.
   */
  
  #define NBUF 16
  
-struct buffer_head * breada_blocks(kdev_t dev, int block,
-                        int bufsize, int blocks)
+struct buffer_head * breada(kdev_t dev, int block, int bufsize,
+       unsigned int pos, unsigned int filesize)
  {
         struct buffer_head * bhlist[NBUF];
+       unsigned int blocks;
         struct buffer_head * bh;
         int index;
         int i, j;
  
+       if (pos >= filesize)
+               return NULL;
+
         if (block < 0)
                 return NULL;
  
@@ -981,14 +878,18 @@ struct buffer_head * breada_blocks(kdev_t dev, int block,
  
         if (buffer_uptodate(bh))
                 return(bh);   
-       else
-               ll_rw_block(READ, 1, &bh);
+       else ll_rw_block(READ, 1, &bh);
+
+       blocks = (filesize - pos) >> (9+index);
  
         if (blocks < (read_ahead[MAJOR(dev)] >> index))
                 blocks = read_ahead[MAJOR(dev)] >> index;
         if (blocks > NBUF) 
                 blocks = NBUF;
  
+/*     if (blocks) printk("breada (new) %d blocks\n",blocks); */
+
+
         bhlist[0] = bh;
         j = 1;
         for(i=1; i<blocks; i++) {
@@ -1015,22 +916,6 @@ struct buffer_head * breada_blocks(kdev_t dev, int block,
         return NULL;
  }
  
-struct buffer_head * breada(kdev_t dev, int block, int bufsize,
-       unsigned int pos, unsigned int filesize)
-{
-       unsigned int blocks;
-       int index;
-
-       if (pos >= filesize)
-               return NULL;
-
-       index = BUFSIZE_INDEX(bufsize);
-
-       blocks = (filesize - pos) >> (9+index);
-
-       return (breada_blocks(dev,block,bufsize,blocks));
-}
-
  /*
   * Note: the caller should wake up the buffer_wait list if needed.
   */
diff --git a/fs/dquot.c b/fs/dquot.c

index 987269fd7b72fe925ce381f56113ecdc92755f3f..705ce07e4bc844c19189019824dec44b76d45a98 100644 (file)
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -641,6 +641,8 @@ static struct dquot *dqduplicate(struct dquot *dquot)
                 dquot->dq_count--;
                 return NODQUOT;
         }
+       dquot->dq_referenced++;
+       dqstats.lookups++;
         return dquot;
  }
  
diff --git a/fs/fat/inode.c b/fs/fat/inode.c

index 24b41b97d77a98fd1b68e1dd8314011fd814fef1..9bdbc37fc83f9d1f77682db8f3404d48a8313ec8 100644 (file)
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -305,12 +305,12 @@ static int parse_options(char *options,int *fat, int *blksize, int *debug,
                         else opts->quiet = 1;
                 }
                 else if (!strcmp(this_char,"blocksize")) {
-                       if (*value) ret = 0;
-                       else if (*blksize != 512  &&
-                                *blksize != 1024 &&
-                                *blksize != 2048) {
-                               printk ("MSDOS FS: Invalid blocksize "
-                                       "(512, 1024, or 2048)\n");
+                       if (!value || !*value) ret = 0;
+                       else {
+                               *blksize = simple_strtoul(value,&value,0);
+                               if (*value || (*blksize != 512 &&
+                                       *blksize != 1024 && *blksize != 2048))
+                                       ret = 0;
                         }
                 }
                 else if (!strcmp(this_char,"sys_immutable")) {
diff --git a/fs/select.c b/fs/select.c

index 290676166018385308201b28a320279c4e4da25c..62a6643edc8f2f61586cd499a8dab2d8fe2df964 100644 (file)
--- a/fs/select.c
+++ b/fs/select.c
@@ -280,8 +280,8 @@ sys_select(int n, fd_set *inp, fd_set *outp, fd_set *exp, struct timeval *tvp)
          
         if (n < 0)
                 goto out_nofds;
-       if (n > current->files->max_fdset + 1)
-               n = current->files->max_fdset + 1;
+       if (n > current->files->max_fdset)
+               n = current->files->max_fdset;
                 
         /*
          * We need 6 bitmaps (in/out/ex for both incoming and outgoing),
diff --git a/include/asm-alpha/core_cia.h b/include/asm-alpha/core_cia.h

index 223ea7b3847b73d63feaf0deb3c2e828d2ed22ff..0e4df559cc452bb2ed2aab77b3283ba7823a3454 100644 (file)
--- a/include/asm-alpha/core_cia.h
+++ b/include/asm-alpha/core_cia.h
@@ -94,7 +94,7 @@
  #define CIA_DMA_WIN_BASE               alpha_mv.dma_win_base
  #define CIA_DMA_WIN_SIZE               alpha_mv.dma_win_size
  #else
-#define CIA_DMA_WIN_BASE               CIA_DMA_WIN_SIZE_DEFAULT
+#define CIA_DMA_WIN_BASE               CIA_DMA_WIN_BASE_DEFAULT
  #define CIA_DMA_WIN_SIZE               CIA_DMA_WIN_SIZE_DEFAULT
  #endif
  
diff --git a/include/asm-alpha/md.h b/include/asm-alpha/md.h

new file mode 100644 (file)

index 0000000..6c9b822
--- /dev/null
+++ b/include/asm-alpha/md.h
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:11:48 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-i386/md.h b/include/asm-i386/md.h

new file mode 100644 (file)

index 0000000..0a2c5dd
--- /dev/null
+++ b/include/asm-i386/md.h
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:11:57 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-m68k/md.h b/include/asm-m68k/md.h

new file mode 100644 (file)

index 0000000..1d15aae
--- /dev/null
+++ b/include/asm-m68k/md.h
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:04 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-ppc/md.h b/include/asm-ppc/md.h

new file mode 100644 (file)

index 0000000..0ff3e7e
--- /dev/null
+++ b/include/asm-ppc/md.h
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:15 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-sparc/md.h b/include/asm-sparc/md.h

new file mode 100644 (file)

index 0000000..e0d0e85
--- /dev/null
+++ b/include/asm-sparc/md.h
@@ -0,0 +1,13 @@
+/* $Id: md.h,v 1.1 1997/12/15 15:12:39 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+/* #define HAVE_ARCH_XORBLOCK */
+
+#define MD_XORBLOCK_ALIGNMENT  sizeof(long)
+
+#endif /* __ASM_MD_H */
diff --git a/include/asm-sparc64/md.h b/include/asm-sparc64/md.h

new file mode 100644 (file)

index 0000000..0387993
--- /dev/null
+++ b/include/asm-sparc64/md.h
@@ -0,0 +1,91 @@
+/* $Id: md.h,v 1.2 1997/12/27 16:28:38 jj Exp $
+ * md.h: High speed xor_block operation for RAID4/5 
+ *            utilizing the UltraSparc Visual Instruction Set.
+ *
+ * Copyright (C) 1997 Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ */
+ 
+#ifndef __ASM_MD_H
+#define __ASM_MD_H
+
+#include <asm/head.h>
+#include <asm/asi.h>
+
+#define HAVE_ARCH_XORBLOCK
+
+#define MD_XORBLOCK_ALIGNMENT  64
+
+/*     void __xor_block (char *dest, char *src, long len)
+ *     {
+ *             while (len--) *dest++ ^= *src++;
+ *     }
+ *
+ *     Requirements:
+ *     !(((long)dest | (long)src) & (MD_XORBLOCK_ALIGNMENT - 1)) &&
+ *     !(len & 127) && len >= 256
+ */
+
+static inline void __xor_block (char *dest, char *src, long len)
+{
+       __asm__ __volatile__ ("
+       wr      %%g0, %3, %%fprs
+       wr      %%g0, %4, %%asi
+       membar  #LoadStore|#StoreLoad|#StoreStore
+       sub     %2, 128, %2
+       ldda    [%0] %4, %%f0
+       ldda    [%1] %4, %%f16
+1:     ldda    [%0 + 64] %%asi, %%f32
+       fxor    %%f0, %%f16, %%f16
+       fxor    %%f2, %%f18, %%f18
+       fxor    %%f4, %%f20, %%f20
+       fxor    %%f6, %%f22, %%f22
+       fxor    %%f8, %%f24, %%f24
+       fxor    %%f10, %%f26, %%f26
+       fxor    %%f12, %%f28, %%f28
+       fxor    %%f14, %%f30, %%f30
+       stda    %%f16, [%0] %4
+       ldda    [%1 + 64] %%asi, %%f48
+       ldda    [%0 + 128] %%asi, %%f0
+       fxor    %%f32, %%f48, %%f48
+       fxor    %%f34, %%f50, %%f50
+       add     %0, 128, %0
+       fxor    %%f36, %%f52, %%f52
+       add     %1, 128, %1
+       fxor    %%f38, %%f54, %%f54
+       subcc   %2, 128, %2
+       fxor    %%f40, %%f56, %%f56
+       fxor    %%f42, %%f58, %%f58
+       fxor    %%f44, %%f60, %%f60
+       fxor    %%f46, %%f62, %%f62
+       stda    %%f48, [%0 - 64] %%asi
+       bne,pt  %%xcc, 1b
+        ldda   [%1] %4, %%f16
+       ldda    [%0 + 64] %%asi, %%f32
+       fxor    %%f0, %%f16, %%f16
+       fxor    %%f2, %%f18, %%f18
+       fxor    %%f4, %%f20, %%f20
+       fxor    %%f6, %%f22, %%f22
+       fxor    %%f8, %%f24, %%f24
+       fxor    %%f10, %%f26, %%f26
+       fxor    %%f12, %%f28, %%f28
+       fxor    %%f14, %%f30, %%f30
+       stda    %%f16, [%0] %4
+       ldda    [%1 + 64] %%asi, %%f48
+       membar  #Sync
+       fxor    %%f32, %%f48, %%f48
+       fxor    %%f34, %%f50, %%f50
+       fxor    %%f36, %%f52, %%f52
+       fxor    %%f38, %%f54, %%f54
+       fxor    %%f40, %%f56, %%f56
+       fxor    %%f42, %%f58, %%f58
+       fxor    %%f44, %%f60, %%f60
+       fxor    %%f46, %%f62, %%f62
+       stda    %%f48, [%0 + 64] %%asi
+       membar  #Sync|#StoreStore|#StoreLoad
+       wr      %%g0, 0, %%fprs
+       " : :
+       "r" (dest), "r" (src), "r" (len), "i" (FPRS_FEF), "i" (ASI_BLK_P) :
+       "cc", "memory");
+}
+
+#endif /* __ASM_MD_H */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h

index 808f69323db998e63def32c7992676971046dcd8..87a9092b219fc31f584629787442a75c54130f5f 100644 (file)
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -62,9 +62,8 @@ extern void unplug_device(void * data);
  extern void make_request(int major,int rw, struct buffer_head * bh);
  
  /* md needs this function to remap requests */
-extern int md_map (kdev_t dev, kdev_t *rdev,
-                                unsigned long *rsector, unsigned long size);
-extern int md_make_request (struct buffer_head * bh, int rw);
+extern int md_map (int minor, kdev_t *rdev, unsigned long *rsector, unsigned long size);
+extern int md_make_request (int minor, int rw, struct buffer_head * bh);
  extern int md_error (kdev_t mddev, kdev_t rdev);
  
  extern int * blk_size[MAX_BLKDEV];
diff --git a/include/linux/ip_masq.h b/include/linux/ip_masq.h

index 34f9fe4943e3a7e4251beb5bc9de88f6beedc98e..ba89313830234c9b8a39d90ccfd17151e0b4e3cc 100644 (file)
--- a/include/linux/ip_masq.h
+++ b/include/linux/ip_masq.h
@@ -1,6 +1,6 @@
  /*
   *     IP_MASQ user space control interface
- *     $Id: ip_masq.h,v 1.2.2.1 1999/08/13 18:23:03 davem Exp $
+ *     $Id: ip_masq.h,v 1.2 1998/12/08 05:41:48 davem Exp $
   */
  
  #ifndef _LINUX_IP_MASQ_H
@@ -103,26 +103,6 @@ struct ip_mfw_user {
  
  #define IP_MASQ_MFW_SCHED      0x01
  
-/* 
- *     VS & schedulers stuff 
- */
-struct ip_vs_user {
-       /* create the virtual service and attach the scheduler to it */
-       u_int16_t       protocol;
-       u_int32_t       vaddr;          /* virtual address */
-       u_int16_t       vport;
-       /* ... timeouts and other stuff */
-
-       /* scheduler specific options */
-       u_int32_t       daddr;          /* real destination address */
-       u_int16_t       dport;
-       unsigned        masq_flags;
-       unsigned        sched_flags;
-       unsigned        weight;
-       char            data[0];        /* optional scheduler parameters */
-};
-
-
  #define IP_FW_MASQCTL_MAX 256
  #define IP_MASQ_TNAME_MAX  32
  
@@ -135,7 +115,6 @@ struct ip_masq_ctl {
                 struct ip_autofw_user autofw_user;
                 struct ip_mfw_user mfw_user;
                 struct ip_masq_user user;
-               struct ip_vs_user vs_user;
                 unsigned char m_raw[IP_FW_MASQCTL_MAX];
         } u;
  };
@@ -145,10 +124,7 @@ struct ip_masq_ctl {
  #define IP_MASQ_TARGET_CORE    1
  #define IP_MASQ_TARGET_MOD     2       /* masq_mod is selected by "name" */
  #define IP_MASQ_TARGET_USER    3       
-#define IP_MASQ_TARGET_VS      4       /* sched_mod is selected by "name" */
-/*  #define IP_MASQ_TARGET_VS_SCHED 5 */
-#define IP_MASQ_TARGET_LAST    5
-
+#define IP_MASQ_TARGET_LAST    4
  
  #define IP_MASQ_CMD_NONE       0       /* just peek */
  #define IP_MASQ_CMD_INSERT     1
@@ -160,9 +136,5 @@ struct ip_masq_ctl {
  #define IP_MASQ_CMD_LIST       7       /* actually fake: done via /proc */
  #define IP_MASQ_CMD_ENABLE     8
  #define IP_MASQ_CMD_DISABLE    9
-#define IP_MASQ_CMD_ADD_DEST   10      /* for adding dest in IPVS */
-#define IP_MASQ_CMD_DEL_DEST   11      /* for deleting dest in IPVS */
-#define IP_MASQ_CMD_SET_DEST   12      /* for setting dest in IPVS */
  
  #endif /* _LINUX_IP_MASQ_H */
-
diff --git a/include/linux/md.h b/include/linux/md.h

new file mode 100644 (file)

index 0000000..f4f4f54
--- /dev/null
+++ b/include/linux/md.h
@@ -0,0 +1,300 @@
+/*
+   md.h : Multiple Devices driver for Linux
+          Copyright (C) 1994-96 Marc ZYNGIER
+         <zyngier@ufr-info-p7.ibp.fr> or
+         <maz@gloups.fdn.fr>
+         
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+   
+   You should have received a copy of the GNU General Public License
+   (for example /usr/src/linux/COPYING); if not, write to the Free
+   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
+*/
+
+#ifndef _MD_H
+#define _MD_H
+
+#include <linux/major.h>
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/*
+ * Different major versions are not compatible.
+ * Different minor versions are only downward compatible.
+ * Different patchlevel versions are downward and upward compatible.
+ */
+#define MD_MAJOR_VERSION               0
+#define MD_MINOR_VERSION               36
+#define MD_PATCHLEVEL_VERSION          6
+
+#define MD_DEFAULT_DISK_READAHEAD      (256 * 1024)
+
+/* ioctls */
+#define REGISTER_DEV           _IO (MD_MAJOR, 1)
+#define START_MD               _IO (MD_MAJOR, 2)
+#define STOP_MD                _IO (MD_MAJOR, 3)
+#define REGISTER_DEV_NEW       _IO (MD_MAJOR, 4)
+
+/*
+   personalities :
+   Byte 0 : Chunk size factor
+   Byte 1 : Fault tolerance count for each physical device
+            (   0 means no fault tolerance,
+             0xFF means always tolerate faults), not used by now.
+   Byte 2 : Personality
+   Byte 3 : Reserved.
+ */
+
+#define FAULT_SHIFT       8
+#define PERSONALITY_SHIFT 16
+
+#define FACTOR_MASK       0x000000FFUL
+#define FAULT_MASK        0x0000FF00UL
+#define PERSONALITY_MASK  0x00FF0000UL
+
+#define MD_RESERVED       0    /* Not used by now */
+#define LINEAR            (1UL << PERSONALITY_SHIFT)
+#define STRIPED           (2UL << PERSONALITY_SHIFT)
+#define RAID0             STRIPED
+#define RAID1             (3UL << PERSONALITY_SHIFT)
+#define RAID5             (4UL << PERSONALITY_SHIFT)
+#define MAX_PERSONALITY   5
+
+/*
+ * MD superblock.
+ *
+ * The MD superblock maintains some statistics on each MD configuration.
+ * Each real device in the MD set contains it near the end of the device.
+ * Some of the ideas are copied from the ext2fs implementation.
+ *
+ * We currently use 4096 bytes as follows:
+ *
+ *     word offset     function
+ *
+ *        0  -    31   Constant generic MD device information.
+ *        32  -    63   Generic state information.
+ *       64  -   127   Personality specific information.
+ *      128  -   511   12 32-words descriptors of the disks in the raid set.
+ *      512  -   911   Reserved.
+ *      912  -  1023   Disk specific descriptor.
+ */
+
+/*
+ * If x is the real device size in bytes, we return an apparent size of:
+ *
+ *     y = (x & ~(MD_RESERVED_BYTES - 1)) - MD_RESERVED_BYTES
+ *
+ * and place the 4kB superblock at offset y.
+ */
+#define MD_RESERVED_BYTES              (64 * 1024)
+#define MD_RESERVED_SECTORS            (MD_RESERVED_BYTES / 512)
+#define MD_RESERVED_BLOCKS             (MD_RESERVED_BYTES / BLOCK_SIZE)
+
+#define MD_NEW_SIZE_SECTORS(x)         ((x & ~(MD_RESERVED_SECTORS - 1)) - MD_RESERVED_SECTORS)
+#define MD_NEW_SIZE_BLOCKS(x)          ((x & ~(MD_RESERVED_BLOCKS - 1)) - MD_RESERVED_BLOCKS)
+
+#define MD_SB_BYTES                    4096
+#define MD_SB_WORDS                    (MD_SB_BYTES / 4)
+#define MD_SB_BLOCKS                   (MD_SB_BYTES / BLOCK_SIZE)
+#define MD_SB_SECTORS                  (MD_SB_BYTES / 512)
+
+/*
+ * The following are counted in 32-bit words
+ */
+#define        MD_SB_GENERIC_OFFSET            0
+#define MD_SB_PERSONALITY_OFFSET       64
+#define MD_SB_DISKS_OFFSET             128
+#define MD_SB_DESCRIPTOR_OFFSET                992
+
+#define MD_SB_GENERIC_CONSTANT_WORDS   32
+#define MD_SB_GENERIC_STATE_WORDS      32
+#define MD_SB_GENERIC_WORDS            (MD_SB_GENERIC_CONSTANT_WORDS + MD_SB_GENERIC_STATE_WORDS)
+#define MD_SB_PERSONALITY_WORDS                64
+#define MD_SB_DISKS_WORDS              384
+#define MD_SB_DESCRIPTOR_WORDS         32
+#define MD_SB_RESERVED_WORDS           (1024 - MD_SB_GENERIC_WORDS - MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
+#define MD_SB_EQUAL_WORDS              (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS + MD_SB_DISKS_WORDS)
+#define MD_SB_DISKS                    (MD_SB_DISKS_WORDS / MD_SB_DESCRIPTOR_WORDS)
+
+/*
+ * Device "operational" state bits
+ */
+#define MD_FAULTY_DEVICE               0       /* Device is faulty / operational */
+#define MD_ACTIVE_DEVICE               1       /* Device is a part or the raid set / spare disk */
+#define MD_SYNC_DEVICE                 2       /* Device is in sync with the raid set */
+
+typedef struct md_device_descriptor_s {
+       __u32 number;           /* 0 Device number in the entire set */
+       __u32 major;            /* 1 Device major number */
+       __u32 minor;            /* 2 Device minor number */
+       __u32 raid_disk;        /* 3 The role of the device in the raid set */
+       __u32 state;            /* 4 Operational state */
+       __u32 reserved[MD_SB_DESCRIPTOR_WORDS - 5];
+} md_descriptor_t;
+
+#define MD_SB_MAGIC            0xa92b4efc
+
+/*
+ * Superblock state bits
+ */
+#define MD_SB_CLEAN            0
+#define MD_SB_ERRORS           1
+
+typedef struct md_superblock_s {
+
+       /*
+        * Constant generic information
+        */
+       __u32 md_magic;         /*  0 MD identifier */
+       __u32 major_version;    /*  1 major version to which the set conforms */
+       __u32 minor_version;    /*  2 minor version to which the set conforms */
+       __u32 patch_version;    /*  3 patchlevel version to which the set conforms */
+       __u32 gvalid_words;     /*  4 Number of non-reserved words in this section */
+       __u32 set_magic;        /*  5 Raid set identifier */
+       __u32 ctime;            /*  6 Creation time */
+       __u32 level;            /*  7 Raid personality (mirroring, raid5, ...) */
+       __u32 size;             /*  8 Apparent size of each individual disk, in kB */
+       __u32 nr_disks;         /*  9 Number of total disks in the raid set */
+       __u32 raid_disks;       /* 10 Number of disks in a fully functional raid set */
+       __u32 gstate_creserved[MD_SB_GENERIC_CONSTANT_WORDS - 11];
+
+       /*
+        * Generic state information
+        */
+       __u32 utime;            /*  0 Superblock update time */
+       __u32 state;            /*  1 State bits (clean, ...) */
+       __u32 active_disks;     /*  2 Number of currently active disks (some non-faulty disks might not be in sync) */
+       __u32 working_disks;    /*  3 Number of working disks */
+       __u32 failed_disks;     /*  4 Number of failed disks */
+       __u32 spare_disks;      /*  5 Number of spare disks */
+       __u32 gstate_sreserved[MD_SB_GENERIC_STATE_WORDS - 6];
+
+       /*
+        * Personality information
+        */
+       __u32 parity_algorithm;
+       __u32 chunk_size;
+       __u32 pstate_reserved[MD_SB_PERSONALITY_WORDS - 2];
+
+       /*
+        * Disks information
+        */
+       md_descriptor_t disks[MD_SB_DISKS];
+
+       /*
+        * Reserved
+        */
+       __u32 reserved[MD_SB_RESERVED_WORDS];
+
+       /*
+        * Active descriptor
+        */
+       md_descriptor_t descriptor;
+} md_superblock_t;
+
+#ifdef __KERNEL__
+
+#include <linux/mm.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <asm/semaphore.h>
+
+/*
+ * Kernel-based reconstruction is mostly working, but still requires
+ * some additional work.
+ */
+#define SUPPORT_RECONSTRUCTION 0
+
+#define MAX_REAL     8         /* Max number of physical dev per md dev */
+#define MAX_MD_DEV   4         /* Max number of md dev */
+
+#define FACTOR(a)         ((a)->repartition & FACTOR_MASK)
+#define MAX_FAULT(a)      (((a)->repartition & FAULT_MASK)>>8)
+#define PERSONALITY(a)    ((a)->repartition & PERSONALITY_MASK)
+
+#define FACTOR_SHIFT(a) (PAGE_SHIFT + (a) - 10)
+
+struct real_dev
+{
+  kdev_t dev;                  /* Device number */
+  int size;                    /* Device size (in blocks) */
+  int offset;                  /* Real device offset (in blocks) in md dev
+                                  (only used in linear mode) */
+  struct inode *inode;         /* Lock inode */
+  md_superblock_t *sb;
+  u32 sb_offset;
+};
+
+struct md_dev;
+
+#define SPARE_INACTIVE 0
+#define SPARE_WRITE    1
+#define SPARE_ACTIVE   2
+
+struct md_personality
+{
+  char *name;
+  int (*map)(struct md_dev *mddev, kdev_t *rdev,
+                     unsigned long *rsector, unsigned long size);
+  int (*make_request)(struct md_dev *mddev, int rw, struct buffer_head * bh);
+  void (*end_request)(struct buffer_head * bh, int uptodate);
+  int (*run)(int minor, struct md_dev *mddev);
+  int (*stop)(int minor, struct md_dev *mddev);
+  int (*status)(char *page, int minor, struct md_dev *mddev);
+  int (*ioctl)(struct inode *inode, struct file *file,
+              unsigned int cmd, unsigned long arg);
+  int max_invalid_dev;
+  int (*error_handler)(struct md_dev *mddev, kdev_t dev);
+
+/*
+ * Some personalities (RAID-1, RAID-5) can get disks hot-added and
+ * hot-removed. Hot removal is different from failure. (failure marks
+ * a disk inactive, but the disk is still part of the array)
+ */
+  int (*hot_add_disk) (struct md_dev *mddev, kdev_t dev);
+  int (*hot_remove_disk) (struct md_dev *mddev, kdev_t dev);
+  int (*mark_spare) (struct md_dev *mddev, md_descriptor_t *descriptor, int state);
+};
+
+struct md_dev
+{
+  struct real_dev      devices[MAX_REAL];
+  struct md_personality        *pers;
+  md_superblock_t      *sb;
+  int                  sb_dirty;
+  int                  repartition;
+  int                  busy;
+  int                  nb_dev;
+  void                 *private;
+};
+
+struct md_thread {
+       void                    (*run) (void *data);
+       void                    *data;
+       struct wait_queue       *wqueue;
+       unsigned long           flags;
+       struct semaphore        *sem;
+       struct task_struct      *tsk;
+};
+
+#define THREAD_WAKEUP  0
+
+extern struct md_dev md_dev[MAX_MD_DEV];
+extern int md_size[MAX_MD_DEV];
+extern int md_maxreadahead[MAX_MD_DEV];
+
+extern char *partition_name (kdev_t dev);
+
+extern int register_md_personality (int p_num, struct md_personality *p);
+extern int unregister_md_personality (int p_num);
+extern struct md_thread *md_register_thread (void (*run) (void *data), void *data);
+extern void md_unregister_thread (struct md_thread *thread);
+extern void md_wakeup_thread(struct md_thread *thread);
+extern int md_update_sb (int minor);
+extern int md_do_sync(struct md_dev *mddev);
+
+#endif __KERNEL__
+#endif _MD_H
diff --git a/include/linux/raid/hsm.h b/include/linux/raid/hsm.h

deleted file mode 100644 (file)

index 0438d27..0000000
--- a/include/linux/raid/hsm.h
+++ /dev/null
@@ -1,65 +0,0 @@
-#ifndef _LVM_H
-#define _LVM_H
-
-#include <linux/raid/md.h>
-
-#if __alpha__
-#error fix cpu_addr on Alpha first
-#endif
-
-#include <linux/raid/hsm_p.h>
-
-#define index_pv(lv,index) ((lv)->vg->pv_array+(index)->data.phys_nr)
-#define index_dev(lv,index) index_pv((lv),(index))->dev
-#define index_block(lv,index) (index)->data.phys_block
-#define index_child(index) ((lv_lptr_t *)((index)->cpu_addr))
-
-#define ptr_to_cpuaddr(ptr) ((__u32) (ptr))
-
-
-typedef struct pv_bg_desc_s {
-       unsigned int            free_blocks;
-       pv_block_group_t        *bg;
-} pv_bg_desc_t;
-
-typedef struct pv_s pv_t;
-typedef struct vg_s vg_t;
-typedef struct lv_s lv_t;
-
-struct pv_s
-{
-       int                     phys_nr;
-       kdev_t                  dev;
-       pv_sb_t                 *pv_sb;
-       pv_bg_desc_t            *bg_array;
-};
-
-struct lv_s
-{
-       int             log_id;
-       vg_t            *vg;
-
-       unsigned int    max_indices;
-       unsigned int    free_indices;
-       lv_lptr_t       root_index;
-
-       kdev_t          dev;
-};
-
-struct vg_s
-{
-       int             nr_pv;
-       pv_t            pv_array [MD_SB_DISKS];
-
-       int             nr_lv;
-       lv_t            lv_array [LVM_MAX_LVS_PER_VG];
-
-       vg_sb_t         *vg_sb;
-       mddev_t         *mddev;
-};
-
-#define kdev_to_lv(dev) ((lv_t *) mddev_map[MINOR(dev)].data)
-#define mddev_to_vg(mddev) ((vg_t *) mddev->private)
-
-#endif
-
diff --git a/include/linux/raid/hsm_p.h b/include/linux/raid/hsm_p.h

deleted file mode 100644 (file)

index 02674b3..0000000
--- a/include/linux/raid/hsm_p.h
+++ /dev/null
@@ -1,237 +0,0 @@
-#ifndef _LVM_P_H
-#define _LVM_P_H
-
-#define LVM_BLOCKSIZE 4096
-#define LVM_BLOCKSIZE_WORDS (LVM_BLOCKSIZE/4)
-#define PACKED __attribute__ ((packed))
-
-/*
- * Identifies a block in physical space
- */
-typedef struct phys_idx_s {
-       __u16 phys_nr;
-       __u32 phys_block;
-
-} PACKED phys_idx_t;
-
-/*
- * Identifies a block in logical space
- */
-typedef struct log_idx_s {
-       __u16 log_id;
-       __u32 log_index;
-
-} PACKED log_idx_t;
-
-/*
- * Describes one PV
- */
-#define LVM_PV_SB_MAGIC          0xf091ae9fU
-
-#define LVM_PV_SB_GENERIC_WORDS 32
-#define LVM_PV_SB_RESERVED_WORDS \
-               (LVM_BLOCKSIZE_WORDS - LVM_PV_SB_GENERIC_WORDS)
-
-/*
- * On-disk PV identification data, on block 0 in any PV.
- */
-typedef struct pv_sb_s
-{
-       __u32 pv_magic;         /*  0                                       */
-
-       __u32 pv_uuid0;         /*  1                                       */
-       __u32 pv_uuid1;         /*  2                                       */
-       __u32 pv_uuid2;         /*  3                                       */
-       __u32 pv_uuid3;         /*  4                                       */
-
-       __u32 pv_major;         /*  5                                       */
-       __u32 pv_minor;         /*  6                                       */
-       __u32 pv_patch;         /*  7                                       */
-
-       __u32 pv_ctime;         /*  8 Creation time                         */
-
-       __u32 pv_total_size;    /*  9 size of this PV, in blocks            */
-       __u32 pv_first_free;    /*  10 first free block                     */
-       __u32 pv_first_used;    /*  11 first used block                     */
-       __u32 pv_blocks_left;   /*  12 unallocated blocks                   */
-       __u32 pv_bg_size;       /*  13 size of a block group, in blocks     */
-       __u32 pv_block_size;    /*  14 size of blocks, in bytes             */
-       __u32 pv_pptr_size;     /*  15 size of block descriptor, in bytes   */
-       __u32 pv_block_groups;  /*  16 number of block groups               */
-
-       __u32 __reserved1[LVM_PV_SB_GENERIC_WORDS - 17];
-
-       /*
-        * Reserved
-        */
-       __u32 __reserved2[LVM_PV_SB_RESERVED_WORDS];
-
-} PACKED pv_sb_t;
-
-/*
- * this is pretty much arbitrary, but has to be less than ~64
- */
-#define LVM_MAX_LVS_PER_VG 32
-
-#define LVM_VG_SB_GENERIC_WORDS 32
-
-#define LV_DESCRIPTOR_WORDS 8
-#define LVM_VG_SB_RESERVED_WORDS (LVM_BLOCKSIZE_WORDS - \
-       LV_DESCRIPTOR_WORDS*LVM_MAX_LVS_PER_VG - LVM_VG_SB_GENERIC_WORDS)
-
-#if (LVM_PV_SB_RESERVED_WORDS < 0)
-#error you messed this one up dude ...
-#endif
-
-typedef struct lv_descriptor_s
-{
-       __u32 lv_id;            /*  0                                       */
-       phys_idx_t lv_root_idx; /*  1                                       */
-       __u16 __reserved;       /*  2                                       */
-       __u32 lv_max_indices;   /*  3                                       */
-       __u32 lv_free_indices;  /*  4                                       */
-       __u32 md_id;            /*  5                                       */
-
-       __u32 reserved[LV_DESCRIPTOR_WORDS - 6];
-
-} PACKED lv_descriptor_t;
-
-#define LVM_VG_SB_MAGIC          0x98320d7aU
-/*
- * On-disk VG identification data, in block 1 on all PVs
- */
-typedef struct vg_sb_s
-{
-       __u32 vg_magic;         /*  0                                       */
-       __u32 nr_lvs;           /*  1                                       */
-
-       __u32 __reserved1[LVM_VG_SB_GENERIC_WORDS - 2];
-
-       lv_descriptor_t lv_array [LVM_MAX_LVS_PER_VG];
-       /*
-        * Reserved
-        */
-       __u32 __reserved2[LVM_VG_SB_RESERVED_WORDS];
-
-} PACKED vg_sb_t;
-
-/*
- * Describes one LV
- */
-
-#define LVM_LV_SB_MAGIC          0xe182bd8aU
-
-/* do we need lv_sb_t? */
-
-typedef struct lv_sb_s
-{
-       /*
-        * On-disk LV identifier
-        */
-       __u32 lv_magic;         /*  0 LV identifier                         */
-       __u32 lv_uuid0;         /*  1                                       */
-       __u32 lv_uuid1;         /*  2                                       */
-       __u32 lv_uuid2;         /*  3                                       */
-       __u32 lv_uuid3;         /*  4                                       */
-
-       __u32 lv_major;         /*  5 PV identifier                         */
-       __u32 lv_minor;         /*  6 PV identifier                         */
-       __u32 lv_patch;         /*  7 PV identifier                         */
-
-       __u32 ctime;            /*  8 Creation time                         */
-       __u32 size;             /*  9 size of this LV, in blocks            */
-       phys_idx_t start;       /*  10 position of root index block         */
-       log_idx_t first_free;   /*  11-12 first free index                  */
-
-       /*
-        * Reserved
-        */
-       __u32 reserved[LVM_BLOCKSIZE_WORDS-13];
-
-} PACKED lv_sb_t;
-
-/*
- * Pointer pointing from the physical space, points to
- * the LV owning this block. It also contains various
- * statistics about the physical block.
- */
-typedef struct pv_pptr_s
-{
-       union {
-       /* case 1 */
-               struct {
-                       log_idx_t owner;
-                       log_idx_t predicted;
-                       __u32 last_referenced;
-               } used;
-       /* case 2 */
-               struct {
-                       __u16 log_id;
-                       __u16 __unused1;
-                       __u32 next_free;
-                       __u32 __unused2;
-                       __u32 __unused3;
-               } free;
-       } u;
-} PACKED pv_pptr_t;
-
-static __inline__ int pv_pptr_free (const pv_pptr_t * pptr)
-{
-       return !pptr->u.free.log_id;
-}
-
-
-#define DATA_BLOCKS_PER_BG ((LVM_BLOCKSIZE*8)/(8*sizeof(pv_pptr_t)+1))
-
-#define TOTAL_BLOCKS_PER_BG (DATA_BLOCKS_PER_BG+1)
-/*
- * A table of pointers filling up a single block, managing
- * the next DATA_BLOCKS_PER_BG physical blocks. Such block
- * groups form the physical space of blocks.
- */
-typedef struct pv_block_group_s
-{
-       __u8 used_bitmap[(DATA_BLOCKS_PER_BG+7)/8];
-
-       pv_pptr_t blocks[DATA_BLOCKS_PER_BG];
-
-} PACKED pv_block_group_t;
-
-/*
- * Pointer from the logical space, points to
- * the (PV,block) containing this logical block
- */
-typedef struct lv_lptr_s
-{
-       phys_idx_t data;
-       __u16 __reserved;
-       __u32 cpu_addr;
-       __u32 __reserved2;
-
-} PACKED lv_lptr_t;
-
-static __inline__ int index_free (const lv_lptr_t * index)
-{
-       return !index->data.phys_block;
-}
-
-static __inline__ int index_present (const lv_lptr_t * index)
-{
-       return index->cpu_addr;
-}
-
-
-#define LVM_LPTRS_PER_BLOCK (LVM_BLOCKSIZE/sizeof(lv_lptr_t))
-/*
- * A table of pointers filling up a single block, managing
- * LVM_LPTRS_PER_BLOCK logical blocks. Such block groups form
- * the logical space of blocks.
- */
-typedef struct lv_index_block_s
-{
-       lv_lptr_t blocks[LVM_LPTRS_PER_BLOCK];
-
-} PACKED lv_index_block_t;
-
-#endif
-
diff --git a/include/linux/raid/linear.h b/include/linux/raid/linear.h

deleted file mode 100644 (file)

index 55cfab7..0000000
--- a/include/linux/raid/linear.h
+++ /dev/null
@@ -1,32 +0,0 @@
-#ifndef _LINEAR_H
-#define _LINEAR_H
-
-#include <linux/raid/md.h>
-
-struct dev_info {
-       kdev_t          dev;
-       int             size;
-       unsigned int    offset;
-};
-
-typedef struct dev_info dev_info_t;
-
-struct linear_hash
-{
-       dev_info_t *dev0, *dev1;
-};
-
-struct linear_private_data
-{
-       struct linear_hash      *hash_table;
-       dev_info_t              disks[MD_SB_DISKS];
-       dev_info_t              *smallest;
-       int                     nr_zones;
-};
-
-
-typedef struct linear_private_data linear_conf_t;
-
-#define mddev_to_conf(mddev) ((linear_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/md.h b/include/linux/raid/md.h

deleted file mode 100644 (file)

index 1059949..0000000
--- a/include/linux/raid/md.h
+++ /dev/null
@@ -1,95 +0,0 @@
-/*
-   md.h : Multiple Devices driver for Linux
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-          Copyright (C) 1994-96 Marc ZYNGIER
-         <zyngier@ufr-info-p7.ibp.fr> or
-         <maz@gloups.fdn.fr>
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_H
-#define _MD_H
-
-#include <linux/config.h>
-#include <linux/mm.h>
-#include <linux/fs.h>
-#include <linux/blkdev.h>
-#include <asm/semaphore.h>
-#include <linux/major.h>
-#include <linux/ioctl.h>
-#include <linux/types.h>
-#include <asm/bitops.h>
-#include <linux/module.h>
-#include <linux/hdreg.h>
-#include <linux/sysctl.h>
-#include <linux/proc_fs.h>
-#include <linux/smp_lock.h>
-#include <linux/delay.h>
-#include <net/checksum.h>
-#include <linux/random.h>
-#include <linux/locks.h>
-#include <asm/io.h>
-
-#include <linux/raid/md_compatible.h>
-/*
- * 'md_p.h' holds the 'physical' layout of RAID devices
- * 'md_u.h' holds the user <=> kernel API
- *
- * 'md_k.h' holds kernel internal definitions
- */
-
-#include <linux/raid/md_p.h>
-#include <linux/raid/md_u.h>
-#include <linux/raid/md_k.h>
-
-/*
- * Different major versions are not compatible.
- * Different minor versions are only downward compatible.
- * Different patchlevel versions are downward and upward compatible.
- */
-#define MD_MAJOR_VERSION                0
-#define MD_MINOR_VERSION                90
-#define MD_PATCHLEVEL_VERSION           0
-
-extern int md_size[MAX_MD_DEVS];
-extern struct hd_struct md_hd_struct[MAX_MD_DEVS];
-
-extern void add_mddev_mapping (mddev_t *mddev, kdev_t dev, void *data);
-extern void del_mddev_mapping (mddev_t *mddev, kdev_t dev);
-extern char * partition_name (kdev_t dev);
-extern int register_md_personality (int p_num, mdk_personality_t *p);
-extern int unregister_md_personality (int p_num);
-extern mdk_thread_t * md_register_thread (void (*run) (void *data),
-                               void *data, const char *name);
-extern void md_unregister_thread (mdk_thread_t *thread);
-extern void md_wakeup_thread(mdk_thread_t *thread);
-extern void md_interrupt_thread (mdk_thread_t *thread);
-extern int md_update_sb (mddev_t *mddev);
-extern int md_do_sync(mddev_t *mddev, mdp_disk_t *spare);
-extern void md_recover_arrays (void);
-extern int md_check_ordering (mddev_t *mddev);
-extern void autodetect_raid(void);
-extern struct gendisk * find_gendisk (kdev_t dev);
-extern int md_notify_reboot(struct notifier_block *this,
-                                       unsigned long code, void *x);
-#if CONFIG_BLK_DEV_MD
-extern void raid_setup(char *str,int *ints) md__init;
-#endif
-#ifdef CONFIG_MD_BOOT
-extern void md_setup(char *str,int *ints) md__init;
-#endif
-
-extern void md_print_devices (void);
-
-#define MD_BUG(x...) { printk("md: bug in file %s, line %d\n", __FILE__, __LINE__); md_print_devices(); }
-
-#endif _MD_H
-
diff --git a/include/linux/raid/md_compatible.h b/include/linux/raid/md_compatible.h

deleted file mode 100644 (file)

index d4119a0..0000000
--- a/include/linux/raid/md_compatible.h
+++ /dev/null
@@ -1,387 +0,0 @@
-
-/*
-   md.h : Multiple Devices driver compatibility layer for Linux 2.0/2.2
-          Copyright (C) 1998 Ingo Molnar
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/version.h>
-
-#ifndef _MD_COMPATIBLE_H
-#define _MD_COMPATIBLE_H
-
-#define LinuxVersionCode(v, p, s) (((v)<<16)+((p)<<8)+(s))
-
-#if LINUX_VERSION_CODE < LinuxVersionCode(2,1,0)
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y,GFP_KERNEL)
-
-#ifdef __i386__
-/* 001 */
-extern __inline__ int md_cpu_has_mmx(void)
-{
-       return x86_capability & 0x00800000;
-}
-#endif
-
-/* 002 */
-#define md_clear_page(page)        memset((void *)(page), 0, PAGE_SIZE)
-
-/* 003 */
-/*
- * someone please suggest a sane compatibility layer for modules
- */
-#define MD_EXPORT_SYMBOL(x)
-
-/* 004 */
-static inline unsigned long
-md_copy_from_user(void *to, const void *from, unsigned long n)
-{
-       int err;
-
-       err = verify_area(VERIFY_READ,from,n);
-       if (!err)
-               memcpy_fromfs(to, from, n);
-       return err; 
-}
-
-/* 005 */
-extern inline unsigned long
-md_copy_to_user(void *to, const void *from, unsigned long n)
-{
-       int err;
-
-       err = verify_area(VERIFY_WRITE,to,n);
-       if (!err)
-               memcpy_tofs(to, from, n);
-       return err; 
-}
-
-/* 006 */
-#define md_put_user(x,ptr)                                             \
-({                                                                     \
-       int __err;                                                      \
-                                                                       \
-       __err = verify_area(VERIFY_WRITE,ptr,sizeof(*ptr));             \
-       if (!__err)                                                     \
-               put_user(x,ptr);                                        \
-       __err;                                                          \
-})
-
-/* 007 */
-extern inline int md_capable_admin(void)
-{
-       return suser();
-}
- 
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_inode)
-
-/* 009 */
-extern inline void md_flush_signals (void)
-{
-       current->signal = 0;
-}
- 
-/* 010 */
-#define __S(nr) (1<<((nr)-1))
-extern inline void md_init_signals (void)
-{
-        current->exit_signal = SIGCHLD;
-        current->blocked = ~(__S(SIGKILL));
-}
-#undef __S
-
-/* 011 */
-extern inline unsigned long md_signal_pending (struct task_struct * tsk)
-{
-       return (tsk->signal & ~tsk->blocked);
-}
-
-/* 012 */
-#define md_set_global_readahead(x) read_ahead[MD_MAJOR] = MD_READAHEAD
-
-/* 013 */
-#define md_mdelay(n) (\
-       {unsigned long msec=(n); while (msec--) udelay(1000);})
-
-/* 014 */
-#define MD_SYS_DOWN 0
-#define MD_SYS_HALT 0
-#define MD_SYS_POWER_OFF 0
-
-/* 015 */
-#define md_register_reboot_notifier(x)
-
-/* 016 */
-extern __inline__ unsigned long
-md_test_and_set_bit(int nr, void * addr)
-{
-       unsigned long flags;
-       unsigned long oldbit;
-
-       save_flags(flags);
-       cli();
-       oldbit = test_bit(nr,addr);
-       set_bit(nr,addr);
-       restore_flags(flags);
-       return oldbit;
-}
-
-/* 017 */
-extern __inline__ unsigned long
-md_test_and_clear_bit(int nr, void * addr)
-{
-       unsigned long flags;
-       unsigned long oldbit;
-
-       save_flags(flags);
-       cli();
-       oldbit = test_bit(nr,addr);
-       clear_bit(nr,addr);
-       restore_flags(flags);
-       return oldbit;
-}
-
-/* 018 */
-#define md_atomic_read(x) (*(volatile int *)(x))
-#define md_atomic_set(x,y) (*(volatile int *)(x) = (y))
-
-/* 019 */
-extern __inline__ void md_lock_kernel (void)
-{
-#if __SMP__
-       lock_kernel();
-       syscall_count++;
-#endif
-}
-
-extern __inline__ void md_unlock_kernel (void)
-{
-#if __SMP__
-       syscall_count--;
-       unlock_kernel();
-#endif
-}
-/* 020 */
-
-#define md__init
-#define md__initdata
-#define md__initfunc(__arginit) __arginit
-
-/* 021 */
-
-/* 022 */
-
-struct md_list_head {
-       struct md_list_head *next, *prev;
-};
-
-#define MD_LIST_HEAD(name) \
-       struct md_list_head name = { &name, &name }
-
-#define MD_INIT_LIST_HEAD(ptr) do { \
-       (ptr)->next = (ptr); (ptr)->prev = (ptr); \
-} while (0)
-
-static __inline__ void md__list_add(struct md_list_head * new,
-       struct md_list_head * prev,
-       struct md_list_head * next)
-{
-       next->prev = new;
-       new->next = next;
-       new->prev = prev;
-       prev->next = new;
-}
-
-static __inline__ void md_list_add(struct md_list_head *new,
-                                               struct md_list_head *head)
-{
-       md__list_add(new, head, head->next);
-}
-
-static __inline__ void md__list_del(struct md_list_head * prev,
-                                       struct md_list_head * next)
-{
-       next->prev = prev;
-       prev->next = next;
-}
-
-static __inline__ void md_list_del(struct md_list_head *entry)
-{
-       md__list_del(entry->prev, entry->next);
-}
-
-static __inline__ int md_list_empty(struct md_list_head *head)
-{
-       return head->next == head;
-}
-
-#define md_list_entry(ptr, type, member) \
-       ((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
-
-/* 023 */
-
-static __inline__ signed long md_schedule_timeout(signed long timeout)
-{
-       current->timeout = jiffies + timeout;
-       schedule();
-       return 0;
-}
-
-/* 024 */
-#define md_need_resched(tsk) (need_resched)
-
-/* 025 */
-typedef struct { int gcc_is_buggy; } md_spinlock_t;
-#define MD_SPIN_LOCK_UNLOCKED (md_spinlock_t) { 0 }
-
-#define md_spin_lock_irq cli
-#define md_spin_unlock_irq sti
-#define md_spin_unlock_irqrestore(x,flags) restore_flags(flags)
-#define md_spin_lock_irqsave(x,flags) do { save_flags(flags); cli(); } while (0)
-
-/* END */
-
-#else
-
-#include <linux/reboot.h>
-#include <linux/vmalloc.h>
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y)
-
-#ifdef __i386__
-/* 001 */
-extern __inline__ int md_cpu_has_mmx(void)
-{
-       return boot_cpu_data.x86_capability & X86_FEATURE_MMX;
-}
-#endif
-
-/* 002 */
-#define md_clear_page(page)        clear_page(page)
-
-/* 003 */
-#define MD_EXPORT_SYMBOL(x) EXPORT_SYMBOL(x)
-
-/* 004 */
-#define md_copy_to_user(x,y,z) copy_to_user(x,y,z)
-
-/* 005 */
-#define md_copy_from_user(x,y,z) copy_from_user(x,y,z)
-
-/* 006 */
-#define md_put_user put_user
-
-/* 007 */
-extern inline int md_capable_admin(void)
-{
-       return capable(CAP_SYS_ADMIN);
-}
-
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_dentry->d_inode)
-
-/* 009 */
-extern inline void md_flush_signals (void)
-{
-       spin_lock(&current->sigmask_lock);
-       flush_signals(current);
-       spin_unlock(&current->sigmask_lock);
-}
- 
-/* 010 */
-extern inline void md_init_signals (void)
-{
-        current->exit_signal = SIGCHLD;
-        siginitsetinv(&current->blocked, sigmask(SIGKILL));
-}
-
-/* 011 */
-#define md_signal_pending signal_pending
-
-/* 012 */
-extern inline void md_set_global_readahead(int * table)
-{
-       max_readahead[MD_MAJOR] = table;
-}
-
-/* 013 */
-#define md_mdelay(x) mdelay(x)
-
-/* 014 */
-#define MD_SYS_DOWN SYS_DOWN
-#define MD_SYS_HALT SYS_HALT
-#define MD_SYS_POWER_OFF SYS_POWER_OFF
-
-/* 015 */
-#define md_register_reboot_notifier register_reboot_notifier
-
-/* 016 */
-#define md_test_and_set_bit test_and_set_bit
-
-/* 017 */
-#define md_test_and_clear_bit test_and_clear_bit
-
-/* 018 */
-#define md_atomic_read atomic_read
-#define md_atomic_set atomic_set
-
-/* 019 */
-#define md_lock_kernel lock_kernel
-#define md_unlock_kernel unlock_kernel
-
-/* 020 */
-
-#include <linux/init.h>
-
-#define md__init __init
-#define md__initdata __initdata
-#define md__initfunc(__arginit) __initfunc(__arginit)
-
-/* 021 */
-
-
-/* 022 */
-
-#define md_list_head list_head
-#define MD_LIST_HEAD(name) LIST_HEAD(name)
-#define MD_INIT_LIST_HEAD(ptr) INIT_LIST_HEAD(ptr)
-#define md_list_add list_add
-#define md_list_del list_del
-#define md_list_empty list_empty
-
-#define md_list_entry(ptr, type, member) list_entry(ptr, type, member)
-
-/* 023 */
-
-#define md_schedule_timeout schedule_timeout
-
-/* 024 */
-#define md_need_resched(tsk) ((tsk)->need_resched)
-
-/* 025 */
-#define md_spinlock_t spinlock_t
-#define MD_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
-
-#define md_spin_lock_irq spin_lock_irq
-#define md_spin_unlock_irq spin_unlock_irq
-#define md_spin_unlock_irqrestore spin_unlock_irqrestore
-#define md_spin_lock_irqsave spin_lock_irqsave
-
-/* END */
-
-#endif
-
-#endif _MD_COMPATIBLE_H
-
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h

deleted file mode 100644 (file)

index c98b4ef..0000000
--- a/include/linux/raid/md_k.h
+++ /dev/null
@@ -1,338 +0,0 @@
-/*
-   md_k.h : kernel internal structure of the Linux MD driver
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_K_H
-#define _MD_K_H
-
-#define MD_RESERVED       0UL
-#define LINEAR            1UL
-#define STRIPED           2UL
-#define RAID0             STRIPED
-#define RAID1             3UL
-#define RAID5             4UL
-#define TRANSLUCENT       5UL
-#define LVM               6UL
-#define MAX_PERSONALITY   7UL
-
-extern inline int pers_to_level (int pers)
-{
-       switch (pers) {
-               case LVM:               return -3;
-               case TRANSLUCENT:       return -2;
-               case LINEAR:            return -1;
-               case RAID0:             return 0;
-               case RAID1:             return 1;
-               case RAID5:             return 5;
-       }
-       panic("pers_to_level()");
-}
-
-extern inline int level_to_pers (int level)
-{
-       switch (level) {
-               case -3: return LVM;
-               case -2: return TRANSLUCENT;
-               case -1: return LINEAR;
-               case 0: return RAID0;
-               case 1: return RAID1;
-               case 4:
-               case 5: return RAID5;
-       }
-       return MD_RESERVED;
-}
-
-typedef struct mddev_s mddev_t;
-typedef struct mdk_rdev_s mdk_rdev_t;
-
-#if (MINORBITS != 8)
-#error MD doesnt handle bigger kdev yet
-#endif
-
-#define MAX_REAL     12                        /* Max number of disks per md dev */
-#define MAX_MD_DEVS  (1<<MINORBITS)    /* Max number of md dev */
-
-/*
- * Maps a kdev to an mddev/subdev. How 'data' is handled is up to
- * the personality. (eg. LVM uses this to identify individual LVs)
- */
-typedef struct dev_mapping_s {
-       mddev_t *mddev;
-       void *data;
-} dev_mapping_t;
-
-extern dev_mapping_t mddev_map [MAX_MD_DEVS];
-
-extern inline mddev_t * kdev_to_mddev (kdev_t dev)
-{
-        return mddev_map[MINOR(dev)].mddev;
-}
-
-/*
- * options passed in raidrun:
- */
-
-#define MAX_CHUNK_SIZE (4096*1024)
-
-/*
- * default readahead
- */
-#define MD_READAHEAD   (256 * 512)
-
-extern inline int disk_faulty(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_FAULTY);
-}
-
-extern inline int disk_active(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_ACTIVE);
-}
-
-extern inline int disk_sync(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_SYNC);
-}
-
-extern inline int disk_spare(mdp_disk_t * d)
-{
-       return !disk_sync(d) && !disk_active(d) && !disk_faulty(d);
-}
-
-extern inline int disk_removed(mdp_disk_t * d)
-{
-       return d->state & (1 << MD_DISK_REMOVED);
-}
-
-extern inline void mark_disk_faulty(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_FAULTY);
-}
-
-extern inline void mark_disk_active(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_ACTIVE);
-}
-
-extern inline void mark_disk_sync(mdp_disk_t * d)
-{
-       d->state |= (1 << MD_DISK_SYNC);
-}
-
-extern inline void mark_disk_spare(mdp_disk_t * d)
-{
-       d->state = 0;
-}
-
-extern inline void mark_disk_removed(mdp_disk_t * d)
-{
-       d->state = (1 << MD_DISK_FAULTY) | (1 << MD_DISK_REMOVED);
-}
-
-extern inline void mark_disk_inactive(mdp_disk_t * d)
-{
-       d->state &= ~(1 << MD_DISK_ACTIVE);
-}
-
-extern inline void mark_disk_nonsync(mdp_disk_t * d)
-{
-       d->state &= ~(1 << MD_DISK_SYNC);
-}
-
-/*
- * MD's 'extended' device
- */
-struct mdk_rdev_s
-{
-       struct md_list_head same_set;   /* RAID devices within the same set */
-       struct md_list_head all;        /* all RAID devices */
-       struct md_list_head pending;    /* undetected RAID devices */
-
-       kdev_t dev;                     /* Device number */
-       kdev_t old_dev;                 /*  "" when it was last imported */
-       int size;                       /* Device size (in blocks) */
-       mddev_t *mddev;                 /* RAID array if running */
-       unsigned long last_events;      /* IO event timestamp */
-
-       struct inode *inode;            /* Lock inode */
-       struct file filp;               /* Lock file */
-
-       mdp_super_t *sb;
-       int sb_offset;
-
-       int faulty;                     /* if faulty do not issue IO requests */
-       int desc_nr;                    /* descriptor index in the superblock */
-};
-
-
-/*
- * disk operations in a working array:
- */
-#define DISKOP_SPARE_INACTIVE  0
-#define DISKOP_SPARE_WRITE     1
-#define DISKOP_SPARE_ACTIVE    2
-#define DISKOP_HOT_REMOVE_DISK 3
-#define DISKOP_HOT_ADD_DISK    4
-
-typedef struct mdk_personality_s mdk_personality_t;
-
-struct mddev_s
-{
-       void                            *private;
-       mdk_personality_t               *pers;
-       int                             __minor;
-       mdp_super_t                     *sb;
-       int                             nb_dev;
-       struct md_list_head             disks;
-       int                             sb_dirty;
-       mdu_param_t                     param;
-       int                             ro;
-       unsigned int                    curr_resync;
-       unsigned long                   resync_start;
-       char                            *name;
-       int                             recovery_running;
-       struct semaphore                reconfig_sem;
-       struct semaphore                recovery_sem;
-       struct semaphore                resync_sem;
-       struct md_list_head             all_mddevs;
-};
-
-struct mdk_personality_s
-{
-       char *name;
-       int (*map)(mddev_t *mddev, kdev_t dev, kdev_t *rdev,
-               unsigned long *rsector, unsigned long size);
-       int (*make_request)(mddev_t *mddev, int rw, struct buffer_head * bh);
-       void (*end_request)(struct buffer_head * bh, int uptodate);
-       int (*run)(mddev_t *mddev);
-       int (*stop)(mddev_t *mddev);
-       int (*status)(char *page, mddev_t *mddev);
-       int (*ioctl)(struct inode *inode, struct file *file,
-               unsigned int cmd, unsigned long arg);
-       int max_invalid_dev;
-       int (*error_handler)(mddev_t *mddev, kdev_t dev);
-
-/*
- * Some personalities (RAID-1, RAID-5) can have disks hot-added and
- * hot-removed. Hot removal is different from failure. (failure marks
- * a disk inactive, but the disk is still part of the array) The interface
- * to such operations is the 'pers->diskop()' function, can be NULL.
- *
- * the diskop function can change the pointer pointing to the incoming
- * descriptor, but must do so very carefully. (currently only
- * SPARE_ACTIVE expects such a change)
- */
-       int (*diskop) (mddev_t *mddev, mdp_disk_t **descriptor, int state);
-
-       int (*stop_resync)(mddev_t *mddev);
-       int (*restart_resync)(mddev_t *mddev);
-};
-
-
-/*
- * Currently we index md_array directly, based on the minor
- * number. This will have to change to dynamic allocation
- * once we start supporting partitioning of md devices.
- */
-extern inline int mdidx (mddev_t * mddev)
-{
-       return mddev->__minor;
-}
-
-extern inline kdev_t mddev_to_kdev(mddev_t * mddev)
-{
-       return MKDEV(MD_MAJOR, mdidx(mddev));
-}
-
-extern mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev);
-extern mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr);
-
-/*
- * iterates through some rdev ringlist. It's safe to remove the
- * current 'rdev'. Dont touch 'tmp' though.
- */
-#define ITERATE_RDEV_GENERIC(head,field,rdev,tmp)                      \
-                                                                       \
-       for (tmp = head.next;                                           \
-               rdev = md_list_entry(tmp, mdk_rdev_t, field),           \
-                       tmp = tmp->next, tmp->prev != &head             \
-               ; )
-/*
- * iterates through the 'same array disks' ringlist
- */
-#define ITERATE_RDEV(mddev,rdev,tmp)                                   \
-       ITERATE_RDEV_GENERIC((mddev)->disks,same_set,rdev,tmp)
-
-/*
- * Same as above, but assumes that the device has rdev->desc_nr numbered
- * from 0 to mddev->nb_dev, and iterates through rdevs in ascending order.
- */
-#define ITERATE_RDEV_ORDERED(mddev,rdev,i)                             \
-       for (i = 0; rdev = find_rdev_nr(mddev, i), i < mddev->nb_dev; i++)
-
-
-/*
- * Iterates through all 'RAID managed disks'
- */
-#define ITERATE_RDEV_ALL(rdev,tmp)                                     \
-       ITERATE_RDEV_GENERIC(all_raid_disks,all,rdev,tmp)
-
-/*
- * Iterates through 'pending RAID disks'
- */
-#define ITERATE_RDEV_PENDING(rdev,tmp)                                 \
-       ITERATE_RDEV_GENERIC(pending_raid_disks,pending,rdev,tmp)
-
-/*
- * iterates through all used mddevs in the system.
- */
-#define ITERATE_MDDEV(mddev,tmp)                                       \
-                                                                       \
-       for (tmp = all_mddevs.next;                                     \
-               mddev = md_list_entry(tmp, mddev_t, all_mddevs),        \
-                       tmp = tmp->next, tmp->prev != &all_mddevs       \
-               ; )
-
-extern inline int lock_mddev (mddev_t * mddev)
-{
-       return down_interruptible(&mddev->reconfig_sem);
-}
-
-extern inline void unlock_mddev (mddev_t * mddev)
-{
-       up(&mddev->reconfig_sem);
-}
-
-#define xchg_values(x,y) do { __typeof__(x) __tmp = x; \
-                               x = y; y = __tmp; } while (0)
-
-typedef struct mdk_thread_s {
-       void                    (*run) (void *data);
-       void                    *data;
-       struct wait_queue       *wqueue;
-       unsigned long           flags;
-       struct semaphore        *sem;
-       struct task_struct      *tsk;
-       const char              *name;
-} mdk_thread_t;
-
-#define THREAD_WAKEUP  0
-
-typedef struct dev_name_s {
-       struct md_list_head list;
-       kdev_t dev;
-       char name [MAX_DISKNAME_LEN];
-} dev_name_t;
-
-#endif _MD_K_H
-
diff --git a/include/linux/raid/md_p.h b/include/linux/raid/md_p.h

deleted file mode 100644 (file)

index 83f8eb1..0000000
--- a/include/linux/raid/md_p.h
+++ /dev/null
@@ -1,161 +0,0 @@
-/*
-   md_p.h : physical layout of Linux RAID devices
-          Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_P_H
-#define _MD_P_H
-
-/*
- * RAID superblock.
- *
- * The RAID superblock maintains some statistics on each RAID configuration.
- * Each real device in the RAID set contains it near the end of the device.
- * Some of the ideas are copied from the ext2fs implementation.
- *
- * We currently use 4096 bytes as follows:
- *
- *     word offset     function
- *
- *        0  -    31   Constant generic RAID device information.
- *        32  -    63   Generic state information.
- *       64  -   127   Personality specific information.
- *      128  -   511   12 32-words descriptors of the disks in the raid set.
- *      512  -   911   Reserved.
- *      912  -  1023   Disk specific descriptor.
- */
-
-/*
- * If x is the real device size in bytes, we return an apparent size of:
- *
- *     y = (x & ~(MD_RESERVED_BYTES - 1)) - MD_RESERVED_BYTES
- *
- * and place the 4kB superblock at offset y.
- */
-#define MD_RESERVED_BYTES              (64 * 1024)
-#define MD_RESERVED_SECTORS            (MD_RESERVED_BYTES / 512)
-#define MD_RESERVED_BLOCKS             (MD_RESERVED_BYTES / BLOCK_SIZE)
-
-#define MD_NEW_SIZE_SECTORS(x)         ((x & ~(MD_RESERVED_SECTORS - 1)) - MD_RESERVED_SECTORS)
-#define MD_NEW_SIZE_BLOCKS(x)          ((x & ~(MD_RESERVED_BLOCKS - 1)) - MD_RESERVED_BLOCKS)
-
-#define MD_SB_BYTES                    4096
-#define MD_SB_WORDS                    (MD_SB_BYTES / 4)
-#define MD_SB_BLOCKS                   (MD_SB_BYTES / BLOCK_SIZE)
-#define MD_SB_SECTORS                  (MD_SB_BYTES / 512)
-
-/*
- * The following are counted in 32-bit words
- */
-#define        MD_SB_GENERIC_OFFSET            0
-#define MD_SB_PERSONALITY_OFFSET       64
-#define MD_SB_DISKS_OFFSET             128
-#define MD_SB_DESCRIPTOR_OFFSET                992
-
-#define MD_SB_GENERIC_CONSTANT_WORDS   32
-#define MD_SB_GENERIC_STATE_WORDS      32
-#define MD_SB_GENERIC_WORDS            (MD_SB_GENERIC_CONSTANT_WORDS + MD_SB_GENERIC_STATE_WORDS)
-#define MD_SB_PERSONALITY_WORDS                64
-#define MD_SB_DISKS_WORDS              384
-#define MD_SB_DESCRIPTOR_WORDS         32
-#define MD_SB_RESERVED_WORDS           (1024 - MD_SB_GENERIC_WORDS - MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
-#define MD_SB_EQUAL_WORDS              (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS + MD_SB_DISKS_WORDS)
-#define MD_SB_DISKS                    (MD_SB_DISKS_WORDS / MD_SB_DESCRIPTOR_WORDS)
-
-/*
- * Device "operational" state bits
- */
-#define MD_DISK_FAULTY         0 /* disk is faulty / operational */
-#define MD_DISK_ACTIVE         1 /* disk is running or spare disk */
-#define MD_DISK_SYNC           2 /* disk is in sync with the raid set */
-#define MD_DISK_REMOVED                3 /* disk is in sync with the raid set */
-
-typedef struct mdp_device_descriptor_s {
-       __u32 number;           /* 0 Device number in the entire set          */
-       __u32 major;            /* 1 Device major number                      */
-       __u32 minor;            /* 2 Device minor number                      */
-       __u32 raid_disk;        /* 3 The role of the device in the raid set   */
-       __u32 state;            /* 4 Operational state                        */
-       __u32 reserved[MD_SB_DESCRIPTOR_WORDS - 5];
-} mdp_disk_t;
-
-#define MD_SB_MAGIC            0xa92b4efc
-
-/*
- * Superblock state bits
- */
-#define MD_SB_CLEAN            0
-#define MD_SB_ERRORS           1
-
-typedef struct mdp_superblock_s {
-       /*
-        * Constant generic information
-        */
-       __u32 md_magic;         /*  0 MD identifier                           */
-       __u32 major_version;    /*  1 major version to which the set conforms */
-       __u32 minor_version;    /*  2 minor version ...                       */
-       __u32 patch_version;    /*  3 patchlevel version ...                  */
-       __u32 gvalid_words;     /*  4 Number of used words in this section    */
-       __u32 set_uuid0;        /*  5 Raid set identifier                     */
-       __u32 ctime;            /*  6 Creation time                           */
-       __u32 level;            /*  7 Raid personality                        */
-       __u32 size;             /*  8 Apparent size of each individual disk   */
-       __u32 nr_disks;         /*  9 total disks in the raid set             */
-       __u32 raid_disks;       /* 10 disks in a fully functional raid set    */
-       __u32 md_minor;         /* 11 preferred MD minor device number        */
-       __u32 not_persistent;   /* 12 does it have a persistent superblock    */
-       __u32 set_uuid1;        /* 13 Raid set identifier #2                  */
-       __u32 set_uuid2;        /* 14 Raid set identifier #3                  */
-       __u32 set_uuid3;        /* 14 Raid set identifier #4                  */
-       __u32 gstate_creserved[MD_SB_GENERIC_CONSTANT_WORDS - 16];
-
-       /*
-        * Generic state information
-        */
-       __u32 utime;            /*  0 Superblock update time                  */
-       __u32 state;            /*  1 State bits (clean, ...)                 */
-       __u32 active_disks;     /*  2 Number of currently active disks        */
-       __u32 working_disks;    /*  3 Number of working disks                 */
-       __u32 failed_disks;     /*  4 Number of failed disks                  */
-       __u32 spare_disks;      /*  5 Number of spare disks                   */
-       __u32 sb_csum;          /*  6 checksum of the whole superblock        */
-       __u64 events;           /*  7 number of superblock updates (64-bit!)  */
-       __u32 gstate_sreserved[MD_SB_GENERIC_STATE_WORDS - 9];
-
-       /*
-        * Personality information
-        */
-       __u32 layout;           /*  0 the array's physical layout             */
-       __u32 chunk_size;       /*  1 chunk size in bytes                     */
-       __u32 root_pv;          /*  2 LV root PV */
-       __u32 root_block;       /*  3 LV root block */
-       __u32 pstate_reserved[MD_SB_PERSONALITY_WORDS - 4];
-
-       /*
-        * Disks information
-        */
-       mdp_disk_t disks[MD_SB_DISKS];
-
-       /*
-        * Reserved
-        */
-       __u32 reserved[MD_SB_RESERVED_WORDS];
-
-       /*
-        * Active descriptor
-        */
-       mdp_disk_t this_disk;
-
-} mdp_super_t;
-
-#endif _MD_P_H
-
diff --git a/include/linux/raid/md_u.h b/include/linux/raid/md_u.h

deleted file mode 100644 (file)

index 18c0295..0000000
--- a/include/linux/raid/md_u.h
+++ /dev/null
@@ -1,114 +0,0 @@
-/*
-   md_u.h : user <=> kernel API between Linux raidtools and RAID drivers
-          Copyright (C) 1998 Ingo Molnar
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#ifndef _MD_U_H
-#define _MD_U_H
-
-/* ioctls */
-
-/* status */
-#define RAID_VERSION           _IOR (MD_MAJOR, 0x10, mdu_version_t)
-#define GET_ARRAY_INFO         _IOR (MD_MAJOR, 0x11, mdu_array_info_t)
-#define GET_DISK_INFO          _IOR (MD_MAJOR, 0x12, mdu_disk_info_t)
-#define PRINT_RAID_DEBUG       _IO (MD_MAJOR, 0x13)
-
-/* configuration */
-#define CLEAR_ARRAY            _IO (MD_MAJOR, 0x20)
-#define ADD_NEW_DISK           _IOW (MD_MAJOR, 0x21, mdu_disk_info_t)
-#define HOT_REMOVE_DISK                _IO (MD_MAJOR, 0x22)
-#define SET_ARRAY_INFO         _IOW (MD_MAJOR, 0x23, mdu_array_info_t)
-#define SET_DISK_INFO          _IO (MD_MAJOR, 0x24)
-#define WRITE_RAID_INFO                _IO (MD_MAJOR, 0x25)
-#define UNPROTECT_ARRAY                _IO (MD_MAJOR, 0x26)
-#define PROTECT_ARRAY          _IO (MD_MAJOR, 0x27)
-#define HOT_ADD_DISK           _IO (MD_MAJOR, 0x28)
-
-/* usage */
-#define RUN_ARRAY              _IOW (MD_MAJOR, 0x30, mdu_param_t)
-#define START_ARRAY            _IO (MD_MAJOR, 0x31)
-#define STOP_ARRAY             _IO (MD_MAJOR, 0x32)
-#define STOP_ARRAY_RO          _IO (MD_MAJOR, 0x33)
-#define RESTART_ARRAY_RW       _IO (MD_MAJOR, 0x34)
-
-typedef struct mdu_version_s {
-       int major;
-       int minor;
-       int patchlevel;
-} mdu_version_t;
-
-typedef struct mdu_array_info_s {
-       /*
-        * Generic constant information
-        */
-       int major_version;
-       int minor_version;
-       int patch_version;
-       int ctime;
-       int level;
-       int size;
-       int nr_disks;
-       int raid_disks;
-       int md_minor;
-       int not_persistent;
-
-       /*
-        * Generic state information
-        */
-       int utime;              /*  0 Superblock update time                  */
-       int state;              /*  1 State bits (clean, ...)                 */
-       int active_disks;       /*  2 Number of currently active disks        */
-       int working_disks;      /*  3 Number of working disks                 */
-       int failed_disks;       /*  4 Number of failed disks                  */
-       int spare_disks;        /*  5 Number of spare disks                   */
-
-       /*
-        * Personality information
-        */
-       int layout;             /*  0 the array's physical layout             */
-       int chunk_size; /*  1 chunk size in bytes                     */
-
-} mdu_array_info_t;
-
-typedef struct mdu_disk_info_s {
-       /*
-        * configuration/status of one particular disk
-        */
-       int number;
-       int major;
-       int minor;
-       int raid_disk;
-       int state;
-
-} mdu_disk_info_t;
-
-typedef struct mdu_start_info_s {
-       /*
-        * configuration/status of one particular disk
-        */
-       int major;
-       int minor;
-       int raid_disk;
-       int state;
-
-} mdu_start_info_t;
-
-typedef struct mdu_param_s
-{
-       int                     personality;    /* 1,2,3,4 */
-       int                     chunk_size;     /* in bytes */
-       int                     max_fault;      /* unused for now */
-} mdu_param_t;
-
-#endif _MD_U_H
-
diff --git a/include/linux/raid/raid0.h b/include/linux/raid/raid0.h

deleted file mode 100644 (file)

index 3ea74db..0000000
--- a/include/linux/raid/raid0.h
+++ /dev/null
@@ -1,33 +0,0 @@
-#ifndef _RAID0_H
-#define _RAID0_H
-
-#include <linux/raid/md.h>
-
-struct strip_zone
-{
-       int zone_offset;                /* Zone offset in md_dev */
-       int dev_offset;                 /* Zone offset in real dev */
-       int size;                       /* Zone size */
-       int nb_dev;                     /* # of devices attached to the zone */
-       mdk_rdev_t *dev[MAX_REAL]; /* Devices attached to the zone */
-};
-
-struct raid0_hash
-{
-       struct strip_zone *zone0, *zone1;
-};
-
-struct raid0_private_data
-{
-       struct raid0_hash *hash_table; /* Dynamically allocated */
-       struct strip_zone *strip_zone; /* This one too */
-       int nr_strip_zones;
-       struct strip_zone *smallest;
-       int nr_zones;
-};
-
-typedef struct raid0_private_data raid0_conf_t;
-
-#define mddev_to_conf(mddev) ((raid0_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/raid1.h b/include/linux/raid/raid1.h

deleted file mode 100644 (file)

index a50ba2f..0000000
--- a/include/linux/raid/raid1.h
+++ /dev/null
@@ -1,64 +0,0 @@
-#ifndef _RAID1_H
-#define _RAID1_H
-
-#include <linux/raid/md.h>
-
-struct mirror_info {
-       int             number;
-       int             raid_disk;
-       kdev_t          dev;
-       int             next;
-       int             sect_limit;
-
-       /*
-        * State bits:
-        */
-       int             operational;
-       int             write_only;
-       int             spare;
-
-       int             used_slot;
-};
-
-struct raid1_private_data {
-       mddev_t                 *mddev;
-       struct mirror_info      mirrors[MD_SB_DISKS];
-       int                     nr_disks;
-       int                     raid_disks;
-       int                     working_disks;
-       int                     last_used;
-       unsigned long           next_sect;
-       int                     sect_count;
-       mdk_thread_t            *thread, *resync_thread;
-       int                     resync_mirrors;
-       struct mirror_info      *spare;
-};
-
-typedef struct raid1_private_data raid1_conf_t;
-
-/*
- * this is the only point in the RAID code where we violate
- * C type safety. mddev->private is an 'opaque' pointer.
- */
-#define mddev_to_conf(mddev) ((raid1_conf_t *) mddev->private)
-
-/*
- * this is our 'private' 'collective' RAID1 buffer head.
- * it contains information about what kind of IO operations were started
- * for this RAID1 operation, and about their status:
- */
-
-struct raid1_bh {
-       atomic_t                remaining; /* 'have we finished' count,
-                                           * used from IRQ handlers
-                                           */
-       int                     cmd;
-       unsigned long           state;
-       mddev_t                 *mddev;
-       struct buffer_head      *master_bh;
-       struct buffer_head      *mirror_bh [MD_SB_DISKS];
-       struct buffer_head      bh_req;
-       struct buffer_head      *next_retry;
-};
-
-#endif
diff --git a/include/linux/raid/raid5.h b/include/linux/raid/raid5.h

deleted file mode 100644 (file)

index 471323c..0000000
--- a/include/linux/raid/raid5.h
+++ /dev/null
@@ -1,113 +0,0 @@
-#ifndef _RAID5_H
-#define _RAID5_H
-
-#include <linux/raid/md.h>
-#include <linux/raid/xor.h>
-
-struct disk_info {
-       kdev_t  dev;
-       int     operational;
-       int     number;
-       int     raid_disk;
-       int     write_only;
-       int     spare;
-       int     used_slot;
-};
-
-struct stripe_head {
-       md_spinlock_t           stripe_lock;
-       struct stripe_head      *hash_next, **hash_pprev; /* hash pointers */
-       struct stripe_head      *free_next;             /* pool of free sh's */
-       struct buffer_head      *buffer_pool;           /* pool of free buffers */
-       struct buffer_head      *bh_pool;               /* pool of free bh's */
-       struct raid5_private_data       *raid_conf;
-       struct buffer_head      *bh_old[MD_SB_DISKS];   /* disk image */
-       struct buffer_head      *bh_new[MD_SB_DISKS];   /* buffers of the MD device (present in buffer cache) */
-       struct buffer_head      *bh_copy[MD_SB_DISKS];  /* copy on write of bh_new (bh_new can change from under us) */
-       struct buffer_head      *bh_req[MD_SB_DISKS];   /* copy of bh_new (only the buffer heads), queued to the lower levels */
-       int                     cmd_new[MD_SB_DISKS];   /* READ/WRITE for new */
-       int                     new[MD_SB_DISKS];       /* buffer added since the last handle_stripe() */
-       unsigned long           sector;                 /* sector of this row */
-       int                     size;                   /* buffers size */
-       int                     pd_idx;                 /* parity disk index */
-       atomic_t                nr_pending;             /* nr of pending cmds */
-       unsigned long           state;                  /* state flags */
-       int                     cmd;                    /* stripe cmd */
-       int                     count;                  /* nr of waiters */
-       int                     write_method;           /* reconstruct-write / read-modify-write */
-       int                     phase;                  /* PHASE_BEGIN, ..., PHASE_COMPLETE */
-       struct wait_queue       *wait;                  /* processes waiting for this stripe */
-};
-
-/*
- * Phase
- */
-#define PHASE_BEGIN            0
-#define PHASE_READ_OLD         1
-#define PHASE_WRITE            2
-#define PHASE_READ             3
-#define PHASE_COMPLETE         4
-
-/*
- * Write method
- */
-#define METHOD_NONE            0
-#define RECONSTRUCT_WRITE      1
-#define READ_MODIFY_WRITE      2
-
-/*
- * Stripe state
- */
-#define STRIPE_LOCKED          0
-#define STRIPE_ERROR           1
-
-/*
- * Stripe commands
- */
-#define STRIPE_NONE            0
-#define        STRIPE_WRITE            1
-#define STRIPE_READ            2
-
-struct raid5_private_data {
-       struct stripe_head      **stripe_hashtbl;
-       mddev_t                 *mddev;
-       mdk_thread_t            *thread, *resync_thread;
-       struct disk_info        disks[MD_SB_DISKS];
-       struct disk_info        *spare;
-       int                     buffer_size;
-       int                     chunk_size, level, algorithm;
-       int                     raid_disks, working_disks, failed_disks;
-       int                     sector_count;
-       unsigned long           next_sector;
-       atomic_t                nr_handle;
-       struct stripe_head      *next_free_stripe;
-       int                     nr_stripes;
-       int                     resync_parity;
-       int                     max_nr_stripes;
-       int                     clock;
-       int                     nr_hashed_stripes;
-       int                     nr_locked_stripes;
-       int                     nr_pending_stripes;
-       int                     nr_cached_stripes;
-
-       /*
-        * Free stripes pool
-        */
-       int                     nr_free_sh;
-       struct stripe_head      *free_sh_list;
-       struct wait_queue       *wait_for_stripe;
-};
-
-typedef struct raid5_private_data raid5_conf_t;
-
-#define mddev_to_conf(mddev) ((raid5_conf_t *) mddev->private)
-
-/*
- * Our supported algorithms
- */
-#define ALGORITHM_LEFT_ASYMMETRIC      0
-#define ALGORITHM_RIGHT_ASYMMETRIC     1
-#define ALGORITHM_LEFT_SYMMETRIC       2
-#define ALGORITHM_RIGHT_SYMMETRIC      3
-
-#endif
diff --git a/include/linux/raid/translucent.h b/include/linux/raid/translucent.h

deleted file mode 100644 (file)

index a1326db..0000000
--- a/include/linux/raid/translucent.h
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef _TRANSLUCENT_H
-#define _TRANSLUCENT_H
-
-#include <linux/raid/md.h>
-
-typedef struct dev_info dev_info_t;
-
-struct dev_info {
-       kdev_t          dev;
-       int             size;
-};
-
-struct translucent_private_data
-{
-       dev_info_t              disks[MD_SB_DISKS];
-};
-
-
-typedef struct translucent_private_data translucent_conf_t;
-
-#define mddev_to_conf(mddev) ((translucent_conf_t *) mddev->private)
-
-#endif
diff --git a/include/linux/raid/xor.h b/include/linux/raid/xor.h

deleted file mode 100644 (file)

index e345fe7..0000000
--- a/include/linux/raid/xor.h
+++ /dev/null
@@ -1,12 +0,0 @@
-#ifndef _XOR_H
-#define _XOR_H
-
-#include <linux/raid/md.h>
-
-#define MAX_XOR_BLOCKS 5
-
-extern void calibrate_xor_block(void);
-extern void (*xor_block)(unsigned int count,
-                         struct buffer_head **bh_ptr);
-
-#endif
diff --git a/include/linux/raid0.h b/include/linux/raid0.h

new file mode 100644 (file)

index 0000000..e1ae51c
--- /dev/null
+++ b/include/linux/raid0.h
@@ -0,0 +1,27 @@
+#ifndef _RAID0_H
+#define _RAID0_H
+
+struct strip_zone
+{
+  int zone_offset;             /* Zone offset in md_dev */
+  int dev_offset;              /* Zone offset in real dev */
+  int size;                    /* Zone size */
+  int nb_dev;                  /* Number of devices attached to the zone */
+  struct real_dev *dev[MAX_REAL]; /* Devices attached to the zone */
+};
+
+struct raid0_hash
+{
+  struct strip_zone *zone0, *zone1;
+};
+
+struct raid0_data
+{
+  struct raid0_hash *hash_table; /* Dynamically allocated */
+  struct strip_zone *strip_zone; /* This one too */
+  int nr_strip_zones;
+  struct strip_zone *smallest;
+  int nr_zones;
+};
+
+#endif
diff --git a/include/linux/raid1.h b/include/linux/raid1.h

new file mode 100644 (file)

index 0000000..4b031e6
--- /dev/null
+++ b/include/linux/raid1.h
@@ -0,0 +1,49 @@
+#ifndef _RAID1_H
+#define _RAID1_H
+
+#include <linux/md.h>
+
+struct mirror_info {
+       int             number;
+       int             raid_disk;
+       kdev_t          dev;
+       int             next;
+       int             sect_limit;
+
+       /*
+        * State bits:
+        */
+       int             operational;
+       int             write_only;
+       int             spare;
+};
+
+struct raid1_data {
+       struct md_dev *mddev;
+       struct mirror_info mirrors[MD_SB_DISKS];        /* RAID1 devices, 2 to MD_SB_DISKS */
+       int raid_disks;
+       int working_disks;                      /* Number of working disks */
+       int last_used;
+       unsigned long   next_sect;
+       int             sect_count;
+       int resync_running;
+};
+
+/*
+ * this is our 'private' 'collective' RAID1 buffer head.
+ * it contains information about what kind of IO operations were started
+ * for this RAID5 operation, and about their status:
+ */
+
+struct raid1_bh {
+       unsigned int            remaining;
+       int                     cmd;
+       unsigned long           state;
+       struct md_dev           *mddev;
+       struct buffer_head      *master_bh;
+       struct buffer_head      *mirror_bh [MD_SB_DISKS];
+       struct buffer_head      bh_req;
+       struct buffer_head      *next_retry;
+};
+
+#endif
diff --git a/include/linux/raid5.h b/include/linux/raid5.h

new file mode 100644 (file)

index 0000000..5efd211
--- /dev/null
+++ b/include/linux/raid5.h
@@ -0,0 +1,110 @@
+#ifndef _RAID5_H
+#define _RAID5_H
+
+#ifdef __KERNEL__
+#include <linux/md.h>
+#include <asm/atomic.h>
+
+struct disk_info {
+       kdev_t  dev;
+       int     operational;
+       int     number;
+       int     raid_disk;
+       int     write_only;
+       int     spare;
+};
+
+struct stripe_head {
+       struct stripe_head      *hash_next, **hash_pprev; /* hash pointers */
+       struct stripe_head      *free_next;             /* pool of free sh's */
+       struct buffer_head      *buffer_pool;           /* pool of free buffers */
+       struct buffer_head      *bh_pool;               /* pool of free bh's */
+       struct raid5_data       *raid_conf;
+       struct buffer_head      *bh_old[MD_SB_DISKS];   /* disk image */
+       struct buffer_head      *bh_new[MD_SB_DISKS];   /* buffers of the MD device (present in buffer cache) */
+       struct buffer_head      *bh_copy[MD_SB_DISKS];  /* copy on write of bh_new (bh_new can change from under us) */
+       struct buffer_head      *bh_req[MD_SB_DISKS];   /* copy of bh_new (only the buffer heads), queued to the lower levels */
+       int                     cmd_new[MD_SB_DISKS];   /* READ/WRITE for new */
+       int                     new[MD_SB_DISKS];       /* buffer added since the last handle_stripe() */
+       unsigned long           sector;                 /* sector of this row */
+       int                     size;                   /* buffers size */
+       int                     pd_idx;                 /* parity disk index */
+       int                     nr_pending;             /* nr of pending cmds */
+       unsigned long           state;                  /* state flags */
+       int                     cmd;                    /* stripe cmd */
+       int                     count;                  /* nr of waiters */
+       int                     write_method;           /* reconstruct-write / read-modify-write */
+       int                     phase;                  /* PHASE_BEGIN, ..., PHASE_COMPLETE */
+       struct wait_queue       *wait;                  /* processes waiting for this stripe */
+};
+
+/*
+ * Phase
+ */
+#define PHASE_BEGIN            0
+#define PHASE_READ_OLD         1
+#define PHASE_WRITE            2
+#define PHASE_READ             3
+#define PHASE_COMPLETE         4
+
+/*
+ * Write method
+ */
+#define METHOD_NONE            0
+#define RECONSTRUCT_WRITE      1
+#define READ_MODIFY_WRITE      2
+
+/*
+ * Stripe state
+ */
+#define STRIPE_LOCKED          0
+#define STRIPE_ERROR           1
+
+/*
+ * Stripe commands
+ */
+#define STRIPE_NONE            0
+#define        STRIPE_WRITE            1
+#define STRIPE_READ            2
+
+struct raid5_data {
+       struct stripe_head      **stripe_hashtbl;
+       struct md_dev           *mddev;
+       struct md_thread        *thread, *resync_thread;
+       struct disk_info        disks[MD_SB_DISKS];
+       struct disk_info        *spare;
+       int                     buffer_size;
+       int                     chunk_size, level, algorithm;
+       int                     raid_disks, working_disks, failed_disks;
+       int                     sector_count;
+       unsigned long           next_sector;
+       atomic_t                nr_handle;
+       struct stripe_head      *next_free_stripe;
+       int                     nr_stripes;
+       int                     resync_parity;
+       int                     max_nr_stripes;
+       int                     clock;
+       int                     nr_hashed_stripes;
+       int                     nr_locked_stripes;
+       int                     nr_pending_stripes;
+       int                     nr_cached_stripes;
+
+       /*
+        * Free stripes pool
+        */
+       int                     nr_free_sh;
+       struct stripe_head      *free_sh_list;
+       struct wait_queue       *wait_for_stripe;
+};
+
+#endif
+
+/*
+ * Our supported algorithms
+ */
+#define ALGORITHM_LEFT_ASYMMETRIC      0
+#define ALGORITHM_RIGHT_ASYMMETRIC     1
+#define ALGORITHM_LEFT_SYMMETRIC       2
+#define ALGORITHM_RIGHT_SYMMETRIC      3
+
+#endif
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h

index b7359b45b780511503aa0645f004ba487f1a11b7..aba1498a73de81a44f80db82a4eab1e8defa0935 100644 (file)
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -425,8 +425,7 @@ enum
  /* CTL_DEV names: */
  enum {
         DEV_CDROM=1,
-       DEV_HWMON=2,
-       DEV_MD=3
+       DEV_HWMON=2
  };
  
  /* /proc/sys/dev/cdrom */
@@ -434,11 +433,6 @@ enum {
         DEV_CDROM_INFO=1
  };
  
-/* /proc/sys/dev/md */
-enum {
-       DEV_MD_SPEED_LIMIT=1
-};
-
  #ifdef __KERNEL__
  
  extern asmlinkage int sys_sysctl(struct __sysctl_args *);
diff --git a/include/net/ip_masq.h b/include/net/ip_masq.h

index 8788eb27e162aa766718c1ff45ca0220bbaff78e..1e050359a25c356f117652403ae6c78904645674 100644 (file)
--- a/include/net/ip_masq.h
+++ b/include/net/ip_masq.h
@@ -14,10 +14,6 @@
  #include <linux/list.h>
  #endif /* __KERNEL__ */
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-struct ip_vs_dest;
-#endif
-
  /*
   * This define affects the number of ports that can be handled
   * by each of the protocol helper modules.
@@ -44,6 +40,10 @@ struct ip_vs_dest;
  #define IP_MASQ_MOD_CTL                        0x00
  #define IP_MASQ_USER_CTL               0x01
  
+#ifdef __KERNEL__
+
+#define IP_MASQ_TAB_SIZE       256
+
  #define IP_MASQ_F_NO_DADDR           0x0001    /* no daddr yet */
  #define IP_MASQ_F_NO_DPORT                   0x0002    /* no dport set yet */
  #define IP_MASQ_F_NO_SADDR           0x0004    /* no sport set yet */
@@ -60,23 +60,6 @@ struct ip_vs_dest;
  #define IP_MASQ_F_USER               0x2000    /* from uspace */
  #define IP_MASQ_F_SIMPLE_HASH        0x8000    /* prevent s+d and m+d hashing */
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-#define IP_MASQ_F_VS             0x00010000    /* virtual server releated */
-#define IP_MASQ_F_VS_NO_OUTPUT    0x00020000   /* output packets avoid masq */
-#define IP_MASQ_F_VS_FIN         0x00040000    /* fin detected */
-#define IP_MASQ_F_VS_FWD_MASK    0x00700000    /* mask for the fdw method */
-#define IP_MASQ_F_VS_LOCALNODE   0x00100000    /* local node destination */
-#define IP_MASQ_F_VS_TUNNEL      0x00200000    /* packets will be tunneled */
-#define IP_MASQ_F_VS_DROUTE      0x00400000    /* direct routing */
-                                                /* masquerading otherwise */
-#define IP_MASQ_VS_FWD(ms) (ms->flags & IP_MASQ_F_VS_FWD_MASK)
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-#ifdef __KERNEL__
-
-#define IP_MASQ_NTABLES                3
-#define IP_MASQ_TAB_SIZE       256
-
  /*
   *     Delta seq. info structure
   *     Each MASQ struct has 2 (output AND input seq. changes).
@@ -108,9 +91,6 @@ struct ip_masq {
         unsigned        timeout;        /* timeout */
         unsigned        state;          /* state info */
         struct ip_masq_timeout_table *timeout_table;
-#ifdef CONFIG_IP_MASQUERADE_VS
-       struct ip_vs_dest *dest;        /* real server & service */
-#endif /* CONFIG_IP_MASQUERADE_VS */
  };
  
  /*
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h

deleted file mode 100644 (file)

index af47b53..0000000
--- a/include/net/ip_vs.h
+++ /dev/null
@@ -1,154 +0,0 @@
-/*
- *      Virtual server support for IP masquerading
- *      data structure and funcationality definitions
- */
-
-#ifndef _IP_VS_H
-#define _IP_VS_H
-
-#include <linux/config.h>
-
-#ifdef CONFIG_IP_VS_DEBUG
-#define IP_VS_DBG(msg...) printk(KERN_DEBUG "IP_VS: " ## msg )
-#else  /* NO DEBUGGING at ALL */
-#define IP_VS_DBG(msg...)
-#endif
-
-#define IP_VS_ERR(msg...) printk(KERN_ERR "IP_VS: " ## msg )
-#define IP_VS_INFO(msg...) printk(KERN_INFO "IP_VS: " ## msg )
-#define IP_VS_WARNING(msg...) \
-       printk(KERN_WARNING "IP_VS: " ## msg)
-
-struct ip_vs_dest;
-struct ip_vs_scheduler;
-
-/*
- *     The information about the virtual service offered to the net
- *     and the forwarding entries
- */
-struct ip_vs_service {
-       struct ip_vs_service    *next;
-       __u32                   addr;     /* IP address for virtual service */
-       __u16                   port;     /* port number for the service */
-       __u16                   protocol; /* which protocol (TCP/UDP) */
-        struct ip_vs_dest      *destinations; /* real server list */
-       struct ip_vs_scheduler  *scheduler;    /* bound scheduler object */
-       void                    *sched_data;   /* scheduler application data */
-};
-
-
-/*
- *     The real server destination forwarding entry
- *     with ip address, port
- */
-struct ip_vs_dest {
-       struct ip_vs_dest       *next;
-       __u32                   addr;     /* IP address of real server */
-       __u16                   port;     /* port number of the service */
-       unsigned                masq_flags;     /* flags to copy to masq */
-       atomic_t                connections;
-       atomic_t                refcnt;
-       int                     weight;
-       struct ip_vs_service    *service;       /* service might be NULL */
-};
-
-
-/*
- *     The scheduler object
- */
-struct ip_vs_scheduler {
-       struct ip_vs_scheduler  *next;
-       char                    *name;
-       atomic_t                refcnt;
-
-        /* scheduler initializing service */
-       int (*init_service)(struct ip_vs_service *svc);
-        /* scheduling service finish */
-        int (*done_service)(struct ip_vs_service *svc);
-
-       /* scheduling and creating a masquerading entry */
-       struct ip_masq* (*schedule)(struct ip_vs_service *svc, 
-                                   struct iphdr *iph);
-};
-
-/*
- * IP Virtual Server hash table
- */
-#define IP_VS_TAB_BITS CONFIG_IP_MASQUERADE_VS_TAB_BITS
-#define IP_VS_TAB_SIZE  (1 << IP_VS_TAB_BITS)
-extern struct list_head  ip_vs_table[IP_VS_TAB_SIZE];
-
-/*
- *  Hash and unhash functions
- */
-extern int ip_vs_hash(struct ip_masq *ms);
-extern int ip_vs_unhash(struct ip_masq *ms);
-
-/*
- *      registering/unregistering scheduler functions
- */
-extern int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
-extern int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler);
-
-/*
- *  Lookup functions for the hash table
- */
-extern struct ip_masq * ip_vs_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port);
-extern struct ip_masq * ip_vs_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port);
-
-/*
- * Creating a masquerading entry for IPVS
- */
-extern struct ip_masq *ip_masq_new_vs(int proto, __u32 maddr, __u16 mport, __u32 saddr, __u16 sport, __u32 daddr, __u16 dport, unsigned flags);
-
-/*
- *      IPVS data and functions
- */
-extern rwlock_t __ip_vs_lock;
-
-extern int ip_vs_ctl(int optname, struct ip_masq_ctl *mctl, int optlen);
-
-extern void ip_vs_fin_masq(struct ip_masq *ms);
-extern void ip_vs_bind_masq(struct ip_masq *ms, struct ip_vs_dest *dest);
-extern void ip_vs_unbind_masq(struct ip_masq *ms);
-
-struct ip_vs_service *ip_vs_lookup_service(__u32 vaddr, __u16 vport,
-                                           __u16 protocol);
-extern struct ip_masq *ip_vs_schedule(__u32 vaddr, __u16 vport,
-                                     __u16 protocol,
-                                     struct iphdr *iph);
-
-extern int ip_vs_tunnel_xmit(struct sk_buff **skb_p, __u32 daddr);
-
-/*
- *      init function
- */
-extern int ip_vs_init(void);
-
-/*
- *     init function prototypes for scheduling modules
- *      these function will be called when they are built in kernel
- */
-extern int ip_vs_rr_init(void);
-extern int ip_vs_wrr_init(void);
-extern int ip_vs_wlc_init(void);
-extern int ip_vs_pcc_init(void);
-
-
-/*
- * ip_vs_fwd_tag returns the forwarding tag of the masq
- */
-static __inline__ char ip_vs_fwd_tag(struct ip_masq *ms)
-{
-  char fwd = 'M';
-
-  switch (IP_MASQ_VS_FWD(ms)) {
-    case IP_MASQ_F_VS_LOCALNODE: fwd = 'L'; break;
-    case IP_MASQ_F_VS_TUNNEL: fwd = 'T'; break;
-    case IP_MASQ_F_VS_DROUTE: fwd = 'R'; break;
-  }
-  return fwd;
-}
-
-
-#endif /* _IP_VS_H */
diff --git a/init/main.c b/init/main.c

index a71187f48762aa2a78f5fd5e7d866ba1ed22eae8..f316a4746e1ec05d8f4dfb04cafbbcbe23312172 100644 (file)
--- a/init/main.c
+++ b/init/main.c
@@ -19,7 +19,6 @@
  #include <linux/utsname.h>
  #include <linux/ioport.h>
  #include <linux/init.h>
-#include <linux/raid/md.h>
  #include <linux/smp_lock.h>
  #include <linux/blk.h>
  #include <linux/hdreg.h>
@@ -471,7 +470,7 @@ static struct dev_name_struct {
  #ifdef CONFIG_BLK_DEV_FD
         { "fd",      0x0200 },
  #endif
-#if CONFIG_MD_BOOT || CONFIG_AUTODETECT_RAID
+#ifdef CONFIG_MD_BOOT
         { "md",      0x0900 },       
  #endif     
  #ifdef CONFIG_BLK_DEV_XD
@@ -883,9 +882,6 @@ static struct kernel_param cooked_params[] __initdata = {
  #ifdef CONFIG_MD_BOOT
         { "md=", md_setup},
  #endif
-#if CONFIG_BLK_DEV_MD
-       { "raid=", raid_setup},
-#endif
  #ifdef CONFIG_ADBMOUSE
         { "adb_buttons=", adb_mouse_setup },
  #endif
@@ -1371,9 +1367,6 @@ static void __init do_basic_setup(void)
                         while (pid != wait(&i));
                 if (MAJOR(real_root_dev) != RAMDISK_MAJOR
                      || MINOR(real_root_dev) != 0) {
-#ifdef CONFIG_BLK_DEV_MD
-                       autodetect_raid();
-#endif
                         error = change_root(real_root_dev,"/initrd");
                         if (error)
                                 printk(KERN_ERR "Change root to /initrd: "
diff --git a/net/ipv4/Config.in b/net/ipv4/Config.in

index 1f4b54dce0482ae3af2673fa3474c942e67620bb..29786da5e96be521fc50e56cbbf7564d65671083 100644 (file)
--- a/net/ipv4/Config.in
+++ b/net/ipv4/Config.in
@@ -52,14 +52,6 @@ if [ "$CONFIG_IP_FIREWALL" = "y" ]; then
            tristate 'IP: ipportfw masq support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_IPPORTFW
            tristate 'IP: ip fwmark masq-forwarding support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_MFW
         fi
-       bool 'IP: masquerading virtual server support (EXPERIMENTAL)' CONFIG_IP_MASQUERADE_VS
-       if [ "$CONFIG_IP_MASQUERADE_VS" = "y" ]; then
-         int 'IP masquerading VS table size (the Nth power of 2)' CONFIG_IP_MASQUERADE_VS_TAB_BITS 12
-          tristate 'IPVS: round-robin scheduling' CONFIG_IP_MASQUERADE_VS_RR
-          tristate 'IPVS: weighted round-robin scheduling' CONFIG_IP_MASQUERADE_VS_WRR
-          tristate 'IPVS: weighted least-connection scheduling' CONFIG_IP_MASQUERADE_VS_WLC
-          tristate 'IPVS: persistent client connection scheduling' CONFIG_IP_MASQUERADE_VS_PCC
-       fi
        fi
      fi
    fi
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile

index 45296ae25abbd3dc491df05666967c67d5e91c43..8ab280deba5ccc693c38995247f89f5511a17c4a 100644 (file)
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -91,42 +91,6 @@ ifeq ($(CONFIG_IP_MASQUERADE_MOD),y)
  
  endif
  
-ifeq ($(CONFIG_IP_MASQUERADE_VS),y)
-  IPV4X_OBJS += ip_vs.o
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_RR),y)
-  IPV4_OBJS += ip_vs_rr.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_RR),m)
-    M_OBJS += ip_vs_rr.o
-    endif
-  endif
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_WRR),y)
-  IPV4_OBJS += ip_vs_wrr.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_WRR),m)
-    M_OBJS += ip_vs_wrr.o
-    endif
-  endif
-  
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_WLC),y)
-  IPV4_OBJS += ip_vs_wlc.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_WLC),m)
-    M_OBJS += ip_vs_wlc.o
-    endif
-  endif
-
-  ifeq ($(CONFIG_IP_MASQUERADE_VS_PCC),y)
-  IPV4_OBJS += ip_vs_pcc.o
-  else
-    ifeq ($(CONFIG_IP_MASQUERADE_VS_PCC),m)
-    M_OBJS += ip_vs_pcc.o
-    endif
-  endif
-endif
-
  M_OBJS += ip_masq_user.o
  M_OBJS += ip_masq_ftp.o ip_masq_irc.o ip_masq_raudio.o ip_masq_quake.o
  M_OBJS += ip_masq_vdolive.o ip_masq_cuseeme.o
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c

index 1646ee78a987ee8ed2e53cd903867c6dcfe1b339..27d2f80214b676f57767559371125579c95de224 100644 (file)
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -1,6 +1,6 @@
  /* linux/net/inet/arp.c
   *
- * Version:    $Id: arp.c,v 1.77.2.2 1999/08/13 18:26:03 davem Exp $
+ * Version:    $Id: arp.c,v 1.77.2.1 1999/06/28 10:39:23 davem Exp $
   *
   * Copyright (C) 1994 by Florian  La Roche
   *
@@ -65,8 +65,6 @@
   *                                     clean up the APFDDI & gen. FDDI bits.
   *             Alexey Kuznetsov:       new arp state machine;
   *                                     now it is in net/core/neighbour.c.
- *              Wensong Zhang   :       NOARP device (such as tunl) arp fix.
- *             Peter Kese      :       arp_solicit: saddr opt disabled for vs.
   */
  
  /* RFC1122 Status:
@@ -308,15 +306,9 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
         u32 target = *(u32*)neigh->primary_key;
         int probes = neigh->probes;
  
-#if !defined(CONFIG_IP_MASQUERADE_VS)  /* Virtual server */ 
-       /* use default interface address as source address in virtual
-        * server environment. Otherways the saddr might be the virtual
-        * address and gateway's arp cache might start routing packets
-        * to the real server */
         if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
                 saddr = skb->nh.iph->saddr;
         else
-#endif
                 saddr = inet_select_addr(dev, target, RT_SCOPE_LINK);
  
         if ((probes -= neigh->parms->ucast_probes) < 0) {
@@ -542,7 +534,6 @@ int arp_rcv(struct sk_buff *skb, struct device *dev, struct packet_type *pt)
         struct rtable *rt;
         unsigned char *sha, *tha;
         u32 sip, tip;
-       struct device *tdev;
         u16 dev_type = dev->type;
         int addr_type;
         struct in_device *in_dev = dev->ip_ptr;
@@ -638,13 +629,6 @@ int arp_rcv(struct sk_buff *skb, struct device *dev, struct packet_type *pt)
         if (LOOPBACK(tip) || MULTICAST(tip))
                 goto out;
  
-/* 
- *      Check for the device flags for the target IP. If the IFF_NOARP
- *      is set, just delete it. No arp reply is sent.    -- WZ
- */ 
-       if ((tdev = ip_dev_find(tip)) && (tdev->flags & IFF_NOARP))
-               goto out;
-
  /*
   *  Process entry.  The idea here is we want to send a reply if it is a
   *  request for us or if it is a request for someone else that we hold
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c

index 92b1078a44dd3e7358bfbe764bad2d66713a0da8..7a3e2618bd90b952f1302331868e463f5502a3d7 100644 (file)
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -5,7 +5,7 @@
   *
   *             The Internet Protocol (IP) module.
   *
- * Version:    $Id: ip_input.c,v 1.37.2.1 1999/08/13 18:26:08 davem Exp $
+ * Version:    $Id: ip_input.c,v 1.37 1999/04/22 10:38:36 davem Exp $
   *
   * Authors:    Ross Biro, <bir7@leland.Stanford.Edu>
   *             Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
@@ -266,15 +266,6 @@ int ip_local_deliver(struct sk_buff *skb)
                 }
  
                 ret = ip_fw_demasquerade(&skb);
-#ifdef CONFIG_IP_MASQUERADE_VS
-               if (ret == -3) {
-                       /* packet had been tunneled */
-                       return(0);
-               }
-               if (ret == -2) {
-                       return skb->dst->input(skb);
-               }
-#endif
                 if (ret < 0) {
                         kfree_skb(skb);
                         return 0;
diff --git a/net/ipv4/ip_masq.c b/net/ipv4/ip_masq.c

index 69b31496c8135610f71fe1f76d3c6cf3a58aca26..0187c58d7c5c2cd601b64d8b906332546180feac 100644 (file)
--- a/net/ipv4/ip_masq.c
+++ b/net/ipv4/ip_masq.c
@@ -4,7 +4,7 @@
   *
   *     Copyright (c) 1994 Pauline Middelink
   *
- *     $Id: ip_masq.c,v 1.34.2.3 1999/08/13 18:26:15 davem Exp $
+ *     $Id: ip_masq.c,v 1.34.2.2 1999/08/07 10:56:28 davem Exp $
   *
   *
   *     See ip_fw.c for original log
@@ -47,8 +47,7 @@
   *     Kai Bankett             :       do not toss other IP protos in proto_doff()
   *     Dan Kegel               :       pointed correct NAT behavior for UDP streams
   *     Julian Anastasov        :       use daddr and dport as hash keys
- *     Wensong Zhang           :       Added virtual server support 
- *     Peter Kese              :       fixed TCP state handling for input-only
+ *     
   */
  
  #include <linux/config.h>
@@ -82,196 +81,6 @@
  #include <linux/ip_fw.h>
  #include <linux/ip_masq.h>
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-#include <net/ip_vs.h>
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*
- * The following block implements slow timers, most code is stolen
- * from linux/kernel/sched.c
- */
-#define SHIFT_BITS 7
-#define TVN_BITS 11
-#define TVR_BITS 7
-#define TVN_SIZE (1 << TVN_BITS)
-#define TVR_SIZE (1 << TVR_BITS)
-#define TVN_MASK (TVN_SIZE - 1)
-#define TVR_MASK (TVR_SIZE - 1)
-
-struct sltimer_vec {
-        int index;
-        struct timer_list *vec[TVN_SIZE];
-};
-
-struct sltimer_vec_root {
-        int index;
-        struct timer_list *vec[TVR_SIZE];
-};
-
-static struct sltimer_vec sltv3 = { 0 };
-static struct sltimer_vec sltv2 = { 0 };
-static struct sltimer_vec_root sltv1 = { 0 };
-
-static struct sltimer_vec * const sltvecs[] = {
-       (struct sltimer_vec *)&sltv1, &sltv2, &sltv3
-};
-
-#define NOOF_SLTVECS (sizeof(sltvecs) / sizeof(sltvecs[0]))
-
-static unsigned long sltimer_jiffies = 0;
-
-static inline void insert_sltimer(struct timer_list *timer,
-                               struct timer_list **vec, int idx)
-{
-       if ((timer->next = vec[idx]))
-               vec[idx]->prev = timer;
-       vec[idx] = timer;
-       timer->prev = (struct timer_list *)&vec[idx];
-}
-
-static inline void internal_add_sltimer(struct timer_list *timer)
-{
-       /*
-        * must be cli-ed when calling this
-        */
-       unsigned long expires = timer->expires;
-       unsigned long idx = (expires - sltimer_jiffies) >> SHIFT_BITS;
-
-       if (idx < TVR_SIZE) {
-               int i = (expires >> SHIFT_BITS) & TVR_MASK;
-               insert_sltimer(timer, sltv1.vec, i);
-       } else if (idx < 1 << (TVR_BITS + TVN_BITS)) {
-               int i = (expires >> (SHIFT_BITS+TVR_BITS)) & TVN_MASK;
-               insert_sltimer(timer, sltv2.vec, i);
-       } else if ((signed long) idx < 0) {
-               /* can happen if you add a timer with expires == jiffies,
-                * or you set a timer to go off in the past
-                */
-               insert_sltimer(timer, sltv1.vec, sltv1.index);
-       } else if (idx <= 0xffffffffUL) {
-               int i = (expires >> (SHIFT_BITS+TVR_BITS+TVN_BITS)) & TVN_MASK;
-               insert_sltimer(timer, sltv3.vec, i);
-       } else {
-               /* Can only get here on architectures with 64-bit jiffies */
-               timer->next = timer->prev = timer;
-       }
-}
-
-rwlock_t  sltimerlist_lock = RW_LOCK_UNLOCKED;
-
-void add_sltimer(struct timer_list *timer)
-{
-       write_lock(&sltimerlist_lock);
-       if (timer->prev)
-               goto bug;
-       internal_add_sltimer(timer);
-out:
-       write_unlock(&sltimerlist_lock);
-       return;
-
-bug:
-       printk("bug: kernel sltimer added twice at %p.\n",
-                       __builtin_return_address(0));
-       goto out;
-}
-
-static inline int detach_sltimer(struct timer_list *timer)
-{
-       struct timer_list *prev = timer->prev;
-       if (prev) {
-               struct timer_list *next = timer->next;
-               prev->next = next;
-               if (next)
-                       next->prev = prev;
-               return 1;
-       }
-       return 0;
-}
-
-void mod_sltimer(struct timer_list *timer, unsigned long expires)
-{
-       write_lock(&sltimerlist_lock);
-       timer->expires = expires;
-       detach_sltimer(timer);
-       internal_add_sltimer(timer);
-       write_unlock(&sltimerlist_lock);
-}
-
-int del_sltimer(struct timer_list * timer)
-{
-       int ret;
-
-       write_lock(&sltimerlist_lock);
-       ret = detach_sltimer(timer);
-       timer->next = timer->prev = 0;
-       write_unlock(&sltimerlist_lock);
-       return ret;
-}
-
-
-static inline void cascade_sltimers(struct sltimer_vec *tv)
-{
-        /* cascade all the timers from tv up one level */
-        struct timer_list *timer;
-        timer = tv->vec[tv->index];
-        /*
-         * We are removing _all_ timers from the list, so we don't  have to
-         * detach them individually, just clear the list afterwards.
-         */
-        while (timer) {
-                struct timer_list *tmp = timer;
-                timer = timer->next;
-                internal_add_sltimer(tmp);
-        }
-        tv->vec[tv->index] = NULL;
-        tv->index = (tv->index + 1) & TVN_MASK;
-}
-
-static inline void run_sltimer_list(void)
-{
-       write_lock(&sltimerlist_lock);
-       while ((long)(jiffies - sltimer_jiffies) >= 0) {
-               struct timer_list *timer;
-               if (!sltv1.index) {
-                       int n = 1;
-                       do {
-                               cascade_sltimers(sltvecs[n]);
-                       } while (sltvecs[n]->index == 1 && ++n < NOOF_SLTVECS);
-               }
-               while ((timer = sltv1.vec[sltv1.index])) {
-                       void (*fn)(unsigned long) = timer->function;
-                       unsigned long data = timer->data;
-                       detach_sltimer(timer);
-                       timer->next = timer->prev = NULL;
-                       write_unlock(&sltimerlist_lock);
-                       fn(data);
-                       write_lock(&sltimerlist_lock);
-               }
-               sltimer_jiffies += 1<<SHIFT_BITS; 
-               sltv1.index = (sltv1.index + 1) & TVR_MASK;
-       }
-       write_unlock(&sltimerlist_lock);
-}
-
-static void sltimer_handler(unsigned long data);
-
-struct timer_list       slow_timer = {
-        NULL, NULL,
-        0, 0,
-        sltimer_handler,
-};
-
-#define SLTIMER_PERIOD       1*HZ
-
-void sltimer_handler(unsigned long data)
-{
-        run_sltimer_list();
-        mod_timer(&slow_timer, (jiffies + SLTIMER_PERIOD));
-}
-
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
  int sysctl_ip_masq_debug = 0;
  
  /*
@@ -362,38 +171,29 @@ struct masq_tcp_states_t masq_tcp_states [] = {
  /*fin*/        {{mTW, mFW, mSS, mTW, mFW, mTW, mCL, mTW, mLA, mLI }},
  /*ack*/        {{mES, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mES }},
  /*rst*/ {{mCL, mCL, mSS, mCL, mCL, mTW, mCL, mCL, mCL, mCL }},
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*     INPUT-ONLY */
-/*       mNO, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mLI      */
-/*syn*/        {{mES, mES, mES, mSR, mES, mSR, mSR, mSR, mSR, mSR }},
-/*fin*/        {{mCL, mFW, mSS, mTW, mFW, mTW, mCL, mCW, mLA, mLI }},
-/*ack*/        {{mCL, mES, mSS, mSR, mFW, mTW, mCL, mCW, mCL, mLI }},
-/*rst*/ {{mCL, mCL, mCL, mSR, mCL, mCL, mCL, mCL, mLA, mLI }},
-#endif
  };
  
-#define MASQ_STATE_INPUT       0
-#define MASQ_STATE_OUTPUT      4
-#define MASQ_STATE_INPUT_ONLY  8
-
-static __inline__ int masq_tcp_state_idx(struct tcphdr *th, int state_off) 
+static __inline__ int masq_tcp_state_idx(struct tcphdr *th, int output) 
  {
         /*
-        *      [0-3]: input states, [4-7]: output, [8-11] input only states.
+        *      [0-3]: input states, [4-7]: output.
          */
+       if (output) 
+               output=4;
+
         if (th->rst)
-               return state_off+3;
+               return output+3;
         if (th->syn)
-               return state_off+0;
+               return output+0;
         if (th->fin)
-               return state_off+1;
+               return output+1;
         if (th->ack)
-               return state_off+2;
+               return output+2;
         return -1;
  }
  
  
+
  static int masq_set_state_timeout(struct ip_masq *ms, int state)
  {
         struct ip_masq_timeout_table *mstim = ms->timeout_table;
@@ -416,24 +216,14 @@ static int masq_set_state_timeout(struct ip_masq *ms, int state)
         return state;
  }
  
-static int masq_tcp_state(struct ip_masq *ms, int state_off, struct tcphdr *th)
+static int masq_tcp_state(struct ip_masq *ms, int output, struct tcphdr *th)
  {
         int state_idx;
         int new_state = IP_MASQ_S_CLOSE;
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-       /* update state offset to INPUT_ONLY if necessary */
-       /* or delete NO_OUTPUT flag if output packet detected */
-       if (ms->flags & IP_MASQ_F_VS_NO_OUTPUT) {
-               if (state_off == MASQ_STATE_OUTPUT)
-                       ms->flags &= ~IP_MASQ_F_VS_NO_OUTPUT;
-               else state_off = MASQ_STATE_INPUT_ONLY;
-       } 
-#endif
-
-       if ((state_idx = masq_tcp_state_idx(th, state_off)) < 0) {
+       if ((state_idx = masq_tcp_state_idx(th, output)) < 0) {
                 IP_MASQ_DEBUG(1, "masq_state_idx(%d)=%d!!!\n", 
-                       state_off, state_idx);
+                       output, state_idx);
                 goto tcp_state_out;
         }
  
@@ -443,7 +233,7 @@ tcp_state_out:
         if (new_state!=ms->state)
                 IP_MASQ_DEBUG(1, "%s %s [%c%c%c%c] %08lX:%04X-%08lX:%04X state: %s->%s\n",
                                 masq_proto_name(ms->protocol),
-                               (state_off==MASQ_STATE_OUTPUT) ? "output " : "input ",
+                               output? "output" : "input ",
                                 th->syn? 'S' : '.',
                                 th->fin? 'F' : '.',
                                 th->ack? 'A' : '.',
@@ -452,14 +242,6 @@ tcp_state_out:
                                 ntohl(ms->daddr), ntohs(ms->dport),
                                 ip_masq_state_name(ms->state),
                                 ip_masq_state_name(new_state));
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-       if (th->fin && (ms->state == IP_MASQ_S_ESTABLISHED)
-            && (ms->flags & IP_MASQ_F_VS) && !(ms->flags & IP_MASQ_F_VS_FIN)) {
-               ip_vs_fin_masq(ms);
-       }
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
         return masq_set_state_timeout(ms, new_state);
  }
  
@@ -467,7 +249,7 @@ tcp_state_out:
  /*
   *     Handle state transitions
   */
-static int masq_set_state(struct ip_masq *ms, int state_off, struct iphdr *iph, void *tp)
+static int masq_set_state(struct ip_masq *ms, int output, struct iphdr *iph, void *tp)
  {
         switch (iph->protocol) {
                 case IPPROTO_ICMP:
@@ -475,7 +257,7 @@ static int masq_set_state(struct ip_masq *ms, int state_off, struct iphdr *iph,
                 case IPPROTO_UDP:
                         return masq_set_state_timeout(ms, IP_MASQ_S_UDP);
                 case IPPROTO_TCP:
-                       return masq_tcp_state(ms, state_off, tp);
+                       return masq_tcp_state(ms, output, tp);
         }
         return -1;
  }
@@ -574,9 +356,6 @@ atomic_t mport_count = ATOMIC_INIT(0);
  
  EXPORT_SYMBOL(ip_masq_get_debug_level);
  EXPORT_SYMBOL(ip_masq_new);
-#ifdef CONFIG_IP_MASQUERADE_VS
-EXPORT_SYMBOL(ip_masq_new_vs);
-#endif /* CONFIG_IP_MASQUERADE_VS */
  EXPORT_SYMBOL(ip_masq_listen);
  EXPORT_SYMBOL(ip_masq_free_ports);
  EXPORT_SYMBOL(ip_masq_out_get);
@@ -599,6 +378,8 @@ EXPORT_SYMBOL(ip_masq_d_table);
   *       1 for extra modules support (daddr)
   */
    
+#define IP_MASQ_NTABLES 3
+
  struct list_head ip_masq_m_table[IP_MASQ_TAB_SIZE];
  struct list_head ip_masq_s_table[IP_MASQ_TAB_SIZE];
  struct list_head ip_masq_d_table[IP_MASQ_TAB_SIZE];
@@ -643,17 +424,9 @@ static void __ip_masq_set_expire(struct ip_masq *ms, unsigned long tout)
  {
          if (tout) {
                  ms->timer.expires = jiffies+tout;
-#ifdef CONFIG_IP_MASQUERADE_VS
-                add_sltimer(&ms->timer);
-#else
                  add_timer(&ms->timer);
-#endif
          } else {
-#ifdef CONFIG_IP_MASQUERADE_VS
-                del_sltimer(&ms->timer);
-#else
                  del_timer(&ms->timer);
-#endif
          }
  }
  
@@ -969,10 +742,6 @@ struct ip_masq * ip_masq_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32
         struct ip_masq *ms;
  
         read_lock(&__ip_masq_lock);
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ms = ip_vs_out_get(protocol, s_addr, s_port, d_addr, d_port);
-        if (ms == NULL)
-#endif /* CONFIG_IP_MASQUERADE_VS */
         ms = __ip_masq_out_get(protocol, s_addr, s_port, d_addr, d_port);
         read_unlock(&__ip_masq_lock);
  
@@ -986,10 +755,6 @@ struct ip_masq * ip_masq_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32
         struct ip_masq *ms;
  
         read_lock(&__ip_masq_lock);
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ms = ip_vs_in_get(protocol, s_addr, s_port, d_addr, d_port);
-        if (ms == NULL)
-#endif /* CONFIG_IP_MASQUERADE_VS */
         ms =  __ip_masq_in_get(protocol, s_addr, s_port, d_addr, d_port);
         read_unlock(&__ip_masq_lock);
  
@@ -1060,14 +825,6 @@ static void masq_expire(unsigned long data)
         if (ms->control) 
                 ip_masq_control_del(ms);
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-        if (ms->flags & IP_MASQ_F_VS) {
-                if (ip_vs_unhash(ms)) {
-                        ip_vs_unbind_masq(ms);
-                }
-        }
-        else
-#endif /* CONFIG_IP_MASQUERADE_VS */
          if (ip_masq_unhash(ms)) {
                 if (ms->flags&IP_MASQ_F_MPORT) {
                         atomic_dec(&mport_count);
@@ -1304,73 +1061,6 @@ mport_nono:
          return NULL;
  }
  
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-/*
- *  Create a new masquerade entry for IPVS, all parameters {maddr,
- *  mport, saddr, sport, daddr, dport, mflags} are known. No need
- *  to allocate a free mport. And, hash it into the ip_vs_table.
- *
- *  Be careful, it can be called from u-space
- */
-
-struct ip_masq * ip_masq_new_vs(int proto, __u32 maddr, __u16 mport, __u32 saddr, __u16 sport, __u32 daddr, __u16 dport, unsigned mflags)
-{
-        struct ip_masq *ms;
-        static int n_fails = 0;
-       int prio;
-
-       prio = (mflags&IP_MASQ_F_USER) ? GFP_KERNEL : GFP_ATOMIC;
-
-        ms = (struct ip_masq *) kmalloc(sizeof(struct ip_masq), prio);
-        if (ms == NULL) {
-                if (++n_fails < 5)
-                        IP_VS_ERR("ip_masq_new_vs(proto=%s): no memory available.\n",
-                                  masq_proto_name(proto));
-                return NULL;
-        }
-       MOD_INC_USE_COUNT;
-        memset(ms, 0, sizeof(*ms));
-       init_timer(&ms->timer);
-       ms->timer.data     = (unsigned long)ms;
-       ms->timer.function = masq_expire;
-        ms->protocol      = proto;
-        ms->saddr         = saddr;
-        ms->sport         = sport;
-        ms->daddr         = daddr;
-        ms->dport         = dport;
-        ms->maddr          = maddr;
-        ms->mport          = mport;
-        ms->flags         = mflags;
-        ms->app_data      = NULL;
-        ms->control       = NULL;
-       
-       atomic_set(&ms->n_control,0);
-       atomic_set(&ms->refcnt,0);
-
-        if (mflags & IP_MASQ_F_USER)   
-                write_lock_bh(&__ip_masq_lock);
-        else 
-                write_lock(&__ip_masq_lock);
-
-        /*
-         *  Hash it in the ip_vs_table
-         */
-        ip_vs_hash(ms);
-
-        if (mflags & IP_MASQ_F_USER)   
-                write_unlock_bh(&__ip_masq_lock);
-        else 
-                write_unlock(&__ip_masq_lock);
-
-        /*  ip_masq_bind_app(ms); */
-        atomic_inc(&ms->refcnt);
-        masq_set_state_timeout(ms, IP_MASQ_S_NONE);
-        return ms;
-}
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
-
  /*
   *     Get transport protocol data offset, check against size
   *     return:
@@ -1413,7 +1103,6 @@ static __inline__ int proto_doff(unsigned proto, char *th, unsigned size)
         return ret;
  }
  
-
  int ip_fw_masquerade(struct sk_buff **skb_p, __u32 maddr)
  {
         struct sk_buff  *skb = *skb_p;
@@ -1668,11 +1357,11 @@ int ip_fw_masquerade(struct sk_buff **skb_p, __u32 maddr)
         IP_MASQ_DEBUG(2, "O-routed from %08lX:%04X with masq.addr %08lX\n",
                 ntohl(ms->maddr),ntohs(ms->mport),ntohl(maddr));
  
-       masq_set_state(ms, MASQ_STATE_OUTPUT, iph, h.portp);
+       masq_set_state(ms, 1, iph, h.portp);
         ip_masq_put(ms);
  
         return 0;
-}
+ }
  
  /*
   *     Restore original addresses and ports in the original IP
@@ -1831,7 +1520,7 @@ int ip_fw_masq_icmp(struct sk_buff **skb_p, __u32 maddr)
                        ntohs(icmp_id(icmph)),
                        icmph->type);
  
-               masq_set_state(ms, MASQ_STATE_OUTPUT, iph, icmph);
+               masq_set_state(ms, 1, iph, icmph);
                 ip_masq_put(ms);
  
                 return 1;
@@ -2077,7 +1766,7 @@ int ip_fw_demasq_icmp(struct sk_buff **skb_p)
                        ntohs(icmp_id(icmph)),
                        icmph->type);
  
-               masq_set_state(ms, MASQ_STATE_INPUT, iph, icmph);
+               masq_set_state(ms, 0, iph, icmph);
                 ip_masq_put(ms);
  
                 return 1;
@@ -2295,19 +1984,13 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                 return(ip_fw_demasq_icmp(skb_p));
         case IPPROTO_TCP:
         case IPPROTO_UDP:
-               /*
+               /* 
                  *      Make sure packet is in the masq range 
                  *      ... or some mod-ule relaxes input range
                  *      ... or there is still some `special' mport opened
                  */
-#ifdef CONFIG_IP_MASQUERADE_VS
-               ms = ip_masq_in_get_iph(iph);
-               if ((ms == NULL)
-                    && (ip_vs_lookup_service(maddr, h.portp[1], iph->protocol) == NULL)
-#else
                 if ((ntohs(h.portp[1]) < PORT_MASQ_BEGIN
                                 || ntohs(h.portp[1]) > PORT_MASQ_END)
-#endif /* CONFIG_IP_MASQUERADE_VS */
  #ifdef CONFIG_IP_MASQUERADE_MOD
                                 && (ip_masq_mod_in_rule(skb, iph) != 1) 
  #endif
@@ -2349,6 +2032,8 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                 return 0;
         }
  
+
+
         IP_MASQ_DEBUG(2, "Incoming %s %08lX:%04X -> %08lX:%04X\n",
                 masq_proto_name(iph->protocol),
                 ntohl(iph->saddr), ntohs(h.portp[0]),
@@ -2357,9 +2042,8 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
         /*
          * reroute to original host:port if found...
           */
-#ifndef CONFIG_IP_MASQUERADE_VS
+
          ms = ip_masq_in_get_iph(iph);
-#endif 
  
         /*
          *      Give additional modules a chance to create an entry
@@ -2374,19 +2058,10 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
         ip_masq_mod_in_update(skb, iph, ms);
  #endif
  
-#ifdef CONFIG_IP_MASQUERADE_VS
-       if (!ms && (h.th->syn || (iph->protocol!=IPPROTO_TCP))) {
-               /* 
-                * Let the virtual server select a real server
-                * for the incomming connection, and create a
-                 * masquerading entry.
-                */ 
-               ms = ip_vs_schedule(iph->daddr,h.portp[1],iph->protocol,iph);
-       }
-#endif /* CONFIG_IP_MASQUERADE_VS */
  
          if (ms != NULL)
          {
+
                  /*
                   *     got reply, so clear flag
                   */
@@ -2435,65 +2110,13 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
  
                  }
                 }
-
                 if ((skb=masq_skb_cow(skb_p, &iph, &h.raw)) == NULL) {
                         ip_masq_put(ms);
                         return -1;
                 }
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-               if (IP_MASQ_VS_FWD(ms) != 0) {
-                        int ret = 0;
-                        
-                        /*
-                         *    Return values mean:
-                         *      -1    skb must be released
-                         *      -2    call skb->dst->input(skb) to release skb
-                         *      -3    skb has been released
-                         */
-                        switch (IP_MASQ_VS_FWD(ms)) {
-                          case IP_MASQ_F_VS_TUNNEL:
-                                if (ip_vs_tunnel_xmit(skb_p, ms->saddr) == 0) {
-                                        IP_VS_DBG("tunneling error.\n");
-                                } else {
-                                        IP_VS_DBG("tunneling succeeded.\n");
-                                }
-                                ret = -3;
-                                break;
-                                
-                          case IP_MASQ_F_VS_DROUTE:
-                                dst_release(skb->dst);
-                                skb->dst = NULL;
-                                ip_send_check(iph);
-                                if (ip_route_input(skb, ms->saddr, iph->saddr,
-                                                   iph->tos, skb->dev)) {
-                                        IP_VS_DBG("direct routing error.\n");
-                                        ret = -1;
-                                } else {
-                                        IP_VS_DBG("direct routing succeeded.\n");
-                                        ret = -2;
-                                }
-                                break;
-                                
-                          case IP_MASQ_F_VS_LOCALNODE:
-                                ret = 0;
-                        }
-                        
-                        /*
-                         *    Set state of masq entry
-                         */
-                        masq_set_state (ms, MASQ_STATE_INPUT, iph, h.portp);
-                        ip_masq_put(ms);
-
-                        return ret;
-               }
- 
-                IP_VS_DBG("masquerading packet...\n");
-#endif /* CONFIG_IP_MASQUERADE_VS */
-                
                  iph->daddr = ms->saddr;
                  h.portp[1] = ms->sport;
-                
+
                 /*
                  *      Invalidate csum saving if tunnel has masq helper
                  */
@@ -2550,11 +2173,11 @@ int ip_fw_demasquerade(struct sk_buff **skb_p)
                                         h.uh->check = 0xFFFF;
                                 break;
                 }
-               ip_send_check(iph);
+                ip_send_check(iph);
  
                  IP_MASQ_DEBUG(2, "I-routed to %08lX:%04X\n",ntohl(iph->daddr),ntohs(h.portp[1]));
  
-               masq_set_state (ms, MASQ_STATE_INPUT, iph, h.portp);
+               masq_set_state (ms, 0, iph, h.portp);
                 ip_masq_put(ms);
  
                  return 1;
@@ -2669,49 +2292,7 @@ static int ip_msqhst_procinfo(char *buffer, char **start, off_t offset,
                 len += sprintf(buffer+len, "%-127s\n", temp);
  
                 if(len >= length) {
-                       read_unlock_bh(&__ip_masq_lock);
-                       goto done;
-               }
-        }
-       read_unlock_bh(&__ip_masq_lock);
  
-       }
-
-#ifdef CONFIG_IP_MASQUERADE_VS
-        for(idx = 0; idx < IP_VS_TAB_SIZE; idx++) 
-       {
-       /*
-        *      Lock is actually only need in next loop 
-        *      we are called from uspace: must stop bh.
-        */
-       read_lock_bh(&__ip_masq_lock);
-
-       l = &ip_vs_table[idx];
-       for (e=l->next; e!=l; e=e->next) {
-               ms = list_entry(e, struct ip_masq, m_list);
-               pos += 128;
-               if (pos <= offset) {
-                       len = 0;
-                       continue;
-               }
-
-               /*
-                *      We have locked the tables, no need to del/add timers
-                *      nor cli()  8)
-                */
-
-               sprintf(temp,"%s %08lX:%04X %08lX:%04X %04X %08X %6d %6d %7lu",
-                       masq_proto_name(ms->protocol),
-                       ntohl(ms->saddr), ntohs(ms->sport),
-                       ntohl(ms->daddr), ntohs(ms->dport),
-                       ntohs(ms->mport),
-                       ms->out_seq.init_seq,
-                       ms->out_seq.delta,
-                       ms->out_seq.previous_delta,
-                       ms->timer.expires-jiffies);
-               len += sprintf(buffer+len, "%-127s\n", temp);
-
-               if(len >= length) {
                         read_unlock_bh(&__ip_masq_lock);
                         goto done;
                 }
@@ -2719,9 +2300,9 @@ static int ip_msqhst_procinfo(char *buffer, char **start, off_t offset,
         read_unlock_bh(&__ip_masq_lock);
  
         }
-#endif /* CONFIG_IP_MASQUERADE_VS */
-
  done:
+
+
         begin = len - (pos - offset);
         *start = buffer + begin;
         len -= begin;
@@ -2828,11 +2409,6 @@ int ip_masq_uctl(int optname, char * optval , int optlen)
                 case IP_MASQ_TARGET_MOD:
                         ret = ip_masq_mod_ctl(optname, &masq_ctl, optlen);
                         break;
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS
-               case IP_MASQ_TARGET_VS:
-                       ret = ip_vs_ctl(optname, &masq_ctl, optlen);
-                       break;
  #endif
         }
  
@@ -2953,7 +2529,7 @@ __initfunc(int ip_masq_init(void))
                 (char *) IPPROTO_ICMP,
                 ip_masq_user_info
         });
-#endif /* CONFIG_PROC_FS */
+#endif 
  #ifdef CONFIG_IP_MASQUERADE_IPAUTOFW
         ip_autofw_init();
  #endif
@@ -2962,11 +2538,6 @@ __initfunc(int ip_masq_init(void))
  #endif
  #ifdef CONFIG_IP_MASQUERADE_MFW
         ip_mfw_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS
-        ip_vs_init();
-        slow_timer.expires = jiffies+SLTIMER_PERIOD;
-        add_timer(&slow_timer);
  #endif
          ip_masq_app_init();
  
diff --git a/net/ipv4/ip_masq_autofw.c b/net/ipv4/ip_masq_autofw.c

index 3030144165f5481d29a58e2de58a0d6a11cc9554..d2a1729c5386a5ff4af8785d8d47620d55c77b0d 100644 (file)
--- a/net/ipv4/ip_masq_autofw.c
+++ b/net/ipv4/ip_masq_autofw.c
@@ -2,7 +2,7 @@
   *             IP_MASQ_AUTOFW auto forwarding module
   *
   *
- *     $Id: ip_masq_autofw.c,v 1.3.2.1 1999/08/13 18:26:20 davem Exp $
+ *     $Id: ip_masq_autofw.c,v 1.3 1998/08/29 23:51:10 davem Exp $
   *
   * Author:     Richard Lynch
   *
@@ -179,13 +179,13 @@ static __inline__ int ip_autofw_add(struct ip_autofw_user * af)
  {
         struct ip_autofw * newaf;
         newaf = kmalloc( sizeof(struct ip_autofw), GFP_KERNEL );
+       init_timer(&newaf->timer);
         if ( newaf == NULL ) 
         {
                 printk("ip_autofw_add:  malloc said no\n");
                 return( ENOMEM );
         }
  
-       init_timer(&newaf->timer);
         MOD_INC_USE_COUNT;
  
         memcpy(newaf, af, sizeof(struct ip_autofw_user));
diff --git a/net/ipv4/ip_masq_mfw.c b/net/ipv4/ip_masq_mfw.c

index 473d0c8e615ec23dbedd5bf408a2365526267cb8..60c7797065f9ce5d1a40d9e7e1b8ea011a5296eb 100644 (file)
--- a/net/ipv4/ip_masq_mfw.c
+++ b/net/ipv4/ip_masq_mfw.c
@@ -3,7 +3,7 @@
   *
   *     Does (reverse-masq) forwarding based on skb->fwmark value
   *
- *     $Id: ip_masq_mfw.c,v 1.3.2.2 1999/08/13 18:26:26 davem Exp $
+ *     $Id: ip_masq_mfw.c,v 1.3.2.1 1999/07/02 10:10:03 davem Exp $
   *
   * Author:     Juan Jose Ciarlante   <jjciarla@raiz.uncu.edu.ar>
   *               based on Steven Clarke's portfw
@@ -216,7 +216,6 @@ static int mfw_delhost(struct ip_masq_mfw *mfw, struct ip_mfw_user *mu)
                         (!mu->rport || h->port == mu->rport)) {
                         /* HIT */
                         atomic_dec(&mfw->nhosts);
-                       e = h->list.prev;
                         list_del(&h->list);
                         kfree_s(h, sizeof(*h));
                         MOD_DEC_USE_COUNT;
diff --git a/net/ipv4/ip_masq_portfw.c b/net/ipv4/ip_masq_portfw.c

index c4b1ef4c88e0e0511599ef8b4d25779f3130c791..6c697a1029bc8abca9b74f615e1a45567e860731 100644 (file)
--- a/net/ipv4/ip_masq_portfw.c
+++ b/net/ipv4/ip_masq_portfw.c
@@ -2,7 +2,7 @@
   *             IP_MASQ_PORTFW masquerading module
   *
   *
- *     $Id: ip_masq_portfw.c,v 1.3.2.2 1999/08/13 18:26:29 davem Exp $
+ *     $Id: ip_masq_portfw.c,v 1.3.2.1 1999/07/02 10:10:02 davem Exp $
   *
   * Author:     Steven Clarke <steven.clarke@monmouth.demon.co.uk>
   *
@@ -85,8 +85,7 @@ static __inline__ int ip_portfw_del(__u16 protocol, __u16 lport, __u32 laddr, __
                                 (!laddr || n->laddr == laddr) &&
                                 (!raddr || n->raddr == raddr) && 
                                 (!rport || n->rport == rport)) {
-                       entry = n->list.prev;
-                       list_del(&n->list);
+                       list_del(entry);
                         ip_masq_mod_dec_nent(mmod_self);
                         kfree_s(n, sizeof(struct ip_portfw));
                         MOD_DEC_USE_COUNT;
@@ -423,6 +422,8 @@ static struct ip_masq * portfw_in_create(const struct sk_buff *skb, const struct
                                 raddr, rport,
                                 iph->saddr, portp[0],
                                 0);
+               ip_masq_listen(ms);
+
                 if (!ms || atomic_read(&mmod_self->mmod_nent) <= 1 
                         /* || ip_masq_nlocks(&portfw_lock) != 1 */ )
                                 /*
@@ -430,8 +431,6 @@ static struct ip_masq * portfw_in_create(const struct sk_buff *skb, const struct
                                  */
                                 goto out;
  
-               ip_masq_listen(ms);
-
                 /*
                  *      Entry created, lock==1.
                  *      if pref_cnt == 0, move
diff --git a/net/ipv4/ip_masq_user.c b/net/ipv4/ip_masq_user.c

index f369f03ddee6c7679efe7cf22186822b1cb32fd5..5129744195f52a8b306ea6431aab2ec976c5a6c2 100644 (file)
--- a/net/ipv4/ip_masq_user.c
+++ b/net/ipv4/ip_masq_user.c
@@ -2,7 +2,7 @@
   *     IP_MASQ_USER user space control module
   *
   *
- *     $Id: ip_masq_user.c,v 1.1.2.2 1999/08/13 18:26:33 davem Exp $
+ *     $Id: ip_masq_user.c,v 1.1.2.1 1999/08/07 10:56:33 davem Exp $
   */
  
  #include <linux/config.h>
diff --git a/net/ipv4/ip_vs.c b/net/ipv4/ip_vs.c

deleted file mode 100644 (file)

index 9e4973d..0000000
--- a/net/ipv4/ip_vs.c
+++ /dev/null
@@ -1,1297 +0,0 @@
-/*
- * IPVS         An implementation of the IP virtual server support for the
- *              LINUX operating system.  IPVS is now implemented as a part
- *              of IP masquerading code. IPVS can be used to build a
- *              high-performance and highly available server based on a
- *              cluster of servers.
- *
- * Version:     $Id: ip_vs.c,v 1.1.2.1 1999/08/13 18:25:27 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *     Wensong Zhang            :     fixed the overflow bug in ip_vs_procinfo
- *     Wensong Zhang            :     added editing dest and service functions
- *     Wensong Zhang            :     changed name of some functions
- *     Wensong Zhang            :     fixed the unlocking bug in ip_vs_del_dest
- *     Wensong Zhang            :     added a separate hash table for IPVS
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <linux/ip_masq.h>
-#include <linux/proc_fs.h>
-
-#include <linux/inetdevice.h>
-#include <linux/ip.h>
-#include <net/icmp.h>
-#include <net/ip.h>
-#include <net/route.h>
-
-#include <net/ip_masq.h>
-#include <net/ip_vs.h>
-
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-
-EXPORT_SYMBOL(register_ip_vs_scheduler);
-EXPORT_SYMBOL(unregister_ip_vs_scheduler);
-EXPORT_SYMBOL(ip_vs_bind_masq);
-EXPORT_SYMBOL(ip_vs_unbind_masq);
-
-/*
- *  Lock for IPVS
- */
-rwlock_t __ip_vs_lock = RW_LOCK_UNLOCKED;
-
-/*
- *  Hash table: for input and output packets lookups of IPVS
- */
-struct list_head ip_vs_table[IP_VS_TAB_SIZE];
-
-/*
- * virtual server list and schedulers
- */
-static struct ip_vs_service *service_list[2] = {NULL,NULL};
-static struct ip_vs_scheduler *schedulers = NULL;
-
-
-/*
- *  Register a scheduler in the scheduler list
- */
-int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
-{
-       if (!scheduler) {
-               IP_VS_ERR("register_ip_vs_scheduler(): NULL arg\n");
-               return -EINVAL;
-       }
-
-        if (!scheduler->name) {
-               IP_MASQ_ERR("register_ip_vs_scheduler(): NULL scheduler_name\n");
-               return -EINVAL;
-       }
-
-       if (scheduler->next) {
-               IP_VS_ERR("register_ip_vs_scheduler(): scheduler already linked\n");
-               return -EINVAL;
-       }
-       
-       scheduler->next = schedulers;
-       schedulers = scheduler;
-
-       return 0;
-}
-
-
-/*
- *  Unregister a scheduler in the scheduler list
- */
-int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
-{
-       struct ip_vs_scheduler **psched;
-
-       if (!scheduler) {
-               IP_MASQ_ERR( "unregister_ip_vs_scheduler(): NULL arg\n");
-               return -EINVAL;
-       }
-
-       /*
-        *      Only allow unregistration if it is not referenced
-        */
-       if (atomic_read(&scheduler->refcnt))  {
-               IP_MASQ_ERR( "unregister_ip_vs_scheduler(): is in use by %d guys. failed\n",
-                               atomic_read(&scheduler->refcnt));
-               return -EINVAL;
-       }
-
-       /*      
-        *      Must be already removed from the scheduler list
-        */
-       for (psched = &schedulers; (*psched) && (*psched != scheduler);
-            psched = &((*psched)->next));
-
-       if (*psched != scheduler) {
-               IP_VS_ERR("unregister_ip_vs_scheduler(): scheduler is in the list. failed\n");
-               return -EINVAL;
-       }
-
-       *psched = scheduler->next;
-       scheduler->next = NULL;
-
-       return 0;
-}
-
-
-/*
- *  Bind a service with a scheduler
- */
-int ip_vs_bind_scheduler(struct ip_vs_service *svc,
-                         struct ip_vs_scheduler *scheduler)
-{
-        if (svc == NULL) {
-               IP_VS_ERR("ip_vs_bind_scheduler(): svc arg NULL\n");
-               return -EINVAL;
-       }
-        if (scheduler == NULL) {
-               IP_VS_ERR("ip_vs_bind_scheduler(): scheduler arg NULL\n");
-               return -EINVAL;
-       }
-
-        svc->scheduler = scheduler;
-        atomic_inc(&scheduler->refcnt);
-        
-        if(scheduler->init_service)
-                if(scheduler->init_service(svc) != 0) {
-                        IP_VS_ERR("ip_vs_bind_scheduler(): init error\n");
-                        return -EINVAL;
-                }
-        
-        return 0;
-}
-
-
-/*
- *  Unbind a service with its scheduler
- */
-int ip_vs_unbind_scheduler(struct ip_vs_service *svc)
-{
-       struct ip_vs_scheduler *sched;
-
-        if (svc == NULL) {
-               IP_VS_ERR("ip_vs_unbind_scheduler(): svc arg NULL\n");
-               return -EINVAL;
-       }
-
-        sched = svc->scheduler;
-        if (sched == NULL) {
-               IP_VS_ERR("ip_vs_unbind_scheduler(): svc isn't bound\n");
-               return -EINVAL;
-       }
-
-        if(sched->done_service)
-                if(sched->done_service(svc) != 0) {
-                        IP_VS_ERR("ip_vs_unbind_scheduler(): done error\n");
-                        return -EINVAL;
-                }
-
-        atomic_dec(&sched->refcnt);
-        svc->scheduler = NULL;
-
-        return 0;
-}
-
-
-/*
- *     Returns hash value for IPVS
- */
-
-static __inline__ unsigned 
-ip_vs_hash_key(unsigned proto, __u32 addr, __u16 port)
-{
-        unsigned addrh = ntohl(addr);
-        
-        return (proto^addrh^(addrh>>IP_VS_TAB_BITS)^ntohs(port))
-                & (IP_VS_TAB_SIZE-1);
-}
-
-
-/*
- *     Hashes ip_masq in ip_vs_table by proto,addr,port.
- *     should be called with locked tables.
- *     returns bool success.
- */
-int ip_vs_hash(struct ip_masq *ms)
-{
-        unsigned hash;
-
-        if (ms->flags & IP_MASQ_F_HASHED) {
-                IP_VS_ERR("ip_vs_hash(): request for already hashed, called from %p\n",
-                          __builtin_return_address(0));
-                return 0;
-        }
-        /*
-         *     Hash by proto,client{addr,port}
-         */
-        hash = ip_vs_hash_key(ms->protocol, ms->daddr, ms->dport);
-
-        /*
-         * Note: because ip_masq_put sets masq expire if its
-         *       refcnt==IP_MASQ_NTABLES, we have to increase
-         *       counter IP_MASQ_NTABLES times, otherwise the masq
-         *       won't expire.
-         */
-       atomic_add(IP_MASQ_NTABLES, &ms->refcnt);
-        list_add(&ms->m_list, &ip_vs_table[hash]);
-
-        ms->flags |= IP_MASQ_F_HASHED;
-        return 1;
-}
-
-
-/*
- *     UNhashes ip_masq from ip_vs_table.
- *     should be called with locked tables.
- *     returns bool success.
- */
-int ip_vs_unhash(struct ip_masq *ms)
-{
-        unsigned int hash;
-
-        if (!(ms->flags & IP_MASQ_F_HASHED)) {
-                IP_VS_ERR("ip_vs_unhash(): request for unhash flagged, called from %p\n",
-                          __builtin_return_address(0));
-                return 0;
-        }
-        /*
-         *     UNhash by client{addr,port}
-         */
-        hash = ip_vs_hash_key(ms->protocol, ms->daddr, ms->dport);
-        /*
-         * Note: since we increase refcnt while hashing,
-         *       we have to decrease it while unhashing.
-         */
-       atomic_sub(IP_MASQ_NTABLES, &ms->refcnt);
-       list_del(&ms->m_list);
-        ms->flags &= ~IP_MASQ_F_HASHED;
-        return 1;
-}
-
-
-/*
- *  Gets ip_masq associated with supplied parameters in the ip_vs_table.
- *  Called for pkts coming from OUTside-to-INside the firewall.
- *     s_addr, s_port: pkt source address (foreign host)
- *     d_addr, d_port: pkt dest address (firewall)
- *  Caller must lock tables
- */
-
-struct ip_masq * ip_vs_in_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port)
-{
-        unsigned hash;
-        struct ip_masq *ms;
-        struct list_head *l, *e;
-
-        hash = ip_vs_hash_key(protocol, s_addr, s_port);
-
-        l=&ip_vs_table[hash];
-        for(e=l->next; e!=l; e=e->next)
-       {
-               ms = list_entry(e, struct ip_masq, m_list);
-               if (protocol==ms->protocol && 
-                   d_addr==ms->maddr && d_port==ms->mport &&
-                   s_addr==ms->daddr && s_port==ms->dport
-                   ) {
-                       atomic_inc(&ms->refcnt);
-                        goto out;
-               }
-        }
-       ms = NULL;
-
-  out:
-        return ms;
-}
-
-
-/*
- *  Gets ip_masq associated with supplied parameters in the ip_vs_table.
- *  Called for pkts coming from inside-to-OUTside the firewall.
- *     s_addr, s_port: pkt source address (inside host)
- *     d_addr, d_port: pkt dest address (foreigh host)
- *  Caller must lock tables
- */
-struct ip_masq * ip_vs_out_get(int protocol, __u32 s_addr, __u16 s_port, __u32 d_addr, __u16 d_port)
-{
-        unsigned hash;
-        struct ip_masq *ms;
-        struct list_head *l, *e;
-
-       /*      
-        *      Check for "full" addressed entries
-        */
-        hash = ip_vs_hash_key(protocol, d_addr, d_port);
-        l=&ip_vs_table[hash];
-
-        for(e=l->next; e!=l; e=e->next)
-       {       
-               ms = list_entry(e, struct ip_masq, m_list);
-               if (protocol == ms->protocol &&
-                   s_addr == ms->saddr && s_port == ms->sport &&
-                   d_addr == ms->daddr && d_port == ms->dport
-                    ) {
-                       atomic_inc(&ms->refcnt);
-                       goto out;
-               }
-
-        }
-       ms = NULL;
-
-  out:
-        return ms;
-}
-
-
-/*
- *  Create a destination
- */
-struct ip_vs_dest *ip_vs_new_dest(struct ip_vs_service *svc,
-                                 struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-
-       IP_VS_DBG("enter ip_vs_new_dest()\n");
-
-       dest = (struct ip_vs_dest*) kmalloc(sizeof(struct ip_vs_dest),
-                                           GFP_ATOMIC);
-       if (dest == NULL) {
-               IP_VS_ERR("ip_vs_new_dest: kmalloc failed.\n");
-               return NULL;
-       }
-       memset(dest, 0, sizeof(struct ip_vs_dest));
-
-       dest->service = svc;
-       dest->addr = mm->daddr;
-       dest->port = mm->dport;
-       dest->weight = mm->weight;
-       dest->masq_flags = mm->masq_flags;
-
-       atomic_set(&dest->connections, 0);
-       atomic_set(&dest->refcnt, 0);
-
-        /*
-         *    Set the IP_MASQ_F_VS flag
-         */
-        dest->masq_flags |= IP_MASQ_F_VS;
-                
-       /* check if local node and update the flags */
-       if (inet_addr_type(mm->daddr) == RTN_LOCAL) {
-               dest->masq_flags = (dest->masq_flags & ~IP_MASQ_F_VS_FWD_MASK)
-                        | IP_MASQ_F_VS_LOCALNODE;
-       }
-
-       /* check if (fwd != masquerading) and update the port & flags */
-       if ((dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) != 0) {
-               dest->masq_flags |= IP_MASQ_F_VS_NO_OUTPUT;
-       }
-
-       return dest;
-}
-
-
-/*
- *  Add a destination into an existing service
- */
-int ip_vs_add_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-
-       IP_VS_DBG("enter ip_vs_add_dest()\n");
-
-       if (mm->weight < 0) {
-                IP_VS_ERR("ip_vs_add_dest(): server weight less than zero\n");
-                return -ERANGE;
-        }
-
-       write_lock_bh(&__ip_vs_lock);
-
-        /* check the existing dest list */
-        for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        write_unlock_bh(&__ip_vs_lock);
-                        IP_VS_ERR("ip_vs_add_dest(): dest exists\n");
-                        return -EEXIST;
-                }
-        }
-        
-       /* allocate and initialize the dest structure */
-       dest = ip_vs_new_dest(svc, mctl);
-       if (dest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-                IP_VS_ERR("ip_vs_add_dest(): out of memory\n");
-                return -ENOMEM;
-        }
-        
-       /* put the dest entry into the list */
-       dest->next = svc->destinations;
-       svc->destinations = dest;
-        
-       write_unlock_bh(&__ip_vs_lock);
-
-       atomic_inc(&dest->refcnt);
-
-       return 0;
-}
-
-        
-/*
- *  Edit a destination in a service
- */
-int ip_vs_edit_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-       struct ip_vs_dest *dest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-
-       IP_VS_DBG("enter ip_vs_edit_dest()\n");
-
-       if (mm->weight < 0) {
-                IP_VS_ERR("ip_vs_add_dest(): server weight less than zero\n");
-                return -ERANGE;
-        }
-        
-       write_lock_bh(&__ip_vs_lock);
-
-        /* lookup the destination list */
-        for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        /* HIT */
-                        break;
-                }
-        }
-
-        if (dest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-                IP_VS_ERR("ip_vs_edit_dest(): dest doesn't exist\n");
-                return -ENOENT;
-        }
-        
-        /*
-         *    Set the weight and the flags
-         */
-       dest->weight = mm->weight;
-       dest->masq_flags = mm->masq_flags;
-
-        dest->masq_flags |= IP_MASQ_F_VS;
-                
-       /* check if local node and update the flags */
-       if (inet_addr_type(mm->daddr) == RTN_LOCAL) {
-               dest->masq_flags = (dest->masq_flags & ~IP_MASQ_F_VS_FWD_MASK)
-                        | IP_MASQ_F_VS_LOCALNODE;
-       }
-
-       /* check if (fwd != masquerading) and update the port & flags */
-       if ((dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) != 0) {
-               dest->masq_flags |= IP_MASQ_F_VS_NO_OUTPUT;
-       }
-        
-       write_unlock_bh(&__ip_vs_lock);
-
-       return 0;
-}
-
-
-/*
- *  Delete a destination from an existing service
- */
-int ip_vs_del_dest(struct ip_vs_service *svc, struct ip_masq_ctl *mctl)
-{
-        struct ip_vs_dest *dest;
-        struct ip_vs_dest **pdest;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-        __u32 daddr = mm->daddr;
-        __u16 dport = mm->dport;
-        
-       IP_VS_DBG("enter ip_vs_del_dest()\n");
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* remove dest from the destination list */
-       pdest = &svc->destinations;
-       while (*pdest) {
-                dest = *pdest;
-                if ((dest->addr == daddr) && (dest->port == dport))
-                        /* HIT */
-                        break;
-
-                pdest = &dest->next;
-        }
-        
-       if (*pdest == NULL) {
-                write_unlock_bh(&__ip_vs_lock);
-               IP_VS_ERR("ip_vs_del_dest(): destination not found!\n");
-               return -ENOENT;
-       }
-        
-       *pdest = dest->next;
-       dest->service = NULL;
-
-       write_unlock_bh(&__ip_vs_lock);
-
-        /*
-         *  Decrease the refcnt of the dest, and free the dest
-         *  if nobody refers to it (refcnt=0).
-         */
-        if (atomic_dec_and_test(&dest->refcnt))
-                kfree_s(dest, sizeof(*dest));
-
-       return 0;
-}
-        
-
-#if 0
-struct ip_vs_dest * ip_vs_lookup_dest(struct ip_vs_service *svc,
-                                      __u32 daddr, __u16 dport)
-{
-       struct ip_vs_dest *dest;
-        
-       read_lock_bh(&__ip_vs_lock);
-
-       /*
-         * Find the destination for the given service
-         */
-       for (dest=svc->destinations; dest; dest=dest->next) {
-                if ((dest->addr == daddr) && (dest->port == dport)) {
-                        /* HIT */
-                        read_unlock_bh(&__ip_vs_lock);
-                        return dest;
-                }
-        }
-
-       read_unlock_bh(&__ip_vs_lock);
-       return NULL;
-}
-#endif
-
-
-/*
- *  Add a service into the service list
- */
-int ip_vs_add_service(__u32 vaddr, __u16 vport, 
-                     __u16 protocol, struct ip_vs_scheduler *scheduler)
-{
-       struct ip_vs_service *svc;
-       int proto_num = masq_proto_num(protocol);
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* check if the service already exists */
-       for (svc = service_list[proto_num]; svc; svc = svc->next) {
-               if ((svc->port == vport) && (svc->addr == vaddr)) {
-                       ret = -EEXIST;
-                       goto out;
-               }
-       }
-
-       svc = (struct ip_vs_service*) kmalloc(sizeof(struct ip_vs_service),
-                                             GFP_ATOMIC);
-       if (svc == NULL) {
-               IP_VS_ERR("vs_add_svc: kmalloc failed.\n");
-               ret = -1;
-               goto out;
-       }
-       memset(svc,0,sizeof(struct ip_vs_service));
-
-       svc->addr = vaddr;
-       svc->port = vport;
-       svc->protocol = protocol;
-
-        /*
-         *    Bind the scheduler
-         */
-       ip_vs_bind_scheduler(svc, scheduler);
-
-
-       /* put the service into the proper service list */
-       if ((svc->port) || (!service_list[proto_num])) {
-               /* prepend to the beginning of the list */
-               svc->next = service_list[proto_num];
-               service_list[proto_num] = svc;
-       } else {
-               /* append to the end of the list if port==0 */
-               struct ip_vs_service *lsvc = service_list[proto_num];
-               while (lsvc->next) lsvc = lsvc->next;
-               svc->next = NULL;
-               lsvc->next = svc;
-       }
-
-  out:
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Edit s service
- */
-int ip_vs_edit_service(struct ip_vs_service *svc,
-                       struct ip_vs_scheduler *scheduler)
-{
-       write_lock_bh(&__ip_vs_lock);
-
-       /*
-         *    Unbind the old scheduler
-         */
-       ip_vs_unbind_scheduler(svc);
-
-        /*
-         *    Bind the new scheduler
-         */
-       ip_vs_bind_scheduler(svc, scheduler);
-        
-       write_unlock_bh(&__ip_vs_lock);
-        
-       return 0;
-}
-
-
-/*
- *  Delete a service from the service list
- */
-int ip_vs_del_service(struct ip_vs_service *svc)
-{
-       struct ip_vs_service **psvc;
-        struct ip_vs_dest *dest, *dnext;
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-
-       /* remove the service from the service_list */
-       psvc = &service_list[masq_proto_num(svc->protocol)];
-       for(; *psvc; psvc = &(*psvc)->next) {
-               if (*psvc == svc) {
-                       break;
-               }
-       }
-
-       if (*psvc == NULL) {
-               IP_VS_ERR("vs_del_svc: service not listed.");
-               ret = -1;
-               goto out;
-       }
-
-       *psvc = svc->next;
-
-       /*
-         *    Unbind scheduler
-         */
-       ip_vs_unbind_scheduler(svc);
-
-        /*
-         *    Unlink the destination list
-         */
-        dest = svc->destinations;
-        svc->destinations = NULL;
-        for (; dest; dest=dnext) {
-                dnext = dest->next;
-                dest->service = NULL;
-                dest->next = NULL;
-                
-                /*
-                 *  Decrease the refcnt of the dest, and free the dest
-                 *  if nobody refers to it (refcnt=0).
-                 */
-                if (atomic_dec_and_test(&dest->refcnt))
-                        kfree_s(dest, sizeof(*dest));
-        }
-
-       /*
-         *    Free the service
-         */
-       kfree_s(svc, sizeof(struct ip_vs_service));
-
-  out:
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Flush all the virtual services
- */
-int ip_vs_flush(void)
-{
-        int proto_num;
-        struct ip_vs_service *svc, *snext;
-        struct ip_vs_dest *dest, *dnext;
-       int ret = 0;
-
-       write_lock_bh(&__ip_vs_lock);
-        
-       for (proto_num=0; proto_num<2; proto_num++) {
-                svc = service_list[proto_num];
-                service_list[proto_num] = NULL;
-                for (; svc; svc=snext) {
-                        snext = svc->next;
-
-                        /*
-                         *    Unbind scheduler
-                         */
-                        ip_vs_unbind_scheduler(svc);
-
-                        /*
-                         *    Unlink the destination list
-                         */
-                        dest = svc->destinations;
-                        svc->destinations = NULL;
-                        for (; dest; dest=dnext) {
-                                dnext = dest->next;
-                                dest->service = NULL;
-                                dest->next = NULL;
-                
-                                /*
-                                 *  Decrease the refcnt of the dest, and free
-                                 *  the dest if nobody refers to it (refcnt=0).
-                                 */
-                                if (atomic_dec_and_test(&dest->refcnt))
-                                        kfree_s(dest, sizeof(*dest));
-                        }
-
-                        /*
-                         *    Free the service
-                         */
-                        kfree_s(svc, sizeof(*svc));
-                }
-        }
-        
-       write_unlock_bh(&__ip_vs_lock);
-       return ret;
-}
-
-
-/*
- *  Called when a FIN packet of ms is received
- */
-void ip_vs_fin_masq(struct ip_masq *ms)
-{
-        IP_VS_DBG("enter ip_vs_fin_masq()\n");
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        if(ms->dest)
-                atomic_dec(&ms->dest->connections);
-       ms->flags |= IP_MASQ_F_VS_FIN;
-}
-
-
-/*
- *  Bind a masq entry with a VS destination
- */
-void ip_vs_bind_masq(struct ip_masq *ms, struct ip_vs_dest *dest)
-{
-        IP_VS_DBG("enter ip_vs_bind_masq()\n");
-
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        ms->flags |= dest->masq_flags;
-        ms->dest = dest;
-
-        /*
-         *    Increase the refcnt and connections couters of the dest.
-         */
-        atomic_inc(&dest->refcnt);
-        atomic_inc(&dest->connections);
-}
-
-
-/*
- *  Unbind a masq entry with its VS destination
- */
-void ip_vs_unbind_masq(struct ip_masq *ms)
-{
-        struct ip_vs_dest *dest = ms->dest;
-        
-        IP_VS_DBG("enter ip_vs_unbind_masq()\n");
-
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        if (dest) {
-               if (!(ms->flags & IP_MASQ_F_VS_FIN)) {
-                        /*
-                         * Masq timeout, decrease the connection counter
-                         */
-                       atomic_dec(&dest->connections);
-                }
-                
-                /*
-                 *  Decrease the refcnt of the dest, and free the dest
-                 *  if nobody refers to it (refcnt=0).
-                 */
-                if (atomic_dec_and_test(&dest->refcnt))
-                        kfree_s(dest, sizeof(*dest));
-       }
-}
-
-
-/*
- *    Get scheduler in the scheduler list by name
- */
-struct ip_vs_scheduler * ip_vs_sched_getbyname(const char *sched_name)
-{
-       struct ip_vs_scheduler *sched;
-
-       IP_VS_DBG("ip_vs_sched_getbyname(): sched_name \"%s\"\n", sched_name);
-       
-       read_lock_bh(&__ip_vs_lock);
-       for (sched = schedulers; sched; sched = sched->next) {
-               if (strcmp(sched_name, sched->name)==0) {
-                       /* HIT */
-                       read_unlock_bh(&__ip_vs_lock);
-                        return sched;
-               }
-       }
-
-       read_unlock_bh(&__ip_vs_lock);
-       return NULL;
-}
-
-
-/*
- *    Lookup scheduler and try to load it if it doesn't exist
- */
-struct ip_vs_scheduler * ip_vs_lookup_scheduler(const char *sched_name)
-{
-       struct ip_vs_scheduler *sched;
-
-        /* search for the scheduler by sched_name */
-        sched = ip_vs_sched_getbyname(sched_name);
-
-        /* if scheduler not found, load the module and search again */
-        if (sched == NULL) {
-                char module_name[IP_MASQ_TNAME_MAX+8];
-                sprintf(module_name,"ip_vs_%s",sched_name);
-#ifdef CONFIG_KMOD
-                request_module(module_name);
-#endif /* CONFIG_KMOD */
-                sched = ip_vs_sched_getbyname(sched_name);
-        }
-                        
-        return sched;
-}
-
-
-/*
- *  Lookup service by {proto,addr,port} in the service list
- */
-struct ip_vs_service *ip_vs_lookup_service(__u32 vaddr, __u16 vport,
-                                           __u16 protocol)
-{
-        struct ip_vs_service *svc;
-
-        read_lock(&__ip_vs_lock);
-        svc = service_list[masq_proto_num(protocol)];
-        while (svc) {
-                if ((svc->addr == vaddr) &&
-                    (!svc->port || (svc->port == vport)))
-                        break;
-                svc = svc->next;
-        }
-        read_unlock(&__ip_vs_lock);
-        return svc; 
-}
-
-        
-/*
- *  IPVS main scheduling function
- *  It selects a server according to the virtual service, and
- *  creates a masq entry.
- */
-struct ip_masq *ip_vs_schedule(__u32 vaddr, __u16 vport, __u16 protocol,
-                              struct iphdr *iph)
-{
-       struct ip_vs_service *svc;
-       struct ip_masq *ms = NULL;
-       int proto_num = masq_proto_num(protocol);
-
-       read_lock(&__ip_vs_lock);
-        
-       /*
-         * Lookup the service
-         */
-       for (svc = service_list[proto_num]; svc; svc = svc->next) {
-               if ((svc->addr == vaddr) &&
-                   (!svc->port || (svc->port == vport))) {
-                       /*
-                        * choose the destination and create ip_masq entry
-                        */
-                       ms = svc->scheduler->schedule(svc, iph);
-                       break;
-               }
-       }
-        
-       read_unlock(&__ip_vs_lock);
-
-        return ms;
-}
-
-
-/*
- *     IPVS user control entry
- */
-int ip_vs_ctl(int optname, struct ip_masq_ctl *mctl, int optlen)
-{
-       struct ip_vs_scheduler *sched = NULL;
-        struct ip_vs_service *svc = NULL;
-       struct ip_vs_user *mm =  &mctl->u.vs_user;
-       __u32 vaddr = mm->vaddr;
-       __u16 vport = mm->vport;
-       int proto_num = masq_proto_num(mm->protocol);
-
-       /*
-        * Check the size of mctl, no overflow...
-        */
-       if (optlen != sizeof(*mctl)) 
-               return EINVAL;
-
-       /*
-         * Flush all the virtual service...
-         */
-        if (mctl->m_cmd == IP_MASQ_CMD_FLUSH)
-                return ip_vs_flush();
-
-       /*
-         * Check for valid protocol: TCP or UDP
-         */
-        if ((proto_num < 0) || (proto_num > 1)) {
-                IP_VS_INFO("vs_ctl: invalid protocol: %d"
-                           "%d.%d.%d.%d:%d %s",
-                           ntohs(mm->protocol),
-                           NIPQUAD(vaddr), ntohs(vport), mctl->m_tname);
-                return -EFAULT;
-        }
-
-        /*
-         * Lookup the service by (vaddr, vport, protocol)
-         */
-        svc = ip_vs_lookup_service(vaddr, vport, mm->protocol);
-
-        switch (mctl->m_cmd) {
-                case IP_MASQ_CMD_ADD:
-                        if (svc != NULL)
-                                return -EEXIST;
-
-                        /* lookup the scheduler, by 'mctl->m_tname' */
-                        sched = ip_vs_lookup_scheduler(mctl->m_tname);
-                        if (sched == NULL) {
-                                IP_VS_INFO("Scheduler module ip_vs_%s.o not found\n",
-                                           mctl->m_tname);
-                                return -ENOENT;
-                        }
-
-                        return ip_vs_add_service(vaddr, vport,
-                                                 mm->protocol, sched);
-
-                case IP_MASQ_CMD_SET:
-                        if (svc == NULL)
-                                return -ESRCH;
-
-                        /* lookup the scheduler, by 'mctl->m_tname' */
-                        sched = ip_vs_lookup_scheduler(mctl->m_tname);
-                        if (sched == NULL) {
-                                IP_VS_INFO("Scheduler module ip_vs_%s.o not found\n",
-                                           mctl->m_tname);
-                                return -ENOENT;
-                        }
-
-                        return ip_vs_edit_service(svc, sched);
-                        
-                case IP_MASQ_CMD_DEL:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_del_service(svc);
-       
-                case IP_MASQ_CMD_ADD_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_add_dest(svc, mctl);
-
-                case IP_MASQ_CMD_SET_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_edit_dest(svc, mctl);
-                        
-                case IP_MASQ_CMD_DEL_DEST:
-                        if (svc == NULL)
-                                return  -ESRCH;
-                        else
-                                return ip_vs_del_dest(svc, mctl);
-        }
-        return -EINVAL;
-}
-
-
-
-#ifdef CONFIG_PROC_FS
-/*
- *     Write the contents of the VS rule table to a PROCfs file.
- */
-static int ip_vs_procinfo(char *buf, char **start, off_t offset,
-                         int length, int *eof, void *data)
-{
-       int ind;
-        int len=0;
-        off_t pos=0;
-        int size;
-        char str1[22];
-       struct ip_vs_service *svc = NULL;
-       struct ip_vs_dest *dest;
-       __u16 protocol = 0;
-
-       size = sprintf(buf+len,
-                       "IP Virtual Server (Version 0.7)\n"
-                       "Protocol Local Address:Port Scheduler\n"
-                       "      -> Remote Address:Port   Forward Weight ActiveConn FinConn\n");
-        pos += size;
-        len += size;
-
-       read_lock_bh(&__ip_vs_lock);
-
-        for (ind = 0; ind < 2; ind++) {
-                if (ind == 0)
-                        protocol = IPPROTO_UDP;
-                else
-                        protocol = IPPROTO_TCP;
-
-                for (svc=service_list[masq_proto_num(protocol)]; svc; svc=svc->next) {
-                        size = sprintf(buf+len, "%s %d.%d.%d.%d:%d %s\n",
-                                       masq_proto_name(protocol),
-                                       NIPQUAD(svc->addr), ntohs(svc->port),
-                                       svc->scheduler->name);
-                        len += size;
-                        pos += size;
-
-                        if (pos <= offset)
-                                len=0;
-                        if (pos >= offset+length)
-                                goto done;
-                              
-                        for (dest = svc->destinations; dest; dest = dest->next) {
-                                char *fwd;
-
-                                switch (dest->masq_flags & IP_MASQ_F_VS_FWD_MASK) {
-                                        case IP_MASQ_F_VS_LOCALNODE:
-                                                fwd = "Local";
-                                                break;
-                                        case IP_MASQ_F_VS_TUNNEL:
-                                                fwd = "Tunnel";
-                                                break;
-                                        case IP_MASQ_F_VS_DROUTE:
-                                                fwd = "Route";
-                                                break;
-                                        default:
-                                                fwd = "Masq";
-                                }
-
-                                sprintf(str1, "%d.%d.%d.%d:%d",
-                                        NIPQUAD(dest->addr), ntohs(dest->port));
-                                size = sprintf(buf+len,
-                                               "      -> %-21s %-7s %-6d %-10d %-10d\n",
-                                               str1, fwd, dest->weight,
-                                               atomic_read(&dest->connections),
-                                               atomic_read(&dest->refcnt) - atomic_read(&dest->connections) - 1);
-                                len += size;
-                                pos += size;
-                  
-                                if (pos <= offset)
-                                        len=0;
-                                if (pos >= offset+length)
-                                        goto done;
-                        }
-               }
-       }
-
-  done:
-       read_unlock_bh(&__ip_vs_lock);
-        
-        *start = buf+len-(pos-offset);          /* Start of wanted data */
-        len = pos-offset;
-        if (len > length)
-                len = length;
-        if (len < 0)
-                len = 0;
-        
-       return len;
-}
-
-struct proc_dir_entry ip_vs_proc_entry = {
-       0,                      /* dynamic inode */
-       2, "vs",                /* namelen and name */
-       S_IFREG | S_IRUGO,      /* mode */
-       1, 0, 0, 0,             /* nlinks, owner, group, size */
-       &proc_net_inode_operations, /* operations */
-       NULL,                   /* get_info */
-       NULL,                   /* fill_inode */
-       NULL, NULL, NULL,       /* next, parent, subdir */
-       NULL,                   /* data */
-       &ip_vs_procinfo,        /* function to generate proc data */
-};
-       
-#endif
-
-
-/*
- *   This function encapsulates the packet in a new IP header, its destination
- *   will be set to the daddr. Most code of this function is from ipip.c.
- *   Usage:
- *     It is called in the ip_fw_demasquerade() function. The load balancer
- *     selects a real server from a cluster based on a scheduling algorithm,
- *     encapsulates the packet and forwards it to the selected server. All real
- *     servers are configured with "ifconfig tunl0 <Virtual IP Address> up".
- *     When the server receives the encapsulated packet, it decapsulates the
- *     packet, processes the request and return the reply packets directly to
- *     the client without passing the load balancer. This can greatly
- *     increase the scalability of virtual server. 
- *   Returns:
- *     if succeeded, return 1; otherwise, return 0.
- */
-
-int ip_vs_tunnel_xmit(struct sk_buff **skb_p, __u32 daddr)
-{
-       struct sk_buff *skb = *skb_p;
-       struct rtable *rt;                      /* Route to the other host */
-       struct device *tdev;                    /* Device to other host */
-       struct iphdr  *old_iph = skb->nh.iph;
-       u8     tos = old_iph->tos;
-       u16    df = 0;
-       struct iphdr  *iph;                     /* Our new IP header */
-       int    max_headroom;                    /* The extra header space needed */
-       u32    dst = daddr;
-       u32    src = 0;
-       int    mtu;
-
-       if (skb->protocol != __constant_htons(ETH_P_IP)) {
-               IP_VS_ERR("ip_vs_tunnel_xmit(): protocol error, ETH_P_IP: %d, skb protocol: %d\n",
-                       __constant_htons(ETH_P_IP),skb->protocol);
-               goto tx_error;
-       }
-
-       if (ip_route_output(&rt, dst, src, RT_TOS(tos), 0)) {
-               IP_VS_ERR("ip_vs_tunnel_xmit(): route error, dst: %08X\n", dst);
-               goto tx_error_icmp;
-       }
-       tdev = rt->u.dst.dev;
-
-       mtu = rt->u.dst.pmtu - sizeof(struct iphdr);
-       if (mtu < 68) {
-               ip_rt_put(rt);
-               IP_VS_ERR("ip_vs_tunnel_xmit(): mtu less than 68\n");
-               goto tx_error;
-       }
-       if (skb->dst && mtu < skb->dst->pmtu)
-               skb->dst->pmtu = mtu;
-
-       df |= (old_iph->frag_off&__constant_htons(IP_DF));
-
-       if ((old_iph->frag_off&__constant_htons(IP_DF)) && mtu < ntohs(old_iph->tot_len)) {
-               icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
-               ip_rt_put(rt);
-               IP_VS_ERR("ip_vs_tunnel_xmit(): frag needed\n");
-               goto tx_error;
-       }
-
-       skb->h.raw = skb->nh.raw;
-
-       /*
-        * Okay, now see if we can stuff it in the buffer as-is.
-        */
-       max_headroom = (((tdev->hard_header_len+15)&~15)+sizeof(struct iphdr));
-
-       if (skb_headroom(skb) < max_headroom || skb_cloned(skb) || skb_shared(skb)) {
-               struct sk_buff *new_skb = skb_realloc_headroom(skb, max_headroom);
-               if (!new_skb) {
-                       ip_rt_put(rt);
-                       kfree_skb(skb);
-                       IP_VS_ERR("ip_vs_tunnel_xmit(): no memory for new_skb\n");
-                       return 0;
-               }
-               kfree_skb(skb);
-               skb = new_skb;
-       }
-
-       skb->nh.raw = skb_push(skb, sizeof(struct iphdr));
-       memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
-       dst_release(skb->dst);
-       skb->dst = &rt->u.dst;
-
-       /*
-        *      Push down and install the IPIP header.
-        */
-
-       iph                     =       skb->nh.iph;
-       iph->version            =       4;
-       iph->ihl                =       sizeof(struct iphdr)>>2;
-       iph->frag_off           =       df;
-       iph->protocol           =       IPPROTO_IPIP;
-       iph->tos                =       tos;
-       iph->daddr              =       rt->rt_dst;
-       iph->saddr              =       rt->rt_src;
-       iph->ttl                =       old_iph->ttl;
-       iph->tot_len            =       htons(skb->len);
-       iph->id                 =       htons(ip_id_count++);
-       ip_send_check(iph);
-
-       ip_send(skb);
-       return 1;
-
-tx_error_icmp:
-       dst_link_failure(skb);
-tx_error:
-       kfree_skb(skb);
-       return 0;
-}
-
-
-/*
- *     Initialize IP virtual server
- */
-__initfunc(int ip_vs_init(void))
-{
-       int idx;
-        for(idx = 0; idx < IP_VS_TAB_SIZE; idx++)  {
-               INIT_LIST_HEAD(&ip_vs_table[idx]);
-       }
-#ifdef CONFIG_PROC_FS
-       ip_masq_proc_register(&ip_vs_proc_entry);       
-#endif        
-
-#ifdef CONFIG_IP_MASQUERADE_VS_RR
-        ip_vs_rr_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_WRR
-        ip_vs_wrr_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_WLC
-        ip_vs_wlc_init();
-#endif
-#ifdef CONFIG_IP_MASQUERADE_VS_PCC
-        ip_vs_pcc_init();
-#endif
-        return 0;
-}
diff --git a/net/ipv4/ip_vs_pcc.c b/net/ipv4/ip_vs_pcc.c

deleted file mode 100644 (file)

index eed47ce..0000000
--- a/net/ipv4/ip_vs_pcc.c
+++ /dev/null
@@ -1,240 +0,0 @@
-/*
- * IPVS:        Persistent Client Connection Scheduling module
- *
- * Version:     $Id: ip_vs_pcc.c,v 1.1.2.1 1999/08/13 18:25:33 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-/*
- * Note:
- *   It is not very good to make persistent connection client feature
- *   as a sperate scheduling module, because PCC is different from
- *   scheduling modules such as RR, WRR and WLC. In fact, it is good
- *   to let user specify which port is persistent. This will be fixed
- *   in the near future.
- */
-
-/*
- * Define TEMPLATE_TIMEOUT a little larger than average connection time
- * plus MASQUERADE_EXPIRE_TCP_FIN(2*60*HZ). Because the template won't
- * be released until its last controlled masq entry gets expired.
- * If TEMPLATE_TIMEOUT is too less, the template will soon expire and
- * will be put in expire again and again, which requires additional
- * overhead. If it is too large, the same will always visit the same
- * server, which will make dynamic load imbalance worse.
- */
-#define TEMPLATE_TIMEOUT       6*60*HZ
-
-static int ip_vs_pcc_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_pcc_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- *    In fact, it is Weighted Least Connection scheduling
- */
-static struct ip_vs_dest* ip_vs_pcc_select(struct ip_vs_service *svc)
-{
-       struct ip_vs_dest *dest, *least;
-       int loh, doh;
-
-       IP_VS_DBG("ip_vs_pcc_select(): selecting a server...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-       /*
-         * The number of connections in TCP_FIN state is
-         *                 dest->refcnt - dest->connections -1
-         * We think the overhead of processing active connections is fifty
-         * times than that of conncetions in TCP_FIN in average. (This fifty
-         * times might be not accurate, we will change it later.) We use
-         * the following formula to estimate the overhead:
-         *                dest->connections*49 + dest->refcnt
-         * and the load:
-         *                (dest overhead) / dest->weight
-         *
-         * Remember -- no floats in kernel mode!!!
-         * The comparison of h1*w2 > h2*w1 is equivalent to that of
-         *                h1/w1 > h2/w2
-         * if every weight is larger than zero.
-         */
-
-       least = svc->destinations;
-       loh = atomic_read(&least->connections)*49 + atomic_read(&least->refcnt);
-        
-        /*
-         *    Find the destination with the least load.
-         */
-       for (dest = least->next; dest; dest = dest->next) {
-               doh = atomic_read(&dest->connections)*49 + atomic_read(&dest->refcnt);
-               if (loh*dest->weight > doh*least->weight) {
-                       least = dest;
-                       loh = doh;
-               }
-       }
-
-        IP_VS_DBG("The selected server: connections %d refcnt %d weight %d"
-                  "overhead %d\n", atomic_read(&least->connections),
-                  atomic_read(&least->refcnt), least->weight, loh);
-
-       return least;
-}
-
-
-static struct ip_masq* ip_vs_pcc_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms, *mst;
-       struct ip_vs_dest *dest;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       /* check if the template exists */
-        mst = ip_masq_in_get(0, iph->saddr, 0, svc->addr, svc->port);
-       if (mst) {
-               /*
-                 * Template masq exists...
-                 */
-               dest = mst->dest;
-                IP_VS_DBG("Template masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                          ip_vs_fwd_tag(mst), ip_masq_state_name(mst->state),
-                          ntohl(mst->daddr),ntohs(mst->dport),
-                          ntohl(mst->maddr),ntohs(mst->mport),
-                          ntohl(mst->saddr),ntohs(mst->sport),
-                          mst->flags, atomic_read(&mst->refcnt));
-       } else {
-               /* template does not exist, select the destination */
-               dest = ip_vs_pcc_select(svc);
-               if (!dest) return NULL;
-
-               /* create the template */
-               mst = ip_masq_new_vs(0, svc->addr, svc->port,
-                                     dest->addr, dest->port,
-                                     iph->saddr, 0, 0);
-               if (!mst) {
-                       IP_VS_ERR("ip_masq_new template failed\n");
-                       return NULL;
-               }
-
-                /*
-                 *    Bind the template masq entry with the vs dest.
-                 */
-                ip_vs_bind_masq(mst, dest);
-                
-                IP_VS_DBG("Template masq created fwd:%c s:%s c:%lX:%x v:%lX:%x"
-                          " d:%lX:%x flg:%X cnt:%d\n",
-                          ip_vs_fwd_tag(mst), ip_masq_state_name(mst->state),
-                          ntohl(mst->daddr),ntohs(mst->dport),
-                          ntohl(mst->maddr),ntohs(mst->mport),
-                          ntohl(mst->saddr),ntohs(mst->sport),
-                          mst->flags, atomic_read(&mst->refcnt));
-
-       }
-
-       /*
-         * The destination is known, and create the masq entry
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("new_vs failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-       /*
-         *    Add its control
-         */
-        ip_masq_control_add(ms, mst);
-
-        /*
-         *    Set the timeout, and put it in expire.
-         */
-        mst->timeout = TEMPLATE_TIMEOUT;
-        ip_masq_put(mst);
-
-        return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_pcc_scheduler = {
-       NULL,                   /* next */
-       "pcc",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_pcc_init_svc,     /* service initializer */
-       ip_vs_pcc_done_svc,     /* service done */
-       ip_vs_pcc_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_pcc_init(void))
-{
-       IP_VS_INFO("InitialzingPCC scheduling\n");
-        return register_ip_vs_scheduler(&ip_vs_pcc_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_pcc_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("PCC scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_pcc_scheduler) != 0)
-               IP_VS_INFO("cannot remove PCC scheduling module\n");
-       else
-               IP_VS_INFO("PCC scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_rr.c b/net/ipv4/ip_vs_rr.c

deleted file mode 100644 (file)

index f7b9d2e..0000000
--- a/net/ipv4/ip_vs_rr.c
+++ /dev/null
@@ -1,138 +0,0 @@
-/*
- * IPVS:        Round-Robin Scheduling module
- *
- * Version:     $Id: ip_vs_rr.c,v 1.1.2.1 1999/08/13 18:25:39 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Fixes/Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-
-static int ip_vs_rr_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_rr_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- * Round-Robin Scheduling
- */
-static struct ip_masq* ip_vs_rr_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-        struct ip_vs_dest *dest;
-       struct ip_masq *ms;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       IP_VS_DBG("ip_vs_rr_schedule(): Scheduling...\n");
-
-        if (svc->sched_data != NULL) 
-                svc->sched_data = ((struct ip_vs_dest*)svc->sched_data)->next;
-        if (svc->sched_data == NULL) 
-                svc->sched_data = svc->destinations;
-        if (svc->sched_data == NULL)
-                return NULL;
-
-        dest = svc->sched_data;
-
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-       return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_rr_scheduler = {
-       NULL,                   /* next */
-       "rr",                   /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_rr_init_svc,      /* service initializer */
-       ip_vs_rr_done_svc,      /* service done */
-       ip_vs_rr_schedule,      /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_rr_init(void))
-{
-       IP_VS_INFO("Initializing RR scheduling\n");
-       return register_ip_vs_scheduler(&ip_vs_rr_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_rr_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("RR scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_rr_scheduler) != 0)
-               IP_VS_INFO("cannot remove RR scheduling module\n");
-       else
-               IP_VS_INFO("RR scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_wlc.c b/net/ipv4/ip_vs_wlc.c

deleted file mode 100644 (file)

index 501a68a..0000000
--- a/net/ipv4/ip_vs_wlc.c
+++ /dev/null
@@ -1,167 +0,0 @@
-/*
- * IPVS:        Weighted Least-Connection Scheduling module
- *
- * Version:     $Id: ip_vs_wlc.c,v 1.1.2.1 1999/08/13 18:25:44 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *              Peter Kese <peter.kese@ijs.si>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-
-static int ip_vs_wlc_init_svc(struct ip_vs_service *svc)
-{
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_wlc_done_svc(struct ip_vs_service *svc)
-{
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-/*
- *    Weighted Least Connection scheduling
- */
-static struct ip_masq* ip_vs_wlc_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms;
-       struct ip_vs_dest *dest, *least;
-       int loh, doh;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-
-       IP_VS_DBG("ip_vs_wlc_schedule(): Scheduling...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-       /*
-         * The number of connections in TCP_FIN state is
-         *                 dest->refcnt - dest->connections -1
-         * We think the overhead of processing active connections is fifty
-         * times than that of conncetions in TCP_FIN in average. (This fifty
-         * times might be not accurate, we will change it later.) We use
-         * the following formula to estimate the overhead:
-         *                dest->connections*49 + dest->refcnt
-         * and the load:
-         *                (dest overhead) / dest->weight
-         *
-         * Remember -- no floats in kernel mode!!!
-         * The comparison of h1*w2 > h2*w1 is equivalent to that of
-         *                h1/w1 > h2/w2
-         * if every weight is larger than zero.
-         */
-
-       least = svc->destinations;
-       loh = atomic_read(&least->connections)*49 + atomic_read(&least->refcnt);
-        
-        /*
-         *    Find the destination with the least load.
-         */
-       for (dest = least->next; dest; dest = dest->next) {
-               doh = atomic_read(&dest->connections)*49 + atomic_read(&dest->refcnt);
-               if (loh*dest->weight > doh*least->weight) {
-                       least = dest;
-                       loh = doh;
-               }
-       }
-
-        IP_VS_DBG("The selected server: connections %d refcnt %d weight %d "
-                  "overhead %d\n", atomic_read(&least->connections),
-                  atomic_read(&least->refcnt), least->weight, loh);
-
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            least->addr, least->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, least);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-        return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_wlc_scheduler = {
-       NULL,                   /* next */
-       "wlc",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_wlc_init_svc,     /* service initializer */
-       ip_vs_wlc_done_svc,     /* service done */
-       ip_vs_wlc_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_wlc_init(void))
-{
-       IP_VS_INFO("Initializing WLC scheduling\n");
-        return register_ip_vs_scheduler(&ip_vs_wlc_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_wlc_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("WLC scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_wlc_scheduler) != 0)
-               IP_VS_INFO("cannot remove WLC scheduling module\n");
-       else
-               IP_VS_INFO("WLC scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/net/ipv4/ip_vs_wrr.c b/net/ipv4/ip_vs_wrr.c

deleted file mode 100644 (file)

index 5bbeaa8..0000000
--- a/net/ipv4/ip_vs_wrr.c
+++ /dev/null
@@ -1,196 +0,0 @@
-/*
- * IPVS:        Weighted Round-Robin Scheduling module
- *
- * Version:     $Id: ip_vs_wrr.c,v 1.1.2.1 1999/08/13 18:25:49 davem Exp $
- *
- * Authors:     Wensong Zhang <wensong@iinchina.net>
- *
- *              This program is free software; you can redistribute it and/or
- *              modify it under the terms of the GNU General Public License
- *              as published by the Free Software Foundation; either version
- *              2 of the License, or (at your option) any later version.
- *
- * Changes:
- *
- */
-
-#include <linux/config.h>
-#include <linux/module.h>
-#ifdef CONFIG_KMOD
-#include <linux/kmod.h>
-#endif
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <net/ip_masq.h>
-#ifdef CONFIG_IP_MASQUERADE_MOD
-#include <net/ip_masq_mod.h>
-#endif
-#include <linux/sysctl.h>
-#include <linux/ip_fw.h>
-#include <net/ip_vs.h>
-
-/*
- * current destination pointer for weighted round-robin scheduling
- */
-struct ip_vs_wrr_mark {
-        struct ip_vs_dest *cdest;    /* current destination pointer */
-        int cw;                      /* current weight */
-};
-
-
-static int ip_vs_wrr_init_svc(struct ip_vs_service *svc)
-{
-       /*
-         *    Allocate the mark variable for WRR scheduling
-         */
-        svc->sched_data = kmalloc(sizeof(struct ip_vs_wrr_mark), GFP_ATOMIC);
-
-        if (svc->sched_data == NULL) {
-                IP_VS_ERR("ip_vs_wrr_init_svc(): no memory\n");
-               return ENOMEM;
-        }
-        memset(svc->sched_data, 0, sizeof(struct ip_vs_wrr_mark));
-
-        MOD_INC_USE_COUNT;
-        return 0;
-}
-
-
-static int ip_vs_wrr_done_svc(struct ip_vs_service *svc)
-{
-        /*
-         *    Release the mark variable
-         */
-        kfree_s(svc->sched_data, sizeof(struct ip_vs_wrr_mark));
-        
-        MOD_DEC_USE_COUNT;
-        return 0;
-}
-
-
-int ip_vs_wrr_max_weight(struct ip_vs_dest *destinations)
-{
-        struct ip_vs_dest *dest;
-        int weight = 0;
-
-        for (dest=destinations; dest; dest=dest->next) {
-                if (dest->weight > weight)
-                        weight = dest->weight;
-        }
-
-        return weight;
-}
-
-        
-/*
- *    Weighted Round-Robin Scheduling
- */
-static struct ip_masq* ip_vs_wrr_schedule(struct ip_vs_service *svc, 
-                                        struct iphdr *iph)
-{
-       struct ip_masq *ms;
-       const __u16 *portp = (__u16 *)&(((char *)iph)[iph->ihl*4]);
-        struct ip_vs_wrr_mark *mark = svc->sched_data;
-        struct ip_vs_dest *dest;
-
-       IP_VS_DBG("ip_vs_wrr_schedule(): Scheduling...\n");
-
-       if (svc->destinations == NULL) return NULL;
-
-        /*
-         * This loop will always terminate, because 0<mark->cw<max_weight,
-         * and at least one server has its weight equal to max_weight.
-         */
-        while (1) {
-                if (mark->cdest == NULL) {
-                        mark->cdest = svc->destinations;
-                        mark->cw--;
-                        if (mark->cw <= 0) {
-                                mark->cw = ip_vs_wrr_max_weight(svc->destinations);
-                                /*
-                                 * Still zero, which means no availabe servers.
-                                 */
-                                if (mark->cw == 0) {
-                                        IP_VS_INFO("ip_vs_wrr_schedule(): no available servers\n");
-                                        return NULL;
-                                }
-                        }
-                }
-                else mark->cdest = mark->cdest->next;
-
-                if(mark->cdest && (mark->cdest->weight >= mark->cw))
-                        break;
-        }
-        
-       dest = mark->cdest;
-        
-       /*
-         *    Create a masquerading entry.
-         */
-        ms = ip_masq_new_vs(iph->protocol,
-                            iph->daddr, portp[1],      
-                            dest->addr, dest->port,
-                            iph->saddr, portp[0],
-                            0);
-       if (ms == NULL) {
-               IP_VS_ERR("ip_masq_new failed\n");
-               return NULL;
-       }
-
-        /*
-         *    Bind the masq entry with the vs dest.
-         */
-        ip_vs_bind_masq(ms, dest);
-        
-        IP_VS_DBG("Masq fwd:%c s:%s c:%lX:%x v:%lX:%x d:%lX:%x flg:%X cnt:%d\n",
-                  ip_vs_fwd_tag(ms), ip_masq_state_name(ms->state),
-                  ntohl(ms->daddr),ntohs(ms->dport),
-                  ntohl(ms->maddr),ntohs(ms->mport),
-                  ntohl(ms->saddr),ntohs(ms->sport),
-                  ms->flags, atomic_read(&ms->refcnt));
-
-       return ms;
-}
-
-
-static struct ip_vs_scheduler ip_vs_wrr_scheduler = {
-       NULL,                   /* next */
-       "wrr",                  /* name */
-       ATOMIC_INIT(0),         /* refcnt */
-       ip_vs_wrr_init_svc,     /* service initializer */
-       ip_vs_wrr_done_svc,     /* service done */
-       ip_vs_wrr_schedule,     /* select a server and create new masq entry */
-};
-
-
-__initfunc(int ip_vs_wrr_init(void))
-{
-       IP_VS_INFO("Initializing WRR scheduling\n");
-       return register_ip_vs_scheduler(&ip_vs_wrr_scheduler) ;
-}
-
-#ifdef MODULE
-EXPORT_NO_SYMBOLS;
-
-int init_module(void)
-{
-       /* module initialization by 'request_module' */
-       if(register_ip_vs_scheduler(&ip_vs_wrr_scheduler) != 0)
-               return -EIO;
-
-       IP_VS_INFO("WRR scheduling module loaded.\n");
-       
-        return 0;
-}
-
-void cleanup_module(void)
-{
-       /* module cleanup by 'release_module' */
-       if(unregister_ip_vs_scheduler(&ip_vs_wrr_scheduler) != 0)
-               IP_VS_INFO("cannot remove WRR scheduling module\n");
-       else
-               IP_VS_INFO("WRR scheduling module unloaded.\n");
-}
-
-#endif /* MODULE */
diff --git a/sound/solo1 b/sound/solo1

deleted file mode 100644 (file)

index 1c0a641..0000000
--- a/sound/solo1
+++ /dev/null
@@ -1,48 +0,0 @@
-ALaw/uLaw sample formats
-------------------------
-
-This driver does not support the ALaw/uLaw sample formats.
-ALaw is the default mode when opening a sound device
-using OSS/Free. The reason for the lack of support is
-that the hardware does not support these formats, and adding
-conversion routines to the kernel would lead to very ugly
-code in the presence of the mmap interface to the driver.
-And since xquake uses mmap, mmap is considered important :-)
-and no sane application uses ALaw/uLaw these days anyway.
-In short, playing a Sun .au file as follows:
-
-cat my_file.au > /dev/dsp
-
-does not work. Instead, you may use the play script from
-Chris Bagwell's sox-12.14 package (or later, available from the URL
-below) to play many different audio file formats.
-The script automatically determines the audio format
-and does do audio conversions if necessary.
-http://home.sprynet.com/sprynet/cbagwell/projects.html
-
-
-Blocking vs. nonblocking IO
----------------------------
-
-Unlike OSS/Free this driver honours the O_NONBLOCK file flag
-not only during open, but also during read and write.
-This is an effort to make the sound driver interface more
-regular. Timidity has problems with this; a patch
-is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html.
-(Timidity patched will also run on OSS/Free).
-
-
-MIDI UART
----------
-
-The driver supports a simple MIDI UART interface, with
-no ioctl's supported.
-
-
-MIDI synthesizer
-----------------
-
-The card has an OPL compatible FM synthesizer.
-
-Thomas Sailer
-sailer@ife.ee.ethz.ch
author	Alan Cox <alan@lxorguk.ukuu.org.uk>
	Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
committer	Alan Cox <alan@lxorguk.ukuu.org.uk>
	Fri, 23 Nov 2007 20:19:35 +0000 (15:19 -0500)
CREDITS		patch \| blob \| history
Documentation/Configure.help		patch \| blob \| history
Documentation/README.DAC960	[new file with mode: 0644]	patch \| blob
Documentation/networking/CREDITS.ipvs	[deleted file]	patch \| blob \| history
Documentation/networking/ChangeLog.ipvs	[deleted file]	patch \| blob \| history
Documentation/networking/README.ipvs	[deleted file]	patch \| blob \| history
arch/alpha/kernel/alpha_ksyms.c		patch \| blob \| history
arch/alpha/kernel/core_mcpcia.c		patch \| blob \| history
arch/alpha/kernel/process.c		patch \| blob \| history
arch/alpha/kernel/setup.c		patch \| blob \| history
arch/i386/defconfig		patch \| blob \| history
arch/i386/kernel/mtrr.c		patch \| blob \| history
arch/i386/mm/init.c		patch \| blob \| history
arch/sparc64/kernel/ioctl32.c		patch \| blob \| history
drivers/block/Config.in		patch \| blob \| history
drivers/block/DAC960.c		patch \| blob \| history
drivers/block/DAC960.h		patch \| blob \| history
drivers/block/Makefile		patch \| blob \| history
drivers/block/cpqarray.h		patch \| blob \| history
drivers/block/genhd.c		patch \| blob \| history
drivers/block/hsm.c	[deleted file]	patch \| blob \| history
drivers/block/linear.c		patch \| blob \| history
drivers/block/linear.h	[new file with mode: 0644]	patch \| blob
drivers/block/ll_rw_blk.c		patch \| blob \| history
drivers/block/md.c		patch \| blob \| history
drivers/block/raid0.c		patch \| blob \| history
drivers/block/raid1.c		patch \| blob \| history
drivers/block/raid5.c		patch \| blob \| history
drivers/block/translucent.c	[deleted file]	patch \| blob \| history
drivers/block/xor.c	[deleted file]	patch \| blob \| history
drivers/cdrom/sonycd535.c		patch \| blob \| history
drivers/char/bttv.c		patch \| blob \| history
drivers/char/buz.c		patch \| blob \| history
drivers/char/dz.c		patch \| blob \| history
drivers/char/generic_serial.c		patch \| blob \| history
drivers/char/planb.c		patch \| blob \| history
drivers/isdn/isdn_ppp.c		patch \| blob \| history
drivers/net/sis900.c		patch \| blob \| history
drivers/scsi/aha152x.c		patch \| blob \| history
drivers/sound/sb_ess.c		patch \| blob \| history
fs/autofs/root.c		patch \| blob \| history
fs/block_dev.c		patch \| blob \| history
fs/buffer.c		patch \| blob \| history
fs/dquot.c		patch \| blob \| history
fs/fat/inode.c		patch \| blob \| history
fs/select.c		patch \| blob \| history
include/asm-alpha/core_cia.h		patch \| blob \| history
include/asm-alpha/md.h	[new file with mode: 0644]	patch \| blob
include/asm-i386/md.h	[new file with mode: 0644]	patch \| blob
include/asm-m68k/md.h	[new file with mode: 0644]	patch \| blob
include/asm-ppc/md.h	[new file with mode: 0644]	patch \| blob
include/asm-sparc/md.h	[new file with mode: 0644]	patch \| blob
include/asm-sparc64/md.h	[new file with mode: 0644]	patch \| blob
include/linux/blkdev.h		patch \| blob \| history
include/linux/ip_masq.h		patch \| blob \| history
include/linux/md.h	[new file with mode: 0644]	patch \| blob
include/linux/raid/hsm.h	[deleted file]	patch \| blob \| history
include/linux/raid/hsm_p.h	[deleted file]	patch \| blob \| history
include/linux/raid/linear.h	[deleted file]	patch \| blob \| history
include/linux/raid/md.h	[deleted file]	patch \| blob \| history
include/linux/raid/md_compatible.h	[deleted file]	patch \| blob \| history
include/linux/raid/md_k.h	[deleted file]	patch \| blob \| history
include/linux/raid/md_p.h	[deleted file]	patch \| blob \| history
include/linux/raid/md_u.h	[deleted file]	patch \| blob \| history
include/linux/raid/raid0.h	[deleted file]	patch \| blob \| history
include/linux/raid/raid1.h	[deleted file]	patch \| blob \| history
include/linux/raid/raid5.h	[deleted file]	patch \| blob \| history
include/linux/raid/translucent.h	[deleted file]	patch \| blob \| history
include/linux/raid/xor.h	[deleted file]	patch \| blob \| history
include/linux/raid0.h	[new file with mode: 0644]	patch \| blob
include/linux/raid1.h	[new file with mode: 0644]	patch \| blob
include/linux/raid5.h	[new file with mode: 0644]	patch \| blob
include/linux/sysctl.h		patch \| blob \| history
include/net/ip_masq.h		patch \| blob \| history
include/net/ip_vs.h	[deleted file]	patch \| blob \| history
init/main.c		patch \| blob \| history
net/ipv4/Config.in		patch \| blob \| history
net/ipv4/Makefile		patch \| blob \| history
net/ipv4/arp.c		patch \| blob \| history
net/ipv4/ip_input.c		patch \| blob \| history
net/ipv4/ip_masq.c		patch \| blob \| history
net/ipv4/ip_masq_autofw.c		patch \| blob \| history
net/ipv4/ip_masq_mfw.c		patch \| blob \| history
net/ipv4/ip_masq_portfw.c		patch \| blob \| history
net/ipv4/ip_masq_user.c		patch \| blob \| history
net/ipv4/ip_vs.c	[deleted file]	patch \| blob \| history
net/ipv4/ip_vs_pcc.c	[deleted file]	patch \| blob \| history
net/ipv4/ip_vs_rr.c	[deleted file]	patch \| blob \| history
net/ipv4/ip_vs_wlc.c	[deleted file]	patch \| blob \| history
net/ipv4/ip_vs_wrr.c	[deleted file]	patch \| blob \| history
sound/solo1	[deleted file]	patch \| blob \| history