sending the processor to sleep and saving power.
CONFIG_ACPI
- ACPI/OSPM support for Linux is currently under development. As such,
- this support is preliminary and EXPERIMENTAL. Configuring ACPI
- support enables kernel interfaces that allow higher level software
- (OSPM) to manipulate ACPI defined hardware and software interfaces,
- including the evaluation of ACPI control methods. If unsure, choose
- N here. Note, this option will enlarge your kernel by about 120K.
-
- This support requires an ACPI compliant platform (hardware/firmware).
- If both ACPI and Advanced Power Management (APM) support are
- configured, whichever is loaded first shall be used.
-
- This code DOES NOT currently provide a complete OSPM implementation
- -- it has not yet reached APM's level of functionality. When fully
- implemented, Linux ACPI/OSPM will provide a more robust functional
- replacement for legacy configuration and power management
- interfaces, including the Plug-and-Play BIOS specification (PnP
- BIOS), the Multi-Processor Specification (MPS), and the Advanced
- Power Management specification (APM).
-
- Linux support for ACPI/OSPM is based on Intel Corporation's ACPI
- Component Architecture (ACPI CA). The latest ACPI CA source code,
- documentation, debug builds, and implementation status information
- can be downloaded from:
- <http://developer.intel.com/technology/iapc/acpi/downloads.htm>.
-
- The ACPI Sourceforge project may also be of interest:
- <http://sf.net/projects/acpi/>
+ Advanced Configuration and Power Interface (ACPI) support for
+ Linux requires an ACPI compliant platform (hardware/firmware),
+ and assumes the presence of OS-directed configuration and power
+ management (OSPM) software.
+
+ Linux ACPI provides a robust functional replacement for several
+ legacy configuration and power management intefaces, including
+ the Plug-and-Play BIOS specification (PnP BIOS), the
+ MultiProcessor Specification (MPS), and the Advanced Power
+ Management (APM) specification. If both ACPI and APM support
+ are configured, whichever is loaded first shall be used.
+
+ The ACPI SourceForge project contains the latest source code,
+ documentation, tools, mailing list subscription, and other
+ information. This project is available at:
+ <http://sourceforge.net/projects/acpi>
+
+ Linux support for ACPI is based on Intel Corporation's ACPI
+ Component Architecture (ACPI CA). For more information see:
+ <http://developer.intel.com/technology/iapc/acpi>
+
+ ACPI is an open industry specification co-developed by Compaq,
+ Intel, Microsoft, Phoenix, and Toshiba. The specification is
+ available at:
+ <http://www.acpi.info>
CONFIG_X86_MSR
This device gives privileged processes access to the x86
with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
/dev/cpu/31/cpuid.
-CONFIG_SOUND
- If you have a sound card in your computer, i.e. if it can say more
- than an occasional beep, say Y. Be sure to have all the information
- about your sound card and its configuration down (I/O port,
- interrupt and DMA channel), because you will be asked for it.
-
- You want to read the Sound-HOWTO, available from
- <http://www.linuxdoc.org/docs.html#howto>. General information about
- the modular sound system is contained in the files
- <file:Documentation/sound/Introduction>. The file
- <file:Documentation/sound/README.OSS> contains some slightly
- outdated but still useful information as well.
-
- If you have a PnP sound card and you want to configure it at boot
- time using the ISA PnP tools (read
- <http://www.roestock.demon.co.uk/isapnptools/>), then you need to
- compile the sound card support as a module ( = code which can be
- inserted in and removed from the running kernel whenever you want)
- and load that module after the PnP configuration is finished. To do
- this, say M here and read <file:Documentation/modules.txt> as well
- as <file:Documentation/sound/README.modules>; the module will be
- called soundcore.o.
-
- I'm told that even without a sound card, you can make your computer
- say more than an occasional beep, by programming the PC speaker.
- Kernel patches and supporting utilities to do that are in the pcsp
- package, available at <ftp://ftp.infradead.org/pub/pcsp/>.
-
CONFIG_PREEMPT
This option reduces the latency of the kernel when reacting to
real-time or interactive events by allowing a low priority process to
klogd/syslogd or the X server.You should normally N here, unless
you want to debug such a crash.
+CONFIG_X86_MCE_NONFATAL
+ Enabling this feature starts a timer that triggers every 5 seconds which
+ will look at the machine check registers to see if anything happened.
+ Non-fatal problems automatically get corrected (but still logged).
+ Disable this if you don't want to see these messages.
+ Seeing the messages this option prints out may be indicative of dying hardware,
+ or out-of-spec (ie, overclocked) hardware.
+ This option only does something on hardware with Intel P6 style MCE.
+ (Pentium Pro and above, AMD Athlon/Duron)
+
# 20000913 Pavel Machek <pavel@suse.cz>
# Converted for x86_64 architecture
# 20010105 Andi Kleen, add IA32 compiler.
+# ....and later removed it again....
#
-# $Id: Makefile,v 1.28 2001/06/29 17:47:43 aj Exp $
-
+# $Id: Makefile,v 1.31 2002/03/22 15:56:07 ak Exp $
#
-# boot system currently needs IA32 tools to link (to be fixed)
+# early bootup linking needs 32bit. You can either use real 32bit tools
+# here or 64bit tools switch to 32bit mode.
#
-# Change this to your i386 compiler/binutils
-IA32_PREFIX := /usr/bin/
-IA32_CC := $(IA32_PREFIX)gcc -O2 -fomit-frame-pointer -nostdinc -I $(HPATH)
-IA32_LD := $(IA32_PREFIX)ld
-IA32_AS := $(IA32_PREFIX)gcc -D__ASSEMBLY__ -traditional -c -nostdinc -I $(HPATH)
-IA32_OBJCOPY := $(IA32_PREFIX)objcopy
-IA32_CPP := $(IA32_PREFIX)gcc -E
+IA32_CC := $(CROSS_COMPILE)gcc -m32 -O2 -fomit-frame-pointer -nostdinc -I $(HPATH)
+IA32_LD := $(CROSS_COMPILE)ld -m elf_i386
+IA32_AS := $(CROSS_COMPILE)gcc -m32 -Wa,--32 -D__ASSEMBLY__ -traditional -c -nostdinc -I $(HPATH)
+IA32_OBJCOPY := $(CROSS_COMPILE)objcopy
+IA32_CPP := $(CROSS_COMPILE)gcc -m32 -E
export IA32_CC IA32_LD IA32_AS IA32_OBJCOPY IA32_CPP
LDFLAGS=-e stext
LINKFLAGS =-T $(TOPDIR)/arch/x86_64/vmlinux.lds $(LDFLAGS)
-CFLAGS += $(shell if $(CC) -mno-red-zone -S -o /dev/null -xc /dev/null >/dev/null 2>&1; then echo "-mno-red-zone"; fi )
+CFLAGS += -mno-red-zone
CFLAGS += -mcmodel=kernel
CFLAGS += -pipe
-# generates worse code, but makes the assembly much more readable:
CFLAGS += -fno-reorder-blocks
-# work around early gcc 3.1 bugs. Later snapshots should this already fixed.
+# needed for later gcc 3.1
+CFLAGS += -finline-limit=2000
+# needed for earlier gcc 3.1
CFLAGS += -fno-strength-reduce
-# make sure all inline functions are inlined
-CFLAGS += -finline-limit=3000
-
#CFLAGS += -g
# prevent gcc from keeping the stack 16 byte aligned (FIXME)
CORE_FILES += arch/x86_64/mm/mm.o
LIBS := $(TOPDIR)/arch/x86_64/lib/lib.a $(LIBS)
-CLEAN_FILES += include/asm-x86_64/offset.h
-
ifdef CONFIG_IA32_EMULATION
SUBDIRS += arch/x86_64/ia32
CORE_FILES += arch/x86_64/ia32/ia32.o
vmlinux: arch/x86_64/vmlinux.lds
-checkoffset: FORCE
- make -C arch/$(ARCH)/tools $(TOPDIR)/include/asm-x86_64/offset.h
-
FORCE: ;
.PHONY: zImage bzImage compressed zlilo bzlilo zdisk bzdisk install \
clean archclean archmrproper archdep checkoffset
+checkoffset: FORCE
+ make -C arch/$(ARCH)/tools $(TOPDIR)/include/asm-x86_64/offset.h
+
bzImage: checkoffset vmlinux
@$(MAKEBOOT) bzImage
tmp:
@$(MAKEBOOT) BOOTIMAGE=bzImage zlilo
+
bzlilo: checkoffset vmlinux
@$(MAKEBOOT) BOOTIMAGE=bzImage zlilo
-zdisk: checkoffset vmlinux
- @$(MAKEBOOT) BOOTIMAGE=zImage zdisk
-
bzdisk: checkoffset vmlinux
@$(MAKEBOOT) BOOTIMAGE=bzImage zdisk
archclean:
@$(MAKEBOOT) clean
- $(MAKE) -C $(TOPDIR)/arch/x86_64/tools clean
+ @$(MAKE) -C $(TOPDIR)/arch/x86_64/tools clean
archmrproper:
rm -f $(TOPDIR)/arch/x86_64/tools/offset.h
rep
stosb
is_disk1:
-# check for Micro Channel (MCA) bus
- movw %cs, %ax # aka SETUPSEG
- subw $DELTA_INITSEG, %ax # aka INITSEG
- movw %ax, %ds
- xorw %ax, %ax
- movw %ax, (0xa0) # set table length to 0
- movb $0xc0, %ah
- stc
- int $0x15 # moves feature table to es:bx
- jc no_mca
- pushw %ds
- movw %es, %ax
- movw %ax, %ds
- movw %cs, %ax # aka SETUPSEG
- subw $DELTA_INITSEG, %ax # aka INITSEG
- movw %ax, %es
- movw %bx, %si
- movw $0xa0, %di
- movw (%si), %cx
- addw $2, %cx # table length is a short
- cmpw $0x10, %cx
- jc sysdesc_ok
-
- movw $0x10, %cx # we keep only first 16 bytes
-sysdesc_ok:
- rep
- movsb
- popw %ds
-no_mca:
# Check for PS/2 pointing device
movw %cs, %ax # aka SETUPSEG
subw $DELTA_INITSEG, %ax # aka INITSEG
movw $0xAA, (0x1ff) # device present
no_psmouse:
-#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
-# Then check for an APM BIOS...
- # %ds points to the bootsector
- movw $0, 0x40 # version = 0 means no APM BIOS
- movw $0x05300, %ax # APM BIOS installation check
- xorw %bx, %bx
- int $0x15
- jc done_apm_bios # Nope, no APM BIOS
-
- cmpw $0x0504d, %bx # Check for "PM" signature
- jne done_apm_bios # No signature, no APM BIOS
-
- andw $0x02, %cx # Is 32 bit supported?
- je done_apm_bios # No 32-bit, no (good) APM BIOS
-
- movw $0x05304, %ax # Disconnect first just in case
- xorw %bx, %bx
- int $0x15 # ignore return code
- movw $0x05303, %ax # 32 bit connect
- xorl %ebx, %ebx
- xorw %cx, %cx # paranoia :-)
- xorw %dx, %dx # ...
- xorl %esi, %esi # ...
- xorw %di, %di # ...
- int $0x15
- jc no_32_apm_bios # Ack, error.
-
- movw %ax, (66) # BIOS code segment
- movl %ebx, (68) # BIOS entry point offset
- movw %cx, (72) # BIOS 16 bit code segment
- movw %dx, (74) # BIOS data segment
- movl %esi, (78) # BIOS code segment lengths
- movw %di, (82) # BIOS data segment length
-# Redo the installation check as the 32 bit connect
-# modifies the flags returned on some BIOSs
- movw $0x05300, %ax # APM BIOS installation check
- xorw %bx, %bx
- xorw %cx, %cx # paranoia
- int $0x15
- jc apm_disconnect # error -> shouldn't happen
-
- cmpw $0x0504d, %bx # check for "PM" signature
- jne apm_disconnect # no sig -> shouldn't happen
-
- movw %ax, (64) # record the APM BIOS version
- movw %cx, (76) # and flags
- jmp done_apm_bios
-
-apm_disconnect: # Tidy up
- movw $0x05304, %ax # Disconnect
- xorw %bx, %bx
- int $0x15 # ignore return code
-
- jmp done_apm_bios
-
-no_32_apm_bios:
- andw $0xfffd, (76) # remove 32 bit support bit
-done_apm_bios:
-#endif
-
# Now we want to move to protected mode ...
cmpw $0, %cs:realmode_swtch
jz rmodeswtch_normal
- lcall *%cs:realmode_swtch
+ lcall %cs:realmode_swtch
jmp rmodeswtch_end
#define VIDEO_80x30 0x0f05
#define VIDEO_80x34 0x0f06
#define VIDEO_80x60 0x0f07
-#define VIDEO_GFX_HACK 0x0f08
#define VIDEO_LAST_SPECIAL 0x0f09
/* Video modes given by resolution */
movw $0x503c, force_size
jmp setvde
-# Special hack for ThinkPad graphics
set_gfx:
-#ifdef CONFIG_VIDEO_GFX_HACK
- movw $VIDEO_GFX_BIOS_AX, %ax
- movw $VIDEO_GFX_BIOS_BX, %bx
- int $0x10
- movw $VIDEO_GFX_DUMMY_RESOLUTION, force_size
- stc
-#endif
ret
#ifdef CONFIG_VIDEO_RETAIN
.word 0x5022 # 80x34
.word VIDEO_80x60
.word 0x503c # 80x60
-#ifdef CONFIG_VIDEO_GFX_HACK
- .word VIDEO_GFX_HACK
- .word VIDEO_GFX_DUMMY_RESOLUTION
-#endif
vga_modes_end:
# Detect VESA modes.
define_bool CONFIG_SBUS n
define_bool CONFIG_UID16 y
-define_bool CONFIG_RWSEM_GENERIC_SPINLOCK n
-define_bool CONFIG_RWSEM_XCHGADD_ALGORITHM y
-
+define_bool CONFIG_RWSEM_GENERIC_SPINLOCK y
+define_bool CONFIG_RWSEM_XCHGADD_ALGORITHM n
+define_bool CONFIG_X86_CMPXCHG y
source init/Config.in
mainmenu_option next_comment
comment 'Processor type and features'
choice 'Processor family' \
- "Clawhammer CONFIG_MK8" Clawhammer
+ "AMD-Hammer CONFIG_MK8" CONFIG_MK8
#
# Define implied options from the CPU selection here
define_int CONFIG_X86_L1_CACHE_SHIFT 6
define_bool CONFIG_X86_TSC y
define_bool CONFIG_X86_GOOD_APIC y
-define_bool CONFIG_X86_CMPXCHG y
tristate '/dev/cpu/*/msr - Model-specific register support' CONFIG_X86_MSR
tristate '/dev/cpu/*/cpuid - CPU information support' CONFIG_X86_CPUID
define_bool CONFIG_MATH_EMULATION n
define_bool CONFIG_MCA n
define_bool CONFIG_EISA n
+define_bool CONFIG_X86_IO_APIC y
+define_bool CONFIG_X86_LOCAL_APIC y
-bool 'MTRR (Memory Type Range Register) support' CONFIG_MTRR
+#currently broken:
+#bool 'MTRR (Memory Type Range Register) support' CONFIG_MTRR
bool 'Symmetric multi-processing support' CONFIG_SMP
bool 'Preemptible Kernel' CONFIG_PREEMPT
-# currently doesn't boot without hacks. probably simulator bug.
-#if [ "$CONFIG_SMP" != "y" ]; then
-# bool 'APIC and IO-APIC support on uniprocessors' CONFIG_X86_UP_IOAPIC
-# if [ "$CONFIG_X86_UP_IOAPIC" = "y" ]; then
-# define_bool CONFIG_X86_IO_APIC y
-# define_bool CONFIG_X86_LOCAL_APIC y
-# fi
-#fi
if [ "$CONFIG_SMP" = "y" -a "$CONFIG_X86_CMPXCHG" = "y" ]; then
define_bool CONFIG_HAVE_DEC_LOCK y
fi
+
+define_bool CONFIG_X86_MCE y
+bool 'Check for non-fatal machine check errors' CONFIG_X86_MCE_NONFATAL $CONFIG_X86_MCE
+
endmenu
mainmenu_option next_comment
+
comment 'General options'
-if [ "$CONFIG_SMP" = "y" ]; then
- define_bool CONFIG_X86_IO_APIC y
- define_bool CONFIG_X86_LOCAL_APIC y
-fi
+source drivers/acpi/Config.in
+
bool 'PCI support' CONFIG_PCI
if [ "$CONFIG_PCI" = "y" ]; then
+# x86-64 doesn't support PCI BIOS access from long mode so always go direct.
define_bool CONFIG_PCI_DIRECT y
fi
if [ "$CONFIG_PROC_FS" = "y" ]; then
define_bool CONFIG_KCORE_ELF y
fi
-# We probably are not going to support a.out, are we? Or should we support a.out in i386 compatibility mode?
- #tristate 'Kernel support for a.out binaries' CONFIG_BINFMT_AOUT
- tristate 'Kernel support for ELF binaries' CONFIG_BINFMT_ELF
+#tristate 'Kernel support for a.out binaries' CONFIG_BINFMT_AOUT
+tristate 'Kernel support for ELF binaries' CONFIG_BINFMT_ELF
tristate 'Kernel support for MISC binaries' CONFIG_BINFMT_MISC
bool 'Power Management support' CONFIG_PM
bool 'IA32 Emulation' CONFIG_IA32_EMULATION
-if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
- dep_bool ' ACPI support' CONFIG_ACPI $CONFIG_PM
- if [ "$CONFIG_ACPI" != "n" ]; then
- source drivers/acpi/Config.in
- fi
-fi
-
endmenu
source drivers/mtd/Config.in
source drivers/input/Config.in
source drivers/char/Config.in
-if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
- source net/bluetooth/Config.in
-fi
-
source drivers/misc/Config.in
source drivers/media/Config.in
source drivers/usb/Config.in
+if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
+ source net/bluetooth/Config.in
+fi
+
mainmenu_option next_comment
comment 'Kernel hacking'
# bool ' Memory mapped I/O debugging' CONFIG_DEBUG_IOVIRT
bool ' Magic SysRq key' CONFIG_MAGIC_SYSRQ
bool ' Spinlock debugging' CONFIG_DEBUG_SPINLOCK
-# bool ' Early printk' CONFIG_EARLY_PRINTK
+ bool ' Early printk' CONFIG_EARLY_PRINTK
bool ' Additional run-time checks' CONFIG_CHECKING
-fi
-bool 'Simnow environment (disables time-consuming things)' CONFIG_SIMNOW
+ bool ' Debug __init statements' CONFIG_INIT_DEBUG
#if [ "$CONFIG_SERIAL_CONSOLE" = "y" ]; then
# bool 'Early serial console (ttyS0)' CONFIG_EARLY_SERIAL_CONSOLE
#fi
+fi
endmenu
source lib/Config.in
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y
-# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
-CONFIG_RWSEM_XCHGADD_ALGORITHM=y
+CONFIG_RWSEM_GENERIC_SPINLOCK=y
+# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
+CONFIG_X86_CMPXCHG=y
#
# Code maturity level options
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
-CONFIG_X86_CMPXCHG=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_MATH_EMULATION is not set
# CONFIG_MCA is not set
# CONFIG_EISA is not set
-CONFIG_MTRR=y
-# CONFIG_SMP is not set
+CONFIG_X86_IO_APIC=y
+CONFIG_X86_LOCAL_APIC=y
+CONFIG_SMP=y
# CONFIG_PREEMPT is not set
+CONFIG_HAVE_DEC_LOCK=y
+CONFIG_X86_MCE=y
+# CONFIG_X86_MCE_NONFATAL is not set
#
# General options
#
+
+#
+# ACPI Support
+#
+# CONFIG_ACPI is not set
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_NAMES is not set
# CONFIG_BINFMT_MISC is not set
CONFIG_PM=y
CONFIG_IA32_EMULATION=y
-# CONFIG_ACPI is not set
#
# Memory Technology Devices (MTD)
CONFIG_IDE=y
#
-# IDE, ATA and ATAPI Block devices
+# ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_IDEDISK=y
-# CONFIG_IDEDISK_MULTI_MODE is not set
+CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
# CONFIG_BLK_DEV_IDEDISK_VENDOR is not set
# CONFIG_BLK_DEV_IDEDISK_FUJITSU is not set
# CONFIG_BLK_DEV_COMMERIAL is not set
# CONFIG_BLK_DEV_TIVO is not set
# CONFIG_BLK_DEV_IDECS is not set
-# CONFIG_BLK_DEV_IDECD is not set
+CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
#
-# IDE chipset support/bugfixes
+# IDE chipset support
#
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_ISAPNP is not set
# CONFIG_BLK_DEV_RZ1000 is not set
-# CONFIG_BLK_DEV_IDEPCI is not set
+CONFIG_BLK_DEV_IDEPCI=y
+# CONFIG_BLK_DEV_OFFBOARD is not set
+# CONFIG_IDEPCI_SHARE_IRQ is not set
+CONFIG_BLK_DEV_IDEDMA_PCI=y
+CONFIG_IDEDMA_PCI_AUTO=y
+# CONFIG_IDEDMA_ONLYDISK is not set
+CONFIG_BLK_DEV_IDEDMA=y
+# CONFIG_BLK_DEV_IDE_TCQ is not set
+# CONFIG_BLK_DEV_IDE_TCQ_DEFAULT is not set
+# CONFIG_IDEDMA_PCI_WIP is not set
+# CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set
+# CONFIG_BLK_DEV_AEC62XX is not set
+# CONFIG_AEC62XX_TUNING is not set
+# CONFIG_BLK_DEV_ALI15X3 is not set
+# CONFIG_WDC_ALI15X3 is not set
+# CONFIG_BLK_DEV_AMD74XX is not set
+# CONFIG_BLK_DEV_CMD64X is not set
+# CONFIG_BLK_DEV_CY82C693 is not set
+# CONFIG_BLK_DEV_CS5530 is not set
+# CONFIG_BLK_DEV_HPT34X is not set
+# CONFIG_HPT34X_AUTODMA is not set
+# CONFIG_BLK_DEV_HPT366 is not set
+# CONFIG_BLK_DEV_PIIX is not set
+# CONFIG_BLK_DEV_NS87415 is not set
+# CONFIG_BLK_DEV_OPTI621 is not set
+# CONFIG_BLK_DEV_PDC_ADMA is not set
+# CONFIG_BLK_DEV_PDC202XX is not set
+# CONFIG_PDC202XX_BURST is not set
+# CONFIG_PDC202XX_FORCE is not set
+# CONFIG_BLK_DEV_SVWKS is not set
+# CONFIG_BLK_DEV_SIS5513 is not set
+# CONFIG_BLK_DEV_TRM290 is not set
+# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
-# CONFIG_IDEDMA_AUTO is not set
+# CONFIG_IDEDMA_IVB is not set
+CONFIG_IDEDMA_AUTO=y
# CONFIG_DMA_NONPCI is not set
-# CONFIG_BLK_DEV_IDE_MODES is not set
# CONFIG_BLK_DEV_ATARAID is not set
# CONFIG_BLK_DEV_ATARAID_PDC is not set
# CONFIG_BLK_DEV_ATARAID_HPT is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
-#
-# Bluetooth support
-#
-# CONFIG_BLUEZ is not set
-
#
# Misc devices
#
# CONFIG_ISO9660_FS is not set
# CONFIG_JOLIET is not set
# CONFIG_ZISOFS is not set
+# CONFIG_JFS_FS is not set
+# CONFIG_JFS_DEBUG is not set
+# CONFIG_JFS_STATISTICS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_ROOT_NFS is not set
# CONFIG_NFSD is not set
# CONFIG_NFSD_V3 is not set
+# CONFIG_NFSD_TCP is not set
# CONFIG_SUNRPC is not set
# CONFIG_LOCKD is not set
# CONFIG_SMB_FS is not set
# CONFIG_USB is not set
#
-# USB Host Controller Drivers
-#
-# CONFIG_USB_EHCI_HCD is not set
-# CONFIG_USB_OHCI_HCD is not set
-# CONFIG_USB_UHCI is not set
-# CONFIG_USB_UHCI_ALT is not set
-# CONFIG_USB_OHCI is not set
-
-#
-# USB Device Class drivers
-#
-# CONFIG_USB_AUDIO is not set
-# CONFIG_USB_BLUETOOTH is not set
-
-#
-# SCSI support is needed for USB Storage
-#
-# CONFIG_USB_STORAGE is not set
-# CONFIG_USB_STORAGE_DEBUG is not set
-# CONFIG_USB_STORAGE_DATAFAB is not set
-# CONFIG_USB_STORAGE_FREECOM is not set
-# CONFIG_USB_STORAGE_ISD200 is not set
-# CONFIG_USB_STORAGE_DPCM is not set
-# CONFIG_USB_STORAGE_HP8200e is not set
-# CONFIG_USB_STORAGE_SDDR09 is not set
-# CONFIG_USB_STORAGE_JUMPSHOT is not set
-# CONFIG_USB_ACM is not set
-# CONFIG_USB_PRINTER is not set
-
-#
-# USB Human Interface Devices (HID)
-#
-
-#
-# Input core support is needed for USB HID
-#
-
-#
-# USB Imaging devices
-#
-# CONFIG_USB_DC2XX is not set
-# CONFIG_USB_MDC800 is not set
-# CONFIG_USB_SCANNER is not set
-# CONFIG_USB_MICROTEK is not set
-# CONFIG_USB_HPUSBSCSI is not set
-
-#
-# USB Multimedia devices
-#
-
-#
-# Video4Linux support is needed for USB Multimedia device support
-#
-
-#
-# USB Network adaptors
-#
-# CONFIG_USB_PEGASUS is not set
-# CONFIG_USB_KAWETH is not set
-# CONFIG_USB_CATC is not set
-# CONFIG_USB_CDCETHER is not set
-# CONFIG_USB_USBNET is not set
-
-#
-# USB port drivers
-#
-# CONFIG_USB_USS720 is not set
-
-#
-# USB Serial Converter support
-#
-# CONFIG_USB_SERIAL is not set
-# CONFIG_USB_SERIAL_GENERIC is not set
-# CONFIG_USB_SERIAL_BELKIN is not set
-# CONFIG_USB_SERIAL_WHITEHEAT is not set
-# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
-# CONFIG_USB_SERIAL_EMPEG is not set
-# CONFIG_USB_SERIAL_FTDI_SIO is not set
-# CONFIG_USB_SERIAL_VISOR is not set
-# CONFIG_USB_SERIAL_IPAQ is not set
-# CONFIG_USB_SERIAL_IR is not set
-# CONFIG_USB_SERIAL_EDGEPORT is not set
-# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
-# CONFIG_USB_SERIAL_KEYSPAN is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set
-# CONFIG_USB_SERIAL_KEYSPAN_USA49W is not set
-# CONFIG_USB_SERIAL_MCT_U232 is not set
-# CONFIG_USB_SERIAL_KLSI is not set
-# CONFIG_USB_SERIAL_PL2303 is not set
-# CONFIG_USB_SERIAL_CYBERJACK is not set
-# CONFIG_USB_SERIAL_XIRCOM is not set
-# CONFIG_USB_SERIAL_OMNINET is not set
-
-#
-# USB Miscellaneous drivers
+# Bluetooth support
#
-# CONFIG_USB_RIO500 is not set
-# CONFIG_USB_AUERSWALD is not set
+# CONFIG_BLUEZ is not set
#
# Kernel hacking
# CONFIG_DEBUG_SLAB is not set
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_EARLY_PRINTK is not set
# CONFIG_CHECKING is not set
-CONFIG_SIMNOW=y
+# CONFIG_INIT_DEBUG is not set
#
# Library routines
.S.o:
$(CC) $(AFLAGS) -c -o $*.o $<
+export-objs := ia32_ioctl.o
+
all: ia32.o
O_TARGET := ia32.o
-obj-$(CONFIG_IA32_EMULATION) := ia32entry.o sys_ia32.o ia32_ioctl.o ia32_signal.o ia32_binfmt.o \
- socket32.o ptrace32.o
+obj-$(CONFIG_IA32_EMULATION) := ia32entry.o sys_ia32.o ia32_ioctl.o ia32_signal.o \
+ ia32_binfmt.o fpu32.o socket32.o ptrace32.o
clean::
--- /dev/null
+/*
+ * Copyright 2002 Andi Kleen, SuSE Labs.
+ * FXSAVE<->i387 conversion support. Based on code by Gareth Hughes.
+ * This is used for ptrace, signals and coredumps in 32bit emulation.
+ * $Id: fpu32.c,v 1.1 2002/03/21 14:16:32 ak Exp $
+ */
+
+#include <linux/sched.h>
+#include <asm/sigcontext32.h>
+#include <asm/processor.h>
+#include <asm/uaccess.h>
+#include <asm/i387.h>
+
+static inline unsigned short twd_i387_to_fxsr(unsigned short twd)
+{
+ unsigned int tmp; /* to avoid 16 bit prefixes in the code */
+
+ /* Transform each pair of bits into 01 (valid) or 00 (empty) */
+ tmp = ~twd;
+ tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */
+ /* and move the valid bits to the lower byte. */
+ tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */
+ tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */
+ tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */
+ return tmp;
+}
+
+static inline unsigned long twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
+{
+ struct _fpxreg *st = NULL;
+ unsigned long twd = (unsigned long) fxsave->twd;
+ unsigned long tag;
+ unsigned long ret = 0xffff0000;
+ int i;
+
+#define FPREG_ADDR(f, n) ((char *)&(f)->st_space + (n) * 16);
+
+ for (i = 0 ; i < 8 ; i++) {
+ if (twd & 0x1) {
+ st = (struct _fpxreg *) FPREG_ADDR( fxsave, i );
+
+ switch (st->exponent & 0x7fff) {
+ case 0x7fff:
+ tag = 2; /* Special */
+ break;
+ case 0x0000:
+ if ( !st->significand[0] &&
+ !st->significand[1] &&
+ !st->significand[2] &&
+ !st->significand[3] ) {
+ tag = 1; /* Zero */
+ } else {
+ tag = 2; /* Special */
+ }
+ break;
+ default:
+ if (st->significand[3] & 0x8000) {
+ tag = 0; /* Valid */
+ } else {
+ tag = 2; /* Special */
+ }
+ break;
+ }
+ } else {
+ tag = 3; /* Empty */
+ }
+ ret |= (tag << (2 * i));
+ twd = twd >> 1;
+ }
+ return ret;
+}
+
+
+static inline int convert_fxsr_from_user(struct i387_fxsave_struct *fxsave,
+ struct _fpstate_ia32 *buf)
+{
+ struct _fpxreg *to;
+ struct _fpreg *from;
+ int i;
+ int err;
+ __u32 v;
+
+ err = __get_user(fxsave->cwd, (u16 *)&buf->cw);
+ err |= __get_user(fxsave->swd, (u16 *)&buf->sw);
+ err |= __get_user(fxsave->twd, (u16 *)&buf->tag);
+ fxsave->twd = twd_i387_to_fxsr(fxsave->twd);
+ err |= __get_user(fxsave->rip, &buf->ipoff);
+ err |= __get_user(fxsave->rdp, &buf->dataoff);
+ err |= __get_user(v, &buf->cssel);
+ fxsave->fop = v >> 16;
+ if (err)
+ return -1;
+
+ to = (struct _fpxreg *)&fxsave->st_space[0];
+ from = &buf->_st[0];
+ for (i = 0 ; i < 8 ; i++, to++, from++) {
+ if (__copy_from_user(to, from, sizeof(*from)))
+ return -1;
+ }
+ return 0;
+}
+
+
+static inline int convert_fxsr_to_user(struct _fpstate_ia32 *buf,
+ struct i387_fxsave_struct *fxsave,
+ struct pt_regs *regs,
+ struct task_struct *tsk)
+{
+ struct _fpreg *to;
+ struct _fpxreg *from;
+ int i;
+ u32 ds;
+ int err;
+
+ err = __put_user((unsigned long)fxsave->cwd | 0xffff0000, &buf->cw);
+ err |= __put_user((unsigned long)fxsave->swd | 0xffff0000, &buf->sw);
+ err |= __put_user((u32)fxsave->rip, &buf->ipoff);
+ err |= __put_user((u32)(regs->cs | ((u32)fxsave->fop << 16)),
+ &buf->cssel);
+ err |= __put_user((u32)twd_fxsr_to_i387(fxsave), &buf->tag);
+ err |= __put_user((u32)fxsave->rdp, &buf->dataoff);
+ if (tsk == current)
+ asm("movl %%ds,%0 " : "=r" (ds));
+ else /* ptrace. task has stopped. */
+ ds = tsk->thread.ds;
+ err |= __put_user(ds, &buf->datasel);
+ if (err)
+ return -1;
+
+ to = &buf->_st[0];
+ from = (struct _fpxreg *) &fxsave->st_space[0];
+ for ( i = 0 ; i < 8 ; i++, to++, from++ ) {
+ if (__copy_to_user(to, from, sizeof(*to)))
+ return -1;
+ }
+ return 0;
+}
+
+int restore_i387_ia32(struct task_struct *tsk, struct _fpstate_ia32 *buf, int fsave)
+{
+ clear_fpu(tsk);
+ if (!fsave) {
+ if (__copy_from_user(&tsk->thread.i387.fxsave,
+ &buf->_fxsr_env[0],
+ sizeof(struct i387_fxsave_struct)))
+ return -1;
+ }
+ tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
+ return convert_fxsr_from_user(&tsk->thread.i387.fxsave, buf);
+}
+
+int save_i387_ia32(struct task_struct *tsk,
+ struct _fpstate_ia32 *buf,
+ struct pt_regs *regs,
+ int fsave)
+{
+ int err = 0;
+
+ if (!tsk->used_math)
+ return 0;
+ tsk->used_math = 0;
+ unlazy_fpu(tsk);
+ if (convert_fxsr_to_user(buf, &tsk->thread.i387.fxsave, regs, tsk))
+ return -1;
+ err |= __put_user(tsk->thread.i387.fxsave.swd, &buf->status);
+ if (fsave)
+ return err ? -1 : 1;
+ err |= __put_user(X86_FXSR_MAGIC, &buf->magic);
+ err |= __copy_to_user(&buf->_fxsr_env[0], &tsk->thread.i387.fxsave,
+ sizeof(struct i387_fxsave_struct));
+ return err ? -1 : 1;
+}
/*
- * Written 2000 by Andi Kleen.
+ * Written 2000,2002 by Andi Kleen.
*
* Losely based on the sparc64 and IA64 32bit emulation loaders.
+ * This tricks binfmt_elf.c into loading 32bit binaries using lots
+ * of ugly preprocessor tricks. Talk about very very poor man's inheritance.
*/
#include <linux/types.h>
#include <linux/config.h>
#include <linux/stddef.h>
#include <linux/module.h>
#include <linux/rwsem.h>
+#include <linux/sched.h>
+#include <linux/string.h>
#include <asm/segment.h>
#include <asm/ptrace.h>
#include <asm/processor.h>
+#include <asm/user32.h>
+#include <asm/sigcontext32.h>
+#include <asm/fpu32.h>
+#include <asm/i387.h>
struct file;
struct elf_phdr;
#define ELF_CLASS ELFCLASS32
#define ELF_DATA ELFDATA2LSB
-//#define USE_ELF_CORE_DUMP
+
+#define USE_ELF_CORE_DUMP 1
+
+/* Overwrite elfcore.h */
+#define _LINUX_ELFCORE_H 1
+typedef unsigned int elf_greg_t;
+
+#define ELF_NGREG (sizeof (struct user_regs_struct32) / sizeof(elf_greg_t))
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+
+struct elf_siginfo
+{
+ int si_signo; /* signal number */
+ int si_code; /* extra code */
+ int si_errno; /* errno */
+};
+
+struct timeval32
+{
+ int tv_sec, tv_usec;
+};
+
+struct elf_prstatus
+{
+ struct elf_siginfo pr_info; /* Info associated with signal */
+ short pr_cursig; /* Current signal */
+ unsigned int pr_sigpend; /* Set of pending signals */
+ unsigned int pr_sighold; /* Set of held signals */
+ pid_t pr_pid;
+ pid_t pr_ppid;
+ pid_t pr_pgrp;
+ pid_t pr_sid;
+ struct timeval32 pr_utime; /* User time */
+ struct timeval32 pr_stime; /* System time */
+ struct timeval32 pr_cutime; /* Cumulative user time */
+ struct timeval32 pr_cstime; /* Cumulative system time */
+ elf_gregset_t pr_reg; /* GP registers */
+ int pr_fpvalid; /* True if math co-processor being used. */
+};
+
+#define ELF_PRARGSZ (80) /* Number of chars for args */
+
+struct elf_prpsinfo
+{
+ char pr_state; /* numeric process state */
+ char pr_sname; /* char for pr_state */
+ char pr_zomb; /* zombie */
+ char pr_nice; /* nice val */
+ unsigned int pr_flag; /* flags */
+ __u16 pr_uid;
+ __u16 pr_gid;
+ pid_t pr_pid, pr_ppid, pr_pgrp, pr_sid;
+ /* Lots missing */
+ char pr_fname[16]; /* filename of executable */
+ char pr_psargs[ELF_PRARGSZ]; /* initial part of arg list */
+};
+
+#define __STR(x) #x
+#define STR(x) __STR(x)
+
+#define _GET_SEG(x) \
+ ({ __u32 seg; asm("movl %%" STR(x) ",%0" : "=r"(seg)); seg; })
+
+/* Assumes current==process to be dumped */
+#define ELF_CORE_COPY_REGS(pr_reg, regs) \
+ pr_reg[0] = regs->rbx; \
+ pr_reg[1] = regs->rcx; \
+ pr_reg[2] = regs->rdx; \
+ pr_reg[3] = regs->rsi; \
+ pr_reg[4] = regs->rdi; \
+ pr_reg[5] = regs->rbp; \
+ pr_reg[6] = regs->rax; \
+ pr_reg[7] = _GET_SEG(ds); \
+ pr_reg[8] = _GET_SEG(es); \
+ pr_reg[9] = _GET_SEG(fs); \
+ pr_reg[10] = _GET_SEG(gs); \
+ pr_reg[11] = regs->orig_rax; \
+ pr_reg[12] = regs->rip; \
+ pr_reg[13] = regs->cs; \
+ pr_reg[14] = regs->eflags; \
+ pr_reg[15] = regs->rsp; \
+ pr_reg[16] = regs->ss;
+
+#define user user32
+
+#define dump_fpu dump_fpu_ia32
#define __ASM_X86_64_ELF_H 1
#include <asm/ia32.h>
#include <linux/elf.h>
-typedef __u32 elf_greg_t;
-
-typedef elf_greg_t elf_gregset_t[8];
-
-/* FIXME -- wrong */
typedef struct user_i387_ia32_struct elf_fpregset_t;
-typedef struct user_i387_struct elf_fpxregset_t;
+typedef struct user32_fxsr_struct elf_fpxregset_t;
#undef elf_check_arch
#define elf_check_arch(x) \
unsigned long map_addr;
struct task_struct *me = current;
+ if (prot & PROT_READ)
+ prot |= PROT_EXEC;
+
down_write(&me->mm->mmap_sem);
map_addr = do_mmap(filep, ELF_PAGESTART(addr),
- eppnt->p_filesz + ELF_PAGEOFFSET(eppnt->p_vaddr), prot, type|MAP_32BIT,
+ eppnt->p_filesz + ELF_PAGEOFFSET(eppnt->p_vaddr), prot,
+ type|MAP_32BIT,
eppnt->p_offset - ELF_PAGEOFFSET(eppnt->p_vaddr));
up_write(&me->mm->mmap_sem);
return(map_addr);
}
+int dump_fpu_ia32(struct pt_regs *regs, elf_fpregset_t *fp)
+{
+ struct _fpstate_ia32 *fpu = (void*)fp;
+ struct task_struct *tsk = current;
+ mm_segment_t oldfs = get_fs();
+ int ret;
+
+ if (!tsk->used_math)
+ return 0;
+ if (!(test_thread_flag(TIF_IA32)))
+ BUG();
+ unlazy_fpu(tsk);
+ set_fs(KERNEL_DS);
+ ret = save_i387_ia32(current, fpu, regs, 1);
+ /* Correct for i386 bug. It puts the fop into the upper 16bits of
+ the tag word (like FXSAVE), not into the fcs*/
+ fpu->cssel |= fpu->tag & 0xffff0000;
+ set_fs(oldfs);
+ return ret;
+}
-/* $Id: ia32_ioctl.c,v 1.2 2001/07/05 06:28:42 ak Exp $
+/* $Id: ia32_ioctl.c,v 1.11 2002/04/18 14:36:37 ak Exp $
* ioctl32.c: Conversion between 32bit and 64bit native ioctls.
*
* Copyright (C) 1997-2000 Jakub Jelinek (jakub@redhat.com)
* Copyright (C) 1998 Eddie C. Dost (ecd@skynet.be)
- * Copyright (C) 2001 Andi Kleen, SuSE Labs
+ * Copyright (C) 2001,2002 Andi Kleen, SuSE Labs
*
* These routines maintain argument size conversion between 32bit and 64bit
* ioctls.
#include <linux/elevator.h>
#include <linux/rtc.h>
#include <linux/pci.h>
+#include <linux/rtc.h>
+#include <linux/module.h>
+#include <linux/serial.h>
+#include <linux/reiserfs_fs.h>
#if defined(CONFIG_BLK_DEV_LVM) || defined(CONFIG_BLK_DEV_LVM_MODULE)
/* Ugh. This header really is not clean */
#define min min
return rw_long(fd, AUTOFS_IOC_SETTIMEOUT, arg);
}
+/* SuSE extension */
+#ifndef TIOCGDEV
+#define TIOCGDEV _IOR('T',0x32, unsigned int)
+#endif
+static int tiocgdev(unsigned fd, unsigned cmd, unsigned int *ptr)
+{
+
+ struct file *file = fget(fd);
+ struct tty_struct *real_tty;
+
+ if (!fd)
+ return -EBADF;
+ if (file->f_op->ioctl != tty_ioctl)
+ return -EINVAL;
+ real_tty = (struct tty_struct *)file->private_data;
+ if (!real_tty)
+ return -EINVAL;
+ return put_user(kdev_t_to_nr(real_tty->device), ptr);
+}
+
+
+struct raw32_config_request
+{
+ int raw_minor;
+ __u64 block_major;
+ __u64 block_minor;
+} __attribute__((packed));
+
+static int raw_ioctl(unsigned fd, unsigned cmd, void *ptr)
+{
+ int ret;
+ switch (cmd) {
+ case RAW_SETBIND:
+ case RAW_GETBIND: {
+ struct raw_config_request req;
+ struct raw32_config_request *user_req = ptr;
+ mm_segment_t oldfs = get_fs();
+
+ if (get_user(req.raw_minor, &user_req->raw_minor) ||
+ get_user(req.block_major, &user_req->block_major) ||
+ get_user(req.block_minor, &user_req->block_minor))
+ return -EFAULT;
+ set_fs(KERNEL_DS);
+ ret = sys_ioctl(fd,cmd,(unsigned long)&req);
+ set_fs(oldfs);
+ break;
+ }
+ default:
+ ret = sys_ioctl(fd,cmd,(unsigned long)ptr);
+ break;
+ }
+ return ret;
+}
+
+struct serial_struct32 {
+ int type;
+ int line;
+ unsigned int port;
+ int irq;
+ int flags;
+ int xmit_fifo_size;
+ int custom_divisor;
+ int baud_base;
+ unsigned short close_delay;
+ char io_type;
+ char reserved_char[1];
+ int hub6;
+ unsigned short closing_wait; /* time to wait before closing */
+ unsigned short closing_wait2; /* no longer used... */
+ __u32 iomem_base;
+ unsigned short iomem_reg_shift;
+ unsigned int port_high;
+ int reserved[1];
+};
+
+static int serial_struct_ioctl(unsigned fd, unsigned cmd, void *ptr)
+{
+ typedef struct serial_struct SS;
+ struct serial_struct32 *ss32 = ptr;
+ int err = 0;
+ struct serial_struct ss;
+ mm_segment_t oldseg = get_fs();
+ set_fs(KERNEL_DS);
+ if (cmd == TIOCSSERIAL) {
+ err = -EFAULT;
+ if (copy_from_user(&ss, ss32, sizeof(struct serial_struct32)))
+ goto out;
+ memmove(&ss.iomem_reg_shift, ((char*)&ss.iomem_base)+4,
+ sizeof(SS)-offsetof(SS,iomem_reg_shift));
+ ss.iomem_base = (void *)((unsigned long)ss.iomem_base & 0xffffffff);
+ }
+ if (!err)
+ err = sys_ioctl(fd,cmd,(unsigned long)(&ss));
+ if (cmd == TIOCGSERIAL && err >= 0) {
+ __u32 base;
+ if (__copy_to_user(ss32,&ss,offsetof(SS,iomem_base)) ||
+ __copy_to_user(&ss32->iomem_reg_shift,
+ &ss.iomem_reg_shift,
+ sizeof(SS) - offsetof(SS, iomem_reg_shift)))
+ err = -EFAULT;
+ if (ss.iomem_base > (unsigned char *)0xffffffff)
+ base = -1;
+ else
+ base = (unsigned long)ss.iomem_base;
+ err |= __put_user(base, &ss32->iomem_base);
+ }
+ out:
+ set_fs(oldseg);
+ return err;
+}
+
struct ioctl_trans {
unsigned long cmd;
- unsigned long handler;
+ int (*handler)(unsigned int, unsigned int, unsigned long, struct file * filp);
struct ioctl_trans *next;
};
+/* generic function to change a single long put_user to arg to 32bit */
+static int arg2long(unsigned int fd, unsigned int cmd, unsigned long arg)
+{
+ int ret;
+ unsigned long val = 0;
+ mm_segment_t oldseg = get_fs();
+ set_fs(KERNEL_DS);
+ ret = sys_ioctl(fd, cmd, (unsigned long)&val);
+ set_fs(oldseg);
+ if (!ret || val) {
+ if (put_user((int)val, (unsigned int *)arg))
+ return -EFAULT;
+ }
+ return ret;
+}
+
#define REF_SYMBOL(handler) if (0) (void)handler;
#define HANDLE_IOCTL2(cmd,handler) REF_SYMBOL(handler); asm volatile(".quad %c0, " #handler ",0"::"i" (cmd));
#define HANDLE_IOCTL(cmd,handler) HANDLE_IOCTL2(cmd,handler)
#define IOCTL_TABLE_END asm volatile("\nioctl_end:"); }
IOCTL_TABLE_START
-/* List here exlicitly which ioctl's are known to have
+/* List here explicitly which ioctl's are known to have
* compatable types passed or none at all...
*/
/* Big T */
COMPATIBLE_IOCTL(TCSETSW)
COMPATIBLE_IOCTL(TCSETSF)
COMPATIBLE_IOCTL(TIOCLINUX)
+HANDLE_IOCTL(TIOCGDEV, tiocgdev)
/* Little t */
COMPATIBLE_IOCTL(TIOCGETD)
COMPATIBLE_IOCTL(TIOCSETD)
COMPATIBLE_IOCTL(TIOCSCTTY)
COMPATIBLE_IOCTL(TIOCGPTN)
COMPATIBLE_IOCTL(TIOCSPTLCK)
-COMPATIBLE_IOCTL(TIOCGSERIAL)
-COMPATIBLE_IOCTL(TIOCSSERIAL)
COMPATIBLE_IOCTL(TIOCSERGETLSR)
COMPATIBLE_IOCTL(FBIOGET_VSCREENINFO)
COMPATIBLE_IOCTL(FBIOPUT_VSCREENINFO)
COMPATIBLE_IOCTL(BLKROGET)
COMPATIBLE_IOCTL(BLKRRPART)
COMPATIBLE_IOCTL(BLKFLSBUF)
+COMPATIBLE_IOCTL(BLKRASET)
+COMPATIBLE_IOCTL(BLKFRASET)
COMPATIBLE_IOCTL(BLKSECTSET)
COMPATIBLE_IOCTL(BLKSSZGET)
COMPATIBLE_IOCTL(RTC_SET_TIME)
COMPATIBLE_IOCTL(RTC_WKALM_SET)
COMPATIBLE_IOCTL(RTC_WKALM_RD)
-COMPATIBLE_IOCTL(RTC_IRQP_READ)
+HANDLE_IOCTL(RTC_IRQP_READ,arg2long)
COMPATIBLE_IOCTL(RTC_IRQP_SET)
COMPATIBLE_IOCTL(RTC_EPOCH_READ)
COMPATIBLE_IOCTL(RTC_EPOCH_SET)
COMPATIBLE_IOCTL(DEVFSDIOC_SET_EVENT_MASK)
COMPATIBLE_IOCTL(DEVFSDIOC_RELEASE_EVENT_QUEUE)
COMPATIBLE_IOCTL(DEVFSDIOC_SET_DEBUG_MASK)
-/* Raw devices */
-COMPATIBLE_IOCTL(RAW_SETBIND)
-COMPATIBLE_IOCTL(RAW_GETBIND)
/* SMB ioctls which do not need any translations */
COMPATIBLE_IOCTL(SMB_IOC_NEWCONN)
/* Little a */
COMPATIBLE_IOCTL(DRM_IOCTL_UNLOCK)
COMPATIBLE_IOCTL(DRM_IOCTL_FINISH)
#endif /* DRM */
+#ifdef CONFIG_AUTOFS_FS
+COMPATIBLE_IOCTL(AUTOFS_IOC_READY);
+COMPATIBLE_IOCTL(AUTOFS_IOC_FAIL);
+COMPATIBLE_IOCTL(AUTOFS_IOC_CATATONIC);
+COMPATIBLE_IOCTL(AUTOFS_IOC_PROTOVER);
+COMPATIBLE_IOCTL(AUTOFS_IOC_SETTIMEOUT);
+COMPATIBLE_IOCTL(AUTOFS_IOC_EXPIRE);
+#endif
+COMPATIBLE_IOCTL(REISERFS_IOC_UNPACK);
+/* serial driver */
+HANDLE_IOCTL(TIOCGSERIAL, serial_struct_ioctl);
+HANDLE_IOCTL(TIOCSSERIAL, serial_struct_ioctl);
/* elevator */
COMPATIBLE_IOCTL(BLKELVGET)
COMPATIBLE_IOCTL(BLKELVSET)
HANDLE_IOCTL(SIOCETHTOOL, ethtool_ioctl)
HANDLE_IOCTL(SIOCADDRT, routing_ioctl)
HANDLE_IOCTL(SIOCDELRT, routing_ioctl)
+/* Raw devices */
+HANDLE_IOCTL(RAW_SETBIND, raw_ioctl)
/* Note SIOCRTMSG is no longer, so this is safe and * the user would have seen just an -EINVAL anyways. */
HANDLE_IOCTL(SIOCRTMSG, ret_einval)
HANDLE_IOCTL(SIOCGSTAMP, do_siocgstamp)
/* Always call these with kernel lock held! */
+
int register_ioctl32_conversion(unsigned int cmd, int (*handler)(unsigned int, unsigned int, unsigned long, struct file *))
{
int i;
if (!additional_ioctls) {
- additional_ioctls = module_map(PAGE_SIZE);
+ additional_ioctls = (struct ioctl_trans *)get_zeroed_page(GFP_KERNEL);
if (!additional_ioctls)
return -ENOMEM;
- memset(additional_ioctls, 0, PAGE_SIZE);
}
for (i = 0; i < PAGE_SIZE/sizeof(struct ioctl_trans); i++)
if (!additional_ioctls[i].cmd)
return -ENOMEM;
additional_ioctls[i].cmd = cmd;
if (!handler)
- additional_ioctls[i].handler = (u32)(long)sys_ioctl;
+ additional_ioctls[i].handler =
+ (int (*)(unsigned,unsigned,unsigned long, struct file *))sys_ioctl;
else
- additional_ioctls[i].handler = (u32)(long)handler;
+ additional_ioctls[i].handler = handler;
ioctl32_insert_translation(&additional_ioctls[i]);
return 0;
}
+
int unregister_ioctl32_conversion(unsigned int cmd)
{
unsigned long hash = ioctl32_hash(cmd);
return -EINVAL;
}
+EXPORT_SYMBOL(register_ioctl32_conversion);
+EXPORT_SYMBOL(unregister_ioctl32_conversion);
+
asmlinkage int sys32_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
{
struct file * filp;
while (t && t->cmd != cmd)
t = (struct ioctl_trans *)(long)t->next;
if (t) {
- handler = (void *)(long)t->handler;
+ handler = t->handler;
error = handler(fd, cmd, arg, filp);
} else {
static int count = 0;
* 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes
* 2000-12-* x86-64 compatibility mode signal handling by Andi Kleen
*
- * $Id: ia32_signal.c,v 1.15 2001/10/16 23:41:42 ak Exp $
+ * $Id: ia32_signal.c,v 1.17 2002/03/21 14:16:32 ak Exp $
*/
#include <linux/sched.h>
#include <asm/ptrace.h>
#include <asm/ia32_unistd.h>
#include <asm/user32.h>
+#include <asm/sigcontext32.h>
+#include <asm/fpu32.h>
#define ptr_to_u32(x) ((u32)(u64)(x)) /* avoid gcc warning */
};
static int
-restore_sigcontext(struct pt_regs *regs, struct sigcontext_ia32 *sc, unsigned int *peax)
+ia32_restore_sigcontext(struct pt_regs *regs, struct sigcontext_ia32 *sc, unsigned int *peax)
{
unsigned int err = 0;
/* Reload fs and gs if they have changed in the signal handler.
This does not handle long fs/gs base changes in the handler, but does not clobber
them at least in the normal case. */
- RELOAD_SEG(gs);
+
+
+ {
+ unsigned short gs;
+ err |= __get_user(gs, &sc->gs);
+ load_gs_index(gs);
+ }
RELOAD_SEG(fs);
COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
{
u32 tmp;
- struct _fpstate * buf;
+ struct _fpstate_ia32 * buf;
err |= __get_user(tmp, &sc->fpstate);
- buf = (struct _fpstate *) (u64)tmp;
+ buf = (struct _fpstate_ia32 *) (u64)tmp;
if (buf) {
if (verify_area(VERIFY_READ, buf, sizeof(*buf)))
goto badframe;
- err |= restore_i387(buf);
+ err |= restore_i387_ia32(current, buf, 0);
}
}
recalc_sigpending();
spin_unlock_irq(¤t->sigmask_lock);
- if (restore_sigcontext(®s, &frame->sc, &eax))
+ if (ia32_restore_sigcontext(®s, &frame->sc, &eax))
goto badframe;
return eax;
recalc_sigpending();
spin_unlock_irq(¤t->sigmask_lock);
- if (restore_sigcontext(®s, &frame->uc.uc_mcontext, &eax))
+ if (ia32_restore_sigcontext(®s, &frame->uc.uc_mcontext, &eax))
goto badframe;
if (__copy_from_user(&st, &frame->uc.uc_stack, sizeof(st)))
*/
static int
-setup_sigcontext(struct sigcontext_ia32 *sc, struct _fpstate_ia32 *fpstate,
+ia32_setup_sigcontext(struct sigcontext_ia32 *sc, struct _fpstate_ia32 *fpstate,
struct pt_regs *regs, unsigned int mask)
{
int tmp, err = 0;
err |= __put_user((u32)regs->eflags, &sc->eflags);
err |= __put_user((u32)regs->rsp, &sc->esp_at_signal);
- tmp = save_i387(fpstate);
+ tmp = save_i387_ia32(current, fpstate, regs, 0);
if (tmp < 0)
err = -EFAULT;
else
{
struct sigframe *frame;
int err = 0;
- struct exec_domain *exec_domain = current_thread_info()->exec_domain;
frame = get_sigframe(ka, regs, sizeof(*frame));
if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
goto give_sigsegv;
- err |= __put_user((exec_domain
- && exec_domain->signal_invmap
- && sig < 32
- ? exec_domain->signal_invmap[sig]
+ {
+ struct exec_domain *ed = current_thread_info()->exec_domain;
+
+ err |= __put_user((ed && ed->signal_invmap && sig < 32
+ ? ed->signal_invmap[sig]
: sig),
&frame->sig);
+ }
if (err)
goto give_sigsegv;
- err |= setup_sigcontext(&frame->sc, &frame->fpstate, regs, set->sig[0]);
+ err |= ia32_setup_sigcontext(&frame->sc, &frame->fpstate, regs, set->sig[0]);
if (err)
goto give_sigsegv;
{
struct rt_sigframe *frame;
int err = 0;
- struct exec_domain *exec_domain = current_thread_info()->exec_domain;
frame = get_sigframe(ka, regs, sizeof(*frame));
if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
goto give_sigsegv;
- err |= __put_user((exec_domain
- && exec_domain->signal_invmap
- && sig < 32
- ? exec_domain->signal_invmap[sig]
+ {
+ struct exec_domain *ed = current_thread_info()->exec_domain;
+ err |= __put_user((ed && ed->signal_invmap && sig < 32
+ ? ed->signal_invmap[sig]
: sig),
&frame->sig);
+ }
err |= __put_user((u32)(u64)&frame->info, &frame->pinfo);
err |= __put_user((u32)(u64)&frame->uc, &frame->puc);
err |= ia32_copy_siginfo_to_user(&frame->info, info);
err |= __put_user(sas_ss_flags(regs->rsp),
&frame->uc.uc_stack.ss_flags);
err |= __put_user(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
- err |= setup_sigcontext(&frame->uc.uc_mcontext, &frame->fpstate,
+ err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, &frame->fpstate,
regs, set->sig[0]);
err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
if (err)
/*
* Compatibility mode system call entry point for x86-64.
*
- * Copyright 2000,2001,2002 Andi Kleen, SuSE Labs.
+ * Copyright 2000,2001 Andi Kleen, SuSE Labs.
*
- * $Id: ia32entry.S,v 1.24 2001/11/11 17:47:47 ak Exp $
+ * $Id: ia32entry.S,v 1.31 2002/03/24 13:01:45 ak Exp $
*/
#include <asm/calling.h>
#include <asm/offset.h>
-#include <asm/thread_info.h>
+#include <asm/current.h>
#include <linux/linkage.h>
#include <asm/errno.h>
#include <asm/ia32_unistd.h>
+#include <asm/thread_info.h>
.macro IA32_ARG_FIXUP
movl %edi,%r8d
*/
ENTRY(ia32_cstar_target)
movq $-ENOSYS,%rax
- SYSRET32
+ sysret
/*
* Emulated IA32 system calls via int 0x80.
cmpl $(IA32_NR_syscalls),%eax
jae ia32_badsys
IA32_ARG_FIXUP
- movl $1,%r10d
call *ia32_sys_call_table(,%rax,8) # xxx: rip relative
movq %rax,RAX-ARGOFFSET(%rsp)
jmp int_ret_from_sys_call
PTREGSCALL stub32_rt_sigsuspend, sys_rt_sigsuspend
ENTRY(ia32_ptregs_common)
- popq %r11 /* save return address outside the stack frame. */
+ popq %r11
SAVE_REST
movq %r11, %r15
call *%rax
.quad stub32_clone /* 120 */
.quad sys_setdomainname
.quad sys_newuname
- .quad ni_syscall /* modify_ldt */
+ .quad sys_modify_ldt
.quad sys32_adjtimex
- .quad sys_mprotect /* 125 */
+ .quad sys32_mprotect /* 125 */
.quad sys32_sigprocmask
- .quad ni_syscall /* query_module */
- .quad ni_syscall /* init_module */
- .quad ni_syscall /* delete module */
- .quad ni_syscall /* 130 get_kernel_syms */
+ .quad sys32_module_warning /* create_module */
+ .quad sys32_module_warning /* init_module */
+ .quad sys32_module_warning /* delete module */
+ .quad sys32_module_warning /* 130 get_kernel_syms */
.quad ni_syscall /* quotactl */
.quad sys_getpgid
.quad sys_fchdir
.quad ni_syscall /* vm86 */
.quad ni_syscall /* query_module */
.quad sys_poll
- .quad ni_syscall /* nfsserverctl */
+ .quad sys32_nfsservctl
.quad sys_setresgid16 /* 170 */
.quad sys_getresgid16
.quad sys_prctl
.quad sys32_pwrite
.quad sys_chown16
.quad sys_getcwd
- .quad ni_syscall /* capget */
- .quad ni_syscall /* capset */
+ .quad sys_capget
+ .quad sys_capset
.quad stub32_sigaltstack
.quad sys32_sendfile
.quad ni_syscall /* streams1 */
.quad sys_pivot_root
.quad sys_mincore
.quad sys_madvise
- .quad sys_getdents64 /* 220 */
+ .quad sys_getdents64 /* 220 getdents64 */
.quad sys32_fcntl64
.quad sys_ni_syscall /* tux */
.quad sys_ni_syscall /* security */
.quad sys_lremovexattr
.quad sys_fremovexattr
.quad sys_tkill /* 238 */
+ .quad sys_sendfile64
+ .quad sys_futex
+ .quad sys32_sched_setaffinity
+ .quad sys32_sched_getaffinity
ia32_syscall_end:
.rept IA32_NR_syscalls-(ia32_syscall_end-ia32_sys_call_table)/8
.quad ni_syscall
/*
* 32bit ptrace for x86-64.
*
- * Copyright 2001 Andi Kleen, SuSE Labs.
- * Some parts copied from arch/i386/kernel/ptrace.c. See that file for
- * earlier copyright.
+ * Copyright 2001,2002 Andi Kleen, SuSE Labs.
+ * Some parts copied from arch/i386/kernel/ptrace.c. See that file for earlier
+ * copyright.
*
- * This allows to access 64bit processes too but there is no way to see
- * the extended register contents.
+ * This allows to access 64bit processes too; but there is no way to see the extended
+ * register contents.
*
- * $Id: ptrace32.c,v 1.2 2001/08/15 06:41:13 ak Exp $
+ * $Id: ptrace32.c,v 1.12 2002/03/24 13:02:02 ak Exp $
*/
#include <linux/kernel.h>
#include <linux/stddef.h>
#include <linux/sched.h>
#include <linux/mm.h>
-#include <linux/mm.h>
-#include <linux/ptrace.h>
#include <asm/ptrace.h>
#include <asm/uaccess.h>
#include <asm/user32.h>
+#include <asm/user.h>
#include <asm/errno.h>
#include <asm/debugreg.h>
+#include <asm/i387.h>
+#include <asm/fpu32.h>
+#include <linux/mm.h>
#define R32(l,q) \
case offsetof(struct user32, regs.l): stack[offsetof(struct pt_regs, q)/8] = val; break
#undef R32
-
static struct task_struct *find_target(int request, int pid, int *err)
{
struct task_struct *child;
if (child)
get_task_struct(child);
read_unlock(&tasklist_lock);
- if (child) {
- *err = -ESRCH;
- if (!(child->ptrace & PT_PTRACED))
- goto out;
- if (child->state != TASK_STOPPED) {
- if (request != PTRACE_KILL)
- goto out;
- }
- if (child->p_pptr != current)
- goto out;
-
+ *err = ptrace_check_attach(child,0);
+ if (*err == 0)
return child;
- }
- out:
put_task_struct(child);
return NULL;
-
}
extern asmlinkage long sys_ptrace(long request, long pid, unsigned long addr, unsigned long data);
asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data)
{
struct task_struct *child;
+ struct pt_regs *childregs;
int ret;
__u32 val;
case PTRACE_SETREGS:
case PTRACE_SETFPREGS:
case PTRACE_GETFPREGS:
+ case PTRACE_SETFPXREGS:
+ case PTRACE_GETFPXREGS:
break;
default:
if (!child)
return ret;
+ childregs = (struct pt_regs *)(child->thread.rsp0 - sizeof(struct pt_regs));
+
switch (request) {
case PTRACE_PEEKDATA:
case PTRACE_PEEKTEXT:
ret = 0;
- if (access_process_vm(child, addr, &val, sizeof(u32), 0) != sizeof(u32))
+ if (access_process_vm(child, addr, &val, sizeof(u32), 0)!=sizeof(u32))
ret = -EIO;
else
ret = put_user(val, (unsigned int *)(u64)data);
case PTRACE_POKEDATA:
case PTRACE_POKETEXT:
ret = 0;
- if (access_process_vm(child, addr, &data, sizeof(u32), 1) != sizeof(u32))
+ if (access_process_vm(child, addr, &data, sizeof(u32), 1)!=sizeof(u32))
ret = -EIO;
break;
case PTRACE_PEEKUSR:
ret = getreg32(child, addr, &val);
- if (ret >= 0)
+ if (ret == 0)
ret = put_user(val, (__u32 *)(unsigned long) data);
break;
case PTRACE_GETREGS: { /* Get all gp regs from the child. */
int i;
- if (!access_ok(VERIFY_WRITE, (unsigned *)(unsigned long)data, FRAME_SIZE)) {
+ if (!access_ok(VERIFY_WRITE, (unsigned *)(unsigned long)data, 16*4)) {
ret = -EIO;
break;
}
case PTRACE_SETREGS: { /* Set all gp regs in the child. */
unsigned long tmp;
int i;
- if (!access_ok(VERIFY_READ, (unsigned *)(unsigned long)data, FRAME_SIZE)) {
+ if (!access_ok(VERIFY_READ, (unsigned *)(unsigned long)data, 16*4)) {
ret = -EIO;
break;
}
+ empty_fpu(child);
ret = 0;
for ( i = 0; i <= 16*4; i += sizeof(u32) ) {
ret |= __get_user(tmp, (u32 *) (unsigned long) data);
break;
}
-#if 0 /* to be done. */
- case PTRACE_GETFPREGS: { /* Get the child extended FPU state. */
- if (!access_ok(VERIFY_WRITE, (unsigned *)data,
- sizeof(struct user_i387_struct))) {
- ret = -EIO;
+ case PTRACE_SETFPREGS:
+ empty_fpu(child);
+ save_i387_ia32(child, (void *)(u64)data, childregs, 1);
+ ret = 0;
break;
- }
- if ( !child->used_math ) {
- /* Simulate an empty FPU. */
- set_fpu_cwd(child, 0x037f);
- set_fpu_swd(child, 0x0000);
- set_fpu_twd(child, 0xffff);
- set_fpu_mxcsr(child, 0x1f80);
- }
- ret = get_fpregs((struct user_i387_struct *)data, child);
+
+ case PTRACE_GETFPREGS:
+ empty_fpu(child);
+ restore_i387_ia32(child, (void *)(u64)data, 1);
+ ret = 0;
break;
- }
- case PTRACE_SETFPREGS: { /* Set the child extended FPU state. */
- if (!access_ok(VERIFY_READ, (unsigned *)data,
- sizeof(struct user_i387_struct))) {
- ret = -EIO;
+ case PTRACE_GETFPXREGS: {
+ struct user32_fxsr_struct *u = (void *)(u64)data;
+ empty_fpu(child);
+ ret = copy_to_user(u, &child->thread.i387.fxsave, sizeof(*u));
+ ret |= __put_user(childregs->cs, &u->fcs);
+ ret |= __put_user(child->thread.ds, &u->fos);
+ if (ret)
+ ret = -EFAULT;
+ break;
+ }
+ case PTRACE_SETFPXREGS: {
+ struct user32_fxsr_struct *u = (void *)(u64)data;
+ empty_fpu(child);
+ /* no error checking to be bug to bug compatible with i386 */
+ copy_from_user(&child->thread.i387.fxsave, u, sizeof(*u));
+ child->thread.i387.fxsave.mxcsr &= 0xffbf;
+ ret = 0;
break;
}
- child->used_math = 1;
- ret = set_fpregs(child, (struct user_i387_struct *)data);
- break;
-
-#endif
default:
ret = -EINVAL;
* Copyright (C) 1997 David S. Miller (davem@caip.rutgers.edu)
* Copyright (C) 2000 Hewlett-Packard Co.
* Copyright (C) 2000 David Mosberger-Tang <davidm@hpl.hp.com>
- * Copyright (C) 2000,2001 Andi Kleen, SuSE Labs (x86-64 port)
+ * Copyright (C) 2000,2001,2002 Andi Kleen, SuSE Labs (x86-64 port)
*
* These routines maintain argument size conversion between 32bit and 64bit
* environment. In 2.5 most of this should be moved to a generic directory.
*
* This file assumes that there is a hole at the end of user address space.
+ *
+ * Some of the functions are LE specific currently. These are hopefully all marked.
+ * This should be fixed.
*/
#include <linux/config.h>
#include <linux/smp_lock.h>
#include <linux/sem.h>
#include <linux/msg.h>
+#include <linux/binfmts.h>
#include <linux/mm.h>
#include <linux/shm.h>
#include <linux/slab.h>
#include <linux/stat.h>
#include <linux/ipc.h>
#include <linux/rwsem.h>
+#include <linux/init.h>
#include <asm/mman.h>
#include <asm/types.h>
#include <asm/uaccess.h>
#define ROUND_UP(x,a) ((__typeof__(x))(((unsigned long)(x) + ((a) - 1)) & ~((a) - 1)))
#define NAME_OFFSET(de) ((int) ((de)->d_name - (char *) (de)))
+#undef high2lowuid
+#undef high2lowgid
+#undef low2highuid
+#undef low2highgid
+
+#define high2lowuid(uid) ((uid) > 65535) ? (u16)overflowuid : (u16)(uid)
+#define high2lowgid(gid) ((gid) > 65535) ? (u16)overflowgid : (u16)(gid)
+#define low2highuid(uid) ((uid) == (u16)-1) ? (uid_t)-1 : (uid_t)(uid)
+#define low2highgid(gid) ((gid) == (u16)-1) ? (gid_t)-1 : (gid_t)(gid)
+extern int overflowuid,overflowgid;
+
+
static int
putstat(struct stat32 *ubuf, struct stat *kbuf)
{
if (!file)
return -EBADF;
}
+ if (a.prot & PROT_READ)
+ a.prot |= PROT_EXEC;
+
+ a.flags |= MAP_32BIT;
mm = current->mm;
down_write(&mm->mmap_sem);
if (file)
fput(file);
- if (retval >= 0xFFFFFFFF) {
+ /* Should not happen */
+ if (retval >= 0xFFFFFFFF && (long)retval > 0) {
do_munmap(mm, retval, a.len);
retval = -ENOMEM;
}
up_write(&mm->mmap_sem);
+ return retval;
+}
+extern asmlinkage long sys_mprotect(unsigned long start,size_t len,unsigned long prot);
- return retval;
+asmlinkage int sys32_mprotect(unsigned long start, size_t len, unsigned long prot)
+{
+ if (prot & PROT_READ)
+ prot |= PROT_EXEC;
+ return sys_mprotect(start,len,prot);
}
asmlinkage long
static inline long
get_tv32(struct timeval *o, struct timeval32 *i)
{
- return (!access_ok(VERIFY_READ, i, sizeof(*i)) ||
- __get_user(o->tv_sec, &i->tv_sec) ||
- __get_user(o->tv_usec, &i->tv_usec));
- return ENOSYS;
+ int err = -EFAULT;
+ if (access_ok(VERIFY_READ, i, sizeof(*i))) {
+ err = __get_user(o->tv_sec, &i->tv_sec);
+ err |= __get_user(o->tv_usec, &i->tv_usec);
+ }
+ return err;
}
static inline long
put_tv32(struct timeval32 *o, struct timeval *i)
{
- return (!access_ok(VERIFY_WRITE, o, sizeof(*o)) ||
- __put_user(i->tv_sec, &o->tv_sec) ||
- __put_user(i->tv_usec, &o->tv_usec));
+ int err = -EFAULT;
+ if (access_ok(VERIFY_WRITE, o, sizeof(*o))) {
+ err = __put_user(i->tv_sec, &o->tv_sec);
+ err |= __put_user(i->tv_usec, &o->tv_usec);
+ }
+ return err;
}
static inline long
get_it32(struct itimerval *o, struct itimerval32 *i)
{
- return (!access_ok(VERIFY_READ, i, sizeof(*i)) ||
- __get_user(o->it_interval.tv_sec, &i->it_interval.tv_sec) ||
- __get_user(o->it_interval.tv_usec, &i->it_interval.tv_usec) ||
- __get_user(o->it_value.tv_sec, &i->it_value.tv_sec) ||
- __get_user(o->it_value.tv_usec, &i->it_value.tv_usec));
- return ENOSYS;
+ int err = -EFAULT;
+ if (access_ok(VERIFY_READ, i, sizeof(*i))) {
+ err = __get_user(o->it_interval.tv_sec, &i->it_interval.tv_sec);
+ err |= __get_user(o->it_interval.tv_usec, &i->it_interval.tv_usec);
+ err |= __get_user(o->it_value.tv_sec, &i->it_value.tv_sec);
+ err |= __get_user(o->it_value.tv_usec, &i->it_value.tv_usec);
+ }
+ return err;
}
static inline long
put_it32(struct itimerval32 *o, struct itimerval *i)
{
- return (!access_ok(VERIFY_WRITE, i, sizeof(*i)) ||
- __put_user(i->it_interval.tv_sec, &o->it_interval.tv_sec) ||
- __put_user(i->it_interval.tv_usec, &o->it_interval.tv_usec) ||
- __put_user(i->it_value.tv_sec, &o->it_value.tv_sec) ||
- __put_user(i->it_value.tv_usec, &o->it_value.tv_usec));
- return ENOSYS;
+ int err = -EFAULT;
+ if (access_ok(VERIFY_WRITE, o, sizeof(*o))) {
+ err = __put_user(i->it_interval.tv_sec, &o->it_interval.tv_sec);
+ err |= __put_user(i->it_interval.tv_usec, &o->it_interval.tv_usec);
+ err |= __put_user(i->it_value.tv_sec, &o->it_value.tv_sec);
+ err |= __put_user(i->it_value.tv_usec, &o->it_value.tv_usec);
+ }
+ return err;
}
extern int do_getitimer(int which, struct itimerval *value);
fourth.val = (int)pad;
else
fourth.__pad = (void *)A(pad);
+
switch (third) {
case IPC_INFO:
__kernel_clock_t32 tms_cstime;
};
-extern asmlinkage long sys_times(struct tms * tbuf);
+extern int sys_times(struct tms *);
asmlinkage long
sys32_times(struct tms32 *tbuf)
return ret;
}
-static inline int
-get_flock32(struct flock *kfl, struct flock32 *ufl)
-{
- if (verify_area(VERIFY_READ, ufl, sizeof(struct flock32)) ||
- __get_user(kfl->l_type, &ufl->l_type) ||
- __get_user(kfl->l_whence, &ufl->l_whence) ||
- __get_user(kfl->l_start, &ufl->l_start) ||
- __get_user(kfl->l_len, &ufl->l_len) ||
- __get_user(kfl->l_pid, &ufl->l_pid))
- return -EFAULT;
- return 0;
+
+static inline int get_flock(struct flock *kfl, struct flock32 *ufl)
+{
+ int err;
+
+ err = get_user(kfl->l_type, &ufl->l_type);
+ err |= __get_user(kfl->l_whence, &ufl->l_whence);
+ err |= __get_user(kfl->l_start, &ufl->l_start);
+ err |= __get_user(kfl->l_len, &ufl->l_len);
+ err |= __get_user(kfl->l_pid, &ufl->l_pid);
+ return err;
}
-static inline int
-put_flock32(struct flock *kfl, struct flock32 *ufl)
-{
- if (verify_area(VERIFY_WRITE, ufl, sizeof(struct flock32)) ||
- __put_user(kfl->l_type, &ufl->l_type) ||
- __put_user(kfl->l_whence, &ufl->l_whence) ||
- __put_user(kfl->l_start, &ufl->l_start) ||
- __put_user(kfl->l_len, &ufl->l_len) ||
- __put_user(kfl->l_pid, &ufl->l_pid))
- return -EFAULT;
- return 0;
+static inline int put_flock(struct flock *kfl, struct flock32 *ufl)
+{
+ int err;
+
+ err = __put_user(kfl->l_type, &ufl->l_type);
+ err |= __put_user(kfl->l_whence, &ufl->l_whence);
+ err |= __put_user(kfl->l_start, &ufl->l_start);
+ err |= __put_user(kfl->l_len, &ufl->l_len);
+ err |= __put_user(kfl->l_pid, &ufl->l_pid);
+ return err;
}
-extern asmlinkage long sys_fcntl(unsigned int fd, unsigned int cmd,
- unsigned long arg);
+extern asmlinkage long sys_fcntl(unsigned int fd, unsigned int cmd, unsigned long arg);
-asmlinkage long
-sys32_fcntl(unsigned int fd, unsigned int cmd, unsigned long arg)
+asmlinkage long sys32_fcntl(unsigned int fd, unsigned int cmd, unsigned long arg)
{
- struct flock f;
- mm_segment_t old_fs;
- long ret;
-
switch (cmd) {
case F_GETLK:
case F_SETLK:
case F_SETLKW:
- if(cmd != F_GETLK && get_flock32(&f, (struct flock32 *)((long)arg)))
- return -EFAULT;
- old_fs = get_fs();
- set_fs(KERNEL_DS);
- ret = sys_fcntl(fd, cmd, (unsigned long)&f);
- set_fs(old_fs);
- if(cmd == F_GETLK && put_flock32(&f, (struct flock32 *)((long)arg)))
- return -EFAULT;
- return ret;
- default:
- /*
- * `sys_fcntl' lies about arg, for the F_SETOWN
- * sub-function arg can have a negative value.
- */
- return sys_fcntl(fd, cmd, (unsigned long)((long)arg));
- }
-}
-
-static inline int
-get_flock64(struct flock *kfl, struct ia32_flock64 *ufl)
-{
- if (verify_area(VERIFY_READ, ufl, sizeof(struct ia32_flock64)) ||
- __get_user(kfl->l_type, &ufl->l_type) ||
- __get_user(kfl->l_whence, &ufl->l_whence) ||
- __copy_from_user(&kfl->l_start, &ufl->l_start, 8) ||
- __copy_from_user(&kfl->l_len, &ufl->l_len, 8) ||
- __get_user(kfl->l_pid, &ufl->l_pid))
- return -EFAULT;
- return 0;
-}
-
-static inline int
-put_flock64(struct flock *kfl, struct ia32_flock64 *ufl)
-{
- if (verify_area(VERIFY_WRITE, ufl, sizeof(struct ia32_flock64)) ||
- __put_user(kfl->l_type, &ufl->l_type) ||
- __put_user(kfl->l_whence, &ufl->l_whence) ||
- __copy_to_user(&ufl->l_start,&kfl->l_start, 8) ||
- __copy_to_user(&ufl->l_len,&kfl->l_len, 8) ||
- __put_user(kfl->l_pid, &ufl->l_pid))
- return -EFAULT;
- return 0;
-}
-
-asmlinkage long
-sys32_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg)
-{
+ {
struct flock f;
mm_segment_t old_fs;
long ret;
- /* sys_fcntl() is by default 64 bit and so don't know anything
- * about F_xxxx64 commands
- */
- switch (cmd) {
- case F_GETLK64:
- cmd = F_GETLK;
- break;
- case F_SETLK64:
- cmd = F_SETLK;
- break;
- case F_SETLKW64:
- cmd = F_SETLKW;
- break;
- }
-
- switch (cmd) {
- case F_SETLKW:
- case F_SETLK:
- if(get_flock64(&f, (struct ia32_flock64 *)arg))
+ if (get_flock(&f, (struct flock32 *)arg))
return -EFAULT;
- case F_GETLK:
- old_fs = get_fs();
- set_fs(KERNEL_DS);
+ old_fs = get_fs(); set_fs (KERNEL_DS);
ret = sys_fcntl(fd, cmd, (unsigned long)&f);
- set_fs(old_fs);
- if(cmd == F_GETLK64 && put_flock64(&f, (struct ia32_flock64 *)((long)arg)))
+ set_fs (old_fs);
+ if (ret) return ret;
+ if (put_flock(&f, (struct flock32 *)arg))
return -EFAULT;
- return ret;
+ return 0;
+ }
default:
- /*
- * `sys_fcntl' lies about arg, for the F_SETOWN
- * sub-function arg can have a negative value.
- */
- return sys_fcntl(fd, cmd, (unsigned long)((long)arg));
+ return sys_fcntl(fd, cmd, (unsigned long)arg);
}
}
+asmlinkage long sys32_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg)
+{
+ if (cmd >= F_GETLK64 && cmd <= F_SETLKW64)
+ return sys_fcntl(fd, cmd + F_GETLK - F_GETLK64, arg);
+ return sys32_fcntl(fd, cmd, arg);
+}
+
int sys32_ni_syscall(int call)
{
printk(KERN_INFO "IA32 syscall %d from %s not implemented\n", call,
return -ENOSYS;
}
-/* In order to reduce some races, while at the same time doing additional
- * checking and hopefully speeding things up, we copy filenames to the
- * kernel data space before using them..
- *
- * POSIX.1 2.4: an empty pathname is invalid (ENOENT).
- */
-static inline int
-do_getname32(const char *filename, char *page)
-{
- int retval;
-
- /* 32bit pointer will be always far below TASK_SIZE :)) */
- retval = strncpy_from_user((char *)page, (char *)filename, PAGE_SIZE);
- if (retval > 0) {
- if (retval < PAGE_SIZE)
- return 0;
- return -ENAMETOOLONG;
- } else if (!retval)
- retval = -ENOENT;
- return retval;
-}
-
-char *
-getname32(const char *filename)
-{
- char *tmp, *result;
-
- result = ERR_PTR(-ENOMEM);
- tmp = (char *)__get_free_page(GFP_KERNEL);
- if (tmp) {
- int retval = do_getname32(filename, tmp);
-
- result = tmp;
- if (retval < 0) {
- putname(tmp);
- result = ERR_PTR(retval);
- }
- }
- return result;
-}
-
/* 32-bit timeval and related flotsam. */
extern asmlinkage long sys_utime(char * filename, struct utimbuf * times);
__get_user (t.actime, ×->actime) ||
__get_user (t.modtime, ×->modtime))
return -EFAULT;
- filenam = getname32 (filename);
+ filenam = getname (filename);
ret = PTR_ERR(filenam);
if (!IS_ERR(filenam)) {
old_fs = get_fs();
set_fs (KERNEL_DS);
ret = sys_utime(filenam, &t);
set_fs (old_fs);
- putname (filenam);
+ putname(filenam);
}
return ret;
}
static int checktype(char *user_type)
{
int err = 0;
- char **s,*kernel_type = getname32(user_type);
- if (!kernel_type)
+ char **s,*kernel_type = getname(user_type);
+ if (!kernel_type || IS_ERR(kernel_type))
return -EFAULT;
for (s = badfs; *s; ++s)
if (!strcmp(kernel_type, *s)) {
mm_segment_t old_fs;
int ret;
- kfilename = getname32(filename);
+ kfilename = getname(filename);
ret = PTR_ERR(kfilename);
if (!IS_ERR(kfilename)) {
if (tvs) {
typedef __kernel_ssize_t32 ssize_t32;
+/* warning. next two assume LE */
asmlinkage ssize_t32
sys32_pread(unsigned int fd, char *ubuf, __kernel_size_t32 count,
- u32 poshi, u32 poslo)
+ u32 poslo, u32 poshi)
{
return sys_pread(fd, ubuf, count,
((loff_t)AA(poshi) << 32) | AA(poslo));
asmlinkage ssize_t32
sys32_pwrite(unsigned int fd, char *ubuf, __kernel_size_t32 count,
- u32 poshi, u32 poslo)
+ u32 poslo, u32 poshi)
{
return sys_pwrite(fd, ubuf, count,
((loff_t)AA(poshi) << 32) | AA(poslo));
cnt = 0;
do {
int ret = get_user(val, (__u32 *)(u64)src);
- if (ret) {
+ if (ret)
return ret;
- }
if (dst)
dst[cnt] = (char *)(u64)val;
cnt++;
src += 4;
- } while(val && cnt < 1023); // XXX: fix limit.
+ if (cnt >= (MAX_ARG_PAGES*PAGE_SIZE)/sizeof(void*))
+ return -E2BIG;
+ } while(val);
if (dst)
dst[cnt-1] = 0;
return cnt;
char **buf;
int na,ne;
int ret;
+ unsigned sz;
na = nargs(argv, NULL);
if (na < 0)
if (ne < 0)
return -EFAULT;
- buf = kmalloc((na+ne)*sizeof(char*), GFP_KERNEL);
+ sz = (na+ne)*sizeof(void *);
+ if (sz > PAGE_SIZE)
+ buf = vmalloc(sz);
+ else
+ buf = kmalloc(sz, GFP_KERNEL);
if (!buf)
return -ENOMEM;
putname(name);
free:
+ if (sz > PAGE_SIZE)
+ vfree(buf);
+ else
kfree(buf);
return ret;
}
return sys_kill(pid, sig);
}
+
+#if defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)
+/* Stuff for NFS server syscalls... */
+struct nfsctl_svc32 {
+ u16 svc32_port;
+ s32 svc32_nthreads;
+};
+
+struct nfsctl_client32 {
+ s8 cl32_ident[NFSCLNT_IDMAX+1];
+ s32 cl32_naddr;
+ struct in_addr cl32_addrlist[NFSCLNT_ADDRMAX];
+ s32 cl32_fhkeytype;
+ s32 cl32_fhkeylen;
+ u8 cl32_fhkey[NFSCLNT_KEYMAX];
+};
+
+struct nfsctl_export32 {
+ s8 ex32_client[NFSCLNT_IDMAX+1];
+ s8 ex32_path[NFS_MAXPATHLEN+1];
+ __kernel_dev_t32 ex32_dev;
+ __kernel_ino_t32 ex32_ino;
+ s32 ex32_flags;
+ __kernel_uid_t32 ex32_anon_uid;
+ __kernel_gid_t32 ex32_anon_gid;
+};
+
+struct nfsctl_uidmap32 {
+ u32 ug32_ident; /* char * */
+ __kernel_uid_t32 ug32_uidbase;
+ s32 ug32_uidlen;
+ u32 ug32_udimap; /* uid_t * */
+ __kernel_uid_t32 ug32_gidbase;
+ s32 ug32_gidlen;
+ u32 ug32_gdimap; /* gid_t * */
+};
+
+struct nfsctl_fhparm32 {
+ struct sockaddr gf32_addr;
+ __kernel_dev_t32 gf32_dev;
+ __kernel_ino_t32 gf32_ino;
+ s32 gf32_version;
+};
+
+struct nfsctl_fdparm32 {
+ struct sockaddr gd32_addr;
+ s8 gd32_path[NFS_MAXPATHLEN+1];
+ s32 gd32_version;
+};
+
+struct nfsctl_fsparm32 {
+ struct sockaddr gd32_addr;
+ s8 gd32_path[NFS_MAXPATHLEN+1];
+ s32 gd32_maxlen;
+};
+
+struct nfsctl_arg32 {
+ s32 ca32_version; /* safeguard */
+ union {
+ struct nfsctl_svc32 u32_svc;
+ struct nfsctl_client32 u32_client;
+ struct nfsctl_export32 u32_export;
+ struct nfsctl_uidmap32 u32_umap;
+ struct nfsctl_fhparm32 u32_getfh;
+ struct nfsctl_fdparm32 u32_getfd;
+ struct nfsctl_fsparm32 u32_getfs;
+ } u;
+#define ca32_svc u.u32_svc
+#define ca32_client u.u32_client
+#define ca32_export u.u32_export
+#define ca32_umap u.u32_umap
+#define ca32_getfh u.u32_getfh
+#define ca32_getfd u.u32_getfd
+#define ca32_getfs u.u32_getfs
+#define ca32_authd u.u32_authd
+};
+
+union nfsctl_res32 {
+ __u8 cr32_getfh[NFS_FHSIZE];
+ struct knfsd_fh cr32_getfs;
+};
+
+static int nfs_svc32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= __get_user(karg->ca_svc.svc_port, &arg32->ca32_svc.svc32_port);
+ err |= __get_user(karg->ca_svc.svc_nthreads, &arg32->ca32_svc.svc32_nthreads);
+ return err;
+}
+
+static int nfs_clnt32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= copy_from_user(&karg->ca_client.cl_ident[0],
+ &arg32->ca32_client.cl32_ident[0],
+ NFSCLNT_IDMAX);
+ err |= __get_user(karg->ca_client.cl_naddr, &arg32->ca32_client.cl32_naddr);
+ err |= copy_from_user(&karg->ca_client.cl_addrlist[0],
+ &arg32->ca32_client.cl32_addrlist[0],
+ (sizeof(struct in_addr) * NFSCLNT_ADDRMAX));
+ err |= __get_user(karg->ca_client.cl_fhkeytype,
+ &arg32->ca32_client.cl32_fhkeytype);
+ err |= __get_user(karg->ca_client.cl_fhkeylen,
+ &arg32->ca32_client.cl32_fhkeylen);
+ err |= copy_from_user(&karg->ca_client.cl_fhkey[0],
+ &arg32->ca32_client.cl32_fhkey[0],
+ NFSCLNT_KEYMAX);
+ return err;
+}
+
+static int nfs_exp32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= copy_from_user(&karg->ca_export.ex_client[0],
+ &arg32->ca32_export.ex32_client[0],
+ NFSCLNT_IDMAX);
+ err |= copy_from_user(&karg->ca_export.ex_path[0],
+ &arg32->ca32_export.ex32_path[0],
+ NFS_MAXPATHLEN);
+ err |= __get_user(karg->ca_export.ex_dev,
+ &arg32->ca32_export.ex32_dev);
+ err |= __get_user(karg->ca_export.ex_ino,
+ &arg32->ca32_export.ex32_ino);
+ err |= __get_user(karg->ca_export.ex_flags,
+ &arg32->ca32_export.ex32_flags);
+ err |= __get_user(karg->ca_export.ex_anon_uid,
+ &arg32->ca32_export.ex32_anon_uid);
+ err |= __get_user(karg->ca_export.ex_anon_gid,
+ &arg32->ca32_export.ex32_anon_gid);
+ karg->ca_export.ex_anon_uid = high2lowuid(karg->ca_export.ex_anon_uid);
+ karg->ca_export.ex_anon_gid = high2lowgid(karg->ca_export.ex_anon_gid);
+ return err;
+}
+
+static int nfs_uud32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ u32 uaddr;
+ int i;
+ int err;
+
+ memset(karg, 0, sizeof(*karg));
+ if(get_user(karg->ca_version, &arg32->ca32_version))
+ return -EFAULT;
+ karg->ca_umap.ug_ident = (char *)get_free_page(GFP_USER);
+ if(!karg->ca_umap.ug_ident)
+ return -ENOMEM;
+ err = get_user(uaddr, &arg32->ca32_umap.ug32_ident);
+ if(strncpy_from_user(karg->ca_umap.ug_ident,
+ (char *)A(uaddr), PAGE_SIZE) <= 0)
+ return -EFAULT;
+ err |= __get_user(karg->ca_umap.ug_uidbase,
+ &arg32->ca32_umap.ug32_uidbase);
+ err |= __get_user(karg->ca_umap.ug_uidlen,
+ &arg32->ca32_umap.ug32_uidlen);
+ err |= __get_user(uaddr, &arg32->ca32_umap.ug32_udimap);
+ if (err)
+ return -EFAULT;
+ karg->ca_umap.ug_udimap = kmalloc((sizeof(uid_t) * karg->ca_umap.ug_uidlen),
+ GFP_USER);
+ if(!karg->ca_umap.ug_udimap)
+ return -ENOMEM;
+ for(i = 0; i < karg->ca_umap.ug_uidlen; i++)
+ err |= __get_user(karg->ca_umap.ug_udimap[i],
+ &(((__kernel_uid_t32 *)A(uaddr))[i]));
+ err |= __get_user(karg->ca_umap.ug_gidbase,
+ &arg32->ca32_umap.ug32_gidbase);
+ err |= __get_user(karg->ca_umap.ug_uidlen,
+ &arg32->ca32_umap.ug32_gidlen);
+ err |= __get_user(uaddr, &arg32->ca32_umap.ug32_gdimap);
+ if (err)
+ return -EFAULT;
+ karg->ca_umap.ug_gdimap = kmalloc((sizeof(gid_t) * karg->ca_umap.ug_uidlen),
+ GFP_USER);
+ if(!karg->ca_umap.ug_gdimap)
+ return -ENOMEM;
+ for(i = 0; i < karg->ca_umap.ug_gidlen; i++)
+ err |= __get_user(karg->ca_umap.ug_gdimap[i],
+ &(((__kernel_gid_t32 *)A(uaddr))[i]));
+
+ return err;
+}
+
+static int nfs_getfh32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= copy_from_user(&karg->ca_getfh.gf_addr,
+ &arg32->ca32_getfh.gf32_addr,
+ (sizeof(struct sockaddr)));
+ err |= __get_user(karg->ca_getfh.gf_dev,
+ &arg32->ca32_getfh.gf32_dev);
+ err |= __get_user(karg->ca_getfh.gf_ino,
+ &arg32->ca32_getfh.gf32_ino);
+ err |= __get_user(karg->ca_getfh.gf_version,
+ &arg32->ca32_getfh.gf32_version);
+ return err;
+}
+
+static int nfs_getfd32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= copy_from_user(&karg->ca_getfd.gd_addr,
+ &arg32->ca32_getfd.gd32_addr,
+ (sizeof(struct sockaddr)));
+ err |= copy_from_user(&karg->ca_getfd.gd_path,
+ &arg32->ca32_getfd.gd32_path,
+ (NFS_MAXPATHLEN+1));
+ err |= get_user(karg->ca_getfd.gd_version,
+ &arg32->ca32_getfd.gd32_version);
+ return err;
+}
+
+static int nfs_getfs32_trans(struct nfsctl_arg *karg, struct nfsctl_arg32 *arg32)
+{
+ int err;
+
+ err = get_user(karg->ca_version, &arg32->ca32_version);
+ err |= copy_from_user(&karg->ca_getfs.gd_addr,
+ &arg32->ca32_getfs.gd32_addr,
+ (sizeof(struct sockaddr)));
+ err |= copy_from_user(&karg->ca_getfs.gd_path,
+ &arg32->ca32_getfs.gd32_path,
+ (NFS_MAXPATHLEN+1));
+ err |= get_user(karg->ca_getfs.gd_maxlen,
+ &arg32->ca32_getfs.gd32_maxlen);
+ return err;
+}
+
+/* This really doesn't need translations, we are only passing
+ * back a union which contains opaque nfs file handle data.
+ */
+static int nfs_getfh32_res_trans(union nfsctl_res *kres, union nfsctl_res32 *res32)
+{
+ return copy_to_user(res32, kres, sizeof(*res32));
+}
+
+int asmlinkage sys32_nfsservctl(int cmd, struct nfsctl_arg32 *arg32, union nfsctl_res32 *res32)
+{
+ struct nfsctl_arg *karg = NULL;
+ union nfsctl_res *kres = NULL;
+ mm_segment_t oldfs;
+ int err;
+
+ karg = kmalloc(sizeof(*karg), GFP_USER);
+ if(!karg)
+ return -ENOMEM;
+ if(res32) {
+ kres = kmalloc(sizeof(*kres), GFP_USER);
+ if(!kres) {
+ kfree(karg);
+ return -ENOMEM;
+ }
+ }
+ switch(cmd) {
+ case NFSCTL_SVC:
+ err = nfs_svc32_trans(karg, arg32);
+ break;
+ case NFSCTL_ADDCLIENT:
+ err = nfs_clnt32_trans(karg, arg32);
+ break;
+ case NFSCTL_DELCLIENT:
+ err = nfs_clnt32_trans(karg, arg32);
+ break;
+ case NFSCTL_EXPORT:
+ case NFSCTL_UNEXPORT:
+ err = nfs_exp32_trans(karg, arg32);
+ break;
+ /* This one is unimplemented, be we're ready for it. */
+ case NFSCTL_UGIDUPDATE:
+ err = nfs_uud32_trans(karg, arg32);
+ break;
+ case NFSCTL_GETFH:
+ err = nfs_getfh32_trans(karg, arg32);
+ break;
+ case NFSCTL_GETFD:
+ err = nfs_getfd32_trans(karg, arg32);
+ break;
+ case NFSCTL_GETFS:
+ err = nfs_getfs32_trans(karg, arg32);
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+ if(err)
+ goto done;
+ oldfs = get_fs();
+ set_fs(KERNEL_DS);
+ err = sys_nfsservctl(cmd, karg, kres);
+ set_fs(oldfs);
+
+ if (err)
+ goto done;
+
+ if((cmd == NFSCTL_GETFH) ||
+ (cmd == NFSCTL_GETFD) ||
+ (cmd == NFSCTL_GETFS))
+ err = nfs_getfh32_res_trans(kres, res32);
+
+done:
+ if(karg) {
+ if(cmd == NFSCTL_UGIDUPDATE) {
+ if(karg->ca_umap.ug_ident)
+ kfree(karg->ca_umap.ug_ident);
+ if(karg->ca_umap.ug_udimap)
+ kfree(karg->ca_umap.ug_udimap);
+ if(karg->ca_umap.ug_gdimap)
+ kfree(karg->ca_umap.ug_gdimap);
+ }
+ kfree(karg);
+ }
+ if(kres)
+ kfree(kres);
+ return err;
+}
+#else /* !NFSD */
+extern asmlinkage long sys_ni_syscall(void);
+int asmlinkage sys32_nfsservctl(int cmd, void *notused, void *notused2)
+{
+ return sys_ni_syscall();
+}
+#endif
+
+int sys32_module_warning(void)
+{
+ static long warn_time = -(60*HZ);
+ if (time_before(warn_time + 60*HZ,jiffies) && strcmp(current->comm,"klogd")) {
+ printk(KERN_INFO "%s: 32bit modutils not supported on 64bit kernel\n",
+ current->comm);
+ warn_time = jiffies;
+ }
+ return -ENOSYS ;
+}
+
+int sys_sched_getaffinity(pid_t pid, unsigned int len, unsigned long *new_mask_ptr);
+int sys_sched_setaffinity(pid_t pid, unsigned int len, unsigned long *new_mask_ptr);
+
+/* only works on LE */
+int sys32_sched_setaffinity(pid_t pid, unsigned int len,
+ unsigned int *new_mask_ptr)
+{
+ mm_segment_t oldfs = get_fs();
+ unsigned long mask;
+ int err;
+ if (get_user(mask, new_mask_ptr))
+ return -EFAULT;
+ set_fs(KERNEL_DS);
+ err = sys_sched_setaffinity(pid,sizeof(mask),&mask);
+ set_fs(oldfs);
+ return err;
+}
+
+/* only works on LE */
+int sys32_sched_getaffinity(pid_t pid, unsigned int len,
+ unsigned int *new_mask_ptr)
+{
+ mm_segment_t oldfs = get_fs();
+ unsigned long mask;
+ int err;
+ mask = 0;
+ set_fs(KERNEL_DS);
+ err = sys_sched_getaffinity(pid,sizeof(mask),&mask);
+ set_fs(oldfs);
+ if (!err)
+ err = put_user((u32)mask, new_mask_ptr);
+ return err;
+}
+
+struct exec_domain ia32_exec_domain = {
+ name: "linux/x86",
+ pers_low: PER_LINUX32,
+ pers_high: PER_LINUX32,
+};
+
+static int __init ia32_init (void)
+{
+ printk("IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $\n");
+ ia32_exec_domain.signal_map = default_exec_domain.signal_map;
+ ia32_exec_domain.signal_invmap = default_exec_domain.signal_invmap;
+ register_exec_domain(&ia32_exec_domain);
+ return 0;
+}
+
+__initcall(ia32_init);
obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o \
ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_x86_64.o \
- pci-dma.o x8664_ksyms.o i387.o syscall.o early_printk.o vsyscall.o \
- setup64.o bluesmoke.o
+ pci-dma.o x8664_ksyms.o i387.o syscall.o vsyscall.o \
+ setup64.o bluesmoke.o bootflag.o
ifdef CONFIG_PCI
obj-y += pci-x86_64.o
obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
obj-$(CONFIG_X86_IO_APIC) += io_apic.o mpparse.o
+#obj-$(CONFIG_ACPI) += acpi.o
+#obj-$(CONFIG_ACPI_SLEEP) += acpi_wakeup.o
+obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
include $(TOPDIR)/Rules.make
/* Using APIC to generate smp_local_timer_interrupt? */
int using_apic_timer = 0;
+int dont_enable_local_apic __initdata = 0;
+
int prof_multiplier[NR_CPUS] = { 1, };
int prof_old_multiplier[NR_CPUS] = { 1, };
int prof_counter[NR_CPUS] = { 1, };
void clear_local_APIC(void)
{
int maxlvt;
- unsigned long v;
+ unsigned int v;
maxlvt = get_maxlvt();
apic_write_around(APIC_LVTPC, APIC_LVT_MASKED);
v = GET_APIC_VERSION(apic_read(APIC_LVR));
if (APIC_INTEGRATED(v)) { /* !82489DX */
- if (maxlvt > 3)
+ if (maxlvt > 3) /* Due to Pentium errata 3AP and 11AP. */
apic_write(APIC_ESR, 0);
apic_read(APIC_ESR);
}
void disable_local_APIC(void)
{
- unsigned long value;
+ unsigned int value;
clear_local_APIC();
*/
void __init init_bsp_APIC(void)
{
- unsigned long value, ver;
+ unsigned int value, ver;
/*
* Don't do the setup now if we have a SMP BIOS as the
void __init setup_local_APIC (void)
{
- unsigned long value, ver, maxlvt;
+ unsigned int value, ver, maxlvt;
/* Pound the ESR really hard over the head with a big hammer - mbligh */
if (esr_disable) {
if (maxlvt > 3) /* Due to the Pentium erratum 3AP. */
apic_write(APIC_ESR, 0);
value = apic_read(APIC_ESR);
- printk("ESR value before enabling vector: %08lx\n", value);
+ printk("ESR value before enabling vector: %08x\n", value);
value = ERROR_APIC_VECTOR; // enables sending errors
apic_write_around(APIC_LVTERR, value);
if (maxlvt > 3)
apic_write(APIC_ESR, 0);
value = apic_read(APIC_ESR);
- printk("ESR value after enabling vector: %08lx\n", value);
+ printk("ESR value after enabling vector: %08x\n", value);
} else {
if (esr_disable)
/*
unsigned int apic_lvterr;
unsigned int apic_tmict;
unsigned int apic_tdcr;
+ unsigned int apic_thmr;
} apic_pm_state;
static void apic_pm_suspend(void *data)
apic_pm_state.apic_lvterr = apic_read(APIC_LVTERR);
apic_pm_state.apic_tmict = apic_read(APIC_TMICT);
apic_pm_state.apic_tdcr = apic_read(APIC_TDCR);
+ apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
__save_flags(flags);
__cli();
disable_local_APIC();
apic_write(APIC_SPIV, apic_pm_state.apic_spiv);
apic_write(APIC_LVT0, apic_pm_state.apic_lvt0);
apic_write(APIC_LVT1, apic_pm_state.apic_lvt1);
+ apic_write(APIC_LVTTHMR, apic_pm_state.apic_thmr);
apic_write(APIC_LVTPC, apic_pm_state.apic_lvtpc);
apic_write(APIC_LVTT, apic_pm_state.apic_lvtt);
apic_write(APIC_TDCR, apic_pm_state.apic_tdcr);
static int __init detect_init_APIC (void)
{
u32 h, l, features;
- int needs_pm = 0;
extern void get_cpu_vendor(struct cpuinfo_x86*);
/* Workaround for us being called before identify_cpu(). */
l &= ~MSR_IA32_APICBASE_BASE;
l |= MSR_IA32_APICBASE_ENABLE | APIC_DEFAULT_PHYS_BASE;
wrmsr(MSR_IA32_APICBASE, l, h);
- needs_pm = 1;
}
}
/*
printk("Could not enable APIC!\n");
return -1;
}
- set_bit(X86_FEATURE_APIC, &boot_cpu_data.x86_capability);
+ set_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
mp_lapic_addr = APIC_DEFAULT_PHYS_BASE;
boot_cpu_id = 0;
if (nmi_watchdog != NMI_NONE)
nmi_watchdog = NMI_LOCAL_APIC;
printk("Found and enabled local APIC!\n");
-
- if (needs_pm)
- apic_pm_init1();
-
return 0;
no_apic:
void setup_APIC_timer(void * data)
{
- unsigned long clocks = (unsigned long) data, slice, t0, t1;
+ unsigned int clocks = (unsigned long) data, slice, t0, t1;
unsigned long flags;
int delta;
*/
slice = clocks / (smp_num_cpus+1);
- printk("cpu: %d, clocks: %lu, slice: %lu\n", smp_processor_id(), clocks, slice);
+ printk("cpu: %d, clocks: %d, slice: %d\n",
+ smp_processor_id(), clocks, slice);
/*
* Wait for IRQ0's slice:
__setup_APIC_LVTT(clocks);
- printk("CPU%d<T0:%lu,T1:%lu,D:%d,S:%lu,C:%lu>\n", smp_processor_id(), t0, t1, delta, slice, clocks);
+ printk("CPU%d<T0:%u,T1:%u,D:%d,S:%u,C:%u>\n",
+ smp_processor_id(), t0, t1, delta, slice, clocks);
__restore_flags(flags);
}
int __init calibrate_APIC_clock(void)
{
- unsigned long long t1 = 0, t2 = 0;
- long tt1, tt2;
- long result;
+ unsigned long t1 = 0, t2 = 0;
+ int tt1, tt2;
+ int result;
int i;
const int LOOPS = HZ/10;
result = (tt1-tt2)*APIC_DIVISOR/LOOPS;
+
+ printk("t1 = %ld t2 = %ld tt1 = %d tt2 = %d\n", t1, t2, tt1, tt2);
+
+
if (cpu_has_tsc)
- printk("..... CPU clock speed is %ld.%04ld MHz.\n",
- ((long)(t2-t1)/LOOPS)/(1000000/HZ),
- ((long)(t2-t1)/LOOPS)%(1000000/HZ));
+ printk("..... CPU clock speed is %d.%04d MHz.\n",
+ ((int)(t2-t1)/LOOPS)/(1000000/HZ),
+ ((int)(t2-t1)/LOOPS)%(1000000/HZ));
- printk("..... host bus clock speed is %ld.%04ld MHz.\n",
+ printk("..... host bus clock speed is %d.%04d MHz.\n",
result/(1000000/HZ),
result%(1000000/HZ));
return result;
}
-static unsigned long calibration_result;
+static unsigned int calibration_result;
void __init setup_APIC_clocks (void)
{
/*
* Now set up the timer for real.
*/
- setup_APIC_timer((void *)calibration_result);
+ setup_APIC_timer((void *)(u64)calibration_result);
__sti();
/* and update all other cpus */
- smp_call_function(setup_APIC_timer, (void *)calibration_result, 1, 1);
+ smp_call_function(setup_APIC_timer, (void *)(u64)calibration_result, 1, 1);
}
void __init disable_APIC_timer(void)
* value into /proc/profile.
*/
-inline void smp_local_timer_interrupt(struct pt_regs * regs)
+inline void smp_local_timer_interrupt(struct pt_regs *regs)
{
int user = user_mode(regs);
int cpu = smp_processor_id();
*/
unsigned int apic_timer_irqs [NR_CPUS];
-void smp_apic_timer_interrupt(struct pt_regs regs)
+void smp_apic_timer_interrupt(struct pt_regs *regs)
{
int cpu = smp_processor_id();
* interrupt lock, which is the WrongThing (tm) to do.
*/
irq_enter(cpu, 0);
- smp_local_timer_interrupt(®s);
+ smp_local_timer_interrupt(regs);
irq_exit(cpu, 0);
if (softirq_pending(cpu))
*/
asmlinkage void smp_spurious_interrupt(void)
{
- unsigned long v;
+ unsigned int v;
+ static unsigned long last_warning;
+ static unsigned long skipped;
/*
* Check if this really is a spurious interrupt and ACK it
ack_APIC_irq();
/* see sw-dev-man vol 3, chapter 7.4.13.5 */
- printk(KERN_INFO "spurious APIC interrupt on CPU#%d, should never happen.\n",
- smp_processor_id());
+ if (last_warning+30*HZ < jiffies) {
+ printk(KERN_INFO "spurious APIC interrupt on CPU#%d, %ld skipped.\n",
+ smp_processor_id(), skipped);
+ last_warning = jiffies;
+ } else {
+ skipped++;
+ }
}
/*
asmlinkage void smp_error_interrupt(void)
{
- unsigned long v, v1;
+ unsigned int v, v1;
/* First tickle the hardware, only then report what went on. -- REW */
v = apic_read(APIC_ESR);
6: Received illegal vector
7: Illegal register address
*/
- printk (KERN_ERR "APIC error on CPU%d: %02lx(%02lx)\n",
+ printk (KERN_ERR "APIC error on CPU%d: %02x(%02x)\n",
smp_processor_id(), v , v1);
}
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/sched.h>
+#include <linux/smp.h>
+#include <linux/config.h>
+#include <linux/irq.h>
#include <asm/processor.h>
+#include <asm/system.h>
#include <asm/msr.h>
+#include <asm/apic.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
static int mce_disabled __initdata = 0;
+static int banks;
+
/*
- * Machine Check Handler For PII/PIII/K7
+ * If we get an MCE, we don't know what state the caches/TLB's are
+ * going to be in, so we throw them all away.
*/
+static void inline flush_all (void)
+{
+ __asm__ __volatile__ ("invd": : );
+ __flush_tlb();
+}
-static int banks;
+/*
+ * P4/Xeon Thermal transition interrupt handler
+ */
+
+static void intel_thermal_interrupt(struct pt_regs *regs)
+{
+#ifdef CONFIG_X86_LOCAL_APIC
+ u32 l, h;
+ unsigned int cpu = smp_processor_id();
+
+ ack_APIC_irq();
+
+ rdmsr(MSR_IA32_THERM_STATUS, l, h);
+ if (l & 1) {
+ printk(KERN_EMERG "CPU#%d: Temperature above threshold\n", cpu);
+ printk(KERN_EMERG "CPU#%d: Running in modulated clock mode\n", cpu);
+ } else {
+ printk(KERN_INFO "CPU#%d: Temperature/speed normal\n", cpu);
+ }
+#endif
+}
+
+static void unexpected_thermal_interrupt(struct pt_regs *regs)
+{
+ printk(KERN_ERR "CPU#%d: Unexpected LVT TMR interrupt!\n", smp_processor_id());
+}
+
+/*
+ * Thermal interrupt handler for this CPU setup
+ */
+
+static void (*vendor_thermal_interrupt)(struct pt_regs *regs) = unexpected_thermal_interrupt;
+
+asmlinkage void smp_thermal_interrupt(struct pt_regs regs)
+{
+ vendor_thermal_interrupt(®s);
+}
+
+/* P4/Xeon Thermal regulation detect and init */
+
+static void __init intel_init_thermal(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_X86_LOCAL_APIC
+ u32 l, h;
+ unsigned int cpu = smp_processor_id();
+
+ /* Thermal monitoring */
+ if (!test_bit(X86_FEATURE_ACPI, &c->x86_capability))
+ return; /* -ENODEV */
+
+ /* Clock modulation */
+ if (!test_bit(X86_FEATURE_ACC, &c->x86_capability))
+ return; /* -ENODEV */
+
+ rdmsr(MSR_IA32_MISC_ENABLE, l, h);
+ /* first check if its enabled already, in which case there might
+ * be some SMM goo which handles it, so we can't even put a handler
+ * since it might be delivered via SMI already -zwanem.
+ */
+
+ if (l & (1<<3)) {
+ printk(KERN_DEBUG "CPU#%d: Thermal monitoring already enabled\n", cpu);
+ } else {
+ wrmsr(MSR_IA32_MISC_ENABLE, l | (1<<3), h);
+ printk(KERN_INFO "CPU#%d: Thermal monitoring enabled\n", cpu);
+ }
+
+ /* check wether a vector already exists */
+ l = apic_read(APIC_LVTTHMR);
+ if (l & 0xff) {
+ printk(KERN_DEBUG "CPU#%d: Thermal LVT already handled\n", cpu);
+ return; /* -EBUSY */
+ }
+
+ wrmsr(MSR_IA32_MISC_ENABLE, l | (1<<3), h);
+ printk(KERN_INFO "CPU#%d: Thermal monitoring enabled\n", cpu);
+
+ /* The temperature transition interrupt handler setup */
+ l = THERMAL_APIC_VECTOR; /* our delivery vector */
+ l |= (APIC_DM_FIXED | APIC_LVT_MASKED); /* we'll mask till we're ready */
+ apic_write_around(APIC_LVTTHMR, l);
+
+ rdmsr(MSR_IA32_THERM_INTERRUPT, l, h);
+ wrmsr(MSR_IA32_THERM_INTERRUPT, l | 0x3 , h);
+
+ /* ok we're good to go... */
+ vendor_thermal_interrupt = intel_thermal_interrupt;
+ l = apic_read(APIC_LVTTHMR);
+ apic_write_around(APIC_LVTTHMR, l & ~APIC_LVT_MASKED);
+
+ return;
+#endif
+}
+
+/*
+ * Machine Check Handler For PII/PIII
+ */
static void intel_machine_check(struct pt_regs * regs, long error_code)
{
u32 mcgstl, mcgsth;
int i;
+ flush_all();
+
rdmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
if(mcgstl&(1<<0)) /* Recoverable ? */
recover=0;
if(high&(1<<27))
{
rdmsr(MSR_IA32_MC0_MISC+i*4, alow, ahigh);
- printk("[%08x%08x]", alow, ahigh);
+ printk("[%08x%08x]", ahigh, alow);
}
if(high&(1<<26))
{
rdmsr(MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
- printk(" at %08x%08x",
- ahigh, alow);
+ printk(" at %08x%08x", ahigh, alow);
}
printk("\n");
/* Clear it */
wrmsr(MSR_IA32_MCG_STATUS,mcgstl, mcgsth);
}
-static void unexpected_machine_check(struct pt_regs *regs, long error_code)
+/*
+ * Handle unconfigured int18 (should never happen)
+ */
+
+static void unexpected_machine_check(struct pt_regs * regs, long error_code)
{
- printk("unexpected machine check %lx\n", error_code);
+ printk(KERN_ERR "CPU#%d: Unexpected int18 (Machine Check).\n", smp_processor_id());
}
/*
static void (*machine_check_vector)(struct pt_regs *, long error_code) = unexpected_machine_check;
-void do_machine_check(struct pt_regs * regs, long error_code)
+asmlinkage void do_machine_check(struct pt_regs * regs, long error_code)
{
machine_check_vector(regs, error_code);
}
+
+#ifdef CONFIG_X86_MCE_NONFATAL
+struct timer_list mce_timer;
+
+static void mce_checkregs (unsigned int cpu)
+{
+ u32 low, high;
+ int i;
+
+ if (cpu!=smp_processor_id())
+ BUG();
+
+ for (i=0; i<banks; i++) {
+ rdmsr(MSR_IA32_MC0_STATUS+i*4, low, high);
+
+ if ((low | high) != 0) {
+ flush_all();
+ printk (KERN_EMERG "MCE: The hardware reports a non fatal, correctable incident occured on CPU %d.\n", smp_processor_id());
+ printk (KERN_EMERG "Bank %d: %08x%08x\n", i, high, low);
+
+ /* Scrub the error so we don't pick it up in 5 seconds time. */
+ wrmsr(MSR_IA32_MC0_STATUS+i*4, 0UL, 0UL);
+
+ /* Serialize */
+ wmb();
+ }
+ }
+
+ /* Refresh the timer. */
+ mce_timer.expires = jiffies + 5 * HZ;
+ add_timer (&mce_timer);
+}
+
+static void mce_timerfunc (unsigned long data)
+{
+ int i;
+
+ for (i=0; i<smp_num_cpus; i++) {
+ if (i == smp_processor_id())
+ mce_checkregs(i);
+ else
+ smp_call_function (mce_checkregs, i, 1, 1);
+ }
+}
+#endif
+
+
/*
- * Set up machine check reporting for Intel processors
+ * Set up machine check reporting for processors with Intel style MCE
*/
static void __init intel_mcheck_init(struct cpuinfo_x86 *c)
* Check for MCE support
*/
- if( !test_bit(X86_FEATURE_MCE, &c->x86_capability) )
+ if( !test_bit(X86_FEATURE_MCE, c->x86_capability) )
return;
/*
* Check for PPro style MCA
*/
- if( !test_bit(X86_FEATURE_MCA, &c->x86_capability) )
+ if( !test_bit(X86_FEATURE_MCA, c->x86_capability) )
return;
/* Ok machine check is available */
if(done==0)
printk(KERN_INFO "Intel machine check architecture supported.\n");
rdmsr(MSR_IA32_MCG_CAP, l, h);
- if(l&(1<<8))
+ if(l&(1<<8)) /* Control register present ? */
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);
banks = l&0xff;
- for(i=1;i<banks;i++)
- {
+
+ /* Don't enable bank 0 on intel P6 cores, it goes bang quickly. */
+ if (c->x86_vendor == X86_VENDOR_INTEL && c->x86 == 6) {
+ for(i=1; i<banks; i++)
+ wrmsr(MSR_IA32_MC0_CTL+4*i, 0xffffffff, 0xffffffff);
+ } else {
+ for(i=0; i<banks; i++)
wrmsr(MSR_IA32_MC0_CTL+4*i, 0xffffffff, 0xffffffff);
}
- for(i=0;i<banks;i++)
- {
+
+ for(i=0; i<banks; i++)
wrmsr(MSR_IA32_MC0_STATUS+4*i, 0x0, 0x0);
- }
+
set_in_cr4(X86_CR4_MCE);
printk(KERN_INFO "Intel machine check reporting enabled on CPU#%d.\n", smp_processor_id());
+
+ intel_init_thermal(c);
+
done=1;
}
* This has to be run for each processor
*/
-
-
void __init mcheck_init(struct cpuinfo_x86 *c)
{
if(mce_disabled==1)
switch(c->x86_vendor)
{
case X86_VENDOR_AMD:
- /*
- * AMD K7 machine check is Intel like
- */
- if(c->x86 == 6)
+ if(c->x86 == 6 || c->x86 == 15) {
intel_mcheck_init(c);
+#ifdef CONFIG_X86_MCE_NONFATAL
+ /* Set the timer to check for non-fatal errors every 5 seconds */
+ init_timer (&mce_timer);
+ mce_timer.expires = jiffies + 5 * HZ;
+ mce_timer.data = 0;
+ mce_timer.function = &mce_timerfunc;
+ add_timer (&mce_timer);
+#endif
+ }
break;
+
case X86_VENDOR_INTEL:
intel_mcheck_init(c);
break;
+
default:
break;
}
--- /dev/null
+/*
+ * Implement 'Simple Boot Flag Specification 1.0'
+ *
+ */
+
+
+#include <linux/config.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <asm/io.h>
+
+#include <linux/mc146818rtc.h>
+
+
+#define SBF_RESERVED (0x78)
+#define SBF_PNPOS (1<<0)
+#define SBF_BOOTING (1<<1)
+#define SBF_DIAG (1<<2)
+#define SBF_PARITY (1<<7)
+
+
+struct sbf_boot
+{
+ u8 sbf_signature[4];
+ u32 sbf_len;
+ u8 sbf_revision __attribute((packed));
+ u8 sbf_csum __attribute((packed));
+ u8 sbf_oemid[6] __attribute((packed));
+ u8 sbf_oemtable[8] __attribute((packed));
+ u8 sbf_revdata[4] __attribute((packed));
+ u8 sbf_creator[4] __attribute((packed));
+ u8 sbf_crearev[4] __attribute((packed));
+ u8 sbf_cmos __attribute((packed));
+ u8 sbf_spare[3] __attribute((packed));
+};
+
+
+static int sbf_port __initdata = -1;
+
+static int __init sbf_struct_valid(unsigned long tptr)
+{
+ u8 *ap;
+ u8 v;
+ unsigned int i;
+ struct sbf_boot sb;
+
+ memcpy_fromio(&sb, tptr, sizeof(sb));
+
+ if(sb.sbf_len != 40 && sb.sbf_len != 39)
+ // 39 on IBM ThinkPad A21m, BIOS version 1.02b (KXET24WW; 2000-12-19).
+ return 0;
+
+ ap = (u8 *)&sb;
+ v= 0;
+
+ for(i=0;i<sb.sbf_len;i++)
+ v+=*ap++;
+
+ if(v)
+ return 0;
+
+ if(memcmp(sb.sbf_signature, "BOOT", 4))
+ return 0;
+
+ if (sb.sbf_len == 39)
+ printk (KERN_WARNING "SBF: ACPI BOOT descriptor is wrong length (%d)\n",
+ sb.sbf_len);
+
+ sbf_port = sb.sbf_cmos; /* Save CMOS port */
+ return 1;
+}
+
+static int __init parity(u8 v)
+{
+ int x = 0;
+ int i;
+
+ for(i=0;i<8;i++)
+ {
+ x^=(v&1);
+ v>>=1;
+ }
+ return x;
+}
+
+static void __init sbf_write(u8 v)
+{
+ unsigned long flags;
+ if(sbf_port != -1)
+ {
+ v &= ~SBF_PARITY;
+ if(!parity(v))
+ v|=SBF_PARITY;
+
+ printk(KERN_INFO "SBF: Setting boot flags 0x%x\n",v);
+
+ spin_lock_irqsave(&rtc_lock, flags);
+ CMOS_WRITE(v, sbf_port);
+ spin_unlock_irqrestore(&rtc_lock, flags);
+ }
+}
+
+static u8 __init sbf_read(void)
+{
+ u8 v;
+ unsigned long flags;
+ if(sbf_port == -1)
+ return 0;
+ spin_lock_irqsave(&rtc_lock, flags);
+ v = CMOS_READ(sbf_port);
+ spin_unlock_irqrestore(&rtc_lock, flags);
+ return v;
+}
+
+static int __init sbf_value_valid(u8 v)
+{
+ if(v&SBF_RESERVED) /* Reserved bits */
+ return 0;
+ if(!parity(v))
+ return 0;
+ return 1;
+}
+
+
+static void __init sbf_bootup(void)
+{
+ u8 v;
+ if(sbf_port == -1)
+ return;
+ v = sbf_read();
+ if(!sbf_value_valid(v))
+ printk(KERN_WARNING "SBF: Simple boot flag value 0x%x read from CMOS RAM was invalid\n",v);
+ v &= ~SBF_RESERVED;
+ v &= ~SBF_BOOTING;
+ v &= ~SBF_DIAG;
+#if defined(CONFIG_ISAPNP)
+ v |= SBF_PNPOS;
+#endif
+ sbf_write(v);
+}
+
+static int __init sbf_init(void)
+{
+ unsigned int i;
+ void *rsdt;
+ u32 rsdtlen = 0;
+ u32 rsdtbase = 0;
+ u8 sum = 0;
+ int n;
+
+ u8 *p;
+
+ for(i=0xE0000; i <= 0xFFFE0; i+=16)
+ {
+ p = phys_to_virt(i);
+
+ if(memcmp(p, "RSD PTR ", 8))
+ continue;
+
+ sum = 0;
+ for(n=0; n<20; n++)
+ sum+=p[n];
+
+ if(sum != 0)
+ continue;
+
+ /* So it says RSD PTR and it checksums... */
+
+ /*
+ * Process the RDSP pointer
+ */
+
+ rsdtbase = *(u32 *)(p+16);
+
+ /*
+ * RSDT length is ACPI 2 only, for ACPI 1 we must map
+ * and remap.
+ */
+
+ if(p[15]>1)
+ rsdtlen = *(u32 *)(p+20);
+ else
+ rsdtlen = 36;
+
+ if(rsdtlen < 36 || rsdtlen > 1024)
+ continue;
+ break;
+ }
+ if(i>0xFFFE0)
+ return 0;
+
+
+ rsdt = ioremap(rsdtbase, rsdtlen);
+ if(rsdt == 0)
+ return 0;
+
+ i = readl(rsdt + 4);
+
+ /*
+ * Remap if needed
+ */
+
+ if(i > rsdtlen)
+ {
+ rsdtlen = i;
+ iounmap(rsdt);
+ rsdt = ioremap(rsdtbase, rsdtlen);
+ if(rsdt == 0)
+ return 0;
+ }
+
+ for(n = 0; n < i; n++)
+ sum += readb(rsdt + n);
+
+ if(sum)
+ {
+ iounmap(rsdt);
+ return 0;
+ }
+
+ /* Ok the RSDT checksums too */
+
+ for(n = 36; n+3 < i; n += 4)
+ {
+ unsigned long rp = readl(rsdt+n);
+ int len = 4096;
+
+ if(rp > 0xFFFFFFFFUL - len)
+ len = 0xFFFFFFFFUL - rp;
+
+ /* Too close to the end!! */
+ if(len < 20)
+ continue;
+ rp = (unsigned long)ioremap(rp, 4096);
+ if(rp == 0)
+ continue;
+ if(sbf_struct_valid(rp))
+ {
+ /* Found the BOOT table and processed it */
+ printk(KERN_INFO "SBF: Simple Boot Flag extension found and enabled.\n");
+ }
+ iounmap((void *)rp);
+ }
+ iounmap(rsdt);
+ sbf_bootup();
+ return 0;
+}
+
+module_init(sbf_init);
#include <linux/poll.h>
#include <linux/smp.h>
#include <linux/major.h>
+#include <linux/fs.h>
#include <linux/smp_lock.h>
#include <linux/fs.h>
+#define printk real_printk
#include <asm/io.h>
+#undef printk
-/* This is "wrong" address to access it, we should access it using
- 0xffff8000000b8000ul; but 0xffff8000000b8000ul is not available
- early at boot. */
-#define VGABASE 0xffffffff800b8000ul
+
+#define VGABASE 0xffffffff800b8000ul /* This is "wrong" address to access it, we should access it using 0xffff8000000b8000ul; but 0xffff8000000b8000ul is not available early at boot. */
#define MAX_YPOS 25
#define MAX_XPOS 80
static char buf[1024];
+int printk(const char *fmt, ...) __attribute__((alias("early_printk")));
+
int early_printk(const char *fmt, ...)
{
va_list args;
* Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs
* Copyright (C) 2000 Pavel Machek <pavel@suse.cz>
*
- * $Id: entry.S,v 1.66 2001/11/11 17:47:47 ak Exp $
+ * $Id$
*/
/*
#include <asm/msr.h>
#include <asm/unistd.h>
#include <asm/thread_info.h>
-
-#define RIP_SYMBOL_NAME(x) x(%rip)
+#include <asm/hw_irq.h>
.code64
movq \tmp,RSP(%rsp)
movq $__USER_DS,SS(%rsp)
movq $__USER_CS,CS(%rsp)
- movq RCX(%rsp),\tmp /* get return address */
- movq \tmp,RIP(%rsp)
+ movq $-1,RCX(%rsp)
movq R11(%rsp),\tmp /* get eflags */
movq \tmp,EFLAGS(%rsp)
.endm
.macro RESTORE_TOP_OF_STACK tmp,offset=0
movq RSP-\offset(%rsp),\tmp
movq \tmp,PDAREF(pda_oldrsp)
- movq RIP-\offset(%rsp),\tmp
- movq \tmp,RCX-\offset(%rsp)
movq EFLAGS-\offset(%rsp),\tmp
movq \tmp,R11-\offset(%rsp)
.endm
addq $8*6, %rsp
.endm
-
/*
* A newly forked process directly context switches into this.
*/
ENTRY(ret_from_fork)
+#if CONFIG_SMP || CONFIG_PREEMPT
+ call schedule_tail
+#endif
GET_THREAD_INFO(%rcx)
bt $TIF_SYSCALL_TRACE,threadinfo_flags(%rcx)
jc rff_trace
rff_action:
RESTORE_REST
- cmpq $__KERNEL_CS,CS-ARGOFFSET(%rsp) # from kernel_thread?
+ testl $3,CS-ARGOFFSET(%rsp) # from kernel_thread?
je int_ret_from_sys_call
testl $_TIF_IA32,threadinfo_flags(%rcx)
jnz int_ret_from_sys_call
* rcx return address for syscall/sysret, C arg3
* rsi arg1
* rdx arg2
- * r10 arg4 (--> moved to rcx for C, serves as TOS flag afterwards)
- * r8 arg5
- * r9 arg6
+ * r10 arg3 (--> moved to rcx for C)
+ * r8 arg4
+ * r9 arg5
* r11 eflags for syscall/sysret, temporary for C
* r12-r15,rbp,rbx saved by C code, not touched.
*
* Interrupts are off on entry.
* Only called from user space.
*
- * XXX need to add a flag for thread_saved_pc/KSTK_*.
+ * XXX if we had a free scratch register we could save the RSP into the stack frame
+ * and report it properly in ps. Unfortunately we haven't.
*/
ENTRY(system_call)
swapgs
movq %rsp,PDAREF(pda_oldrsp)
movq PDAREF(pda_kernelstack),%rsp
- pushq %rax
sti
- SAVE_ARGS
+ SAVE_ARGS 8,1
+ movq %rax,ORIG_RAX-ARGOFFSET(%rsp)
+ movq %rcx,RIP-ARGOFFSET(%rsp)
GET_THREAD_INFO(%rcx)
bt $TIF_SYSCALL_TRACE,threadinfo_flags(%rcx)
jc tracesys
* Syscall return path ending with SYSRET (fast path)
* Has incomplete stack frame and undefined top of stack.
*/
-ENTRY(ret_from_sys_call)
+ .globl ret_from_sys_call
+ret_from_sys_call:
+ movl $_TIF_WORK_MASK,%edi
+ /* edi: flagmask */
+sysret_check:
GET_THREAD_INFO(%rcx)
cli
movl threadinfo_flags(%rcx),%edx
- andl $_TIF_WORK_MASK,%edx # tracesys has been already checked.
+ andl %edi,%edx
jnz sysret_careful
-sysret_restore_args:
- RESTORE_ARGS
+ movq RIP-ARGOFFSET(%rsp),%rcx
+ RESTORE_ARGS 0,-ARG_SKIP,1
movq PDAREF(pda_oldrsp),%rsp
swapgs
- SYSRET64
+ sysretq
+ /* Handle reschedules */
+ /* edx: work, edi: workmask */
sysret_careful:
bt $TIF_NEED_RESCHED,%edx
- jnc 1f
+ jnc sysret_signal
+ sti
+ pushq %rdi
call schedule
- jmp ret_from_sys_call
-1: sti
- SAVE_REST
- FIXUP_TOP_OF_STACK %rax
- xorq %rsi,%rsi # oldset
- movq %rsp,%rdi # &ptregs
- call do_notify_resume
- RESTORE_TOP_OF_STACK %rax
- RESTORE_REST
- jmp ret_from_sys_call
+ popq %rdi
+ jmp sysret_check
+
+ /* Handle a signal */
+sysret_signal:
+ sti
+ testl $(_TIF_SIGPENDING|_TIF_NOTIFY_RESUME),%edx
+ jz 1f
+
+ /* Really a signal */
+ /* edx: work flags (arg3) */
+ leaq do_notify_resume(%rip),%rax
+ leaq -ARGOFFSET(%rsp),%rdi # &pt_regs -> arg1
+ xorl %esi,%esi # oldset -> arg2
+ call ptregscall_common
+1: movl $_TIF_NEED_RESCHED,%edi
+ jmp sysret_check
+ /* Do syscall tracing */
tracesys:
SAVE_REST
movq $-ENOSYS,RAX(%rsp)
cmpq $__NR_syscall_max,%rax
ja 1f
movq %r10,%rcx /* fixup for C */
- movl $1,%r10d /* set TOS flag */
call *sys_call_table(,%rax,8)
movq %rax,RAX-ARGOFFSET(%rsp)
SAVE_REST
* Has correct top of stack, but partial stack frame.
*/
ENTRY(int_ret_from_sys_call)
- cmpq $__KERNEL_CS,CS-ARGOFFSET(%rsp) # in kernel syscall?
+ testl $3,CS-ARGOFFSET(%rsp) # kernel syscall?
je int_restore_args
- movl $_TIF_ALLWORK_MASK,%esi
-int_with_reschedule:
+ movl $_TIF_ALLWORK_MASK,%edi
+ /* edi: mask to check */
+int_with_check:
GET_THREAD_INFO(%rcx)
cli
movl threadinfo_flags(%rcx),%edx
- andl %esi,%edx
+ andl %edi,%edx
jnz int_careful
+int_restore_swapgs:
swapgs
int_restore_args:
- RESTORE_ARGS
- addq $8,%rsp # Remove oldrax
+ RESTORE_ARGS 0,8,0
iretq
+ /* Either reschedule or signal or syscall exit tracking needed. */
+ /* First do a reschedule test. */
+ /* edx: work, edi: workmask */
int_careful:
- sti
bt $TIF_NEED_RESCHED,%edx
jnc int_very_careful
+ sti
+ pushq %rdi
call schedule
- movl $_TIF_ALLWORK_MASK,%esi
- jmp int_with_reschedule
+ popq %rdi
+ jmp int_with_check
+
+ /* handle signals and tracing -- both require a full stack frame */
int_very_careful:
+ sti
SAVE_REST
- leaq syscall_trace(%rip),%rbp
- leaq do_notify_resume(%rip),%rbx
+ /* Check for syscall exit trace */
bt $TIF_SYSCALL_TRACE,%edx
- cmovcq %rbp,%rbx
- xorq %rsi,%rsi # oldset -> arg2
+ jnc int_signal
movq %rsp,%rdi # &ptregs -> arg1
- call *%rbx
+ pushq %rdi
+ call syscall_trace
+ popq %rdi
+ btr $TIF_SYSCALL_TRACE,%edi
+ jmp int_restore_rest
+
+int_signal:
+ testl $(_TIF_NOTIFY_RESUME|_TIF_SIGPENDING),%edx
+ jz 1f
+ movq %rsp,%rdi # &ptregs -> arg1
+ xorl %esi,%esi # oldset -> arg2
+ call do_notify_resume
+1: movl $_TIF_NEED_RESCHED,%edi
+int_restore_rest:
RESTORE_REST
- movl $_TIF_WORK_MASK,%esi
- jmp int_with_reschedule
+ jmp int_with_check
/*
* Certain special system calls that need to save a complete full stack frame.
FIXUP_TOP_OF_STACK %r11
call sys_execve
GET_THREAD_INFO(%rcx)
- testl $_TIF_IA32,threadinfo_flags(%rcx)
- jnz exec_32bit
+ bt $TIF_IA32,threadinfo_flags(%rcx)
+ jc exec_32bit
RESTORE_TOP_OF_STACK %r11
movq %r15, %r11
RESTORE_REST
*/
/* 0(%rsp): interrupt number */
-ENTRY(common_interrupt)
- cmpq $__KERNEL_CS,16(%rsp)
- je 1f
- swapgs
-1: cld
+ .macro interrupt func
+ cld
SAVE_ARGS
#ifdef CONFIG_PREEMPT
GET_THREAD_INFO(%rdx)
incl threadinfo_preempt_count(%rdx)
#endif
leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler
- addl $1,PDAREF(pda_irqcount) # XXX: should be merged with irq.c irqcount
+ testl $3,CS(%rdi)
+ je 1f
+ swapgs
+1: addl $1,PDAREF(pda_irqcount) # XXX: should be merged with irq.c irqcount
movq PDAREF(pda_irqstackptr),%rax
cmoveq %rax,%rsp
pushq %rdi # save old stack
- call do_IRQ
+ call \func
+ .endm
+
+ENTRY(common_interrupt)
+ interrupt do_IRQ
/* 0(%rsp): oldrsp-ARGOFFSET */
- .globl ret_from_intr
ret_from_intr:
popq %rdi
cli
#ifdef CONFIG_PREEMPT
decl threadinfo_preempt_count(%rcx)
#endif
- cmpq $__KERNEL_CS,CS-ARGOFFSET(%rsp)
+ testl $3,CS-ARGOFFSET(%rsp)
je retint_kernel
/* Interrupt came from user space */
/*
- * Shared return path for exceptions and interrupts that came from user space.
* Has a correct top of stack, but a partial stack frame
* %rcx: thread info. Interrupts off.
*/
retint_with_reschedule:
- testl $_TIF_WORK_MASK,threadinfo_flags(%rcx)
+ movl $_TIF_WORK_MASK,%edi
+retint_check:
+ movl threadinfo_flags(%rcx),%edx
+ andl %edi,%edx
jnz retint_careful
retint_swapgs:
swapgs
retint_restore_args:
- RESTORE_ARGS
- addq $8,%rsp
+ RESTORE_ARGS 0,8,0
iretq
+ /* edi: workmask, edx: work */
retint_careful:
- movl threadinfo_flags(%rcx),%edx
bt $TIF_NEED_RESCHED,%edx
jnc retint_signal
sti
+ pushq %rdi
call schedule
-retint_next_try:
+ popq %rdi
GET_THREAD_INFO(%rcx)
cli
- jmp retint_with_reschedule
+ jmp retint_check
+
retint_signal:
testl $(_TIF_SIGPENDING|_TIF_NOTIFY_RESUME),%edx
jz retint_swapgs
movq %rsp,%rdi # &pt_regs
call do_notify_resume
RESTORE_REST
- jmp retint_next_try
+ cli
+ movl $_TIF_NEED_RESCHED,%edi
+ GET_THREAD_INFO(%rcx)
+ jmp retint_check
#ifdef CONFIG_PREEMPT
/* Returning to kernel space. Check if we need preemption */
movl PDAREF(pda___local_bh_count),%eax
addl PDAREF(pda___local_irq_count),%eax
jnz retint_restore_args
- incl threadinfo_preempt_count(%rcx)
+ movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx)
sti
- call preempt_schedule
+ call schedule
cli
+ GET_THREAD_INFO(%rcx)
+ movl $0,threadinfo_preempt_count(%rcx)
jmp exit_intr
#endif
+/*
+ * APIC interrupts.
+ */
+ .macro apicinterrupt num,func
+ pushq $\num-256
+ interrupt \func
+ jmp ret_from_intr
+ .endm
+
+#ifdef CONFIG_SMP
+ENTRY(reschedule_interrupt)
+ apicinterrupt RESCHEDULE_VECTOR,smp_reschedule_interrupt
+
+ENTRY(invalidate_interrupt)
+ apicinterrupt INVALIDATE_TLB_VECTOR,smp_invalidate_interrupt
+
+ENTRY(call_function_interrupt)
+ apicinterrupt CALL_FUNCTION_VECTOR,smp_call_function_interrupt
+#endif
+
+#ifdef CONFIG_X86_LOCAL_APIC
+ENTRY(apic_timer_interrupt)
+ apicinterrupt LOCAL_TIMER_VECTOR,smp_apic_timer_interrupt
+
+ENTRY(error_interrupt)
+ apicinterrupt ERROR_APIC_VECTOR,smp_error_interrupt
+
+ENTRY(spurious_interrupt)
+ apicinterrupt SPURIOUS_APIC_VECTOR,smp_spurious_interrupt
+#endif
+
/*
* Exception entry points.
*/
.macro zeroentry sym
pushq $0 /* push error code/oldrax */
pushq %rax /* push real oldrax to the rdi slot */
- leaq RIP_SYMBOL_NAME(\sym),%rax
+ leaq \sym(%rip),%rax
jmp error_entry
.endm
.macro errorentry sym
pushq %rax
- leaq RIP_SYMBOL_NAME(\sym),%rax
+ leaq \sym(%rip),%rax
jmp error_entry
.endm
*/
ALIGN
error_entry:
- cmpq $__KERNEL_CS,24(%rsp)
+ testl $3,24(%rsp)
je error_kernelspace
swapgs
error_kernelspace:
RESTORE_REST
cli
GET_THREAD_INFO(%rcx)
- cmpq $__KERNEL_CS,CS-ARGOFFSET(%rsp)
+ testl $3,CS-ARGOFFSET(%rsp)
je retint_kernel
- jmp retint_with_reschedule
+ movl threadinfo_flags(%rcx),%edx
+ movl $_TIF_WORK_MASK,%edi
+ andl %edi,%edx
+ jnz retint_careful
+ swapgs
+ RESTORE_ARGS 0,8,0
+ iretq
/*
* Create a kernel thread.
ret
ENTRY(page_fault)
+#ifdef CONFIG_KDB
+ pushq %rcx
+ pushq %rdx
+ pushq %rax
+ movl $473,%ecx
+ rdmsr
+ andl $0xfffffffe,%eax /* Disable last branch recording */
+ wrmsr
+ popq %rax
+ popq %rdx
+ popq %rcx
+#endif
errorentry do_page_fault
ENTRY(coprocessor_error)
ENTRY(simd_coprocessor_error)
zeroentry do_simd_coprocessor_error
-
ENTRY(device_not_available)
- cmpq $0,(%rsp)
- jl 1f
+ testl $3,8(%rsp)
+ je 1f
swapgs
-1: pushq $-1
+1: pushq $-1 #error code
SAVE_ALL
movq %cr0,%rax
leaq math_state_restore(%rip),%rcx
ENTRY(debug)
zeroentry do_debug
- /* XXX checkme */
ENTRY(nmi)
- cmpq $0,(%rsp)
- jl 1f
- swapgs
-1: pushq $-1
+ pushq $-1
SAVE_ALL
- movq %rsp,%rdi
+ /* NMI could happen inside the critical section of a swapgs,
+ so it is needed to use this expensive way to check. */
+ movl $MSR_GS_BASE,%ecx
+ rdmsr
+ xorl %ebx,%ebx
+ testl %edx,%edx
+ js 1f
+ swapgs
+ movl $1,%ebx
+1: movq %rsp,%rdi # regs -> arg1
call do_nmi
- RESTORE_ALL
- addq $8,%rsp
- cmpq $0,(%rsp)
- jl 2f
+ /* XXX: should do preemption checks here */
+ cli
+ testl %ebx,%ebx
+ jz 2f
swapgs
-2: iretq
+2: RESTORE_ALL 8
+ iretq
ENTRY(int3)
zeroentry do_int3
errorentry do_alignment_check
ENTRY(divide_error)
- errorentry do_divide_error
+ zeroentry do_divide_error
ENTRY(spurious_interrupt_bug)
zeroentry do_spurious_interrupt_bug
-ENTRY(__bad_intr)
- pushq $-1
- SAVE_ALL
- call bad_intr
- RESTORE_ALL
- addq $8,%rsp
- iretq
+ENTRY(machine_check)
+ zeroentry do_machine_check
+
+ENTRY(call_debug)
+ zeroentry do_call_debug
+
* Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE
* Copyright (C) 2000 Pavel Machek <pavel@suse.cz>
* Copyright (C) 2000 Karsten Keil <kkeil@suse.de>
- * Copyright (C) 2001 2002 Andi Kleen <ak@suse.de>
+ * Copyright (C) 2001,2002 Andi Kleen <ak@suse.de>
*
- * $Id: head.S,v 1.41 2001/07/05 23:43:45 ak Exp $
+ * $Id: head.S,v 1.49 2002/03/19 17:39:25 ak Exp $
*/
-.code64
-.text
+
#include <linux/linkage.h>
#include <linux/threads.h>
#include <asm/desc.h>
#include <asm/segment.h>
#include <asm/page.h>
-#include <asm/offset.h>
+#include <asm/msr.h>
-/* we don't able to switch in one step to final KERNEL ADDRESS SPACE
+/* we are not able to switch in one step to the final KERNEL ADRESS SPACE
* because we need identity-mapped pages on setup so define __START_KERNEL to
* 0x100000 for this stage
*
*/
-
+ .text
+ .code32
+/* %bx: 1 if comming from smp trampoline on secondary cpu */
startup_32:
-.code32
+
/*
* At this point the CPU runs in 32bit protected mode (CS.D = 1) with
* paging disabled and the point of this file is to switch to 64bit
* There is no stack until we set one up.
*/
- /* As first check if extended functions are implemented */
+ movl %ebx,%ebp /* Save trampoline flag */
+
+ /* First check if extended functions are implemented */
movl $0x80000000, %eax
cpuid
cmpl $0x80000000, %eax
btl $29, %edx
jnc no_long_mode
+ movl %edx,%edi
+
/*
* Prepare for entering 64bits mode
*/
movl %eax, %cr3
/* Setup EFER (Extended Feature Enable Register) */
- movl $0xc0000080, %ecx
+ movl $MSR_EFER, %ecx
rdmsr
/* Fool rdmsr and reset %eax to avoid dependences */
xorl %eax, %eax
/* Enable Long Mode */
- btsl $8, %eax
+ btsl $_EFER_LME, %eax
/* Enable System Call */
- btsl $0, %eax
+ btsl $_EFER_SCE, %eax
+
+#if 0
+ /* No Execute supported? */
+ btl $20,%edi
+ jnc 1f
+ btsl $_EFER_NX, %eax
+1:
+#endif
+
/* Make changes effective */
wrmsr
* the new gdt/idt that has __KERNEL_CS with CS.L = 1.
*/
+ testw %bp,%bp /* secondary CPU? */
+ jnz second
+
/* Load new GDT with the 64bit segment using 32bit descriptor */
/* to avoid 32bit relocations we use fixed adresses here */
movl $0x100F00, %eax
lgdt (%eax)
+
movl $0x100F10, %eax
/* Finally jump in 64bit mode */
ljmp *(%eax)
-.code64
-reach_long64:
- /*
- * Where we're running at 0x0000000000100000, and yes, finally
- * in 64bit mode.
- */
- .globl init_rsp
+second:
+ /* abuse syscall to get into 64bit mode. this way we don't need
+ a working low identity mapping just for the short 32bit roundtrip.
+ XXX kludge. this should not be needed. */
+ movl $MSR_STAR,%ecx
+ xorl %eax,%eax
+ movl $(__USER32_CS<<16)|__KERNEL_CS,%edx
+ wrmsr
- /* Setup the first kernel stack (this instruction is modified by smpboot) */
- .byte 0x48, 0xb8 /* movq *init_rsp,%rax */
-init_rsp:
- .quad init_thread_union+THREAD_SIZE
- movq %rax, %rsp
+ movl $MSR_CSTAR,%ecx
+ movl $0xffffffff,%edx
+ movl $0x80100100,%eax # reach_long64 absolute
+ wrmsr
+ syscall
+
+ .code64
+ .org 0x100
+reach_long64:
+ movq init_rsp(%rip),%rsp
/* zero EFLAGS after setting rsp */
pushq $0
*/
lgdt pGDT64
- /* esi is pointer to real mode structure with interesting info.
- pass it to C */
- movl %esi, %edi
-
movl $__KERNEL_DS,%eax
- movl %eax,%ss
movl %eax,%ds
+ movl %eax,%ss
movl %eax,%es
+ /* esi is pointer to real mode structure with interesting info.
+ pass it to C */
+ movl %esi, %edi
+
/* Finally jump to run C code and to be on real kernel address
* Since we are running on identity-mapped space we have to jump
* to the full 64bit address , this is only possible as indirect
movq initial_code(%rip),%rax
jmp *%rax
-
- /* SMP bootup changes this */
+ /* SMP bootup changes these two */
.globl initial_code
initial_code:
.quad x86_64_start_kernel
+ .globl init_rsp
+init_rsp:
+ .quad init_thread_union+THREAD_SIZE-8
+
.code32
ENTRY(no_long_mode)
.org 0xf00
pGDT32:
.word gdt32_end-gdt_table32
- .quad gdt_table32-__START_KERNEL+0x100000
+ .long gdt_table32-__START_KERNEL+0x100000
.org 0xf10
ljumpvector:
* 2Mbyte large pages provided by PAE mode)
*/
.org 0x1000
-ENTRY(level4_pgt)
+ENTRY(init_level4_pgt)
.quad 0x0000000000102007 /* -> level3_ident_pgt */
.fill 255,8,0
- /* __PAGE_OFFSET 0xffff800000000000 */
.quad 0x000000000010a007
.fill 254,8,0
/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
.org 0x2000
/* Kernel does not "know" about 4-th level of page tables. */
-ENTRY(swapper_pg_dir)
ENTRY(level3_ident_pgt)
.quad 0x0000000000104007
.fill 511,8,0
.org 0x4000
ENTRY(level2_ident_pgt)
- /* 2 Mbytes are enough, this is necessary only for head.S */
+ /* 40MB for bootup. */
.quad 0x0000000000000283
- /* .fill 511,8,0 */
- /* Jan needs more than 2Mbytes, so set a 40Mbyte mapping instead */
.quad 0x0000000000200183
.quad 0x0000000000400183
.quad 0x0000000000600183
.quad 0x0000000002200183
.quad 0x0000000002400183
.quad 0x0000000002600183
+ /* Temporary mappings for the super early allocator in arch/x86_64/mm/init.c */
+ .globl temp_boot_pmds
+temp_boot_pmds:
.fill 492,8,0
.org 0x5000
ENTRY(level2_kernel_pgt)
+ /* 40MB kernel mapping. The kernel code cannot be bigger than that.
+ When you change this change KERNEL_TEXT_SIZE in pgtable.h too. */
/* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */
.quad 0x0000000000000183
.quad 0x0000000000200183
.quad 0x0000000002200183
.quad 0x0000000002400183
.quad 0x0000000002600183
- /*
- * We could go ahead without any downside (except typing programmer
- * wise :) but 40Mbyte are just enough for the kernel statically
- * linked part (and extending it is trivial).
- */
+ /* Module mapping starts here */
.fill 492,8,0
.org 0x6000
ENTRY(level3_physmem_pgt)
.quad 0x0000000000105007 /* -> level2_kernel_pgt (so that __va works even before pagetable_init) */
-
-
.org 0xb000
-
-
.data
.globl SYMBOL_NAME(gdt)
.quad 0x00cffe000000ffff /* __USER32_CS */
.quad 0x00cff2000000ffff /* __USER_DS, __USER32_DS */
.quad 0x00affa000000ffff /* __USER_CS */
+ .word 0xFFFF # 4Gb - (0x100000*0x1000 = 4Gb)
+ .word 0 # base address = 0
+ .word 0x9A00 # code read/exec
+ .word 0x00CF # granularity = 4096, 386
+ # (+5th nibble of limit)
+ /* __KERNEL32_CS */
.globl tss_start
tss_start:
.rept NR_CPUS
- .quad 0,0 /* TSS descriptors. filled in later */
- .endr
- .globl ldt_start
-ldt_start:
- .rept NR_CPUS
- .quad 0,0 /* LDT descriptors. filled in later */
+ .quad 0,0,0,0,0,0,0,0 /* TSS/LDT/per cpu entries. filled in later */
.endr
gdt_end:
.globl gdt_end
.quad 0
.quad 0
.endr
-
-.section .text.lock
-ENTRY(stext_lock)
(int *)&boot_cpu_data.x86_vendor_id[4]);
/* get cpu type */
- cpuid(1, &eax, &dummy, &dummy, &boot_cpu_data.x86_capability[0]);
+ cpuid(1, &eax, &dummy, &dummy, (int *) &boot_cpu_data.x86_capability);
boot_cpu_data.x86 = (eax >> 8) & 0xf;
boot_cpu_data.x86_model = (eax >> 4) & 0xf;
boot_cpu_data.x86_mask = eax & 0xf;
* linux/arch/x86_64/kernel/i387.c
*
* Copyright (C) 1994 Linus Torvalds
+ * Copyright (C) 2002 Andi Kleen, SuSE Labs
*
* Pentium III FXSR, SSE support
* General FPU state handling cleanups
* Gareth Hughes <gareth@valinux.com>, May 2000
+ *
+ * x86-64 rework 2002 Andi Kleen.
+ * Does direct fxsave in and out of user space now for signal handlers.
+ * All the FSAVE<->FXSAVE conversion code has been moved to the 32bit emulation,
+ * the 64bit user space sees a FXSAVE frame directly.
*/
#include <linux/config.h>
#include <linux/sched.h>
-#include <linux/spinlock.h>
+#include <linux/init.h>
#include <asm/processor.h>
#include <asm/i387.h>
#include <asm/sigcontext.h>
#include <asm/ptrace.h>
#include <asm/uaccess.h>
-#define HAVE_HWFP 1
+static struct i387_fxsave_struct init_fpu_env;
/*
- * The _current_ task is using the FPU for the first time
- * so initialize it and set the mxcsr to its default
- * value at reset if we support XMM instructions and then
- * remeber the current task has used the FPU.
+ * Called at bootup to set up the initial FPU state that is later cloned
+ * into all processes.
*/
-void init_fpu(void)
+void __init fpu_init(void)
{
- __asm__("fninit");
- if ( cpu_has_xmm )
- load_mxcsr(0x1f80);
+ unsigned long oldcr0 = read_cr0();
+ extern void __bad_fxsave_alignment(void);
- current->used_math = 1;
-}
-
-/*
- * FPU lazy state save handling.
- */
-
-static inline void __save_init_fpu( struct task_struct *tsk )
-{
- if ( cpu_has_fxsr ) {
- asm volatile( "fxsave %0 ; fnclex"
- : "=m" (tsk->thread.i387.fxsave) );
- } else {
- asm volatile( "fnsave %0 ; fwait"
- : "=m" (tsk->thread.i387.fsave) );
- }
- clear_tsk_thread_flag(tsk, TIF_USEDFPU);
-}
-
-void save_init_fpu( struct task_struct *tsk )
-{
- __save_init_fpu(tsk);
+ if (offsetof(struct task_struct, thread.i387.fxsave) & 15)
+ __bad_fxsave_alignment();
+ set_in_cr4(X86_CR4_OSFXSR);
+ set_in_cr4(X86_CR4_OSXMMEXCPT);
+
+ write_cr0(oldcr0 & ~((1UL<<3)|(1UL<<2))); /* clear TS and EM */
+
+ asm("fninit");
+ load_mxcsr(0x1f80);
+ /* initialize MMX state. normally this will be covered by fninit, but the
+ architecture doesn't guarantee it so do it explicitely. */
+ asm volatile("movq %0,%%mm0\n\t"
+ "movq %%mm0,%%mm1\n\t"
+ "movq %%mm0,%%mm2\n\t"
+ "movq %%mm0,%%mm3\n\t"
+ "movq %%mm0,%%mm4\n\t"
+ "movq %%mm0,%%mm5\n\t"
+ "movq %%mm0,%%mm6\n\t"
+ "movq %%mm0,%%mm7\n\t" :: "m" (0ULL));
+ asm("emms");
+
+ /* initialize XMM state */
+ asm("xorpd %xmm0,%xmm0");
+ asm("xorpd %xmm1,%xmm1");
+ asm("xorpd %xmm2,%xmm2");
+ asm("xorpd %xmm3,%xmm3");
+ asm("xorpd %xmm4,%xmm4");
+ asm("xorpd %xmm5,%xmm5");
+ asm("xorpd %xmm6,%xmm6");
+ asm("xorpd %xmm7,%xmm7");
+ asm("xorpd %xmm8,%xmm8");
+ asm("xorpd %xmm9,%xmm9");
+ asm("xorpd %xmm10,%xmm10");
+ asm("xorpd %xmm11,%xmm11");
+ asm("xorpd %xmm12,%xmm12");
+ asm("xorpd %xmm13,%xmm13");
+ asm("xorpd %xmm14,%xmm14");
+ asm("xorpd %xmm15,%xmm15");
+ load_mxcsr(0x1f80);
+ asm volatile("fxsave %0" : "=m" (init_fpu_env));
+
+ /* clean state in init */
stts();
-}
-
-void kernel_fpu_begin(void)
-{
- preempt_disable();
- if (test_thread_flag(TIF_USEDFPU)) {
- __save_init_fpu(current);
- return;
- }
- clts();
-}
-
-void restore_fpu( struct task_struct *tsk )
-{
- if ( cpu_has_fxsr ) {
- asm volatile( "fxrstor %0"
- : : "m" (tsk->thread.i387.fxsave) );
- } else {
- asm volatile( "frstor %0"
- : : "m" (tsk->thread.i387.fsave) );
- }
-}
-
-/*
- * FPU tag word conversions.
- */
-
-static inline unsigned short twd_i387_to_fxsr( unsigned short twd )
-{
- unsigned int tmp; /* to avoid 16 bit prefixes in the code */
-
- /* Transform each pair of bits into 01 (valid) or 00 (empty) */
- tmp = ~twd;
- tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */
- /* and move the valid bits to the lower byte. */
- tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */
- tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */
- tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */
- return tmp;
-}
-
-static inline u32 twd_fxsr_to_i387( struct i387_fxsave_struct *fxsave )
-{
- struct _fpxreg *st = NULL;
- u32 twd = (u32) fxsave->twd;
- u32 tag;
- u32 ret = 0xffff0000;
- int i;
-
-#define FPREG_ADDR(f, n) ((char *)&(f)->st_space + (n) * 16);
-
- for ( i = 0 ; i < 8 ; i++ ) {
- if ( twd & 0x1 ) {
- st = (struct _fpxreg *) FPREG_ADDR( fxsave, i );
-
- switch ( st->exponent & 0x7fff ) {
- case 0x7fff:
- tag = 2; /* Special */
- break;
- case 0x0000:
- if ( !st->significand[0] &&
- !st->significand[1] &&
- !st->significand[2] &&
- !st->significand[3] ) {
- tag = 1; /* Zero */
- } else {
- tag = 2; /* Special */
- }
- break;
- default:
- if ( st->significand[3] & 0x8000 ) {
- tag = 0; /* Valid */
- } else {
- tag = 2; /* Special */
- }
- break;
- }
- } else {
- tag = 3; /* Empty */
- }
- ret |= (tag << (2 * i));
- twd = twd >> 1;
- }
- return ret;
-}
-
-/*
- * FPU state interaction.
- */
-
-unsigned short get_fpu_cwd( struct task_struct *tsk )
-{
- if ( cpu_has_fxsr ) {
- return tsk->thread.i387.fxsave.cwd;
- } else {
- return (unsigned short)tsk->thread.i387.fsave.cwd;
- }
-}
-
-unsigned short get_fpu_swd( struct task_struct *tsk )
-{
- if ( cpu_has_fxsr ) {
- return tsk->thread.i387.fxsave.swd;
- } else {
- return (unsigned short)tsk->thread.i387.fsave.swd;
- }
-}
-
-unsigned short get_fpu_twd( struct task_struct *tsk )
-{
- if ( cpu_has_fxsr ) {
- return tsk->thread.i387.fxsave.twd;
- } else {
- return (unsigned short)tsk->thread.i387.fsave.twd;
- }
-}
-
-unsigned short get_fpu_mxcsr( struct task_struct *tsk )
-{
- if ( cpu_has_xmm ) {
- return tsk->thread.i387.fxsave.mxcsr;
- } else {
- return 0x1f80;
- }
-}
-
-void set_fpu_cwd( struct task_struct *tsk, unsigned short cwd )
-{
- if ( cpu_has_fxsr ) {
- tsk->thread.i387.fxsave.cwd = cwd;
- } else {
- tsk->thread.i387.fsave.cwd = ((u32)cwd | 0xffff0000);
- }
-}
-
-void set_fpu_swd( struct task_struct *tsk, unsigned short swd )
-{
- if ( cpu_has_fxsr ) {
- tsk->thread.i387.fxsave.swd = swd;
- } else {
- tsk->thread.i387.fsave.swd = ((u32)swd | 0xffff0000);
- }
-}
-
-void set_fpu_twd( struct task_struct *tsk, unsigned short twd )
-{
- if ( cpu_has_fxsr ) {
- tsk->thread.i387.fxsave.twd = twd_i387_to_fxsr(twd);
- } else {
- tsk->thread.i387.fsave.twd = ((u32)twd | 0xffff0000);
- }
-}
-
-void set_fpu_mxcsr( struct task_struct *tsk, unsigned short mxcsr )
-{
- if ( cpu_has_xmm ) {
- tsk->thread.i387.fxsave.mxcsr = (mxcsr & 0xffbf);
- }
+ clear_thread_flag(TIF_USEDFPU);
+ current->used_math = 0;
}
/*
- * FXSR floating point environment conversions.
+ * The _current_ task is using the FPU for the first time
+ * so initialize it and set the mxcsr to its default.
+ * remeber the current task has used the FPU.
*/
-
-static inline int convert_fxsr_to_user( struct _fpstate *buf,
- struct i387_fxsave_struct *fxsave )
-{
- u32 env[7];
- struct _fpreg *to;
- struct _fpxreg *from;
- int i;
-
- env[0] = (u32)fxsave->cwd | 0xffff0000;
- env[1] = (u32)fxsave->swd | 0xffff0000;
- env[2] = twd_fxsr_to_i387(fxsave);
- env[3] = fxsave->fip;
- env[4] = fxsave->fcs | ((u32)fxsave->fop << 16);
- env[5] = fxsave->foo;
- env[6] = fxsave->fos;
-
- if ( __copy_to_user( buf, env, 7 * sizeof(u32) ) )
- return 1;
-
- to = &buf->_st[0];
- from = (struct _fpxreg *) &fxsave->st_space[0];
- for ( i = 0 ; i < 8 ; i++, to++, from++ ) {
- if ( __copy_to_user( to, from, sizeof(*to) ) )
- return 1;
- }
- return 0;
-}
-
-static inline int convert_fxsr_from_user( struct i387_fxsave_struct *fxsave,
- struct _fpstate *buf )
+void init_fpu(void)
{
- u32 env[7];
- struct _fpxreg *to;
- struct _fpreg *from;
- int i;
-
- if ( __copy_from_user( env, buf, 7 * sizeof(u32) ) )
- return 1;
-
- fxsave->cwd = (unsigned short)(env[0] & 0xffff);
- fxsave->swd = (unsigned short)(env[1] & 0xffff);
- fxsave->twd = twd_i387_to_fxsr((unsigned short)(env[2] & 0xffff));
- fxsave->fip = env[3];
- fxsave->fop = (unsigned short)((env[4] & 0xffff0000) >> 16);
- fxsave->fcs = (env[4] & 0xffff);
- fxsave->foo = env[5];
- fxsave->fos = env[6];
-
- to = (struct _fpxreg *) &fxsave->st_space[0];
- from = &buf->_st[0];
- for ( i = 0 ; i < 8 ; i++, to++, from++ ) {
- if ( __copy_from_user( to, from, sizeof(*from) ) )
- return 1;
- }
- return 0;
+#if 0
+ asm("fninit");
+ load_mxcsr(0x1f80);
+#else
+ asm volatile("fxrstor %0" :: "m" (init_fpu_env));
+#endif
+ current->used_math = 1;
}
/*
* Signal frame handlers.
*/
-static inline int save_i387_fsave( struct _fpstate *buf )
-{
- struct task_struct *tsk = current;
-
- unlazy_fpu( tsk );
- tsk->thread.i387.fsave.status = tsk->thread.i387.fsave.swd;
- if ( __copy_to_user( buf, &tsk->thread.i387.fsave,
- sizeof(struct i387_fsave_struct) ) )
- return -1;
- return 1;
-}
-
-static inline int save_i387_fxsave( struct _fpstate *buf )
+int save_i387(struct _fpstate *buf)
{
struct task_struct *tsk = current;
int err = 0;
- unlazy_fpu( tsk );
-
- if ( convert_fxsr_to_user( buf, &tsk->thread.i387.fxsave ) )
- return -1;
-
- err |= __put_user( tsk->thread.i387.fxsave.swd, &buf->status );
- err |= __put_user( X86_FXSR_MAGIC, &buf->magic );
- if ( err )
- return -1;
-
- if ( __copy_to_user( &buf->_fxsr_env[0], &tsk->thread.i387.fxsave,
- sizeof(struct i387_fxsave_struct) ) )
- return -1;
- return 1;
-}
+ {
+ extern void bad_user_i387_struct(void);
+ if (sizeof(struct user_i387_struct) != sizeof(tsk->thread.i387.fxsave))
+ bad_user_i387_struct();
+ }
-int save_i387( struct _fpstate *buf )
-{
- if ( !current->used_math )
+ if (!tsk->used_math)
return 0;
-
- /* This will cause a "finit" to be triggered by the next
- * attempted FPU operation by the 'current' process.
- */
- current->used_math = 0;
-
- if ( HAVE_HWFP ) {
- if ( cpu_has_fxsr ) {
- return save_i387_fxsave( buf );
+ tsk->used_math = 0; /* trigger finit */
+ if (test_thread_flag(TIF_USEDFPU)) {
+ err = save_i387_checking((struct i387_fxsave_struct *)buf);
+ if (err) return err;
+ stts();
} else {
- return save_i387_fsave( buf );
- }
+ if (__copy_to_user(buf, &tsk->thread.i387.fxsave,
+ sizeof(struct i387_fxsave_struct)))
+ return -1;
}
-}
-
-static inline int restore_i387_fsave( struct _fpstate *buf )
-{
- struct task_struct *tsk = current;
- clear_fpu( tsk );
- return __copy_from_user( &tsk->thread.i387.fsave, buf,
- sizeof(struct i387_fsave_struct) );
-}
-
-static inline int restore_i387_fxsave( struct _fpstate *buf )
-{
- struct task_struct *tsk = current;
- clear_fpu( tsk );
- if ( __copy_from_user( &tsk->thread.i387.fxsave, &buf->_fxsr_env[0],
- sizeof(struct i387_fxsave_struct) ) )
return 1;
- /* mxcsr bit 6 and 31-16 must be zero for security reasons */
- tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
- return convert_fxsr_from_user( &tsk->thread.i387.fxsave, buf );
-}
-
-int restore_i387( struct _fpstate *buf )
-{
- int err;
-
- if ( HAVE_HWFP ) {
- if ( cpu_has_fxsr ) {
- err = restore_i387_fxsave( buf );
- } else {
- err = restore_i387_fsave( buf );
- }
- }
- current->used_math = 1;
- return err;
}
/*
* ptrace request handlers.
*/
-int get_fpregs( struct user_i387_struct *buf, struct task_struct *tsk )
+int get_fpregs(struct user_i387_struct *buf, struct task_struct *tsk)
{
- if ( cpu_has_fxsr ) {
- if (__copy_to_user( (void *)buf, &tsk->thread.i387.fxsave,
- sizeof(struct user_i387_struct) ))
- return -EFAULT;
- return 0;
- } else {
- return -EIO;
- }
+ empty_fpu(tsk);
+ return __copy_to_user((void *)buf, &tsk->thread.i387.fxsave,
+ sizeof(struct user_i387_struct)) ? -EFAULT : 0;
}
-int set_fpregs( struct task_struct *tsk, struct user_i387_struct *buf )
+int set_fpregs(struct task_struct *tsk, struct user_i387_struct *buf)
{
- if ( cpu_has_fxsr ) {
- __copy_from_user( &tsk->thread.i387.fxsave, (void *)buf,
- sizeof(struct user_i387_struct) );
- /* mxcsr bit 6 and 31-16 must be zero for security reasons */
- tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
+ if (__copy_from_user(&tsk->thread.i387.fxsave, buf,
+ sizeof(struct user_i387_struct)))
+ return -EFAULT;
return 0;
- } else {
- return -EIO;
- }
}
/*
* FPU state for core dumps.
*/
-static inline void copy_fpu_fsave( struct task_struct *tsk,
- struct user_i387_struct *fpu )
-{
- memcpy( fpu, &tsk->thread.i387.fsave,
- sizeof(struct user_i387_struct) );
-}
-
-static inline void copy_fpu_fxsave( struct task_struct *tsk,
- struct user_i387_struct *fpu )
-{
- unsigned short *to;
- unsigned short *from;
- int i;
-
- memcpy( fpu, &tsk->thread.i387.fxsave, 7 * sizeof(u32) );
-
- to = (unsigned short *)&fpu->st_space[0];
- from = (unsigned short *)&tsk->thread.i387.fxsave.st_space[0];
- for ( i = 0 ; i < 8 ; i++, to += 5, from += 8 ) {
- memcpy( to, from, 5 * sizeof(unsigned short) );
- }
-}
-
int dump_fpu( struct pt_regs *regs, struct user_i387_struct *fpu )
{
- int fpvalid;
struct task_struct *tsk = current;
- fpvalid = tsk->used_math && cpu_has_fxsr;
- if ( fpvalid ) {
- unlazy_fpu( tsk );
- memcpy( fpu, &tsk->thread.i387.fxsave,
- sizeof(struct user_i387_struct) );
- }
+ if (!tsk->used_math)
+ return 0;
- return fpvalid;
+ unlazy_fpu(tsk);
+ memcpy(fpu, &tsk->thread.i387.fxsave, sizeof(struct user_i387_struct));
+ return 1;
}
* interrupt-controller happy.
*/
-BUILD_COMMON_IRQ()
-
#define BI(x,y) \
BUILD_IRQ(x##y)
*/
BUILD_16_IRQS(0x0)
-#ifdef CONFIG_X86_IO_APIC
+#ifdef CONFIG_X86_LOCAL_APIC
/*
* The IO-APIC gives us many more interrupt sources. Most of these
* are unused but an SMP system is supposed to have enough memory ...
#undef BI
-/*
- * The following vectors are part of the Linux architecture, there
- * is no hardware IRQ pin equivalent for them, they are triggered
- * through the ICC by us (IPIs)
- */
-#ifdef CONFIG_SMP
-BUILD_SMP_INTERRUPT(task_migration_interrupt,TASK_MIGRATION_VECTOR);
-BUILD_SMP_INTERRUPT(reschedule_interrupt,RESCHEDULE_VECTOR);
-BUILD_SMP_INTERRUPT(invalidate_interrupt,INVALIDATE_TLB_VECTOR);
-BUILD_SMP_INTERRUPT(call_function_interrupt,CALL_FUNCTION_VECTOR);
-#endif
-
-/*
- * every pentium local APIC has two 'local interrupts', with a
- * soft-definable vector attached to both interrupts, one of
- * which is a timer interrupt, the other one is error counter
- * overflow. Linux uses the local APIC timer interrupt to get
- * a much simpler SMP time architecture:
- */
-#ifdef CONFIG_X86_LOCAL_APIC
-BUILD_SMP_INTERRUPT(apic_timer_interrupt, LOCAL_TIMER_VECTOR);
-BUILD_SMP_INTERRUPT(error_interrupt,ERROR_APIC_VECTOR);
-BUILD_SMP_INTERRUPT(spurious_interrupt,SPURIOUS_APIC_VECTOR);
-#endif
-
#define IRQ(x,y) \
IRQ##x##y##_interrupt
static void end_8259A_irq (unsigned int irq)
{
+ if (irq > 256) {
+ char var;
+ printk("return %p stack %p ti %p\n", __builtin_return_address(0), &var, current->thread_info);
+
+ BUG();
+ }
+
if (!(irq_desc[irq].status & (IRQ_DISABLED|IRQ_INPROGRESS)))
enable_8259A_irq(irq);
}
* IRQ2 is cascade interrupt to second interrupt controller
*/
-#ifndef CONFIG_VISWS
static struct irqaction irq2 = { no_action, 0, 0, "cascade", NULL, NULL};
-#endif
-
void __init init_ISA_irqs (void)
{
}
}
+void apic_timer_interrupt(void);
+void spurious_interrupt(void);
+void error_interrupt(void);
+void reschedule_interrupt(void);
+void call_function_interrupt(void);
+void invalidate_interrupt(void);
+
void __init init_IRQ(void)
{
int i;
-#ifndef CONFIG_X86_VISWS_APIC
init_ISA_irqs();
-#else
- init_VISWS_APIC_irqs();
-#endif
/*
* Cover the whole vector space, no vector can escape
* us. (some of these will be overridden and become
*/
for (i = 0; i < NR_IRQS; i++) {
int vector = FIRST_EXTERNAL_VECTOR + i;
- if (vector != IA32_SYSCALL_VECTOR)
+ if (vector != IA32_SYSCALL_VECTOR && vector != KDB_VECTOR) {
set_intr_gate(vector, interrupt[i]);
}
+ }
#ifdef CONFIG_SMP
/*
*/
set_intr_gate(RESCHEDULE_VECTOR, reschedule_interrupt);
- /* IPI for task migration */
- set_intr_gate(TASK_MIGRATION_VECTOR, task_migration_interrupt);
-
/* IPI for invalidation */
set_intr_gate(INVALIDATE_TLB_VECTOR, invalidate_interrupt);
outb_p(LATCH & 0xff , 0x40); /* LSB */
outb(LATCH >> 8 , 0x40); /* MSB */
-#ifndef CONFIG_VISWS
setup_irq(2, &irq2);
-#endif
}
#include <asm/uaccess.h>
#include <asm/pgtable.h>
#include <asm/desc.h>
-#include <asm/thread_info.h>
static struct fs_struct init_fs = INIT_FS;
static struct files_struct init_files = INIT_FILES;
struct mm_struct init_mm = INIT_MM(init_mm);
/*
- * Initial thread structure.
+ * Initial task structure.
*
* We need to make sure that this is 8192-byte aligned due to the
* way process stacks are handled. This is done by having a special
* section. Since TSS's are completely CPU-local, we want them
* on exact cacheline boundaries, to eliminate cacheline ping-pong.
*/
-struct tss_struct init_tss[NR_CPUS] __cacheline_aligned = { [0 ... NR_CPUS-1] = INIT_TSS };
+struct tss_struct init_tss[NR_CPUS] __cacheline_aligned;
+
+
+#define ALIGN_TO_4K __attribute__((section(".data.init_task")))
+
+pgd_t boot_vmalloc_pgt[512] ALIGN_TO_4K;
* shared ISA-space IRQs, so we have to support them. We are super
* fast in the common case, and fast for shared ISA-space IRQs.
*/
-static void add_pin_to_irq(unsigned int irq, int apic, int pin)
+static void __init add_pin_to_irq(unsigned int irq, int apic, int pin)
{
static int first_free_entry = NR_IRQS;
struct irq_pin_list *entry = irq_2_pin + irq;
*/
void __global_cli(void)
{
- unsigned int flags;
+ unsigned long flags;
__save_flags(flags);
if (flags & (1 << EFLAGS_IF_SHIFT)) {
* 0 return value means that this irq is already being
* handled by some other CPU. (or is disabled)
*/
- int irq = regs->orig_rax & 0xff; /* high bits used in ret_from_ code */
+ unsigned irq = regs->orig_rax & 0xff; /* high bits used in ret_from_ code */
int cpu = smp_processor_id();
irq_desc_t *desc = irq_desc + irq;
struct irqaction * action;
unsigned int status;
+ if (irq > 256) BUG();
+
kstat.irqs[cpu][irq]++;
spin_lock(&desc->lock);
desc->handler->ack(irq);
* The ->end() handler has to deal with interrupts which got
* disabled while the handler was running.
*/
+ if (irq > 256) BUG();
desc->handler->end(irq);
spin_unlock(&desc->lock);
/*
- * linux/kernel/ldt.c
+ * linux/arch/x86_64/kernel/ldt.c
*
* Copyright (C) 1992 Krishna Balasubramanian and Linus Torvalds
* Copyright (C) 1999 Ingo Molnar <mingo@redhat.com>
+ * Copyright (C) 2002 Andi Kleen
+ *
+ * This handles calls from both 32bit and 64bit mode.
*/
/*
- * FIXME: forbid code segment setting for 64bit mode. doesn't work with SYSCALL
+ * FIXME:
+ * Need to add locking for LAR in load_gs_index.
*/
#include <linux/errno.h>
static int read_default_ldt(void * ptr, unsigned long bytecount)
{
- int err;
- unsigned long size;
- void *address;
-
- err = 0;
- address = &default_ldt[0];
- size = sizeof(struct desc_struct);
- if (size > bytecount)
- size = bytecount;
-
- err = size;
- if (copy_to_user(ptr, address, size))
- err = -EFAULT;
-
- return err;
+ /* Arbitary number */
+ if (bytecount > 128)
+ bytecount = 128;
+ if (clear_user(ptr, bytecount))
+ return -EFAULT;
+ return bytecount;
}
static int write_ldt(void * ptr, unsigned long bytecount, int oldmode)
{
- struct mm_struct * mm = current->mm;
+ struct task_struct *me = current;
+ struct mm_struct * mm = me->mm;
__u32 entry_1, entry_2, *lp;
int error;
struct modify_ldt_ldt_s ldt_info;
error = -EINVAL;
+
if (bytecount != sizeof(ldt_info))
goto out;
error = -EFAULT;
- if (copy_from_user(&ldt_info, ptr, sizeof(ldt_info)))
+ if (copy_from_user(&ldt_info, ptr, bytecount))
goto out;
error = -EINVAL;
goto out;
}
- current->thread.fsindex = 0;
- current->thread.gsindex = 0;
+ me->thread.fsindex = 0;
+ me->thread.gsindex = 0;
+ me->thread.gs = 0;
+ me->thread.fs = 0;
/*
* the GDT index of the LDT is allocated dynamically, and is
ldt_info.seg_32bit == 0 &&
ldt_info.limit_in_pages == 0 &&
ldt_info.seg_not_present == 1 &&
- ldt_info.useable == 0 )) {
+ ldt_info.useable == 0 &&
+ ldt_info.lm == 0)) {
entry_1 = 0;
entry_2 = 0;
goto install;
((ldt_info.seg_not_present ^ 1) << 15) |
(ldt_info.seg_32bit << 22) |
(ldt_info.limit_in_pages << 23) |
+ (ldt_info.lm << 21) |
0x7000;
if (!oldmode)
entry_2 |= (ldt_info.useable << 20);
#include <asm/pgalloc.h>
/* Have we found an MP table */
-int smp_found_config = 0;
+int smp_found_config;
+
+int acpi_found_madt;
/*
* Various Linux-internal data structures created from the
/* Bitmask of physically existing CPUs */
unsigned long phys_cpu_present_map = 0;
+/* ACPI MADT entry parsing functions */
+#ifdef CONFIG_ACPI_BOOT
+extern struct acpi_boot_flags acpi_boot;
+#ifdef CONFIG_X86_LOCAL_APIC
+extern int acpi_parse_lapic (acpi_table_entry_header *header);
+extern int acpi_parse_lapic_addr_ovr (acpi_table_entry_header *header);
+extern int acpi_parse_lapic_nmi (acpi_table_entry_header *header);
+#endif /*CONFIG_X86_LOCAL_APIC*/
+#ifdef CONFIG_X86_IO_APIC
+extern int acpi_parse_ioapic (acpi_table_entry_header *header);
+#endif /*CONFIG_X86_IO_APIC*/
+#endif /*CONFIG_ACPI_BOOT*/
+
/*
* Intel MP BIOS table parsing routines:
*/
printk("APIC at: 0x%X\n",mpc->mpc_lapic);
/* save the local APIC address, it might be non-default */
+ if (!acpi_found_madt)
mp_lapic_addr = mpc->mpc_lapic;
/*
{
struct mpc_config_processor *m=
(struct mpc_config_processor *)mpt;
+ if (!acpi_found_madt)
MP_processor_info(m);
mpt += sizeof(*m);
count += sizeof(*m);
void __init get_smp_config (void)
{
struct intel_mp_floating *mpf = mpf_found;
+
+#ifdef CONFIG_ACPI_BOOT
+ /*
+ * Check if the MADT exists, and if so, use it to get processor
+ * information (ACPI_MADT_LAPIC). The MADT supports the concept
+ * of both logical (e.g. HT) and physical processor(s); where the
+ * MPS only supports physical.
+ */
+ if (acpi_boot.madt) {
+ acpi_found_madt = acpi_table_parse(ACPI_APIC, acpi_parse_madt);
+ if (acpi_found_madt > 0)
+ acpi_table_parse_madt(ACPI_MADT_LAPIC, acpi_parse_lapic);
+ }
+#endif /*CONFIG_ACPI_BOOT*/
+
printk("Intel MultiProcessor Specification v1.%d\n", mpf->mpf_specification);
if (mpf->mpf_feature2 & (1<<7)) {
printk(" IMCR and PIC compatibility mode.\n");
static int __init smp_scan_config (unsigned long base, unsigned long length)
{
- unsigned long *bp = phys_to_virt(base);
+ unsigned int *bp = phys_to_virt(base);
struct intel_mp_floating *mpf;
Dprintk("Scan SMP from %p for %ld bytes.\n", bp,length);
-/* Generic MTRR (Memory Type Range Register) driver.
+/* x86-64 MTRR (Memory Type Range Register) driver.
+ Based largely upon arch/i386/kernel/mtrr.c
Copyright (C) 1997-2000 Richard Gooch
+ Copyright (C) 2002 Dave Jones.
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- Richard Gooch may be reached by email at rgooch@atnf.csiro.au
- The postal address is:
- Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
-
- Source: "Pentium Pro Family Developer's Manual, Volume 3:
- Operating System Writer's Guide" (Intel document number 242692),
- section 11.11.7
-
- ChangeLog
-
- Prehistory Martin Tischhäuser <martin@ikcbarka.fzk.de>
- Initial register-setting code (from proform-1.0).
- 19971216 Richard Gooch <rgooch@atnf.csiro.au>
- Original version for /proc/mtrr interface, SMP-safe.
- v1.0
- 19971217 Richard Gooch <rgooch@atnf.csiro.au>
- Bug fix for ioctls()'s.
- Added sample code in Documentation/mtrr.txt
- v1.1
- 19971218 Richard Gooch <rgooch@atnf.csiro.au>
- Disallow overlapping regions.
- 19971219 Jens Maurer <jmaurer@menuett.rhein-main.de>
- Register-setting fixups.
- v1.2
- 19971222 Richard Gooch <rgooch@atnf.csiro.au>
- Fixups for kernel 2.1.75.
- v1.3
- 19971229 David Wragg <dpw@doc.ic.ac.uk>
- Register-setting fixups and conformity with Intel conventions.
- 19971229 Richard Gooch <rgooch@atnf.csiro.au>
- Cosmetic changes and wrote this ChangeLog ;-)
- 19980106 Richard Gooch <rgooch@atnf.csiro.au>
- Fixups for kernel 2.1.78.
- v1.4
- 19980119 David Wragg <dpw@doc.ic.ac.uk>
- Included passive-release enable code (elsewhere in PCI setup).
- v1.5
- 19980131 Richard Gooch <rgooch@atnf.csiro.au>
- Replaced global kernel lock with private spinlock.
- v1.6
- 19980201 Richard Gooch <rgooch@atnf.csiro.au>
- Added wait for other CPUs to complete changes.
- v1.7
- 19980202 Richard Gooch <rgooch@atnf.csiro.au>
- Bug fix in definition of <set_mtrr> for UP.
- v1.8
- 19980319 Richard Gooch <rgooch@atnf.csiro.au>
- Fixups for kernel 2.1.90.
- 19980323 Richard Gooch <rgooch@atnf.csiro.au>
- Move SMP BIOS fixup before secondary CPUs call <calibrate_delay>
- v1.9
- 19980325 Richard Gooch <rgooch@atnf.csiro.au>
- Fixed test for overlapping regions: confused by adjacent regions
- 19980326 Richard Gooch <rgooch@atnf.csiro.au>
- Added wbinvd in <set_mtrr_prepare>.
- 19980401 Richard Gooch <rgooch@atnf.csiro.au>
- Bug fix for non-SMP compilation.
- 19980418 David Wragg <dpw@doc.ic.ac.uk>
- Fixed-MTRR synchronisation for SMP and use atomic operations
- instead of spinlocks.
- 19980418 Richard Gooch <rgooch@atnf.csiro.au>
- Differentiate different MTRR register classes for BIOS fixup.
- v1.10
- 19980419 David Wragg <dpw@doc.ic.ac.uk>
- Bug fix in variable MTRR synchronisation.
- v1.11
- 19980419 Richard Gooch <rgooch@atnf.csiro.au>
- Fixups for kernel 2.1.97.
- v1.12
- 19980421 Richard Gooch <rgooch@atnf.csiro.au>
- Safer synchronisation across CPUs when changing MTRRs.
- v1.13
- 19980423 Richard Gooch <rgooch@atnf.csiro.au>
- Bugfix for SMP systems without MTRR support.
- v1.14
- 19980427 Richard Gooch <rgooch@atnf.csiro.au>
- Trap calls to <mtrr_add> and <mtrr_del> on non-MTRR machines.
- v1.15
- 19980427 Richard Gooch <rgooch@atnf.csiro.au>
- Use atomic bitops for setting SMP change mask.
- v1.16
- 19980428 Richard Gooch <rgooch@atnf.csiro.au>
- Removed spurious diagnostic message.
- v1.17
- 19980429 Richard Gooch <rgooch@atnf.csiro.au>
- Moved register-setting macros into this file.
- Moved setup code from init/main.c to i386-specific areas.
- v1.18
- 19980502 Richard Gooch <rgooch@atnf.csiro.au>
- Moved MTRR detection outside conditionals in <mtrr_init>.
- v1.19
- 19980502 Richard Gooch <rgooch@atnf.csiro.au>
- Documentation improvement: mention Pentium II and AGP.
- v1.20
- 19980521 Richard Gooch <rgooch@atnf.csiro.au>
- Only manipulate interrupt enable flag on local CPU.
- Allow enclosed uncachable regions.
- v1.21
- 19980611 Richard Gooch <rgooch@atnf.csiro.au>
- Always define <main_lock>.
- v1.22
- 19980901 Richard Gooch <rgooch@atnf.csiro.au>
- Removed module support in order to tidy up code.
- Added sanity check for <mtrr_add>/<mtrr_del> before <mtrr_init>.
- Created addition queue for prior to SMP commence.
- v1.23
- 19980902 Richard Gooch <rgooch@atnf.csiro.au>
- Ported patch to kernel 2.1.120-pre3.
- v1.24
- 19980910 Richard Gooch <rgooch@atnf.csiro.au>
- Removed sanity checks and addition queue: Linus prefers an OOPS.
- v1.25
- 19981001 Richard Gooch <rgooch@atnf.csiro.au>
- Fixed harmless compiler warning in include/asm-i386/mtrr.h
- Fixed version numbering and history for v1.23 -> v1.24.
- v1.26
- 19990118 Richard Gooch <rgooch@atnf.csiro.au>
- Added devfs support.
- v1.27
- 19990123 Richard Gooch <rgooch@atnf.csiro.au>
- Changed locking to spin with reschedule.
- Made use of new <smp_call_function>.
- v1.28
- 19990201 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Extended the driver to be able to use Cyrix style ARRs.
- 19990204 Richard Gooch <rgooch@atnf.csiro.au>
- Restructured Cyrix support.
- v1.29
- 19990204 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Refined ARR support: enable MAPEN in set_mtrr_prepare()
- and disable MAPEN in set_mtrr_done().
- 19990205 Richard Gooch <rgooch@atnf.csiro.au>
- Minor cleanups.
- v1.30
- 19990208 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Protect plain 6x86s (and other processors without the
- Page Global Enable feature) against accessing CR4 in
- set_mtrr_prepare() and set_mtrr_done().
- 19990210 Richard Gooch <rgooch@atnf.csiro.au>
- Turned <set_mtrr_up> and <get_mtrr> into function pointers.
- v1.31
- 19990212 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Major rewrite of cyrix_arr_init(): do not touch ARRs,
- leave them as the BIOS have set them up.
- Enable usage of all 8 ARRs.
- Avoid multiplications by 3 everywhere and other
- code clean ups/speed ups.
- 19990213 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Set up other Cyrix processors identical to the boot cpu.
- Since Cyrix don't support Intel APIC, this is l'art pour l'art.
- Weigh ARRs by size:
- If size <= 32M is given, set up ARR# we were given.
- If size > 32M is given, set up ARR7 only if it is free,
- fail otherwise.
- 19990214 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Also check for size >= 256K if we are to set up ARR7,
- mtrr_add() returns the value it gets from set_mtrr()
- 19990218 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Remove Cyrix "coma bug" workaround from here.
- Moved to linux/arch/i386/kernel/setup.c and
- linux/include/asm-i386/bugs.h
- 19990228 Richard Gooch <rgooch@atnf.csiro.au>
- Added MTRRIOC_KILL_ENTRY ioctl(2)
- Trap for counter underflow in <mtrr_file_del>.
- Trap for 4 MiB aligned regions for PPro, stepping <= 7.
- 19990301 Richard Gooch <rgooch@atnf.csiro.au>
- Created <get_free_region> hook.
- 19990305 Richard Gooch <rgooch@atnf.csiro.au>
- Temporarily disable AMD support now MTRR capability flag is set.
- v1.32
- 19990308 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Adjust my changes (19990212-19990218) to Richard Gooch's
- latest changes. (19990228-19990305)
- v1.33
- 19990309 Richard Gooch <rgooch@atnf.csiro.au>
- Fixed typo in <printk> message.
- 19990310 Richard Gooch <rgooch@atnf.csiro.au>
- Support K6-II/III based on Alan Cox's <alan@redhat.com> patches.
- v1.34
- 19990511 Bart Hartgers <bart@etpmod.phys.tue.nl>
- Support Centaur C6 MCR's.
- 19990512 Richard Gooch <rgooch@atnf.csiro.au>
- Minor cleanups.
- v1.35
- 19990707 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Check whether ARR3 is protected in cyrix_get_free_region()
- and mtrr_del(). The code won't attempt to delete or change it
- from now on if the BIOS protected ARR3. It silently skips ARR3
- in cyrix_get_free_region() or returns with an error code from
- mtrr_del().
- 19990711 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Reset some bits in the CCRs in cyrix_arr_init() to disable SMM
- if ARR3 isn't protected. This is needed because if SMM is active
- and ARR3 isn't protected then deleting and setting ARR3 again
- may lock up the processor. With SMM entirely disabled, it does
- not happen.
- 19990812 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Rearrange switch() statements so the driver accomodates to
- the fact that the AMD Athlon handles its MTRRs the same way
- as Intel does.
- 19990814 Zoltán Böszörményi <zboszor@mail.externet.hu>
- Double check for Intel in mtrr_add()'s big switch() because
- that revision check is only valid for Intel CPUs.
- 19990819 Alan Cox <alan@redhat.com>
- Tested Zoltan's changes on a pre production Athlon - 100%
- success.
- 19991008 Manfred Spraul <manfreds@colorfullife.com>
- replaced spin_lock_reschedule() with a normal semaphore.
- v1.36
- 20000221 Richard Gooch <rgooch@atnf.csiro.au>
- Compile fix if procfs and devfs not enabled.
- Formatting changes.
- v1.37
- 20001109 H. Peter Anvin <hpa@zytor.com>
- Use the new centralized CPU feature detects.
-
- v1.38
- 20010309 Dave Jones <davej@suse.de>
- Add support for Cyrix III.
-
- v1.39
- 20010312 Dave Jones <davej@suse.de>
- Ugh, I broke AMD support.
- Reworked fix by Troels Walsted Hansen <troels@thule.no>
-
- v1.40
- 20010327 Dave Jones <davej@suse.de>
- Adapted Cyrix III support to include VIA C3.
+ (For earlier history, see arch/i386/kernel/mtrr.c)
+ September 2001 Dave Jones <davej@suse.de>
+ Initial rewrite for x86-64.
*/
#include <linux/types.h>
#include <linux/devfs_fs_kernel.h>
#include <linux/mm.h>
#include <linux/module.h>
-#include <linux/pci.h>
#define MTRR_NEED_STRINGS
#include <asm/mtrr.h>
#include <linux/init.h>
#include <asm/hardirq.h>
#include <linux/irq.h>
-#define MTRR_VERSION "1.40 (20010327)"
+#define MTRR_VERSION "2.00 (20020207)"
#define TRUE 1
#define FALSE 0
-/*
- * The code assumes all processors support the same MTRR
- * interface. This is generally a good assumption, but could
- * potentially be a problem.
- */
-enum mtrr_if_type {
- MTRR_IF_NONE, /* No MTRRs supported */
- MTRR_IF_INTEL, /* Intel (P6) standard MTRRs */
- MTRR_IF_AMD_K6, /* AMD pre-Athlon MTRRs */
- MTRR_IF_CYRIX_ARR, /* Cyrix ARRs */
- MTRR_IF_CENTAUR_MCR, /* Centaur MCRs */
-} mtrr_if = MTRR_IF_NONE;
-
-static __initdata char *mtrr_if_name[] = {
- "none", "Intel", "AMD K6", "Cyrix ARR", "Centaur MCR"
-};
-
#define MTRRcap_MSR 0x0fe
#define MTRRdefType_MSR 0x2ff
#define MTRRfix4K_F8000_MSR 0x26f
#ifdef CONFIG_SMP
-# define MTRR_CHANGE_MASK_FIXED 0x01
-# define MTRR_CHANGE_MASK_VARIABLE 0x02
-# define MTRR_CHANGE_MASK_DEFTYPE 0x04
+#define MTRR_CHANGE_MASK_FIXED 0x01
+#define MTRR_CHANGE_MASK_VARIABLE 0x02
+#define MTRR_CHANGE_MASK_DEFTYPE 0x04
#endif
-/* In the Intel processor's MTRR interface, the MTRR type is always held in
- an 8 bit field: */
typedef u8 mtrr_type;
#define LINE_SIZE 80
-#define JIFFIE_TIMEOUT 100
#ifdef CONFIG_SMP
-# define set_mtrr(reg,base,size,type) set_mtrr_smp (reg, base, size, type)
+#define set_mtrr(reg,base,size,type) set_mtrr_smp (reg, base, size, type)
#else
-# define set_mtrr(reg,base,size,type) (*set_mtrr_up) (reg, base, size, type, \
+#define set_mtrr(reg,base,size,type) (*set_mtrr_up) (reg, base, size, type, \
TRUE)
#endif
#if defined(CONFIG_PROC_FS) || defined(CONFIG_DEVFS_FS)
-# define USERSPACE_INTERFACE
+#define USERSPACE_INTERFACE
#endif
#ifndef USERSPACE_INTERFACE
-# define compute_ascii() while (0)
+#define compute_ascii() while (0)
#endif
#ifdef USERSPACE_INTERFACE
static unsigned int ascii_buf_bytes;
#endif
static unsigned int *usage_table;
-static DECLARE_MUTEX(main_lock);
+static DECLARE_MUTEX (main_lock);
/* Private functions */
#ifdef USERSPACE_INTERFACE
static void compute_ascii (void);
#endif
-
-struct set_mtrr_context
-{
+struct set_mtrr_context {
unsigned long flags;
unsigned long deftype_lo;
unsigned long deftype_hi;
unsigned long cr4val;
- unsigned long ccr3;
};
-static int arr3_protected;
/* Put the processor into a state where MTRRs can be safely set */
-static void set_mtrr_prepare_save (struct set_mtrr_context *ctxt)
+static void set_mtrr_prepare (struct set_mtrr_context *ctxt)
{
- /* Disable interrupts locally */
- __save_flags (ctxt->flags); __cli ();
+ unsigned long cr0;
- if ( mtrr_if != MTRR_IF_INTEL && mtrr_if != MTRR_IF_CYRIX_ARR )
- return;
+ /* Disable interrupts locally */
+ __save_flags(ctxt->flags);
+ __cli();
/* Save value of CR4 and clear Page Global Enable (bit 7) */
- if ( test_bit(X86_FEATURE_PGE, &boot_cpu_data.x86_capability) ) {
+ if (test_bit(X86_FEATURE_PGE, &boot_cpu_data.x86_capability)) {
ctxt->cr4val = read_cr4();
- write_cr4(ctxt->cr4val & ~(1<<7));
+ write_cr4(ctxt->cr4val & ~(1UL << 7));
}
/* Disable and flush caches. Note that wbinvd flushes the TLBs as
a side-effect */
-
- {
- long cr0 = read_cr0() | 0x40000000;
+ cr0 = read_cr0() | 0x40000000;
wbinvd();
- write_cr0( cr0 );
+ write_cr0(cr0);
wbinvd();
- }
-
- if ( mtrr_if == MTRR_IF_INTEL ) {
- /* Save MTRR state */
- rdmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
- } else {
- /* Cyrix ARRs - everything else were excluded at the top */
- ctxt->ccr3 = getCx86 (CX86_CCR3);
- }
-} /* End Function set_mtrr_prepare_save */
-
-static void set_mtrr_cache_disable (struct set_mtrr_context *ctxt)
-{
- if ( mtrr_if != MTRR_IF_INTEL && mtrr_if != MTRR_IF_CYRIX_ARR )
- return;
- if ( mtrr_if == MTRR_IF_INTEL ) {
/* Disable MTRRs, and set the default type to uncached */
- wrmsr (MTRRdefType_MSR, ctxt->deftype_lo & 0xf300UL, ctxt->deftype_hi);
- } else {
- /* Cyrix ARRs - everything else were excluded at the top */
- setCx86 (CX86_CCR3, (ctxt->ccr3 & 0x0f) | 0x10);
- }
-} /* End Function set_mtrr_cache_disable */
+ rdmsr(MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
+ wrmsr(MTRRdefType_MSR, ctxt->deftype_lo & 0xf300UL, ctxt->deftype_hi);
+}
+
/* Restore the processor after a set_mtrr_prepare */
static void set_mtrr_done (struct set_mtrr_context *ctxt)
{
- if ( mtrr_if != MTRR_IF_INTEL && mtrr_if != MTRR_IF_CYRIX_ARR ) {
- __restore_flags (ctxt->flags);
- return;
- }
-
/* Flush caches and TLBs */
wbinvd();
/* Restore MTRRdefType */
- if ( mtrr_if == MTRR_IF_INTEL ) {
- /* Intel (P6) standard MTRRs */
- wrmsr (MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
- } else {
- /* Cyrix ARRs - everything else was excluded at the top */
- setCx86 (CX86_CCR3, ctxt->ccr3);
- }
+ wrmsr(MTRRdefType_MSR, ctxt->deftype_lo, ctxt->deftype_hi);
/* Enable caches */
- write_cr0( read_cr0() & 0xbfffffff );
+ write_cr0(read_cr0() & 0xbfffffff);
/* Restore value of CR4 */
- if ( test_bit(X86_FEATURE_PGE, &boot_cpu_data.x86_capability) )
- write_cr4(ctxt->cr4val);
+ if (test_bit(X86_FEATURE_PGE, &boot_cpu_data.x86_capability))
+ write_cr4 (ctxt->cr4val);
/* Re-enable interrupts locally (if enabled previously) */
- __restore_flags (ctxt->flags);
-} /* End Function set_mtrr_done */
+ __restore_flags(ctxt->flags);
+}
+
/* This function returns the number of variable MTRRs */
static unsigned int get_num_var_ranges (void)
{
unsigned long config, dummy;
- switch ( mtrr_if )
- {
- case MTRR_IF_INTEL:
rdmsr (MTRRcap_MSR, config, dummy);
return (config & 0xff);
- case MTRR_IF_AMD_K6:
- return 2;
- case MTRR_IF_CYRIX_ARR:
- return 8;
- case MTRR_IF_CENTAUR_MCR:
- return 8;
- default:
- return 0;
- }
-} /* End Function get_num_var_ranges */
+}
+
/* Returns non-zero if we have the write-combining memory type */
static int have_wrcomb (void)
{
unsigned long config, dummy;
- struct pci_dev *dev = NULL;
-
- /* ServerWorks LE chipsets have problems with write-combining
- Don't allow it and leave room for other chipsets to be tagged */
-
- if ((dev = pci_find_class(PCI_CLASS_BRIDGE_HOST << 8, NULL)) != NULL) {
- switch(dev->vendor) {
- case PCI_VENDOR_ID_SERVERWORKS:
- switch (dev->device) {
- case PCI_DEVICE_ID_SERVERWORKS_LE:
- return 0;
- break;
- default:
- break;
- }
- break;
- default:
- break;
- }
- }
-
- switch ( mtrr_if )
- {
- case MTRR_IF_INTEL:
rdmsr (MTRRcap_MSR, config, dummy);
- return (config & (1<<10));
- return 1;
- case MTRR_IF_AMD_K6:
- case MTRR_IF_CENTAUR_MCR:
- case MTRR_IF_CYRIX_ARR:
- return 1;
- default:
- return 0;
- }
-} /* End Function have_wrcomb */
+ return (config & (1 << 10));
+}
+
static u32 size_or_mask, size_and_mask;
-static void intel_get_mtrr (unsigned int reg, unsigned long *base,
- unsigned long *size, mtrr_type *type)
+static void get_mtrr (unsigned int reg, unsigned long *base,
+ unsigned long *size, mtrr_type * type)
{
unsigned long mask_lo, mask_hi, base_lo, base_hi;
- rdmsr (MTRRphysMask_MSR(reg), mask_lo, mask_hi);
- if ( (mask_lo & 0x800) == 0 )
- {
+ rdmsr (MTRRphysMask_MSR (reg), mask_lo, mask_hi);
+ if ((mask_lo & 0x800) == 0) {
/* Invalid (i.e. free) range */
*base = 0;
*size = 0;
return;
}
- rdmsr(MTRRphysBase_MSR(reg), base_lo, base_hi);
+ rdmsr (MTRRphysBase_MSR (reg), base_lo, base_hi);
/* Work out the shifted address mask. */
mask_lo = size_or_mask | mask_hi << (32 - PAGE_SHIFT)
*size = -mask_lo;
*base = base_hi << (32 - PAGE_SHIFT) | base_lo >> PAGE_SHIFT;
*type = base_lo & 0xff;
-} /* End Function intel_get_mtrr */
-
-static void cyrix_get_arr (unsigned int reg, unsigned long *base,
- unsigned long *size, mtrr_type *type)
-{
- unsigned long flags;
- unsigned char arr, ccr3, rcr, shift;
-
- arr = CX86_ARR_BASE + (reg << 1) + reg; /* avoid multiplication by 3 */
-
- /* Save flags and disable interrupts */
- __save_flags (flags); __cli ();
-
- ccr3 = getCx86 (CX86_CCR3);
- setCx86 (CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */
- ((unsigned char *) base)[3] = getCx86 (arr);
- ((unsigned char *) base)[2] = getCx86 (arr+1);
- ((unsigned char *) base)[1] = getCx86 (arr+2);
- rcr = getCx86(CX86_RCR_BASE + reg);
- setCx86 (CX86_CCR3, ccr3); /* disable MAPEN */
-
- /* Enable interrupts if it was enabled previously */
- __restore_flags (flags);
- shift = ((unsigned char *) base)[1] & 0x0f;
- *base >>= PAGE_SHIFT;
-
- /* Power of two, at least 4K on ARR0-ARR6, 256K on ARR7
- * Note: shift==0xf means 4G, this is unsupported.
- */
- if (shift)
- *size = (reg < 7 ? 0x1UL : 0x40UL) << (shift - 1);
- else
- *size = 0;
-
- /* Bit 0 is Cache Enable on ARR7, Cache Disable on ARR0-ARR6 */
- if (reg < 7)
- {
- switch (rcr)
- {
- case 1: *type = MTRR_TYPE_UNCACHABLE; break;
- case 8: *type = MTRR_TYPE_WRBACK; break;
- case 9: *type = MTRR_TYPE_WRCOMB; break;
- case 24:
- default: *type = MTRR_TYPE_WRTHROUGH; break;
- }
- } else
- {
- switch (rcr)
- {
- case 0: *type = MTRR_TYPE_UNCACHABLE; break;
- case 8: *type = MTRR_TYPE_WRCOMB; break;
- case 9: *type = MTRR_TYPE_WRBACK; break;
- case 25:
- default: *type = MTRR_TYPE_WRTHROUGH; break;
- }
- }
-} /* End Function cyrix_get_arr */
-
-static void amd_get_mtrr (unsigned int reg, unsigned long *base,
- unsigned long *size, mtrr_type *type)
-{
- unsigned long low, high;
-
- rdmsr (0xC0000085, low, high);
- /* Upper dword is region 1, lower is region 0 */
- if (reg == 1) low = high;
- /* The base masks off on the right alignment */
- *base = (low & 0xFFFE0000) >> PAGE_SHIFT;
- *type = 0;
- if (low & 1) *type = MTRR_TYPE_UNCACHABLE;
- if (low & 2) *type = MTRR_TYPE_WRCOMB;
- if ( !(low & 3) )
- {
- *size = 0;
- return;
- }
- /*
- * This needs a little explaining. The size is stored as an
- * inverted mask of bits of 128K granularity 15 bits long offset
- * 2 bits
- *
- * So to get a size we do invert the mask and add 1 to the lowest
- * mask bit (4 as its 2 bits in). This gives us a size we then shift
- * to turn into 128K blocks
- *
- * eg 111 1111 1111 1100 is 512K
- *
- * invert 000 0000 0000 0011
- * +1 000 0000 0000 0100
- * *128K ...
- */
- low = (~low) & 0x1FFFC;
- *size = (low + 4) << (15 - PAGE_SHIFT);
- return;
-} /* End Function amd_get_mtrr */
-
-static struct
-{
- unsigned long high;
- unsigned long low;
-} centaur_mcr[8];
+}
-static u8 centaur_mcr_reserved;
-static u8 centaur_mcr_type; /* 0 for winchip, 1 for winchip2 */
-/*
- * Report boot time MCR setups
- */
-
-void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
-{
- centaur_mcr[mcr].low = lo;
- centaur_mcr[mcr].high = hi;
-}
-static void centaur_get_mcr (unsigned int reg, unsigned long *base,
- unsigned long *size, mtrr_type *type)
-{
- *base = centaur_mcr[reg].high >> PAGE_SHIFT;
- *size = -(centaur_mcr[reg].low & 0xfffff000) >> PAGE_SHIFT;
- *type = MTRR_TYPE_WRCOMB; /* If it is there, it is write-combining */
- if(centaur_mcr_type==1 && ((centaur_mcr[reg].low&31)&2))
- *type = MTRR_TYPE_UNCACHABLE;
- if(centaur_mcr_type==1 && (centaur_mcr[reg].low&31)==25)
- *type = MTRR_TYPE_WRBACK;
- if(centaur_mcr_type==0 && (centaur_mcr[reg].low&31)==31)
- *type = MTRR_TYPE_WRBACK;
-
-} /* End Function centaur_get_mcr */
-
-static void (*get_mtrr) (unsigned int reg, unsigned long *base,
- unsigned long *size, mtrr_type *type);
-
-static void intel_set_mtrr_up (unsigned int reg, unsigned long base,
+static void set_mtrr_up (unsigned int reg, unsigned long base,
unsigned long size, mtrr_type type, int do_safe)
/* [SUMMARY] Set variable MTRR register on the local CPU.
<reg> The register to set.
{
struct set_mtrr_context ctxt;
- if (do_safe) {
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
- }
- if (size == 0)
- {
+ if (do_safe)
+ set_mtrr_prepare (&ctxt);
+
+ if (size == 0) {
/* The invalid bit is kept in the mask, so we simply clear the
relevant mask register to disable a range. */
wrmsr (MTRRphysMask_MSR (reg), 0, 0);
- }
- else
- {
+ } else {
wrmsr (MTRRphysBase_MSR (reg), base << PAGE_SHIFT | type,
(base & size_and_mask) >> (32 - PAGE_SHIFT));
wrmsr (MTRRphysMask_MSR (reg), -size << PAGE_SHIFT | 0x800,
(-size & size_and_mask) >> (32 - PAGE_SHIFT));
}
- if (do_safe) set_mtrr_done (&ctxt);
-} /* End Function intel_set_mtrr_up */
-
-static void cyrix_set_arr_up (unsigned int reg, unsigned long base,
- unsigned long size, mtrr_type type, int do_safe)
-{
- struct set_mtrr_context ctxt;
- unsigned char arr, arr_type, arr_size;
-
- arr = CX86_ARR_BASE + (reg << 1) + reg; /* avoid multiplication by 3 */
-
- /* count down from 32M (ARR0-ARR6) or from 2G (ARR7) */
- if (reg >= 7)
- size >>= 6;
-
- size &= 0x7fff; /* make sure arr_size <= 14 */
- for(arr_size = 0; size; arr_size++, size >>= 1);
-
- if (reg<7)
- {
- switch (type) {
- case MTRR_TYPE_UNCACHABLE: arr_type = 1; break;
- case MTRR_TYPE_WRCOMB: arr_type = 9; break;
- case MTRR_TYPE_WRTHROUGH: arr_type = 24; break;
- default: arr_type = 8; break;
- }
- }
- else
- {
- switch (type)
- {
- case MTRR_TYPE_UNCACHABLE: arr_type = 0; break;
- case MTRR_TYPE_WRCOMB: arr_type = 8; break;
- case MTRR_TYPE_WRTHROUGH: arr_type = 25; break;
- default: arr_type = 9; break;
- }
- }
-
- if (do_safe) {
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
- }
- base <<= PAGE_SHIFT;
- setCx86(arr, ((unsigned char *) &base)[3]);
- setCx86(arr+1, ((unsigned char *) &base)[2]);
- setCx86(arr+2, (((unsigned char *) &base)[1]) | arr_size);
- setCx86(CX86_RCR_BASE + reg, arr_type);
- if (do_safe) set_mtrr_done (&ctxt);
-} /* End Function cyrix_set_arr_up */
-
-static void amd_set_mtrr_up (unsigned int reg, unsigned long base,
- unsigned long size, mtrr_type type, int do_safe)
-/* [SUMMARY] Set variable MTRR register on the local CPU.
- <reg> The register to set.
- <base> The base address of the region.
- <size> The size of the region. If this is 0 the region is disabled.
- <type> The type of the region.
- <do_safe> If TRUE, do the change safely. If FALSE, safety measures should
- be done externally.
- [RETURNS] Nothing.
-*/
-{
- u32 regs[2];
- struct set_mtrr_context ctxt;
-
- if (do_safe) {
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
- }
- /*
- * Low is MTRR0 , High MTRR 1
- */
- rdmsr (0xC0000085, regs[0], regs[1]);
- /*
- * Blank to disable
- */
- if (size == 0)
- regs[reg] = 0;
- else
- /* Set the register to the base, the type (off by one) and an
- inverted bitmask of the size The size is the only odd
- bit. We are fed say 512K We invert this and we get 111 1111
- 1111 1011 but if you subtract one and invert you get the
- desired 111 1111 1111 1100 mask
-
- But ~(x - 1) == ~x + 1 == -x. Two's complement rocks! */
- regs[reg] = (-size>>(15-PAGE_SHIFT) & 0x0001FFFC)
- | (base<<PAGE_SHIFT) | (type+1);
-
- /*
- * The writeback rule is quite specific. See the manual. Its
- * disable local interrupts, write back the cache, set the mtrr
- */
- wbinvd();
- wrmsr (0xC0000085, regs[0], regs[1]);
- if (do_safe) set_mtrr_done (&ctxt);
-} /* End Function amd_set_mtrr_up */
-
-
-static void centaur_set_mcr_up (unsigned int reg, unsigned long base,
- unsigned long size, mtrr_type type,
- int do_safe)
-{
- struct set_mtrr_context ctxt;
- unsigned long low, high;
-
- if (do_safe) {
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
- }
- if (size == 0)
- {
- /* Disable */
- high = low = 0;
- }
- else
- {
- high = base << PAGE_SHIFT;
- if(centaur_mcr_type == 0)
- low = -size << PAGE_SHIFT | 0x1f; /* only support write-combining... */
- else
- {
- if(type == MTRR_TYPE_UNCACHABLE)
- low = -size << PAGE_SHIFT | 0x02; /* NC */
- else
- low = -size << PAGE_SHIFT | 0x09; /* WWO,WC */
- }
- }
- centaur_mcr[reg].high = high;
- centaur_mcr[reg].low = low;
- wrmsr (0x110 + reg, low, high);
- if (do_safe) set_mtrr_done( &ctxt );
-} /* End Function centaur_set_mtrr_up */
+ if (do_safe)
+ set_mtrr_done (&ctxt);
+}
-static void (*set_mtrr_up) (unsigned int reg, unsigned long base,
- unsigned long size, mtrr_type type,
- int do_safe);
#ifdef CONFIG_SMP
-struct mtrr_var_range
-{
+struct mtrr_var_range {
unsigned long base_lo;
unsigned long base_hi;
unsigned long mask_lo;
unsigned long mask_hi;
};
-
/* Get the MSR pair relating to a var range */
static void __init get_mtrr_var_range (unsigned int index,
struct mtrr_var_range *vr)
{
rdmsr (MTRRphysBase_MSR (index), vr->base_lo, vr->base_hi);
rdmsr (MTRRphysMask_MSR (index), vr->mask_lo, vr->mask_hi);
-} /* End Function get_mtrr_var_range */
+}
/* Set the MSR pair relating to a var range. Returns TRUE if
changes are made */
-static int __init set_mtrr_var_range_testing (unsigned int index,
- struct mtrr_var_range *vr)
+static int __init
+set_mtrr_var_range_testing (unsigned int index, struct mtrr_var_range *vr)
{
unsigned int lo, hi;
int changed = FALSE;
- rdmsr(MTRRphysBase_MSR(index), lo, hi);
- if ( (vr->base_lo & 0xfffff0ffUL) != (lo & 0xfffff0ffUL)
- || (vr->base_hi & 0xfUL) != (hi & 0xfUL) )
- {
- wrmsr (MTRRphysBase_MSR(index), vr->base_lo, vr->base_hi);
+ rdmsr (MTRRphysBase_MSR (index), lo, hi);
+ if ((vr->base_lo & 0xfffff0ffUL) != (lo & 0xfffff0ffUL)
+ || (vr->base_hi & 0xfUL) != (hi & 0xfUL)) {
+ wrmsr (MTRRphysBase_MSR (index), vr->base_lo, vr->base_hi);
changed = TRUE;
}
- rdmsr (MTRRphysMask_MSR(index), lo, hi);
+ rdmsr (MTRRphysMask_MSR (index), lo, hi);
- if ( (vr->mask_lo & 0xfffff800UL) != (lo & 0xfffff800UL)
- || (vr->mask_hi & 0xfUL) != (hi & 0xfUL) )
- {
- wrmsr(MTRRphysMask_MSR(index), vr->mask_lo, vr->mask_hi);
+ if ((vr->mask_lo & 0xfffff800UL) != (lo & 0xfffff800UL)
+ || (vr->mask_hi & 0xfUL) != (hi & 0xfUL)) {
+ wrmsr (MTRRphysMask_MSR (index), vr->mask_lo, vr->mask_hi);
changed = TRUE;
}
return changed;
-} /* End Function set_mtrr_var_range_testing */
+}
+
-static void __init get_fixed_ranges(mtrr_type *frs)
+static void __init get_fixed_ranges (mtrr_type * frs)
{
- unsigned long *p = (unsigned long *)frs;
+ unsigned long *p = (unsigned long *) frs;
int i;
- rdmsr(MTRRfix64K_00000_MSR, p[0], p[1]);
+ rdmsr (MTRRfix64K_00000_MSR, p[0], p[1]);
for (i = 0; i < 2; i++)
- rdmsr(MTRRfix16K_80000_MSR + i, p[2 + i*2], p[3 + i*2]);
+ rdmsr (MTRRfix16K_80000_MSR + i, p[2 + i * 2], p[3 + i * 2]);
for (i = 0; i < 8; i++)
- rdmsr(MTRRfix4K_C0000_MSR + i, p[6 + i*2], p[7 + i*2]);
-} /* End Function get_fixed_ranges */
+ rdmsr (MTRRfix4K_C0000_MSR + i, p[6 + i * 2], p[7 + i * 2]);
+}
-static int __init set_fixed_ranges_testing(mtrr_type *frs)
+
+static int __init set_fixed_ranges_testing (mtrr_type * frs)
{
- unsigned long *p = (unsigned long *)frs;
+ unsigned long *p = (unsigned long *) frs;
int changed = FALSE;
int i;
unsigned long lo, hi;
- rdmsr(MTRRfix64K_00000_MSR, lo, hi);
- if (p[0] != lo || p[1] != hi)
- {
+ rdmsr (MTRRfix64K_00000_MSR, lo, hi);
+ if (p[0] != lo || p[1] != hi) {
wrmsr (MTRRfix64K_00000_MSR, p[0], p[1]);
changed = TRUE;
}
- for (i = 0; i < 2; i++)
- {
+ for (i = 0; i < 2; i++) {
rdmsr (MTRRfix16K_80000_MSR + i, lo, hi);
- if (p[2 + i*2] != lo || p[3 + i*2] != hi)
- {
- wrmsr (MTRRfix16K_80000_MSR + i, p[2 + i*2], p[3 + i*2]);
+ if (p[2 + i * 2] != lo || p[3 + i * 2] != hi) {
+ wrmsr (MTRRfix16K_80000_MSR + i, p[2 + i * 2],
+ p[3 + i * 2]);
changed = TRUE;
}
}
- for (i = 0; i < 8; i++)
- {
+ for (i = 0; i < 8; i++) {
rdmsr (MTRRfix4K_C0000_MSR + i, lo, hi);
- if (p[6 + i*2] != lo || p[7 + i*2] != hi)
- {
- wrmsr(MTRRfix4K_C0000_MSR + i, p[6 + i*2], p[7 + i*2]);
+ if (p[6 + i * 2] != lo || p[7 + i * 2] != hi) {
+ wrmsr (MTRRfix4K_C0000_MSR + i, p[6 + i * 2],
+ p[7 + i * 2]);
changed = TRUE;
}
}
return changed;
-} /* End Function set_fixed_ranges_testing */
+}
-struct mtrr_state
-{
+
+struct mtrr_state {
unsigned int num_var_ranges;
struct mtrr_var_range *var_ranges;
mtrr_type fixed_ranges[NUM_FIXED_RANGES];
/* Grab all of the MTRR state for this CPU into *state */
-static void __init get_mtrr_state(struct mtrr_state *state)
+static void __init get_mtrr_state (struct mtrr_state *state)
{
unsigned int nvrs, i;
struct mtrr_var_range *vrs;
unsigned long lo, dummy;
- nvrs = state->num_var_ranges = get_num_var_ranges();
+ nvrs = state->num_var_ranges = get_num_var_ranges ();
vrs = state->var_ranges
= kmalloc (nvrs * sizeof (struct mtrr_var_range), GFP_KERNEL);
if (vrs == NULL)
rdmsr (MTRRdefType_MSR, lo, dummy);
state->def_type = (lo & 0xff);
state->enabled = (lo & 0xc00) >> 10;
-} /* End Function get_mtrr_state */
+}
/* Free resources associated with a struct mtrr_state */
-static void __init finalize_mtrr_state(struct mtrr_state *state)
+static void __init finalize_mtrr_state (struct mtrr_state *state)
{
- if (state->var_ranges) kfree (state->var_ranges);
-} /* End Function finalize_mtrr_state */
+ if (state->var_ranges)
+ kfree (state->var_ranges);
+}
static unsigned long __init set_mtrr_state (struct mtrr_state *state,
unsigned long change_mask = 0;
for (i = 0; i < state->num_var_ranges; i++)
- if ( set_mtrr_var_range_testing (i, &state->var_ranges[i]) )
+ if (set_mtrr_var_range_testing (i, &state->var_ranges[i]))
change_mask |= MTRR_CHANGE_MASK_VARIABLE;
- if ( set_fixed_ranges_testing(state->fixed_ranges) )
+ if (set_fixed_ranges_testing (state->fixed_ranges))
change_mask |= MTRR_CHANGE_MASK_FIXED;
/* Set_mtrr_restore restores the old value of MTRRdefType,
so to set it we fiddle with the saved value */
- if ( (ctxt->deftype_lo & 0xff) != state->def_type
- || ( (ctxt->deftype_lo & 0xc00) >> 10 ) != state->enabled)
- {
+ if ((ctxt->deftype_lo & 0xff) != state->def_type
+ || ((ctxt->deftype_lo & 0xc00) >> 10) != state->enabled) {
ctxt->deftype_lo |= (state->def_type | state->enabled << 10);
change_mask |= MTRR_CHANGE_MASK_DEFTYPE;
}
return change_mask;
-} /* End Function set_mtrr_state */
+}
static atomic_t undone_count;
-static volatile int wait_barrier_cache_disable = FALSE;
static volatile int wait_barrier_execute = FALSE;
static volatile int wait_barrier_cache_enable = FALSE;
-struct set_mtrr_data
-{
+struct set_mtrr_data {
unsigned long smp_base;
unsigned long smp_size;
unsigned int smp_reg;
{
struct set_mtrr_data *data = info;
struct set_mtrr_context ctxt;
- set_mtrr_prepare_save (&ctxt);
- /* Notify master that I've flushed and disabled my cache */
- atomic_dec (&undone_count);
- while (wait_barrier_cache_disable) { rep_nop(); barrier(); }
- set_mtrr_cache_disable (&ctxt);
+
+ set_mtrr_prepare (&ctxt);
/* Notify master that I've flushed and disabled my cache */
atomic_dec (&undone_count);
- while (wait_barrier_execute) { rep_nop(); barrier(); }
+ while (wait_barrier_execute)
+ barrier ();
+
/* The master has cleared me to execute */
(*set_mtrr_up) (data->smp_reg, data->smp_base, data->smp_size,
data->smp_type, FALSE);
+
/* Notify master CPU that I've executed the function */
atomic_dec (&undone_count);
+
/* Wait for master to clear me to enable cache and return */
- while (wait_barrier_cache_enable) { rep_nop(); barrier(); }
+ while (wait_barrier_cache_enable)
+ barrier ();
set_mtrr_done (&ctxt);
-} /* End Function ipi_handler */
+}
+
static void set_mtrr_smp (unsigned int reg, unsigned long base,
unsigned long size, mtrr_type type)
data.smp_base = base;
data.smp_size = size;
data.smp_type = type;
- wait_barrier_cache_disable = TRUE;
wait_barrier_execute = TRUE;
wait_barrier_cache_enable = TRUE;
atomic_set (&undone_count, smp_num_cpus - 1);
+
/* Start the ball rolling on other CPUs */
if (smp_call_function (ipi_handler, &data, 1, 0) != 0)
panic ("mtrr: timed out waiting for other CPUs\n");
+
/* Flush and disable the local CPU's cache */
- set_mtrr_prepare_save (&ctxt);
- /* Wait for all other CPUs to flush and disable their caches */
- while (atomic_read (&undone_count) > 0) { rep_nop(); barrier(); }
- /* Set up for completion wait and then release other CPUs to change MTRRs*/
- atomic_set (&undone_count, smp_num_cpus - 1);
- wait_barrier_cache_disable = FALSE;
- set_mtrr_cache_disable (&ctxt);
+ set_mtrr_prepare (&ctxt);
/* Wait for all other CPUs to flush and disable their caches */
- while (atomic_read (&undone_count) > 0) { rep_nop(); barrier(); }
- /* Set up for completion wait and then release other CPUs to change MTRRs*/
+ while (atomic_read (&undone_count) > 0)
+ barrier ();
+
+ /* Set up for completion wait and then release other CPUs to change MTRRs */
atomic_set (&undone_count, smp_num_cpus - 1);
wait_barrier_execute = FALSE;
(*set_mtrr_up) (reg, base, size, type, FALSE);
+
/* Now wait for other CPUs to complete the function */
- while (atomic_read (&undone_count) > 0) { rep_nop(); barrier(); }
+ while (atomic_read (&undone_count) > 0)
+ barrier ();
+
/* Now all CPUs should have finished the function. Release the barrier to
allow them to re-enable their caches and return from their interrupt,
then enable the local cache and return */
wait_barrier_cache_enable = FALSE;
set_mtrr_done (&ctxt);
-} /* End Function set_mtrr_smp */
+}
/* Some BIOS's are fucked and don't set all MTRRs the same! */
-static void __init mtrr_state_warn(unsigned long mask)
+static void __init mtrr_state_warn (unsigned long mask)
{
- if (!mask) return;
+ if (!mask)
+ return;
if (mask & MTRR_CHANGE_MASK_FIXED)
printk ("mtrr: your CPUs had inconsistent fixed MTRR settings\n");
if (mask & MTRR_CHANGE_MASK_VARIABLE)
if (mask & MTRR_CHANGE_MASK_DEFTYPE)
printk ("mtrr: your CPUs had inconsistent MTRRdefType settings\n");
printk ("mtrr: probably your BIOS does not setup all CPUs\n");
-} /* End Function mtrr_state_warn */
+}
#endif /* CONFIG_SMP */
-static char *attrib_to_str (int x)
+
+static char inline * attrib_to_str (int x)
{
return (x <= 6) ? mtrr_strings[x] : "?";
-} /* End Function attrib_to_str */
+}
+
-static void init_table (void)
+static void __init init_table (void)
{
int i, max;
max = get_num_var_ranges ();
- if ( ( usage_table = kmalloc (max * sizeof *usage_table, GFP_KERNEL) )
- == NULL )
- {
+ if ((usage_table = kmalloc (max * sizeof *usage_table, GFP_KERNEL))==NULL) {
printk ("mtrr: could not allocate\n");
return;
}
- for (i = 0; i < max; i++) usage_table[i] = 1;
+
+ for (i = 0; i < max; i++)
+ usage_table[i] = 1;
+
#ifdef USERSPACE_INTERFACE
- if ( ( ascii_buffer = kmalloc (max * LINE_SIZE, GFP_KERNEL) ) == NULL )
- {
+ if ((ascii_buffer = kmalloc (max * LINE_SIZE, GFP_KERNEL)) == NULL) {
printk ("mtrr: could not allocate\n");
return;
}
ascii_buf_bytes = 0;
compute_ascii ();
#endif
-} /* End Function init_table */
-
-static int generic_get_free_region (unsigned long base, unsigned long size)
-/* [SUMMARY] Get a free MTRR.
- <base> The starting (base) address of the region.
- <size> The size (in bytes) of the region.
- [RETURNS] The index of the region on success, else -1 on error.
-*/
-{
- int i, max;
- mtrr_type ltype;
- unsigned long lbase, lsize;
+}
- max = get_num_var_ranges ();
- for (i = 0; i < max; ++i)
- {
- (*get_mtrr) (i, &lbase, &lsize, <ype);
- if (lsize == 0) return i;
- }
- return -ENOSPC;
-} /* End Function generic_get_free_region */
-static int centaur_get_free_region (unsigned long base, unsigned long size)
+static int generic_get_free_region (unsigned long base,
+ unsigned long size)
/* [SUMMARY] Get a free MTRR.
<base> The starting (base) address of the region.
<size> The size (in bytes) of the region.
unsigned long lbase, lsize;
max = get_num_var_ranges ();
- for (i = 0; i < max; ++i)
- {
- if(centaur_mcr_reserved & (1<<i))
- continue;
+ for (i = 0; i < max; ++i) {
(*get_mtrr) (i, &lbase, &lsize, <ype);
- if (lsize == 0) return i;
+ if (lsize == 0)
+ return i;
}
return -ENOSPC;
-} /* End Function generic_get_free_region */
-
-static int cyrix_get_free_region (unsigned long base, unsigned long size)
-/* [SUMMARY] Get a free ARR.
- <base> The starting (base) address of the region.
- <size> The size (in bytes) of the region.
- [RETURNS] The index of the region on success, else -1 on error.
-*/
-{
- int i;
- mtrr_type ltype;
- unsigned long lbase, lsize;
+}
- /* If we are to set up a region >32M then look at ARR7 immediately */
- if (size > 0x2000)
- {
- cyrix_get_arr (7, &lbase, &lsize, <ype);
- if (lsize == 0) return 7;
- /* Else try ARR0-ARR6 first */
- }
- else
- {
- for (i = 0; i < 7; i++)
- {
- cyrix_get_arr (i, &lbase, &lsize, <ype);
- if ((i == 3) && arr3_protected) continue;
- if (lsize == 0) return i;
- }
- /* ARR0-ARR6 isn't free, try ARR7 but its size must be at least 256K */
- cyrix_get_arr (i, &lbase, &lsize, <ype);
- if ((lsize == 0) && (size >= 0x40)) return i;
- }
- return -ENOSPC;
-} /* End Function cyrix_get_free_region */
static int (*get_free_region) (unsigned long base,
unsigned long size) = generic_get_free_region;
* failures and do not wish system log messages to be sent.
*/
-int mtrr_add_page(unsigned long base, unsigned long size, unsigned int type, char increment)
+int mtrr_add_page (unsigned long base, unsigned long size,
+ unsigned int type, char increment)
{
/* [SUMMARY] Add an MTRR entry.
<base> The starting (base, in pages) address of the region.
mtrr_type ltype;
unsigned long lbase, lsize, last;
- switch ( mtrr_if )
- {
- case MTRR_IF_NONE:
- return -ENXIO; /* No MTRRs whatsoever */
-
- case MTRR_IF_AMD_K6:
- /* Apply the K6 block alignment and size rules
- In order
- o Uncached or gathering only
- o 128K or bigger block
- o Power of 2 block
- o base suitably aligned to the power
- */
- if ( type > MTRR_TYPE_WRCOMB || size < (1 << (17-PAGE_SHIFT)) ||
- (size & ~(size-1))-size || ( base & (size-1) ) )
- return -EINVAL;
- break;
-
- case MTRR_IF_INTEL:
- /* For Intel PPro stepping <= 7, must be 4 MiB aligned
- and not touch 0x70000000->0x7003FFFF */
- if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
- boot_cpu_data.x86 == 6 &&
- boot_cpu_data.x86_model == 1 &&
- boot_cpu_data.x86_mask <= 7 )
- {
- if ( base & ((1 << (22-PAGE_SHIFT))-1) )
- {
- printk (KERN_WARNING "mtrr: base(0x%lx000) is not 4 MiB aligned\n", base);
- return -EINVAL;
- }
- if (!(base + size < 0x70000000 || base > 0x7003FFFF) &&
- (type == MTRR_TYPE_WRCOMB || type == MTRR_TYPE_WRBACK))
- {
- printk (KERN_WARNING "mtrr: writable mtrr between 0x70000000 and 0x7003FFFF may hang the CPU.\n");
- return -EINVAL;
- }
- }
- /* Fall through */
-
- case MTRR_IF_CYRIX_ARR:
- case MTRR_IF_CENTAUR_MCR:
- if ( mtrr_if == MTRR_IF_CENTAUR_MCR )
- {
- /*
- * FIXME: Winchip2 supports uncached
- */
- if (type != MTRR_TYPE_WRCOMB && (centaur_mcr_type == 0 || type != MTRR_TYPE_UNCACHABLE))
- {
- printk (KERN_WARNING "mtrr: only write-combining%s supported\n",
- centaur_mcr_type?" and uncacheable are":" is");
- return -EINVAL;
- }
- }
- else if (base + size < 0x100)
- {
- printk (KERN_WARNING "mtrr: cannot set region below 1 MiB (0x%lx000,0x%lx000)\n",
+ if (base + size < 0x100) {
+ printk (KERN_WARNING
+ "mtrr: cannot set region below 1 MiB (0x%lx000,0x%lx000)\n",
base, size);
return -EINVAL;
}
+
/* Check upper bits of base and last are equal and lower bits are 0
for base and 1 for last */
last = base + size - 1;
for (lbase = base; !(lbase & 1) && (last & 1);
- lbase = lbase >> 1, last = last >> 1);
- if (lbase != last)
- {
- printk (KERN_WARNING "mtrr: base(0x%lx000) is not aligned on a size(0x%lx000) boundary\n",
- base, size);
- return -EINVAL;
- }
- break;
+ lbase = lbase >> 1, last = last >> 1) ;
- default:
+ if (lbase != last) {
+ printk (KERN_WARNING
+ "mtrr: base(0x%lx000) is not aligned on a size(0x%lx000) boundary\n",
+ base, size);
return -EINVAL;
}
- if (type >= MTRR_NUM_TYPES)
- {
+ if (type >= MTRR_NUM_TYPES) {
printk ("mtrr: type: %u illegal\n", type);
return -EINVAL;
}
/* If the type is WC, check that this processor supports it */
- if ( (type == MTRR_TYPE_WRCOMB) && !have_wrcomb () )
- {
- printk (KERN_WARNING "mtrr: your processor doesn't support write-combining\n");
+ if ((type == MTRR_TYPE_WRCOMB) && !have_wrcomb ()) {
+ printk (KERN_WARNING
+ "mtrr: your processor doesn't support write-combining\n");
return -ENOSYS;
}
- if ( base & size_or_mask || size & size_or_mask )
- {
+ if (base & size_or_mask || size & size_or_mask) {
printk ("mtrr: base or size exceeds the MTRR width\n");
return -EINVAL;
}
increment = increment ? 1 : 0;
max = get_num_var_ranges ();
/* Search for existing MTRR */
- down(&main_lock);
- for (i = 0; i < max; ++i)
- {
+ down (&main_lock);
+ for (i = 0; i < max; ++i) {
(*get_mtrr) (i, &lbase, &lsize, <ype);
- if (base >= lbase + lsize) continue;
- if ( (base < lbase) && (base + size <= lbase) ) continue;
+ if (base >= lbase + lsize)
+ continue;
+ if ((base < lbase) && (base + size <= lbase))
+ continue;
+
/* At this point we know there is some kind of overlap/enclosure */
- if ( (base < lbase) || (base + size > lbase + lsize) )
- {
- up(&main_lock);
- printk (KERN_WARNING "mtrr: 0x%lx000,0x%lx000 overlaps existing"
- " 0x%lx000,0x%lx000\n",
- base, size, lbase, lsize);
+ if ((base < lbase) || (base + size > lbase + lsize)) {
+ up (&main_lock);
+ printk (KERN_WARNING
+ "mtrr: 0x%lx000,0x%lx000 overlaps existing"
+ " 0x%lx000,0x%lx000\n", base, size, lbase,
+ lsize);
return -EINVAL;
}
/* New region is enclosed by an existing region */
- if (ltype != type)
- {
- if (type == MTRR_TYPE_UNCACHABLE) continue;
- up(&main_lock);
- printk ( "mtrr: type mismatch for %lx000,%lx000 old: %s new: %s\n",
- base, size, attrib_to_str (ltype), attrib_to_str (type) );
+ if (ltype != type) {
+ if (type == MTRR_TYPE_UNCACHABLE)
+ continue;
+ up (&main_lock);
+ printk
+ ("mtrr: type mismatch for %lx000,%lx000 old: %s new: %s\n",
+ base, size, attrib_to_str (ltype),
+ attrib_to_str (type));
return -EINVAL;
}
- if (increment) ++usage_table[i];
+ if (increment)
+ ++usage_table[i];
compute_ascii ();
- up(&main_lock);
+ up (&main_lock);
return i;
}
/* Search for an empty MTRR */
i = (*get_free_region) (base, size);
- if (i < 0)
- {
- up(&main_lock);
+ if (i < 0) {
+ up (&main_lock);
printk ("mtrr: no more MTRRs available\n");
return i;
}
set_mtrr (i, base, size, type);
usage_table[i] = 1;
compute_ascii ();
- up(&main_lock);
+ up (&main_lock);
return i;
-} /* End Function mtrr_add_page */
+}
+
/**
* mtrr_add - Add a memory type region
* failures and do not wish system log messages to be sent.
*/
-int mtrr_add(unsigned long base, unsigned long size, unsigned int type, char increment)
+int mtrr_add (unsigned long base, unsigned long size, unsigned int type,
+ char increment)
{
/* [SUMMARY] Add an MTRR entry.
<base> The starting (base) address of the region.
the error code.
*/
- if ( (base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
- {
+ if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) {
printk ("mtrr: size and base must be multiples of 4 kiB\n");
printk ("mtrr: size: 0x%lx base: 0x%lx\n", size, base);
return -EINVAL;
}
- return mtrr_add_page(base >> PAGE_SHIFT, size >> PAGE_SHIFT, type, increment);
-} /* End Function mtrr_add */
+ return mtrr_add_page (base >> PAGE_SHIFT, size >> PAGE_SHIFT, type,
+ increment);
+}
+
/**
* mtrr_del_page - delete a memory type region
mtrr_type ltype;
unsigned long lbase, lsize;
- if ( mtrr_if == MTRR_IF_NONE ) return -ENXIO;
-
max = get_num_var_ranges ();
down (&main_lock);
- if (reg < 0)
- {
+ if (reg < 0) {
/* Search for existing MTRR */
- for (i = 0; i < max; ++i)
- {
+ for (i = 0; i < max; ++i) {
(*get_mtrr) (i, &lbase, &lsize, <ype);
- if (lbase == base && lsize == size)
- {
+ if (lbase == base && lsize == size) {
reg = i;
break;
}
}
- if (reg < 0)
- {
- up(&main_lock);
- printk ("mtrr: no MTRR for %lx000,%lx000 found\n", base, size);
+ if (reg < 0) {
+ up (&main_lock);
+ printk ("mtrr: no MTRR for %lx000,%lx000 found\n", base,
+ size);
return -EINVAL;
}
}
- if (reg >= max)
- {
+
+ if (reg >= max) {
up (&main_lock);
printk ("mtrr: register: %d too big\n", reg);
return -EINVAL;
}
- if ( mtrr_if == MTRR_IF_CYRIX_ARR )
- {
- if ( (reg == 3) && arr3_protected )
- {
- up (&main_lock);
- printk ("mtrr: ARR3 cannot be changed\n");
- return -EINVAL;
- }
- }
(*get_mtrr) (reg, &lbase, &lsize, <ype);
- if (lsize < 1)
- {
+
+ if (lsize < 1) {
up (&main_lock);
printk ("mtrr: MTRR %d not used\n", reg);
return -EINVAL;
}
- if (usage_table[reg] < 1)
- {
+
+ if (usage_table[reg] < 1) {
up (&main_lock);
printk ("mtrr: reg: %d has count=0\n", reg);
return -EINVAL;
}
- if (--usage_table[reg] < 1) set_mtrr (reg, 0, 0, 0);
+
+ if (--usage_table[reg] < 1)
+ set_mtrr (reg, 0, 0, 0);
compute_ascii ();
up (&main_lock);
return reg;
-} /* End Function mtrr_del_page */
+}
+
/**
* mtrr_del - delete a memory type region
the error code.
*/
{
- if ( (base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
- {
+ if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) {
printk ("mtrr: size and base must be multiples of 4 kiB\n");
printk ("mtrr: size: 0x%lx base: 0x%lx\n", size, base);
return -EINVAL;
}
- return mtrr_del_page(reg, base >> PAGE_SHIFT, size >> PAGE_SHIFT);
+ return mtrr_del_page (reg, base >> PAGE_SHIFT, size >> PAGE_SHIFT);
}
+
#ifdef USERSPACE_INTERFACE
static int mtrr_file_add (unsigned long base, unsigned long size,
unsigned int *fcount = file->private_data;
max = get_num_var_ranges ();
- if (fcount == NULL)
- {
- if ( ( fcount = kmalloc (max * sizeof *fcount, GFP_KERNEL) ) == NULL )
- {
+ if (fcount == NULL) {
+ if ((fcount =
+ kmalloc (max * sizeof *fcount, GFP_KERNEL)) == NULL) {
printk ("mtrr: could not allocate\n");
return -ENOMEM;
}
memset (fcount, 0, max * sizeof *fcount);
file->private_data = fcount;
}
+
if (!page) {
- if ( (base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
- {
- printk ("mtrr: size and base must be multiples of 4 kiB\n");
+ if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) {
+ printk
+ ("mtrr: size and base must be multiples of 4 kiB\n");
printk ("mtrr: size: 0x%lx base: 0x%lx\n", size, base);
return -EINVAL;
}
base >>= PAGE_SHIFT;
size >>= PAGE_SHIFT;
}
+
reg = mtrr_add_page (base, size, type, 1);
- if (reg >= 0) ++fcount[reg];
+
+ if (reg >= 0)
+ ++fcount[reg];
return reg;
-} /* End Function mtrr_file_add */
+}
+
static int mtrr_file_del (unsigned long base, unsigned long size,
struct file *file, int page)
unsigned int *fcount = file->private_data;
if (!page) {
- if ( (base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
- {
- printk ("mtrr: size and base must be multiples of 4 kiB\n");
+ if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) {
+ printk
+ ("mtrr: size and base must be multiples of 4 kiB\n");
printk ("mtrr: size: 0x%lx base: 0x%lx\n", size, base);
return -EINVAL;
}
size >>= PAGE_SHIFT;
}
reg = mtrr_del_page (-1, base, size);
- if (reg < 0) return reg;
- if (fcount == NULL) return reg;
- if (fcount[reg] < 1) return -EINVAL;
+ if (reg < 0)
+ return reg;
+ if (fcount == NULL)
+ return reg;
+ if (fcount[reg] < 1)
+ return -EINVAL;
--fcount[reg];
return reg;
-} /* End Function mtrr_file_del */
+}
+
static ssize_t mtrr_read (struct file *file, char *buf, size_t len,
- loff_t *ppos)
+ loff_t * ppos)
{
- if (*ppos >= ascii_buf_bytes) return 0;
- if (*ppos + len > ascii_buf_bytes) len = ascii_buf_bytes - *ppos;
- if ( copy_to_user (buf, ascii_buffer + *ppos, len) ) return -EFAULT;
+ if (*ppos >= ascii_buf_bytes)
+ return 0;
+
+ if (*ppos + len > ascii_buf_bytes)
+ len = ascii_buf_bytes - *ppos;
+
+ if (copy_to_user (buf, ascii_buffer + *ppos, len))
+ return -EFAULT;
+
*ppos += len;
return len;
-} /* End Function mtrr_read */
+}
-static ssize_t mtrr_write (struct file *file, const char *buf, size_t len,
- loff_t *ppos)
+
+static ssize_t mtrr_write (struct file *file, const char *buf,
+ size_t len, loff_t * ppos)
/* Format of control line:
"base=%Lx size=%Lx type=%s" OR:
"disable=%d"
char *ptr;
char line[LINE_SIZE];
- if ( !suser () ) return -EPERM;
+ if (!suser ())
+ return -EPERM;
+
/* Can't seek (pwrite) on this device */
- if (ppos != &file->f_pos) return -ESPIPE;
+ if (ppos != &file->f_pos)
+ return -ESPIPE;
memset (line, 0, LINE_SIZE);
- if (len > LINE_SIZE) len = LINE_SIZE;
- if ( copy_from_user (line, buf, len - 1) ) return -EFAULT;
+
+ if (len > LINE_SIZE)
+ len = LINE_SIZE;
+
+ if (copy_from_user (line, buf, len - 1))
+ return -EFAULT;
ptr = line + strlen (line) - 1;
- if (*ptr == '\n') *ptr = '\0';
- if ( !strncmp (line, "disable=", 8) )
- {
+
+ if (*ptr == '\n')
+ *ptr = '\0';
+
+ if (!strncmp (line, "disable=", 8)) {
reg = simple_strtoul (line + 8, &ptr, 0);
err = mtrr_del_page (reg, 0, 0);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
return len;
}
- if ( strncmp (line, "base=", 5) )
- {
+
+ if (strncmp (line, "base=", 5)) {
printk ("mtrr: no \"base=\" in line: \"%s\"\n", line);
return -EINVAL;
}
+
base = simple_strtoull (line + 5, &ptr, 0);
- for (; isspace (*ptr); ++ptr);
- if ( strncmp (ptr, "size=", 5) )
- {
+
+ for (; isspace (*ptr); ++ptr) ;
+
+ if (strncmp (ptr, "size=", 5)) {
printk ("mtrr: no \"size=\" in line: \"%s\"\n", line);
return -EINVAL;
}
+
size = simple_strtoull (ptr + 5, &ptr, 0);
- if ( (base & 0xfff) || (size & 0xfff) )
- {
+
+ if ((base & 0xfff) || (size & 0xfff)) {
printk ("mtrr: size and base must be multiples of 4 kiB\n");
printk ("mtrr: size: 0x%Lx base: 0x%Lx\n", size, base);
return -EINVAL;
}
- for (; isspace (*ptr); ++ptr);
- if ( strncmp (ptr, "type=", 5) )
- {
+
+ for (; isspace (*ptr); ++ptr) ;
+
+ if (strncmp (ptr, "type=", 5)) {
printk ("mtrr: no \"type=\" in line: \"%s\"\n", line);
return -EINVAL;
}
ptr += 5;
- for (; isspace (*ptr); ++ptr);
- for (i = 0; i < MTRR_NUM_TYPES; ++i)
- {
- if ( strcmp (ptr, mtrr_strings[i]) ) continue;
+
+ for (; isspace (*ptr); ++ptr) ;
+
+ for (i = 0; i < MTRR_NUM_TYPES; ++i) {
+ if (strcmp (ptr, mtrr_strings[i]))
+ continue;
base >>= PAGE_SHIFT;
size >>= PAGE_SHIFT;
- err = mtrr_add_page ((unsigned long)base, (unsigned long)size, i, 1);
- if (err < 0) return err;
+ err =
+ mtrr_add_page ((unsigned long) base, (unsigned long) size,
+ i, 1);
+ if (err < 0)
+ return err;
return len;
}
printk ("mtrr: illegal type: \"%s\"\n", ptr);
return -EINVAL;
-} /* End Function mtrr_write */
+}
+
static int mtrr_ioctl (struct inode *inode, struct file *file,
unsigned int cmd, unsigned long arg)
struct mtrr_sentry sentry;
struct mtrr_gentry gentry;
- switch (cmd)
- {
+ switch (cmd) {
default:
return -ENOIOCTLCMD;
+
case MTRRIOC_ADD_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
- err = mtrr_file_add (sentry.base, sentry.size, sentry.type, 1, file, 0);
- if (err < 0) return err;
+ err =
+ mtrr_file_add (sentry.base, sentry.size, sentry.type, 1,
+ file, 0);
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_SET_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_add (sentry.base, sentry.size, sentry.type, 0);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_DEL_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_file_del (sentry.base, sentry.size, file, 0);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_KILL_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_del (-1, sentry.base, sentry.size);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_GET_ENTRY:
- if ( copy_from_user (&gentry, (void *) arg, sizeof gentry) )
+ if (copy_from_user (&gentry, (void *) arg, sizeof gentry))
return -EFAULT;
- if ( gentry.regnum >= get_num_var_ranges () ) return -EINVAL;
+ if (gentry.regnum >= get_num_var_ranges ())
+ return -EINVAL;
(*get_mtrr) (gentry.regnum, &gentry.base, &gentry.size, &type);
/* Hide entries that go above 4GB */
- if (gentry.base + gentry.size > 0x100000 || gentry.size == 0x100000)
+ if (gentry.base + gentry.size > 0x100000
+ || gentry.size == 0x100000)
gentry.base = gentry.size = gentry.type = 0;
else {
gentry.base <<= PAGE_SHIFT;
gentry.type = type;
}
- if ( copy_to_user ( (void *) arg, &gentry, sizeof gentry) )
+ if (copy_to_user ((void *) arg, &gentry, sizeof gentry))
return -EFAULT;
break;
+
case MTRRIOC_ADD_PAGE_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
- err = mtrr_file_add (sentry.base, sentry.size, sentry.type, 1, file, 1);
- if (err < 0) return err;
+ err =
+ mtrr_file_add (sentry.base, sentry.size, sentry.type, 1,
+ file, 1);
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_SET_PAGE_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_add_page (sentry.base, sentry.size, sentry.type, 0);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_DEL_PAGE_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_file_del (sentry.base, sentry.size, file, 1);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_KILL_PAGE_ENTRY:
- if ( !suser () ) return -EPERM;
- if ( copy_from_user (&sentry, (void *) arg, sizeof sentry) )
+ if (!suser ())
+ return -EPERM;
+ if (copy_from_user (&sentry, (void *) arg, sizeof sentry))
return -EFAULT;
err = mtrr_del_page (-1, sentry.base, sentry.size);
- if (err < 0) return err;
+ if (err < 0)
+ return err;
break;
+
case MTRRIOC_GET_PAGE_ENTRY:
- if ( copy_from_user (&gentry, (void *) arg, sizeof gentry) )
+ if (copy_from_user (&gentry, (void *) arg, sizeof gentry))
return -EFAULT;
- if ( gentry.regnum >= get_num_var_ranges () ) return -EINVAL;
+ if (gentry.regnum >= get_num_var_ranges ())
+ return -EINVAL;
(*get_mtrr) (gentry.regnum, &gentry.base, &gentry.size, &type);
gentry.type = type;
- if ( copy_to_user ( (void *) arg, &gentry, sizeof gentry) )
+ if (copy_to_user ((void *) arg, &gentry, sizeof gentry))
return -EFAULT;
break;
}
return 0;
-} /* End Function mtrr_ioctl */
+}
+
static int mtrr_close (struct inode *ino, struct file *file)
{
int i, max;
unsigned int *fcount = file->private_data;
- if (fcount == NULL) return 0;
+ if (fcount == NULL)
+ return 0;
+
+ lock_kernel ();
max = get_num_var_ranges ();
- for (i = 0; i < max; ++i)
- {
- while (fcount[i] > 0)
- {
- if (mtrr_del (i, 0, 0) < 0) printk ("mtrr: reg %d not used\n", i);
+ for (i = 0; i < max; ++i) {
+ while (fcount[i] > 0) {
+ if (mtrr_del (i, 0, 0) < 0)
+ printk ("mtrr: reg %d not used\n", i);
--fcount[i];
}
}
+ unlock_kernel ();
kfree (fcount);
file->private_data = NULL;
return 0;
-} /* End Function mtrr_close */
+}
-static struct file_operations mtrr_fops =
-{
+
+static struct file_operations mtrr_fops = {
owner: THIS_MODULE,
read: mtrr_read,
write: mtrr_write,
ioctl: mtrr_ioctl,
- release: mtrr_close,
+ release:mtrr_close,
};
-# ifdef CONFIG_PROC_FS
-
+#ifdef CONFIG_PROC_FS
static struct proc_dir_entry *proc_root_mtrr;
-
-# endif /* CONFIG_PROC_FS */
+#endif
static devfs_handle_t devfs_handle;
ascii_buf_bytes = 0;
max = get_num_var_ranges ();
- for (i = 0; i < max; i++)
- {
+ for (i = 0; i < max; i++) {
(*get_mtrr) (i, &base, &size, &type);
- if (size == 0) usage_table[i] = 0;
- else
- {
- if (size < (0x100000 >> PAGE_SHIFT))
- {
+ if (size == 0)
+ usage_table[i] = 0;
+ else {
+ if (size < (0x100000 >> PAGE_SHIFT)) {
/* less than 1MB */
factor = 'K';
size <<= PAGE_SHIFT - 10;
- }
- else
- {
+ } else {
factor = 'M';
size >>= 20 - PAGE_SHIFT;
}
"reg%02i: base=0x%05lx000 (%4liMB), size=%4li%cB: %s, count=%d\n",
i, base, base >> (20 - PAGE_SHIFT), size, factor,
attrib_to_str (type), usage_table[i]);
- ascii_buf_bytes += strlen (ascii_buffer + ascii_buf_bytes);
+ ascii_buf_bytes +=
+ strlen (ascii_buffer + ascii_buf_bytes);
}
}
devfs_set_file_size (devfs_handle, ascii_buf_bytes);
-# ifdef CONFIG_PROC_FS
+#ifdef CONFIG_PROC_FS
if (proc_root_mtrr)
proc_root_mtrr->size = ascii_buf_bytes;
-# endif /* CONFIG_PROC_FS */
-} /* End Function compute_ascii */
-
-#endif /* USERSPACE_INTERFACE */
-
-EXPORT_SYMBOL(mtrr_add);
-EXPORT_SYMBOL(mtrr_del);
-
-#ifdef CONFIG_SMP
-
-typedef struct
-{
- unsigned long base;
- unsigned long size;
- mtrr_type type;
-} arr_state_t;
-
-arr_state_t arr_state[8] __initdata =
-{
- {0UL,0UL,0UL}, {0UL,0UL,0UL}, {0UL,0UL,0UL}, {0UL,0UL,0UL},
- {0UL,0UL,0UL}, {0UL,0UL,0UL}, {0UL,0UL,0UL}, {0UL,0UL,0UL}
-};
-
-unsigned char ccr_state[7] __initdata = { 0, 0, 0, 0, 0, 0, 0 };
-
-static void __init cyrix_arr_init_secondary(void)
-{
- struct set_mtrr_context ctxt;
- int i;
-
- /* flush cache and enable MAPEN */
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
-
- /* the CCRs are not contiguous */
- for(i=0; i<4; i++) setCx86(CX86_CCR0 + i, ccr_state[i]);
- for( ; i<7; i++) setCx86(CX86_CCR4 + i, ccr_state[i]);
- for(i=0; i<8; i++)
- cyrix_set_arr_up(i,
- arr_state[i].base, arr_state[i].size, arr_state[i].type, FALSE);
-
- set_mtrr_done (&ctxt); /* flush cache and disable MAPEN */
-} /* End Function cyrix_arr_init_secondary */
-
-#endif
-
-/*
- * On Cyrix 6x86(MX) and M II the ARR3 is special: it has connection
- * with the SMM (System Management Mode) mode. So we need the following:
- * Check whether SMI_LOCK (CCR3 bit 0) is set
- * if it is set, write a warning message: ARR3 cannot be changed!
- * (it cannot be changed until the next processor reset)
- * if it is reset, then we can change it, set all the needed bits:
- * - disable access to SMM memory through ARR3 range (CCR1 bit 7 reset)
- * - disable access to SMM memory (CCR1 bit 2 reset)
- * - disable SMM mode (CCR1 bit 1 reset)
- * - disable write protection of ARR3 (CCR6 bit 1 reset)
- * - (maybe) disable ARR3
- * Just to be sure, we enable ARR usage by the processor (CCR5 bit 5 set)
- */
-static void __init cyrix_arr_init(void)
-{
- struct set_mtrr_context ctxt;
- unsigned char ccr[7];
- int ccrc[7] = { 0, 0, 0, 0, 0, 0, 0 };
-#ifdef CONFIG_SMP
- int i;
-#endif
-
- /* flush cache and enable MAPEN */
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
-
- /* Save all CCRs locally */
- ccr[0] = getCx86 (CX86_CCR0);
- ccr[1] = getCx86 (CX86_CCR1);
- ccr[2] = getCx86 (CX86_CCR2);
- ccr[3] = ctxt.ccr3;
- ccr[4] = getCx86 (CX86_CCR4);
- ccr[5] = getCx86 (CX86_CCR5);
- ccr[6] = getCx86 (CX86_CCR6);
-
- if (ccr[3] & 1)
- {
- ccrc[3] = 1;
- arr3_protected = 1;
- }
- else
- {
- /* Disable SMM mode (bit 1), access to SMM memory (bit 2) and
- * access to SMM memory through ARR3 (bit 7).
- */
- if (ccr[1] & 0x80) { ccr[1] &= 0x7f; ccrc[1] |= 0x80; }
- if (ccr[1] & 0x04) { ccr[1] &= 0xfb; ccrc[1] |= 0x04; }
- if (ccr[1] & 0x02) { ccr[1] &= 0xfd; ccrc[1] |= 0x02; }
- arr3_protected = 0;
- if (ccr[6] & 0x02) {
- ccr[6] &= 0xfd; ccrc[6] = 1; /* Disable write protection of ARR3 */
- setCx86 (CX86_CCR6, ccr[6]);
- }
- /* Disable ARR3. This is safe now that we disabled SMM. */
- /* cyrix_set_arr_up (3, 0, 0, 0, FALSE); */
- }
- /* If we changed CCR1 in memory, change it in the processor, too. */
- if (ccrc[1]) setCx86 (CX86_CCR1, ccr[1]);
-
- /* Enable ARR usage by the processor */
- if (!(ccr[5] & 0x20))
- {
- ccr[5] |= 0x20; ccrc[5] = 1;
- setCx86 (CX86_CCR5, ccr[5]);
- }
-
-#ifdef CONFIG_SMP
- for(i=0; i<7; i++) ccr_state[i] = ccr[i];
- for(i=0; i<8; i++)
- cyrix_get_arr(i,
- &arr_state[i].base, &arr_state[i].size, &arr_state[i].type);
#endif
+}
- set_mtrr_done (&ctxt); /* flush cache and disable MAPEN */
+#endif /* USERSPACE_INTERFACE */
- if ( ccrc[5] ) printk ("mtrr: ARR usage was not enabled, enabled manually\n");
- if ( ccrc[3] ) printk ("mtrr: ARR3 cannot be changed\n");
-/*
- if ( ccrc[1] & 0x80) printk ("mtrr: SMM memory access through ARR3 disabled\n");
- if ( ccrc[1] & 0x04) printk ("mtrr: SMM memory access disabled\n");
- if ( ccrc[1] & 0x02) printk ("mtrr: SMM mode disabled\n");
-*/
- if ( ccrc[6] ) printk ("mtrr: ARR3 was write protected, unprotected\n");
-} /* End Function cyrix_arr_init */
+EXPORT_SYMBOL (mtrr_add);
+EXPORT_SYMBOL (mtrr_del);
-/*
- * Initialise the later (saner) Winchip MCR variant. In this version
- * the BIOS can pass us the registers it has used (but not their values)
- * and the control register is read/write
- */
-static void __init centaur_mcr1_init(void)
+static void __init mtrr_setup (void)
{
- unsigned i;
- u32 lo, hi;
-
- /* Unfortunately, MCR's are read-only, so there is no way to
- * find out what the bios might have done.
- */
-
- rdmsr(0x120, lo, hi);
- if(((lo>>17)&7)==1) /* Type 1 Winchip2 MCR */
- {
- lo&= ~0x1C0; /* clear key */
- lo|= 0x040; /* set key to 1 */
- wrmsr(0x120, lo, hi); /* unlock MCR */
- }
-
- centaur_mcr_type = 1;
-
- /*
- * Clear any unconfigured MCR's.
- */
-
- for (i = 0; i < 8; ++i)
- {
- if(centaur_mcr[i]. high == 0 && centaur_mcr[i].low == 0)
- {
- if(!(lo & (1<<(9+i))))
- wrmsr (0x110 + i , 0, 0);
- else
- /*
- * If the BIOS set up an MCR we cannot see it
- * but we don't wish to obliterate it
- */
- centaur_mcr_reserved |= (1<<i);
- }
- }
- /*
- * Throw the main write-combining switch...
- * However if OOSTORE is enabled then people have already done far
- * cleverer things and we should behave.
- */
-
- lo |= 15; /* Write combine enables */
- wrmsr(0x120, lo, hi);
-} /* End Function centaur_mcr1_init */
-
-/*
- * Initialise the original winchip with read only MCR registers
- * no used bitmask for the BIOS to pass on and write only control
- */
-
-static void __init centaur_mcr0_init(void)
-{
- unsigned i;
-
- /* Unfortunately, MCR's are read-only, so there is no way to
- * find out what the bios might have done.
- */
-
- /* Clear any unconfigured MCR's.
- * This way we are sure that the centaur_mcr array contains the actual
- * values. The disadvantage is that any BIOS tweaks are thus undone.
- *
- */
- for (i = 0; i < 8; ++i)
- {
- if(centaur_mcr[i]. high == 0 && centaur_mcr[i].low == 0)
- wrmsr (0x110 + i , 0, 0);
- }
-
- wrmsr(0x120, 0x01F0001F, 0); /* Write only */
-} /* End Function centaur_mcr0_init */
-
-/*
- * Initialise Winchip series MCR registers
- */
-
-static void __init centaur_mcr_init(void)
-{
- struct set_mtrr_context ctxt;
-
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
-
- if(boot_cpu_data.x86_model==4)
- centaur_mcr0_init();
- else if(boot_cpu_data.x86_model==8 || boot_cpu_data.x86_model == 9)
- centaur_mcr1_init();
+ printk ("mtrr: v%s)\n", MTRR_VERSION);
- set_mtrr_done (&ctxt);
-} /* End Function centaur_mcr_init */
-
-static int __init mtrr_setup(void)
-{
- if ( test_bit(X86_FEATURE_MTRR, &boot_cpu_data.x86_capability) ) {
- /* Intel (P6) standard MTRRs */
- mtrr_if = MTRR_IF_INTEL;
- get_mtrr = intel_get_mtrr;
- set_mtrr_up = intel_set_mtrr_up;
- switch (boot_cpu_data.x86_vendor) {
-
- case X86_VENDOR_AMD:
- /* The original Athlon docs said that
- total addressable memory is 44 bits wide.
- It was not really clear whether its MTRRs
- follow this or not. (Read: 44 or 36 bits).
- However, "x86-64_overview.pdf" explicitly
- states that "previous implementations support
- 36 bit MTRRs" and also provides a way to
- query the width (in bits) of the physical
- addressable memory on the Hammer family.
- */
- if (boot_cpu_data.x86 == 7 && (cpuid_eax(0x80000000) >= 0x80000008)) {
+ if (test_bit (X86_FEATURE_MTRR, &boot_cpu_data.x86_capability)) {
+ /* Query the width (in bits) of the physical
+ addressable memory on the Hammer family. */
+ if ((cpuid_eax (0x80000000) >= 0x80000008)) {
u32 phys_addr;
- phys_addr = cpuid_eax(0x80000008) & 0xff ;
- size_or_mask = ~((1 << (phys_addr - PAGE_SHIFT)) - 1);
+ phys_addr = cpuid_eax (0x80000008) & 0xff;
+ size_or_mask =
+ ~((1 << (phys_addr - PAGE_SHIFT)) - 1);
size_and_mask = ~size_or_mask & 0xfff00000;
- break;
- }
- size_or_mask = 0xff000000; /* 36 bits */
- size_and_mask = 0x00f00000;
- break;
-
- case X86_VENDOR_CENTAUR:
- /* VIA Cyrix family have Intel style MTRRs, but don't support PAE */
- if (boot_cpu_data.x86 == 6) {
- size_or_mask = 0xfff00000; /* 32 bits */
- size_and_mask = 0;
- }
- break;
-
- default:
- /* Intel, etc. */
+ } else {
+ /* FIXME: This is to make it work on Athlon during debugging. */
size_or_mask = 0xff000000; /* 36 bits */
size_and_mask = 0x00f00000;
- break;
}
- } else if ( test_bit(X86_FEATURE_K6_MTRR, &boot_cpu_data.x86_capability) ) {
- /* Pre-Athlon (K6) AMD CPU MTRRs */
- mtrr_if = MTRR_IF_AMD_K6;
- get_mtrr = amd_get_mtrr;
- set_mtrr_up = amd_set_mtrr_up;
- size_or_mask = 0xfff00000; /* 32 bits */
- size_and_mask = 0;
- } else if ( test_bit(X86_FEATURE_CYRIX_ARR, &boot_cpu_data.x86_capability) ) {
- /* Cyrix ARRs */
- mtrr_if = MTRR_IF_CYRIX_ARR;
- get_mtrr = cyrix_get_arr;
- set_mtrr_up = cyrix_set_arr_up;
- get_free_region = cyrix_get_free_region;
- cyrix_arr_init();
- size_or_mask = 0xfff00000; /* 32 bits */
- size_and_mask = 0;
- } else if ( test_bit(X86_FEATURE_CENTAUR_MCR, &boot_cpu_data.x86_capability) ) {
- /* Centaur MCRs */
- mtrr_if = MTRR_IF_CENTAUR_MCR;
- get_mtrr = centaur_get_mcr;
- set_mtrr_up = centaur_set_mcr_up;
- get_free_region = centaur_get_free_region;
- centaur_mcr_init();
- size_or_mask = 0xfff00000; /* 32 bits */
- size_and_mask = 0;
- } else {
- /* No supported MTRR interface */
- mtrr_if = MTRR_IF_NONE;
+ printk ("mtrr: detected mtrr type: x86-64\n");
}
-
- printk ("mtrr: v%s Richard Gooch (rgooch@atnf.csiro.au)\n"
- "mtrr: detected mtrr type: %s\n",
- MTRR_VERSION, mtrr_if_name[mtrr_if]);
-
- return (mtrr_if != MTRR_IF_NONE);
-} /* End Function mtrr_setup */
+}
#ifdef CONFIG_SMP
static volatile unsigned long smp_changes_mask __initdata = 0;
-static struct mtrr_state smp_mtrr_state __initdata = {0, 0};
+static struct mtrr_state smp_mtrr_state __initdata = { 0, 0 };
-void __init mtrr_init_boot_cpu(void)
+void __init mtrr_init_boot_cpu (void)
{
- if ( !mtrr_setup () )
- return;
-
- if ( mtrr_if == MTRR_IF_INTEL ) {
- /* Only for Intel MTRRs */
+ mtrr_setup();
get_mtrr_state (&smp_mtrr_state);
- }
-} /* End Function mtrr_init_boot_cpu */
+}
+
-static void __init intel_mtrr_init_secondary_cpu(void)
+void __init mtrr_init_secondary_cpu (void)
{
unsigned long mask, count;
struct set_mtrr_context ctxt;
/* Note that this is not ideal, since the cache is only flushed/disabled
for this CPU while the MTRRs are changed, but changing this requires
more invasive changes to the way the kernel boots */
- set_mtrr_prepare_save (&ctxt);
- set_mtrr_cache_disable (&ctxt);
+ set_mtrr_prepare (&ctxt);
mask = set_mtrr_state (&smp_mtrr_state, &ctxt);
set_mtrr_done (&ctxt);
+
/* Use the atomic bitops to update the global mask */
- for (count = 0; count < sizeof mask * 8; ++count)
- {
- if (mask & 0x01) set_bit (count, &smp_changes_mask);
+ for (count = 0; count < sizeof mask * 8; ++count) {
+ if (mask & 0x01)
+ set_bit (count, &smp_changes_mask);
mask >>= 1;
}
-} /* End Function intel_mtrr_init_secondary_cpu */
+}
-void __init mtrr_init_secondary_cpu(void)
-{
- switch ( mtrr_if ) {
- case MTRR_IF_INTEL:
- /* Intel (P6) standard MTRRs */
- intel_mtrr_init_secondary_cpu();
- break;
- case MTRR_IF_CYRIX_ARR:
- /* This is _completely theoretical_!
- * I assume here that one day Cyrix will support Intel APIC.
- * In reality on non-Intel CPUs we won't even get to this routine.
- * Hopefully no one will plug two Cyrix processors in a dual P5 board.
- * :-)
- */
- cyrix_arr_init_secondary ();
- break;
- default:
- /* I see no MTRRs I can support in SMP mode... */
- printk ("mtrr: SMP support incomplete for this vendor\n");
- }
-} /* End Function mtrr_init_secondary_cpu */
#endif /* CONFIG_SMP */
-int __init mtrr_init(void)
+
+int __init mtrr_init (void)
{
#ifdef CONFIG_SMP
/* mtrr_setup() should already have been called from mtrr_init_boot_cpu() */
- if ( mtrr_if == MTRR_IF_INTEL ) {
finalize_mtrr_state (&smp_mtrr_state);
mtrr_state_warn (smp_changes_mask);
- }
#else
- if ( !mtrr_setup() )
- return 0; /* MTRRs not supported? */
+ mtrr_setup();
#endif
#ifdef CONFIG_PROC_FS
proc_root_mtrr->proc_fops = &mtrr_fops;
}
#endif
+#ifdef CONFIG_DEVFS_FS
devfs_handle = devfs_register (NULL, "cpu/mtrr", DEVFS_FL_DEFAULT, 0, 0,
S_IFREG | S_IRUGO | S_IWUSR,
&mtrr_fops, NULL);
+#endif
init_table ();
return 0;
-} /* End Function mtrr_init */
-
-/*
- * Local Variables:
- * mode:c
- * c-file-style:"k&r"
- * c-basic-offset:4
- * End:
- */
+}
+
void *ret;
int gfp = GFP_ATOMIC;
- /* We need to always allocate below 4Gig. We probably need new
- GPF mask to say that */
gfp |= GFP_DMA;
ret = (void *)__get_free_pages(gfp, get_order(size));
#include <linux/slab.h>
#include <linux/interrupt.h>
#include <linux/irq.h>
-
#include <asm/io.h>
#include <asm/smp.h>
#include <asm/io_apic.h>
#define PIRQ_SIGNATURE (('$' << 0) + ('P' << 8) + ('I' << 16) + ('R' << 24))
#define PIRQ_VERSION 0x0100
+int pci_use_acpi_routing = 0;
+
static struct irq_routing_table *pirq_table;
/*
{ "VLSI 82C534", PCI_VENDOR_ID_VLSI, PCI_DEVICE_ID_VLSI_82C534, pirq_vlsi_get, pirq_vlsi_set },
{ "ServerWorks", PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_OSB4,
pirq_serverworks_get, pirq_serverworks_set },
+ { "ServerWorks", PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_CSB5,
+ pirq_serverworks_get, pirq_serverworks_set },
{ "AMD756 VIPER", PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_740B,
pirq_amd756_get, pirq_amd756_set },
{
}
+#ifdef CONFIG_ACPI_PCI
+
+static int acpi_lookup_irq (
+ struct pci_dev *dev,
+ u8 pin,
+ int assign)
+{
+ int result = 0;
+ int irq = 0;
+
+ /* TBD: Select IRQ from possible to improve routing performance. */
+
+ result = acpi_prt_get_irq(dev, pin, &irq);
+ if (!irq)
+ result = -ENODEV;
+ if (0 != result) {
+ printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s\n",
+ 'A'+pin, dev->slot_name);
+ return result;
+ }
+
+ dev->irq = irq;
+
+ if (!assign) {
+ /* only check for the IRQ */
+ printk(KERN_INFO "PCI: Found IRQ %d for device %s\n", irq,
+ dev->slot_name);
+ return 1;
+ }
+
+ /* also assign an IRQ */
+ if (irq && (dev->class >> 8) != PCI_CLASS_DISPLAY_VGA) {
+ result = acpi_prt_set_irq(dev, pin, irq);
+ if (0 != result) {
+ printk(KERN_WARNING "PCI: Could not assign IRQ %d to device %s\n", irq, dev->slot_name);
+ return result;
+ }
+
+ eisa_set_level_irq(irq);
+
+ printk(KERN_INFO "PCI: Assigned IRQ %d for device %s\n", irq, dev->slot_name);
+ }
+
+ return 1;
+}
+
+#endif /* CONFIG_ACPI_PCI */
+
static int pcibios_lookup_irq(struct pci_dev *dev, int assign)
{
u8 pin;
struct pci_dev *dev2;
char *msg = NULL;
- if (!pirq_table)
- return 0;
-
- /* Find IRQ routing entry */
+ /* Find IRQ pin */
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (!pin) {
DBG(" -> no interrupt pin\n");
}
pin = pin - 1;
+#ifdef CONFIG_ACPI_PCI
+ /* Use ACPI to lookup IRQ */
+ if (pci_use_acpi_routing)
+ return acpi_lookup_irq(dev, pin, assign);
+#endif
+
+ /* Find IRQ routing entry */
+
+ if (!pirq_table)
+ return 0;
+
DBG("IRQ for %s:%d", dev->slot_name, pin);
info = pirq_get_info(dev);
if (!info) {
* reported by the device if possible.
*/
newirq = dev->irq;
+ if (!((1 << newirq) & mask)) {
+ if ( pci_probe & PCI_USE_PIRQ_MASK) newirq = 0;
+ else printk(KERN_WARNING "PCI: IRQ %i for device %s doesn't match PIRQ mask - try pci=usepirqmask\n", newirq, dev->slot_name);
+ }
if (!newirq && assign) {
for (i = 0; i < 16; i++) {
if (!(mask & (1 << i)))
irq = pirq & 0xf;
DBG(" -> hardcoded IRQ %d\n", irq);
msg = "Hardcoded";
- } else if (r->get && (irq = r->get(pirq_router_dev, dev, pirq))) {
+ } else if ( r->get && (irq = r->get(pirq_router_dev, dev, pirq)) && \
+ ((!(pci_probe & PCI_USE_PIRQ_MASK)) || ((1 << irq) & mask)) ) {
DBG(" -> got IRQ %d\n", irq);
msg = "Found";
} else if (newirq && r->set && (dev->class >> 8) != PCI_CLASS_DISPLAY_VGA) {
continue;
if (info->irq[pin].link == pirq) {
/* We refuse to override the dev->irq information. Give a warning! */
- if (dev2->irq && dev2->irq != irq) {
+ if ( dev2->irq && dev2->irq != irq && \
+ (!(pci_probe & PCI_USE_PIRQ_MASK) || \
+ ((1 << dev2->irq) & mask)) ) {
printk(KERN_INFO "IRQ routing conflict for %s, have irq %d, want irq %d\n",
dev2->slot_name, dev2->irq, irq);
continue;
void __init pcibios_irq_init(void)
{
DBG("PCI: IRQ init\n");
+
+#ifdef CONFIG_ACPI_PCI
+ if (!(pci_probe & PCI_NO_ACPI_ROUTING)) {
+ if (acpi_prts.count) {
+ printk(KERN_INFO "PCI: Using ACPI for IRQ routing\n");
+ pci_use_acpi_routing = 1;
+ return;
+ }
+ else
+ printk(KERN_WARNING "PCI: Invalid ACPI-PCI IRQ routing table\n");
+ }
+#endif
+
pirq_table = pirq_find_routing_table();
+
#ifdef CONFIG_PCI_BIOS
if (!pirq_table && (pci_probe & PCI_BIOS_IRQ_SCAN))
pirq_table = pcibios_get_irq_routing_table();
pci_probe = PCI_PROBE_CONF2 | PCI_NO_CHECKS;
return NULL;
}
+#endif
+#ifdef CONFIG_ACPI_PCI
+ else if (!strcmp(str, "noacpi")) {
+ pci_probe |= PCI_NO_ACPI_ROUTING;
+ return NULL;
+ }
#endif
else if (!strcmp(str, "rom")) {
pci_probe |= PCI_ASSIGN_ROMS;
} else if (!strncmp(str, "lastbus=", 8)) {
pcibios_last_bus = simple_strtol(str+8, NULL, 0);
return NULL;
+ } else if (!strcmp(str, "usepirqmask")) {
+ pci_probe |= PCI_USE_PIRQ_MASK;
+ return NULL;
}
return str;
}
#define PCI_NO_SORT 0x0100
#define PCI_BIOS_SORT 0x0200
#define PCI_NO_CHECKS 0x0400
+#define PCI_USE_PIRQ_MASK 0x0800
#define PCI_ASSIGN_ROMS 0x1000
#define PCI_BIOS_IRQ_SCAN 0x2000
#define PCI_ASSIGN_ALL_BUSSES 0x4000
+#define PCI_NO_ACPI_ROUTING 0x8000
extern unsigned int pci_probe;
extern unsigned int pcibios_irq_mask;
+extern int pci_use_acpi_routing;
+
void pcibios_irq_init(void);
void pcibios_fixup_irqs(void);
void pcibios_enable_irq(struct pci_dev *dev);
#define __KERNEL_SYSCALLS__
#include <stdarg.h>
+#include <linux/compiler.h>
#include <linux/errno.h>
#include <linux/sched.h>
-#include <linux/fs.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/delay.h>
#include <linux/reboot.h>
#include <linux/init.h>
+#include <linux/ctype.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
#include <asm/i387.h>
#include <asm/desc.h>
+#include <asm/mmu_context.h>
#include <asm/pda.h>
#include <asm/prctl.h>
+#include <asm/kdebug.h>
#include <linux/irq.h>
-#include <linux/err.h>
asmlinkage extern void ret_from_fork(void);
* We use this if we don't have any better
* idle routine..
*/
-static void default_idle(void)
+void default_idle(void)
{
if (!hlt_counter) {
__cli();
static long no_idt[3];
static int reboot_mode;
-int reboot_thru_bios;
#ifdef CONFIG_SMP
int reboot_smp = 0;
static int reboot_cpu = -1;
-/* shamelessly grabbed from lib/vsprintf.c for readability */
-#define is_digit(c) ((c) >= '0' && (c) <= '9')
#endif
static int __init reboot_setup(char *str)
{
case 'c': /* "cold" reboot (with memory testing etc) */
reboot_mode = 0x0;
break;
- case 'b': /* "bios" reboot by jumping through the BIOS */
- reboot_thru_bios = 1;
- break;
- case 'h': /* "hard" reboot by toggling RESET and/or crashing the CPU */
- reboot_thru_bios = 0;
- break;
#ifdef CONFIG_SMP
case 's': /* "smp" reboot by executing reset on BSP or other CPU*/
reboot_smp = 1;
- if (is_digit(*(str+1))) {
- reboot_cpu = (int) (*(str+1) - '0');
- if (is_digit(*(str+2)))
- reboot_cpu = reboot_cpu*10 + (int)(*(str+2) - '0');
- }
+ if (isdigit(str[1]))
+ sscanf(str+1, "%d", &reboot_cpu);
+ else if (!strncmp(str,"smp",3))
+ sscanf(str+3, "%d", &reboot_cpu);
/* we will leave sorting out the final value
when we are ready to reboot, since we might not
have set up boot_cpu_id or smp_num_cpu */
break;
}
-/*
- * Switch to real mode and then execute the code
- * specified by the code and length parameters.
- * We assume that length will aways be less that 100!
- */
-void machine_real_restart(unsigned char *code, int length)
-{
- cli();
-
- /* This will have to be rewritten for sledgehammer. It would
- help if sledgehammer have simple option to reset itself.
- */
-
- panic( "real_restart is hard to do.\n" );
- while(1);
-}
-
void machine_restart(char * __unused)
{
#if CONFIG_SMP
* Stop all CPUs and turn off local APICs and the IO-APIC, so
* other OSs see a clean IRQ state.
*/
+ if (notify_die(DIE_STOP,"cpustop",0,0) != NOTIFY_BAD)
smp_send_stop();
disable_IO_APIC();
#endif
-
- if(!reboot_thru_bios) {
/* rebooting needs to touch the page at absolute addr 0 */
*((unsigned short *)__va(0x472)) = reboot_mode;
for (;;) {
int i;
+ /* First fondle with the keyboard controller. */
for (i=0; i<100; i++) {
kb_wait();
udelay(50);
__asm__ __volatile__("lidt %0": :"m" (no_idt));
__asm__ __volatile__("int3");
}
- }
- printk("no bios restart currently\n");
- for (;;);
+ /* Could do reset through the northbridge of Hammer here. */
}
void machine_halt(void)
/* Prints also some state that isn't saved in the pt_regs */
void show_regs(struct pt_regs * regs)
{
- unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L, fs, gs;
+ unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L, fs, gs, shadowgs;
unsigned int fsindex,gsindex;
unsigned int ds,cs,es;
regs->rax, regs->rbx, regs->rcx);
printk("RDX: %016lx RSI: %016lx RDI: %016lx\n",
regs->rdx, regs->rsi, regs->rdi);
- printk("RBP: %016lx R08: %016lx R09: %08lx\n",
+ printk("RBP: %016lx R08: %016lx R09: %016lx\n",
regs->rbp, regs->r8, regs->r9);
printk("R10: %016lx R11: %016lx R12: %016lx\n",
regs->r10, regs->r11, regs->r12);
regs->r13, regs->r14, regs->r15);
asm("movl %%ds,%0" : "=r" (ds));
- asm("movl %%es,%0" : "=r" (es));
asm("movl %%cs,%0" : "=r" (cs));
+ asm("movl %%es,%0" : "=r" (es));
asm("movl %%fs,%0" : "=r" (fsindex));
asm("movl %%gs,%0" : "=r" (gsindex));
- rdmsrl(0xc0000100, fs);
- rdmsrl(0xc0000101, gs);
+ rdmsrl(MSR_FS_BASE, fs);
+ rdmsrl(MSR_GS_BASE, gs);
+ rdmsrl(MSR_KERNEL_GS_BASE, shadowgs);
asm("movq %%cr0, %0": "=r" (cr0));
asm("movq %%cr2, %0": "=r" (cr2));
asm("movq %%cr3, %0": "=r" (cr3));
asm("movq %%cr4, %0": "=r" (cr4));
- printk("FS: %016lx(%04x) GS:%016lx(%04x)\n", fs,fsindex,gs,gsindex);
- printk("CS: %04x DS:%04x ES:%04x CR0: %016lx\n", cs, ds, es, cr0);
+ printk("FS: %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n",
+ fs,fsindex,gs,gsindex,shadowgs);
+ printk("CS: %04x DS: %04x ES: %04x CR0: %016lx\n", cs, ds, es, cr0);
printk("CR2: %016lx CR3: %016lx CR4: %016lx\n", cr2, cr3, cr4);
}
}
}
+void load_gs_index(unsigned gs)
+{
+ int access;
+ /* should load gs in syscall exit after swapgs instead */
+ /* XXX need to add LDT locking for SMP to protect against parallel changes */
+ asm volatile("pushf\n\t"
+ "cli\n\t"
+ "swapgs\n\t"
+ "lar %1,%0\n\t"
+ "jnz 1f\n\t"
+ "movl %1,%%eax\n\t"
+ "movl %%eax,%%gs\n\t"
+ "jmp 2f\n\t"
+ "1: movl %2,%%gs\n\t"
+ "2: swapgs\n\t"
+ "popf" : "=g" (access) : "g" (gs), "r" (0) : "rax");
+}
+
#define __STR(x) #x
#define __STR2(x) __STR(x)
return 0;
}
-/*
- * fill in the user structure for a core dump..
- */
-void dump_thread(struct pt_regs * regs, struct user * dump)
-{
- int i;
-
-/* changed the size calculations - should hopefully work better. lbt */
- dump->magic = CMAGIC;
- dump->start_code = 0;
- dump->start_stack = regs->rsp & ~(PAGE_SIZE - 1);
- dump->u_tsize = ((unsigned long) current->mm->end_code) >> PAGE_SHIFT;
- dump->u_dsize = ((unsigned long) (current->mm->brk + (PAGE_SIZE-1))) >> PAGE_SHIFT;
- dump->u_dsize -= dump->u_tsize;
- dump->u_ssize = 0;
- for (i = 0; i < 8; i++)
- dump->u_debugreg[i] = current->thread.debugreg[i];
-
- if (dump->start_stack < TASK_SIZE)
- dump->u_ssize = ((unsigned long) (TASK_SIZE - dump->start_stack)) >> PAGE_SHIFT;
-
-#define SAVE(reg) dump->regs.reg = regs->reg
- SAVE(rax);
- SAVE(rbx);
- SAVE(rcx);
- SAVE(rdx);
- SAVE(rsi);
- SAVE(rdi);
- SAVE(rbp);
- SAVE(r8);
- SAVE(r9);
- SAVE(r10);
- SAVE(r11);
- SAVE(r12);
- SAVE(r13);
- SAVE(r14);
- SAVE(r15);
- SAVE(orig_rax);
- SAVE(rip);
-#undef SAVE
-
- /* FIXME: Should use symbolic names for msr-s! */
- rdmsrl(0xc0000100, dump->regs.fs_base);
- rdmsrl(0xc0000101, dump->regs.kernel_gs_base);
-
- dump->u_fpvalid = dump_fpu (regs, &dump->i387);
-}
-
/*
* This special macro can be used to load a debugging register
*/
*next = &next_p->thread;
struct tss_struct *tss = init_tss + smp_processor_id();
+
unlazy_fpu(prev_p);
/*
/*
* Switch DS and ES.
- * XXX: check if this is safe on SMP
*/
asm volatile("movl %%es,%0" : "=m" (prev->es));
- if (unlikely(next->es != prev->es))
+ if (unlikely(next->es | prev->es))
loadsegment(es, next->es);
asm volatile ("movl %%ds,%0" : "=m" (prev->ds));
- if (unlikely(next->ds != prev->ds))
+ if (unlikely(next->ds | prev->ds))
loadsegment(ds, next->ds);
/*
* Switch FS and GS.
+ * XXX Check if this is safe on SMP (!= -> |)
*/
{
unsigned int fsindex;
asm volatile("movl %%fs,%0" : "=g" (fsindex));
+ if (unlikely(fsindex != next->fsindex)) /* or likely? */
+ loadsegment(fs, next->fsindex);
if (unlikely(fsindex != prev->fsindex))
prev->fs = 0;
- if (unlikely((fsindex | next->fsindex) || prev->fs))
- loadsegment(fs, next->fsindex);
- /* Should use a shortcut via a GDT entry if next->fs is 32bit */
- if (fsindex != prev->fsindex || next->fs != prev->fs)
+ if ((fsindex != prev->fsindex) || (prev->fs != next->fs))
wrmsrl(MSR_FS_BASE, next->fs);
prev->fsindex = fsindex;
}
-
{
unsigned int gsindex;
asm volatile("movl %%gs,%0" : "=g" (gsindex));
+ if (unlikely(gsindex != next->gsindex))
+ load_gs_index(next->gs);
if (unlikely(gsindex != prev->gsindex))
prev->gs = 0;
- if (unlikely((gsindex | next->gsindex) || prev->gs)) {
- unsigned long flags;
- /* could load gs in syscall exit after swapgs instead */
- int nr = smp_processor_id();
- __save_flags(flags);
- __cli();
- loadsegment(gs, next->gsindex);
- wrmsrl(MSR_GS_BASE, cpu_pda+nr);
- __restore_flags(flags);
- }
- if (gsindex != prev->gsindex || (prev->gs | next->gs))
+ if (gsindex != prev->gsindex || prev->gs != next->gs)
wrmsrl(MSR_KERNEL_GS_BASE, next->gs);
prev->gsindex = gsindex;
}
prev->userrsp = read_pda(oldrsp);
write_pda(oldrsp, next->userrsp);
write_pda(pcurrent, next_p);
- write_pda(kernelstack,
- (unsigned long)next_p->thread_info + THREAD_SIZE - PDA_STACKOFFSET);
+ write_pda(kernelstack, (unsigned long)next_p->thread_info + THREAD_SIZE - PDA_STACKOFFSET);
/*
* Now maybe reload the debug registers
/*
* Handle the IO bitmap
*/
- if (unlikely(prev->ioperm | next->ioperm)) {
+ if (unlikely(prev->ioperm || next->ioperm)) {
if (next->ioperm) {
/*
* 4 cachelines copy ... not good, but not that
#define first_sched ((unsigned long) scheduling_functions_start_here)
#define last_sched ((unsigned long) scheduling_functions_end_here)
-/*
- * Do a traceback through the scheduler to find where a process sleeps.
- *
- * Currently sched.c is compiled with -fno-omit-frame-pointer, so we
- * just go through the stack frames.
- */
unsigned long get_wchan(struct task_struct *p)
{
- return -1;
+ u64 fp,rip;
+ int count = 0;
+
+ if (!p || p == current || p->state==TASK_RUNNING)
+ return 0;
+ if (p->thread.rsp < (u64)p || p->thread.rsp > (u64)p + THREAD_SIZE)
+ return 0;
+ fp = *(u64 *)(p->thread.rsp);
+ do {
+ if (fp < (unsigned long)p || fp > (unsigned long)p+THREAD_SIZE)
+ return 0;
+ rip = *(u64 *)(fp+8);
+ if (rip < first_sched || rip >= last_sched)
+ return rip;
+ fp = *(u64 *)fp;
+ } while (count++ < 16);
+ return 0;
}
#undef last_sched
#undef first_sched
{
int ret = 0;
unsigned long tmp;
+
switch (code) {
case ARCH_SET_GS:
+ if (addr >= TASK_SIZE)
+ return -EPERM;
asm volatile("movw %%gs,%0" : "=g" (current->thread.gsindex));
current->thread.gs = addr;
ret = checking_wrmsrl(MSR_KERNEL_GS_BASE, addr);
break;
case ARCH_SET_FS:
+ /* Not strictly needed for fs, but do it for symmetry
+ with gs */
+ if (addr >= TASK_SIZE)
+ return -EPERM;
asm volatile("movw %%fs,%0" : "=g" (current->thread.fsindex));
current->thread.fs = addr;
ret = checking_wrmsrl(MSR_FS_BASE, addr);
/*
* Pentium III FXSR, SSE support
* Gareth Hughes <gareth@valinux.com>, May 2000
+ *
+ * x86-64 port 2000-2002 Andi Kleen
*/
#include <linux/kernel.h>
unsigned long regno, unsigned long value)
{
unsigned long tmp;
- switch (regno >> 2) {
- // XXX: add 64bit setting.
- case FS:
+ switch (regno) {
+ case offsetof(struct user_regs_struct,fs):
if (value && (value & 3) != 3)
return -EIO;
- child->thread.fs = value;
+ child->thread.fsindex = value & 0xffff;
+ return 0;
+ case offsetof(struct user_regs_struct,gs):
+ if (value && (value & 3) != 3)
+ return -EIO;
+ child->thread.gsindex = value & 0xffff;
return 0;
- case GS:
+ case offsetof(struct user_regs_struct,ds):
if (value && (value & 3) != 3)
return -EIO;
+ child->thread.ds = value & 0xffff;
+ return 0;
+ case offsetof(struct user_regs_struct,es):
+ if (value && (value & 3) != 3)
+ return -EIO;
+ child->thread.es = value & 0xffff;
+ return 0;
+ case offsetof(struct user_regs_struct,fs_base):
+ if (!((value >> 48) == 0 || (value >> 48) == 0xffff))
+ return -EIO;
+ child->thread.fs = value;
+ return 0;
+ case offsetof(struct user_regs_struct,gs_base):
+ if (!((value >> 48) == 0 || (value >> 48) == 0xffff))
+ return -EIO;
child->thread.gs = value;
return 0;
- case EFLAGS:
+ case offsetof(struct user_regs_struct, eflags):
value &= FLAG_MASK;
tmp = get_stack_long(child, EFL_OFFSET);
tmp &= ~FLAG_MASK;
value |= tmp;
break;
+ case offsetof(struct user_regs_struct,cs):
+ if (value && (value & 3) != 3)
+ return -EIO;
+ value &= 0xffff;
+ break;
}
- /* assumption about sizes... */
- if (regno > GS*4)
- regno -= 2*4;
- /* This has to be changes to put_stack_64() */
- /* Hmm, with 32 bit applications being around... this will be
- rather funny */
put_stack_long(child, regno - sizeof(struct pt_regs), value);
return 0;
}
-static unsigned long getreg(struct task_struct *child,
- unsigned long regno)
+static unsigned long getreg(struct task_struct *child, unsigned long regno)
{
- switch (regno >> 3) {
- case FS:
+ switch (regno) {
+ case offsetof(struct user_regs_struct, fs):
+ return child->thread.fsindex;
+ case offsetof(struct user_regs_struct, gs):
+ return child->thread.gsindex;
+ case offsetof(struct user_regs_struct, ds):
+ return child->thread.ds;
+ case offsetof(struct user_regs_struct, es):
+ return child->thread.es;
+ case offsetof(struct user_regs_struct, fs_base):
return child->thread.fs;
- case GS:
+ case offsetof(struct user_regs_struct, gs_base):
return child->thread.gs;
default:
regno = regno - sizeof(struct pt_regs);
ret = ptrace_attach(child);
goto out_tsk;
}
- ret = -ESRCH;
- if (!(child->ptrace & PT_PTRACED))
- goto out_tsk;
- if (child->state != TASK_STOPPED) {
- if (request != PTRACE_KILL)
- goto out_tsk;
- }
- if (child->p_pptr != current)
+ ret = ptrace_check_attach(child, request == PTRACE_KILL);
+ if (ret < 0)
goto out_tsk;
+
switch (request) {
/* when I and D space are separate, these will need to be fixed. */
case PTRACE_PEEKTEXT: /* read word at location addr. */
break;
tmp = 0; /* Default return condition */
- if(addr < 20*sizeof(long))
+ if(addr < sizeof(struct user_regs_struct))
tmp = getreg(child, addr);
if(addr >= (long) &dummy->u_debugreg[0] &&
addr <= (long) &dummy->u_debugreg[7]){
addr > sizeof(struct user) - 3)
break;
- if (addr < 20*sizeof(long)) {
+ if (addr < sizeof(struct user_regs_struct)) {
ret = putreg(child, addr, data);
break;
}
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
- if (request == PTRACE_SYSCALL) {
- set_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- }
- else {
- clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- }
+ if (request == PTRACE_SYSCALL)
+ set_tsk_thread_flag(child,TIF_SYSCALL_TRACE);
+ else
+ clear_tsk_thread_flag(child,TIF_SYSCALL_TRACE);
child->exit_code = data;
/* make sure the single step bit is not set. */
tmp = get_stack_long(child, EFL_OFFSET);
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
- clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
+ clear_tsk_thread_flag(child,TIF_SYSCALL_TRACE);
if ((child->ptrace & PT_DTRACE) == 0) {
/* Spurious delayed TF traps may occur */
child->ptrace |= PT_DTRACE;
ret = -EIO;
break;
}
- for ( i = 0; i < FRAME_SIZE; i += sizeof(long) ) {
+ for ( i = 0; i < sizeof(struct user_regs_struct); i += sizeof(long) ) {
__put_user(getreg(child, i),(unsigned long *) data);
data += sizeof(long);
}
ret = -EIO;
break;
}
- for ( i = 0; i < FRAME_SIZE; i += sizeof(long) ) {
+ for ( i = 0; i < sizeof(struct user_regs_struct); i += sizeof(long) ) {
__get_user(tmp, (unsigned long *) data);
putreg(child, i, tmp);
data += sizeof(long);
ret = -EIO;
break;
}
- if ( !child->used_math ) {
- /* Simulate an empty FPU. */
- set_fpu_cwd(child, 0x037f);
- set_fpu_swd(child, 0x0000);
- set_fpu_twd(child, 0xffff);
- set_fpu_mxcsr(child, 0x1f80);
- }
ret = get_fpregs((struct user_i387_struct *)data, child);
break;
}
current->exit_code = SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
? 0x80 : 0);
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
/*
* this isn't the same as continuing with a signal, but it will do
* for normal use. strace only continues with a signal if the
*/
#include <linux/config.h>
#include <linux/sched.h>
-#include <linux/err.h>
+#include <asm/errno.h>
#include <asm/semaphore.h>
}
-/*
- * The semaphore operations have a special calling sequence that
- * allow us to do a simpler in-line version of them. These routines
- * need to convert that sequence back into the C sequence when
- * there is contention on the semaphore.
- *
- * %rcx contains the semaphore pointer on entry. Save all the callee
- * clobbered registers. It would be better if the compiler had a way
- * to specify that for the callee.
- */
-
-
-#define PUSH_CLOBBER "pushq %rdi ; pushq %rsi ; pushq %rdx ; pushq %rcx ;" \
- "pushq %rbx ; pushq %r8 ; push %r9\n\t"
-#define POP_CLOBBER "popq %r9 ; popq %r8 ; popq %rbx ; popq %rcx ; " \
- "popq %rdx ; popq %rsi ; popq %rdi\n\t"
-
-#define SEM_ENTRY(label, name) asm( \
- ".p2align\n\t.globl " #label "\n\t" \
- #label ":\n\t" PUSH_CLOBBER "call " #name "\n\t" POP_CLOBBER "ret" )
-
-SEM_ENTRY(__down_failed, __down);
-SEM_ENTRY(__down_failed_interruptible, __down_interruptible);
-SEM_ENTRY(__down_failed_trylock, __down_trylock);
-SEM_ENTRY(__up_wakeup, __up);
-
-
-#if defined(CONFIG_SMP)
-asm(
-".p2align"
-"\n.globl __write_lock_failed"
-"\n__write_lock_failed:"
-"\n " LOCK "addl $" RW_LOCK_BIAS_STR ",(%rax)"
-"\n1: rep; nop; cmpl $" RW_LOCK_BIAS_STR ",(%rax)"
-"\n jne 1b"
-
-"\n " LOCK "subl $" RW_LOCK_BIAS_STR ",(%rax)"
-"\n jnz __write_lock_failed"
-"\n ret"
-
-
-"\n.p2align"
-"\n.globl __read_lock_failed"
-"\n__read_lock_failed:"
-"\n lock ; incl (%rax)"
-"\n1: rep; nop; cmpl $1,(%rax)"
-"\n js 1b"
-
-"\n lock ; decl (%rax)"
-"\n js __read_lock_failed"
-"\n ret"
-);
-#endif
-
#include <asm/mpspec.h>
#include <asm/mmu_context.h>
#include <asm/bootsetup.h>
+#include <asm/smp.h>
/*
* Machine setup..
*/
extern void mcheck_init(struct cpuinfo_x86 *c);
+extern void init_memory_mapping(void);
-char ignore_irq13; /* set if exception 16 works */
struct cpuinfo_x86 boot_cpu_data = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
unsigned long mmu_cr4_features;
*/
struct drive_info_struct { char dummy[32]; } drive_info;
struct screen_info screen_info;
-struct apm_info apm_info;
struct sys_desc_table_struct {
unsigned short length;
unsigned char table[0];
extern int root_mountflags;
extern char _text, _etext, _edata, _end;
-extern unsigned long cpu_khz;
static int disable_x86_fxsr __initdata = 0;
for (i = 0; i < e820.nr_map; i++) {
printk(" %s: %016Lx - %016Lx ", who,
- (unsigned long long)e820.map[i].addr,
- (unsigned long long)(e820.map[i].addr + e820.map[i].size));
+ (unsigned long long) e820.map[i].addr,
+ (unsigned long long) (e820.map[i].addr + e820.map[i].size));
switch (e820.map[i].type) {
case E820_RAM: printk("(usable)\n");
break;
case E820_NVS:
printk("(ACPI NVS)\n");
break;
- default: printk("type %lu\n", (unsigned long)e820.map[i].type);
+ default: printk("type %u\n", e820.map[i].type);
break;
}
}
}
}
+unsigned long start_pfn, end_pfn;
+
void __init setup_arch(char **cmdline_p)
{
unsigned long bootmap_size, low_mem_size;
- unsigned long start_pfn, max_pfn, max_low_pfn;
int i;
ROOT_DEV = to_kdev_t(ORIG_ROOT_DEV);
drive_info = DRIVE_INFO;
screen_info = SCREEN_INFO;
- apm_info.bios = APM_BIOS_INFO;
aux_device_present = AUX_DEVICE_INFO;
#ifdef CONFIG_BLK_DEV_RAM
#define PFN_DOWN(x) ((x) >> PAGE_SHIFT)
#define PFN_PHYS(x) ((x) << PAGE_SHIFT)
-#define VMALLOC_RESERVE (unsigned long)(4096 << 20)
-#define MAXMEM (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE)
+#define MAXMEM (120UL * 1024 * 1024 * 1024 * 1024) /* 120TB */
#define MAXMEM_PFN PFN_DOWN(MAXMEM)
+#define MAX_NONPAE_PFN (1 << 20)
/*
* partially used pages are not usable - thus
* we are rounding upwards:
*/
- start_pfn = PFN_UP(__pa(&_end));
+ start_pfn = PFN_UP(__pa_symbol(&_end));
/*
* Find the highest page frame number we have available
*/
- max_pfn = 0;
+ end_pfn = 0;
for (i = 0; i < e820.nr_map; i++) {
unsigned long start, end;
/* RAM? */
end = PFN_DOWN(e820.map[i].addr + e820.map[i].size);
if (start >= end)
continue;
- if (end > max_pfn)
- max_pfn = end;
+ if (end > end_pfn)
+ end_pfn = end;
}
- /*
- * Determine low and high memory ranges:
- */
- max_low_pfn = max_pfn;
- if (max_low_pfn > MAXMEM_PFN) {
- max_low_pfn = MAXMEM_PFN;
+ if (end_pfn > MAXMEM_PFN) {
+ end_pfn = MAXMEM_PFN;
}
+ init_memory_mapping();
/*
* Initialize the boot-time allocator (with low memory only):
*/
- bootmap_size = init_bootmem(start_pfn, max_low_pfn);
+ bootmap_size = init_bootmem(start_pfn, end_pfn);
/*
* Register fully available low RAM pages with the bootmem allocator.
* We are rounding up the start address of usable memory:
*/
curr_pfn = PFN_UP(e820.map[i].addr);
- if (curr_pfn >= max_low_pfn)
+ if (curr_pfn >= end_pfn)
continue;
/*
* ... and at the end of the usable range downwards:
*/
last_pfn = PFN_DOWN(e820.map[i].addr + e820.map[i].size);
- if (last_pfn > max_low_pfn)
- last_pfn = max_low_pfn;
+ if (last_pfn > end_pfn)
+ last_pfn = end_pfn;
/*
* .. finally, did all the rounding and playing
*/
reserve_bootmem(PAGE_SIZE, PAGE_SIZE);
#endif
-
+#ifdef CONFIG_ACPI_SLEEP
+ /*
+ * Reserve low memory region for sleep support.
+ */
+ acpi_reserve_bootmem();
+#endif
#ifdef CONFIG_X86_LOCAL_APIC
/*
* Find and reserve possible boot-time SMP configuration:
smp_alloc_memory(); /* AP processor realmode stacks in low memory*/
#endif
paging_init();
+#ifdef CONFIG_ACPI_BOOT
+ /*
+ * Initialize the ACPI boot-time table parser (gets the RSDP and SDT).
+ * Must do this after paging_init (due to reliance on fixmap, and thus
+ * the bootmem allocator) but before get_smp_config (to allow parsing
+ * of MADT).
+ */
+ acpi_boot_init(*cmdline_p);
+#endif
#ifdef CONFIG_X86_LOCAL_APIC
/*
* get boot-time SMP configuration:
request_resource(&ioport_resource, standard_io_resources+i);
/* Tell the PCI layer not to allocate too close to the RAM area.. */
+ /* ??? move this up on x86-64 */
low_mem_size = ((max_low_pfn << PAGE_SHIFT) + 0xfffff) & ~0xfffff;
if (low_mem_size > pci_mem_start)
pci_mem_start = low_mem_size;
clear_bit(0*32+31, &c->x86_capability);
r = get_model_name(c);
+ if (!r) {
+ switch (c->x86) {
+ case 15:
+ /* Should distingush Models here, but this is only
+ a fallback anyways. */
+ strcpy(c->x86_model_id, "Hammer");
+ break;
+ }
+ }
display_cacheinfo(c);
return r;
}
static int show_cpuinfo(struct seq_file *m, void *v)
{
struct cpuinfo_x86 *c = v;
- int index = c - cpu_data;
/*
* These flag bits must match the definitions in <asm/cpufeature.h>.
/* AMD-defined */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, "syscall", NULL, NULL, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL, NULL, "mmxext", NULL,
+ NULL, NULL, NULL, NULL, "nx", NULL, "mmxext", NULL,
NULL, NULL, NULL, NULL, NULL, "lm", "3dnowext", "3dnow",
/* Transmeta-defined */
return 0;
#endif
- seq_printf(m,"processor\t: %d\n"
+ seq_printf(m,"processor\t: %u\n"
"vendor_id\t: %s\n"
"cpu family\t: %d\n"
"model\t\t: %d\n"
"model name\t: %s\n",
- index,
+ (unsigned)(c-cpu_data),
c->x86_vendor_id[0] ? c->x86_vendor_id : "unknown",
c->x86,
- c->x86_model,
+ (int)c->x86_model,
c->x86_model_id[0] ? c->x86_model_id : "unknown");
if (c->x86_mask || c->cpuid_level >= 0)
seq_printf(m, "stepping\t: unknown\n");
if ( test_bit(X86_FEATURE_TSC, &c->x86_capability) ) {
- seq_printf(m, "cpu MHz\t\t: %lu.%03lu\n",
+ seq_printf(m, "cpu MHz\t\t: %u.%03u\n",
cpu_khz / 1000, (cpu_khz % 1000));
}
/*
- * X86-64 specific setup part.
+ * X86-64 specific CPU setup.
* Copyright (C) 1995 Linus Torvalds
- * Copyright 2001 2002 SuSE Labs / Andi Kleen.
+ * Copyright 2001, 2002 SuSE Labs / Andi Kleen.
* See setup.c for older changelog.
+ * $Id: setup64.c,v 1.12 2002/03/21 10:09:17 ak Exp $
*/
#include <linux/config.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/sched.h>
+#include <linux/string.h>
#include <asm/pda.h>
-#include <asm/pda.h>
+#include <asm/pgtable.h>
#include <asm/processor.h>
#include <asm/desc.h>
#include <asm/bitops.h>
#include <asm/atomic.h>
#include <asm/mmu_context.h>
+#include <asm/smp.h>
+#include <asm/i387.h>
char x86_boot_params[2048] __initdata = {0,};
extern void system_call(void);
extern void ia32_cstar_target(void);
+extern struct task_struct init_task;
+
struct desc_ptr gdt_descr = { 0 /* filled in */, (unsigned long) gdt_table };
struct desc_ptr idt_descr = { 256 * 16, (unsigned long) idt_table };
+char boot_cpu_stack[IRQSTACKSIZE] __cacheline_aligned;
+
void pda_init(int cpu)
{
- cpu_pda[cpu].me = &cpu_pda[cpu];
- cpu_pda[cpu].cpunumber = cpu;
- cpu_pda[cpu].irqcount = -1;
- cpu_pda[cpu].irqstackptr = cpu_pda[cpu].irqstack + sizeof(cpu_pda[0].irqstack);
- /* others are initialized in smpboot.c */
+ pml4_t *level4;
+
if (cpu == 0) {
+ /* others are initialized in smpboot.c */
cpu_pda[cpu].pcurrent = &init_task;
- cpu_pda[cpu].kernelstack =
- (unsigned long)&init_thread_union+THREAD_SIZE-PDA_STACKOFFSET;
+ cpu_pda[cpu].irqstackptr = boot_cpu_stack;
+ level4 = init_level4_pgt;
+ } else {
+ cpu_pda[cpu].irqstackptr = (char *)
+ __get_free_pages(GFP_ATOMIC, IRQSTACK_ORDER);
+ if (!cpu_pda[cpu].irqstackptr)
+ panic("cannot allocate irqstack for cpu %d\n", cpu);
+ level4 = (pml4_t *)__get_free_pages(GFP_ATOMIC, 0);
}
- asm volatile("movl %0,%%gs ; movl %0,%%fs" :: "r" (0));
+ if (!level4)
+ panic("Cannot allocate top level page for cpu %d", cpu);
+
+ cpu_pda[cpu].level4_pgt = (unsigned long *)level4;
+ if (level4 != init_level4_pgt)
+ memcpy(level4, &init_level4_pgt, PAGE_SIZE);
+ set_pml4(level4 + 510, mk_kernel_pml4(__pa_symbol(boot_vmalloc_pgt)));
+ asm volatile("movq %0,%%cr3" :: "r" (__pa(level4)));
+
+ cpu_pda[cpu].irqstackptr += IRQSTACKSIZE-64;
+ cpu_pda[cpu].cpunumber = cpu;
+ cpu_pda[cpu].irqcount = -1;
+ cpu_pda[cpu].kernelstack =
+ (unsigned long)stack_thread_info() - PDA_STACKOFFSET + THREAD_SIZE;
+ cpu_pda[cpu].me = &cpu_pda[cpu];
+
+ asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0));
wrmsrl(MSR_GS_BASE, cpu_pda + cpu);
}
+#define EXCEPTION_STK_ORDER 0 /* >= N_EXCEPTION_STACKS*EXCEPTION_STKSZ */
+char boot_exception_stacks[N_EXCEPTION_STACKS*EXCEPTION_STKSZ];
+
+
/*
* cpu_init() initializes state that is per-CPU. Some data is already
* initialized (naturally) in the bootstrap process, such as the GDT
* and IDT. We reload them nevertheless, this function acts as a
* 'CPU state barrier', nothing should get across.
+ * A lot of state is already set up in PDA init.
*/
void __init cpu_init (void)
{
#ifdef CONFIG_SMP
- int nr = current_thread_info()->cpu;
+ int nr = stack_smp_processor_id();
#else
int nr = smp_processor_id();
#endif
struct tss_struct * t = &init_tss[nr];
unsigned long v;
+ char *estacks;
/* CPU 0 is initialised in head64.c */
- if (nr != 0)
+ if (nr != 0) {
+ estacks = (char *)__get_free_pages(GFP_ATOMIC, 0);
+ if (!estacks)
+ panic("Can't allocate exception stacks for CPU %d\n",nr);
pda_init(nr);
+ } else
+ estacks = boot_exception_stacks;
+
+ if (test_and_set_bit(nr, &cpu_initialized))
+ panic("CPU#%d already initialized!\n", nr);
- if (test_and_set_bit(nr, &cpu_initialized)) {
- printk("CPU#%d already initialized!\n", nr);
- for (;;) __sti();
- }
printk("Initializing CPU#%d\n", nr);
- if (cpu_has_vme || cpu_has_tsc || cpu_has_de)
clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
gdt_descr.size = (__u8*) gdt_end - (__u8*)gdt_table;
* Delete NT
*/
- __asm__ volatile("pushfq ; popq %%rax ; btr $14,%%rax ; pushq %%rax ; popfq" :: : "eax");
+ asm volatile("pushfq ; popq %%rax ; btr $14,%%rax ; pushq %%rax ; popfq" ::: "eax");
/*
* LSTAR and STAR live in a bit strange symbiosis.
wrmsrl(MSR_CSTAR, ia32_cstar_target);
#endif
- rdmsrl(MSR_EFER, v);
- wrmsrl(MSR_EFER, v|1);
-
/* Flags to clear on syscall */
wrmsrl(MSR_SYSCALL_MASK, EF_TF|EF_DF|EF_IE);
-
wrmsrl(MSR_FS_BASE, 0);
wrmsrl(MSR_KERNEL_GS_BASE, 0);
barrier();
/*
- * set up and load the per-CPU TSS and LDT
+ * set up and load the per-CPU TSS
*/
+ estacks += EXCEPTION_STKSZ;
+ for (v = 0; v < N_EXCEPTION_STACKS; v++) {
+ t->ist[v] = (unsigned long)estacks;
+ estacks += EXCEPTION_STKSZ;
+ }
+
atomic_inc(&init_mm.mm_count);
current->active_mm = &init_mm;
if(current->mm)
BUG();
enter_lazy_tlb(&init_mm, current, nr);
- set_tssldt_descriptor((__u8 *)tss_start + (nr*16), (unsigned long) t,
- DESC_TSS,
- offsetof(struct tss_struct, io_bitmap));
+ set_tss_desc(nr, t);
load_TR(nr);
load_LDT(&init_mm);
set_debug(0UL, 6);
set_debug(0UL, 7);
- /*
- * Force FPU initialization:
- */
- clear_thread_flag(TIF_USEDFPU);
- current->used_math = 0;
- stts();
+ fpu_init();
}
#include <linux/stddef.h>
#include <linux/tty.h>
#include <linux/personality.h>
+#include <linux/compiler.h>
#include <linux/binfmts.h>
#include <asm/ucontext.h>
#include <asm/uaccess.h>
#include <asm/i387.h>
-#define DEBUG_SIG 0
+/* #define DEBUG_SIG 1 */
#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
char *pretcode;
struct ucontext uc;
struct siginfo info;
- struct _fpstate fpstate;
- char retcode[8];
+ struct _fpstate fpstate __attribute__((aligned(8)));
};
static int
{
unsigned int err = 0;
-#define COPY(x) err |= __get_user(regs->x, &sc->x)
-
-#define COPY_SEG(seg) \
- { unsigned short tmp; \
- err |= __get_user(tmp, &sc->seg); \
- regs->x##seg = tmp; }
-#define COPY_SEG_STRICT(seg) \
- { unsigned short tmp; \
- err |= __get_user(tmp, &sc->seg); \
- regs->x##seg = tmp|3; }
+#define COPY(x) err |= __get_user(regs->x, &sc->x)
-#define GET_SEG(seg) \
- { unsigned short tmp; \
- err |= __get_user(tmp, &sc->seg); \
- loadsegment(seg,tmp); }
+ {
+ unsigned int seg;
+ err |= __get_user(seg, &sc->gs);
+ load_gs_index(seg);
+ err |= __get_user(seg, &sc->fs);
+ loadsegment(fs,seg);
+ }
- /* XXX: rdmsr for 64bits */
- GET_SEG(gs);
- GET_SEG(fs);
COPY(rdi); COPY(rsi); COPY(rbp); COPY(rsp); COPY(rbx);
COPY(rdx); COPY(rcx); COPY(rip);
COPY(r8);
COPY(r14);
COPY(r15);
-
{
unsigned int tmpflags;
err |= __get_user(tmpflags, &sc->eflags);
struct pt_regs *regs, unsigned long mask)
{
int tmp, err = 0;
+ struct task_struct *me = current;
tmp = 0;
__asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp));
err |= __put_user(regs->r13, &sc->r13);
err |= __put_user(regs->r14, &sc->r14);
err |= __put_user(regs->r15, &sc->r15);
- err |= __put_user(current->thread.trap_no, &sc->trapno);
- err |= __put_user(current->thread.error_code, &sc->err);
+ err |= __put_user(me->thread.trap_no, &sc->trapno);
+ err |= __put_user(me->thread.error_code, &sc->err);
err |= __put_user(regs->rip, &sc->rip);
err |= __put_user(regs->eflags, &sc->eflags);
- err |= __put_user(regs->rsp, &sc->rsp_at_signal);
+ err |= __put_user(mask, &sc->oldmask);
+ err |= __put_user(me->thread.cr2, &sc->cr2);
tmp = save_i387(fpstate);
if (tmp < 0)
else
err |= __put_user(tmp ? fpstate : NULL, &sc->fpstate);
- /* non-iBCS2 extensions.. */
- err |= __put_user(mask, &sc->oldmask);
- err |= __put_user(current->thread.cr2, &sc->cr2);
-
return err;
}
rsp = current->sas_ss_sp + current->sas_ss_size;
}
- return (void *)((rsp - frame_size) & -16UL);
+ {
+ extern void bad_sigframe(void);
+ /* beginning of sigframe is 8 bytes misaligned, but fpstate
+ must end up on a 16byte boundary */
+ if ((offsetof(struct rt_sigframe, fpstate) & 16) != 0)
+ bad_sigframe();
+ }
+
+ return (void *)((rsp - frame_size) & ~(15UL)) - 8;
}
static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
sigset_t *set, struct pt_regs * regs)
{
- struct thread_info *ti;
struct rt_sigframe *frame;
int err = 0;
printk("%d old rip %lx old rsp %lx old rax %lx\n", current->pid,regs->rip,regs->rsp,regs->rax);
#endif
- ti = current_thread_info();
/* Set up registers for signal handler */
- regs->rdi = (ti->exec_domain
- && ti->exec_domain->signal_invmap
- && sig < 32
- ? ti->exec_domain->signal_invmap[sig]
- : sig);
- regs->rax = 0; /* In case the signal handler was declared without prototypes */
-
+ {
+ struct exec_domain *ed = current_thread_info()->exec_domain;
+ if (unlikely(ed && ed->signal_invmap && sig < 32))
+ sig = ed->signal_invmap[sig];
+ }
+ regs->rdi = sig;
+ /* In case the signal handler was declared without prototypes */
+ regs->rax = 0;
/* This also works for non SA_SIGINFO handlers because they expect the
next argument after the signal number on the stack. */
regs->rip = (unsigned long) ka->sa.sa_handler;
set_fs(USER_DS);
- // XXX: cs
regs->eflags &= ~TF_MASK;
#if DEBUG_SIG
if ((current->ptrace & PT_PTRACED) && signr != SIGKILL) {
/* Let the debugger run. */
current->exit_code = signr;
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
/* We're back. Did the debugger cancel the sig? */
if (!(signr = current->exit_code))
info.si_signo = signr;
info.si_errno = 0;
info.si_code = SI_USER;
- info.si_pid = current->p_pptr->pid;
- info.si_uid = current->p_pptr->uid;
+ info.si_pid = current->parent->pid;
+ info.si_uid = current->parent->uid;
}
/* If the (new) signal is now blocked, requeue it. */
preempt_disable();
current->state = TASK_STOPPED;
current->exit_code = signr;
- sig = current->p_pptr->sig;
+ sig = current->parent->sig;
if (sig && !(sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
notify_parent(current, SIGCHLD);
schedule();
regs->rax == -ERESTARTSYS ||
regs->rax == -ERESTARTNOINTR) {
regs->rax = regs->orig_rax;
- regs->rcx -= 2;
+ regs->rip -= 2;
}
}
return 0;
#include <asm/mtrr.h>
#include <asm/pgalloc.h>
-
-/*
- * Some notes on x86 processor bugs affecting SMP operation:
- *
- * Pentium, Pentium Pro, II, III (and all CPUs) have bugs.
- * The Linux implications for SMP are handled as follows:
- *
- * Pentium III / [Xeon]
- * None of the E1AP-E3AP errata are visible to the user.
- *
- * E1AP. see PII A1AP
- * E2AP. see PII A2AP
- * E3AP. see PII A3AP
- *
- * Pentium II / [Xeon]
- * None of the A1AP-A3AP errata are visible to the user.
- *
- * A1AP. see PPro 1AP
- * A2AP. see PPro 2AP
- * A3AP. see PPro 7AP
- *
- * Pentium Pro
- * None of 1AP-9AP errata are visible to the normal user,
- * except occasional delivery of 'spurious interrupt' as trap #15.
- * This is very rare and a non-problem.
- *
- * 1AP. Linux maps APIC as non-cacheable
- * 2AP. worked around in hardware
- * 3AP. fixed in C0 and above steppings microcode update.
- * Linux does not use excessive STARTUP_IPIs.
- * 4AP. worked around in hardware
- * 5AP. symmetric IO mode (normal Linux operation) not affected.
- * 'noapic' mode has vector 0xf filled out properly.
- * 6AP. 'noapic' mode might be affected - fixed in later steppings
- * 7AP. We do not assume writes to the LVT deassering IRQs
- * 8AP. We do not enable low power mode (deep sleep) during MP bootup
- * 9AP. We do not use mixed mode
- *
- * Pentium
- * There is a marginal case where REP MOVS on 100MHz SMP
- * machines with B stepping processors can fail. XXX should provide
- * an L1cache=Writethrough or L1cache=off option.
- *
- * B stepping CPUs may hang. There are hardware work arounds
- * for this. We warn about it in case your board doesnt have the work
- * arounds. Basically thats so I can tell anyone with a B stepping
- * CPU and SMP problems "tough".
- *
- * Specific items [From Pentium Processor Specification Update]
- *
- * 1AP. Linux doesn't use remote read
- * 2AP. Linux doesn't trust APIC errors
- * 3AP. We work around this
- * 4AP. Linux never generated 3 interrupts of the same priority
- * to cause a lost local interrupt.
- * 5AP. Remote read is never used
- * 6AP. not affected - worked around in hardware
- * 7AP. not affected - worked around in hardware
- * 8AP. worked around in hardware - we get explicit CS errors if not
- * 9AP. only 'noapic' mode affected. Might generate spurious
- * interrupts, we log only the first one and count the
- * rest silently.
- * 10AP. not affected - worked around in hardware
- * 11AP. Linux reads the APIC between writes to avoid this, as per
- * the documentation. Make sure you preserve this as it affects
- * the C stepping chips too.
- * 12AP. not affected - worked around in hardware
- * 13AP. not affected - worked around in hardware
- * 14AP. we always deassert INIT during bootup
- * 15AP. not affected - worked around in hardware
- * 16AP. not affected - worked around in hardware
- * 17AP. not affected - worked around in hardware
- * 18AP. not affected - worked around in hardware
- * 19AP. not affected - worked around in BIOS
- *
- * If this sounds worrying believe me these bugs are either ___RARE___,
- * or are signal timing bugs worked around in hardware and there's
- * about nothing of note with C stepping upwards.
- */
+#include <asm/tlbflush.h>
/* The 'big kernel lock' */
spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
-struct tlb_state cpu_tlbstate[NR_CPUS] __cacheline_aligned = {[0 ... NR_CPUS-1] = { &init_mm, 0, }};
+struct tlb_state cpu_tlbstate[NR_CPUS] = {[0 ... NR_CPUS-1] = { &init_mm, 0 }};
/*
* the following functions deal with sending IPIs between CPUs.
* We use 'broadcast', CPU->CPU IPIs and self-IPIs too.
*/
-static inline int __prepare_ICR (unsigned int shortcut, int vector)
+static inline unsigned int __prepare_ICR (unsigned int shortcut, int vector)
{
- return APIC_DM_FIXED | shortcut | vector | APIC_DEST_LOGICAL;
+ unsigned int icr = APIC_DM_FIXED | shortcut | vector | APIC_DEST_LOGICAL;
+ if (vector == KDB_VECTOR)
+ icr = (icr & (~APIC_VECTOR_MASK)) | APIC_DM_NMI;
+ return icr;
}
static inline int __prepare_ICR2 (unsigned int mask)
do_flush_tlb_all_local();
}
-static spinlock_t migration_lock = SPIN_LOCK_UNLOCKED;
-static task_t *new_task;
-
-/*
- * This function sends a 'task migration' IPI to another CPU.
- * Must be called from syscall contexts, with interrupts *enabled*.
- */
-void smp_migrate_task(int cpu, task_t *p)
+void smp_kdb_stop(void)
{
- /*
- * The target CPU will unlock the migration spinlock:
- */
- _raw_spin_lock(&migration_lock);
- new_task = p;
- send_IPI_mask(1 << cpu, TASK_MIGRATION_VECTOR);
+ send_IPI_allbutself(KDB_VECTOR);
}
-/*
- * Task migration callback.
- */
-asmlinkage void smp_task_migration_interrupt(void)
-{
- task_t *p;
-
- ack_APIC_irq();
- p = new_task;
- _raw_spin_unlock(&migration_lock);
- sched_task_migrated(p);
-}
/*
* this function sends a 'reschedule' IPI to another CPU.
* it goes straight through and wastes no time serializing
* anything. Worst case is that we lose a reschedule ...
*/
+
void smp_send_reschedule(int cpu)
{
send_IPI_mask(1 << cpu, RESCHEDULE_VECTOR);
}
-/*
- * this function sends a reschedule IPI to all (other) CPUs.
- * This should only be used if some 'global' task became runnable,
- * such as a RT task, that must be handled now. The first CPU
- * that manages to grab the task will run it.
- */
-void smp_send_reschedule_all(void)
-{
- send_IPI_allbutself(RESCHEDULE_VECTOR);
-}
-
/*
* Structure and data for smp_call_function(). This is designed to minimise
* static memory requirements. It also looks cleaner.
#include <asm/mtrr.h>
#include <asm/pgalloc.h>
#include <asm/desc.h>
+#include <asm/kdebug.h>
+#include <asm/tlbflush.h>
/* Set if we find a B stepping CPU */
static int smp_b_stepping;
static unsigned long __init setup_trampoline(void)
{
- extern __u32 tramp_gdt_ptr;
- tramp_gdt_ptr = (__u32)virt_to_phys(&gdt_table);
+ extern volatile __u32 tramp_gdt_ptr;
+ tramp_gdt_ptr = __pa_symbol(&gdt_table);
memcpy(trampoline_base, trampoline_data, trampoline_end - trampoline_data);
return virt_to_phys(trampoline_base);
}
#define NR_LOOPS 5
-extern unsigned long fast_gettimeoffset_quotient;
+extern unsigned int fast_gettimeoffset_quotient;
-/*
- * accurate 64-bit/32-bit division, expanded to 32-bit divisions and 64-bit
- * multiplication. Not terribly optimized but we need it at boot time only
- * anyway.
- *
- * result == a / b
- * == (a1 + a2*(2^32)) / b
- * == a1/b + a2*(2^32/b)
- * == a1/b + a2*((2^32-1)/b) + a2/b + (a2*((2^32-1) % b))/b
- * ^---- (this multiplication can overflow)
- */
-
-static unsigned long long div64 (unsigned long long a, unsigned long b0)
+static inline unsigned long long div64 (unsigned long long a, unsigned long b)
{
- unsigned int a1, a2;
- unsigned long long res;
-
- a1 = ((unsigned int*)&a)[0];
- a2 = ((unsigned int*)&a)[1];
-
- res = a1/b0 +
- (unsigned long long)a2 * (unsigned long long)(0xffffffff/b0) +
- a2 / b0 +
- (a2 * (0xffffffff % b0)) / b0;
-
- return res;
+ return a/b;
}
static void __init synchronize_tsc_bp (void)
if (tsc_values[i] < avg)
realdelta = -realdelta;
- printk("BIOS BUG: CPU#%d improperly initialized, has %ld usecs TSC skew! FIXED.\n", i, realdelta);
+ printk("BIOS BUG: CPU#%d improperly initialized, has %ld usecs TSC skew! FIXED.\n",
+ i, realdelta);
}
sum += delta;
}
if (!buggy)
printk("passed.\n");
- ;
}
static void __init synchronize_tsc_ap (void)
*/
smp_store_cpu_info(cpuid);
- disable_APIC_timer();
+ notify_die(DIE_CPUINIT, "cpuinit", NULL, 0);
+
/*
* Allow the master to continue.
*/
*/
int __init start_secondary(void *unused)
{
+ int var;
+ printk("rsp %p\n",&var);
+
/*
* Dont put anything before smp_callin(), SMP
* booting is too fragile that we want to limit the
smp_callin();
while (!atomic_read(&smp_commenced))
rep_nop();
- enable_APIC_timer();
/*
* low-memory mappings have been cleared, flush them from
* the local TLBs too.
*/
void __init initialize_secondary(void)
{
+ struct task_struct *me = stack_current();
+
/*
* We don't actually need to load the full TSS,
* basically just the stack pointer and the eip.
"movq %0,%%rsp\n\t"
"jmp *%1"
:
- :"r" (current->thread.rsp),"r" (current->thread.rip));
+ :"r" (me->thread.rsp),"r" (me->thread.rip));
}
-extern void *init_rsp;
+extern volatile unsigned long init_rsp;
extern void (*initial_code)(void);
static int __init fork_by_hand(void)
{
struct pt_regs regs;
/*
- * don't care about the eip and regs settings since
+ * don't care about the rip and regs settings since
* we'll never reschedule the forked task.
*/
return do_fork(CLONE_VM|CLONE_PID, 0, ®s, 0);
int timeout, num_starts, j, cpu;
unsigned long start_eip;
+ printk("do_boot_cpu cpucount = %d\n", cpucount);
+
cpu = ++cpucount;
/*
* We can't use kernel_thread since we must avoid to
x86_cpu_to_apicid[cpu] = apicid;
x86_apicid_to_cpu[apicid] = cpu;
- idle->thread.rip = (unsigned long) start_secondary;
-
- init_rsp = (void *) (THREAD_SIZE + (char *)idle->thread_info);
+ idle->thread.rip = (unsigned long)start_secondary;
+// idle->thread.rsp = (unsigned long)idle->thread_info + THREAD_SIZE - 512;
unhash_process(idle);
+
cpu_pda[cpu].pcurrent = idle;
- cpu_pda[cpu].kernelstack = init_rsp - PDA_STACKOFFSET;
/* start_eip had better be page-aligned! */
start_eip = setup_trampoline();
- /* So we see what's up */
- printk("Booting processor %d/%d eip %lx\n", cpu, apicid, start_eip);
-
+ init_rsp = (unsigned long)idle->thread_info + PAGE_SIZE + 1024;
initial_code = initialize_secondary;
+ printk("Booting processor %d/%d rip %lx rsp %lx rsp2 %lx\n", cpu, apicid,
+ start_eip, idle->thread.rsp, init_rsp);
+
/*
* This grunge runs the startup process for
* the targeted processor.
struct vm_area_struct *vma;
unsigned long end = TASK_SIZE;
+ if (test_thread_flag(TIF_IA32))
+ flags |= MAP_32BIT;
if (flags & MAP_32BIT)
- end = 0xffffffff;
- if (len > TASK_SIZE)
+ end = 0xffffffff-1;
+ if (len > end)
return -ENOMEM;
if (!addr) {
addr = TASK_UNMAPPED_64;
- if (test_thread_flag(TIF_IA32) || (flags & MAP_32BIT)) {
+ if (flags & MAP_32BIT) {
addr = TASK_UNMAPPED_32;
}
}
for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
/* At this point: (!vma || addr < vma->vm_end). */
- if (TASK_SIZE - len < addr)
+ if (end - len < addr)
return -ENOMEM;
if (!vma || addr + len <= vma->vm_start)
return addr;
schedule();
return -ERESTARTNOHAND;
}
+
+asmlinkage long wrap_sys_shmat(int shmid, char *shmaddr, int shmflg,
+ unsigned long *raddr_user)
+{
+ unsigned long raddr;
+ return sys_shmat(shmid,shmaddr,shmflg,&raddr) ?: put_user(raddr,raddr_user);
+}
+
+asmlinkage long wrap_sys_semctl(int semid, int semnum, int cmd, unsigned long *ptr)
+{
+ unsigned long val;
+ /* XXX: for cmd==SETVAL the manpage says ptr is the value directly. i386
+ seems to always get it via a pointer. Follow i386 here. Check this. */
+ if (get_user(val, ptr))
+ return -EFAULT;
+ return sys_semctl(semid, semnum, cmd, (union semun)(void *)val);
+}
#include <linux/linkage.h>
#include <linux/sys.h>
#include <linux/cache.h>
+#include <linux/config.h>
+
+/* No comment. */
+#if defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)
+#else
+#define sys_nfsservctl sys_ni_syscall
+#endif
#define __NO_STUBS
#include <linux/irq.h>
-unsigned long cpu_khz; /* Detected as we calibrate the TSC */
+unsigned int cpu_khz; /* Detected as we calibrate the TSC */
/* Number of usecs that the last interrupt was delayed */
int __delay_at_last_interrupt __section_delay_at_last_interrupt;
-unsigned long __last_tsc_low __section_last_tsc_low; /* lsb 32 bits of Time Stamp Counter */
+unsigned int __last_tsc_low __section_last_tsc_low; /* lsb 32 bits of Time Stamp Counter */
/* Cached *multiplier* to convert TSC counts to microseconds.
* (see the equation below).
* Equal to 2^32 * (1 / (clocks per usec) ).
* Initialized in time_init.
*/
-unsigned long __fast_gettimeoffset_quotient __section_fast_gettimeoffset_quotient;
+unsigned int __fast_gettimeoffset_quotient __section_fast_gettimeoffset_quotient;
extern rwlock_t xtime_lock;
struct timeval __xtime __section_xtime;
spinlock_t rtc_lock = SPIN_LOCK_UNLOCKED;
-static inline unsigned long do_gettimeoffset(void)
+inline unsigned long do_gettimeoffset(void)
{
- register unsigned long eax, edx;
+ register unsigned int eax, edx;
/* Read the Time Stamp Counter */
* in the critical path.
*/
- edx = (eax*fast_gettimeoffset_quotient) >> 32;
+ __asm__("mull %2"
+ :"=a" (eax), "=d" (edx)
+ :"rm" (fast_gettimeoffset_quotient),
+ "0" (eax));
/* our adjusted time offset in microseconds */
return delay_at_last_interrupt + edx;
}
-
-
-
#define TICK_SIZE tick
spinlock_t i8253_lock = SPIN_LOCK_UNLOCKED;
extern spinlock_t i8259A_lock;
-
-static inline unsigned long do_fast_gettimeoffset(void)
-{
- register unsigned long eax, edx;
-
- /* Read the Time Stamp Counter */
-
- rdtsc(eax,edx);
-
- /* .. relative to previous jiffy (32 bits is enough) */
- eax -= last_tsc_low; /* tsc_low delta */
-
- /*
- * Time offset = (tsc_low delta) * fast_gettimeoffset_quotient
- * = (tsc_low delta) * (usecs_per_clock)
- * = (tsc_low delta) * (usecs_per_jiffy / clocks_per_jiffy)
- *
- * Using a mull instead of a divl saves up to 31 clock cycles
- * in the critical path.
- */
-
- edx = (eax*fast_gettimeoffset_quotient) >> 32;
-
- /* our adjusted time offset in microseconds */
- return delay_at_last_interrupt + edx;
-}
-
/*
* This version of gettimeofday has microsecond resolution
* and better than microsecond precision on fast x86 machines with TSC.
unsigned long get_cmos_time(void)
{
unsigned int year, mon, day, hour, min, sec;
+ int i;
+ spin_lock(&rtc_lock);
/* The Linux interpretation of the CMOS clock register contents:
* When the Update-In-Progress (UIP) flag goes from 1 to 0, the
* RTC registers show the second which has precisely just started.
* Let's hope other operating systems interpret the RTC the same way.
*/
-#ifndef CONFIG_SIMNOW
- int i;
- /* FIXME: This would take eons in emulated environment */
+
/* read RTC exactly on falling edge of update flag */
for (i = 0 ; i < 1000000 ; i++) /* may take up to 1 second... */
if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP)
for (i = 0 ; i < 1000000 ; i++) /* must try at least 2.228 ms */
if (!(CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP))
break;
-#endif
do { /* Isn't this overkill ? UIP above should guarantee consistency */
sec = CMOS_READ(RTC_SECONDS);
min = CMOS_READ(RTC_MINUTES);
BCD_TO_BIN(mon);
BCD_TO_BIN(year);
}
+ spin_unlock(&rtc_lock);
if ((year += 1900) < 1970)
year += 100;
return mktime(year, mon, day, hour, min, sec);
#define CALIBRATE_LATCH (5 * LATCH)
#define CALIBRATE_TIME (5 * 1000020/HZ)
+/* Could use 64bit arithmetic on x86-64, but the code is too fragile */
static unsigned long __init calibrate_tsc(void)
{
/* Set the Gate high, disable speaker */
outb(CALIBRATE_LATCH >> 8, 0x42); /* MSB of count */
{
- unsigned long start;
- unsigned long end;
- unsigned long count;
-
- {
- int low, high;
- rdtsc(low,high);
- start = ((u64)high)<<32 | low;
- }
+ unsigned int startlow, starthigh;
+ unsigned int endlow, endhigh;
+ unsigned int count;
+
+ rdtsc(startlow,starthigh);
count = 0;
do {
count++;
} while ((inb(0x61) & 0x20) == 0);
+ rdtsc(endlow,endhigh);
- {
- int low, high;
- rdtsc(low,high);
- end = ((u64)high)<<32 | low;
- last_tsc_low = low;
- }
-
+ last_tsc_low = endlow;
/* Error: ECTCNEVERSET */
if (count <= 1)
goto bad_ctc;
- end -= start;
+ __asm__("subl %2,%0\n\t"
+ "sbbl %3,%1"
+ :"=a" (endlow), "=d" (endhigh)
+ :"g" (startlow), "g" (starthigh),
+ "0" (endlow), "1" (endhigh));
+ /* Error: ECPUTOOFAST */
+ if (endhigh)
+ goto bad_ctc;
/* Error: ECPUTOOSLOW */
- if (end <= CALIBRATE_TIME)
+ if (endlow <= CALIBRATE_TIME)
goto bad_ctc;
- end = (((u64)CALIBRATE_TIME)<<32)/end;
- return end;
+ __asm__("divl %2"
+ :"=a" (endlow), "=d" (endhigh)
+ :"r" (endlow), "0" (0), "1" (CALIBRATE_TIME));
+
+ return endlow;
}
/*
*/
if (cpu_has_tsc) {
- unsigned long tsc_quotient = calibrate_tsc();
+ unsigned int tsc_quotient = calibrate_tsc();
if (tsc_quotient) {
fast_gettimeoffset_quotient = tsc_quotient;
use_tsc = 1;
* The formula is (10^6 * 2^32) / (2^32 * 1 / (clocks/us)) =
* clock/second. Our precision is about 100 ppm.
*/
- {
- cpu_khz = ((1000*(1UL<<32)) / tsc_quotient);
- printk("Detected %ld Hz processor.\n", cpu_khz);
+ { unsigned int eax=0, edx=1000;
+ __asm__("divl %2"
+ :"=a" (cpu_khz), "=d" (edx)
+ :"r" (tsc_quotient),
+ "0" (eax), "1" (edx));
+ printk("Detected %u.%03u MHz processor.\n",
+ cpu_khz / 1000, cpu_khz % 1000);
}
}
}
lidt idt_48 - r_base # load idt with 0, 0
lgdt gdt_48 - r_base # load gdt with whatever is appropriate
+ movw $__KERNEL_DS,%ax
+ movw %ax,%ds
+ movw %ax,%es
+
xor %ax, %ax
inc %ax # protected mode (PE) bit
lmsw %ax # into protected mode
jmp flush_instr
flush_instr:
- ljmpl $__KERNEL_CS, $0x00100000
- # jump to startup_32 in arch/i386/kernel/head.S
+ ljmpl $__KERNEL32_CS, $0x00100000
+ # jump to startup_32 in arch/x86_64/kernel/head.S
idt_48:
.word 0 # idt limit = 0
# relocations.
-.globl SYMBOL_NAME(trampoline_end)
-SYMBOL_NAME_LABEL(trampoline_end)
+.globl trampoline_end
+trampoline_end:
*
* Pentium III FXSR, SSE support
* Gareth Hughes <gareth@valinux.com>, May 2000
+ *
+ * $Id: traps.c,v 1.36 2002/03/24 11:09:10 ak Exp $
*/
/*
- * 'traps.c' handles hardware traps and faults after we have saved some
+ * 'Traps.c' handles hardware traps and faults after we have saved some
* state in 'entry.S'.
*/
#include <linux/config.h>
#include <asm/desc.h>
#include <asm/i387.h>
#include <asm/kdebug.h>
-#include <asm/uaccess.h>
#include <asm/smp.h>
#include <asm/pgalloc.h>
char stackfault_stack[4*1024];
#endif
-/* We really need to get rid of F0 0F workaround. Setting up new idts
- is not an option */
-
-struct desc_struct default_ldt[] = { { 0, 0 }, { 0, 0 }, { 0, 0 },
- { 0, 0 }, { 0, 0 } };
-
extern struct gate_struct idt_table[256];
asmlinkage void divide_error(void);
asmlinkage void simd_coprocessor_error(void);
asmlinkage void reserved(void);
asmlinkage void alignment_check(void);
+asmlinkage void machine_check(void);
asmlinkage void spurious_interrupt_bug(void);
+asmlinkage void call_debug(void);
+
+extern char iret_address[];
struct notifier_block *die_chain;
-int kstack_depth_to_print = 24;
+int kstack_depth_to_print = 10;
+
+#ifdef CONFIG_KALLSYMS
+#include <linux/kallsyms.h>
+int printk_address(unsigned long address)
+{
+ unsigned long dummy;
+ const char *modname, *secname, *symname;
+ unsigned long symstart;
+ char *delim = ":";
+
+ /* What a function call! */
+ if (!kallsyms_address_to_symbol(address,
+ &modname, &dummy, &dummy,
+ &secname, &dummy, &dummy,
+ &symname, &symstart, &dummy)) {
+ return printk("[<%016lx>]", address);
+ }
+ if (!strcmp(modname, "kernel"))
+ modname = delim = "";
+ return printk("[%016lx%s%s%s%s%+ld]",
+ address,delim,modname,delim,symname,address-symstart);
+}
+#else
+int printk_address(unsigned long address)
+{
+ return printk("[<%016lx>]", address);
+}
+#endif
+
#ifdef CONFIG_MODULES
printk("\nCall Trace: ");
- irqstack = (unsigned long *) &(cpu_pda[cpu].irqstack);
- irqstack_end = (unsigned long *) ((char *)irqstack + sizeof_field(struct x8664_pda, irqstack));
+ irqstack_end = (unsigned long *) (cpu_pda[cpu].irqstackptr);
+ irqstack = (unsigned long *) (cpu_pda[cpu].irqstackptr - IRQSTACKSIZE + 64);
i = 1;
if (stack >= irqstack && stack < irqstack_end) {
+ unsigned long *tstack;
while (stack < irqstack_end) {
addr = *stack++;
/*
* out the call path that was taken.
*/
if (kernel_text_address(addr)) {
- if (i && ((i % 6) == 0))
+ i += printk_address(addr);
+ i += printk(" ");
+ if (i > 50) {
printk("\n ");
- printk("[<%016lx>] ", addr);
- i++;
+ i = 0;
+ }
}
}
stack = (unsigned long *) (irqstack_end[-1]);
printk(" <EOI> ");
#if 1
- if (stack < (unsigned long *)current ||
- (char*)stack > ((char*)current->thread_info)+THREAD_SIZE)
+ tstack = (unsigned long *)(current_thread_info()+1);
+ if (stack < tstack || (char*)stack > (char*)tstack+THREAD_SIZE)
printk("\n" KERN_DEBUG
- "no stack at the end of irqstack; stack:%p, cur:%p/%p\n",
- stack, current, ((char*)current)+THREAD_SIZE);
+ "no stack at the end of irqstack; stack:%lx, curstack %lx\n",
+ stack, tstack);
#endif
}
while (((long) stack & (THREAD_SIZE-1)) != 0) {
addr = *stack++;
- /*
- * If the address is either in the text segment of the
- * kernel, or in the region which contains vmalloc'ed
- * memory, it *may* be the address of a calling
- * routine; if so, print it so that someone tracing
- * down the cause of the crash will be able to figure
- * out the call path that was taken.
- */
if (kernel_text_address(addr)) {
- if (i && ((i % 6) == 0))
+ i += printk_address(addr);
+ i += printk(" ");
+ if (i > 50) {
printk("\n ");
- printk("[<%016lx>] ", addr);
- i++;
+ i = 0;
+ }
}
}
printk("\n");
{
unsigned long *stack;
int i;
+ const int cpu = smp_processor_id();
+ unsigned long *irqstack_end = (unsigned long *) (cpu_pda[cpu].irqstackptr);
+ unsigned long *irqstack = (unsigned long *) (cpu_pda[cpu].irqstackptr - IRQSTACKSIZE);
// debugging aid: "show_stack(NULL);" prints the
// back trace for this cpu.
stack = rsp;
for(i=0; i < kstack_depth_to_print; i++) {
+ if (stack >= irqstack && stack <= irqstack_end) {
+ if (stack == irqstack_end) {
+ stack = (unsigned long *) (irqstack_end[-1]);
+ printk(" <EOI> ");
+ }
+ } else {
if (((long) stack & (THREAD_SIZE-1)) == 0)
break;
- if (i && ((i % 8) == 0))
+ }
+ if (i && ((i % 4) == 0))
printk("\n ");
printk("%016lx ", *stack++);
}
+ show_trace((unsigned long *)rsp);
}
-extern void dump_pagetable(void);
-
void show_registers(struct pt_regs *regs)
{
int i;
}
printk("CPU %d ", cpu);
show_regs(regs);
- printk("Process %s (pid: %d, thread_info:%p task:%p)\n",
- cur->comm, cur->pid, cur->thread_info, cur);
-
- dump_pagetable();
+ printk("Process %s (pid: %d, stackpage=%08lx)\n",
+ cur->comm, cur->pid, 4096+(unsigned long)cur);
/*
* When in-kernel, we also print out the stack and code at the
}
}
printk("\n");
- show_trace((unsigned long *)rsp);
}
void handle_BUG(struct pt_regs *regs)
struct bug_frame f;
char tmp;
- if ((regs->cs & 3) || regs->rip < __PAGE_OFFSET)
+ if (regs->cs & 3)
return;
- if (__copy_from_user(&f, (struct bug_frame *) regs->rip, sizeof(struct bug_frame)))
+ if (__copy_from_user(&f, (struct bug_frame *) regs->rip,
+ sizeof(struct bug_frame)))
return;
- if ((unsigned long)f.filename < __PAGE_OFFSET)
+ if ((unsigned long)f.filename < __PAGE_OFFSET ||
+ f.ud2[0] != 0x0f || f.ud2[1] != 0x0b)
return;
if (__get_user(tmp, f.filename))
f.filename = "unmapped filename";
- printk("Kernel BUG at %.30s:%d\n", f.filename, f.line);
+ printk("Kernel BUG at %.50s:%d\n", f.filename, f.line);
+}
+
+void out_of_line_bug(void)
+{
+ BUG();
}
spinlock_t die_lock = SPIN_LOCK_UNLOCKED;
+int die_owner = -1;
void die(const char * str, struct pt_regs * regs, long err)
{
+ int cpu;
struct die_args args = { regs, str, err };
console_verbose();
notifier_call_chain(&die_chain, DIE_DIE, &args);
- spin_lock_irq(&die_lock);
bust_spinlocks(1);
handle_BUG(regs);
printk("%s: %04lx\n", str, err & 0xffff);
+ cpu = smp_processor_id();
+ /* racy, but better than risking deadlock. */
+ __cli();
+ if (!spin_trylock(&die_lock)) {
+ if (cpu == die_owner)
+ /* nested oops. should stop eventually */;
+ else
+ spin_lock(&die_lock);
+ }
+ die_owner = cpu;
show_registers(regs);
bust_spinlocks(0);
spin_unlock_irq(&die_lock);
+ notify_die(DIE_OOPS, (char *)str, regs, err);
do_exit(SIGSEGV);
}
static inline void die_if_kernel(const char * str, struct pt_regs * regs, long err)
{
- if (!(regs->eflags & VM_MASK) && (regs->rip >= TASK_SIZE))
+ if (!(regs->eflags & VM_MASK) && (regs->cs == __KERNEL_CS))
die(str, regs, err);
}
return address;
}
-static void inline do_trap(int trapnr, int signr, char *str, int vm86,
+static void do_trap(int trapnr, int signr, char *str,
struct pt_regs * regs, long error_code, siginfo_t *info)
{
- if ((regs->cs & 3) == 0)
- goto kernel_trap;
-
+ if ((regs->cs & 3) != 0) {
+ struct task_struct *tsk = current;
-#if 0
- printk("%d/%s trap %d sig %d %s rip:%lx rsp:%lx error_code:%lx\n",
- current->pid, current->comm,
- trapnr, signr, str, regs->rip, regs->rsp, error_code);
-#endif
+ if (trapnr != 3)
+ printk("%s[%d] trap %s at rip:%lx rsp:%lx err:%lx\n",
+ tsk->comm, tsk->pid, str, regs->rip, regs->rsp, error_code);
- {
- struct task_struct *tsk = current;
tsk->thread.error_code = error_code;
tsk->thread.trap_no = trapnr;
if (info)
return;
}
- kernel_trap: {
+
+ /* kernel trap */
+ {
unsigned long fixup = search_exception_table(regs->rip);
- if (fixup)
+ if (fixup) {
+ extern int exception_trace;
+ if (exception_trace)
+ printk(KERN_ERR
+ "%s: fixed kernel exception at %lx err:%ld\n",
+ current->comm, regs->rip, error_code);
+
regs->rip = fixup;
- else
+ } else
die(str, regs, error_code);
return;
}
#define DO_ERROR(trapnr, signr, str, name) \
asmlinkage void do_##name(struct pt_regs * regs, long error_code) \
{ \
- do_trap(trapnr, signr, str, 0, regs, error_code, NULL); \
+ do_trap(trapnr, signr, str, regs, error_code, NULL); \
}
#define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
info.si_errno = 0; \
info.si_code = sicode; \
info.si_addr = (void *)siaddr; \
- do_trap(trapnr, signr, str, 0, regs, error_code, &info); \
+ do_trap(trapnr, signr, str, regs, error_code, &info); \
}
-#define DO_VM86_ERROR(trapnr, signr, str, name) \
-asmlinkage void do_##name(struct pt_regs * regs, long error_code) \
-{ \
- do_trap(trapnr, signr, str, 1, regs, error_code, NULL); \
-}
-
-#define DO_VM86_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
-asmlinkage void do_##name(struct pt_regs * regs, long error_code) \
-{ \
- siginfo_t info; \
- info.si_signo = signr; \
- info.si_errno = 0; \
- info.si_code = sicode; \
- info.si_addr = (void *)siaddr; \
- do_trap(trapnr, signr, str, 1, regs, error_code, &info); \
-}
-
-DO_VM86_ERROR_INFO( 0, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->rip)
-DO_VM86_ERROR( 4, SIGSEGV, "overflow", overflow)
-DO_VM86_ERROR( 5, SIGSEGV, "bounds", bounds)
+DO_ERROR_INFO( 0, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->rip)
+DO_ERROR( 4, SIGSEGV, "overflow", overflow)
+DO_ERROR( 5, SIGSEGV, "bounds", bounds)
DO_ERROR_INFO( 6, SIGILL, "invalid operand", invalid_op, ILL_ILLOPN, regs->rip)
-DO_VM86_ERROR( 7, SIGSEGV, "device not available", device_not_available)
+DO_ERROR( 7, SIGSEGV, "device not available", device_not_available)
DO_ERROR( 8, SIGSEGV, "double fault", double_fault)
DO_ERROR( 9, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun)
DO_ERROR(10, SIGSEGV, "invalid TSS", invalid_TSS)
asmlinkage void do_int3(struct pt_regs * regs, long error_code)
{
- struct die_args args = { regs, "int3", error_code };
- notifier_call_chain(&die_chain, DIE_INT3, &args);
- do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL);
+ if (notify_die(DIE_INT3, "int3", regs, error_code) == NOTIFY_BAD)
+ return;
+ do_trap(3, SIGTRAP, "int3", regs, error_code, NULL);
}
+extern void dump_pagetable(unsigned long);
+
asmlinkage void do_general_protection(struct pt_regs * regs, long error_code)
{
- if ((regs->cs & 3)==0)
- goto gp_in_kernel;
-
+ if ((regs->cs & 3)!=0) {
current->thread.error_code = error_code;
current->thread.trap_no = 13;
force_sig(SIGSEGV, current);
return;
+ }
-gp_in_kernel:
+ /* kernel gp */
{
unsigned long fixup;
fixup = search_exception_table(regs->rip);
regs->rip = fixup;
return;
}
+// dump_pagetable(regs->rip);
die("general protection fault", regs, error_code);
}
}
unknown_nmi_error(reason, regs);
return;
}
+ if (notify_die(DIE_NMI, "nmi", regs, reason) == NOTIFY_BAD)
+ return;
if (reason & 0x80)
mem_parity_error(reason, regs);
if (reason & 0x40)
asm("movq %%db6,%0" : "=r" (condition));
+ if (notify_die(DIE_DEBUG, "debug", regs, error_code) == NOTIFY_BAD)
+ return;
+
/* Mask out spurious debug traps due to lazy DR7 setting */
if (condition & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) {
if (!tsk->thread.debugreg[7]) {
* allowing programs to debug themselves without the ptrace()
* interface.
*/
+ if ((regs->cs & 3) == 0)
+ goto clear_TF;
if ((tsk->ptrace & (PT_DTRACE|PT_PTRACED)) == PT_DTRACE)
goto clear_TF;
}
/* Ok, finally something we can handle */
- /* XXX: add die_chain here */
tsk->thread.trap_no = 1;
tsk->thread.error_code = error_code;
info.si_signo = SIGTRAP;
* the correct behaviour even in the presence of the asynchronous
* IRQ13 behaviour
*/
-void math_error(void *eip)
+void math_error(void *rip)
{
struct task_struct * task;
siginfo_t info;
info.si_signo = SIGFPE;
info.si_errno = 0;
info.si_code = __SI_FAULT;
- info.si_addr = eip;
+ info.si_addr = rip;
/*
* (~cwd & swd) will mask out exceptions that are not set to unmasked
* status. 0x3f is the exception bits in these regs, 0x200 is the
asmlinkage void do_coprocessor_error(struct pt_regs * regs, long error_code)
{
- ignore_irq13 = 1;
math_error((void *)regs->rip);
}
printk("bad interrupt");
}
-void simd_math_error(void *eip)
+static inline void simd_math_error(void *rip)
{
struct task_struct * task;
siginfo_t info;
info.si_signo = SIGFPE;
info.si_errno = 0;
info.si_code = __SI_FAULT;
- info.si_addr = eip;
+ info.si_addr = rip;
/*
* The SIMD FPU exceptions are handled a little differently, as there
* is only a single status/control register. Thus, to determine which
asmlinkage void do_simd_coprocessor_error(struct pt_regs * regs,
long error_code)
{
- if (cpu_has_xmm) {
- /* Handle SIMD FPU exceptions on PIII+ processors. */
- ignore_irq13 = 1;
simd_math_error((void *)regs->rip);
- } else {
- /*
- * Handle strange cache flush from user space exception
- * in all other cases. This is undocumented behaviour.
- */
- die_if_kernel("cache flush denied", regs, error_code);
- current->thread.trap_no = 19;
- current->thread.error_code = error_code;
- force_sig(SIGSEGV, current);
- }
}
-asmlinkage void do_spurious_interrupt_bug(struct pt_regs * regs,
- long error_code)
+asmlinkage void do_spurious_interrupt_bug(struct pt_regs * regs)
{
-#if 0
- /* No need to warn about this any longer. */
- printk("Ignoring P6 Local APIC Spurious Interrupt Bug...\n");
-#endif
}
/*
*
* Careful.. There are problems with IBM-designed IRQ13 behaviour.
* Don't touch unless you *really* know how it works.
- *
- * Must be called with kernel preemption disabled.
*/
asmlinkage void math_state_restore(void)
{
+ struct task_struct *me = current;
clts(); /* Allow maths ops (or we recurse) */
- if (current->used_math) {
- restore_fpu(current);
+ if (me->used_math) {
+ restore_fpu_checking(&me->thread.i387.fxsave);
} else {
init_fpu();
}
asmlinkage void math_emulate(void)
{
- printk("math-emulation not enabled and no coprocessor found.\n");
- printk("killing %s.\n",current->comm);
- force_sig(SIGFPE,current);
- schedule();
+ BUG();
+}
+
+void do_call_debug(struct pt_regs *regs)
+{
+ notify_die(DIE_CALL, "debug call", regs, 0);
}
void __init trap_init(void)
{
set_intr_gate(0,÷_error);
set_intr_gate(1,&debug);
- set_intr_gate(2,&nmi);
+ set_intr_gate_ist(2,&nmi,NMI_STACK);
set_system_gate(3,&int3); /* int3-5 can be called from all */
set_system_gate(4,&overflow);
set_system_gate(5,&bounds);
set_intr_gate(6,&invalid_op);
set_intr_gate(7,&device_not_available);
- set_intr_gate_ist(8,&double_fault, 1);
+ set_intr_gate_ist(8,&double_fault, DOUBLEFAULT_STACK);
set_intr_gate(9,&coprocessor_segment_overrun);
set_intr_gate(10,&invalid_TSS);
set_intr_gate(11,&segment_not_present);
set_intr_gate(15,&spurious_interrupt_bug);
set_intr_gate(16,&coprocessor_error);
set_intr_gate(17,&alignment_check);
+ set_intr_gate(18,&machine_check);
set_intr_gate(19,&simd_coprocessor_error);
#ifdef CONFIG_IA32_EMULATION
set_intr_gate(IA32_SYSCALL_VECTOR, ia32_syscall);
#endif
-#if 0
- /*
- * default LDT is a single-entry callgate to lcall7 for iBCS
- * and a callgate to lcall27 for Solaris/x86 binaries
- */
- set_call_gate(&default_ldt[0],lcall7);
- set_call_gate(&default_ldt[4],lcall27);
-#endif
+ set_intr_gate(KDB_VECTOR, call_debug);
+
+ notify_die(DIE_TRAPINIT, "traps initialized", 0, 0);
/*
* Should be a barrier for any external CPU state.
*/
cpu_init();
}
-
* vsyscalls. One vsyscall can reserve more than 1 slot to avoid
* jumping out of line if necessary.
*
- * $Id: vsyscall.c,v 1.4 2001/09/27 17:58:13 ak Exp $
+ * $Id: vsyscall.c,v 1.9 2002/03/21 13:42:58 ak Exp $
*/
/*
#include <asm/fixmap.h>
#include <asm/errno.h>
+
#define __vsyscall(nr) __attribute__ ((unused,__section__(".vsyscall_" #nr)))
+//#define NO_VSYSCALL 1
+
+#ifdef NO_VSYSCALL
+#include <asm/unistd.h>
+
+static int errno __section_vxtime_sequence;
+
+__syscall2(static inline int,int,gettimeofday,struct timeval *,tv,struct timezone *,tz)
+
+#else
static inline void timeval_normalize(struct timeval * tv)
{
time_t __sec;
long __vxtime_sequence[2] __section_vxtime_sequence;
-inline void do_vgettimeofday(struct timeval * tv)
+static inline void do_vgettimeofday(struct timeval * tv)
{
long sequence;
unsigned long usec, sec;
rmb();
} while (sequence != __vxtime_sequence[0]);
}
+#endif
static int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
{
+#ifdef NO_VSYSCALL
+ return gettimeofday(tv,tz);
+#else
if (tv)
do_vgettimeofday(tv);
if (tz)
do_get_tz(tz);
return 0;
+#endif
}
-static time_t __vsyscall(1) vtime(time_t * time)
+static time_t __vsyscall(1) vtime(time_t * t)
{
+#ifdef NO_VSYSCALL
+ struct timeval tv;
+ gettimeofday(&tv,NULL);
+ if (t) *t = tv.tv_sec;
+ return tv.tv_sec;
+#else
long sequence;
time_t __time;
rmb();
} while (sequence != __vxtime_sequence[0]);
- if (time)
- *time = __time;
+ if (t)
+ *t = __time;
return __time;
+#endif
}
static long __vsyscall(2) venosys_0(void)
#include <linux/module.h>
#include <linux/smp.h>
#include <linux/user.h>
-#include <linux/elfcore.h>
#include <linux/mca.h>
#include <linux/sched.h>
#include <linux/in6.h>
#include <asm/desc.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
+#include <asm/kdebug.h>
-extern void dump_thread(struct pt_regs *, struct user *);
extern spinlock_t rtc_lock;
-#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
-extern void machine_real_restart(unsigned char *, int);
-EXPORT_SYMBOL(machine_real_restart);
-#endif
-
#ifdef CONFIG_SMP
-extern void FASTCALL( __write_lock_failed(rwlock_t *rw));
-extern void FASTCALL( __read_lock_failed(rwlock_t *rw));
+extern void __write_lock_failed(rwlock_t *rw);
+extern void __read_lock_failed(rwlock_t *rw);
#endif
#if defined(CONFIG_BLK_DEV_IDE) || defined(CONFIG_BLK_DEV_HD) || defined(CONFIG_BLK_DEV_IDE_MODULE) || defined(CONFIG_BLK_DEV_HD_MODULE)
/* platform dependent support */
EXPORT_SYMBOL(boot_cpu_data);
-EXPORT_SYMBOL(dump_thread);
EXPORT_SYMBOL(dump_fpu);
EXPORT_SYMBOL(__ioremap);
EXPORT_SYMBOL(iounmap);
EXPORT_SYMBOL(pm_idle);
EXPORT_SYMBOL(pm_power_off);
EXPORT_SYMBOL(get_cmos_time);
-EXPORT_SYMBOL(apm_info);
#ifdef CONFIG_IO_DEBUG
EXPORT_SYMBOL(__io_virt_debug);
EXPORT_SYMBOL_NOVERS(__get_user_1);
EXPORT_SYMBOL_NOVERS(__get_user_2);
EXPORT_SYMBOL_NOVERS(__get_user_4);
+EXPORT_SYMBOL_NOVERS(__get_user_8);
EXPORT_SYMBOL_NOVERS(__put_user_1);
EXPORT_SYMBOL_NOVERS(__put_user_2);
EXPORT_SYMBOL_NOVERS(__put_user_4);
+EXPORT_SYMBOL_NOVERS(__put_user_8);
EXPORT_SYMBOL(strpbrk);
EXPORT_SYMBOL(strstr);
EXPORT_SYMBOL(__global_restore_flags);
EXPORT_SYMBOL(smp_call_function);
-/* TLB flushing */
-EXPORT_SYMBOL(flush_tlb_page);
#endif
#ifdef CONFIG_MCA
EXPORT_SYMBOL(rtc_lock);
+/* Export string functions. We normally rely on gcc builtin for most of these,
+ but gcc sometimes decides not to inline them. */
#undef memcpy
#undef memset
+#undef memmove
+#undef memchr
+#undef strlen
+#undef strcpy
+#undef strncmp
+#undef strncpy
+#undef strchr
+#undef strcmp
+#undef bcopy
+#undef strcpy
+
extern void * memset(void *,int,__kernel_size_t);
-extern void * memcpy(void *,const void *,__kernel_size_t);
-EXPORT_SYMBOL_NOVERS(memcpy);
+extern size_t strlen(const char *);
+extern char * bcopy(const char * src, char * dest, int count);
+extern void * memmove(void * dest,const void *src,size_t count);
+extern char * strcpy(char * dest,const char *src);
+extern int strcmp(const char * cs,const char * ct);
+extern void *memchr(const void *s, int c, size_t n);
EXPORT_SYMBOL_NOVERS(memset);
+EXPORT_SYMBOL_NOVERS(strlen);
+EXPORT_SYMBOL_NOVERS(memmove);
+EXPORT_SYMBOL_NOVERS(strcpy);
+EXPORT_SYMBOL_NOVERS(strncmp);
+EXPORT_SYMBOL_NOVERS(strncpy);
+EXPORT_SYMBOL_NOVERS(strchr);
+EXPORT_SYMBOL_NOVERS(strcmp);
+EXPORT_SYMBOL_NOVERS(strcat);
+EXPORT_SYMBOL_NOVERS(strncat);
+EXPORT_SYMBOL_NOVERS(memchr);
+EXPORT_SYMBOL_NOVERS(strrchr);
+EXPORT_SYMBOL_NOVERS(strnlen);
+EXPORT_SYMBOL_NOVERS(memscan);
+EXPORT_SYMBOL_NOVERS(bcopy);
EXPORT_SYMBOL(empty_zero_page);
EXPORT_SYMBOL(atomic_dec_and_lock);
#endif
+EXPORT_SYMBOL(die_chain);
+
+extern void do_softirq_thunk(void);
+EXPORT_SYMBOL_NOVERS(do_softirq_thunk);
+
+void out_of_line_bug(void);
+EXPORT_SYMBOL(out_of_line_bug);
L_TARGET = lib.a
obj-y = generic-checksum.o old-checksum.o delay.o \
usercopy.o getuser.o putuser.o \
- checksum_copy.o rwsem_thunk.o
+ checksum_copy.o thunk.o mmx.o
obj-$(CONFIG_IO_DEBUG) += iodebug.o
-obj-$(CONFIG_X86_USE_3DNOW) += mmx.o
obj-$(CONFIG_HAVE_DEC_LOCK) += dec_and_lock.o
include $(TOPDIR)/Rules.make
-#include <linux/config.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/sched.h>
+#include <linux/compiler.h>
#include <asm/i387.h>
#include <asm/hardirq.h>
-
+#include <asm/page.h>
/*
* MMX 3DNow! library helper functions
* 22/09/2000 - Arjan van de Ven
* Improved for non-egineering-sample Athlons
*
+ * 2002 Andi Kleen. Some cleanups and changes for x86-64.
+ * Not really tuned yet. Using the Athlon version for now.
+ * This currenly uses MMX for 8 byte stores, but on hammer we could
+ * use integer 8 byte stores too and avoid the FPU save overhead.
+ * Disadvantage is that the integer load/stores have strong ordering
+ * model and may be slower.
+ *
+ * $Id$
*/
-#error Don't use these for now, but we'll have to provide optimized functions in future
+#ifdef MMX_MEMCPY_THRESH
+
void *_mmx_memcpy(void *to, const void *from, size_t len)
{
void *p;
int i;
- if (in_interrupt())
- return __memcpy(to, from, len);
-
p = to;
+
+ if (unlikely(in_interrupt()))
+ goto standard;
+
+ /* XXX: check if this is still memory bound with unaligned to/from.
+ if not align them here to 8bytes. */
i = len >> 6; /* len/64 */
kernel_fpu_begin();
__asm__ __volatile__ (
- "1: prefetch (%0)\n" /* This set is 28 bytes */
+ " prefetch (%0)\n" /* This set is 28 bytes */
" prefetch 64(%0)\n"
" prefetch 128(%0)\n"
" prefetch 192(%0)\n"
" prefetch 256(%0)\n"
- "2: \n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
+ "\n"
: : "r" (from) );
for(; i>0; i--)
{
__asm__ __volatile__ (
- "1: prefetch 320(%0)\n"
- "2: movq (%0), %%mm0\n"
+ " prefetch 320(%0)\n"
+ " movq (%0), %%mm0\n"
" movq 8(%0), %%mm1\n"
" movq 16(%0), %%mm2\n"
" movq 24(%0), %%mm3\n"
" movq %%mm1, 40(%1)\n"
" movq %%mm2, 48(%1)\n"
" movq %%mm3, 56(%1)\n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x05EB, 1b\n" /* jmp on 5 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
: : "r" (from), "r" (to) : "memory");
from+=64;
to+=64;
}
+ len &= 63;
+ kernel_fpu_end();
+
/*
* Now do the tail of the block
*/
- __memcpy(to, from, len&63);
- kernel_fpu_end();
+
+ standard:
+ __inline_memcpy(to, from, len);
return p;
}
+#endif
-#ifdef CONFIG_MK7
-
-/*
- * The K7 has streaming cache bypass load/store. The Cyrix III, K6 and
- * other MMX using processors do not.
- */
-
-static void fast_clear_page(void *page)
+static inline void fast_clear_page(void *page)
{
int i;
kernel_fpu_end();
}
-static void fast_copy_page(void *to, void *from)
+static inline void fast_copy_page(void *to, void *from)
{
int i;
* but that is for later. -AV
*/
__asm__ __volatile__ (
- "1: prefetch (%0)\n"
+ " prefetch (%0)\n"
" prefetch 64(%0)\n"
" prefetch 128(%0)\n"
" prefetch 192(%0)\n"
" prefetch 256(%0)\n"
- "2: \n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
: : "r" (from) );
for(i=0; i<(4096-320)/64; i++)
{
__asm__ __volatile__ (
- "1: prefetch 320(%0)\n"
- "2: movq (%0), %%mm0\n"
+ " prefetch 320(%0)\n"
+ " movq (%0), %%mm0\n"
" movntq %%mm0, (%1)\n"
" movq 8(%0), %%mm1\n"
" movntq %%mm1, 8(%1)\n"
" movntq %%mm6, 48(%1)\n"
" movq 56(%0), %%mm7\n"
" movntq %%mm7, 56(%1)\n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x05EB, 1b\n" /* jmp on 5 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
: : "r" (from), "r" (to) : "memory");
from+=64;
to+=64;
kernel_fpu_end();
}
-#else
-
-/*
- * Generic MMX implementation without K7 specific streaming
- */
-
-static void fast_clear_page(void *page)
-{
- int i;
-
- kernel_fpu_begin();
-
- __asm__ __volatile__ (
- " pxor %%mm0, %%mm0\n" : :
- );
-
- for(i=0;i<4096/128;i++)
- {
- __asm__ __volatile__ (
- " movq %%mm0, (%0)\n"
- " movq %%mm0, 8(%0)\n"
- " movq %%mm0, 16(%0)\n"
- " movq %%mm0, 24(%0)\n"
- " movq %%mm0, 32(%0)\n"
- " movq %%mm0, 40(%0)\n"
- " movq %%mm0, 48(%0)\n"
- " movq %%mm0, 56(%0)\n"
- " movq %%mm0, 64(%0)\n"
- " movq %%mm0, 72(%0)\n"
- " movq %%mm0, 80(%0)\n"
- " movq %%mm0, 88(%0)\n"
- " movq %%mm0, 96(%0)\n"
- " movq %%mm0, 104(%0)\n"
- " movq %%mm0, 112(%0)\n"
- " movq %%mm0, 120(%0)\n"
- : : "r" (page) : "memory");
- page+=128;
- }
-
- kernel_fpu_end();
-}
-
-static void fast_copy_page(void *to, void *from)
-{
- int i;
-
-
- kernel_fpu_begin();
-
- __asm__ __volatile__ (
- "1: prefetch (%0)\n"
- " prefetch 64(%0)\n"
- " prefetch 128(%0)\n"
- " prefetch 192(%0)\n"
- " prefetch 256(%0)\n"
- "2: \n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
- : : "r" (from) );
-
- for(i=0; i<4096/64; i++)
- {
- __asm__ __volatile__ (
- "1: prefetch 320(%0)\n"
- "2: movq (%0), %%mm0\n"
- " movq 8(%0), %%mm1\n"
- " movq 16(%0), %%mm2\n"
- " movq 24(%0), %%mm3\n"
- " movq %%mm0, (%1)\n"
- " movq %%mm1, 8(%1)\n"
- " movq %%mm2, 16(%1)\n"
- " movq %%mm3, 24(%1)\n"
- " movq 32(%0), %%mm0\n"
- " movq 40(%0), %%mm1\n"
- " movq 48(%0), %%mm2\n"
- " movq 56(%0), %%mm3\n"
- " movq %%mm0, 32(%1)\n"
- " movq %%mm1, 40(%1)\n"
- " movq %%mm2, 48(%1)\n"
- " movq %%mm3, 56(%1)\n"
- ".section .fixup, \"ax\"\n"
- "3: movw $0x05EB, 1b\n" /* jmp on 5 bytes */
- " jmp 2b\n"
- ".previous\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b, 3b\n"
- ".previous"
- : : "r" (from), "r" (to) : "memory");
- from+=64;
- to+=64;
- }
- kernel_fpu_end();
-}
-
-
-#endif
-
-/*
- * Favour MMX for page clear and copy.
- */
-
-static void slow_zero_page(void * page)
-{
- int d0, d1;
- __asm__ __volatile__( \
- "cld\n\t" \
- "rep ; stosl" \
- : "=&c" (d0), "=&D" (d1)
- :"a" (0),"1" (page),"0" (1024)
- :"memory");
-}
-
void mmx_clear_page(void * page)
{
- if(in_interrupt())
- slow_zero_page(page);
+#if 1
+ __builtin_memset(page,0,PAGE_SIZE);
+#else
+ /* AK: these in_interrupt checks should not be needed. */
+ if(unlikely(in_interrupt()))
+ __builtin_memset(page,0,PAGE_SIZE);
else
fast_clear_page(page);
+#endif
}
-static void slow_copy_page(void *to, void *from)
-{
- int d0, d1, d2;
- __asm__ __volatile__( \
- "cld\n\t" \
- "rep ; movsl" \
- : "=&c" (d0), "=&D" (d1), "=&S" (d2) \
- : "0" (1024),"1" ((long) to),"2" ((long) from) \
- : "memory");
-}
-
-
void mmx_copy_page(void *to, void *from)
{
- if(in_interrupt())
- slow_copy_page(to, from);
+#if 1
+ __builtin_memcpy(to,from,PAGE_SIZE);
+#else
+ /* AK: these in_interrupt checks should not be needed. */
+ if(unlikely(in_interrupt()))
+ __builtin_memcpy(to,from,PAGE_SIZE);
else
fast_copy_page(to, from);
+#endif
}
--- /dev/null
+ /*
+ * Save registers before calling assembly functions. This avoids
+ * disturbance of register allocation in some inline assembly constructs.
+ * Copyright 2001,2002 by Andi Kleen, SuSE Labs.
+ * Subject to the GNU public license, v.2. No warranty of any kind.
+ * $Id: thunk.S,v 1.2 2002/03/13 20:06:58 ak Exp $
+ */
+
+ #include <linux/config.h>
+ #include <linux/linkage.h>
+ #include <asm/calling.h>
+ #include <asm/rwlock.h>
+
+ /* rdi: arg1 ... normal C conventions. rax is saved/restored. */
+ .macro thunk name,func
+ .globl \name
+\name:
+ SAVE_ARGS
+ call \func
+ jmp restore
+ .endm
+
+ /* rdi: arg1 ... normal C conventions. rax is passed from C. */
+ .macro thunk_retrax name,func
+ .globl \name
+\name:
+ SAVE_ARGS
+ call \func
+ jmp restore_norax
+ .endm
+
+
+#ifdef CONFIG_RWSEM_XCHGADD_ALGORITHM
+ thunk rwsem_down_read_failed_thunk,rwsem_down_read_failed
+ thunk rwsem_down_write_failed_thunk,rwsem_down_write_failed
+ thunk rwsem_wake_thunk,rwsem_wake
+#endif
+ thunk do_softirq_thunk,do_softirq
+
+ thunk __down_failed,__down
+ thunk_retrax __down_failed_interruptible,__down_interruptible
+ thunk_retrax __down_failed_trylock,__down_trylock
+ thunk __up_wakeup,__up
+
+restore:
+ RESTORE_ARGS
+ ret
+
+restore_norax:
+ RESTORE_ARGS 1
+ ret
+
+#ifdef CONFIG_SMP
+/* Support for read/write spinlocks. */
+
+/* rax: pointer to rwlock_t */
+ENTRY(__write_lock_failed)
+ lock
+ addl $RW_LOCK_BIAS,(%rax)
+1: rep
+ nop
+ cmpl $RW_LOCK_BIAS,(%rax)
+ jne 1b
+ lock
+ subl $RW_LOCK_BIAS,(%rax)
+ jnz __write_lock_failed
+ ret
+
+/* rax: pointer to rwlock_t */
+ENTRY(__read_lock_failed)
+ lock
+ incl (%rax)
+1: rep
+ nop
+ cmpl $1,(%rax)
+ js 1b
+ lock
+ decl (%rax)
+ js __read_lock_failed
+ ret
+#endif
# Note 2! The CFLAGS definition is now in the main makefile...
O_TARGET := mm.o
-obj-y := init.o fault.o ioremap.o extable.o
+obj-y := init.o fault.o ioremap.o extable.o modutil.o
include $(TOPDIR)/Rules.make
* linux/arch/x86-64/mm/fault.c
*
* Copyright (C) 1995 Linus Torvalds
+ * Copyright (C) 2001,2002 Andi Kleen, SuSE Labs.
*/
+#include <linux/config.h>
#include <linux/signal.h>
#include <linux/sched.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/tty.h>
#include <linux/vt_kern.h> /* For unblank_screen() */
+#include <linux/compiler.h>
#include <asm/system.h>
#include <asm/uaccess.h>
#include <asm/pgalloc.h>
#include <asm/hardirq.h>
+#include <asm/smp.h>
+#include <asm/tlbflush.h>
extern void die(const char *,struct pt_regs *,long);
-asmlinkage void do_invalid_op(struct pt_regs *, unsigned long);
-extern unsigned long idt;
-
extern spinlock_t console_lock, timerlist_lock;
void bust_spinlocks(int yes)
}
}
-void do_BUG(const char *file, int line)
-{
- bust_spinlocks(1);
- printk("kernel BUG at %s:%d!\n", file, line);
-}
-
-
void dump_pagetable(unsigned long address)
{
static char *name[] = { "PML4", "PGD", "PDE", "PTE" };
int i, shift;
unsigned long page;
- asm("movq %%cr3,%0":"=r" (page));
shift = 9+9+9+12;
address &= ~0xFFFF000000000000UL;
+ asm("movq %%cr3,%0" : "=r" (page));
for (i = 0; i < 4; i++) {
- page = ((unsigned long *) __va(page))[(address >> shift) & 0x1FFU];
+ unsigned long *padr = (unsigned long *) __va(page);
+ padr += (address >> shift) & 0x1FFU;
+ if (__get_user(page, padr)) {
+ printk("%s: bad %p\n", name[i], padr);
+ break;
+ }
printk("%s: %016lx ", name[i], page);
if ((page & (1 | (1<<7))) != 1) /* Not present or 2MB page */
break;
page &= ~0xFFFUL;
- shift -= 9;
+ shift -= (i == 0) ? 12 : 9;
}
printk("\n");
}
+int page_fault_trace;
+int exception_trace = 1;
+
/*
* This routine handles page faults. It determines the address,
* and the problem, and then passes it off to one of the appropriate
* bit 0 == 0 means no page found, 1 means protection fault
* bit 1 == 0 means read, 1 means write
* bit 2 == 0 means kernel, 1 means user-mode
+ * bit 3 == 1 means fault was an instruction fetch
*/
asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
/* get the address */
__asm__("movq %%cr2,%0":"=r" (address));
+ if (page_fault_trace)
+ printk("pagefault rip:%lx rsp:%lx cs:%lu ss:%lu address %lx error %lx\n",
+ regs->rip,regs->rsp,regs->cs,regs->ss,address,error_code);
tsk = current;
mm = tsk->mm;
info.si_code = SEGV_MAPERR;
- if (address >= TASK_SIZE && !(error_code & 5))
+ /* 5 => page not present and from supervisor mode */
+ if (unlikely(!(error_code & 5) &&
+ ((address >= VMALLOC_START && address <= VMALLOC_END) ||
+ (address >= MODULES_VADDR && address <= MODULES_END))))
goto vmalloc_fault;
-
/*
* If we're in an interrupt or have no user
* context, we must not take the fault..
if (in_interrupt() || !mm)
goto no_context;
+ again:
down_read(&mm->mmap_sem);
vma = find_vma(mm, address);
-
-#if 0
- printk("fault at %lx rip:%lx rsp:%lx err:%lx thr:%x ", address,regs->rip,regs->rsp,error_code,tsk->thread.flags);
- if (vma)
- printk("vma %lx-%lx prot:%lx flags:%lx",vma->vm_start,vma->vm_end,
- vma->vm_page_prot,vma->vm_flags);
- printk("\n");
-#endif
-
-
if (!vma)
goto bad_area;
if (vma->vm_start <= address)
goto bad_area;
}
-survive:
/*
* If for any reason at all we couldn't handle the fault,
* make sure we exit gracefully rather than endlessly redo
/* User mode accesses just cause a SIGSEGV */
if (error_code & 4) {
-
- printk(KERN_ERR "%.20s[%d] segfaulted rip:%lx rsp:%lx adr:%lx err:%lx\n",
- tsk->comm, tsk->pid,
- regs->rip, regs->rsp, address, error_code);
+ printk("%s[%d] segfault at rip:%lx rsp:%lx adr:%lx err:%lx\n",
+ tsk->comm, tsk->pid, regs->rip, regs->rsp, address,
+ error_code);
tsk->thread.cr2 = address;
tsk->thread.error_code = error_code;
/* Are we prepared to handle this kernel fault? */
if ((fixup = search_exception_table(regs->rip)) != 0) {
regs->rip = fixup;
+ if (exception_trace)
+ printk(KERN_ERR
+ "%s: fixed kernel exception at %lx address %lx err:%ld\n",
+ current->comm, regs->rip, address, error_code);
return;
}
up_read(&mm->mmap_sem);
if (current->pid == 1) {
yield();
- down_read(&mm->mmap_sem);
- goto survive;
+ goto again;
}
printk("VM: killing process %s\n", tsk->comm);
if (error_code & 4)
/* Kernel mode? Handle exceptions or die */
if (!(error_code & 4))
goto no_context;
+ return;
vmalloc_fault:
{
+ pgd_t *pgd;
+ pmd_t *pmd;
+ pte_t *pte;
+
/*
- * Synchronize the kernel space top level page-table
- * with the 'reference' page table.
- * Currently it only works for first and last 512 GB of
- * kernel memory FIXME
- *
+ * x86-64 has the same kernel 3rd level pages for all CPUs.
+ * But for vmalloc/modules the TLB synchronization works lazily,
+ * so it can happen that we get a page fault for something
+ * that is really already in the page table. Just check if it
+ * is really there and when yes flush the local TLB.
*/
- level4_t *l4pd = level4_offset_k(address);
- int offset = __pgd_offset(address);
- pgd_t *pgd, *pgd_k;
- pmd_t *pmd, *pmd_k;
-
- if (! level4_val(*l4pd)) {
- printk(KERN_ERR "fatal - no entry in level4_page for %lx\n",
- address);
- goto bad_area_nosemaphore;
- }
- pgd = level3_offset_k(l4pd, address);
- pgd_k = init_mm.pgd + offset;
-
- if (!pgd_present(*pgd)) {
- if (!pgd_present(*pgd_k))
+ pgd = pgd_offset_k(address);
+ if (pgd != current_pgd_offset_k(address))
+ BUG();
+ if (!pgd_present(*pgd))
goto bad_area_nosemaphore;
- set_pgd(pgd, *pgd_k);
- return;
- }
-
pmd = pmd_offset(pgd, address);
- pmd_k = pmd_offset(pgd_k, address);
-
- if (pmd_present(*pmd) || !pmd_present(*pmd_k))
+ if (!pmd_present(*pmd))
+ goto bad_area_nosemaphore;
+ pte = pte_offset_kernel(pmd, address);
+ if (!pte_present(*pte))
goto bad_area_nosemaphore;
- set_pmd(pmd, *pmd_k);
+
+ /* Strictly a flush_tlb_all because vmalloc is global,
+ but this only applies to new global pages so it should
+ not be needed. vmalloc will likely touch multiple ptes,
+ so do a full flush instead of a partial one. */
+ __flush_tlb();
return;
}
}
#include <asm/e820.h>
#include <asm/apic.h>
#include <asm/tlb.h>
+#include <asm/mmu_context.h>
mmu_gather_t mmu_gathers[NR_CPUS];
static void set_pte_phys(unsigned long vaddr,
unsigned long phys, pgprot_t prot)
{
- level4_t *level4;
+ pml4_t *level4;
pgd_t *pgd;
pmd_t *pmd;
pte_t *pte;
- level4 = level4_offset_k(vaddr);
- if (level4_none(*level4)) {
- printk("LEVEL4 FIXMAP MISSING, it should be setup in head.S!\n");
+ level4 = pml4_offset_k(vaddr);
+ if (pml4_none(*level4)) {
+ printk("PML4 FIXMAP MISSING, it should be setup in head.S!\n");
return;
}
pgd = level3_offset_k(level4, vaddr);
if (pgd_none(*pgd)) {
pmd = (pmd_t *) spp_getpage();
- set_pgd(pgd, __pgd(__pa(pmd) + 0x7));
+ set_pgd(pgd, __pgd(__pa(pmd) | _KERNPG_TABLE | _PAGE_USER));
if (pmd != pmd_offset(pgd, 0)) {
printk("PAGETABLE BUG #01!\n");
return;
pmd = pmd_offset(pgd, vaddr);
if (pmd_none(*pmd)) {
pte = (pte_t *) spp_getpage();
- set_pmd(pmd, __pmd(__pa(pte) + 0x7));
+ set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE | _PAGE_USER));
if (pte != pte_offset_kernel(pmd, 0)) {
printk("PAGETABLE BUG #02!\n");
return;
set_pte_phys(address, phys, prot);
}
-static void __init pagetable_init (void)
-{
- unsigned long paddr, end;
- pgd_t *pgd;
- int i, j;
+extern unsigned long start_pfn, end_pfn;
+extern pmd_t temp_boot_pmds[];
+
+static struct temp_map {
pmd_t *pmd;
+ void *address;
+ int allocated;
+} temp_mappings[] __initdata = {
+ { &temp_boot_pmds[0], (void *)(40UL * 1024 * 1024) },
+ { &temp_boot_pmds[1], (void *)(42UL * 1024 * 1024) },
+ {}
+};
+
+static __init void *alloc_low_page(int *index, unsigned long *phys)
+{
+ struct temp_map *ti;
+ int i;
+ unsigned long pfn = start_pfn++, paddr;
+ void *adr;
+
+ if (pfn >= end_pfn)
+ panic("alloc_low_page: ran out of memory");
+ for (i = 0; temp_mappings[i].allocated; i++) {
+ if (!temp_mappings[i].pmd)
+ panic("alloc_low_page: ran out of temp mappings");
+ }
+ ti = &temp_mappings[i];
+ paddr = (pfn & (~511)) << PAGE_SHIFT;
+ set_pmd(ti->pmd, __pmd(paddr | _KERNPG_TABLE | _PAGE_PSE));
+ ti->allocated = 1;
+ __flush_tlb();
+ adr = ti->address + (pfn & 511)*PAGE_SIZE;
+ *index = i;
+ *phys = pfn * PAGE_SIZE;
+ return adr;
+}
- /*
- * This can be zero as well - no problem, in that case we exit
- * the loops anyway due to the PTRS_PER_* conditions.
- */
- end = (unsigned long) max_low_pfn*PAGE_SIZE;
- if (end > 0x8000000000) {
- printk("Temporary supporting only 512G of global RAM\n");
- end = 0x8000000000;
- max_low_pfn = 0x8000000000 >> PAGE_SHIFT;
- }
+static __init void unmap_low_page(int i)
+{
+ struct temp_map *ti = &temp_mappings[i];
+ set_pmd(ti->pmd, __pmd(0));
+ ti->allocated = 0;
+}
- i = __pgd_offset(PAGE_OFFSET);
- pgd = level3_physmem_pgt + i;
+static void __init phys_pgd_init(pgd_t *pgd, unsigned long address, unsigned long end)
+{
+ long i, j;
+ i = pgd_index(address);
+ pgd = pgd + i;
for (; i < PTRS_PER_PGD; pgd++, i++) {
- paddr = i*PGDIR_SIZE;
- if (paddr >= end)
- break;
- if (i)
- pmd = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
- else
- pmd = level2_kernel_pgt;
+ int map;
+ unsigned long paddr = i*PGDIR_SIZE, pmd_phys;
+ pmd_t *pmd;
- set_pgd(pgd, __pgd(__pa(pmd) + 0x7));
+ if (paddr >= end) {
+ for (; i < PTRS_PER_PGD; i++, pgd++)
+ set_pgd(pgd, __pgd(0));
+ break;
+ }
+ pmd = alloc_low_page(&map, &pmd_phys);
+ set_pgd(pgd, __pgd(pmd_phys | _KERNPG_TABLE));
for (j = 0; j < PTRS_PER_PMD; pmd++, j++) {
- unsigned long __pe;
+ unsigned long pe;
paddr = i*PGDIR_SIZE + j*PMD_SIZE;
- if (paddr >= end)
+ if (paddr >= end) {
+ for (; j < PTRS_PER_PMD; j++, pmd++)
+ set_pmd(pmd, __pmd(0));
break;
-
- __pe = _KERNPG_TABLE + _PAGE_PSE + paddr + _PAGE_GLOBAL;
- set_pmd(pmd, __pmd(__pe));
}
+ pe = _PAGE_PSE | _KERNPG_TABLE | _PAGE_GLOBAL | paddr;
+ set_pmd(pmd, __pmd(pe));
+ }
+ unmap_low_page(map);
}
+ __flush_tlb();
+}
- /*
- * Add low memory identity-mappings - SMP needs it when
- * starting up on an AP from real-mode. In the non-PAE
- * case we already have these mappings through head.S.
- * All user-space mappings are explicitly cleared after
- * SMP startup.
- */
-#ifdef FIXME
- pgd_base [0] is not what you think, this needs to be rewritten for SMP.
- pgd_base[0] = pgd_base[USER_PTRS_PER_PGD];
-#endif
+/* Setup the direct mapping of the physical memory at PAGE_OFFSET.
+ This runs before bootmem is initialized and gets pages directly from the
+ physical memory. To access them they are temporarily mapped. */
+void __init init_memory_mapping(void)
+{
+ unsigned long adr;
+ unsigned long end;
+ unsigned long next;
+
+ end = PAGE_OFFSET + (end_pfn * PAGE_SIZE);
+ for (adr = PAGE_OFFSET; adr < end; adr = next) {
+ int map;
+ unsigned long pgd_phys;
+ pgd_t *pgd = alloc_low_page(&map, &pgd_phys);
+ next = adr + (512UL * 1024 * 1024 * 1024);
+ if (next > end)
+ next = end;
+ phys_pgd_init(pgd, adr-PAGE_OFFSET, next-PAGE_OFFSET);
+ set_pml4(init_level4_pgt + pml4_index(adr), mk_kernel_pml4(pgd_phys));
+ unmap_low_page(map);
+ }
+ asm volatile("movq %%cr4,%0" : "=r" (mmu_cr4_features));
+ __flush_tlb_all();
}
+extern struct x8664_pda cpu_pda[NR_CPUS];
+
void __init zap_low_mappings (void)
{
int i;
- /*
- * Zap initial low-memory mappings.
- *
- * Note that "pgd_clear()" doesn't do it for
- * us in this case, because pgd_clear() is a
- * no-op in the 2-level case (pmd_clear() is
- * the thing that clears the page-tables in
- * that case).
- */
- for (i = 0; i < USER_PTRS_PER_PGD; i++)
- pgd_clear(swapper_pg_dir+i);
+ for (i = 0; i < NR_CPUS; i++) {
+ if (cpu_pda[i].level4_pgt)
+ cpu_pda[i].level4_pgt[0] = 0;
+ }
flush_tlb_all();
}
-/*
- * paging_init() sets up the page tables - note that the first 4MB are
- * already mapped by head.S.
- *
- * This routines also unmaps the page at virtual kernel address 0, so
- * that we can trap those pesky NULL-reference errors in the kernel.
- */
void __init paging_init(void)
{
- asm volatile("movq %%cr4,%0" : "=r" (mmu_cr4_features));
-
- pagetable_init();
-
- __flush_tlb_all();
-
{
unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0};
unsigned int max_dma, low;
initsize >> 10);
/*
- * Subtle. SMP is doing it's boot stuff late (because it has to
+ * Subtle. SMP is doing its boot stuff late (because it has to
* fork idle threads) - but it also needs low mappings for the
* protected-mode entry to work. We zap these entries only after
* the WP-bit has been tested.
for (; addr < (unsigned long)(&__init_end); addr += PAGE_SIZE) {
ClearPageReserved(virt_to_page(addr));
set_page_count(virt_to_page(addr), 1);
+#ifdef CONFIG_INIT_DEBUG
+ memset(addr & ~(PAGE_SIZE-1), 0xcc, PAGE_SIZE);
+#endif
free_page(addr);
totalram_pages++;
}
/*
- * arch/i386/mm/ioremap.c
+ * arch/x86_64/mm/ioremap.c
*
* Re-map IO memory to kernel address space so that we can access it.
* This is needed for high PCI addresses that aren't mapped in the
#include <linux/vmalloc.h>
#include <asm/io.h>
#include <asm/pgalloc.h>
+#include <asm/fixmap.h>
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+
static inline void remap_area_pte(pte_t * pte, unsigned long address, unsigned long size,
unsigned long phys_addr, unsigned long flags)
BUG();
}
set_pte(pte, mk_pte_phys(phys_addr, __pgprot(_PAGE_PRESENT | _PAGE_RW |
- _PAGE_DIRTY | _PAGE_ACCESSED | flags)));
+ _PAGE_GLOBAL | _PAGE_DIRTY | _PAGE_ACCESSED | flags)));
address += PAGE_SIZE;
phys_addr += PAGE_SIZE;
pte++;
unsigned long end = address + size;
phys_addr -= address;
- dir = pgd_offset(&init_mm, address);
+ dir = pgd_offset_k(address);
flush_cache_all();
if (address >= end)
BUG();
--- /dev/null
+/* arch/x86_64/mm/modutil.c
+ *
+ * Copyright (C) 1997,1998 Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ * Based upon code written by Linus Torvalds and others.
+ *
+ * Blatantly copied from sparc64 for x86-64 by Andi Kleen.
+ * Should use direct mapping with 2MB pages. This would need extension
+ * of the kernel mapping.
+ */
+
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+
+#include <asm/uaccess.h>
+#include <asm/system.h>
+
+static struct vm_struct * modvmlist = NULL;
+
+void module_unmap (void * addr)
+{
+ struct vm_struct **p, *tmp;
+
+ if (!addr)
+ return;
+ if ((PAGE_SIZE-1) & (unsigned long) addr) {
+ printk("Trying to unmap module with bad address (%p)\n", addr);
+ return;
+ }
+ for (p = &modvmlist ; (tmp = *p) ; p = &tmp->next) {
+ if (tmp->addr == addr) {
+ *p = tmp->next;
+ vmfree_area_pages(VMALLOC_VMADDR(tmp->addr), tmp->size);
+ kfree(tmp);
+ return;
+ }
+ }
+ printk("Trying to unmap nonexistent module vm area (%p)\n", addr);
+}
+
+void * module_map (unsigned long size)
+{
+ void * addr;
+ struct vm_struct **p, *tmp, *area;
+
+ size = PAGE_ALIGN(size);
+ if (!size || size > MODULES_LEN) return NULL;
+
+ addr = (void *) MODULES_VADDR;
+ for (p = &modvmlist; (tmp = *p) ; p = &tmp->next) {
+ if (size + (unsigned long) addr < (unsigned long) tmp->addr)
+ break;
+ addr = (void *) (tmp->size + (unsigned long) tmp->addr);
+ }
+ if ((unsigned long) addr + size >= MODULES_END) return NULL;
+
+ area = (struct vm_struct *) kmalloc(sizeof(*area), GFP_KERNEL);
+ if (!area) return NULL;
+ area->size = size + PAGE_SIZE;
+ area->addr = addr;
+ area->next = *p;
+ *p = area;
+
+ if (vmalloc_area_pages(VMALLOC_VMADDR(addr), size, GFP_KERNEL, PAGE_KERNEL)) {
+ module_unmap(addr);
+ return NULL;
+ }
+ return addr;
+}
ENTRY(pcurrent);
ENTRY(irqrsp);
ENTRY(irqcount);
- ENTRY(irqstack);
ENTRY(cpunumber);
ENTRY(irqstackptr);
- ENTRY(me);
ENTRY(__softirq_pending);
ENTRY(__local_irq_count);
ENTRY(__local_bh_count);
ENTRY(__ksoftirqd_task);
+ ENTRY(level4_pgt);
+ ENTRY(me);
#undef ENTRY
output("#ifdef __ASSEMBLY__");
#define CONST(t) outconst("#define " #t " %0", t)
* Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>;
*/
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
-OUTPUT_ARCH(i386)
+OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SECTIONS
{
__ksymtab : { *(__ksymtab) }
__stop___ksymtab = .;
+ __start___kallsyms = .; /* All kernel symbols */
+ __kallsyms : { *(__kallsyms) }
+ __stop___kallsyms = .;
+
.data : { /* Data */
*(.data)
CONSTRUCTORS
. = ALIGN(8192); /* init_task */
.data.init_task : { *(.data.init_task) }
+ . = ALIGN(4096);
+ .data.boot_pgt : { *(.data.boot_pgt) }
+
. = ALIGN(4096); /* Init code and data */
__init_begin = .;
.text.init : { *(.text.init) }
*(.initcall7.init)
}
__initcall_end = .;
+ . = ALIGN(32);
+ __per_cpu_start = .;
+ . = ALIGN(64);
+ .data.percpu : { *(.data.percpu) }
+ __per_cpu_end = .;
. = ALIGN(4096);
__init_end = .;
*(.exitcall.exit)
}
- /* Stabs debugging sections. */
- .stab 0 : { *(.stab) }
- .stabstr 0 : { *(.stabstr) }
- .stab.excl 0 : { *(.stab.excl) }
- .stab.exclstr 0 : { *(.stab.exclstr) }
- .stab.index 0 : { *(.stab.index) }
- .stab.indexstr 0 : { *(.stab.indexstr) }
+ /* DWARF 2 */
+ .debug_info 0 : { *(.debug_info) }
+ .debug_abbrev 0 : { *(.debug_abbrev) }
+ .debug_line 0 : { *(.debug_line) }
+ .debug_frame 0 : { *(.debug_frame) }
+ .debug_str 0 : { *(.debug_str) }
+ .debug_loc 0 : { *(.debug_loc) }
+ .debug_macinfo 0 : { *(.debug_macinfo) }
+ /* SGI/MIPS DWARF 2 extensions */
+ .debug_weaknames 0 : { *(.debug_weaknames) }
+ .debug_funcnames 0 : { *(.debug_funcnames) }
+ .debug_typenames 0 : { *(.debug_typenames) }
+ .debug_varnames 0 : { *(.debug_varnames) }
+
+
.comment 0 : { *(.comment) }
}
#include <linux/config.h>
#include <linux/pm.h>
+#include <asm/fixmap.h>
#include <asm/apicdef.h>
#include <asm/system.h>
* Basic functions accessing APICs.
*/
-static __inline void apic_write(unsigned long reg, unsigned long v)
+static __inline void apic_write(unsigned long reg, unsigned int v)
{
- *((volatile unsigned long *)(APIC_BASE+reg)) = v;
+ *((volatile unsigned int *)(APIC_BASE+reg)) = v;
}
-static __inline void apic_write_atomic(unsigned long reg, unsigned long v)
+static __inline void apic_write_atomic(unsigned long reg, unsigned int v)
{
- xchg((volatile unsigned long *)(APIC_BASE+reg), v);
+ xchg((volatile unsigned int *)(APIC_BASE+reg), v);
}
-static __inline unsigned long apic_read(unsigned long reg)
+static __inline unsigned int apic_read(unsigned long reg)
{
- return *((volatile unsigned long *)(APIC_BASE+reg));
+ return *((volatile unsigned int *)(APIC_BASE+reg));
}
static __inline__ void apic_wait_icr_idle(void)
{
- do { } while ( apic_read( APIC_ICR ) & APIC_ICR_BUSY );
+ while ( apic_read( APIC_ICR ) & APIC_ICR_BUSY );
}
#ifdef CONFIG_X86_GOOD_APIC
apic_write_around(APIC_EOI, 0);
}
-extern int get_maxlvt(void);
-extern void clear_local_APIC(void);
+extern int get_maxlvt (void);
+extern void clear_local_APIC (void);
extern void connect_bsp_APIC (void);
extern void disconnect_bsp_APIC (void);
extern void disable_local_APIC (void);
#define GET_APIC_DEST_FIELD(x) (((x)>>24)&0xFF)
#define SET_APIC_DEST_FIELD(x) ((x)<<24)
#define APIC_LVTT 0x320
+#define APIC_LVTTHMR 0x330
#define APIC_LVTPC 0x340
#define APIC_LVT0 0x350
#define APIC_LVT_TIMER_BASE_MASK (0x3<<18)
u32 __reserved_4[3];
} lvt_timer;
-/*330*/ struct { u32 __reserved[4]; } __reserved_15;
+/*330*/ struct { /* LVT - Thermal Sensor */
+ u32 vector : 8,
+ delivery_mode : 3,
+ __reserved_1 : 1,
+ delivery_status : 1,
+ __reserved_2 : 3,
+ mask : 1,
+ __reserved_3 : 15;
+ u32 __reserved_4[3];
+ } lvt_thermal;
/*340*/ struct { /* LVT - Performance Counter */
u32 vector : 8,
*
* This function is atomic and may not be reordered. See __set_bit()
* if you do not require the atomic guarantees.
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
*/
-static __inline__ void set_bit(int nr, volatile void * addr)
+static __inline__ void set_bit(long nr, volatile void * addr)
{
__asm__ __volatile__( LOCK_PREFIX
- "btsl %1,%0"
+ "btsq %1,%0"
:"=m" (ADDR)
- :"dIr" (nr));
+ :"dIr" (nr) : "memory");
}
/**
*/
static __inline__ void __set_bit(int nr, volatile void * addr)
{
- __asm__(
+ __asm__ volatile(
"btsl %1,%0"
:"=m" (ADDR)
- :"dIr" (nr));
+ :"dIr" (nr) : "memory");
}
/**
__asm__ __volatile__(
"btrl %1,%0"
:"=m" (ADDR)
- :"Ir" (nr));
+ :"dIr" (nr));
}
+
#define smp_mb__before_clear_bit() barrier()
#define smp_mb__after_clear_bit() barrier()
return 0;
__asm__ __volatile__(
"movl $-1,%%eax\n\t"
- "xorq %%rdx,%%rdx\n\t"
+ "xorl %%edx,%%edx\n\t"
"repe; scasl\n\t"
"je 1f\n\t"
"xorl -4(%%rdi),%%eax\n\t"
return res;
}
+/**
+ * find_next_zero_bit - find the first zero bit in a memory region
+ * @addr: The address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The maximum size to search
+ */
+static __inline__ int find_next_zero_bit (void * addr, int size, int offset)
+{
+ unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
+ unsigned long set = 0;
+ long res, bit = offset&63;
+
+ if (bit) {
+ /*
+ * Look for zero in first word
+ */
+ __asm__("bsfq %1,%0\n\t"
+ "cmoveq %2,%0"
+ : "=r" (set)
+ : "r" (~(*p >> bit)), "r"(64L));
+ if (set < (64 - bit))
+ return set + offset;
+ set = 64 - bit;
+ p++;
+ }
+ /*
+ * No zero yet, search remaining full words for a zero
+ */
+ res = find_first_zero_bit (p, size - 64 * (p - (unsigned long *) addr));
+ return (offset + set + res);
+}
+
+
/**
* find_first_bit - find the first set bit in a memory region
* @addr: The address to start the search at
int res;
/* This looks at memory. Mark it volatile to tell gcc not to move it around */
- /* Work in 32bit for now */
__asm__ __volatile__(
"xorl %%eax,%%eax\n\t"
"repe; scasl\n\t"
"jz 1f\n\t"
"leaq -4(%%rdi),%%rdi\n\t"
- "bsfl (%%rdi),%%eax\n"
- "1:\tsubq %%rbx,%%rdi\n\t"
- "shlq $3,%%rdi\n\t"
- "addq %%rdi,%%rax"
+ "bsfq (%%rdi),%%rax\n"
+ "1:\tsubl %%ebx,%%edi\n\t"
+ "shll $3,%%edi\n\t"
+ "addl %%edi,%%eax"
:"=a" (res), "=&c" (d0), "=&D" (d1)
:"1" ((size + 31) >> 5), "2" (addr), "b" (addr));
return res;
}
+
/**
- * find_next_zero_bit - find the first zero bit in a memory region
+ * find_next_bit - find the first set bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The maximum size to search
*/
-static __inline__ int find_next_zero_bit (void * addr, int size, int offset)
+static __inline__ int find_next_bit(void * addr, int size, int offset)
{
unsigned int * p = ((unsigned int *) addr) + (offset >> 5);
int set = 0, bit = offset & 31, res;
if (bit) {
/*
- * Look for zero in the first 32 bits.
+ * Look for nonzero in the first 32 bits:
*/
__asm__("bsfl %1,%0\n\t"
- "jne 1f\n\t"
- "movl $32, %0\n"
- "1:"
+ "cmovel %2,%0\n\t"
: "=r" (set)
- : "r" (~(*p >> bit)));
+ : "r" (*p >> bit), "r" (32));
if (set < (32 - bit))
return set + offset;
set = 32 - bit;
p++;
}
- /*
- * No zero yet, search remaining full bytes for a zero
- */
- res = find_first_zero_bit (p, size - 32 * (p - (unsigned int *) addr));
- return (offset + set + res);
-}
-
-/**
- * find_next_bit - find the first set bit in a memory region
- * @addr: The address to base the search on
- * @offset: The bitnumber to start searching at
- * @size: The maximum size to search
- */
-static __inline__ int find_next_bit (void * addr, int size, int offset)
-{
- unsigned long * p = ((unsigned long *) addr) + (offset >> 5);
- unsigned long set = 0, bit = offset & 63, res;
-
- if (bit) {
- /*
- * Look for nonzero in the first 64 bits:
- */
- __asm__("bsfq %1,%0\n\t"
- "jne 1f\n\t"
- "movq $64, %0\n"
- "1:"
- : "=r" (set)
- : "r" (*p >> bit));
- if (set < (64 - bit))
- return set + offset;
- set = 64 - bit;
- p++;
- }
/*
* No set bit yet, search remaining full words for a bit
*/
- res = find_first_bit (p, size - 64 * (p - (unsigned long *) addr));
+ res = find_first_bit (p, size - 32 * (p - (unsigned int *) addr));
return (offset + set + res);
}
int r;
__asm__("bsfl %1,%0\n\t"
- "jnz 1f\n\t"
- "movl $-1,%0\n"
- "1:" : "=r" (r) : "g" (x));
+ "cmovzl %2,%0"
+ : "=r" (r) : "g" (x), "r" (32));
return r+1;
}
#ifdef __KERNEL__
-#define ext2_set_bit __test_and_set_bit
-#define ext2_clear_bit __test_and_clear_bit
-#define ext2_test_bit test_bit
-#define ext2_find_first_zero_bit find_first_zero_bit
-#define ext2_find_next_zero_bit find_next_zero_bit
+#define ext2_set_bit(nr,addr) \
+ __test_and_set_bit((nr),(unsigned long*)addr)
+#define ext2_clear_bit(nr, addr) \
+ __test_and_clear_bit((nr),(unsigned long*)addr)
+#define ext2_test_bit(nr, addr) test_bit((nr),(unsigned long*)addr)
+#define ext2_find_first_zero_bit(addr, size) \
+ find_first_zero_bit((unsigned long*)addr, size)
+#define ext2_find_next_zero_bit(addr, size, off) \
+ find_next_zero_bit((unsigned long*)addr, size, off)
/* Bitmap functions for the minix filesystem. */
-#define minix_test_and_set_bit(nr,addr) __test_and_set_bit(nr,addr)
-#define minix_set_bit(nr,addr) __set_bit(nr,addr)
-#define minix_test_and_clear_bit(nr,addr) __test_and_clear_bit(nr,addr)
-#define minix_test_bit(nr,addr) test_bit(nr,addr)
-#define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
+#define minix_test_and_set_bit(nr,addr) __test_and_set_bit(nr,(void*)addr)
+#define minix_set_bit(nr,addr) __set_bit(nr,(void*)addr)
+#define minix_test_and_clear_bit(nr,addr) __test_and_clear_bit(nr,(void*)addr)
+#define minix_test_bit(nr,addr) test_bit(nr,(void*)addr)
+#define minix_find_first_zero_bit(addr,size) \
+ find_first_zero_bit((void*)addr,size)
#endif /* __KERNEL__ */
#include <linux/config.h>
#include <asm/processor.h>
#include <asm/i387.h>
-
-static inline void check_fpu(void)
-{
- extern void __bad_fxsave_alignment(void);
- if (offsetof(struct task_struct, thread.i387.fxsave) & 15)
- __bad_fxsave_alignment();
- printk(KERN_INFO "Enabling fast FPU save and restore... ");
- set_in_cr4(X86_CR4_OSFXSR);
- printk("done.\n");
- printk(KERN_INFO "Enabling unmasked SIMD FPU exception support... ");
- set_in_cr4(X86_CR4_OSXMMEXCPT);
- printk("done.\n");
-}
-
-/*
- * If we configured ourselves for FXSR, we'd better have it.
- */
+#include <asm/msr.h>
+#include <asm/pda.h>
static void __init check_bugs(void)
{
identify_cpu(&boot_cpu_data);
- check_fpu();
#if !defined(CONFIG_SMP)
printk("CPU: ");
print_cpu_info(&boot_cpu_data);
--- /dev/null
+#ifndef _I386_CACHEFLUSH_H
+#define _I386_CACHEFLUSH_H
+
+/* Keep includes the same across arches. */
+#include <linux/mm.h>
+
+/* Caches aren't brain-dead on the intel. */
+#define flush_cache_all() do { } while (0)
+#define flush_cache_mm(mm) do { } while (0)
+#define flush_cache_range(vma, start, end) do { } while (0)
+#define flush_cache_page(vma, vmaddr) do { } while (0)
+#define flush_page_to_ram(page) do { } while (0)
+#define flush_dcache_page(page) do { } while (0)
+#define flush_icache_range(start, end) do { } while (0)
+#define flush_icache_page(vma,pg) do { } while (0)
+#define flush_icache_user_range(vma,pg,adr,len) do { } while (0)
+
+#endif /* _I386_CACHEFLUSH_H */
-/* Some macros to handle stack frames */
+/*
+ * Some macros to handle stack frames in assembly.
+ */
- .macro SAVE_ARGS
- pushq %rdi
- pushq %rsi
- pushq %rdx
- pushq %rcx
- pushq %rax
- pushq %r8
- pushq %r9
- pushq %r10
- pushq %r11
+#include <linux/config.h>
+
+#define R15 0
+#define R14 8
+#define R13 16
+#define R12 24
+#define RBP 36
+#define RBX 40
+/* arguments: interrupts/non tracing syscalls only save upto here*/
+#define R11 48
+#define R10 56
+#define R9 64
+#define R8 72
+#define RAX 80
+#define RCX 88
+#define RDX 96
+#define RSI 104
+#define RDI 112
+#define ORIG_RAX 120 /* + error_code */
+/* end of arguments */
+/* cpu exception frame or undefined in case of fast syscall. */
+#define RIP 128
+#define CS 136
+#define EFLAGS 144
+#define RSP 152
+#define SS 160
+#define ARGOFFSET R11
+
+ .macro SAVE_ARGS addskip=0,norcx=0
+ subq $9*8+\addskip,%rsp
+ movq %rdi,8*8(%rsp)
+ movq %rsi,7*8(%rsp)
+ movq %rdx,6*8(%rsp)
+ .if \norcx
+ .else
+ movq %rcx,5*8(%rsp)
+ .endif
+ movq %rax,4*8(%rsp)
+ movq %r8,3*8(%rsp)
+ movq %r9,2*8(%rsp)
+ movq %r10,1*8(%rsp)
+ movq %r11,(%rsp)
.endm
- .macro RESTORE_ARGS
- popq %r11
- popq %r10
- popq %r9
- popq %r8
- popq %rax
- popq %rcx
- popq %rdx
- popq %rsi
- popq %rdi
+#define ARG_SKIP 9*8
+ .macro RESTORE_ARGS skiprax=0,addskip=0,skiprcx=0
+ movq (%rsp),%r11
+ movq 1*8(%rsp),%r10
+ movq 2*8(%rsp),%r9
+ movq 3*8(%rsp),%r8
+ .if \skiprax
+ .else
+ movq 4*8(%rsp),%rax
+ .endif
+ .if \skiprcx
+ .else
+ movq 5*8(%rsp),%rcx
+ .endif
+ movq 6*8(%rsp),%rdx
+ movq 7*8(%rsp),%rsi
+ movq 8*8(%rsp),%rdi
+ .if ARG_SKIP+\addskip > 0
+ addq $ARG_SKIP+\addskip,%rsp
+ .endif
.endm
.macro LOAD_ARGS offset
.endm
.macro SAVE_REST
- pushq %rbx
- pushq %rbp
- pushq %r12
- pushq %r13
- pushq %r14
- pushq %r15
+ subq $6*8,%rsp
+ movq %rbx,5*8(%rsp)
+ movq %rbp,4*8(%rsp)
+ movq %r12,3*8(%rsp)
+ movq %r13,2*8(%rsp)
+ movq %r14,1*8(%rsp)
+ movq %r15,(%rsp)
.endm
+#define REST_SKIP 6*8
.macro RESTORE_REST
- popq %r15
- popq %r14
- popq %r13
- popq %r12
- popq %rbp
- popq %rbx
+ movq (%rsp),%r15
+ movq 1*8(%rsp),%r14
+ movq 2*8(%rsp),%r13
+ movq 3*8(%rsp),%r12
+ movq 4*8(%rsp),%rbp
+ movq 5*8(%rsp),%rbx
+ addq $REST_SKIP,%rsp
.endm
.macro SAVE_ALL
SAVE_REST
.endm
- .macro RESTORE_ALL
+ .macro RESTORE_ALL addskip=0
RESTORE_REST
- RESTORE_ARGS
+ RESTORE_ARGS 0,\addskip
.endm
+ /* push in order ss, rsp, eflags, cs, rip */
+ .macro FAKE_STACK_FRAME child_rip
+ xorl %eax,%eax
+ subq $6*8,%rsp
+ movq %rax,5*8(%rsp) /* ss */
+ movq %rax,4*8(%rsp) /* rsp */
+ movq %rax,3*8(%rsp) /* eflags */
+ movq $__KERNEL_CS,2*8(%rsp) /* cs */
+ movq \child_rip,1*8(%rsp) /* rip */
+ movq %rax,(%rsp) /* orig_rax */
+ .endm
-R15 = 0
-R14 = 8
-R13 = 16
-R12 = 24
-RBP = 36
-RBX = 40
-/* arguments: interrupts/non tracing syscalls only save upto here*/
-R11 = 48
-R10 = 56
-R9 = 64
-R8 = 72
-RAX = 80
-RCX = 88
-RDX = 96
-RSI = 104
-RDI = 112
-ORIG_RAX = 120 /* = ERROR */
-/* end of arguments */
-/* cpu exception frame or undefined in case of fast syscall. */
-RIP = 128
-CS = 136
-EFLAGS = 144
-RSP = 152
-SS = 160
-ARGOFFSET = R11
-
- .macro SYSRET32
- .byte 0x0f,0x07
+ .macro UNFAKE_STACK_FRAME
+ addq $8*6, %rsp
.endm
- .macro SYSRET64
- .byte 0x48,0x0f,0x07
+ .macro icebp
+ .byte 0xf1
.endm
+
+#ifdef CONFIG_FRAME_POINTER
+#define ENTER enter
+#define LEAVE leave
+#else
+#define ENTER
+#define LEAVE
+#endif
#ifndef _X86_64_CHECKSUM_H
#define _X86_64_CHECKSUM_H
+#include <linux/in6.h>
/*
* This is a version of ip_compute_csum() optimized for IP headers,
}
+#define stack_current() \
+({ \
+ struct thread_info *ti; \
+ __asm__("andq %%rsp,%0; ":"=r" (ti) : "0" (~8191UL)); \
+ ti->task; \
+})
+
#define current get_current()
#ifndef __ASSEMBLY__
/* Keep this syncronized with kernel/head.S */
-#define TSS_START (7 * 8)
-#define LDT_START (TSS_START + NR_CPUS*16)
+#define TSS_START (8 * 8)
+#define LDT_START (TSS_START + 16)
-#define __TSS(n) (TSS_START + (n)*16)
-#define __LDT(n) (LDT_START + (n)*16)
+#define __TSS(n) (TSS_START + (n)*64)
+#define __LDT(n) (LDT_START + (n)*64)
extern __u8 tss_start[];
extern __u8 gdt_table[];
-extern __u8 ldt_start[];
extern __u8 gdt_end[];
enum {
enum {
DESC_TSS = 0x9,
DESC_LDT = 0x2,
- TSSLIMIT = 0x67,
};
// LDT or TSS descriptor in the GDT. 16 bytes.
unsigned long address;
} __attribute__((packed)) ;
-/* FIXME: these should use more generic register classes */
-#define load_TR(n) asm volatile("ltr %%ax"::"a" (__TSS(n)))
-#define __load_LDT(n) asm volatile("lldt %%ax"::"a" (__LDT(n)))
+#define load_TR(n) asm volatile("ltr %w0"::"r" (__TSS(n)))
+#define __load_LDT(n) asm volatile("lldt %w0"::"r" (__LDT(n)))
+#define clear_LDT(n) asm volatile("lldt %w0"::"r" (0))
/*
* This is the ldt that every process will get unless we need
_set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 3, 0);
}
-static inline void set_trap_gate(int nr, void *func)
-{
- _set_gate(&idt_table[nr], GATE_TRAP, (unsigned long) func, 0, 0);
-}
-
-static inline void set_call_gate(void *adr, void *func)
-{
- _set_gate(adr, GATE_CALL, (unsigned long) func, 3, 0);
-}
-
-static inline void set_priv_gate(int nr, void *func)
-{
- _set_gate(&idt_table[nr], GATE_TRAP, (unsigned long) func, 0, 0);
-}
-
static inline void set_tssldt_descriptor(void *ptr, unsigned long tss, unsigned type,
unsigned size)
{
static inline void set_tss_desc(unsigned n, void *addr)
{
- set_tssldt_descriptor((__u8*)gdt_table + __TSS(n), (unsigned long)addr, DESC_TSS,
- TSSLIMIT);
+ set_tssldt_descriptor((__u8*)gdt_table + __TSS(n), (unsigned long)addr,
+ DESC_TSS,
+ sizeof(struct tss_struct));
}
static inline void set_ldt_desc(unsigned n, void *addr, int size)
{
- set_tssldt_descriptor((__u8*)gdt_table + __LDT(n), (unsigned long)addr, DESC_LDT, size);
-}
-
-
-#ifndef MINIKERNEL
-extern inline void clear_LDT(void)
-{
- int cpu = smp_processor_id();
- set_ldt_desc(cpu, &default_ldt[0], 5);
- __load_LDT(cpu);
+ set_tssldt_descriptor((__u8*)gdt_table + __LDT(n), (unsigned long)addr,
+ DESC_LDT, size);
}
-
/*
* load one particular LDT into the current CPU
*/
{
int cpu = smp_processor_id();
void *segments = mm->context.segments;
- int count = LDT_ENTRIES;
if (!segments) {
- segments = &default_ldt[0];
- count = 5;
+ clear_LDT(cpu);
+ return;
}
- set_ldt_desc(cpu, segments, count);
+ set_ldt_desc(cpu, segments, LDT_ENTRIES);
__load_LDT(cpu);
}
-#endif
#endif /* !__ASSEMBLY__ */
A value of 0 tells we have no such handler.
- We might as well make sure everything else is cleared too (except for %rsp),
+ We might as well make sure everything else is cleared too (except for %esp),
just to make things more deterministic.
*/
#define ELF_PLAT_INIT(_r) do { \
cur->thread.fs = 0; cur->thread.gs = 0; \
cur->thread.fsindex = 0; cur->thread.gsindex = 0; \
cur->thread.ds = 0; cur->thread.es = 0; \
+ clear_thread_flag(TIF_IA32); \
} while (0)
#define USE_ELF_CORE_DUMP
#define ELF_ET_DYN_BASE (2 * TASK_SIZE / 3)
/* regs is struct pt_regs, pr_reg is elf_gregset_t (which is
- now struct_user_regs, they are different) */
+ now struct_user_regs, they are different). Assumes current is the process
+ getting dumped. */
-#define ELF_CORE_COPY_REGS(pr_reg, regs) \
+#define ELF_CORE_COPY_REGS(pr_reg, regs) do { \
+ unsigned v; \
(pr_reg)[0] = (regs)->r15; \
(pr_reg)[1] = (regs)->r14; \
(pr_reg)[2] = (regs)->r13; \
(pr_reg)[18] = (regs)->eflags; \
(pr_reg)[19] = (regs)->rsp; \
(pr_reg)[20] = (regs)->ss; \
- rdmsrl(MSR_FS_BASE, (pr_reg)[21]); \
- rdmsrl(MSR_KERNEL_GS_BASE, (pr_reg)[22]);
+ (pr_reg)[21] = current->thread.fs; \
+ (pr_reg)[22] = current->thread.gs; \
+ asm("movl %%ds,%0" : "=r" (v)); (pr_reg)[23] = v; \
+ asm("movl %%es,%0" : "=r" (v)); (pr_reg)[24] = v; \
+ asm("movl %%fs,%0" : "=r" (v)); (pr_reg)[25] = v; \
+ asm("movl %%gs,%0" : "=r" (v)); (pr_reg)[26] = v; \
+} while(0);
/* This yields a mask that user programs can use to figure out what
instruction set this CPU supports. This could be done in user space,
* Here we define all the compile-time 'special' virtual
* addresses. The point is to have a constant address at
* compile time, but to set the physical address only
- * in the boot process. We allocate these special addresses
- * from the end of virtual memory (0xfffff000) backwards.
- * Also this lets us do fail-safe vmalloc(), we
- * can guarantee that these special addresses and
- * vmalloc()-ed addresses never overlap.
+ * in the boot process.
*
* these 'compile-time allocated' memory buffers are
* fixed-size 4k pages. (or larger if used with an increment
* task switches.
*/
-/*
- * on UP currently we will have no trace of the fixmap mechanizm,
- * no page table allocations, etc. This might change in the
- * future, say framebuffers for the console driver(s) could be
- * fix-mapped?
- */
enum fixed_addresses {
VSYSCALL_LAST_PAGE,
VSYSCALL_FIRST_PAGE = VSYSCALL_LAST_PAGE + ((VSYSCALL_END-VSYSCALL_START) >> PAGE_SHIFT) - 1,
*/
#define set_fixmap_nocache(idx, phys) \
__set_fixmap(idx, phys, PAGE_KERNEL_NOCACHE)
-/*
- * used by vmalloc.c.
- *
- * Leave one empty page between vmalloc'ed areas and
- * the start of the fixmap, and leave one page empty
- * at the top of mem..
- */
+
#define FIXADDR_TOP (VSYSCALL_END-PAGE_SIZE)
#define FIXADDR_SIZE (__end_of_fixed_addresses << PAGE_SHIFT)
#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
--- /dev/null
+#ifndef _FPU32_H
+#define _FPU32_H 1
+
+struct _fpstate_ia32;
+
+int restore_i387_ia32(struct task_struct *tsk, struct _fpstate_ia32 *buf, int fsave);
+int save_i387_ia32(struct task_struct *tsk, struct _fpstate_ia32 *buf,
+ struct pt_regs *regs, int fsave);
+
+#endif
* $Id: hw_irq.h,v 1.24 2001/09/14 20:55:03 vojtech Exp $
*/
+#ifndef __ASSEMBLY__
#include <linux/config.h>
#include <asm/atomic.h>
#include <asm/irq.h>
+#endif
/*
* IDT vectors usable for external interrupt sources start
#define RESCHEDULE_VECTOR 0xfc
#define TASK_MIGRATION_VECTOR 0xfb
#define CALL_FUNCTION_VECTOR 0xfa
+#define KDB_VECTOR 0xf9
+
+#define THERMAL_APIC_VECTOR 0xf0
+
/*
* Local APIC timer IRQ vector is on a different priority level,
#define FIRST_DEVICE_VECTOR 0x31
#define FIRST_SYSTEM_VECTOR 0xef
+
+#ifndef __ASSEMBLY__
extern int irq_vector[NR_IRQS];
#define IO_APIC_VECTOR(irq) irq_vector[irq]
#include <asm/ptrace.h>
-#ifdef CONFIG_PREEMPT
-#define PREEMPT_LOCK \
-" movq %rsp,%rdx ;" \
-" andq $-8192,%rdx ;" \
-" incl " __STR(threadinfo_preempt_count)"(%rdx) ;"
-#else
-#define PREEMPT_LOCK
-#endif
-
-/* IF:off, stack contains irq number on origrax */
-#define IRQ_ENTER \
-" cld ;" \
-" pushq %rdi ;" \
-" pushq %rsi ;" \
-" pushq %rdx ;" \
-" pushq %rcx ;" \
-" pushq %rax ;" \
-" pushq %r8 ;" \
-" pushq %r9 ;" \
-" pushq %r10 ;" \
-" pushq %r11 ;" \
- PREEMPT_LOCK \
-" leaq -48(%rsp),%rdi # arg1 for handler ;" \
-" cmpq $ " __STR(__KERNEL_CS) ",88(%rsp) # CS - ARGOFFSET ;" \
-" je 1f ;" \
-" swapgs ;" \
-"1: addl $1,%gs: " __STR(pda_irqcount) ";" \
-" movq %gs: " __STR(pda_irqstackptr) ",%rax ;" \
-" cmoveq %rax,%rsp ;"
-
#define IRQ_NAME2(nr) nr##_interrupt(void)
#define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr)
* SMP has a few special interrupts for IPI messages
*/
- /* there is a second layer of macro just to get the symbolic
- name for the vector evaluated. This change is for RTLinux */
-#define BUILD_SMP_INTERRUPT(x,v) XBUILD_SMP_INTERRUPT(x,v)
-#define XBUILD_SMP_INTERRUPT(x,v)\
-asmlinkage void x(void); \
-asmlinkage void call_##x(void); \
-__asm__( \
-"\n"__ALIGN_STR"\n" \
-SYMBOL_NAME_STR(x) ":\n\t" \
- "push $" #v "-256;" \
- IRQ_ENTER \
- "pushq %rdi ; " \
- "call " SYMBOL_NAME_STR(smp_##x) " ; " \
- "jmp ret_from_intr")
-
-#define BUILD_COMMON_IRQ()
-
#define BUILD_IRQ(nr) \
asmlinkage void IRQ_NAME(nr); \
__asm__( \
-"\n"__ALIGN_STR "\n" \
-SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \
+"\n.p2align\n" \
+"IRQ" #nr "_interrupt:\n\t" \
"push $" #nr "-256 ; " \
"jmp common_interrupt");
static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {}
#endif
+#endif
+
#endif /* _ASM_HW_IRQ_H */
* Pentium III FXSR, SSE support
* General FPU state handling cleanups
* Gareth Hughes <gareth@valinux.com>, May 2000
+ * x86-64 work by Andi Kleen 2002
*/
#ifndef __ASM_X86_64_I387_H
#define __ASM_X86_64_I387_H
#include <linux/sched.h>
-#include <linux/spinlock.h>
#include <asm/processor.h>
#include <asm/sigcontext.h>
#include <asm/user.h>
+extern void fpu_init(void);
extern void init_fpu(void);
+int save_i387(struct _fpstate *buf);
/*
* FPU lazy state save handling...
*/
-extern void save_fpu( struct task_struct *tsk );
-extern void save_init_fpu( struct task_struct *tsk );
-extern void restore_fpu( struct task_struct *tsk );
-extern void kernel_fpu_begin(void);
-#define kernel_fpu_end() do { stts(); preempt_enable(); } while(0)
+#define kernel_fpu_end() stts()
-
-#define unlazy_fpu( tsk ) do { \
+#define unlazy_fpu(tsk) do { \
if (test_tsk_thread_flag(tsk, TIF_USEDFPU)) \
- save_init_fpu( tsk ); \
+ save_init_fpu(tsk); \
} while (0)
-#define clear_fpu( tsk ) \
-do { \
+#define clear_fpu(tsk) do { \
if (test_tsk_thread_flag(tsk, TIF_USEDFPU)) { \
asm volatile("fwait"); \
- clear_tsk_thread_flag(tsk, TIF_USEDFPU); \
+ clear_tsk_thread_flag(tsk,TIF_USEDFPU); \
stts(); \
} \
} while (0)
-/*
- * FPU state interaction...
- */
-extern unsigned short get_fpu_cwd( struct task_struct *tsk );
-extern unsigned short get_fpu_swd( struct task_struct *tsk );
-extern unsigned short get_fpu_twd( struct task_struct *tsk );
-extern unsigned short get_fpu_mxcsr( struct task_struct *tsk );
-
-extern void set_fpu_cwd( struct task_struct *tsk, unsigned short cwd );
-extern void set_fpu_swd( struct task_struct *tsk, unsigned short swd );
-extern void set_fpu_twd( struct task_struct *tsk, unsigned short twd );
-extern void set_fpu_mxcsr( struct task_struct *tsk, unsigned short mxcsr );
-
-#define load_mxcsr( val ) do { \
+#define load_mxcsr(val) do { \
unsigned long __mxcsr = ((unsigned long)(val) & 0xffbf); \
- asm volatile( "ldmxcsr %0" : : "m" (__mxcsr) ); \
+ asm volatile("ldmxcsr %0" : : "m" (__mxcsr)); \
} while (0)
/*
- * Signal frame handlers...
+ * ptrace request handers...
*/
-extern int save_i387( struct _fpstate *buf );
-extern int restore_i387( struct _fpstate *buf );
+extern int get_fpregs(struct user_i387_struct *buf,
+ struct task_struct *tsk);
+extern int set_fpregs(struct task_struct *tsk,
+ struct user_i387_struct *buf);
/*
- * ptrace request handers...
+ * FPU state for core dumps...
*/
-extern int get_fpregs( struct user_i387_struct *buf,
- struct task_struct *tsk );
-extern int set_fpregs( struct task_struct *tsk,
- struct user_i387_struct *buf );
+extern int dump_fpu(struct pt_regs *regs,
+ struct user_i387_struct *fpu);
/*
- * FPU state for core dumps...
+ * i387 state interaction
*/
-extern int dump_fpu( struct pt_regs *regs,
- struct user_i387_struct *fpu );
-extern int dump_extended_fpu( struct pt_regs *regs,
- struct user_i387_struct *fpu );
+#define get_fpu_mxcsr(t) ((t)->thread.i387.fxsave.mxcsr)
+#define get_fpu_cwd(t) ((t)->thread.i387.fxsave.cwd)
+#define get_fpu_fxsr_twd(t) ((t)->thread.i387.fxsave.twd)
+#define get_fpu_swd(t) ((t)->thread.i387.fxsave.swd)
+#define set_fpu_cwd(t,val) ((t)->thread.i387.fxsave.cwd = (val))
+#define set_fpu_swd(t,val) ((t)->thread.i387.fxsave.swd = (val))
+#define set_fpu_fxsr_twd(t,val) ((t)->thread.i387.fxsave.twd = (val))
+#define set_fpu_mxcsr(t,val) ((t)->thread.i387.fxsave.mxcsr = (val)&0xffbf)
+
+static inline int restore_fpu_checking(struct i387_fxsave_struct *fx)
+{
+ int err;
+ asm volatile("1: rex64 ; fxrstor (%[fx])\n\t"
+ "2:\n"
+ ".section .fixup,\"ax\"\n"
+ "3: movl $-1,%[err]\n"
+ " jmp 2b\n"
+ ".previous\n"
+ ".section __ex_table,\"a\"\n"
+ " .align 8\n"
+ " .quad 1b,3b\n"
+ ".previous"
+ : [err] "=r" (err)
+ : [fx] "r" (fx), "0" (0));
+ return err;
+}
+
+static inline int save_i387_checking(struct i387_fxsave_struct *fx)
+{
+ int err;
+ asm volatile("1: rex64 ; fxsave (%[fx])\n\t"
+ "2:\n"
+ ".section .fixup,\"ax\"\n"
+ "3: movl $-1,%[err]\n"
+ " jmp 2b\n"
+ ".previous\n"
+ ".section __ex_table,\"a\"\n"
+ " .align 8\n"
+ " .quad 1b,3b\n"
+ ".previous"
+ : [err] "=r" (err)
+ : [fx] "r" (fx), "0" (0));
+ return err;
+}
+
+static inline void kernel_fpu_begin(void)
+{
+ struct task_struct *me = current;
+ if (test_tsk_thread_flag(me,TIF_USEDFPU)) {
+ asm volatile("fxsave %0 ; fnclex"
+ : "=m" (me->thread.i387.fxsave));
+ clear_tsk_thread_flag(me, TIF_USEDFPU);
+ return;
+ }
+ clts();
+}
+
+static inline void save_init_fpu( struct task_struct *tsk )
+{
+ asm volatile( "fxsave %0 ; fnclex"
+ : "=m" (tsk->thread.i387.fxsave));
+ clear_tsk_thread_flag(tsk, TIF_USEDFPU);
+ stts();
+}
+
+/*
+ * This restores directly out of user space. Exceptions are handled.
+ */
+static inline int restore_i387(struct _fpstate *buf)
+{
+ return restore_fpu_checking((struct i387_fxsave_struct *)buf);
+}
+
+
+static inline void empty_fpu(struct task_struct *child)
+{
+ if (!child->used_math) {
+ /* Simulate an empty FPU. */
+ child->thread.i387.fxsave.cwd = 0x037f;
+ child->thread.i387.fxsave.swd = 0;
+ child->thread.i387.fxsave.twd = 0;
+ child->thread.i387.fxsave.mxcsr = 0x1f80;
+ }
+ child->used_math = 1;
+}
#endif /* __ASM_X86_64_I387_H */
#define F_SETLK64 13
#define F_SETLKW64 14
-
-
-/* sigcontext.h */
-/* The x86-64 port uses FXSAVE without prefix; thus a 32bit compatible
- FXSAVE layout. The additional XMM registers are added, but they're
- in currently unused space. Hopefully nobody else will use them*/
-#define _fpstate_ia32 _fpstate
-
-struct sigcontext_ia32 {
- unsigned short gs, __gsh;
- unsigned short fs, __fsh;
- unsigned short es, __esh;
- unsigned short ds, __dsh;
- unsigned int edi;
- unsigned int esi;
- unsigned int ebp;
- unsigned int esp;
- unsigned int ebx;
- unsigned int edx;
- unsigned int ecx;
- unsigned int eax;
- unsigned int trapno;
- unsigned int err;
- unsigned int eip;
- unsigned short cs, __csh;
- unsigned int eflags;
- unsigned int esp_at_signal;
- unsigned short ss, __ssh;
- unsigned int fpstate; /* really (struct _fpstate_ia32 *) */
- unsigned int oldmask;
- unsigned int cr2;
-};
+#include <asm/sigcontext32.h>
/* signal.h */
#define _IA32_NSIG 64
#define __NR_ia32_lremovexattr 236
#define __NR_ia32_fremovexattr 237
#define __NR_ia32_tkill 238
+#define __NR_ia32_sendfile64 239
+#define __NR_ia32_futex 240
+#define __NR_ia32_sched_setaffinity 241
+#define __NR_ia32_sched_getaffinity 242
-#define IA32_NR_syscalls 240 /* must be > than biggest syscall! */
+#define IA32_NR_syscalls 243 /* must be > than biggest syscall! */
#endif /* _ASM_X86_64_IA32_UNISTD_H_ */
/*
* However PCI ones are not necessarily 1:1 and therefore these interfaces
* are forbidden in portable PCI drivers.
+ *
+ * Allow them on x86 for legacy drivers, though.
*/
-extern unsigned long virt_to_bus_not_defined_use_pci_map(volatile void *addr);
-#define virt_to_bus virt_to_bus_not_defined_use_pci_map
-extern unsigned long bus_to_virt_not_defined_use_pci_map(volatile void *addr);
-#define bus_to_virt bus_to_virt_not_defined_use_pci_map
+#define virt_to_bus virt_to_phys
+#define bus_to_virt phys_to_virt
/*
* readX/writeX() are used to access memory mapped devices. On some
#define eth_io_copy_and_sum(a,b,c,d) eth_copy_and_sum((a),__io_virt(b),(c),(d))
#define isa_eth_io_copy_and_sum(a,b,c,d) eth_copy_and_sum((a),__io_virt(__ISA_IO_base + (b)),(c),(d))
+/**
+ * check_signature - find BIOS signatures
+ * @io_addr: mmio address to check
+ * @signature: signature block
+ * @length: length of signature
+ *
+ * Perform a signature comparison with the mmio address io_addr. This
+ * address should have been obtained by ioremap.
+ * Returns 1 on a match.
+ */
+
static inline int check_signature(unsigned long io_addr,
const unsigned char *signature, int length)
{
return retval;
}
+/**
+ * isa_check_signature - find BIOS signatures
+ * @io_addr: mmio address to check
+ * @signature: signature block
+ * @length: length of signature
+ *
+ * Perform a signature comparison with the ISA mmio address io_addr.
+ * Returns 1 on a match.
+ *
+ * This function is deprecated. New drivers should use ioremap and
+ * check_signature.
+ */
+
+
static inline int isa_check_signature(unsigned long io_addr,
const unsigned char *signature, int length)
{
#include <linux/config.h>
#include <asm/types.h>
-#include <asm/mpspec.h>
/*
* Intel IO-APIC support for SMP and UP systems.
#define APIC_MISMATCH_DEBUG
#define IO_APIC_BASE(idx) \
- ((volatile int *)__fix_to_virt(FIX_IO_APIC_BASE_0 + idx))
+ ((volatile int *)(__fix_to_virt(FIX_IO_APIC_BASE_0 + idx) \
+ + (mp_ioapics[idx].mpc_apicaddr & ~PAGE_MASK)))
/*
* The structure of the IO-APIC:
* These are used to wrap system calls on x86.
*
* See arch/i386/kernel/sys_i386.c for ugly details..
+ *
+ * (on x86-64 only used for 32bit emulation)
*/
struct ipc_kludge {
extern struct notifier_block *die_chain;
-enum {
- DIE_DIE = 1,
+/* Grossly misnamed. */
+enum die_val {
+ DIE_OOPS = 1,
DIE_INT3,
DIE_DEBUG,
DIE_PANIC,
+ DIE_NMI,
+ DIE_DIE,
+ DIE_CALL,
+ DIE_CPUINIT, /* not really a die, but .. */
+ DIE_TRAPINIT, /* not really a die, but .. */
+ DIE_STOP,
};
+static inline int notify_die(enum die_val val,char *str,struct pt_regs *regs,long err)
+{
+ struct die_args args = { regs: regs, str: str, err: err };
+ return notifier_call_chain(&die_chain, val, &args);
+}
+
+int printk_address(unsigned long address);
+
#endif
#ifndef _LINUX_LDT_H
#define _LINUX_LDT_H
-/* Is this to allow userland manipulate LDTs? It looks so. We should
- consider disallowing LDT manipulations altogether: in long mode
- there's no possibility of v86 mode, so something will have to
- break, anyway. --pavel */
-
/* Maximum number of LDT entries supported. */
#define LDT_ENTRIES 8192
/* The size of each LDT entry. */
#define LDT_ENTRY_SIZE 8
#ifndef __ASSEMBLY__
+/* Note on 64bit base and limit is ignored and you cannot set
+ DS/ES/CS not to the default values if you still want to do syscalls. This
+ call is more for 32bit mode therefore. */
struct modify_ldt_ldt_s {
unsigned int entry_number;
- unsigned long base_addr;
+ unsigned int base_addr;
unsigned int limit;
unsigned int seg_32bit:1;
unsigned int contents:2;
unsigned int limit_in_pages:1;
unsigned int seg_not_present:1;
unsigned int useable:1;
+ unsigned int lm:1;
};
#define MODIFY_LDT_CONTENTS_DATA 0
#define PROT_WRITE 0x2 /* page can be written */
#define PROT_EXEC 0x4 /* page can be executed */
#define PROT_NONE 0x0 /* page can not be accessed */
+#define PROT_SEM 0x0
#define MAP_SHARED 0x01 /* Share changes */
#define MAP_PRIVATE 0x02 /* Changes are private */
#include <asm/desc.h>
#include <asm/atomic.h>
#include <asm/pgalloc.h>
+#include <asm/pda.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
/*
* possibly do the LDT unload here?
}
#endif
-static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk, unsigned cpu)
+static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+ struct task_struct *tsk, unsigned cpu)
{
if (likely(prev != next)) {
/* stop flush ipis for the previous mm */
set_bit(cpu, &next->cpu_vm_mask);
set_bit(cpu, &next->context.cpuvalid);
/* Re-load page tables */
- asm volatile("movq %0,level4_pgt": :"r" (__pa(next->pgd) | 7));
+ *read_pda(level4_pgt) = __pa(next->pgd) | _PAGE_TABLE;
__flush_tlb();
}
#ifdef CONFIG_SMP
else {
cpu_tlbstate[cpu].state = TLBSTATE_OK;
if(cpu_tlbstate[cpu].active_mm != next)
- BUG();
+ out_of_line_bug();
if(!test_and_set_bit(cpu, &next->cpu_vm_mask)) {
/* We were in lazy tlb mode and leave_mm disabled
* tlb flush IPI delivery. We must flush our tlb.
#define activate_mm(prev, next) \
switch_mm((prev),(next),NULL,smp_processor_id())
+
#endif
#ifndef _ASM_X8664_MODULE_H
#define _ASM_X8664_MODULE_H
+
/*
* This file contains the x8664 architecture specific module code.
+ * Modules need to be mapped near the kernel code to allow 32bit relocations.
*/
-#define module_map(x) vmalloc(x)
-#define module_unmap(x) vfree(x)
+extern void *module_map(unsigned long);
+extern void module_unmap(void *);
+
#define module_arch_init(x) (0)
#define arch_init_modules(x) do { } while (0)
MP_BUS_MCA
};
extern int mp_bus_id_to_type [MAX_MP_BUSSES];
+extern int mp_bus_id_to_node [MAX_MP_BUSSES];
+extern int mp_bus_id_to_local [MAX_MP_BUSSES];
+extern int quad_local_to_mp_bus_id [NR_CPUS/4][4];
extern int mp_bus_id_to_pci_bus [MAX_MP_BUSSES];
extern unsigned int boot_cpu_physical_apicid;
#ifndef X86_64_MSR_H
#define X86_64_MSR_H 1
+
+#ifndef __ASSEMBLY__
/*
* Access to machine-specific registers (available on 586 and better only)
* Note: the rd* operations modify the parameters directly (without using
" .quad 2b,3b\n\t" \
".previous" \
: "=a" (ret__) \
- : "c" (msr), "0" ((__u32)val), "d" ((val)>>32), "i" (-EFAULT)); \
+ : "c" (msr), "0" ((__u32)val), "d" ((val)>>32), "i" (-EFAULT));\
ret__; })
#define rdtsc(low,high) \
#define rdtscl(low) \
__asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
-#define rdtscll(val) \
- __asm__ __volatile__ ("rdtsc" : "=A" (val))
+#define rdtscll(val) do { \
+ unsigned int a,d; \
+ asm volatile("rdtsc" : "=a" (a), "=d" (d)); \
+ (val) = ((unsigned long)a) | (((unsigned long)d)<<32); \
+} while(0)
+
+#define rdpmc(counter,low,high) \
+ __asm__ __volatile__("rdpmc" \
+ : "=a" (low), "=d" (high) \
+ : "c" (counter))
#define write_tsc(val1,val2) wrmsr(0x10, val1, val2)
: "=a" (low), "=d" (high) \
: "c" (counter))
+#endif
/* AMD/K8 specific MSRs */
#define MSR_EFER 0xc0000080 /* extended feature register */
#define MSR_SYSCALL_MASK 0xc0000084 /* EFLAGS mask for syscall */
#define MSR_FS_BASE 0xc0000100 /* 64bit GS base */
#define MSR_GS_BASE 0xc0000101 /* 64bit FS base */
-#define MSR_KERNEL_GS_BASE 0xc0000102 /* SwapGS GS shadow (or USER_GS from kernel view) */
-
+#define MSR_KERNEL_GS_BASE 0xc0000102 /* SwapGS GS shadow (or USER_GS from kernel) */
+/* EFER bits: */
+#define _EFER_SCE 0 /* SYSCALL/SYSRET */
+#define _EFER_LME 8 /* Long mode enable */
+#define _EFER_LMA 10 /* Long mode active (read-only) */
+#define _EFER_NX 11 /* No execute enable */
+
+#define EFER_SCE (1<<_EFER_SCE)
+#define EFER_LME (1<<EFER_LME)
+#define EFER_LMA (1<<EFER_LMA)
+#define EFER_NX (1<<_EFER_NX)
/* Intel MSRs. Some also available on other CPUs */
#define MSR_IA32_PLATFORM_ID 0x17
#define MSR_IA32_MC0_ADDR 0x402
#define MSR_IA32_MC0_MISC 0x403
-/* K7 MSRs */
+#define MSR_P6_PERFCTR0 0xc1
+#define MSR_P6_PERFCTR1 0xc2
+#define MSR_P6_EVNTSEL0 0x186
+#define MSR_P6_EVNTSEL1 0x187
+
+/* K7/K8 MSRs. Not complete. See the architecture manual for a more complete list. */
#define MSR_K7_EVNTSEL0 0xC0010000
#define MSR_K7_PERFCTR0 0xC0010004
+#define MSR_K7_EVNTSEL1 0xC0010001
+#define MSR_K7_PERFCTR1 0xC0010005
+#define MSR_K7_EVNTSEL2 0xC0010002
+#define MSR_K7_PERFCTR2 0xC0010006
+#define MSR_K7_EVNTSEL3 0xC0010003
+#define MSR_K7_PERFCTR3 0xC0010007
+#define MSR_K8_TOP_MEM1 0xC001001A
+#define MSR_K8_TOP_MEM2 0xC001001D
+#define MSR_K8_SYSCFG 0xC0000010
/* K6 MSRs */
#define MSR_K6_EFER 0xC0000080
#define MSR_IA32_APICBASE_ENABLE (1<<11)
#define MSR_IA32_APICBASE_BASE (0xfffff<<12)
+
+#define MSR_IA32_THERM_CONTROL 0x19a
+#define MSR_IA32_THERM_INTERRUPT 0x19b
+#define MSR_IA32_THERM_STATUS 0x19c
+#define MSR_IA32_MISC_ENABLE 0x1a0
+
+
#endif
#ifdef __KERNEL__
#ifndef __ASSEMBLY__
-#include <linux/config.h>
-
-#ifdef CONFIG_X86_USE_3DNOW
-
+#if 0
#include <asm/mmx.h>
-
#define clear_page(page) mmx_clear_page((void *)(page))
#define copy_page(to,from) mmx_copy_page(to,from)
-
#else
-
-/*
- * On older X86 processors its not a win to use MMX here it seems.
- * Maybe the K6-III ?
- */
-
#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
-
#define copy_page(to,from) memcpy((void *)(to), (void *)(from), PAGE_SIZE)
-
#endif
#define clear_user_page(page, vaddr) clear_page(page)
typedef struct { unsigned long pte; } pte_t;
typedef struct { unsigned long pmd; } pmd_t;
typedef struct { unsigned long pgd; } pgd_t;
-typedef struct { unsigned long level4; } level4_t;
+typedef struct { unsigned long pml4; } pml4_t;
#define PTE_MASK PAGE_MASK
typedef struct { unsigned long pgprot; } pgprot_t;
#define pte_val(x) ((x).pte)
#define pmd_val(x) ((x).pmd)
#define pgd_val(x) ((x).pgd)
-#define level4_val(x) ((x).level4)
+#define pml4_val(x) ((x).pml4)
#define pgprot_val(x) ((x).pgprot)
#define __pte(x) ((pte_t) { (x) } )
/* to align the pointer to the (next) page boundary */
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
-
+/* See Documentation/x86_64/mm.txt for a description of the layout. */
#define __START_KERNEL 0xffffffff80100000
#define __START_KERNEL_map 0xffffffff80000000
-#define __PAGE_OFFSET 0xffff800000000000
+#define __PAGE_OFFSET 0x0000010000000000
#ifndef __ASSEMBLY__
/*
- * Tell the user there is some problem.
+ * Tell the user there is some problem. The exception handler decodes this frame.
*/
-
struct bug_frame {
- unsigned short ud2;
+ unsigned char ud2[2];
char *filename; /* should use 32bit offset instead, but the assembler doesn't like it */
unsigned short line;
} __attribute__((packed));
-
#define BUG() asm volatile("ud2 ; .quad %c1 ; .short %c0" :: "i"(__LINE__), "i" (__FILE__))
-#define PAGE_BUG(page) BUG();
+#define PAGE_BUG(page) BUG()
+void out_of_line_bug(void);
/* Pure 2^n version of get_order */
extern __inline__ int get_order(unsigned long size)
return order;
}
-static unsigned long start_kernel_map __attribute__((unused)) = __START_KERNEL_map; /* FIXME: workaround gcc bug */
-
#endif /* __ASSEMBLY__ */
#define PAGE_OFFSET ((unsigned long)__PAGE_OFFSET)
-#define __pa(x) (((unsigned long)(x)>=start_kernel_map)?(unsigned long)(x) - (unsigned long)start_kernel_map:(unsigned long)(x) - PAGE_OFFSET)
+
+/* Note: __pa(&symbol_visible_to_c) should be always replaced with __pa_symbol.
+ Otherwise you risk miscompilation. */
+#define __pa(x) (((unsigned long)(x)>=__START_KERNEL_map)?(unsigned long)(x) - (unsigned long)__START_KERNEL_map:(unsigned long)(x) - PAGE_OFFSET)
+/* __pa_symbol should be used for C visible symbols.
+ This seems to be the official gcc blessed way to do such arithmetic. */
+#define __pa_symbol(x) \
+ ({unsigned long v; \
+ asm("" : "=r" (v) : "0" (x)); \
+ __pa(v); })
+
#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))
#define virt_to_page(kaddr) (mem_map + (__pa(kaddr) >> PAGE_SHIFT))
#define VALID_PAGE(page) ((page - mem_map) < max_mapnr)
+
#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_EXEC | \
VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
#endif
#include <linux/cache.h>
-struct task_struct;
-
/* Per processor datastructure. %gs points to it while the kernel runs */
/* To use a new field with the *_pda macros it needs to be added to tools/offset.c */
struct x8664_pda {
struct task_struct *pcurrent; /* Current process */
int irqcount; /* Irq nesting counter. Starts with -1 */
int cpunumber; /* Logical CPU number */
- char *irqstackptr;
+ char *irqstackptr; /* top of irqstack */
+ unsigned long volatile *level4_pgt;
unsigned int __softirq_pending;
unsigned int __local_irq_count;
unsigned int __local_bh_count;
unsigned int __nmi_count; /* arch dependent */
struct task_struct * __ksoftirqd_task; /* waitqueue is too large */
- char irqstack[16 * 1024]; /* Stack used by interrupts */
} ____cacheline_aligned;
#define PDA_STACKOFFSET (5*8)
+#define IRQSTACK_ORDER 2
+#define IRQSTACKSIZE (PAGE_SIZE << IRQSTACK_ORDER)
+
extern struct x8664_pda cpu_pda[];
/*
#define pda_from_op(op,field) ({ \
typedef typeof_field(struct x8664_pda, field) T__; T__ ret__; \
switch (sizeof_field(struct x8664_pda, field)) { \
- case 2: asm volatile (op "w %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
- case 4: asm volatile (op "l %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
- case 8: asm volatile (op "q %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
+ case 2: asm volatile(op "w %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
+ case 4: asm volatile(op "l %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
+ case 8: asm volatile(op "q %%gs:" __STR2(pda_ ## field) ",%0":"=r" (ret__)::"memory"); break; \
default: __bad_pda_field(); \
} \
ret__; })
--- /dev/null
+#ifndef __ARCH_I386_PERCPU__
+#define __ARCH_I386_PERCPU__
+
+#include <asm-generic/percpu.h>
+
+#endif /* __ARCH_I386_PERCPU__ */
#ifndef _X86_64_PGALLOC_H
#define _X86_64_PGALLOC_H
-#include <linux/config.h>
#include <asm/processor.h>
#include <asm/fixmap.h>
#include <asm/pda.h>
}
-/*
- * TLB flushing:
- *
- * - flush_tlb() flushes the current mm struct TLBs
- * - flush_tlb_all() flushes all processes TLBs
- * - flush_tlb_mm(mm) flushes the specified mm context TLB's
- * - flush_tlb_page(vma, vmaddr) flushes one page
- * - flush_tlb_range(vma, start, end) flushes a range of pages
- * - flush_tlb_pgtables(mm, start, end) flushes a range of page tables
- *
- * ..but the i386 has somewhat limited tlb flushing capabilities,
- * and page-granular flushes are available only on i486 and up.
- */
-
-#ifndef CONFIG_SMP
-
-#define flush_tlb() __flush_tlb()
-#define flush_tlb_all() __flush_tlb_all()
-#define local_flush_tlb() __flush_tlb()
-
-static inline void flush_tlb_mm(struct mm_struct *mm)
-{
- if (mm == current->active_mm)
- __flush_tlb();
-}
-
-static inline void flush_tlb_page(struct vm_area_struct *vma,
- unsigned long addr)
-{
- if (vma->vm_mm == current->active_mm)
- __flush_tlb_one(addr);
-}
-
-static inline void flush_tlb_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end)
-{
- if (vma->vm_mm == current->active_mm)
- __flush_tlb();
-}
-
-#else
-
-#include <asm/smp.h>
-
-#define local_flush_tlb() \
- __flush_tlb()
-
-extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
-extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-
-#define flush_tlb() flush_tlb_current_task()
-
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
-{
- flush_tlb_mm(vma->vm_mm);
-}
-
-#define TLBSTATE_OK 1
-#define TLBSTATE_LAZY 2
-
-struct tlb_state
-{
- struct mm_struct *active_mm;
- int state;
- char __cacheline_padding[24];
-};
-extern struct tlb_state cpu_tlbstate[NR_CPUS];
-
-
-#endif
-
-extern inline void flush_tlb_pgtables(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
- /* i386 does not keep any page table caches in TLB */
-}
-
#endif /* _X86_64_PGALLOC_H */
#ifndef _X86_64_PGTABLE_H
#define _X86_64_PGTABLE_H
-#include <linux/config.h>
-
/*
* This file contains the functions and defines necessary to modify and use
* the x86-64 page table tree.
* three level page setup on the beginning and some kernel mappings at
* the end. For more details see Documentation/x86_64/mm.txt
*/
-#ifndef __ASSEMBLY__
#include <asm/processor.h>
#include <asm/fixmap.h>
#include <asm/bitops.h>
#include <linux/threads.h>
+#include <asm/pda.h>
-extern level4_t level4_pgt[512];
extern pgd_t level3_kernel_pgt[512];
extern pgd_t level3_physmem_pgt[512];
-extern pgd_t level3_ident_pgt[512], swapper_pg_dir[512];
+extern pgd_t level3_ident_pgt[512];
extern pmd_t level2_kernel_pgt[512];
-extern void paging_init(void);
+extern pml4_t init_level4_pgt[];
+extern pgd_t boot_vmalloc_pgt[];
-/* Caches aren't brain-dead. */
-#define flush_cache_all() do { } while (0)
-#define flush_cache_mm(mm) do { } while (0)
-#define flush_cache_range(vma, start, end) do { } while (0)
-#define flush_cache_page(vma, vmaddr) do { } while (0)
-#define flush_page_to_ram(page) do { } while (0)
-#define flush_dcache_page(page) do { } while (0)
-#define flush_icache_range(start, end) do { } while (0)
-#define flush_icache_page(vma,pg) do { } while (0)
-#define flush_icache_user_range(vma,pg,adr,len) do { } while (0)
-
-#define __flush_tlb() \
- do { \
- unsigned long tmpreg; \
- \
- __asm__ __volatile__( \
- "movq %%cr3, %0; # flush TLB \n" \
- "movq %0, %%cr3; \n" \
- : "=r" (tmpreg) \
- :: "memory"); \
- } while (0)
+#define swapper_pg_dir NULL
-/*
- * Global pages have to be flushed a bit differently. Not a real
- * performance problem because this does not happen often.
- */
-#define __flush_tlb_global() \
- do { \
- unsigned long tmpreg; \
- \
- __asm__ __volatile__( \
- "movq %1, %%cr4; # turn off PGE \n" \
- "movq %%cr3, %0; # flush TLB \n" \
- "movq %0, %%cr3; \n" \
- "movq %2, %%cr4; # turn PGE back on \n" \
- : "=&r" (tmpreg) \
- : "r" (mmu_cr4_features & ~X86_CR4_PGE), \
- "r" (mmu_cr4_features) \
- : "memory"); \
- } while (0)
+extern void paging_init(void);
extern unsigned long pgkern_mask;
-/*
- * Do not check the PGE bit unnecesserily if this is a PPro+ kernel.
- * FIXME: This should be cleaned up
- */
-
-# define __flush_tlb_all() __flush_tlb_global()
-
-#define __flush_tlb_one(addr) __asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
-
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
extern unsigned long empty_zero_page[1024];
#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
-#endif /* !__ASSEMBLY__ */
-
-#define LEVEL4_SHIFT 39
-#define PTRS_PER_LEVEL4 512
+#define PML4_SHIFT 39
+#define PTRS_PER_PML4 512
/*
* PGDIR_SHIFT determines what a top-level page table entry can map
#define pgd_ERROR(e) \
printk("%s:%d: bad pgd %p(%016lx).\n", __FILE__, __LINE__, &(e), pgd_val(e))
-#define level4_none(x) (!level4_val(x))
+#define pml4_none(x) (!pml4_val(x))
#define pgd_none(x) (!pgd_val(x))
-#define pgd_bad(x) ((pgd_val(x) & (~PAGE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE )
-
extern inline int pgd_present(pgd_t pgd) { return !pgd_none(pgd); }
static inline void set_pte(pte_t *dst, pte_t val)
{
- *((unsigned long *)dst) = pte_val(val);
+ pte_val(*dst) = pte_val(val);
}
static inline void set_pmd(pmd_t *dst, pmd_t val)
{
- *((unsigned long *)dst) = pmd_val(val);
+ pmd_val(*dst) = pmd_val(val);
}
static inline void set_pgd(pgd_t *dst, pgd_t val)
{
- *((unsigned long *)dst) = pgd_val(val);
+ pgd_val(*dst) = pgd_val(val);
}
-extern inline void __pgd_clear (pgd_t * pgd)
+extern inline void pgd_clear (pgd_t * pgd)
{
set_pgd(pgd, __pgd(0));
}
-extern inline void pgd_clear (pgd_t * pgd)
+static inline void set_pml4(pml4_t *dst, pml4_t val)
{
- __pgd_clear(pgd);
- __flush_tlb();
+ pml4_val(*dst) = pml4_val(val);
}
#define pgd_page(pgd) \
#ifndef __ASSEMBLY__
-/* Just any arbitrary offset to the start of the vmalloc VM area: the
- * current 8MB value just means that there will be a 8MB "hole" after the
- * physical memory until the kernel virtual memory starts. That means that
- * any out-of-bounds memory accesses will hopefully be caught.
- * The vmalloc() routines leaves a hole of 4kB between each vmalloced
- * area for the same reason. ;)
- */
-#define VMALLOC_OFFSET (8*1024*1024)
-#define VMALLOC_START (((unsigned long) high_memory + 2*VMALLOC_OFFSET-1) & \
- ~(VMALLOC_OFFSET-1))
+#define VMALLOC_START 0xffffff0000000000
+#define VMALLOC_END 0xffffff7fffffffff
#define VMALLOC_VMADDR(x) ((unsigned long)(x))
-#define VMALLOC_END (__START_KERNEL_map-PAGE_SIZE)
+#define MODULES_VADDR 0xffffffffa0000000
+#define MODULES_END 0xffffffffafffffff
+#define MODULES_LEN (MODULES_END - MODULES_VADDR)
#define _PAGE_BIT_PRESENT 0
#define _PAGE_BIT_RW 1
#define _PAGE_BIT_PCD 4
#define _PAGE_BIT_ACCESSED 5
#define _PAGE_BIT_DIRTY 6
-#define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page, Pentium+, if present.. */
+#define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page */
#define _PAGE_BIT_GLOBAL 8 /* Global TLB entry PPro+ */
+#define _PAGE_BIT_NX 63 /* No execute: only valid after cpuid check */
#define _PAGE_PRESENT 0x001
#define _PAGE_RW 0x002
#define _PAGE_ACCESSED 0x020
#define _PAGE_DIRTY 0x040
#define _PAGE_PSE 0x080 /* 2MB page */
-#define _PAGE_GLOBAL 0x100 /* Global TLB entry PPro+ */
+#define _PAGE_GLOBAL 0x100 /* Global TLB entry */
#define _PAGE_PROTNONE 0x080 /* If not present */
+#define _PAGE_NX (1UL<<_PAGE_BIT_NX)
#define _PAGE_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY)
#define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
#define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
#define PAGE_KERNEL_VSYSCALL MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL)
-/*
- * The i386 can't do page protection for execute, and considers that
- * the same are read. Also, write permissions imply read permissions.
- * This is the closest we can get..
- */
#define __P000 PAGE_NONE
#define __P001 PAGE_READONLY
#define __P010 PAGE_COPY
#define __S110 PAGE_SHARED
#define __S111 PAGE_SHARED
-/*
- * Define this if things work differently on an i386 and an i486:
- * it will (on an i486) warn about kernel memory accesses that are
- * done without a 'verify_area(VERIFY_WRITE,..)'
- */
-#undef TEST_VERIFY_AREA
-
-/* page table for 0-4MB for everybody */
-extern unsigned long pg0[1024];
-
-/*
- * Handling allocation failures during page table setup.
- */
-extern void __handle_bad_pmd(pmd_t * pmd);
-extern void __handle_bad_pmd_kernel(pmd_t * pmd);
+static inline unsigned long pgd_bad(pgd_t pgd)
+{
+ unsigned long val = pgd_val(pgd);
+ val &= ~PAGE_MASK;
+ val &= ~(_PAGE_USER | _PAGE_DIRTY);
+ return val & ~(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED);
+}
#define pte_none(x) (!pte_val(x))
#define pte_present(x) (pte_val(x) & (_PAGE_PRESENT | _PAGE_PROTNONE))
#define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0)
#define pmd_bad(x) ((pmd_val(x) & (~PAGE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE )
-/*
- * Permanent address of a page. Obviously must never be
- * called on a highmem page.
- */
#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) /* FIXME: is this
right? */
#define pte_page(x) (mem_map+((unsigned long)((pte_val(x) >> PAGE_SHIFT))))
* and a page entry and page directory to the page they refer to.
*/
-#define mk_pte(page,pgprot) \
-({ \
- pte_t __pte; \
- \
- set_pte(&__pte, __pte(((page)-mem_map) * \
- (unsigned long long)PAGE_SIZE + pgprot_val(pgprot))); \
- __pte; \
-})
-
-/* This takes a physical page address that is used by the remapping functions */
-#define mk_pte_phys(physpage, pgprot) \
-({ pte_t __pte; set_pte(&__pte, __pte(physpage + pgprot_val(pgprot))); __pte; })
-
-extern inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
-{ set_pte(&pte, __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot))); return pte; }
-
#define page_pte(page) page_pte_prot(page, __pgprot(0))
-#define pmd_page_kernel(pmd) \
-((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
-#define pmd_page(pmd) \
- (mem_map + (pmd_val(pmd) >> PAGE_SHIFT))
+/*
+ * Level 4 access.
+ * Never use these in the common code.
+ */
+#define pml4_page(pml4) ((unsigned long) __va(pml4_val(pml4) & PAGE_MASK))
+#define pml4_index(address) ((address >> PML4_SHIFT) & (PTRS_PER_PML4-1))
+#define pml4_offset_k(address) (init_level4_pgt + pml4_index(address))
+#define mk_kernel_pml4(address) ((pml4_t){ (address) | _KERNPG_TABLE })
+#define level3_offset_k(dir, address) ((pgd_t *) pml4_page(*(dir)) + pgd_index(address))
+/* PGD - Level3 access */
+
+#define __pgd_offset_k(pgd, address) ((pgd) + pgd_index(address))
/* to find an entry in a page-table-directory. */
#define pgd_index(address) ((address >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
+#define current_pgd_offset_k(address) \
+ __pgd_offset_k((pgd_t *)read_pda(level4_pgt), address)
-#define __pgd_offset(address) pgd_index(address)
+/* This accesses the reference page table of the boot cpu.
+ Other CPUs get synced lazily via the page fault handler. */
+static inline pgd_t *pgd_offset_k(unsigned long address)
+{
+ pml4_t pml4;
+
+ pml4 = init_level4_pgt[pml4_index(address)];
+ return __pgd_offset_k(__va(pml4_val(pml4) & PAGE_MASK), address);
+}
+#define __pgd_offset(address) pgd_index(address)
#define pgd_offset(mm, address) ((mm)->pgd+pgd_index(address))
-/* to find an entry in a kernel page-table-directory */
-#define pgd_offset_k(address) pgd_offset(&init_mm, address)
+/* PMD - Level 2 access */
+#define pmd_page_kernel(pmd) ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
+#define pmd_page(pmd) (mem_map + (pmd_val(pmd) >> PAGE_SHIFT))
#define __pmd_offset(address) \
(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
-/* Find an entry in the third-level page table.. */
+/* PTE - Level 1 access. */
+
+#define mk_pte(page,pgprot) \
+({ \
+ pte_t __pte; \
+ \
+ set_pte(&__pte, __pte(((page)-mem_map) * \
+ (unsigned long long)PAGE_SIZE + pgprot_val(pgprot))); \
+ __pte; \
+})
+
+/* This takes a physical page address that is used by the remapping functions */
+static inline pte_t mk_pte_phys(unsigned long physpage, pgprot_t pgprot)
+{
+ pte_t __pte;
+ set_pte(&__pte, __pte(physpage + pgprot_val(pgprot)));
+ return __pte;
+}
+
+extern inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+ set_pte(&pte, __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot)));
+ return pte;
+}
+
#define __pte_offset(address) \
((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
#define pte_offset_kernel(dir, address) ((pte_t *) pmd_page_kernel(*(dir)) + \
__pte_offset(address))
+/* x86-64 always has all page tables mapped. */
#define pte_offset_map(dir,address) pte_offset_kernel(dir,address)
#define pte_offset_map_nested(dir,address) pte_offset_kernel(dir,address)
#define pte_unmap(pte) /* NOP */
#define pte_unmap_nested(pte) /* NOP */
-
-/* never use these in the common code */
-#define level4_page(level4) ((unsigned long) __va(level4_val(level4) & PAGE_MASK))
-#define level4_index(address) ((address >> LEVEL4_SHIFT) & (PTRS_PER_LEVEL4-1))
-#define level4_offset_k(address) (level4_pgt + level4_index(address))
-#define level3_offset_k(dir, address) ((pgd_t *) level4_page(*(dir)) + pgd_index(address))
-
-/*
- * The i386 doesn't have any external MMU info: the kernel page
- * tables contain all the necessary information.
- */
#define update_mmu_cache(vma,address,pte) do { } while (0)
/* Encode and de-code a swap entry */
#define HAVE_ARCH_UNMAPPED_AREA
#define pgtable_cache_init() do { } while (0)
-
+#define check_pgt_cache() do { } while (0)
#endif /* _X86_64_PGTABLE_H */
#define VIP_MASK 0x00100000 /* virtual interrupt pending */
#define ID_MASK 0x00200000
+/*
+ * Default implementation of macro that returns current
+ * instruction pointer ("program counter").
+ */
#define current_text_addr() ({ void *pc; asm volatile("leaq 1f(%%rip),%0\n1:":"=r"(pc)); pc; })
/*
:"ax");
}
+#if 0
+/*
+ * Cyrix CPU configuration register indexes
+ */
#define CX86_CCR0 0xc0
#define CX86_CCR1 0xc1
#define CX86_CCR2 0xc2
outb((data), 0x23); \
} while (0)
+#endif
/*
* Bus types
#define TASK_UNMAPPED_32 0x40000000
#define TASK_UNMAPPED_64 (TASK_SIZE/3)
#define TASK_UNMAPPED_BASE \
- ((current->thread.flags & THREAD_IA32) ? TASK_UNMAPPED_32 : TASK_UNMAPPED_64)
+ (test_thread_flags(TIF_IA32) ? TASK_UNMAPPED_32 : TASK_UNMAPPED_64)
/*
* Size of io_bitmap in longwords: 32 is ports 0-0x3ff.
#define IO_BITMAP_OFFSET offsetof(struct tss_struct,io_bitmap)
#define INVALID_IO_BITMAP_OFFSET 0x8000
-/* We'll have to decide which format to use for floating stores, and
- kill all others... */
-struct i387_fsave_struct {
- u32 cwd;
- u32 swd;
- u32 twd;
- u32 fip;
- u32 fcs;
- u32 foo;
- u32 fos;
- u32 st_space[20]; /* 8*10 bytes for each FP-reg = 80 bytes */
- u32 status; /* software status information */
-};
-
struct i387_fxsave_struct {
u16 cwd;
u16 swd;
u16 twd;
u16 fop;
- u32 fip;
- u32 fcs;
- u32 foo;
- u32 fos;
+ u64 rip;
+ u64 rdp;
u32 mxcsr;
- u32 reserved;
+ u32 mxcsr_mask;
u32 st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
- u32 xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
- u32 padding[56];
+ u32 xmm_space[64]; /* 16*16 bytes for each XMM-reg = 128 bytes */
+ u32 padding[24];
} __attribute__ ((aligned (16)));
-struct i387_soft_struct {
- u32 cwd;
- u32 swd;
- u32 twd;
- u32 fip;
- u32 fcs;
- u32 foo;
- u32 fos;
- u32 st_space[20]; /* 8*10 bytes for each FP-reg = 80 bytes */
- unsigned char ftop, changed, lookahead, no_update, rm, alimit;
- struct info *info;
- unsigned long entry_eip;
-};
-
union i387_union {
- struct i387_fsave_struct fsave;
struct i387_fxsave_struct fxsave;
- struct i387_soft_struct soft;
};
typedef struct {
#define INIT_MMAP \
{ &init_mm, 0, 0, NULL, PAGE_SHARED, VM_READ | VM_WRITE | VM_EXEC, 1, NULL, NULL }
-
-#ifndef CONFIG_SMP
-extern char stackfault_stack[];
-#define STACKDESC rsp2: (unsigned long)stackfault_stack,
-#define STACKFAULT_STACK 2
-#else
-#define STACKFAULT_STACK 0
-#define STACKDESC
-#endif
-
-/* Doublefault currently shares the same stack on all CPUs. Hopefully
- only one gets into this unfortunate condition at a time. Cannot do
- the same for SF because that can be easily triggered by user
- space. */
-#define INIT_TSS { \
- rsp1: (unsigned long)doublefault_stack, \
- STACKDESC \
-}
-
-extern char doublefault_stack[];
+#define STACKFAULT_STACK 1
+#define DOUBLEFAULT_STACK 2
+#define NMI_STACK 3
+#define N_EXCEPTION_STACKS 3 /* hw limit: 7 */
+#define EXCEPTION_STKSZ 1024
#define start_thread(regs,new_rip,new_rsp) do { \
__asm__("movl %0,%%fs; movl %0,%%es; movl %0,%%ds": :"r" (0)); \
/*
* Return saved PC of a blocked thread.
+ * What is this good for? it will be always the scheduler or ret_from_fork.
*/
-extern inline unsigned long thread_saved_pc(struct task_struct *t)
-{
- return -1; /* FIXME */
-}
-
-unsigned long get_wchan(struct task_struct *p);
-
+#define thread_saved_pc(t) (*(unsigned long *)((t)->thread.rsp - 8))
-/* FIXME: this is incorrect when the task is sleeping in a syscall entered
- through SYSCALL. */
-#define __kstk_regs(tsk) \
- ((struct pt_regs *)\
- (((char *)(tsk)->thread_info) + THREAD_SIZE - sizeof(struct pt_regs)))
-#define KSTK_EIP(tsk) (__kstk_regs(tsk)->rip)
-#define KSTK_ESP(tsk) (__kstk_regs(tsk)->rsp)
+extern unsigned long get_wchan(struct task_struct *p);
+#define KSTK_EIP(tsk) \
+ (((struct pt_regs *)(tsk->thread.rsp0 - sizeof(struct pt_regs)))->rip)
+#define KSTK_ESP(tsk) -1 /* sorry. doesn't work for syscall. */
/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
extern inline void rep_nop(void)
#define cpu_has_fpu 1
-/* 3d now! prefetch instructions. Could also use the SSE flavours; not sure
- if it makes a difference. gcc 3.1 has __builtin_prefetch too, but I am
- not sure it makes sense to use them. */
-
#define ARCH_HAS_PREFETCH
#define ARCH_HAS_PREFETCHW
#define ARCH_HAS_SPINLOCK_PREFETCH
-extern inline void prefetch(const void *x)
-{
- __asm__ __volatile__ ("prefetch (%0)" : : "r"(x));
-}
-
-extern inline void prefetchw(const void *x)
-{
- __asm__ __volatile__ ("prefetchw (%0)" : : "r"(x));
-}
+#define prefetch(x) __builtin_prefetch((x),0)
+#define prefetchw(x) __builtin_prefetch((x),1)
#define spin_lock_prefetch(x) prefetchw(x)
#define cpu_relax() rep_nop()
#ifndef _X86_64_PTRACE_H
#define _X86_64_PTRACE_H
-#ifdef __ASSEMBLY__
+#if defined(__ASSEMBLY__) || defined(__FRAME_OFFSETS)
#define R15 0
#define R14 8
#define R13 16
#define PTRACE_SETFPXREGS 19
#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
-#define user_mode(regs) ((regs)->rsp <= PAGE_OFFSET)
-#define instruction_pointer(regs) ((regs)->eip)
+#define user_mode(regs) (!!((regs)->cs & 3))
+#define instruction_pointer(regs) ((regs)->rip)
extern void show_regs(struct pt_regs *);
enum {
* spinlock.h Copyright 1996 Linus Torvalds.
*
* Copyright 1999 Red Hat, Inc.
- * Copyright 2001 SuSE labs
+ * Copyright 2001,2002 SuSE labs
*
* Written by Benjamin LaHaise.
*
#ifndef _ASM_X86_64_RWLOCK_H
#define _ASM_X86_64_RWLOCK_H
+#include <linux/stringify.h>
+
#define RW_LOCK_BIAS 0x01000000
#define RW_LOCK_BIAS_STR "0x01000000"
asm volatile(LOCK "subl $1,(%0)\n\t" \
"js 2f\n" \
"1:\n" \
- ".section .text.lock,\"ax\"\n" \
+ LOCK_SECTION_START("") \
"2:\tcall " helper "\n\t" \
"jmp 1b\n" \
- ".previous" \
- ::"d" (rw) : "memory")
+ LOCK_SECTION_END \
+ ::"a" (rw) : "memory")
#define __build_read_lock_const(rw, helper) \
asm volatile(LOCK "subl $1,%0\n\t" \
"js 2f\n" \
"1:\n" \
- ".section .text.lock,\"ax\"\n" \
+ LOCK_SECTION_START("") \
"2:\tpushq %%rax\n\t" \
- "leal %0,%%eax\n\t" \
+ "leaq %0,%%rax\n\t" \
"call " helper "\n\t" \
"popq %%rax\n\t" \
"jmp 1b\n" \
- ".previous" \
+ LOCK_SECTION_END \
:"=m" (*((volatile int *)rw))::"memory")
#define __build_read_lock(rw, helper) do { \
asm volatile(LOCK "subl $" RW_LOCK_BIAS_STR ",(%0)\n\t" \
"jnz 2f\n" \
"1:\n" \
- ".section .text.lock,\"ax\"\n" \
+ LOCK_SECTION_START("") \
"2:\tcall " helper "\n\t" \
"jmp 1b\n" \
- ".previous" \
- ::"d" (rw) : "memory")
+ LOCK_SECTION_END \
+ ::"a" (rw) : "memory")
#define __build_write_lock_const(rw, helper) \
asm volatile(LOCK "subl $" RW_LOCK_BIAS_STR ",(%0)\n\t" \
"jnz 2f\n" \
"1:\n" \
- ".section .text.lock,\"ax\"\n" \
+ LOCK_SECTION_START("") \
"2:\tpushq %%rax\n\t" \
"leaq %0,%%rax\n\t" \
"call " helper "\n\t" \
"popq %%rax\n\t" \
"jmp 1b\n" \
- ".previous" \
+ LOCK_SECTION_END \
:"=m" (*((volatile long *)rw))::"memory")
#define __build_write_lock(rw, helper) do { \
* the semaphore definition
*/
struct rw_semaphore {
- signed long count;
+ signed int count;
#define RWSEM_UNLOCKED_VALUE 0x00000000
#define RWSEM_ACTIVE_BIAS 0x00000001
#define RWSEM_ACTIVE_MASK 0x0000ffff
{
__asm__ __volatile__(
"# beginning down_read\n\t"
-LOCK_PREFIX " incl (%%rax)\n\t" /* adds 0x00000001, returns the old value */
+LOCK_PREFIX " incl (%%rdi)\n\t" /* adds 0x00000001, returns the old value */
" js 2f\n\t" /* jump if we weren't granted the lock */
"1:\n\t"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("") \
"2:\n\t"
" call rwsem_down_read_failed_thunk\n\t"
" jmp 1b\n"
- ".previous"
+ LOCK_SECTION_END \
"# ending down_read\n\t"
: "+m"(sem->count)
- : "a"(sem)
+ : "D"(sem)
: "memory", "cc");
}
tmp = RWSEM_ACTIVE_WRITE_BIAS;
__asm__ __volatile__(
"# beginning down_write\n\t"
-LOCK_PREFIX " xadd %0,(%%rax)\n\t" /* subtract 0x0000ffff, returns the old value */
+LOCK_PREFIX " xaddl %0,(%%rdi)\n\t" /* subtract 0x0000ffff, returns the old value */
" testl %0,%0\n\t" /* was the count 0 before? */
" jnz 2f\n\t" /* jump if we weren't granted the lock */
"1:\n\t"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\n\t"
" call rwsem_down_write_failed_thunk\n\t"
" jmp 1b\n"
- ".previous\n"
+ LOCK_SECTION_END
"# ending down_write"
- : "=r" (tmp)
- : "0"(tmp), "a"(sem)
+ : "=&r" (tmp)
+ : "0"(tmp), "D"(sem)
: "memory", "cc");
}
__s32 tmp = -RWSEM_ACTIVE_READ_BIAS;
__asm__ __volatile__(
"# beginning __up_read\n\t"
-LOCK_PREFIX " xadd %%edx,(%%rax)\n\t" /* subtracts 1, returns the old value */
+LOCK_PREFIX " xaddl %%edx,(%%rdi)\n\t" /* subtracts 1, returns the old value */
" js 2f\n\t" /* jump if the lock is being waited upon */
"1:\n\t"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\n\t"
" decw %%dx\n\t" /* do nothing if still outstanding active readers */
" jnz 1b\n\t"
" call rwsem_wake_thunk\n\t"
" jmp 1b\n"
- ".previous\n"
+ LOCK_SECTION_END
"# ending __up_read\n"
- : "+d"(tmp)
- : "a"(sem)
- : "memory");
+ : "+m"(sem->count), "+d"(tmp)
+ : "D"(sem)
+ : "memory", "cc");
}
/*
{
__asm__ __volatile__(
"# beginning __up_write\n\t"
- " movl %1,%%edx\n\t"
-LOCK_PREFIX " xaddl %%edx,(%%rax)\n\t" /* tries to transition 0xffff0001 -> 0x00000000 */
+ " movl %2,%%edx\n\t"
+LOCK_PREFIX " xaddl %%edx,(%%rdi)\n\t" /* tries to transition 0xffff0001 -> 0x00000000 */
" jnz 2f\n\t" /* jump if the lock is being waited upon */
"1:\n\t"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\n\t"
" decw %%dx\n\t" /* did the active count reduce to 0? */
" jnz 1b\n\t" /* jump back if not */
" call rwsem_wake_thunk\n\t"
" jmp 1b\n"
- ".previous\n"
+ LOCK_SECTION_END
"# ending __up_write\n"
- :
- : "a"(sem), "i"(-RWSEM_ACTIVE_WRITE_BIAS)
- : "memory", "cc", "edx");
+ : "+m"(sem->count)
+ : "D"(sem), "i"(-RWSEM_ACTIVE_WRITE_BIAS)
+ : "memory", "cc", "rdx");
}
/*
int tmp = delta;
__asm__ __volatile__(
-LOCK_PREFIX "xadd %0,(%2)"
+LOCK_PREFIX "xaddl %0,(%2)"
: "=r"(tmp), "=m"(sem->count)
: "r"(sem), "m"(sem->count), "0" (tmp)
: "memory");
}
#endif /* __KERNEL__ */
-#endif /* _I386_RWSEM_H */
+#endif /* _X8664_RWSEM_H */
#define __KERNEL_CS 0x10
#define __KERNEL_DS 0x18
+#define __KERNEL32_CS 0x38
+
/*
* we cannot use the same code segment descriptor for user and kernel
* even not in the long flat model, because of different DPL /kkeil
#include <asm/rwlock.h>
#include <linux/wait.h>
#include <linux/rwsem.h>
+#include <linux/stringify.h>
struct semaphore {
atomic_t count;
LOCK "decl %0\n\t" /* --sem->count */
"js 2f\n"
"1:\n"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\tcall __down_failed\n\t"
"jmp 1b\n"
- ".previous"
+ LOCK_SECTION_END
:"=m" (sem->count)
:"D" (sem)
:"memory");
"js 2f\n\t"
"xorl %0,%0\n"
"1:\n"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\tcall __down_failed_interruptible\n\t"
"jmp 1b\n"
- ".previous"
+ LOCK_SECTION_END
:"=a" (result), "=m" (sem->count)
:"D" (sem)
:"memory");
"js 2f\n\t"
"xorl %0,%0\n"
"1:\n"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\tcall __down_failed_trylock\n\t"
"jmp 1b\n"
- ".previous"
+ LOCK_SECTION_END
:"=a" (result), "=m" (sem->count)
:"D" (sem)
- :"memory");
+ :"memory","cc");
return result;
}
LOCK "incl %0\n\t" /* ++sem->count */
"jle 2f\n"
"1:\n"
- ".section .text.lock,\"ax\"\n"
+ LOCK_SECTION_START("")
"2:\tcall __up_wakeup\n\t"
"jmp 1b\n"
- ".previous"
+ LOCK_SECTION_END
:"=m" (sem->count)
:"D" (sem)
:"memory");
#include <asm/types.h>
-/*
- * The first part of "struct _fpstate" is just the normal i387
- * hardware setup, the extra "status" word is used to save the
- * coprocessor status word before entering the handler.
- *
- * Pentium III FXSR, SSE support
- * Gareth Hughes <gareth@valinux.com>, May 2000
- *
- * The FPU state data structure has had to grow to accomodate the
- * extended FPU state required by the Streaming SIMD Extensions.
- * There is no documented standard to accomplish this at the moment.
- */
-struct _fpreg {
- unsigned short significand[4];
- unsigned short exponent;
-};
-
-struct _fpxreg {
- unsigned short significand[4];
- unsigned short exponent;
- unsigned short padding[3];
-};
-
-struct _xmmreg {
- __u32 element[4];
-};
-
-
-/* This is FXSAVE layout without 64bit prefix thus 32bit compatible.
- This means that the IP and DPs are only 32bit and are not useful
- in 64bit space.
- If someone used them we would need to switch to 64bit FXSAVE.
-*/
+/* FXSAVE frame */
+/* Note: reserved1/2 may someday contain valuable data. Always save/restore
+ them when you change signal frames. */
struct _fpstate {
- /* Regular FPU environment */
- __u32 cw;
- __u32 sw;
- __u32 tag;
- __u32 ipoff;
- __u32 cssel;
- __u32 dataoff;
- __u32 datasel;
- struct _fpreg _st[8];
- unsigned short status;
- unsigned short magic; /* 0xffff = regular FPU data only */
-
- /* FXSR FPU environment */
- __u32 _fxsr_env[6];
+ __u16 cwd;
+ __u16 swd;
+ __u16 twd; /* Note this is not the same as the 32bit/x87/FSAVE twd */
+ __u16 fop;
+ __u64 rip;
+ __u64 rdp;
__u32 mxcsr;
- __u32 reserved;
- struct _fpxreg _fxsr_st[8];
- struct _xmmreg _xmm[8]; /* It's actually 16 */
- __u32 padding[56];
+ __u32 mxcsr_mask;
+ __u32 st_space[32]; /* 8*16 bytes for each FP-reg */
+ __u32 xmm_space[64]; /* 16*16 bytes for each XMM-reg */
+ __u32 reserved2[24];
};
-#define X86_FXSR_MAGIC 0x0000
-
struct sigcontext {
- unsigned short gs, __gsh;
- unsigned short fs, __fsh;
- unsigned short es, __esh;
- unsigned short ds, __dsh;
unsigned long r8;
unsigned long r9;
unsigned long r10;
+ unsigned long r11;
unsigned long r12;
unsigned long r13;
unsigned long r14;
unsigned long rbx;
unsigned long rdx;
unsigned long rax;
- unsigned long trapno;
- unsigned long err;
+ unsigned long rcx;
+ unsigned long rsp;
unsigned long rip;
- unsigned short cs, __csh;
- unsigned int __pad0;
- unsigned long eflags;
- unsigned long rsp_at_signal;
- struct _fpstate * fpstate;
+ unsigned long eflags; /* RFLAGS */
+ unsigned short cs;
+ unsigned short gs;
+ unsigned short fs;
+ unsigned short __pad0;
+ unsigned long err;
+ unsigned long trapno;
unsigned long oldmask;
unsigned long cr2;
- unsigned long r11;
- unsigned long rcx;
- unsigned long rsp;
+ struct _fpstate *fpstate; /* zero when no FPU context */
+ unsigned long reserved1[8];
};
-
#endif
--- /dev/null
+#ifndef _SIGCONTEXT32_H
+#define _SIGCONTEXT32_H 1
+
+/* signal context for 32bit programs. */
+
+#define X86_FXSR_MAGIC 0x0000
+
+struct _fpreg {
+ unsigned short significand[4];
+ unsigned short exponent;
+};
+
+struct _fpxreg {
+ unsigned short significand[4];
+ unsigned short exponent;
+ unsigned short padding[3];
+};
+
+struct _xmmreg {
+ __u32 element[4];
+};
+
+/* FSAVE frame with extensions */
+struct _fpstate_ia32 {
+ /* Regular FPU environment */
+ __u32 cw;
+ __u32 sw;
+ __u32 tag; /* not compatible to 64bit twd */
+ __u32 ipoff;
+ __u32 cssel;
+ __u32 dataoff;
+ __u32 datasel;
+ struct _fpreg _st[8];
+ unsigned short status;
+ unsigned short magic; /* 0xffff = regular FPU data only */
+
+ /* FXSR FPU environment */
+ __u32 _fxsr_env[6];
+ __u32 mxcsr;
+ __u32 reserved;
+ struct _fpxreg _fxsr_st[8];
+ struct _xmmreg _xmm[8]; /* It's actually 16 */
+ __u32 padding[56];
+};
+
+struct sigcontext_ia32 {
+ unsigned short gs, __gsh;
+ unsigned short fs, __fsh;
+ unsigned short es, __esh;
+ unsigned short ds, __dsh;
+ unsigned int edi;
+ unsigned int esi;
+ unsigned int ebp;
+ unsigned int esp;
+ unsigned int ebx;
+ unsigned int edx;
+ unsigned int ecx;
+ unsigned int eax;
+ unsigned int trapno;
+ unsigned int err;
+ unsigned int eip;
+ unsigned short cs, __csh;
+ unsigned int eflags;
+ unsigned int esp_at_signal;
+ unsigned short ss, __ssh;
+ unsigned int fpstate; /* really (struct _fpstate_ia32 *) */
+ unsigned int oldmask;
+ unsigned int cr2;
+};
+
+#endif
#define SI_ASYNCIO -4 /* sent by AIO completion */
#define SI_SIGIO -5 /* sent by queued SIGIO */
#define SI_TKILL -6 /* sent by tkill system call */
+#define SI_DETHREAD -7 /* sent by execve() killing subsidiary thread */
#define SI_FROMUSER(siptr) ((siptr)->si_code <= 0)
#define SI_FROMKERNEL(siptr) ((siptr)->si_code > 0)
#include <asm/io_apic.h>
#endif
#include <asm/apic.h>
+#include <asm/thread_info.h>
#endif
#endif
#define NO_PROC_ID 0xFF /* No processor magic marker */
-
-
#endif
#define INT_DELIVERY_MODE 1 /* logical delivery */
#define TARGET_CPUS 1
+
+
+#ifndef CONFIG_SMP
+#define stack_smp_processor_id() 0
+#else
+#include <asm/thread_info.h>
+#define stack_smp_processor_id() \
+({ \
+ struct thread_info *ti; \
+ __asm__("andq %%rsp,%0; ":"=r" (ti) : "0" (~8191UL)); \
+ ti->cpu; \
+})
#endif
+
+#endif
+
#define LOCK_PREFIX ""
#endif
-struct task_struct; /* one of the stranger aspects of C forward declarations.. */
-extern void __switch_to(struct task_struct *prev, struct task_struct *next);
-
-#define prepare_to_switch() do { } while(0)
-
-#define switch_to(prev,next) do { \
- asm volatile("pushq %%rbp\n\t" \
- "pushq %%rbx\n\t" \
- "pushq %%r8\n\t" \
- "pushq %%r9\n\t" \
- "pushq %%r10\n\t" \
- "pushq %%r11\n\t" \
- "pushq %%r12\n\t" \
- "pushq %%r13\n\t" \
- "pushq %%r14\n\t" \
- "pushq %%r15\n\t" \
- "movq %%rsp,%0\n\t" /* save RSP */ \
- "movq %2,%%rsp\n\t" /* restore RSP */ \
- "leaq 1f(%%rip),%%rbp\n\t" \
- "movq %%rbp,%1\n\t" /* save RIP */ \
- "pushq %3\n\t" /* setup new RIP */ \
+#define prepare_to_switch() do {} while(0)
+
+#define __STR(x) #x
+#define STR(x) __STR(x)
+
+#define __PUSH(x) "pushq %%" __STR(x) "\n\t"
+#define __POP(x) "popq %%" __STR(x) "\n\t"
+
+/* frame pointer must be last for get_wchan */
+
+/* It would be more efficient to let the compiler clobber most of these registers.
+ Clobbering all is not possible because that lets reload freak out. Even just
+ clobbering six generates wrong code with gcc 3.1 for me so do it this way for now.
+ rbp needs to be always explicitely saved because gcc cannot clobber the
+ frame pointer and the scheduler is compiled with frame pointers. -AK */
+#define SAVE_CONTEXT \
+ __PUSH(r8) __PUSH(r9) __PUSH(r10) __PUSH(r11) __PUSH(r12) __PUSH(r13) \
+ __PUSH(r14) __PUSH(r15) __PUSH(rax) \
+ __PUSH(rdi) __PUSH(rsi) \
+ __PUSH(rdx) __PUSH(rcx) \
+ __PUSH(rbx) __PUSH(rbp)
+#define RESTORE_CONTEXT \
+ __POP(rbp) __POP(rbx) \
+ __POP(rcx) __POP(rdx) \
+ __POP(rsi) __POP(rdi) \
+ __POP(rax) __POP(r15) __POP(r14) __POP(r13) __POP(r12) __POP(r11) __POP(r10) \
+ __POP(r9) __POP(r8)
+
+#define switch_to(prev,next) \
+ asm volatile(SAVE_CONTEXT \
+ "movq %%rsp,%[prevrsp]\n\t" \
+ "movq %[nextrsp],%%rsp\n\t" \
+ "movq $1f,%[prevrip]\n\t" \
+ "pushq %[nextrip]\n\t" \
"jmp __switch_to\n\t" \
- "1:\t" \
- "popq %%r15\n\t" \
- "popq %%r14\n\t" \
- "popq %%r13\n\t" \
- "popq %%r12\n\t" \
- "popq %%r11\n\t" \
- "popq %%r10\n\t" \
- "popq %%r9\n\t" \
- "popq %%r8\n\t" \
- "popq %%rbx\n\t" \
- "popq %%rbp\n\t" \
- :"=m" (prev->thread.rsp),"=m" (prev->thread.rip) \
- :"m" (next->thread.rsp),"m" (next->thread.rip), \
- "b" (prev), "S" (next), "D" (prev)); \
-} while (0)
+ "1:\n\t" \
+ RESTORE_CONTEXT \
+ :[prevrsp] "=m" (prev->thread.rsp), \
+ [prevrip] "=m" (prev->thread.rip) \
+ :[nextrsp] "m" (next->thread.rsp), \
+ [nextrip]"m" (next->thread.rip), \
+ [next] "S" (next), [prev] "D" (prev) \
+ :"memory")
+
+extern void load_gs_index(unsigned);
/*
* Load a segment. Fall back on loading the zero
* segment if something goes wrong..
*/
-#define loadsegment(seg,value) do { int v = value; \
+#define loadsegment(seg,value) \
asm volatile("\n" \
"1:\t" \
"movl %0,%%" #seg "\n" \
".align 4\n\t" \
".quad 1b,3b\n" \
".previous" \
- : :"r" (v)); } while(0)
+ : :"r" ((int)(value)))
#define set_debug(value,register) \
__asm__("movq %0,%%db" #register \
* Force strict CPU ordering.
* And yes, this is required on UP too when we're talking
* to devices.
- *
- * For now, "wmb()" doesn't actually do anything, as all
- * Intel CPU's follow what Intel calls a *Processor Order*,
- * in which all writes are seen in the program order even
- * outside the CPU.
- *
- * I expect future Intel CPU's to have a weaker ordering,
- * but I'd also expect them to finally get their act together
- * and add some real memory barriers if so.
*/
-#define mb() __asm__ __volatile__ ("lock; addl $0,0(%%rsp)": : :"memory")
-#define rmb() mb()
-#define wmb() __asm__ __volatile__ ("": : :"memory")
+#define mb() asm volatile("mfence":::"memory")
+#define rmb() asm volatile("lfence":::"memory")
+#define wmb() asm volatile("sfence":::"memory")
#define set_mb(var, value) do { xchg(&var, value); } while (0)
#define set_wmb(var, value) do { var = value; wmb(); } while (0)
+#define warn_if_not_ulong(x) do { unsigned long foo; (void) (&(x) == &foo); } while (0)
+
/* interrupt control.. */
-#define __save_flags(x) __asm__ __volatile__("# save_flags \n\t pushfq ; popq %q0":"=g" (x): /* no input */ :"memory")
+#define __save_flags(x) do { warn_if_not_ulong(x); __asm__ __volatile__("# save_flags \n\t pushfq ; popq %q0":"=g" (x): /* no input */ :"memory"); } while (0)
#define __restore_flags(x) __asm__ __volatile__("# restore_flags \n\t pushq %0 ; popfq": /* no output */ :"g" (x):"memory", "cc")
#define __cli() __asm__ __volatile__("cli": : :"memory")
#define __sti() __asm__ __volatile__("sti": : :"memory")
#define safe_halt() __asm__ __volatile__("sti; hlt": : :"memory")
/* For spinlocks etc */
-#define local_irq_save(x) __asm__ __volatile__("# local_irq_save \n\t pushfq ; popq %0 ; cli":"=g" (x): /* no input */ :"memory")
+#define local_irq_save(x) do { warn_if_not_ulong(x); __asm__ __volatile__("# local_irq_save \n\t pushfq ; popq %0 ; cli":"=g" (x): /* no input */ :"memory"); } while (0)
#define local_irq_restore(x) __asm__ __volatile__("# local_irq_restore \n\t pushq %0 ; popfq": /* no output */ :"g" (x):"memory")
-#define local_irq_disable() __asm__ __volatile__("cli": : :"memory")
-#define local_irq_enable() __asm__ __volatile__("sti": : :"memory")
+#define local_irq_disable() __cli()
+#define local_irq_enable() __sti()
#ifdef CONFIG_SMP
#endif
-#define icebp() asm volatile("xchg %bx,%bx")
-
+/* Default simics "magic" breakpoint */
+#define icebp() asm volatile("xchg %%bx,%%bx" ::: "ebx")
/*
* disable hlt during certain critical i/o operations
/* how to get the thread information struct from C */
-#ifdef CONFIG_PREEMPT
-/* Preemptive kernels need to access this from interrupt context too. */
static inline struct thread_info *current_thread_info(void)
{
struct thread_info *ti;
ti = (void *)read_pda(kernelstack) + PDA_STACKOFFSET - THREAD_SIZE;
return ti;
}
-#else
-/* On others go for a minimally cheaper way. */
-static inline struct thread_info *current_thread_info(void)
+
+static inline struct thread_info *stack_thread_info(void)
{
struct thread_info *ti;
__asm__("andq %%rsp,%0; ":"=r" (ti) : "0" (~8191UL));
return ti;
}
-#endif
/* thread information allocation */
#define THREAD_SIZE (2*PAGE_SIZE)
#define TIF_NOTIFY_RESUME 1 /* resumption notification requested */
#define TIF_SIGPENDING 2 /* signal pending */
#define TIF_NEED_RESCHED 3 /* rescheduling necessary */
-#define TIF_USEDFPU 16 /* FPU was used by this task this quantum (SMP) */
+#define TIF_USEDFPU 16 /* FPU was used by this task this quantum */
#define TIF_POLLING_NRFLAG 17 /* true if poll_idle() is polling TIF_NEED_RESCHED */
#define TIF_IA32 18 /* 32bit process */
#define _TIF_WORK_MASK 0x0000FFFE /* work to do on interrupt/exception return */
#define _TIF_ALLWORK_MASK 0x0000FFFF /* work to do on any return to u-space */
+#define PREEMPT_ACTIVE 0x4000000
+
#endif /* __KERNEL__ */
#endif /* _ASM_THREAD_INFO_H */
#define CLOCK_TICK_RATE 1193180 /* Underlying HZ */
#define CLOCK_TICK_FACTOR 20 /* Factor of both 1000000 and CLOCK_TICK_RATE */
-#define FINETUNE ((((((long)LATCH * HZ - CLOCK_TICK_RATE) << SHIFT_HZ) * \
+#define FINETUNE ((((((int)LATCH * HZ - CLOCK_TICK_RATE) << SHIFT_HZ) * \
(1000000/CLOCK_TICK_FACTOR) / (CLOCK_TICK_RATE/CLOCK_TICK_FACTOR)) \
<< (SHIFT_SCALE-SHIFT_HZ)) / HZ)
#endif
}
-extern unsigned long cpu_khz;
+extern unsigned int cpu_khz;
#endif
--- /dev/null
+#ifndef _X8664_TLBFLUSH_H
+#define _X8664_TLBFLUSH_H
+
+#include <linux/config.h>
+#include <linux/mm.h>
+#include <asm/processor.h>
+
+#define __flush_tlb() \
+ do { \
+ unsigned long tmpreg; \
+ \
+ __asm__ __volatile__( \
+ "movq %%cr3, %0; # flush TLB \n" \
+ "movq %0, %%cr3; \n" \
+ : "=r" (tmpreg) \
+ :: "memory"); \
+ } while (0)
+
+/*
+ * Global pages have to be flushed a bit differently. Not a real
+ * performance problem because this does not happen often.
+ */
+#define __flush_tlb_global() \
+ do { \
+ unsigned long tmpreg; \
+ \
+ __asm__ __volatile__( \
+ "movq %1, %%cr4; # turn off PGE \n" \
+ "movq %%cr3, %0; # flush TLB \n" \
+ "movq %0, %%cr3; \n" \
+ "movq %2, %%cr4; # turn PGE back on \n" \
+ : "=&r" (tmpreg) \
+ : "r" (mmu_cr4_features & ~X86_CR4_PGE), \
+ "r" (mmu_cr4_features) \
+ : "memory"); \
+ } while (0)
+
+extern unsigned long pgkern_mask;
+
+#define __flush_tlb_all() __flush_tlb_global()
+
+#define __flush_tlb_one(addr) \
+ __asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
+
+
+/*
+ * TLB flushing:
+ *
+ * - flush_tlb() flushes the current mm struct TLBs
+ * - flush_tlb_all() flushes all processes TLBs
+ * - flush_tlb_mm(mm) flushes the specified mm context TLB's
+ * - flush_tlb_page(vma, vmaddr) flushes one page
+ * - flush_tlb_range(vma, start, end) flushes a range of pages
+ * - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
+ * - flush_tlb_pgtables(mm, start, end) flushes a range of page tables
+ *
+ * ..but the x86_64 has somewhat limited tlb flushing capabilities,
+ * and page-granular flushes are available only on i486 and up.
+ */
+
+#ifndef CONFIG_SMP
+
+#define flush_tlb() __flush_tlb()
+#define flush_tlb_all() __flush_tlb_all()
+#define local_flush_tlb() __flush_tlb()
+
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+ if (mm == current->active_mm)
+ __flush_tlb();
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long addr)
+{
+ if (vma->vm_mm == current->active_mm)
+ __flush_tlb_one(addr);
+}
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+ if (vma->vm_mm == current->active_mm)
+ __flush_tlb();
+}
+
+#else
+
+#include <asm/smp.h>
+
+#define local_flush_tlb() \
+ __flush_tlb()
+
+extern void flush_tlb_all(void);
+extern void flush_tlb_current_task(void);
+extern void flush_tlb_mm(struct mm_struct *);
+extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
+
+#define flush_tlb() flush_tlb_current_task()
+
+static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
+{
+ flush_tlb_mm(vma->vm_mm);
+}
+
+#define TLBSTATE_OK 1
+#define TLBSTATE_LAZY 2
+
+struct tlb_state
+{
+ struct mm_struct *active_mm;
+ int state;
+ char __cacheline_padding[24];
+};
+extern struct tlb_state cpu_tlbstate[NR_CPUS];
+
+
+#endif
+
+#define flush_tlb_kernel_range(start, end) flush_tlb_all()
+
+static inline void flush_tlb_pgtables(struct mm_struct *mm,
+ unsigned long start, unsigned long end)
+{
+ /* x86_64 does not keep any page table caches in TLB */
+}
+
+#endif /* _X8664_TLBFLUSH_H */
#define BITS_PER_LONG 64
-typedef u32 dma64_addr_t;
+typedef u64 dma64_addr_t;
typedef u64 dma_addr_t;
#endif /* __KERNEL__ */
#define __NR_shmget 29
__SYSCALL(__NR_shmget, sys_shmget)
#define __NR_shmat 30
-__SYSCALL(__NR_shmat, sys_shmat)
+__SYSCALL(__NR_shmat, wrap_sys_shmat)
#define __NR_shmctl 31
__SYSCALL(__NR_shmctl, sys_shmctl)
#define __NR_semop 65
__SYSCALL(__NR_semop, sys_semop)
#define __NR_semctl 66
-__SYSCALL(__NR_semctl, sys_semctl)
+__SYSCALL(__NR_semctl, wrap_sys_semctl)
#define __NR_shmdt 67
__SYSCALL(__NR_shmdt, sys_shmdt)
#define __NR_msgget 68
__SYSCALL(__NR_fremovexattr, sys_fremovexattr)
#define __NR_tkill 200
__SYSCALL(__NR_tkill, sys_tkill)
+#define __NR_time 201
+__SYSCALL(__NR_time, sys_time)
+#define __NR_futex 202
+__SYSCALL(__NR_futex, sys_futex)
+#define __NR_sched_setaffinity 203
+__SYSCALL(__NR_sched_setaffinity, sys_sched_setaffinity)
+#define __NR_sched_getaffinity 204
+__SYSCALL(__NR_sched_getaffinity, sys_sched_getaffinity)
-#define __NR_syscall_max __NR_tkill
+
+#define __NR_syscall_max __NR_sched_getaffinity
#ifndef __NO_STUBS
-/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
+/* user-visible error numbers are in the range -1 - -4095 */
#define __syscall_clobber "r11","rcx","memory"
#include <asm/page.h>
#include <linux/ptrace.h>
/* Core file format: The core file is written in such a way that gdb
- can understand it and provide useful information to the user (under
- linux we use the 'trad-core' bfd). There are quite a number of
- obstacles to being able to view the contents of the floating point
- registers, and until these are solved you will not be able to view the
- contents of them. Actually, you can read in the core file and look at
- the contents of the user struct to find out what the floating point
- registers contain.
+ can understand it and provide useful information to the user.
+ There are quite a number of obstacles to being able to view the
+ contents of the floating point registers, and until these are
+ solved you will not be able to view the contents of them.
+ Actually, you can read in the core file and look at the contents of
+ the user struct to find out what the floating point registers
+ contain.
+
The actual file contents are as follows:
UPAGE: 1 page consisting of a user struct that tells gdb what is present
in the file. Directly after this is a copy of the task_struct, which
backtrace. We need to write the data from (esp) to
current->start_stack, so we round each of these off in order to be able
to write an integer number of pages.
- The minimum core file size is 3 pages, or 12288 bytes.
-*/
-
-/* This is not neccessary in first phase. It will have to be
- synchronized with gdb later. */
+ The minimum core file size is 3 pages, or 12288 bytes. */
/*
* Pentium III FXSR, SSE support
* and both the standard and SIMD floating point data can be accessed via
* the new ptrace requests. In either case, changes to the FPU environment
* will be reflected in the task's state as expected.
+ *
+ * x86-64 support by Andi Kleen.
*/
+/* This matches the 64bit FXSAVE format as defined by AMD. It is the same
+ as the 32bit format defined by Intel, except that the selector:offset pairs for
+ data and eip are replaced with flat 64bit pointers. */
struct user_i387_struct {
unsigned short cwd;
unsigned short swd;
- unsigned short twd;
+ unsigned short twd; /* Note this is not the same as the 32bit/x87/FSAVE twd */
unsigned short fop;
- u32 fip;
- u32 fcs;
- u32 foo;
- u32 fos;
+ u64 rip;
+ u64 rdp;
u32 mxcsr;
- u32 reserved;
+ u32 mxcsr_mask;
u32 st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
- u32 xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
- u32 padding[56];
+ u32 xmm_space[64]; /* 16*16 bytes for each XMM-reg = 256 bytes */
+ u32 padding[24];
};
/*
- * This is copy of the layout of "struct pt_regs", and
- * is still the layout used by user mode (the new
- * pt_regs doesn't have all registers as the kernel
- * doesn't use the extra segment registers)
+ * Segment register layout in coredumps.
*/
struct user_regs_struct {
unsigned long r15,r14,r13,r12,rbp,rbx,r11,r10;
unsigned long r9,r8,rax,rcx,rdx,rsi,rdi,orig_rax;
unsigned long rip,cs,eflags;
unsigned long rsp,ss;
- unsigned long fs_base, kernel_gs_base;
+ unsigned long fs_base, gs_base;
+ unsigned long ds,es,fs,gs;
};
/* When the kernel dumps core, it starts by dumping the user struct -
struct user_i387_struct* u_fpstate; /* Math Co-processor pointer. */
unsigned long magic; /* To uniquely identify a core file */
char u_comm[32]; /* User command that was responsible */
- int u_debugreg[8];
+ unsigned long u_debugreg[8];
};
#define NBPG PAGE_SIZE
#define UPAGES 1
u32 st_space[20]; /* 8*10 bytes for each FP-reg = 80 bytes */
};
-/*
- * This is the old layout of "struct pt_regs", and
- * is still the layout used by user mode (the new
- * pt_regs doesn't have all registers as the kernel
- * doesn't use the extra segment registers)
- */
+/* FSAVE frame with extensions */
+struct user32_fxsr_struct {
+ unsigned short cwd;
+ unsigned short swd;
+ unsigned short twd; /* not compatible to 64bit twd */
+ unsigned short fop;
+ int fip;
+ int fcs;
+ int foo;
+ int fos;
+ int mxcsr;
+ int reserved;
+ int st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
+ int xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
+ int padding[56];
+};
+
struct user_regs_struct32 {
__u32 ebx, ecx, edx, esi, edi, ebp, eax;
unsigned short ds, __ds, es, __es;
/* vsyscall space (readonly) */
extern long __vxtime_sequence[2];
extern int __delay_at_last_interrupt;
-extern unsigned long __last_tsc_low;
-extern unsigned long __fast_gettimeoffset_quotient;
+extern unsigned int __last_tsc_low;
+extern unsigned int __fast_gettimeoffset_quotient;
extern struct timeval __xtime;
extern volatile unsigned long __jiffies;
extern unsigned long __wall_jiffies;
/* kernel space (writeable) */
extern unsigned long last_tsc_low;
extern int delay_at_last_interrupt;
-extern unsigned long fast_gettimeoffset_quotient;
+extern unsigned int fast_gettimeoffset_quotient;
extern unsigned long wall_jiffies;
extern struct timezone sys_tz;
extern long vxtime_sequence[2];