diff --git a/[refs] b/[refs] index 3051332ad29a..b6f7210d12ca 100644 --- a/[refs] +++ b/[refs] @@ -1,2 +1,2 @@ --- -refs/heads/master: cdf4f383a4b0ffbf458f65380ecffbeee1f79841 +refs/heads/master: d00e708cef16442cabaf23f653baf924f5d66e83 diff --git a/trunk/Documentation/console/console.txt b/trunk/Documentation/console/console.txt deleted file mode 100644 index d3e17447321c..000000000000 --- a/trunk/Documentation/console/console.txt +++ /dev/null @@ -1,144 +0,0 @@ -Console Drivers -=============== - -The linux kernel has 2 general types of console drivers. The first type is -assigned by the kernel to all the virtual consoles during the boot process. -This type will be called 'system driver', and only one system driver is allowed -to exist. The system driver is persistent and it can never be unloaded, though -it may become inactive. - -The second type has to be explicitly loaded and unloaded. This will be called -'modular driver' by this document. Multiple modular drivers can coexist at -any time with each driver sharing the console with other drivers including -the system driver. However, modular drivers cannot take over the console -that is currently occupied by another modular driver. (Exception: Drivers that -call take_over_console() will succeed in the takeover regardless of the type -of driver occupying the consoles.) They can only take over the console that is -occupied by the system driver. In the same token, if the modular driver is -released by the console, the system driver will take over. - -Modular drivers, from the programmer's point of view, has to call: - - take_over_console() - load and bind driver to console layer - give_up_console() - unbind and unload driver - -In newer kernels, the following are also available: - - register_con_driver() - unregister_con_driver() - -If sysfs is enabled, the contents of /sys/class/vtconsole can be -examined. This shows the console backends currently registered by the -system which are named vtcon where is an integer fro 0 to 15. Thus: - - ls /sys/class/vtconsole - . .. vtcon0 vtcon1 - -Each directory in /sys/class/vtconsole has 3 files: - - ls /sys/class/vtconsole/vtcon0 - . .. bind name uevent - -What do these files signify? - - 1. bind - this is a read/write file. It shows the status of the driver if - read, or acts to bind or unbind the driver to the virtual consoles - when written to. The possible values are: - - 0 - means the driver is not bound and if echo'ed, commands the driver - to unbind - - 1 - means the driver is bound and if echo'ed, commands the driver to - bind - - 2. name - read-only file. Shows the name of the driver in this format: - - cat /sys/class/vtconsole/vtcon0/name - (S) VGA+ - - '(S)' stands for a (S)ystem driver, ie, it cannot be directly - commanded to bind or unbind - - 'VGA+' is the name of the driver - - cat /sys/class/vtconsole/vtcon1/name - (M) frame buffer device - - In this case, '(M)' stands for a (M)odular driver, one that can be - directly commanded to bind or unbind. - - 3. uevent - ignore this file - -When unbinding, the modular driver is detached first, and then the system -driver takes over the consoles vacated by the driver. Binding, on the other -hand, will bind the driver to the consoles that are currently occupied by a -system driver. - -NOTE1: Binding and binding must be selected in Kconfig. It's under: - -Device Drivers -> Character devices -> Support for binding and unbinding -console drivers - -NOTE2: If any of the virtual consoles are in KD_GRAPHICS mode, then binding or -unbinding will not succeed. An example of an application that sets the console -to KD_GRAPHICS is X. - -How useful is this feature? This is very useful for console driver -developers. By unbinding the driver from the console layer, one can unload the -driver, make changes, recompile, reload and rebind the driver without any need -for rebooting the kernel. For regular users who may want to switch from -framebuffer console to VGA console and vice versa, this feature also makes -this possible. (NOTE NOTE NOTE: Please read fbcon.txt under Documentation/fb -for more details). - -Notes for developers: -===================== - -take_over_console() is now broken up into: - - register_con_driver() - bind_con_driver() - private function - -give_up_console() is a wrapper to unregister_con_driver(), and a driver must -be fully unbound for this call to succeed. con_is_bound() will check if the -driver is bound or not. - -Guidelines for console driver writers: -===================================== - -In order for binding to and unbinding from the console to properly work, -console drivers must follow these guidelines: - -1. All drivers, except system drivers, must call either register_con_driver() - or take_over_console(). register_con_driver() will just add the driver to - the console's internal list. It won't take over the - console. take_over_console(), as it name implies, will also take over (or - bind to) the console. - -2. All resources allocated during con->con_init() must be released in - con->con_deinit(). - -3. All resources allocated in con->con_startup() must be released when the - driver, which was previously bound, becomes unbound. The console layer - does not have a complementary call to con->con_startup() so it's up to the - driver to check when it's legal to release these resources. Calling - con_is_bound() in con->con_deinit() will help. If the call returned - false(), then it's safe to release the resources. This balance has to be - ensured because con->con_startup() can be called again when a request to - rebind the driver to the console arrives. - -4. Upon exit of the driver, ensure that the driver is totally unbound. If the - condition is satisfied, then the driver must call unregister_con_driver() - or give_up_console(). - -5. unregister_con_driver() can also be called on conditions which make it - impossible for the driver to service console requests. This can happen - with the framebuffer console that suddenly lost all of its drivers. - -The current crop of console drivers should still work correctly, but binding -and unbinding them may cause problems. With minimal fixes, these drivers can -be made to work correctly. - -========================== -Antonino Daplas - diff --git a/trunk/Documentation/fb/fbcon.txt b/trunk/Documentation/fb/fbcon.txt index f373df12ed4c..08dce0f631bf 100644 --- a/trunk/Documentation/fb/fbcon.txt +++ b/trunk/Documentation/fb/fbcon.txt @@ -135,10 +135,10 @@ C. Boot options The angle can be changed anytime afterwards by 'echoing' the same numbers to any one of the 2 attributes found in - /sys/class/graphics/fbcon + /sys/class/graphics/fb{x} - rotate - rotate the display of the active console - rotate_all - rotate the display of all consoles + con_rotate - rotate the display of the active console + con_rotate_all - rotate the display of all consoles Console rotation will only become available if Console Rotation Support is compiled in your kernel. @@ -148,177 +148,5 @@ C. Boot options Actually, the underlying fb driver is totally ignorant of console rotation. -C. Attaching, Detaching and Unloading - -Before going on on how to attach, detach and unload the framebuffer console, an -illustration of the dependencies may help. - -The console layer, as with most subsystems, needs a driver that interfaces with -the hardware. Thus, in a VGA console: - -console ---> VGA driver ---> hardware. - -Assuming the VGA driver can be unloaded, one must first unbind the VGA driver -from the console layer before unloading the driver. The VGA driver cannot be -unloaded if it is still bound to the console layer. (See -Documentation/console/console.txt for more information). - -This is more complicated in the case of the the framebuffer console (fbcon), -because fbcon is an intermediate layer between the console and the drivers: - -console ---> fbcon ---> fbdev drivers ---> hardware - -The fbdev drivers cannot be unloaded if it's bound to fbcon, and fbcon cannot -be unloaded if it's bound to the console layer. - -So to unload the fbdev drivers, one must first unbind fbcon from the console, -then unbind the fbdev drivers from fbcon. Fortunately, unbinding fbcon from -the console layer will automatically unbind framebuffer drivers from -fbcon. Thus, there is no need to explicitly unbind the fbdev drivers from -fbcon. - -So, how do we unbind fbcon from the console? Part of the answer is in -Documentation/console/console.txt. To summarize: - -Echo a value to the bind file that represents the framebuffer console -driver. So assuming vtcon1 represents fbcon, then: - -echo 1 > sys/class/vtconsole/vtcon1/bind - attach framebuffer console to - console layer -echo 0 > sys/class/vtconsole/vtcon1/bind - detach framebuffer console from - console layer - -If fbcon is detached from the console layer, your boot console driver (which is -usually VGA text mode) will take over. A few drivers (rivafb and i810fb) will -restore VGA text mode for you. With the rest, before detaching fbcon, you -must take a few additional steps to make sure that your VGA text mode is -restored properly. The following is one of the several methods that you can do: - -1. Download or install vbetool. This utility is included with most - distributions nowadays, and is usually part of the suspend/resume tool. - -2. In your kernel configuration, ensure that CONFIG_FRAMEBUFFER_CONSOLE is set - to 'y' or 'm'. Enable one or more of your favorite framebuffer drivers. - -3. Boot into text mode and as root run: - - vbetool vbestate save > - - The above command saves the register contents of your graphics - hardware to . You need to do this step only once as - the state file can be reused. - -4. If fbcon is compiled as a module, load fbcon by doing: - - modprobe fbcon - -5. Now to detach fbcon: - - vbetool vbestate restore < && \ - echo 0 > /sys/class/vtconsole/vtcon1/bind - -6. That's it, you're back to VGA mode. And if you compiled fbcon as a module, - you can unload it by 'rmmod fbcon' - -7. To reattach fbcon: - - echo 1 > /sys/class/vtconsole/vtcon1/bind - -8. Once fbcon is unbound, all drivers registered to the system will also -become unbound. This means that fbcon and individual framebuffer drivers -can be unloaded or reloaded at will. Reloading the drivers or fbcon will -automatically bind the console, fbcon and the drivers together. Unloading -all the drivers without unloading fbcon will make it impossible for the -console to bind fbcon. - -Notes for vesafb users: -======================= - -Unfortunately, if your bootline includes a vga=xxx parameter that sets the -hardware in graphics mode, such as when loading vesafb, vgacon will not load. -Instead, vgacon will replace the default boot console with dummycon, and you -won't get any display after detaching fbcon. Your machine is still alive, so -you can reattach vesafb. However, to reattach vesafb, you need to do one of -the following: - -Variation 1: - - a. Before detaching fbcon, do - - vbetool vbemode save > # do once for each vesafb mode, - # the file can be reused - - b. Detach fbcon as in step 5. - - c. Attach fbcon - - vbetool vbestate restore < && \ - echo 1 > /sys/class/vtconsole/vtcon1/bind - -Variation 2: - - a. Before detaching fbcon, do: - echo > /sys/class/tty/console/bind - - - vbetool vbemode get - - b. Take note of the mode number - - b. Detach fbcon as in step 5. - - c. Attach fbcon: - - vbetool vbemode set && \ - echo 1 > /sys/class/vtconsole/vtcon1/bind - -Samples: -======== - -Here are 2 sample bash scripts that you can use to bind or unbind the -framebuffer console driver if you are in an X86 box: - ---------------------------------------------------------------------------- -#!/bin/bash -# Unbind fbcon - -# Change this to where your actual vgastate file is located -# Or Use VGASTATE=$1 to indicate the state file at runtime -VGASTATE=/tmp/vgastate - -# path to vbetool -VBETOOL=/usr/local/bin - - -for (( i = 0; i < 16; i++)) -do - if test -x /sys/class/vtconsole/vtcon$i; then - if [ `cat /sys/class/vtconsole/vtcon$i/name | grep -c "frame buffer"` \ - = 1 ]; then - if test -x $VBETOOL/vbetool; then - echo Unbinding vtcon$i - $VBETOOL/vbetool vbestate restore < $VGASTATE - echo 0 > /sys/class/vtconsole/vtcon$i/bind - fi - fi - fi -done - ---------------------------------------------------------------------------- -#!/bin/bash -# Bind fbcon - -for (( i = 0; i < 16; i++)) -do - if test -x /sys/class/vtconsole/vtcon$i; then - if [ `cat /sys/class/vtconsole/vtcon$i/name | grep -c "frame buffer"` \ - = 1 ]; then - echo Unbinding vtcon$i - echo 1 > /sys/class/vtconsole/vtcon$i/bind - fi - fi -done ---------------------------------------------------------------------------- - --- +--- Antonino Daplas diff --git a/trunk/Documentation/filesystems/ext3.txt b/trunk/Documentation/filesystems/ext3.txt index 4aecc9bdb273..afb1335c05d6 100644 --- a/trunk/Documentation/filesystems/ext3.txt +++ b/trunk/Documentation/filesystems/ext3.txt @@ -113,14 +113,6 @@ noquota grpquota usrquota -bh (*) ext3 associates buffer heads to data pages to -nobh (a) cache disk block mapping information - (b) link pages into transaction to provide - ordering guarantees. - "bh" option forces use of buffer heads. - "nobh" option tries to avoid associating buffer - heads (supported only for "writeback" mode). - Specification ============= diff --git a/trunk/Documentation/kernel-parameters.txt b/trunk/Documentation/kernel-parameters.txt index 2e352a605fcf..bca6f389da66 100644 --- a/trunk/Documentation/kernel-parameters.txt +++ b/trunk/Documentation/kernel-parameters.txt @@ -61,7 +61,6 @@ parameter is applicable: MTD MTD support is enabled. NET Appropriate network support is enabled. NUMA NUMA support is enabled. - GENERIC_TIME The generic timeofday code is enabled. NFS Appropriate NFS support is enabled. OSS OSS sound support is enabled. PARIDE The ParIDE subsystem is enabled. @@ -180,11 +179,6 @@ running once the system is up. override platform specific driver. See also Documentation/acpi-hotkey.txt. - acpi_pm_good [IA-32,X86-64] - Override the pmtimer bug detection: force the kernel - to assume that this machine's pmtimer latches its value - and always returns good values. - enable_timer_pin_1 [i386,x86-64] Enable PIN 1 of APIC timer Can be useful to work around chipset bugs @@ -347,11 +341,10 @@ running once the system is up. Value can be changed at runtime via /selinux/checkreqprot. - clock= [BUGS=IA-32, HW] gettimeofday clocksource override. - [Deprecated] - Forces specified clocksource (if avaliable) to be used - when calculating gettimeofday(). If specified - clocksource is not avalible, it defaults to PIT. + clock= [BUGS=IA-32,HW] gettimeofday timesource override. + Forces specified timesource (if avaliable) to be used + when calculating gettimeofday(). If specicified + timesource is not avalible, it defaults to PIT. Format: { pit | tsc | cyclone | pmtmr } disable_8254_timer @@ -1624,10 +1617,6 @@ running once the system is up. time Show timing data prefixed to each printk message line - clocksource= [GENERIC_TIME] Override the default clocksource - Override the default clocksource and use the clocksource - with the name specified. - tipar.timeout= [HW,PPT] Set communications timeout in tenths of a second (default 15). diff --git a/trunk/Documentation/keys.txt b/trunk/Documentation/keys.txt index 61c0fad2fe2f..3bbe157b45e4 100644 --- a/trunk/Documentation/keys.txt +++ b/trunk/Documentation/keys.txt @@ -241,30 +241,25 @@ The security class "key" has been added to SELinux so that mandatory access controls can be applied to keys created within various contexts. This support is preliminary, and is likely to change quite significantly in the near future. Currently, all of the basic permissions explained above are provided in SELinux -as well; SELinux is simply invoked after all basic permission checks have been +as well; SE Linux is simply invoked after all basic permission checks have been performed. -The value of the file /proc/self/attr/keycreate influences the labeling of -newly-created keys. If the contents of that file correspond to an SELinux -security context, then the key will be assigned that context. Otherwise, the -key will be assigned the current context of the task that invoked the key -creation request. Tasks must be granted explicit permission to assign a -particular context to newly-created keys, using the "create" permission in the -key security class. +Each key is labeled with the same context as the task to which it belongs. +Typically, this is the same task that was running when the key was created. +The default keyrings are handled differently, but in a way that is very +intuitive: -The default keyrings associated with users will be labeled with the default -context of the user if and only if the login programs have been instrumented to -properly initialize keycreate during the login process. Otherwise, they will -be labeled with the context of the login program itself. + (*) The user and user session keyrings that are created when the user logs in + are currently labeled with the context of the login manager. + + (*) The keyrings associated with new threads are each labeled with the context + of their associated thread, and both session and process keyrings are + handled similarly. Note, however, that the default keyrings associated with the root user are labeled with the default kernel context, since they are created early in the boot process, before root has a chance to log in. -The keyrings associated with new threads are each labeled with the context of -their associated thread, and both session and process keyrings are handled -similarly. - ================ NEW PROCFS FILES @@ -275,17 +270,9 @@ about the status of the key service: (*) /proc/keys - This lists the keys that are currently viewable by the task reading the - file, giving information about their type, description and permissions. - It is not possible to view the payload of the key this way, though some - information about it may be given. - - The only keys included in the list are those that grant View permission to - the reading process whether or not it possesses them. Note that LSM - security checks are still performed, and may further filter out keys that - the current process is not authorised to view. - - The contents of the file look like this: + This lists all the keys on the system, giving information about their + type, description and permissions. The payload of the key is not available + this way: SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY 00000001 I----- 39 perm 1f3f0000 0 0 keyring _uid_ses.0: 1/4 @@ -313,7 +300,7 @@ about the status of the key service: (*) /proc/key-users This file lists the tracking data for each user that has at least one key - on the system. Such data includes quota information and statistics: + on the system. Such data includes quota information and statistics: [root@andromeda root]# cat /proc/key-users 0: 46 45/45 1/100 13/10000 diff --git a/trunk/Documentation/md.txt b/trunk/Documentation/md.txt index 0668f9dc9d29..03a13c462cf2 100644 --- a/trunk/Documentation/md.txt +++ b/trunk/Documentation/md.txt @@ -200,17 +200,6 @@ All md devices contain: This can be written only while the array is being assembled, not after it is started. - layout - The "layout" for the array for the particular level. This is - simply a number that is interpretted differently by different - levels. It can be written while assembling an array. - - resync_start - The point at which resync should start. If no resync is needed, - this will be a very large number. At array creation it will - default to 0, though starting the array as 'clean' will - set it much larger. - new_dev This file can be written but not read. The value written should be a block device number as major:minor. e.g. 8:0 @@ -218,54 +207,6 @@ All md devices contain: available. It will then appear at md/dev-XXX (depending on the name of the device) and further configuration is then possible. - safe_mode_delay - When an md array has seen no write requests for a certain period - of time, it will be marked as 'clean'. When another write - request arrive, the array is marked as 'dirty' before the write - commenses. This is known as 'safe_mode'. - The 'certain period' is controlled by this file which stores the - period as a number of seconds. The default is 200msec (0.200). - Writing a value of 0 disables safemode. - - array_state - This file contains a single word which describes the current - state of the array. In many cases, the state can be set by - writing the word for the desired state, however some states - cannot be explicitly set, and some transitions are not allowed. - - clear - No devices, no size, no level - Writing is equivalent to STOP_ARRAY ioctl - inactive - May have some settings, but array is not active - all IO results in error - When written, doesn't tear down array, but just stops it - suspended (not supported yet) - All IO requests will block. The array can be reconfigured. - Writing this, if accepted, will block until array is quiessent - readonly - no resync can happen. no superblocks get written. - write requests fail - read-auto - like readonly, but behaves like 'clean' on a write request. - - clean - no pending writes, but otherwise active. - When written to inactive array, starts without resync - If a write request arrives then - if metadata is known, mark 'dirty' and switch to 'active'. - if not known, block and switch to write-pending - If written to an active array that has pending writes, then fails. - active - fully active: IO and resync can be happening. - When written to inactive array, starts with resync - - write-pending - clean, but writes are blocked waiting for 'active' to be written. - - active-idle - like active, but no writes have been seen for a while (safe_mode_delay). - - sync_speed_min sync_speed_max This are similar to /proc/sys/dev/raid/speed_limit_{min,max} @@ -309,18 +250,10 @@ Each directory contains: faulty - device has been kicked from active use due to a detected fault in_sync - device is a fully in-sync member of the array - writemostly - device will only be subject to read - requests if there are no other options. - This applies only to raid1 arrays. spare - device is working, but not a full member. This includes spares that are in the process of being recoverred to This list make grow in future. - This can be written to. - Writing "faulty" simulates a failure on the device. - Writing "remove" removes the device from the array. - Writing "writemostly" sets the writemostly flag. - Writing "-writemostly" clears the writemostly flag. errors An approximate count of read errors that have been detected on diff --git a/trunk/Documentation/tty.txt b/trunk/Documentation/tty.txt index dab56604745d..8ff7bc2a0811 100644 --- a/trunk/Documentation/tty.txt +++ b/trunk/Documentation/tty.txt @@ -80,6 +80,13 @@ receive_buf() - Hand buffers of bytes from the driver to the ldisc for processing. Semantics currently rather mysterious 8( +receive_room() - Can be called by the driver layer at any time when + the ldisc is opened. The ldisc must be able to + handle the reported amount of data at that instant. + Synchronization between active receive_buf and + receive_room calls is down to the driver not the + ldisc. Must not sleep. + write_wakeup() - May be called at any point between open and close. The TTY_DO_WRITE_WAKEUP flag indicates if a call is needed but always races versus calls. Thus the diff --git a/trunk/Documentation/x86_64/boot-options.txt b/trunk/Documentation/x86_64/boot-options.txt index 6887d44d2661..f2cd6ef53ff3 100644 --- a/trunk/Documentation/x86_64/boot-options.txt +++ b/trunk/Documentation/x86_64/boot-options.txt @@ -205,27 +205,6 @@ IOMMU pages Prereserve that many 128K pages for the software IO bounce buffering. force Force all IO through the software TLB. - calgary=[64k,128k,256k,512k,1M,2M,4M,8M] - calgary=[translate_empty_slots] - calgary=[disable=] - - 64k,...,8M - Set the size of each PCI slot's translation table - when using the Calgary IOMMU. This is the size of the translation - table itself in main memory. The smallest table, 64k, covers an IO - space of 32MB; the largest, 8MB table, can cover an IO space of - 4GB. Normally the kernel will make the right choice by itself. - - translate_empty_slots - Enable translation even on slots that have - no devices attached to them, in case a device will be hotplugged - in the future. - - disable= - Disable translation on a given PHB. For - example, the built-in graphics adapter resides on the first bridge - (PCI bus number 0); if translation (isolation) is enabled on this - bridge, X servers that access the hardware directly from user - space might stop working. Use this option if you have devices that - are accessed from userspace directly on some PCI host bridge. - Debugging oops=panic Always panic on oopses. Default is to just kill the process, diff --git a/trunk/MAINTAINERS b/trunk/MAINTAINERS index 31a13720f23c..4dcd2f1f14d6 100644 --- a/trunk/MAINTAINERS +++ b/trunk/MAINTAINERS @@ -1118,11 +1118,6 @@ L: lm-sensors@lm-sensors.org W: http://www.lm-sensors.nu/ S: Maintained -HARDWARE RANDOM NUMBER GENERATOR CORE -P: Michael Buesch -M: mb@bu3sch.de -S: Maintained - HARD DRIVE ACTIVE PROTECTION SYSTEM (HDAPS) DRIVER P: Robert Love M: rlove@rlove.org @@ -1401,8 +1396,7 @@ S: Supported INPUT (KEYBOARD, MOUSE, JOYSTICK) DRIVERS P: Dmitry Torokhov -M: dmitry.torokhov@gmail.com -M: dtor@mail.ru +M: dtor_core@ameritech.net L: linux-input@atrey.karlin.mff.cuni.cz L: linux-joystick@atrey.karlin.mff.cuni.cz T: git kernel.org:/pub/scm/linux/kernel/git/dtor/input.git @@ -1442,11 +1436,6 @@ P: Tigran Aivazian M: tigran@veritas.com S: Maintained -INTEL IXP4XX RANDOM NUMBER GENERATOR SUPPORT -P: Deepak Saxena -M: dsaxena@plexity.net -S: Maintained - INTEL PRO/100 ETHERNET SUPPORT P: John Ronciak M: john.ronciak@intel.com @@ -2736,11 +2725,6 @@ P: Christoph Hellwig M: hch@infradead.org S: Maintained -TI OMAP RANDOM NUMBER GENERATOR SUPPORT -P: Deepak Saxena -M: dsaxena@plexity.net -S: Maintained - TI PARALLEL LINK CABLE DRIVER P: Romain Lievin M: roms@lpg.ticalc.org diff --git a/trunk/arch/alpha/oprofile/common.c b/trunk/arch/alpha/oprofile/common.c index 9fc0eeb4f0ab..ba788cfdc3c6 100644 --- a/trunk/arch/alpha/oprofile/common.c +++ b/trunk/arch/alpha/oprofile/common.c @@ -112,7 +112,7 @@ op_axp_create_files(struct super_block * sb, struct dentry * root) for (i = 0; i < model->num_counters; ++i) { struct dentry *dir; - char buf[4]; + char buf[3]; snprintf(buf, sizeof buf, "%d", i); dir = oprofilefs_mkdir(sb, root, buf); diff --git a/trunk/arch/arm/common/locomo.c b/trunk/arch/arm/common/locomo.c index 0dafba3a701d..a7dc1370695b 100644 --- a/trunk/arch/arm/common/locomo.c +++ b/trunk/arch/arm/common/locomo.c @@ -629,6 +629,21 @@ static int locomo_resume(struct platform_device *dev) #endif +#define LCM_ALC_EN 0x8000 + +void frontlight_set(struct locomo *lchip, int duty, int vr, int bpwf) +{ + unsigned long flags; + + spin_lock_irqsave(&lchip->lock, flags); + locomo_writel(bpwf, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALS); + udelay(100); + locomo_writel(duty, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALD); + locomo_writel(bpwf | LCM_ALC_EN, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALS); + spin_unlock_irqrestore(&lchip->lock, flags); +} + + /** * locomo_probe - probe for a single LoCoMo chip. * @phys_addr: physical address of device. @@ -683,10 +698,14 @@ __locomo_probe(struct device *me, struct resource *mem, int irq) , lchip->base + LOCOMO_GPD); locomo_writel(0, lchip->base + LOCOMO_GIE); - /* Frontlight */ + /* FrontLight */ locomo_writel(0, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALS); locomo_writel(0, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALD); + /* Same constants can be used for collie and poodle + (depending on CONFIG options in original sharp code)? */ + frontlight_set(lchip, 163, 0, 148); + /* Longtime timer */ locomo_writel(0, lchip->base + LOCOMO_LTINT); /* SPI */ @@ -1043,30 +1062,6 @@ void locomo_m62332_senddata(struct locomo_dev *ldev, unsigned int dac_data, int spin_unlock_irqrestore(&lchip->lock, flags); } -/* - * Frontlight control - */ - -static struct locomo *locomo_chip_driver(struct locomo_dev *ldev); - -void locomo_frontlight_set(struct locomo_dev *dev, int duty, int vr, int bpwf) -{ - unsigned long flags; - struct locomo *lchip = locomo_chip_driver(dev); - - if (vr) - locomo_gpio_write(dev, LOCOMO_GPIO_FL_VR, 1); - else - locomo_gpio_write(dev, LOCOMO_GPIO_FL_VR, 0); - - spin_lock_irqsave(&lchip->lock, flags); - locomo_writel(bpwf, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALS); - udelay(100); - locomo_writel(duty, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALD); - locomo_writel(bpwf | LOCOMO_ALC_EN, lchip->base + LOCOMO_FRONTLIGHT + LOCOMO_ALS); - spin_unlock_irqrestore(&lchip->lock, flags); -} - /* * LoCoMo "Register Access Bus." * diff --git a/trunk/arch/i386/Kconfig b/trunk/arch/i386/Kconfig index f3eaf22f273d..1596101cfaf8 100644 --- a/trunk/arch/i386/Kconfig +++ b/trunk/arch/i386/Kconfig @@ -14,10 +14,6 @@ config X86_32 486, 586, Pentiums, and various instruction-set-compatible chips by AMD, Cyrix, and others. -config GENERIC_TIME - bool - default y - config SEMAPHORE_SLEEPERS bool default y @@ -328,15 +324,6 @@ config X86_MCE_P4THERMAL Enabling this feature will cause a message to be printed when the P4 enters thermal throttling. -config VM86 - default y - bool "Enable VM86 support" if EMBEDDED - help - This option is required by programs like DOSEMU to run 16-bit legacy - code on X86 processors. It also may be needed by software like - XFree86 to initialize some video cards via BIOS. Disabling this - option saves about 6k. - config TOSHIBA tristate "Toshiba Laptop support" ---help--- @@ -1059,27 +1046,13 @@ config SCx200 tristate "NatSemi SCx200 support" depends on !X86_VOYAGER help - This provides basic support for National Semiconductor's - (now AMD's) Geode processors. The driver probes for the - PCI-IDs of several on-chip devices, so its a good dependency - for other scx200_* drivers. + This provides basic support for the National Semiconductor SCx200 + processor. Right now this is just a driver for the GPIO pins. - If compiled as a module, the driver is named scx200. - -config SCx200HR_TIMER - tristate "NatSemi SCx200 27MHz High-Resolution Timer Support" - depends on SCx200 && GENERIC_TIME - default y - help - This driver provides a clocksource built upon the on-chip - 27MHz high-resolution timer. Its also a workaround for - NSC Geode SC-1100's buggy TSC, which loses time when the - processor goes idle (as is done by the scheduler). The - other workaround is idle=poll boot option. + If you don't know what to do here, say N. -config K8_NB - def_bool y - depends on AGP_AMD64 + This support is also available as a module. If compiled as a + module, it will be called scx200. source "drivers/pcmcia/Kconfig" diff --git a/trunk/arch/i386/boot/Makefile b/trunk/arch/i386/boot/Makefile index e97946626064..33e55476381b 100644 --- a/trunk/arch/i386/boot/Makefile +++ b/trunk/arch/i386/boot/Makefile @@ -109,13 +109,8 @@ fdimage288: $(BOOTIMAGE) $(obj)/mtools.conf isoimage: $(BOOTIMAGE) -rm -rf $(obj)/isoimage mkdir $(obj)/isoimage - for i in lib lib64 share end ; do \ - if [ -f /usr/$$i/syslinux/isolinux.bin ] ; then \ - cp /usr/$$i/syslinux/isolinux.bin $(obj)/isoimage ; \ - break ; \ - fi ; \ - if [ $$i = end ] ; then exit 1 ; fi ; \ - done + cp `echo /usr/lib*/syslinux/isolinux.bin | awk '{ print $1; }'` \ + $(obj)/isoimage cp $(BOOTIMAGE) $(obj)/isoimage/linux echo '$(image_cmdline)' > $(obj)/isoimage/isolinux.cfg if [ -f '$(FDINITRD)' ] ; then \ diff --git a/trunk/arch/i386/boot/compressed/misc.c b/trunk/arch/i386/boot/compressed/misc.c index b2ccd543410d..f19f3a7492a5 100644 --- a/trunk/arch/i386/boot/compressed/misc.c +++ b/trunk/arch/i386/boot/compressed/misc.c @@ -24,6 +24,14 @@ #undef memset #undef memcpy + +/* + * Why do we do this? Don't ask me.. + * + * Incomprehensible are the ways of bootloaders. + */ +static void* memset(void *, int, size_t); +static void* memcpy(void *, __const void *, size_t); #define memzero(s, n) memset ((s), 0, (n)) typedef unsigned char uch; @@ -85,7 +93,7 @@ static unsigned char *real_mode; /* Pointer to real-mode data */ #endif #define RM_SCREEN_INFO (*(struct screen_info *)(real_mode+0)) -extern unsigned char input_data[]; +extern char input_data[]; extern int input_len; static long bytes_out = 0; @@ -95,9 +103,6 @@ static unsigned long output_ptr = 0; static void *malloc(int size); static void free(void *where); -static void *memset(void *s, int c, unsigned n); -static void *memcpy(void *dest, const void *src, unsigned n); - static void putstr(const char *); extern int end; @@ -200,7 +205,7 @@ static void putstr(const char *s) outb_p(0xff & (pos >> 1), vidport+1); } -static void* memset(void* s, int c, unsigned n) +static void* memset(void* s, int c, size_t n) { int i; char *ss = (char*)s; @@ -209,13 +214,14 @@ static void* memset(void* s, int c, unsigned n) return s; } -static void* memcpy(void* dest, const void* src, unsigned n) +static void* memcpy(void* __dest, __const void* __src, + size_t __n) { int i; - char *d = (char *)dest, *s = (char *)src; + char *d = (char *)__dest, *s = (char *)__src; - for (i=0;i RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < 1024) error("Less than 2MB of memory"); #endif - output_data = (unsigned char *)__PHYSICAL_START; /* Normally Points to 1M */ + output_data = (char *)__PHYSICAL_START; /* Normally Points to 1M */ free_mem_end_ptr = (long)real_mode; } @@ -318,9 +324,11 @@ static void setup_output_buffer_if_we_run_high(struct moveparams *mv) #ifdef STANDARD_MEMORY_BIOS_CALL if (RM_EXT_MEM_K < (3*1024)) error("Less than 4MB of memory"); #else - if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < (3*1024)) error("Less than 4MB of memory"); + if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < + (3*1024)) + error("Less than 4MB of memory"); #endif - mv->low_buffer_start = output_data = (unsigned char *)LOW_BUFFER_START; + mv->low_buffer_start = output_data = (char *)LOW_BUFFER_START; low_buffer_end = ((unsigned int)real_mode > LOW_BUFFER_MAX ? LOW_BUFFER_MAX : (unsigned int)real_mode) & ~0xfff; low_buffer_size = low_buffer_end - LOW_BUFFER_START; diff --git a/trunk/arch/i386/boot/video.S b/trunk/arch/i386/boot/video.S index 8c2a6faeeae5..c9343c3a8082 100644 --- a/trunk/arch/i386/boot/video.S +++ b/trunk/arch/i386/boot/video.S @@ -1929,7 +1929,7 @@ skip10: movb %ah, %al ret store_edid: -#ifdef CONFIG_FIRMWARE_EDID +#ifdef CONFIG_FB_FIRMWARE_EDID pushw %es # just save all registers pushw %ax pushw %bx @@ -1947,22 +1947,6 @@ store_edid: rep stosl - pushw %es # save ES - xorw %di, %di # Report Capability - pushw %di - popw %es # ES:DI must be 0:0 - movw $0x4f15, %ax - xorw %bx, %bx - xorw %cx, %cx - int $0x10 - popw %es # restore ES - - cmpb $0x00, %ah # call successful - jne no_edid - - cmpb $0x4f, %al # function supported - jne no_edid - movw $0x4f15, %ax # do VBE/DDC movw $0x01, %bx movw $0x00, %cx @@ -1970,7 +1954,6 @@ store_edid: movw $0x140, %di int $0x10 -no_edid: popw %di # restore all registers popw %dx popw %cx diff --git a/trunk/arch/i386/kernel/Makefile b/trunk/arch/i386/kernel/Makefile index 5e70c2fb273a..96fb8a020af2 100644 --- a/trunk/arch/i386/kernel/Makefile +++ b/trunk/arch/i386/kernel/Makefile @@ -7,9 +7,10 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o bootflag.o \ - quirks.o i8237.o topology.o alternative.o i8253.o tsc.o + quirks.o i8237.o topology.o alternative.o obj-y += cpu/ +obj-y += timers/ obj-y += acpi/ obj-$(CONFIG_X86_BIOS_REBOOT) += reboot.o obj-$(CONFIG_MCA) += mca.o @@ -36,8 +37,6 @@ obj-$(CONFIG_EFI) += efi.o efi_stub.o obj-$(CONFIG_DOUBLEFAULT) += doublefault.o obj-$(CONFIG_VM86) += vm86.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o -obj-$(CONFIG_HPET_TIMER) += hpet.o -obj-$(CONFIG_K8_NB) += k8.o EXTRA_AFLAGS := -traditional @@ -77,6 +76,3 @@ SYSCFLAGS_vsyscall-syms.o = -r $(obj)/vsyscall-syms.o: $(src)/vsyscall.lds \ $(obj)/vsyscall-sysenter.o $(obj)/vsyscall-note.o FORCE $(call if_changed,syscall) - -k8-y += ../../x86_64/kernel/k8.o - diff --git a/trunk/arch/i386/kernel/alternative.c b/trunk/arch/i386/kernel/alternative.c index 50eb0e03777e..5cbd6f99fb2a 100644 --- a/trunk/arch/i386/kernel/alternative.c +++ b/trunk/arch/i386/kernel/alternative.c @@ -4,41 +4,27 @@ #include #include -static int no_replacement = 0; -static int smp_alt_once = 0; -static int debug_alternative = 0; - -static int __init noreplacement_setup(char *s) -{ - no_replacement = 1; - return 1; -} -static int __init bootonly(char *str) -{ - smp_alt_once = 1; - return 1; -} -static int __init debug_alt(char *str) -{ - debug_alternative = 1; - return 1; -} - -__setup("noreplacement", noreplacement_setup); -__setup("smp-alt-boot", bootonly); -__setup("debug-alternative", debug_alt); - -#define DPRINTK(fmt, args...) if (debug_alternative) \ - printk(KERN_DEBUG fmt, args) +#define DEBUG 0 +#if DEBUG +# define DPRINTK(fmt, args...) printk(fmt, args) +#else +# define DPRINTK(fmt, args...) +#endif -#ifdef GENERIC_NOP1 /* Use inline assembly to define this because the nops are defined as inline assembly strings in the include files and we cannot get them easily into strings. */ asm("\t.data\nintelnops: " GENERIC_NOP1 GENERIC_NOP2 GENERIC_NOP3 GENERIC_NOP4 GENERIC_NOP5 GENERIC_NOP6 GENERIC_NOP7 GENERIC_NOP8); -extern unsigned char intelnops[]; +asm("\t.data\nk8nops: " + K8_NOP1 K8_NOP2 K8_NOP3 K8_NOP4 K8_NOP5 K8_NOP6 + K8_NOP7 K8_NOP8); +asm("\t.data\nk7nops: " + K7_NOP1 K7_NOP2 K7_NOP3 K7_NOP4 K7_NOP5 K7_NOP6 + K7_NOP7 K7_NOP8); + +extern unsigned char intelnops[], k8nops[], k7nops[]; static unsigned char *intel_nops[ASM_NOP_MAX+1] = { NULL, intelnops, @@ -50,13 +36,6 @@ static unsigned char *intel_nops[ASM_NOP_MAX+1] = { intelnops + 1 + 2 + 3 + 4 + 5 + 6, intelnops + 1 + 2 + 3 + 4 + 5 + 6 + 7, }; -#endif - -#ifdef K8_NOP1 -asm("\t.data\nk8nops: " - K8_NOP1 K8_NOP2 K8_NOP3 K8_NOP4 K8_NOP5 K8_NOP6 - K8_NOP7 K8_NOP8); -extern unsigned char k8nops[]; static unsigned char *k8_nops[ASM_NOP_MAX+1] = { NULL, k8nops, @@ -68,13 +47,6 @@ static unsigned char *k8_nops[ASM_NOP_MAX+1] = { k8nops + 1 + 2 + 3 + 4 + 5 + 6, k8nops + 1 + 2 + 3 + 4 + 5 + 6 + 7, }; -#endif - -#ifdef K7_NOP1 -asm("\t.data\nk7nops: " - K7_NOP1 K7_NOP2 K7_NOP3 K7_NOP4 K7_NOP5 K7_NOP6 - K7_NOP7 K7_NOP8); -extern unsigned char k7nops[]; static unsigned char *k7_nops[ASM_NOP_MAX+1] = { NULL, k7nops, @@ -86,18 +58,6 @@ static unsigned char *k7_nops[ASM_NOP_MAX+1] = { k7nops + 1 + 2 + 3 + 4 + 5 + 6, k7nops + 1 + 2 + 3 + 4 + 5 + 6 + 7, }; -#endif - -#ifdef CONFIG_X86_64 - -extern char __vsyscall_0; -static inline unsigned char** find_nop_table(void) -{ - return k8_nops; -} - -#else /* CONFIG_X86_64 */ - static struct nop { int cpuid; unsigned char **noptable; @@ -107,6 +67,14 @@ static struct nop { { -1, NULL } }; + +extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; +extern struct alt_instr __smp_alt_instructions[], __smp_alt_instructions_end[]; +extern u8 *__smp_locks[], *__smp_locks_end[]; + +extern u8 __smp_alt_begin[], __smp_alt_end[]; + + static unsigned char** find_nop_table(void) { unsigned char **noptable = intel_nops; @@ -121,14 +89,6 @@ static unsigned char** find_nop_table(void) return noptable; } -#endif /* CONFIG_X86_64 */ - -extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; -extern struct alt_instr __smp_alt_instructions[], __smp_alt_instructions_end[]; -extern u8 *__smp_locks[], *__smp_locks_end[]; - -extern u8 __smp_alt_begin[], __smp_alt_end[]; - /* Replace instructions with better alternatives for this CPU type. This runs before SMP is initialized to avoid SMP problems with self modifying code. This implies that assymetric systems where @@ -139,7 +99,6 @@ void apply_alternatives(struct alt_instr *start, struct alt_instr *end) { unsigned char **noptable = find_nop_table(); struct alt_instr *a; - u8 *instr; int diff, i, k; DPRINTK("%s: alt table %p -> %p\n", __FUNCTION__, start, end); @@ -147,16 +106,7 @@ void apply_alternatives(struct alt_instr *start, struct alt_instr *end) BUG_ON(a->replacementlen > a->instrlen); if (!boot_cpu_has(a->cpuid)) continue; - instr = a->instr; -#ifdef CONFIG_X86_64 - /* vsyscall code is not mapped yet. resolve it manually. */ - if (instr >= (u8 *)VSYSCALL_START && instr < (u8*)VSYSCALL_END) { - instr = __va(instr - (u8*)VSYSCALL_START + (u8*)__pa_symbol(&__vsyscall_0)); - DPRINTK("%s: vsyscall fixup: %p => %p\n", - __FUNCTION__, a->instr, instr); - } -#endif - memcpy(instr, a->replacement, a->replacementlen); + memcpy(a->instr, a->replacement, a->replacementlen); diff = a->instrlen - a->replacementlen; /* Pad the rest with nops */ for (i = a->replacementlen; diff > 0; diff -= k, i += k) { @@ -236,6 +186,14 @@ struct smp_alt_module { static LIST_HEAD(smp_alt_modules); static DEFINE_SPINLOCK(smp_alt); +static int smp_alt_once = 0; +static int __init bootonly(char *str) +{ + smp_alt_once = 1; + return 1; +} +__setup("smp-alt-boot", bootonly); + void alternatives_smp_module_add(struct module *mod, char *name, void *locks, void *locks_end, void *text, void *text_end) @@ -243,9 +201,6 @@ void alternatives_smp_module_add(struct module *mod, char *name, struct smp_alt_module *smp; unsigned long flags; - if (no_replacement) - return; - if (smp_alt_once) { if (boot_cpu_has(X86_FEATURE_UP)) alternatives_smp_unlock(locks, locks_end, @@ -280,7 +235,7 @@ void alternatives_smp_module_del(struct module *mod) struct smp_alt_module *item; unsigned long flags; - if (no_replacement || smp_alt_once) + if (smp_alt_once) return; spin_lock_irqsave(&smp_alt, flags); @@ -301,7 +256,7 @@ void alternatives_smp_switch(int smp) struct smp_alt_module *mod; unsigned long flags; - if (no_replacement || smp_alt_once) + if (smp_alt_once) return; BUG_ON(!smp && (num_online_cpus() > 1)); @@ -330,13 +285,6 @@ void alternatives_smp_switch(int smp) void __init alternative_instructions(void) { - if (no_replacement) { - printk(KERN_INFO "(SMP-)alternatives turned off\n"); - free_init_pages("SMP alternatives", - (unsigned long)__smp_alt_begin, - (unsigned long)__smp_alt_end); - return; - } apply_alternatives(__alt_instructions, __alt_instructions_end); /* switch to patch-once-at-boottime-only mode and free the diff --git a/trunk/arch/i386/kernel/apic.c b/trunk/arch/i386/kernel/apic.c index 7ce09492fc0c..5ab59c12335b 100644 --- a/trunk/arch/i386/kernel/apic.c +++ b/trunk/arch/i386/kernel/apic.c @@ -36,7 +36,6 @@ #include #include #include -#include #include #include @@ -157,7 +156,7 @@ void clear_local_APIC(void) maxlvt = get_maxlvt(); /* - * Masking an LVT entry can trigger a local APIC error + * Masking an LVT entry on a P6 can trigger a local APIC error * if the vector is zero. Mask LVTERR first to prevent this. */ if (maxlvt >= 3) { @@ -1118,18 +1117,7 @@ void disable_APIC_timer(void) unsigned long v; v = apic_read(APIC_LVTT); - /* - * When an illegal vector value (0-15) is written to an LVT - * entry and delivery mode is Fixed, the APIC may signal an - * illegal vector error, with out regard to whether the mask - * bit is set or whether an interrupt is actually seen on input. - * - * Boot sequence might call this function when the LVTT has - * '0' vector value. So make sure vector field is set to - * valid value. - */ - v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); - apic_write_around(APIC_LVTT, v); + apic_write_around(APIC_LVTT, v | APIC_LVT_MASKED); } } diff --git a/trunk/arch/i386/kernel/apm.c b/trunk/arch/i386/kernel/apm.c index 7c5729d1fd06..9e819eb68229 100644 --- a/trunk/arch/i386/kernel/apm.c +++ b/trunk/arch/i386/kernel/apm.c @@ -764,9 +764,9 @@ static int apm_do_idle(void) int idled = 0; int polling; - polling = !!(current_thread_info()->status & TS_POLLING); + polling = test_thread_flag(TIF_POLLING_NRFLAG); if (polling) { - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); smp_mb__after_clear_bit(); } if (!need_resched()) { @@ -774,7 +774,7 @@ static int apm_do_idle(void) ret = apm_bios_call_simple(APM_FUNC_IDLE, 0, 0, &eax); } if (polling) - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); if (!idled) return 0; diff --git a/trunk/arch/i386/kernel/cpu/amd.c b/trunk/arch/i386/kernel/cpu/amd.c index fd0457c9c827..786d1a57048b 100644 --- a/trunk/arch/i386/kernel/cpu/amd.c +++ b/trunk/arch/i386/kernel/cpu/amd.c @@ -224,17 +224,15 @@ static void __init init_amd(struct cpuinfo_x86 *c) #ifdef CONFIG_X86_HT /* - * On a AMD multi core setup the lower bits of the APIC id - * distingush the cores. + * On a AMD dual core setup the lower bits of the APIC id + * distingush the cores. Assumes number of cores is a power + * of two. */ if (c->x86_max_cores > 1) { int cpu = smp_processor_id(); - unsigned bits = (cpuid_ecx(0x80000008) >> 12) & 0xf; - - if (bits == 0) { - while ((1 << bits) < c->x86_max_cores) - bits++; - } + unsigned bits = 0; + while ((1 << bits) < c->x86_max_cores) + bits++; cpu_core_id[cpu] = phys_proc_id[cpu] & ((1<>= bits; printk(KERN_INFO "CPU %d(%d) -> Core %d\n", @@ -242,8 +240,6 @@ static void __init init_amd(struct cpuinfo_x86 *c) } #endif - if (cpuid_eax(0x80000000) >= 0x80000006) - num_cache_leaves = 3; } static unsigned int amd_size_cache(struct cpuinfo_x86 * c, unsigned int size) diff --git a/trunk/arch/i386/kernel/cpu/intel.c b/trunk/arch/i386/kernel/cpu/intel.c index 10afc645c540..5386b29bb5a5 100644 --- a/trunk/arch/i386/kernel/cpu/intel.c +++ b/trunk/arch/i386/kernel/cpu/intel.c @@ -122,12 +122,6 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c) select_idle_routine(c); l2 = init_intel_cacheinfo(c); - if (c->cpuid_level > 9 ) { - unsigned eax = cpuid_eax(10); - /* Check for version and the number of counters */ - if ((eax & 0xff) && (((eax>>8) & 0xff) > 1)) - set_bit(X86_FEATURE_ARCH_PERFMON, c->x86_capability); - } /* SEP CPUID bug: Pentium Pro reports SEP but doesn't have it until model 3 mask 3 */ if ((c->x86<<8 | c->x86_model<<4 | c->x86_mask) < 0x633) diff --git a/trunk/arch/i386/kernel/cpu/intel_cacheinfo.c b/trunk/arch/i386/kernel/cpu/intel_cacheinfo.c index 6c37b4fd8ce2..c8547a6fa7e6 100644 --- a/trunk/arch/i386/kernel/cpu/intel_cacheinfo.c +++ b/trunk/arch/i386/kernel/cpu/intel_cacheinfo.c @@ -4,7 +4,6 @@ * Changes: * Venkatesh Pallipadi : Adding cache identification through cpuid(4) * Ashok Raj : Work with CPU hotplug infrastructure. - * Andi Kleen : CPUID4 emulation on AMD. */ #include @@ -131,111 +130,25 @@ struct _cpuid4_info { cpumask_t shared_cpu_map; }; -unsigned short num_cache_leaves; - -/* AMD doesn't have CPUID4. Emulate it here to report the same - information to the user. This makes some assumptions about the machine: - No L3, L2 not shared, no SMT etc. that is currently true on AMD CPUs. - - In theory the TLBs could be reported as fake type (they are in "dummy"). - Maybe later */ -union l1_cache { - struct { - unsigned line_size : 8; - unsigned lines_per_tag : 8; - unsigned assoc : 8; - unsigned size_in_kb : 8; - }; - unsigned val; -}; - -union l2_cache { - struct { - unsigned line_size : 8; - unsigned lines_per_tag : 4; - unsigned assoc : 4; - unsigned size_in_kb : 16; - }; - unsigned val; -}; - -static unsigned short assocs[] = { - [1] = 1, [2] = 2, [4] = 4, [6] = 8, - [8] = 16, - [0xf] = 0xffff // ?? - }; -static unsigned char levels[] = { 1, 1, 2 }; -static unsigned char types[] = { 1, 2, 3 }; - -static void __cpuinit amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax, - union _cpuid4_leaf_ebx *ebx, - union _cpuid4_leaf_ecx *ecx) -{ - unsigned dummy; - unsigned line_size, lines_per_tag, assoc, size_in_kb; - union l1_cache l1i, l1d; - union l2_cache l2; - - eax->full = 0; - ebx->full = 0; - ecx->full = 0; - - cpuid(0x80000005, &dummy, &dummy, &l1d.val, &l1i.val); - cpuid(0x80000006, &dummy, &dummy, &l2.val, &dummy); - - if (leaf > 2 || !l1d.val || !l1i.val || !l2.val) - return; - - eax->split.is_self_initializing = 1; - eax->split.type = types[leaf]; - eax->split.level = levels[leaf]; - eax->split.num_threads_sharing = 0; - eax->split.num_cores_on_die = current_cpu_data.x86_max_cores - 1; - - if (leaf <= 1) { - union l1_cache *l1 = leaf == 0 ? &l1d : &l1i; - assoc = l1->assoc; - line_size = l1->line_size; - lines_per_tag = l1->lines_per_tag; - size_in_kb = l1->size_in_kb; - } else { - assoc = l2.assoc; - line_size = l2.line_size; - lines_per_tag = l2.lines_per_tag; - /* cpu_data has errata corrections for K7 applied */ - size_in_kb = current_cpu_data.x86_cache_size; - } - - if (assoc == 0xf) - eax->split.is_fully_associative = 1; - ebx->split.coherency_line_size = line_size - 1; - ebx->split.ways_of_associativity = assocs[assoc] - 1; - ebx->split.physical_line_partition = lines_per_tag - 1; - ecx->split.number_of_sets = (size_in_kb * 1024) / line_size / - (ebx->split.ways_of_associativity + 1) - 1; -} +static unsigned short num_cache_leaves; static int __cpuinit cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf) { - union _cpuid4_leaf_eax eax; - union _cpuid4_leaf_ebx ebx; - union _cpuid4_leaf_ecx ecx; - unsigned edx; + unsigned int eax, ebx, ecx, edx; + union _cpuid4_leaf_eax cache_eax; - if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) - amd_cpuid4(index, &eax, &ebx, &ecx); - else - cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full, &edx); - if (eax.split.type == CACHE_TYPE_NULL) + cpuid_count(4, index, &eax, &ebx, &ecx, &edx); + cache_eax.full = eax; + if (cache_eax.split.type == CACHE_TYPE_NULL) return -EIO; /* better error ? */ - this_leaf->eax = eax; - this_leaf->ebx = ebx; - this_leaf->ecx = ecx; - this_leaf->size = (ecx.split.number_of_sets + 1) * - (ebx.split.coherency_line_size + 1) * - (ebx.split.physical_line_partition + 1) * - (ebx.split.ways_of_associativity + 1); + this_leaf->eax.full = eax; + this_leaf->ebx.full = ebx; + this_leaf->ecx.full = ecx; + this_leaf->size = (this_leaf->ecx.split.number_of_sets + 1) * + (this_leaf->ebx.split.coherency_line_size + 1) * + (this_leaf->ebx.split.physical_line_partition + 1) * + (this_leaf->ebx.split.ways_of_associativity + 1); return 0; } diff --git a/trunk/arch/i386/kernel/crash.c b/trunk/arch/i386/kernel/crash.c index 0c88d3ec8c18..21dc1bbb8067 100644 --- a/trunk/arch/i386/kernel/crash.c +++ b/trunk/arch/i386/kernel/crash.c @@ -120,9 +120,14 @@ static int crash_nmi_callback(struct pt_regs *regs, int cpu) return 1; } +/* + * By using the NMI code instead of a vector we just sneak thru the + * word generator coming out with just what we want. AND it does + * not matter if clustered_apic_mode is set or not. + */ static void smp_send_nmi_allbutself(void) { - send_IPI_allbutself(NMI_VECTOR); + send_IPI_allbutself(APIC_DM_NMI); } static void nmi_shootdown_cpus(void) diff --git a/trunk/arch/i386/kernel/entry.S b/trunk/arch/i386/kernel/entry.S index e6e4506e749a..cfc683f153b9 100644 --- a/trunk/arch/i386/kernel/entry.S +++ b/trunk/arch/i386/kernel/entry.S @@ -48,7 +48,6 @@ #include #include #include -#include #include "irq_vectors.h" #define nr_syscalls ((syscall_table_size)/4) @@ -86,67 +85,31 @@ VM_MASK = 0x00020000 #define SAVE_ALL \ cld; \ pushl %es; \ - CFI_ADJUST_CFA_OFFSET 4;\ - /*CFI_REL_OFFSET es, 0;*/\ pushl %ds; \ - CFI_ADJUST_CFA_OFFSET 4;\ - /*CFI_REL_OFFSET ds, 0;*/\ pushl %eax; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET eax, 0;\ pushl %ebp; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET ebp, 0;\ pushl %edi; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET edi, 0;\ pushl %esi; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET esi, 0;\ pushl %edx; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET edx, 0;\ pushl %ecx; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET ecx, 0;\ pushl %ebx; \ - CFI_ADJUST_CFA_OFFSET 4;\ - CFI_REL_OFFSET ebx, 0;\ movl $(__USER_DS), %edx; \ movl %edx, %ds; \ movl %edx, %es; #define RESTORE_INT_REGS \ popl %ebx; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE ebx;\ popl %ecx; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE ecx;\ popl %edx; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE edx;\ popl %esi; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE esi;\ popl %edi; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE edi;\ popl %ebp; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE ebp;\ - popl %eax; \ - CFI_ADJUST_CFA_OFFSET -4;\ - CFI_RESTORE eax + popl %eax #define RESTORE_REGS \ RESTORE_INT_REGS; \ 1: popl %ds; \ - CFI_ADJUST_CFA_OFFSET -4;\ - /*CFI_RESTORE ds;*/\ 2: popl %es; \ - CFI_ADJUST_CFA_OFFSET -4;\ - /*CFI_RESTORE es;*/\ .section .fixup,"ax"; \ 3: movl $0,(%esp); \ jmp 1b; \ @@ -159,43 +122,13 @@ VM_MASK = 0x00020000 .long 2b,4b; \ .previous -#define RING0_INT_FRAME \ - CFI_STARTPROC simple;\ - CFI_DEF_CFA esp, 3*4;\ - /*CFI_OFFSET cs, -2*4;*/\ - CFI_OFFSET eip, -3*4 - -#define RING0_EC_FRAME \ - CFI_STARTPROC simple;\ - CFI_DEF_CFA esp, 4*4;\ - /*CFI_OFFSET cs, -2*4;*/\ - CFI_OFFSET eip, -3*4 - -#define RING0_PTREGS_FRAME \ - CFI_STARTPROC simple;\ - CFI_DEF_CFA esp, OLDESP-EBX;\ - /*CFI_OFFSET cs, CS-OLDESP;*/\ - CFI_OFFSET eip, EIP-OLDESP;\ - /*CFI_OFFSET es, ES-OLDESP;*/\ - /*CFI_OFFSET ds, DS-OLDESP;*/\ - CFI_OFFSET eax, EAX-OLDESP;\ - CFI_OFFSET ebp, EBP-OLDESP;\ - CFI_OFFSET edi, EDI-OLDESP;\ - CFI_OFFSET esi, ESI-OLDESP;\ - CFI_OFFSET edx, EDX-OLDESP;\ - CFI_OFFSET ecx, ECX-OLDESP;\ - CFI_OFFSET ebx, EBX-OLDESP ENTRY(ret_from_fork) - CFI_STARTPROC pushl %eax - CFI_ADJUST_CFA_OFFSET -4 call schedule_tail GET_THREAD_INFO(%ebp) popl %eax - CFI_ADJUST_CFA_OFFSET -4 jmp syscall_exit - CFI_ENDPROC /* * Return to user mode is not as complex as all this looks, @@ -206,7 +139,6 @@ ENTRY(ret_from_fork) # userspace resumption stub bypassing syscall exit tracing ALIGN - RING0_PTREGS_FRAME ret_from_exception: preempt_stop ret_from_intr: @@ -239,33 +171,20 @@ need_resched: call preempt_schedule_irq jmp need_resched #endif - CFI_ENDPROC /* SYSENTER_RETURN points to after the "sysenter" instruction in the vsyscall page. See vsyscall-sysentry.S, which defines the symbol. */ # sysenter call handler stub ENTRY(sysenter_entry) - CFI_STARTPROC simple - CFI_DEF_CFA esp, 0 - CFI_REGISTER esp, ebp movl TSS_sysenter_esp0(%esp),%esp sysenter_past_esp: sti pushl $(__USER_DS) - CFI_ADJUST_CFA_OFFSET 4 - /*CFI_REL_OFFSET ss, 0*/ pushl %ebp - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET esp, 0 pushfl - CFI_ADJUST_CFA_OFFSET 4 pushl $(__USER_CS) - CFI_ADJUST_CFA_OFFSET 4 - /*CFI_REL_OFFSET cs, 0*/ pushl $SYSENTER_RETURN - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET eip, 0 /* * Load the potential sixth argument from user stack. @@ -280,7 +199,6 @@ sysenter_past_esp: .previous pushl %eax - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL GET_THREAD_INFO(%ebp) @@ -301,14 +219,11 @@ sysenter_past_esp: xorl %ebp,%ebp sti sysexit - CFI_ENDPROC # system call handler stub ENTRY(system_call) - RING0_INT_FRAME # can't unwind into user space anyway pushl %eax # save orig_eax - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL GET_THREAD_INFO(%ebp) testl $TF_MASK,EFLAGS(%esp) @@ -341,12 +256,10 @@ restore_all: movb CS(%esp), %al andl $(VM_MASK | (4 << 8) | 3), %eax cmpl $((4 << 8) | 3), %eax - CFI_REMEMBER_STATE je ldt_ss # returning to user-space with LDT SS restore_nocheck: RESTORE_REGS addl $4, %esp - CFI_ADJUST_CFA_OFFSET -4 1: iret .section .fixup,"ax" iret_exc: @@ -360,7 +273,6 @@ iret_exc: .long 1b,iret_exc .previous - CFI_RESTORE_STATE ldt_ss: larl OLDSS(%esp), %eax jnz restore_nocheck @@ -373,13 +285,11 @@ ldt_ss: * CPUs, which we can try to work around to make * dosemu and wine happy. */ subl $8, %esp # reserve space for switch16 pointer - CFI_ADJUST_CFA_OFFSET 8 cli movl %esp, %eax /* Set up the 16bit stack frame with switch32 pointer on top, * and a switch16 pointer on top of the current frame. */ call setup_x86_bogus_stack - CFI_ADJUST_CFA_OFFSET -8 # frame has moved RESTORE_REGS lss 20+4(%esp), %esp # switch to 16bit stack 1: iret @@ -387,11 +297,9 @@ ldt_ss: .align 4 .long 1b,iret_exc .previous - CFI_ENDPROC # perform work that needs to be done immediately before resumption ALIGN - RING0_PTREGS_FRAME # can't unwind into user space anyway work_pending: testb $_TIF_NEED_RESCHED, %cl jz work_notifysig @@ -421,10 +329,8 @@ work_notifysig: # deal with pending signals and work_notifysig_v86: #ifdef CONFIG_VM86 pushl %ecx # save ti_flags for do_notify_resume - CFI_ADJUST_CFA_OFFSET 4 call save_v86_state # %eax contains pt_regs pointer popl %ecx - CFI_ADJUST_CFA_OFFSET -4 movl %eax, %esp xorl %edx, %edx call do_notify_resume @@ -457,21 +363,19 @@ syscall_exit_work: movl $1, %edx call do_syscall_trace jmp resume_userspace - CFI_ENDPROC - RING0_INT_FRAME # can't unwind into user space anyway + ALIGN syscall_fault: pushl %eax # save orig_eax - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL GET_THREAD_INFO(%ebp) movl $-EFAULT,EAX(%esp) jmp resume_userspace + ALIGN syscall_badsys: movl $-ENOSYS,EAX(%esp) jmp resume_userspace - CFI_ENDPROC #define FIXUP_ESPFIX_STACK \ movl %esp, %eax; \ @@ -483,21 +387,16 @@ syscall_badsys: movl %eax, %esp; #define UNWIND_ESPFIX_STACK \ pushl %eax; \ - CFI_ADJUST_CFA_OFFSET 4; \ movl %ss, %eax; \ /* see if on 16bit stack */ \ cmpw $__ESPFIX_SS, %ax; \ - je 28f; \ -27: popl %eax; \ - CFI_ADJUST_CFA_OFFSET -4; \ -.section .fixup,"ax"; \ -28: movl $__KERNEL_DS, %eax; \ - movl %eax, %ds; \ - movl %eax, %es; \ + jne 28f; \ + movl $__KERNEL_DS, %edx; \ + movl %edx, %ds; \ + movl %edx, %es; \ /* switch to 32bit stack */ \ - FIXUP_ESPFIX_STACK; \ - jmp 27b; \ -.previous + FIXUP_ESPFIX_STACK \ +28: popl %eax; /* * Build the entry stubs and pointer table with @@ -509,14 +408,9 @@ ENTRY(interrupt) vector=0 ENTRY(irq_entries_start) - RING0_INT_FRAME .rept NR_IRQS ALIGN - .if vector - CFI_ADJUST_CFA_OFFSET -4 - .endif 1: pushl $vector-256 - CFI_ADJUST_CFA_OFFSET 4 jmp common_interrupt .data .long 1b @@ -530,99 +424,60 @@ common_interrupt: movl %esp,%eax call do_IRQ jmp ret_from_intr - CFI_ENDPROC #define BUILD_INTERRUPT(name, nr) \ ENTRY(name) \ - RING0_INT_FRAME; \ pushl $nr-256; \ - CFI_ADJUST_CFA_OFFSET 4; \ - SAVE_ALL; \ + SAVE_ALL \ movl %esp,%eax; \ call smp_/**/name; \ - jmp ret_from_intr; \ - CFI_ENDPROC + jmp ret_from_intr; /* The include is where all of the SMP etc. interrupts come from */ #include "entry_arch.h" ENTRY(divide_error) - RING0_INT_FRAME pushl $0 # no error code - CFI_ADJUST_CFA_OFFSET 4 pushl $do_divide_error - CFI_ADJUST_CFA_OFFSET 4 ALIGN error_code: pushl %ds - CFI_ADJUST_CFA_OFFSET 4 - /*CFI_REL_OFFSET ds, 0*/ pushl %eax - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET eax, 0 xorl %eax, %eax pushl %ebp - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET ebp, 0 pushl %edi - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET edi, 0 pushl %esi - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET esi, 0 pushl %edx - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET edx, 0 decl %eax # eax = -1 pushl %ecx - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET ecx, 0 pushl %ebx - CFI_ADJUST_CFA_OFFSET 4 - CFI_REL_OFFSET ebx, 0 cld pushl %es - CFI_ADJUST_CFA_OFFSET 4 - /*CFI_REL_OFFSET es, 0*/ UNWIND_ESPFIX_STACK popl %ecx - CFI_ADJUST_CFA_OFFSET -4 - /*CFI_REGISTER es, ecx*/ movl ES(%esp), %edi # get the function address movl ORIG_EAX(%esp), %edx # get the error code movl %eax, ORIG_EAX(%esp) movl %ecx, ES(%esp) - /*CFI_REL_OFFSET es, ES*/ movl $(__USER_DS), %ecx movl %ecx, %ds movl %ecx, %es movl %esp,%eax # pt_regs pointer call *%edi jmp ret_from_exception - CFI_ENDPROC ENTRY(coprocessor_error) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_coprocessor_error - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(simd_coprocessor_error) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_simd_coprocessor_error - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(device_not_available) - RING0_INT_FRAME pushl $-1 # mark this as an int - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL movl %cr0, %eax testl $0x4, %eax # EM (math emulation bit) @@ -632,12 +487,9 @@ ENTRY(device_not_available) jmp ret_from_exception device_not_available_emulate: pushl $0 # temporary storage for ORIG_EIP - CFI_ADJUST_CFA_OFFSET 4 call math_emulate addl $4, %esp - CFI_ADJUST_CFA_OFFSET -4 jmp ret_from_exception - CFI_ENDPROC /* * Debug traps and NMI can happen at the one SYSENTER instruction @@ -662,19 +514,16 @@ label: \ pushl $sysenter_past_esp KPROBE_ENTRY(debug) - RING0_INT_FRAME cmpl $sysenter_entry,(%esp) jne debug_stack_correct FIX_STACK(12, debug_stack_correct, debug_esp_fix_insn) debug_stack_correct: pushl $-1 # mark this as an int - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL xorl %edx,%edx # error code 0 movl %esp,%eax # pt_regs pointer call do_debug jmp ret_from_exception - CFI_ENDPROC .previous .text /* * NMI is doubly nasty. It can happen _while_ we're handling @@ -685,18 +534,14 @@ debug_stack_correct: * fault happened on the sysenter path. */ ENTRY(nmi) - RING0_INT_FRAME pushl %eax - CFI_ADJUST_CFA_OFFSET 4 movl %ss, %eax cmpw $__ESPFIX_SS, %ax popl %eax - CFI_ADJUST_CFA_OFFSET -4 je nmi_16bit_stack cmpl $sysenter_entry,(%esp) je nmi_stack_fixup pushl %eax - CFI_ADJUST_CFA_OFFSET 4 movl %esp,%eax /* Do not access memory above the end of our stack page, * it might not exist. @@ -704,19 +549,16 @@ ENTRY(nmi) andl $(THREAD_SIZE-1),%eax cmpl $(THREAD_SIZE-20),%eax popl %eax - CFI_ADJUST_CFA_OFFSET -4 jae nmi_stack_correct cmpl $sysenter_entry,12(%esp) je nmi_debug_stack_check nmi_stack_correct: pushl %eax - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL xorl %edx,%edx # zero error code movl %esp,%eax # pt_regs pointer call do_nmi jmp restore_all - CFI_ENDPROC nmi_stack_fixup: FIX_STACK(12,nmi_stack_correct, 1) @@ -732,177 +574,94 @@ nmi_debug_stack_check: jmp nmi_stack_correct nmi_16bit_stack: - RING0_INT_FRAME /* create the pointer to lss back */ pushl %ss - CFI_ADJUST_CFA_OFFSET 4 pushl %esp - CFI_ADJUST_CFA_OFFSET 4 movzwl %sp, %esp addw $4, (%esp) /* copy the iret frame of 12 bytes */ .rept 3 pushl 16(%esp) - CFI_ADJUST_CFA_OFFSET 4 .endr pushl %eax - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL FIXUP_ESPFIX_STACK # %eax == %esp - CFI_ADJUST_CFA_OFFSET -20 # the frame has now moved xorl %edx,%edx # zero error code call do_nmi RESTORE_REGS lss 12+4(%esp), %esp # back to 16bit stack 1: iret - CFI_ENDPROC .section __ex_table,"a" .align 4 .long 1b,iret_exc .previous KPROBE_ENTRY(int3) - RING0_INT_FRAME pushl $-1 # mark this as an int - CFI_ADJUST_CFA_OFFSET 4 SAVE_ALL xorl %edx,%edx # zero error code movl %esp,%eax # pt_regs pointer call do_int3 jmp ret_from_exception - CFI_ENDPROC .previous .text ENTRY(overflow) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_overflow - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(bounds) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_bounds - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(invalid_op) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_invalid_op - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(coprocessor_segment_overrun) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_coprocessor_segment_overrun - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(invalid_TSS) - RING0_EC_FRAME pushl $do_invalid_TSS - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(segment_not_present) - RING0_EC_FRAME pushl $do_segment_not_present - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC ENTRY(stack_segment) - RING0_EC_FRAME pushl $do_stack_segment - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC KPROBE_ENTRY(general_protection) - RING0_EC_FRAME pushl $do_general_protection - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC .previous .text ENTRY(alignment_check) - RING0_EC_FRAME pushl $do_alignment_check - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC KPROBE_ENTRY(page_fault) - RING0_EC_FRAME pushl $do_page_fault - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC .previous .text #ifdef CONFIG_X86_MCE ENTRY(machine_check) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl machine_check_vector - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC #endif ENTRY(spurious_interrupt_bug) - RING0_INT_FRAME pushl $0 - CFI_ADJUST_CFA_OFFSET 4 pushl $do_spurious_interrupt_bug - CFI_ADJUST_CFA_OFFSET 4 jmp error_code - CFI_ENDPROC - -#ifdef CONFIG_STACK_UNWIND -ENTRY(arch_unwind_init_running) - CFI_STARTPROC - movl 4(%esp), %edx - movl (%esp), %ecx - leal 4(%esp), %eax - movl %ebx, EBX(%edx) - xorl %ebx, %ebx - movl %ebx, ECX(%edx) - movl %ebx, EDX(%edx) - movl %esi, ESI(%edx) - movl %edi, EDI(%edx) - movl %ebp, EBP(%edx) - movl %ebx, EAX(%edx) - movl $__USER_DS, DS(%edx) - movl $__USER_DS, ES(%edx) - movl %ebx, ORIG_EAX(%edx) - movl %ecx, EIP(%edx) - movl 12(%esp), %ecx - movl $__KERNEL_CS, CS(%edx) - movl %ebx, EFLAGS(%edx) - movl %eax, OLDESP(%edx) - movl 8(%esp), %eax - movl %ecx, 8(%esp) - movl EBX(%edx), %ebx - movl $__KERNEL_DS, OLDSS(%edx) - jmpl *%eax - CFI_ENDPROC -ENDPROC(arch_unwind_init_running) -#endif .section .rodata,"a" #include "syscall_table.S" diff --git a/trunk/arch/i386/kernel/hpet.c b/trunk/arch/i386/kernel/hpet.c deleted file mode 100644 index c6737c35815d..000000000000 --- a/trunk/arch/i386/kernel/hpet.c +++ /dev/null @@ -1,67 +0,0 @@ -#include -#include -#include -#include - -#include -#include - -#define HPET_MASK CLOCKSOURCE_MASK(32) -#define HPET_SHIFT 22 - -/* FSEC = 10^-15 NSEC = 10^-9 */ -#define FSEC_PER_NSEC 1000000 - -static void *hpet_ptr; - -static cycle_t read_hpet(void) -{ - return (cycle_t)readl(hpet_ptr); -} - -static struct clocksource clocksource_hpet = { - .name = "hpet", - .rating = 250, - .read = read_hpet, - .mask = HPET_MASK, - .mult = 0, /* set below */ - .shift = HPET_SHIFT, - .is_continuous = 1, -}; - -static int __init init_hpet_clocksource(void) -{ - unsigned long hpet_period; - void __iomem* hpet_base; - u64 tmp; - - if (!hpet_address) - return -ENODEV; - - /* calculate the hpet address: */ - hpet_base = - (void __iomem*)ioremap_nocache(hpet_address, HPET_MMAP_SIZE); - hpet_ptr = hpet_base + HPET_COUNTER; - - /* calculate the frequency: */ - hpet_period = readl(hpet_base + HPET_PERIOD); - - /* - * hpet period is in femto seconds per cycle - * so we need to convert this to ns/cyc units - * aproximated by mult/2^shift - * - * fsec/cyc * 1nsec/1000000fsec = nsec/cyc = mult/2^shift - * fsec/cyc * 1ns/1000000fsec * 2^shift = mult - * fsec/cyc * 2^shift * 1nsec/1000000fsec = mult - * (fsec/cyc << shift)/1000000 = mult - * (hpet_period << shift)/FSEC_PER_NSEC = mult - */ - tmp = (u64)hpet_period << HPET_SHIFT; - do_div(tmp, FSEC_PER_NSEC); - clocksource_hpet.mult = (u32)tmp; - - return clocksource_register(&clocksource_hpet); -} - -module_init(init_hpet_clocksource); diff --git a/trunk/arch/i386/kernel/i8253.c b/trunk/arch/i386/kernel/i8253.c deleted file mode 100644 index 477b24daff53..000000000000 --- a/trunk/arch/i386/kernel/i8253.c +++ /dev/null @@ -1,118 +0,0 @@ -/* - * i8253.c 8253/PIT functions - * - */ -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -#include "io_ports.h" - -DEFINE_SPINLOCK(i8253_lock); -EXPORT_SYMBOL(i8253_lock); - -void setup_pit_timer(void) -{ - unsigned long flags; - - spin_lock_irqsave(&i8253_lock, flags); - outb_p(0x34,PIT_MODE); /* binary, mode 2, LSB/MSB, ch 0 */ - udelay(10); - outb_p(LATCH & 0xff , PIT_CH0); /* LSB */ - udelay(10); - outb(LATCH >> 8 , PIT_CH0); /* MSB */ - spin_unlock_irqrestore(&i8253_lock, flags); -} - -/* - * Since the PIT overflows every tick, its not very useful - * to just read by itself. So use jiffies to emulate a free - * running counter: - */ -static cycle_t pit_read(void) -{ - unsigned long flags; - int count; - u32 jifs; - static int old_count; - static u32 old_jifs; - - spin_lock_irqsave(&i8253_lock, flags); - /* - * Although our caller may have the read side of xtime_lock, - * this is now a seqlock, and we are cheating in this routine - * by having side effects on state that we cannot undo if - * there is a collision on the seqlock and our caller has to - * retry. (Namely, old_jifs and old_count.) So we must treat - * jiffies as volatile despite the lock. We read jiffies - * before latching the timer count to guarantee that although - * the jiffies value might be older than the count (that is, - * the counter may underflow between the last point where - * jiffies was incremented and the point where we latch the - * count), it cannot be newer. - */ - jifs = jiffies; - outb_p(0x00, PIT_MODE); /* latch the count ASAP */ - count = inb_p(PIT_CH0); /* read the latched count */ - count |= inb_p(PIT_CH0) << 8; - - /* VIA686a test code... reset the latch if count > max + 1 */ - if (count > LATCH) { - outb_p(0x34, PIT_MODE); - outb_p(LATCH & 0xff, PIT_CH0); - outb(LATCH >> 8, PIT_CH0); - count = LATCH - 1; - } - - /* - * It's possible for count to appear to go the wrong way for a - * couple of reasons: - * - * 1. The timer counter underflows, but we haven't handled the - * resulting interrupt and incremented jiffies yet. - * 2. Hardware problem with the timer, not giving us continuous time, - * the counter does small "jumps" upwards on some Pentium systems, - * (see c't 95/10 page 335 for Neptun bug.) - * - * Previous attempts to handle these cases intelligently were - * buggy, so we just do the simple thing now. - */ - if (count > old_count && jifs == old_jifs) { - count = old_count; - } - old_count = count; - old_jifs = jifs; - - spin_unlock_irqrestore(&i8253_lock, flags); - - count = (LATCH - 1) - count; - - return (cycle_t)(jifs * LATCH) + count; -} - -static struct clocksource clocksource_pit = { - .name = "pit", - .rating = 110, - .read = pit_read, - .mask = CLOCKSOURCE_MASK(32), - .mult = 0, - .shift = 20, -}; - -static int __init init_pit_clocksource(void) -{ - if (num_possible_cpus() > 4) /* PIT does not scale! */ - return 0; - - clocksource_pit.mult = clocksource_hz2mult(CLOCK_TICK_RATE, 20); - return clocksource_register(&clocksource_pit); -} -module_init(init_pit_clocksource); diff --git a/trunk/arch/i386/kernel/io_apic.c b/trunk/arch/i386/kernel/io_apic.c index 72ae414e4d49..a62df3e764c5 100644 --- a/trunk/arch/i386/kernel/io_apic.c +++ b/trunk/arch/i386/kernel/io_apic.c @@ -38,7 +38,6 @@ #include #include #include -#include #include @@ -51,7 +50,6 @@ atomic_t irq_mis_count; static struct { int pin, apic; } ioapic_i8259 = { -1, -1 }; static DEFINE_SPINLOCK(ioapic_lock); -static DEFINE_SPINLOCK(vector_lock); int timer_over_8254 __initdata = 1; @@ -1163,17 +1161,10 @@ u8 irq_vector[NR_IRQ_VECTORS] __read_mostly = { FIRST_DEVICE_VECTOR , 0 }; int assign_irq_vector(int irq) { static int current_vector = FIRST_DEVICE_VECTOR, offset = 0; - unsigned long flags; - int vector; - - BUG_ON(irq != AUTO_ASSIGN && (unsigned)irq >= NR_IRQ_VECTORS); - spin_lock_irqsave(&vector_lock, flags); - - if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) { - spin_unlock_irqrestore(&vector_lock, flags); + BUG_ON(irq >= NR_IRQ_VECTORS); + if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) return IO_APIC_VECTOR(irq); - } next: current_vector += 8; if (current_vector == SYSCALL_VECTOR) @@ -1181,21 +1172,16 @@ int assign_irq_vector(int irq) if (current_vector >= FIRST_SYSTEM_VECTOR) { offset++; - if (!(offset%8)) { - spin_unlock_irqrestore(&vector_lock, flags); + if (!(offset%8)) return -ENOSPC; - } current_vector = FIRST_DEVICE_VECTOR + offset; } - vector = current_vector; - vector_irq[vector] = irq; + vector_irq[current_vector] = irq; if (irq != AUTO_ASSIGN) - IO_APIC_VECTOR(irq) = vector; + IO_APIC_VECTOR(irq) = current_vector; - spin_unlock_irqrestore(&vector_lock, flags); - - return vector; + return current_vector; } static struct hw_interrupt_type ioapic_level_type; @@ -1207,14 +1193,21 @@ static struct hw_interrupt_type ioapic_edge_type; static inline void ioapic_register_intr(int irq, int vector, unsigned long trigger) { - unsigned idx = use_pci_vector() && !platform_legacy_irq(irq) ? vector : irq; - - if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || - trigger == IOAPIC_LEVEL) - irq_desc[idx].handler = &ioapic_level_type; - else - irq_desc[idx].handler = &ioapic_edge_type; - set_intr_gate(vector, interrupt[idx]); + if (use_pci_vector() && !platform_legacy_irq(irq)) { + if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || + trigger == IOAPIC_LEVEL) + irq_desc[vector].handler = &ioapic_level_type; + else + irq_desc[vector].handler = &ioapic_edge_type; + set_intr_gate(vector, interrupt[vector]); + } else { + if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || + trigger == IOAPIC_LEVEL) + irq_desc[irq].handler = &ioapic_level_type; + else + irq_desc[irq].handler = &ioapic_edge_type; + set_intr_gate(vector, interrupt[irq]); + } } static void __init setup_IO_APIC_irqs(void) diff --git a/trunk/arch/i386/kernel/irq.c b/trunk/arch/i386/kernel/irq.c index 061533e0cb5e..49ce4c31b713 100644 --- a/trunk/arch/i386/kernel/irq.c +++ b/trunk/arch/i386/kernel/irq.c @@ -227,7 +227,7 @@ int show_interrupts(struct seq_file *p, void *v) if (i == 0) { seq_printf(p, " "); for_each_online_cpu(j) - seq_printf(p, "CPU%-8d",j); + seq_printf(p, "CPU%d ",j); seq_putc(p, '\n'); } diff --git a/trunk/arch/i386/kernel/kprobes.c b/trunk/arch/i386/kernel/kprobes.c index 727e419ad78a..395a9a6dff88 100644 --- a/trunk/arch/i386/kernel/kprobes.c +++ b/trunk/arch/i386/kernel/kprobes.c @@ -57,85 +57,34 @@ static __always_inline void set_jmp_op(void *from, void *to) /* * returns non-zero if opcodes can be boosted. */ -static __always_inline int can_boost(kprobe_opcode_t *opcodes) +static __always_inline int can_boost(kprobe_opcode_t opcode) { -#define W(row,b0,b1,b2,b3,b4,b5,b6,b7,b8,b9,ba,bb,bc,bd,be,bf) \ - (((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) | \ - (b4##UL << 0x4)|(b5##UL << 0x5)|(b6##UL << 0x6)|(b7##UL << 0x7) | \ - (b8##UL << 0x8)|(b9##UL << 0x9)|(ba##UL << 0xa)|(bb##UL << 0xb) | \ - (bc##UL << 0xc)|(bd##UL << 0xd)|(be##UL << 0xe)|(bf##UL << 0xf)) \ - << (row % 32)) - /* - * Undefined/reserved opcodes, conditional jump, Opcode Extension - * Groups, and some special opcodes can not be boost. - */ - static const unsigned long twobyte_is_boostable[256 / 32] = { - /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ - /* ------------------------------- */ - W(0x00, 0,0,1,1,0,0,1,0,1,1,0,0,0,0,0,0)| /* 00 */ - W(0x10, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0), /* 10 */ - W(0x20, 1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0)| /* 20 */ - W(0x30, 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0), /* 30 */ - W(0x40, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)| /* 40 */ - W(0x50, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0), /* 50 */ - W(0x60, 1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1)| /* 60 */ - W(0x70, 0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,1), /* 70 */ - W(0x80, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)| /* 80 */ - W(0x90, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1), /* 90 */ - W(0xa0, 1,1,0,1,1,1,0,0,1,1,0,1,1,1,0,1)| /* a0 */ - W(0xb0, 1,1,1,1,1,1,1,1,0,0,0,1,1,1,1,1), /* b0 */ - W(0xc0, 1,1,0,0,0,0,0,0,1,1,1,1,1,1,1,1)| /* c0 */ - W(0xd0, 0,1,1,1,0,1,0,0,1,1,0,1,1,1,0,1), /* d0 */ - W(0xe0, 0,1,1,0,0,1,0,0,1,1,0,1,1,1,0,1)| /* e0 */ - W(0xf0, 0,1,1,1,0,1,0,0,1,1,1,0,1,1,1,0) /* f0 */ - /* ------------------------------- */ - /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ - }; -#undef W - kprobe_opcode_t opcode; - kprobe_opcode_t *orig_opcodes = opcodes; -retry: - if (opcodes - orig_opcodes > MAX_INSN_SIZE - 1) - return 0; - opcode = *(opcodes++); - - /* 2nd-byte opcode */ - if (opcode == 0x0f) { - if (opcodes - orig_opcodes > MAX_INSN_SIZE - 1) - return 0; - return test_bit(*opcodes, twobyte_is_boostable); - } - - switch (opcode & 0xf0) { - case 0x60: - if (0x63 < opcode && opcode < 0x67) - goto retry; /* prefixes */ - /* can't boost Address-size override and bound */ - return (opcode != 0x62 && opcode != 0x67); + switch (opcode & 0xf0 ) { case 0x70: return 0; /* can't boost conditional jump */ + case 0x90: + /* can't boost call and pushf */ + return opcode != 0x9a && opcode != 0x9c; case 0xc0: - /* can't boost software-interruptions */ - return (0xc1 < opcode && opcode < 0xcc) || opcode == 0xcf; + /* can't boost undefined opcodes and soft-interruptions */ + return (0xc1 < opcode && opcode < 0xc6) || + (0xc7 < opcode && opcode < 0xcc) || opcode == 0xcf; case 0xd0: /* can boost AA* and XLAT */ return (opcode == 0xd4 || opcode == 0xd5 || opcode == 0xd7); case 0xe0: - /* can boost in/out and absolute jmps */ - return ((opcode & 0x04) || opcode == 0xea); + /* can boost in/out and (may be) jmps */ + return (0xe3 < opcode && opcode != 0xe8); case 0xf0: - if ((opcode & 0x0c) == 0 && opcode != 0xf1) - goto retry; /* lock/rep(ne) prefix */ /* clear and set flags can be boost */ return (opcode == 0xf5 || (0xf7 < opcode && opcode < 0xfe)); default: - if (opcode == 0x26 || opcode == 0x36 || opcode == 0x3e) - goto retry; /* prefixes */ - /* can't boost CS override and call */ - return (opcode != 0x2e && opcode != 0x9a); + /* currently, can't boost 2 bytes opcodes */ + return opcode != 0x0f; } } + /* * returns non-zero if opcode modifies the interrupt flag. */ @@ -160,7 +109,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); p->opcode = *p->addr; - if (can_boost(p->addr)) { + if (can_boost(p->opcode)) { p->ainsn.boostable = 0; } else { p->ainsn.boostable = -1; @@ -259,9 +208,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe_ctlblk *kcb; #ifdef CONFIG_PREEMPT unsigned pre_preempt_count = preempt_count(); -#else - unsigned pre_preempt_count = 1; -#endif +#endif /* CONFIG_PREEMPT */ addr = (kprobe_opcode_t *)(regs->eip - sizeof(kprobe_opcode_t)); @@ -338,14 +285,22 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) /* handler has already set things up, so skip ss setup */ return 1; -ss_probe: - if (pre_preempt_count && p->ainsn.boostable == 1 && !p->post_handler){ + if (p->ainsn.boostable == 1 && +#ifdef CONFIG_PREEMPT + !(pre_preempt_count) && /* + * This enables booster when the direct + * execution path aren't preempted. + */ +#endif /* CONFIG_PREEMPT */ + !p->post_handler && !p->break_handler ) { /* Boost up -- we can execute copied instructions directly */ reset_current_kprobe(); regs->eip = (unsigned long)p->ainsn.insn; preempt_enable_no_resched(); return 1; } + +ss_probe: prepare_singlestep(p, regs); kcb->kprobe_status = KPROBE_HIT_SS; return 1; diff --git a/trunk/arch/i386/kernel/nmi.c b/trunk/arch/i386/kernel/nmi.c index a76e93146585..d43b498ec745 100644 --- a/trunk/arch/i386/kernel/nmi.c +++ b/trunk/arch/i386/kernel/nmi.c @@ -14,17 +14,21 @@ */ #include +#include #include +#include +#include #include +#include +#include #include #include #include #include -#include #include +#include #include -#include #include "mach_traps.h" @@ -96,9 +100,6 @@ int nmi_active; (P4_CCCR_OVF_PMI0|P4_CCCR_THRESHOLD(15)|P4_CCCR_COMPLEMENT| \ P4_CCCR_COMPARE|P4_CCCR_REQUIRED|P4_CCCR_ESCR_SELECT(4)|P4_CCCR_ENABLE) -#define ARCH_PERFMON_NMI_EVENT_SEL ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL -#define ARCH_PERFMON_NMI_EVENT_UMASK ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK - #ifdef CONFIG_SMP /* The performance counters used by NMI_LOCAL_APIC don't trigger when * the CPU is idle. To make sure the NMI watchdog really ticks on all @@ -211,8 +212,6 @@ static int __init setup_nmi_watchdog(char *str) __setup("nmi_watchdog=", setup_nmi_watchdog); -static void disable_intel_arch_watchdog(void); - static void disable_lapic_nmi_watchdog(void) { if (nmi_active <= 0) @@ -222,10 +221,6 @@ static void disable_lapic_nmi_watchdog(void) wrmsr(MSR_K7_EVNTSEL0, 0, 0); break; case X86_VENDOR_INTEL: - if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) { - disable_intel_arch_watchdog(); - break; - } switch (boot_cpu_data.x86) { case 6: if (boot_cpu_data.x86_model > 0xd) @@ -454,53 +449,6 @@ static int setup_p4_watchdog(void) return 1; } -static void disable_intel_arch_watchdog(void) -{ - unsigned ebx; - - /* - * Check whether the Architectural PerfMon supports - * Unhalted Core Cycles Event or not. - * NOTE: Corresponding bit = 0 in ebp indicates event present. - */ - ebx = cpuid_ebx(10); - if (!(ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT)) - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, 0, 0); -} - -static int setup_intel_arch_watchdog(void) -{ - unsigned int evntsel; - unsigned ebx; - - /* - * Check whether the Architectural PerfMon supports - * Unhalted Core Cycles Event or not. - * NOTE: Corresponding bit = 0 in ebp indicates event present. - */ - ebx = cpuid_ebx(10); - if ((ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT)) - return 0; - - nmi_perfctr_msr = MSR_ARCH_PERFMON_PERFCTR0; - - clear_msr_range(MSR_ARCH_PERFMON_EVENTSEL0, 2); - clear_msr_range(MSR_ARCH_PERFMON_PERFCTR0, 2); - - evntsel = ARCH_PERFMON_EVENTSEL_INT - | ARCH_PERFMON_EVENTSEL_OS - | ARCH_PERFMON_EVENTSEL_USR - | ARCH_PERFMON_NMI_EVENT_SEL - | ARCH_PERFMON_NMI_EVENT_UMASK; - - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, evntsel, 0); - write_watchdog_counter("INTEL_ARCH_PERFCTR0"); - apic_write(APIC_LVTPC, APIC_DM_NMI); - evntsel |= ARCH_PERFMON_EVENTSEL0_ENABLE; - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, evntsel, 0); - return 1; -} - void setup_apic_nmi_watchdog (void) { switch (boot_cpu_data.x86_vendor) { @@ -510,11 +458,6 @@ void setup_apic_nmi_watchdog (void) setup_k7_watchdog(); break; case X86_VENDOR_INTEL: - if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) { - if (!setup_intel_arch_watchdog()) - return; - break; - } switch (boot_cpu_data.x86) { case 6: if (boot_cpu_data.x86_model > 0xd) @@ -618,8 +561,7 @@ void nmi_watchdog_tick (struct pt_regs * regs) wrmsr(MSR_P4_IQ_CCCR0, nmi_p4_cccr_val, 0); apic_write(APIC_LVTPC, APIC_DM_NMI); } - else if (nmi_perfctr_msr == MSR_P6_PERFCTR0 || - nmi_perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) { + else if (nmi_perfctr_msr == MSR_P6_PERFCTR0) { /* Only P6 based Pentium M need to re-unmask * the apic vector but it doesn't hurt * other P6 variant */ diff --git a/trunk/arch/i386/kernel/numaq.c b/trunk/arch/i386/kernel/numaq.c index 0caf14652bad..5f5b075f860a 100644 --- a/trunk/arch/i386/kernel/numaq.c +++ b/trunk/arch/i386/kernel/numaq.c @@ -79,12 +79,10 @@ int __init get_memcfg_numaq(void) return 1; } -static int __init numaq_tsc_disable(void) +static int __init numaq_dsc_disable(void) { - if (num_online_nodes() > 1) { - printk(KERN_DEBUG "NUMAQ: disabling TSC\n"); - tsc_disable = 1; - } + printk(KERN_DEBUG "NUMAQ: disabling TSC\n"); + tsc_disable = 1; return 0; } -arch_initcall(numaq_tsc_disable); +core_initcall(numaq_dsc_disable); diff --git a/trunk/arch/i386/kernel/process.c b/trunk/arch/i386/kernel/process.c index 6946b06e2784..6259afea46d1 100644 --- a/trunk/arch/i386/kernel/process.c +++ b/trunk/arch/i386/kernel/process.c @@ -102,7 +102,7 @@ void default_idle(void) local_irq_enable(); if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); @@ -111,7 +111,7 @@ void default_idle(void) else local_irq_enable(); } - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); } else { while (!need_resched()) cpu_relax(); @@ -174,7 +174,7 @@ void cpu_idle(void) { int cpu = smp_processor_id(); - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); /* endless idle loop with no priority at all */ while (1) { @@ -312,7 +312,7 @@ void show_regs(struct pt_regs * regs) cr3 = read_cr3(); cr4 = read_cr4_safe(); printk("CR0: %08lx CR2: %08lx CR3: %08lx CR4: %08lx\n", cr0, cr2, cr3, cr4); - show_trace(NULL, regs, ®s->esp); + show_trace(NULL, ®s->esp); } /* diff --git a/trunk/arch/i386/kernel/setup.c b/trunk/arch/i386/kernel/setup.c index 4a65040cc624..6bef9273733e 100644 --- a/trunk/arch/i386/kernel/setup.c +++ b/trunk/arch/i386/kernel/setup.c @@ -1575,7 +1575,6 @@ void __init setup_arch(char **cmdline_p) conswitchp = &dummy_con; #endif #endif - tsc_init(); } static __init int add_pcspkr(void) diff --git a/trunk/arch/i386/kernel/smp.c b/trunk/arch/i386/kernel/smp.c index c10789d7a9d3..d134e9643a58 100644 --- a/trunk/arch/i386/kernel/smp.c +++ b/trunk/arch/i386/kernel/smp.c @@ -114,17 +114,7 @@ DEFINE_PER_CPU(struct tlb_state, cpu_tlbstate) ____cacheline_aligned = { &init_m static inline int __prepare_ICR (unsigned int shortcut, int vector) { - unsigned int icr = shortcut | APIC_DEST_LOGICAL; - - switch (vector) { - default: - icr |= APIC_DM_FIXED | vector; - break; - case NMI_VECTOR: - icr |= APIC_DM_NMI; - break; - } - return icr; + return APIC_DM_FIXED | shortcut | vector | APIC_DEST_LOGICAL; } static inline int __prepare_ICR2 (unsigned int mask) diff --git a/trunk/arch/i386/kernel/smpboot.c b/trunk/arch/i386/kernel/smpboot.c index bce5470ecb42..bd0ca5c9f053 100644 --- a/trunk/arch/i386/kernel/smpboot.c +++ b/trunk/arch/i386/kernel/smpboot.c @@ -52,7 +52,6 @@ #include #include #include -#include #include #include diff --git a/trunk/arch/i386/kernel/time.c b/trunk/arch/i386/kernel/time.c index 5f43d0410122..9d3074759856 100644 --- a/trunk/arch/i386/kernel/time.c +++ b/trunk/arch/i386/kernel/time.c @@ -82,6 +82,13 @@ extern unsigned long wall_jiffies; DEFINE_SPINLOCK(rtc_lock); EXPORT_SYMBOL(rtc_lock); +#include + +DEFINE_SPINLOCK(i8253_lock); +EXPORT_SYMBOL(i8253_lock); + +struct timer_opts *cur_timer __read_mostly = &timer_none; + /* * This is a special lock that is owned by the CPU and holds the index * register we are working with. It is required for NMI access to the @@ -111,19 +118,99 @@ void rtc_cmos_write(unsigned char val, unsigned char addr) } EXPORT_SYMBOL(rtc_cmos_write); +/* + * This version of gettimeofday has microsecond resolution + * and better than microsecond precision on fast x86 machines with TSC. + */ +void do_gettimeofday(struct timeval *tv) +{ + unsigned long seq; + unsigned long usec, sec; + unsigned long max_ntp_tick; + + do { + unsigned long lost; + + seq = read_seqbegin(&xtime_lock); + + usec = cur_timer->get_offset(); + lost = jiffies - wall_jiffies; + + /* + * If time_adjust is negative then NTP is slowing the clock + * so make sure not to go into next possible interval. + * Better to lose some accuracy than have time go backwards.. + */ + if (unlikely(time_adjust < 0)) { + max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj; + usec = min(usec, max_ntp_tick); + + if (lost) + usec += lost * max_ntp_tick; + } + else if (unlikely(lost)) + usec += lost * (USEC_PER_SEC / HZ); + + sec = xtime.tv_sec; + usec += (xtime.tv_nsec / 1000); + } while (read_seqretry(&xtime_lock, seq)); + + while (usec >= 1000000) { + usec -= 1000000; + sec++; + } + + tv->tv_sec = sec; + tv->tv_usec = usec; +} + +EXPORT_SYMBOL(do_gettimeofday); + +int do_settimeofday(struct timespec *tv) +{ + time_t wtm_sec, sec = tv->tv_sec; + long wtm_nsec, nsec = tv->tv_nsec; + + if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC) + return -EINVAL; + + write_seqlock_irq(&xtime_lock); + /* + * This is revolting. We need to set "xtime" correctly. However, the + * value in this location is the value at the most recent update of + * wall time. Discover what correction gettimeofday() would have + * made, and then undo it! + */ + nsec -= cur_timer->get_offset() * NSEC_PER_USEC; + nsec -= (jiffies - wall_jiffies) * TICK_NSEC; + + wtm_sec = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec); + wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec); + + set_normalized_timespec(&xtime, sec, nsec); + set_normalized_timespec(&wall_to_monotonic, wtm_sec, wtm_nsec); + + ntp_clear(); + write_sequnlock_irq(&xtime_lock); + clock_was_set(); + return 0; +} + +EXPORT_SYMBOL(do_settimeofday); + static int set_rtc_mmss(unsigned long nowtime) { int retval; - unsigned long flags; + + WARN_ON(irqs_disabled()); /* gets recalled with irq locally disabled */ - /* XXX - does irqsave resolve this? -johnstul */ - spin_lock_irqsave(&rtc_lock, flags); + spin_lock_irq(&rtc_lock); if (efi_enabled) retval = efi_set_rtc_mmss(nowtime); else retval = mach_set_rtc_mmss(nowtime); - spin_unlock_irqrestore(&rtc_lock, flags); + spin_unlock_irq(&rtc_lock); return retval; } @@ -131,6 +218,16 @@ static int set_rtc_mmss(unsigned long nowtime) int timer_ack; +/* monotonic_clock(): returns # of nanoseconds passed since time_init() + * Note: This function is required to return accurate + * time even in the absence of multiple timer ticks. + */ +unsigned long long monotonic_clock(void) +{ + return cur_timer->monotonic_clock(); +} +EXPORT_SYMBOL(monotonic_clock); + #if defined(CONFIG_SMP) && defined(CONFIG_FRAME_POINTER) unsigned long profile_pc(struct pt_regs *regs) { @@ -145,21 +242,11 @@ EXPORT_SYMBOL(profile_pc); #endif /* - * This is the same as the above, except we _also_ save the current - * Time Stamp Counter value at the time of the timer interrupt, so that - * we later on can estimate the time of day more exactly. + * timer_interrupt() needs to keep up the real-time clock, + * as well as call the "do_timer()" routine every clocktick */ -irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) +static inline void do_timer_interrupt(int irq, struct pt_regs *regs) { - /* - * Here we are in the timer irq handler. We just have irqs locally - * disabled but we don't know if the timer_bh is running on the other - * CPU. We need to avoid to SMP race with it. NOTE: we don' t need - * the irq version of write_lock because as just said we have irq - * locally disabled. -arca - */ - write_seqlock(&xtime_lock); - #ifdef CONFIG_X86_IO_APIC if (timer_ack) { /* @@ -192,6 +279,27 @@ irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) irq = inb_p( 0x61 ); /* read the current state */ outb_p( irq|0x80, 0x61 ); /* reset the IRQ */ } +} + +/* + * This is the same as the above, except we _also_ save the current + * Time Stamp Counter value at the time of the timer interrupt, so that + * we later on can estimate the time of day more exactly. + */ +irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) +{ + /* + * Here we are in the timer irq handler. We just have irqs locally + * disabled but we don't know if the timer_bh is running on the other + * CPU. We need to avoid to SMP race with it. NOTE: we don' t need + * the irq version of write_lock because as just said we have irq + * locally disabled. -arca + */ + write_seqlock(&xtime_lock); + + cur_timer->mark_offset(); + + do_timer_interrupt(irq, regs); write_sequnlock(&xtime_lock); @@ -272,6 +380,7 @@ void notify_arch_cmos_timer(void) static long clock_cmos_diff, sleep_start; +static struct timer_opts *last_timer; static int timer_suspend(struct sys_device *dev, pm_message_t state) { /* @@ -280,6 +389,10 @@ static int timer_suspend(struct sys_device *dev, pm_message_t state) clock_cmos_diff = -get_cmos_time(); clock_cmos_diff += get_seconds(); sleep_start = get_cmos_time(); + last_timer = cur_timer; + cur_timer = &timer_none; + if (last_timer->suspend) + last_timer->suspend(state); return 0; } @@ -302,6 +415,10 @@ static int timer_resume(struct sys_device *dev) jiffies_64 += sleep_length; wall_jiffies += sleep_length; write_sequnlock_irqrestore(&xtime_lock, flags); + if (last_timer->resume) + last_timer->resume(); + cur_timer = last_timer; + last_timer = NULL; touch_softlockup_watchdog(); return 0; } @@ -343,6 +460,9 @@ static void __init hpet_time_init(void) printk("Using HPET for base-timer\n"); } + cur_timer = select_timer(); + printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name); + time_init_hook(); } #endif @@ -364,5 +484,8 @@ void __init time_init(void) set_normalized_timespec(&wall_to_monotonic, -xtime.tv_sec, -xtime.tv_nsec); + cur_timer = select_timer(); + printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name); + time_init_hook(); } diff --git a/trunk/arch/i386/kernel/timers/Makefile b/trunk/arch/i386/kernel/timers/Makefile new file mode 100644 index 000000000000..8fa12be658dd --- /dev/null +++ b/trunk/arch/i386/kernel/timers/Makefile @@ -0,0 +1,9 @@ +# +# Makefile for x86 timers +# + +obj-y := timer.o timer_none.o timer_tsc.o timer_pit.o common.o + +obj-$(CONFIG_X86_CYCLONE_TIMER) += timer_cyclone.o +obj-$(CONFIG_HPET_TIMER) += timer_hpet.o +obj-$(CONFIG_X86_PM_TIMER) += timer_pm.o diff --git a/trunk/arch/i386/kernel/timers/common.c b/trunk/arch/i386/kernel/timers/common.c new file mode 100644 index 000000000000..8163fe0cf1f0 --- /dev/null +++ b/trunk/arch/i386/kernel/timers/common.c @@ -0,0 +1,172 @@ +/* + * Common functions used across the timers go here + */ + +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "mach_timer.h" + +/* ------ Calibrate the TSC ------- + * Return 2^32 * (1 / (TSC clocks per usec)) for do_fast_gettimeoffset(). + * Too much 64-bit arithmetic here to do this cleanly in C, and for + * accuracy's sake we want to keep the overhead on the CTC speaker (channel 2) + * output busy loop as low as possible. We avoid reading the CTC registers + * directly because of the awkward 8-bit access mechanism of the 82C54 + * device. + */ + +#define CALIBRATE_TIME (5 * 1000020/HZ) + +unsigned long calibrate_tsc(void) +{ + mach_prepare_counter(); + + { + unsigned long startlow, starthigh; + unsigned long endlow, endhigh; + unsigned long count; + + rdtsc(startlow,starthigh); + mach_countup(&count); + rdtsc(endlow,endhigh); + + + /* Error: ECTCNEVERSET */ + if (count <= 1) + goto bad_ctc; + + /* 64-bit subtract - gcc just messes up with long longs */ + __asm__("subl %2,%0\n\t" + "sbbl %3,%1" + :"=a" (endlow), "=d" (endhigh) + :"g" (startlow), "g" (starthigh), + "0" (endlow), "1" (endhigh)); + + /* Error: ECPUTOOFAST */ + if (endhigh) + goto bad_ctc; + + /* Error: ECPUTOOSLOW */ + if (endlow <= CALIBRATE_TIME) + goto bad_ctc; + + __asm__("divl %2" + :"=a" (endlow), "=d" (endhigh) + :"r" (endlow), "0" (0), "1" (CALIBRATE_TIME)); + + return endlow; + } + + /* + * The CTC wasn't reliable: we got a hit on the very first read, + * or the CPU was so fast/slow that the quotient wouldn't fit in + * 32 bits.. + */ +bad_ctc: + return 0; +} + +#ifdef CONFIG_HPET_TIMER +/* ------ Calibrate the TSC using HPET ------- + * Return 2^32 * (1 / (TSC clocks per usec)) for getting the CPU freq. + * Second output is parameter 1 (when non NULL) + * Set 2^32 * (1 / (tsc per HPET clk)) for delay_hpet(). + * calibrate_tsc() calibrates the processor TSC by comparing + * it to the HPET timer of known frequency. + * Too much 64-bit arithmetic here to do this cleanly in C + */ +#define CALIBRATE_CNT_HPET (5 * hpet_tick) +#define CALIBRATE_TIME_HPET (5 * KERNEL_TICK_USEC) + +unsigned long __devinit calibrate_tsc_hpet(unsigned long *tsc_hpet_quotient_ptr) +{ + unsigned long tsc_startlow, tsc_starthigh; + unsigned long tsc_endlow, tsc_endhigh; + unsigned long hpet_start, hpet_end; + unsigned long result, remain; + + hpet_start = hpet_readl(HPET_COUNTER); + rdtsc(tsc_startlow, tsc_starthigh); + do { + hpet_end = hpet_readl(HPET_COUNTER); + } while ((hpet_end - hpet_start) < CALIBRATE_CNT_HPET); + rdtsc(tsc_endlow, tsc_endhigh); + + /* 64-bit subtract - gcc just messes up with long longs */ + __asm__("subl %2,%0\n\t" + "sbbl %3,%1" + :"=a" (tsc_endlow), "=d" (tsc_endhigh) + :"g" (tsc_startlow), "g" (tsc_starthigh), + "0" (tsc_endlow), "1" (tsc_endhigh)); + + /* Error: ECPUTOOFAST */ + if (tsc_endhigh) + goto bad_calibration; + + /* Error: ECPUTOOSLOW */ + if (tsc_endlow <= CALIBRATE_TIME_HPET) + goto bad_calibration; + + ASM_DIV64_REG(result, remain, tsc_endlow, 0, CALIBRATE_TIME_HPET); + if (remain > (tsc_endlow >> 1)) + result++; /* rounding the result */ + + if (tsc_hpet_quotient_ptr) { + unsigned long tsc_hpet_quotient; + + ASM_DIV64_REG(tsc_hpet_quotient, remain, tsc_endlow, 0, + CALIBRATE_CNT_HPET); + if (remain > (tsc_endlow >> 1)) + tsc_hpet_quotient++; /* rounding the result */ + *tsc_hpet_quotient_ptr = tsc_hpet_quotient; + } + + return result; +bad_calibration: + /* + * the CPU was so fast/slow that the quotient wouldn't fit in + * 32 bits.. + */ + return 0; +} +#endif + + +unsigned long read_timer_tsc(void) +{ + unsigned long retval; + rdtscl(retval); + return retval; +} + + +/* calculate cpu_khz */ +void init_cpu_khz(void) +{ + if (cpu_has_tsc) { + unsigned long tsc_quotient = calibrate_tsc(); + if (tsc_quotient) { + /* report CPU clock rate in Hz. + * The formula is (10^6 * 2^32) / (2^32 * 1 / (clocks/us)) = + * clock/second. Our precision is about 100 ppm. + */ + { unsigned long eax=0, edx=1000; + __asm__("divl %2" + :"=a" (cpu_khz), "=d" (edx) + :"r" (tsc_quotient), + "0" (eax), "1" (edx)); + printk("Detected %u.%03u MHz processor.\n", + cpu_khz / 1000, cpu_khz % 1000); + } + } + } +} + diff --git a/trunk/arch/i386/kernel/timers/timer.c b/trunk/arch/i386/kernel/timers/timer.c new file mode 100644 index 000000000000..7e39ed8e33f8 --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer.c @@ -0,0 +1,75 @@ +#include +#include +#include +#include + +#ifdef CONFIG_HPET_TIMER +/* + * HPET memory read is slower than tsc reads, but is more dependable as it + * always runs at constant frequency and reduces complexity due to + * cpufreq. So, we prefer HPET timer to tsc based one. Also, we cannot use + * timer_pit when HPET is active. So, we default to timer_tsc. + */ +#endif +/* list of timers, ordered by preference, NULL terminated */ +static struct init_timer_opts* __initdata timers[] = { +#ifdef CONFIG_X86_CYCLONE_TIMER + &timer_cyclone_init, +#endif +#ifdef CONFIG_HPET_TIMER + &timer_hpet_init, +#endif +#ifdef CONFIG_X86_PM_TIMER + &timer_pmtmr_init, +#endif + &timer_tsc_init, + &timer_pit_init, + NULL, +}; + +static char clock_override[10] __initdata; + +static int __init clock_setup(char* str) +{ + if (str) + strlcpy(clock_override, str, sizeof(clock_override)); + return 1; +} +__setup("clock=", clock_setup); + + +/* The chosen timesource has been found to be bad. + * Fall back to a known good timesource (the PIT) + */ +void clock_fallback(void) +{ + cur_timer = &timer_pit; +} + +/* iterates through the list of timers, returning the first + * one that initializes successfully. + */ +struct timer_opts* __init select_timer(void) +{ + int i = 0; + + /* find most preferred working timer */ + while (timers[i]) { + if (timers[i]->init) + if (timers[i]->init(clock_override) == 0) + return timers[i]->opts; + ++i; + } + + panic("select_timer: Cannot find a suitable timer\n"); + return NULL; +} + +int read_current_timer(unsigned long *timer_val) +{ + if (cur_timer->read_timer) { + *timer_val = cur_timer->read_timer(); + return 0; + } + return -1; +} diff --git a/trunk/arch/i386/kernel/timers/timer_cyclone.c b/trunk/arch/i386/kernel/timers/timer_cyclone.c new file mode 100644 index 000000000000..13892a65c941 --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_cyclone.c @@ -0,0 +1,259 @@ +/* Cyclone-timer: + * This code implements timer_ops for the cyclone counter found + * on IBM x440, x360, and other Summit based systems. + * + * Copyright (C) 2002 IBM, John Stultz (johnstul@us.ibm.com) + */ + + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "io_ports.h" + +/* Number of usecs that the last interrupt was delayed */ +static int delay_at_last_interrupt; + +#define CYCLONE_CBAR_ADDR 0xFEB00CD0 +#define CYCLONE_PMCC_OFFSET 0x51A0 +#define CYCLONE_MPMC_OFFSET 0x51D0 +#define CYCLONE_MPCS_OFFSET 0x51A8 +#define CYCLONE_TIMER_FREQ 100000000 +#define CYCLONE_TIMER_MASK (((u64)1<<40)-1) /* 40 bit mask */ +int use_cyclone = 0; + +static u32* volatile cyclone_timer; /* Cyclone MPMC0 register */ +static u32 last_cyclone_low; +static u32 last_cyclone_high; +static unsigned long long monotonic_base; +static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED; + +/* helper macro to atomically read both cyclone counter registers */ +#define read_cyclone_counter(low,high) \ + do{ \ + high = cyclone_timer[1]; low = cyclone_timer[0]; \ + } while (high != cyclone_timer[1]); + + +static void mark_offset_cyclone(void) +{ + unsigned long lost, delay; + unsigned long delta = last_cyclone_low; + int count; + unsigned long long this_offset, last_offset; + + write_seqlock(&monotonic_lock); + last_offset = ((unsigned long long)last_cyclone_high<<32)|last_cyclone_low; + + spin_lock(&i8253_lock); + read_cyclone_counter(last_cyclone_low,last_cyclone_high); + + /* read values for delay_at_last_interrupt */ + outb_p(0x00, 0x43); /* latch the count ASAP */ + + count = inb_p(0x40); /* read the latched count */ + count |= inb(0x40) << 8; + + /* + * VIA686a test code... reset the latch if count > max + 1 + * from timer_pit.c - cjb + */ + if (count > LATCH) { + outb_p(0x34, PIT_MODE); + outb_p(LATCH & 0xff, PIT_CH0); + outb(LATCH >> 8, PIT_CH0); + count = LATCH - 1; + } + spin_unlock(&i8253_lock); + + /* lost tick compensation */ + delta = last_cyclone_low - delta; + delta /= (CYCLONE_TIMER_FREQ/1000000); + delta += delay_at_last_interrupt; + lost = delta/(1000000/HZ); + delay = delta%(1000000/HZ); + if (lost >= 2) + jiffies_64 += lost-1; + + /* update the monotonic base value */ + this_offset = ((unsigned long long)last_cyclone_high<<32)|last_cyclone_low; + monotonic_base += (this_offset - last_offset) & CYCLONE_TIMER_MASK; + write_sequnlock(&monotonic_lock); + + /* calculate delay_at_last_interrupt */ + count = ((LATCH-1) - count) * TICK_SIZE; + delay_at_last_interrupt = (count + LATCH/2) / LATCH; + + + /* catch corner case where tick rollover occured + * between cyclone and pit reads (as noted when + * usec delta is > 90% # of usecs/tick) + */ + if (lost && abs(delay - delay_at_last_interrupt) > (900000/HZ)) + jiffies_64++; +} + +static unsigned long get_offset_cyclone(void) +{ + u32 offset; + + if(!cyclone_timer) + return delay_at_last_interrupt; + + /* Read the cyclone timer */ + offset = cyclone_timer[0]; + + /* .. relative to previous jiffy */ + offset = offset - last_cyclone_low; + + /* convert cyclone ticks to microseconds */ + /* XXX slow, can we speed this up? */ + offset = offset/(CYCLONE_TIMER_FREQ/1000000); + + /* our adjusted time offset in microseconds */ + return delay_at_last_interrupt + offset; +} + +static unsigned long long monotonic_clock_cyclone(void) +{ + u32 now_low, now_high; + unsigned long long last_offset, this_offset, base; + unsigned long long ret; + unsigned seq; + + /* atomically read monotonic base & last_offset */ + do { + seq = read_seqbegin(&monotonic_lock); + last_offset = ((unsigned long long)last_cyclone_high<<32)|last_cyclone_low; + base = monotonic_base; + } while (read_seqretry(&monotonic_lock, seq)); + + + /* Read the cyclone counter */ + read_cyclone_counter(now_low,now_high); + this_offset = ((unsigned long long)now_high<<32)|now_low; + + /* convert to nanoseconds */ + ret = base + ((this_offset - last_offset)&CYCLONE_TIMER_MASK); + return ret * (1000000000 / CYCLONE_TIMER_FREQ); +} + +static int __init init_cyclone(char* override) +{ + u32* reg; + u32 base; /* saved cyclone base address */ + u32 pageaddr; /* page that contains cyclone_timer register */ + u32 offset; /* offset from pageaddr to cyclone_timer register */ + int i; + + /* check clock override */ + if (override[0] && strncmp(override,"cyclone",7)) + return -ENODEV; + + /*make sure we're on a summit box*/ + if(!use_cyclone) return -ENODEV; + + printk(KERN_INFO "Summit chipset: Starting Cyclone Counter.\n"); + + /* find base address */ + pageaddr = (CYCLONE_CBAR_ADDR)&PAGE_MASK; + offset = (CYCLONE_CBAR_ADDR)&(~PAGE_MASK); + set_fixmap_nocache(FIX_CYCLONE_TIMER, pageaddr); + reg = (u32*)(fix_to_virt(FIX_CYCLONE_TIMER) + offset); + if(!reg){ + printk(KERN_ERR "Summit chipset: Could not find valid CBAR register.\n"); + return -ENODEV; + } + base = *reg; + if(!base){ + printk(KERN_ERR "Summit chipset: Could not find valid CBAR value.\n"); + return -ENODEV; + } + + /* setup PMCC */ + pageaddr = (base + CYCLONE_PMCC_OFFSET)&PAGE_MASK; + offset = (base + CYCLONE_PMCC_OFFSET)&(~PAGE_MASK); + set_fixmap_nocache(FIX_CYCLONE_TIMER, pageaddr); + reg = (u32*)(fix_to_virt(FIX_CYCLONE_TIMER) + offset); + if(!reg){ + printk(KERN_ERR "Summit chipset: Could not find valid PMCC register.\n"); + return -ENODEV; + } + reg[0] = 0x00000001; + + /* setup MPCS */ + pageaddr = (base + CYCLONE_MPCS_OFFSET)&PAGE_MASK; + offset = (base + CYCLONE_MPCS_OFFSET)&(~PAGE_MASK); + set_fixmap_nocache(FIX_CYCLONE_TIMER, pageaddr); + reg = (u32*)(fix_to_virt(FIX_CYCLONE_TIMER) + offset); + if(!reg){ + printk(KERN_ERR "Summit chipset: Could not find valid MPCS register.\n"); + return -ENODEV; + } + reg[0] = 0x00000001; + + /* map in cyclone_timer */ + pageaddr = (base + CYCLONE_MPMC_OFFSET)&PAGE_MASK; + offset = (base + CYCLONE_MPMC_OFFSET)&(~PAGE_MASK); + set_fixmap_nocache(FIX_CYCLONE_TIMER, pageaddr); + cyclone_timer = (u32*)(fix_to_virt(FIX_CYCLONE_TIMER) + offset); + if(!cyclone_timer){ + printk(KERN_ERR "Summit chipset: Could not find valid MPMC register.\n"); + return -ENODEV; + } + + /*quick test to make sure its ticking*/ + for(i=0; i<3; i++){ + u32 old = cyclone_timer[0]; + int stall = 100; + while(stall--) barrier(); + if(cyclone_timer[0] == old){ + printk(KERN_ERR "Summit chipset: Counter not counting! DISABLED\n"); + cyclone_timer = 0; + return -ENODEV; + } + } + + init_cpu_khz(); + + /* Everything looks good! */ + return 0; +} + + +static void delay_cyclone(unsigned long loops) +{ + unsigned long bclock, now; + if(!cyclone_timer) + return; + bclock = cyclone_timer[0]; + do { + rep_nop(); + now = cyclone_timer[0]; + } while ((now-bclock) < loops); +} +/************************************************************/ + +/* cyclone timer_opts struct */ +static struct timer_opts timer_cyclone = { + .name = "cyclone", + .mark_offset = mark_offset_cyclone, + .get_offset = get_offset_cyclone, + .monotonic_clock = monotonic_clock_cyclone, + .delay = delay_cyclone, +}; + +struct init_timer_opts __initdata timer_cyclone_init = { + .init = init_cyclone, + .opts = &timer_cyclone, +}; diff --git a/trunk/arch/i386/kernel/timers/timer_hpet.c b/trunk/arch/i386/kernel/timers/timer_hpet.c new file mode 100644 index 000000000000..17a6fe7166e7 --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_hpet.c @@ -0,0 +1,217 @@ +/* + * This code largely moved from arch/i386/kernel/time.c. + * See comments there for proper credits. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "io_ports.h" +#include "mach_timer.h" +#include + +static unsigned long hpet_usec_quotient __read_mostly; /* convert hpet clks to usec */ +static unsigned long tsc_hpet_quotient __read_mostly; /* convert tsc to hpet clks */ +static unsigned long hpet_last; /* hpet counter value at last tick*/ +static unsigned long last_tsc_low; /* lsb 32 bits of Time Stamp Counter */ +static unsigned long last_tsc_high; /* msb 32 bits of Time Stamp Counter */ +static unsigned long long monotonic_base; +static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED; + +/* convert from cycles(64bits) => nanoseconds (64bits) + * basic equation: + * ns = cycles / (freq / ns_per_sec) + * ns = cycles * (ns_per_sec / freq) + * ns = cycles * (10^9 / (cpu_khz * 10^3)) + * ns = cycles * (10^6 / cpu_khz) + * + * Then we use scaling math (suggested by george@mvista.com) to get: + * ns = cycles * (10^6 * SC / cpu_khz) / SC + * ns = cycles * cyc2ns_scale / SC + * + * And since SC is a constant power of two, we can convert the div + * into a shift. + * + * We can use khz divisor instead of mhz to keep a better percision, since + * cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits. + * (mathieu.desnoyers@polymtl.ca) + * + * -johnstul@us.ibm.com "math is hard, lets go shopping!" + */ +static unsigned long cyc2ns_scale __read_mostly; +#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */ + +static inline void set_cyc2ns_scale(unsigned long cpu_khz) +{ + cyc2ns_scale = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz; +} + +static inline unsigned long long cycles_2_ns(unsigned long long cyc) +{ + return (cyc * cyc2ns_scale) >> CYC2NS_SCALE_FACTOR; +} + +static unsigned long long monotonic_clock_hpet(void) +{ + unsigned long long last_offset, this_offset, base; + unsigned seq; + + /* atomically read monotonic base & last_offset */ + do { + seq = read_seqbegin(&monotonic_lock); + last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + base = monotonic_base; + } while (read_seqretry(&monotonic_lock, seq)); + + /* Read the Time Stamp Counter */ + rdtscll(this_offset); + + /* return the value in ns */ + return base + cycles_2_ns(this_offset - last_offset); +} + +static unsigned long get_offset_hpet(void) +{ + register unsigned long eax, edx; + + eax = hpet_readl(HPET_COUNTER); + eax -= hpet_last; /* hpet delta */ + eax = min(hpet_tick, eax); + /* + * Time offset = (hpet delta) * ( usecs per HPET clock ) + * = (hpet delta) * ( usecs per tick / HPET clocks per tick) + * = (hpet delta) * ( hpet_usec_quotient ) / (2^32) + * + * Where, + * hpet_usec_quotient = (2^32 * usecs per tick)/HPET clocks per tick + * + * Using a mull instead of a divl saves some cycles in critical path. + */ + ASM_MUL64_REG(eax, edx, hpet_usec_quotient, eax); + + /* our adjusted time offset in microseconds */ + return edx; +} + +static void mark_offset_hpet(void) +{ + unsigned long long this_offset, last_offset; + unsigned long offset; + + write_seqlock(&monotonic_lock); + last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + rdtsc(last_tsc_low, last_tsc_high); + + if (hpet_use_timer) + offset = hpet_readl(HPET_T0_CMP) - hpet_tick; + else + offset = hpet_readl(HPET_COUNTER); + if (unlikely(((offset - hpet_last) >= (2*hpet_tick)) && (hpet_last != 0))) { + int lost_ticks = ((offset - hpet_last) / hpet_tick) - 1; + jiffies_64 += lost_ticks; + } + hpet_last = offset; + + /* update the monotonic base value */ + this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + monotonic_base += cycles_2_ns(this_offset - last_offset); + write_sequnlock(&monotonic_lock); +} + +static void delay_hpet(unsigned long loops) +{ + unsigned long hpet_start, hpet_end; + unsigned long eax; + + /* loops is the number of cpu cycles. Convert it to hpet clocks */ + ASM_MUL64_REG(eax, loops, tsc_hpet_quotient, loops); + + hpet_start = hpet_readl(HPET_COUNTER); + do { + rep_nop(); + hpet_end = hpet_readl(HPET_COUNTER); + } while ((hpet_end - hpet_start) < (loops)); +} + +static struct timer_opts timer_hpet; + +static int __init init_hpet(char* override) +{ + unsigned long result, remain; + + /* check clock override */ + if (override[0] && strncmp(override,"hpet",4)) + return -ENODEV; + + if (!is_hpet_enabled()) + return -ENODEV; + + printk("Using HPET for gettimeofday\n"); + if (cpu_has_tsc) { + unsigned long tsc_quotient = calibrate_tsc_hpet(&tsc_hpet_quotient); + if (tsc_quotient) { + /* report CPU clock rate in Hz. + * The formula is (10^6 * 2^32) / (2^32 * 1 / (clocks/us)) = + * clock/second. Our precision is about 100 ppm. + */ + { unsigned long eax=0, edx=1000; + ASM_DIV64_REG(cpu_khz, edx, tsc_quotient, + eax, edx); + printk("Detected %u.%03u MHz processor.\n", + cpu_khz / 1000, cpu_khz % 1000); + } + set_cyc2ns_scale(cpu_khz); + } + /* set this only when cpu_has_tsc */ + timer_hpet.read_timer = read_timer_tsc; + } + + /* + * Math to calculate hpet to usec multiplier + * Look for the comments at get_offset_hpet() + */ + ASM_DIV64_REG(result, remain, hpet_tick, 0, KERNEL_TICK_USEC); + if (remain > (hpet_tick >> 1)) + result++; /* rounding the result */ + hpet_usec_quotient = result; + + return 0; +} + +static int hpet_resume(void) +{ + write_seqlock(&monotonic_lock); + /* Assume this is the last mark offset time */ + rdtsc(last_tsc_low, last_tsc_high); + + if (hpet_use_timer) + hpet_last = hpet_readl(HPET_T0_CMP) - hpet_tick; + else + hpet_last = hpet_readl(HPET_COUNTER); + write_sequnlock(&monotonic_lock); + return 0; +} +/************************************************************/ + +/* tsc timer_opts struct */ +static struct timer_opts timer_hpet __read_mostly = { + .name = "hpet", + .mark_offset = mark_offset_hpet, + .get_offset = get_offset_hpet, + .monotonic_clock = monotonic_clock_hpet, + .delay = delay_hpet, + .resume = hpet_resume, +}; + +struct init_timer_opts __initdata timer_hpet_init = { + .init = init_hpet, + .opts = &timer_hpet, +}; diff --git a/trunk/arch/i386/kernel/timers/timer_none.c b/trunk/arch/i386/kernel/timers/timer_none.c new file mode 100644 index 000000000000..4ea2f414dbbd --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_none.c @@ -0,0 +1,39 @@ +#include +#include + +static void mark_offset_none(void) +{ + /* nothing needed */ +} + +static unsigned long get_offset_none(void) +{ + return 0; +} + +static unsigned long long monotonic_clock_none(void) +{ + return 0; +} + +static void delay_none(unsigned long loops) +{ + int d0; + __asm__ __volatile__( + "\tjmp 1f\n" + ".align 16\n" + "1:\tjmp 2f\n" + ".align 16\n" + "2:\tdecl %0\n\tjns 2b" + :"=&a" (d0) + :"0" (loops)); +} + +/* none timer_opts struct */ +struct timer_opts timer_none = { + .name = "none", + .mark_offset = mark_offset_none, + .get_offset = get_offset_none, + .monotonic_clock = monotonic_clock_none, + .delay = delay_none, +}; diff --git a/trunk/arch/i386/kernel/timers/timer_pit.c b/trunk/arch/i386/kernel/timers/timer_pit.c new file mode 100644 index 000000000000..b9b6bd56b9ba --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_pit.c @@ -0,0 +1,177 @@ +/* + * This code largely moved from arch/i386/kernel/time.c. + * See comments there for proper credits. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "do_timer.h" +#include "io_ports.h" + +static int count_p; /* counter in get_offset_pit() */ + +static int __init init_pit(char* override) +{ + /* check clock override */ + if (override[0] && strncmp(override,"pit",3)) + printk(KERN_ERR "Warning: clock= override failed. Defaulting " + "to PIT\n"); + init_cpu_khz(); + count_p = LATCH; + return 0; +} + +static void mark_offset_pit(void) +{ + /* nothing needed */ +} + +static unsigned long long monotonic_clock_pit(void) +{ + return 0; +} + +static void delay_pit(unsigned long loops) +{ + int d0; + __asm__ __volatile__( + "\tjmp 1f\n" + ".align 16\n" + "1:\tjmp 2f\n" + ".align 16\n" + "2:\tdecl %0\n\tjns 2b" + :"=&a" (d0) + :"0" (loops)); +} + + +/* This function must be called with xtime_lock held. + * It was inspired by Steve McCanne's microtime-i386 for BSD. -- jrs + * + * However, the pc-audio speaker driver changes the divisor so that + * it gets interrupted rather more often - it loads 64 into the + * counter rather than 11932! This has an adverse impact on + * do_gettimeoffset() -- it stops working! What is also not + * good is that the interval that our timer function gets called + * is no longer 10.0002 ms, but 9.9767 ms. To get around this + * would require using a different timing source. Maybe someone + * could use the RTC - I know that this can interrupt at frequencies + * ranging from 8192Hz to 2Hz. If I had the energy, I'd somehow fix + * it so that at startup, the timer code in sched.c would select + * using either the RTC or the 8253 timer. The decision would be + * based on whether there was any other device around that needed + * to trample on the 8253. I'd set up the RTC to interrupt at 1024 Hz, + * and then do some jiggery to have a version of do_timer that + * advanced the clock by 1/1024 s. Every time that reached over 1/100 + * of a second, then do all the old code. If the time was kept correct + * then do_gettimeoffset could just return 0 - there is no low order + * divider that can be accessed. + * + * Ideally, you would be able to use the RTC for the speaker driver, + * but it appears that the speaker driver really needs interrupt more + * often than every 120 us or so. + * + * Anyway, this needs more thought.... pjsg (1993-08-28) + * + * If you are really that interested, you should be reading + * comp.protocols.time.ntp! + */ + +static unsigned long get_offset_pit(void) +{ + int count; + unsigned long flags; + static unsigned long jiffies_p = 0; + + /* + * cache volatile jiffies temporarily; we have xtime_lock. + */ + unsigned long jiffies_t; + + spin_lock_irqsave(&i8253_lock, flags); + /* timer count may underflow right here */ + outb_p(0x00, PIT_MODE); /* latch the count ASAP */ + + count = inb_p(PIT_CH0); /* read the latched count */ + + /* + * We do this guaranteed double memory access instead of a _p + * postfix in the previous port access. Wheee, hackady hack + */ + jiffies_t = jiffies; + + count |= inb_p(PIT_CH0) << 8; + + /* VIA686a test code... reset the latch if count > max + 1 */ + if (count > LATCH) { + outb_p(0x34, PIT_MODE); + outb_p(LATCH & 0xff, PIT_CH0); + outb(LATCH >> 8, PIT_CH0); + count = LATCH - 1; + } + + /* + * avoiding timer inconsistencies (they are rare, but they happen)... + * there are two kinds of problems that must be avoided here: + * 1. the timer counter underflows + * 2. hardware problem with the timer, not giving us continuous time, + * the counter does small "jumps" upwards on some Pentium systems, + * (see c't 95/10 page 335 for Neptun bug.) + */ + + if( jiffies_t == jiffies_p ) { + if( count > count_p ) { + /* the nutcase */ + count = do_timer_overflow(count); + } + } else + jiffies_p = jiffies_t; + + count_p = count; + + spin_unlock_irqrestore(&i8253_lock, flags); + + count = ((LATCH-1) - count) * TICK_SIZE; + count = (count + LATCH/2) / LATCH; + + return count; +} + + +/* tsc timer_opts struct */ +struct timer_opts timer_pit = { + .name = "pit", + .mark_offset = mark_offset_pit, + .get_offset = get_offset_pit, + .monotonic_clock = monotonic_clock_pit, + .delay = delay_pit, +}; + +struct init_timer_opts __initdata timer_pit_init = { + .init = init_pit, + .opts = &timer_pit, +}; + +void setup_pit_timer(void) +{ + unsigned long flags; + + spin_lock_irqsave(&i8253_lock, flags); + outb_p(0x34,PIT_MODE); /* binary, mode 2, LSB/MSB, ch 0 */ + udelay(10); + outb_p(LATCH & 0xff , PIT_CH0); /* LSB */ + udelay(10); + outb(LATCH >> 8 , PIT_CH0); /* MSB */ + spin_unlock_irqrestore(&i8253_lock, flags); +} diff --git a/trunk/arch/i386/kernel/timers/timer_pm.c b/trunk/arch/i386/kernel/timers/timer_pm.c new file mode 100644 index 000000000000..144e94a04933 --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_pm.c @@ -0,0 +1,342 @@ +/* + * (C) Dominik Brodowski 2003 + * + * Driver to use the Power Management Timer (PMTMR) available in some + * southbridges as primary timing source for the Linux kernel. + * + * Based on parts of linux/drivers/acpi/hardware/hwtimer.c, timer_pit.c, + * timer_hpet.c, and on Arjan van de Ven's implementation for 2.4. + * + * This file is licensed under the GPL v2. + */ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include "mach_timer.h" + +/* Number of PMTMR ticks expected during calibration run */ +#define PMTMR_TICKS_PER_SEC 3579545 +#define PMTMR_EXPECTED_RATE \ + ((CALIBRATE_LATCH * (PMTMR_TICKS_PER_SEC >> 10)) / (CLOCK_TICK_RATE>>10)) + + +/* The I/O port the PMTMR resides at. + * The location is detected during setup_arch(), + * in arch/i386/acpi/boot.c */ +u32 pmtmr_ioport = 0; + + +/* value of the Power timer at last timer interrupt */ +static u32 offset_tick; +static u32 offset_delay; + +static unsigned long long monotonic_base; +static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED; + +#define ACPI_PM_MASK 0xFFFFFF /* limit it to 24 bits */ + +static int pmtmr_need_workaround __read_mostly = 1; + +/*helper function to safely read acpi pm timesource*/ +static inline u32 read_pmtmr(void) +{ + if (pmtmr_need_workaround) { + u32 v1, v2, v3; + + /* It has been reported that because of various broken + * chipsets (ICH4, PIIX4 and PIIX4E) where the ACPI PM time + * source is not latched, so you must read it multiple + * times to insure a safe value is read. + */ + do { + v1 = inl(pmtmr_ioport); + v2 = inl(pmtmr_ioport); + v3 = inl(pmtmr_ioport); + } while ((v1 > v2 && v1 < v3) || (v2 > v3 && v2 < v1) + || (v3 > v1 && v3 < v2)); + + /* mask the output to 24 bits */ + return v2 & ACPI_PM_MASK; + } + + return inl(pmtmr_ioport) & ACPI_PM_MASK; +} + + +/* + * Some boards have the PMTMR running way too fast. We check + * the PMTMR rate against PIT channel 2 to catch these cases. + */ +static int verify_pmtmr_rate(void) +{ + u32 value1, value2; + unsigned long count, delta; + + mach_prepare_counter(); + value1 = read_pmtmr(); + mach_countup(&count); + value2 = read_pmtmr(); + delta = (value2 - value1) & ACPI_PM_MASK; + + /* Check that the PMTMR delta is within 5% of what we expect */ + if (delta < (PMTMR_EXPECTED_RATE * 19) / 20 || + delta > (PMTMR_EXPECTED_RATE * 21) / 20) { + printk(KERN_INFO "PM-Timer running at invalid rate: %lu%% of normal - aborting.\n", 100UL * delta / PMTMR_EXPECTED_RATE); + return -1; + } + + return 0; +} + + +static int init_pmtmr(char* override) +{ + u32 value1, value2; + unsigned int i; + + if (override[0] && strncmp(override,"pmtmr",5)) + return -ENODEV; + + if (!pmtmr_ioport) + return -ENODEV; + + /* we use the TSC for delay_pmtmr, so make sure it exists */ + if (!cpu_has_tsc) + return -ENODEV; + + /* "verify" this timing source */ + value1 = read_pmtmr(); + for (i = 0; i < 10000; i++) { + value2 = read_pmtmr(); + if (value2 == value1) + continue; + if (value2 > value1) + goto pm_good; + if ((value2 < value1) && ((value2) < 0xFFF)) + goto pm_good; + printk(KERN_INFO "PM-Timer had inconsistent results: 0x%#x, 0x%#x - aborting.\n", value1, value2); + return -EINVAL; + } + printk(KERN_INFO "PM-Timer had no reasonable result: 0x%#x - aborting.\n", value1); + return -ENODEV; + +pm_good: + if (verify_pmtmr_rate() != 0) + return -ENODEV; + + init_cpu_khz(); + return 0; +} + +static inline u32 cyc2us(u32 cycles) +{ + /* The Power Management Timer ticks at 3.579545 ticks per microsecond. + * 1 / PM_TIMER_FREQUENCY == 0.27936511 =~ 286/1024 [error: 0.024%] + * + * Even with HZ = 100, delta is at maximum 35796 ticks, so it can + * easily be multiplied with 286 (=0x11E) without having to fear + * u32 overflows. + */ + cycles *= 286; + return (cycles >> 10); +} + +/* + * this gets called during each timer interrupt + * - Called while holding the writer xtime_lock + */ +static void mark_offset_pmtmr(void) +{ + u32 lost, delta, last_offset; + static int first_run = 1; + last_offset = offset_tick; + + write_seqlock(&monotonic_lock); + + offset_tick = read_pmtmr(); + + /* calculate tick interval */ + delta = (offset_tick - last_offset) & ACPI_PM_MASK; + + /* convert to usecs */ + delta = cyc2us(delta); + + /* update the monotonic base value */ + monotonic_base += delta * NSEC_PER_USEC; + write_sequnlock(&monotonic_lock); + + /* convert to ticks */ + delta += offset_delay; + lost = delta / (USEC_PER_SEC / HZ); + offset_delay = delta % (USEC_PER_SEC / HZ); + + + /* compensate for lost ticks */ + if (lost >= 2) + jiffies_64 += lost - 1; + + /* don't calculate delay for first run, + or if we've got less then a tick */ + if (first_run || (lost < 1)) { + first_run = 0; + offset_delay = 0; + } +} + +static int pmtmr_resume(void) +{ + write_seqlock(&monotonic_lock); + /* Assume this is the last mark offset time */ + offset_tick = read_pmtmr(); + write_sequnlock(&monotonic_lock); + return 0; +} + +static unsigned long long monotonic_clock_pmtmr(void) +{ + u32 last_offset, this_offset; + unsigned long long base, ret; + unsigned seq; + + + /* atomically read monotonic base & last_offset */ + do { + seq = read_seqbegin(&monotonic_lock); + last_offset = offset_tick; + base = monotonic_base; + } while (read_seqretry(&monotonic_lock, seq)); + + /* Read the pmtmr */ + this_offset = read_pmtmr(); + + /* convert to nanoseconds */ + ret = (this_offset - last_offset) & ACPI_PM_MASK; + ret = base + (cyc2us(ret) * NSEC_PER_USEC); + return ret; +} + +static void delay_pmtmr(unsigned long loops) +{ + unsigned long bclock, now; + + rdtscl(bclock); + do + { + rep_nop(); + rdtscl(now); + } while ((now-bclock) < loops); +} + + +/* + * get the offset (in microseconds) from the last call to mark_offset() + * - Called holding a reader xtime_lock + */ +static unsigned long get_offset_pmtmr(void) +{ + u32 now, offset, delta = 0; + + offset = offset_tick; + now = read_pmtmr(); + delta = (now - offset)&ACPI_PM_MASK; + + return (unsigned long) offset_delay + cyc2us(delta); +} + + +/* acpi timer_opts struct */ +static struct timer_opts timer_pmtmr = { + .name = "pmtmr", + .mark_offset = mark_offset_pmtmr, + .get_offset = get_offset_pmtmr, + .monotonic_clock = monotonic_clock_pmtmr, + .delay = delay_pmtmr, + .read_timer = read_timer_tsc, + .resume = pmtmr_resume, +}; + +struct init_timer_opts __initdata timer_pmtmr_init = { + .init = init_pmtmr, + .opts = &timer_pmtmr, +}; + +#ifdef CONFIG_PCI +/* + * PIIX4 Errata: + * + * The power management timer may return improper results when read. + * Although the timer value settles properly after incrementing, + * while incrementing there is a 3 ns window every 69.8 ns where the + * timer value is indeterminate (a 4.2% chance that the data will be + * incorrect when read). As a result, the ACPI free running count up + * timer specification is violated due to erroneous reads. + */ +static int __init pmtmr_bug_check(void) +{ + static struct pci_device_id gray_list[] __initdata = { + /* these chipsets may have bug. */ + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_82801DB_0) }, + { }, + }; + struct pci_dev *dev; + int pmtmr_has_bug = 0; + u8 rev; + + if (cur_timer != &timer_pmtmr || !pmtmr_need_workaround) + return 0; + + dev = pci_get_device(PCI_VENDOR_ID_INTEL, + PCI_DEVICE_ID_INTEL_82371AB_3, NULL); + if (dev) { + pci_read_config_byte(dev, PCI_REVISION_ID, &rev); + /* the bug has been fixed in PIIX4M */ + if (rev < 3) { + printk(KERN_WARNING "* Found PM-Timer Bug on this " + "chipset. Due to workarounds for a bug,\n" + "* this time source is slow. Consider trying " + "other time sources (clock=)\n"); + pmtmr_has_bug = 1; + } + pci_dev_put(dev); + } + + if (pci_dev_present(gray_list)) { + printk(KERN_WARNING "* This chipset may have PM-Timer Bug. Due" + " to workarounds for a bug,\n" + "* this time source is slow. If you are sure your timer" + " does not have\n" + "* this bug, please use \"pmtmr_good\" to disable the " + "workaround\n"); + pmtmr_has_bug = 1; + } + + if (!pmtmr_has_bug) + pmtmr_need_workaround = 0; + + return 0; +} +device_initcall(pmtmr_bug_check); +#endif + +static int __init pmtr_good_setup(char *__str) +{ + pmtmr_need_workaround = 0; + return 1; +} +__setup("pmtmr_good", pmtr_good_setup); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Dominik Brodowski "); +MODULE_DESCRIPTION("Power Management Timer (PMTMR) as primary timing source for x86"); diff --git a/trunk/arch/i386/kernel/timers/timer_tsc.c b/trunk/arch/i386/kernel/timers/timer_tsc.c new file mode 100644 index 000000000000..f1187ddb0d0f --- /dev/null +++ b/trunk/arch/i386/kernel/timers/timer_tsc.c @@ -0,0 +1,617 @@ +/* + * This code largely moved from arch/i386/kernel/time.c. + * See comments there for proper credits. + * + * 2004-06-25 Jesper Juhl + * moved mark_offset_tsc below cpufreq_delayed_get to avoid gcc 3.4 + * failing to inline. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +/* processor.h for distable_tsc flag */ +#include + +#include "io_ports.h" +#include "mach_timer.h" + +#include +#include + +#ifdef CONFIG_HPET_TIMER +static unsigned long hpet_usec_quotient; +static unsigned long hpet_last; +static struct timer_opts timer_tsc; +#endif + +static inline void cpufreq_delayed_get(void); + +int tsc_disable __devinitdata = 0; + +static int use_tsc; +/* Number of usecs that the last interrupt was delayed */ +static int delay_at_last_interrupt; + +static unsigned long last_tsc_low; /* lsb 32 bits of Time Stamp Counter */ +static unsigned long last_tsc_high; /* msb 32 bits of Time Stamp Counter */ +static unsigned long long monotonic_base; +static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED; + +/* Avoid compensating for lost ticks before TSCs are synched */ +static int detect_lost_ticks; +static int __init start_lost_tick_compensation(void) +{ + detect_lost_ticks = 1; + return 0; +} +late_initcall(start_lost_tick_compensation); + +/* convert from cycles(64bits) => nanoseconds (64bits) + * basic equation: + * ns = cycles / (freq / ns_per_sec) + * ns = cycles * (ns_per_sec / freq) + * ns = cycles * (10^9 / (cpu_khz * 10^3)) + * ns = cycles * (10^6 / cpu_khz) + * + * Then we use scaling math (suggested by george@mvista.com) to get: + * ns = cycles * (10^6 * SC / cpu_khz) / SC + * ns = cycles * cyc2ns_scale / SC + * + * And since SC is a constant power of two, we can convert the div + * into a shift. + * + * We can use khz divisor instead of mhz to keep a better percision, since + * cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits. + * (mathieu.desnoyers@polymtl.ca) + * + * -johnstul@us.ibm.com "math is hard, lets go shopping!" + */ +static unsigned long cyc2ns_scale __read_mostly; +#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */ + +static inline void set_cyc2ns_scale(unsigned long cpu_khz) +{ + cyc2ns_scale = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz; +} + +static inline unsigned long long cycles_2_ns(unsigned long long cyc) +{ + return (cyc * cyc2ns_scale) >> CYC2NS_SCALE_FACTOR; +} + +static int count2; /* counter for mark_offset_tsc() */ + +/* Cached *multiplier* to convert TSC counts to microseconds. + * (see the equation below). + * Equal to 2^32 * (1 / (clocks per usec) ). + * Initialized in time_init. + */ +static unsigned long fast_gettimeoffset_quotient; + +static unsigned long get_offset_tsc(void) +{ + register unsigned long eax, edx; + + /* Read the Time Stamp Counter */ + + rdtsc(eax,edx); + + /* .. relative to previous jiffy (32 bits is enough) */ + eax -= last_tsc_low; /* tsc_low delta */ + + /* + * Time offset = (tsc_low delta) * fast_gettimeoffset_quotient + * = (tsc_low delta) * (usecs_per_clock) + * = (tsc_low delta) * (usecs_per_jiffy / clocks_per_jiffy) + * + * Using a mull instead of a divl saves up to 31 clock cycles + * in the critical path. + */ + + __asm__("mull %2" + :"=a" (eax), "=d" (edx) + :"rm" (fast_gettimeoffset_quotient), + "0" (eax)); + + /* our adjusted time offset in microseconds */ + return delay_at_last_interrupt + edx; +} + +static unsigned long long monotonic_clock_tsc(void) +{ + unsigned long long last_offset, this_offset, base; + unsigned seq; + + /* atomically read monotonic base & last_offset */ + do { + seq = read_seqbegin(&monotonic_lock); + last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + base = monotonic_base; + } while (read_seqretry(&monotonic_lock, seq)); + + /* Read the Time Stamp Counter */ + rdtscll(this_offset); + + /* return the value in ns */ + return base + cycles_2_ns(this_offset - last_offset); +} + +/* + * Scheduler clock - returns current time in nanosec units. + */ +unsigned long long sched_clock(void) +{ + unsigned long long this_offset; + + /* + * In the NUMA case we dont use the TSC as they are not + * synchronized across all CPUs. + */ +#ifndef CONFIG_NUMA + if (!use_tsc) +#endif + /* no locking but a rare wrong value is not a big deal */ + return jiffies_64 * (1000000000 / HZ); + + /* Read the Time Stamp Counter */ + rdtscll(this_offset); + + /* return the value in ns */ + return cycles_2_ns(this_offset); +} + +static void delay_tsc(unsigned long loops) +{ + unsigned long bclock, now; + + rdtscl(bclock); + do + { + rep_nop(); + rdtscl(now); + } while ((now-bclock) < loops); +} + +#ifdef CONFIG_HPET_TIMER +static void mark_offset_tsc_hpet(void) +{ + unsigned long long this_offset, last_offset; + unsigned long offset, temp, hpet_current; + + write_seqlock(&monotonic_lock); + last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + /* + * It is important that these two operations happen almost at + * the same time. We do the RDTSC stuff first, since it's + * faster. To avoid any inconsistencies, we need interrupts + * disabled locally. + */ + /* + * Interrupts are just disabled locally since the timer irq + * has the SA_INTERRUPT flag set. -arca + */ + /* read Pentium cycle counter */ + + hpet_current = hpet_readl(HPET_COUNTER); + rdtsc(last_tsc_low, last_tsc_high); + + /* lost tick compensation */ + offset = hpet_readl(HPET_T0_CMP) - hpet_tick; + if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0)) + && detect_lost_ticks) { + int lost_ticks = (offset - hpet_last) / hpet_tick; + jiffies_64 += lost_ticks; + } + hpet_last = hpet_current; + + /* update the monotonic base value */ + this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + monotonic_base += cycles_2_ns(this_offset - last_offset); + write_sequnlock(&monotonic_lock); + + /* calculate delay_at_last_interrupt */ + /* + * Time offset = (hpet delta) * ( usecs per HPET clock ) + * = (hpet delta) * ( usecs per tick / HPET clocks per tick) + * = (hpet delta) * ( hpet_usec_quotient ) / (2^32) + * Where, + * hpet_usec_quotient = (2^32 * usecs per tick)/HPET clocks per tick + */ + delay_at_last_interrupt = hpet_current - offset; + ASM_MUL64_REG(temp, delay_at_last_interrupt, + hpet_usec_quotient, delay_at_last_interrupt); +} +#endif + + +#ifdef CONFIG_CPU_FREQ +#include + +static unsigned int cpufreq_delayed_issched = 0; +static unsigned int cpufreq_init = 0; +static struct work_struct cpufreq_delayed_get_work; + +static void handle_cpufreq_delayed_get(void *v) +{ + unsigned int cpu; + for_each_online_cpu(cpu) { + cpufreq_get(cpu); + } + cpufreq_delayed_issched = 0; +} + +/* if we notice lost ticks, schedule a call to cpufreq_get() as it tries + * to verify the CPU frequency the timing core thinks the CPU is running + * at is still correct. + */ +static inline void cpufreq_delayed_get(void) +{ + if (cpufreq_init && !cpufreq_delayed_issched) { + cpufreq_delayed_issched = 1; + printk(KERN_DEBUG "Losing some ticks... checking if CPU frequency changed.\n"); + schedule_work(&cpufreq_delayed_get_work); + } +} + +/* If the CPU frequency is scaled, TSC-based delays will need a different + * loops_per_jiffy value to function properly. + */ + +static unsigned int ref_freq = 0; +static unsigned long loops_per_jiffy_ref = 0; + +#ifndef CONFIG_SMP +static unsigned long fast_gettimeoffset_ref = 0; +static unsigned int cpu_khz_ref = 0; +#endif + +static int +time_cpufreq_notifier(struct notifier_block *nb, unsigned long val, + void *data) +{ + struct cpufreq_freqs *freq = data; + + if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE) + write_seqlock_irq(&xtime_lock); + if (!ref_freq) { + if (!freq->old){ + ref_freq = freq->new; + goto end; + } + ref_freq = freq->old; + loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy; +#ifndef CONFIG_SMP + fast_gettimeoffset_ref = fast_gettimeoffset_quotient; + cpu_khz_ref = cpu_khz; +#endif + } + + if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) || + (val == CPUFREQ_POSTCHANGE && freq->old > freq->new) || + (val == CPUFREQ_RESUMECHANGE)) { + if (!(freq->flags & CPUFREQ_CONST_LOOPS)) + cpu_data[freq->cpu].loops_per_jiffy = cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new); +#ifndef CONFIG_SMP + if (cpu_khz) + cpu_khz = cpufreq_scale(cpu_khz_ref, ref_freq, freq->new); + if (use_tsc) { + if (!(freq->flags & CPUFREQ_CONST_LOOPS)) { + fast_gettimeoffset_quotient = cpufreq_scale(fast_gettimeoffset_ref, freq->new, ref_freq); + set_cyc2ns_scale(cpu_khz); + } + } +#endif + } + +end: + if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE) + write_sequnlock_irq(&xtime_lock); + + return 0; +} + +static struct notifier_block time_cpufreq_notifier_block = { + .notifier_call = time_cpufreq_notifier +}; + + +static int __init cpufreq_tsc(void) +{ + int ret; + INIT_WORK(&cpufreq_delayed_get_work, handle_cpufreq_delayed_get, NULL); + ret = cpufreq_register_notifier(&time_cpufreq_notifier_block, + CPUFREQ_TRANSITION_NOTIFIER); + if (!ret) + cpufreq_init = 1; + return ret; +} +core_initcall(cpufreq_tsc); + +#else /* CONFIG_CPU_FREQ */ +static inline void cpufreq_delayed_get(void) { return; } +#endif + +int recalibrate_cpu_khz(void) +{ +#ifndef CONFIG_SMP + unsigned int cpu_khz_old = cpu_khz; + + if (cpu_has_tsc) { + local_irq_disable(); + init_cpu_khz(); + local_irq_enable(); + cpu_data[0].loops_per_jiffy = + cpufreq_scale(cpu_data[0].loops_per_jiffy, + cpu_khz_old, + cpu_khz); + return 0; + } else + return -ENODEV; +#else + return -ENODEV; +#endif +} +EXPORT_SYMBOL(recalibrate_cpu_khz); + +static void mark_offset_tsc(void) +{ + unsigned long lost,delay; + unsigned long delta = last_tsc_low; + int count; + int countmp; + static int count1 = 0; + unsigned long long this_offset, last_offset; + static int lost_count = 0; + + write_seqlock(&monotonic_lock); + last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + /* + * It is important that these two operations happen almost at + * the same time. We do the RDTSC stuff first, since it's + * faster. To avoid any inconsistencies, we need interrupts + * disabled locally. + */ + + /* + * Interrupts are just disabled locally since the timer irq + * has the SA_INTERRUPT flag set. -arca + */ + + /* read Pentium cycle counter */ + + rdtsc(last_tsc_low, last_tsc_high); + + spin_lock(&i8253_lock); + outb_p(0x00, PIT_MODE); /* latch the count ASAP */ + + count = inb_p(PIT_CH0); /* read the latched count */ + count |= inb(PIT_CH0) << 8; + + /* + * VIA686a test code... reset the latch if count > max + 1 + * from timer_pit.c - cjb + */ + if (count > LATCH) { + outb_p(0x34, PIT_MODE); + outb_p(LATCH & 0xff, PIT_CH0); + outb(LATCH >> 8, PIT_CH0); + count = LATCH - 1; + } + + spin_unlock(&i8253_lock); + + if (pit_latch_buggy) { + /* get center value of last 3 time lutch */ + if ((count2 >= count && count >= count1) + || (count1 >= count && count >= count2)) { + count2 = count1; count1 = count; + } else if ((count1 >= count2 && count2 >= count) + || (count >= count2 && count2 >= count1)) { + countmp = count;count = count2; + count2 = count1;count1 = countmp; + } else { + count2 = count1; count1 = count; count = count1; + } + } + + /* lost tick compensation */ + delta = last_tsc_low - delta; + { + register unsigned long eax, edx; + eax = delta; + __asm__("mull %2" + :"=a" (eax), "=d" (edx) + :"rm" (fast_gettimeoffset_quotient), + "0" (eax)); + delta = edx; + } + delta += delay_at_last_interrupt; + lost = delta/(1000000/HZ); + delay = delta%(1000000/HZ); + if (lost >= 2 && detect_lost_ticks) { + jiffies_64 += lost-1; + + /* sanity check to ensure we're not always losing ticks */ + if (lost_count++ > 100) { + printk(KERN_WARNING "Losing too many ticks!\n"); + printk(KERN_WARNING "TSC cannot be used as a timesource. \n"); + printk(KERN_WARNING "Possible reasons for this are:\n"); + printk(KERN_WARNING " You're running with Speedstep,\n"); + printk(KERN_WARNING " You don't have DMA enabled for your hard disk (see hdparm),\n"); + printk(KERN_WARNING " Incorrect TSC synchronization on an SMP system (see dmesg).\n"); + printk(KERN_WARNING "Falling back to a sane timesource now.\n"); + + clock_fallback(); + } + /* ... but give the TSC a fair chance */ + if (lost_count > 25) + cpufreq_delayed_get(); + } else + lost_count = 0; + /* update the monotonic base value */ + this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low; + monotonic_base += cycles_2_ns(this_offset - last_offset); + write_sequnlock(&monotonic_lock); + + /* calculate delay_at_last_interrupt */ + count = ((LATCH-1) - count) * TICK_SIZE; + delay_at_last_interrupt = (count + LATCH/2) / LATCH; + + /* catch corner case where tick rollover occured + * between tsc and pit reads (as noted when + * usec delta is > 90% # of usecs/tick) + */ + if (lost && abs(delay - delay_at_last_interrupt) > (900000/HZ)) + jiffies_64++; +} + +static int __init init_tsc(char* override) +{ + + /* check clock override */ + if (override[0] && strncmp(override,"tsc",3)) { +#ifdef CONFIG_HPET_TIMER + if (is_hpet_enabled()) { + printk(KERN_ERR "Warning: clock= override failed. Defaulting to tsc\n"); + } else +#endif + { + return -ENODEV; + } + } + + /* + * If we have APM enabled or the CPU clock speed is variable + * (CPU stops clock on HLT or slows clock to save power) + * then the TSC timestamps may diverge by up to 1 jiffy from + * 'real time' but nothing will break. + * The most frequent case is that the CPU is "woken" from a halt + * state by the timer interrupt itself, so we get 0 error. In the + * rare cases where a driver would "wake" the CPU and request a + * timestamp, the maximum error is < 1 jiffy. But timestamps are + * still perfectly ordered. + * Note that the TSC counter will be reset if APM suspends + * to disk; this won't break the kernel, though, 'cuz we're + * smart. See arch/i386/kernel/apm.c. + */ + /* + * Firstly we have to do a CPU check for chips with + * a potentially buggy TSC. At this point we haven't run + * the ident/bugs checks so we must run this hook as it + * may turn off the TSC flag. + * + * NOTE: this doesn't yet handle SMP 486 machines where only + * some CPU's have a TSC. Thats never worked and nobody has + * moaned if you have the only one in the world - you fix it! + */ + + count2 = LATCH; /* initialize counter for mark_offset_tsc() */ + + if (cpu_has_tsc) { + unsigned long tsc_quotient; +#ifdef CONFIG_HPET_TIMER + if (is_hpet_enabled() && hpet_use_timer) { + unsigned long result, remain; + printk("Using TSC for gettimeofday\n"); + tsc_quotient = calibrate_tsc_hpet(NULL); + timer_tsc.mark_offset = &mark_offset_tsc_hpet; + /* + * Math to calculate hpet to usec multiplier + * Look for the comments at get_offset_tsc_hpet() + */ + ASM_DIV64_REG(result, remain, hpet_tick, + 0, KERNEL_TICK_USEC); + if (remain > (hpet_tick >> 1)) + result++; /* rounding the result */ + + hpet_usec_quotient = result; + } else +#endif + { + tsc_quotient = calibrate_tsc(); + } + + if (tsc_quotient) { + fast_gettimeoffset_quotient = tsc_quotient; + use_tsc = 1; + /* + * We could be more selective here I suspect + * and just enable this for the next intel chips ? + */ + /* report CPU clock rate in Hz. + * The formula is (10^6 * 2^32) / (2^32 * 1 / (clocks/us)) = + * clock/second. Our precision is about 100 ppm. + */ + { unsigned long eax=0, edx=1000; + __asm__("divl %2" + :"=a" (cpu_khz), "=d" (edx) + :"r" (tsc_quotient), + "0" (eax), "1" (edx)); + printk("Detected %u.%03u MHz processor.\n", + cpu_khz / 1000, cpu_khz % 1000); + } + set_cyc2ns_scale(cpu_khz); + return 0; + } + } + return -ENODEV; +} + +static int tsc_resume(void) +{ + write_seqlock(&monotonic_lock); + /* Assume this is the last mark offset time */ + rdtsc(last_tsc_low, last_tsc_high); +#ifdef CONFIG_HPET_TIMER + if (is_hpet_enabled() && hpet_use_timer) + hpet_last = hpet_readl(HPET_COUNTER); +#endif + write_sequnlock(&monotonic_lock); + return 0; +} + +#ifndef CONFIG_X86_TSC +/* disable flag for tsc. Takes effect by clearing the TSC cpu flag + * in cpu/common.c */ +static int __init tsc_setup(char *str) +{ + tsc_disable = 1; + return 1; +} +#else +static int __init tsc_setup(char *str) +{ + printk(KERN_WARNING "notsc: Kernel compiled with CONFIG_X86_TSC, " + "cannot disable TSC.\n"); + return 1; +} +#endif +__setup("notsc", tsc_setup); + + + +/************************************************************/ + +/* tsc timer_opts struct */ +static struct timer_opts timer_tsc = { + .name = "tsc", + .mark_offset = mark_offset_tsc, + .get_offset = get_offset_tsc, + .monotonic_clock = monotonic_clock_tsc, + .delay = delay_tsc, + .read_timer = read_timer_tsc, + .resume = tsc_resume, +}; + +struct init_timer_opts __initdata timer_tsc_init = { + .init = init_tsc, + .opts = &timer_tsc, +}; diff --git a/trunk/arch/i386/kernel/traps.c b/trunk/arch/i386/kernel/traps.c index 78464097470a..dcc14477af1f 100644 --- a/trunk/arch/i386/kernel/traps.c +++ b/trunk/arch/i386/kernel/traps.c @@ -28,7 +28,6 @@ #include #include #include -#include #ifdef CONFIG_EISA #include @@ -48,7 +47,7 @@ #include #include #include -#include + #include #include #include @@ -93,7 +92,6 @@ asmlinkage void spurious_interrupt_bug(void); asmlinkage void machine_check(void); static int kstack_depth_to_print = 24; -static int call_trace = 1; ATOMIC_NOTIFIER_HEAD(i386die_chain); int register_die_notifier(struct notifier_block *nb) @@ -172,23 +170,7 @@ static inline unsigned long print_context_stack(struct thread_info *tinfo, return ebp; } -static asmlinkage int show_trace_unwind(struct unwind_frame_info *info, void *log_lvl) -{ - int n = 0; - int printed = 0; /* nr of entries already printed on current line */ - - while (unwind(info) == 0 && UNW_PC(info)) { - ++n; - printed = print_addr_and_symbol(UNW_PC(info), log_lvl, printed); - if (arch_unw_user_mode(info)) - break; - } - if (printed) - printk("\n"); - return n; -} - -static void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, +static void show_trace_log_lvl(struct task_struct *task, unsigned long *stack, char *log_lvl) { unsigned long ebp; @@ -196,26 +178,6 @@ static void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, if (!task) task = current; - if (call_trace >= 0) { - int unw_ret = 0; - struct unwind_frame_info info; - - if (regs) { - if (unwind_init_frame_info(&info, task, regs) == 0) - unw_ret = show_trace_unwind(&info, log_lvl); - } else if (task == current) - unw_ret = unwind_init_running(&info, show_trace_unwind, log_lvl); - else { - if (unwind_init_blocked(&info, task) == 0) - unw_ret = show_trace_unwind(&info, log_lvl); - } - if (unw_ret > 0) { - if (call_trace > 0) - return; - printk("%sLegacy call trace:\n", log_lvl); - } - } - if (task == current) { /* Grab ebp right from our regs */ asm ("movl %%ebp, %0" : "=r" (ebp) : ); @@ -236,13 +198,13 @@ static void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, } } -void show_trace(struct task_struct *task, struct pt_regs *regs, unsigned long * stack) +void show_trace(struct task_struct *task, unsigned long * stack) { - show_trace_log_lvl(task, regs, stack, ""); + show_trace_log_lvl(task, stack, ""); } -static void show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, - unsigned long *esp, char *log_lvl) +static void show_stack_log_lvl(struct task_struct *task, unsigned long *esp, + char *log_lvl) { unsigned long *stack; int i; @@ -263,13 +225,13 @@ static void show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, printk("%08lx ", *stack++); } printk("\n%sCall Trace:\n", log_lvl); - show_trace_log_lvl(task, regs, esp, log_lvl); + show_trace_log_lvl(task, esp, log_lvl); } void show_stack(struct task_struct *task, unsigned long *esp) { printk(" "); - show_stack_log_lvl(task, NULL, esp, ""); + show_stack_log_lvl(task, esp, ""); } /* @@ -279,7 +241,7 @@ void dump_stack(void) { unsigned long stack; - show_trace(current, NULL, &stack); + show_trace(current, &stack); } EXPORT_SYMBOL(dump_stack); @@ -323,7 +285,7 @@ void show_registers(struct pt_regs *regs) u8 __user *eip; printk("\n" KERN_EMERG "Stack: "); - show_stack_log_lvl(NULL, regs, (unsigned long *)esp, KERN_EMERG); + show_stack_log_lvl(NULL, (unsigned long *)esp, KERN_EMERG); printk(KERN_EMERG "Code: "); @@ -1253,15 +1215,3 @@ static int __init kstack_setup(char *s) return 1; } __setup("kstack=", kstack_setup); - -static int __init call_trace_setup(char *s) -{ - if (strcmp(s, "old") == 0) - call_trace = -1; - else if (strcmp(s, "both") == 0) - call_trace = 0; - else if (strcmp(s, "new") == 0) - call_trace = 1; - return 1; -} -__setup("call_trace=", call_trace_setup); diff --git a/trunk/arch/i386/kernel/tsc.c b/trunk/arch/i386/kernel/tsc.c deleted file mode 100644 index 7e0d8dab2075..000000000000 --- a/trunk/arch/i386/kernel/tsc.c +++ /dev/null @@ -1,478 +0,0 @@ -/* - * This code largely moved from arch/i386/kernel/timer/timer_tsc.c - * which was originally moved from arch/i386/kernel/time.c. - * See comments there for proper credits. - */ - -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -#include "mach_timer.h" - -/* - * On some systems the TSC frequency does not - * change with the cpu frequency. So we need - * an extra value to store the TSC freq - */ -unsigned int tsc_khz; - -int tsc_disable __cpuinitdata = 0; - -#ifdef CONFIG_X86_TSC -static int __init tsc_setup(char *str) -{ - printk(KERN_WARNING "notsc: Kernel compiled with CONFIG_X86_TSC, " - "cannot disable TSC.\n"); - return 1; -} -#else -/* - * disable flag for tsc. Takes effect by clearing the TSC cpu flag - * in cpu/common.c - */ -static int __init tsc_setup(char *str) -{ - tsc_disable = 1; - - return 1; -} -#endif - -__setup("notsc", tsc_setup); - -/* - * code to mark and check if the TSC is unstable - * due to cpufreq or due to unsynced TSCs - */ -static int tsc_unstable; - -static inline int check_tsc_unstable(void) -{ - return tsc_unstable; -} - -void mark_tsc_unstable(void) -{ - tsc_unstable = 1; -} -EXPORT_SYMBOL_GPL(mark_tsc_unstable); - -/* Accellerators for sched_clock() - * convert from cycles(64bits) => nanoseconds (64bits) - * basic equation: - * ns = cycles / (freq / ns_per_sec) - * ns = cycles * (ns_per_sec / freq) - * ns = cycles * (10^9 / (cpu_khz * 10^3)) - * ns = cycles * (10^6 / cpu_khz) - * - * Then we use scaling math (suggested by george@mvista.com) to get: - * ns = cycles * (10^6 * SC / cpu_khz) / SC - * ns = cycles * cyc2ns_scale / SC - * - * And since SC is a constant power of two, we can convert the div - * into a shift. - * - * We can use khz divisor instead of mhz to keep a better percision, since - * cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits. - * (mathieu.desnoyers@polymtl.ca) - * - * -johnstul@us.ibm.com "math is hard, lets go shopping!" - */ -static unsigned long cyc2ns_scale __read_mostly; - -#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */ - -static inline void set_cyc2ns_scale(unsigned long cpu_khz) -{ - cyc2ns_scale = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz; -} - -static inline unsigned long long cycles_2_ns(unsigned long long cyc) -{ - return (cyc * cyc2ns_scale) >> CYC2NS_SCALE_FACTOR; -} - -/* - * Scheduler clock - returns current time in nanosec units. - */ -unsigned long long sched_clock(void) -{ - unsigned long long this_offset; - - /* - * in the NUMA case we dont use the TSC as they are not - * synchronized across all CPUs. - */ -#ifndef CONFIG_NUMA - if (!cpu_khz || check_tsc_unstable()) -#endif - /* no locking but a rare wrong value is not a big deal */ - return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ); - - /* read the Time Stamp Counter: */ - rdtscll(this_offset); - - /* return the value in ns */ - return cycles_2_ns(this_offset); -} - -static unsigned long calculate_cpu_khz(void) -{ - unsigned long long start, end; - unsigned long count; - u64 delta64; - int i; - unsigned long flags; - - local_irq_save(flags); - - /* run 3 times to ensure the cache is warm */ - for (i = 0; i < 3; i++) { - mach_prepare_counter(); - rdtscll(start); - mach_countup(&count); - rdtscll(end); - } - /* - * Error: ECTCNEVERSET - * The CTC wasn't reliable: we got a hit on the very first read, - * or the CPU was so fast/slow that the quotient wouldn't fit in - * 32 bits.. - */ - if (count <= 1) - goto err; - - delta64 = end - start; - - /* cpu freq too fast: */ - if (delta64 > (1ULL<<32)) - goto err; - - /* cpu freq too slow: */ - if (delta64 <= CALIBRATE_TIME_MSEC) - goto err; - - delta64 += CALIBRATE_TIME_MSEC/2; /* round for do_div */ - do_div(delta64,CALIBRATE_TIME_MSEC); - - local_irq_restore(flags); - return (unsigned long)delta64; -err: - local_irq_restore(flags); - return 0; -} - -int recalibrate_cpu_khz(void) -{ -#ifndef CONFIG_SMP - unsigned long cpu_khz_old = cpu_khz; - - if (cpu_has_tsc) { - cpu_khz = calculate_cpu_khz(); - tsc_khz = cpu_khz; - cpu_data[0].loops_per_jiffy = - cpufreq_scale(cpu_data[0].loops_per_jiffy, - cpu_khz_old, cpu_khz); - return 0; - } else - return -ENODEV; -#else - return -ENODEV; -#endif -} - -EXPORT_SYMBOL(recalibrate_cpu_khz); - -void tsc_init(void) -{ - if (!cpu_has_tsc || tsc_disable) - return; - - cpu_khz = calculate_cpu_khz(); - tsc_khz = cpu_khz; - - if (!cpu_khz) - return; - - printk("Detected %lu.%03lu MHz processor.\n", - (unsigned long)cpu_khz / 1000, - (unsigned long)cpu_khz % 1000); - - set_cyc2ns_scale(cpu_khz); - use_tsc_delay(); -} - -#ifdef CONFIG_CPU_FREQ - -static unsigned int cpufreq_delayed_issched = 0; -static unsigned int cpufreq_init = 0; -static struct work_struct cpufreq_delayed_get_work; - -static void handle_cpufreq_delayed_get(void *v) -{ - unsigned int cpu; - - for_each_online_cpu(cpu) - cpufreq_get(cpu); - - cpufreq_delayed_issched = 0; -} - -/* - * if we notice cpufreq oddness, schedule a call to cpufreq_get() as it tries - * to verify the CPU frequency the timing core thinks the CPU is running - * at is still correct. - */ -static inline void cpufreq_delayed_get(void) -{ - if (cpufreq_init && !cpufreq_delayed_issched) { - cpufreq_delayed_issched = 1; - printk(KERN_DEBUG "Checking if CPU frequency changed.\n"); - schedule_work(&cpufreq_delayed_get_work); - } -} - -/* - * if the CPU frequency is scaled, TSC-based delays will need a different - * loops_per_jiffy value to function properly. - */ -static unsigned int ref_freq = 0; -static unsigned long loops_per_jiffy_ref = 0; -static unsigned long cpu_khz_ref = 0; - -static int -time_cpufreq_notifier(struct notifier_block *nb, unsigned long val, void *data) -{ - struct cpufreq_freqs *freq = data; - - if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE) - write_seqlock_irq(&xtime_lock); - - if (!ref_freq) { - if (!freq->old){ - ref_freq = freq->new; - goto end; - } - ref_freq = freq->old; - loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy; - cpu_khz_ref = cpu_khz; - } - - if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) || - (val == CPUFREQ_POSTCHANGE && freq->old > freq->new) || - (val == CPUFREQ_RESUMECHANGE)) { - if (!(freq->flags & CPUFREQ_CONST_LOOPS)) - cpu_data[freq->cpu].loops_per_jiffy = - cpufreq_scale(loops_per_jiffy_ref, - ref_freq, freq->new); - - if (cpu_khz) { - - if (num_online_cpus() == 1) - cpu_khz = cpufreq_scale(cpu_khz_ref, - ref_freq, freq->new); - if (!(freq->flags & CPUFREQ_CONST_LOOPS)) { - tsc_khz = cpu_khz; - set_cyc2ns_scale(cpu_khz); - /* - * TSC based sched_clock turns - * to junk w/ cpufreq - */ - mark_tsc_unstable(); - } - } - } -end: - if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE) - write_sequnlock_irq(&xtime_lock); - - return 0; -} - -static struct notifier_block time_cpufreq_notifier_block = { - .notifier_call = time_cpufreq_notifier -}; - -static int __init cpufreq_tsc(void) -{ - int ret; - - INIT_WORK(&cpufreq_delayed_get_work, handle_cpufreq_delayed_get, NULL); - ret = cpufreq_register_notifier(&time_cpufreq_notifier_block, - CPUFREQ_TRANSITION_NOTIFIER); - if (!ret) - cpufreq_init = 1; - - return ret; -} - -core_initcall(cpufreq_tsc); - -#endif - -/* clock source code */ - -static unsigned long current_tsc_khz = 0; -static int tsc_update_callback(void); - -static cycle_t read_tsc(void) -{ - cycle_t ret; - - rdtscll(ret); - - return ret; -} - -static struct clocksource clocksource_tsc = { - .name = "tsc", - .rating = 300, - .read = read_tsc, - .mask = CLOCKSOURCE_MASK(64), - .mult = 0, /* to be set */ - .shift = 22, - .update_callback = tsc_update_callback, - .is_continuous = 1, -}; - -static int tsc_update_callback(void) -{ - int change = 0; - - /* check to see if we should switch to the safe clocksource: */ - if (clocksource_tsc.rating != 50 && check_tsc_unstable()) { - clocksource_tsc.rating = 50; - clocksource_reselect(); - change = 1; - } - - /* only update if tsc_khz has changed: */ - if (current_tsc_khz != tsc_khz) { - current_tsc_khz = tsc_khz; - clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz, - clocksource_tsc.shift); - change = 1; - } - - return change; -} - -static int __init dmi_mark_tsc_unstable(struct dmi_system_id *d) -{ - printk(KERN_NOTICE "%s detected: marking TSC unstable.\n", - d->ident); - mark_tsc_unstable(); - return 0; -} - -/* List of systems that have known TSC problems */ -static struct dmi_system_id __initdata bad_tsc_dmi_table[] = { - { - .callback = dmi_mark_tsc_unstable, - .ident = "IBM Thinkpad 380XD", - .matches = { - DMI_MATCH(DMI_BOARD_VENDOR, "IBM"), - DMI_MATCH(DMI_BOARD_NAME, "2635FA0"), - }, - }, - {} -}; - -#define TSC_FREQ_CHECK_INTERVAL (10*MSEC_PER_SEC) /* 10sec in MS */ -static struct timer_list verify_tsc_freq_timer; - -/* XXX - Probably should add locking */ -static void verify_tsc_freq(unsigned long unused) -{ - static u64 last_tsc; - static unsigned long last_jiffies; - - u64 now_tsc, interval_tsc; - unsigned long now_jiffies, interval_jiffies; - - - if (check_tsc_unstable()) - return; - - rdtscll(now_tsc); - now_jiffies = jiffies; - - if (!last_jiffies) { - goto out; - } - - interval_jiffies = now_jiffies - last_jiffies; - interval_tsc = now_tsc - last_tsc; - interval_tsc *= HZ; - do_div(interval_tsc, cpu_khz*1000); - - if (interval_tsc < (interval_jiffies * 3 / 4)) { - printk("TSC appears to be running slowly. " - "Marking it as unstable\n"); - mark_tsc_unstable(); - return; - } - -out: - last_tsc = now_tsc; - last_jiffies = now_jiffies; - /* set us up to go off on the next interval: */ - mod_timer(&verify_tsc_freq_timer, - jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL)); -} - -/* - * Make an educated guess if the TSC is trustworthy and synchronized - * over all CPUs. - */ -static __init int unsynchronized_tsc(void) -{ - /* - * Intel systems are normally all synchronized. - * Exceptions must mark TSC as unstable: - */ - if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) - return 0; - - /* assume multi socket systems are not synchronized: */ - return num_possible_cpus() > 1; -} - -static int __init init_tsc_clocksource(void) -{ - - if (cpu_has_tsc && tsc_khz && !tsc_disable) { - /* check blacklist */ - dmi_check_system(bad_tsc_dmi_table); - - if (unsynchronized_tsc()) /* mark unstable if unsynced */ - mark_tsc_unstable(); - current_tsc_khz = tsc_khz; - clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz, - clocksource_tsc.shift); - /* lower the rating if we already know its unstable: */ - if (check_tsc_unstable()) - clocksource_tsc.rating = 50; - - init_timer(&verify_tsc_freq_timer); - verify_tsc_freq_timer.function = verify_tsc_freq; - verify_tsc_freq_timer.expires = - jiffies + msecs_to_jiffies(TSC_FREQ_CHECK_INTERVAL); - add_timer(&verify_tsc_freq_timer); - - return clocksource_register(&clocksource_tsc); - } - - return 0; -} - -module_init(init_tsc_clocksource); diff --git a/trunk/arch/i386/kernel/vmlinux.lds.S b/trunk/arch/i386/kernel/vmlinux.lds.S index 2d4f1386e2b1..7512f39c9f25 100644 --- a/trunk/arch/i386/kernel/vmlinux.lds.S +++ b/trunk/arch/i386/kernel/vmlinux.lds.S @@ -71,15 +71,6 @@ SECTIONS .data.read_mostly : AT(ADDR(.data.read_mostly) - LOAD_OFFSET) { *(.data.read_mostly) } _edata = .; /* End of data section */ -#ifdef CONFIG_STACK_UNWIND - . = ALIGN(4); - .eh_frame : AT(ADDR(.eh_frame) - LOAD_OFFSET) { - __start_unwind = .; - *(.eh_frame) - __end_unwind = .; - } -#endif - . = ALIGN(THREAD_SIZE); /* init_task */ .data.init_task : AT(ADDR(.data.init_task) - LOAD_OFFSET) { *(.data.init_task) diff --git a/trunk/arch/i386/lib/delay.c b/trunk/arch/i386/lib/delay.c index 3c0714c4b669..c49a6acbee56 100644 --- a/trunk/arch/i386/lib/delay.c +++ b/trunk/arch/i386/lib/delay.c @@ -10,92 +10,43 @@ * we have to worry about. */ -#include #include #include #include - +#include #include #include #include #ifdef CONFIG_SMP -# include +#include #endif -/* simple loop based delay: */ -static void delay_loop(unsigned long loops) -{ - int d0; - - __asm__ __volatile__( - "\tjmp 1f\n" - ".align 16\n" - "1:\tjmp 2f\n" - ".align 16\n" - "2:\tdecl %0\n\tjns 2b" - :"=&a" (d0) - :"0" (loops)); -} - -/* TSC based delay: */ -static void delay_tsc(unsigned long loops) -{ - unsigned long bclock, now; - - rdtscl(bclock); - do { - rep_nop(); - rdtscl(now); - } while ((now-bclock) < loops); -} - -/* - * Since we calibrate only once at boot, this - * function should be set once at boot and not changed - */ -static void (*delay_fn)(unsigned long) = delay_loop; - -void use_tsc_delay(void) -{ - delay_fn = delay_tsc; -} - -int read_current_timer(unsigned long *timer_val) -{ - if (delay_fn == delay_tsc) { - rdtscl(*timer_val); - return 0; - } - return -1; -} +extern struct timer_opts* timer; void __delay(unsigned long loops) { - delay_fn(loops); + cur_timer->delay(loops); } inline void __const_udelay(unsigned long xloops) { int d0; - xloops *= 4; __asm__("mull %0" :"=d" (xloops), "=&a" (d0) - :"1" (xloops), "0" - (cpu_data[raw_smp_processor_id()].loops_per_jiffy * (HZ/4))); - - __delay(++xloops); + :"1" (xloops),"0" (cpu_data[raw_smp_processor_id()].loops_per_jiffy * (HZ/4))); + __delay(++xloops); } void __udelay(unsigned long usecs) { - __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */ + __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */ } void __ndelay(unsigned long nsecs) { - __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */ + __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */ } EXPORT_SYMBOL(__delay); diff --git a/trunk/arch/i386/mm/fault.c b/trunk/arch/i386/mm/fault.c index 6ee7faaf2c1b..bd6fe96cc16d 100644 --- a/trunk/arch/i386/mm/fault.c +++ b/trunk/arch/i386/mm/fault.c @@ -30,40 +30,6 @@ extern void die(const char *,struct pt_regs *,long); -#ifdef CONFIG_KPROBES -ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); -int register_page_fault_notifier(struct notifier_block *nb) -{ - vmalloc_sync_all(); - return atomic_notifier_chain_register(¬ify_page_fault_chain, nb); -} - -int unregister_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_unregister(¬ify_page_fault_chain, nb); -} - -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - struct die_args args = { - .regs = regs, - .str = str, - .err = err, - .trapnr = trap, - .signr = sig - }; - return atomic_notifier_call_chain(¬ify_page_fault_chain, val, &args); -} -#else -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - return NOTIFY_DONE; -} -#endif - - /* * Unlock any spinlocks which will prevent us from getting the * message out @@ -358,7 +324,7 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs, if (unlikely(address >= TASK_SIZE)) { if (!(error_code & 0x0000000d) && vmalloc_fault(address) >= 0) return; - if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, + if (notify_die(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, SIGSEGV) == NOTIFY_STOP) return; /* @@ -368,7 +334,7 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs, goto bad_area_nosemaphore; } - if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, + if (notify_die(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, SIGSEGV) == NOTIFY_STOP) return; diff --git a/trunk/arch/i386/oprofile/nmi_int.c b/trunk/arch/i386/oprofile/nmi_int.c index fa8a37bcb391..ec0fd3cfa774 100644 --- a/trunk/arch/i386/oprofile/nmi_int.c +++ b/trunk/arch/i386/oprofile/nmi_int.c @@ -281,9 +281,9 @@ static int nmi_create_files(struct super_block * sb, struct dentry * root) for (i = 0; i < model->num_counters; ++i) { struct dentry * dir; - char buf[4]; + char buf[2]; - snprintf(buf, sizeof(buf), "%d", i); + snprintf(buf, 2, "%d", i); dir = oprofilefs_mkdir(sb, root, buf); oprofilefs_create_ulong(sb, dir, "enabled", &counter_config[i].enabled); oprofilefs_create_ulong(sb, dir, "event", &counter_config[i].event); diff --git a/trunk/arch/i386/oprofile/op_model_athlon.c b/trunk/arch/i386/oprofile/op_model_athlon.c index 693bdea4a52b..3ad9a72a5036 100644 --- a/trunk/arch/i386/oprofile/op_model_athlon.c +++ b/trunk/arch/i386/oprofile/op_model_athlon.c @@ -13,7 +13,6 @@ #include #include #include -#include #include "op_x86_model.h" #include "op_counter.h" diff --git a/trunk/arch/i386/oprofile/op_model_p4.c b/trunk/arch/i386/oprofile/op_model_p4.c index 7c61d357b82b..ac8a066035c2 100644 --- a/trunk/arch/i386/oprofile/op_model_p4.c +++ b/trunk/arch/i386/oprofile/op_model_p4.c @@ -14,7 +14,6 @@ #include #include #include -#include #include "op_x86_model.h" #include "op_counter.h" diff --git a/trunk/arch/i386/oprofile/op_model_ppro.c b/trunk/arch/i386/oprofile/op_model_ppro.c index 5c3ab4b027ad..d719015fc044 100644 --- a/trunk/arch/i386/oprofile/op_model_ppro.c +++ b/trunk/arch/i386/oprofile/op_model_ppro.c @@ -14,7 +14,6 @@ #include #include #include -#include #include "op_x86_model.h" #include "op_counter.h" diff --git a/trunk/arch/i386/pci/pcbios.c b/trunk/arch/i386/pci/pcbios.c index ed1512a175ab..1eec0868f4b3 100644 --- a/trunk/arch/i386/pci/pcbios.c +++ b/trunk/arch/i386/pci/pcbios.c @@ -371,7 +371,8 @@ void __devinit pcibios_sort(void) list_for_each(ln, &pci_devices) { d = pci_dev_g(ln); if (d->bus->number == bus && d->devfn == devfn) { - list_move_tail(&d->global_list, &sorted_devices); + list_del(&d->global_list); + list_add_tail(&d->global_list, &sorted_devices); if (d == dev) found = 1; break; @@ -389,7 +390,8 @@ void __devinit pcibios_sort(void) if (!found) { printk(KERN_WARNING "PCI: Device %s not found by BIOS\n", pci_name(dev)); - list_move_tail(&dev->global_list, &sorted_devices); + list_del(&dev->global_list); + list_add_tail(&dev->global_list, &sorted_devices); } } list_splice(&sorted_devices, &pci_devices); diff --git a/trunk/arch/ia64/kernel/process.c b/trunk/arch/ia64/kernel/process.c index b045c279136c..355d57970ba3 100644 --- a/trunk/arch/ia64/kernel/process.c +++ b/trunk/arch/ia64/kernel/process.c @@ -272,9 +272,9 @@ cpu_idle (void) /* endless idle loop with no priority at all */ while (1) { if (can_do_pal_halt) - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); else - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); if (!need_resched()) { void (*idle)(void); diff --git a/trunk/arch/ia64/mm/fault.c b/trunk/arch/ia64/mm/fault.c index 14ef7cceb208..d98ec49570b8 100644 --- a/trunk/arch/ia64/mm/fault.c +++ b/trunk/arch/ia64/mm/fault.c @@ -19,40 +19,6 @@ extern void die (char *, struct pt_regs *, long); -#ifdef CONFIG_KPROBES -ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); - -/* Hook to register for page fault notifications */ -int register_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_register(¬ify_page_fault_chain, nb); -} - -int unregister_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_unregister(¬ify_page_fault_chain, nb); -} - -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - struct die_args args = { - .regs = regs, - .str = str, - .err = err, - .trapnr = trap, - .signr = sig - }; - return atomic_notifier_call_chain(¬ify_page_fault_chain, val, &args); -} -#else -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - return NOTIFY_DONE; -} -#endif - /* * Return TRUE if ADDRESS points at a page in the kernel's mapped segment * (inside region 5, on ia64) and that page is present. @@ -118,7 +84,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re /* * This is to handle the kprobes on user space access instructions */ - if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, code, TRAP_BRKPT, + if (notify_die(DIE_PAGE_FAULT, "page fault", regs, code, TRAP_BRKPT, SIGSEGV) == NOTIFY_STOP) return; diff --git a/trunk/arch/m68k/mm/memory.c b/trunk/arch/m68k/mm/memory.c index a226668f20c3..d6d582a5abb0 100644 --- a/trunk/arch/m68k/mm/memory.c +++ b/trunk/arch/m68k/mm/memory.c @@ -94,7 +94,8 @@ pmd_t *get_pointer_table (void) PD_MARKBITS(dp) = mask & ~tmp; if (!PD_MARKBITS(dp)) { /* move to end of list */ - list_move_tail(dp, &ptable_list); + list_del(dp); + list_add_tail(dp, &ptable_list); } return (pmd_t *) (page_address(PD_PAGE(dp)) + off); } @@ -122,7 +123,8 @@ int free_pointer_table (pmd_t *ptable) * move this descriptor to the front of the list, since * it has one or more free tables. */ - list_move(dp, &ptable_list); + list_del(dp); + list_add(dp, &ptable_list); } return 0; } diff --git a/trunk/arch/m68k/sun3/sun3dvma.c b/trunk/arch/m68k/sun3/sun3dvma.c index 97c7bfde8ae8..f04a1d25f1a2 100644 --- a/trunk/arch/m68k/sun3/sun3dvma.c +++ b/trunk/arch/m68k/sun3/sun3dvma.c @@ -119,7 +119,8 @@ static inline int refill(void) if(hole->end == prev->start) { hole->size += prev->size; hole->end = prev->end; - list_move(&(prev->list), &hole_cache); + list_del(&(prev->list)); + list_add(&(prev->list), &hole_cache); ret++; } @@ -181,7 +182,8 @@ static inline unsigned long get_baddr(int len, unsigned long align) #endif return hole->end; } else if(hole->size == newlen) { - list_move(&(hole->list), &hole_cache); + list_del(&(hole->list)); + list_add(&(hole->list), &hole_cache); dvma_entry_use(hole->start) = newlen; #ifdef DVMA_DEBUG dvma_allocs++; diff --git a/trunk/arch/m68knommu/Kconfig b/trunk/arch/m68knommu/Kconfig index 8b6e723eb82b..6c6980b9b6d4 100644 --- a/trunk/arch/m68knommu/Kconfig +++ b/trunk/arch/m68knommu/Kconfig @@ -472,46 +472,38 @@ config 4KSTACKS running more threads on a system and also reduces the pressure on the VM subsystem for higher order allocations. -comment "RAM configuration" - -config RAMBASE - hex "Address of the base of RAM" - default "0" - help - Define the address that RAM starts at. On many platforms this is - 0, the base of the address space. And this is the default. Some - platforms choose to setup their RAM at other addresses within the - processor address space. - -config RAMSIZE - hex "Size of RAM (in bytes)" - default "0x400000" - help - Define the size of the system RAM. If you select 0 then the - kernel will try to probe the RAM size at runtime. This is not - supported on all CPU types. - -config VECTORBASE - hex "Address of the base of system vectors" - default "0" - help - Define the address of the the system vectors. Commonly this is - put at the start of RAM, but it doesn't have to be. On ColdFire - platforms this address is programmed into the VBR register, thus - actually setting the address to use. - -config KERNELBASE - hex "Address of the base of kernel code" - default "0x400" - help - Typically on m68k systems the kernel will not start at the base - of RAM, but usually some small offset from it. Define the start - address of the kernel here. The most common setup will have the - processor vectors at the base of RAM and then the start of the - kernel. On some platforms some RAM is reserved for boot loaders - and the kernel starts after that. The 0x400 default was based on - a system with the RAM based at address 0, and leaving enough room - for the theoretical maximum number of 256 vectors. +choice + prompt "RAM size" + default AUTO + +config RAMAUTO + bool "AUTO" + ---help--- + Configure the RAM size on your platform. Many platforms can auto + detect this, on those choose the AUTO option. Otherwise set the + RAM size you intend using. + +config RAM4MB + bool "4MiB" + help + Set RAM size to be 4MiB. + +config RAM8MB + bool "8MiB" + help + Set RAM size to be 8MiB. + +config RAM16MB + bool "16MiB" + help + Set RAM size to be 16MiB. + +config RAM32MB + bool "32MiB" + help + Set RAM size to be 32MiB. + +endchoice choice prompt "RAM bus width" @@ -519,7 +511,7 @@ choice config RAMAUTOBIT bool "AUTO" - help + ---help--- Select the physical RAM data bus size. Not needed on most platforms, so you can generally choose AUTO. @@ -553,9 +545,7 @@ config RAMKERNEL config ROMKERNEL bool "ROM" help - The kernel will be resident in FLASH/ROM when running. This is - often referred to as Execute-in-Place (XIP), since the kernel - code executes from the position it is stored in the FLASH/ROM. + The kernel will be resident in FLASH/ROM when running. endchoice diff --git a/trunk/arch/m68knommu/kernel/vmlinux.lds.S b/trunk/arch/m68knommu/kernel/vmlinux.lds.S index 6a2f0c693254..a331cc90797c 100644 --- a/trunk/arch/m68knommu/kernel/vmlinux.lds.S +++ b/trunk/arch/m68knommu/kernel/vmlinux.lds.S @@ -1,7 +1,7 @@ /* * vmlinux.lds.S -- master linker script for m68knommu arch * - * (C) Copyright 2002-2006, Greg Ungerer + * (C) Copyright 2002-2004, Greg Ungerer * * This ends up looking compilcated, because of the number of * address variations for ram and rom/flash layouts. The real @@ -22,7 +22,13 @@ #define ROM_START 0x10c10400 #define ROM_LENGTH 0xfec00 #define ROM_END 0x10d00000 -#define DATA_ADDR CONFIG_KERNELBASE +#define RAMVEC_START 0x00000000 +#define RAMVEC_LENGTH 0x400 +#define RAM_START 0x10000400 +#define RAM_LENGTH 0xffc00 +#define RAM_END 0x10100000 +#define _ramend _ram_end_notused +#define DATA_ADDR RAM_START #endif /* @@ -35,6 +41,11 @@ #define ROM_START 0x10c10400 #define ROM_LENGTH 0x1efc00 #define ROM_END 0x10e00000 +#define RAMVEC_START 0x00000000 +#define RAMVEC_LENGTH 0x400 +#define RAM_START 0x00020400 +#define RAM_LENGTH 0x7dfc00 +#define RAM_END 0x00800000 #endif #ifdef CONFIG_ROMKERNEL #define ROMVEC_START 0x10c10000 @@ -42,6 +53,11 @@ #define ROM_START 0x10c10400 #define ROM_LENGTH 0x1efc00 #define ROM_END 0x10e00000 +#define RAMVEC_START 0x00000000 +#define RAMVEC_LENGTH 0x400 +#define RAM_START 0x00020000 +#define RAM_LENGTH 0x600000 +#define RAM_END 0x00800000 #endif #ifdef CONFIG_HIMEMKERNEL #define ROMVEC_START 0x00600000 @@ -49,28 +65,141 @@ #define ROM_START 0x00600400 #define ROM_LENGTH 0x1efc00 #define ROM_END 0x007f0000 +#define RAMVEC_START 0x00000000 +#define RAMVEC_LENGTH 0x400 +#define RAM_START 0x00020000 +#define RAM_LENGTH 0x5e0000 +#define RAM_END 0x00600000 #endif #endif +#ifdef CONFIG_DRAGEN2 +#define RAM_START 0x10000 +#define RAM_LENGTH 0x7f0000 +#endif + #ifdef CONFIG_UCQUICC #define ROMVEC_START 0x00000000 #define ROMVEC_LENGTH 0x404 #define ROM_START 0x00000404 #define ROM_LENGTH 0x1ff6fc #define ROM_END 0x00200000 +#define RAMVEC_START 0x00200000 +#define RAMVEC_LENGTH 0x404 +#define RAM_START 0x00200404 +#define RAM_LENGTH 0x1ff6fc +#define RAM_END 0x00400000 +#endif + +/* + * The standard Arnewsh 5206 board only has 1MiB of ram. Not normally + * enough to be useful. Assume the user has fitted something larger, + * at least 4MiB in size. No point in not letting the kernel completely + * link, it will be obvious if it is too big when they go to load it. + */ +#if defined(CONFIG_ARN5206) +#define RAM_START 0x10000 +#define RAM_LENGTH 0x3f0000 +#endif + +/* + * The Motorola 5206eLITE board only has 1MiB of static RAM. + */ +#if defined(CONFIG_ELITE) +#define RAM_START 0x30020000 +#define RAM_LENGTH 0xe0000 +#endif + +/* + * All the Motorola eval boards have the same basic arrangement. + * The end of RAM will vary depending on how much ram is fitted, + * but this isn't important here, we assume at least 4MiB. + */ +#if defined(CONFIG_M5206eC3) || defined(CONFIG_M5249C3) || \ + defined(CONFIG_M5272C3) || defined(CONFIG_M5307C3) || \ + defined(CONFIG_ARN5307) || defined(CONFIG_M5407C3) || \ + defined(CONFIG_M5271EVB) || defined(CONFIG_M5275EVB) || \ + defined(CONFIG_M5235EVB) +#define RAM_START 0x20000 +#define RAM_LENGTH 0x3e0000 +#endif + +/* + * The Freescale 5208EVB board has 32MB of RAM. + */ +#if defined(CONFIG_M5208EVB) +#define RAM_START 0x40020000 +#define RAM_LENGTH 0x01fe0000 +#endif + +/* + * The senTec COBRA5272 board has nearly the same memory layout as + * the M5272C3. We assume 16MiB ram. + */ +#if defined(CONFIG_COBRA5272) +#define RAM_START 0x20000 +#define RAM_LENGTH 0xfe0000 +#endif + +#if defined(CONFIG_M5282EVB) +#define RAM_START 0x10000 +#define RAM_LENGTH 0x3f0000 +#endif + +/* + * The senTec COBRA5282 board has the same memory layout as the M5282EVB. + */ +#if defined(CONFIG_COBRA5282) +#define RAM_START 0x10000 +#define RAM_LENGTH 0x3f0000 +#endif + + +/* + * The EMAC SoM-5282EM module. + */ +#if defined(CONFIG_SOM5282EM) +#define RAM_START 0x10000 +#define RAM_LENGTH 0xff0000 +#endif + + +/* + * These flash boot boards use all of ram for operation. Again the + * actual memory size is not important here, assume at least 4MiB. + * They currently have no support for running in flash. + */ +#if defined(CONFIG_NETtel) || defined(CONFIG_eLIA) || \ + defined(CONFIG_DISKtel) || defined(CONFIG_SECUREEDGEMP3) || \ + defined(CONFIG_HW_FEITH) +#define RAM_START 0x400 +#define RAM_LENGTH 0x3ffc00 +#endif + +/* + * Sneha Boards mimimun memory + * The end of RAM will vary depending on how much ram is fitted, + * but this isn't important here, we assume at least 4MiB. + */ +#if defined(CONFIG_CPU16B) +#define RAM_START 0x20000 +#define RAM_LENGTH 0x3e0000 +#endif + +#if defined(CONFIG_MOD5272) +#define RAM_START 0x02000000 +#define RAM_LENGTH 0x00800000 +#define RAMVEC_START 0x20000000 +#define RAMVEC_LENGTH 0x00000400 #endif #if defined(CONFIG_RAMKERNEL) -#define RAM_START CONFIG_KERNELBASE -#define RAM_LENGTH (CONFIG_RAMBASE + CONFIG_RAMSIZE - CONFIG_KERNELBASE) #define TEXT ram #define DATA ram #define INIT ram #define BSS ram #endif #if defined(CONFIG_ROMKERNEL) || defined(CONFIG_HIMEMKERNEL) -#define RAM_START CONFIG_RAMBASE -#define RAM_LENGTH CONFIG_RAMSIZE #define TEXT rom #define DATA ram #define INIT ram @@ -86,7 +215,13 @@ OUTPUT_ARCH(m68k) ENTRY(_start) MEMORY { +#ifdef RAMVEC_START + ramvec : ORIGIN = RAMVEC_START, LENGTH = RAMVEC_LENGTH +#endif ram : ORIGIN = RAM_START, LENGTH = RAM_LENGTH +#ifdef RAM_END + eram : ORIGIN = RAM_END, LENGTH = 0 +#endif #ifdef ROM_START romvec : ORIGIN = ROMVEC_START, LENGTH = ROMVEC_LENGTH rom : ORIGIN = ROM_START, LENGTH = ROM_LENGTH @@ -173,6 +308,12 @@ SECTIONS { __rom_end = . ; } > erom #endif +#ifdef RAMVEC_START + . = RAMVEC_START ; + .ramvec : { + __ramvec = .; + } > ramvec +#endif .data DATA_ADDR : { . = ALIGN(4); @@ -232,5 +373,12 @@ SECTIONS { _ebss = . ; } > BSS +#ifdef RAM_END + . = RAM_END ; + .eram : { + __ramend = . ; + _ramend = . ; + } > eram +#endif } diff --git a/trunk/arch/m68knommu/platform/5307/head.S b/trunk/arch/m68knommu/platform/5307/head.S index 1d9eb301d7ac..c30c462b99b1 100644 --- a/trunk/arch/m68knommu/platform/5307/head.S +++ b/trunk/arch/m68knommu/platform/5307/head.S @@ -3,7 +3,7 @@ /* * head.S -- common startup code for ColdFire CPUs. * - * (C) Copyright 1999-2006, Greg Ungerer . + * (C) Copyright 1999-2004, Greg Ungerer (gerg@snapgear.com). */ /*****************************************************************************/ @@ -19,15 +19,47 @@ /*****************************************************************************/ /* - * If we don't have a fixed memory size, then lets build in code + * Define fixed memory sizes. Configuration of a fixed memory size + * overrides everything else. If the user defined a size we just + * blindly use it (they know what they are doing right :-) + */ +#if defined(CONFIG_RAM32MB) +#define MEM_SIZE 0x02000000 /* memory size 32Mb */ +#elif defined(CONFIG_RAM16MB) +#define MEM_SIZE 0x01000000 /* memory size 16Mb */ +#elif defined(CONFIG_RAM8MB) +#define MEM_SIZE 0x00800000 /* memory size 8Mb */ +#elif defined(CONFIG_RAM4MB) +#define MEM_SIZE 0x00400000 /* memory size 4Mb */ +#elif defined(CONFIG_RAM1MB) +#define MEM_SIZE 0x00100000 /* memory size 1Mb */ +#endif + +/* + * Memory size exceptions for special cases. Some boards may be set + * for auto memory sizing, but we can't do it that way for some reason. + * For example the 5206eLITE board has static RAM, and auto-detecting + * the SDRAM will do you no good at all. Same goes for the MOD5272. + */ +#ifdef CONFIG_RAMAUTO +#if defined(CONFIG_M5206eLITE) +#define MEM_SIZE 0x00100000 /* 1MiB default memory */ +#endif +#if defined(CONFIG_MOD5272) +#define MEM_SIZE 0x00800000 /* 8MiB default memory */ +#endif +#endif /* CONFIG_RAMAUTO */ + + +/* + * If we don't have a fixed memory size now, then lets build in code * to auto detect the DRAM size. Obviously this is the prefered - * method, and should work for most boards. It won't work for those - * that do not have their RAM starting at address 0, and it only - * works on SDRAM (not boards fitted with SRAM). + * method, and should work for most boards (it won't work for those + * that do not have their RAM starting at address 0). */ -#if CONFIG_RAMSIZE != 0 +#if defined(MEM_SIZE) .macro GET_MEM_SIZE - movel #CONFIG_RAMSIZE,%d0 /* hard coded memory size */ + movel #MEM_SIZE,%d0 /* hard coded memory size */ .endm #elif defined(CONFIG_M5206) || defined(CONFIG_M5206e) || \ @@ -66,7 +98,37 @@ .endm #else -#error "ERROR: I don't know how to probe your boards memory size?" +#error "ERROR: I don't know how to determine your boards memory size?" +#endif + + +/* + * Most ColdFire boards have their DRAM starting at address 0. + * Notable exception is the 5206eLITE board, another is the MOD5272. + */ +#if defined(CONFIG_M5206eLITE) +#define MEM_BASE 0x30000000 +#endif +#if defined(CONFIG_MOD5272) +#define MEM_BASE 0x02000000 +#define VBR_BASE 0x20000000 /* vectors in SRAM */ +#endif +#if defined(CONFIG_M5208EVB) +#define MEM_BASE 0x40000000 +#endif + +#ifndef MEM_BASE +#define MEM_BASE 0x00000000 /* memory base at address 0 */ +#endif + +/* + * The default location for the vectors is at the base of RAM. + * Some boards might like to use internal SRAM or something like + * that. If no board specific header defines an alternative then + * use the base of RAM. + */ +#ifndef VBR_BASE +#define VBR_BASE MEM_BASE /* vector address */ #endif /*****************************************************************************/ @@ -129,11 +191,11 @@ _start: * Create basic memory configuration. Set VBR accordingly, * and size memory. */ - movel #CONFIG_VECTORBASE,%a7 + movel #VBR_BASE,%a7 movec %a7,%VBR /* set vectors addr */ movel %a7,_ramvec - movel #CONFIG_RAMBASE,%a7 /* mark the base of RAM */ + movel #MEM_BASE,%a7 /* mark the base of RAM */ movel %a7,_rambase GET_MEM_SIZE /* macro code determines size */ diff --git a/trunk/arch/m68knommu/platform/68328/head-pilot.S b/trunk/arch/m68knommu/platform/68328/head-pilot.S index 46b3604f999c..c46775fe04be 100644 --- a/trunk/arch/m68knommu/platform/68328/head-pilot.S +++ b/trunk/arch/m68knommu/platform/68328/head-pilot.S @@ -21,6 +21,7 @@ .global _start .global _rambase +.global __ramvec .global _ramvec .global _ramstart .global _ramend @@ -120,7 +121,7 @@ L0: DBG_PUTC('B') /* Copy command line from beginning of RAM (+16) to end of bss */ - movel #CONFIG_VECTORBASE, %d7 + movel #__ramvec, %d7 addl #16, %d7 moveal %d7, %a0 moveal #_ebss, %a1 diff --git a/trunk/arch/m68knommu/platform/68328/head-ram.S b/trunk/arch/m68knommu/platform/68328/head-ram.S index e8dc9241ff96..6bdc9bce43f2 100644 --- a/trunk/arch/m68knommu/platform/68328/head-ram.S +++ b/trunk/arch/m68knommu/platform/68328/head-ram.S @@ -1,7 +1,10 @@ #include .global __main + .global __ram_start + .global __ram_end .global __rom_start + .global __rom_end .global _rambase .global _ramstart @@ -9,7 +12,6 @@ .global splash_bits .global _start .global _stext - .global _edata #define DEBUG #define ROM_OFFSET 0x10C00000 @@ -71,7 +73,7 @@ pclp1: #ifdef CONFIG_RELOCATE /* Copy me to RAM */ moveal #__rom_start, %a0 - moveal #_stext, %a1 + moveal #__ram_start, %a1 moveal #_edata, %a2 /* Copy %a0 to %a1 until %a1 == %a2 */ diff --git a/trunk/arch/mips/oprofile/common.c b/trunk/arch/mips/oprofile/common.c index 65eb55400d77..c31e4cff64e0 100644 --- a/trunk/arch/mips/oprofile/common.c +++ b/trunk/arch/mips/oprofile/common.c @@ -38,7 +38,7 @@ static int op_mips_create_files(struct super_block * sb, struct dentry * root) for (i = 0; i < model->num_counters; ++i) { struct dentry *dir; - char buf[4]; + char buf[3]; snprintf(buf, sizeof buf, "%d", i); dir = oprofilefs_mkdir(sb, root, buf); diff --git a/trunk/arch/powerpc/kernel/time.c b/trunk/arch/powerpc/kernel/time.c index 7dd5dab789a1..d20907561f46 100644 --- a/trunk/arch/powerpc/kernel/time.c +++ b/trunk/arch/powerpc/kernel/time.c @@ -102,7 +102,7 @@ EXPORT_SYMBOL(tb_ticks_per_sec); /* for cputime_t conversions */ u64 tb_to_xs; unsigned tb_to_us; -#define TICKLEN_SCALE TICK_LENGTH_SHIFT +#define TICKLEN_SCALE (SHIFT_SCALE - 10) u64 last_tick_len; /* units are ns / 2^TICKLEN_SCALE */ u64 ticklen_to_xs; /* 0.64 fraction */ diff --git a/trunk/arch/powerpc/mm/fault.c b/trunk/arch/powerpc/mm/fault.c index a0a9e1e0061e..fdbba4206d59 100644 --- a/trunk/arch/powerpc/mm/fault.c +++ b/trunk/arch/powerpc/mm/fault.c @@ -40,40 +40,6 @@ #include #include -#ifdef CONFIG_KPROBES -ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); - -/* Hook to register for page fault notifications */ -int register_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_register(¬ify_page_fault_chain, nb); -} - -int unregister_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_unregister(¬ify_page_fault_chain, nb); -} - -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - struct die_args args = { - .regs = regs, - .str = str, - .err = err, - .trapnr = trap, - .signr = sig - }; - return atomic_notifier_call_chain(¬ify_page_fault_chain, val, &args); -} -#else -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - return NOTIFY_DONE; -} -#endif - /* * Check whether the instruction at regs->nip is a store using * an update addressing form which will update r1. @@ -176,7 +142,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, is_write = error_code & ESR_DST; #endif /* CONFIG_4xx || CONFIG_BOOKE */ - if (notify_page_fault(DIE_PAGE_FAULT, "page_fault", regs, error_code, + if (notify_die(DIE_PAGE_FAULT, "page_fault", regs, error_code, 11, SIGSEGV) == NOTIFY_STOP) return 0; diff --git a/trunk/arch/powerpc/oprofile/common.c b/trunk/arch/powerpc/oprofile/common.c index fd0bbbe7a4de..27ad56bd227e 100644 --- a/trunk/arch/powerpc/oprofile/common.c +++ b/trunk/arch/powerpc/oprofile/common.c @@ -94,7 +94,7 @@ static int op_powerpc_create_files(struct super_block *sb, struct dentry *root) for (i = 0; i < model->num_counters; ++i) { struct dentry *dir; - char buf[4]; + char buf[3]; snprintf(buf, sizeof buf, "%d", i); dir = oprofilefs_mkdir(sb, root, buf); diff --git a/trunk/arch/sh/oprofile/op_model_sh7750.c b/trunk/arch/sh/oprofile/op_model_sh7750.c index c265185b22a7..5ec9ddcc4b0b 100644 --- a/trunk/arch/sh/oprofile/op_model_sh7750.c +++ b/trunk/arch/sh/oprofile/op_model_sh7750.c @@ -198,7 +198,7 @@ static int sh7750_perf_counter_create_files(struct super_block *sb, struct dentr for (i = 0; i < NR_CNTRS; i++) { struct dentry *dir; - char buf[4]; + char buf[3]; snprintf(buf, sizeof(buf), "%d", i); dir = oprofilefs_mkdir(sb, root, buf); diff --git a/trunk/arch/sparc/kernel/of_device.c b/trunk/arch/sparc/kernel/of_device.c index 80a809478781..001b8673b4bd 100644 --- a/trunk/arch/sparc/kernel/of_device.c +++ b/trunk/arch/sparc/kernel/of_device.c @@ -138,7 +138,6 @@ struct bus_type ebus_bus_type = { .suspend = of_device_suspend, .resume = of_device_resume, }; -EXPORT_SYMBOL(ebus_bus_type); #endif #ifdef CONFIG_SBUS @@ -150,7 +149,6 @@ struct bus_type sbus_bus_type = { .suspend = of_device_suspend, .resume = of_device_resume, }; -EXPORT_SYMBOL(sbus_bus_type); #endif static int __init of_bus_driver_init(void) diff --git a/trunk/arch/sparc/kernel/prom.c b/trunk/arch/sparc/kernel/prom.c index 946ce6d15819..63b2b9bd778e 100644 --- a/trunk/arch/sparc/kernel/prom.c +++ b/trunk/arch/sparc/kernel/prom.c @@ -27,11 +27,6 @@ static struct device_node *allnodes; -/* use when traversing tree through the allnext, child, sibling, - * or parent members of struct device_node. - */ -static DEFINE_RWLOCK(devtree_lock); - int of_device_is_compatible(struct device_node *device, const char *compat) { const char* cp; @@ -190,54 +185,6 @@ int of_getintprop_default(struct device_node *np, const char *name, int def) } EXPORT_SYMBOL(of_getintprop_default); -int of_set_property(struct device_node *dp, const char *name, void *val, int len) -{ - struct property **prevp; - void *new_val; - int err; - - new_val = kmalloc(len, GFP_KERNEL); - if (!new_val) - return -ENOMEM; - - memcpy(new_val, val, len); - - err = -ENODEV; - - write_lock(&devtree_lock); - prevp = &dp->properties; - while (*prevp) { - struct property *prop = *prevp; - - if (!strcmp(prop->name, name)) { - void *old_val = prop->value; - int ret; - - ret = prom_setprop(dp->node, name, val, len); - err = -EINVAL; - if (ret >= 0) { - prop->value = new_val; - prop->length = len; - - if (OF_IS_DYNAMIC(prop)) - kfree(old_val); - - OF_MARK_DYNAMIC(prop); - - err = 0; - } - break; - } - prevp = &(*prevp)->next; - } - write_unlock(&devtree_lock); - - /* XXX Upate procfs if necessary... */ - - return err; -} -EXPORT_SYMBOL(of_set_property); - static unsigned int prom_early_allocated; static void * __init prom_early_alloc(unsigned long size) @@ -407,9 +354,7 @@ static char * __init build_full_name(struct device_node *dp) return n; } -static unsigned int unique_id; - -static struct property * __init build_one_prop(phandle node, char *prev, char *special_name, void *special_val, int special_len) +static struct property * __init build_one_prop(phandle node, char *prev) { static struct property *tmp = NULL; struct property *p; @@ -419,34 +364,25 @@ static struct property * __init build_one_prop(phandle node, char *prev, char *s p = tmp; memset(p, 0, sizeof(*p) + 32); tmp = NULL; - } else { + } else p = prom_early_alloc(sizeof(struct property) + 32); - p->unique_id = unique_id++; - } p->name = (char *) (p + 1); - if (special_name) { - p->length = special_len; - p->value = prom_early_alloc(special_len); - memcpy(p->value, special_val, special_len); + if (prev == NULL) { + prom_firstprop(node, p->name); } else { - if (prev == NULL) { - prom_firstprop(node, p->name); - } else { - prom_nextprop(node, prev, p->name); - } - if (strlen(p->name) == 0) { - tmp = p; - return NULL; - } - p->length = prom_getproplen(node, p->name); - if (p->length <= 0) { - p->length = 0; - } else { - p->value = prom_early_alloc(p->length + 1); - prom_getproperty(node, p->name, p->value, p->length); - ((unsigned char *)p->value)[p->length] = '\0'; - } + prom_nextprop(node, prev, p->name); + } + if (strlen(p->name) == 0) { + tmp = p; + return NULL; + } + p->length = prom_getproplen(node, p->name); + if (p->length <= 0) { + p->length = 0; + } else { + p->value = prom_early_alloc(p->length); + len = prom_getproperty(node, p->name, p->value, p->length); } return p; } @@ -455,14 +391,9 @@ static struct property * __init build_prop_list(phandle node) { struct property *head, *tail; - head = tail = build_one_prop(node, NULL, - ".node", &node, sizeof(node)); - - tail->next = build_one_prop(node, NULL, NULL, NULL, 0); - tail = tail->next; + head = tail = build_one_prop(node, NULL); while(tail) { - tail->next = build_one_prop(node, tail->name, - NULL, NULL, 0); + tail->next = build_one_prop(node, tail->name); tail = tail->next; } @@ -491,7 +422,6 @@ static struct device_node * __init create_node(phandle node) return NULL; dp = prom_early_alloc(sizeof(*dp)); - dp->unique_id = unique_id++; kref_init(&dp->kref); diff --git a/trunk/arch/sparc/lib/Makefile b/trunk/arch/sparc/lib/Makefile index 5db7e1d85385..fa5006946062 100644 --- a/trunk/arch/sparc/lib/Makefile +++ b/trunk/arch/sparc/lib/Makefile @@ -9,5 +9,3 @@ lib-y := mul.o rem.o sdiv.o udiv.o umul.o urem.o ashrdi3.o memcpy.o memset.o \ strncpy_from_user.o divdi3.o udivdi3.o strlen_user.o \ copy_user.o locks.o atomic.o atomic32.o bitops.o \ lshrdi3.o ashldi3.o rwsem.o muldi3.o bitext.o - -obj-y += iomap.o diff --git a/trunk/arch/sparc/lib/iomap.c b/trunk/arch/sparc/lib/iomap.c deleted file mode 100644 index 54501c1ca785..000000000000 --- a/trunk/arch/sparc/lib/iomap.c +++ /dev/null @@ -1,48 +0,0 @@ -/* - * Implement the sparc iomap interfaces - */ -#include -#include -#include - -/* Create a virtual mapping cookie for an IO port range */ -void __iomem *ioport_map(unsigned long port, unsigned int nr) -{ - return (void __iomem *) (unsigned long) port; -} - -void ioport_unmap(void __iomem *addr) -{ - /* Nothing to do */ -} -EXPORT_SYMBOL(ioport_map); -EXPORT_SYMBOL(ioport_unmap); - -/* Create a virtual mapping cookie for a PCI BAR (memory or IO) */ -void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen) -{ - unsigned long start = pci_resource_start(dev, bar); - unsigned long len = pci_resource_len(dev, bar); - unsigned long flags = pci_resource_flags(dev, bar); - - if (!len || !start) - return NULL; - if (maxlen && len > maxlen) - len = maxlen; - if (flags & IORESOURCE_IO) - return ioport_map(start, len); - if (flags & IORESOURCE_MEM) { - if (flags & IORESOURCE_CACHEABLE) - return ioremap(start, len); - return ioremap_nocache(start, len); - } - /* What? */ - return NULL; -} - -void pci_iounmap(struct pci_dev *dev, void __iomem * addr) -{ - /* nothing to do */ -} -EXPORT_SYMBOL(pci_iomap); -EXPORT_SYMBOL(pci_iounmap); diff --git a/trunk/arch/sparc64/kernel/auxio.c b/trunk/arch/sparc64/kernel/auxio.c index c2c69c167d18..2c42894b188f 100644 --- a/trunk/arch/sparc64/kernel/auxio.c +++ b/trunk/arch/sparc64/kernel/auxio.c @@ -6,7 +6,6 @@ */ #include -#include #include #include #include @@ -17,8 +16,8 @@ #include #include +/* This cannot be static, as it is referenced in irq.c */ void __iomem *auxio_register = NULL; -EXPORT_SYMBOL(auxio_register); enum auxio_type { AUXIO_TYPE_NODEV, diff --git a/trunk/arch/sparc64/kernel/irq.c b/trunk/arch/sparc64/kernel/irq.c index cc89b06d0178..31e0fbb0d82c 100644 --- a/trunk/arch/sparc64/kernel/irq.c +++ b/trunk/arch/sparc64/kernel/irq.c @@ -563,6 +563,67 @@ void handler_irq(int irq, struct pt_regs *regs) irq_exit(); } +#ifdef CONFIG_BLK_DEV_FD +extern irqreturn_t floppy_interrupt(int, void *, struct pt_regs *); + +/* XXX No easy way to include asm/floppy.h XXX */ +extern unsigned char *pdma_vaddr; +extern unsigned long pdma_size; +extern volatile int doing_pdma; +extern unsigned long fdc_status; + +irqreturn_t sparc_floppy_irq(int irq, void *dev_cookie, struct pt_regs *regs) +{ + if (likely(doing_pdma)) { + void __iomem *stat = (void __iomem *) fdc_status; + unsigned char *vaddr = pdma_vaddr; + unsigned long size = pdma_size; + u8 val; + + while (size) { + val = readb(stat); + if (unlikely(!(val & 0x80))) { + pdma_vaddr = vaddr; + pdma_size = size; + return IRQ_HANDLED; + } + if (unlikely(!(val & 0x20))) { + pdma_vaddr = vaddr; + pdma_size = size; + doing_pdma = 0; + goto main_interrupt; + } + if (val & 0x40) { + /* read */ + *vaddr++ = readb(stat + 1); + } else { + unsigned char data = *vaddr++; + + /* write */ + writeb(data, stat + 1); + } + size--; + } + + pdma_vaddr = vaddr; + pdma_size = size; + + /* Send Terminal Count pulse to floppy controller. */ + val = readb(auxio_register); + val |= AUXIO_AUX1_FTCNT; + writeb(val, auxio_register); + val &= ~AUXIO_AUX1_FTCNT; + writeb(val, auxio_register); + + doing_pdma = 0; + } + +main_interrupt: + return floppy_interrupt(irq, dev_cookie, regs); +} +EXPORT_SYMBOL(sparc_floppy_irq); +#endif + struct sun5_timer { u64 count0; u64 limit0; diff --git a/trunk/arch/sparc64/kernel/of_device.c b/trunk/arch/sparc64/kernel/of_device.c index 768475bbce82..566aa343aa62 100644 --- a/trunk/arch/sparc64/kernel/of_device.c +++ b/trunk/arch/sparc64/kernel/of_device.c @@ -138,7 +138,6 @@ struct bus_type isa_bus_type = { .suspend = of_device_suspend, .resume = of_device_resume, }; -EXPORT_SYMBOL(isa_bus_type); struct bus_type ebus_bus_type = { .name = "ebus", @@ -148,7 +147,6 @@ struct bus_type ebus_bus_type = { .suspend = of_device_suspend, .resume = of_device_resume, }; -EXPORT_SYMBOL(ebus_bus_type); #endif #ifdef CONFIG_SBUS @@ -160,7 +158,6 @@ struct bus_type sbus_bus_type = { .suspend = of_device_suspend, .resume = of_device_resume, }; -EXPORT_SYMBOL(sbus_bus_type); #endif static int __init of_bus_driver_init(void) diff --git a/trunk/arch/sparc64/kernel/prom.c b/trunk/arch/sparc64/kernel/prom.c index 8e87e7ea0325..e9d703eea806 100644 --- a/trunk/arch/sparc64/kernel/prom.c +++ b/trunk/arch/sparc64/kernel/prom.c @@ -27,11 +27,6 @@ static struct device_node *allnodes; -/* use when traversing tree through the allnext, child, sibling, - * or parent members of struct device_node. - */ -static DEFINE_RWLOCK(devtree_lock); - int of_device_is_compatible(struct device_node *device, const char *compat) { const char* cp; @@ -190,54 +185,6 @@ int of_getintprop_default(struct device_node *np, const char *name, int def) } EXPORT_SYMBOL(of_getintprop_default); -int of_set_property(struct device_node *dp, const char *name, void *val, int len) -{ - struct property **prevp; - void *new_val; - int err; - - new_val = kmalloc(len, GFP_KERNEL); - if (!new_val) - return -ENOMEM; - - memcpy(new_val, val, len); - - err = -ENODEV; - - write_lock(&devtree_lock); - prevp = &dp->properties; - while (*prevp) { - struct property *prop = *prevp; - - if (!strcmp(prop->name, name)) { - void *old_val = prop->value; - int ret; - - ret = prom_setprop(dp->node, name, val, len); - err = -EINVAL; - if (ret >= 0) { - prop->value = new_val; - prop->length = len; - - if (OF_IS_DYNAMIC(prop)) - kfree(old_val); - - OF_MARK_DYNAMIC(prop); - - err = 0; - } - break; - } - prevp = &(*prevp)->next; - } - write_unlock(&devtree_lock); - - /* XXX Upate procfs if necessary... */ - - return err; -} -EXPORT_SYMBOL(of_set_property); - static unsigned int prom_early_allocated; static void * __init prom_early_alloc(unsigned long size) @@ -584,9 +531,7 @@ static char * __init build_full_name(struct device_node *dp) return n; } -static unsigned int unique_id; - -static struct property * __init build_one_prop(phandle node, char *prev, char *special_name, void *special_val, int special_len) +static struct property * __init build_one_prop(phandle node, char *prev) { static struct property *tmp = NULL; struct property *p; @@ -595,35 +540,25 @@ static struct property * __init build_one_prop(phandle node, char *prev, char *s p = tmp; memset(p, 0, sizeof(*p) + 32); tmp = NULL; - } else { + } else p = prom_early_alloc(sizeof(struct property) + 32); - p->unique_id = unique_id++; - } p->name = (char *) (p + 1); - if (special_name) { - strcpy(p->name, special_name); - p->length = special_len; - p->value = prom_early_alloc(special_len); - memcpy(p->value, special_val, special_len); + if (prev == NULL) { + prom_firstprop(node, p->name); } else { - if (prev == NULL) { - prom_firstprop(node, p->name); - } else { - prom_nextprop(node, prev, p->name); - } - if (strlen(p->name) == 0) { - tmp = p; - return NULL; - } - p->length = prom_getproplen(node, p->name); - if (p->length <= 0) { - p->length = 0; - } else { - p->value = prom_early_alloc(p->length + 1); - prom_getproperty(node, p->name, p->value, p->length); - ((unsigned char *)p->value)[p->length] = '\0'; - } + prom_nextprop(node, prev, p->name); + } + if (strlen(p->name) == 0) { + tmp = p; + return NULL; + } + p->length = prom_getproplen(node, p->name); + if (p->length <= 0) { + p->length = 0; + } else { + p->value = prom_early_alloc(p->length); + prom_getproperty(node, p->name, p->value, p->length); } return p; } @@ -632,14 +567,9 @@ static struct property * __init build_prop_list(phandle node) { struct property *head, *tail; - head = tail = build_one_prop(node, NULL, - ".node", &node, sizeof(node)); - - tail->next = build_one_prop(node, NULL, NULL, NULL, 0); - tail = tail->next; + head = tail = build_one_prop(node, NULL); while(tail) { - tail->next = build_one_prop(node, tail->name, - NULL, NULL, 0); + tail->next = build_one_prop(node, tail->name); tail = tail->next; } @@ -668,7 +598,6 @@ static struct device_node * __init create_node(phandle node) return NULL; dp = prom_early_alloc(sizeof(*dp)); - dp->unique_id = unique_id++; kref_init(&dp->kref); diff --git a/trunk/arch/sparc64/mm/fault.c b/trunk/arch/sparc64/mm/fault.c index 1605967cce91..6e002aacb961 100644 --- a/trunk/arch/sparc64/mm/fault.c +++ b/trunk/arch/sparc64/mm/fault.c @@ -31,40 +31,6 @@ #include #include -#ifdef CONFIG_KPROBES -ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); - -/* Hook to register for page fault notifications */ -int register_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_register(¬ify_page_fault_chain, nb); -} - -int unregister_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_unregister(¬ify_page_fault_chain, nb); -} - -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - struct die_args args = { - .regs = regs, - .str = str, - .err = err, - .trapnr = trap, - .signr = sig - }; - return atomic_notifier_call_chain(¬ify_page_fault_chain, val, &args); -} -#else -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - return NOTIFY_DONE; -} -#endif - /* * To debug kernel to catch accesses to certain virtual/physical addresses. * Mode = 0 selects physical watchpoints, mode = 1 selects virtual watchpoints. @@ -297,7 +263,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) fault_code = get_thread_fault_code(); - if (notify_page_fault(DIE_PAGE_FAULT, "page_fault", regs, + if (notify_die(DIE_PAGE_FAULT, "page_fault", regs, fault_code, 0, SIGSEGV) == NOTIFY_STOP) return; diff --git a/trunk/arch/sparc64/mm/init.c b/trunk/arch/sparc64/mm/init.c index 5c2bcf354ce6..513993414747 100644 --- a/trunk/arch/sparc64/mm/init.c +++ b/trunk/arch/sparc64/mm/init.c @@ -1568,7 +1568,6 @@ pgprot_t PAGE_EXEC __read_mostly; unsigned long pg_iobits __read_mostly; unsigned long _PAGE_IE __read_mostly; -EXPORT_SYMBOL(_PAGE_IE); unsigned long _PAGE_E __read_mostly; EXPORT_SYMBOL(_PAGE_E); diff --git a/trunk/arch/x86_64/Kconfig b/trunk/arch/x86_64/Kconfig index ccc4a7fb97a3..af44130f0d65 100644 --- a/trunk/arch/x86_64/Kconfig +++ b/trunk/arch/x86_64/Kconfig @@ -386,45 +386,24 @@ config HPET_EMULATE_RTC bool "Provide RTC interrupt" depends on HPET_TIMER && RTC=y -# Mark as embedded because too many people got it wrong. -# The code disables itself when not needed. -config IOMMU - bool "IOMMU support" if EMBEDDED +config GART_IOMMU + bool "K8 GART IOMMU support" default y select SWIOTLB select AGP depends on PCI help - Support for full DMA access of devices with 32bit memory access only - on systems with more than 3GB. This is usually needed for USB, - sound, many IDE/SATA chipsets and some other devices. - Provides a driver for the AMD Athlon64/Opteron/Turion/Sempron GART - based IOMMU and a software bounce buffer based IOMMU used on Intel - systems and as fallback. - The code is only active when needed (enough memory and limited - device) unless CONFIG_IOMMU_DEBUG or iommu=force is specified - too. - -config CALGARY_IOMMU - bool "IBM Calgary IOMMU support" - default y - select SWIOTLB - depends on PCI && EXPERIMENTAL - help - Support for hardware IOMMUs in IBM's xSeries x366 and x460 - systems. Needed to run systems with more than 3GB of memory - properly with 32-bit PCI devices that do not support DAC - (Double Address Cycle). Calgary also supports bus level - isolation, where all DMAs pass through the IOMMU. This - prevents them from going anywhere except their intended - destination. This catches hard-to-find kernel bugs and - mis-behaving drivers and devices that do not use the DMA-API - properly to set up their DMA buffers. The IOMMU can be - turned off at boot time with the iommu=off parameter. - Normally the kernel will make the right choice by itself. - If unsure, say Y. - -# need this always selected by IOMMU for the VIA workaround + Support for hardware IOMMU in AMD's Opteron/Athlon64 Processors + and for the bounce buffering software IOMMU. + Needed to run systems with more than 3GB of memory properly with + 32-bit PCI devices that do not support DAC (Double Address Cycle). + The IOMMU can be turned off at runtime with the iommu=off parameter. + Normally the kernel will take the right choice by itself. + This option includes a driver for the AMD Opteron/Athlon64 IOMMU + northbridge and a software emulation used on other systems without + hardware IOMMU. If unsure, say Y. + +# need this always selected by GART_IOMMU for the VIA workaround config SWIOTLB bool @@ -522,10 +501,6 @@ config REORDER optimal TLB usage. If you have pretty much any version of binutils, this can increase your kernel build time by roughly one minute. -config K8_NB - def_bool y - depends on AGP_AMD64 || IOMMU || (PCI && NUMA) - endmenu # diff --git a/trunk/arch/x86_64/Kconfig.debug b/trunk/arch/x86_64/Kconfig.debug index 1d92ab56c0f9..ea31b4c62105 100644 --- a/trunk/arch/x86_64/Kconfig.debug +++ b/trunk/arch/x86_64/Kconfig.debug @@ -13,7 +13,7 @@ config DEBUG_RODATA If in doubt, say "N". config IOMMU_DEBUG - depends on IOMMU && DEBUG_KERNEL + depends on GART_IOMMU && DEBUG_KERNEL bool "Enable IOMMU debugging" help Force the IOMMU to on even when you have less than 4GB of @@ -35,22 +35,6 @@ config IOMMU_LEAK Add a simple leak tracer to the IOMMU code. This is useful when you are debugging a buggy device driver that leaks IOMMU mappings. -config DEBUG_STACKOVERFLOW - bool "Check for stack overflows" - depends on DEBUG_KERNEL - help - This option will cause messages to be printed if free stack space - drops below a certain limit. - -config DEBUG_STACK_USAGE - bool "Stack utilization instrumentation" - depends on DEBUG_KERNEL - help - Enables the display of the minimum amount of free stack which each - task has ever had available in the sysrq-T and sysrq-P debug output. - - This option will slow down process creation somewhat. - #config X86_REMOTE_DEBUG # bool "kgdb debugging stub" diff --git a/trunk/arch/x86_64/Makefile b/trunk/arch/x86_64/Makefile index 431bb4bc36cd..e573e2ab5510 100644 --- a/trunk/arch/x86_64/Makefile +++ b/trunk/arch/x86_64/Makefile @@ -27,7 +27,6 @@ LDFLAGS_vmlinux := CHECKFLAGS += -D__x86_64__ -m64 cflags-y := -cflags-kernel-y := cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8) cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona) cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic) @@ -36,7 +35,7 @@ cflags-y += -m64 cflags-y += -mno-red-zone cflags-y += -mcmodel=kernel cflags-y += -pipe -cflags-kernel-$(CONFIG_REORDER) += -ffunction-sections +cflags-$(CONFIG_REORDER) += -ffunction-sections # this makes reading assembly source easier, but produces worse code # actually it makes the kernel smaller too. cflags-y += -fno-reorder-blocks @@ -56,7 +55,6 @@ cflags-y += $(call cc-option,-funit-at-a-time) cflags-y += $(call cc-option,-mno-sse -mno-mmx -mno-sse2 -mno-3dnow,) CFLAGS += $(cflags-y) -CFLAGS_KERNEL += $(cflags-kernel-y) AFLAGS += -m64 head-y := arch/x86_64/kernel/head.o arch/x86_64/kernel/head64.o arch/x86_64/kernel/init_task.o diff --git a/trunk/arch/x86_64/boot/Makefile b/trunk/arch/x86_64/boot/Makefile index deb063e7762d..43ee6c50c277 100644 --- a/trunk/arch/x86_64/boot/Makefile +++ b/trunk/arch/x86_64/boot/Makefile @@ -107,13 +107,8 @@ fdimage288: $(BOOTIMAGE) $(obj)/mtools.conf isoimage: $(BOOTIMAGE) -rm -rf $(obj)/isoimage mkdir $(obj)/isoimage - for i in lib lib64 share end ; do \ - if [ -f /usr/$$i/syslinux/isolinux.bin ] ; then \ - cp /usr/$$i/syslinux/isolinux.bin $(obj)/isoimage ; \ - break ; \ - fi ; \ - if [ $$i = end ] ; then exit 1 ; fi ; \ - done + cp `echo /usr/lib*/syslinux/isolinux.bin | awk '{ print $1; }'` \ + $(obj)/isoimage cp $(BOOTIMAGE) $(obj)/isoimage/linux echo '$(image_cmdline)' > $(obj)/isoimage/isolinux.cfg if [ -f '$(FDINITRD)' ] ; then \ diff --git a/trunk/arch/x86_64/boot/compressed/misc.c b/trunk/arch/x86_64/boot/compressed/misc.c index 3755b2e394d0..cf4b88c416dc 100644 --- a/trunk/arch/x86_64/boot/compressed/misc.c +++ b/trunk/arch/x86_64/boot/compressed/misc.c @@ -77,11 +77,11 @@ static void gzip_release(void **); */ static unsigned char *real_mode; /* Pointer to real-mode data */ -#define RM_EXT_MEM_K (*(unsigned short *)(real_mode + 0x2)) +#define EXT_MEM_K (*(unsigned short *)(real_mode + 0x2)) #ifndef STANDARD_MEMORY_BIOS_CALL -#define RM_ALT_MEM_K (*(unsigned long *)(real_mode + 0x1e0)) +#define ALT_MEM_K (*(unsigned long *)(real_mode + 0x1e0)) #endif -#define RM_SCREEN_INFO (*(struct screen_info *)(real_mode+0)) +#define SCREEN_INFO (*(struct screen_info *)(real_mode+0)) extern unsigned char input_data[]; extern int input_len; @@ -92,9 +92,9 @@ static unsigned long output_ptr = 0; static void *malloc(int size); static void free(void *where); - -static void *memset(void *s, int c, unsigned n); -static void *memcpy(void *dest, const void *src, unsigned n); + +void* memset(void* s, int c, unsigned n); +void* memcpy(void* dest, const void* src, unsigned n); static void putstr(const char *); @@ -162,8 +162,8 @@ static void putstr(const char *s) int x,y,pos; char c; - x = RM_SCREEN_INFO.orig_x; - y = RM_SCREEN_INFO.orig_y; + x = SCREEN_INFO.orig_x; + y = SCREEN_INFO.orig_y; while ( ( c = *s++ ) != '\0' ) { if ( c == '\n' ) { @@ -184,8 +184,8 @@ static void putstr(const char *s) } } - RM_SCREEN_INFO.orig_x = x; - RM_SCREEN_INFO.orig_y = y; + SCREEN_INFO.orig_x = x; + SCREEN_INFO.orig_y = y; pos = (x + cols * y) * 2; /* Update cursor position */ outb_p(14, vidport); @@ -194,7 +194,7 @@ static void putstr(const char *s) outb_p(0xff & (pos >> 1), vidport+1); } -static void* memset(void* s, int c, unsigned n) +void* memset(void* s, int c, unsigned n) { int i; char *ss = (char*)s; @@ -203,7 +203,7 @@ static void* memset(void* s, int c, unsigned n) return s; } -static void* memcpy(void* dest, const void* src, unsigned n) +void* memcpy(void* dest, const void* src, unsigned n) { int i; char *d = (char *)dest, *s = (char *)src; @@ -278,15 +278,15 @@ static void error(char *x) putstr(x); putstr("\n\n -- System halted"); - while(1); /* Halt */ + while(1); } -static void setup_normal_output_buffer(void) +void setup_normal_output_buffer(void) { #ifdef STANDARD_MEMORY_BIOS_CALL - if (RM_EXT_MEM_K < 1024) error("Less than 2MB of memory"); + if (EXT_MEM_K < 1024) error("Less than 2MB of memory"); #else - if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < 1024) error("Less than 2MB of memory"); + if ((ALT_MEM_K > EXT_MEM_K ? ALT_MEM_K : EXT_MEM_K) < 1024) error("Less than 2MB of memory"); #endif output_data = (unsigned char *)__PHYSICAL_START; /* Normally Points to 1M */ free_mem_end_ptr = (long)real_mode; @@ -297,13 +297,13 @@ struct moveparams { uch *high_buffer_start; int hcount; }; -static void setup_output_buffer_if_we_run_high(struct moveparams *mv) +void setup_output_buffer_if_we_run_high(struct moveparams *mv) { high_buffer_start = (uch *)(((ulg)&end) + HEAP_SIZE); #ifdef STANDARD_MEMORY_BIOS_CALL - if (RM_EXT_MEM_K < (3*1024)) error("Less than 4MB of memory"); + if (EXT_MEM_K < (3*1024)) error("Less than 4MB of memory"); #else - if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < (3*1024)) error("Less than 4MB of memory"); + if ((ALT_MEM_K > EXT_MEM_K ? ALT_MEM_K : EXT_MEM_K) < (3*1024)) error("Less than 4MB of memory"); #endif mv->low_buffer_start = output_data = (unsigned char *)LOW_BUFFER_START; low_buffer_end = ((unsigned int)real_mode > LOW_BUFFER_MAX @@ -319,7 +319,7 @@ static void setup_output_buffer_if_we_run_high(struct moveparams *mv) mv->high_buffer_start = high_buffer_start; } -static void close_output_buffer_if_we_run_high(struct moveparams *mv) +void close_output_buffer_if_we_run_high(struct moveparams *mv) { if (bytes_out > low_buffer_size) { mv->lcount = low_buffer_size; @@ -335,7 +335,7 @@ int decompress_kernel(struct moveparams *mv, void *rmode) { real_mode = rmode; - if (RM_SCREEN_INFO.orig_video_mode == 7) { + if (SCREEN_INFO.orig_video_mode == 7) { vidmem = (char *) 0xb0000; vidport = 0x3b4; } else { @@ -343,8 +343,8 @@ int decompress_kernel(struct moveparams *mv, void *rmode) vidport = 0x3d4; } - lines = RM_SCREEN_INFO.orig_video_lines; - cols = RM_SCREEN_INFO.orig_video_cols; + lines = SCREEN_INFO.orig_video_lines; + cols = SCREEN_INFO.orig_video_cols; if (free_mem_ptr < 0x100000) setup_normal_output_buffer(); else setup_output_buffer_if_we_run_high(mv); diff --git a/trunk/arch/x86_64/boot/tools/build.c b/trunk/arch/x86_64/boot/tools/build.c index eae86691709a..c44f5e2ec100 100644 --- a/trunk/arch/x86_64/boot/tools/build.c +++ b/trunk/arch/x86_64/boot/tools/build.c @@ -149,8 +149,10 @@ int main(int argc, char ** argv) sz = sb.st_size; fprintf (stderr, "System is %d kB\n", sz/1024); sys_size = (sz + 15) / 16; - if (!is_big_kernel && sys_size > DEF_SYSSIZE) - die("System is too big. Try using bzImage or modules."); + /* 0x40000*16 = 4.0 MB, reasonable estimate for the current maximum */ + if (sys_size > (is_big_kernel ? 0x40000 : DEF_SYSSIZE)) + die("System is too big. Try using %smodules.", + is_big_kernel ? "" : "bzImage or "); while (sz > 0) { int l, n; diff --git a/trunk/arch/x86_64/boot/video.S b/trunk/arch/x86_64/boot/video.S index 2aa565c136e5..32327bb37aff 100644 --- a/trunk/arch/x86_64/boot/video.S +++ b/trunk/arch/x86_64/boot/video.S @@ -1929,7 +1929,6 @@ skip10: movb %ah, %al ret store_edid: -#ifdef CONFIG_FIRMWARE_EDID pushw %es # just save all registers pushw %ax pushw %bx @@ -1947,22 +1946,6 @@ store_edid: rep stosl - pushw %es # save ES - xorw %di, %di # Report Capability - pushw %di - popw %es # ES:DI must be 0:0 - movw $0x4f15, %ax - xorw %bx, %bx - xorw %cx, %cx - int $0x10 - popw %es # restore ES - - cmpb $0x00, %ah # call successful - jne no_edid - - cmpb $0x4f, %al # function supported - jne no_edid - movw $0x4f15, %ax # do VBE/DDC movw $0x01, %bx movw $0x00, %cx @@ -1970,14 +1953,12 @@ store_edid: movw $0x140, %di int $0x10 -no_edid: popw %di # restore all registers popw %dx popw %cx popw %bx popw %ax popw %es -#endif ret # VIDEO_SELECT-only variables diff --git a/trunk/arch/x86_64/defconfig b/trunk/arch/x86_64/defconfig index e69d403949c8..69db0c0721d1 100644 --- a/trunk/arch/x86_64/defconfig +++ b/trunk/arch/x86_64/defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.17-git6 -# Sat Jun 24 00:52:28 2006 +# Linux kernel version: 2.6.17-rc1-git11 +# Sun Apr 16 07:22:36 2006 # CONFIG_X86_64=y CONFIG_64BIT=y @@ -42,6 +42,7 @@ CONFIG_IKCONFIG_PROC=y # CONFIG_RELAY is not set CONFIG_INITRAMFS_SOURCE="" CONFIG_UID16=y +CONFIG_VM86=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y @@ -56,6 +57,7 @@ CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y +CONFIG_DOUBLEFAULT=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_SLOB is not set @@ -142,8 +144,7 @@ CONFIG_NR_CPUS=32 CONFIG_HOTPLUG_CPU=y CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y -CONFIG_IOMMU=y -# CONFIG_CALGARY_IOMMU is not set +CONFIG_GART_IOMMU=y CONFIG_SWIOTLB=y CONFIG_X86_MCE=y CONFIG_X86_MCE_INTEL=y @@ -157,7 +158,6 @@ CONFIG_HZ_250=y # CONFIG_HZ_1000 is not set CONFIG_HZ=250 # CONFIG_REORDER is not set -CONFIG_K8_NB=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_ISA_DMA_API=y @@ -293,8 +293,6 @@ CONFIG_IP_PNP_DHCP=y # CONFIG_INET_IPCOMP is not set # CONFIG_INET_XFRM_TUNNEL is not set # CONFIG_INET_TUNNEL is not set -# CONFIG_INET_XFRM_MODE_TRANSPORT is not set -# CONFIG_INET_XFRM_MODE_TUNNEL is not set CONFIG_INET_DIAG=y CONFIG_INET_TCP_DIAG=y # CONFIG_TCP_CONG_ADVANCED is not set @@ -307,10 +305,7 @@ CONFIG_IPV6=y # CONFIG_INET6_IPCOMP is not set # CONFIG_INET6_XFRM_TUNNEL is not set # CONFIG_INET6_TUNNEL is not set -# CONFIG_INET6_XFRM_MODE_TRANSPORT is not set -# CONFIG_INET6_XFRM_MODE_TUNNEL is not set # CONFIG_IPV6_TUNNEL is not set -# CONFIG_NETWORK_SECMARK is not set # CONFIG_NETFILTER is not set # @@ -349,7 +344,6 @@ CONFIG_IPV6=y # Network testing # # CONFIG_NET_PKTGEN is not set -# CONFIG_NET_TCPPROBE is not set # CONFIG_HAMRADIO is not set # CONFIG_IRDA is not set # CONFIG_BT is not set @@ -366,7 +360,6 @@ CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y # CONFIG_DEBUG_DRIVER is not set -# CONFIG_SYS_HYPERVISOR is not set # # Connector - unified userspace <-> kernelspace linker @@ -533,7 +526,6 @@ CONFIG_SCSI_ATA_PIIX=y # CONFIG_SCSI_SATA_MV is not set CONFIG_SCSI_SATA_NV=y # CONFIG_SCSI_PDC_ADMA is not set -# CONFIG_SCSI_HPTIOP is not set # CONFIG_SCSI_SATA_QSTOR is not set # CONFIG_SCSI_SATA_PROMISE is not set # CONFIG_SCSI_SATA_SX4 is not set @@ -599,7 +591,10 @@ CONFIG_IEEE1394=y # # Device Drivers # -# CONFIG_IEEE1394_PCILYNX is not set + +# +# Texas Instruments PCILynx requires I2C +# CONFIG_IEEE1394_OHCI1394=y # @@ -650,16 +645,7 @@ CONFIG_VORTEX=y # # Tulip family network device support # -CONFIG_NET_TULIP=y -# CONFIG_DE2104X is not set -CONFIG_TULIP=y -# CONFIG_TULIP_MWI is not set -# CONFIG_TULIP_MMIO is not set -# CONFIG_TULIP_NAPI is not set -# CONFIG_DE4X5 is not set -# CONFIG_WINBOND_840 is not set -# CONFIG_DM9102 is not set -# CONFIG_ULI526X is not set +# CONFIG_NET_TULIP is not set # CONFIG_HP100 is not set CONFIG_NET_PCI=y # CONFIG_PCNET32 is not set @@ -711,7 +697,6 @@ CONFIG_TIGON3=y # CONFIG_IXGB is not set CONFIG_S2IO=m # CONFIG_S2IO_NAPI is not set -# CONFIG_MYRI10GE is not set # # Token Ring devices @@ -902,56 +887,7 @@ CONFIG_HPET_MMAP=y # # I2C support # -CONFIG_I2C=m -CONFIG_I2C_CHARDEV=m - -# -# I2C Algorithms -# -# CONFIG_I2C_ALGOBIT is not set -# CONFIG_I2C_ALGOPCF is not set -# CONFIG_I2C_ALGOPCA is not set - -# -# I2C Hardware Bus support -# -# CONFIG_I2C_ALI1535 is not set -# CONFIG_I2C_ALI1563 is not set -# CONFIG_I2C_ALI15X3 is not set -# CONFIG_I2C_AMD756 is not set -# CONFIG_I2C_AMD8111 is not set -# CONFIG_I2C_I801 is not set -# CONFIG_I2C_I810 is not set -# CONFIG_I2C_PIIX4 is not set -CONFIG_I2C_ISA=m -# CONFIG_I2C_NFORCE2 is not set -# CONFIG_I2C_OCORES is not set -# CONFIG_I2C_PARPORT_LIGHT is not set -# CONFIG_I2C_PROSAVAGE is not set -# CONFIG_I2C_SAVAGE4 is not set -# CONFIG_I2C_SIS5595 is not set -# CONFIG_I2C_SIS630 is not set -# CONFIG_I2C_SIS96X is not set -# CONFIG_I2C_STUB is not set -# CONFIG_I2C_VIA is not set -# CONFIG_I2C_VIAPRO is not set -# CONFIG_I2C_VOODOO3 is not set -# CONFIG_I2C_PCA_ISA is not set - -# -# Miscellaneous I2C Chip support -# -# CONFIG_SENSORS_DS1337 is not set -# CONFIG_SENSORS_DS1374 is not set -# CONFIG_SENSORS_EEPROM is not set -# CONFIG_SENSORS_PCF8574 is not set -# CONFIG_SENSORS_PCA9539 is not set -# CONFIG_SENSORS_PCF8591 is not set -# CONFIG_SENSORS_MAX6875 is not set -# CONFIG_I2C_DEBUG_CORE is not set -# CONFIG_I2C_DEBUG_ALGO is not set -# CONFIG_I2C_DEBUG_BUS is not set -# CONFIG_I2C_DEBUG_CHIP is not set +# CONFIG_I2C is not set # # SPI support @@ -962,51 +898,14 @@ CONFIG_I2C_ISA=m # # Dallas's 1-wire bus # +# CONFIG_W1 is not set # # Hardware Monitoring support # CONFIG_HWMON=y # CONFIG_HWMON_VID is not set -# CONFIG_SENSORS_ABITUGURU is not set -# CONFIG_SENSORS_ADM1021 is not set -# CONFIG_SENSORS_ADM1025 is not set -# CONFIG_SENSORS_ADM1026 is not set -# CONFIG_SENSORS_ADM1031 is not set -# CONFIG_SENSORS_ADM9240 is not set -# CONFIG_SENSORS_ASB100 is not set -# CONFIG_SENSORS_ATXP1 is not set -# CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_F71805F is not set -# CONFIG_SENSORS_FSCHER is not set -# CONFIG_SENSORS_FSCPOS is not set -# CONFIG_SENSORS_GL518SM is not set -# CONFIG_SENSORS_GL520SM is not set -# CONFIG_SENSORS_IT87 is not set -# CONFIG_SENSORS_LM63 is not set -# CONFIG_SENSORS_LM75 is not set -# CONFIG_SENSORS_LM77 is not set -# CONFIG_SENSORS_LM78 is not set -# CONFIG_SENSORS_LM80 is not set -# CONFIG_SENSORS_LM83 is not set -# CONFIG_SENSORS_LM85 is not set -# CONFIG_SENSORS_LM87 is not set -# CONFIG_SENSORS_LM90 is not set -# CONFIG_SENSORS_LM92 is not set -# CONFIG_SENSORS_MAX1619 is not set -# CONFIG_SENSORS_PC87360 is not set -# CONFIG_SENSORS_SIS5595 is not set -# CONFIG_SENSORS_SMSC47M1 is not set -# CONFIG_SENSORS_SMSC47M192 is not set -CONFIG_SENSORS_SMSC47B397=m -# CONFIG_SENSORS_VIA686A is not set -# CONFIG_SENSORS_VT8231 is not set -# CONFIG_SENSORS_W83781D is not set -# CONFIG_SENSORS_W83791D is not set -# CONFIG_SENSORS_W83792D is not set -# CONFIG_SENSORS_W83L785TS is not set -# CONFIG_SENSORS_W83627HF is not set -# CONFIG_SENSORS_W83627EHF is not set # CONFIG_SENSORS_HDAPS is not set # CONFIG_HWMON_DEBUG_CHIP is not set @@ -1019,7 +918,6 @@ CONFIG_SENSORS_SMSC47B397=m # Multimedia devices # # CONFIG_VIDEO_DEV is not set -CONFIG_VIDEO_V4L2=y # # Digital Video Broadcasting Devices @@ -1055,17 +953,28 @@ CONFIG_SOUND=y # Open Sound System # CONFIG_SOUND_PRIME=y +CONFIG_OBSOLETE_OSS_DRIVER=y # CONFIG_SOUND_BT878 is not set +# CONFIG_SOUND_CMPCI is not set # CONFIG_SOUND_EMU10K1 is not set # CONFIG_SOUND_FUSION is not set +# CONFIG_SOUND_CS4281 is not set +# CONFIG_SOUND_ES1370 is not set # CONFIG_SOUND_ES1371 is not set +# CONFIG_SOUND_ESSSOLO1 is not set +# CONFIG_SOUND_MAESTRO is not set +# CONFIG_SOUND_MAESTRO3 is not set CONFIG_SOUND_ICH=y +# CONFIG_SOUND_SONICVIBES is not set # CONFIG_SOUND_TRIDENT is not set # CONFIG_SOUND_MSNDCLAS is not set # CONFIG_SOUND_MSNDPIN is not set # CONFIG_SOUND_VIA82CXXX is not set # CONFIG_SOUND_OSS is not set -# CONFIG_SOUND_TVMIXER is not set +# CONFIG_SOUND_ALI5455 is not set +# CONFIG_SOUND_FORTE is not set +# CONFIG_SOUND_RME96XX is not set +# CONFIG_SOUND_AD1980 is not set # # USB support @@ -1091,7 +1000,6 @@ CONFIG_USB_DEVICEFS=y CONFIG_USB_EHCI_HCD=y # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set -# CONFIG_USB_EHCI_TT_NEWSCHED is not set # CONFIG_USB_ISP116X_HCD is not set CONFIG_USB_OHCI_HCD=y # CONFIG_USB_OHCI_BIG_ENDIAN is not set @@ -1181,12 +1089,10 @@ CONFIG_USB_MON=y # CONFIG_USB_LEGOTOWER is not set # CONFIG_USB_LCD is not set # CONFIG_USB_LED is not set -# CONFIG_USB_CY7C63 is not set # CONFIG_USB_CYTHERM is not set # CONFIG_USB_PHIDGETKIT is not set # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set -# CONFIG_USB_APPLEDISPLAY is not set # CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_LD is not set # CONFIG_USB_TEST is not set @@ -1234,19 +1140,6 @@ CONFIG_USB_MON=y # # CONFIG_RTC_CLASS is not set -# -# DMA Engine support -# -# CONFIG_DMA_ENGINE is not set - -# -# DMA Clients -# - -# -# DMA Devices -# - # # Firmware Drivers # @@ -1282,7 +1175,6 @@ CONFIG_FS_POSIX_ACL=y # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y -CONFIG_INOTIFY_USER=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=y @@ -1439,8 +1331,7 @@ CONFIG_DETECT_SOFTLOCKUP=y CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set # CONFIG_FRAME_POINTER is not set -CONFIG_UNWIND_INFO=y -CONFIG_STACK_UNWIND=y +# CONFIG_UNWIND_INFO is not set # CONFIG_FORCED_INLINING is not set # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_DEBUG_RODATA is not set diff --git a/trunk/arch/x86_64/ia32/fpu32.c b/trunk/arch/x86_64/ia32/fpu32.c index 2c8209a3605a..1c23095f1813 100644 --- a/trunk/arch/x86_64/ia32/fpu32.c +++ b/trunk/arch/x86_64/ia32/fpu32.c @@ -2,6 +2,7 @@ * Copyright 2002 Andi Kleen, SuSE Labs. * FXSAVE<->i387 conversion support. Based on code by Gareth Hughes. * This is used for ptrace, signals and coredumps in 32bit emulation. + * $Id: fpu32.c,v 1.1 2002/03/21 14:16:32 ak Exp $ */ #include diff --git a/trunk/arch/x86_64/ia32/ia32_signal.c b/trunk/arch/x86_64/ia32/ia32_signal.c index 25e5ca22204c..e0a92439f634 100644 --- a/trunk/arch/x86_64/ia32/ia32_signal.c +++ b/trunk/arch/x86_64/ia32/ia32_signal.c @@ -6,6 +6,8 @@ * 1997-11-28 Modified for POSIX.1b signals by Richard Henderson * 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes * 2000-12-* x86-64 compatibility mode signal handling by Andi Kleen + * + * $Id: ia32_signal.c,v 1.22 2002/07/29 10:34:03 ak Exp $ */ #include diff --git a/trunk/arch/x86_64/ia32/ia32entry.S b/trunk/arch/x86_64/ia32/ia32entry.S index c536fa98ea37..4ec594ab1a98 100644 --- a/trunk/arch/x86_64/ia32/ia32entry.S +++ b/trunk/arch/x86_64/ia32/ia32entry.S @@ -155,7 +155,6 @@ sysenter_tracesys: .previous jmp sysenter_do_call CFI_ENDPROC -ENDPROC(ia32_sysenter_target) /* * 32bit SYSCALL instruction entry. @@ -179,7 +178,7 @@ ENDPROC(ia32_sysenter_target) */ ENTRY(ia32_cstar_target) CFI_STARTPROC32 simple - CFI_DEF_CFA rsp,PDA_STACKOFFSET + CFI_DEF_CFA rsp,0 CFI_REGISTER rip,rcx /*CFI_REGISTER rflags,r11*/ swapgs @@ -250,7 +249,6 @@ cstar_tracesys: .quad 1b,ia32_badarg .previous jmp cstar_do_call -END(ia32_cstar_target) ia32_badarg: movq $-EFAULT,%rax @@ -316,13 +314,16 @@ ia32_tracesys: LOAD_ARGS ARGOFFSET /* reload args from stack in case ptrace changed it */ RESTORE_REST jmp ia32_do_syscall -END(ia32_syscall) ia32_badsys: movq $0,ORIG_RAX-ARGOFFSET(%rsp) movq $-ENOSYS,RAX-ARGOFFSET(%rsp) jmp int_ret_from_sys_call +ni_syscall: + movq %rax,%rdi + jmp sys32_ni_syscall + quiet_ni_syscall: movq $-ENOSYS,%rax ret @@ -369,10 +370,10 @@ ENTRY(ia32_ptregs_common) RESTORE_REST jmp ia32_sysret /* misbalances the return cache */ CFI_ENDPROC -END(ia32_ptregs_common) .section .rodata,"a" .align 8 + .globl ia32_sys_call_table ia32_sys_call_table: .quad sys_restart_syscall .quad sys_exit diff --git a/trunk/arch/x86_64/ia32/ptrace32.c b/trunk/arch/x86_64/ia32/ptrace32.c index a590b7a0d92d..23a4515a73b4 100644 --- a/trunk/arch/x86_64/ia32/ptrace32.c +++ b/trunk/arch/x86_64/ia32/ptrace32.c @@ -7,6 +7,8 @@ * * This allows to access 64bit processes too; but there is no way to see the extended * register contents. + * + * $Id: ptrace32.c,v 1.16 2003/03/14 16:06:35 ak Exp $ */ #include @@ -25,7 +27,6 @@ #include #include #include -#include /* * Determines which flags the user has access to [1 = access, 0 = no access]. @@ -198,24 +199,6 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val) #undef R32 -static long ptrace32_siginfo(unsigned request, u32 pid, u32 addr, u32 data) -{ - int ret; - compat_siginfo_t *si32 = (compat_siginfo_t *)compat_ptr(data); - siginfo_t *si = compat_alloc_user_space(sizeof(siginfo_t)); - if (request == PTRACE_SETSIGINFO) { - ret = copy_siginfo_from_user32(si, si32); - if (ret) - return ret; - } - ret = sys_ptrace(request, pid, addr, (unsigned long)si); - if (ret) - return ret; - if (request == PTRACE_GETSIGINFO) - ret = copy_siginfo_to_user32(si32, si); - return ret; -} - asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) { struct task_struct *child; @@ -225,18 +208,8 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) __u32 val; switch (request) { - case PTRACE_TRACEME: - case PTRACE_ATTACH: - case PTRACE_KILL: - case PTRACE_CONT: - case PTRACE_SINGLESTEP: - case PTRACE_DETACH: - case PTRACE_SYSCALL: - case PTRACE_SETOPTIONS: - return sys_ptrace(request, pid, addr, data); - default: - return -EINVAL; + return sys_ptrace(request, pid, addr, data); case PTRACE_PEEKTEXT: case PTRACE_PEEKDATA: @@ -252,11 +225,10 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) case PTRACE_GETFPXREGS: case PTRACE_GETEVENTMSG: break; + } - case PTRACE_SETSIGINFO: - case PTRACE_GETSIGINFO: - return ptrace32_siginfo(request, pid, addr, data); - } + if (request == PTRACE_TRACEME) + return ptrace_traceme(); child = ptrace_get_task_struct(pid); if (IS_ERR(child)) @@ -377,7 +349,8 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) break; default: - BUG(); + ret = -EINVAL; + break; } out: diff --git a/trunk/arch/x86_64/ia32/sys_ia32.c b/trunk/arch/x86_64/ia32/sys_ia32.c index dc88154c412b..f182b20858e2 100644 --- a/trunk/arch/x86_64/ia32/sys_ia32.c +++ b/trunk/arch/x86_64/ia32/sys_ia32.c @@ -508,6 +508,19 @@ sys32_waitpid(compat_pid_t pid, unsigned int *stat_addr, int options) return compat_sys_wait4(pid, stat_addr, options, NULL); } +int sys32_ni_syscall(int call) +{ + struct task_struct *me = current; + static char lastcomm[sizeof(me->comm)]; + + if (strncmp(lastcomm, me->comm, sizeof(lastcomm))) { + printk(KERN_INFO "IA32 syscall %d from %s not implemented\n", + call, me->comm); + strncpy(lastcomm, me->comm, sizeof(lastcomm)); + } + return -ENOSYS; +} + /* 32-bit timeval and related flotsam. */ asmlinkage long @@ -903,7 +916,7 @@ long sys32_vm86_warning(void) struct task_struct *me = current; static char lastcomm[sizeof(me->comm)]; if (strncmp(lastcomm, me->comm, sizeof(lastcomm))) { - compat_printk(KERN_INFO "%s: vm86 mode not supported on 64 bit kernel\n", + printk(KERN_INFO "%s: vm86 mode not supported on 64 bit kernel\n", me->comm); strncpy(lastcomm, me->comm, sizeof(lastcomm)); } @@ -916,3 +929,13 @@ long sys32_lookup_dcookie(u32 addr_low, u32 addr_high, return sys_lookup_dcookie(((u64)addr_high << 32) | addr_low, buf, len); } +static int __init ia32_init (void) +{ + printk("IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $\n"); + return 0; +} + +__initcall(ia32_init); + +extern unsigned long ia32_sys_call_table[]; +EXPORT_SYMBOL(ia32_sys_call_table); diff --git a/trunk/arch/x86_64/kernel/Makefile b/trunk/arch/x86_64/kernel/Makefile index aeb9c560be88..059c88313f4e 100644 --- a/trunk/arch/x86_64/kernel/Makefile +++ b/trunk/arch/x86_64/kernel/Makefile @@ -8,7 +8,7 @@ obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ setup64.o bootflag.o e820.o reboot.o quirks.o i8237.o \ - pci-dma.o pci-nommu.o alternative.o + pci-dma.o pci-nommu.o obj-$(CONFIG_X86_MCE) += mce.o obj-$(CONFIG_X86_MCE_INTEL) += mce_intel.o @@ -28,13 +28,11 @@ obj-$(CONFIG_PM) += suspend.o obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o obj-$(CONFIG_CPU_FREQ) += cpufreq/ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o -obj-$(CONFIG_IOMMU) += pci-gart.o aperture.o -obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary.o tce.o +obj-$(CONFIG_GART_IOMMU) += pci-gart.o aperture.o obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_X86_PM_TIMER) += pmtimer.o obj-$(CONFIG_X86_VSMP) += vsmp.o -obj-$(CONFIG_K8_NB) += k8.o obj-$(CONFIG_MODULES) += module.o @@ -51,5 +49,3 @@ intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o quirks-y += ../../i386/kernel/quirks.o i8237-y += ../../i386/kernel/i8237.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o -alternative-y += ../../i386/kernel/alternative.o - diff --git a/trunk/arch/x86_64/kernel/aperture.c b/trunk/arch/x86_64/kernel/aperture.c index a195ef06ec55..70b9d21ed675 100644 --- a/trunk/arch/x86_64/kernel/aperture.c +++ b/trunk/arch/x86_64/kernel/aperture.c @@ -8,6 +8,7 @@ * because only the bootmem allocator can allocate 32+MB. * * Copyright 2002 Andi Kleen, SuSE Labs. + * $Id: aperture.c,v 1.7 2003/08/01 03:36:18 ak Exp $ */ #include #include @@ -23,7 +24,6 @@ #include #include #include -#include int iommu_aperture; int iommu_aperture_disabled __initdata = 0; @@ -37,6 +37,8 @@ int fix_aperture __initdata = 1; /* This code runs before the PCI subsystem is initialized, so just access the northbridge directly. */ +#define NB_ID_3 (PCI_VENDOR_ID_AMD | (0x1103<<16)) + static u32 __init allocate_aperture(void) { pg_data_t *nd0 = NODE_DATA(0); @@ -66,20 +68,20 @@ static u32 __init allocate_aperture(void) return (u32)__pa(p); } -static int __init aperture_valid(u64 aper_base, u32 aper_size) +static int __init aperture_valid(char *name, u64 aper_base, u32 aper_size) { if (!aper_base) return 0; if (aper_size < 64*1024*1024) { - printk("Aperture too small (%d MB)\n", aper_size>>20); + printk("Aperture from %s too small (%d MB)\n", name, aper_size>>20); return 0; } if (aper_base + aper_size >= 0xffffffff) { - printk("Aperture beyond 4GB. Ignoring.\n"); + printk("Aperture from %s beyond 4GB. Ignoring.\n",name); return 0; } if (e820_any_mapped(aper_base, aper_base + aper_size, E820_RAM)) { - printk("Aperture pointing to e820 RAM. Ignoring.\n"); + printk("Aperture from %s pointing to e820 RAM. Ignoring.\n",name); return 0; } return 1; @@ -138,7 +140,7 @@ static __u32 __init read_agp(int num, int slot, int func, int cap, u32 *order) printk("Aperture from AGP @ %Lx size %u MB (APSIZE %x)\n", aper, 32 << *order, apsizereg); - if (!aperture_valid(aper, (32*1024*1024) << *order)) + if (!aperture_valid("AGP bridge", aper, (32*1024*1024) << *order)) return 0; return (u32)aper; } @@ -206,10 +208,10 @@ void __init iommu_hole_init(void) fix = 0; for (num = 24; num < 32; num++) { - if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) - continue; + char name[30]; + if (read_pci_config(0, num, 3, 0x00) != NB_ID_3) + continue; - iommu_detected = 1; iommu_aperture = 1; aper_order = (read_pci_config(0, num, 3, 0x90) >> 1) & 7; @@ -220,7 +222,9 @@ void __init iommu_hole_init(void) printk("CPU %d: aperture @ %Lx size %u MB\n", num-24, aper_base, aper_size>>20); - if (!aperture_valid(aper_base, aper_size)) { + sprintf(name, "northbridge cpu %d", num-24); + + if (!aperture_valid(name, aper_base, aper_size)) { fix = 1; break; } @@ -269,7 +273,7 @@ void __init iommu_hole_init(void) /* Fix up the north bridges */ for (num = 24; num < 32; num++) { - if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) + if (read_pci_config(0, num, 3, 0x00) != NB_ID_3) continue; /* Don't enable translation yet. That is done later. diff --git a/trunk/arch/x86_64/kernel/apic.c b/trunk/arch/x86_64/kernel/apic.c index b2ead91df218..29ef99001e05 100644 --- a/trunk/arch/x86_64/kernel/apic.c +++ b/trunk/arch/x86_64/kernel/apic.c @@ -100,7 +100,7 @@ void clear_local_APIC(void) maxlvt = get_maxlvt(); /* - * Masking an LVT entry can trigger a local APIC error + * Masking an LVT entry on a P6 can trigger a local APIC error * if the vector is zero. Mask LVTERR first to prevent this. */ if (maxlvt >= 3) { @@ -851,18 +851,7 @@ void disable_APIC_timer(void) unsigned long v; v = apic_read(APIC_LVTT); - /* - * When an illegal vector value (0-15) is written to an LVT - * entry and delivery mode is Fixed, the APIC may signal an - * illegal vector error, with out regard to whether the mask - * bit is set or whether an interrupt is actually seen on input. - * - * Boot sequence might call this function when the LVTT has - * '0' vector value. So make sure vector field is set to - * valid value. - */ - v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); - apic_write(APIC_LVTT, v); + apic_write(APIC_LVTT, v | APIC_LVT_MASKED); } } @@ -920,13 +909,15 @@ int setup_profiling_timer(unsigned int multiplier) return -EINVAL; } -void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char vector, - unsigned char msg_type, unsigned char mask) +#ifdef CONFIG_X86_MCE_AMD +void setup_threshold_lvt(unsigned long lvt_off) { - unsigned long reg = (lvt_off << 4) + K8_APIC_EXT_LVT_BASE; - unsigned int v = (mask << 16) | (msg_type << 8) | vector; + unsigned int v = 0; + unsigned long reg = (lvt_off << 4) + 0x500; + v |= THRESHOLD_APIC_VECTOR; apic_write(reg, v); } +#endif /* CONFIG_X86_MCE_AMD */ #undef APIC_DIVISOR @@ -992,7 +983,7 @@ void smp_apic_timer_interrupt(struct pt_regs *regs) } /* - * apic_is_clustered_box() -- Check if we can expect good TSC + * oem_force_hpet_timer -- force HPET mode for some boxes. * * Thus far, the major user of this is IBM's Summit2 series: * @@ -1000,7 +991,7 @@ void smp_apic_timer_interrupt(struct pt_regs *regs) * multi-chassis. Use available data to take a good guess. * If in doubt, go HPET. */ -__cpuinit int apic_is_clustered_box(void) +__cpuinit int oem_force_hpet_timer(void) { int i, clusters, zeros; unsigned id; @@ -1031,7 +1022,8 @@ __cpuinit int apic_is_clustered_box(void) } /* - * If clusters > 2, then should be multi-chassis. + * If clusters > 2, then should be multi-chassis. Return 1 for HPET. + * Else return 0 to use TSC. * May have to revisit this when multi-core + hyperthreaded CPUs come * out, but AFAIK this will work even for them. */ diff --git a/trunk/arch/x86_64/kernel/crash.c b/trunk/arch/x86_64/kernel/crash.c index 8ca04912b1cc..4e6c3b729e39 100644 --- a/trunk/arch/x86_64/kernel/crash.c +++ b/trunk/arch/x86_64/kernel/crash.c @@ -111,14 +111,14 @@ static int crash_nmi_callback(struct pt_regs *regs, int cpu) atomic_dec(&waiting_for_crash_ipi); /* Assume hlt works */ for(;;) - halt(); + asm("hlt"); return 1; } static void smp_send_nmi_allbutself(void) { - send_IPI_allbutself(NMI_VECTOR); + send_IPI_allbutself(APIC_DM_NMI); } /* diff --git a/trunk/arch/x86_64/kernel/e820.c b/trunk/arch/x86_64/kernel/e820.c index 9e94d834624b..1ef6028f721e 100644 --- a/trunk/arch/x86_64/kernel/e820.c +++ b/trunk/arch/x86_64/kernel/e820.c @@ -1,6 +1,7 @@ /* * Handle the memory map. * The functions here do the job until bootmem takes over. + * $Id: e820.c,v 1.4 2002/09/19 19:25:32 ak Exp $ * * Getting sanitize_e820_map() in sync with i386 version by applying change: * - Provisions for empty E820 memory regions (reported by certain BIOSes). @@ -620,7 +621,6 @@ void __init parse_memmapopt(char *p, char **from) } unsigned long pci_mem_start = 0xaeedbabe; -EXPORT_SYMBOL(pci_mem_start); /* * Search for the biggest gap in the low 32 bits of the e820 diff --git a/trunk/arch/x86_64/kernel/entry.S b/trunk/arch/x86_64/kernel/entry.S index 7290e72b9a34..586b34c00c48 100644 --- a/trunk/arch/x86_64/kernel/entry.S +++ b/trunk/arch/x86_64/kernel/entry.S @@ -154,7 +154,6 @@ rff_trace: GET_THREAD_INFO(%rcx) jmp rff_action CFI_ENDPROC -END(ret_from_fork) /* * System call entry. Upto 6 arguments in registers are supported. @@ -189,7 +188,7 @@ END(ret_from_fork) ENTRY(system_call) CFI_STARTPROC simple - CFI_DEF_CFA rsp,PDA_STACKOFFSET + CFI_DEF_CFA rsp,0 CFI_REGISTER rip,rcx /*CFI_REGISTER rflags,r11*/ swapgs @@ -286,7 +285,6 @@ tracesys: /* Use IRET because user could have changed frame */ jmp int_ret_from_sys_call CFI_ENDPROC -END(system_call) /* * Syscall return path ending with IRET. @@ -366,7 +364,6 @@ int_restore_rest: cli jmp int_with_check CFI_ENDPROC -END(int_ret_from_sys_call) /* * Certain special system calls that need to save a complete full stack frame. @@ -378,7 +375,6 @@ END(int_ret_from_sys_call) leaq \func(%rip),%rax leaq -ARGOFFSET+8(%rsp),\arg /* 8 for return address */ jmp ptregscall_common -END(\label) .endm CFI_STARTPROC @@ -408,7 +404,6 @@ ENTRY(ptregscall_common) CFI_REL_OFFSET rip, 0 ret CFI_ENDPROC -END(ptregscall_common) ENTRY(stub_execve) CFI_STARTPROC @@ -423,7 +418,6 @@ ENTRY(stub_execve) RESTORE_REST jmp int_ret_from_sys_call CFI_ENDPROC -END(stub_execve) /* * sigreturn is special because it needs to restore all registers on return. @@ -441,7 +435,6 @@ ENTRY(stub_rt_sigreturn) RESTORE_REST jmp int_ret_from_sys_call CFI_ENDPROC -END(stub_rt_sigreturn) /* * initial frame state for interrupts and exceptions @@ -473,18 +466,29 @@ END(stub_rt_sigreturn) /* 0(%rsp): interrupt number */ .macro interrupt func cld - SAVE_ARGS - leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler - pushq %rbp - CFI_ADJUST_CFA_OFFSET 8 - CFI_REL_OFFSET rbp, 0 +#ifdef CONFIG_DEBUG_INFO + SAVE_ALL + movq %rsp,%rdi + /* + * Setup a stack frame pointer. This allows gdb to trace + * back to the original stack. + */ movq %rsp,%rbp CFI_DEF_CFA_REGISTER rbp +#else + SAVE_ARGS + leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler +#endif testl $3,CS(%rdi) je 1f swapgs 1: incl %gs:pda_irqcount # RED-PEN should check preempt count - cmoveq %gs:pda_irqstackptr,%rsp + movq %gs:pda_irqstackptr,%rax + cmoveq %rax,%rsp /*todo This needs CFI annotation! */ + pushq %rdi # save old stack +#ifndef CONFIG_DEBUG_INFO + CFI_ADJUST_CFA_OFFSET 8 +#endif call \func .endm @@ -493,11 +497,17 @@ ENTRY(common_interrupt) interrupt do_IRQ /* 0(%rsp): oldrsp-ARGOFFSET */ ret_from_intr: + popq %rdi +#ifndef CONFIG_DEBUG_INFO + CFI_ADJUST_CFA_OFFSET -8 +#endif cli decl %gs:pda_irqcount - leaveq +#ifdef CONFIG_DEBUG_INFO + movq RBP(%rdi),%rbp CFI_DEF_CFA_REGISTER rsp - CFI_ADJUST_CFA_OFFSET -8 +#endif + leaq ARGOFFSET(%rdi),%rsp /*todo This needs CFI annotation! */ exit_intr: GET_THREAD_INFO(%rcx) testl $3,CS-ARGOFFSET(%rsp) @@ -579,9 +589,7 @@ retint_kernel: call preempt_schedule_irq jmp exit_intr #endif - CFI_ENDPROC -END(common_interrupt) /* * APIC interrupts. @@ -597,21 +605,17 @@ END(common_interrupt) ENTRY(thermal_interrupt) apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt -END(thermal_interrupt) ENTRY(threshold_interrupt) apicinterrupt THRESHOLD_APIC_VECTOR,mce_threshold_interrupt -END(threshold_interrupt) #ifdef CONFIG_SMP ENTRY(reschedule_interrupt) apicinterrupt RESCHEDULE_VECTOR,smp_reschedule_interrupt -END(reschedule_interrupt) .macro INVALIDATE_ENTRY num ENTRY(invalidate_interrupt\num) apicinterrupt INVALIDATE_TLB_VECTOR_START+\num,smp_invalidate_interrupt -END(invalidate_interrupt\num) .endm INVALIDATE_ENTRY 0 @@ -625,21 +629,17 @@ END(invalidate_interrupt\num) ENTRY(call_function_interrupt) apicinterrupt CALL_FUNCTION_VECTOR,smp_call_function_interrupt -END(call_function_interrupt) #endif #ifdef CONFIG_X86_LOCAL_APIC ENTRY(apic_timer_interrupt) apicinterrupt LOCAL_TIMER_VECTOR,smp_apic_timer_interrupt -END(apic_timer_interrupt) ENTRY(error_interrupt) apicinterrupt ERROR_APIC_VECTOR,smp_error_interrupt -END(error_interrupt) ENTRY(spurious_interrupt) apicinterrupt SPURIOUS_APIC_VECTOR,smp_spurious_interrupt -END(spurious_interrupt) #endif /* @@ -777,7 +777,6 @@ error_kernelspace: cmpq $gs_change,RIP(%rsp) je error_swapgs jmp error_sti -END(error_entry) /* Reload gs selector with exception handling */ /* edi: new selector */ @@ -795,7 +794,6 @@ gs_change: CFI_ADJUST_CFA_OFFSET -8 ret CFI_ENDPROC -ENDPROC(load_gs_index) .section __ex_table,"a" .align 8 @@ -849,7 +847,7 @@ ENTRY(kernel_thread) UNFAKE_STACK_FRAME ret CFI_ENDPROC -ENDPROC(kernel_thread) + child_rip: /* @@ -862,7 +860,6 @@ child_rip: # exit xorl %edi, %edi call do_exit -ENDPROC(child_rip) /* * execve(). This function needs to use IRET, not SYSRET, to set up all state properly. @@ -892,24 +889,19 @@ ENTRY(execve) UNFAKE_STACK_FRAME ret CFI_ENDPROC -ENDPROC(execve) KPROBE_ENTRY(page_fault) errorentry do_page_fault -END(page_fault) .previous .text ENTRY(coprocessor_error) zeroentry do_coprocessor_error -END(coprocessor_error) ENTRY(simd_coprocessor_error) zeroentry do_simd_coprocessor_error -END(simd_coprocessor_error) ENTRY(device_not_available) zeroentry math_state_restore -END(device_not_available) /* runs on exception stack */ KPROBE_ENTRY(debug) @@ -919,7 +911,6 @@ KPROBE_ENTRY(debug) paranoidentry do_debug, DEBUG_STACK jmp paranoid_exit CFI_ENDPROC -END(debug) .previous .text /* runs on exception stack */ @@ -970,7 +961,6 @@ paranoid_schedule: cli jmp paranoid_userspace CFI_ENDPROC -END(nmi) .previous .text KPROBE_ENTRY(int3) @@ -980,28 +970,22 @@ KPROBE_ENTRY(int3) paranoidentry do_int3, DEBUG_STACK jmp paranoid_exit CFI_ENDPROC -END(int3) .previous .text ENTRY(overflow) zeroentry do_overflow -END(overflow) ENTRY(bounds) zeroentry do_bounds -END(bounds) ENTRY(invalid_op) zeroentry do_invalid_op -END(invalid_op) ENTRY(coprocessor_segment_overrun) zeroentry do_coprocessor_segment_overrun -END(coprocessor_segment_overrun) ENTRY(reserved) zeroentry do_reserved -END(reserved) /* runs on exception stack */ ENTRY(double_fault) @@ -1009,15 +993,12 @@ ENTRY(double_fault) paranoidentry do_double_fault jmp paranoid_exit CFI_ENDPROC -END(double_fault) ENTRY(invalid_TSS) errorentry do_invalid_TSS -END(invalid_TSS) ENTRY(segment_not_present) errorentry do_segment_not_present -END(segment_not_present) /* runs on exception stack */ ENTRY(stack_segment) @@ -1025,24 +1006,19 @@ ENTRY(stack_segment) paranoidentry do_stack_segment jmp paranoid_exit CFI_ENDPROC -END(stack_segment) KPROBE_ENTRY(general_protection) errorentry do_general_protection -END(general_protection) .previous .text ENTRY(alignment_check) errorentry do_alignment_check -END(alignment_check) ENTRY(divide_error) zeroentry do_divide_error -END(divide_error) ENTRY(spurious_interrupt_bug) zeroentry do_spurious_interrupt_bug -END(spurious_interrupt_bug) #ifdef CONFIG_X86_MCE /* runs on exception stack */ @@ -1053,7 +1029,6 @@ ENTRY(machine_check) paranoidentry do_machine_check jmp paranoid_exit CFI_ENDPROC -END(machine_check) #endif ENTRY(call_softirq) @@ -1071,37 +1046,3 @@ ENTRY(call_softirq) decl %gs:pda_irqcount ret CFI_ENDPROC -ENDPROC(call_softirq) - -#ifdef CONFIG_STACK_UNWIND -ENTRY(arch_unwind_init_running) - CFI_STARTPROC - movq %r15, R15(%rdi) - movq %r14, R14(%rdi) - xchgq %rsi, %rdx - movq %r13, R13(%rdi) - movq %r12, R12(%rdi) - xorl %eax, %eax - movq %rbp, RBP(%rdi) - movq %rbx, RBX(%rdi) - movq (%rsp), %rcx - movq %rax, R11(%rdi) - movq %rax, R10(%rdi) - movq %rax, R9(%rdi) - movq %rax, R8(%rdi) - movq %rax, RAX(%rdi) - movq %rax, RCX(%rdi) - movq %rax, RDX(%rdi) - movq %rax, RSI(%rdi) - movq %rax, RDI(%rdi) - movq %rax, ORIG_RAX(%rdi) - movq %rcx, RIP(%rdi) - leaq 8(%rsp), %rcx - movq $__KERNEL_CS, CS(%rdi) - movq %rax, EFLAGS(%rdi) - movq %rcx, RSP(%rdi) - movq $__KERNEL_DS, SS(%rdi) - jmpq *%rdx - CFI_ENDPROC -ENDPROC(arch_unwind_init_running) -#endif diff --git a/trunk/arch/x86_64/kernel/genapic_flat.c b/trunk/arch/x86_64/kernel/genapic_flat.c index 21c7066e236a..1a2ab825be98 100644 --- a/trunk/arch/x86_64/kernel/genapic_flat.c +++ b/trunk/arch/x86_64/kernel/genapic_flat.c @@ -78,29 +78,22 @@ static void flat_send_IPI_mask(cpumask_t cpumask, int vector) static void flat_send_IPI_allbutself(int vector) { -#ifdef CONFIG_HOTPLUG_CPU - int hotplug = 1; +#ifndef CONFIG_HOTPLUG_CPU + if (((num_online_cpus()) - 1) >= 1) + __send_IPI_shortcut(APIC_DEST_ALLBUT, vector,APIC_DEST_LOGICAL); #else - int hotplug = 0; -#endif - if (hotplug || vector == NMI_VECTOR) { - cpumask_t allbutme = cpu_online_map; + cpumask_t allbutme = cpu_online_map; - cpu_clear(smp_processor_id(), allbutme); + cpu_clear(smp_processor_id(), allbutme); - if (!cpus_empty(allbutme)) - flat_send_IPI_mask(allbutme, vector); - } else if (num_online_cpus() > 1) { - __send_IPI_shortcut(APIC_DEST_ALLBUT, vector,APIC_DEST_LOGICAL); - } + if (!cpus_empty(allbutme)) + flat_send_IPI_mask(allbutme, vector); +#endif } static void flat_send_IPI_all(int vector) { - if (vector == NMI_VECTOR) - flat_send_IPI_mask(cpu_online_map, vector); - else - __send_IPI_shortcut(APIC_DEST_ALLINC, vector, APIC_DEST_LOGICAL); + __send_IPI_shortcut(APIC_DEST_ALLINC, vector, APIC_DEST_LOGICAL); } static int flat_apic_id_registered(void) @@ -115,7 +108,10 @@ static unsigned int flat_cpu_mask_to_apicid(cpumask_t cpumask) static unsigned int phys_pkg_id(int index_msb) { - return hard_smp_processor_id() >> index_msb; + u32 ebx; + + ebx = cpuid_ebx(1); + return ((ebx >> 24) & 0xFF) >> index_msb; } struct genapic apic_flat = { diff --git a/trunk/arch/x86_64/kernel/head64.c b/trunk/arch/x86_64/kernel/head64.c index e6a71c9556d9..cea20a66c150 100644 --- a/trunk/arch/x86_64/kernel/head64.c +++ b/trunk/arch/x86_64/kernel/head64.c @@ -2,6 +2,8 @@ * linux/arch/x86_64/kernel/head64.c -- prepare to run common code * * Copyright (C) 2000 Andrea Arcangeli SuSE + * + * $Id: head64.c,v 1.22 2001/07/06 14:28:20 ak Exp $ */ #include diff --git a/trunk/arch/x86_64/kernel/i8259.c b/trunk/arch/x86_64/kernel/i8259.c index 9b1a4e147321..5ecd34ab8c2b 100644 --- a/trunk/arch/x86_64/kernel/i8259.c +++ b/trunk/arch/x86_64/kernel/i8259.c @@ -44,11 +44,11 @@ BI(x,8) BI(x,9) BI(x,a) BI(x,b) \ BI(x,c) BI(x,d) BI(x,e) BI(x,f) -#define BUILD_15_IRQS(x) \ +#define BUILD_14_IRQS(x) \ BI(x,0) BI(x,1) BI(x,2) BI(x,3) \ BI(x,4) BI(x,5) BI(x,6) BI(x,7) \ BI(x,8) BI(x,9) BI(x,a) BI(x,b) \ - BI(x,c) BI(x,d) BI(x,e) + BI(x,c) BI(x,d) /* * ISA PIC or low IO-APIC triggered (INTA-cycle or APIC) interrupts: @@ -73,13 +73,13 @@ BUILD_16_IRQS(0x8) BUILD_16_IRQS(0x9) BUILD_16_IRQS(0xa) BUILD_16_IRQS(0xb) BUILD_16_IRQS(0xc) BUILD_16_IRQS(0xd) #ifdef CONFIG_PCI_MSI - BUILD_15_IRQS(0xe) + BUILD_14_IRQS(0xe) #endif #endif #undef BUILD_16_IRQS -#undef BUILD_15_IRQS +#undef BUILD_14_IRQS #undef BI @@ -92,11 +92,11 @@ BUILD_16_IRQS(0xc) BUILD_16_IRQS(0xd) IRQ(x,8), IRQ(x,9), IRQ(x,a), IRQ(x,b), \ IRQ(x,c), IRQ(x,d), IRQ(x,e), IRQ(x,f) -#define IRQLIST_15(x) \ +#define IRQLIST_14(x) \ IRQ(x,0), IRQ(x,1), IRQ(x,2), IRQ(x,3), \ IRQ(x,4), IRQ(x,5), IRQ(x,6), IRQ(x,7), \ IRQ(x,8), IRQ(x,9), IRQ(x,a), IRQ(x,b), \ - IRQ(x,c), IRQ(x,d), IRQ(x,e) + IRQ(x,c), IRQ(x,d) void (*interrupt[NR_IRQS])(void) = { IRQLIST_16(0x0), @@ -108,7 +108,7 @@ void (*interrupt[NR_IRQS])(void) = { IRQLIST_16(0xc), IRQLIST_16(0xd) #ifdef CONFIG_PCI_MSI - , IRQLIST_15(0xe) + , IRQLIST_14(0xe) #endif #endif diff --git a/trunk/arch/x86_64/kernel/io_apic.c b/trunk/arch/x86_64/kernel/io_apic.c index c768d8a036d0..9cc7031b7151 100644 --- a/trunk/arch/x86_64/kernel/io_apic.c +++ b/trunk/arch/x86_64/kernel/io_apic.c @@ -41,7 +41,6 @@ #include #include #include -#include #define __apicdebuginit __init @@ -57,7 +56,6 @@ int timer_over_8254 __initdata = 0; static struct { int pin, apic; } ioapic_i8259 = { -1, -1 }; static DEFINE_SPINLOCK(ioapic_lock); -static DEFINE_SPINLOCK(vector_lock); /* * # of IRQ routing registers @@ -319,7 +317,7 @@ void __init check_ioapic(void) vendor &= 0xffff; switch (vendor) { case PCI_VENDOR_ID_VIA: -#ifdef CONFIG_IOMMU +#ifdef CONFIG_GART_IOMMU if ((end_pfn > MAX_DMA32_PFN || force_iommu) && !iommu_aperture_allowed) { @@ -836,17 +834,10 @@ u8 irq_vector[NR_IRQ_VECTORS] __read_mostly = { FIRST_DEVICE_VECTOR , 0 }; int assign_irq_vector(int irq) { static int current_vector = FIRST_DEVICE_VECTOR, offset = 0; - unsigned long flags; - int vector; BUG_ON(irq != AUTO_ASSIGN && (unsigned)irq >= NR_IRQ_VECTORS); - - spin_lock_irqsave(&vector_lock, flags); - - if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) { - spin_unlock_irqrestore(&vector_lock, flags); + if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) return IO_APIC_VECTOR(irq); - } next: current_vector += 8; if (current_vector == IA32_SYSCALL_VECTOR) @@ -858,14 +849,11 @@ int assign_irq_vector(int irq) current_vector = FIRST_DEVICE_VECTOR + offset; } - vector = current_vector; - vector_irq[vector] = irq; + vector_irq[current_vector] = irq; if (irq != AUTO_ASSIGN) - IO_APIC_VECTOR(irq) = vector; - - spin_unlock_irqrestore(&vector_lock, flags); + IO_APIC_VECTOR(irq) = current_vector; - return vector; + return current_vector; } extern void (*interrupt[NR_IRQS])(void); @@ -878,14 +866,21 @@ static struct hw_interrupt_type ioapic_edge_type; static inline void ioapic_register_intr(int irq, int vector, unsigned long trigger) { - unsigned idx = use_pci_vector() && !platform_legacy_irq(irq) ? vector : irq; - - if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || - trigger == IOAPIC_LEVEL) - irq_desc[idx].handler = &ioapic_level_type; - else - irq_desc[idx].handler = &ioapic_edge_type; - set_intr_gate(vector, interrupt[idx]); + if (use_pci_vector() && !platform_legacy_irq(irq)) { + if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || + trigger == IOAPIC_LEVEL) + irq_desc[vector].handler = &ioapic_level_type; + else + irq_desc[vector].handler = &ioapic_edge_type; + set_intr_gate(vector, interrupt[vector]); + } else { + if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) || + trigger == IOAPIC_LEVEL) + irq_desc[irq].handler = &ioapic_level_type; + else + irq_desc[irq].handler = &ioapic_edge_type; + set_intr_gate(vector, interrupt[irq]); + } } static void __init setup_IO_APIC_irqs(void) diff --git a/trunk/arch/x86_64/kernel/irq.c b/trunk/arch/x86_64/kernel/irq.c index 59518d4d4358..d8bd0b345b1e 100644 --- a/trunk/arch/x86_64/kernel/irq.c +++ b/trunk/arch/x86_64/kernel/irq.c @@ -26,30 +26,6 @@ atomic_t irq_mis_count; #endif #endif -#ifdef CONFIG_DEBUG_STACKOVERFLOW -/* - * Probabilistic stack overflow check: - * - * Only check the stack in process context, because everything else - * runs on the big interrupt stacks. Checking reliably is too expensive, - * so we just check from interrupts. - */ -static inline void stack_overflow_check(struct pt_regs *regs) -{ - u64 curbase = (u64) current->thread_info; - static unsigned long warned = -60*HZ; - - if (regs->rsp >= curbase && regs->rsp <= curbase + THREAD_SIZE && - regs->rsp < curbase + sizeof(struct thread_info) + 128 && - time_after(jiffies, warned + 60*HZ)) { - printk("do_IRQ: %s near stack overflow (cur:%Lx,rsp:%lx)\n", - current->comm, curbase, regs->rsp); - show_stack(NULL,NULL); - warned = jiffies; - } -} -#endif - /* * Generic, controller-independent functions: */ @@ -63,7 +39,7 @@ int show_interrupts(struct seq_file *p, void *v) if (i == 0) { seq_printf(p, " "); for_each_online_cpu(j) - seq_printf(p, "CPU%-8d",j); + seq_printf(p, "CPU%d ",j); seq_putc(p, '\n'); } @@ -120,9 +96,7 @@ asmlinkage unsigned int do_IRQ(struct pt_regs *regs) exit_idle(); irq_enter(); -#ifdef CONFIG_DEBUG_STACKOVERFLOW - stack_overflow_check(regs); -#endif + __do_IRQ(irq, regs); irq_exit(); diff --git a/trunk/arch/x86_64/kernel/k8.c b/trunk/arch/x86_64/kernel/k8.c deleted file mode 100644 index 6416682d33d0..000000000000 --- a/trunk/arch/x86_64/kernel/k8.c +++ /dev/null @@ -1,118 +0,0 @@ -/* - * Shared support code for AMD K8 northbridges and derivates. - * Copyright 2006 Andi Kleen, SUSE Labs. Subject to GPLv2. - */ -#include -#include -#include -#include -#include -#include -#include - -int num_k8_northbridges; -EXPORT_SYMBOL(num_k8_northbridges); - -static u32 *flush_words; - -struct pci_device_id k8_nb_ids[] = { - { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1103) }, - { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1203) }, - {} -}; -EXPORT_SYMBOL(k8_nb_ids); - -struct pci_dev **k8_northbridges; -EXPORT_SYMBOL(k8_northbridges); - -static struct pci_dev *next_k8_northbridge(struct pci_dev *dev) -{ - do { - dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev); - if (!dev) - break; - } while (!pci_match_id(&k8_nb_ids[0], dev)); - return dev; -} - -int cache_k8_northbridges(void) -{ - int i; - struct pci_dev *dev; - if (num_k8_northbridges) - return 0; - - num_k8_northbridges = 0; - dev = NULL; - while ((dev = next_k8_northbridge(dev)) != NULL) - num_k8_northbridges++; - - k8_northbridges = kmalloc((num_k8_northbridges + 1) * sizeof(void *), - GFP_KERNEL); - if (!k8_northbridges) - return -ENOMEM; - - flush_words = kmalloc(num_k8_northbridges * sizeof(u32), GFP_KERNEL); - if (!flush_words) { - kfree(k8_northbridges); - return -ENOMEM; - } - - dev = NULL; - i = 0; - while ((dev = next_k8_northbridge(dev)) != NULL) { - k8_northbridges[i++] = dev; - pci_read_config_dword(dev, 0x9c, &flush_words[i]); - } - k8_northbridges[i] = NULL; - return 0; -} -EXPORT_SYMBOL_GPL(cache_k8_northbridges); - -/* Ignores subdevice/subvendor but as far as I can figure out - they're useless anyways */ -int __init early_is_k8_nb(u32 device) -{ - struct pci_device_id *id; - u32 vendor = device & 0xffff; - device >>= 16; - for (id = k8_nb_ids; id->vendor; id++) - if (vendor == id->vendor && device == id->device) - return 1; - return 0; -} - -void k8_flush_garts(void) -{ - int flushed, i; - unsigned long flags; - static DEFINE_SPINLOCK(gart_lock); - - /* Avoid races between AGP and IOMMU. In theory it's not needed - but I'm not sure if the hardware won't lose flush requests - when another is pending. This whole thing is so expensive anyways - that it doesn't matter to serialize more. -AK */ - spin_lock_irqsave(&gart_lock, flags); - flushed = 0; - for (i = 0; i < num_k8_northbridges; i++) { - pci_write_config_dword(k8_northbridges[i], 0x9c, - flush_words[i]|1); - flushed++; - } - for (i = 0; i < num_k8_northbridges; i++) { - u32 w; - /* Make sure the hardware actually executed the flush*/ - for (;;) { - pci_read_config_dword(k8_northbridges[i], - 0x9c, &w); - if (!(w & 1)) - break; - cpu_relax(); - } - } - spin_unlock_irqrestore(&gart_lock, flags); - if (!flushed) - printk("nothing to flush?\n"); -} -EXPORT_SYMBOL_GPL(k8_flush_garts); - diff --git a/trunk/arch/x86_64/kernel/mce.c b/trunk/arch/x86_64/kernel/mce.c index acd5816b1a6f..c69fc43cee7b 100644 --- a/trunk/arch/x86_64/kernel/mce.c +++ b/trunk/arch/x86_64/kernel/mce.c @@ -562,7 +562,7 @@ static struct sysdev_class mce_sysclass = { set_kset_name("machinecheck"), }; -DEFINE_PER_CPU(struct sys_device, device_mce); +static DEFINE_PER_CPU(struct sys_device, device_mce); /* Why are there no generic functions for this? */ #define ACCESSOR(name, var, start) \ diff --git a/trunk/arch/x86_64/kernel/mce_amd.c b/trunk/arch/x86_64/kernel/mce_amd.c index 335200aa2737..d13b241ad094 100644 --- a/trunk/arch/x86_64/kernel/mce_amd.c +++ b/trunk/arch/x86_64/kernel/mce_amd.c @@ -1,5 +1,5 @@ /* - * (c) 2005, 2006 Advanced Micro Devices, Inc. + * (c) 2005 Advanced Micro Devices, Inc. * Your use of this code is subject to the terms and conditions of the * GNU general public license version 2. See "COPYING" or * http://www.gnu.org/licenses/gpl.html @@ -8,10 +8,9 @@ * * Support : jacob.shin@amd.com * - * April 2006 - * - added support for AMD Family 0x10 processors + * MC4_MISC0 DRAM ECC Error Threshold available under AMD K8 Rev F. + * MC4_MISC0 exists per physical processor. * - * All MC4_MISCi registers are shared between multi-cores */ #include @@ -30,45 +29,32 @@ #include #include -#define PFX "mce_threshold: " -#define VERSION "version 1.1.1" -#define NR_BANKS 6 -#define NR_BLOCKS 9 -#define THRESHOLD_MAX 0xFFF -#define INT_TYPE_APIC 0x00020000 -#define MASK_VALID_HI 0x80000000 -#define MASK_LVTOFF_HI 0x00F00000 -#define MASK_COUNT_EN_HI 0x00080000 -#define MASK_INT_TYPE_HI 0x00060000 -#define MASK_OVERFLOW_HI 0x00010000 +#define PFX "mce_threshold: " +#define VERSION "version 1.00.9" +#define NR_BANKS 5 +#define THRESHOLD_MAX 0xFFF +#define INT_TYPE_APIC 0x00020000 +#define MASK_VALID_HI 0x80000000 +#define MASK_LVTOFF_HI 0x00F00000 +#define MASK_COUNT_EN_HI 0x00080000 +#define MASK_INT_TYPE_HI 0x00060000 +#define MASK_OVERFLOW_HI 0x00010000 #define MASK_ERR_COUNT_HI 0x00000FFF -#define MASK_BLKPTR_LO 0xFF000000 -#define MCG_XBLK_ADDR 0xC0000400 +#define MASK_OVERFLOW 0x0001000000000000L -struct threshold_block { - unsigned int block; - unsigned int bank; +struct threshold_bank { unsigned int cpu; - u32 address; - u16 interrupt_enable; + u8 bank; + u8 interrupt_enable; u16 threshold_limit; struct kobject kobj; - struct list_head miscj; }; -/* defaults used early on boot */ -static struct threshold_block threshold_defaults = { +static struct threshold_bank threshold_defaults = { .interrupt_enable = 0, .threshold_limit = THRESHOLD_MAX, }; -struct threshold_bank { - struct kobject kobj; - struct threshold_block *blocks; - cpumask_t cpus; -}; -static DEFINE_PER_CPU(struct threshold_bank *, threshold_banks[NR_BANKS]); - #ifdef CONFIG_SMP static unsigned char shared_bank[NR_BANKS] = { 0, 0, 0, 0, 1 @@ -82,12 +68,12 @@ static DEFINE_PER_CPU(unsigned char, bank_map); /* see which banks are on */ */ /* must be called with correct cpu affinity */ -static void threshold_restart_bank(struct threshold_block *b, +static void threshold_restart_bank(struct threshold_bank *b, int reset, u16 old_limit) { u32 mci_misc_hi, mci_misc_lo; - rdmsr(b->address, mci_misc_lo, mci_misc_hi); + rdmsr(MSR_IA32_MC0_MISC + b->bank * 4, mci_misc_lo, mci_misc_hi); if (b->threshold_limit < (mci_misc_hi & THRESHOLD_MAX)) reset = 1; /* limit cannot be lower than err count */ @@ -108,57 +94,35 @@ static void threshold_restart_bank(struct threshold_block *b, (mci_misc_hi &= ~MASK_INT_TYPE_HI); mci_misc_hi |= MASK_COUNT_EN_HI; - wrmsr(b->address, mci_misc_lo, mci_misc_hi); + wrmsr(MSR_IA32_MC0_MISC + b->bank * 4, mci_misc_lo, mci_misc_hi); } -/* cpu init entry point, called from mce.c with preempt off */ void __cpuinit mce_amd_feature_init(struct cpuinfo_x86 *c) { - unsigned int bank, block; + int bank; + u32 mci_misc_lo, mci_misc_hi; unsigned int cpu = smp_processor_id(); - u32 low = 0, high = 0, address = 0; for (bank = 0; bank < NR_BANKS; ++bank) { - for (block = 0; block < NR_BLOCKS; ++block) { - if (block == 0) - address = MSR_IA32_MC0_MISC + bank * 4; - else if (block == 1) - address = MCG_XBLK_ADDR - + ((low & MASK_BLKPTR_LO) >> 21); - else - ++address; - - if (rdmsr_safe(address, &low, &high)) - continue; + rdmsr(MSR_IA32_MC0_MISC + bank * 4, mci_misc_lo, mci_misc_hi); - if (!(high & MASK_VALID_HI)) { - if (block) - continue; - else - break; - } + /* !valid, !counter present, bios locked */ + if (!(mci_misc_hi & MASK_VALID_HI) || + !(mci_misc_hi & MASK_VALID_HI >> 1) || + (mci_misc_hi & MASK_VALID_HI >> 2)) + continue; - if (!(high & MASK_VALID_HI >> 1) || - (high & MASK_VALID_HI >> 2)) - continue; + per_cpu(bank_map, cpu) |= (1 << bank); - if (!block) - per_cpu(bank_map, cpu) |= (1 << bank); #ifdef CONFIG_SMP - if (shared_bank[bank] && c->cpu_core_id) - break; + if (shared_bank[bank] && cpu_core_id[cpu]) + continue; #endif - high &= ~MASK_LVTOFF_HI; - high |= K8_APIC_EXT_LVT_ENTRY_THRESHOLD << 20; - wrmsr(address, low, high); - setup_APIC_extened_lvt(K8_APIC_EXT_LVT_ENTRY_THRESHOLD, - THRESHOLD_APIC_VECTOR, - K8_APIC_EXT_INT_MSG_FIX, 0); - - threshold_defaults.address = address; - threshold_restart_bank(&threshold_defaults, 0, 0); - } + setup_threshold_lvt((mci_misc_hi & MASK_LVTOFF_HI) >> 20); + threshold_defaults.cpu = cpu; + threshold_defaults.bank = bank; + threshold_restart_bank(&threshold_defaults, 0, 0); } } @@ -173,9 +137,8 @@ void __cpuinit mce_amd_feature_init(struct cpuinfo_x86 *c) */ asmlinkage void mce_threshold_interrupt(void) { - unsigned int bank, block; + int bank; struct mce m; - u32 low = 0, high = 0, address = 0; ack_APIC_irq(); exit_idle(); @@ -187,42 +150,15 @@ asmlinkage void mce_threshold_interrupt(void) /* assume first bank caused it */ for (bank = 0; bank < NR_BANKS; ++bank) { - for (block = 0; block < NR_BLOCKS; ++block) { - if (block == 0) - address = MSR_IA32_MC0_MISC + bank * 4; - else if (block == 1) - address = MCG_XBLK_ADDR - + ((low & MASK_BLKPTR_LO) >> 21); - else - ++address; - - if (rdmsr_safe(address, &low, &high)) - continue; + m.bank = MCE_THRESHOLD_BASE + bank; + rdmsrl(MSR_IA32_MC0_MISC + bank * 4, m.misc); - if (!(high & MASK_VALID_HI)) { - if (block) - continue; - else - break; - } - - if (!(high & MASK_VALID_HI >> 1) || - (high & MASK_VALID_HI >> 2)) - continue; - - if (high & MASK_OVERFLOW_HI) { - rdmsrl(address, m.misc); - rdmsrl(MSR_IA32_MC0_STATUS + bank * 4, - m.status); - m.bank = K8_MCE_THRESHOLD_BASE - + bank * NR_BLOCKS - + block; - mce_log(&m); - goto out; - } + if (m.misc & MASK_OVERFLOW) { + mce_log(&m); + goto out; } } -out: + out: irq_exit(); } @@ -230,12 +166,20 @@ asmlinkage void mce_threshold_interrupt(void) * Sysfs Interface */ +static struct sysdev_class threshold_sysclass = { + set_kset_name("threshold"), +}; + +static DEFINE_PER_CPU(struct sys_device, device_threshold); + struct threshold_attr { - struct attribute attr; - ssize_t(*show) (struct threshold_block *, char *); - ssize_t(*store) (struct threshold_block *, const char *, size_t count); + struct attribute attr; + ssize_t(*show) (struct threshold_bank *, char *); + ssize_t(*store) (struct threshold_bank *, const char *, size_t count); }; +static DEFINE_PER_CPU(struct threshold_bank *, threshold_banks[NR_BANKS]); + static cpumask_t affinity_set(unsigned int cpu) { cpumask_t oldmask = current->cpus_allowed; @@ -250,15 +194,15 @@ static void affinity_restore(cpumask_t oldmask) set_cpus_allowed(current, oldmask); } -#define SHOW_FIELDS(name) \ -static ssize_t show_ ## name(struct threshold_block * b, char *buf) \ -{ \ - return sprintf(buf, "%lx\n", (unsigned long) b->name); \ -} +#define SHOW_FIELDS(name) \ + static ssize_t show_ ## name(struct threshold_bank * b, char *buf) \ + { \ + return sprintf(buf, "%lx\n", (unsigned long) b->name); \ + } SHOW_FIELDS(interrupt_enable) SHOW_FIELDS(threshold_limit) -static ssize_t store_interrupt_enable(struct threshold_block *b, +static ssize_t store_interrupt_enable(struct threshold_bank *b, const char *buf, size_t count) { char *end; @@ -275,7 +219,7 @@ static ssize_t store_interrupt_enable(struct threshold_block *b, return end - buf; } -static ssize_t store_threshold_limit(struct threshold_block *b, +static ssize_t store_threshold_limit(struct threshold_bank *b, const char *buf, size_t count) { char *end; @@ -298,18 +242,18 @@ static ssize_t store_threshold_limit(struct threshold_block *b, return end - buf; } -static ssize_t show_error_count(struct threshold_block *b, char *buf) +static ssize_t show_error_count(struct threshold_bank *b, char *buf) { u32 high, low; cpumask_t oldmask; oldmask = affinity_set(b->cpu); - rdmsr(b->address, low, high); + rdmsr(MSR_IA32_MC0_MISC + b->bank * 4, low, high); /* ignore low 32 */ affinity_restore(oldmask); return sprintf(buf, "%x\n", (high & 0xFFF) - (THRESHOLD_MAX - b->threshold_limit)); } -static ssize_t store_error_count(struct threshold_block *b, +static ssize_t store_error_count(struct threshold_bank *b, const char *buf, size_t count) { cpumask_t oldmask; @@ -325,13 +269,13 @@ static ssize_t store_error_count(struct threshold_block *b, .store = _store, \ }; -#define RW_ATTR(name) \ -static struct threshold_attr name = \ +#define ATTR_FIELDS(name) \ + static struct threshold_attr name = \ THRESHOLD_ATTR(name, 0644, show_## name, store_## name) -RW_ATTR(interrupt_enable); -RW_ATTR(threshold_limit); -RW_ATTR(error_count); +ATTR_FIELDS(interrupt_enable); +ATTR_FIELDS(threshold_limit); +ATTR_FIELDS(error_count); static struct attribute *default_attrs[] = { &interrupt_enable.attr, @@ -340,12 +284,12 @@ static struct attribute *default_attrs[] = { NULL }; -#define to_block(k) container_of(k, struct threshold_block, kobj) -#define to_attr(a) container_of(a, struct threshold_attr, attr) +#define to_bank(k) container_of(k,struct threshold_bank,kobj) +#define to_attr(a) container_of(a,struct threshold_attr,attr) static ssize_t show(struct kobject *kobj, struct attribute *attr, char *buf) { - struct threshold_block *b = to_block(kobj); + struct threshold_bank *b = to_bank(kobj); struct threshold_attr *a = to_attr(attr); ssize_t ret; ret = a->show ? a->show(b, buf) : -EIO; @@ -355,7 +299,7 @@ static ssize_t show(struct kobject *kobj, struct attribute *attr, char *buf) static ssize_t store(struct kobject *kobj, struct attribute *attr, const char *buf, size_t count) { - struct threshold_block *b = to_block(kobj); + struct threshold_bank *b = to_bank(kobj); struct threshold_attr *a = to_attr(attr); ssize_t ret; ret = a->store ? a->store(b, buf, count) : -EIO; @@ -372,174 +316,69 @@ static struct kobj_type threshold_ktype = { .default_attrs = default_attrs, }; -static __cpuinit int allocate_threshold_blocks(unsigned int cpu, - unsigned int bank, - unsigned int block, - u32 address) -{ - int err; - u32 low, high; - struct threshold_block *b = NULL; - - if ((bank >= NR_BANKS) || (block >= NR_BLOCKS)) - return 0; - - if (rdmsr_safe(address, &low, &high)) - goto recurse; - - if (!(high & MASK_VALID_HI)) { - if (block) - goto recurse; - else - return 0; - } - - if (!(high & MASK_VALID_HI >> 1) || - (high & MASK_VALID_HI >> 2)) - goto recurse; - - b = kzalloc(sizeof(struct threshold_block), GFP_KERNEL); - if (!b) - return -ENOMEM; - memset(b, 0, sizeof(struct threshold_block)); - - b->block = block; - b->bank = bank; - b->cpu = cpu; - b->address = address; - b->interrupt_enable = 0; - b->threshold_limit = THRESHOLD_MAX; - - INIT_LIST_HEAD(&b->miscj); - - if (per_cpu(threshold_banks, cpu)[bank]->blocks) - list_add(&b->miscj, - &per_cpu(threshold_banks, cpu)[bank]->blocks->miscj); - else - per_cpu(threshold_banks, cpu)[bank]->blocks = b; - - kobject_set_name(&b->kobj, "misc%i", block); - b->kobj.parent = &per_cpu(threshold_banks, cpu)[bank]->kobj; - b->kobj.ktype = &threshold_ktype; - err = kobject_register(&b->kobj); - if (err) - goto out_free; -recurse: - if (!block) { - address = (low & MASK_BLKPTR_LO) >> 21; - if (!address) - return 0; - address += MCG_XBLK_ADDR; - } else - ++address; - - err = allocate_threshold_blocks(cpu, bank, ++block, address); - if (err) - goto out_free; - - return err; - -out_free: - if (b) { - kobject_unregister(&b->kobj); - kfree(b); - } - return err; -} - /* symlinks sibling shared banks to first core. first core owns dir/files. */ -static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank) +static __cpuinit int threshold_create_bank(unsigned int cpu, int bank) { - int i, err = 0; + int err = 0; struct threshold_bank *b = NULL; - cpumask_t oldmask = CPU_MASK_NONE; - char name[32]; - - sprintf(name, "threshold_bank%i", bank); #ifdef CONFIG_SMP - if (cpu_data[cpu].cpu_core_id && shared_bank[bank]) { /* symlink */ - i = first_cpu(cpu_core_map[cpu]); - - /* first core not up yet */ - if (cpu_data[i].cpu_core_id) - goto out; - - /* already linked */ - if (per_cpu(threshold_banks, cpu)[bank]) - goto out; - - b = per_cpu(threshold_banks, i)[bank]; + if (cpu_core_id[cpu] && shared_bank[bank]) { /* symlink */ + char name[16]; + unsigned lcpu = first_cpu(cpu_core_map[cpu]); + if (cpu_core_id[lcpu]) + goto out; /* first core not up yet */ + b = per_cpu(threshold_banks, lcpu)[bank]; if (!b) goto out; - - err = sysfs_create_link(&per_cpu(device_mce, cpu).kobj, + sprintf(name, "bank%i", bank); + err = sysfs_create_link(&per_cpu(device_threshold, cpu).kobj, &b->kobj, name); if (err) goto out; - - b->cpus = cpu_core_map[cpu]; per_cpu(threshold_banks, cpu)[bank] = b; goto out; } #endif - b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL); + b = kmalloc(sizeof(struct threshold_bank), GFP_KERNEL); if (!b) { err = -ENOMEM; goto out; } memset(b, 0, sizeof(struct threshold_bank)); - kobject_set_name(&b->kobj, "threshold_bank%i", bank); - b->kobj.parent = &per_cpu(device_mce, cpu).kobj; -#ifndef CONFIG_SMP - b->cpus = CPU_MASK_ALL; -#else - b->cpus = cpu_core_map[cpu]; -#endif - err = kobject_register(&b->kobj); - if (err) - goto out_free; - - per_cpu(threshold_banks, cpu)[bank] = b; - - oldmask = affinity_set(cpu); - err = allocate_threshold_blocks(cpu, bank, 0, - MSR_IA32_MC0_MISC + bank * 4); - affinity_restore(oldmask); - - if (err) - goto out_free; - - for_each_cpu_mask(i, b->cpus) { - if (i == cpu) - continue; - - err = sysfs_create_link(&per_cpu(device_mce, i).kobj, - &b->kobj, name); - if (err) - goto out; + b->cpu = cpu; + b->bank = bank; + b->interrupt_enable = 0; + b->threshold_limit = THRESHOLD_MAX; + kobject_set_name(&b->kobj, "bank%i", bank); + b->kobj.parent = &per_cpu(device_threshold, cpu).kobj; + b->kobj.ktype = &threshold_ktype; - per_cpu(threshold_banks, i)[bank] = b; + err = kobject_register(&b->kobj); + if (err) { + kfree(b); + goto out; } - - goto out; - -out_free: - per_cpu(threshold_banks, cpu)[bank] = NULL; - kfree(b); -out: + per_cpu(threshold_banks, cpu)[bank] = b; + out: return err; } /* create dir/files for all valid threshold banks */ static __cpuinit int threshold_create_device(unsigned int cpu) { - unsigned int bank; + int bank; int err = 0; + per_cpu(device_threshold, cpu).id = cpu; + per_cpu(device_threshold, cpu).cls = &threshold_sysclass; + err = sysdev_register(&per_cpu(device_threshold, cpu)); + if (err) + goto out; + for (bank = 0; bank < NR_BANKS; ++bank) { if (!(per_cpu(bank_map, cpu) & 1 << bank)) continue; @@ -547,7 +386,7 @@ static __cpuinit int threshold_create_device(unsigned int cpu) if (err) goto out; } -out: + out: return err; } @@ -558,85 +397,92 @@ static __cpuinit int threshold_create_device(unsigned int cpu) * of shared sysfs dir/files, and rest of the cores will be symlinked to it. */ -static __cpuinit void deallocate_threshold_block(unsigned int cpu, - unsigned int bank) -{ - struct threshold_block *pos = NULL; - struct threshold_block *tmp = NULL; - struct threshold_bank *head = per_cpu(threshold_banks, cpu)[bank]; - - if (!head) - return; - - list_for_each_entry_safe(pos, tmp, &head->blocks->miscj, miscj) { - kobject_unregister(&pos->kobj); - list_del(&pos->miscj); - kfree(pos); - } - - kfree(per_cpu(threshold_banks, cpu)[bank]->blocks); - per_cpu(threshold_banks, cpu)[bank]->blocks = NULL; -} - +/* cpu hotplug call removes all symlinks before first core dies */ static __cpuinit void threshold_remove_bank(unsigned int cpu, int bank) { - int i = 0; struct threshold_bank *b; - char name[32]; + char name[16]; b = per_cpu(threshold_banks, cpu)[bank]; - if (!b) return; - - if (!b->blocks) - goto free_out; - - sprintf(name, "threshold_bank%i", bank); - - /* sibling symlink */ - if (shared_bank[bank] && b->blocks->cpu != cpu) { - sysfs_remove_link(&per_cpu(device_mce, cpu).kobj, name); - per_cpu(threshold_banks, i)[bank] = NULL; - return; - } - - /* remove all sibling symlinks before unregistering */ - for_each_cpu_mask(i, b->cpus) { - if (i == cpu) - continue; - - sysfs_remove_link(&per_cpu(device_mce, i).kobj, name); - per_cpu(threshold_banks, i)[bank] = NULL; + if (shared_bank[bank] && atomic_read(&b->kobj.kref.refcount) > 2) { + sprintf(name, "bank%i", bank); + sysfs_remove_link(&per_cpu(device_threshold, cpu).kobj, name); + per_cpu(threshold_banks, cpu)[bank] = NULL; + } else { + kobject_unregister(&b->kobj); + kfree(per_cpu(threshold_banks, cpu)[bank]); } - - deallocate_threshold_block(cpu, bank); - -free_out: - kobject_unregister(&b->kobj); - kfree(b); - per_cpu(threshold_banks, cpu)[bank] = NULL; } static __cpuinit void threshold_remove_device(unsigned int cpu) { - unsigned int bank; + int bank; for (bank = 0; bank < NR_BANKS; ++bank) { if (!(per_cpu(bank_map, cpu) & 1 << bank)) continue; threshold_remove_bank(cpu, bank); } + sysdev_unregister(&per_cpu(device_threshold, cpu)); } +/* link all existing siblings when first core comes up */ +static __cpuinit int threshold_create_symlinks(unsigned int cpu) +{ + int bank, err = 0; + unsigned int lcpu = 0; + + if (cpu_core_id[cpu]) + return 0; + for_each_cpu_mask(lcpu, cpu_core_map[cpu]) { + if (lcpu == cpu) + continue; + for (bank = 0; bank < NR_BANKS; ++bank) { + if (!(per_cpu(bank_map, cpu) & 1 << bank)) + continue; + if (!shared_bank[bank]) + continue; + err = threshold_create_bank(lcpu, bank); + } + } + return err; +} + +/* remove all symlinks before first core dies. */ +static __cpuinit void threshold_remove_symlinks(unsigned int cpu) +{ + int bank; + unsigned int lcpu = 0; + if (cpu_core_id[cpu]) + return; + for_each_cpu_mask(lcpu, cpu_core_map[cpu]) { + if (lcpu == cpu) + continue; + for (bank = 0; bank < NR_BANKS; ++bank) { + if (!(per_cpu(bank_map, cpu) & 1 << bank)) + continue; + if (!shared_bank[bank]) + continue; + threshold_remove_bank(lcpu, bank); + } + } +} #else /* !CONFIG_HOTPLUG_CPU */ +static __cpuinit void threshold_create_symlinks(unsigned int cpu) +{ +} +static __cpuinit void threshold_remove_symlinks(unsigned int cpu) +{ +} static void threshold_remove_device(unsigned int cpu) { } #endif /* get notified when a cpu comes on/off */ -static int __cpuinit threshold_cpu_callback(struct notifier_block *nfb, +static int threshold_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) { /* cpu was unsigned int to begin with */ @@ -648,6 +494,13 @@ static int __cpuinit threshold_cpu_callback(struct notifier_block *nfb, switch (action) { case CPU_ONLINE: threshold_create_device(cpu); + threshold_create_symlinks(cpu); + break; + case CPU_DOWN_PREPARE: + threshold_remove_symlinks(cpu); + break; + case CPU_DOWN_FAILED: + threshold_create_symlinks(cpu); break; case CPU_DEAD: threshold_remove_device(cpu); @@ -659,22 +512,29 @@ static int __cpuinit threshold_cpu_callback(struct notifier_block *nfb, return NOTIFY_OK; } -static struct notifier_block threshold_cpu_notifier __cpuinitdata = { +static struct notifier_block threshold_cpu_notifier = { .notifier_call = threshold_cpu_callback, }; static __init int threshold_init_device(void) { - unsigned lcpu = 0; + int err; + int lcpu = 0; + + err = sysdev_class_register(&threshold_sysclass); + if (err) + goto out; /* to hit CPUs online before the notifier is up */ for_each_online_cpu(lcpu) { - int err = threshold_create_device(lcpu); + err = threshold_create_device(lcpu); if (err) - return err; + goto out; } register_cpu_notifier(&threshold_cpu_notifier); - return 0; + + out: + return err; } device_initcall(threshold_init_device); diff --git a/trunk/arch/x86_64/kernel/module.c b/trunk/arch/x86_64/kernel/module.c index 9d0958ff547f..bac195c74bcc 100644 --- a/trunk/arch/x86_64/kernel/module.c +++ b/trunk/arch/x86_64/kernel/module.c @@ -145,38 +145,26 @@ int apply_relocate(Elf_Shdr *sechdrs, return -ENOSYS; } +extern void apply_alternatives(void *start, void *end); + int module_finalize(const Elf_Ehdr *hdr, - const Elf_Shdr *sechdrs, - struct module *me) + const Elf_Shdr *sechdrs, + struct module *me) { - const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL; + const Elf_Shdr *s; char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; - for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { - if (!strcmp(".text", secstrings + s->sh_name)) - text = s; - if (!strcmp(".altinstructions", secstrings + s->sh_name)) - alt = s; - if (!strcmp(".smp_locks", secstrings + s->sh_name)) - locks= s; - } - - if (alt) { - /* patch .altinstructions */ - void *aseg = (void *)alt->sh_addr; - apply_alternatives(aseg, aseg + alt->sh_size); - } - if (locks && text) { - void *lseg = (void *)locks->sh_addr; - void *tseg = (void *)text->sh_addr; - alternatives_smp_module_add(me, me->name, - lseg, lseg + locks->sh_size, - tseg, tseg + text->sh_size); - } + /* look for .altinstructions to patch */ + for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { + void *seg; + if (strcmp(".altinstructions", secstrings + s->sh_name)) + continue; + seg = (void *)s->sh_addr; + apply_alternatives(seg, seg + s->sh_size); + } return 0; } void module_arch_cleanup(struct module *mod) { - alternatives_smp_module_del(mod); } diff --git a/trunk/arch/x86_64/kernel/nmi.c b/trunk/arch/x86_64/kernel/nmi.c index 399489c93132..4e6357fe0ec3 100644 --- a/trunk/arch/x86_64/kernel/nmi.c +++ b/trunk/arch/x86_64/kernel/nmi.c @@ -15,7 +15,11 @@ #include #include #include +#include +#include #include +#include +#include #include #include #include @@ -23,11 +27,14 @@ #include #include +#include +#include #include +#include #include #include +#include #include -#include /* * lapic_nmi_owner tracks the ownership of the lapic NMI hardware: @@ -67,9 +74,6 @@ static unsigned int nmi_p4_cccr_val; #define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING -#define ARCH_PERFMON_NMI_EVENT_SEL ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL -#define ARCH_PERFMON_NMI_EVENT_UMASK ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK - #define MSR_P4_MISC_ENABLE 0x1A0 #define MSR_P4_MISC_ENABLE_PERF_AVAIL (1<<7) #define MSR_P4_MISC_ENABLE_PEBS_UNAVAIL (1<<12) @@ -101,10 +105,7 @@ static __cpuinit inline int nmi_known_cpu(void) case X86_VENDOR_AMD: return boot_cpu_data.x86 == 15; case X86_VENDOR_INTEL: - if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) - return 1; - else - return (boot_cpu_data.x86 == 15); + return boot_cpu_data.x86 == 15; } return 0; } @@ -210,8 +211,6 @@ int __init setup_nmi_watchdog(char *str) __setup("nmi_watchdog=", setup_nmi_watchdog); -static void disable_intel_arch_watchdog(void); - static void disable_lapic_nmi_watchdog(void) { if (nmi_active <= 0) @@ -224,8 +223,6 @@ static void disable_lapic_nmi_watchdog(void) if (boot_cpu_data.x86 == 15) { wrmsr(MSR_P4_IQ_CCCR0, 0, 0); wrmsr(MSR_P4_CRU_ESCR0, 0, 0); - } else if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) { - disable_intel_arch_watchdog(); } break; } @@ -378,53 +375,6 @@ static void setup_k7_watchdog(void) wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); } -static void disable_intel_arch_watchdog(void) -{ - unsigned ebx; - - /* - * Check whether the Architectural PerfMon supports - * Unhalted Core Cycles Event or not. - * NOTE: Corresponding bit = 0 in ebp indicates event present. - */ - ebx = cpuid_ebx(10); - if (!(ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT)) - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, 0, 0); -} - -static int setup_intel_arch_watchdog(void) -{ - unsigned int evntsel; - unsigned ebx; - - /* - * Check whether the Architectural PerfMon supports - * Unhalted Core Cycles Event or not. - * NOTE: Corresponding bit = 0 in ebp indicates event present. - */ - ebx = cpuid_ebx(10); - if ((ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT)) - return 0; - - nmi_perfctr_msr = MSR_ARCH_PERFMON_PERFCTR0; - - clear_msr_range(MSR_ARCH_PERFMON_EVENTSEL0, 2); - clear_msr_range(MSR_ARCH_PERFMON_PERFCTR0, 2); - - evntsel = ARCH_PERFMON_EVENTSEL_INT - | ARCH_PERFMON_EVENTSEL_OS - | ARCH_PERFMON_EVENTSEL_USR - | ARCH_PERFMON_NMI_EVENT_SEL - | ARCH_PERFMON_NMI_EVENT_UMASK; - - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, evntsel, 0); - wrmsrl(MSR_ARCH_PERFMON_PERFCTR0, -((u64)cpu_khz * 1000 / nmi_hz)); - apic_write(APIC_LVTPC, APIC_DM_NMI); - evntsel |= ARCH_PERFMON_EVENTSEL0_ENABLE; - wrmsr(MSR_ARCH_PERFMON_EVENTSEL0, evntsel, 0); - return 1; -} - static int setup_p4_watchdog(void) { @@ -478,16 +428,10 @@ void setup_apic_nmi_watchdog(void) setup_k7_watchdog(); break; case X86_VENDOR_INTEL: - if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) { - if (!setup_intel_arch_watchdog()) - return; - } else if (boot_cpu_data.x86 == 15) { - if (!setup_p4_watchdog()) - return; - } else { + if (boot_cpu_data.x86 != 15) + return; + if (!setup_p4_watchdog()) return; - } - break; default: @@ -572,14 +516,7 @@ void __kprobes nmi_watchdog_tick(struct pt_regs * regs, unsigned reason) */ wrmsr(MSR_P4_IQ_CCCR0, nmi_p4_cccr_val, 0); apic_write(APIC_LVTPC, APIC_DM_NMI); - } else if (nmi_perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) { - /* - * For Intel based architectural perfmon - * - LVTPC is masked on interrupt and must be - * unmasked by the LVTPC handler. - */ - apic_write(APIC_LVTPC, APIC_DM_NMI); - } + } wrmsrl(nmi_perfctr_msr, -((u64)cpu_khz * 1000 / nmi_hz)); } } diff --git a/trunk/arch/x86_64/kernel/pci-calgary.c b/trunk/arch/x86_64/kernel/pci-calgary.c deleted file mode 100644 index d91cb843f54d..000000000000 --- a/trunk/arch/x86_64/kernel/pci-calgary.c +++ /dev/null @@ -1,1018 +0,0 @@ -/* - * Derived from arch/powerpc/kernel/iommu.c - * - * Copyright (C) 2006 Jon Mason , IBM Corporation - * Copyright (C) 2006 Muli Ben-Yehuda , IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define PCI_DEVICE_ID_IBM_CALGARY 0x02a1 -#define PCI_VENDOR_DEVICE_ID_CALGARY \ - (PCI_VENDOR_ID_IBM | PCI_DEVICE_ID_IBM_CALGARY << 16) - -/* we need these for register space address calculation */ -#define START_ADDRESS 0xfe000000 -#define CHASSIS_BASE 0 -#define ONE_BASED_CHASSIS_NUM 1 - -/* register offsets inside the host bridge space */ -#define PHB_CSR_OFFSET 0x0110 -#define PHB_PLSSR_OFFSET 0x0120 -#define PHB_CONFIG_RW_OFFSET 0x0160 -#define PHB_IOBASE_BAR_LOW 0x0170 -#define PHB_IOBASE_BAR_HIGH 0x0180 -#define PHB_MEM_1_LOW 0x0190 -#define PHB_MEM_1_HIGH 0x01A0 -#define PHB_IO_ADDR_SIZE 0x01B0 -#define PHB_MEM_1_SIZE 0x01C0 -#define PHB_MEM_ST_OFFSET 0x01D0 -#define PHB_AER_OFFSET 0x0200 -#define PHB_CONFIG_0_HIGH 0x0220 -#define PHB_CONFIG_0_LOW 0x0230 -#define PHB_CONFIG_0_END 0x0240 -#define PHB_MEM_2_LOW 0x02B0 -#define PHB_MEM_2_HIGH 0x02C0 -#define PHB_MEM_2_SIZE_HIGH 0x02D0 -#define PHB_MEM_2_SIZE_LOW 0x02E0 -#define PHB_DOSHOLE_OFFSET 0x08E0 - -/* PHB_CONFIG_RW */ -#define PHB_TCE_ENABLE 0x20000000 -#define PHB_SLOT_DISABLE 0x1C000000 -#define PHB_DAC_DISABLE 0x01000000 -#define PHB_MEM2_ENABLE 0x00400000 -#define PHB_MCSR_ENABLE 0x00100000 -/* TAR (Table Address Register) */ -#define TAR_SW_BITS 0x0000ffffffff800fUL -#define TAR_VALID 0x0000000000000008UL -/* CSR (Channel/DMA Status Register) */ -#define CSR_AGENT_MASK 0xffe0ffff - -#define MAX_NUM_OF_PHBS 8 /* how many PHBs in total? */ -#define MAX_PHB_BUS_NUM (MAX_NUM_OF_PHBS * 2) /* max dev->bus->number */ -#define PHBS_PER_CALGARY 4 - -/* register offsets in Calgary's internal register space */ -static const unsigned long tar_offsets[] = { - 0x0580 /* TAR0 */, - 0x0588 /* TAR1 */, - 0x0590 /* TAR2 */, - 0x0598 /* TAR3 */ -}; - -static const unsigned long split_queue_offsets[] = { - 0x4870 /* SPLIT QUEUE 0 */, - 0x5870 /* SPLIT QUEUE 1 */, - 0x6870 /* SPLIT QUEUE 2 */, - 0x7870 /* SPLIT QUEUE 3 */ -}; - -static const unsigned long phb_offsets[] = { - 0x8000 /* PHB0 */, - 0x9000 /* PHB1 */, - 0xA000 /* PHB2 */, - 0xB000 /* PHB3 */ -}; - -void* tce_table_kva[MAX_NUM_OF_PHBS * MAX_NUMNODES]; -unsigned int specified_table_size = TCE_TABLE_SIZE_UNSPECIFIED; -static int translate_empty_slots __read_mostly = 0; -static int calgary_detected __read_mostly = 0; - -/* - * the bitmap of PHBs the user requested that we disable - * translation on. - */ -static DECLARE_BITMAP(translation_disabled, MAX_NUMNODES * MAX_PHB_BUS_NUM); - -static void tce_cache_blast(struct iommu_table *tbl); - -/* enable this to stress test the chip's TCE cache */ -#ifdef CONFIG_IOMMU_DEBUG -static inline void tce_cache_blast_stress(struct iommu_table *tbl) -{ - tce_cache_blast(tbl); -} -#else -static inline void tce_cache_blast_stress(struct iommu_table *tbl) -{ -} -#endif /* BLAST_TCE_CACHE_ON_UNMAP */ - -static inline unsigned int num_dma_pages(unsigned long dma, unsigned int dmalen) -{ - unsigned int npages; - - npages = PAGE_ALIGN(dma + dmalen) - (dma & PAGE_MASK); - npages >>= PAGE_SHIFT; - - return npages; -} - -static inline int translate_phb(struct pci_dev* dev) -{ - int disabled = test_bit(dev->bus->number, translation_disabled); - return !disabled; -} - -static void iommu_range_reserve(struct iommu_table *tbl, - unsigned long start_addr, unsigned int npages) -{ - unsigned long index; - unsigned long end; - - index = start_addr >> PAGE_SHIFT; - - /* bail out if we're asked to reserve a region we don't cover */ - if (index >= tbl->it_size) - return; - - end = index + npages; - if (end > tbl->it_size) /* don't go off the table */ - end = tbl->it_size; - - while (index < end) { - if (test_bit(index, tbl->it_map)) - printk(KERN_ERR "Calgary: entry already allocated at " - "0x%lx tbl %p dma 0x%lx npages %u\n", - index, tbl, start_addr, npages); - ++index; - } - set_bit_string(tbl->it_map, start_addr >> PAGE_SHIFT, npages); -} - -static unsigned long iommu_range_alloc(struct iommu_table *tbl, - unsigned int npages) -{ - unsigned long offset; - - BUG_ON(npages == 0); - - offset = find_next_zero_string(tbl->it_map, tbl->it_hint, - tbl->it_size, npages); - if (offset == ~0UL) { - tce_cache_blast(tbl); - offset = find_next_zero_string(tbl->it_map, 0, - tbl->it_size, npages); - if (offset == ~0UL) { - printk(KERN_WARNING "Calgary: IOMMU full.\n"); - if (panic_on_overflow) - panic("Calgary: fix the allocator.\n"); - else - return bad_dma_address; - } - } - - set_bit_string(tbl->it_map, offset, npages); - tbl->it_hint = offset + npages; - BUG_ON(tbl->it_hint > tbl->it_size); - - return offset; -} - -static dma_addr_t iommu_alloc(struct iommu_table *tbl, void *vaddr, - unsigned int npages, int direction) -{ - unsigned long entry, flags; - dma_addr_t ret = bad_dma_address; - - spin_lock_irqsave(&tbl->it_lock, flags); - - entry = iommu_range_alloc(tbl, npages); - - if (unlikely(entry == bad_dma_address)) - goto error; - - /* set the return dma address */ - ret = (entry << PAGE_SHIFT) | ((unsigned long)vaddr & ~PAGE_MASK); - - /* put the TCEs in the HW table */ - tce_build(tbl, entry, npages, (unsigned long)vaddr & PAGE_MASK, - direction); - - spin_unlock_irqrestore(&tbl->it_lock, flags); - - return ret; - -error: - spin_unlock_irqrestore(&tbl->it_lock, flags); - printk(KERN_WARNING "Calgary: failed to allocate %u pages in " - "iommu %p\n", npages, tbl); - return bad_dma_address; -} - -static void __iommu_free(struct iommu_table *tbl, dma_addr_t dma_addr, - unsigned int npages) -{ - unsigned long entry; - unsigned long i; - - entry = dma_addr >> PAGE_SHIFT; - - BUG_ON(entry + npages > tbl->it_size); - - tce_free(tbl, entry, npages); - - for (i = 0; i < npages; ++i) { - if (!test_bit(entry + i, tbl->it_map)) - printk(KERN_ERR "Calgary: bit is off at 0x%lx " - "tbl %p dma 0x%Lx entry 0x%lx npages %u\n", - entry + i, tbl, dma_addr, entry, npages); - } - - __clear_bit_string(tbl->it_map, entry, npages); - - tce_cache_blast_stress(tbl); -} - -static void iommu_free(struct iommu_table *tbl, dma_addr_t dma_addr, - unsigned int npages) -{ - unsigned long flags; - - spin_lock_irqsave(&tbl->it_lock, flags); - - __iommu_free(tbl, dma_addr, npages); - - spin_unlock_irqrestore(&tbl->it_lock, flags); -} - -static void __calgary_unmap_sg(struct iommu_table *tbl, - struct scatterlist *sglist, int nelems, int direction) -{ - while (nelems--) { - unsigned int npages; - dma_addr_t dma = sglist->dma_address; - unsigned int dmalen = sglist->dma_length; - - if (dmalen == 0) - break; - - npages = num_dma_pages(dma, dmalen); - __iommu_free(tbl, dma, npages); - sglist++; - } -} - -void calgary_unmap_sg(struct device *dev, struct scatterlist *sglist, - int nelems, int direction) -{ - unsigned long flags; - struct iommu_table *tbl = to_pci_dev(dev)->bus->self->sysdata; - - if (!translate_phb(to_pci_dev(dev))) - return; - - spin_lock_irqsave(&tbl->it_lock, flags); - - __calgary_unmap_sg(tbl, sglist, nelems, direction); - - spin_unlock_irqrestore(&tbl->it_lock, flags); -} - -static int calgary_nontranslate_map_sg(struct device* dev, - struct scatterlist *sg, int nelems, int direction) -{ - int i; - - for (i = 0; i < nelems; i++ ) { - struct scatterlist *s = &sg[i]; - BUG_ON(!s->page); - s->dma_address = virt_to_bus(page_address(s->page) +s->offset); - s->dma_length = s->length; - } - return nelems; -} - -int calgary_map_sg(struct device *dev, struct scatterlist *sg, - int nelems, int direction) -{ - struct iommu_table *tbl = to_pci_dev(dev)->bus->self->sysdata; - unsigned long flags; - unsigned long vaddr; - unsigned int npages; - unsigned long entry; - int i; - - if (!translate_phb(to_pci_dev(dev))) - return calgary_nontranslate_map_sg(dev, sg, nelems, direction); - - spin_lock_irqsave(&tbl->it_lock, flags); - - for (i = 0; i < nelems; i++ ) { - struct scatterlist *s = &sg[i]; - BUG_ON(!s->page); - - vaddr = (unsigned long)page_address(s->page) + s->offset; - npages = num_dma_pages(vaddr, s->length); - - entry = iommu_range_alloc(tbl, npages); - if (entry == bad_dma_address) { - /* makes sure unmap knows to stop */ - s->dma_length = 0; - goto error; - } - - s->dma_address = (entry << PAGE_SHIFT) | s->offset; - - /* insert into HW table */ - tce_build(tbl, entry, npages, vaddr & PAGE_MASK, - direction); - - s->dma_length = s->length; - } - - spin_unlock_irqrestore(&tbl->it_lock, flags); - - return nelems; -error: - __calgary_unmap_sg(tbl, sg, nelems, direction); - for (i = 0; i < nelems; i++) { - sg[i].dma_address = bad_dma_address; - sg[i].dma_length = 0; - } - spin_unlock_irqrestore(&tbl->it_lock, flags); - return 0; -} - -dma_addr_t calgary_map_single(struct device *dev, void *vaddr, - size_t size, int direction) -{ - dma_addr_t dma_handle = bad_dma_address; - unsigned long uaddr; - unsigned int npages; - struct iommu_table *tbl = to_pci_dev(dev)->bus->self->sysdata; - - uaddr = (unsigned long)vaddr; - npages = num_dma_pages(uaddr, size); - - if (translate_phb(to_pci_dev(dev))) - dma_handle = iommu_alloc(tbl, vaddr, npages, direction); - else - dma_handle = virt_to_bus(vaddr); - - return dma_handle; -} - -void calgary_unmap_single(struct device *dev, dma_addr_t dma_handle, - size_t size, int direction) -{ - struct iommu_table *tbl = to_pci_dev(dev)->bus->self->sysdata; - unsigned int npages; - - if (!translate_phb(to_pci_dev(dev))) - return; - - npages = num_dma_pages(dma_handle, size); - iommu_free(tbl, dma_handle, npages); -} - -void* calgary_alloc_coherent(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t flag) -{ - void *ret = NULL; - dma_addr_t mapping; - unsigned int npages, order; - struct iommu_table *tbl; - - tbl = to_pci_dev(dev)->bus->self->sysdata; - - size = PAGE_ALIGN(size); /* size rounded up to full pages */ - npages = size >> PAGE_SHIFT; - order = get_order(size); - - /* alloc enough pages (and possibly more) */ - ret = (void *)__get_free_pages(flag, order); - if (!ret) - goto error; - memset(ret, 0, size); - - if (translate_phb(to_pci_dev(dev))) { - /* set up tces to cover the allocated range */ - mapping = iommu_alloc(tbl, ret, npages, DMA_BIDIRECTIONAL); - if (mapping == bad_dma_address) - goto free; - - *dma_handle = mapping; - } else /* non translated slot */ - *dma_handle = virt_to_bus(ret); - - return ret; - -free: - free_pages((unsigned long)ret, get_order(size)); - ret = NULL; -error: - return ret; -} - -static struct dma_mapping_ops calgary_dma_ops = { - .alloc_coherent = calgary_alloc_coherent, - .map_single = calgary_map_single, - .unmap_single = calgary_unmap_single, - .map_sg = calgary_map_sg, - .unmap_sg = calgary_unmap_sg, -}; - -static inline int busno_to_phbid(unsigned char num) -{ - return bus_to_phb(num) % PHBS_PER_CALGARY; -} - -static inline unsigned long split_queue_offset(unsigned char num) -{ - size_t idx = busno_to_phbid(num); - - return split_queue_offsets[idx]; -} - -static inline unsigned long tar_offset(unsigned char num) -{ - size_t idx = busno_to_phbid(num); - - return tar_offsets[idx]; -} - -static inline unsigned long phb_offset(unsigned char num) -{ - size_t idx = busno_to_phbid(num); - - return phb_offsets[idx]; -} - -static inline void __iomem* calgary_reg(void __iomem *bar, unsigned long offset) -{ - unsigned long target = ((unsigned long)bar) | offset; - return (void __iomem*)target; -} - -static void tce_cache_blast(struct iommu_table *tbl) -{ - u64 val; - u32 aer; - int i = 0; - void __iomem *bbar = tbl->bbar; - void __iomem *target; - - /* disable arbitration on the bus */ - target = calgary_reg(bbar, phb_offset(tbl->it_busno) | PHB_AER_OFFSET); - aer = readl(target); - writel(0, target); - - /* read plssr to ensure it got there */ - target = calgary_reg(bbar, phb_offset(tbl->it_busno) | PHB_PLSSR_OFFSET); - val = readl(target); - - /* poll split queues until all DMA activity is done */ - target = calgary_reg(bbar, split_queue_offset(tbl->it_busno)); - do { - val = readq(target); - i++; - } while ((val & 0xff) != 0xff && i < 100); - if (i == 100) - printk(KERN_WARNING "Calgary: PCI bus not quiesced, " - "continuing anyway\n"); - - /* invalidate TCE cache */ - target = calgary_reg(bbar, tar_offset(tbl->it_busno)); - writeq(tbl->tar_val, target); - - /* enable arbitration */ - target = calgary_reg(bbar, phb_offset(tbl->it_busno) | PHB_AER_OFFSET); - writel(aer, target); - (void)readl(target); /* flush */ -} - -static void __init calgary_reserve_mem_region(struct pci_dev *dev, u64 start, - u64 limit) -{ - unsigned int numpages; - - limit = limit | 0xfffff; - limit++; - - numpages = ((limit - start) >> PAGE_SHIFT); - iommu_range_reserve(dev->sysdata, start, numpages); -} - -static void __init calgary_reserve_peripheral_mem_1(struct pci_dev *dev) -{ - void __iomem *target; - u64 low, high, sizelow; - u64 start, limit; - struct iommu_table *tbl = dev->sysdata; - unsigned char busnum = dev->bus->number; - void __iomem *bbar = tbl->bbar; - - /* peripheral MEM_1 region */ - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_1_LOW); - low = be32_to_cpu(readl(target)); - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_1_HIGH); - high = be32_to_cpu(readl(target)); - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_1_SIZE); - sizelow = be32_to_cpu(readl(target)); - - start = (high << 32) | low; - limit = sizelow; - - calgary_reserve_mem_region(dev, start, limit); -} - -static void __init calgary_reserve_peripheral_mem_2(struct pci_dev *dev) -{ - void __iomem *target; - u32 val32; - u64 low, high, sizelow, sizehigh; - u64 start, limit; - struct iommu_table *tbl = dev->sysdata; - unsigned char busnum = dev->bus->number; - void __iomem *bbar = tbl->bbar; - - /* is it enabled? */ - target = calgary_reg(bbar, phb_offset(busnum) | PHB_CONFIG_RW_OFFSET); - val32 = be32_to_cpu(readl(target)); - if (!(val32 & PHB_MEM2_ENABLE)) - return; - - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_2_LOW); - low = be32_to_cpu(readl(target)); - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_2_HIGH); - high = be32_to_cpu(readl(target)); - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_2_SIZE_LOW); - sizelow = be32_to_cpu(readl(target)); - target = calgary_reg(bbar, phb_offset(busnum) | PHB_MEM_2_SIZE_HIGH); - sizehigh = be32_to_cpu(readl(target)); - - start = (high << 32) | low; - limit = (sizehigh << 32) | sizelow; - - calgary_reserve_mem_region(dev, start, limit); -} - -/* - * some regions of the IO address space do not get translated, so we - * must not give devices IO addresses in those regions. The regions - * are the 640KB-1MB region and the two PCI peripheral memory holes. - * Reserve all of them in the IOMMU bitmap to avoid giving them out - * later. - */ -static void __init calgary_reserve_regions(struct pci_dev *dev) -{ - unsigned int npages; - void __iomem *bbar; - unsigned char busnum; - u64 start; - struct iommu_table *tbl = dev->sysdata; - - bbar = tbl->bbar; - busnum = dev->bus->number; - - /* reserve bad_dma_address in case it's a legal address */ - iommu_range_reserve(tbl, bad_dma_address, 1); - - /* avoid the BIOS/VGA first 640KB-1MB region */ - start = (640 * 1024); - npages = ((1024 - 640) * 1024) >> PAGE_SHIFT; - iommu_range_reserve(tbl, start, npages); - - /* reserve the two PCI peripheral memory regions in IO space */ - calgary_reserve_peripheral_mem_1(dev); - calgary_reserve_peripheral_mem_2(dev); -} - -static int __init calgary_setup_tar(struct pci_dev *dev, void __iomem *bbar) -{ - u64 val64; - u64 table_phys; - void __iomem *target; - int ret; - struct iommu_table *tbl; - - /* build TCE tables for each PHB */ - ret = build_tce_table(dev, bbar); - if (ret) - return ret; - - calgary_reserve_regions(dev); - - /* set TARs for each PHB */ - target = calgary_reg(bbar, tar_offset(dev->bus->number)); - val64 = be64_to_cpu(readq(target)); - - /* zero out all TAR bits under sw control */ - val64 &= ~TAR_SW_BITS; - - tbl = dev->sysdata; - table_phys = (u64)__pa(tbl->it_base); - val64 |= table_phys; - - BUG_ON(specified_table_size > TCE_TABLE_SIZE_8M); - val64 |= (u64) specified_table_size; - - tbl->tar_val = cpu_to_be64(val64); - writeq(tbl->tar_val, target); - readq(target); /* flush */ - - return 0; -} - -static void __init calgary_free_tar(struct pci_dev *dev) -{ - u64 val64; - struct iommu_table *tbl = dev->sysdata; - void __iomem *target; - - target = calgary_reg(tbl->bbar, tar_offset(dev->bus->number)); - val64 = be64_to_cpu(readq(target)); - val64 &= ~TAR_SW_BITS; - writeq(cpu_to_be64(val64), target); - readq(target); /* flush */ - - kfree(tbl); - dev->sysdata = NULL; -} - -static void calgary_watchdog(unsigned long data) -{ - struct pci_dev *dev = (struct pci_dev *)data; - struct iommu_table *tbl = dev->sysdata; - void __iomem *bbar = tbl->bbar; - u32 val32; - void __iomem *target; - - target = calgary_reg(bbar, phb_offset(tbl->it_busno) | PHB_CSR_OFFSET); - val32 = be32_to_cpu(readl(target)); - - /* If no error, the agent ID in the CSR is not valid */ - if (val32 & CSR_AGENT_MASK) { - printk(KERN_EMERG "calgary_watchdog: DMA error on bus %d, " - "CSR = %#x\n", dev->bus->number, val32); - writel(0, target); - - /* Disable bus that caused the error */ - target = calgary_reg(bbar, phb_offset(tbl->it_busno) | - PHB_CONFIG_RW_OFFSET); - val32 = be32_to_cpu(readl(target)); - val32 |= PHB_SLOT_DISABLE; - writel(cpu_to_be32(val32), target); - readl(target); /* flush */ - } else { - /* Reset the timer */ - mod_timer(&tbl->watchdog_timer, jiffies + 2 * HZ); - } -} - -static void __init calgary_enable_translation(struct pci_dev *dev) -{ - u32 val32; - unsigned char busnum; - void __iomem *target; - void __iomem *bbar; - struct iommu_table *tbl; - - busnum = dev->bus->number; - tbl = dev->sysdata; - bbar = tbl->bbar; - - /* enable TCE in PHB Config Register */ - target = calgary_reg(bbar, phb_offset(busnum) | PHB_CONFIG_RW_OFFSET); - val32 = be32_to_cpu(readl(target)); - val32 |= PHB_TCE_ENABLE | PHB_DAC_DISABLE | PHB_MCSR_ENABLE; - - printk(KERN_INFO "Calgary: enabling translation on PHB %d\n", busnum); - printk(KERN_INFO "Calgary: errant DMAs will now be prevented on this " - "bus.\n"); - - writel(cpu_to_be32(val32), target); - readl(target); /* flush */ - - init_timer(&tbl->watchdog_timer); - tbl->watchdog_timer.function = &calgary_watchdog; - tbl->watchdog_timer.data = (unsigned long)dev; - mod_timer(&tbl->watchdog_timer, jiffies); -} - -static void __init calgary_disable_translation(struct pci_dev *dev) -{ - u32 val32; - unsigned char busnum; - void __iomem *target; - void __iomem *bbar; - struct iommu_table *tbl; - - busnum = dev->bus->number; - tbl = dev->sysdata; - bbar = tbl->bbar; - - /* disable TCE in PHB Config Register */ - target = calgary_reg(bbar, phb_offset(busnum) | PHB_CONFIG_RW_OFFSET); - val32 = be32_to_cpu(readl(target)); - val32 &= ~(PHB_TCE_ENABLE | PHB_DAC_DISABLE | PHB_MCSR_ENABLE); - - printk(KERN_INFO "Calgary: disabling translation on PHB %d!\n", busnum); - writel(cpu_to_be32(val32), target); - readl(target); /* flush */ - - del_timer_sync(&tbl->watchdog_timer); -} - -static inline unsigned int __init locate_register_space(struct pci_dev *dev) -{ - int rionodeid; - u32 address; - - rionodeid = (dev->bus->number % 15 > 4) ? 3 : 2; - /* - * register space address calculation as follows: - * FE0MB-8MB*OneBasedChassisNumber+1MB*(RioNodeId-ChassisBase) - * ChassisBase is always zero for x366/x260/x460 - * RioNodeId is 2 for first Calgary, 3 for second Calgary - */ - address = START_ADDRESS - - (0x800000 * (ONE_BASED_CHASSIS_NUM + dev->bus->number / 15)) + - (0x100000) * (rionodeid - CHASSIS_BASE); - return address; -} - -static int __init calgary_init_one_nontraslated(struct pci_dev *dev) -{ - dev->sysdata = NULL; - dev->bus->self = dev; - - return 0; -} - -static int __init calgary_init_one(struct pci_dev *dev) -{ - u32 address; - void __iomem *bbar; - int ret; - - address = locate_register_space(dev); - /* map entire 1MB of Calgary config space */ - bbar = ioremap_nocache(address, 1024 * 1024); - if (!bbar) { - ret = -ENODATA; - goto done; - } - - ret = calgary_setup_tar(dev, bbar); - if (ret) - goto iounmap; - - dev->bus->self = dev; - calgary_enable_translation(dev); - - return 0; - -iounmap: - iounmap(bbar); -done: - return ret; -} - -static int __init calgary_init(void) -{ - int i, ret = -ENODEV; - struct pci_dev *dev = NULL; - - for (i = 0; i <= num_online_nodes() * MAX_NUM_OF_PHBS; i++) { - dev = pci_get_device(PCI_VENDOR_ID_IBM, - PCI_DEVICE_ID_IBM_CALGARY, - dev); - if (!dev) - break; - if (!translate_phb(dev)) { - calgary_init_one_nontraslated(dev); - continue; - } - if (!tce_table_kva[i] && !translate_empty_slots) { - pci_dev_put(dev); - continue; - } - ret = calgary_init_one(dev); - if (ret) - goto error; - } - - return ret; - -error: - for (i--; i >= 0; i--) { - dev = pci_find_device_reverse(PCI_VENDOR_ID_IBM, - PCI_DEVICE_ID_IBM_CALGARY, - dev); - if (!translate_phb(dev)) { - pci_dev_put(dev); - continue; - } - if (!tce_table_kva[i] && !translate_empty_slots) - continue; - calgary_disable_translation(dev); - calgary_free_tar(dev); - pci_dev_put(dev); - } - - return ret; -} - -static inline int __init determine_tce_table_size(u64 ram) -{ - int ret; - - if (specified_table_size != TCE_TABLE_SIZE_UNSPECIFIED) - return specified_table_size; - - /* - * Table sizes are from 0 to 7 (TCE_TABLE_SIZE_64K to - * TCE_TABLE_SIZE_8M). Table size 0 has 8K entries and each - * larger table size has twice as many entries, so shift the - * max ram address by 13 to divide by 8K and then look at the - * order of the result to choose between 0-7. - */ - ret = get_order(ram >> 13); - if (ret > TCE_TABLE_SIZE_8M) - ret = TCE_TABLE_SIZE_8M; - - return ret; -} - -void __init detect_calgary(void) -{ - u32 val; - int bus, table_idx; - void *tbl; - int detected = 0; - - /* - * if the user specified iommu=off or iommu=soft or we found - * another HW IOMMU already, bail out. - */ - if (swiotlb || no_iommu || iommu_detected) - return; - - specified_table_size = determine_tce_table_size(end_pfn * PAGE_SIZE); - - for (bus = 0, table_idx = 0; - bus <= num_online_nodes() * MAX_PHB_BUS_NUM; - bus++) { - BUG_ON(bus > MAX_NUMNODES * MAX_PHB_BUS_NUM); - if (read_pci_config(bus, 0, 0, 0) != PCI_VENDOR_DEVICE_ID_CALGARY) - continue; - if (test_bit(bus, translation_disabled)) { - printk(KERN_INFO "Calgary: translation is disabled for " - "PHB 0x%x\n", bus); - /* skip this phb, don't allocate a tbl for it */ - tce_table_kva[table_idx] = NULL; - table_idx++; - continue; - } - /* - * scan the first slot of the PCI bus to see if there - * are any devices present - */ - val = read_pci_config(bus, 1, 0, 0); - if (val != 0xffffffff || translate_empty_slots) { - tbl = alloc_tce_table(); - if (!tbl) - goto cleanup; - detected = 1; - } else - tbl = NULL; - - tce_table_kva[table_idx] = tbl; - table_idx++; - } - - if (detected) { - iommu_detected = 1; - calgary_detected = 1; - printk(KERN_INFO "PCI-DMA: Calgary IOMMU detected. " - "TCE table spec is %d.\n", specified_table_size); - } - return; - -cleanup: - for (--table_idx; table_idx >= 0; --table_idx) - if (tce_table_kva[table_idx]) - free_tce_table(tce_table_kva[table_idx]); -} - -int __init calgary_iommu_init(void) -{ - int ret; - - if (no_iommu || swiotlb) - return -ENODEV; - - if (!calgary_detected) - return -ENODEV; - - /* ok, we're trying to use Calgary - let's roll */ - printk(KERN_INFO "PCI-DMA: Using Calgary IOMMU\n"); - - ret = calgary_init(); - if (ret) { - printk(KERN_ERR "PCI-DMA: Calgary init failed %d, " - "falling back to no_iommu\n", ret); - if (end_pfn > MAX_DMA32_PFN) - printk(KERN_ERR "WARNING more than 4GB of memory, " - "32bit PCI may malfunction.\n"); - return ret; - } - - force_iommu = 1; - dma_ops = &calgary_dma_ops; - - return 0; -} - -static int __init calgary_parse_options(char *p) -{ - unsigned int bridge; - size_t len; - char* endp; - - while (*p) { - if (!strncmp(p, "64k", 3)) - specified_table_size = TCE_TABLE_SIZE_64K; - else if (!strncmp(p, "128k", 4)) - specified_table_size = TCE_TABLE_SIZE_128K; - else if (!strncmp(p, "256k", 4)) - specified_table_size = TCE_TABLE_SIZE_256K; - else if (!strncmp(p, "512k", 4)) - specified_table_size = TCE_TABLE_SIZE_512K; - else if (!strncmp(p, "1M", 2)) - specified_table_size = TCE_TABLE_SIZE_1M; - else if (!strncmp(p, "2M", 2)) - specified_table_size = TCE_TABLE_SIZE_2M; - else if (!strncmp(p, "4M", 2)) - specified_table_size = TCE_TABLE_SIZE_4M; - else if (!strncmp(p, "8M", 2)) - specified_table_size = TCE_TABLE_SIZE_8M; - - len = strlen("translate_empty_slots"); - if (!strncmp(p, "translate_empty_slots", len)) - translate_empty_slots = 1; - - len = strlen("disable"); - if (!strncmp(p, "disable", len)) { - p += len; - if (*p == '=') - ++p; - if (*p == '\0') - break; - bridge = simple_strtol(p, &endp, 0); - if (p == endp) - break; - - if (bridge <= (num_online_nodes() * MAX_PHB_BUS_NUM)) { - printk(KERN_INFO "Calgary: disabling " - "translation for PHB 0x%x\n", bridge); - set_bit(bridge, translation_disabled); - } - } - - p = strpbrk(p, ","); - if (!p) - break; - - p++; /* skip ',' */ - } - return 1; -} -__setup("calgary=", calgary_parse_options); diff --git a/trunk/arch/x86_64/kernel/pci-dma.c b/trunk/arch/x86_64/kernel/pci-dma.c index 9c44f4f2433d..a9275c9557cf 100644 --- a/trunk/arch/x86_64/kernel/pci-dma.c +++ b/trunk/arch/x86_64/kernel/pci-dma.c @@ -9,7 +9,6 @@ #include #include #include -#include int iommu_merge __read_mostly = 0; EXPORT_SYMBOL(iommu_merge); @@ -34,15 +33,12 @@ int panic_on_overflow __read_mostly = 0; int force_iommu __read_mostly= 0; #endif -/* Set this to 1 if there is a HW IOMMU in the system */ -int iommu_detected __read_mostly = 0; - /* Dummy device used for NULL arguments (normally ISA). Better would be probably a smaller DMA mask, but this is bug-to-bug compatible to i386. */ struct device fallback_dev = { .bus_id = "fallback device", - .coherent_dma_mask = DMA_32BIT_MASK, + .coherent_dma_mask = 0xffffffff, .dma_mask = &fallback_dev.coherent_dma_mask, }; @@ -81,7 +77,7 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, dev = &fallback_dev; dma_mask = dev->coherent_dma_mask; if (dma_mask == 0) - dma_mask = DMA_32BIT_MASK; + dma_mask = 0xffffffff; /* Don't invoke OOM killer */ gfp |= __GFP_NORETRY; @@ -94,7 +90,7 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, larger than 16MB and in this case we have a chance of finding fitting memory in the next higher zone first. If not retry with true GFP_DMA. -AK */ - if (dma_mask <= DMA_32BIT_MASK) + if (dma_mask <= 0xffffffff) gfp |= GFP_DMA32; again: @@ -115,7 +111,7 @@ dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, /* Don't use the 16MB ZONE_DMA unless absolutely needed. It's better to use remapping first. */ - if (dma_mask < DMA_32BIT_MASK && !(gfp & GFP_DMA)) { + if (dma_mask < 0xffffffff && !(gfp & GFP_DMA)) { gfp = (gfp & ~GFP_DMA32) | GFP_DMA; goto again; } @@ -178,7 +174,7 @@ int dma_supported(struct device *dev, u64 mask) /* Copied from i386. Doesn't make much sense, because it will only work for pci_alloc_coherent. The caller just has to use GFP_DMA in this case. */ - if (mask < DMA_24BIT_MASK) + if (mask < 0x00ffffff) return 0; /* Tell the device to use SAC when IOMMU force is on. This @@ -193,7 +189,7 @@ int dma_supported(struct device *dev, u64 mask) SAC for these. Assume all masks <= 40 bits are of this type. Normally this doesn't make any difference, but gives more gentle handling of IOMMU overflow. */ - if (iommu_sac_force && (mask >= DMA_40BIT_MASK)) { + if (iommu_sac_force && (mask >= 0xffffffffffULL)) { printk(KERN_INFO "%s: Force SAC with mask %Lx\n", dev->bus_id,mask); return 0; } @@ -270,7 +266,7 @@ __init int iommu_setup(char *p) swiotlb = 1; #endif -#ifdef CONFIG_IOMMU +#ifdef CONFIG_GART_IOMMU gart_parse_options(p); #endif @@ -280,40 +276,3 @@ __init int iommu_setup(char *p) } return 1; } -__setup("iommu=", iommu_setup); - -void __init pci_iommu_alloc(void) -{ - /* - * The order of these functions is important for - * fall-back/fail-over reasons - */ -#ifdef CONFIG_IOMMU - iommu_hole_init(); -#endif - -#ifdef CONFIG_CALGARY_IOMMU - detect_calgary(); -#endif - -#ifdef CONFIG_SWIOTLB - pci_swiotlb_init(); -#endif -} - -static int __init pci_iommu_init(void) -{ -#ifdef CONFIG_CALGARY_IOMMU - calgary_iommu_init(); -#endif - -#ifdef CONFIG_IOMMU - gart_iommu_init(); -#endif - - no_iommu_init(); - return 0; -} - -/* Must execute after PCI subsystem */ -fs_initcall(pci_iommu_init); diff --git a/trunk/arch/x86_64/kernel/pci-gart.c b/trunk/arch/x86_64/kernel/pci-gart.c index 4ca674d16b09..82a7c9bfdfa0 100644 --- a/trunk/arch/x86_64/kernel/pci-gart.c +++ b/trunk/arch/x86_64/kernel/pci-gart.c @@ -32,7 +32,6 @@ #include #include #include -#include unsigned long iommu_bus_base; /* GART remapping area (physical) */ static unsigned long iommu_size; /* size of remapping area bytes */ @@ -47,6 +46,8 @@ u32 *iommu_gatt_base; /* Remapping table */ also seen with Qlogic at least). */ int iommu_fullflush = 1; +#define MAX_NB 8 + /* Allocation bitmap for the remapping area */ static DEFINE_SPINLOCK(iommu_bitmap_lock); static unsigned long *iommu_gart_bitmap; /* guarded by iommu_bitmap_lock */ @@ -62,6 +63,13 @@ static u32 gart_unmapped_entry; #define to_pages(addr,size) \ (round_up(((addr) & ~PAGE_MASK) + (size), PAGE_SIZE) >> PAGE_SHIFT) +#define for_all_nb(dev) \ + dev = NULL; \ + while ((dev = pci_get_device(PCI_VENDOR_ID_AMD, 0x1103, dev))!=NULL) + +static struct pci_dev *northbridges[MAX_NB]; +static u32 northbridge_flush_word[MAX_NB]; + #define EMERGENCY_PAGES 32 /* = 128KB */ #ifdef CONFIG_AGP @@ -85,7 +93,7 @@ static unsigned long alloc_iommu(int size) offset = find_next_zero_string(iommu_gart_bitmap,next_bit,iommu_pages,size); if (offset == -1) { need_flush = 1; - offset = find_next_zero_string(iommu_gart_bitmap,0,iommu_pages,size); + offset = find_next_zero_string(iommu_gart_bitmap,0,next_bit,size); } if (offset != -1) { set_bit_string(iommu_gart_bitmap, offset, size); @@ -112,17 +120,44 @@ static void free_iommu(unsigned long offset, int size) /* * Use global flush state to avoid races with multiple flushers. */ -static void flush_gart(void) +static void flush_gart(struct device *dev) { unsigned long flags; + int flushed = 0; + int i, max; + spin_lock_irqsave(&iommu_bitmap_lock, flags); - if (need_flush) { - k8_flush_garts(); + if (need_flush) { + max = 0; + for (i = 0; i < MAX_NB; i++) { + if (!northbridges[i]) + continue; + pci_write_config_dword(northbridges[i], 0x9c, + northbridge_flush_word[i] | 1); + flushed++; + max = i; + } + for (i = 0; i <= max; i++) { + u32 w; + if (!northbridges[i]) + continue; + /* Make sure the hardware actually executed the flush. */ + for (;;) { + pci_read_config_dword(northbridges[i], 0x9c, &w); + if (!(w & 1)) + break; + cpu_relax(); + } + } + if (!flushed) + printk("nothing to flush?\n"); need_flush = 0; } spin_unlock_irqrestore(&iommu_bitmap_lock, flags); } + + #ifdef CONFIG_IOMMU_LEAK #define SET_LEAK(x) if (iommu_leak_tab) \ @@ -231,7 +266,7 @@ static dma_addr_t gart_map_simple(struct device *dev, char *buf, size_t size, int dir) { dma_addr_t map = dma_map_area(dev, virt_to_bus(buf), size, dir); - flush_gart(); + flush_gart(dev); return map; } @@ -253,28 +288,6 @@ dma_addr_t gart_map_single(struct device *dev, void *addr, size_t size, int dir) return bus; } -/* - * Free a DMA mapping. - */ -void gart_unmap_single(struct device *dev, dma_addr_t dma_addr, - size_t size, int direction) -{ - unsigned long iommu_page; - int npages; - int i; - - if (dma_addr < iommu_bus_base + EMERGENCY_PAGES*PAGE_SIZE || - dma_addr >= iommu_bus_base + iommu_size) - return; - iommu_page = (dma_addr - iommu_bus_base)>>PAGE_SHIFT; - npages = to_pages(dma_addr, size); - for (i = 0; i < npages; i++) { - iommu_gatt_base[iommu_page + i] = gart_unmapped_entry; - CLEAR_LEAK(iommu_page + i); - } - free_iommu(iommu_page, npages); -} - /* * Wrapper for pci_unmap_single working with scatterlists. */ @@ -286,7 +299,7 @@ void gart_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, int di struct scatterlist *s = &sg[i]; if (!s->dma_length || !s->length) break; - gart_unmap_single(dev, s->dma_address, s->dma_length, dir); + dma_unmap_single(dev, s->dma_address, s->dma_length, dir); } } @@ -316,7 +329,7 @@ static int dma_map_sg_nonforce(struct device *dev, struct scatterlist *sg, s->dma_address = addr; s->dma_length = s->length; } - flush_gart(); + flush_gart(dev); return nents; } @@ -423,13 +436,13 @@ int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir) if (dma_map_cont(sg, start, i, sg+out, pages, need) < 0) goto error; out++; - flush_gart(); + flush_gart(dev); if (out < nents) sg[out].dma_length = 0; return out; error: - flush_gart(); + flush_gart(NULL); gart_unmap_sg(dev, sg, nents, dir); /* When it was forced or merged try again in a dumb way */ if (force_iommu || iommu_merge) { @@ -445,6 +458,28 @@ int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir) return 0; } +/* + * Free a DMA mapping. + */ +void gart_unmap_single(struct device *dev, dma_addr_t dma_addr, + size_t size, int direction) +{ + unsigned long iommu_page; + int npages; + int i; + + if (dma_addr < iommu_bus_base + EMERGENCY_PAGES*PAGE_SIZE || + dma_addr >= iommu_bus_base + iommu_size) + return; + iommu_page = (dma_addr - iommu_bus_base)>>PAGE_SHIFT; + npages = to_pages(dma_addr, size); + for (i = 0; i < npages; i++) { + iommu_gatt_base[iommu_page + i] = gart_unmapped_entry; + CLEAR_LEAK(iommu_page + i); + } + free_iommu(iommu_page, npages); +} + static int no_agp; static __init unsigned long check_iommu_size(unsigned long aper, u64 aper_size) @@ -497,13 +532,10 @@ static __init int init_k8_gatt(struct agp_kern_info *info) void *gatt; unsigned aper_base, new_aper_base; unsigned aper_size, gatt_size, new_aper_size; - int i; - + printk(KERN_INFO "PCI-DMA: Disabling AGP.\n"); aper_size = aper_base = info->aper_size = 0; - dev = NULL; - for (i = 0; i < num_k8_northbridges; i++) { - dev = k8_northbridges[i]; + for_all_nb(dev) { new_aper_base = read_aperture(dev, &new_aper_size); if (!new_aper_base) goto nommu; @@ -526,12 +558,11 @@ static __init int init_k8_gatt(struct agp_kern_info *info) panic("Cannot allocate GATT table"); memset(gatt, 0, gatt_size); agp_gatt_table = gatt; - - for (i = 0; i < num_k8_northbridges; i++) { + + for_all_nb(dev) { u32 ctl; u32 gatt_reg; - dev = k8_northbridges[i]; gatt_reg = __pa(gatt) >> 12; gatt_reg <<= 4; pci_write_config_dword(dev, 0x98, gatt_reg); @@ -542,7 +573,7 @@ static __init int init_k8_gatt(struct agp_kern_info *info) pci_write_config_dword(dev, 0x90, ctl); } - flush_gart(); + flush_gart(NULL); printk("PCI-DMA: aperture base @ %x size %u KB\n",aper_base, aper_size>>10); return 0; @@ -571,19 +602,15 @@ static struct dma_mapping_ops gart_dma_ops = { .unmap_sg = gart_unmap_sg, }; -void __init gart_iommu_init(void) +static int __init pci_iommu_init(void) { struct agp_kern_info info; unsigned long aper_size; unsigned long iommu_start; + struct pci_dev *dev; unsigned long scratch; long i; - if (cache_k8_northbridges() < 0 || num_k8_northbridges == 0) { - printk(KERN_INFO "PCI-GART: No AMD northbridge found.\n"); - return; - } - #ifndef CONFIG_AGP_AMD64 no_agp = 1; #else @@ -595,11 +622,7 @@ void __init gart_iommu_init(void) #endif if (swiotlb) - return; - - /* Did we detect a different HW IOMMU? */ - if (iommu_detected && !iommu_aperture) - return; + return -1; if (no_iommu || (!force_iommu && end_pfn <= MAX_DMA32_PFN) || @@ -611,7 +634,15 @@ void __init gart_iommu_init(void) "but IOMMU not available.\n" KERN_ERR "WARNING 32bit PCI may malfunction.\n"); } - return; + return -1; + } + + i = 0; + for_all_nb(dev) + i++; + if (i > MAX_NB) { + printk(KERN_ERR "PCI-GART: Too many northbridges (%ld). Disabled\n", i); + return -1; } printk(KERN_INFO "PCI-DMA: using GART IOMMU.\n"); @@ -676,10 +707,26 @@ void __init gart_iommu_init(void) for (i = EMERGENCY_PAGES; i < iommu_pages; i++) iommu_gatt_base[i] = gart_unmapped_entry; - flush_gart(); + for_all_nb(dev) { + u32 flag; + int cpu = PCI_SLOT(dev->devfn) - 24; + if (cpu >= MAX_NB) + continue; + northbridges[cpu] = dev; + pci_read_config_dword(dev, 0x9c, &flag); /* cache flush word */ + northbridge_flush_word[cpu] = flag; + } + + flush_gart(NULL); + dma_ops = &gart_dma_ops; + + return 0; } +/* Must execute after PCI subsystem */ +fs_initcall(pci_iommu_init); + void gart_parse_options(char *p) { int arg; diff --git a/trunk/arch/x86_64/kernel/pci-nommu.c b/trunk/arch/x86_64/kernel/pci-nommu.c index c4c3cc36ac5b..1f6ecc62061d 100644 --- a/trunk/arch/x86_64/kernel/pci-nommu.c +++ b/trunk/arch/x86_64/kernel/pci-nommu.c @@ -4,8 +4,6 @@ #include #include #include -#include - #include #include #include @@ -14,11 +12,10 @@ static int check_addr(char *name, struct device *hwdev, dma_addr_t bus, size_t size) { if (hwdev && bus + size > *hwdev->dma_mask) { - if (*hwdev->dma_mask >= DMA_32BIT_MASK) + if (*hwdev->dma_mask >= 0xffffffffULL) printk(KERN_ERR - "nommu_%s: overflow %Lx+%zu of device mask %Lx\n", - name, (long long)bus, size, - (long long)*hwdev->dma_mask); + "nommu_%s: overflow %Lx+%lu of device mask %Lx\n", + name, (long long)bus, size, (long long)*hwdev->dma_mask); return 0; } return 1; diff --git a/trunk/arch/x86_64/kernel/pci-swiotlb.c b/trunk/arch/x86_64/kernel/pci-swiotlb.c index ebdb77fe2057..990ed67896f2 100644 --- a/trunk/arch/x86_64/kernel/pci-swiotlb.c +++ b/trunk/arch/x86_64/kernel/pci-swiotlb.c @@ -31,7 +31,7 @@ struct dma_mapping_ops swiotlb_dma_ops = { void pci_swiotlb_init(void) { /* don't initialize swiotlb if iommu=off (no_iommu=1) */ - if (!iommu_detected && !no_iommu && + if (!iommu_aperture && !no_iommu && (end_pfn > MAX_DMA32_PFN || force_iommu)) swiotlb = 1; if (swiotlb) { diff --git a/trunk/arch/x86_64/kernel/pmtimer.c b/trunk/arch/x86_64/kernel/pmtimer.c index 7554458dc9cb..bf421ed26808 100644 --- a/trunk/arch/x86_64/kernel/pmtimer.c +++ b/trunk/arch/x86_64/kernel/pmtimer.c @@ -27,7 +27,7 @@ /* The I/O port the PMTMR resides at. * The location is detected during setup_arch(), * in arch/i386/kernel/acpi/boot.c */ -u32 pmtmr_ioport __read_mostly; +u32 pmtmr_ioport; /* value of the Power timer at last timer interrupt */ static u32 offset_delay; diff --git a/trunk/arch/x86_64/kernel/process.c b/trunk/arch/x86_64/kernel/process.c index ca56e19b8b6e..fb903e65e079 100644 --- a/trunk/arch/x86_64/kernel/process.c +++ b/trunk/arch/x86_64/kernel/process.c @@ -10,6 +10,7 @@ * Andi Kleen. * * CPU hotplug support - ashok.raj@intel.com + * $Id: process.c,v 1.38 2002/01/15 10:08:03 ak Exp $ */ /* @@ -63,7 +64,6 @@ EXPORT_SYMBOL(boot_option_idle_override); * Powermanagement idle function, if any.. */ void (*pm_idle)(void); -EXPORT_SYMBOL(pm_idle); static DEFINE_PER_CPU(unsigned int, cpu_idle_state); static ATOMIC_NOTIFIER_HEAD(idle_notifier); @@ -111,7 +111,7 @@ static void default_idle(void) { local_irq_enable(); - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); @@ -120,7 +120,7 @@ static void default_idle(void) else local_irq_enable(); } - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); } /* @@ -203,7 +203,8 @@ static inline void play_dead(void) */ void cpu_idle (void) { - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); + /* endless idle loop with no priority at all */ while (1) { while (!need_resched()) { @@ -334,7 +335,7 @@ void show_regs(struct pt_regs *regs) { printk("CPU %d:", smp_processor_id()); __show_regs(regs); - show_trace(NULL, regs, (void *)(regs + 1)); + show_trace(®s->rsp); } /* @@ -364,11 +365,8 @@ void flush_thread(void) struct task_struct *tsk = current; struct thread_info *t = current_thread_info(); - if (t->flags & _TIF_ABI_PENDING) { + if (t->flags & _TIF_ABI_PENDING) t->flags ^= (_TIF_ABI_PENDING | _TIF_IA32); - if (t->flags & _TIF_IA32) - current_thread_info()->status |= TS_COMPAT; - } tsk->thread.debugreg0 = 0; tsk->thread.debugreg1 = 0; diff --git a/trunk/arch/x86_64/kernel/reboot.c b/trunk/arch/x86_64/kernel/reboot.c index 2d6769847456..57117b8beb2b 100644 --- a/trunk/arch/x86_64/kernel/reboot.c +++ b/trunk/arch/x86_64/kernel/reboot.c @@ -20,7 +20,6 @@ * Power off function, if any */ void (*pm_power_off)(void); -EXPORT_SYMBOL(pm_power_off); static long no_idt[3]; static enum { diff --git a/trunk/arch/x86_64/kernel/setup.c b/trunk/arch/x86_64/kernel/setup.c index 1129918ede82..fb850b52b4da 100644 --- a/trunk/arch/x86_64/kernel/setup.c +++ b/trunk/arch/x86_64/kernel/setup.c @@ -5,6 +5,8 @@ * * Nov 2001 Dave Jones * Forked from i386 setup code. + * + * $Id$ */ /* @@ -63,7 +65,9 @@ #include #include #include +#include #include +#include #include /* @@ -71,7 +75,6 @@ */ struct cpuinfo_x86 boot_cpu_data __read_mostly; -EXPORT_SYMBOL(boot_cpu_data); unsigned long mmu_cr4_features; @@ -100,14 +103,12 @@ char dmi_alloc_data[DMI_MAX_DATA]; * Setup options */ struct screen_info screen_info; -EXPORT_SYMBOL(screen_info); struct sys_desc_table_struct { unsigned short length; unsigned char table[0]; }; struct edid_info edid_info; -EXPORT_SYMBOL_GPL(edid_info); struct e820map e820; extern int root_mountflags; @@ -472,6 +473,80 @@ contig_initmem_init(unsigned long start_pfn, unsigned long end_pfn) } #endif +/* Use inline assembly to define this because the nops are defined + as inline assembly strings in the include files and we cannot + get them easily into strings. */ +asm("\t.data\nk8nops: " + K8_NOP1 K8_NOP2 K8_NOP3 K8_NOP4 K8_NOP5 K8_NOP6 + K8_NOP7 K8_NOP8); + +extern unsigned char k8nops[]; +static unsigned char *k8_nops[ASM_NOP_MAX+1] = { + NULL, + k8nops, + k8nops + 1, + k8nops + 1 + 2, + k8nops + 1 + 2 + 3, + k8nops + 1 + 2 + 3 + 4, + k8nops + 1 + 2 + 3 + 4 + 5, + k8nops + 1 + 2 + 3 + 4 + 5 + 6, + k8nops + 1 + 2 + 3 + 4 + 5 + 6 + 7, +}; + +extern char __vsyscall_0; + +/* Replace instructions with better alternatives for this CPU type. + + This runs before SMP is initialized to avoid SMP problems with + self modifying code. This implies that assymetric systems where + APs have less capabilities than the boot processor are not handled. + In this case boot with "noreplacement". */ +void apply_alternatives(void *start, void *end) +{ + struct alt_instr *a; + int diff, i, k; + for (a = start; (void *)a < end; a++) { + u8 *instr; + + if (!boot_cpu_has(a->cpuid)) + continue; + + BUG_ON(a->replacementlen > a->instrlen); + instr = a->instr; + /* vsyscall code is not mapped yet. resolve it manually. */ + if (instr >= (u8 *)VSYSCALL_START && instr < (u8*)VSYSCALL_END) + instr = __va(instr - (u8*)VSYSCALL_START + (u8*)__pa_symbol(&__vsyscall_0)); + __inline_memcpy(instr, a->replacement, a->replacementlen); + diff = a->instrlen - a->replacementlen; + + /* Pad the rest with nops */ + for (i = a->replacementlen; diff > 0; diff -= k, i += k) { + k = diff; + if (k > ASM_NOP_MAX) + k = ASM_NOP_MAX; + __inline_memcpy(instr + i, k8_nops[k], k); + } + } +} + +static int no_replacement __initdata = 0; + +void __init alternative_instructions(void) +{ + extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; + if (no_replacement) + return; + apply_alternatives(__alt_instructions, __alt_instructions_end); +} + +static int __init noreplacement_setup(char *s) +{ + no_replacement = 1; + return 1; +} + +__setup("noreplacement", noreplacement_setup); + #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE) struct edd edd; #ifdef CONFIG_EDD_MODULE @@ -704,6 +779,10 @@ void __init setup_arch(char **cmdline_p) e820_setup_gap(); +#ifdef CONFIG_GART_IOMMU + iommu_hole_init(); +#endif + #ifdef CONFIG_VT #if defined(CONFIG_VGA_CONSOLE) conswitchp = &vga_con; @@ -788,32 +867,24 @@ static int nearby_node(int apicid) static void __init amd_detect_cmp(struct cpuinfo_x86 *c) { #ifdef CONFIG_SMP + int cpu = smp_processor_id(); unsigned bits; #ifdef CONFIG_NUMA - int cpu = smp_processor_id(); int node = 0; unsigned apicid = hard_smp_processor_id(); #endif - unsigned ecx = cpuid_ecx(0x80000008); - - c->x86_max_cores = (ecx & 0xff) + 1; - /* CPU telling us the core id bits shift? */ - bits = (ecx >> 12) & 0xF; - - /* Otherwise recompute */ - if (bits == 0) { - while ((1 << bits) < c->x86_max_cores) - bits++; - } + bits = 0; + while ((1 << bits) < c->x86_max_cores) + bits++; /* Low order bits define the core id (index of core in socket) */ - c->cpu_core_id = c->phys_proc_id & ((1 << bits)-1); + cpu_core_id[cpu] = phys_proc_id[cpu] & ((1 << bits)-1); /* Convert the APIC ID into the socket ID */ - c->phys_proc_id = phys_pkg_id(bits); + phys_proc_id[cpu] = phys_pkg_id(bits); #ifdef CONFIG_NUMA - node = c->phys_proc_id; + node = phys_proc_id[cpu]; if (apicid_to_node[apicid] != NUMA_NO_NODE) node = apicid_to_node[apicid]; if (!node_online(node)) { @@ -826,7 +897,7 @@ static void __init amd_detect_cmp(struct cpuinfo_x86 *c) but in the same order as the HT nodeids. If that doesn't result in a usable node fall back to the path for the previous case. */ - int ht_nodeid = apicid - (cpu_data[0].phys_proc_id << bits); + int ht_nodeid = apicid - (phys_proc_id[0] << bits); if (ht_nodeid >= 0 && apicid_to_node[ht_nodeid] != NUMA_NO_NODE) node = apicid_to_node[ht_nodeid]; @@ -836,13 +907,15 @@ static void __init amd_detect_cmp(struct cpuinfo_x86 *c) } numa_set_node(cpu, node); - printk(KERN_INFO "CPU %d/%x -> Node %d\n", cpu, apicid, node); + printk(KERN_INFO "CPU %d/%x(%d) -> Node %d -> Core %d\n", + cpu, apicid, c->x86_max_cores, node, cpu_core_id[cpu]); #endif #endif } -static void __init init_amd(struct cpuinfo_x86 *c) +static int __init init_amd(struct cpuinfo_x86 *c) { + int r; unsigned level; #ifdef CONFIG_SMP @@ -875,8 +948,8 @@ static void __init init_amd(struct cpuinfo_x86 *c) if (c->x86 >= 6) set_bit(X86_FEATURE_FXSAVE_LEAK, &c->x86_capability); - level = get_model_name(c); - if (!level) { + r = get_model_name(c); + if (!r) { switch (c->x86) { case 15: /* Should distinguish Models here, but this is only @@ -891,12 +964,13 @@ static void __init init_amd(struct cpuinfo_x86 *c) if (c->x86_power & (1<<8)) set_bit(X86_FEATURE_CONSTANT_TSC, &c->x86_capability); - /* Multi core CPU? */ - if (c->extended_cpuid_level >= 0x80000008) + if (c->extended_cpuid_level >= 0x80000008) { + c->x86_max_cores = (cpuid_ecx(0x80000008) & 0xff) + 1; + amd_detect_cmp(c); + } - /* Fix cpuid4 emulation for more */ - num_cache_leaves = 3; + return r; } static void __cpuinit detect_ht(struct cpuinfo_x86 *c) @@ -904,14 +978,13 @@ static void __cpuinit detect_ht(struct cpuinfo_x86 *c) #ifdef CONFIG_SMP u32 eax, ebx, ecx, edx; int index_msb, core_bits; + int cpu = smp_processor_id(); cpuid(1, &eax, &ebx, &ecx, &edx); - if (!cpu_has(c, X86_FEATURE_HT)) + if (!cpu_has(c, X86_FEATURE_HT) || cpu_has(c, X86_FEATURE_CMP_LEGACY)) return; - if (cpu_has(c, X86_FEATURE_CMP_LEGACY)) - goto out; smp_num_siblings = (ebx & 0xff0000) >> 16; @@ -926,7 +999,10 @@ static void __cpuinit detect_ht(struct cpuinfo_x86 *c) } index_msb = get_count_order(smp_num_siblings); - c->phys_proc_id = phys_pkg_id(index_msb); + phys_proc_id[cpu] = phys_pkg_id(index_msb); + + printk(KERN_INFO "CPU: Physical Processor ID: %d\n", + phys_proc_id[cpu]); smp_num_siblings = smp_num_siblings / c->x86_max_cores; @@ -934,15 +1010,13 @@ static void __cpuinit detect_ht(struct cpuinfo_x86 *c) core_bits = get_count_order(c->x86_max_cores); - c->cpu_core_id = phys_pkg_id(index_msb) & + cpu_core_id[cpu] = phys_pkg_id(index_msb) & ((1 << core_bits) - 1); - } -out: - if ((c->x86_max_cores * smp_num_siblings) > 1) { - printk(KERN_INFO "CPU: Physical Processor ID: %d\n", c->phys_proc_id); - printk(KERN_INFO "CPU: Processor Core ID: %d\n", c->cpu_core_id); - } + if (c->x86_max_cores > 1) + printk(KERN_INFO "CPU: Processor Core ID: %d\n", + cpu_core_id[cpu]); + } #endif } @@ -951,12 +1025,15 @@ static void __cpuinit detect_ht(struct cpuinfo_x86 *c) */ static int __cpuinit intel_num_cpu_cores(struct cpuinfo_x86 *c) { - unsigned int eax, t; + unsigned int eax; if (c->cpuid_level < 4) return 1; - cpuid_count(4, 0, &eax, &t, &t, &t); + __asm__("cpuid" + : "=a" (eax) + : "0" (4), "c" (0) + : "bx", "dx"); if (eax & 0x1f) return ((eax >> 26) + 1); @@ -969,17 +1046,16 @@ static void srat_detect_node(void) #ifdef CONFIG_NUMA unsigned node; int cpu = smp_processor_id(); - int apicid = hard_smp_processor_id(); /* Don't do the funky fallback heuristics the AMD version employs for now. */ - node = apicid_to_node[apicid]; + node = apicid_to_node[hard_smp_processor_id()]; if (node == NUMA_NO_NODE) node = first_node(node_online_map); numa_set_node(cpu, node); if (acpi_numa > 0) - printk(KERN_INFO "CPU %d/%x -> Node %d\n", cpu, apicid, node); + printk(KERN_INFO "CPU %d -> Node %d\n", cpu, node); #endif } @@ -989,13 +1065,6 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c) unsigned n; init_intel_cacheinfo(c); - if (c->cpuid_level > 9 ) { - unsigned eax = cpuid_eax(10); - /* Check for version and the number of counters */ - if ((eax & 0xff) && (((eax>>8) & 0xff) > 1)) - set_bit(X86_FEATURE_ARCH_PERFMON, &c->x86_capability); - } - n = c->extended_cpuid_level; if (n >= 0x80000008) { unsigned eax = cpuid_eax(0x80000008); @@ -1087,7 +1156,7 @@ void __cpuinit early_identify_cpu(struct cpuinfo_x86 *c) } #ifdef CONFIG_SMP - c->phys_proc_id = (cpuid_ebx(1) >> 24) & 0xff; + phys_proc_id[smp_processor_id()] = (cpuid_ebx(1) >> 24) & 0xff; #endif } @@ -1214,7 +1283,7 @@ static int show_cpuinfo(struct seq_file *m, void *v) NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, "syscall", NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, "nx", NULL, "mmxext", NULL, - NULL, "fxsr_opt", NULL, "rdtscp", NULL, "lm", "3dnowext", "3dnow", + NULL, "fxsr_opt", "rdtscp", NULL, NULL, "lm", "3dnowext", "3dnow", /* Transmeta-defined */ "recovery", "longrun", NULL, "lrti", NULL, NULL, NULL, NULL, @@ -1225,7 +1294,7 @@ static int show_cpuinfo(struct seq_file *m, void *v) /* Other (Linux-defined) */ "cxmmx", NULL, "cyrix_arr", "centaur_mcr", NULL, "constant_tsc", NULL, NULL, - "up", NULL, NULL, NULL, NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, @@ -1295,9 +1364,9 @@ static int show_cpuinfo(struct seq_file *m, void *v) #ifdef CONFIG_SMP if (smp_num_siblings * c->x86_max_cores > 1) { int cpu = c - cpu_data; - seq_printf(m, "physical id\t: %d\n", c->phys_proc_id); + seq_printf(m, "physical id\t: %d\n", phys_proc_id[cpu]); seq_printf(m, "siblings\t: %d\n", cpus_weight(cpu_core_map[cpu])); - seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id); + seq_printf(m, "core id\t\t: %d\n", cpu_core_id[cpu]); seq_printf(m, "cpu cores\t: %d\n", c->booted_cores); } #endif @@ -1371,7 +1440,7 @@ struct seq_operations cpuinfo_op = { .show = show_cpuinfo, }; -#if defined(CONFIG_INPUT_PCSPKR) || defined(CONFIG_INPUT_PCSPKR_MODULE) +#ifdef CONFIG_INPUT_PCSPKR #include static __init int add_pcspkr(void) { diff --git a/trunk/arch/x86_64/kernel/setup64.c b/trunk/arch/x86_64/kernel/setup64.c index f5934cb4a2b6..8a691fa6d393 100644 --- a/trunk/arch/x86_64/kernel/setup64.c +++ b/trunk/arch/x86_64/kernel/setup64.c @@ -3,6 +3,7 @@ * Copyright (C) 1995 Linus Torvalds * Copyright 2001, 2002, 2003 SuSE Labs / Andi Kleen. * See setup.c for older changelog. + * $Id: setup64.c,v 1.12 2002/03/21 10:09:17 ak Exp $ */ #include #include @@ -30,7 +31,6 @@ char x86_boot_params[BOOT_PARAM_SIZE] __initdata = {0,}; cpumask_t cpu_initialized __cpuinitdata = CPU_MASK_NONE; struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly; -EXPORT_SYMBOL(_cpu_pda); struct x8664_pda boot_cpu_pda[NR_CPUS] __cacheline_aligned; struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; @@ -38,7 +38,6 @@ struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; char boot_cpu_stack[IRQSTACKSIZE] __attribute__((section(".bss.page_aligned"))); unsigned long __supported_pte_mask __read_mostly = ~0UL; -EXPORT_SYMBOL(__supported_pte_mask); static int do_not_nx __cpuinitdata = 0; /* noexec=on|off diff --git a/trunk/arch/x86_64/kernel/signal.c b/trunk/arch/x86_64/kernel/signal.c index 28161170fb0a..e5f5ce7909a3 100644 --- a/trunk/arch/x86_64/kernel/signal.c +++ b/trunk/arch/x86_64/kernel/signal.c @@ -7,6 +7,8 @@ * 1997-11-28 Modified for POSIX.1b signals by Richard Henderson * 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes * 2000-2002 x86-64 support by Andi Kleen + * + * $Id: signal.c,v 1.18 2001/10/17 22:30:37 ak Exp $ */ #include @@ -237,6 +239,7 @@ get_stack(struct k_sigaction *ka, struct pt_regs *regs, unsigned long size) rsp = regs->rsp - 128; /* This is the X/Open sanctioned signal stack switching. */ + /* RED-PEN: redzone on that stack? */ if (ka->sa.sa_flags & SA_ONSTACK) { if (sas_ss_flags(rsp) == 0) rsp = current->sas_ss_sp + current->sas_ss_size; diff --git a/trunk/arch/x86_64/kernel/smp.c b/trunk/arch/x86_64/kernel/smp.c index 8188bae9c6d5..4a6628b14d99 100644 --- a/trunk/arch/x86_64/kernel/smp.c +++ b/trunk/arch/x86_64/kernel/smp.c @@ -224,7 +224,6 @@ void flush_tlb_current_task(void) flush_tlb_others(cpu_mask, mm, FLUSH_ALL); preempt_enable(); } -EXPORT_SYMBOL(flush_tlb_current_task); void flush_tlb_mm (struct mm_struct * mm) { @@ -245,7 +244,6 @@ void flush_tlb_mm (struct mm_struct * mm) preempt_enable(); } -EXPORT_SYMBOL(flush_tlb_mm); void flush_tlb_page(struct vm_area_struct * vma, unsigned long va) { @@ -268,7 +266,6 @@ void flush_tlb_page(struct vm_area_struct * vma, unsigned long va) preempt_enable(); } -EXPORT_SYMBOL(flush_tlb_page); static void do_flush_tlb_all(void* info) { @@ -446,7 +443,6 @@ int smp_call_function (void (*func) (void *info), void *info, int nonatomic, spin_unlock(&call_lock); return 0; } -EXPORT_SYMBOL(smp_call_function); void smp_stop_cpu(void) { @@ -464,7 +460,7 @@ static void smp_really_stop_cpu(void *dummy) { smp_stop_cpu(); for (;;) - halt(); + asm("hlt"); } void smp_send_stop(void) @@ -524,13 +520,13 @@ asmlinkage void smp_call_function_interrupt(void) int safe_smp_processor_id(void) { - unsigned apicid, i; + int apicid, i; if (disable_apic) return 0; apicid = hard_smp_processor_id(); - if (apicid < NR_CPUS && x86_cpu_to_apicid[apicid] == apicid) + if (x86_cpu_to_apicid[apicid] == apicid) return apicid; for (i = 0; i < NR_CPUS; ++i) { diff --git a/trunk/arch/x86_64/kernel/smpboot.c b/trunk/arch/x86_64/kernel/smpboot.c index 4e9755179ecf..71a7222cf9ce 100644 --- a/trunk/arch/x86_64/kernel/smpboot.c +++ b/trunk/arch/x86_64/kernel/smpboot.c @@ -63,11 +63,13 @@ /* Number of siblings per CPU package */ int smp_num_siblings = 1; -EXPORT_SYMBOL(smp_num_siblings); +/* Package ID of each logical CPU */ +u8 phys_proc_id[NR_CPUS] __read_mostly = { [0 ... NR_CPUS-1] = BAD_APICID }; +/* core ID of each logical CPU */ +u8 cpu_core_id[NR_CPUS] __read_mostly = { [0 ... NR_CPUS-1] = BAD_APICID }; /* Last level cache ID of each logical CPU */ u8 cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID}; -EXPORT_SYMBOL(cpu_llc_id); /* Bitmask of currently online CPUs */ cpumask_t cpu_online_map __read_mostly; @@ -80,21 +82,18 @@ EXPORT_SYMBOL(cpu_online_map); */ cpumask_t cpu_callin_map; cpumask_t cpu_callout_map; -EXPORT_SYMBOL(cpu_callout_map); cpumask_t cpu_possible_map; EXPORT_SYMBOL(cpu_possible_map); /* Per CPU bogomips and other parameters */ struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned; -EXPORT_SYMBOL(cpu_data); /* Set when the idlers are all forked */ int smp_threads_ready; /* representing HT siblings of each logical CPU */ cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly; -EXPORT_SYMBOL(cpu_sibling_map); /* representing HT and core siblings of each logical CPU */ cpumask_t cpu_core_map[NR_CPUS] __read_mostly; @@ -473,8 +472,8 @@ static inline void set_cpu_sibling_map(int cpu) if (smp_num_siblings > 1) { for_each_cpu_mask(i, cpu_sibling_setup_map) { - if (c[cpu].phys_proc_id == c[i].phys_proc_id && - c[cpu].cpu_core_id == c[i].cpu_core_id) { + if (phys_proc_id[cpu] == phys_proc_id[i] && + cpu_core_id[cpu] == cpu_core_id[i]) { cpu_set(i, cpu_sibling_map[cpu]); cpu_set(cpu, cpu_sibling_map[i]); cpu_set(i, cpu_core_map[cpu]); @@ -501,7 +500,7 @@ static inline void set_cpu_sibling_map(int cpu) cpu_set(i, c[cpu].llc_shared_map); cpu_set(cpu, c[i].llc_shared_map); } - if (c[cpu].phys_proc_id == c[i].phys_proc_id) { + if (phys_proc_id[cpu] == phys_proc_id[i]) { cpu_set(i, cpu_core_map[cpu]); cpu_set(cpu, cpu_core_map[i]); /* @@ -798,8 +797,6 @@ static int __cpuinit do_boot_cpu(int cpu, int apicid) } - alternatives_smp_switch(1); - c_idle.idle = get_idle_for_cpu(cpu); if (c_idle.idle) { @@ -1202,8 +1199,8 @@ static void remove_siblinginfo(int cpu) cpu_clear(cpu, cpu_sibling_map[sibling]); cpus_clear(cpu_sibling_map[cpu]); cpus_clear(cpu_core_map[cpu]); - c[cpu].phys_proc_id = 0; - c[cpu].cpu_core_id = 0; + phys_proc_id[cpu] = BAD_APICID; + cpu_core_id[cpu] = BAD_APICID; cpu_clear(cpu, cpu_sibling_setup_map); } @@ -1262,8 +1259,6 @@ void __cpu_die(unsigned int cpu) /* They ack this in play_dead by setting CPU_DEAD */ if (per_cpu(cpu_state, cpu) == CPU_DEAD) { printk ("CPU %d is now offline\n", cpu); - if (1 == num_online_cpus()) - alternatives_smp_switch(0); return; } msleep(100); diff --git a/trunk/arch/x86_64/kernel/tce.c b/trunk/arch/x86_64/kernel/tce.c deleted file mode 100644 index 8d4c67f61b8e..000000000000 --- a/trunk/arch/x86_64/kernel/tce.c +++ /dev/null @@ -1,202 +0,0 @@ -/* - * Derived from arch/powerpc/platforms/pseries/iommu.c - * - * Copyright (C) 2006 Jon Mason , IBM Corporation - * Copyright (C) 2006 Muli Ben-Yehuda , IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* flush a tce at 'tceaddr' to main memory */ -static inline void flush_tce(void* tceaddr) -{ - /* a single tce can't cross a cache line */ - if (cpu_has_clflush) - asm volatile("clflush (%0)" :: "r" (tceaddr)); - else - asm volatile("wbinvd":::"memory"); -} - -void tce_build(struct iommu_table *tbl, unsigned long index, - unsigned int npages, unsigned long uaddr, int direction) -{ - u64* tp; - u64 t; - u64 rpn; - - t = (1 << TCE_READ_SHIFT); - if (direction != DMA_TO_DEVICE) - t |= (1 << TCE_WRITE_SHIFT); - - tp = ((u64*)tbl->it_base) + index; - - while (npages--) { - rpn = (virt_to_bus((void*)uaddr)) >> PAGE_SHIFT; - t &= ~TCE_RPN_MASK; - t |= (rpn << TCE_RPN_SHIFT); - - *tp = cpu_to_be64(t); - flush_tce(tp); - - uaddr += PAGE_SIZE; - tp++; - } -} - -void tce_free(struct iommu_table *tbl, long index, unsigned int npages) -{ - u64* tp; - - tp = ((u64*)tbl->it_base) + index; - - while (npages--) { - *tp = cpu_to_be64(0); - flush_tce(tp); - tp++; - } -} - -static inline unsigned int table_size_to_number_of_entries(unsigned char size) -{ - /* - * size is the order of the table, 0-7 - * smallest table is 8K entries, so shift result by 13 to - * multiply by 8K - */ - return (1 << size) << 13; -} - -static int tce_table_setparms(struct pci_dev *dev, struct iommu_table *tbl) -{ - unsigned int bitmapsz; - unsigned int tce_table_index; - unsigned long bmppages; - int ret; - - tbl->it_busno = dev->bus->number; - - /* set the tce table size - measured in entries */ - tbl->it_size = table_size_to_number_of_entries(specified_table_size); - - tce_table_index = bus_to_phb(tbl->it_busno); - tbl->it_base = (unsigned long)tce_table_kva[tce_table_index]; - if (!tbl->it_base) { - printk(KERN_ERR "Calgary: iommu_table_setparms: " - "no table allocated?!\n"); - ret = -ENOMEM; - goto done; - } - - /* - * number of bytes needed for the bitmap size in number of - * entries; we need one bit per entry - */ - bitmapsz = tbl->it_size / BITS_PER_BYTE; - bmppages = __get_free_pages(GFP_KERNEL, get_order(bitmapsz)); - if (!bmppages) { - printk(KERN_ERR "Calgary: cannot allocate bitmap\n"); - ret = -ENOMEM; - goto done; - } - - tbl->it_map = (unsigned long*)bmppages; - - memset(tbl->it_map, 0, bitmapsz); - - tbl->it_hint = 0; - - spin_lock_init(&tbl->it_lock); - - return 0; - -done: - return ret; -} - -int build_tce_table(struct pci_dev *dev, void __iomem *bbar) -{ - struct iommu_table *tbl; - int ret; - - if (dev->sysdata) { - printk(KERN_ERR "Calgary: dev %p has sysdata %p\n", - dev, dev->sysdata); - BUG(); - } - - tbl = kzalloc(sizeof(struct iommu_table), GFP_KERNEL); - if (!tbl) { - printk(KERN_ERR "Calgary: error allocating iommu_table\n"); - ret = -ENOMEM; - goto done; - } - - ret = tce_table_setparms(dev, tbl); - if (ret) - goto free_tbl; - - tce_free(tbl, 0, tbl->it_size); - - tbl->bbar = bbar; - - /* - * NUMA is already using the bus's sysdata pointer, so we use - * the bus's pci_dev's sysdata instead. - */ - dev->sysdata = tbl; - - return 0; - -free_tbl: - kfree(tbl); -done: - return ret; -} - -void* alloc_tce_table(void) -{ - unsigned int size; - - size = table_size_to_number_of_entries(specified_table_size); - size *= TCE_ENTRY_SIZE; - - return __alloc_bootmem_low(size, size, 0); -} - -void free_tce_table(void *tbl) -{ - unsigned int size; - - if (!tbl) - return; - - size = table_size_to_number_of_entries(specified_table_size); - size *= TCE_ENTRY_SIZE; - - free_bootmem(__pa(tbl), size); -} diff --git a/trunk/arch/x86_64/kernel/time.c b/trunk/arch/x86_64/kernel/time.c index ebbee6f59ff5..7392570f975d 100644 --- a/trunk/arch/x86_64/kernel/time.c +++ b/trunk/arch/x86_64/kernel/time.c @@ -8,7 +8,7 @@ * Copyright (c) 1995 Markus Kuhn * Copyright (c) 1996 Ingo Molnar * Copyright (c) 1998 Andrea Arcangeli - * Copyright (c) 2002,2006 Vojtech Pavlik + * Copyright (c) 2002 Vojtech Pavlik * Copyright (c) 2003 Andi Kleen * RTC support code taken from arch/i386/kernel/timers/time_hpet.c */ @@ -51,21 +51,14 @@ extern int using_apic_timer; static char *time_init_gtod(void); DEFINE_SPINLOCK(rtc_lock); -EXPORT_SYMBOL(rtc_lock); DEFINE_SPINLOCK(i8253_lock); int nohpet __initdata = 0; static int notsc __initdata = 0; -#define USEC_PER_TICK (USEC_PER_SEC / HZ) -#define NSEC_PER_TICK (NSEC_PER_SEC / HZ) -#define FSEC_PER_TICK (FSEC_PER_SEC / HZ) - -#define NS_SCALE 10 /* 2^10, carefully chosen */ -#define US_SCALE 32 /* 2^32, arbitralrily chosen */ +#undef HPET_HACK_ENABLE_DANGEROUS unsigned int cpu_khz; /* TSC clocks / usec, not used here */ -EXPORT_SYMBOL(cpu_khz); static unsigned long hpet_period; /* fsecs / HPET clock */ unsigned long hpet_tick; /* HPET clocks / interrupt */ int hpet_use_timer; /* Use counter of hpet for time keeping, otherwise PIT */ @@ -97,7 +90,7 @@ static inline unsigned int do_gettimeoffset_tsc(void) t = get_cycles_sync(); if (t < vxtime.last_tsc) t = vxtime.last_tsc; /* hack */ - x = ((t - vxtime.last_tsc) * vxtime.tsc_quot) >> US_SCALE; + x = ((t - vxtime.last_tsc) * vxtime.tsc_quot) >> 32; return x; } @@ -105,7 +98,7 @@ static inline unsigned int do_gettimeoffset_hpet(void) { /* cap counter read to one tick to avoid inconsistencies */ unsigned long counter = hpet_readl(HPET_COUNTER) - vxtime.last; - return (min(counter,hpet_tick) * vxtime.quot) >> US_SCALE; + return (min(counter,hpet_tick) * vxtime.quot) >> 32; } unsigned int (*do_gettimeoffset)(void) = do_gettimeoffset_tsc; @@ -125,7 +118,7 @@ void do_gettimeofday(struct timeval *tv) seq = read_seqbegin(&xtime_lock); sec = xtime.tv_sec; - usec = xtime.tv_nsec / NSEC_PER_USEC; + usec = xtime.tv_nsec / 1000; /* i386 does some correction here to keep the clock monotonous even when ntpd is fixing drift. @@ -136,14 +129,14 @@ void do_gettimeofday(struct timeval *tv) in arch/x86_64/kernel/vsyscall.c and export all needed variables in vmlinux.lds. -AK */ - t = (jiffies - wall_jiffies) * USEC_PER_TICK + + t = (jiffies - wall_jiffies) * (1000000L / HZ) + do_gettimeoffset(); usec += t; } while (read_seqretry(&xtime_lock, seq)); - tv->tv_sec = sec + usec / USEC_PER_SEC; - tv->tv_usec = usec % USEC_PER_SEC; + tv->tv_sec = sec + usec / 1000000; + tv->tv_usec = usec % 1000000; } EXPORT_SYMBOL(do_gettimeofday); @@ -164,8 +157,8 @@ int do_settimeofday(struct timespec *tv) write_seqlock_irq(&xtime_lock); - nsec -= do_gettimeoffset() * NSEC_PER_USEC + - (jiffies - wall_jiffies) * NSEC_PER_TICK; + nsec -= do_gettimeoffset() * 1000 + + (jiffies - wall_jiffies) * (NSEC_PER_SEC/HZ); wtm_sec = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec); wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec); @@ -295,7 +288,7 @@ unsigned long long monotonic_clock(void) this_offset = hpet_readl(HPET_COUNTER); } while (read_seqretry(&xtime_lock, seq)); offset = (this_offset - last_offset); - offset *= NSEC_PER_TICK / hpet_tick; + offset *= (NSEC_PER_SEC/HZ) / hpet_tick; } else { do { seq = read_seqbegin(&xtime_lock); @@ -304,8 +297,7 @@ unsigned long long monotonic_clock(void) base = monotonic_base; } while (read_seqretry(&xtime_lock, seq)); this_offset = get_cycles_sync(); - /* FIXME: 1000 or 1000000? */ - offset = (this_offset - last_offset)*1000 / cpu_khz; + offset = (this_offset - last_offset)*1000 / cpu_khz; } return base + offset; } @@ -390,7 +382,7 @@ void main_timer_handler(struct pt_regs *regs) } monotonic_base += - (offset - vxtime.last) * NSEC_PER_TICK / hpet_tick; + (offset - vxtime.last)*(NSEC_PER_SEC/HZ) / hpet_tick; vxtime.last = offset; #ifdef CONFIG_X86_PM_TIMER @@ -399,25 +391,24 @@ void main_timer_handler(struct pt_regs *regs) #endif } else { offset = (((tsc - vxtime.last_tsc) * - vxtime.tsc_quot) >> US_SCALE) - USEC_PER_TICK; + vxtime.tsc_quot) >> 32) - (USEC_PER_SEC / HZ); if (offset < 0) offset = 0; - if (offset > USEC_PER_TICK) { - lost = offset / USEC_PER_TICK; - offset %= USEC_PER_TICK; + if (offset > (USEC_PER_SEC / HZ)) { + lost = offset / (USEC_PER_SEC / HZ); + offset %= (USEC_PER_SEC / HZ); } - /* FIXME: 1000 or 1000000? */ - monotonic_base += (tsc - vxtime.last_tsc) * 1000000 / cpu_khz; + monotonic_base += (tsc - vxtime.last_tsc)*1000000/cpu_khz ; vxtime.last_tsc = tsc - vxtime.quot * delay / vxtime.tsc_quot; if ((((tsc - vxtime.last_tsc) * - vxtime.tsc_quot) >> US_SCALE) < offset) + vxtime.tsc_quot) >> 32) < offset) vxtime.last_tsc = tsc - - (((long) offset << US_SCALE) / vxtime.tsc_quot) - 1; + (((long) offset << 32) / vxtime.tsc_quot) - 1; } if (lost > 0) { @@ -477,15 +468,16 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) } static unsigned int cyc2ns_scale __read_mostly; +#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */ static inline void set_cyc2ns_scale(unsigned long cpu_khz) { - cyc2ns_scale = (NSEC_PER_MSEC << NS_SCALE) / cpu_khz; + cyc2ns_scale = (1000000 << CYC2NS_SCALE_FACTOR)/cpu_khz; } static inline unsigned long long cycles_2_ns(unsigned long long cyc) { - return (cyc * cyc2ns_scale) >> NS_SCALE; + return (cyc * cyc2ns_scale) >> CYC2NS_SCALE_FACTOR; } unsigned long long sched_clock(void) @@ -498,7 +490,7 @@ unsigned long long sched_clock(void) Disadvantage is a small drift between CPUs in some configurations, but that should be tolerable. */ if (__vxtime.mode == VXTIME_HPET) - return (hpet_readl(HPET_COUNTER) * vxtime.quot) >> US_SCALE; + return (hpet_readl(HPET_COUNTER) * vxtime.quot) >> 32; #endif /* Could do CPU core sync here. Opteron can execute rdtsc speculatively, @@ -641,7 +633,7 @@ static int time_cpufreq_notifier(struct notifier_block *nb, unsigned long val, cpu_khz = cpufreq_scale(cpu_khz_ref, ref_freq, freq->new); if (!(freq->flags & CPUFREQ_CONST_LOOPS)) - vxtime.tsc_quot = (USEC_PER_MSEC << US_SCALE) / cpu_khz; + vxtime.tsc_quot = (1000L << 32) / cpu_khz; } set_cyc2ns_scale(cpu_khz_ref); @@ -797,8 +789,8 @@ static int hpet_timer_stop_set_go(unsigned long tick) if (hpet_use_timer) { hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT, HPET_T0_CFG); - hpet_writel(hpet_tick, HPET_T0_CMP); /* next interrupt */ - hpet_writel(hpet_tick, HPET_T0_CMP); /* period */ + hpet_writel(hpet_tick, HPET_T0_CMP); + hpet_writel(hpet_tick, HPET_T0_CMP); /* AK: why twice? */ cfg |= HPET_CFG_LEGACY; } /* @@ -833,7 +825,8 @@ static int hpet_init(void) if (hpet_period < 100000 || hpet_period > 100000000) return -1; - hpet_tick = (FSEC_PER_TICK + hpet_period / 2) / hpet_period; + hpet_tick = (1000000000L * (USEC_PER_SEC / HZ) + hpet_period / 2) / + hpet_period; hpet_use_timer = (id & HPET_ID_LEGSUP); @@ -897,6 +890,18 @@ void __init time_init(void) char *timename; char *gtod; +#ifdef HPET_HACK_ENABLE_DANGEROUS + if (!vxtime.hpet_address) { + printk(KERN_WARNING "time.c: WARNING: Enabling HPET base " + "manually!\n"); + outl(0x800038a0, 0xcf8); + outl(0xff000001, 0xcfc); + outl(0x800038a0, 0xcf8); + vxtime.hpet_address = inl(0xcfc) & 0xfffffffe; + printk(KERN_WARNING "time.c: WARNING: Enabled HPET " + "at %#lx.\n", vxtime.hpet_address); + } +#endif if (nohpet) vxtime.hpet_address = 0; @@ -907,7 +912,7 @@ void __init time_init(void) -xtime.tv_sec, -xtime.tv_nsec); if (!hpet_init()) - vxtime_hz = (FSEC_PER_SEC + hpet_period / 2) / hpet_period; + vxtime_hz = (1000000000000000L + hpet_period / 2) / hpet_period; else vxtime.hpet_address = 0; @@ -936,8 +941,8 @@ void __init time_init(void) vxtime_hz / 1000000, vxtime_hz % 1000000, timename, gtod); printk(KERN_INFO "time.c: Detected %d.%03d MHz processor.\n", cpu_khz / 1000, cpu_khz % 1000); - vxtime.quot = (USEC_PER_SEC << US_SCALE) / vxtime_hz; - vxtime.tsc_quot = (USEC_PER_MSEC << US_SCALE) / cpu_khz; + vxtime.quot = (1000000L << 32) / vxtime_hz; + vxtime.tsc_quot = (1000L << 32) / cpu_khz; vxtime.last_tsc = get_cycles_sync(); setup_irq(0, &irq0); @@ -951,10 +956,10 @@ void __init time_init(void) __cpuinit int unsynchronized_tsc(void) { #ifdef CONFIG_SMP - if (apic_is_clustered_box()) + if (oem_force_hpet_timer()) return 1; /* Intel systems are normally all synchronized. Exceptions - are handled in the check above. */ + are handled in the OEM check above. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) return 0; #endif diff --git a/trunk/arch/x86_64/kernel/traps.c b/trunk/arch/x86_64/kernel/traps.c index 3d11a2fe45b7..cea335e8746c 100644 --- a/trunk/arch/x86_64/kernel/traps.c +++ b/trunk/arch/x86_64/kernel/traps.c @@ -6,6 +6,8 @@ * * Pentium III FXSR, SSE support * Gareth Hughes , May 2000 + * + * $Id: traps.c,v 1.36 2002/03/24 11:09:10 ak Exp $ */ /* @@ -29,7 +31,6 @@ #include #include #include -#include #include #include @@ -40,7 +41,7 @@ #include #include #include -#include + #include #include #include @@ -70,7 +71,6 @@ asmlinkage void machine_check(void); asmlinkage void spurious_interrupt_bug(void); ATOMIC_NOTIFIER_HEAD(die_chain); -EXPORT_SYMBOL(die_chain); int register_die_notifier(struct notifier_block *nb) { @@ -107,8 +107,7 @@ static inline void preempt_conditional_cli(struct pt_regs *regs) preempt_enable_no_resched(); } -static int kstack_depth_to_print = 12; -static int call_trace = 1; +static int kstack_depth_to_print = 10; #ifdef CONFIG_KALLSYMS #include @@ -192,25 +191,6 @@ static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, return NULL; } -static int show_trace_unwind(struct unwind_frame_info *info, void *context) -{ - int i = 11, n = 0; - - while (unwind(info) == 0 && UNW_PC(info)) { - ++n; - if (i > 50) { - printk("\n "); - i = 7; - } else - i += printk(" "); - i += printk_address(UNW_PC(info)); - if (arch_unw_user_mode(info)) - break; - } - printk("\n"); - return n; -} - /* * x86-64 can have upto three kernel stacks: * process stack @@ -218,39 +198,15 @@ static int show_trace_unwind(struct unwind_frame_info *info, void *context) * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack */ -void show_trace(struct task_struct *tsk, struct pt_regs *regs, unsigned long * stack) +void show_trace(unsigned long *stack) { const unsigned cpu = safe_smp_processor_id(); unsigned long *irqstack_end = (unsigned long *)cpu_pda(cpu)->irqstackptr; - int i = 11; + int i; unsigned used = 0; printk("\nCall Trace:"); - if (!tsk) - tsk = current; - - if (call_trace >= 0) { - int unw_ret = 0; - struct unwind_frame_info info; - - if (regs) { - if (unwind_init_frame_info(&info, tsk, regs) == 0) - unw_ret = show_trace_unwind(&info, NULL); - } else if (tsk == current) - unw_ret = unwind_init_running(&info, show_trace_unwind, NULL); - else { - if (unwind_init_blocked(&info, tsk) == 0) - unw_ret = show_trace_unwind(&info, NULL); - } - if (unw_ret > 0) { - if (call_trace > 0) - return; - printk("Legacy call trace:"); - i = 18; - } - } - #define HANDLE_STACK(cond) \ do while (cond) { \ unsigned long addr = *stack++; \ @@ -273,7 +229,7 @@ void show_trace(struct task_struct *tsk, struct pt_regs *regs, unsigned long * s } \ } while (0) - for(; ; ) { + for(i = 11; ; ) { const char *id; unsigned long *estack_end; estack_end = in_exception_stack(cpu, (unsigned long)stack, @@ -308,7 +264,7 @@ void show_trace(struct task_struct *tsk, struct pt_regs *regs, unsigned long * s printk("\n"); } -static void _show_stack(struct task_struct *tsk, struct pt_regs *regs, unsigned long * rsp) +void show_stack(struct task_struct *tsk, unsigned long * rsp) { unsigned long *stack; int i; @@ -342,12 +298,7 @@ static void _show_stack(struct task_struct *tsk, struct pt_regs *regs, unsigned printk("%016lx ", *stack++); touch_nmi_watchdog(); } - show_trace(tsk, regs, rsp); -} - -void show_stack(struct task_struct *tsk, unsigned long * rsp) -{ - _show_stack(tsk, NULL, rsp); + show_trace((unsigned long *)rsp); } /* @@ -356,7 +307,7 @@ void show_stack(struct task_struct *tsk, unsigned long * rsp) void dump_stack(void) { unsigned long dummy; - show_trace(NULL, NULL, &dummy); + show_trace(&dummy); } EXPORT_SYMBOL(dump_stack); @@ -383,7 +334,7 @@ void show_registers(struct pt_regs *regs) if (in_kernel) { printk("Stack: "); - _show_stack(NULL, regs, (unsigned long*)rsp); + show_stack(NULL, (unsigned long*)rsp); printk("\nCode: "); if (regs->rip < PAGE_OFFSET) @@ -432,7 +383,6 @@ void out_of_line_bug(void) { BUG(); } -EXPORT_SYMBOL(out_of_line_bug); #endif static DEFINE_SPINLOCK(die_lock); @@ -1062,14 +1012,3 @@ static int __init kstack_setup(char *s) } __setup("kstack=", kstack_setup); -static int __init call_trace_setup(char *s) -{ - if (strcmp(s, "old") == 0) - call_trace = -1; - else if (strcmp(s, "both") == 0) - call_trace = 0; - else if (strcmp(s, "new") == 0) - call_trace = 1; - return 1; -} -__setup("call_trace=", call_trace_setup); diff --git a/trunk/arch/x86_64/kernel/vmlinux.lds.S b/trunk/arch/x86_64/kernel/vmlinux.lds.S index 1c6a5f322919..b81f473c4a19 100644 --- a/trunk/arch/x86_64/kernel/vmlinux.lds.S +++ b/trunk/arch/x86_64/kernel/vmlinux.lds.S @@ -45,15 +45,6 @@ SECTIONS RODATA -#ifdef CONFIG_STACK_UNWIND - . = ALIGN(8); - .eh_frame : AT(ADDR(.eh_frame) - LOAD_OFFSET) { - __start_unwind = .; - *(.eh_frame) - __end_unwind = .; - } -#endif - /* Data */ .data : AT(ADDR(.data) - LOAD_OFFSET) { *(.data) @@ -140,26 +131,6 @@ SECTIONS *(.data.page_aligned) } - /* might get freed after init */ - . = ALIGN(4096); - __smp_alt_begin = .; - __smp_alt_instructions = .; - .smp_altinstructions : AT(ADDR(.smp_altinstructions) - LOAD_OFFSET) { - *(.smp_altinstructions) - } - __smp_alt_instructions_end = .; - . = ALIGN(8); - __smp_locks = .; - .smp_locks : AT(ADDR(.smp_locks) - LOAD_OFFSET) { - *(.smp_locks) - } - __smp_locks_end = .; - .smp_altinstr_replacement : AT(ADDR(.smp_altinstr_replacement) - LOAD_OFFSET) { - *(.smp_altinstr_replacement) - } - . = ALIGN(4096); - __smp_alt_end = .; - . = ALIGN(4096); /* Init code and data */ __init_begin = .; .init.text : AT(ADDR(.init.text) - LOAD_OFFSET) { diff --git a/trunk/arch/x86_64/kernel/vsyscall.c b/trunk/arch/x86_64/kernel/vsyscall.c index f603037df162..9468fb20b0bc 100644 --- a/trunk/arch/x86_64/kernel/vsyscall.c +++ b/trunk/arch/x86_64/kernel/vsyscall.c @@ -107,7 +107,7 @@ static __always_inline long time_syscall(long *t) int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz) { - if (!__sysctl_vsyscall) + if (unlikely(!__sysctl_vsyscall)) return gettimeofday(tv,tz); if (tv) do_vgettimeofday(tv); @@ -120,7 +120,7 @@ int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz) * unlikely */ time_t __vsyscall(1) vtime(time_t *t) { - if (!__sysctl_vsyscall) + if (unlikely(!__sysctl_vsyscall)) return time_syscall(t); else if (t) *t = __xtime.tv_sec; diff --git a/trunk/arch/x86_64/kernel/x8664_ksyms.c b/trunk/arch/x86_64/kernel/x8664_ksyms.c index 370952c4ff22..1def21c9f7cd 100644 --- a/trunk/arch/x86_64/kernel/x8664_ksyms.c +++ b/trunk/arch/x86_64/kernel/x8664_ksyms.c @@ -1,21 +1,66 @@ -/* Exports for assembly files. - All C exports should go in the respective C files. */ - #include #include #include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include #include #include +#include #include +#include +#include +#include +#include +#include +#include #include +#include +#include +#include +#include +#include +#include + +extern spinlock_t rtc_lock; +#ifdef CONFIG_SMP +extern void __write_lock_failed(rwlock_t *rw); +extern void __read_lock_failed(rwlock_t *rw); +#endif + +/* platform dependent support */ +EXPORT_SYMBOL(boot_cpu_data); +//EXPORT_SYMBOL(dump_fpu); +EXPORT_SYMBOL(__ioremap); +EXPORT_SYMBOL(ioremap_nocache); +EXPORT_SYMBOL(iounmap); EXPORT_SYMBOL(kernel_thread); +EXPORT_SYMBOL(pm_idle); +EXPORT_SYMBOL(pm_power_off); EXPORT_SYMBOL(__down_failed); EXPORT_SYMBOL(__down_failed_interruptible); EXPORT_SYMBOL(__down_failed_trylock); EXPORT_SYMBOL(__up_wakeup); +/* Networking helper routines. */ +EXPORT_SYMBOL(csum_partial_copy_nocheck); +EXPORT_SYMBOL(ip_compute_csum); +/* Delay loops */ +EXPORT_SYMBOL(__udelay); +EXPORT_SYMBOL(__ndelay); +EXPORT_SYMBOL(__delay); +EXPORT_SYMBOL(__const_udelay); EXPORT_SYMBOL(__get_user_1); EXPORT_SYMBOL(__get_user_2); @@ -26,20 +71,42 @@ EXPORT_SYMBOL(__put_user_2); EXPORT_SYMBOL(__put_user_4); EXPORT_SYMBOL(__put_user_8); +EXPORT_SYMBOL(strncpy_from_user); +EXPORT_SYMBOL(__strncpy_from_user); +EXPORT_SYMBOL(clear_user); +EXPORT_SYMBOL(__clear_user); EXPORT_SYMBOL(copy_user_generic); EXPORT_SYMBOL(copy_from_user); EXPORT_SYMBOL(copy_to_user); +EXPORT_SYMBOL(copy_in_user); +EXPORT_SYMBOL(strnlen_user); + +#ifdef CONFIG_PCI +EXPORT_SYMBOL(pci_mem_start); +#endif EXPORT_SYMBOL(copy_page); EXPORT_SYMBOL(clear_page); +EXPORT_SYMBOL(_cpu_pda); #ifdef CONFIG_SMP -extern void FASTCALL( __write_lock_failed(rwlock_t *rw)); -extern void FASTCALL( __read_lock_failed(rwlock_t *rw)); +EXPORT_SYMBOL(cpu_data); EXPORT_SYMBOL(__write_lock_failed); EXPORT_SYMBOL(__read_lock_failed); + +EXPORT_SYMBOL(smp_call_function); +EXPORT_SYMBOL(cpu_callout_map); +#endif + +#ifdef CONFIG_VT +EXPORT_SYMBOL(screen_info); #endif +EXPORT_SYMBOL(rtc_lock); + +EXPORT_SYMBOL_GPL(set_nmi_callback); +EXPORT_SYMBOL_GPL(unset_nmi_callback); + /* Export string functions. We normally rely on gcc builtin for most of these, but gcc sometimes decides not to inline them. */ #undef memcpy @@ -47,14 +114,51 @@ EXPORT_SYMBOL(__read_lock_failed); #undef memmove extern void * memset(void *,int,__kernel_size_t); +extern size_t strlen(const char *); +extern void * memmove(void * dest,const void *src,size_t count); extern void * memcpy(void *,const void *,__kernel_size_t); extern void * __memcpy(void *,const void *,__kernel_size_t); EXPORT_SYMBOL(memset); +EXPORT_SYMBOL(memmove); EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(__memcpy); +#ifdef CONFIG_RWSEM_XCHGADD_ALGORITHM +/* prototypes are wrong, these are assembly with custom calling functions */ +extern void rwsem_down_read_failed_thunk(void); +extern void rwsem_wake_thunk(void); +extern void rwsem_downgrade_thunk(void); +extern void rwsem_down_write_failed_thunk(void); +EXPORT_SYMBOL(rwsem_down_read_failed_thunk); +EXPORT_SYMBOL(rwsem_wake_thunk); +EXPORT_SYMBOL(rwsem_downgrade_thunk); +EXPORT_SYMBOL(rwsem_down_write_failed_thunk); +#endif + EXPORT_SYMBOL(empty_zero_page); + +EXPORT_SYMBOL(die_chain); + +#ifdef CONFIG_SMP +EXPORT_SYMBOL(cpu_sibling_map); +EXPORT_SYMBOL(smp_num_siblings); +#endif + +#ifdef CONFIG_BUG +EXPORT_SYMBOL(out_of_line_bug); +#endif + EXPORT_SYMBOL(init_level4_pgt); + +extern unsigned long __supported_pte_mask; +EXPORT_SYMBOL(__supported_pte_mask); + +#ifdef CONFIG_SMP +EXPORT_SYMBOL(flush_tlb_page); +#endif + +EXPORT_SYMBOL(cpu_khz); + EXPORT_SYMBOL(load_gs_index); diff --git a/trunk/arch/x86_64/lib/csum-partial.c b/trunk/arch/x86_64/lib/csum-partial.c index c493735218da..5384e227cdf6 100644 --- a/trunk/arch/x86_64/lib/csum-partial.c +++ b/trunk/arch/x86_64/lib/csum-partial.c @@ -147,5 +147,4 @@ unsigned short ip_compute_csum(unsigned char * buff, int len) { return csum_fold(csum_partial(buff,len,0)); } -EXPORT_SYMBOL(ip_compute_csum); diff --git a/trunk/arch/x86_64/lib/csum-wrappers.c b/trunk/arch/x86_64/lib/csum-wrappers.c index b1320ec58428..94323f20816e 100644 --- a/trunk/arch/x86_64/lib/csum-wrappers.c +++ b/trunk/arch/x86_64/lib/csum-wrappers.c @@ -109,7 +109,6 @@ csum_partial_copy_nocheck(const unsigned char *src, unsigned char *dst, int len, { return csum_partial_copy_generic(src,dst,len,sum,NULL,NULL); } -EXPORT_SYMBOL(csum_partial_copy_nocheck); unsigned short csum_ipv6_magic(struct in6_addr *saddr, struct in6_addr *daddr, __u32 len, unsigned short proto, unsigned int sum) diff --git a/trunk/arch/x86_64/lib/delay.c b/trunk/arch/x86_64/lib/delay.c index b6cd3cca2f45..03c460cbdd1c 100644 --- a/trunk/arch/x86_64/lib/delay.c +++ b/trunk/arch/x86_64/lib/delay.c @@ -9,7 +9,6 @@ */ #include -#include #include #include #include @@ -37,22 +36,18 @@ void __delay(unsigned long loops) } while((now-bclock) < loops); } -EXPORT_SYMBOL(__delay); inline void __const_udelay(unsigned long xloops) { __delay((xloops * HZ * cpu_data[raw_smp_processor_id()].loops_per_jiffy) >> 32); } -EXPORT_SYMBOL(__const_udelay); void __udelay(unsigned long usecs) { __const_udelay(usecs * 0x000010c6); /* 2**32 / 1000000 */ } -EXPORT_SYMBOL(__udelay); void __ndelay(unsigned long nsecs) { __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */ } -EXPORT_SYMBOL(__ndelay); diff --git a/trunk/arch/x86_64/lib/memmove.c b/trunk/arch/x86_64/lib/memmove.c index 751ebae8ec42..e93d5255fdc9 100644 --- a/trunk/arch/x86_64/lib/memmove.c +++ b/trunk/arch/x86_64/lib/memmove.c @@ -3,13 +3,12 @@ */ #define _STRING_C #include -#include #undef memmove void *memmove(void * dest,const void *src,size_t count) { if (dest < src) { - return memcpy(dest,src,count); + __inline_memcpy(dest,src,count); } else { char *p = (char *) dest + count; char *s = (char *) src + count; @@ -18,4 +17,3 @@ void *memmove(void * dest,const void *src,size_t count) } return dest; } -EXPORT_SYMBOL(memmove); diff --git a/trunk/arch/x86_64/lib/usercopy.c b/trunk/arch/x86_64/lib/usercopy.c index 893d43f838cc..9bc2c295818e 100644 --- a/trunk/arch/x86_64/lib/usercopy.c +++ b/trunk/arch/x86_64/lib/usercopy.c @@ -5,7 +5,6 @@ * Copyright 1997 Linus Torvalds * Copyright 2002 Andi Kleen */ -#include #include /* @@ -48,17 +47,15 @@ __strncpy_from_user(char *dst, const char __user *src, long count) __do_strncpy_from_user(dst, src, count, res); return res; } -EXPORT_SYMBOL(__strncpy_from_user); long strncpy_from_user(char *dst, const char __user *src, long count) { long res = -EFAULT; if (access_ok(VERIFY_READ, src, 1)) - return __strncpy_from_user(dst, src, count); + __do_strncpy_from_user(dst, src, count, res); return res; } -EXPORT_SYMBOL(strncpy_from_user); /* * Zero Userspace @@ -97,7 +94,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size) [zero] "r" (0UL), [eight] "r" (8UL)); return size; } -EXPORT_SYMBOL(__clear_user); + unsigned long clear_user(void __user *to, unsigned long n) { @@ -105,7 +102,6 @@ unsigned long clear_user(void __user *to, unsigned long n) return __clear_user(to, n); return n; } -EXPORT_SYMBOL(clear_user); /* * Return the size of a string (including the ending 0) @@ -129,7 +125,6 @@ long __strnlen_user(const char __user *s, long n) s++; } } -EXPORT_SYMBOL(__strnlen_user); long strnlen_user(const char __user *s, long n) { @@ -137,7 +132,6 @@ long strnlen_user(const char __user *s, long n) return 0; return __strnlen_user(s, n); } -EXPORT_SYMBOL(strnlen_user); long strlen_user(const char __user *s) { @@ -153,7 +147,6 @@ long strlen_user(const char __user *s) s++; } } -EXPORT_SYMBOL(strlen_user); unsigned long copy_in_user(void __user *to, const void __user *from, unsigned len) { @@ -162,5 +155,3 @@ unsigned long copy_in_user(void __user *to, const void __user *from, unsigned le } return len; } -EXPORT_SYMBOL(copy_in_user); - diff --git a/trunk/arch/x86_64/mm/fault.c b/trunk/arch/x86_64/mm/fault.c index 08dc696f54ee..55250593d8c9 100644 --- a/trunk/arch/x86_64/mm/fault.c +++ b/trunk/arch/x86_64/mm/fault.c @@ -41,41 +41,6 @@ #define PF_RSVD (1<<3) #define PF_INSTR (1<<4) -#ifdef CONFIG_KPROBES -ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); - -/* Hook to register for page fault notifications */ -int register_page_fault_notifier(struct notifier_block *nb) -{ - vmalloc_sync_all(); - return atomic_notifier_chain_register(¬ify_page_fault_chain, nb); -} - -int unregister_page_fault_notifier(struct notifier_block *nb) -{ - return atomic_notifier_chain_unregister(¬ify_page_fault_chain, nb); -} - -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - struct die_args args = { - .regs = regs, - .str = str, - .err = err, - .trapnr = trap, - .signr = sig - }; - return atomic_notifier_call_chain(¬ify_page_fault_chain, val, &args); -} -#else -static inline int notify_page_fault(enum die_val val, const char *str, - struct pt_regs *regs, long err, int trap, int sig) -{ - return NOTIFY_DONE; -} -#endif - void bust_spinlocks(int yes) { int loglevel_save = console_loglevel; @@ -195,7 +160,7 @@ void dump_pagetable(unsigned long address) printk("PGD %lx ", pgd_val(*pgd)); if (!pgd_present(*pgd)) goto ret; - pud = pud_offset(pgd, address); + pud = __pud_offset_k((pud_t *)pgd_page(*pgd), address); if (bad_address(pud)) goto bad; printk("PUD %lx ", pud_val(*pud)); if (!pud_present(*pud)) goto ret; @@ -383,7 +348,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, if (vmalloc_fault(address) >= 0) return; } - if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, + if (notify_die(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, SIGSEGV) == NOTIFY_STOP) return; /* @@ -393,7 +358,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, goto bad_area_nosemaphore; } - if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, + if (notify_die(DIE_PAGE_FAULT, "page fault", regs, error_code, 14, SIGSEGV) == NOTIFY_STOP) return; @@ -445,10 +410,8 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, if (!(vma->vm_flags & VM_GROWSDOWN)) goto bad_area; if (error_code & 4) { - /* Allow userspace just enough access below the stack pointer - * to let the 'enter' instruction work. - */ - if (address + 65536 + 32 * sizeof(unsigned long) < regs->rsp) + // XXX: align red zone size with ABI + if (address + 128 < regs->rsp) goto bad_area; } if (expand_stack(vma, address)) diff --git a/trunk/arch/x86_64/mm/init.c b/trunk/arch/x86_64/mm/init.c index 02add1d1dfa8..4ba34e95d835 100644 --- a/trunk/arch/x86_64/mm/init.c +++ b/trunk/arch/x86_64/mm/init.c @@ -41,6 +41,8 @@ #include #include #include +#include +#include #ifndef Dprintk #define Dprintk(x...) @@ -88,6 +90,8 @@ void show_mem(void) printk(KERN_INFO "%lu pages swap cached\n",cached); } +/* References to section boundaries */ + int after_bootmem; static __init void *spp_getpage(void) @@ -257,10 +261,9 @@ phys_pmd_init(pmd_t *pmd, unsigned long address, unsigned long end) for (i = 0; i < PTRS_PER_PMD; pmd++, i++, address += PMD_SIZE) { unsigned long entry; - if (address >= end) { - if (!after_bootmem) - for (; i < PTRS_PER_PMD; i++, pmd++) - set_pmd(pmd, __pmd(0)); + if (address > end) { + for (; i < PTRS_PER_PMD; i++, pmd++) + set_pmd(pmd, __pmd(0)); break; } entry = _PAGE_NX|_PAGE_PSE|_KERNPG_TABLE|_PAGE_GLOBAL|address; @@ -338,8 +341,7 @@ static void __init find_early_table_space(unsigned long end) table_end = table_start; early_printk("kernel direct mapping tables up to %lx @ %lx-%lx\n", - end, table_start << PAGE_SHIFT, - (table_start << PAGE_SHIFT) + tables); + end, table_start << PAGE_SHIFT, table_end << PAGE_SHIFT); } /* Setup the direct mapping of the physical memory at PAGE_OFFSET. @@ -370,7 +372,7 @@ void __meminit init_memory_mapping(unsigned long start, unsigned long end) pud_t *pud; if (after_bootmem) - pud = pud_offset(pgd, start & PGDIR_MASK); + pud = pud_offset_k(pgd, start & PGDIR_MASK); else pud = alloc_low_page(&map, &pud_phys); @@ -585,7 +587,10 @@ void __init mem_init(void) { long codesize, reservedpages, datasize, initsize; - pci_iommu_alloc(); +#ifdef CONFIG_SWIOTLB + pci_swiotlb_init(); +#endif + no_iommu_init(); /* How many end-of-memory variables you have, grandma! */ max_low_pfn = end_pfn; @@ -639,29 +644,20 @@ void __init mem_init(void) #endif } -void free_init_pages(char *what, unsigned long begin, unsigned long end) +void free_initmem(void) { unsigned long addr; - if (begin >= end) - return; - - printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10); - for (addr = begin; addr < end; addr += PAGE_SIZE) { + addr = (unsigned long)(&__init_begin); + for (; addr < (unsigned long)(&__init_end); addr += PAGE_SIZE) { ClearPageReserved(virt_to_page(addr)); init_page_count(virt_to_page(addr)); memset((void *)(addr & ~(PAGE_SIZE-1)), 0xcc, PAGE_SIZE); free_page(addr); totalram_pages++; } -} - -void free_initmem(void) -{ memset(__initdata_begin, 0xba, __initdata_end - __initdata_begin); - free_init_pages("unused kernel memory", - (unsigned long)(&__init_begin), - (unsigned long)(&__init_end)); + printk ("Freeing unused kernel memory: %luk freed\n", (__init_end - __init_begin) >> 10); } #ifdef CONFIG_DEBUG_RODATA @@ -690,7 +686,15 @@ void mark_rodata_ro(void) #ifdef CONFIG_BLK_DEV_INITRD void free_initrd_mem(unsigned long start, unsigned long end) { - free_init_pages("initrd memory", start, end); + if (start >= end) + return; + printk ("Freeing initrd memory: %ldk freed\n", (end - start) >> 10); + for (; start < end; start += PAGE_SIZE) { + ClearPageReserved(virt_to_page(start)); + init_page_count(virt_to_page(start)); + free_page(start); + totalram_pages++; + } } #endif diff --git a/trunk/arch/x86_64/mm/ioremap.c b/trunk/arch/x86_64/mm/ioremap.c index 45d7d823c3b8..ae207064201e 100644 --- a/trunk/arch/x86_64/mm/ioremap.c +++ b/trunk/arch/x86_64/mm/ioremap.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include #include @@ -220,7 +219,6 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned long size, unsigned l } return (__force void __iomem *) (offset + (char *)addr); } -EXPORT_SYMBOL(__ioremap); /** * ioremap_nocache - map bus memory into CPU space @@ -248,7 +246,6 @@ void __iomem *ioremap_nocache (unsigned long phys_addr, unsigned long size) { return __ioremap(phys_addr, size, _PAGE_PCD); } -EXPORT_SYMBOL(ioremap_nocache); /** * iounmap - Free a IO remapping @@ -294,5 +291,3 @@ void iounmap(volatile void __iomem *addr) BUG_ON(p != o || o == NULL); kfree(p); } -EXPORT_SYMBOL(iounmap); - diff --git a/trunk/arch/x86_64/pci/k8-bus.c b/trunk/arch/x86_64/pci/k8-bus.c index b50a7c7c47f8..3acf60ded2a0 100644 --- a/trunk/arch/x86_64/pci/k8-bus.c +++ b/trunk/arch/x86_64/pci/k8-bus.c @@ -2,7 +2,6 @@ #include #include #include -#include /* * This discovers the pcibus <-> node mapping on AMD K8. @@ -19,6 +18,7 @@ #define NR_LDT_BUS_NUMBER_REGISTERS 3 #define SECONDARY_LDT_BUS_NUMBER(dword) ((dword >> 8) & 0xFF) #define SUBORDINATE_LDT_BUS_NUMBER(dword) ((dword >> 16) & 0xFF) +#define PCI_DEVICE_ID_K8HTCONFIG 0x1100 /** * fill_mp_bus_to_cpumask() @@ -28,7 +28,8 @@ __init static int fill_mp_bus_to_cpumask(void) { - int i, j, k; + struct pci_dev *nb_dev = NULL; + int i, j; u32 ldtbus, nid; static int lbnr[3] = { LDT_BUS_NUMBER_REGISTER_0, @@ -36,9 +37,8 @@ fill_mp_bus_to_cpumask(void) LDT_BUS_NUMBER_REGISTER_2 }; - cache_k8_northbridges(); - for (k = 0; k < num_k8_northbridges; k++) { - struct pci_dev *nb_dev = k8_northbridges[k]; + while ((nb_dev = pci_get_device(PCI_VENDOR_ID_AMD, + PCI_DEVICE_ID_K8HTCONFIG, nb_dev))) { pci_read_config_dword(nb_dev, NODE_ID_REGISTER, &nid); for (i = 0; i < NR_LDT_BUS_NUMBER_REGISTERS; i++) { diff --git a/trunk/crypto/khazad.c b/trunk/crypto/khazad.c index 807f2bf4ea24..5b8dc9a2d374 100644 --- a/trunk/crypto/khazad.c +++ b/trunk/crypto/khazad.c @@ -758,7 +758,7 @@ static int khazad_setkey(void *ctx_arg, const u8 *in_key, unsigned int key_len, u32 *flags) { struct khazad_ctx *ctx = ctx_arg; - const __be64 *key = (const __be64 *)in_key; + const __be32 *key = (const __be32 *)in_key; int r; const u64 *S = T7; u64 K2, K1; @@ -769,8 +769,9 @@ static int khazad_setkey(void *ctx_arg, const u8 *in_key, return -EINVAL; } - K2 = be64_to_cpu(key[0]); - K1 = be64_to_cpu(key[1]); + /* key is supposed to be 32-bit aligned */ + K2 = ((u64)be32_to_cpu(key[0]) << 32) | be32_to_cpu(key[1]); + K1 = ((u64)be32_to_cpu(key[2]) << 32) | be32_to_cpu(key[3]); /* setup the encrypt key */ for (r = 0; r <= KHAZAD_ROUNDS; r++) { diff --git a/trunk/drivers/Makefile b/trunk/drivers/Makefile index fc2d744a4e4a..3c5170310bd0 100644 --- a/trunk/drivers/Makefile +++ b/trunk/drivers/Makefile @@ -74,5 +74,4 @@ obj-$(CONFIG_SGI_SN) += sn/ obj-y += firmware/ obj-$(CONFIG_CRYPTO) += crypto/ obj-$(CONFIG_SUPERH) += sh/ -obj-$(CONFIG_GENERIC_TIME) += clocksource/ obj-$(CONFIG_DMA_ENGINE) += dma/ diff --git a/trunk/drivers/acpi/processor_idle.c b/trunk/drivers/acpi/processor_idle.c index 8a74bf3efd8e..3b97a5eae9e8 100644 --- a/trunk/drivers/acpi/processor_idle.c +++ b/trunk/drivers/acpi/processor_idle.c @@ -206,11 +206,11 @@ acpi_processor_power_activate(struct acpi_processor *pr, static void acpi_safe_halt(void) { - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); smp_mb__after_clear_bit(); if (!need_resched()) safe_halt(); - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); } static atomic_t c3_cpu_count; @@ -330,10 +330,10 @@ static void acpi_processor_idle(void) * Invoke the current Cx state to put the processor to sleep. */ if (cx->type == ACPI_STATE_C2 || cx->type == ACPI_STATE_C3) { - current_thread_info()->status &= ~TS_POLLING; + clear_thread_flag(TIF_POLLING_NRFLAG); smp_mb__after_clear_bit(); if (need_resched()) { - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); local_irq_enable(); return; } @@ -369,14 +369,9 @@ static void acpi_processor_idle(void) t2 = inl(acpi_fadt.xpm_tmr_blk.address); /* Get end time (ticks) */ t2 = inl(acpi_fadt.xpm_tmr_blk.address); - -#ifdef CONFIG_GENERIC_TIME - /* TSC halts in C2, so notify users */ - mark_tsc_unstable(); -#endif /* Re-enable interrupts */ local_irq_enable(); - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); /* Compute time (ticks) that we were actually asleep */ sleep_ticks = ticks_elapsed(t1, t2) - cx->latency_ticks - C2_OVERHEAD; @@ -414,13 +409,9 @@ static void acpi_processor_idle(void) ACPI_MTX_DO_NOT_LOCK); } -#ifdef CONFIG_GENERIC_TIME - /* TSC halts in C3, so notify users */ - mark_tsc_unstable(); -#endif /* Re-enable interrupts */ local_irq_enable(); - current_thread_info()->status |= TS_POLLING; + set_thread_flag(TIF_POLLING_NRFLAG); /* Compute time (ticks) that we were actually asleep */ sleep_ticks = ticks_elapsed(t1, t2) - cx->latency_ticks - C3_OVERHEAD; diff --git a/trunk/drivers/base/power/resume.c b/trunk/drivers/base/power/resume.c index 826093ef4c7e..520679ce53a8 100644 --- a/trunk/drivers/base/power/resume.c +++ b/trunk/drivers/base/power/resume.c @@ -53,7 +53,8 @@ void dpm_resume(void) struct device * dev = to_device(entry); get_device(dev); - list_move_tail(entry, &dpm_active); + list_del_init(entry); + list_add_tail(entry, &dpm_active); up(&dpm_list_sem); if (!dev->power.prev_state.event) @@ -100,7 +101,8 @@ void dpm_power_up(void) struct device * dev = to_device(entry); get_device(dev); - list_move_tail(entry, &dpm_active); + list_del_init(entry); + list_add_tail(entry, &dpm_active); resume_device(dev); put_device(dev); } diff --git a/trunk/drivers/base/power/suspend.c b/trunk/drivers/base/power/suspend.c index 69509e02f703..1a1fe43a3057 100644 --- a/trunk/drivers/base/power/suspend.c +++ b/trunk/drivers/base/power/suspend.c @@ -116,10 +116,12 @@ int device_suspend(pm_message_t state) /* Check if the device got removed */ if (!list_empty(&dev->power.entry)) { /* Move it to the dpm_off or dpm_off_irq list */ - if (!error) - list_move(&dev->power.entry, &dpm_off); - else if (error == -EAGAIN) { - list_move(&dev->power.entry, &dpm_off_irq); + if (!error) { + list_del(&dev->power.entry); + list_add(&dev->power.entry, &dpm_off); + } else if (error == -EAGAIN) { + list_del(&dev->power.entry); + list_add(&dev->power.entry, &dpm_off_irq); error = 0; } } @@ -137,7 +139,8 @@ int device_suspend(pm_message_t state) */ while (!list_empty(&dpm_off_irq)) { struct list_head * entry = dpm_off_irq.next; - list_move(entry, &dpm_off); + list_del(entry); + list_add(entry, &dpm_off); } dpm_resume(); } diff --git a/trunk/drivers/bluetooth/dtl1_cs.c b/trunk/drivers/bluetooth/dtl1_cs.c index ed8dca84ff69..a71a240611e0 100644 --- a/trunk/drivers/bluetooth/dtl1_cs.c +++ b/trunk/drivers/bluetooth/dtl1_cs.c @@ -423,9 +423,6 @@ static int dtl1_hci_send_frame(struct sk_buff *skb) nsh.len = skb->len; s = bt_skb_alloc(NSHL + skb->len + 1, GFP_ATOMIC); - if (!s) - return -ENOMEM; - skb_reserve(s, NSHL); memcpy(skb_put(s, skb->len), skb->data, skb->len); if (skb->len & 0x0001) diff --git a/trunk/drivers/char/Kconfig b/trunk/drivers/char/Kconfig index 3610c5729553..63f28d169b36 100644 --- a/trunk/drivers/char/Kconfig +++ b/trunk/drivers/char/Kconfig @@ -62,23 +62,6 @@ config HW_CONSOLE depends on VT && !S390 && !UML default y -config VT_HW_CONSOLE_BINDING - bool "Support for binding and unbinding console drivers" - depends on HW_CONSOLE - default n - ---help--- - The virtual terminal is the device that interacts with the physical - terminal through console drivers. On these systems, at least one - console driver is loaded. In other configurations, additional console - drivers may be enabled, such as the framebuffer console. If more than - 1 console driver is enabled, setting this to 'y' will allow you to - select the console driver that will serve as the backend for the - virtual terminals. - - See for more - information. For framebuffer console users, please refer to - . - config SERIAL_NONSTANDARD bool "Non-standard serial port support" ---help--- @@ -687,7 +670,20 @@ config NWFLASH If you're not sure, say N. -source "drivers/char/hw_random/Kconfig" +config HW_RANDOM + tristate "Intel/AMD/VIA HW Random Number Generator support" + depends on (X86 || IA64) && PCI + ---help--- + This driver provides kernel-side support for the Random Number + Generator hardware found on Intel i8xx-based motherboards, + AMD 76x-based motherboards, and Via Nehemiah CPUs. + + Provides a character driver, used to read() entropy data. + + To compile this driver as a module, choose M here: the + module will be called hw_random. + + If unsure, say N. config NVRAM tristate "/dev/nvram support" diff --git a/trunk/drivers/char/Makefile b/trunk/drivers/char/Makefile index 524105597ea7..fb919bfb2824 100644 --- a/trunk/drivers/char/Makefile +++ b/trunk/drivers/char/Makefile @@ -75,7 +75,7 @@ endif obj-$(CONFIG_TOSHIBA) += toshiba.o obj-$(CONFIG_I8K) += i8k.o obj-$(CONFIG_DS1620) += ds1620.o -obj-$(CONFIG_HW_RANDOM) += hw_random/ +obj-$(CONFIG_HW_RANDOM) += hw_random.o obj-$(CONFIG_FTAPE) += ftape/ obj-$(CONFIG_COBALT_LCD) += lcd.o obj-$(CONFIG_PPDEV) += ppdev.o diff --git a/trunk/drivers/char/agp/Kconfig b/trunk/drivers/char/agp/Kconfig index 9826a399fa02..46685a540772 100644 --- a/trunk/drivers/char/agp/Kconfig +++ b/trunk/drivers/char/agp/Kconfig @@ -55,9 +55,9 @@ config AGP_AMD X on AMD Irongate, 761, and 762 chipsets. config AGP_AMD64 - tristate "AMD Opteron/Athlon64 on-CPU GART support" if !IOMMU + tristate "AMD Opteron/Athlon64 on-CPU GART support" if !GART_IOMMU depends on AGP && X86 - default y if IOMMU + default y if GART_IOMMU help This option gives you AGP support for the GLX component of X using the on-CPU northbridge of the AMD Athlon64/Opteron CPUs. diff --git a/trunk/drivers/char/agp/amd64-agp.c b/trunk/drivers/char/agp/amd64-agp.c index f690ee8cb732..ac3c33a2e37d 100644 --- a/trunk/drivers/char/agp/amd64-agp.c +++ b/trunk/drivers/char/agp/amd64-agp.c @@ -15,9 +15,11 @@ #include #include #include /* PAGE_SIZE */ -#include #include "agp.h" +/* Will need to be increased if AMD64 ever goes >8-way. */ +#define MAX_HAMMER_GARTS 8 + /* PTE bits. */ #define GPTE_VALID 1 #define GPTE_COHERENT 2 @@ -51,12 +53,28 @@ #define ULI_X86_64_HTT_FEA_REG 0x50 #define ULI_X86_64_ENU_SCR_REG 0x54 +static int nr_garts; +static struct pci_dev * hammers[MAX_HAMMER_GARTS]; + static struct resource *aperture_resource; static int __initdata agp_try_unsupported = 1; +#define for_each_nb() for(gart_iterator=0;gart_iteratorgatt_table_real); - int i; + int gart_iterator; /* Configure AGP regs in each x86-64 host bridge. */ - for (i = 0; i < num_k8_northbridges; i++) { + for_each_nb() { agp_bridge->gart_bus_addr = - amd64_configure(k8_northbridges[i], gatt_bus); + amd64_configure(hammers[gart_iterator],gatt_bus); } - k8_flush_garts(); return 0; } @@ -216,13 +236,12 @@ static int amd_8151_configure(void) static void amd64_cleanup(void) { u32 tmp; - int i; - for (i = 0; i < num_k8_northbridges; i++) { - struct pci_dev *dev = k8_northbridges[i]; + int gart_iterator; + for_each_nb() { /* disable gart translation */ - pci_read_config_dword (dev, AMD64_GARTAPERTURECTL, &tmp); + pci_read_config_dword (hammers[gart_iterator], AMD64_GARTAPERTURECTL, &tmp); tmp &= ~AMD64_GARTEN; - pci_write_config_dword (dev, AMD64_GARTAPERTURECTL, tmp); + pci_write_config_dword (hammers[gart_iterator], AMD64_GARTAPERTURECTL, tmp); } } @@ -292,7 +311,7 @@ static int __devinit aperture_valid(u64 aper, u32 size) /* * W*s centric BIOS sometimes only set up the aperture in the AGP * bridge, not the northbridge. On AMD64 this is handled early - * in aperture.c, but when IOMMU is not enabled or we run + * in aperture.c, but when GART_IOMMU is not enabled or we run * on a 32bit kernel this needs to be redone. * Unfortunately it is impossible to fix the aperture here because it's too late * to allocate that much memory. But at least error out cleanly instead of @@ -342,15 +361,17 @@ static __devinit int fix_northbridge(struct pci_dev *nb, struct pci_dev *agp, static __devinit int cache_nbs (struct pci_dev *pdev, u32 cap_ptr) { - int i; - - if (cache_k8_northbridges() < 0) - return -ENODEV; - - i = 0; - for (i = 0; i < num_k8_northbridges; i++) { - struct pci_dev *dev = k8_northbridges[i]; - if (fix_northbridge(dev, pdev, cap_ptr) < 0) { + struct pci_dev *loop_dev = NULL; + int i = 0; + + /* cache pci_devs of northbridges. */ + while ((loop_dev = pci_get_device(PCI_VENDOR_ID_AMD, 0x1103, loop_dev)) + != NULL) { + if (i == MAX_HAMMER_GARTS) { + printk(KERN_ERR PFX "Too many northbridges for AGP\n"); + return -1; + } + if (fix_northbridge(loop_dev, pdev, cap_ptr) < 0) { printk(KERN_ERR PFX "No usable aperture found.\n"); #ifdef __x86_64__ /* should port this to i386 */ @@ -358,8 +379,10 @@ static __devinit int cache_nbs (struct pci_dev *pdev, u32 cap_ptr) #endif return -1; } + hammers[i++] = loop_dev; } - return 0; + nr_garts = i; + return i == 0 ? -1 : 0; } /* Handle AMD 8151 quirks */ @@ -427,7 +450,7 @@ static int __devinit uli_agp_init(struct pci_dev *pdev) } /* shadow x86-64 registers into ULi registers */ - pci_read_config_dword (k8_northbridges[0], AMD64_GARTAPERTUREBASE, &httfea); + pci_read_config_dword (hammers[0], AMD64_GARTAPERTUREBASE, &httfea); /* if x86-64 aperture base is beyond 4G, exit here */ if ((httfea & 0x7fff) >> (32 - 25)) @@ -490,7 +513,7 @@ static int __devinit nforce3_agp_init(struct pci_dev *pdev) pci_write_config_dword(dev1, NVIDIA_X86_64_1_APSIZE, tmp); /* shadow x86-64 registers into NVIDIA registers */ - pci_read_config_dword (k8_northbridges[0], AMD64_GARTAPERTUREBASE, &apbase); + pci_read_config_dword (hammers[0], AMD64_GARTAPERTUREBASE, &apbase); /* if x86-64 aperture base is beyond 4G, exit here */ if ( (apbase & 0x7fff) >> (32 - 25) ) { @@ -731,6 +754,10 @@ static struct pci_driver agp_amd64_pci_driver = { int __init agp_amd64_init(void) { int err = 0; + static struct pci_device_id amd64nb[] = { + { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1103) }, + { }, + }; if (agp_off) return -EINVAL; @@ -747,7 +774,7 @@ int __init agp_amd64_init(void) } /* First check that we have at least one AMD64 NB */ - if (!pci_dev_present(k8_nb_ids)) + if (!pci_dev_present(amd64nb)) return -ENODEV; /* Look for any AGP bridge */ @@ -775,7 +802,7 @@ static void __exit agp_amd64_cleanup(void) /* On AMD64 the PCI driver needs to initialize this driver early for the IOMMU, so it has to be called via a backdoor. */ -#ifndef CONFIG_IOMMU +#ifndef CONFIG_GART_IOMMU module_init(agp_amd64_init); module_exit(agp_amd64_cleanup); #endif diff --git a/trunk/drivers/char/hangcheck-timer.c b/trunk/drivers/char/hangcheck-timer.c index d69f2ad9a67d..ac626418b329 100644 --- a/trunk/drivers/char/hangcheck-timer.c +++ b/trunk/drivers/char/hangcheck-timer.c @@ -117,12 +117,12 @@ __setup("hcheck_reboot", hangcheck_parse_reboot); __setup("hcheck_dump_tasks", hangcheck_parse_dump_tasks); #endif /* not MODULE */ -#if defined(CONFIG_X86_64) || defined(CONFIG_S390) +#if defined(CONFIG_X86) || defined(CONFIG_S390) # define HAVE_MONOTONIC # define TIMER_FREQ 1000000000ULL #elif defined(CONFIG_IA64) # define TIMER_FREQ ((unsigned long long)local_cpu_data->itc_freq) -#else +#elif defined(CONFIG_PPC64) # define TIMER_FREQ (HZ*loops_per_jiffy) #endif diff --git a/trunk/drivers/char/hw_random.c b/trunk/drivers/char/hw_random.c new file mode 100644 index 000000000000..29dc87e59020 --- /dev/null +++ b/trunk/drivers/char/hw_random.c @@ -0,0 +1,698 @@ +/* + Added support for the AMD Geode LX RNG + (c) Copyright 2004-2005 Advanced Micro Devices, Inc. + + derived from + + Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) + (c) Copyright 2003 Red Hat Inc + + derived from + + Hardware driver for the AMD 768 Random Number Generator (RNG) + (c) Copyright 2001 Red Hat Inc + + derived from + + Hardware driver for Intel i810 Random Number Generator (RNG) + Copyright 2000,2001 Jeff Garzik + Copyright 2000,2001 Philipp Rumpf + + Please read Documentation/hw_random.txt for details on use. + + ---------------------------------------------------------- + This software may be used and distributed according to the terms + of the GNU General Public License, incorporated herein by reference. + + */ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef __i386__ +#include +#include +#endif + +#include +#include + + +/* + * core module and version information + */ +#define RNG_VERSION "1.0.0" +#define RNG_MODULE_NAME "hw_random" +#define RNG_DRIVER_NAME RNG_MODULE_NAME " hardware driver " RNG_VERSION +#define PFX RNG_MODULE_NAME ": " + + +/* + * debugging macros + */ + +/* pr_debug() collapses to a no-op if DEBUG is not defined */ +#define DPRINTK(fmt, args...) pr_debug(PFX "%s: " fmt, __FUNCTION__ , ## args) + + +#undef RNG_NDEBUG /* define to enable lightweight runtime checks */ +#ifdef RNG_NDEBUG +#define assert(expr) \ + if(!(expr)) { \ + printk(KERN_DEBUG PFX "Assertion failed! %s,%s,%s," \ + "line=%d\n", #expr, __FILE__, __FUNCTION__, __LINE__); \ + } +#else +#define assert(expr) +#endif + +#define RNG_MISCDEV_MINOR 183 /* official */ + +static int rng_dev_open (struct inode *inode, struct file *filp); +static ssize_t rng_dev_read (struct file *filp, char __user *buf, size_t size, + loff_t * offp); + +static int __init intel_init (struct pci_dev *dev); +static void intel_cleanup(void); +static unsigned int intel_data_present (void); +static u32 intel_data_read (void); + +static int __init amd_init (struct pci_dev *dev); +static void amd_cleanup(void); +static unsigned int amd_data_present (void); +static u32 amd_data_read (void); + +#ifdef __i386__ +static int __init via_init(struct pci_dev *dev); +static void via_cleanup(void); +static unsigned int via_data_present (void); +static u32 via_data_read (void); +#endif + +static int __init geode_init(struct pci_dev *dev); +static void geode_cleanup(void); +static unsigned int geode_data_present (void); +static u32 geode_data_read (void); + +struct rng_operations { + int (*init) (struct pci_dev *dev); + void (*cleanup) (void); + unsigned int (*data_present) (void); + u32 (*data_read) (void); + unsigned int n_bytes; /* number of bytes per ->data_read */ +}; +static struct rng_operations *rng_ops; + +static struct file_operations rng_chrdev_ops = { + .owner = THIS_MODULE, + .open = rng_dev_open, + .read = rng_dev_read, +}; + + +static struct miscdevice rng_miscdev = { + RNG_MISCDEV_MINOR, + RNG_MODULE_NAME, + &rng_chrdev_ops, +}; + +enum { + rng_hw_none, + rng_hw_intel, + rng_hw_amd, +#ifdef __i386__ + rng_hw_via, +#endif + rng_hw_geode, +}; + +static struct rng_operations rng_vendor_ops[] = { + /* rng_hw_none */ + { }, + + /* rng_hw_intel */ + { intel_init, intel_cleanup, intel_data_present, + intel_data_read, 1 }, + + /* rng_hw_amd */ + { amd_init, amd_cleanup, amd_data_present, amd_data_read, 4 }, + +#ifdef __i386__ + /* rng_hw_via */ + { via_init, via_cleanup, via_data_present, via_data_read, 1 }, +#endif + + /* rng_hw_geode */ + { geode_init, geode_cleanup, geode_data_present, geode_data_read, 4 } +}; + +/* + * Data for PCI driver interface + * + * This data only exists for exporting the supported + * PCI ids via MODULE_DEVICE_TABLE. We do not actually + * register a pci_driver, because someone else might one day + * want to register another driver on the same PCI id. + */ +static struct pci_device_id rng_pci_tbl[] = { + { 0x1022, 0x7443, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_amd }, + { 0x1022, 0x746b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_amd }, + + { 0x8086, 0x2418, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + { 0x8086, 0x2428, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + { 0x8086, 0x2430, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + { 0x8086, 0x2448, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + { 0x8086, 0x244e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + { 0x8086, 0x245e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_intel }, + + { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LX_AES, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, rng_hw_geode }, + + { 0, }, /* terminate list */ +}; +MODULE_DEVICE_TABLE (pci, rng_pci_tbl); + + +/*********************************************************************** + * + * Intel RNG operations + * + */ + +/* + * RNG registers (offsets from rng_mem) + */ +#define INTEL_RNG_HW_STATUS 0 +#define INTEL_RNG_PRESENT 0x40 +#define INTEL_RNG_ENABLED 0x01 +#define INTEL_RNG_STATUS 1 +#define INTEL_RNG_DATA_PRESENT 0x01 +#define INTEL_RNG_DATA 2 + +/* + * Magic address at which Intel PCI bridges locate the RNG + */ +#define INTEL_RNG_ADDR 0xFFBC015F +#define INTEL_RNG_ADDR_LEN 3 + +/* token to our ioremap'd RNG register area */ +static void __iomem *rng_mem; + +static inline u8 intel_hwstatus (void) +{ + assert (rng_mem != NULL); + return readb (rng_mem + INTEL_RNG_HW_STATUS); +} + +static inline u8 intel_hwstatus_set (u8 hw_status) +{ + assert (rng_mem != NULL); + writeb (hw_status, rng_mem + INTEL_RNG_HW_STATUS); + return intel_hwstatus (); +} + +static unsigned int intel_data_present(void) +{ + assert (rng_mem != NULL); + + return (readb (rng_mem + INTEL_RNG_STATUS) & INTEL_RNG_DATA_PRESENT) ? + 1 : 0; +} + +static u32 intel_data_read(void) +{ + assert (rng_mem != NULL); + + return readb (rng_mem + INTEL_RNG_DATA); +} + +static int __init intel_init (struct pci_dev *dev) +{ + int rc; + u8 hw_status; + + DPRINTK ("ENTER\n"); + + rng_mem = ioremap (INTEL_RNG_ADDR, INTEL_RNG_ADDR_LEN); + if (rng_mem == NULL) { + printk (KERN_ERR PFX "cannot ioremap RNG Memory\n"); + rc = -EBUSY; + goto err_out; + } + + /* Check for Intel 82802 */ + hw_status = intel_hwstatus (); + if ((hw_status & INTEL_RNG_PRESENT) == 0) { + printk (KERN_ERR PFX "RNG not detected\n"); + rc = -ENODEV; + goto err_out_free_map; + } + + /* turn RNG h/w on, if it's off */ + if ((hw_status & INTEL_RNG_ENABLED) == 0) + hw_status = intel_hwstatus_set (hw_status | INTEL_RNG_ENABLED); + if ((hw_status & INTEL_RNG_ENABLED) == 0) { + printk (KERN_ERR PFX "cannot enable RNG, aborting\n"); + rc = -EIO; + goto err_out_free_map; + } + + DPRINTK ("EXIT, returning 0\n"); + return 0; + +err_out_free_map: + iounmap (rng_mem); + rng_mem = NULL; +err_out: + DPRINTK ("EXIT, returning %d\n", rc); + return rc; +} + +static void intel_cleanup(void) +{ + u8 hw_status; + + hw_status = intel_hwstatus (); + if (hw_status & INTEL_RNG_ENABLED) + intel_hwstatus_set (hw_status & ~INTEL_RNG_ENABLED); + else + printk(KERN_WARNING PFX "unusual: RNG already disabled\n"); + iounmap(rng_mem); + rng_mem = NULL; +} + +/*********************************************************************** + * + * AMD RNG operations + * + */ + +static u32 pmbase; /* PMxx I/O base */ +static struct pci_dev *amd_dev; + +static unsigned int amd_data_present (void) +{ + return inl(pmbase + 0xF4) & 1; +} + + +static u32 amd_data_read (void) +{ + return inl(pmbase + 0xF0); +} + +static int __init amd_init (struct pci_dev *dev) +{ + int rc; + u8 rnen; + + DPRINTK ("ENTER\n"); + + pci_read_config_dword(dev, 0x58, &pmbase); + + pmbase &= 0x0000FF00; + + if (pmbase == 0) + { + printk (KERN_ERR PFX "power management base not set\n"); + rc = -EIO; + goto err_out; + } + + pci_read_config_byte(dev, 0x40, &rnen); + rnen |= (1 << 7); /* RNG on */ + pci_write_config_byte(dev, 0x40, rnen); + + pci_read_config_byte(dev, 0x41, &rnen); + rnen |= (1 << 7); /* PMIO enable */ + pci_write_config_byte(dev, 0x41, rnen); + + pr_info( PFX "AMD768 system management I/O registers at 0x%X.\n", + pmbase); + + amd_dev = dev; + + DPRINTK ("EXIT, returning 0\n"); + return 0; + +err_out: + DPRINTK ("EXIT, returning %d\n", rc); + return rc; +} + +static void amd_cleanup(void) +{ + u8 rnen; + + pci_read_config_byte(amd_dev, 0x40, &rnen); + rnen &= ~(1 << 7); /* RNG off */ + pci_write_config_byte(amd_dev, 0x40, rnen); + + /* FIXME: twiddle pmio, also? */ +} + +#ifdef __i386__ +/*********************************************************************** + * + * VIA RNG operations + * + */ + +enum { + VIA_STRFILT_CNT_SHIFT = 16, + VIA_STRFILT_FAIL = (1 << 15), + VIA_STRFILT_ENABLE = (1 << 14), + VIA_RAWBITS_ENABLE = (1 << 13), + VIA_RNG_ENABLE = (1 << 6), + VIA_XSTORE_CNT_MASK = 0x0F, + + VIA_RNG_CHUNK_8 = 0x00, /* 64 rand bits, 64 stored bits */ + VIA_RNG_CHUNK_4 = 0x01, /* 32 rand bits, 32 stored bits */ + VIA_RNG_CHUNK_4_MASK = 0xFFFFFFFF, + VIA_RNG_CHUNK_2 = 0x02, /* 16 rand bits, 32 stored bits */ + VIA_RNG_CHUNK_2_MASK = 0xFFFF, + VIA_RNG_CHUNK_1 = 0x03, /* 8 rand bits, 32 stored bits */ + VIA_RNG_CHUNK_1_MASK = 0xFF, +}; + +static u32 via_rng_datum; + +/* + * Investigate using the 'rep' prefix to obtain 32 bits of random data + * in one insn. The upside is potentially better performance. The + * downside is that the instruction becomes no longer atomic. Due to + * this, just like familiar issues with /dev/random itself, the worst + * case of a 'rep xstore' could potentially pause a cpu for an + * unreasonably long time. In practice, this condition would likely + * only occur when the hardware is failing. (or so we hope :)) + * + * Another possible performance boost may come from simply buffering + * until we have 4 bytes, thus returning a u32 at a time, + * instead of the current u8-at-a-time. + */ + +static inline u32 xstore(u32 *addr, u32 edx_in) +{ + u32 eax_out; + + asm(".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */" + :"=m"(*addr), "=a"(eax_out) + :"D"(addr), "d"(edx_in)); + + return eax_out; +} + +static unsigned int via_data_present(void) +{ + u32 bytes_out; + + /* We choose the recommended 1-byte-per-instruction RNG rate, + * for greater randomness at the expense of speed. Larger + * values 2, 4, or 8 bytes-per-instruction yield greater + * speed at lesser randomness. + * + * If you change this to another VIA_CHUNK_n, you must also + * change the ->n_bytes values in rng_vendor_ops[] tables. + * VIA_CHUNK_8 requires further code changes. + * + * A copy of MSR_VIA_RNG is placed in eax_out when xstore + * completes. + */ + via_rng_datum = 0; /* paranoia, not really necessary */ + bytes_out = xstore(&via_rng_datum, VIA_RNG_CHUNK_1) & VIA_XSTORE_CNT_MASK; + if (bytes_out == 0) + return 0; + + return 1; +} + +static u32 via_data_read(void) +{ + return via_rng_datum; +} + +static int __init via_init(struct pci_dev *dev) +{ + u32 lo, hi, old_lo; + + /* Control the RNG via MSR. Tread lightly and pay very close + * close attention to values written, as the reserved fields + * are documented to be "undefined and unpredictable"; but it + * does not say to write them as zero, so I make a guess that + * we restore the values we find in the register. + */ + rdmsr(MSR_VIA_RNG, lo, hi); + + old_lo = lo; + lo &= ~(0x7f << VIA_STRFILT_CNT_SHIFT); + lo &= ~VIA_XSTORE_CNT_MASK; + lo &= ~(VIA_STRFILT_ENABLE | VIA_STRFILT_FAIL | VIA_RAWBITS_ENABLE); + lo |= VIA_RNG_ENABLE; + + if (lo != old_lo) + wrmsr(MSR_VIA_RNG, lo, hi); + + /* perhaps-unnecessary sanity check; remove after testing if + unneeded */ + rdmsr(MSR_VIA_RNG, lo, hi); + if ((lo & VIA_RNG_ENABLE) == 0) { + printk(KERN_ERR PFX "cannot enable VIA C3 RNG, aborting\n"); + return -ENODEV; + } + + return 0; +} + +static void via_cleanup(void) +{ + /* do nothing */ +} +#endif + +/*********************************************************************** + * + * AMD Geode RNG operations + * + */ + +static void __iomem *geode_rng_base = NULL; + +#define GEODE_RNG_DATA_REG 0x50 +#define GEODE_RNG_STATUS_REG 0x54 + +static u32 geode_data_read(void) +{ + u32 val; + + assert(geode_rng_base != NULL); + val = readl(geode_rng_base + GEODE_RNG_DATA_REG); + return val; +} + +static unsigned int geode_data_present(void) +{ + u32 val; + + assert(geode_rng_base != NULL); + val = readl(geode_rng_base + GEODE_RNG_STATUS_REG); + return val; +} + +static void geode_cleanup(void) +{ + iounmap(geode_rng_base); + geode_rng_base = NULL; +} + +static int geode_init(struct pci_dev *dev) +{ + unsigned long rng_base = pci_resource_start(dev, 0); + + if (rng_base == 0) + return 1; + + geode_rng_base = ioremap(rng_base, 0x58); + + if (geode_rng_base == NULL) { + printk(KERN_ERR PFX "Cannot ioremap RNG memory\n"); + return -EBUSY; + } + + return 0; +} + +/*********************************************************************** + * + * /dev/hwrandom character device handling (major 10, minor 183) + * + */ + +static int rng_dev_open (struct inode *inode, struct file *filp) +{ + /* enforce read-only access to this chrdev */ + if ((filp->f_mode & FMODE_READ) == 0) + return -EINVAL; + if (filp->f_mode & FMODE_WRITE) + return -EINVAL; + + return 0; +} + + +static ssize_t rng_dev_read (struct file *filp, char __user *buf, size_t size, + loff_t * offp) +{ + static DEFINE_SPINLOCK(rng_lock); + unsigned int have_data; + u32 data = 0; + ssize_t ret = 0; + + while (size) { + spin_lock(&rng_lock); + + have_data = 0; + if (rng_ops->data_present()) { + data = rng_ops->data_read(); + have_data = rng_ops->n_bytes; + } + + spin_unlock (&rng_lock); + + while (have_data && size) { + if (put_user((u8)data, buf++)) { + ret = ret ? : -EFAULT; + break; + } + size--; + ret++; + have_data--; + data>>=8; + } + + if (filp->f_flags & O_NONBLOCK) + return ret ? : -EAGAIN; + + if(need_resched()) + schedule_timeout_interruptible(1); + else + udelay(200); /* FIXME: We could poll for 250uS ?? */ + + if (signal_pending (current)) + return ret ? : -ERESTARTSYS; + } + return ret; +} + + + +/* + * rng_init_one - look for and attempt to init a single RNG + */ +static int __init rng_init_one (struct pci_dev *dev) +{ + int rc; + + DPRINTK ("ENTER\n"); + + assert(rng_ops != NULL); + + rc = rng_ops->init(dev); + if (rc) + goto err_out; + + rc = misc_register (&rng_miscdev); + if (rc) { + printk (KERN_ERR PFX "misc device register failed\n"); + goto err_out_cleanup_hw; + } + + DPRINTK ("EXIT, returning 0\n"); + return 0; + +err_out_cleanup_hw: + rng_ops->cleanup(); +err_out: + DPRINTK ("EXIT, returning %d\n", rc); + return rc; +} + + + +MODULE_AUTHOR("The Linux Kernel team"); +MODULE_DESCRIPTION("H/W Random Number Generator (RNG) driver"); +MODULE_LICENSE("GPL"); + + +/* + * rng_init - initialize RNG module + */ +static int __init rng_init (void) +{ + int rc; + struct pci_dev *pdev = NULL; + const struct pci_device_id *ent; + + DPRINTK ("ENTER\n"); + + /* Probe for Intel, AMD, Geode RNGs */ + for_each_pci_dev(pdev) { + ent = pci_match_id(rng_pci_tbl, pdev); + if (ent) { + rng_ops = &rng_vendor_ops[ent->driver_data]; + goto match; + } + } + +#ifdef __i386__ + /* Probe for VIA RNG */ + if (cpu_has_xstore) { + rng_ops = &rng_vendor_ops[rng_hw_via]; + pdev = NULL; + goto match; + } +#endif + + DPRINTK ("EXIT, returning -ENODEV\n"); + return -ENODEV; + +match: + rc = rng_init_one (pdev); + if (rc) + return rc; + + pr_info( RNG_DRIVER_NAME " loaded\n"); + + DPRINTK ("EXIT, returning 0\n"); + return 0; +} + + +/* + * rng_init - shutdown RNG module + */ +static void __exit rng_cleanup (void) +{ + DPRINTK ("ENTER\n"); + + misc_deregister (&rng_miscdev); + + if (rng_ops->cleanup) + rng_ops->cleanup(); + + DPRINTK ("EXIT\n"); +} + + +module_init (rng_init); +module_exit (rng_cleanup); diff --git a/trunk/drivers/char/hw_random/Kconfig b/trunk/drivers/char/hw_random/Kconfig deleted file mode 100644 index 9f7635f75178..000000000000 --- a/trunk/drivers/char/hw_random/Kconfig +++ /dev/null @@ -1,90 +0,0 @@ -# -# Hardware Random Number Generator (RNG) configuration -# - -config HW_RANDOM - bool "Hardware Random Number Generator Core support" - default y - ---help--- - Hardware Random Number Generator Core infrastructure. - - If unsure, say Y. - -config HW_RANDOM_INTEL - tristate "Intel HW Random Number Generator support" - depends on HW_RANDOM && (X86 || IA64) && PCI - default y - ---help--- - This driver provides kernel-side support for the Random Number - Generator hardware found on Intel i8xx-based motherboards. - - To compile this driver as a module, choose M here: the - module will be called intel-rng. - - If unsure, say Y. - -config HW_RANDOM_AMD - tristate "AMD HW Random Number Generator support" - depends on HW_RANDOM && X86 && PCI - default y - ---help--- - This driver provides kernel-side support for the Random Number - Generator hardware found on AMD 76x-based motherboards. - - To compile this driver as a module, choose M here: the - module will be called amd-rng. - - If unsure, say Y. - -config HW_RANDOM_GEODE - tristate "AMD Geode HW Random Number Generator support" - depends on HW_RANDOM && X86 && PCI - default y - ---help--- - This driver provides kernel-side support for the Random Number - Generator hardware found on the AMD Geode LX. - - To compile this driver as a module, choose M here: the - module will be called geode-rng. - - If unsure, say Y. - -config HW_RANDOM_VIA - tristate "VIA HW Random Number Generator support" - depends on HW_RANDOM && X86_32 - default y - ---help--- - This driver provides kernel-side support for the Random Number - Generator hardware found on VIA based motherboards. - - To compile this driver as a module, choose M here: the - module will be called via-rng. - - If unsure, say Y. - -config HW_RANDOM_IXP4XX - tristate "Intel IXP4xx NPU HW Random Number Generator support" - depends on HW_RANDOM && ARCH_IXP4XX - default y - ---help--- - This driver provides kernel-side support for the Random - Number Generator hardware found on the Intel IXP4xx NPU. - - To compile this driver as a module, choose M here: the - module will be called ixp4xx-rng. - - If unsure, say Y. - -config HW_RANDOM_OMAP - tristate "OMAP Random Number Generator support" - depends on HW_RANDOM && (ARCH_OMAP16XX || ARCH_OMAP24XX) - default y - ---help--- - This driver provides kernel-side support for the Random Number - Generator hardware found on OMAP16xx and OMAP24xx multimedia - processors. - - To compile this driver as a module, choose M here: the - module will be called omap-rng. - - If unsure, say Y. diff --git a/trunk/drivers/char/hw_random/Makefile b/trunk/drivers/char/hw_random/Makefile deleted file mode 100644 index e263ae96f940..000000000000 --- a/trunk/drivers/char/hw_random/Makefile +++ /dev/null @@ -1,11 +0,0 @@ -# -# Makefile for HW Random Number Generator (RNG) device drivers. -# - -obj-$(CONFIG_HW_RANDOM) += core.o -obj-$(CONFIG_HW_RANDOM_INTEL) += intel-rng.o -obj-$(CONFIG_HW_RANDOM_AMD) += amd-rng.o -obj-$(CONFIG_HW_RANDOM_GEODE) += geode-rng.o -obj-$(CONFIG_HW_RANDOM_VIA) += via-rng.o -obj-$(CONFIG_HW_RANDOM_IXP4XX) += ixp4xx-rng.o -obj-$(CONFIG_HW_RANDOM_OMAP) += omap-rng.o diff --git a/trunk/drivers/char/hw_random/amd-rng.c b/trunk/drivers/char/hw_random/amd-rng.c deleted file mode 100644 index 71e4e0f3fd54..000000000000 --- a/trunk/drivers/char/hw_random/amd-rng.c +++ /dev/null @@ -1,152 +0,0 @@ -/* - * RNG driver for AMD RNGs - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * with the majority of the code coming from: - * - * Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) - * (c) Copyright 2003 Red Hat Inc - * - * derived from - * - * Hardware driver for the AMD 768 Random Number Generator (RNG) - * (c) Copyright 2001 Red Hat Inc - * - * derived from - * - * Hardware driver for Intel i810 Random Number Generator (RNG) - * Copyright 2000,2001 Jeff Garzik - * Copyright 2000,2001 Philipp Rumpf - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include - - -#define PFX KBUILD_MODNAME ": " - - -/* - * Data for PCI driver interface - * - * This data only exists for exporting the supported - * PCI ids via MODULE_DEVICE_TABLE. We do not actually - * register a pci_driver, because someone else might one day - * want to register another driver on the same PCI id. - */ -static const struct pci_device_id pci_tbl[] = { - { 0x1022, 0x7443, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x1022, 0x746b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0, }, /* terminate list */ -}; -MODULE_DEVICE_TABLE(pci, pci_tbl); - -static struct pci_dev *amd_pdev; - - -static int amd_rng_data_present(struct hwrng *rng) -{ - u32 pmbase = (u32)rng->priv; - - return !!(inl(pmbase + 0xF4) & 1); -} - -static int amd_rng_data_read(struct hwrng *rng, u32 *data) -{ - u32 pmbase = (u32)rng->priv; - - *data = inl(pmbase + 0xF0); - - return 4; -} - -static int amd_rng_init(struct hwrng *rng) -{ - u8 rnen; - - pci_read_config_byte(amd_pdev, 0x40, &rnen); - rnen |= (1 << 7); /* RNG on */ - pci_write_config_byte(amd_pdev, 0x40, rnen); - - pci_read_config_byte(amd_pdev, 0x41, &rnen); - rnen |= (1 << 7); /* PMIO enable */ - pci_write_config_byte(amd_pdev, 0x41, rnen); - - return 0; -} - -static void amd_rng_cleanup(struct hwrng *rng) -{ - u8 rnen; - - pci_read_config_byte(amd_pdev, 0x40, &rnen); - rnen &= ~(1 << 7); /* RNG off */ - pci_write_config_byte(amd_pdev, 0x40, rnen); -} - - -static struct hwrng amd_rng = { - .name = "amd", - .init = amd_rng_init, - .cleanup = amd_rng_cleanup, - .data_present = amd_rng_data_present, - .data_read = amd_rng_data_read, -}; - - -static int __init mod_init(void) -{ - int err = -ENODEV; - struct pci_dev *pdev = NULL; - const struct pci_device_id *ent; - u32 pmbase; - - for_each_pci_dev(pdev) { - ent = pci_match_id(pci_tbl, pdev); - if (ent) - goto found; - } - /* Device not found. */ - goto out; - -found: - err = pci_read_config_dword(pdev, 0x58, &pmbase); - if (err) - goto out; - err = -EIO; - pmbase &= 0x0000FF00; - if (pmbase == 0) - goto out; - amd_rng.priv = (unsigned long)pmbase; - amd_pdev = pdev; - - printk(KERN_INFO "AMD768 RNG detected\n"); - err = hwrng_register(&amd_rng); - if (err) { - printk(KERN_ERR PFX "RNG registering failed (%d)\n", - err); - goto out; - } -out: - return err; -} - -static void __exit mod_exit(void) -{ - hwrng_unregister(&amd_rng); -} - -subsys_initcall(mod_init); -module_exit(mod_exit); - -MODULE_AUTHOR("The Linux Kernel team"); -MODULE_DESCRIPTION("H/W RNG driver for AMD chipsets"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/core.c b/trunk/drivers/char/hw_random/core.c deleted file mode 100644 index 88b026639f10..000000000000 --- a/trunk/drivers/char/hw_random/core.c +++ /dev/null @@ -1,354 +0,0 @@ -/* - Added support for the AMD Geode LX RNG - (c) Copyright 2004-2005 Advanced Micro Devices, Inc. - - derived from - - Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) - (c) Copyright 2003 Red Hat Inc - - derived from - - Hardware driver for the AMD 768 Random Number Generator (RNG) - (c) Copyright 2001 Red Hat Inc - - derived from - - Hardware driver for Intel i810 Random Number Generator (RNG) - Copyright 2000,2001 Jeff Garzik - Copyright 2000,2001 Philipp Rumpf - - Added generic RNG API - Copyright 2006 Michael Buesch - Copyright 2005 (c) MontaVista Software, Inc. - - Please read Documentation/hw_random.txt for details on use. - - ---------------------------------------------------------- - This software may be used and distributed according to the terms - of the GNU General Public License, incorporated herein by reference. - - */ - - -#include -#include -#include -#include -#include -#include -#include -#include -#include - - -#define RNG_MODULE_NAME "hw_random" -#define PFX RNG_MODULE_NAME ": " -#define RNG_MISCDEV_MINOR 183 /* official */ - - -static struct hwrng *current_rng; -static LIST_HEAD(rng_list); -static DEFINE_MUTEX(rng_mutex); - - -static inline int hwrng_init(struct hwrng *rng) -{ - if (!rng->init) - return 0; - return rng->init(rng); -} - -static inline void hwrng_cleanup(struct hwrng *rng) -{ - if (rng && rng->cleanup) - rng->cleanup(rng); -} - -static inline int hwrng_data_present(struct hwrng *rng) -{ - if (!rng->data_present) - return 1; - return rng->data_present(rng); -} - -static inline int hwrng_data_read(struct hwrng *rng, u32 *data) -{ - return rng->data_read(rng, data); -} - - -static int rng_dev_open(struct inode *inode, struct file *filp) -{ - /* enforce read-only access to this chrdev */ - if ((filp->f_mode & FMODE_READ) == 0) - return -EINVAL; - if (filp->f_mode & FMODE_WRITE) - return -EINVAL; - return 0; -} - -static ssize_t rng_dev_read(struct file *filp, char __user *buf, - size_t size, loff_t *offp) -{ - u32 data; - ssize_t ret = 0; - int i, err = 0; - int data_present; - int bytes_read; - - while (size) { - err = -ERESTARTSYS; - if (mutex_lock_interruptible(&rng_mutex)) - goto out; - if (!current_rng) { - mutex_unlock(&rng_mutex); - err = -ENODEV; - goto out; - } - if (filp->f_flags & O_NONBLOCK) { - data_present = hwrng_data_present(current_rng); - } else { - /* Some RNG require some time between data_reads to gather - * new entropy. Poll it. - */ - for (i = 0; i < 20; i++) { - data_present = hwrng_data_present(current_rng); - if (data_present) - break; - udelay(10); - } - } - bytes_read = 0; - if (data_present) - bytes_read = hwrng_data_read(current_rng, &data); - mutex_unlock(&rng_mutex); - - err = -EAGAIN; - if (!bytes_read && (filp->f_flags & O_NONBLOCK)) - goto out; - - err = -EFAULT; - while (bytes_read && size) { - if (put_user((u8)data, buf++)) - goto out; - size--; - ret++; - bytes_read--; - data >>= 8; - } - - if (need_resched()) - schedule_timeout_interruptible(1); - err = -ERESTARTSYS; - if (signal_pending(current)) - goto out; - } -out: - return ret ? : err; -} - - -static struct file_operations rng_chrdev_ops = { - .owner = THIS_MODULE, - .open = rng_dev_open, - .read = rng_dev_read, -}; - -static struct miscdevice rng_miscdev = { - .minor = RNG_MISCDEV_MINOR, - .name = RNG_MODULE_NAME, - .fops = &rng_chrdev_ops, -}; - - -static ssize_t hwrng_attr_current_store(struct class_device *class, - const char *buf, size_t len) -{ - int err; - struct hwrng *rng; - - err = mutex_lock_interruptible(&rng_mutex); - if (err) - return -ERESTARTSYS; - err = -ENODEV; - list_for_each_entry(rng, &rng_list, list) { - if (strcmp(rng->name, buf) == 0) { - if (rng == current_rng) { - err = 0; - break; - } - err = hwrng_init(rng); - if (err) - break; - hwrng_cleanup(current_rng); - current_rng = rng; - err = 0; - break; - } - } - mutex_unlock(&rng_mutex); - - return err ? : len; -} - -static ssize_t hwrng_attr_current_show(struct class_device *class, - char *buf) -{ - int err; - ssize_t ret; - const char *name = "none"; - - err = mutex_lock_interruptible(&rng_mutex); - if (err) - return -ERESTARTSYS; - if (current_rng) - name = current_rng->name; - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); - mutex_unlock(&rng_mutex); - - return ret; -} - -static ssize_t hwrng_attr_available_show(struct class_device *class, - char *buf) -{ - int err; - ssize_t ret = 0; - struct hwrng *rng; - - err = mutex_lock_interruptible(&rng_mutex); - if (err) - return -ERESTARTSYS; - buf[0] = '\0'; - list_for_each_entry(rng, &rng_list, list) { - strncat(buf, rng->name, PAGE_SIZE - ret - 1); - ret += strlen(rng->name); - strncat(buf, " ", PAGE_SIZE - ret - 1); - ret++; - } - strncat(buf, "\n", PAGE_SIZE - ret - 1); - ret++; - mutex_unlock(&rng_mutex); - - return ret; -} - -static CLASS_DEVICE_ATTR(rng_current, S_IRUGO | S_IWUSR, - hwrng_attr_current_show, - hwrng_attr_current_store); -static CLASS_DEVICE_ATTR(rng_available, S_IRUGO, - hwrng_attr_available_show, - NULL); - - -static void unregister_miscdev(void) -{ - class_device_remove_file(rng_miscdev.class, - &class_device_attr_rng_available); - class_device_remove_file(rng_miscdev.class, - &class_device_attr_rng_current); - misc_deregister(&rng_miscdev); -} - -static int register_miscdev(void) -{ - int err; - - err = misc_register(&rng_miscdev); - if (err) - goto out; - err = class_device_create_file(rng_miscdev.class, - &class_device_attr_rng_current); - if (err) - goto err_misc_dereg; - err = class_device_create_file(rng_miscdev.class, - &class_device_attr_rng_available); - if (err) - goto err_remove_current; -out: - return err; - -err_remove_current: - class_device_remove_file(rng_miscdev.class, - &class_device_attr_rng_current); -err_misc_dereg: - misc_deregister(&rng_miscdev); - goto out; -} - -int hwrng_register(struct hwrng *rng) -{ - int must_register_misc; - int err = -EINVAL; - struct hwrng *old_rng, *tmp; - - if (rng->name == NULL || - rng->data_read == NULL) - goto out; - - mutex_lock(&rng_mutex); - - /* Must not register two RNGs with the same name. */ - err = -EEXIST; - list_for_each_entry(tmp, &rng_list, list) { - if (strcmp(tmp->name, rng->name) == 0) - goto out_unlock; - } - - must_register_misc = (current_rng == NULL); - old_rng = current_rng; - if (!old_rng) { - err = hwrng_init(rng); - if (err) - goto out_unlock; - current_rng = rng; - } - err = 0; - if (must_register_misc) { - err = register_miscdev(); - if (err) { - if (!old_rng) { - hwrng_cleanup(rng); - current_rng = NULL; - } - goto out_unlock; - } - } - INIT_LIST_HEAD(&rng->list); - list_add_tail(&rng->list, &rng_list); -out_unlock: - mutex_unlock(&rng_mutex); -out: - return err; -} -EXPORT_SYMBOL_GPL(hwrng_register); - -void hwrng_unregister(struct hwrng *rng) -{ - int err; - - mutex_lock(&rng_mutex); - - list_del(&rng->list); - if (current_rng == rng) { - hwrng_cleanup(rng); - if (list_empty(&rng_list)) { - current_rng = NULL; - } else { - current_rng = list_entry(rng_list.prev, struct hwrng, list); - err = hwrng_init(current_rng); - if (err) - current_rng = NULL; - } - } - if (list_empty(&rng_list)) - unregister_miscdev(); - - mutex_unlock(&rng_mutex); -} -EXPORT_SYMBOL_GPL(hwrng_unregister); - - -MODULE_DESCRIPTION("H/W Random Number Generator (RNG) driver"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/geode-rng.c b/trunk/drivers/char/hw_random/geode-rng.c deleted file mode 100644 index be61f22ee7bb..000000000000 --- a/trunk/drivers/char/hw_random/geode-rng.c +++ /dev/null @@ -1,128 +0,0 @@ -/* - * RNG driver for AMD Geode RNGs - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * with the majority of the code coming from: - * - * Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) - * (c) Copyright 2003 Red Hat Inc - * - * derived from - * - * Hardware driver for the AMD 768 Random Number Generator (RNG) - * (c) Copyright 2001 Red Hat Inc - * - * derived from - * - * Hardware driver for Intel i810 Random Number Generator (RNG) - * Copyright 2000,2001 Jeff Garzik - * Copyright 2000,2001 Philipp Rumpf - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include - - -#define PFX KBUILD_MODNAME ": " - -#define GEODE_RNG_DATA_REG 0x50 -#define GEODE_RNG_STATUS_REG 0x54 - -/* - * Data for PCI driver interface - * - * This data only exists for exporting the supported - * PCI ids via MODULE_DEVICE_TABLE. We do not actually - * register a pci_driver, because someone else might one day - * want to register another driver on the same PCI id. - */ -static const struct pci_device_id pci_tbl[] = { - { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LX_AES, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0, }, /* terminate list */ -}; -MODULE_DEVICE_TABLE(pci, pci_tbl); - - -static int geode_rng_data_read(struct hwrng *rng, u32 *data) -{ - void __iomem *mem = (void __iomem *)rng->priv; - - *data = readl(mem + GEODE_RNG_DATA_REG); - - return 4; -} - -static int geode_rng_data_present(struct hwrng *rng) -{ - void __iomem *mem = (void __iomem *)rng->priv; - - return !!(readl(mem + GEODE_RNG_STATUS_REG)); -} - - -static struct hwrng geode_rng = { - .name = "geode", - .data_present = geode_rng_data_present, - .data_read = geode_rng_data_read, -}; - - -static int __init mod_init(void) -{ - int err = -ENODEV; - struct pci_dev *pdev = NULL; - const struct pci_device_id *ent; - void __iomem *mem; - unsigned long rng_base; - - for_each_pci_dev(pdev) { - ent = pci_match_id(pci_tbl, pdev); - if (ent) - goto found; - } - /* Device not found. */ - goto out; - -found: - rng_base = pci_resource_start(pdev, 0); - if (rng_base == 0) - goto out; - err = -ENOMEM; - mem = ioremap(rng_base, 0x58); - if (!mem) - goto out; - geode_rng.priv = (unsigned long)mem; - - printk(KERN_INFO "AMD Geode RNG detected\n"); - err = hwrng_register(&geode_rng); - if (err) { - printk(KERN_ERR PFX "RNG registering failed (%d)\n", - err); - goto out; - } -out: - return err; -} - -static void __exit mod_exit(void) -{ - void __iomem *mem = (void __iomem *)geode_rng.priv; - - hwrng_unregister(&geode_rng); - iounmap(mem); -} - -subsys_initcall(mod_init); -module_exit(mod_exit); - -MODULE_DESCRIPTION("H/W RNG driver for AMD Geode LX CPUs"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/intel-rng.c b/trunk/drivers/char/hw_random/intel-rng.c deleted file mode 100644 index 6594bd5645f4..000000000000 --- a/trunk/drivers/char/hw_random/intel-rng.c +++ /dev/null @@ -1,189 +0,0 @@ -/* - * RNG driver for Intel RNGs - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * with the majority of the code coming from: - * - * Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) - * (c) Copyright 2003 Red Hat Inc - * - * derived from - * - * Hardware driver for the AMD 768 Random Number Generator (RNG) - * (c) Copyright 2001 Red Hat Inc - * - * derived from - * - * Hardware driver for Intel i810 Random Number Generator (RNG) - * Copyright 2000,2001 Jeff Garzik - * Copyright 2000,2001 Philipp Rumpf - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include - - -#define PFX KBUILD_MODNAME ": " - -/* - * RNG registers - */ -#define INTEL_RNG_HW_STATUS 0 -#define INTEL_RNG_PRESENT 0x40 -#define INTEL_RNG_ENABLED 0x01 -#define INTEL_RNG_STATUS 1 -#define INTEL_RNG_DATA_PRESENT 0x01 -#define INTEL_RNG_DATA 2 - -/* - * Magic address at which Intel PCI bridges locate the RNG - */ -#define INTEL_RNG_ADDR 0xFFBC015F -#define INTEL_RNG_ADDR_LEN 3 - -/* - * Data for PCI driver interface - * - * This data only exists for exporting the supported - * PCI ids via MODULE_DEVICE_TABLE. We do not actually - * register a pci_driver, because someone else might one day - * want to register another driver on the same PCI id. - */ -static const struct pci_device_id pci_tbl[] = { - { 0x8086, 0x2418, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x8086, 0x2428, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x8086, 0x2430, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x8086, 0x2448, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x8086, 0x244e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0x8086, 0x245e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, - { 0, }, /* terminate list */ -}; -MODULE_DEVICE_TABLE(pci, pci_tbl); - - -static inline u8 hwstatus_get(void __iomem *mem) -{ - return readb(mem + INTEL_RNG_HW_STATUS); -} - -static inline u8 hwstatus_set(void __iomem *mem, - u8 hw_status) -{ - writeb(hw_status, mem + INTEL_RNG_HW_STATUS); - return hwstatus_get(mem); -} - -static int intel_rng_data_present(struct hwrng *rng) -{ - void __iomem *mem = (void __iomem *)rng->priv; - - return !!(readb(mem + INTEL_RNG_STATUS) & INTEL_RNG_DATA_PRESENT); -} - -static int intel_rng_data_read(struct hwrng *rng, u32 *data) -{ - void __iomem *mem = (void __iomem *)rng->priv; - - *data = readb(mem + INTEL_RNG_DATA); - - return 1; -} - -static int intel_rng_init(struct hwrng *rng) -{ - void __iomem *mem = (void __iomem *)rng->priv; - u8 hw_status; - int err = -EIO; - - hw_status = hwstatus_get(mem); - /* turn RNG h/w on, if it's off */ - if ((hw_status & INTEL_RNG_ENABLED) == 0) - hw_status = hwstatus_set(mem, hw_status | INTEL_RNG_ENABLED); - if ((hw_status & INTEL_RNG_ENABLED) == 0) { - printk(KERN_ERR PFX "cannot enable RNG, aborting\n"); - goto out; - } - err = 0; -out: - return err; -} - -static void intel_rng_cleanup(struct hwrng *rng) -{ - void __iomem *mem = (void __iomem *)rng->priv; - u8 hw_status; - - hw_status = hwstatus_get(mem); - if (hw_status & INTEL_RNG_ENABLED) - hwstatus_set(mem, hw_status & ~INTEL_RNG_ENABLED); - else - printk(KERN_WARNING PFX "unusual: RNG already disabled\n"); -} - - -static struct hwrng intel_rng = { - .name = "intel", - .init = intel_rng_init, - .cleanup = intel_rng_cleanup, - .data_present = intel_rng_data_present, - .data_read = intel_rng_data_read, -}; - - -static int __init mod_init(void) -{ - int err = -ENODEV; - void __iomem *mem; - u8 hw_status; - - if (!pci_dev_present(pci_tbl)) - goto out; /* Device not found. */ - - err = -ENOMEM; - mem = ioremap(INTEL_RNG_ADDR, INTEL_RNG_ADDR_LEN); - if (!mem) - goto out; - intel_rng.priv = (unsigned long)mem; - - /* Check for Intel 82802 */ - err = -ENODEV; - hw_status = hwstatus_get(mem); - if ((hw_status & INTEL_RNG_PRESENT) == 0) - goto err_unmap; - - printk(KERN_INFO "Intel 82802 RNG detected\n"); - err = hwrng_register(&intel_rng); - if (err) { - printk(KERN_ERR PFX "RNG registering failed (%d)\n", - err); - goto out; - } -out: - return err; - -err_unmap: - iounmap(mem); - goto out; -} - -static void __exit mod_exit(void) -{ - void __iomem *mem = (void __iomem *)intel_rng.priv; - - hwrng_unregister(&intel_rng); - iounmap(mem); -} - -subsys_initcall(mod_init); -module_exit(mod_exit); - -MODULE_DESCRIPTION("H/W RNG driver for Intel chipsets"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/ixp4xx-rng.c b/trunk/drivers/char/hw_random/ixp4xx-rng.c deleted file mode 100644 index ef71022423c9..000000000000 --- a/trunk/drivers/char/hw_random/ixp4xx-rng.c +++ /dev/null @@ -1,73 +0,0 @@ -/* - * drivers/char/rng/ixp4xx-rng.c - * - * RNG driver for Intel IXP4xx family of NPUs - * - * Author: Deepak Saxena - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * Fixes by Michael Buesch - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include - - -static int ixp4xx_rng_data_read(struct hwrng *rng, u32 *buffer) -{ - void __iomem * rng_base = (void __iomem *)rng->priv; - - *buffer = __raw_readl(rng_base); - - return 4; -} - -static struct hwrng ixp4xx_rng_ops = { - .name = "ixp4xx", - .data_read = ixp4xx_rng_data_read, -}; - -static int __init ixp4xx_rng_init(void) -{ - void __iomem * rng_base; - int err; - - rng_base = ioremap(0x70002100, 4); - if (!rng_base) - return -ENOMEM; - ixp4xx_rng_ops.priv = (unsigned long)rng_base; - err = hwrng_register(&ixp4xx_rng_ops); - if (err) - iounmap(rng_base); - - return err; -} - -static void __exit ixp4xx_rng_exit(void) -{ - void __iomem * rng_base = (void __iomem *)ixp4xx_rng_ops.priv; - - hwrng_unregister(&ixp4xx_rng_ops); - iounmap(rng_base); -} - -subsys_initcall(ixp4xx_rng_init); -module_exit(ixp4xx_rng_exit); - -MODULE_AUTHOR("Deepak Saxena "); -MODULE_DESCRIPTION("H/W Random Number Generator (RNG) driver for IXP4xx"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/omap-rng.c b/trunk/drivers/char/hw_random/omap-rng.c deleted file mode 100644 index 819516b35a79..000000000000 --- a/trunk/drivers/char/hw_random/omap-rng.c +++ /dev/null @@ -1,208 +0,0 @@ -/* - * driver/char/hw_random/omap-rng.c - * - * RNG driver for TI OMAP CPU family - * - * Author: Deepak Saxena - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * Mostly based on original driver: - * - * Copyright (C) 2005 Nokia Corporation - * Author: Juha Yrj�� - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - * - * TODO: - * - * - Make status updated be interrupt driven so we don't poll - * - */ - -#include -#include -#include -#include -#include -#include - -#include -#include - -#define RNG_OUT_REG 0x00 /* Output register */ -#define RNG_STAT_REG 0x04 /* Status register - [0] = STAT_BUSY */ -#define RNG_ALARM_REG 0x24 /* Alarm register - [7:0] = ALARM_COUNTER */ -#define RNG_CONFIG_REG 0x28 /* Configuration register - [11:6] = RESET_COUNT - [5:3] = RING2_DELAY - [2:0] = RING1_DELAY */ -#define RNG_REV_REG 0x3c /* Revision register - [7:0] = REV_NB */ -#define RNG_MASK_REG 0x40 /* Mask and reset register - [2] = IT_EN - [1] = SOFTRESET - [0] = AUTOIDLE */ -#define RNG_SYSSTATUS 0x44 /* System status - [0] = RESETDONE */ - -static void __iomem *rng_base; -static struct clk *rng_ick; -static struct device *rng_dev; - -static u32 omap_rng_read_reg(int reg) -{ - return __raw_readl(rng_base + reg); -} - -static void omap_rng_write_reg(int reg, u32 val) -{ - __raw_writel(val, rng_base + reg); -} - -/* REVISIT: Does the status bit really work on 16xx? */ -static int omap_rng_data_present(struct hwrng *rng) -{ - return omap_rng_read_reg(RNG_STAT_REG) ? 0 : 1; -} - -static int omap_rng_data_read(struct hwrng *rng, u32 *data) -{ - *data = omap_rng_read_reg(RNG_OUT_REG); - - return 4; -} - -static struct hwrng omap_rng_ops = { - .name = "omap", - .data_present = omap_rng_data_present, - .data_read = omap_rng_data_read, -}; - -static int __init omap_rng_probe(struct device *dev) -{ - struct platform_device *pdev = to_platform_device(dev); - struct resource *res, *mem; - int ret; - - /* - * A bit ugly, and it will never actually happen but there can - * be only one RNG and this catches any bork - */ - BUG_ON(rng_dev); - - if (cpu_is_omap24xx()) { - rng_ick = clk_get(NULL, "rng_ick"); - if (IS_ERR(rng_ick)) { - dev_err(dev, "Could not get rng_ick\n"); - ret = PTR_ERR(rng_ick); - return ret; - } - else { - clk_use(rng_ick); - } - } - - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - - if (!res) - return -ENOENT; - - mem = request_mem_region(res->start, res->end - res->start + 1, - pdev->name); - if (mem == NULL) - return -EBUSY; - - dev_set_drvdata(dev, mem); - rng_base = (u32 __iomem *)io_p2v(res->start); - - ret = hwrng_register(&omap_rng_ops); - if (ret) { - release_resource(mem); - rng_base = NULL; - return ret; - } - - dev_info(dev, "OMAP Random Number Generator ver. %02x\n", - omap_rng_read_reg(RNG_REV_REG)); - omap_rng_write_reg(RNG_MASK_REG, 0x1); - - rng_dev = dev; - - return 0; -} - -static int __exit omap_rng_remove(struct device *dev) -{ - struct resource *mem = dev_get_drvdata(dev); - - hwrng_unregister(&omap_rng_ops); - - omap_rng_write_reg(RNG_MASK_REG, 0x0); - - if (cpu_is_omap24xx()) { - clk_unuse(rng_ick); - clk_put(rng_ick); - } - - release_resource(mem); - rng_base = NULL; - - return 0; -} - -#ifdef CONFIG_PM - -static int omap_rng_suspend(struct device *dev, pm_message_t message, u32 level) -{ - omap_rng_write_reg(RNG_MASK_REG, 0x0); - - return 0; -} - -static int omap_rng_resume(struct device *dev, pm_message_t message, u32 level) -{ - omap_rng_write_reg(RNG_MASK_REG, 0x1); - - return 1; -} - -#else - -#define omap_rng_suspend NULL -#define omap_rng_resume NULL - -#endif - - -static struct device_driver omap_rng_driver = { - .name = "omap_rng", - .bus = &platform_bus_type, - .probe = omap_rng_probe, - .remove = __exit_p(omap_rng_remove), - .suspend = omap_rng_suspend, - .resume = omap_rng_resume -}; - -static int __init omap_rng_init(void) -{ - if (!cpu_is_omap16xx() && !cpu_is_omap24xx()) - return -ENODEV; - - return driver_register(&omap_rng_driver); -} - -static void __exit omap_rng_exit(void) -{ - driver_unregister(&omap_rng_driver); -} - -module_init(omap_rng_init); -module_exit(omap_rng_exit); - -MODULE_AUTHOR("Deepak Saxena (and others)"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/hw_random/via-rng.c b/trunk/drivers/char/hw_random/via-rng.c deleted file mode 100644 index 0e786b617bb8..000000000000 --- a/trunk/drivers/char/hw_random/via-rng.c +++ /dev/null @@ -1,183 +0,0 @@ -/* - * RNG driver for VIA RNGs - * - * Copyright 2005 (c) MontaVista Software, Inc. - * - * with the majority of the code coming from: - * - * Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG) - * (c) Copyright 2003 Red Hat Inc - * - * derived from - * - * Hardware driver for the AMD 768 Random Number Generator (RNG) - * (c) Copyright 2001 Red Hat Inc - * - * derived from - * - * Hardware driver for Intel i810 Random Number Generator (RNG) - * Copyright 2000,2001 Jeff Garzik - * Copyright 2000,2001 Philipp Rumpf - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include -#include -#include - - -#define PFX KBUILD_MODNAME ": " - - -enum { - VIA_STRFILT_CNT_SHIFT = 16, - VIA_STRFILT_FAIL = (1 << 15), - VIA_STRFILT_ENABLE = (1 << 14), - VIA_RAWBITS_ENABLE = (1 << 13), - VIA_RNG_ENABLE = (1 << 6), - VIA_XSTORE_CNT_MASK = 0x0F, - - VIA_RNG_CHUNK_8 = 0x00, /* 64 rand bits, 64 stored bits */ - VIA_RNG_CHUNK_4 = 0x01, /* 32 rand bits, 32 stored bits */ - VIA_RNG_CHUNK_4_MASK = 0xFFFFFFFF, - VIA_RNG_CHUNK_2 = 0x02, /* 16 rand bits, 32 stored bits */ - VIA_RNG_CHUNK_2_MASK = 0xFFFF, - VIA_RNG_CHUNK_1 = 0x03, /* 8 rand bits, 32 stored bits */ - VIA_RNG_CHUNK_1_MASK = 0xFF, -}; - -/* - * Investigate using the 'rep' prefix to obtain 32 bits of random data - * in one insn. The upside is potentially better performance. The - * downside is that the instruction becomes no longer atomic. Due to - * this, just like familiar issues with /dev/random itself, the worst - * case of a 'rep xstore' could potentially pause a cpu for an - * unreasonably long time. In practice, this condition would likely - * only occur when the hardware is failing. (or so we hope :)) - * - * Another possible performance boost may come from simply buffering - * until we have 4 bytes, thus returning a u32 at a time, - * instead of the current u8-at-a-time. - */ - -static inline u32 xstore(u32 *addr, u32 edx_in) -{ - u32 eax_out; - - asm(".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */" - :"=m"(*addr), "=a"(eax_out) - :"D"(addr), "d"(edx_in)); - - return eax_out; -} - -static int via_rng_data_present(struct hwrng *rng) -{ - u32 bytes_out; - u32 *via_rng_datum = (u32 *)(&rng->priv); - - /* We choose the recommended 1-byte-per-instruction RNG rate, - * for greater randomness at the expense of speed. Larger - * values 2, 4, or 8 bytes-per-instruction yield greater - * speed at lesser randomness. - * - * If you change this to another VIA_CHUNK_n, you must also - * change the ->n_bytes values in rng_vendor_ops[] tables. - * VIA_CHUNK_8 requires further code changes. - * - * A copy of MSR_VIA_RNG is placed in eax_out when xstore - * completes. - */ - - *via_rng_datum = 0; /* paranoia, not really necessary */ - bytes_out = xstore(via_rng_datum, VIA_RNG_CHUNK_1); - bytes_out &= VIA_XSTORE_CNT_MASK; - if (bytes_out == 0) - return 0; - return 1; -} - -static int via_rng_data_read(struct hwrng *rng, u32 *data) -{ - u32 via_rng_datum = (u32)rng->priv; - - *data = via_rng_datum; - - return 1; -} - -static int via_rng_init(struct hwrng *rng) -{ - u32 lo, hi, old_lo; - - /* Control the RNG via MSR. Tread lightly and pay very close - * close attention to values written, as the reserved fields - * are documented to be "undefined and unpredictable"; but it - * does not say to write them as zero, so I make a guess that - * we restore the values we find in the register. - */ - rdmsr(MSR_VIA_RNG, lo, hi); - - old_lo = lo; - lo &= ~(0x7f << VIA_STRFILT_CNT_SHIFT); - lo &= ~VIA_XSTORE_CNT_MASK; - lo &= ~(VIA_STRFILT_ENABLE | VIA_STRFILT_FAIL | VIA_RAWBITS_ENABLE); - lo |= VIA_RNG_ENABLE; - - if (lo != old_lo) - wrmsr(MSR_VIA_RNG, lo, hi); - - /* perhaps-unnecessary sanity check; remove after testing if - unneeded */ - rdmsr(MSR_VIA_RNG, lo, hi); - if ((lo & VIA_RNG_ENABLE) == 0) { - printk(KERN_ERR PFX "cannot enable VIA C3 RNG, aborting\n"); - return -ENODEV; - } - - return 0; -} - - -static struct hwrng via_rng = { - .name = "via", - .init = via_rng_init, - .data_present = via_rng_data_present, - .data_read = via_rng_data_read, -}; - - -static int __init mod_init(void) -{ - int err; - - if (!cpu_has_xstore) - return -ENODEV; - printk(KERN_INFO "VIA RNG detected\n"); - err = hwrng_register(&via_rng); - if (err) { - printk(KERN_ERR PFX "RNG registering failed (%d)\n", - err); - goto out; - } -out: - return err; -} - -static void __exit mod_exit(void) -{ - hwrng_unregister(&via_rng); -} - -subsys_initcall(mod_init); -module_exit(mod_exit); - -MODULE_DESCRIPTION("H/W RNG driver for VIA chipsets"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/char/ipmi/ipmi_msghandler.c b/trunk/drivers/char/ipmi/ipmi_msghandler.c index 23028559dbc4..9f2f8fdec69a 100644 --- a/trunk/drivers/char/ipmi/ipmi_msghandler.c +++ b/trunk/drivers/char/ipmi/ipmi_msghandler.c @@ -936,8 +936,11 @@ int ipmi_set_gets_events(ipmi_user_t user, int val) if (val) { /* Deliver any queued events. */ - list_for_each_entry_safe(msg, msg2, &intf->waiting_events, link) - list_move_tail(&msg->link, &msgs); + list_for_each_entry_safe(msg, msg2, &intf->waiting_events, + link) { + list_del(&msg->link); + list_add_tail(&msg->link, &msgs); + } intf->waiting_events_count = 0; } diff --git a/trunk/drivers/char/keyboard.c b/trunk/drivers/char/keyboard.c index 4bb3d2272604..edd996f6fb87 100644 --- a/trunk/drivers/char/keyboard.c +++ b/trunk/drivers/char/keyboard.c @@ -151,7 +151,6 @@ unsigned char kbd_sysrq_xlate[KEY_MAX + 1] = "230\177\000\000\213\214\000\000\000\000\000\000\000\000\000\000" /* 0x50 - 0x5f */ "\r\000/"; /* 0x60 - 0x6f */ static int sysrq_down; -static int sysrq_alt_use; #endif static int sysrq_alt; @@ -674,7 +673,7 @@ static void k_dead2(struct vc_data *vc, unsigned char value, char up_flag, struc */ static void k_dead(struct vc_data *vc, unsigned char value, char up_flag, struct pt_regs *regs) { - static const unsigned char ret_diacr[NR_DEAD] = {'`', '\'', '^', '~', '"', ',' }; + static unsigned char ret_diacr[NR_DEAD] = {'`', '\'', '^', '~', '"', ',' }; value = ret_diacr[value]; k_deadunicode(vc, value, up_flag, regs); } @@ -711,8 +710,8 @@ static void k_cur(struct vc_data *vc, unsigned char value, char up_flag, struct static void k_pad(struct vc_data *vc, unsigned char value, char up_flag, struct pt_regs *regs) { - static const char pad_chars[] = "0123456789+-*/\015,.?()#"; - static const char app_map[] = "pqrstuvwxylSRQMnnmPQS"; + static const char *pad_chars = "0123456789+-*/\015,.?()#"; + static const char *app_map = "pqrstuvwxylSRQMnnmPQS"; if (up_flag) return; /* no action, if this is a key release */ @@ -1037,7 +1036,7 @@ static void kbd_refresh_leds(struct input_handle *handle) #define HW_RAW(dev) (test_bit(EV_MSC, dev->evbit) && test_bit(MSC_RAW, dev->mscbit) &&\ ((dev)->id.bustype == BUS_I8042) && ((dev)->id.vendor == 0x0001) && ((dev)->id.product == 0x0001)) -static const unsigned short x86_keycodes[256] = +static unsigned short x86_keycodes[256] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, @@ -1075,13 +1074,11 @@ static int emulate_raw(struct vc_data *vc, unsigned int keycode, put_queue(vc, 0x1d | up_flag); put_queue(vc, 0x45 | up_flag); return 0; - case KEY_HANGEUL: - if (!up_flag) - put_queue(vc, 0xf2); + case KEY_HANGUEL: + if (!up_flag) put_queue(vc, 0xf1); return 0; case KEY_HANJA: - if (!up_flag) - put_queue(vc, 0xf1); + if (!up_flag) put_queue(vc, 0xf2); return 0; } @@ -1146,7 +1143,7 @@ static void kbd_keycode(unsigned int keycode, int down, kbd = kbd_table + fg_console; if (keycode == KEY_LEFTALT || keycode == KEY_RIGHTALT) - sysrq_alt = down ? keycode : 0; + sysrq_alt = down; #ifdef CONFIG_SPARC if (keycode == KEY_STOP) sparc_l1_a_state = down; @@ -1166,14 +1163,9 @@ static void kbd_keycode(unsigned int keycode, int down, #ifdef CONFIG_MAGIC_SYSRQ /* Handle the SysRq Hack */ if (keycode == KEY_SYSRQ && (sysrq_down || (down == 1 && sysrq_alt))) { - if (!sysrq_down) { - sysrq_down = down; - sysrq_alt_use = sysrq_alt; - } + sysrq_down = down; return; } - if (sysrq_down && !down && keycode == sysrq_alt_use) - sysrq_down = 0; if (sysrq_down && down && !rep) { handle_sysrq(kbd_sysrq_xlate[keycode], regs, tty); return; diff --git a/trunk/drivers/char/vt.c b/trunk/drivers/char/vt.c index 714d95ff2f1e..6c94879e0b99 100644 --- a/trunk/drivers/char/vt.c +++ b/trunk/drivers/char/vt.c @@ -98,22 +98,7 @@ #include #include -#define MAX_NR_CON_DRIVER 16 -#define CON_DRIVER_FLAG_MODULE 1 -#define CON_DRIVER_FLAG_INIT 2 - -struct con_driver { - const struct consw *con; - const char *desc; - struct class_device *class_dev; - int node; - int first; - int last; - int flag; -}; - -static struct con_driver registered_con_driver[MAX_NR_CON_DRIVER]; const struct consw *conswitchp; /* A bitmap for codes <32. A bit of 1 indicates that the code @@ -2572,7 +2557,7 @@ static int __init con_init(void) { const char *display_desc = NULL; struct vc_data *vc; - unsigned int currcons = 0, i; + unsigned int currcons = 0; acquire_console_sem(); @@ -2584,22 +2569,6 @@ static int __init con_init(void) return 0; } - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - struct con_driver *con_driver = ®istered_con_driver[i]; - - if (con_driver->con == NULL) { - con_driver->con = conswitchp; - con_driver->desc = display_desc; - con_driver->flag = CON_DRIVER_FLAG_INIT; - con_driver->first = 0; - con_driver->last = MAX_NR_CONSOLES - 1; - break; - } - } - - for (i = 0; i < MAX_NR_CONSOLES; i++) - con_driver_map[i] = conswitchp; - init_timer(&console_timer); console_timer.function = blank_screen_t; if (blankinterval) { @@ -2687,53 +2656,38 @@ int __init vty_init(void) } #ifndef VT_SINGLE_DRIVER -#include -static struct class *vtconsole_class; +/* + * If we support more console drivers, this function is used + * when a driver wants to take over some existing consoles + * and become default driver for newly opened ones. + */ -static int bind_con_driver(const struct consw *csw, int first, int last, - int deflt) +int take_over_console(const struct consw *csw, int first, int last, int deflt) { - struct module *owner = csw->owner; - const char *desc = NULL; - struct con_driver *con_driver; - int i, j = -1, k = -1, retval = -ENODEV; + int i, j = -1; + const char *desc; + struct module *owner; + owner = csw->owner; if (!try_module_get(owner)) return -ENODEV; acquire_console_sem(); - /* check if driver is registered */ - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - con_driver = ®istered_con_driver[i]; - - if (con_driver->con == csw) { - desc = con_driver->desc; - retval = 0; - break; - } - } - - if (retval) - goto err; - - if (!(con_driver->flag & CON_DRIVER_FLAG_INIT)) { - csw->con_startup(); - con_driver->flag |= CON_DRIVER_FLAG_INIT; + desc = csw->con_startup(); + if (!desc) { + release_console_sem(); + module_put(owner); + return -ENODEV; } - if (deflt) { if (conswitchp) module_put(conswitchp->owner); - __module_get(owner); conswitchp = csw; } - first = max(first, con_driver->first); - last = min(last, con_driver->last); - for (i = first; i <= last; i++) { int old_was_color; struct vc_data *vc = vc_cons[i].d; @@ -2747,17 +2701,15 @@ static int bind_con_driver(const struct consw *csw, int first, int last, continue; j = i; - - if (CON_IS_VISIBLE(vc)) { - k = i; + if (CON_IS_VISIBLE(vc)) save_screen(vc); - } - old_was_color = vc->vc_can_do_color; vc->vc_sw->con_deinit(vc); vc->vc_origin = (unsigned long)vc->vc_screenbuf; + vc->vc_visible_origin = vc->vc_origin; + vc->vc_scr_end = vc->vc_origin + vc->vc_screenbuf_size; + vc->vc_pos = vc->vc_origin + vc->vc_size_row * vc->vc_y + 2 * vc->vc_x; visual_init(vc, i, 0); - set_origin(vc); update_attr(vc); /* If the console changed between mono <-> color, then @@ -2766,506 +2718,36 @@ static int bind_con_driver(const struct consw *csw, int first, int last, */ if (old_was_color != vc->vc_can_do_color) clear_buffer_attributes(vc); - } + if (CON_IS_VISIBLE(vc)) + update_screen(vc); + } printk("Console: switching "); if (!deflt) printk("consoles %d-%d ", first+1, last+1); - if (j >= 0) { - struct vc_data *vc = vc_cons[j].d; - + if (j >= 0) printk("to %s %s %dx%d\n", - vc->vc_can_do_color ? "colour" : "mono", - desc, vc->vc_cols, vc->vc_rows); - - if (k >= 0) { - vc = vc_cons[k].d; - update_screen(vc); - } - } else + vc_cons[j].d->vc_can_do_color ? "colour" : "mono", + desc, vc_cons[j].d->vc_cols, vc_cons[j].d->vc_rows); + else printk("to %s\n", desc); - retval = 0; -err: release_console_sem(); - module_put(owner); - return retval; -}; - -#ifdef CONFIG_VT_HW_CONSOLE_BINDING -static int con_is_graphics(const struct consw *csw, int first, int last) -{ - int i, retval = 0; - - for (i = first; i <= last; i++) { - struct vc_data *vc = vc_cons[i].d; - - if (vc && vc->vc_mode == KD_GRAPHICS) { - retval = 1; - break; - } - } - - return retval; -} - -static int unbind_con_driver(const struct consw *csw, int first, int last, - int deflt) -{ - struct module *owner = csw->owner; - const struct consw *defcsw = NULL; - struct con_driver *con_driver = NULL, *con_back = NULL; - int i, retval = -ENODEV; - - if (!try_module_get(owner)) - return -ENODEV; - - acquire_console_sem(); - - /* check if driver is registered and if it is unbindable */ - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - con_driver = ®istered_con_driver[i]; - - if (con_driver->con == csw && - con_driver->flag & CON_DRIVER_FLAG_MODULE) { - retval = 0; - break; - } - } - - if (retval) { - release_console_sem(); - goto err; - } - - retval = -ENODEV; - - /* check if backup driver exists */ - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - con_back = ®istered_con_driver[i]; - - if (con_back->con && - !(con_back->flag & CON_DRIVER_FLAG_MODULE)) { - defcsw = con_back->con; - retval = 0; - break; - } - } - - if (retval) { - release_console_sem(); - goto err; - } - - if (!con_is_bound(csw)) { - release_console_sem(); - goto err; - } - - first = max(first, con_driver->first); - last = min(last, con_driver->last); - - for (i = first; i <= last; i++) { - if (con_driver_map[i] == csw) { - module_put(csw->owner); - con_driver_map[i] = NULL; - } - } - - if (!con_is_bound(defcsw)) { - const struct consw *defconsw = conswitchp; - - defcsw->con_startup(); - con_back->flag |= CON_DRIVER_FLAG_INIT; - /* - * vgacon may change the default driver to point - * to dummycon, we restore it here... - */ - conswitchp = defconsw; - } - - if (!con_is_bound(csw)) - con_driver->flag &= ~CON_DRIVER_FLAG_INIT; - release_console_sem(); - /* ignore return value, binding should not fail */ - bind_con_driver(defcsw, first, last, deflt); -err: module_put(owner); - return retval; - -} - -static int vt_bind(struct con_driver *con) -{ - const struct consw *defcsw = NULL, *csw = NULL; - int i, more = 1, first = -1, last = -1, deflt = 0; - - if (!con->con || !(con->flag & CON_DRIVER_FLAG_MODULE) || - con_is_graphics(con->con, con->first, con->last)) - goto err; - - csw = con->con; - - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - struct con_driver *con = ®istered_con_driver[i]; - - if (con->con && !(con->flag & CON_DRIVER_FLAG_MODULE)) { - defcsw = con->con; - break; - } - } - - if (!defcsw) - goto err; - - while (more) { - more = 0; - - for (i = con->first; i <= con->last; i++) { - if (con_driver_map[i] == defcsw) { - if (first == -1) - first = i; - last = i; - more = 1; - } else if (first != -1) - break; - } - - if (first == 0 && last == MAX_NR_CONSOLES -1) - deflt = 1; - - if (first != -1) - bind_con_driver(csw, first, last, deflt); - - first = -1; - last = -1; - deflt = 0; - } - -err: - return 0; -} - -static int vt_unbind(struct con_driver *con) -{ - const struct consw *csw = NULL; - int i, more = 1, first = -1, last = -1, deflt = 0; - - if (!con->con || !(con->flag & CON_DRIVER_FLAG_MODULE) || - con_is_graphics(con->con, con->first, con->last)) - goto err; - - csw = con->con; - - while (more) { - more = 0; - - for (i = con->first; i <= con->last; i++) { - if (con_driver_map[i] == csw) { - if (first == -1) - first = i; - last = i; - more = 1; - } else if (first != -1) - break; - } - - if (first == 0 && last == MAX_NR_CONSOLES -1) - deflt = 1; - - if (first != -1) - unbind_con_driver(csw, first, last, deflt); - - first = -1; - last = -1; - deflt = 0; - } - -err: - return 0; -} -#else -static inline int vt_bind(struct con_driver *con) -{ - return 0; -} -static inline int vt_unbind(struct con_driver *con) -{ - return 0; -} -#endif /* CONFIG_VT_HW_CONSOLE_BINDING */ - -static ssize_t store_bind(struct class_device *class_device, - const char *buf, size_t count) -{ - struct con_driver *con = class_get_devdata(class_device); - int bind = simple_strtoul(buf, NULL, 0); - - if (bind) - vt_bind(con); - else - vt_unbind(con); - - return count; -} - -static ssize_t show_bind(struct class_device *class_device, char *buf) -{ - struct con_driver *con = class_get_devdata(class_device); - int bind = con_is_bound(con->con); - - return snprintf(buf, PAGE_SIZE, "%i\n", bind); -} - -static ssize_t show_name(struct class_device *class_device, char *buf) -{ - struct con_driver *con = class_get_devdata(class_device); - - return snprintf(buf, PAGE_SIZE, "%s %s\n", - (con->flag & CON_DRIVER_FLAG_MODULE) ? "(M)" : "(S)", - con->desc); - -} - -static struct class_device_attribute class_device_attrs[] = { - __ATTR(bind, S_IRUGO|S_IWUSR, show_bind, store_bind), - __ATTR(name, S_IRUGO, show_name, NULL), -}; - -static int vtconsole_init_class_device(struct con_driver *con) -{ - int i; - - class_set_devdata(con->class_dev, con); - for (i = 0; i < ARRAY_SIZE(class_device_attrs); i++) - class_device_create_file(con->class_dev, - &class_device_attrs[i]); - return 0; } -static void vtconsole_deinit_class_device(struct con_driver *con) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(class_device_attrs); i++) - class_device_remove_file(con->class_dev, - &class_device_attrs[i]); -} - -/** - * con_is_bound - checks if driver is bound to the console - * @csw: console driver - * - * RETURNS: zero if unbound, nonzero if bound - * - * Drivers can call this and if zero, they should release - * all resources allocated on con_startup() - */ -int con_is_bound(const struct consw *csw) -{ - int i, bound = 0; - - for (i = 0; i < MAX_NR_CONSOLES; i++) { - if (con_driver_map[i] == csw) { - bound = 1; - break; - } - } - - return bound; -} -EXPORT_SYMBOL(con_is_bound); - -/** - * register_con_driver - register console driver to console layer - * @csw: console driver - * @first: the first console to take over, minimum value is 0 - * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1 - * - * DESCRIPTION: This function registers a console driver which can later - * bind to a range of consoles specified by @first and @last. It will - * also initialize the console driver by calling con_startup(). - */ -int register_con_driver(const struct consw *csw, int first, int last) -{ - struct module *owner = csw->owner; - struct con_driver *con_driver; - const char *desc; - int i, retval = 0; - - if (!try_module_get(owner)) - return -ENODEV; - - acquire_console_sem(); - - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - con_driver = ®istered_con_driver[i]; - - /* already registered */ - if (con_driver->con == csw) - retval = -EINVAL; - } - - if (retval) - goto err; - - desc = csw->con_startup(); - - if (!desc) - goto err; - - retval = -EINVAL; - - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - con_driver = ®istered_con_driver[i]; - - if (con_driver->con == NULL) { - con_driver->con = csw; - con_driver->desc = desc; - con_driver->node = i; - con_driver->flag = CON_DRIVER_FLAG_MODULE | - CON_DRIVER_FLAG_INIT; - con_driver->first = first; - con_driver->last = last; - retval = 0; - break; - } - } - - if (retval) - goto err; - - con_driver->class_dev = class_device_create(vtconsole_class, NULL, - MKDEV(0, con_driver->node), - NULL, "vtcon%i", - con_driver->node); - - if (IS_ERR(con_driver->class_dev)) { - printk(KERN_WARNING "Unable to create class_device for %s; " - "errno = %ld\n", con_driver->desc, - PTR_ERR(con_driver->class_dev)); - con_driver->class_dev = NULL; - } else { - vtconsole_init_class_device(con_driver); - } -err: - release_console_sem(); - module_put(owner); - return retval; -} -EXPORT_SYMBOL(register_con_driver); - -/** - * unregister_con_driver - unregister console driver from console layer - * @csw: console driver - * - * DESCRIPTION: All drivers that registers to the console layer must - * call this function upon exit, or if the console driver is in a state - * where it won't be able to handle console services, such as the - * framebuffer console without loaded framebuffer drivers. - * - * The driver must unbind first prior to unregistration. - */ -int unregister_con_driver(const struct consw *csw) -{ - int i, retval = -ENODEV; - - acquire_console_sem(); - - /* cannot unregister a bound driver */ - if (con_is_bound(csw)) - goto err; - - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - struct con_driver *con_driver = ®istered_con_driver[i]; - - if (con_driver->con == csw && - con_driver->flag & CON_DRIVER_FLAG_MODULE) { - vtconsole_deinit_class_device(con_driver); - class_device_destroy(vtconsole_class, - MKDEV(0, con_driver->node)); - con_driver->con = NULL; - con_driver->desc = NULL; - con_driver->class_dev = NULL; - con_driver->node = 0; - con_driver->flag = 0; - con_driver->first = 0; - con_driver->last = 0; - retval = 0; - break; - } - } -err: - release_console_sem(); - return retval; -} -EXPORT_SYMBOL(unregister_con_driver); - -/* - * If we support more console drivers, this function is used - * when a driver wants to take over some existing consoles - * and become default driver for newly opened ones. - * - * take_over_console is basically a register followed by unbind - */ -int take_over_console(const struct consw *csw, int first, int last, int deflt) -{ - int err; - - err = register_con_driver(csw, first, last); - - if (!err) - bind_con_driver(csw, first, last, deflt); - - return err; -} - -/* - * give_up_console is a wrapper to unregister_con_driver. It will only - * work if driver is fully unbound. - */ void give_up_console(const struct consw *csw) -{ - unregister_con_driver(csw); -} - -static int __init vtconsole_class_init(void) { int i; - vtconsole_class = class_create(THIS_MODULE, "vtconsole"); - if (IS_ERR(vtconsole_class)) { - printk(KERN_WARNING "Unable to create vt console class; " - "errno = %ld\n", PTR_ERR(vtconsole_class)); - vtconsole_class = NULL; - } - - /* Add system drivers to sysfs */ - for (i = 0; i < MAX_NR_CON_DRIVER; i++) { - struct con_driver *con = ®istered_con_driver[i]; - - if (con->con && !con->class_dev) { - con->class_dev = - class_device_create(vtconsole_class, NULL, - MKDEV(0, con->node), NULL, - "vtcon%i", con->node); - - if (IS_ERR(con->class_dev)) { - printk(KERN_WARNING "Unable to create " - "class_device for %s; errno = %ld\n", - con->desc, PTR_ERR(con->class_dev)); - con->class_dev = NULL; - } else { - vtconsole_init_class_device(con); - } + for(i = 0; i < MAX_NR_CONSOLES; i++) + if (con_driver_map[i] == csw) { + module_put(csw->owner); + con_driver_map[i] = NULL; } - } - - return 0; } -postcore_initcall(vtconsole_class_init); #endif diff --git a/trunk/drivers/clocksource/Makefile b/trunk/drivers/clocksource/Makefile deleted file mode 100644 index a52225470225..000000000000 --- a/trunk/drivers/clocksource/Makefile +++ /dev/null @@ -1,3 +0,0 @@ -obj-$(CONFIG_X86_CYCLONE_TIMER) += cyclone.o -obj-$(CONFIG_X86_PM_TIMER) += acpi_pm.o -obj-$(CONFIG_SCx200HR_TIMER) += scx200_hrt.o diff --git a/trunk/drivers/clocksource/acpi_pm.c b/trunk/drivers/clocksource/acpi_pm.c deleted file mode 100644 index 7ad3be8c0f49..000000000000 --- a/trunk/drivers/clocksource/acpi_pm.c +++ /dev/null @@ -1,177 +0,0 @@ -/* - * linux/drivers/clocksource/acpi_pm.c - * - * This file contains the ACPI PM based clocksource. - * - * This code was largely moved from the i386 timer_pm.c file - * which was (C) Dominik Brodowski 2003 - * and contained the following comments: - * - * Driver to use the Power Management Timer (PMTMR) available in some - * southbridges as primary timing source for the Linux kernel. - * - * Based on parts of linux/drivers/acpi/hardware/hwtimer.c, timer_pit.c, - * timer_hpet.c, and on Arjan van de Ven's implementation for 2.4. - * - * This file is licensed under the GPL v2. - */ - -#include -#include -#include -#include -#include - -/* Number of PMTMR ticks expected during calibration run */ -#define PMTMR_TICKS_PER_SEC 3579545 - -/* - * The I/O port the PMTMR resides at. - * The location is detected during setup_arch(), - * in arch/i386/acpi/boot.c - */ -u32 pmtmr_ioport __read_mostly; - -#define ACPI_PM_MASK CLOCKSOURCE_MASK(24) /* limit it to 24 bits */ - -static inline u32 read_pmtmr(void) -{ - /* mask the output to 24 bits */ - return inl(pmtmr_ioport) & ACPI_PM_MASK; -} - -static cycle_t acpi_pm_read_verified(void) -{ - u32 v1 = 0, v2 = 0, v3 = 0; - - /* - * It has been reported that because of various broken - * chipsets (ICH4, PIIX4 and PIIX4E) where the ACPI PM clock - * source is not latched, you must read it multiple - * times to ensure a safe value is read: - */ - do { - v1 = read_pmtmr(); - v2 = read_pmtmr(); - v3 = read_pmtmr(); - } while ((v1 > v2 && v1 < v3) || (v2 > v3 && v2 < v1) - || (v3 > v1 && v3 < v2)); - - return (cycle_t)v2; -} - -static cycle_t acpi_pm_read(void) -{ - return (cycle_t)read_pmtmr(); -} - -static struct clocksource clocksource_acpi_pm = { - .name = "acpi_pm", - .rating = 200, - .read = acpi_pm_read, - .mask = (cycle_t)ACPI_PM_MASK, - .mult = 0, /*to be caluclated*/ - .shift = 22, - .is_continuous = 1, -}; - - -#ifdef CONFIG_PCI -static int acpi_pm_good; -static int __init acpi_pm_good_setup(char *__str) -{ - acpi_pm_good = 1; - return 1; -} -__setup("acpi_pm_good", acpi_pm_good_setup); - -static inline void acpi_pm_need_workaround(void) -{ - clocksource_acpi_pm.read = acpi_pm_read_verified; - clocksource_acpi_pm.rating = 110; -} - -/* - * PIIX4 Errata: - * - * The power management timer may return improper results when read. - * Although the timer value settles properly after incrementing, - * while incrementing there is a 3 ns window every 69.8 ns where the - * timer value is indeterminate (a 4.2% chance that the data will be - * incorrect when read). As a result, the ACPI free running count up - * timer specification is violated due to erroneous reads. - */ -static void __devinit acpi_pm_check_blacklist(struct pci_dev *dev) -{ - u8 rev; - - if (acpi_pm_good) - return; - - pci_read_config_byte(dev, PCI_REVISION_ID, &rev); - /* the bug has been fixed in PIIX4M */ - if (rev < 3) { - printk(KERN_WARNING "* Found PM-Timer Bug on the chipset." - " Due to workarounds for a bug,\n" - "* this clock source is slow. Consider trying" - " other clock sources\n"); - - acpi_pm_need_workaround(); - } -} -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371AB_3, - acpi_pm_check_blacklist); - -static void __devinit acpi_pm_check_graylist(struct pci_dev *dev) -{ - if (acpi_pm_good) - return; - - printk(KERN_WARNING "* The chipset may have PM-Timer Bug. Due to" - " workarounds for a bug,\n" - "* this clock source is slow. If you are sure your timer" - " does not have\n" - "* this bug, please use \"acpi_pm_good\" to disable the" - " workaround\n"); - - acpi_pm_need_workaround(); -} -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_0, - acpi_pm_check_graylist); -#endif - - -static int __init init_acpi_pm_clocksource(void) -{ - u32 value1, value2; - unsigned int i; - - if (!pmtmr_ioport) - return -ENODEV; - - clocksource_acpi_pm.mult = clocksource_hz2mult(PMTMR_TICKS_PER_SEC, - clocksource_acpi_pm.shift); - - /* "verify" this timing source: */ - value1 = read_pmtmr(); - for (i = 0; i < 10000; i++) { - value2 = read_pmtmr(); - if (value2 == value1) - continue; - if (value2 > value1) - goto pm_good; - if ((value2 < value1) && ((value2) < 0xFFF)) - goto pm_good; - printk(KERN_INFO "PM-Timer had inconsistent results:" - " 0x%#x, 0x%#x - aborting.\n", value1, value2); - return -EINVAL; - } - printk(KERN_INFO "PM-Timer had no reasonable result:" - " 0x%#x - aborting.\n", value1); - return -ENODEV; - -pm_good: - return clocksource_register(&clocksource_acpi_pm); -} - -module_init(init_acpi_pm_clocksource); diff --git a/trunk/drivers/clocksource/cyclone.c b/trunk/drivers/clocksource/cyclone.c deleted file mode 100644 index bf4d3d50d1c4..000000000000 --- a/trunk/drivers/clocksource/cyclone.c +++ /dev/null @@ -1,119 +0,0 @@ -#include -#include -#include -#include -#include - -#include -#include - -#include "mach_timer.h" - -#define CYCLONE_CBAR_ADDR 0xFEB00CD0 /* base address ptr */ -#define CYCLONE_PMCC_OFFSET 0x51A0 /* offset to control register */ -#define CYCLONE_MPCS_OFFSET 0x51A8 /* offset to select register */ -#define CYCLONE_MPMC_OFFSET 0x51D0 /* offset to count register */ -#define CYCLONE_TIMER_FREQ 99780000 /* 100Mhz, but not really */ -#define CYCLONE_TIMER_MASK CLOCKSOURCE_MASK(32) /* 32 bit mask */ - -int use_cyclone = 0; -static void __iomem *cyclone_ptr; - -static cycle_t read_cyclone(void) -{ - return (cycle_t)readl(cyclone_ptr); -} - -static struct clocksource clocksource_cyclone = { - .name = "cyclone", - .rating = 250, - .read = read_cyclone, - .mask = CYCLONE_TIMER_MASK, - .mult = 10, - .shift = 0, - .is_continuous = 1, -}; - -static int __init init_cyclone_clocksource(void) -{ - unsigned long base; /* saved value from CBAR */ - unsigned long offset; - u32 __iomem* volatile cyclone_timer; /* Cyclone MPMC0 register */ - u32 __iomem* reg; - int i; - - /* make sure we're on a summit box: */ - if (!use_cyclone) - return -ENODEV; - - printk(KERN_INFO "Summit chipset: Starting Cyclone Counter.\n"); - - /* find base address: */ - offset = CYCLONE_CBAR_ADDR; - reg = ioremap_nocache(offset, sizeof(reg)); - if (!reg) { - printk(KERN_ERR "Summit chipset: Could not find valid CBAR register.\n"); - return -ENODEV; - } - /* even on 64bit systems, this is only 32bits: */ - base = readl(reg); - if (!base) { - printk(KERN_ERR "Summit chipset: Could not find valid CBAR value.\n"); - return -ENODEV; - } - iounmap(reg); - - /* setup PMCC: */ - offset = base + CYCLONE_PMCC_OFFSET; - reg = ioremap_nocache(offset, sizeof(reg)); - if (!reg) { - printk(KERN_ERR "Summit chipset: Could not find valid PMCC register.\n"); - return -ENODEV; - } - writel(0x00000001,reg); - iounmap(reg); - - /* setup MPCS: */ - offset = base + CYCLONE_MPCS_OFFSET; - reg = ioremap_nocache(offset, sizeof(reg)); - if (!reg) { - printk(KERN_ERR "Summit chipset: Could not find valid MPCS register.\n"); - return -ENODEV; - } - writel(0x00000001,reg); - iounmap(reg); - - /* map in cyclone_timer: */ - offset = base + CYCLONE_MPMC_OFFSET; - cyclone_timer = ioremap_nocache(offset, sizeof(u64)); - if (!cyclone_timer) { - printk(KERN_ERR "Summit chipset: Could not find valid MPMC register.\n"); - return -ENODEV; - } - - /* quick test to make sure its ticking: */ - for (i = 0; i < 3; i++){ - u32 old = readl(cyclone_timer); - int stall = 100; - - while (stall--) - barrier(); - - if (readl(cyclone_timer) == old) { - printk(KERN_ERR "Summit chipset: Counter not counting! DISABLED\n"); - iounmap(cyclone_timer); - cyclone_timer = NULL; - return -ENODEV; - } - } - cyclone_ptr = cyclone_timer; - - /* sort out mult/shift values: */ - clocksource_cyclone.shift = 22; - clocksource_cyclone.mult = clocksource_hz2mult(CYCLONE_TIMER_FREQ, - clocksource_cyclone.shift); - - return clocksource_register(&clocksource_cyclone); -} - -module_init(init_cyclone_clocksource); diff --git a/trunk/drivers/clocksource/scx200_hrt.c b/trunk/drivers/clocksource/scx200_hrt.c deleted file mode 100644 index d418b8297211..000000000000 --- a/trunk/drivers/clocksource/scx200_hrt.c +++ /dev/null @@ -1,101 +0,0 @@ -/* - * Copyright (C) 2006 Jim Cromie - * - * This is a clocksource driver for the Geode SCx200's 1 or 27 MHz - * high-resolution timer. The Geode SC-1100 (at least) has a buggy - * time stamp counter (TSC), which loses time unless 'idle=poll' is - * given as a boot-arg. In its absence, the Generic Timekeeping code - * will detect and de-rate the bad TSC, allowing this timer to take - * over timekeeping duties. - * - * Based on work by John Stultz, and Ted Phelps (in a 2.6.12-rc6 patch) - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation; either version 2 of the - * License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include - -#define NAME "scx200_hrt" - -static int mhz27; -module_param(mhz27, int, 0); /* load time only */ -MODULE_PARM_DESC(mhz27, "count at 27.0 MHz (default is 1.0 MHz)"); - -static int ppm; -module_param(ppm, int, 0); /* load time only */ -MODULE_PARM_DESC(ppm, "+-adjust to actual XO freq (ppm)"); - -/* HiRes Timer configuration register address */ -#define SCx200_TMCNFG_OFFSET (SCx200_TIMER_OFFSET + 5) - -/* and config settings */ -#define HR_TMEN (1 << 0) /* timer interrupt enable */ -#define HR_TMCLKSEL (1 << 1) /* 1|0 counts at 27|1 MHz */ -#define HR_TM27MPD (1 << 2) /* 1 turns off input clock (power-down) */ - -/* The base timer frequency, * 27 if selected */ -#define HRT_FREQ 1000000 - -static cycle_t read_hrt(void) -{ - /* Read the timer value */ - return (cycle_t) inl(scx200_cb_base + SCx200_TIMER_OFFSET); -} - -#define HRT_SHIFT_1 22 -#define HRT_SHIFT_27 26 - -static struct clocksource cs_hrt = { - .name = "scx200_hrt", - .rating = 250, - .read = read_hrt, - .mask = CLOCKSOURCE_MASK(32), - .is_continuous = 1, - /* mult, shift are set based on mhz27 flag */ -}; - -static int __init init_hrt_clocksource(void) -{ - /* Make sure scx200 has initializedd the configuration block */ - if (!scx200_cb_present()) - return -ENODEV; - - /* Reserve the timer's ISA io-region for ourselves */ - if (!request_region(scx200_cb_base + SCx200_TIMER_OFFSET, - SCx200_TIMER_SIZE, - "NatSemi SCx200 High-Resolution Timer")) { - printk(KERN_WARNING NAME ": unable to lock timer region\n"); - return -ENODEV; - } - - /* write timer config */ - outb(HR_TMEN | (mhz27) ? HR_TMCLKSEL : 0, - scx200_cb_base + SCx200_TMCNFG_OFFSET); - - if (mhz27) { - cs_hrt.shift = HRT_SHIFT_27; - cs_hrt.mult = clocksource_hz2mult((HRT_FREQ + ppm) * 27, - cs_hrt.shift); - } else { - cs_hrt.shift = HRT_SHIFT_1; - cs_hrt.mult = clocksource_hz2mult(HRT_FREQ + ppm, - cs_hrt.shift); - } - printk(KERN_INFO "enabling scx200 high-res timer (%s MHz +%d ppm)\n", - mhz27 ? "27":"1", ppm); - - return clocksource_register(&cs_hrt); -} - -module_init(init_hrt_clocksource); - -MODULE_AUTHOR("Jim Cromie "); -MODULE_DESCRIPTION("clocksource on SCx200 HiRes Timer"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/dma/ioatdma.c b/trunk/drivers/dma/ioatdma.c index 2801d14a5e42..0fdf7fbd6495 100644 --- a/trunk/drivers/dma/ioatdma.c +++ b/trunk/drivers/dma/ioatdma.c @@ -824,9 +824,10 @@ static int __init ioat_init_module(void) { /* it's currently unsafe to unload this module */ /* if forced, worst case is that rmmod hangs */ - __unsafe(THIS_MODULE); + if (THIS_MODULE != NULL) + THIS_MODULE->unsafe = 1; - pci_module_init(&ioat_pci_drv); + return pci_module_init(&ioat_pci_drv); } module_init(ioat_init_module); diff --git a/trunk/drivers/ide/ide-io.c b/trunk/drivers/ide/ide-io.c index d2428cef1598..622a55c72f03 100644 --- a/trunk/drivers/ide/ide-io.c +++ b/trunk/drivers/ide/ide-io.c @@ -959,7 +959,7 @@ static void ide_check_pm_state(ide_drive_t *drive, struct request *rq) printk(KERN_WARNING "%s: bus not ready on wakeup\n", drive->name); SELECT_DRIVE(drive); HWIF(drive)->OUTB(8, HWIF(drive)->io_ports[IDE_CONTROL_OFFSET]); - rc = ide_wait_not_busy(HWIF(drive), 100000); + rc = ide_wait_not_busy(HWIF(drive), 10000); if (rc) printk(KERN_WARNING "%s: drive not ready on wakeup\n", drive->name); } diff --git a/trunk/drivers/ide/ide-lib.c b/trunk/drivers/ide/ide-lib.c index 7ddb11828731..16a143133f93 100644 --- a/trunk/drivers/ide/ide-lib.c +++ b/trunk/drivers/ide/ide-lib.c @@ -485,7 +485,7 @@ static u8 ide_dump_ata_status(ide_drive_t *drive, const char *msg, u8 stat) unsigned long flags; u8 err = 0; - local_irq_save(flags); + local_irq_set(flags); printk("%s: %s: status=0x%02x { ", drive->name, msg, stat); if (stat & BUSY_STAT) printk("Busy "); @@ -567,7 +567,7 @@ static u8 ide_dump_atapi_status(ide_drive_t *drive, const char *msg, u8 stat) status.all = stat; error.all = 0; - local_irq_save(flags); + local_irq_set(flags); printk("%s: %s: status=0x%02x { ", drive->name, msg, stat); if (status.b.bsy) printk("Busy "); diff --git a/trunk/drivers/ide/ide-timing.h b/trunk/drivers/ide/ide-timing.h index c0864b1e9228..2fcfac6e967a 100644 --- a/trunk/drivers/ide/ide-timing.h +++ b/trunk/drivers/ide/ide-timing.h @@ -219,12 +219,6 @@ static int ide_timing_compute(ide_drive_t *drive, short speed, struct ide_timing if (!(s = ide_timing_find_mode(speed))) return -EINVAL; -/* - * Copy the timing from the table. - */ - - *t = *s; - /* * If the drive is an EIDE drive, it can tell us it needs extended * PIO/MWDMA cycle timing. @@ -253,7 +247,7 @@ static int ide_timing_compute(ide_drive_t *drive, short speed, struct ide_timing * Convert the timing to bus clock counts. */ - ide_timing_quantize(t, t, T, UT); + ide_timing_quantize(s, t, T, UT); /* * Even in DMA/UDMA modes we still use PIO access for IDENTIFY, S.M.A.R.T diff --git a/trunk/drivers/ide/pci/pdc202xx_old.c b/trunk/drivers/ide/pci/pdc202xx_old.c index 22d17548ecdb..7ce5bf783688 100644 --- a/trunk/drivers/ide/pci/pdc202xx_old.c +++ b/trunk/drivers/ide/pci/pdc202xx_old.c @@ -370,6 +370,7 @@ static int config_chipset_for_dma (ide_drive_t *drive) if (!(speed)) { /* restore original pci-config space */ pci_write_config_dword(dev, drive_pci, drive_conf); + hwif->tuneproc(drive, 5); return 0; } @@ -414,6 +415,8 @@ static void pdc202xx_old_ide_dma_start(ide_drive_t *drive) if (drive->addressing == 1) { struct request *rq = HWGROUP(drive)->rq; ide_hwif_t *hwif = HWIF(drive); +// struct pci_dev *dev = hwif->pci_dev; +// unsgned long high_16 = pci_resource_start(dev, 4); unsigned long high_16 = hwif->dma_master; unsigned long atapi_reg = high_16 + (hwif->channel ? 0x24 : 0x20); u32 word_count = 0; @@ -433,6 +436,7 @@ static int pdc202xx_old_ide_dma_end(ide_drive_t *drive) { if (drive->addressing == 1) { ide_hwif_t *hwif = HWIF(drive); +// unsigned long high_16 = pci_resource_start(hwif->pci_dev, 4); unsigned long high_16 = hwif->dma_master; unsigned long atapi_reg = high_16 + (hwif->channel ? 0x24 : 0x20); u8 clock = 0; @@ -449,6 +453,8 @@ static int pdc202xx_old_ide_dma_end(ide_drive_t *drive) static int pdc202xx_old_ide_dma_test_irq(ide_drive_t *drive) { ide_hwif_t *hwif = HWIF(drive); +// struct pci_dev *dev = hwif->pci_dev; +// unsigned long high_16 = pci_resource_start(dev, 4); unsigned long high_16 = hwif->dma_master; u8 dma_stat = hwif->INB(hwif->dma_status); u8 sc1d = hwif->INB((high_16 + 0x001d)); @@ -486,7 +492,12 @@ static int pdc202xx_ide_dma_timeout(ide_drive_t *drive) static void pdc202xx_reset_host (ide_hwif_t *hwif) { +#ifdef CONFIG_BLK_DEV_IDEDMA +// unsigned long high_16 = hwif->dma_base - (8*(hwif->channel)); unsigned long high_16 = hwif->dma_master; +#else /* !CONFIG_BLK_DEV_IDEDMA */ + unsigned long high_16 = pci_resource_start(hwif->pci_dev, 4); +#endif /* CONFIG_BLK_DEV_IDEDMA */ u8 udma_speed_flag = hwif->INB(high_16|0x001f); hwif->OUTB((udma_speed_flag | 0x10), (high_16|0x001f)); @@ -539,6 +550,31 @@ static void pdc202xx_reset (ide_drive_t *drive) #endif } +/* + * Since SUN Cobalt is attempting to do this operation, I should disclose + * this has been a long time ago Thu Jul 27 16:40:57 2000 was the patch date + * HOTSWAP ATA Infrastructure. + */ +static int pdc202xx_tristate (ide_drive_t * drive, int state) +{ + ide_hwif_t *hwif = HWIF(drive); +// unsigned long high_16 = hwif->dma_base - (8*(hwif->channel)); + unsigned long high_16 = hwif->dma_master; + u8 sc1f = hwif->INB(high_16|0x001f); + + if (!hwif) + return -EINVAL; + +// hwif->bus_state = state; + + if (state) { + hwif->OUTB(sc1f | 0x08, (high_16|0x001f)); + } else { + hwif->OUTB(sc1f & ~0x08, (high_16|0x001f)); + } + return 0; +} + static unsigned int __devinit init_chipset_pdc202xx(struct pci_dev *dev, const char *name) { if (dev->resource[PCI_ROM_RESOURCE].start) { @@ -588,8 +624,10 @@ static void __devinit init_hwif_pdc202xx(ide_hwif_t *hwif) hwif->tuneproc = &config_chipset_for_pio; hwif->quirkproc = &pdc202xx_quirkproc; - if (hwif->pci_dev->device != PCI_DEVICE_ID_PROMISE_20246) + if (hwif->pci_dev->device != PCI_DEVICE_ID_PROMISE_20246) { + hwif->busproc = &pdc202xx_tristate; hwif->resetproc = &pdc202xx_reset; + } hwif->speedproc = &pdc202xx_tune_chipset; diff --git a/trunk/drivers/ide/pci/piix.c b/trunk/drivers/ide/pci/piix.c index 7fac6f57b5d6..e9b83e1a3028 100644 --- a/trunk/drivers/ide/pci/piix.c +++ b/trunk/drivers/ide/pci/piix.c @@ -222,8 +222,6 @@ static void piix_tune_drive (ide_drive_t *drive, u8 pio) unsigned long flags; u16 master_data; u8 slave_data; - static DEFINE_SPINLOCK(tune_lock); - /* ISP RTC */ u8 timings[][2] = { { 0, 0 }, { 0, 0 }, @@ -232,13 +230,7 @@ static void piix_tune_drive (ide_drive_t *drive, u8 pio) { 2, 3 }, }; pio = ide_get_best_pio_mode(drive, pio, 5, NULL); - - /* - * Master vs slave is synchronized above us but the slave register is - * shared by the two hwifs so the corner case of two slave timeouts in - * parallel must be locked. - */ - spin_lock_irqsave(&tune_lock, flags); + spin_lock_irqsave(&ide_lock, flags); pci_read_config_word(dev, master_port, &master_data); if (is_slave) { master_data = master_data | 0x4000; @@ -258,7 +250,7 @@ static void piix_tune_drive (ide_drive_t *drive, u8 pio) pci_write_config_word(dev, master_port, master_data); if (is_slave) pci_write_config_byte(dev, slave_port, slave_data); - spin_unlock_irqrestore(&tune_lock, flags); + spin_unlock_irqrestore(&ide_lock, flags); } /** diff --git a/trunk/drivers/ieee1394/eth1394.c b/trunk/drivers/ieee1394/eth1394.c index 2d5b57be98c3..5bda15904a08 100644 --- a/trunk/drivers/ieee1394/eth1394.c +++ b/trunk/drivers/ieee1394/eth1394.c @@ -1074,7 +1074,8 @@ static inline int update_partial_datagram(struct list_head *pdgl, struct list_he /* Move list entry to beginnig of list so that oldest partial * datagrams percolate to the end of the list */ - list_move(lh, pdgl); + list_del(lh); + list_add(lh, pdgl); return 0; } diff --git a/trunk/drivers/ieee1394/raw1394.c b/trunk/drivers/ieee1394/raw1394.c index 571ea68c0cf2..20ce539580f1 100644 --- a/trunk/drivers/ieee1394/raw1394.c +++ b/trunk/drivers/ieee1394/raw1394.c @@ -132,7 +132,8 @@ static void free_pending_request(struct pending_request *req) static void __queue_complete_req(struct pending_request *req) { struct file_info *fi = req->file_info; - list_move_tail(&req->list, &fi->req_complete); + list_del(&req->list); + list_add_tail(&req->list, &fi->req_complete); up(&fi->complete_sem); wake_up_interruptible(&fi->poll_wait_complete); diff --git a/trunk/drivers/infiniband/core/mad.c b/trunk/drivers/infiniband/core/mad.c index 5ed4dab52a6f..b38e02a5db35 100644 --- a/trunk/drivers/infiniband/core/mad.c +++ b/trunk/drivers/infiniband/core/mad.c @@ -1775,9 +1775,11 @@ ib_find_send_mad(struct ib_mad_agent_private *mad_agent_priv, void ib_mark_mad_done(struct ib_mad_send_wr_private *mad_send_wr) { mad_send_wr->timeout = 0; - if (mad_send_wr->refcount == 1) - list_move_tail(&mad_send_wr->agent_list, + if (mad_send_wr->refcount == 1) { + list_del(&mad_send_wr->agent_list); + list_add_tail(&mad_send_wr->agent_list, &mad_send_wr->mad_agent_priv->done_list); + } } static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv, @@ -2096,7 +2098,8 @@ static void ib_mad_send_done_handler(struct ib_mad_port_private *port_priv, queued_send_wr = container_of(mad_list, struct ib_mad_send_wr_private, mad_list); - list_move_tail(&mad_list->list, &send_queue->list); + list_del(&mad_list->list); + list_add_tail(&mad_list->list, &send_queue->list); } spin_unlock_irqrestore(&send_queue->lock, flags); diff --git a/trunk/drivers/infiniband/core/mad_rmpp.c b/trunk/drivers/infiniband/core/mad_rmpp.c index ebcd5b181770..d4704e054e30 100644 --- a/trunk/drivers/infiniband/core/mad_rmpp.c +++ b/trunk/drivers/infiniband/core/mad_rmpp.c @@ -665,7 +665,8 @@ static void process_rmpp_ack(struct ib_mad_agent_private *agent, goto out; mad_send_wr->refcount++; - list_move_tail(&mad_send_wr->agent_list, + list_del(&mad_send_wr->agent_list); + list_add_tail(&mad_send_wr->agent_list, &mad_send_wr->mad_agent_priv->send_list); } out: diff --git a/trunk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/trunk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index ab40488182b3..216471fa01cc 100644 --- a/trunk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/trunk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -864,7 +864,8 @@ void ipoib_mcast_restart_task(void *dev_ptr) if (mcast) { /* Destroy the send only entry */ - list_move_tail(&mcast->list, &remove_list); + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); rb_replace_node(&mcast->rb_node, &nmcast->rb_node, @@ -889,7 +890,8 @@ void ipoib_mcast_restart_task(void *dev_ptr) rb_erase(&mcast->rb_node, &priv->multicast_tree); /* Move to the remove list */ - list_move_tail(&mcast->list, &remove_list); + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); } } diff --git a/trunk/drivers/input/evdev.c b/trunk/drivers/input/evdev.c index a29d5ceb00cf..5f561fce32d8 100644 --- a/trunk/drivers/input/evdev.c +++ b/trunk/drivers/input/evdev.c @@ -78,19 +78,14 @@ static int evdev_fasync(int fd, struct file *file, int on) { int retval; struct evdev_list *list = file->private_data; - retval = fasync_helper(fd, file, on, &list->fasync); - return retval < 0 ? retval : 0; } -static int evdev_flush(struct file *file, fl_owner_t id) +static int evdev_flush(struct file * file, fl_owner_t id) { struct evdev_list *list = file->private_data; - - if (!list->evdev->exist) - return -ENODEV; - + if (!list->evdev->exist) return -ENODEV; return input_flush_device(&list->evdev->handle, file); } @@ -305,7 +300,6 @@ static ssize_t evdev_read(struct file * file, char __user * buffer, size_t count static unsigned int evdev_poll(struct file *file, poll_table *wait) { struct evdev_list *list = file->private_data; - poll_wait(file, &list->evdev->wait, wait); return ((list->head == list->tail) ? 0 : (POLLIN | POLLRDNORM)) | (list->evdev->exist ? 0 : (POLLHUP | POLLERR)); diff --git a/trunk/drivers/input/input.c b/trunk/drivers/input/input.c index de2e7546b491..3038c268917d 100644 --- a/trunk/drivers/input/input.c +++ b/trunk/drivers/input/input.c @@ -28,6 +28,20 @@ MODULE_AUTHOR("Vojtech Pavlik "); MODULE_DESCRIPTION("Input core"); MODULE_LICENSE("GPL"); +EXPORT_SYMBOL(input_allocate_device); +EXPORT_SYMBOL(input_register_device); +EXPORT_SYMBOL(input_unregister_device); +EXPORT_SYMBOL(input_register_handler); +EXPORT_SYMBOL(input_unregister_handler); +EXPORT_SYMBOL(input_grab_device); +EXPORT_SYMBOL(input_release_device); +EXPORT_SYMBOL(input_open_device); +EXPORT_SYMBOL(input_close_device); +EXPORT_SYMBOL(input_accept_process); +EXPORT_SYMBOL(input_flush_device); +EXPORT_SYMBOL(input_event); +EXPORT_SYMBOL_GPL(input_class); + #define INPUT_DEVICES 256 static LIST_HEAD(input_dev_list); @@ -49,13 +63,11 @@ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, in case EV_SYN: switch (code) { case SYN_CONFIG: - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; case SYN_REPORT: - if (dev->sync) - return; + if (dev->sync) return; dev->sync = 1; break; } @@ -124,8 +136,7 @@ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, in if (code > MSC_MAX || !test_bit(code, dev->mscbit)) return; - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; @@ -135,9 +146,7 @@ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, in return; change_bit(code, dev->led); - - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; @@ -149,25 +158,21 @@ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, in if (!!test_bit(code, dev->snd) != !!value) change_bit(code, dev->snd); - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; case EV_REP: - if (code > REP_MAX || value < 0 || dev->rep[code] == value) - return; + if (code > REP_MAX || value < 0 || dev->rep[code] == value) return; dev->rep[code] = value; - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; case EV_FF: - if (dev->event) - dev->event(dev, type, code, value); + if (dev->event) dev->event(dev, type, code, value); break; } @@ -181,7 +186,6 @@ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, in if (handle->open) handle->handler->event(handle, type, code, value); } -EXPORT_SYMBOL(input_event); static void input_repeat_key(unsigned long data) { @@ -204,7 +208,6 @@ int input_accept_process(struct input_handle *handle, struct file *file) return 0; } -EXPORT_SYMBOL(input_accept_process); int input_grab_device(struct input_handle *handle) { @@ -214,14 +217,12 @@ int input_grab_device(struct input_handle *handle) handle->dev->grab = handle; return 0; } -EXPORT_SYMBOL(input_grab_device); void input_release_device(struct input_handle *handle) { if (handle->dev->grab == handle) handle->dev->grab = NULL; } -EXPORT_SYMBOL(input_release_device); int input_open_device(struct input_handle *handle) { @@ -244,7 +245,6 @@ int input_open_device(struct input_handle *handle) return err; } -EXPORT_SYMBOL(input_open_device); int input_flush_device(struct input_handle* handle, struct file* file) { @@ -253,7 +253,6 @@ int input_flush_device(struct input_handle* handle, struct file* file) return 0; } -EXPORT_SYMBOL(input_flush_device); void input_close_device(struct input_handle *handle) { @@ -269,7 +268,6 @@ void input_close_device(struct input_handle *handle) mutex_unlock(&dev->mutex); } -EXPORT_SYMBOL(input_close_device); static void input_link_handle(struct input_handle *handle) { @@ -337,11 +335,9 @@ static inline void input_wakeup_procfs_readers(void) static unsigned int input_proc_devices_poll(struct file *file, poll_table *wait) { int state = input_devices_state; - poll_wait(file, &input_devices_poll_wait, wait); if (state != input_devices_state) return POLLIN | POLLRDNORM; - return 0; } @@ -633,7 +629,7 @@ static ssize_t input_dev_show_modalias(struct class_device *dev, char *buf) len = input_print_modalias(buf, PAGE_SIZE, id, 1); - return min_t(int, len, PAGE_SIZE); + return max_t(int, len, PAGE_SIZE); } static CLASS_DEVICE_ATTR(modalias, S_IRUGO, input_dev_show_modalias, NULL); @@ -866,7 +862,6 @@ struct class input_class = { .release = input_dev_release, .uevent = input_dev_uevent, }; -EXPORT_SYMBOL_GPL(input_class); struct input_dev *input_allocate_device(void) { @@ -877,27 +872,12 @@ struct input_dev *input_allocate_device(void) dev->dynalloc = 1; dev->cdev.class = &input_class; class_device_initialize(&dev->cdev); - mutex_init(&dev->mutex); INIT_LIST_HEAD(&dev->h_list); INIT_LIST_HEAD(&dev->node); } return dev; } -EXPORT_SYMBOL(input_allocate_device); - -void input_free_device(struct input_dev *dev) -{ - if (dev) { - - mutex_lock(&dev->mutex); - dev->name = dev->phys = dev->uniq = NULL; - mutex_unlock(&dev->mutex); - - input_put_device(dev); - } -} -EXPORT_SYMBOL(input_free_device); int input_register_device(struct input_dev *dev) { @@ -915,6 +895,7 @@ int input_register_device(struct input_dev *dev) return -EINVAL; } + mutex_init(&dev->mutex); set_bit(EV_SYN, dev->evbit); /* @@ -975,14 +956,12 @@ int input_register_device(struct input_dev *dev) fail1: class_device_del(&dev->cdev); return error; } -EXPORT_SYMBOL(input_register_device); void input_unregister_device(struct input_dev *dev) { - struct list_head *node, *next; + struct list_head * node, * next; - if (!dev) - return; + if (!dev) return; del_timer_sync(&dev->timer); @@ -1000,13 +979,8 @@ void input_unregister_device(struct input_dev *dev) sysfs_remove_group(&dev->cdev.kobj, &input_dev_attr_group); class_device_unregister(&dev->cdev); - mutex_lock(&dev->mutex); - dev->name = dev->phys = dev->uniq = NULL; - mutex_unlock(&dev->mutex); - input_wakeup_procfs_readers(); } -EXPORT_SYMBOL(input_unregister_device); void input_register_handler(struct input_handler *handler) { @@ -1014,8 +988,7 @@ void input_register_handler(struct input_handler *handler) struct input_handle *handle; struct input_device_id *id; - if (!handler) - return; + if (!handler) return; INIT_LIST_HEAD(&handler->h_list); @@ -1032,11 +1005,10 @@ void input_register_handler(struct input_handler *handler) input_wakeup_procfs_readers(); } -EXPORT_SYMBOL(input_register_handler); void input_unregister_handler(struct input_handler *handler) { - struct list_head *node, *next; + struct list_head * node, * next; list_for_each_safe(node, next, &handler->h_list) { struct input_handle * handle = to_handle_h(node); @@ -1052,7 +1024,6 @@ void input_unregister_handler(struct input_handler *handler) input_wakeup_procfs_readers(); } -EXPORT_SYMBOL(input_unregister_handler); static int input_open_file(struct inode *inode, struct file *file) { diff --git a/trunk/drivers/input/joydev.c b/trunk/drivers/input/joydev.c index d67157513bf7..949bdcef8c2b 100644 --- a/trunk/drivers/input/joydev.c +++ b/trunk/drivers/input/joydev.c @@ -81,7 +81,10 @@ static int joydev_correct(int value, struct js_corr *corr) return 0; } - return value < -32767 ? -32767 : (value > 32767 ? 32767 : value); + if (value < -32767) return -32767; + if (value > 32767) return 32767; + + return value; } static void joydev_event(struct input_handle *handle, unsigned int type, unsigned int code, int value) @@ -93,8 +96,7 @@ static void joydev_event(struct input_handle *handle, unsigned int type, unsigne switch (type) { case EV_KEY: - if (code < BTN_MISC || value == 2) - return; + if (code < BTN_MISC || value == 2) return; event.type = JS_EVENT_BUTTON; event.number = joydev->keymap[code - BTN_MISC]; event.value = value; @@ -104,8 +106,7 @@ static void joydev_event(struct input_handle *handle, unsigned int type, unsigne event.type = JS_EVENT_AXIS; event.number = joydev->absmap[code]; event.value = joydev_correct(value, joydev->corr + event.number); - if (event.value == joydev->abs[event.number]) - return; + if (event.value == joydev->abs[event.number]) return; joydev->abs[event.number] = event.value; break; @@ -133,9 +134,7 @@ static int joydev_fasync(int fd, struct file *file, int on) { int retval; struct joydev_list *list = file->private_data; - retval = fasync_helper(fd, file, on, &list->fasync); - return retval < 0 ? retval : 0; } @@ -223,12 +222,12 @@ static ssize_t joydev_read(struct file *file, char __user *buf, size_t count, lo return sizeof(struct JS_DATA_TYPE); } - if (list->startup == joydev->nabs + joydev->nkey && - list->head == list->tail && (file->f_flags & O_NONBLOCK)) - return -EAGAIN; + if (list->startup == joydev->nabs + joydev->nkey + && list->head == list->tail && (file->f_flags & O_NONBLOCK)) + return -EAGAIN; retval = wait_event_interruptible(list->joydev->wait, - !list->joydev->exist || + !list->joydev->exist || list->startup < joydev->nabs + joydev->nkey || list->head != list->tail); @@ -277,9 +276,8 @@ static ssize_t joydev_read(struct file *file, char __user *buf, size_t count, lo static unsigned int joydev_poll(struct file *file, poll_table *wait) { struct joydev_list *list = file->private_data; - poll_wait(file, &list->joydev->wait, wait); - return ((list->head != list->tail || list->startup < list->joydev->nabs + list->joydev->nkey) ? + return ((list->head != list->tail || list->startup < list->joydev->nabs + list->joydev->nkey) ? (POLLIN | POLLRDNORM) : 0) | (list->joydev->exist ? 0 : (POLLHUP | POLLERR)); } @@ -293,26 +291,20 @@ static int joydev_ioctl_common(struct joydev *joydev, unsigned int cmd, void __u case JS_SET_CAL: return copy_from_user(&joydev->glue.JS_CORR, argp, sizeof(joydev->glue.JS_CORR)) ? -EFAULT : 0; - case JS_GET_CAL: return copy_to_user(argp, &joydev->glue.JS_CORR, sizeof(joydev->glue.JS_CORR)) ? -EFAULT : 0; - case JS_SET_TIMEOUT: return get_user(joydev->glue.JS_TIMEOUT, (s32 __user *) argp); - case JS_GET_TIMEOUT: return put_user(joydev->glue.JS_TIMEOUT, (s32 __user *) argp); case JSIOCGVERSION: return put_user(JS_VERSION, (__u32 __user *) argp); - case JSIOCGAXES: return put_user(joydev->nabs, (__u8 __user *) argp); - case JSIOCGBUTTONS: return put_user(joydev->nkey, (__u8 __user *) argp); - case JSIOCSCORR: if (copy_from_user(joydev->corr, argp, sizeof(joydev->corr[0]) * joydev->nabs)) @@ -322,49 +314,38 @@ static int joydev_ioctl_common(struct joydev *joydev, unsigned int cmd, void __u joydev->abs[i] = joydev_correct(dev->abs[j], joydev->corr + i); } return 0; - case JSIOCGCORR: return copy_to_user(argp, joydev->corr, sizeof(joydev->corr[0]) * joydev->nabs) ? -EFAULT : 0; - case JSIOCSAXMAP: if (copy_from_user(joydev->abspam, argp, sizeof(__u8) * (ABS_MAX + 1))) return -EFAULT; for (i = 0; i < joydev->nabs; i++) { - if (joydev->abspam[i] > ABS_MAX) - return -EINVAL; + if (joydev->abspam[i] > ABS_MAX) return -EINVAL; joydev->absmap[joydev->abspam[i]] = i; } return 0; - case JSIOCGAXMAP: return copy_to_user(argp, joydev->abspam, sizeof(__u8) * (ABS_MAX + 1)) ? -EFAULT : 0; - case JSIOCSBTNMAP: if (copy_from_user(joydev->keypam, argp, sizeof(__u16) * (KEY_MAX - BTN_MISC + 1))) return -EFAULT; for (i = 0; i < joydev->nkey; i++) { - if (joydev->keypam[i] > KEY_MAX || joydev->keypam[i] < BTN_MISC) - return -EINVAL; + if (joydev->keypam[i] > KEY_MAX || joydev->keypam[i] < BTN_MISC) return -EINVAL; joydev->keymap[joydev->keypam[i] - BTN_MISC] = i; } return 0; - case JSIOCGBTNMAP: return copy_to_user(argp, joydev->keypam, sizeof(__u16) * (KEY_MAX - BTN_MISC + 1)) ? -EFAULT : 0; - default: if ((cmd & ~(_IOC_SIZEMASK << _IOC_SIZESHIFT)) == JSIOCGNAME(0)) { int len; - if (!dev->name) - return 0; + if (!dev->name) return 0; len = strlen(dev->name) + 1; - if (len > _IOC_SIZE(cmd)) - len = _IOC_SIZE(cmd); - if (copy_to_user(argp, dev->name, len)) - return -EFAULT; + if (len > _IOC_SIZE(cmd)) len = _IOC_SIZE(cmd); + if (copy_to_user(argp, dev->name, len)) return -EFAULT; return len; } } @@ -381,9 +362,7 @@ static long joydev_compat_ioctl(struct file *file, unsigned int cmd, unsigned lo struct JS_DATA_SAVE_TYPE_32 ds32; int err; - if (!joydev->exist) - return -ENODEV; - + if (!joydev->exist) return -ENODEV; switch(cmd) { case JS_SET_TIMELIMIT: err = get_user(tmp32, (s32 __user *) arg); @@ -416,7 +395,8 @@ static long joydev_compat_ioctl(struct file *file, unsigned int cmd, unsigned lo ds32.JS_SAVE = joydev->glue.JS_SAVE; ds32.JS_CORR = joydev->glue.JS_CORR; - err = copy_to_user(argp, &ds32, sizeof(ds32)) ? -EFAULT : 0; + err = copy_to_user(argp, &ds32, + sizeof(ds32)) ? -EFAULT : 0; break; default: @@ -432,8 +412,7 @@ static int joydev_ioctl(struct inode *inode, struct file *file, unsigned int cmd struct joydev *joydev = list->joydev; void __user *argp = (void __user *)arg; - if (!joydev->exist) - return -ENODEV; + if (!joydev->exist) return -ENODEV; switch(cmd) { case JS_SET_TIMELIMIT: @@ -567,8 +546,8 @@ static struct input_device_id joydev_blacklist[] = { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT, .evbit = { BIT(EV_KEY) }, .keybit = { [LONG(BTN_TOUCH)] = BIT(BTN_TOUCH) }, - }, /* Avoid itouchpads, touchscreens and tablets */ - { } /* Terminating entry */ + }, /* Avoid itouchpads, touchscreens and tablets */ + { }, /* Terminating entry */ }; static struct input_device_id joydev_ids[] = { @@ -587,7 +566,7 @@ static struct input_device_id joydev_ids[] = { .evbit = { BIT(EV_ABS) }, .absbit = { BIT(ABS_THROTTLE) }, }, - { } /* Terminating entry */ + { }, /* Terminating entry */ }; MODULE_DEVICE_TABLE(input, joydev_ids); @@ -600,7 +579,7 @@ static struct input_handler joydev_handler = { .minor = JOYDEV_MINOR_BASE, .name = "joydev", .id_table = joydev_ids, - .blacklist = joydev_blacklist, + .blacklist = joydev_blacklist, }; static int __init joydev_init(void) diff --git a/trunk/drivers/input/joystick/a3d.c b/trunk/drivers/input/joystick/a3d.c index b11a4bbc84c4..4612d13ea756 100644 --- a/trunk/drivers/input/joystick/a3d.c +++ b/trunk/drivers/input/joystick/a3d.c @@ -306,7 +306,7 @@ static int a3d_connect(struct gameport *gameport, struct gameport_driver *drv) gameport_set_poll_handler(gameport, a3d_poll); gameport_set_poll_interval(gameport, 20); - snprintf(a3d->phys, sizeof(a3d->phys), "%s/input0", gameport->phys); + sprintf(a3d->phys, "%s/input0", gameport->phys); input_dev->name = a3d_names[a3d->mode]; input_dev->phys = a3d->phys; diff --git a/trunk/drivers/input/joystick/analog.c b/trunk/drivers/input/joystick/analog.c index 01dc0b195d59..3121961e3e7c 100644 --- a/trunk/drivers/input/joystick/analog.c +++ b/trunk/drivers/input/joystick/analog.c @@ -408,23 +408,21 @@ static void analog_calibrate_timer(struct analog_port *port) static void analog_name(struct analog *analog) { - snprintf(analog->name, sizeof(analog->name), "Analog %d-axis %d-button", - hweight8(analog->mask & ANALOG_AXES_STD), - hweight8(analog->mask & ANALOG_BTNS_STD) + !!(analog->mask & ANALOG_BTNS_CHF) * 2 + - hweight16(analog->mask & ANALOG_BTNS_GAMEPAD) + !!(analog->mask & ANALOG_HBTN_CHF) * 4); + sprintf(analog->name, "Analog %d-axis %d-button", + hweight8(analog->mask & ANALOG_AXES_STD), + hweight8(analog->mask & ANALOG_BTNS_STD) + !!(analog->mask & ANALOG_BTNS_CHF) * 2 + + hweight16(analog->mask & ANALOG_BTNS_GAMEPAD) + !!(analog->mask & ANALOG_HBTN_CHF) * 4); if (analog->mask & ANALOG_HATS_ALL) - snprintf(analog->name, sizeof(analog->name), "%s %d-hat", - analog->name, hweight16(analog->mask & ANALOG_HATS_ALL)); + sprintf(analog->name, "%s %d-hat", + analog->name, hweight16(analog->mask & ANALOG_HATS_ALL)); if (analog->mask & ANALOG_HAT_FCS) - strlcat(analog->name, " FCS", sizeof(analog->name)); + strcat(analog->name, " FCS"); if (analog->mask & ANALOG_ANY_CHF) - strlcat(analog->name, (analog->mask & ANALOG_SAITEK) ? " Saitek" : " CHF", - sizeof(analog->name)); + strcat(analog->name, (analog->mask & ANALOG_SAITEK) ? " Saitek" : " CHF"); - strlcat(analog->name, (analog->mask & ANALOG_GAMEPAD) ? " gamepad": " joystick", - sizeof(analog->name)); + strcat(analog->name, (analog->mask & ANALOG_GAMEPAD) ? " gamepad": " joystick"); } /* @@ -437,8 +435,7 @@ static int analog_init_device(struct analog_port *port, struct analog *analog, i int i, j, t, v, w, x, y, z; analog_name(analog); - snprintf(analog->phys, sizeof(analog->phys), - "%s/input%d", port->gameport->phys, index); + sprintf(analog->phys, "%s/input%d", port->gameport->phys, index); analog->buttons = (analog->mask & ANALOG_GAMEPAD) ? analog_pad_btn : analog_joy_btn; analog->dev = input_dev = input_allocate_device(); diff --git a/trunk/drivers/input/joystick/cobra.c b/trunk/drivers/input/joystick/cobra.c index d5e42eb88a20..1909f7ef340c 100644 --- a/trunk/drivers/input/joystick/cobra.c +++ b/trunk/drivers/input/joystick/cobra.c @@ -202,8 +202,7 @@ static int cobra_connect(struct gameport *gameport, struct gameport_driver *drv) goto fail3; } - snprintf(cobra->phys[i], sizeof(cobra->phys[i]), - "%s/input%d", gameport->phys, i); + sprintf(cobra->phys[i], "%s/input%d", gameport->phys, i); input_dev->name = "Creative Labs Blaster GamePad Cobra"; input_dev->phys = cobra->phys[i]; diff --git a/trunk/drivers/input/joystick/db9.c b/trunk/drivers/input/joystick/db9.c index 6f31f054d1bb..e61894685cb1 100644 --- a/trunk/drivers/input/joystick/db9.c +++ b/trunk/drivers/input/joystick/db9.c @@ -620,8 +620,7 @@ static struct db9 __init *db9_probe(int parport, int mode) goto err_unreg_devs; } - snprintf(db9->phys[i], sizeof(db9->phys[i]), - "%s/input%d", db9->pd->port->name, i); + sprintf(db9->phys[i], "%s/input%d", db9->pd->port->name, i); input_dev->name = db9_mode->name; input_dev->phys = db9->phys[i]; diff --git a/trunk/drivers/input/joystick/gamecon.c b/trunk/drivers/input/joystick/gamecon.c index fe12aa37393d..ecbdb6b9bbd6 100644 --- a/trunk/drivers/input/joystick/gamecon.c +++ b/trunk/drivers/input/joystick/gamecon.c @@ -761,8 +761,7 @@ static struct gc __init *gc_probe(int parport, int *pads, int n_pads) if (!pads[i]) continue; - snprintf(gc->phys[i], sizeof(gc->phys[i]), - "%s/input%d", gc->pd->port->name, i); + sprintf(gc->phys[i], "%s/input%d", gc->pd->port->name, i); err = gc_setup_pad(gc, i, pads[i]); if (err) goto err_unreg_devs; diff --git a/trunk/drivers/input/joystick/gf2k.c b/trunk/drivers/input/joystick/gf2k.c index e4a699f6ec87..8a3ad455eb38 100644 --- a/trunk/drivers/input/joystick/gf2k.c +++ b/trunk/drivers/input/joystick/gf2k.c @@ -298,7 +298,7 @@ static int gf2k_connect(struct gameport *gameport, struct gameport_driver *drv) gameport_set_poll_handler(gameport, gf2k_poll); gameport_set_poll_interval(gameport, 20); - snprintf(gf2k->phys, sizeof(gf2k->phys), "%s/input0", gameport->phys); + sprintf(gf2k->phys, "%s/input0", gameport->phys); gf2k->length = gf2k_lens[gf2k->id]; diff --git a/trunk/drivers/input/joystick/grip.c b/trunk/drivers/input/joystick/grip.c index 17a90c436de8..20cb98ac2d79 100644 --- a/trunk/drivers/input/joystick/grip.c +++ b/trunk/drivers/input/joystick/grip.c @@ -354,8 +354,7 @@ static int grip_connect(struct gameport *gameport, struct gameport_driver *drv) goto fail3; } - snprintf(grip->phys[i], sizeof(grip->phys[i]), - "%s/input%d", gameport->phys, i); + sprintf(grip->phys[i], "%s/input%d", gameport->phys, i); input_dev->name = grip_name[grip->mode[i]]; input_dev->phys = grip->phys[i]; diff --git a/trunk/drivers/input/joystick/guillemot.c b/trunk/drivers/input/joystick/guillemot.c index 840ed9b512b2..6e2c721c26ba 100644 --- a/trunk/drivers/input/joystick/guillemot.c +++ b/trunk/drivers/input/joystick/guillemot.c @@ -222,7 +222,7 @@ static int guillemot_connect(struct gameport *gameport, struct gameport_driver * gameport_set_poll_handler(gameport, guillemot_poll); gameport_set_poll_interval(gameport, 20); - snprintf(guillemot->phys, sizeof(guillemot->phys), "%s/input0", gameport->phys); + sprintf(guillemot->phys, "%s/input0", gameport->phys); guillemot->type = guillemot_type + i; input_dev->name = guillemot_type[i].name; diff --git a/trunk/drivers/input/joystick/iforce/iforce-ff.c b/trunk/drivers/input/joystick/iforce/iforce-ff.c index 50c90765aee1..2b8e8456c9fa 100644 --- a/trunk/drivers/input/joystick/iforce/iforce-ff.c +++ b/trunk/drivers/input/joystick/iforce/iforce-ff.c @@ -47,7 +47,7 @@ static int make_magnitude_modifier(struct iforce* iforce, iforce->device_memory.start, iforce->device_memory.end, 2L, NULL, NULL)) { mutex_unlock(&iforce->mem_mutex); - return -ENOSPC; + return -ENOMEM; } mutex_unlock(&iforce->mem_mutex); } @@ -80,7 +80,7 @@ static int make_period_modifier(struct iforce* iforce, iforce->device_memory.start, iforce->device_memory.end, 2L, NULL, NULL)) { mutex_unlock(&iforce->mem_mutex); - return -ENOSPC; + return -ENOMEM; } mutex_unlock(&iforce->mem_mutex); } @@ -120,7 +120,7 @@ static int make_envelope_modifier(struct iforce* iforce, iforce->device_memory.start, iforce->device_memory.end, 2L, NULL, NULL)) { mutex_unlock(&iforce->mem_mutex); - return -ENOSPC; + return -ENOMEM; } mutex_unlock(&iforce->mem_mutex); } @@ -157,7 +157,7 @@ static int make_condition_modifier(struct iforce* iforce, iforce->device_memory.start, iforce->device_memory.end, 2L, NULL, NULL)) { mutex_unlock(&iforce->mem_mutex); - return -ENOSPC; + return -ENOMEM; } mutex_unlock(&iforce->mem_mutex); } diff --git a/trunk/drivers/input/joystick/iforce/iforce-main.c b/trunk/drivers/input/joystick/iforce/iforce-main.c index 6d99e3c37884..ab0a26b924ca 100644 --- a/trunk/drivers/input/joystick/iforce/iforce-main.c +++ b/trunk/drivers/input/joystick/iforce/iforce-main.c @@ -86,7 +86,7 @@ static struct iforce_device iforce_device[] = { static int iforce_input_event(struct input_dev *dev, unsigned int type, unsigned int code, int value) { - struct iforce* iforce = dev->private; + struct iforce* iforce = (struct iforce*)(dev->private); unsigned char data[3]; if (type != EV_FF) @@ -138,7 +138,7 @@ static int iforce_input_event(struct input_dev *dev, unsigned int type, unsigned */ static int iforce_upload_effect(struct input_dev *dev, struct ff_effect *effect) { - struct iforce* iforce = dev->private; + struct iforce* iforce = (struct iforce*)(dev->private); int id; int ret; int is_update; @@ -218,7 +218,7 @@ static int iforce_upload_effect(struct input_dev *dev, struct ff_effect *effect) */ static int iforce_erase_effect(struct input_dev *dev, int effect_id) { - struct iforce* iforce = dev->private; + struct iforce* iforce = (struct iforce*)(dev->private); int err = 0; struct iforce_core_effect* core_effect; diff --git a/trunk/drivers/input/joystick/interact.c b/trunk/drivers/input/joystick/interact.c index bbfeb9c59b87..c4ed01758226 100644 --- a/trunk/drivers/input/joystick/interact.c +++ b/trunk/drivers/input/joystick/interact.c @@ -251,7 +251,7 @@ static int interact_connect(struct gameport *gameport, struct gameport_driver *d gameport_set_poll_handler(gameport, interact_poll); gameport_set_poll_interval(gameport, 20); - snprintf(interact->phys, sizeof(interact->phys), "%s/input0", gameport->phys); + sprintf(interact->phys, "%s/input0", gameport->phys); interact->type = i; interact->length = interact_type[i].length; diff --git a/trunk/drivers/input/joystick/magellan.c b/trunk/drivers/input/joystick/magellan.c index 168b1061a03b..ca3cc2319d6a 100644 --- a/trunk/drivers/input/joystick/magellan.c +++ b/trunk/drivers/input/joystick/magellan.c @@ -162,7 +162,7 @@ static int magellan_connect(struct serio *serio, struct serio_driver *drv) goto fail; magellan->dev = input_dev; - snprintf(magellan->phys, sizeof(magellan->phys), "%s/input0", serio->phys); + sprintf(magellan->phys, "%s/input0", serio->phys); input_dev->name = "LogiCad3D Magellan / SpaceMouse"; input_dev->phys = magellan->phys; diff --git a/trunk/drivers/input/joystick/sidewinder.c b/trunk/drivers/input/joystick/sidewinder.c index e58b22c018e4..95c0de7964a0 100644 --- a/trunk/drivers/input/joystick/sidewinder.c +++ b/trunk/drivers/input/joystick/sidewinder.c @@ -541,7 +541,7 @@ static void sw_print_packet(char *name, int length, unsigned char *buf, char bit * Unfortunately I don't know how to do this for the other SW types. */ -static void sw_3dp_id(unsigned char *buf, char *comment, size_t size) +static void sw_3dp_id(unsigned char *buf, char *comment) { int i; char pnp[8], rev[9]; @@ -554,7 +554,7 @@ static void sw_3dp_id(unsigned char *buf, char *comment, size_t size) pnp[7] = rev[8] = 0; - snprintf(comment, size, " [PnP %d.%02d id %s rev %s]", + sprintf(comment, " [PnP %d.%02d id %s rev %s]", (int) ((sw_get_bits(buf, 8, 6, 1) << 6) | /* Two 6-bit values */ sw_get_bits(buf, 16, 6, 1)) / 100, (int) ((sw_get_bits(buf, 8, 6, 1) << 6) | @@ -695,7 +695,7 @@ static int sw_connect(struct gameport *gameport, struct gameport_driver *drv) sw->type = SW_ID_FFP; sprintf(comment, " [AC %s]", sw_get_bits(idbuf,38,1,3) ? "off" : "on"); } else - sw->type = SW_ID_PP; + sw->type = SW_ID_PP; break; case 66: sw->bits = 3; @@ -703,8 +703,7 @@ static int sw_connect(struct gameport *gameport, struct gameport_driver *drv) sw->length = 22; case 64: sw->type = SW_ID_3DP; - if (j == 160) - sw_3dp_id(idbuf, comment, sizeof(comment)); + if (j == 160) sw_3dp_id(idbuf, comment); break; } } @@ -734,10 +733,8 @@ static int sw_connect(struct gameport *gameport, struct gameport_driver *drv) for (i = 0; i < sw->number; i++) { int bits, code; - snprintf(sw->name, sizeof(sw->name), - "Microsoft SideWinder %s", sw_name[sw->type]); - snprintf(sw->phys[i], sizeof(sw->phys[i]), - "%s/input%d", gameport->phys, i); + sprintf(sw->name, "Microsoft SideWinder %s", sw_name[sw->type]); + sprintf(sw->phys[i], "%s/input%d", gameport->phys, i); sw->dev[i] = input_dev = input_allocate_device(); if (!input_dev) { diff --git a/trunk/drivers/input/joystick/spaceball.c b/trunk/drivers/input/joystick/spaceball.c index 75eb5ca59992..d6f8db8ec3fd 100644 --- a/trunk/drivers/input/joystick/spaceball.c +++ b/trunk/drivers/input/joystick/spaceball.c @@ -220,7 +220,7 @@ static int spaceball_connect(struct serio *serio, struct serio_driver *drv) goto fail; spaceball->dev = input_dev; - snprintf(spaceball->phys, sizeof(spaceball->phys), "%s/input0", serio->phys); + sprintf(spaceball->phys, "%s/input0", serio->phys); input_dev->name = spaceball_names[id]; input_dev->phys = spaceball->phys; diff --git a/trunk/drivers/input/joystick/spaceorb.c b/trunk/drivers/input/joystick/spaceorb.c index 3e2782e79834..7c123a01c58e 100644 --- a/trunk/drivers/input/joystick/spaceorb.c +++ b/trunk/drivers/input/joystick/spaceorb.c @@ -177,7 +177,7 @@ static int spaceorb_connect(struct serio *serio, struct serio_driver *drv) goto fail; spaceorb->dev = input_dev; - snprintf(spaceorb->phys, sizeof(spaceorb->phys), "%s/input0", serio->phys); + sprintf(spaceorb->phys, "%s/input0", serio->phys); input_dev->name = "SpaceTec SpaceOrb 360 / Avenger"; input_dev->phys = spaceorb->phys; diff --git a/trunk/drivers/input/joystick/stinger.c b/trunk/drivers/input/joystick/stinger.c index 011ec4858e15..0a9ed1d30636 100644 --- a/trunk/drivers/input/joystick/stinger.c +++ b/trunk/drivers/input/joystick/stinger.c @@ -148,7 +148,7 @@ static int stinger_connect(struct serio *serio, struct serio_driver *drv) goto fail; stinger->dev = input_dev; - snprintf(stinger->phys, sizeof(stinger->phys), "%s/serio0", serio->phys); + sprintf(stinger->phys, "%s/serio0", serio->phys); input_dev->name = "Gravis Stinger"; input_dev->phys = stinger->phys; diff --git a/trunk/drivers/input/joystick/twidjoy.c b/trunk/drivers/input/joystick/twidjoy.c index 076f237d9654..7f8b0093c5bc 100644 --- a/trunk/drivers/input/joystick/twidjoy.c +++ b/trunk/drivers/input/joystick/twidjoy.c @@ -199,7 +199,7 @@ static int twidjoy_connect(struct serio *serio, struct serio_driver *drv) goto fail; twidjoy->dev = input_dev; - snprintf(twidjoy->phys, sizeof(twidjoy->phys), "%s/input0", serio->phys); + sprintf(twidjoy->phys, "%s/input0", serio->phys); input_dev->name = "Handykey Twiddler"; input_dev->phys = twidjoy->phys; diff --git a/trunk/drivers/input/joystick/warrior.c b/trunk/drivers/input/joystick/warrior.c index f9c1a03214eb..1849b176cf18 100644 --- a/trunk/drivers/input/joystick/warrior.c +++ b/trunk/drivers/input/joystick/warrior.c @@ -154,7 +154,7 @@ static int warrior_connect(struct serio *serio, struct serio_driver *drv) goto fail; warrior->dev = input_dev; - snprintf(warrior->phys, sizeof(warrior->phys), "%s/input0", serio->phys); + sprintf(warrior->phys, "%s/input0", serio->phys); input_dev->name = "Logitech WingMan Warrior"; input_dev->phys = warrior->phys; diff --git a/trunk/drivers/input/keyboard/atkbd.c b/trunk/drivers/input/keyboard/atkbd.c index ffde8f86e0fb..fad04b66d268 100644 --- a/trunk/drivers/input/keyboard/atkbd.c +++ b/trunk/drivers/input/keyboard/atkbd.c @@ -55,7 +55,7 @@ static int atkbd_softraw = 1; module_param_named(softraw, atkbd_softraw, bool, 0); MODULE_PARM_DESC(softraw, "Use software generated rawmode"); -static int atkbd_scroll; +static int atkbd_scroll = 0; module_param_named(scroll, atkbd_scroll, bool, 0); MODULE_PARM_DESC(scroll, "Enable scroll-wheel on MS Office and similar keyboards"); @@ -150,8 +150,8 @@ static unsigned char atkbd_unxlate_table[128] = { #define ATKBD_RET_EMUL0 0xe0 #define ATKBD_RET_EMUL1 0xe1 #define ATKBD_RET_RELEASE 0xf0 -#define ATKBD_RET_HANJA 0xf1 -#define ATKBD_RET_HANGEUL 0xf2 +#define ATKBD_RET_HANGUEL 0xf1 +#define ATKBD_RET_HANJA 0xf2 #define ATKBD_RET_ERR 0xff #define ATKBD_KEY_UNKNOWN 0 @@ -170,13 +170,6 @@ static unsigned char atkbd_unxlate_table[128] = { #define ATKBD_LED_EVENT_BIT 0 #define ATKBD_REP_EVENT_BIT 1 -#define ATKBD_XL_ERR 0x01 -#define ATKBD_XL_BAT 0x02 -#define ATKBD_XL_ACK 0x04 -#define ATKBD_XL_NAK 0x08 -#define ATKBD_XL_HANGEUL 0x10 -#define ATKBD_XL_HANJA 0x20 - static struct { unsigned char keycode; unsigned char set2; @@ -218,7 +211,8 @@ struct atkbd { unsigned char emul; unsigned char resend; unsigned char release; - unsigned long xl_bit; + unsigned char bat_xl; + unsigned char err_xl; unsigned int last; unsigned long time; @@ -251,65 +245,17 @@ ATKBD_DEFINE_ATTR(set); ATKBD_DEFINE_ATTR(softrepeat); ATKBD_DEFINE_ATTR(softraw); -static const unsigned int xl_table[] = { - ATKBD_RET_BAT, ATKBD_RET_ERR, ATKBD_RET_ACK, - ATKBD_RET_NAK, ATKBD_RET_HANJA, ATKBD_RET_HANGEUL, -}; - -/* - * Checks if we should mangle the scancode to extract 'release' bit - * in translated mode. - */ -static int atkbd_need_xlate(unsigned long xl_bit, unsigned char code) -{ - int i; - - if (code == ATKBD_RET_EMUL0 || code == ATKBD_RET_EMUL1) - return 0; - - for (i = 0; i < ARRAY_SIZE(xl_table); i++) - if (code == xl_table[i]) - return test_bit(i, &xl_bit); - - return 1; -} -/* - * Calculates new value of xl_bit so the driver can distinguish - * between make/break pair of scancodes for select keys and PS/2 - * protocol responses. - */ -static void atkbd_calculate_xl_bit(struct atkbd *atkbd, unsigned char code) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(xl_table); i++) { - if (!((code ^ xl_table[i]) & 0x7f)) { - if (code & 0x80) - __clear_bit(i, &atkbd->xl_bit); - else - __set_bit(i, &atkbd->xl_bit); - break; - } - } -} - -/* - * Encode the scancode, 0xe0 prefix, and high bit into a single integer, - * keeping kernel 2.4 compatibility for set 2 - */ -static unsigned int atkbd_compat_scancode(struct atkbd *atkbd, unsigned int code) +static void atkbd_report_key(struct input_dev *dev, struct pt_regs *regs, int code, int value) { - if (atkbd->set == 3) { - if (atkbd->emul == 1) - code |= 0x100; - } else { - code = (code & 0x7f) | ((code & 0x80) << 1); - if (atkbd->emul == 1) - code |= 0x80; - } - - return code; + input_regs(dev, regs); + if (value == 3) { + input_report_key(dev, code, 1); + input_sync(dev); + input_report_key(dev, code, 0); + } else + input_event(dev, EV_KEY, code, value); + input_sync(dev); } /* @@ -321,11 +267,9 @@ static irqreturn_t atkbd_interrupt(struct serio *serio, unsigned char data, unsigned int flags, struct pt_regs *regs) { struct atkbd *atkbd = serio_get_drvdata(serio); - struct input_dev *dev = atkbd->dev; unsigned int code = data; - int scroll = 0, hscroll = 0, click = -1, add_release_event = 0; + int scroll = 0, hscroll = 0, click = -1; int value; - unsigned char keycode; #ifdef ATKBD_DEBUG printk(KERN_DEBUG "atkbd.c: Received %02x flags %02x\n", data, flags); @@ -354,17 +298,25 @@ static irqreturn_t atkbd_interrupt(struct serio *serio, unsigned char data, if (!atkbd->enabled) goto out; - input_event(dev, EV_MSC, MSC_RAW, code); + input_event(atkbd->dev, EV_MSC, MSC_RAW, code); if (atkbd->translated) { - if (atkbd->emul || atkbd_need_xlate(atkbd->xl_bit, code)) { + if (atkbd->emul || + (code != ATKBD_RET_EMUL0 && code != ATKBD_RET_EMUL1 && + code != ATKBD_RET_HANGUEL && code != ATKBD_RET_HANJA && + (code != ATKBD_RET_ERR || atkbd->err_xl) && + (code != ATKBD_RET_BAT || atkbd->bat_xl))) { atkbd->release = code >> 7; code &= 0x7f; } - if (!atkbd->emul) - atkbd_calculate_xl_bit(atkbd, data); + if (!atkbd->emul) { + if ((code & 0x7f) == (ATKBD_RET_BAT & 0x7f)) + atkbd->bat_xl = !(data >> 7); + if ((code & 0x7f) == (ATKBD_RET_ERR & 0x7f)) + atkbd->err_xl = !(data >> 7); + } } switch (code) { @@ -381,48 +333,47 @@ static irqreturn_t atkbd_interrupt(struct serio *serio, unsigned char data, case ATKBD_RET_RELEASE: atkbd->release = 1; goto out; - case ATKBD_RET_ACK: - case ATKBD_RET_NAK: - printk(KERN_WARNING "atkbd.c: Spurious %s on %s. " - "Some program might be trying access hardware directly.\n", - data == ATKBD_RET_ACK ? "ACK" : "NAK", serio->phys); + case ATKBD_RET_HANGUEL: + atkbd_report_key(atkbd->dev, regs, KEY_HANGUEL, 3); goto out; - case ATKBD_RET_HANGEUL: case ATKBD_RET_HANJA: - /* - * These keys do not report release and thus need to be - * flagged properly - */ - add_release_event = 1; - break; + atkbd_report_key(atkbd->dev, regs, KEY_HANJA, 3); + goto out; case ATKBD_RET_ERR: printk(KERN_DEBUG "atkbd.c: Keyboard on %s reports too many keys pressed.\n", serio->phys); goto out; } - code = atkbd_compat_scancode(atkbd, code); - - if (atkbd->emul && --atkbd->emul) - goto out; - - keycode = atkbd->keycode[code]; + if (atkbd->set != 3) + code = (code & 0x7f) | ((code & 0x80) << 1); + if (atkbd->emul) { + if (--atkbd->emul) + goto out; + code |= (atkbd->set != 3) ? 0x80 : 0x100; + } - if (keycode != ATKBD_KEY_NULL) - input_event(dev, EV_MSC, MSC_SCAN, code); + if (atkbd->keycode[code] != ATKBD_KEY_NULL) + input_event(atkbd->dev, EV_MSC, MSC_SCAN, code); - switch (keycode) { + switch (atkbd->keycode[code]) { case ATKBD_KEY_NULL: break; case ATKBD_KEY_UNKNOWN: - printk(KERN_WARNING - "atkbd.c: Unknown key %s (%s set %d, code %#x on %s).\n", - atkbd->release ? "released" : "pressed", - atkbd->translated ? "translated" : "raw", - atkbd->set, code, serio->phys); - printk(KERN_WARNING - "atkbd.c: Use 'setkeycodes %s%02x ' to make it known.\n", - code & 0x80 ? "e0" : "", code & 0x7f); - input_sync(dev); + if (data == ATKBD_RET_ACK || data == ATKBD_RET_NAK) { + printk(KERN_WARNING "atkbd.c: Spurious %s on %s. Some program, " + "like XFree86, might be trying access hardware directly.\n", + data == ATKBD_RET_ACK ? "ACK" : "NAK", serio->phys); + } else { + printk(KERN_WARNING "atkbd.c: Unknown key %s " + "(%s set %d, code %#x on %s).\n", + atkbd->release ? "released" : "pressed", + atkbd->translated ? "translated" : "raw", + atkbd->set, code, serio->phys); + printk(KERN_WARNING "atkbd.c: Use 'setkeycodes %s%02x ' " + "to make it known.\n", + code & 0x80 ? "e0" : "", code & 0x7f); + } + input_sync(atkbd->dev); break; case ATKBD_SCR_1: scroll = 1 - atkbd->release * 2; @@ -446,35 +397,33 @@ static irqreturn_t atkbd_interrupt(struct serio *serio, unsigned char data, hscroll = 1; break; default: - if (atkbd->release) { - value = 0; - atkbd->last = 0; - } else if (!atkbd->softrepeat && test_bit(keycode, dev->key)) { - /* Workaround Toshiba laptop multiple keypress */ - value = time_before(jiffies, atkbd->time) && atkbd->last == code ? 1 : 2; - } else { - value = 1; - atkbd->last = code; - atkbd->time = jiffies + msecs_to_jiffies(dev->rep[REP_DELAY]) / 2; + value = atkbd->release ? 0 : + (1 + (!atkbd->softrepeat && test_bit(atkbd->keycode[code], atkbd->dev->key))); + + switch (value) { /* Workaround Toshiba laptop multiple keypress */ + case 0: + atkbd->last = 0; + break; + case 1: + atkbd->last = code; + atkbd->time = jiffies + msecs_to_jiffies(atkbd->dev->rep[REP_DELAY]) / 2; + break; + case 2: + if (!time_after(jiffies, atkbd->time) && atkbd->last == code) + value = 1; + break; } - input_regs(dev, regs); - input_report_key(dev, keycode, value); - input_sync(dev); - - if (value && add_release_event) { - input_report_key(dev, keycode, 0); - input_sync(dev); - } + atkbd_report_key(atkbd->dev, regs, atkbd->keycode[code], value); } if (atkbd->scroll) { - input_regs(dev, regs); + input_regs(atkbd->dev, regs); if (click != -1) - input_report_key(dev, BTN_MIDDLE, click); - input_report_rel(dev, REL_WHEEL, scroll); - input_report_rel(dev, REL_HWHEEL, hscroll); - input_sync(dev); + input_report_key(atkbd->dev, BTN_MIDDLE, click); + input_report_rel(atkbd->dev, REL_WHEEL, scroll); + input_report_rel(atkbd->dev, REL_HWHEEL, hscroll); + input_sync(atkbd->dev); } atkbd->release = 0; @@ -815,9 +764,6 @@ static void atkbd_set_keycode_table(struct atkbd *atkbd) for (i = 0; i < ARRAY_SIZE(atkbd_scroll_keys); i++) atkbd->keycode[atkbd_scroll_keys[i].set2] = atkbd_scroll_keys[i].keycode; } - - atkbd->keycode[atkbd_compat_scancode(atkbd, ATKBD_RET_HANGEUL)] = KEY_HANGUEL; - atkbd->keycode[atkbd_compat_scancode(atkbd, ATKBD_RET_HANJA)] = KEY_HANJA; } /* @@ -830,15 +776,12 @@ static void atkbd_set_device_attrs(struct atkbd *atkbd) int i; if (atkbd->extra) - snprintf(atkbd->name, sizeof(atkbd->name), - "AT Set 2 Extra keyboard"); + sprintf(atkbd->name, "AT Set 2 Extra keyboard"); else - snprintf(atkbd->name, sizeof(atkbd->name), - "AT %s Set %d keyboard", - atkbd->translated ? "Translated" : "Raw", atkbd->set); + sprintf(atkbd->name, "AT %s Set %d keyboard", + atkbd->translated ? "Translated" : "Raw", atkbd->set); - snprintf(atkbd->phys, sizeof(atkbd->phys), - "%s/input0", atkbd->ps2dev.serio->phys); + sprintf(atkbd->phys, "%s/input0", atkbd->ps2dev.serio->phys); input_dev->name = atkbd->name; input_dev->phys = atkbd->phys; diff --git a/trunk/drivers/input/keyboard/lkkbd.c b/trunk/drivers/input/keyboard/lkkbd.c index 5174224cadb4..77c4d9669ad0 100644 --- a/trunk/drivers/input/keyboard/lkkbd.c +++ b/trunk/drivers/input/keyboard/lkkbd.c @@ -384,21 +384,18 @@ lkkbd_detection_done (struct lkkbd *lk) */ switch (lk->id[4]) { case 1: - strlcpy (lk->name, "DEC LK201 keyboard", - sizeof (lk->name)); + sprintf (lk->name, "DEC LK201 keyboard"); if (lk201_compose_is_alt) lk->keycode[0xb1] = KEY_LEFTALT; break; case 2: - strlcpy (lk->name, "DEC LK401 keyboard", - sizeof (lk->name)); + sprintf (lk->name, "DEC LK401 keyboard"); break; default: - strlcpy (lk->name, "Unknown DEC keyboard", - sizeof (lk->name)); + sprintf (lk->name, "Unknown DEC keyboard"); printk (KERN_ERR "lkkbd: keyboard on %s is unknown, " "please report to Jan-Benedict Glaw " "\n", lk->phys); diff --git a/trunk/drivers/input/keyboard/newtonkbd.c b/trunk/drivers/input/keyboard/newtonkbd.c index 40a3f551247e..d10983c521e6 100644 --- a/trunk/drivers/input/keyboard/newtonkbd.c +++ b/trunk/drivers/input/keyboard/newtonkbd.c @@ -96,7 +96,7 @@ static int nkbd_connect(struct serio *serio, struct serio_driver *drv) nkbd->serio = serio; nkbd->dev = input_dev; - snprintf(nkbd->phys, sizeof(nkbd->phys), "%s/input0", serio->phys); + sprintf(nkbd->phys, "%s/input0", serio->phys); memcpy(nkbd->keycode, nkbd_keycode, sizeof(nkbd->keycode)); input_dev->name = "Newton Keyboard"; diff --git a/trunk/drivers/input/keyboard/sunkbd.c b/trunk/drivers/input/keyboard/sunkbd.c index 9dbd7b85686d..b15b6d8d4f83 100644 --- a/trunk/drivers/input/keyboard/sunkbd.c +++ b/trunk/drivers/input/keyboard/sunkbd.c @@ -263,7 +263,7 @@ static int sunkbd_connect(struct serio *serio, struct serio_driver *drv) goto fail; } - snprintf(sunkbd->name, sizeof(sunkbd->name), "Sun Type %d keyboard", sunkbd->type); + sprintf(sunkbd->name, "Sun Type %d keyboard", sunkbd->type); memcpy(sunkbd->keycode, sunkbd_keycode, sizeof(sunkbd->keycode)); input_dev->name = sunkbd->name; diff --git a/trunk/drivers/input/keyboard/xtkbd.c b/trunk/drivers/input/keyboard/xtkbd.c index 0821d53cf0c1..4135e3e16c51 100644 --- a/trunk/drivers/input/keyboard/xtkbd.c +++ b/trunk/drivers/input/keyboard/xtkbd.c @@ -100,7 +100,7 @@ static int xtkbd_connect(struct serio *serio, struct serio_driver *drv) xtkbd->serio = serio; xtkbd->dev = input_dev; - snprintf(xtkbd->phys, sizeof(xtkbd->phys), "%s/input0", serio->phys); + sprintf(xtkbd->phys, "%s/input0", serio->phys); memcpy(xtkbd->keycode, xtkbd_keycode, sizeof(xtkbd->keycode)); input_dev->name = "XT Keyboard"; diff --git a/trunk/drivers/input/mouse/alps.c b/trunk/drivers/input/mouse/alps.c index 070d75330afd..a0e2e797c6d5 100644 --- a/trunk/drivers/input/mouse/alps.c +++ b/trunk/drivers/input/mouse/alps.c @@ -470,7 +470,7 @@ int alps_init(struct psmouse *psmouse) dev1->keybit[LONG(BTN_BACK)] |= BIT(BTN_BACK); } - snprintf(priv->phys, sizeof(priv->phys), "%s/input1", psmouse->ps2dev.serio->phys); + sprintf(priv->phys, "%s/input1", psmouse->ps2dev.serio->phys); dev2->phys = priv->phys; dev2->name = (priv->i->flags & ALPS_DUALPOINT) ? "DualPoint Stick" : "PS/2 Mouse"; dev2->id.bustype = BUS_I8042; diff --git a/trunk/drivers/input/mouse/psmouse-base.c b/trunk/drivers/input/mouse/psmouse-base.c index 8bc9f51ae6c2..136321a2cfdb 100644 --- a/trunk/drivers/input/mouse/psmouse-base.c +++ b/trunk/drivers/input/mouse/psmouse-base.c @@ -150,20 +150,9 @@ static psmouse_ret_t psmouse_process_byte(struct psmouse *psmouse, struct pt_reg */ if (psmouse->type == PSMOUSE_IMEX) { - switch (packet[3] & 0xC0) { - case 0x80: /* vertical scroll on IntelliMouse Explorer 4.0 */ - input_report_rel(dev, REL_WHEEL, (int) (packet[3] & 32) - (int) (packet[3] & 31)); - break; - case 0x40: /* horizontal scroll on IntelliMouse Explorer 4.0 */ - input_report_rel(dev, REL_HWHEEL, (int) (packet[3] & 32) - (int) (packet[3] & 31)); - break; - case 0x00: - case 0xC0: - input_report_rel(dev, REL_WHEEL, (int) (packet[3] & 8) - (int) (packet[3] & 7)); - input_report_key(dev, BTN_SIDE, (packet[3] >> 4) & 1); - input_report_key(dev, BTN_EXTRA, (packet[3] >> 5) & 1); - break; - } + input_report_rel(dev, REL_WHEEL, (int) (packet[3] & 8) - (int) (packet[3] & 7)); + input_report_key(dev, BTN_SIDE, (packet[3] >> 4) & 1); + input_report_key(dev, BTN_EXTRA, (packet[3] >> 5) & 1); } /* @@ -477,25 +466,9 @@ static int im_explorer_detect(struct psmouse *psmouse, int set_properties) if (param[0] != 4) return -1; -/* Magic to enable horizontal scrolling on IntelliMouse 4.0 */ - param[0] = 200; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - param[0] = 80; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - param[0] = 40; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - - param[0] = 200; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - param[0] = 200; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - param[0] = 60; - ps2_command(ps2dev, param, PSMOUSE_CMD_SETRATE); - if (set_properties) { set_bit(BTN_MIDDLE, psmouse->dev->keybit); set_bit(REL_WHEEL, psmouse->dev->relbit); - set_bit(REL_HWHEEL, psmouse->dev->relbit); set_bit(BTN_SIDE, psmouse->dev->keybit); set_bit(BTN_EXTRA, psmouse->dev->keybit); @@ -1084,8 +1057,8 @@ static int psmouse_switch_protocol(struct psmouse *psmouse, struct psmouse_proto if (psmouse->resync_time && psmouse->poll(psmouse)) psmouse->resync_time = 0; - snprintf(psmouse->devname, sizeof(psmouse->devname), "%s %s %s", - psmouse_protocol_by_type(psmouse->type)->name, psmouse->vendor, psmouse->name); + sprintf(psmouse->devname, "%s %s %s", + psmouse_protocol_by_type(psmouse->type)->name, psmouse->vendor, psmouse->name); input_dev->name = psmouse->devname; input_dev->phys = psmouse->phys; @@ -1126,7 +1099,7 @@ static int psmouse_connect(struct serio *serio, struct serio_driver *drv) ps2_init(&psmouse->ps2dev, serio); INIT_WORK(&psmouse->resync_work, psmouse_resync, psmouse); psmouse->dev = input_dev; - snprintf(psmouse->phys, sizeof(psmouse->phys), "%s/input0", serio->phys); + sprintf(psmouse->phys, "%s/input0", serio->phys); psmouse_set_state(psmouse, PSMOUSE_INITIALIZING); diff --git a/trunk/drivers/input/mouse/sermouse.c b/trunk/drivers/input/mouse/sermouse.c index a89742431717..2f9a04ae725f 100644 --- a/trunk/drivers/input/mouse/sermouse.c +++ b/trunk/drivers/input/mouse/sermouse.c @@ -254,7 +254,7 @@ static int sermouse_connect(struct serio *serio, struct serio_driver *drv) goto fail; sermouse->dev = input_dev; - snprintf(sermouse->phys, sizeof(sermouse->phys), "%s/input0", serio->phys); + sprintf(sermouse->phys, "%s/input0", serio->phys); sermouse->type = serio->id.proto; input_dev->name = sermouse_protocols[sermouse->type]; diff --git a/trunk/drivers/input/mouse/vsxxxaa.c b/trunk/drivers/input/mouse/vsxxxaa.c index 7b85bc21ae4a..36e9442a16b2 100644 --- a/trunk/drivers/input/mouse/vsxxxaa.c +++ b/trunk/drivers/input/mouse/vsxxxaa.c @@ -153,25 +153,22 @@ vsxxxaa_detection_done (struct vsxxxaa *mouse) { switch (mouse->type) { case 0x02: - strlcpy (mouse->name, "DEC VSXXX-AA/-GA mouse", - sizeof (mouse->name)); + sprintf (mouse->name, "DEC VSXXX-AA/-GA mouse"); break; case 0x04: - strlcpy (mouse->name, "DEC VSXXX-AB digitizer", - sizeof (mouse->name)); + sprintf (mouse->name, "DEC VSXXX-AB digitizer"); break; default: - snprintf (mouse->name, sizeof (mouse->name), - "unknown DEC pointer device (type = 0x%02x)", - mouse->type); + sprintf (mouse->name, "unknown DEC pointer device " + "(type = 0x%02x)", mouse->type); break; } - printk (KERN_INFO - "Found %s version 0x%02x from country 0x%02x on port %s\n", - mouse->name, mouse->version, mouse->country, mouse->phys); + printk (KERN_INFO "Found %s version 0x%02x from country 0x%02x " + "on port %s\n", mouse->name, mouse->version, + mouse->country, mouse->phys); } /* @@ -506,9 +503,8 @@ vsxxxaa_connect (struct serio *serio, struct serio_driver *drv) mouse->dev = input_dev; mouse->serio = serio; - strlcat (mouse->name, "DEC VSXXX-AA/-GA mouse or VSXXX-AB digitizer", - sizeof (mouse->name)); - snprintf (mouse->phys, sizeof (mouse->phys), "%s/input0", serio->phys); + sprintf (mouse->name, "DEC VSXXX-AA/-GA mouse or VSXXX-AB digitizer"); + sprintf (mouse->phys, "%s/input0", serio->phys); input_dev->name = mouse->name; input_dev->phys = mouse->phys; diff --git a/trunk/drivers/input/mousedev.c b/trunk/drivers/input/mousedev.c index eb721b11ff37..b685a507955d 100644 --- a/trunk/drivers/input/mousedev.c +++ b/trunk/drivers/input/mousedev.c @@ -123,9 +123,7 @@ static void mousedev_touchpad_event(struct input_dev *dev, struct mousedev *mous if (mousedev->touch) { size = dev->absmax[ABS_X] - dev->absmin[ABS_X]; - if (size == 0) - size = 256 * 2; - + if (size == 0) size = 256 * 2; switch (code) { case ABS_X: fx(0) = value; @@ -157,24 +155,18 @@ static void mousedev_abs_event(struct input_dev *dev, struct mousedev *mousedev, switch (code) { case ABS_X: size = dev->absmax[ABS_X] - dev->absmin[ABS_X]; - if (size == 0) - size = xres ? : 1; - if (value > dev->absmax[ABS_X]) - value = dev->absmax[ABS_X]; - if (value < dev->absmin[ABS_X]) - value = dev->absmin[ABS_X]; + if (size == 0) size = xres ? : 1; + if (value > dev->absmax[ABS_X]) value = dev->absmax[ABS_X]; + if (value < dev->absmin[ABS_X]) value = dev->absmin[ABS_X]; mousedev->packet.x = ((value - dev->absmin[ABS_X]) * xres) / size; mousedev->packet.abs_event = 1; break; case ABS_Y: size = dev->absmax[ABS_Y] - dev->absmin[ABS_Y]; - if (size == 0) - size = yres ? : 1; - if (value > dev->absmax[ABS_Y]) - value = dev->absmax[ABS_Y]; - if (value < dev->absmin[ABS_Y]) - value = dev->absmin[ABS_Y]; + if (size == 0) size = yres ? : 1; + if (value > dev->absmax[ABS_Y]) value = dev->absmax[ABS_Y]; + if (value < dev->absmin[ABS_Y]) value = dev->absmin[ABS_Y]; mousedev->packet.y = yres - ((value - dev->absmin[ABS_Y]) * yres) / size; mousedev->packet.abs_event = 1; break; @@ -210,7 +202,7 @@ static void mousedev_key_event(struct mousedev *mousedev, unsigned int code, int case BTN_SIDE: index = 3; break; case BTN_4: case BTN_EXTRA: index = 4; break; - default: return; + default: return; } if (value) { @@ -293,9 +285,10 @@ static void mousedev_touchpad_touch(struct mousedev *mousedev, int value) mousedev->touch = mousedev->pkt_count = 0; mousedev->frac_dx = 0; mousedev->frac_dy = 0; - - } else if (!mousedev->touch) - mousedev->touch = jiffies; + } + else + if (!mousedev->touch) + mousedev->touch = jiffies; } static void mousedev_event(struct input_handle *handle, unsigned int type, unsigned int code, int value) @@ -334,7 +327,7 @@ static void mousedev_event(struct input_handle *handle, unsigned int type, unsig mousedev->pkt_count++; /* Input system eats duplicate events, but we need all of them * to do correct averaging so apply present one forward - */ + */ fx(0) = fx(1); fy(0) = fy(1); } @@ -353,9 +346,7 @@ static int mousedev_fasync(int fd, struct file *file, int on) { int retval; struct mousedev_list *list = file->private_data; - retval = fasync_helper(fd, file, on, &list->fasync); - return retval < 0 ? retval : 0; } @@ -516,16 +507,14 @@ static ssize_t mousedev_write(struct file * file, const char __user * buffer, si list->imexseq = 0; list->mode = MOUSEDEV_EMUL_EXPS; } - } else - list->imexseq = 0; + } else list->imexseq = 0; if (c == mousedev_imps_seq[list->impsseq]) { if (++list->impsseq == MOUSEDEV_SEQ_LEN) { list->impsseq = 0; list->mode = MOUSEDEV_EMUL_IMPS; } - } else - list->impsseq = 0; + } else list->impsseq = 0; list->ps2[0] = 0xfa; @@ -609,7 +598,6 @@ static ssize_t mousedev_read(struct file * file, char __user * buffer, size_t co static unsigned int mousedev_poll(struct file *file, poll_table *wait) { struct mousedev_list *list = file->private_data; - poll_wait(file, &list->mousedev->wait, wait); return ((list->ready || list->buffer) ? (POLLIN | POLLRDNORM) : 0) | (list->mousedev->exist ? 0 : (POLLHUP | POLLERR)); diff --git a/trunk/drivers/input/touchscreen/gunze.c b/trunk/drivers/input/touchscreen/gunze.c index b769b21973b7..466da190ceec 100644 --- a/trunk/drivers/input/touchscreen/gunze.c +++ b/trunk/drivers/input/touchscreen/gunze.c @@ -129,7 +129,7 @@ static int gunze_connect(struct serio *serio, struct serio_driver *drv) gunze->serio = serio; gunze->dev = input_dev; - snprintf(gunze->phys, sizeof(serio->phys), "%s/input0", serio->phys); + sprintf(gunze->phys, "%s/input0", serio->phys); input_dev->private = gunze; input_dev->name = "Gunze AHL-51S TouchScreen"; diff --git a/trunk/drivers/input/touchscreen/h3600_ts_input.c b/trunk/drivers/input/touchscreen/h3600_ts_input.c index 2de2139f2fed..a595d386312f 100644 --- a/trunk/drivers/input/touchscreen/h3600_ts_input.c +++ b/trunk/drivers/input/touchscreen/h3600_ts_input.c @@ -363,7 +363,7 @@ static int h3600ts_connect(struct serio *serio, struct serio_driver *drv) ts->serio = serio; ts->dev = input_dev; - snprintf(ts->phys, sizeof(ts->phys), "%s/input0", serio->phys); + sprintf(ts->phys, "%s/input0", serio->phys); input_dev->name = "H3600 TouchScreen"; input_dev->phys = ts->phys; diff --git a/trunk/drivers/input/touchscreen/mtouch.c b/trunk/drivers/input/touchscreen/mtouch.c index 8647a905df80..1d0d37eeef6e 100644 --- a/trunk/drivers/input/touchscreen/mtouch.c +++ b/trunk/drivers/input/touchscreen/mtouch.c @@ -143,7 +143,7 @@ static int mtouch_connect(struct serio *serio, struct serio_driver *drv) mtouch->serio = serio; mtouch->dev = input_dev; - snprintf(mtouch->phys, sizeof(mtouch->phys), "%s/input0", serio->phys); + sprintf(mtouch->phys, "%s/input0", serio->phys); input_dev->private = mtouch; input_dev->name = "MicroTouch Serial TouchScreen"; diff --git a/trunk/drivers/input/tsdev.c b/trunk/drivers/input/tsdev.c index 5f9ecad2ca75..d678d144bbf8 100644 --- a/trunk/drivers/input/tsdev.c +++ b/trunk/drivers/input/tsdev.c @@ -35,7 +35,7 @@ * e-mail - mail your message to . */ -#define TSDEV_MINOR_BASE 128 +#define TSDEV_MINOR_BASE 128 #define TSDEV_MINORS 32 /* First 16 devices are h3600_ts compatible; second 16 are h3600_tsraw */ #define TSDEV_MINOR_MASK 15 @@ -230,7 +230,6 @@ static ssize_t tsdev_read(struct file *file, char __user *buffer, size_t count, static unsigned int tsdev_poll(struct file *file, poll_table * wait) { struct tsdev_list *list = file->private_data; - poll_wait(file, &list->tsdev->wait, wait); return ((list->head == list->tail) ? 0 : (POLLIN | POLLRDNORM)) | (list->tsdev->exist ? 0 : (POLLHUP | POLLERR)); @@ -249,13 +248,11 @@ static int tsdev_ioctl(struct inode *inode, struct file *file, sizeof (struct ts_calibration))) retval = -EFAULT; break; - case TS_SET_CAL: if (copy_from_user (&tsdev->cal, (void __user *)arg, sizeof (struct ts_calibration))) retval = -EFAULT; break; - default: retval = -EINVAL; break; @@ -287,11 +284,9 @@ static void tsdev_event(struct input_handle *handle, unsigned int type, case ABS_X: tsdev->x = value; break; - case ABS_Y: tsdev->y = value; break; - case ABS_PRESSURE: if (value > handle->dev->absmax[ABS_PRESSURE]) value = handle->dev->absmax[ABS_PRESSURE]; @@ -312,7 +307,6 @@ static void tsdev_event(struct input_handle *handle, unsigned int type, else if (tsdev->x > xres) tsdev->x = xres; break; - case REL_Y: tsdev->y += value; if (tsdev->y < 0) @@ -329,7 +323,6 @@ static void tsdev_event(struct input_handle *handle, unsigned int type, case 0: tsdev->pressure = 0; break; - case 1: if (!tsdev->pressure) tsdev->pressure = 1; @@ -377,8 +370,9 @@ static struct input_handle *tsdev_connect(struct input_handler *handler, struct class_device *cdev; int minor, delta; - for (minor = 0; minor < TSDEV_MINORS / 2 && tsdev_table[minor]; minor++); - if (minor >= TSDEV_MINORS / 2) { + for (minor = 0; minor < TSDEV_MINORS/2 && tsdev_table[minor]; + minor++); + if (minor >= TSDEV_MINORS/2) { printk(KERN_ERR "tsdev: You have way too many touchscreens\n"); return NULL; @@ -450,22 +444,22 @@ static struct input_device_id tsdev_ids[] = { .evbit = { BIT(EV_KEY) | BIT(EV_REL) }, .keybit = { [LONG(BTN_LEFT)] = BIT(BTN_LEFT) }, .relbit = { BIT(REL_X) | BIT(REL_Y) }, - }, /* A mouse like device, at least one button, two relative axes */ + },/* A mouse like device, at least one button, two relative axes */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT | INPUT_DEVICE_ID_MATCH_ABSBIT, .evbit = { BIT(EV_KEY) | BIT(EV_ABS) }, .keybit = { [LONG(BTN_TOUCH)] = BIT(BTN_TOUCH) }, .absbit = { BIT(ABS_X) | BIT(ABS_Y) }, - }, /* A tablet like device, at least touch detection, two absolute axes */ + },/* A tablet like device, at least touch detection, two absolute axes */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_ABSBIT, .evbit = { BIT(EV_ABS) }, .absbit = { BIT(ABS_X) | BIT(ABS_Y) | BIT(ABS_PRESSURE) }, - }, /* A tablet like device with several gradations of pressure */ + },/* A tablet like device with several gradations of pressure */ - {} /* Terminating entry */ + {},/* Terminating entry */ }; MODULE_DEVICE_TABLE(input, tsdev_ids); diff --git a/trunk/drivers/isdn/capi/capi.c b/trunk/drivers/isdn/capi/capi.c index 2e541fa02024..173c899a1fb4 100644 --- a/trunk/drivers/isdn/capi/capi.c +++ b/trunk/drivers/isdn/capi/capi.c @@ -87,11 +87,6 @@ struct capincci; #ifdef CONFIG_ISDN_CAPI_MIDDLEWARE struct capiminor; -struct datahandle_queue { - struct list_head list; - u16 datahandle; -}; - struct capiminor { struct list_head list; struct capincci *nccip; @@ -114,9 +109,12 @@ struct capiminor { int outbytes; /* transmit path */ - struct list_head ackqueue; + struct datahandle_queue { + struct datahandle_queue *next; + u16 datahandle; + } *ackqueue; int nack; - spinlock_t ackqlock; + }; #endif /* CONFIG_ISDN_CAPI_MIDDLEWARE */ @@ -158,54 +156,48 @@ static LIST_HEAD(capiminor_list); static int capincci_add_ack(struct capiminor *mp, u16 datahandle) { - struct datahandle_queue *n; - unsigned long flags; + struct datahandle_queue *n, **pp; n = kmalloc(sizeof(*n), GFP_ATOMIC); - if (unlikely(!n)) { - printk(KERN_ERR "capi: alloc datahandle failed\n"); - return -1; + if (!n) { + printk(KERN_ERR "capi: alloc datahandle failed\n"); + return -1; } + n->next = NULL; n->datahandle = datahandle; - INIT_LIST_HEAD(&n->list); - spin_lock_irqsave(&mp->ackqlock, flags); - list_add_tail(&n->list, &mp->ackqueue); + for (pp = &mp->ackqueue; *pp; pp = &(*pp)->next) ; + *pp = n; mp->nack++; - spin_unlock_irqrestore(&mp->ackqlock, flags); return 0; } static int capiminor_del_ack(struct capiminor *mp, u16 datahandle) { - struct datahandle_queue *p, *tmp; - unsigned long flags; + struct datahandle_queue **pp, *p; - spin_lock_irqsave(&mp->ackqlock, flags); - list_for_each_entry_safe(p, tmp, &mp->ackqueue, list) { - if (p->datahandle == datahandle) { - list_del(&p->list); + for (pp = &mp->ackqueue; *pp; pp = &(*pp)->next) { + if ((*pp)->datahandle == datahandle) { + p = *pp; + *pp = (*pp)->next; kfree(p); mp->nack--; - spin_unlock_irqrestore(&mp->ackqlock, flags); return 0; } } - spin_unlock_irqrestore(&mp->ackqlock, flags); return -1; } static void capiminor_del_all_ack(struct capiminor *mp) { - struct datahandle_queue *p, *tmp; - unsigned long flags; + struct datahandle_queue **pp, *p; - spin_lock_irqsave(&mp->ackqlock, flags); - list_for_each_entry_safe(p, tmp, &mp->ackqueue, list) { - list_del(&p->list); + pp = &mp->ackqueue; + while (*pp) { + p = *pp; + *pp = (*pp)->next; kfree(p); mp->nack--; } - spin_unlock_irqrestore(&mp->ackqlock, flags); } @@ -228,8 +220,6 @@ static struct capiminor *capiminor_alloc(struct capi20_appl *ap, u32 ncci) mp->ncci = ncci; mp->msgid = 0; atomic_set(&mp->ttyopencount,0); - INIT_LIST_HEAD(&mp->ackqueue); - spin_lock_init(&mp->ackqlock); skb_queue_head_init(&mp->inqueue); skb_queue_head_init(&mp->outqueue); diff --git a/trunk/drivers/isdn/gigaset/bas-gigaset.c b/trunk/drivers/isdn/gigaset/bas-gigaset.c index 8a45715dd4c1..eb41aba3ddef 100644 --- a/trunk/drivers/isdn/gigaset/bas-gigaset.c +++ b/trunk/drivers/isdn/gigaset/bas-gigaset.c @@ -65,22 +65,23 @@ static struct usb_device_id gigaset_table [] = { MODULE_DEVICE_TABLE(usb, gigaset_table); -/*======================= local function prototypes ==========================*/ +/*======================= local function prototypes =============================*/ -/* function called if a new device belonging to this driver is connected */ +/* This function is called if a new device is connected to the USB port. It + * checks whether this new device belongs to this driver. + */ static int gigaset_probe(struct usb_interface *interface, const struct usb_device_id *id); /* Function will be called if the device is unplugged */ static void gigaset_disconnect(struct usb_interface *interface); -static int atread_submit(struct cardstate *, int); +static void read_ctrl_callback(struct urb *, struct pt_regs *); static void stopurbs(struct bas_bc_state *); -static int req_submit(struct bc_state *, int, int, int); static int atwrite_submit(struct cardstate *, unsigned char *, int); static int start_cbsend(struct cardstate *); -/*============================================================================*/ +/*==============================================================================*/ struct bas_cardstate { struct usb_device *udev; /* USB device pointer */ @@ -90,7 +91,6 @@ struct bas_cardstate { struct urb *urb_ctrl; /* control pipe default URB */ struct usb_ctrlrequest dr_ctrl; struct timer_list timer_ctrl; /* control request timeout */ - int retry_ctrl; struct timer_list timer_atrdy; /* AT command ready timeout */ struct urb *urb_cmd_out; /* for sending AT commands */ @@ -307,7 +307,6 @@ static int gigaset_set_line_ctrl(struct cardstate *cs, unsigned cflag) * hang up any existing connection because of an unrecoverable error * This function may be called from any context and takes care of scheduling * the necessary actions for execution outside of interrupt context. - * cs->lock must not be held. * argument: * B channel control structure */ @@ -326,17 +325,14 @@ static inline void error_hangup(struct bc_state *bcs) /* error_reset * reset Gigaset device because of an unrecoverable error - * This function may be called from any context, and takes care of + * This function may be called from any context, and should take care of * scheduling the necessary actions for execution outside of interrupt context. - * cs->lock must not be held. + * Right now, it just generates a kernel message calling for help. * argument: * controller state structure */ static inline void error_reset(struct cardstate *cs) { - /* close AT command channel to recover (ignore errors) */ - req_submit(cs->bcs, HD_CLOSE_ATCHANNEL, 0, BAS_TIMEOUT); - //FIXME try to recover without bothering the user dev_err(cs->dev, "unrecoverable error - please disconnect Gigaset base to reset\n"); @@ -407,30 +403,14 @@ static void cmd_in_timeout(unsigned long data) { struct cardstate *cs = (struct cardstate *) data; struct bas_cardstate *ucs = cs->hw.bas; - int rc; if (!ucs->rcvbuf_size) { gig_dbg(DEBUG_USBREQ, "%s: no receive in progress", __func__); return; } - if (ucs->retry_cmd_in++ < BAS_RETRY) { - dev_notice(cs->dev, "control read: timeout, retry %d\n", - ucs->retry_cmd_in); - rc = atread_submit(cs, BAS_TIMEOUT); - if (rc >= 0 || rc == -ENODEV) - /* resubmitted or disconnected */ - /* - bypass regular exit block */ - return; - } else { - dev_err(cs->dev, - "control read: timeout, giving up after %d tries\n", - ucs->retry_cmd_in); - } - kfree(ucs->rcvbuf); - ucs->rcvbuf = NULL; - ucs->rcvbuf_size = 0; - error_reset(cs); + dev_err(cs->dev, "timeout reading AT response\n"); + error_reset(cs); //FIXME retry? } /* set/clear bits in base connection state, return previous state @@ -448,96 +428,6 @@ inline static int update_basstate(struct bas_cardstate *ucs, return state; } -/* read_ctrl_callback - * USB completion handler for control pipe input - * called by the USB subsystem in interrupt context - * parameter: - * urb USB request block - * urb->context = inbuf structure for controller state - */ -static void read_ctrl_callback(struct urb *urb, struct pt_regs *regs) -{ - struct inbuf_t *inbuf = urb->context; - struct cardstate *cs = inbuf->cs; - struct bas_cardstate *ucs = cs->hw.bas; - int have_data = 0; - unsigned numbytes; - int rc; - - update_basstate(ucs, 0, BS_ATRDPEND); - - if (!ucs->rcvbuf_size) { - dev_warn(cs->dev, "%s: no receive in progress\n", __func__); - return; - } - - del_timer(&ucs->timer_cmd_in); - - switch (urb->status) { - case 0: /* normal completion */ - numbytes = urb->actual_length; - if (unlikely(numbytes != ucs->rcvbuf_size)) { - dev_warn(cs->dev, - "control read: received %d chars, expected %d\n", - numbytes, ucs->rcvbuf_size); - if (numbytes > ucs->rcvbuf_size) - numbytes = ucs->rcvbuf_size; - } - - /* copy received bytes to inbuf */ - have_data = gigaset_fill_inbuf(inbuf, ucs->rcvbuf, numbytes); - - if (unlikely(numbytes < ucs->rcvbuf_size)) { - /* incomplete - resubmit for remaining bytes */ - ucs->rcvbuf_size -= numbytes; - ucs->retry_cmd_in = 0; - rc = atread_submit(cs, BAS_TIMEOUT); - if (rc >= 0 || rc == -ENODEV) - /* resubmitted or disconnected */ - /* - bypass regular exit block */ - return; - error_reset(cs); - } - break; - - case -ENOENT: /* cancelled */ - case -ECONNRESET: /* cancelled (async) */ - case -EINPROGRESS: /* pending */ - case -ENODEV: /* device removed */ - case -ESHUTDOWN: /* device shut down */ - /* no action necessary */ - gig_dbg(DEBUG_USBREQ, "%s: %s", - __func__, get_usb_statmsg(urb->status)); - break; - - default: /* severe trouble */ - dev_warn(cs->dev, "control read: %s\n", - get_usb_statmsg(urb->status)); - if (ucs->retry_cmd_in++ < BAS_RETRY) { - dev_notice(cs->dev, "control read: retry %d\n", - ucs->retry_cmd_in); - rc = atread_submit(cs, BAS_TIMEOUT); - if (rc >= 0 || rc == -ENODEV) - /* resubmitted or disconnected */ - /* - bypass regular exit block */ - return; - } else { - dev_err(cs->dev, - "control read: giving up after %d tries\n", - ucs->retry_cmd_in); - } - error_reset(cs); - } - - kfree(ucs->rcvbuf); - ucs->rcvbuf = NULL; - ucs->rcvbuf_size = 0; - if (have_data) { - gig_dbg(DEBUG_INTR, "%s-->BH", __func__); - gigaset_schedule_event(cs); - } -} - /* atread_submit * submit an HD_READ_ATMESSAGE command URB and optionally start a timeout * parameters: @@ -576,7 +466,7 @@ static int atread_submit(struct cardstate *cs, int timeout) if ((ret = usb_submit_urb(ucs->urb_cmd_in, SLAB_ATOMIC)) != 0) { update_basstate(ucs, 0, BS_ATRDPEND); dev_err(cs->dev, "could not submit HD_READ_ATMESSAGE: %s\n", - get_usb_rcmsg(ret)); + get_usb_statmsg(ret)); return ret; } @@ -721,12 +611,9 @@ static void read_int_callback(struct urb *urb, struct pt_regs *regs) kfree(ucs->rcvbuf); ucs->rcvbuf = NULL; ucs->rcvbuf_size = 0; - if (rc != -ENODEV) { + if (rc != -ENODEV) //FIXME corrective action? - spin_unlock_irqrestore(&cs->lock, flags); error_reset(cs); - break; - } } spin_unlock_irqrestore(&cs->lock, flags); break; @@ -756,6 +643,97 @@ static void read_int_callback(struct urb *urb, struct pt_regs *regs) } } +/* read_ctrl_callback + * USB completion handler for control pipe input + * called by the USB subsystem in interrupt context + * parameter: + * urb USB request block + * urb->context = inbuf structure for controller state + */ +static void read_ctrl_callback(struct urb *urb, struct pt_regs *regs) +{ + struct inbuf_t *inbuf = urb->context; + struct cardstate *cs = inbuf->cs; + struct bas_cardstate *ucs = cs->hw.bas; + int have_data = 0; + unsigned numbytes; + int rc; + + update_basstate(ucs, 0, BS_ATRDPEND); + + if (!ucs->rcvbuf_size) { + dev_warn(cs->dev, "%s: no receive in progress\n", __func__); + return; + } + + del_timer(&ucs->timer_cmd_in); + + switch (urb->status) { + case 0: /* normal completion */ + numbytes = urb->actual_length; + if (unlikely(numbytes == 0)) { + dev_warn(cs->dev, + "control read: empty block received\n"); + goto retry; + } + if (unlikely(numbytes != ucs->rcvbuf_size)) { + dev_warn(cs->dev, + "control read: received %d chars, expected %d\n", + numbytes, ucs->rcvbuf_size); + if (numbytes > ucs->rcvbuf_size) + numbytes = ucs->rcvbuf_size; + } + + /* copy received bytes to inbuf */ + have_data = gigaset_fill_inbuf(inbuf, ucs->rcvbuf, numbytes); + + if (unlikely(numbytes < ucs->rcvbuf_size)) { + /* incomplete - resubmit for remaining bytes */ + ucs->rcvbuf_size -= numbytes; + ucs->retry_cmd_in = 0; + goto retry; + } + break; + + case -ENOENT: /* cancelled */ + case -ECONNRESET: /* cancelled (async) */ + case -EINPROGRESS: /* pending */ + case -ENODEV: /* device removed */ + case -ESHUTDOWN: /* device shut down */ + /* no action necessary */ + gig_dbg(DEBUG_USBREQ, "%s: %s", + __func__, get_usb_statmsg(urb->status)); + break; + + default: /* severe trouble */ + dev_warn(cs->dev, "control read: %s\n", + get_usb_statmsg(urb->status)); + retry: + if (ucs->retry_cmd_in++ < BAS_RETRY) { + dev_notice(cs->dev, "control read: retry %d\n", + ucs->retry_cmd_in); + rc = atread_submit(cs, BAS_TIMEOUT); + if (rc >= 0 || rc == -ENODEV) + /* resubmitted or disconnected */ + /* - bypass regular exit block */ + return; + } else { + dev_err(cs->dev, + "control read: giving up after %d tries\n", + ucs->retry_cmd_in); + } + error_reset(cs); + } + + kfree(ucs->rcvbuf); + ucs->rcvbuf = NULL; + ucs->rcvbuf_size = 0; + if (have_data) { + gig_dbg(DEBUG_INTR, "%s-->BH", __func__); + gigaset_schedule_event(cs); + } +} + /* read_iso_callback * USB completion handler for B channel isochronous input * called by the USB subsystem in interrupt context @@ -1400,7 +1378,6 @@ static void req_timeout(unsigned long data) case HD_CLOSE_B1CHANNEL: dev_err(bcs->cs->dev, "timeout closing channel %d\n", bcs->channel + 1); - error_reset(bcs->cs); break; default: @@ -1419,61 +1396,22 @@ static void req_timeout(unsigned long data) static void write_ctrl_callback(struct urb *urb, struct pt_regs *regs) { struct bas_cardstate *ucs = urb->context; - int rc; unsigned long flags; - /* check status */ - switch (urb->status) { - case 0: /* normal completion */ - spin_lock_irqsave(&ucs->lock, flags); - switch (ucs->pending) { - case HD_DEVICE_INIT_ACK: /* no reply expected */ - del_timer(&ucs->timer_ctrl); - ucs->pending = 0; - break; - } - spin_unlock_irqrestore(&ucs->lock, flags); - return; - - case -ENOENT: /* cancelled */ - case -ECONNRESET: /* cancelled (async) */ - case -EINPROGRESS: /* pending */ - case -ENODEV: /* device removed */ - case -ESHUTDOWN: /* device shut down */ - /* ignore silently */ - gig_dbg(DEBUG_USBREQ, "%s: %s", - __func__, get_usb_statmsg(urb->status)); + spin_lock_irqsave(&ucs->lock, flags); + if (urb->status && ucs->pending) { + dev_err(&ucs->interface->dev, + "control request 0x%02x failed: %s\n", + ucs->pending, get_usb_statmsg(urb->status)); + del_timer(&ucs->timer_ctrl); + ucs->pending = 0; + } + /* individual handling of specific request types */ + switch (ucs->pending) { + case HD_DEVICE_INIT_ACK: /* no reply expected */ + ucs->pending = 0; break; - - default: /* any failure */ - if (++ucs->retry_ctrl > BAS_RETRY) { - dev_err(&ucs->interface->dev, - "control request 0x%02x failed: %s\n", - ucs->dr_ctrl.bRequest, - get_usb_statmsg(urb->status)); - break; /* give up */ - } - dev_notice(&ucs->interface->dev, - "control request 0x%02x: %s, retry %d\n", - ucs->dr_ctrl.bRequest, get_usb_statmsg(urb->status), - ucs->retry_ctrl); - /* urb->dev is clobbered by USB subsystem */ - urb->dev = ucs->udev; - rc = usb_submit_urb(urb, SLAB_ATOMIC); - if (unlikely(rc)) { - dev_err(&ucs->interface->dev, - "could not resubmit request 0x%02x: %s\n", - ucs->dr_ctrl.bRequest, get_usb_rcmsg(rc)); - break; - } - /* resubmitted */ - return; } - - /* failed, clear pending request */ - spin_lock_irqsave(&ucs->lock, flags); - del_timer(&ucs->timer_ctrl); - ucs->pending = 0; spin_unlock_irqrestore(&ucs->lock, flags); } @@ -1517,11 +1455,9 @@ static int req_submit(struct bc_state *bcs, int req, int val, int timeout) usb_sndctrlpipe(ucs->udev, 0), (unsigned char*) &ucs->dr_ctrl, NULL, 0, write_ctrl_callback, ucs); - ucs->retry_ctrl = 0; - ret = usb_submit_urb(ucs->urb_ctrl, SLAB_ATOMIC); - if (unlikely(ret)) { + if ((ret = usb_submit_urb(ucs->urb_ctrl, SLAB_ATOMIC)) != 0) { dev_err(bcs->cs->dev, "could not submit request 0x%02x: %s\n", - req, get_usb_rcmsg(ret)); + req, get_usb_statmsg(ret)); spin_unlock_irqrestore(&ucs->lock, flags); return ret; } diff --git a/trunk/drivers/isdn/gigaset/ev-layer.c b/trunk/drivers/isdn/gigaset/ev-layer.c index 44f02dbd1111..18e05c09b71c 100644 --- a/trunk/drivers/isdn/gigaset/ev-layer.c +++ b/trunk/drivers/isdn/gigaset/ev-layer.c @@ -1262,8 +1262,7 @@ static void do_action(int action, struct cardstate *cs, break; case ACT_HUPMODEM: /* send "+++" (hangup in unimodem mode) */ - if (cs->connected) - cs->ops->write_cmd(cs, "+++", 3, NULL); + cs->ops->write_cmd(cs, "+++", 3, NULL); break; case ACT_RING: /* get fresh AT state structure for new CID */ @@ -1295,6 +1294,7 @@ static void do_action(int action, struct cardstate *cs, break; case ACT_ICALL: handle_icall(cs, bcs, p_at_state); + at_state = *p_at_state; break; case ACT_FAILSDOWN: dev_warn(cs->dev, "Could not shut down the device.\n"); @@ -1334,8 +1334,10 @@ static void do_action(int action, struct cardstate *cs, */ at_state->pending_commands |= PC_DLE0; atomic_set(&cs->commands_pending, 1); - } else + } else { disconnect(p_at_state); + at_state = *p_at_state; + } break; case ACT_FAKEDLE0: at_state->int_var[VAR_ZDLE] = 0; @@ -1352,8 +1354,10 @@ static void do_action(int action, struct cardstate *cs, at_state->cid = -1; if (bcs && cs->onechannel) at_state->pending_commands |= PC_DLE0; - else + else { disconnect(p_at_state); + at_state = *p_at_state; + } schedule_init(cs, MS_RECOVER); break; case ACT_FAILDLE0: @@ -1406,6 +1410,7 @@ static void do_action(int action, struct cardstate *cs, case ACT_ABORTACCEPT: /* hangup/error/timeout during ICALL processing */ disconnect(p_at_state); + at_state = *p_at_state; break; case ACT_ABORTDIAL: /* error/timeout during dial preparation */ diff --git a/trunk/drivers/isdn/hisax/q931.c b/trunk/drivers/isdn/hisax/q931.c index aacbf0d14b64..abecabf8c271 100644 --- a/trunk/drivers/isdn/hisax/q931.c +++ b/trunk/drivers/isdn/hisax/q931.c @@ -1402,12 +1402,12 @@ dlogframe(struct IsdnCardState *cs, struct sk_buff *skb, int dir) } /* No, locate it in the table */ if (cset == 0) { - for (i = 0; i < IESIZE_NI1; i++) + for (i = 0; i < IESIZE; i++) if (*buf == ielist_ni1[i].nr) break; /* When not found, give appropriate msg */ - if (i != IESIZE_NI1) { + if (i != IESIZE) { dp += sprintf(dp, " %s\n", ielist_ni1[i].descr); dp += ielist_ni1[i].f(dp, buf); } else diff --git a/trunk/drivers/macintosh/Makefile b/trunk/drivers/macintosh/Makefile index 45a268f8047e..8972e53d2dcb 100644 --- a/trunk/drivers/macintosh/Makefile +++ b/trunk/drivers/macintosh/Makefile @@ -11,7 +11,7 @@ obj-$(CONFIG_MAC_EMUMOUSEBTN) += mac_hid.o obj-$(CONFIG_INPUT_ADBHID) += adbhid.o obj-$(CONFIG_ANSLCD) += ans-lcd.o -obj-$(CONFIG_ADB_PMU) += via-pmu.o via-pmu-event.o +obj-$(CONFIG_ADB_PMU) += via-pmu.o obj-$(CONFIG_PMAC_BACKLIGHT) += via-pmu-backlight.o obj-$(CONFIG_ADB_CUDA) += via-cuda.o obj-$(CONFIG_PMAC_APM_EMU) += apm_emu.o diff --git a/trunk/drivers/macintosh/adbhid.c b/trunk/drivers/macintosh/adbhid.c index cbfbbe2b150a..c26e1236b275 100644 --- a/trunk/drivers/macintosh/adbhid.c +++ b/trunk/drivers/macintosh/adbhid.c @@ -179,7 +179,7 @@ u8 adb_to_linux_keycodes[128] = { /* 0x65 */ KEY_F9, /* 67 */ /* 0x66 */ KEY_HANJA, /* 123 */ /* 0x67 */ KEY_F11, /* 87 */ - /* 0x68 */ KEY_HANGEUL, /* 122 */ + /* 0x68 */ KEY_HANGUEL, /* 122 */ /* 0x69 */ KEY_SYSRQ, /* 99 */ /* 0x6a */ 0, /* 0x6b */ KEY_SCROLLLOCK, /* 70 */ diff --git a/trunk/drivers/macintosh/via-pmu-event.c b/trunk/drivers/macintosh/via-pmu-event.c deleted file mode 100644 index 25cd56542328..000000000000 --- a/trunk/drivers/macintosh/via-pmu-event.c +++ /dev/null @@ -1,80 +0,0 @@ -/* - * via-pmu event device for reporting some events that come through the PMU - * - * Copyright 2006 Johannes Berg - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - * - */ - -#include -#include -#include -#include "via-pmu-event.h" - -static struct input_dev *pmu_input_dev; - -static int __init via_pmu_event_init(void) -{ - int err; - - /* do other models report button/lid status? */ - if (pmu_get_model() != PMU_KEYLARGO_BASED) - return -ENODEV; - - pmu_input_dev = input_allocate_device(); - if (!pmu_input_dev) - return -ENOMEM; - - pmu_input_dev->name = "PMU"; - pmu_input_dev->id.bustype = BUS_HOST; - pmu_input_dev->id.vendor = 0x0001; - pmu_input_dev->id.product = 0x0001; - pmu_input_dev->id.version = 0x0100; - - set_bit(EV_KEY, pmu_input_dev->evbit); - set_bit(EV_SW, pmu_input_dev->evbit); - set_bit(KEY_POWER, pmu_input_dev->keybit); - set_bit(SW_LID, pmu_input_dev->swbit); - - err = input_register_device(pmu_input_dev); - if (err) - input_free_device(pmu_input_dev); - return err; -} - -void via_pmu_event(int key, int down) -{ - - if (unlikely(!pmu_input_dev)) - return; - - switch (key) { - case PMU_EVT_POWER: - input_report_key(pmu_input_dev, KEY_POWER, down); - break; - case PMU_EVT_LID: - input_report_switch(pmu_input_dev, SW_LID, down); - break; - default: - /* no such key handled */ - return; - } - - input_sync(pmu_input_dev); -} - -late_initcall(via_pmu_event_init); diff --git a/trunk/drivers/macintosh/via-pmu-event.h b/trunk/drivers/macintosh/via-pmu-event.h deleted file mode 100644 index 72c54de408e8..000000000000 --- a/trunk/drivers/macintosh/via-pmu-event.h +++ /dev/null @@ -1,8 +0,0 @@ -#ifndef __VIA_PMU_EVENT_H -#define __VIA_PMU_EVENT_H - -#define PMU_EVT_POWER 0 -#define PMU_EVT_LID 1 -extern void via_pmu_event(int key, int down); - -#endif /* __VIA_PMU_EVENT_H */ diff --git a/trunk/drivers/macintosh/via-pmu.c b/trunk/drivers/macintosh/via-pmu.c index 1ab4f16c08b9..2a355ae59562 100644 --- a/trunk/drivers/macintosh/via-pmu.c +++ b/trunk/drivers/macintosh/via-pmu.c @@ -69,8 +69,6 @@ #include #endif -#include "via-pmu-event.h" - /* Some compile options */ #undef SUSPEND_USES_PMU #define DEBUG_SLEEP @@ -1429,12 +1427,6 @@ pmu_handle_data(unsigned char *data, int len, struct pt_regs *regs) if (pmu_battery_count) query_battery_state(); pmu_pass_intr(data, len); - /* len == 6 is probably a bad check. But how do I - * know what PMU versions send what events here? */ - if (len == 6) { - via_pmu_event(PMU_EVT_POWER, !!(data[1]&8)); - via_pmu_event(PMU_EVT_LID, data[1]&1); - } } else { pmu_pass_intr(data, len); } diff --git a/trunk/drivers/md/Kconfig b/trunk/drivers/md/Kconfig index bf869ed03eed..ac25a48362ac 100644 --- a/trunk/drivers/md/Kconfig +++ b/trunk/drivers/md/Kconfig @@ -90,7 +90,7 @@ config MD_RAID10 depends on BLK_DEV_MD && EXPERIMENTAL ---help--- RAID-10 provides a combination of striping (RAID-0) and - mirroring (RAID-1) with easier configuration and more flexible + mirroring (RAID-1) with easier configuration and more flexable layout. Unlike RAID-0, but like RAID-1, RAID-10 requires all devices to be the same size (or at least, only as much as the smallest device @@ -104,8 +104,8 @@ config MD_RAID10 If unsure, say Y. -config MD_RAID456 - tristate "RAID-4/RAID-5/RAID-6 mode" +config MD_RAID5 + tristate "RAID-4/RAID-5 mode" depends on BLK_DEV_MD ---help--- A RAID-5 set of N drives with a capacity of C MB per drive provides @@ -116,28 +116,20 @@ config MD_RAID456 while a RAID-5 set distributes the parity across the drives in one of the available parity distribution methods. - A RAID-6 set of N drives with a capacity of C MB per drive - provides the capacity of C * (N - 2) MB, and protects - against a failure of any two drives. For a given sector - (row) number, (N - 2) drives contain data sectors, and two - drives contains two independent redundancy syndromes. Like - RAID-5, RAID-6 distributes the syndromes across the drives - in one of the available parity distribution methods. - Information about Software RAID on Linux is contained in the Software-RAID mini-HOWTO, available from . There you will also learn where to get the supporting user space utilities raidtools. - If you want to use such a RAID-4/RAID-5/RAID-6 set, say Y. To + If you want to use such a RAID-4/RAID-5 set, say Y. To compile this code as a module, choose M here: the module - will be called raid456. + will be called raid5. If unsure, say Y. config MD_RAID5_RESHAPE bool "Support adding drives to a raid-5 array (experimental)" - depends on MD_RAID456 && EXPERIMENTAL + depends on MD_RAID5 && EXPERIMENTAL ---help--- A RAID-5 set can be expanded by adding extra drives. This requires "restriping" the array which means (almost) every @@ -147,7 +139,7 @@ config MD_RAID5_RESHAPE is online. However it is still EXPERIMENTAL code. It should work, but please be sure that you have backups. - You will need mdadm version 2.4.1 or later to use this + You will need mdadm verion 2.4.1 or later to use this feature safely. During the early stage of reshape there is a critical section where live data is being over-written. A crash during this time needs extra care for recovery. The @@ -162,6 +154,28 @@ config MD_RAID5_RESHAPE There should be enough spares already present to make the new array workable. +config MD_RAID6 + tristate "RAID-6 mode" + depends on BLK_DEV_MD + ---help--- + A RAID-6 set of N drives with a capacity of C MB per drive + provides the capacity of C * (N - 2) MB, and protects + against a failure of any two drives. For a given sector + (row) number, (N - 2) drives contain data sectors, and two + drives contains two independent redundancy syndromes. Like + RAID-5, RAID-6 distributes the syndromes across the drives + in one of the available parity distribution methods. + + RAID-6 requires mdadm-1.5.0 or later, available at: + + ftp://ftp.kernel.org/pub/linux/utils/raid/mdadm/ + + If you want to use such a RAID-6 set, say Y. To compile + this code as a module, choose M here: the module will be + called raid6. + + If unsure, say Y. + config MD_MULTIPATH tristate "Multipath I/O support" depends on BLK_DEV_MD @@ -221,7 +235,7 @@ config DM_SNAPSHOT tristate "Snapshot target (EXPERIMENTAL)" depends on BLK_DEV_DM && EXPERIMENTAL ---help--- - Allow volume managers to take writable snapshots of a device. + Allow volume managers to take writeable snapshots of a device. config DM_MIRROR tristate "Mirror target (EXPERIMENTAL)" diff --git a/trunk/drivers/md/Makefile b/trunk/drivers/md/Makefile index 34957a68d921..d3efedf6a6ad 100644 --- a/trunk/drivers/md/Makefile +++ b/trunk/drivers/md/Makefile @@ -8,7 +8,7 @@ dm-multipath-objs := dm-hw-handler.o dm-path-selector.o dm-mpath.o dm-snapshot-objs := dm-snap.o dm-exception-store.o dm-mirror-objs := dm-log.o dm-raid1.o md-mod-objs := md.o bitmap.o -raid456-objs := raid5.o raid6algos.o raid6recov.o raid6tables.o \ +raid6-objs := raid6main.o raid6algos.o raid6recov.o raid6tables.o \ raid6int1.o raid6int2.o raid6int4.o \ raid6int8.o raid6int16.o raid6int32.o \ raid6altivec1.o raid6altivec2.o raid6altivec4.o \ @@ -25,7 +25,8 @@ obj-$(CONFIG_MD_LINEAR) += linear.o obj-$(CONFIG_MD_RAID0) += raid0.o obj-$(CONFIG_MD_RAID1) += raid1.o obj-$(CONFIG_MD_RAID10) += raid10.o -obj-$(CONFIG_MD_RAID456) += raid456.o xor.o +obj-$(CONFIG_MD_RAID5) += raid5.o xor.o +obj-$(CONFIG_MD_RAID6) += raid6.o xor.o obj-$(CONFIG_MD_MULTIPATH) += multipath.o obj-$(CONFIG_MD_FAULTY) += faulty.o obj-$(CONFIG_BLK_DEV_MD) += md-mod.o diff --git a/trunk/drivers/md/bitmap.c b/trunk/drivers/md/bitmap.c index ebbd2d856256..f8ffaee20ff8 100644 --- a/trunk/drivers/md/bitmap.c +++ b/trunk/drivers/md/bitmap.c @@ -7,6 +7,7 @@ * additions, Copyright (C) 2003-2004, Paul Clements, SteelEye Technology, Inc.: * - added disk storage for bitmap * - changes to allow various bitmap chunk sizes + * - added bitmap daemon (to asynchronously clear bitmap bits from disk) */ /* @@ -14,6 +15,9 @@ * * flush after percent set rather than just time based. (maybe both). * wait if count gets too high, wake when it drops to half. + * allow bitmap to be mirrored with superblock (before or after...) + * allow hot-add to re-instate a current device. + * allow hot-add of bitmap after quiessing device */ #include @@ -68,6 +72,24 @@ static inline char * bmname(struct bitmap *bitmap) } +/* + * test if the bitmap is active + */ +int bitmap_active(struct bitmap *bitmap) +{ + unsigned long flags; + int res = 0; + + if (!bitmap) + return res; + spin_lock_irqsave(&bitmap->lock, flags); + res = bitmap->flags & BITMAP_ACTIVE; + spin_unlock_irqrestore(&bitmap->lock, flags); + return res; +} + +#define WRITE_POOL_SIZE 256 + /* * just a placeholder - calls kmalloc for bitmap pages */ @@ -247,8 +269,6 @@ static struct page *read_sb_page(mddev_t *mddev, long offset, unsigned long inde if (sync_page_io(rdev->bdev, target, PAGE_SIZE, page, READ)) { page->index = index; - attach_page_buffers(page, NULL); /* so that free_buffer will - * quietly no-op */ return page; } } @@ -280,132 +300,77 @@ static int write_sb_page(mddev_t *mddev, long offset, struct page *page, int wai */ static int write_page(struct bitmap *bitmap, struct page *page, int wait) { - struct buffer_head *bh; + int ret = -ENOMEM; if (bitmap->file == NULL) return write_sb_page(bitmap->mddev, bitmap->offset, page, wait); - bh = page_buffers(page); + flush_dcache_page(page); /* make sure visible to anyone reading the file */ - while (bh && bh->b_blocknr) { - atomic_inc(&bitmap->pending_writes); - set_buffer_locked(bh); - set_buffer_mapped(bh); - submit_bh(WRITE, bh); - bh = bh->b_this_page; + if (wait) + lock_page(page); + else { + if (TestSetPageLocked(page)) + return -EAGAIN; /* already locked */ + if (PageWriteback(page)) { + unlock_page(page); + return -EAGAIN; + } } - if (wait) { - wait_event(bitmap->write_wait, - atomic_read(&bitmap->pending_writes)==0); - return (bitmap->flags & BITMAP_WRITE_ERROR) ? -EIO : 0; + ret = page->mapping->a_ops->prepare_write(bitmap->file, page, 0, PAGE_SIZE); + if (!ret) + ret = page->mapping->a_ops->commit_write(bitmap->file, page, 0, + PAGE_SIZE); + if (ret) { + unlock_page(page); + return ret; } - return 0; -} -static void end_bitmap_write(struct buffer_head *bh, int uptodate) -{ - struct bitmap *bitmap = bh->b_private; - unsigned long flags; - - if (!uptodate) { - spin_lock_irqsave(&bitmap->lock, flags); - bitmap->flags |= BITMAP_WRITE_ERROR; - spin_unlock_irqrestore(&bitmap->lock, flags); - } - if (atomic_dec_and_test(&bitmap->pending_writes)) - wake_up(&bitmap->write_wait); -} - -/* copied from buffer.c */ -static void -__clear_page_buffers(struct page *page) -{ - ClearPagePrivate(page); - set_page_private(page, 0); - page_cache_release(page); -} -static void free_buffers(struct page *page) -{ - struct buffer_head *bh = page_buffers(page); - - while (bh) { - struct buffer_head *next = bh->b_this_page; - free_buffer_head(bh); - bh = next; + set_page_dirty(page); /* force it to be written out */ + + if (!wait) { + /* add to list to be waited for by daemon */ + struct page_list *item = mempool_alloc(bitmap->write_pool, GFP_NOIO); + item->page = page; + get_page(page); + spin_lock(&bitmap->write_lock); + list_add(&item->list, &bitmap->complete_pages); + spin_unlock(&bitmap->write_lock); + md_wakeup_thread(bitmap->writeback_daemon); } - __clear_page_buffers(page); - put_page(page); + return write_one_page(page, wait); } -/* read a page from a file. - * We both read the page, and attach buffers to the page to record the - * address of each block (using bmap). These addresses will be used - * to write the block later, completely bypassing the filesystem. - * This usage is similar to how swap files are handled, and allows us - * to write to a file with no concerns of memory allocation failing. - */ +/* read a page from a file, pinning it into cache, and return bytes_read */ static struct page *read_page(struct file *file, unsigned long index, - struct bitmap *bitmap, - unsigned long count) + unsigned long *bytes_read) { + struct inode *inode = file->f_mapping->host; struct page *page = NULL; - struct inode *inode = file->f_dentry->d_inode; - struct buffer_head *bh; - sector_t block; + loff_t isize = i_size_read(inode); + unsigned long end_index = isize >> PAGE_SHIFT; PRINTK("read bitmap file (%dB @ %Lu)\n", (int)PAGE_SIZE, (unsigned long long)index << PAGE_SHIFT); - page = alloc_page(GFP_KERNEL); - if (!page) - page = ERR_PTR(-ENOMEM); + page = read_cache_page(inode->i_mapping, index, + (filler_t *)inode->i_mapping->a_ops->readpage, file); if (IS_ERR(page)) goto out; - - bh = alloc_page_buffers(page, 1<i_blkbits, 0); - if (!bh) { + wait_on_page_locked(page); + if (!PageUptodate(page) || PageError(page)) { put_page(page); - page = ERR_PTR(-ENOMEM); + page = ERR_PTR(-EIO); goto out; } - attach_page_buffers(page, bh); - block = index << (PAGE_SHIFT - inode->i_blkbits); - while (bh) { - if (count == 0) - bh->b_blocknr = 0; - else { - bh->b_blocknr = bmap(inode, block); - if (bh->b_blocknr == 0) { - /* Cannot use this file! */ - free_buffers(page); - page = ERR_PTR(-EINVAL); - goto out; - } - bh->b_bdev = inode->i_sb->s_bdev; - if (count < (1<i_blkbits)) - count = 0; - else - count -= (1<i_blkbits); - - bh->b_end_io = end_bitmap_write; - bh->b_private = bitmap; - atomic_inc(&bitmap->pending_writes); - set_buffer_locked(bh); - set_buffer_mapped(bh); - submit_bh(READ, bh); - } - block++; - bh = bh->b_this_page; - } - page->index = index; - wait_event(bitmap->write_wait, - atomic_read(&bitmap->pending_writes)==0); - if (bitmap->flags & BITMAP_WRITE_ERROR) { - free_buffers(page); - page = ERR_PTR(-EIO); - } + if (index > end_index) /* we have read beyond EOF */ + *bytes_read = 0; + else if (index == end_index) /* possible short read */ + *bytes_read = isize & ~PAGE_MASK; + else + *bytes_read = PAGE_SIZE; /* got a full page */ out: if (IS_ERR(page)) printk(KERN_ALERT "md: bitmap read error: (%dB @ %Lu): %ld\n", @@ -476,14 +441,16 @@ static int bitmap_read_sb(struct bitmap *bitmap) char *reason = NULL; bitmap_super_t *sb; unsigned long chunksize, daemon_sleep, write_behind; + unsigned long bytes_read; unsigned long long events; int err = -EINVAL; /* page 0 is the superblock, read it... */ if (bitmap->file) - bitmap->sb_page = read_page(bitmap->file, 0, bitmap, PAGE_SIZE); + bitmap->sb_page = read_page(bitmap->file, 0, &bytes_read); else { bitmap->sb_page = read_sb_page(bitmap->mddev, bitmap->offset, 0); + bytes_read = PAGE_SIZE; } if (IS_ERR(bitmap->sb_page)) { err = PTR_ERR(bitmap->sb_page); @@ -493,6 +460,13 @@ static int bitmap_read_sb(struct bitmap *bitmap) sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0); + if (bytes_read < sizeof(*sb)) { /* short read */ + printk(KERN_INFO "%s: bitmap file superblock truncated\n", + bmname(bitmap)); + err = -ENOSPC; + goto out; + } + chunksize = le32_to_cpu(sb->chunksize); daemon_sleep = le32_to_cpu(sb->daemon_sleep); write_behind = le32_to_cpu(sb->write_behind); @@ -576,6 +550,7 @@ static void bitmap_mask_state(struct bitmap *bitmap, enum bitmap_state bits, spin_unlock_irqrestore(&bitmap->lock, flags); return; } + get_page(bitmap->sb_page); spin_unlock_irqrestore(&bitmap->lock, flags); sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0); switch (op) { @@ -586,6 +561,7 @@ static void bitmap_mask_state(struct bitmap *bitmap, enum bitmap_state bits, default: BUG(); } kunmap_atomic(sb, KM_USER0); + put_page(bitmap->sb_page); } /* @@ -638,17 +614,48 @@ static void bitmap_file_unmap(struct bitmap *bitmap) while (pages--) if (map[pages]->index != 0) /* 0 is sb_page, release it below */ - free_buffers(map[pages]); + put_page(map[pages]); kfree(map); kfree(attr); - if (sb_page) - free_buffers(sb_page); + safe_put_page(sb_page); +} + +static void bitmap_stop_daemon(struct bitmap *bitmap); + +/* dequeue the next item in a page list -- don't call from irq context */ +static struct page_list *dequeue_page(struct bitmap *bitmap) +{ + struct page_list *item = NULL; + struct list_head *head = &bitmap->complete_pages; + + spin_lock(&bitmap->write_lock); + if (list_empty(head)) + goto out; + item = list_entry(head->prev, struct page_list, list); + list_del(head->prev); +out: + spin_unlock(&bitmap->write_lock); + return item; +} + +static void drain_write_queues(struct bitmap *bitmap) +{ + struct page_list *item; + + while ((item = dequeue_page(bitmap))) { + /* don't bother to wait */ + put_page(item->page); + mempool_free(item, bitmap->write_pool); + } + + wake_up(&bitmap->write_wait); } static void bitmap_file_put(struct bitmap *bitmap) { struct file *file; + struct inode *inode; unsigned long flags; spin_lock_irqsave(&bitmap->lock, flags); @@ -656,14 +663,17 @@ static void bitmap_file_put(struct bitmap *bitmap) bitmap->file = NULL; spin_unlock_irqrestore(&bitmap->lock, flags); - if (file) - wait_event(bitmap->write_wait, - atomic_read(&bitmap->pending_writes)==0); + bitmap_stop_daemon(bitmap); + + drain_write_queues(bitmap); + bitmap_file_unmap(bitmap); if (file) { - struct inode *inode = file->f_dentry->d_inode; - invalidate_inode_pages(inode->i_mapping); + inode = file->f_mapping->host; + spin_lock(&inode->i_lock); + atomic_set(&inode->i_writecount, 1); /* allow writes again */ + spin_unlock(&inode->i_lock); fput(file); } } @@ -698,27 +708,26 @@ static void bitmap_file_kick(struct bitmap *bitmap) } enum bitmap_page_attr { - BITMAP_PAGE_DIRTY = 0, // there are set bits that need to be synced - BITMAP_PAGE_CLEAN = 1, // there are bits that might need to be cleared - BITMAP_PAGE_NEEDWRITE=2, // there are cleared bits that need to be synced + BITMAP_PAGE_DIRTY = 1, // there are set bits that need to be synced + BITMAP_PAGE_CLEAN = 2, // there are bits that might need to be cleared + BITMAP_PAGE_NEEDWRITE=4, // there are cleared bits that need to be synced }; static inline void set_page_attr(struct bitmap *bitmap, struct page *page, enum bitmap_page_attr attr) { - __set_bit((page->index<<2) + attr, bitmap->filemap_attr); + bitmap->filemap_attr[page->index] |= attr; } static inline void clear_page_attr(struct bitmap *bitmap, struct page *page, enum bitmap_page_attr attr) { - __clear_bit((page->index<<2) + attr, bitmap->filemap_attr); + bitmap->filemap_attr[page->index] &= ~attr; } -static inline unsigned long test_page_attr(struct bitmap *bitmap, struct page *page, - enum bitmap_page_attr attr) +static inline unsigned long get_page_attr(struct bitmap *bitmap, struct page *page) { - return test_bit((page->index<<2) + attr, bitmap->filemap_attr); + return bitmap->filemap_attr[page->index]; } /* @@ -742,6 +751,11 @@ static void bitmap_file_set_bit(struct bitmap *bitmap, sector_t block) page = filemap_get_page(bitmap, chunk); bit = file_page_offset(chunk); + + /* make sure the page stays cached until it gets written out */ + if (! (get_page_attr(bitmap, page) & BITMAP_PAGE_DIRTY)) + get_page(page); + /* set the bit */ kaddr = kmap_atomic(page, KM_USER0); if (bitmap->flags & BITMAP_HOSTENDIAN) @@ -761,8 +775,7 @@ static void bitmap_file_set_bit(struct bitmap *bitmap, sector_t block) * sync the dirty pages of the bitmap file to disk */ int bitmap_unplug(struct bitmap *bitmap) { - unsigned long i, flags; - int dirty, need_write; + unsigned long i, attr, flags; struct page *page; int wait = 0; int err; @@ -779,26 +792,35 @@ int bitmap_unplug(struct bitmap *bitmap) return 0; } page = bitmap->filemap[i]; - dirty = test_page_attr(bitmap, page, BITMAP_PAGE_DIRTY); - need_write = test_page_attr(bitmap, page, BITMAP_PAGE_NEEDWRITE); + attr = get_page_attr(bitmap, page); clear_page_attr(bitmap, page, BITMAP_PAGE_DIRTY); clear_page_attr(bitmap, page, BITMAP_PAGE_NEEDWRITE); - if (dirty) + if ((attr & BITMAP_PAGE_DIRTY)) wait = 1; spin_unlock_irqrestore(&bitmap->lock, flags); - if (dirty | need_write) + if (attr & (BITMAP_PAGE_DIRTY | BITMAP_PAGE_NEEDWRITE)) { err = write_page(bitmap, page, 0); + if (err == -EAGAIN) { + if (attr & BITMAP_PAGE_DIRTY) + err = write_page(bitmap, page, 1); + else + err = 0; + } + if (err) + return 1; + } } if (wait) { /* if any writes were performed, we need to wait on them */ - if (bitmap->file) - wait_event(bitmap->write_wait, - atomic_read(&bitmap->pending_writes)==0); - else + if (bitmap->file) { + spin_lock_irq(&bitmap->write_lock); + wait_event_lock_irq(bitmap->write_wait, + list_empty(&bitmap->complete_pages), bitmap->write_lock, + wake_up_process(bitmap->writeback_daemon->tsk)); + spin_unlock_irq(&bitmap->write_lock); + } else md_super_wait(bitmap->mddev); } - if (bitmap->flags & BITMAP_WRITE_ERROR) - bitmap_file_kick(bitmap); return 0; } @@ -820,7 +842,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start) struct page *page = NULL, *oldpage = NULL; unsigned long num_pages, bit_cnt = 0; struct file *file; - unsigned long bytes, offset; + unsigned long bytes, offset, dummy; int outofdate; int ret = -ENOSPC; void *paddr; @@ -857,12 +879,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start) if (!bitmap->filemap) goto out; - /* We need 4 bits per page, rounded up to a multiple of sizeof(unsigned long) */ - bitmap->filemap_attr = kzalloc( - (((num_pages*4/8)+sizeof(unsigned long)-1) - /sizeof(unsigned long)) - *sizeof(unsigned long), - GFP_KERNEL); + bitmap->filemap_attr = kzalloc(sizeof(long) * num_pages, GFP_KERNEL); if (!bitmap->filemap_attr) goto out; @@ -873,12 +890,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start) index = file_page_index(i); bit = file_page_offset(i); if (index != oldindex) { /* this is a new page, read it in */ - int count; /* unmap the old page, we're done with it */ - if (index == num_pages-1) - count = bytes - index * PAGE_SIZE; - else - count = PAGE_SIZE; if (index == 0) { /* * if we're here then the superblock page @@ -888,7 +900,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start) page = bitmap->sb_page; offset = sizeof(bitmap_super_t); } else if (file) { - page = read_page(file, index, bitmap, count); + page = read_page(file, index, &dummy); offset = 0; } else { page = read_sb_page(bitmap->mddev, bitmap->offset, index); @@ -959,11 +971,12 @@ void bitmap_write_all(struct bitmap *bitmap) /* We don't actually write all bitmap blocks here, * just flag them as needing to be written */ - int i; - for (i=0; i < bitmap->file_pages; i++) - set_page_attr(bitmap, bitmap->filemap[i], - BITMAP_PAGE_NEEDWRITE); + unsigned long chunks = bitmap->chunks; + unsigned long bytes = (chunks+7)/8 + sizeof(bitmap_super_t); + unsigned long num_pages = (bytes + PAGE_SIZE-1) / PAGE_SIZE; + while (num_pages--) + bitmap->filemap_attr[num_pages] |= BITMAP_PAGE_NEEDWRITE; } @@ -994,6 +1007,7 @@ int bitmap_daemon_work(struct bitmap *bitmap) struct page *page = NULL, *lastpage = NULL; int err = 0; int blocks; + int attr; void *paddr; if (bitmap == NULL) @@ -1015,34 +1029,43 @@ int bitmap_daemon_work(struct bitmap *bitmap) if (page != lastpage) { /* skip this page unless it's marked as needing cleaning */ - if (!test_page_attr(bitmap, page, BITMAP_PAGE_CLEAN)) { - int need_write = test_page_attr(bitmap, page, - BITMAP_PAGE_NEEDWRITE); - if (need_write) + if (!((attr=get_page_attr(bitmap, page)) & BITMAP_PAGE_CLEAN)) { + if (attr & BITMAP_PAGE_NEEDWRITE) { + get_page(page); clear_page_attr(bitmap, page, BITMAP_PAGE_NEEDWRITE); - + } spin_unlock_irqrestore(&bitmap->lock, flags); - if (need_write) { + if (attr & BITMAP_PAGE_NEEDWRITE) { switch (write_page(bitmap, page, 0)) { + case -EAGAIN: + set_page_attr(bitmap, page, BITMAP_PAGE_NEEDWRITE); + break; case 0: break; default: bitmap_file_kick(bitmap); } + put_page(page); } continue; } /* grab the new page, sync and release the old */ + get_page(page); if (lastpage != NULL) { - if (test_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE)) { + if (get_page_attr(bitmap, lastpage) & BITMAP_PAGE_NEEDWRITE) { clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); spin_unlock_irqrestore(&bitmap->lock, flags); err = write_page(bitmap, lastpage, 0); + if (err == -EAGAIN) { + err = 0; + set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); + } } else { set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); spin_unlock_irqrestore(&bitmap->lock, flags); } + put_page(lastpage); if (err) bitmap_file_kick(bitmap); } else @@ -1084,19 +1107,131 @@ int bitmap_daemon_work(struct bitmap *bitmap) /* now sync the final page */ if (lastpage != NULL) { spin_lock_irqsave(&bitmap->lock, flags); - if (test_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE)) { + if (get_page_attr(bitmap, lastpage) &BITMAP_PAGE_NEEDWRITE) { clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); spin_unlock_irqrestore(&bitmap->lock, flags); err = write_page(bitmap, lastpage, 0); + if (err == -EAGAIN) { + set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); + err = 0; + } } else { set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE); spin_unlock_irqrestore(&bitmap->lock, flags); } + + put_page(lastpage); } return err; } +static void daemon_exit(struct bitmap *bitmap, mdk_thread_t **daemon) +{ + mdk_thread_t *dmn; + unsigned long flags; + + /* if no one is waiting on us, we'll free the md thread struct + * and exit, otherwise we let the waiter clean things up */ + spin_lock_irqsave(&bitmap->lock, flags); + if ((dmn = *daemon)) { /* no one is waiting, cleanup and exit */ + *daemon = NULL; + spin_unlock_irqrestore(&bitmap->lock, flags); + kfree(dmn); + complete_and_exit(NULL, 0); /* do_exit not exported */ + } + spin_unlock_irqrestore(&bitmap->lock, flags); +} + +static void bitmap_writeback_daemon(mddev_t *mddev) +{ + struct bitmap *bitmap = mddev->bitmap; + struct page *page; + struct page_list *item; + int err = 0; + + if (signal_pending(current)) { + printk(KERN_INFO + "%s: bitmap writeback daemon got signal, exiting...\n", + bmname(bitmap)); + err = -EINTR; + goto out; + } + if (bitmap == NULL) + /* about to be stopped. */ + return; + + PRINTK("%s: bitmap writeback daemon woke up...\n", bmname(bitmap)); + /* wait on bitmap page writebacks */ + while ((item = dequeue_page(bitmap))) { + page = item->page; + mempool_free(item, bitmap->write_pool); + PRINTK("wait on page writeback: %p\n", page); + wait_on_page_writeback(page); + PRINTK("finished page writeback: %p\n", page); + + err = PageError(page); + put_page(page); + if (err) { + printk(KERN_WARNING "%s: bitmap file writeback " + "failed (page %lu): %d\n", + bmname(bitmap), page->index, err); + bitmap_file_kick(bitmap); + goto out; + } + } + out: + wake_up(&bitmap->write_wait); + if (err) { + printk(KERN_INFO "%s: bitmap writeback daemon exiting (%d)\n", + bmname(bitmap), err); + daemon_exit(bitmap, &bitmap->writeback_daemon); + } +} + +static mdk_thread_t *bitmap_start_daemon(struct bitmap *bitmap, + void (*func)(mddev_t *), char *name) +{ + mdk_thread_t *daemon; + char namebuf[32]; + +#ifdef INJECT_FATAL_FAULT_2 + daemon = NULL; +#else + sprintf(namebuf, "%%s_%s", name); + daemon = md_register_thread(func, bitmap->mddev, namebuf); +#endif + if (!daemon) { + printk(KERN_ERR "%s: failed to start bitmap daemon\n", + bmname(bitmap)); + return ERR_PTR(-ECHILD); + } + + md_wakeup_thread(daemon); /* start it running */ + + PRINTK("%s: %s daemon (pid %d) started...\n", + bmname(bitmap), name, daemon->tsk->pid); + + return daemon; +} + +static void bitmap_stop_daemon(struct bitmap *bitmap) +{ + /* the daemon can't stop itself... it'll just exit instead... */ + if (bitmap->writeback_daemon && ! IS_ERR(bitmap->writeback_daemon) && + current->pid != bitmap->writeback_daemon->tsk->pid) { + mdk_thread_t *daemon; + unsigned long flags; + + spin_lock_irqsave(&bitmap->lock, flags); + daemon = bitmap->writeback_daemon; + bitmap->writeback_daemon = NULL; + spin_unlock_irqrestore(&bitmap->lock, flags); + if (daemon && ! IS_ERR(daemon)) + md_unregister_thread(daemon); /* destroy the thread */ + } +} + static bitmap_counter_t *bitmap_get_counter(struct bitmap *bitmap, sector_t offset, int *blocks, int create) @@ -1365,6 +1500,8 @@ static void bitmap_free(struct bitmap *bitmap) /* free all allocated memory */ + mempool_destroy(bitmap->write_pool); + if (bp) /* deallocate the page memory */ for (k = 0; k < pages; k++) if (bp[k].map && !bp[k].hijacked) @@ -1412,20 +1549,20 @@ int bitmap_create(mddev_t *mddev) return -ENOMEM; spin_lock_init(&bitmap->lock); - atomic_set(&bitmap->pending_writes, 0); - init_waitqueue_head(&bitmap->write_wait); - bitmap->mddev = mddev; + spin_lock_init(&bitmap->write_lock); + INIT_LIST_HEAD(&bitmap->complete_pages); + init_waitqueue_head(&bitmap->write_wait); + bitmap->write_pool = mempool_create_kmalloc_pool(WRITE_POOL_SIZE, + sizeof(struct page_list)); + err = -ENOMEM; + if (!bitmap->write_pool) + goto error; + bitmap->file = file; bitmap->offset = mddev->bitmap_offset; - if (file) { - get_file(file); - do_sync_file_range(file, 0, LLONG_MAX, - SYNC_FILE_RANGE_WAIT_BEFORE | - SYNC_FILE_RANGE_WRITE | - SYNC_FILE_RANGE_WAIT_AFTER); - } + if (file) get_file(file); /* read superblock from bitmap file (this sets bitmap->chunksize) */ err = bitmap_read_sb(bitmap); if (err) @@ -1457,6 +1594,8 @@ int bitmap_create(mddev_t *mddev) if (!bitmap->bp) goto error; + bitmap->flags |= BITMAP_ACTIVE; + /* now that we have some pages available, initialize the in-memory * bitmap from the on-disk bitmap */ start = 0; @@ -1474,6 +1613,15 @@ int bitmap_create(mddev_t *mddev) mddev->bitmap = bitmap; + if (file) + /* kick off the bitmap writeback daemon */ + bitmap->writeback_daemon = + bitmap_start_daemon(bitmap, + bitmap_writeback_daemon, + "bitmap_wb"); + + if (IS_ERR(bitmap->writeback_daemon)) + return PTR_ERR(bitmap->writeback_daemon); mddev->thread->timeout = bitmap->daemon_sleep * HZ; return bitmap_update_sb(bitmap); @@ -1490,3 +1638,4 @@ EXPORT_SYMBOL(bitmap_start_sync); EXPORT_SYMBOL(bitmap_end_sync); EXPORT_SYMBOL(bitmap_unplug); EXPORT_SYMBOL(bitmap_close_sync); +EXPORT_SYMBOL(bitmap_daemon_work); diff --git a/trunk/drivers/md/dm-crypt.c b/trunk/drivers/md/dm-crypt.c index 6022ed12a795..61a590bb6241 100644 --- a/trunk/drivers/md/dm-crypt.c +++ b/trunk/drivers/md/dm-crypt.c @@ -20,7 +20,7 @@ #include "dm.h" -#define DM_MSG_PREFIX "crypt" +#define PFX "crypt: " /* * per bio private data @@ -125,19 +125,19 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti, u8 *salt; if (opts == NULL) { - ti->error = "Digest algorithm missing for ESSIV mode"; + ti->error = PFX "Digest algorithm missing for ESSIV mode"; return -EINVAL; } /* Hash the cipher key with the given hash algorithm */ hash_tfm = crypto_alloc_tfm(opts, CRYPTO_TFM_REQ_MAY_SLEEP); if (hash_tfm == NULL) { - ti->error = "Error initializing ESSIV hash"; + ti->error = PFX "Error initializing ESSIV hash"; return -EINVAL; } if (crypto_tfm_alg_type(hash_tfm) != CRYPTO_ALG_TYPE_DIGEST) { - ti->error = "Expected digest algorithm for ESSIV hash"; + ti->error = PFX "Expected digest algorithm for ESSIV hash"; crypto_free_tfm(hash_tfm); return -EINVAL; } @@ -145,7 +145,7 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti, saltsize = crypto_tfm_alg_digestsize(hash_tfm); salt = kmalloc(saltsize, GFP_KERNEL); if (salt == NULL) { - ti->error = "Error kmallocing salt storage in ESSIV"; + ti->error = PFX "Error kmallocing salt storage in ESSIV"; crypto_free_tfm(hash_tfm); return -ENOMEM; } @@ -159,20 +159,20 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti, CRYPTO_TFM_MODE_ECB | CRYPTO_TFM_REQ_MAY_SLEEP); if (essiv_tfm == NULL) { - ti->error = "Error allocating crypto tfm for ESSIV"; + ti->error = PFX "Error allocating crypto tfm for ESSIV"; kfree(salt); return -EINVAL; } if (crypto_tfm_alg_blocksize(essiv_tfm) != crypto_tfm_alg_ivsize(cc->tfm)) { - ti->error = "Block size of ESSIV cipher does " + ti->error = PFX "Block size of ESSIV cipher does " "not match IV size of block cipher"; crypto_free_tfm(essiv_tfm); kfree(salt); return -EINVAL; } if (crypto_cipher_setkey(essiv_tfm, salt, saltsize) < 0) { - ti->error = "Failed to set key for ESSIV cipher"; + ti->error = PFX "Failed to set key for ESSIV cipher"; crypto_free_tfm(essiv_tfm); kfree(salt); return -EINVAL; @@ -521,7 +521,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) unsigned long long tmpll; if (argc != 5) { - ti->error = "Not enough arguments"; + ti->error = PFX "Not enough arguments"; return -EINVAL; } @@ -532,21 +532,21 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) ivmode = strsep(&ivopts, ":"); if (tmp) - DMWARN("Unexpected additional cipher options"); + DMWARN(PFX "Unexpected additional cipher options"); key_size = strlen(argv[1]) >> 1; cc = kmalloc(sizeof(*cc) + key_size * sizeof(u8), GFP_KERNEL); if (cc == NULL) { ti->error = - "Cannot allocate transparent encryption context"; + PFX "Cannot allocate transparent encryption context"; return -ENOMEM; } cc->key_size = key_size; if ((!key_size && strcmp(argv[1], "-") != 0) || (key_size && crypt_decode_key(cc->key, argv[1], key_size) < 0)) { - ti->error = "Error decoding key"; + ti->error = PFX "Error decoding key"; goto bad1; } @@ -562,22 +562,22 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) else if (strcmp(chainmode, "ecb") == 0) crypto_flags = CRYPTO_TFM_MODE_ECB; else { - ti->error = "Unknown chaining mode"; + ti->error = PFX "Unknown chaining mode"; goto bad1; } if (crypto_flags != CRYPTO_TFM_MODE_ECB && !ivmode) { - ti->error = "This chaining mode requires an IV mechanism"; + ti->error = PFX "This chaining mode requires an IV mechanism"; goto bad1; } tfm = crypto_alloc_tfm(cipher, crypto_flags | CRYPTO_TFM_REQ_MAY_SLEEP); if (!tfm) { - ti->error = "Error allocating crypto tfm"; + ti->error = PFX "Error allocating crypto tfm"; goto bad1; } if (crypto_tfm_alg_type(tfm) != CRYPTO_ALG_TYPE_CIPHER) { - ti->error = "Expected cipher algorithm"; + ti->error = PFX "Expected cipher algorithm"; goto bad2; } @@ -595,7 +595,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) else if (strcmp(ivmode, "essiv") == 0) cc->iv_gen_ops = &crypt_iv_essiv_ops; else { - ti->error = "Invalid IV mode"; + ti->error = PFX "Invalid IV mode"; goto bad2; } @@ -610,7 +610,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) else { cc->iv_size = 0; if (cc->iv_gen_ops) { - DMWARN("Selected cipher does not support IVs"); + DMWARN(PFX "Selected cipher does not support IVs"); if (cc->iv_gen_ops->dtr) cc->iv_gen_ops->dtr(cc); cc->iv_gen_ops = NULL; @@ -619,36 +619,36 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) cc->io_pool = mempool_create_slab_pool(MIN_IOS, _crypt_io_pool); if (!cc->io_pool) { - ti->error = "Cannot allocate crypt io mempool"; + ti->error = PFX "Cannot allocate crypt io mempool"; goto bad3; } cc->page_pool = mempool_create_page_pool(MIN_POOL_PAGES, 0); if (!cc->page_pool) { - ti->error = "Cannot allocate page mempool"; + ti->error = PFX "Cannot allocate page mempool"; goto bad4; } if (tfm->crt_cipher.cit_setkey(tfm, cc->key, key_size) < 0) { - ti->error = "Error setting key"; + ti->error = PFX "Error setting key"; goto bad5; } if (sscanf(argv[2], "%llu", &tmpll) != 1) { - ti->error = "Invalid iv_offset sector"; + ti->error = PFX "Invalid iv_offset sector"; goto bad5; } cc->iv_offset = tmpll; if (sscanf(argv[4], "%llu", &tmpll) != 1) { - ti->error = "Invalid device sector"; + ti->error = PFX "Invalid device sector"; goto bad5; } cc->start = tmpll; if (dm_get_device(ti, argv[3], cc->start, ti->len, dm_table_get_mode(ti->table), &cc->dev)) { - ti->error = "Device lookup failed"; + ti->error = PFX "Device lookup failed"; goto bad5; } @@ -657,7 +657,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) *(ivopts - 1) = ':'; cc->iv_mode = kmalloc(strlen(ivmode) + 1, GFP_KERNEL); if (!cc->iv_mode) { - ti->error = "Error kmallocing iv_mode string"; + ti->error = PFX "Error kmallocing iv_mode string"; goto bad5; } strcpy(cc->iv_mode, ivmode); @@ -918,13 +918,13 @@ static int __init dm_crypt_init(void) _kcryptd_workqueue = create_workqueue("kcryptd"); if (!_kcryptd_workqueue) { r = -ENOMEM; - DMERR("couldn't create kcryptd"); + DMERR(PFX "couldn't create kcryptd"); goto bad1; } r = dm_register_target(&crypt_target); if (r < 0) { - DMERR("register failed %d", r); + DMERR(PFX "register failed %d", r); goto bad2; } @@ -942,7 +942,7 @@ static void __exit dm_crypt_exit(void) int r = dm_unregister_target(&crypt_target); if (r < 0) - DMERR("unregister failed %d", r); + DMERR(PFX "unregister failed %d", r); destroy_workqueue(_kcryptd_workqueue); kmem_cache_destroy(_crypt_io_pool); diff --git a/trunk/drivers/md/dm-emc.c b/trunk/drivers/md/dm-emc.c index 2a374ccb30dd..c7067674dcb7 100644 --- a/trunk/drivers/md/dm-emc.c +++ b/trunk/drivers/md/dm-emc.c @@ -12,8 +12,6 @@ #include #include -#define DM_MSG_PREFIX "multipath emc" - struct emc_handler { spinlock_t lock; @@ -68,7 +66,7 @@ static struct bio *get_failover_bio(struct path *path, unsigned data_size) bio = bio_alloc(GFP_ATOMIC, 1); if (!bio) { - DMERR("get_failover_bio: bio_alloc() failed."); + DMERR("dm-emc: get_failover_bio: bio_alloc() failed."); return NULL; } @@ -80,13 +78,13 @@ static struct bio *get_failover_bio(struct path *path, unsigned data_size) page = alloc_page(GFP_ATOMIC); if (!page) { - DMERR("get_failover_bio: alloc_page() failed."); + DMERR("dm-emc: get_failover_bio: alloc_page() failed."); bio_put(bio); return NULL; } if (bio_add_page(bio, page, data_size, 0) != data_size) { - DMERR("get_failover_bio: alloc_page() failed."); + DMERR("dm-emc: get_failover_bio: alloc_page() failed."); __free_page(page); bio_put(bio); return NULL; @@ -105,7 +103,7 @@ static struct request *get_failover_req(struct emc_handler *h, /* FIXME: Figure out why it fails with GFP_ATOMIC. */ rq = blk_get_request(q, WRITE, __GFP_WAIT); if (!rq) { - DMERR("get_failover_req: blk_get_request failed"); + DMERR("dm-emc: get_failover_req: blk_get_request failed"); return NULL; } @@ -162,7 +160,7 @@ static struct request *emc_trespass_get(struct emc_handler *h, bio = get_failover_bio(path, data_size); if (!bio) { - DMERR("emc_trespass_get: no bio"); + DMERR("dm-emc: emc_trespass_get: no bio"); return NULL; } @@ -175,7 +173,7 @@ static struct request *emc_trespass_get(struct emc_handler *h, /* get request for block layer packet command */ rq = get_failover_req(h, bio, path); if (!rq) { - DMERR("emc_trespass_get: no rq"); + DMERR("dm-emc: emc_trespass_get: no rq"); free_bio(bio); return NULL; } @@ -202,18 +200,18 @@ static void emc_pg_init(struct hw_handler *hwh, unsigned bypassed, * initial state passed into us and then get an update here. */ if (!q) { - DMINFO("emc_pg_init: no queue"); + DMINFO("dm-emc: emc_pg_init: no queue"); goto fail_path; } /* FIXME: The request should be pre-allocated. */ rq = emc_trespass_get(hwh->context, path); if (!rq) { - DMERR("emc_pg_init: no rq"); + DMERR("dm-emc: emc_pg_init: no rq"); goto fail_path; } - DMINFO("emc_pg_init: sending switch-over command"); + DMINFO("dm-emc: emc_pg_init: sending switch-over command"); elv_add_request(q, rq, ELEVATOR_INSERT_FRONT, 1); return; @@ -243,18 +241,18 @@ static int emc_create(struct hw_handler *hwh, unsigned argc, char **argv) hr = 0; short_trespass = 0; } else if (argc != 2) { - DMWARN("incorrect number of arguments"); + DMWARN("dm-emc hwhandler: incorrect number of arguments"); return -EINVAL; } else { if ((sscanf(argv[0], "%u", &short_trespass) != 1) || (short_trespass > 1)) { - DMWARN("invalid trespass mode selected"); + DMWARN("dm-emc: invalid trespass mode selected"); return -EINVAL; } if ((sscanf(argv[1], "%u", &hr) != 1) || (hr > 1)) { - DMWARN("invalid honor reservation flag selected"); + DMWARN("dm-emc: invalid honor reservation flag selected"); return -EINVAL; } } @@ -266,14 +264,14 @@ static int emc_create(struct hw_handler *hwh, unsigned argc, char **argv) hwh->context = h; if ((h->short_trespass = short_trespass)) - DMWARN("short trespass command will be send"); + DMWARN("dm-emc: short trespass command will be send"); else - DMWARN("long trespass command will be send"); + DMWARN("dm-emc: long trespass command will be send"); if ((h->hr = hr)) - DMWARN("honor reservation bit will be set"); + DMWARN("dm-emc: honor reservation bit will be set"); else - DMWARN("honor reservation bit will not be set (default)"); + DMWARN("dm-emc: honor reservation bit will not be set (default)"); return 0; } @@ -338,9 +336,9 @@ static int __init dm_emc_init(void) int r = dm_register_hw_handler(&emc_hwh); if (r < 0) - DMERR("register failed %d", r); + DMERR("emc: register failed %d", r); - DMINFO("version 0.0.3 loaded"); + DMINFO("dm-emc version 0.0.3 loaded"); return r; } @@ -350,7 +348,7 @@ static void __exit dm_emc_exit(void) int r = dm_unregister_hw_handler(&emc_hwh); if (r < 0) - DMERR("unregister failed %d", r); + DMERR("emc: unregister failed %d", r); } module_init(dm_emc_init); diff --git a/trunk/drivers/md/dm-exception-store.c b/trunk/drivers/md/dm-exception-store.c index d12379b5cdb5..cc07bbebbb16 100644 --- a/trunk/drivers/md/dm-exception-store.c +++ b/trunk/drivers/md/dm-exception-store.c @@ -16,8 +16,6 @@ #include #include -#define DM_MSG_PREFIX "snapshots" - /*----------------------------------------------------------------- * Persistent snapshots, by persistent we mean that the snapshot * will survive a reboot. @@ -93,6 +91,7 @@ struct pstore { struct dm_snapshot *snap; /* up pointer to my snapshot */ int version; int valid; + uint32_t chunk_size; uint32_t exceptions_per_area; /* @@ -134,7 +133,7 @@ static int alloc_area(struct pstore *ps) int r = -ENOMEM; size_t len; - len = ps->snap->chunk_size << SECTOR_SHIFT; + len = ps->chunk_size << SECTOR_SHIFT; /* * Allocate the chunk_size block of memory that will hold @@ -161,8 +160,8 @@ static int chunk_io(struct pstore *ps, uint32_t chunk, int rw) unsigned long bits; where.bdev = ps->snap->cow->bdev; - where.sector = ps->snap->chunk_size * chunk; - where.count = ps->snap->chunk_size; + where.sector = ps->chunk_size * chunk; + where.count = ps->chunk_size; return dm_io_sync_vm(1, &where, rw, ps->area, &bits); } @@ -189,7 +188,7 @@ static int area_io(struct pstore *ps, uint32_t area, int rw) static int zero_area(struct pstore *ps, uint32_t area) { - memset(ps->area, 0, ps->snap->chunk_size << SECTOR_SHIFT); + memset(ps->area, 0, ps->chunk_size << SECTOR_SHIFT); return area_io(ps, area, WRITE); } @@ -197,7 +196,6 @@ static int read_header(struct pstore *ps, int *new_snapshot) { int r; struct disk_header *dh; - chunk_t chunk_size; r = chunk_io(ps, 0, READ); if (r) @@ -212,29 +210,8 @@ static int read_header(struct pstore *ps, int *new_snapshot) *new_snapshot = 0; ps->valid = le32_to_cpu(dh->valid); ps->version = le32_to_cpu(dh->version); - chunk_size = le32_to_cpu(dh->chunk_size); - if (ps->snap->chunk_size != chunk_size) { - DMWARN("chunk size %llu in device metadata overrides " - "table chunk size of %llu.", - (unsigned long long)chunk_size, - (unsigned long long)ps->snap->chunk_size); - - /* We had a bogus chunk_size. Fix stuff up. */ - dm_io_put(sectors_to_pages(ps->snap->chunk_size)); - free_area(ps); - - ps->snap->chunk_size = chunk_size; - ps->snap->chunk_mask = chunk_size - 1; - ps->snap->chunk_shift = ffs(chunk_size) - 1; - - r = alloc_area(ps); - if (r) - return r; - - r = dm_io_get(sectors_to_pages(chunk_size)); - if (r) - return r; - } + ps->chunk_size = le32_to_cpu(dh->chunk_size); + } else { DMWARN("Invalid/corrupt snapshot"); r = -ENXIO; @@ -247,13 +224,13 @@ static int write_header(struct pstore *ps) { struct disk_header *dh; - memset(ps->area, 0, ps->snap->chunk_size << SECTOR_SHIFT); + memset(ps->area, 0, ps->chunk_size << SECTOR_SHIFT); dh = (struct disk_header *) ps->area; dh->magic = cpu_to_le32(SNAP_MAGIC); dh->valid = cpu_to_le32(ps->valid); dh->version = cpu_to_le32(ps->version); - dh->chunk_size = cpu_to_le32(ps->snap->chunk_size); + dh->chunk_size = cpu_to_le32(ps->chunk_size); return chunk_io(ps, 0, WRITE); } @@ -388,7 +365,7 @@ static void persistent_destroy(struct exception_store *store) { struct pstore *ps = get_info(store); - dm_io_put(sectors_to_pages(ps->snap->chunk_size)); + dm_io_put(sectors_to_pages(ps->chunk_size)); vfree(ps->callbacks); free_area(ps); kfree(ps); @@ -406,16 +383,6 @@ static int persistent_read_metadata(struct exception_store *store) if (r) return r; - /* - * Now we know correct chunk_size, complete the initialisation. - */ - ps->exceptions_per_area = (ps->snap->chunk_size << SECTOR_SHIFT) / - sizeof(struct disk_exception); - ps->callbacks = dm_vcalloc(ps->exceptions_per_area, - sizeof(*ps->callbacks)); - if (!ps->callbacks) - return -ENOMEM; - /* * Do we need to setup a new snapshot ? */ @@ -566,6 +533,9 @@ int dm_create_persistent(struct exception_store *store, uint32_t chunk_size) ps->snap = store->snap; ps->valid = 1; ps->version = SNAPSHOT_DISK_VERSION; + ps->chunk_size = chunk_size; + ps->exceptions_per_area = (chunk_size << SECTOR_SHIFT) / + sizeof(struct disk_exception); ps->next_free = 2; /* skipping the header and first area */ ps->current_committed = 0; @@ -573,9 +543,18 @@ int dm_create_persistent(struct exception_store *store, uint32_t chunk_size) if (r) goto bad; + /* + * Allocate space for all the callbacks. + */ ps->callback_count = 0; atomic_set(&ps->pending_count, 0); - ps->callbacks = NULL; + ps->callbacks = dm_vcalloc(ps->exceptions_per_area, + sizeof(*ps->callbacks)); + + if (!ps->callbacks) { + r = -ENOMEM; + goto bad; + } store->destroy = persistent_destroy; store->read_metadata = persistent_read_metadata; diff --git a/trunk/drivers/md/dm-ioctl.c b/trunk/drivers/md/dm-ioctl.c index 3edb3477f987..8edd6435414d 100644 --- a/trunk/drivers/md/dm-ioctl.c +++ b/trunk/drivers/md/dm-ioctl.c @@ -1,6 +1,6 @@ /* * Copyright (C) 2001, 2002 Sistina Software (UK) Limited. - * Copyright (C) 2004 - 2006 Red Hat, Inc. All rights reserved. + * Copyright (C) 2004 - 2005 Red Hat, Inc. All rights reserved. * * This file is released under the GPL. */ @@ -19,7 +19,6 @@ #include -#define DM_MSG_PREFIX "ioctl" #define DM_DRIVER_EMAIL "dm-devel@redhat.com" /*----------------------------------------------------------------- @@ -49,7 +48,7 @@ struct vers_iter { static struct list_head _name_buckets[NUM_BUCKETS]; static struct list_head _uuid_buckets[NUM_BUCKETS]; -static void dm_hash_remove_all(int keep_open_devices); +static void dm_hash_remove_all(void); /* * Guards access to both hash tables. @@ -74,7 +73,7 @@ static int dm_hash_init(void) static void dm_hash_exit(void) { - dm_hash_remove_all(0); + dm_hash_remove_all(); devfs_remove(DM_DIR); } @@ -103,10 +102,8 @@ static struct hash_cell *__get_name_cell(const char *str) unsigned int h = hash_str(str); list_for_each_entry (hc, _name_buckets + h, name_list) - if (!strcmp(hc->name, str)) { - dm_get(hc->md); + if (!strcmp(hc->name, str)) return hc; - } return NULL; } @@ -117,10 +114,8 @@ static struct hash_cell *__get_uuid_cell(const char *str) unsigned int h = hash_str(str); list_for_each_entry (hc, _uuid_buckets + h, uuid_list) - if (!strcmp(hc->uuid, str)) { - dm_get(hc->md); + if (!strcmp(hc->uuid, str)) return hc; - } return NULL; } @@ -196,7 +191,7 @@ static int unregister_with_devfs(struct hash_cell *hc) */ static int dm_hash_insert(const char *name, const char *uuid, struct mapped_device *md) { - struct hash_cell *cell, *hc; + struct hash_cell *cell; /* * Allocate the new cells. @@ -209,19 +204,14 @@ static int dm_hash_insert(const char *name, const char *uuid, struct mapped_devi * Insert the cell into both hash tables. */ down_write(&_hash_lock); - hc = __get_name_cell(name); - if (hc) { - dm_put(hc->md); + if (__get_name_cell(name)) goto bad; - } list_add(&cell->name_list, _name_buckets + hash_str(name)); if (uuid) { - hc = __get_uuid_cell(uuid); - if (hc) { + if (__get_uuid_cell(uuid)) { list_del(&cell->name_list); - dm_put(hc->md); goto bad; } list_add(&cell->uuid_list, _uuid_buckets + hash_str(uuid)); @@ -261,41 +251,19 @@ static void __hash_remove(struct hash_cell *hc) free_cell(hc); } -static void dm_hash_remove_all(int keep_open_devices) +static void dm_hash_remove_all(void) { - int i, dev_skipped, dev_removed; + int i; struct hash_cell *hc; struct list_head *tmp, *n; down_write(&_hash_lock); - -retry: - dev_skipped = dev_removed = 0; for (i = 0; i < NUM_BUCKETS; i++) { list_for_each_safe (tmp, n, _name_buckets + i) { hc = list_entry(tmp, struct hash_cell, name_list); - - if (keep_open_devices && - dm_lock_for_deletion(hc->md)) { - dev_skipped++; - continue; - } __hash_remove(hc); - dev_removed = 1; } } - - /* - * Some mapped devices may be using other mapped devices, so if any - * still exist, repeat until we make no further progress. - */ - if (dev_skipped) { - if (dev_removed) - goto retry; - - DMWARN("remove_all left %d open device(s)", dev_skipped); - } - up_write(&_hash_lock); } @@ -321,7 +289,6 @@ static int dm_hash_rename(const char *old, const char *new) if (hc) { DMWARN("asked to rename to an already existing name %s -> %s", old, new); - dm_put(hc->md); up_write(&_hash_lock); kfree(new_name); return -EBUSY; @@ -361,7 +328,6 @@ static int dm_hash_rename(const char *old, const char *new) dm_table_put(table); } - dm_put(hc->md); up_write(&_hash_lock); kfree(old_name); return 0; @@ -378,7 +344,7 @@ typedef int (*ioctl_fn)(struct dm_ioctl *param, size_t param_size); static int remove_all(struct dm_ioctl *param, size_t param_size) { - dm_hash_remove_all(1); + dm_hash_remove_all(); param->data_size = 0; return 0; } @@ -558,6 +524,7 @@ static int __dev_status(struct mapped_device *md, struct dm_ioctl *param) { struct gendisk *disk = dm_disk(md); struct dm_table *table; + struct block_device *bdev; param->flags &= ~(DM_SUSPEND_FLAG | DM_READONLY_FLAG | DM_ACTIVE_PRESENT_FLAG); @@ -567,12 +534,20 @@ static int __dev_status(struct mapped_device *md, struct dm_ioctl *param) param->dev = huge_encode_dev(MKDEV(disk->major, disk->first_minor)); - /* - * Yes, this will be out of date by the time it gets back - * to userland, but it is still very useful for - * debugging. - */ - param->open_count = dm_open_count(md); + if (!(param->flags & DM_SKIP_BDGET_FLAG)) { + bdev = bdget_disk(disk, 0); + if (!bdev) + return -ENXIO; + + /* + * Yes, this will be out of date by the time it gets back + * to userland, but it is still very useful for + * debugging. + */ + param->open_count = bdev->bd_openers; + bdput(bdev); + } else + param->open_count = -1; if (disk->policy) param->flags |= DM_READONLY_FLAG; @@ -592,7 +567,7 @@ static int __dev_status(struct mapped_device *md, struct dm_ioctl *param) static int dev_create(struct dm_ioctl *param, size_t param_size) { - int r, m = DM_ANY_MINOR; + int r; struct mapped_device *md; r = check_name(param->name); @@ -600,9 +575,10 @@ static int dev_create(struct dm_ioctl *param, size_t param_size) return r; if (param->flags & DM_PERSISTENT_DEV_FLAG) - m = MINOR(huge_decode_dev(param->dev)); + r = dm_create_with_minor(MINOR(huge_decode_dev(param->dev)), &md); + else + r = dm_create(&md); - r = dm_create(m, &md); if (r) return r; @@ -635,8 +611,10 @@ static struct hash_cell *__find_device_hash_cell(struct dm_ioctl *param) return __get_name_cell(param->name); md = dm_get_md(huge_decode_dev(param->dev)); - if (md) + if (md) { mdptr = dm_get_mdptr(md); + dm_put(md); + } return mdptr; } @@ -650,6 +628,7 @@ static struct mapped_device *find_device(struct dm_ioctl *param) hc = __find_device_hash_cell(param); if (hc) { md = hc->md; + dm_get(md); /* * Sneakily write in both the name and the uuid @@ -674,8 +653,6 @@ static struct mapped_device *find_device(struct dm_ioctl *param) static int dev_remove(struct dm_ioctl *param, size_t param_size) { struct hash_cell *hc; - struct mapped_device *md; - int r; down_write(&_hash_lock); hc = __find_device_hash_cell(param); @@ -686,22 +663,8 @@ static int dev_remove(struct dm_ioctl *param, size_t param_size) return -ENXIO; } - md = hc->md; - - /* - * Ensure the device is not open and nothing further can open it. - */ - r = dm_lock_for_deletion(md); - if (r) { - DMWARN("unable to remove open device %s", hc->name); - up_write(&_hash_lock); - dm_put(md); - return r; - } - __hash_remove(hc); up_write(&_hash_lock); - dm_put(md); param->data_size = 0; return 0; } @@ -827,6 +790,7 @@ static int do_resume(struct dm_ioctl *param) } md = hc->md; + dm_get(md); new_map = hc->new_map; hc->new_map = NULL; @@ -1114,7 +1078,6 @@ static int table_clear(struct dm_ioctl *param, size_t param_size) { int r; struct hash_cell *hc; - struct mapped_device *md; down_write(&_hash_lock); @@ -1133,9 +1096,7 @@ static int table_clear(struct dm_ioctl *param, size_t param_size) param->flags &= ~DM_INACTIVE_PRESENT_FLAG; r = __dev_status(hc->md, param); - md = hc->md; up_write(&_hash_lock); - dm_put(md); return r; } diff --git a/trunk/drivers/md/dm-linear.c b/trunk/drivers/md/dm-linear.c index 47b3c62bbdb8..daf586c0898d 100644 --- a/trunk/drivers/md/dm-linear.c +++ b/trunk/drivers/md/dm-linear.c @@ -12,8 +12,6 @@ #include #include -#define DM_MSG_PREFIX "linear" - /* * Linear: maps a linear range of a device. */ @@ -31,7 +29,7 @@ static int linear_ctr(struct dm_target *ti, unsigned int argc, char **argv) unsigned long long tmp; if (argc != 2) { - ti->error = "Invalid argument count"; + ti->error = "dm-linear: Invalid argument count"; return -EINVAL; } @@ -113,7 +111,7 @@ int __init dm_linear_init(void) int r = dm_register_target(&linear_target); if (r < 0) - DMERR("register failed %d", r); + DMERR("linear: register failed %d", r); return r; } @@ -123,5 +121,5 @@ void dm_linear_exit(void) int r = dm_unregister_target(&linear_target); if (r < 0) - DMERR("unregister failed %d", r); + DMERR("linear: unregister failed %d", r); } diff --git a/trunk/drivers/md/dm-log.c b/trunk/drivers/md/dm-log.c index 64b764bd02cc..d73779a42417 100644 --- a/trunk/drivers/md/dm-log.c +++ b/trunk/drivers/md/dm-log.c @@ -12,8 +12,6 @@ #include "dm-log.h" #include "dm-io.h" -#define DM_MSG_PREFIX "mirror log" - static LIST_HEAD(_log_types); static DEFINE_SPINLOCK(_lock); @@ -157,6 +155,8 @@ struct log_c { struct io_region header_location; struct log_header *disk_header; + + struct io_region bits_location; }; /* @@ -240,22 +240,44 @@ static inline int write_header(struct log_c *log) log->disk_header, &ebits); } +/*---------------------------------------------------------------- + * Bits IO + *--------------------------------------------------------------*/ +static int read_bits(struct log_c *log) +{ + int r; + unsigned long ebits; + + r = dm_io_sync_vm(1, &log->bits_location, READ, + log->clean_bits, &ebits); + if (r) + return r; + + return 0; +} + +static int write_bits(struct log_c *log) +{ + unsigned long ebits; + return dm_io_sync_vm(1, &log->bits_location, WRITE, + log->clean_bits, &ebits); +} + /*---------------------------------------------------------------- * core log constructor/destructor * * argv contains region_size followed optionally by [no]sync *--------------------------------------------------------------*/ #define BYTE_SHIFT 3 -static int create_log_context(struct dirty_log *log, struct dm_target *ti, - unsigned int argc, char **argv, - struct dm_dev *dev) +static int core_ctr(struct dirty_log *log, struct dm_target *ti, + unsigned int argc, char **argv) { enum sync sync = DEFAULTSYNC; struct log_c *lc; uint32_t region_size; unsigned int region_count; - size_t bitset_size, buf_size; + size_t bitset_size; if (argc < 1 || argc > 2) { DMWARN("wrong number of arguments to mirror log"); @@ -297,53 +319,22 @@ static int create_log_context(struct dirty_log *log, struct dm_target *ti, * Work out how many "unsigned long"s we need to hold the bitset. */ bitset_size = dm_round_up(region_count, - sizeof(*lc->clean_bits) << BYTE_SHIFT); + sizeof(unsigned long) << BYTE_SHIFT); bitset_size >>= BYTE_SHIFT; - lc->bitset_uint32_count = bitset_size / sizeof(*lc->clean_bits); - - /* - * Disk log? - */ - if (!dev) { - lc->clean_bits = vmalloc(bitset_size); - if (!lc->clean_bits) { - DMWARN("couldn't allocate clean bitset"); - kfree(lc); - return -ENOMEM; - } - lc->disk_header = NULL; - } else { - lc->log_dev = dev; - lc->header_location.bdev = lc->log_dev->bdev; - lc->header_location.sector = 0; - - /* - * Buffer holds both header and bitset. - */ - buf_size = dm_round_up((LOG_OFFSET << SECTOR_SHIFT) + - bitset_size, ti->limits.hardsect_size); - lc->header_location.count = buf_size >> SECTOR_SHIFT; - - lc->disk_header = vmalloc(buf_size); - if (!lc->disk_header) { - DMWARN("couldn't allocate disk log buffer"); - kfree(lc); - return -ENOMEM; - } - - lc->clean_bits = (void *)lc->disk_header + - (LOG_OFFSET << SECTOR_SHIFT); + lc->bitset_uint32_count = bitset_size / 4; + lc->clean_bits = vmalloc(bitset_size); + if (!lc->clean_bits) { + DMWARN("couldn't allocate clean bitset"); + kfree(lc); + return -ENOMEM; } - memset(lc->clean_bits, -1, bitset_size); lc->sync_bits = vmalloc(bitset_size); if (!lc->sync_bits) { DMWARN("couldn't allocate sync bitset"); - if (!dev) - vfree(lc->clean_bits); - vfree(lc->disk_header); + vfree(lc->clean_bits); kfree(lc); return -ENOMEM; } @@ -354,38 +345,23 @@ static int create_log_context(struct dirty_log *log, struct dm_target *ti, if (!lc->recovering_bits) { DMWARN("couldn't allocate sync bitset"); vfree(lc->sync_bits); - if (!dev) - vfree(lc->clean_bits); - vfree(lc->disk_header); + vfree(lc->clean_bits); kfree(lc); return -ENOMEM; } memset(lc->recovering_bits, 0, bitset_size); lc->sync_search = 0; log->context = lc; - return 0; } -static int core_ctr(struct dirty_log *log, struct dm_target *ti, - unsigned int argc, char **argv) -{ - return create_log_context(log, ti, argc, argv, NULL); -} - -static void destroy_log_context(struct log_c *lc) -{ - vfree(lc->sync_bits); - vfree(lc->recovering_bits); - kfree(lc); -} - static void core_dtr(struct dirty_log *log) { struct log_c *lc = (struct log_c *) log->context; - vfree(lc->clean_bits); - destroy_log_context(lc); + vfree(lc->sync_bits); + vfree(lc->recovering_bits); + kfree(lc); } /*---------------------------------------------------------------- @@ -397,6 +373,8 @@ static int disk_ctr(struct dirty_log *log, struct dm_target *ti, unsigned int argc, char **argv) { int r; + size_t size; + struct log_c *lc; struct dm_dev *dev; if (argc < 2 || argc > 3) { @@ -409,22 +387,49 @@ static int disk_ctr(struct dirty_log *log, struct dm_target *ti, if (r) return r; - r = create_log_context(log, ti, argc - 1, argv + 1, dev); + r = core_ctr(log, ti, argc - 1, argv + 1); if (r) { dm_put_device(ti, dev); return r; } + lc = (struct log_c *) log->context; + lc->log_dev = dev; + + /* setup the disk header fields */ + lc->header_location.bdev = lc->log_dev->bdev; + lc->header_location.sector = 0; + lc->header_location.count = 1; + + /* + * We can't read less than this amount, even though we'll + * not be using most of this space. + */ + lc->disk_header = vmalloc(1 << SECTOR_SHIFT); + if (!lc->disk_header) + goto bad; + + /* setup the disk bitset fields */ + lc->bits_location.bdev = lc->log_dev->bdev; + lc->bits_location.sector = LOG_OFFSET; + + size = dm_round_up(lc->bitset_uint32_count * sizeof(uint32_t), + 1 << SECTOR_SHIFT); + lc->bits_location.count = size >> SECTOR_SHIFT; return 0; + + bad: + dm_put_device(ti, lc->log_dev); + core_dtr(log); + return -ENOMEM; } static void disk_dtr(struct dirty_log *log) { struct log_c *lc = (struct log_c *) log->context; - dm_put_device(lc->ti, lc->log_dev); vfree(lc->disk_header); - destroy_log_context(lc); + core_dtr(log); } static int count_bits32(uint32_t *addr, unsigned size) @@ -449,7 +454,12 @@ static int disk_resume(struct dirty_log *log) if (r) return r; - /* set or clear any new bits -- device has grown */ + /* read the bits */ + r = read_bits(lc); + if (r) + return r; + + /* set or clear any new bits */ if (lc->sync == NOSYNC) for (i = lc->header.nr_regions; i < lc->region_count; i++) /* FIXME: amazingly inefficient */ @@ -459,14 +469,15 @@ static int disk_resume(struct dirty_log *log) /* FIXME: amazingly inefficient */ log_clear_bit(lc, lc->clean_bits, i); - /* clear any old bits -- device has shrunk */ - for (i = lc->region_count; i % (sizeof(*lc->clean_bits) << BYTE_SHIFT); i++) - log_clear_bit(lc, lc->clean_bits, i); - /* copy clean across to sync */ memcpy(lc->sync_bits, lc->clean_bits, size); lc->sync_count = count_bits32(lc->clean_bits, lc->bitset_uint32_count); + /* write the bits */ + r = write_bits(lc); + if (r) + return r; + /* set the correct number of regions in the header */ lc->header.nr_regions = lc->region_count; @@ -507,7 +518,7 @@ static int disk_flush(struct dirty_log *log) if (!lc->touched) return 0; - r = write_header(lc); + r = write_bits(lc); if (!r) lc->touched = 0; diff --git a/trunk/drivers/md/dm-mpath.c b/trunk/drivers/md/dm-mpath.c index 217615b33223..1816f30678ed 100644 --- a/trunk/drivers/md/dm-mpath.c +++ b/trunk/drivers/md/dm-mpath.c @@ -21,7 +21,6 @@ #include #include -#define DM_MSG_PREFIX "multipath" #define MESG_STR(x) x, sizeof(x) /* Path properties */ @@ -447,6 +446,8 @@ struct param { char *error; }; +#define ESTR(s) ("dm-multipath: " s) + static int read_param(struct param *param, char *str, unsigned *v, char **error) { if (!str || @@ -494,12 +495,12 @@ static int parse_path_selector(struct arg_set *as, struct priority_group *pg, unsigned ps_argc; static struct param _params[] = { - {0, 1024, "invalid number of path selector args"}, + {0, 1024, ESTR("invalid number of path selector args")}, }; pst = dm_get_path_selector(shift(as)); if (!pst) { - ti->error = "unknown path selector type"; + ti->error = ESTR("unknown path selector type"); return -EINVAL; } @@ -510,7 +511,7 @@ static int parse_path_selector(struct arg_set *as, struct priority_group *pg, r = pst->create(&pg->ps, ps_argc, as->argv); if (r) { dm_put_path_selector(pst); - ti->error = "path selector constructor failed"; + ti->error = ESTR("path selector constructor failed"); return r; } @@ -528,7 +529,7 @@ static struct pgpath *parse_path(struct arg_set *as, struct path_selector *ps, /* we need at least a path arg */ if (as->argc < 1) { - ti->error = "no device given"; + ti->error = ESTR("no device given"); return NULL; } @@ -539,7 +540,7 @@ static struct pgpath *parse_path(struct arg_set *as, struct path_selector *ps, r = dm_get_device(ti, shift(as), ti->begin, ti->len, dm_table_get_mode(ti->table), &p->path.dev); if (r) { - ti->error = "error getting device"; + ti->error = ESTR("error getting device"); goto bad; } @@ -561,8 +562,8 @@ static struct priority_group *parse_priority_group(struct arg_set *as, struct dm_target *ti) { static struct param _params[] = { - {1, 1024, "invalid number of paths"}, - {0, 1024, "invalid number of selector args"} + {1, 1024, ESTR("invalid number of paths")}, + {0, 1024, ESTR("invalid number of selector args")} }; int r; @@ -571,13 +572,13 @@ static struct priority_group *parse_priority_group(struct arg_set *as, if (as->argc < 2) { as->argc = 0; - ti->error = "not enough priority group aruments"; + ti->error = ESTR("not enough priority group aruments"); return NULL; } pg = alloc_priority_group(); if (!pg) { - ti->error = "couldn't allocate priority group"; + ti->error = ESTR("couldn't allocate priority group"); return NULL; } pg->m = m; @@ -632,7 +633,7 @@ static int parse_hw_handler(struct arg_set *as, struct multipath *m, unsigned hw_argc; static struct param _params[] = { - {0, 1024, "invalid number of hardware handler args"}, + {0, 1024, ESTR("invalid number of hardware handler args")}, }; r = read_param(_params, shift(as), &hw_argc, &ti->error); @@ -644,14 +645,14 @@ static int parse_hw_handler(struct arg_set *as, struct multipath *m, hwht = dm_get_hw_handler(shift(as)); if (!hwht) { - ti->error = "unknown hardware handler type"; + ti->error = ESTR("unknown hardware handler type"); return -EINVAL; } r = hwht->create(&m->hw_handler, hw_argc - 1, as->argv); if (r) { dm_put_hw_handler(hwht); - ti->error = "hardware handler constructor failed"; + ti->error = ESTR("hardware handler constructor failed"); return r; } @@ -668,7 +669,7 @@ static int parse_features(struct arg_set *as, struct multipath *m, unsigned argc; static struct param _params[] = { - {0, 1, "invalid number of feature args"}, + {0, 1, ESTR("invalid number of feature args")}, }; r = read_param(_params, shift(as), &argc, &ti->error); @@ -691,8 +692,8 @@ static int multipath_ctr(struct dm_target *ti, unsigned int argc, { /* target parameters */ static struct param _params[] = { - {1, 1024, "invalid number of priority groups"}, - {1, 1024, "invalid initial priority group number"}, + {1, 1024, ESTR("invalid number of priority groups")}, + {1, 1024, ESTR("invalid initial priority group number")}, }; int r; @@ -706,7 +707,7 @@ static int multipath_ctr(struct dm_target *ti, unsigned int argc, m = alloc_multipath(); if (!m) { - ti->error = "can't allocate multipath"; + ti->error = ESTR("can't allocate multipath"); return -EINVAL; } @@ -745,7 +746,7 @@ static int multipath_ctr(struct dm_target *ti, unsigned int argc, } if (pg_count != m->nr_priority_groups) { - ti->error = "priority group count mismatch"; + ti->error = ESTR("priority group count mismatch"); r = -EINVAL; goto bad; } @@ -806,7 +807,7 @@ static int fail_path(struct pgpath *pgpath) if (!pgpath->path.is_active) goto out; - DMWARN("Failing path %s.", pgpath->path.dev->name); + DMWARN("dm-multipath: Failing path %s.", pgpath->path.dev->name); pgpath->pg->ps.type->fail_path(&pgpath->pg->ps, &pgpath->path); pgpath->path.is_active = 0; @@ -1249,7 +1250,7 @@ static int multipath_message(struct dm_target *ti, unsigned argc, char **argv) r = dm_get_device(ti, argv[1], ti->begin, ti->len, dm_table_get_mode(ti->table), &dev); if (r) { - DMWARN("message: error getting device %s", + DMWARN("dm-multipath message: error getting device %s", argv[1]); return -EINVAL; } @@ -1308,7 +1309,7 @@ static int __init dm_multipath_init(void) return -ENOMEM; } - DMINFO("version %u.%u.%u loaded", + DMINFO("dm-multipath version %u.%u.%u loaded", multipath_target.version[0], multipath_target.version[1], multipath_target.version[2]); diff --git a/trunk/drivers/md/dm-raid1.c b/trunk/drivers/md/dm-raid1.c index be48cedf986b..d12cf3e5e076 100644 --- a/trunk/drivers/md/dm-raid1.c +++ b/trunk/drivers/md/dm-raid1.c @@ -20,8 +20,6 @@ #include #include -#define DM_MSG_PREFIX "raid1" - static struct workqueue_struct *_kmirrord_wq; static struct work_struct _kmirrord_work; @@ -108,42 +106,12 @@ struct region { struct bio_list delayed_bios; }; - -/*----------------------------------------------------------------- - * Mirror set structures. - *---------------------------------------------------------------*/ -struct mirror { - atomic_t error_count; - struct dm_dev *dev; - sector_t offset; -}; - -struct mirror_set { - struct dm_target *ti; - struct list_head list; - struct region_hash rh; - struct kcopyd_client *kcopyd_client; - - spinlock_t lock; /* protects the next two lists */ - struct bio_list reads; - struct bio_list writes; - - /* recovery */ - region_t nr_regions; - int in_sync; - - struct mirror *default_mirror; /* Default mirror */ - - unsigned int nr_mirrors; - struct mirror mirror[0]; -}; - /* * Conversion fns */ static inline region_t bio_to_region(struct region_hash *rh, struct bio *bio) { - return (bio->bi_sector - rh->ms->ti->begin) >> rh->region_shift; + return bio->bi_sector >> rh->region_shift; } static inline sector_t region_to_sector(struct region_hash *rh, region_t region) @@ -490,9 +458,11 @@ static int __rh_recovery_prepare(struct region_hash *rh) /* Already quiesced ? */ if (atomic_read(®->pending)) list_del_init(®->list); - else - list_move(®->list, &rh->quiesced_regions); + else { + list_del_init(®->list); + list_add(®->list, &rh->quiesced_regions); + } spin_unlock_irq(&rh->region_lock); return 1; @@ -571,6 +541,35 @@ static void rh_start_recovery(struct region_hash *rh) wake(); } +/*----------------------------------------------------------------- + * Mirror set structures. + *---------------------------------------------------------------*/ +struct mirror { + atomic_t error_count; + struct dm_dev *dev; + sector_t offset; +}; + +struct mirror_set { + struct dm_target *ti; + struct list_head list; + struct region_hash rh; + struct kcopyd_client *kcopyd_client; + + spinlock_t lock; /* protects the next two lists */ + struct bio_list reads; + struct bio_list writes; + + /* recovery */ + region_t nr_regions; + int in_sync; + + struct mirror *default_mirror; /* Default mirror */ + + unsigned int nr_mirrors; + struct mirror mirror[0]; +}; + /* * Every mirror should look like this one. */ @@ -604,7 +603,7 @@ static void recovery_complete(int read_err, unsigned int write_err, struct region *reg = (struct region *) context; /* FIXME: better error handling */ - rh_recovery_end(reg, !(read_err || write_err)); + rh_recovery_end(reg, read_err || write_err); } static int recover(struct mirror_set *ms, struct region *reg) @@ -894,7 +893,7 @@ static struct mirror_set *alloc_context(unsigned int nr_mirrors, ms = kmalloc(len, GFP_KERNEL); if (!ms) { - ti->error = "Cannot allocate mirror context"; + ti->error = "dm-mirror: Cannot allocate mirror context"; return NULL; } @@ -908,7 +907,7 @@ static struct mirror_set *alloc_context(unsigned int nr_mirrors, ms->default_mirror = &ms->mirror[DEFAULT_MIRROR]; if (rh_init(&ms->rh, ms, dl, region_size, ms->nr_regions)) { - ti->error = "Error creating dirty region hash"; + ti->error = "dm-mirror: Error creating dirty region hash"; kfree(ms); return NULL; } @@ -938,14 +937,14 @@ static int get_mirror(struct mirror_set *ms, struct dm_target *ti, unsigned long long offset; if (sscanf(argv[1], "%llu", &offset) != 1) { - ti->error = "Invalid offset"; + ti->error = "dm-mirror: Invalid offset"; return -EINVAL; } if (dm_get_device(ti, argv[0], offset, ti->len, dm_table_get_mode(ti->table), &ms->mirror[mirror].dev)) { - ti->error = "Device lookup failure"; + ti->error = "dm-mirror: Device lookup failure"; return -ENXIO; } @@ -982,30 +981,30 @@ static struct dirty_log *create_dirty_log(struct dm_target *ti, struct dirty_log *dl; if (argc < 2) { - ti->error = "Insufficient mirror log arguments"; + ti->error = "dm-mirror: Insufficient mirror log arguments"; return NULL; } if (sscanf(argv[1], "%u", ¶m_count) != 1) { - ti->error = "Invalid mirror log argument count"; + ti->error = "dm-mirror: Invalid mirror log argument count"; return NULL; } *args_used = 2 + param_count; if (argc < *args_used) { - ti->error = "Insufficient mirror log arguments"; + ti->error = "dm-mirror: Insufficient mirror log arguments"; return NULL; } dl = dm_create_dirty_log(argv[0], ti, param_count, argv + 2); if (!dl) { - ti->error = "Error creating mirror dirty log"; + ti->error = "dm-mirror: Error creating mirror dirty log"; return NULL; } if (!_check_region_size(ti, dl->type->get_region_size(dl))) { - ti->error = "Invalid region size"; + ti->error = "dm-mirror: Invalid region size"; dm_destroy_dirty_log(dl); return NULL; } @@ -1039,7 +1038,7 @@ static int mirror_ctr(struct dm_target *ti, unsigned int argc, char **argv) if (!argc || sscanf(argv[0], "%u", &nr_mirrors) != 1 || nr_mirrors < 2 || nr_mirrors > KCOPYD_MAX_REGIONS + 1) { - ti->error = "Invalid number of mirrors"; + ti->error = "dm-mirror: Invalid number of mirrors"; dm_destroy_dirty_log(dl); return -EINVAL; } @@ -1047,7 +1046,7 @@ static int mirror_ctr(struct dm_target *ti, unsigned int argc, char **argv) argv++, argc--; if (argc != nr_mirrors * 2) { - ti->error = "Wrong number of mirror arguments"; + ti->error = "dm-mirror: Wrong number of mirror arguments"; dm_destroy_dirty_log(dl); return -EINVAL; } @@ -1116,7 +1115,7 @@ static int mirror_map(struct dm_target *ti, struct bio *bio, struct mirror *m; struct mirror_set *ms = ti->private; - map_context->ll = bio_to_region(&ms->rh, bio); + map_context->ll = bio->bi_sector >> ms->rh.region_shift; if (rw == WRITE) { queue_bio(ms, bio, rw); @@ -1222,7 +1221,7 @@ static int mirror_status(struct dm_target *ti, status_type_t type, static struct target_type mirror_target = { .name = "mirror", - .version = {1, 0, 2}, + .version = {1, 0, 1}, .module = THIS_MODULE, .ctr = mirror_ctr, .dtr = mirror_dtr, diff --git a/trunk/drivers/md/dm-round-robin.c b/trunk/drivers/md/dm-round-robin.c index c5a16c550122..d0024865a789 100644 --- a/trunk/drivers/md/dm-round-robin.c +++ b/trunk/drivers/md/dm-round-robin.c @@ -14,8 +14,6 @@ #include -#define DM_MSG_PREFIX "multipath round-robin" - /*----------------------------------------------------------------- * Path-handling code, paths are held in lists *---------------------------------------------------------------*/ @@ -193,9 +191,9 @@ static int __init dm_rr_init(void) int r = dm_register_path_selector(&rr_ps); if (r < 0) - DMERR("register failed %d", r); + DMERR("round-robin: register failed %d", r); - DMINFO("version 1.0.0 loaded"); + DMINFO("dm-round-robin version 1.0.0 loaded"); return r; } diff --git a/trunk/drivers/md/dm-snap.c b/trunk/drivers/md/dm-snap.c index 8eea0ddbf5ec..08312b46463a 100644 --- a/trunk/drivers/md/dm-snap.c +++ b/trunk/drivers/md/dm-snap.c @@ -23,8 +23,6 @@ #include "dm-bio-list.h" #include "kcopyd.h" -#define DM_MSG_PREFIX "snapshots" - /* * The percentage increment we will wake up users at */ @@ -119,7 +117,7 @@ static int init_origin_hash(void) _origins = kmalloc(ORIGIN_HASH_SIZE * sizeof(struct list_head), GFP_KERNEL); if (!_origins) { - DMERR("unable to allocate memory"); + DMERR("Device mapper: Snapshot: unable to allocate memory"); return -ENOMEM; } @@ -414,7 +412,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) int blocksize; if (argc < 4) { - ti->error = "requires exactly 4 arguments"; + ti->error = "dm-snapshot: requires exactly 4 arguments"; r = -EINVAL; goto bad1; } @@ -532,7 +530,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) } ti->private = s; - ti->split_io = s->chunk_size; + ti->split_io = chunk_size; return 0; @@ -1129,7 +1127,7 @@ static int origin_ctr(struct dm_target *ti, unsigned int argc, char **argv) struct dm_dev *dev; if (argc != 1) { - ti->error = "origin: incorrect number of arguments"; + ti->error = "dm-origin: incorrect number of arguments"; return -EINVAL; } @@ -1206,7 +1204,7 @@ static int origin_status(struct dm_target *ti, status_type_t type, char *result, static struct target_type origin_target = { .name = "snapshot-origin", - .version = {1, 4, 0}, + .version = {1, 1, 0}, .module = THIS_MODULE, .ctr = origin_ctr, .dtr = origin_dtr, @@ -1217,7 +1215,7 @@ static struct target_type origin_target = { static struct target_type snapshot_target = { .name = "snapshot", - .version = {1, 4, 0}, + .version = {1, 1, 0}, .module = THIS_MODULE, .ctr = snapshot_ctr, .dtr = snapshot_dtr, @@ -1238,7 +1236,7 @@ static int __init dm_snapshot_init(void) r = dm_register_target(&origin_target); if (r < 0) { - DMERR("Origin target register failed %d", r); + DMERR("Device mapper: Origin: register failed %d\n", r); goto bad1; } diff --git a/trunk/drivers/md/dm-stripe.c b/trunk/drivers/md/dm-stripe.c index 6c29fcecd892..08328a8f5a3c 100644 --- a/trunk/drivers/md/dm-stripe.c +++ b/trunk/drivers/md/dm-stripe.c @@ -12,8 +12,6 @@ #include #include -#define DM_MSG_PREFIX "striped" - struct stripe { struct dm_dev *dev; sector_t physical_start; @@ -80,19 +78,19 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) unsigned int i; if (argc < 2) { - ti->error = "Not enough arguments"; + ti->error = "dm-stripe: Not enough arguments"; return -EINVAL; } stripes = simple_strtoul(argv[0], &end, 10); if (*end) { - ti->error = "Invalid stripe count"; + ti->error = "dm-stripe: Invalid stripe count"; return -EINVAL; } chunk_size = simple_strtoul(argv[1], &end, 10); if (*end) { - ti->error = "Invalid chunk_size"; + ti->error = "dm-stripe: Invalid chunk_size"; return -EINVAL; } @@ -101,19 +99,19 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) */ if (!chunk_size || (chunk_size & (chunk_size - 1)) || (chunk_size < (PAGE_SIZE >> SECTOR_SHIFT))) { - ti->error = "Invalid chunk size"; + ti->error = "dm-stripe: Invalid chunk size"; return -EINVAL; } if (ti->len & (chunk_size - 1)) { - ti->error = "Target length not divisible by " + ti->error = "dm-stripe: Target length not divisible by " "chunk size"; return -EINVAL; } width = ti->len; if (sector_div(width, stripes)) { - ti->error = "Target length not divisible by " + ti->error = "dm-stripe: Target length not divisible by " "number of stripes"; return -EINVAL; } @@ -122,14 +120,14 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) * Do we have enough arguments for that many stripes ? */ if (argc != (2 + 2 * stripes)) { - ti->error = "Not enough destinations " + ti->error = "dm-stripe: Not enough destinations " "specified"; return -EINVAL; } sc = alloc_context(stripes); if (!sc) { - ti->error = "Memory allocation for striped context " + ti->error = "dm-stripe: Memory allocation for striped context " "failed"; return -ENOMEM; } @@ -151,7 +149,8 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = get_stripe(ti, sc, i, argv); if (r < 0) { - ti->error = "Couldn't parse stripe destination"; + ti->error = "dm-stripe: Couldn't parse stripe " + "destination"; while (i--) dm_put_device(ti, sc->stripe[i].dev); kfree(sc); @@ -228,7 +227,7 @@ int __init dm_stripe_init(void) r = dm_register_target(&stripe_target); if (r < 0) - DMWARN("target registration failed"); + DMWARN("striped target registration failed"); return r; } @@ -236,7 +235,7 @@ int __init dm_stripe_init(void) void dm_stripe_exit(void) { if (dm_unregister_target(&stripe_target)) - DMWARN("target unregistration failed"); + DMWARN("striped target unregistration failed"); return; } diff --git a/trunk/drivers/md/dm-table.c b/trunk/drivers/md/dm-table.c index 75fe9493e6af..8f56a54cf0ce 100644 --- a/trunk/drivers/md/dm-table.c +++ b/trunk/drivers/md/dm-table.c @@ -17,8 +17,6 @@ #include #include -#define DM_MSG_PREFIX "table" - #define MAX_DEPTH 16 #define NODE_SIZE L1_CACHE_BYTES #define KEYS_PER_NODE (NODE_SIZE / sizeof(sector_t)) @@ -239,44 +237,6 @@ int dm_table_create(struct dm_table **result, int mode, return 0; } -int dm_create_error_table(struct dm_table **result, struct mapped_device *md) -{ - struct dm_table *t; - sector_t dev_size = 1; - int r; - - /* - * Find current size of device. - * Default to 1 sector if inactive. - */ - t = dm_get_table(md); - if (t) { - dev_size = dm_table_get_size(t); - dm_table_put(t); - } - - r = dm_table_create(&t, FMODE_READ, 1, md); - if (r) - return r; - - r = dm_table_add_target(t, "error", 0, dev_size, NULL); - if (r) - goto out; - - r = dm_table_complete(t); - if (r) - goto out; - - *result = t; - -out: - if (r) - dm_table_put(t); - - return r; -} -EXPORT_SYMBOL_GPL(dm_create_error_table); - static void free_devices(struct list_head *devices) { struct list_head *tmp, *next; @@ -630,12 +590,6 @@ int dm_split_args(int *argc, char ***argvp, char *input) unsigned array_size = 0; *argc = 0; - - if (!input) { - *argvp = NULL; - return 0; - } - argv = realloc_argv(&array_size, argv); if (!argv) return -ENOMEM; @@ -717,14 +671,15 @@ int dm_table_add_target(struct dm_table *t, const char *type, memset(tgt, 0, sizeof(*tgt)); if (!len) { - DMERR("%s: zero-length target", dm_device_name(t->md)); + tgt->error = "zero-length target"; + DMERR("%s", tgt->error); return -EINVAL; } tgt->type = dm_get_target_type(type); if (!tgt->type) { - DMERR("%s: %s: unknown target type", dm_device_name(t->md), - type); + tgt->error = "unknown target type"; + DMERR("%s", tgt->error); return -EINVAL; } @@ -761,7 +716,7 @@ int dm_table_add_target(struct dm_table *t, const char *type, return 0; bad: - DMERR("%s: %s: %s", dm_device_name(t->md), type, tgt->error); + DMERR("%s", tgt->error); dm_put_target_type(tgt->type); return r; } @@ -847,7 +802,7 @@ sector_t dm_table_get_size(struct dm_table *t) struct dm_target *dm_table_get_target(struct dm_table *t, unsigned int index) { - if (index >= t->num_targets) + if (index > t->num_targets) return NULL; return t->targets + index; diff --git a/trunk/drivers/md/dm-target.c b/trunk/drivers/md/dm-target.c index 477a041a41cf..64fd8e79ea4c 100644 --- a/trunk/drivers/md/dm-target.c +++ b/trunk/drivers/md/dm-target.c @@ -12,8 +12,6 @@ #include #include -#define DM_MSG_PREFIX "target" - struct tt_internal { struct target_type tt; diff --git a/trunk/drivers/md/dm-zero.c b/trunk/drivers/md/dm-zero.c index ea569f7348d2..51c0639b2487 100644 --- a/trunk/drivers/md/dm-zero.c +++ b/trunk/drivers/md/dm-zero.c @@ -10,15 +10,13 @@ #include #include -#define DM_MSG_PREFIX "zero" - /* * Construct a dummy mapping that only returns zeros */ static int zero_ctr(struct dm_target *ti, unsigned int argc, char **argv) { if (argc != 0) { - ti->error = "No arguments required"; + ti->error = "dm-zero: No arguments required"; return -EINVAL; } @@ -62,7 +60,7 @@ static int __init dm_zero_init(void) int r = dm_register_target(&zero_target); if (r < 0) - DMERR("register failed %d", r); + DMERR("zero: register failed %d", r); return r; } @@ -72,7 +70,7 @@ static void __exit dm_zero_exit(void) int r = dm_unregister_target(&zero_target); if (r < 0) - DMERR("unregister failed %d", r); + DMERR("zero: unregister failed %d", r); } module_init(dm_zero_init) diff --git a/trunk/drivers/md/dm.c b/trunk/drivers/md/dm.c index 3ed2e53b9eb6..4d710b7a133b 100644 --- a/trunk/drivers/md/dm.c +++ b/trunk/drivers/md/dm.c @@ -1,6 +1,6 @@ /* * Copyright (C) 2001, 2002 Sistina Software (UK) Limited. - * Copyright (C) 2004-2006 Red Hat, Inc. All rights reserved. + * Copyright (C) 2004 Red Hat, Inc. All rights reserved. * * This file is released under the GPL. */ @@ -21,14 +21,11 @@ #include #include -#define DM_MSG_PREFIX "core" - static const char *_name = DM_NAME; static unsigned int major = 0; static unsigned int _major = 0; -static DEFINE_SPINLOCK(_minor_lock); /* * One of these is allocated per bio. */ @@ -52,28 +49,23 @@ struct target_io { union map_info *dm_get_mapinfo(struct bio *bio) { - if (bio && bio->bi_private) - return &((struct target_io *)bio->bi_private)->info; - return NULL; + if (bio && bio->bi_private) + return &((struct target_io *)bio->bi_private)->info; + return NULL; } -#define MINOR_ALLOCED ((void *)-1) - /* * Bits for the md->flags field. */ #define DMF_BLOCK_IO 0 #define DMF_SUSPENDED 1 #define DMF_FROZEN 2 -#define DMF_FREEING 3 -#define DMF_DELETING 4 struct mapped_device { struct rw_semaphore io_lock; struct semaphore suspend_lock; rwlock_t map_lock; atomic_t holders; - atomic_t open_count; unsigned long flags; @@ -226,25 +218,9 @@ static int dm_blk_open(struct inode *inode, struct file *file) { struct mapped_device *md; - spin_lock(&_minor_lock); - md = inode->i_bdev->bd_disk->private_data; - if (!md) - goto out; - - if (test_bit(DMF_FREEING, &md->flags) || - test_bit(DMF_DELETING, &md->flags)) { - md = NULL; - goto out; - } - dm_get(md); - atomic_inc(&md->open_count); - -out: - spin_unlock(&_minor_lock); - - return md ? 0 : -ENXIO; + return 0; } static int dm_blk_close(struct inode *inode, struct file *file) @@ -252,35 +228,10 @@ static int dm_blk_close(struct inode *inode, struct file *file) struct mapped_device *md; md = inode->i_bdev->bd_disk->private_data; - atomic_dec(&md->open_count); dm_put(md); return 0; } -int dm_open_count(struct mapped_device *md) -{ - return atomic_read(&md->open_count); -} - -/* - * Guarantees nothing is using the device before it's deleted. - */ -int dm_lock_for_deletion(struct mapped_device *md) -{ - int r = 0; - - spin_lock(&_minor_lock); - - if (dm_open_count(md)) - r = -EBUSY; - else - set_bit(DMF_DELETING, &md->flags); - - spin_unlock(&_minor_lock); - - return r; -} - static int dm_blk_getgeo(struct block_device *bdev, struct hd_geometry *geo) { struct mapped_device *md = bdev->bd_disk->private_data; @@ -505,8 +456,8 @@ static void __map_bio(struct dm_target *ti, struct bio *clone, if (r > 0) { /* the bio has been remapped so dispatch it */ - blk_add_trace_remap(bdev_get_queue(clone->bi_bdev), clone, - tio->io->bio->bi_bdev->bd_dev, sector, + blk_add_trace_remap(bdev_get_queue(clone->bi_bdev), clone, + tio->io->bio->bi_bdev->bd_dev, sector, clone->bi_sector); generic_make_request(clone); @@ -793,39 +744,43 @@ static int dm_any_congested(void *congested_data, int bdi_bits) /*----------------------------------------------------------------- * An IDR is used to keep track of allocated minor numbers. *---------------------------------------------------------------*/ +static DEFINE_MUTEX(_minor_lock); static DEFINE_IDR(_minor_idr); -static void free_minor(int minor) +static void free_minor(unsigned int minor) { - spin_lock(&_minor_lock); + mutex_lock(&_minor_lock); idr_remove(&_minor_idr, minor); - spin_unlock(&_minor_lock); + mutex_unlock(&_minor_lock); } /* * See if the device with a specific minor # is free. */ -static int specific_minor(struct mapped_device *md, int minor) +static int specific_minor(struct mapped_device *md, unsigned int minor) { int r, m; if (minor >= (1 << MINORBITS)) return -EINVAL; - r = idr_pre_get(&_minor_idr, GFP_KERNEL); - if (!r) - return -ENOMEM; - - spin_lock(&_minor_lock); + mutex_lock(&_minor_lock); if (idr_find(&_minor_idr, minor)) { r = -EBUSY; goto out; } - r = idr_get_new_above(&_minor_idr, MINOR_ALLOCED, minor, &m); - if (r) + r = idr_pre_get(&_minor_idr, GFP_KERNEL); + if (!r) { + r = -ENOMEM; + goto out; + } + + r = idr_get_new_above(&_minor_idr, md, minor, &m); + if (r) { goto out; + } if (m != minor) { idr_remove(&_minor_idr, m); @@ -834,21 +789,24 @@ static int specific_minor(struct mapped_device *md, int minor) } out: - spin_unlock(&_minor_lock); + mutex_unlock(&_minor_lock); return r; } -static int next_free_minor(struct mapped_device *md, int *minor) +static int next_free_minor(struct mapped_device *md, unsigned int *minor) { - int r, m; + int r; + unsigned int m; - r = idr_pre_get(&_minor_idr, GFP_KERNEL); - if (!r) - return -ENOMEM; + mutex_lock(&_minor_lock); - spin_lock(&_minor_lock); + r = idr_pre_get(&_minor_idr, GFP_KERNEL); + if (!r) { + r = -ENOMEM; + goto out; + } - r = idr_get_new(&_minor_idr, MINOR_ALLOCED, &m); + r = idr_get_new(&_minor_idr, md, &m); if (r) { goto out; } @@ -862,7 +820,7 @@ static int next_free_minor(struct mapped_device *md, int *minor) *minor = m; out: - spin_unlock(&_minor_lock); + mutex_unlock(&_minor_lock); return r; } @@ -871,25 +829,18 @@ static struct block_device_operations dm_blk_dops; /* * Allocate and initialise a blank device with a given minor. */ -static struct mapped_device *alloc_dev(int minor) +static struct mapped_device *alloc_dev(unsigned int minor, int persistent) { int r; struct mapped_device *md = kmalloc(sizeof(*md), GFP_KERNEL); - void *old_md; if (!md) { DMWARN("unable to allocate device, out of memory."); return NULL; } - if (!try_module_get(THIS_MODULE)) - goto bad0; - /* get a minor number for the dev */ - if (minor == DM_ANY_MINOR) - r = next_free_minor(md, &minor); - else - r = specific_minor(md, minor); + r = persistent ? specific_minor(md, minor) : next_free_minor(md, &minor); if (r < 0) goto bad1; @@ -898,7 +849,6 @@ static struct mapped_device *alloc_dev(int minor) init_MUTEX(&md->suspend_lock); rwlock_init(&md->map_lock); atomic_set(&md->holders, 1); - atomic_set(&md->open_count, 0); atomic_set(&md->event_nr, 0); md->queue = blk_alloc_queue(GFP_KERNEL); @@ -925,10 +875,6 @@ static struct mapped_device *alloc_dev(int minor) if (!md->disk) goto bad4; - atomic_set(&md->pending, 0); - init_waitqueue_head(&md->wait); - init_waitqueue_head(&md->eventq); - md->disk->major = _major; md->disk->first_minor = minor; md->disk->fops = &dm_blk_dops; @@ -938,12 +884,9 @@ static struct mapped_device *alloc_dev(int minor) add_disk(md->disk); format_dev_t(md->name, MKDEV(_major, minor)); - /* Populate the mapping, nobody knows we exist yet */ - spin_lock(&_minor_lock); - old_md = idr_replace(&_minor_idr, md, minor); - spin_unlock(&_minor_lock); - - BUG_ON(old_md != MINOR_ALLOCED); + atomic_set(&md->pending, 0); + init_waitqueue_head(&md->wait); + init_waitqueue_head(&md->eventq); return md; @@ -955,15 +898,13 @@ static struct mapped_device *alloc_dev(int minor) blk_cleanup_queue(md->queue); free_minor(minor); bad1: - module_put(THIS_MODULE); - bad0: kfree(md); return NULL; } static void free_dev(struct mapped_device *md) { - int minor = md->disk->first_minor; + unsigned int minor = md->disk->first_minor; if (md->suspended_bdev) { thaw_bdev(md->suspended_bdev, NULL); @@ -973,14 +914,8 @@ static void free_dev(struct mapped_device *md) mempool_destroy(md->io_pool); del_gendisk(md->disk); free_minor(minor); - - spin_lock(&_minor_lock); - md->disk->private_data = NULL; - spin_unlock(&_minor_lock); - put_disk(md->disk); blk_cleanup_queue(md->queue); - module_put(THIS_MODULE); kfree(md); } @@ -1049,11 +984,12 @@ static void __unbind(struct mapped_device *md) /* * Constructor for a new device. */ -int dm_create(int minor, struct mapped_device **result) +static int create_aux(unsigned int minor, int persistent, + struct mapped_device **result) { struct mapped_device *md; - md = alloc_dev(minor); + md = alloc_dev(minor, persistent); if (!md) return -ENXIO; @@ -1061,6 +997,16 @@ int dm_create(int minor, struct mapped_device **result) return 0; } +int dm_create(struct mapped_device **result) +{ + return create_aux(0, 0, result); +} + +int dm_create_with_minor(unsigned int minor, struct mapped_device **result) +{ + return create_aux(minor, 1, result); +} + static struct mapped_device *dm_find_md(dev_t dev) { struct mapped_device *md; @@ -1069,18 +1015,13 @@ static struct mapped_device *dm_find_md(dev_t dev) if (MAJOR(dev) != _major || minor >= (1 << MINORBITS)) return NULL; - spin_lock(&_minor_lock); + mutex_lock(&_minor_lock); md = idr_find(&_minor_idr, minor); - if (md && (md == MINOR_ALLOCED || - (dm_disk(md)->first_minor != minor) || - test_bit(DMF_FREEING, &md->flags))) { + if (!md || (dm_disk(md)->first_minor != minor)) md = NULL; - goto out; - } -out: - spin_unlock(&_minor_lock); + mutex_unlock(&_minor_lock); return md; } @@ -1110,23 +1051,12 @@ void dm_get(struct mapped_device *md) atomic_inc(&md->holders); } -const char *dm_device_name(struct mapped_device *md) -{ - return md->name; -} -EXPORT_SYMBOL_GPL(dm_device_name); - void dm_put(struct mapped_device *md) { struct dm_table *map; - BUG_ON(test_bit(DMF_FREEING, &md->flags)); - - if (atomic_dec_and_lock(&md->holders, &_minor_lock)) { + if (atomic_dec_and_test(&md->holders)) { map = dm_get_table(md); - idr_replace(&_minor_idr, MINOR_ALLOCED, dm_disk(md)->first_minor); - set_bit(DMF_FREEING, &md->flags); - spin_unlock(&_minor_lock); if (!dm_suspended(md)) { dm_table_presuspend_targets(map); dm_table_postsuspend_targets(map); diff --git a/trunk/drivers/md/dm.h b/trunk/drivers/md/dm.h index 3c03c0ecab7e..fd90bc8f9e45 100644 --- a/trunk/drivers/md/dm.h +++ b/trunk/drivers/md/dm.h @@ -2,7 +2,7 @@ * Internal header file for device mapper * * Copyright (C) 2001, 2002 Sistina Software - * Copyright (C) 2004-2006 Red Hat, Inc. All rights reserved. + * Copyright (C) 2004 Red Hat, Inc. All rights reserved. * * This file is released under the LGPL. */ @@ -17,10 +17,9 @@ #include #define DM_NAME "device-mapper" - -#define DMERR(f, arg...) printk(KERN_ERR DM_NAME ": " DM_MSG_PREFIX ": " f "\n", ## arg) -#define DMWARN(f, arg...) printk(KERN_WARNING DM_NAME ": " DM_MSG_PREFIX ": " f "\n", ## arg) -#define DMINFO(f, arg...) printk(KERN_INFO DM_NAME ": " DM_MSG_PREFIX ": " f "\n", ## arg) +#define DMWARN(f, x...) printk(KERN_WARNING DM_NAME ": " f "\n" , ## x) +#define DMERR(f, x...) printk(KERN_ERR DM_NAME ": " f "\n" , ## x) +#define DMINFO(f, x...) printk(KERN_INFO DM_NAME ": " f "\n" , ## x) #define DMEMIT(x...) sz += ((sz >= maxlen) ? \ 0 : scnprintf(result + sz, maxlen - sz, x)) @@ -40,16 +39,83 @@ struct dm_dev { }; struct dm_table; +struct mapped_device; + +/*----------------------------------------------------------------- + * Functions for manipulating a struct mapped_device. + * Drop the reference with dm_put when you finish with the object. + *---------------------------------------------------------------*/ +int dm_create(struct mapped_device **md); +int dm_create_with_minor(unsigned int minor, struct mapped_device **md); +void dm_set_mdptr(struct mapped_device *md, void *ptr); +void *dm_get_mdptr(struct mapped_device *md); +struct mapped_device *dm_get_md(dev_t dev); + +/* + * Reference counting for md. + */ +void dm_get(struct mapped_device *md); +void dm_put(struct mapped_device *md); + +/* + * A device can still be used while suspended, but I/O is deferred. + */ +int dm_suspend(struct mapped_device *md, int with_lockfs); +int dm_resume(struct mapped_device *md); + +/* + * The device must be suspended before calling this method. + */ +int dm_swap_table(struct mapped_device *md, struct dm_table *t); + +/* + * Drop a reference on the table when you've finished with the + * result. + */ +struct dm_table *dm_get_table(struct mapped_device *md); + +/* + * Event functions. + */ +uint32_t dm_get_event_nr(struct mapped_device *md); +int dm_wait_event(struct mapped_device *md, int event_nr); + +/* + * Info functions. + */ +struct gendisk *dm_disk(struct mapped_device *md); +int dm_suspended(struct mapped_device *md); + +/* + * Geometry functions. + */ +int dm_get_geometry(struct mapped_device *md, struct hd_geometry *geo); +int dm_set_geometry(struct mapped_device *md, struct hd_geometry *geo); /*----------------------------------------------------------------- - * Internal table functions. + * Functions for manipulating a table. Tables are also reference + * counted. *---------------------------------------------------------------*/ +int dm_table_create(struct dm_table **result, int mode, + unsigned num_targets, struct mapped_device *md); + +void dm_table_get(struct dm_table *t); +void dm_table_put(struct dm_table *t); + +int dm_table_add_target(struct dm_table *t, const char *type, + sector_t start, sector_t len, char *params); +int dm_table_complete(struct dm_table *t); void dm_table_event_callback(struct dm_table *t, void (*fn)(void *), void *context); +void dm_table_event(struct dm_table *t); +sector_t dm_table_get_size(struct dm_table *t); struct dm_target *dm_table_get_target(struct dm_table *t, unsigned int index); struct dm_target *dm_table_find_target(struct dm_table *t, sector_t sector); void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q); +unsigned int dm_table_get_num_targets(struct dm_table *t); struct list_head *dm_table_get_devices(struct dm_table *t); +int dm_table_get_mode(struct dm_table *t); +struct mapped_device *dm_table_get_md(struct dm_table *t); void dm_table_presuspend_targets(struct dm_table *t); void dm_table_postsuspend_targets(struct dm_table *t); void dm_table_resume_targets(struct dm_table *t); @@ -67,6 +133,7 @@ void dm_put_target_type(struct target_type *t); int dm_target_iterate(void (*iter_func)(struct target_type *tt, void *param), void *param); + /*----------------------------------------------------------------- * Useful inlines. *---------------------------------------------------------------*/ @@ -124,7 +191,5 @@ void dm_stripe_exit(void); void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size); union map_info *dm_get_mapinfo(struct bio *bio); -int dm_open_count(struct mapped_device *md); -int dm_lock_for_deletion(struct mapped_device *md); #endif diff --git a/trunk/drivers/md/kcopyd.c b/trunk/drivers/md/kcopyd.c index 73ab875fb158..72480a48d88b 100644 --- a/trunk/drivers/md/kcopyd.c +++ b/trunk/drivers/md/kcopyd.c @@ -314,7 +314,7 @@ static void complete_io(unsigned long error, void *context) if (error) { if (job->rw == WRITE) - job->write_err |= error; + job->write_err &= error; else job->read_err = 1; @@ -460,7 +460,7 @@ static void segment_complete(int read_err, job->read_err = 1; if (write_err) - job->write_err |= write_err; + job->write_err &= write_err; /* * Only dispatch more work if there hasn't been an error. diff --git a/trunk/drivers/md/linear.c b/trunk/drivers/md/linear.c index ff83c9b5979e..777585458c85 100644 --- a/trunk/drivers/md/linear.c +++ b/trunk/drivers/md/linear.c @@ -111,7 +111,7 @@ static int linear_issue_flush(request_queue_t *q, struct gendisk *disk, return ret; } -static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks) +static int linear_run (mddev_t *mddev) { linear_conf_t *conf; dev_info_t **table; @@ -121,21 +121,20 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks) sector_t curr_offset; struct list_head *tmp; - conf = kzalloc (sizeof (*conf) + raid_disks*sizeof(dev_info_t), + conf = kzalloc (sizeof (*conf) + mddev->raid_disks*sizeof(dev_info_t), GFP_KERNEL); if (!conf) - return NULL; - + goto out; mddev->private = conf; cnt = 0; - conf->array_size = 0; + mddev->array_size = 0; ITERATE_RDEV(mddev,rdev,tmp) { int j = rdev->raid_disk; dev_info_t *disk = conf->disks + j; - if (j < 0 || j > raid_disks || disk->rdev) { + if (j < 0 || j > mddev->raid_disks || disk->rdev) { printk("linear: disk numbering problem. Aborting!\n"); goto out; } @@ -153,11 +152,11 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks) blk_queue_max_sectors(mddev->queue, PAGE_SIZE>>9); disk->size = rdev->size; - conf->array_size += rdev->size; + mddev->array_size += rdev->size; cnt++; } - if (cnt != raid_disks) { + if (cnt != mddev->raid_disks) { printk("linear: not enough drives present. Aborting!\n"); goto out; } @@ -201,7 +200,7 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks) unsigned round; unsigned long base; - sz = conf->array_size >> conf->preshift; + sz = mddev->array_size >> conf->preshift; sz += 1; /* force round-up */ base = conf->hash_spacing >> conf->preshift; round = sector_div(sz, base); @@ -248,56 +247,14 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks) BUG_ON(table - conf->hash_table > nb_zone); - return conf; - -out: - kfree(conf); - return NULL; -} - -static int linear_run (mddev_t *mddev) -{ - linear_conf_t *conf; - - conf = linear_conf(mddev, mddev->raid_disks); - - if (!conf) - return 1; - mddev->private = conf; - mddev->array_size = conf->array_size; - blk_queue_merge_bvec(mddev->queue, linear_mergeable_bvec); mddev->queue->unplug_fn = linear_unplug; mddev->queue->issue_flush_fn = linear_issue_flush; return 0; -} - -static int linear_add(mddev_t *mddev, mdk_rdev_t *rdev) -{ - /* Adding a drive to a linear array allows the array to grow. - * It is permitted if the new drive has a matching superblock - * already on it, with raid_disk equal to raid_disks. - * It is achieved by creating a new linear_private_data structure - * and swapping it in in-place of the current one. - * The current one is never freed until the array is stopped. - * This avoids races. - */ - linear_conf_t *newconf; - - if (rdev->raid_disk != mddev->raid_disks) - return -EINVAL; - newconf = linear_conf(mddev,mddev->raid_disks+1); - - if (!newconf) - return -ENOMEM; - - newconf->prev = mddev_to_conf(mddev); - mddev->private = newconf; - mddev->raid_disks++; - mddev->array_size = newconf->array_size; - set_capacity(mddev->gendisk, mddev->array_size << 1); - return 0; +out: + kfree(conf); + return 1; } static int linear_stop (mddev_t *mddev) @@ -305,12 +262,8 @@ static int linear_stop (mddev_t *mddev) linear_conf_t *conf = mddev_to_conf(mddev); blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/ - do { - linear_conf_t *t = conf->prev; - kfree(conf->hash_table); - kfree(conf); - conf = t; - } while (conf); + kfree(conf->hash_table); + kfree(conf); return 0; } @@ -407,7 +360,6 @@ static struct mdk_personality linear_personality = .run = linear_run, .stop = linear_stop, .status = linear_status, - .hot_add_disk = linear_add, }; static int __init linear_init (void) diff --git a/trunk/drivers/md/md.c b/trunk/drivers/md/md.c index 306268ec99ff..f19b874753a9 100644 --- a/trunk/drivers/md/md.c +++ b/trunk/drivers/md/md.c @@ -44,7 +44,6 @@ #include #include #include -#include #include @@ -73,10 +72,6 @@ static void autostart_arrays (int part); static LIST_HEAD(pers_list); static DEFINE_SPINLOCK(pers_lock); -static void md_print_devices(void); - -#define MD_BUG(x...) { printk("md: bug in file %s, line %d\n", __FILE__, __LINE__); md_print_devices(); } - /* * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit' * is 1000 KB/sec, so the extra system load does not show up that much. @@ -175,7 +170,7 @@ EXPORT_SYMBOL_GPL(md_new_event); /* Alternate version that can be called from interrupts * when calling sysfs_notify isn't needed. */ -static void md_new_event_inintr(mddev_t *mddev) +void md_new_event_inintr(mddev_t *mddev) { atomic_inc(&md_event_count); wake_up(&md_event_waiters); @@ -737,7 +732,6 @@ static int super_90_validate(mddev_t *mddev, mdk_rdev_t *rdev) { mdp_disk_t *desc; mdp_super_t *sb = (mdp_super_t *)page_address(rdev->sb_page); - __u64 ev1 = md_event(sb); rdev->raid_disk = -1; rdev->flags = 0; @@ -754,7 +748,7 @@ static int super_90_validate(mddev_t *mddev, mdk_rdev_t *rdev) mddev->layout = sb->layout; mddev->raid_disks = sb->raid_disks; mddev->size = sb->size; - mddev->events = ev1; + mddev->events = md_event(sb); mddev->bitmap_offset = 0; mddev->default_bitmap_offset = MD_SB_BYTES >> 9; @@ -803,6 +797,7 @@ static int super_90_validate(mddev_t *mddev, mdk_rdev_t *rdev) } else if (mddev->pers == NULL) { /* Insist on good event counter while assembling */ + __u64 ev1 = md_event(sb); ++ev1; if (ev1 < mddev->events) return -EINVAL; @@ -810,21 +805,19 @@ static int super_90_validate(mddev_t *mddev, mdk_rdev_t *rdev) /* if adding to array with a bitmap, then we can accept an * older device ... but not too old. */ + __u64 ev1 = md_event(sb); if (ev1 < mddev->bitmap->events_cleared) return 0; - } else { - if (ev1 < mddev->events) - /* just a hot-add of a new device, leave raid_disk at -1 */ - return 0; - } + } else /* just a hot-add of a new device, leave raid_disk at -1 */ + return 0; if (mddev->level != LEVEL_MULTIPATH) { desc = sb->disks + rdev->desc_nr; if (desc->state & (1<flags); - else if (desc->state & (1<raid_disk < mddev->raid_disks */) { + else if (desc->state & (1<raid_disk < mddev->raid_disks) { set_bit(In_sync, &rdev->flags); rdev->raid_disk = desc->raid_disk; } @@ -1107,7 +1100,6 @@ static int super_1_load(mdk_rdev_t *rdev, mdk_rdev_t *refdev, int minor_version) static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) { struct mdp_superblock_1 *sb = (struct mdp_superblock_1*)page_address(rdev->sb_page); - __u64 ev1 = le64_to_cpu(sb->events); rdev->raid_disk = -1; rdev->flags = 0; @@ -1123,7 +1115,7 @@ static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) mddev->layout = le32_to_cpu(sb->layout); mddev->raid_disks = le32_to_cpu(sb->raid_disks); mddev->size = le64_to_cpu(sb->size)/2; - mddev->events = ev1; + mddev->events = le64_to_cpu(sb->events); mddev->bitmap_offset = 0; mddev->default_bitmap_offset = 1024 >> 9; @@ -1157,6 +1149,7 @@ static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) } else if (mddev->pers == NULL) { /* Insist of good event counter while assembling */ + __u64 ev1 = le64_to_cpu(sb->events); ++ev1; if (ev1 < mddev->events) return -EINVAL; @@ -1164,13 +1157,12 @@ static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) /* If adding to array with a bitmap, then we can accept an * older device, but not too old. */ + __u64 ev1 = le64_to_cpu(sb->events); if (ev1 < mddev->bitmap->events_cleared) return 0; - } else { - if (ev1 < mddev->events) - /* just a hot-add of a new device, leave raid_disk at -1 */ - return 0; - } + } else /* just a hot-add of a new device, leave raid_disk at -1 */ + return 0; + if (mddev->level != LEVEL_MULTIPATH) { int role; rdev->desc_nr = le32_to_cpu(sb->dev_number); @@ -1182,11 +1174,7 @@ static int super_1_validate(mddev_t *mddev, mdk_rdev_t *rdev) set_bit(Faulty, &rdev->flags); break; default: - if ((le32_to_cpu(sb->feature_map) & - MD_FEATURE_RECOVERY_OFFSET)) - rdev->recovery_offset = le64_to_cpu(sb->recovery_offset); - else - set_bit(In_sync, &rdev->flags); + set_bit(In_sync, &rdev->flags); rdev->raid_disk = role; break; } @@ -1210,7 +1198,6 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) sb->feature_map = 0; sb->pad0 = 0; - sb->recovery_offset = cpu_to_le64(0); memset(sb->pad1, 0, sizeof(sb->pad1)); memset(sb->pad2, 0, sizeof(sb->pad2)); memset(sb->pad3, 0, sizeof(sb->pad3)); @@ -1231,14 +1218,6 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) sb->bitmap_offset = cpu_to_le32((__u32)mddev->bitmap_offset); sb->feature_map = cpu_to_le32(MD_FEATURE_BITMAP_OFFSET); } - - if (rdev->raid_disk >= 0 && - !test_bit(In_sync, &rdev->flags) && - rdev->recovery_offset > 0) { - sb->feature_map |= cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET); - sb->recovery_offset = cpu_to_le64(rdev->recovery_offset); - } - if (mddev->reshape_position != MaxSector) { sb->feature_map |= cpu_to_le32(MD_FEATURE_RESHAPE_ACTIVE); sb->reshape_position = cpu_to_le64(mddev->reshape_position); @@ -1263,12 +1242,11 @@ static void super_1_sync(mddev_t *mddev, mdk_rdev_t *rdev) sb->dev_roles[i] = cpu_to_le16(0xfffe); else if (test_bit(In_sync, &rdev2->flags)) sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); - else if (rdev2->raid_disk >= 0 && rdev2->recovery_offset > 0) - sb->dev_roles[i] = cpu_to_le16(rdev2->raid_disk); else sb->dev_roles[i] = cpu_to_le16(0xffff); } + sb->recovery_offset = cpu_to_le64(0); /* not supported yet */ sb->sb_csum = calc_sb_1_csum(sb); } @@ -1529,7 +1507,7 @@ static void print_rdev(mdk_rdev_t *rdev) printk(KERN_INFO "md: no rdev superblock!\n"); } -static void md_print_devices(void) +void md_print_devices(void) { struct list_head *tmp, *tmp2; mdk_rdev_t *rdev; @@ -1558,30 +1536,15 @@ static void md_print_devices(void) } -static void sync_sbs(mddev_t * mddev, int nospares) +static void sync_sbs(mddev_t * mddev) { - /* Update each superblock (in-memory image), but - * if we are allowed to, skip spares which already - * have the right event counter, or have one earlier - * (which would mean they aren't being marked as dirty - * with the rest of the array) - */ mdk_rdev_t *rdev; struct list_head *tmp; ITERATE_RDEV(mddev,rdev,tmp) { - if (rdev->sb_events == mddev->events || - (nospares && - rdev->raid_disk < 0 && - (rdev->sb_events&1)==0 && - rdev->sb_events+1 == mddev->events)) { - /* Don't update this superblock */ - rdev->sb_loaded = 2; - } else { - super_types[mddev->major_version]. - sync_super(mddev, rdev); - rdev->sb_loaded = 1; - } + super_types[mddev->major_version]. + sync_super(mddev, rdev); + rdev->sb_loaded = 1; } } @@ -1591,42 +1554,12 @@ void md_update_sb(mddev_t * mddev) struct list_head *tmp; mdk_rdev_t *rdev; int sync_req; - int nospares = 0; repeat: spin_lock_irq(&mddev->write_lock); sync_req = mddev->in_sync; mddev->utime = get_seconds(); - if (mddev->sb_dirty == 3) - /* just a clean<-> dirty transition, possibly leave spares alone, - * though if events isn't the right even/odd, we will have to do - * spares after all - */ - nospares = 1; - - /* If this is just a dirty<->clean transition, and the array is clean - * and 'events' is odd, we can roll back to the previous clean state */ - if (mddev->sb_dirty == 3 - && (mddev->in_sync && mddev->recovery_cp == MaxSector) - && (mddev->events & 1)) - mddev->events--; - else { - /* otherwise we have to go forward and ... */ - mddev->events ++; - if (!mddev->in_sync || mddev->recovery_cp != MaxSector) { /* not clean */ - /* .. if the array isn't clean, insist on an odd 'events' */ - if ((mddev->events&1)==0) { - mddev->events++; - nospares = 0; - } - } else { - /* otherwise insist on an even 'events' (for clean states) */ - if ((mddev->events&1)) { - mddev->events++; - nospares = 0; - } - } - } + mddev->events ++; if (!mddev->events) { /* @@ -1638,7 +1571,7 @@ void md_update_sb(mddev_t * mddev) mddev->events --; } mddev->sb_dirty = 2; - sync_sbs(mddev, nospares); + sync_sbs(mddev); /* * do not write anything to disk if using @@ -1660,8 +1593,6 @@ void md_update_sb(mddev_t * mddev) ITERATE_RDEV(mddev,rdev,tmp) { char b[BDEVNAME_SIZE]; dprintk(KERN_INFO "md: "); - if (rdev->sb_loaded != 1) - continue; /* no noise on spare devices */ if (test_bit(Faulty, &rdev->flags)) dprintk("(skipping faulty "); @@ -1673,7 +1604,6 @@ void md_update_sb(mddev_t * mddev) dprintk(KERN_INFO "(write) %s's sb offset: %llu\n", bdevname(rdev->bdev,b), (unsigned long long)rdev->sb_offset); - rdev->sb_events = mddev->events; } else dprintk(")\n"); @@ -1737,10 +1667,6 @@ state_show(mdk_rdev_t *rdev, char *page) len += sprintf(page+len, "%sin_sync",sep); sep = ","; } - if (test_bit(WriteMostly, &rdev->flags)) { - len += sprintf(page+len, "%swrite_mostly",sep); - sep = ","; - } if (!test_bit(Faulty, &rdev->flags) && !test_bit(In_sync, &rdev->flags)) { len += sprintf(page+len, "%sspare", sep); @@ -1749,40 +1675,8 @@ state_show(mdk_rdev_t *rdev, char *page) return len+sprintf(page+len, "\n"); } -static ssize_t -state_store(mdk_rdev_t *rdev, const char *buf, size_t len) -{ - /* can write - * faulty - simulates and error - * remove - disconnects the device - * writemostly - sets write_mostly - * -writemostly - clears write_mostly - */ - int err = -EINVAL; - if (cmd_match(buf, "faulty") && rdev->mddev->pers) { - md_error(rdev->mddev, rdev); - err = 0; - } else if (cmd_match(buf, "remove")) { - if (rdev->raid_disk >= 0) - err = -EBUSY; - else { - mddev_t *mddev = rdev->mddev; - kick_rdev_from_array(rdev); - md_update_sb(mddev); - md_new_event(mddev); - err = 0; - } - } else if (cmd_match(buf, "writemostly")) { - set_bit(WriteMostly, &rdev->flags); - err = 0; - } else if (cmd_match(buf, "-writemostly")) { - clear_bit(WriteMostly, &rdev->flags); - err = 0; - } - return err ? err : len; -} static struct rdev_sysfs_entry -rdev_state = __ATTR(state, 0644, state_show, state_store); +rdev_state = __ATTR_RO(state); static ssize_t super_show(mdk_rdev_t *rdev, char *page) @@ -1979,7 +1873,6 @@ static mdk_rdev_t *md_import_device(dev_t newdev, int super_format, int super_mi rdev->desc_nr = -1; rdev->flags = 0; rdev->data_offset = 0; - rdev->sb_events = 0; atomic_set(&rdev->nr_pending, 0); atomic_set(&rdev->read_errors, 0); atomic_set(&rdev->corrected_errors, 0); @@ -2084,54 +1977,6 @@ static void analyze_sbs(mddev_t * mddev) } -static ssize_t -safe_delay_show(mddev_t *mddev, char *page) -{ - int msec = (mddev->safemode_delay*1000)/HZ; - return sprintf(page, "%d.%03d\n", msec/1000, msec%1000); -} -static ssize_t -safe_delay_store(mddev_t *mddev, const char *cbuf, size_t len) -{ - int scale=1; - int dot=0; - int i; - unsigned long msec; - char buf[30]; - char *e; - /* remove a period, and count digits after it */ - if (len >= sizeof(buf)) - return -EINVAL; - strlcpy(buf, cbuf, len); - buf[len] = 0; - for (i=0; isafemode_delay = 0; - else { - mddev->safemode_delay = (msec*HZ)/1000; - if (mddev->safemode_delay == 0) - mddev->safemode_delay = 1; - } - return len; -} -static struct md_sysfs_entry md_safe_delay = -__ATTR(safe_mode_delay, 0644,safe_delay_show, safe_delay_store); - static ssize_t level_show(mddev_t *mddev, char *page) { @@ -2167,32 +2012,6 @@ level_store(mddev_t *mddev, const char *buf, size_t len) static struct md_sysfs_entry md_level = __ATTR(level, 0644, level_show, level_store); - -static ssize_t -layout_show(mddev_t *mddev, char *page) -{ - /* just a number, not meaningful for all levels */ - return sprintf(page, "%d\n", mddev->layout); -} - -static ssize_t -layout_store(mddev_t *mddev, const char *buf, size_t len) -{ - char *e; - unsigned long n = simple_strtoul(buf, &e, 10); - if (mddev->pers) - return -EBUSY; - - if (!*buf || (*e && *e != '\n')) - return -EINVAL; - - mddev->layout = n; - return len; -} -static struct md_sysfs_entry md_layout = -__ATTR(layout, 0655, layout_show, layout_store); - - static ssize_t raid_disks_show(mddev_t *mddev, char *page) { @@ -2247,200 +2066,6 @@ chunk_size_store(mddev_t *mddev, const char *buf, size_t len) static struct md_sysfs_entry md_chunk_size = __ATTR(chunk_size, 0644, chunk_size_show, chunk_size_store); -static ssize_t -resync_start_show(mddev_t *mddev, char *page) -{ - return sprintf(page, "%llu\n", (unsigned long long)mddev->recovery_cp); -} - -static ssize_t -resync_start_store(mddev_t *mddev, const char *buf, size_t len) -{ - /* can only set chunk_size if array is not yet active */ - char *e; - unsigned long long n = simple_strtoull(buf, &e, 10); - - if (mddev->pers) - return -EBUSY; - if (!*buf || (*e && *e != '\n')) - return -EINVAL; - - mddev->recovery_cp = n; - return len; -} -static struct md_sysfs_entry md_resync_start = -__ATTR(resync_start, 0644, resync_start_show, resync_start_store); - -/* - * The array state can be: - * - * clear - * No devices, no size, no level - * Equivalent to STOP_ARRAY ioctl - * inactive - * May have some settings, but array is not active - * all IO results in error - * When written, doesn't tear down array, but just stops it - * suspended (not supported yet) - * All IO requests will block. The array can be reconfigured. - * Writing this, if accepted, will block until array is quiessent - * readonly - * no resync can happen. no superblocks get written. - * write requests fail - * read-auto - * like readonly, but behaves like 'clean' on a write request. - * - * clean - no pending writes, but otherwise active. - * When written to inactive array, starts without resync - * If a write request arrives then - * if metadata is known, mark 'dirty' and switch to 'active'. - * if not known, block and switch to write-pending - * If written to an active array that has pending writes, then fails. - * active - * fully active: IO and resync can be happening. - * When written to inactive array, starts with resync - * - * write-pending - * clean, but writes are blocked waiting for 'active' to be written. - * - * active-idle - * like active, but no writes have been seen for a while (100msec). - * - */ -enum array_state { clear, inactive, suspended, readonly, read_auto, clean, active, - write_pending, active_idle, bad_word}; -static char *array_states[] = { - "clear", "inactive", "suspended", "readonly", "read-auto", "clean", "active", - "write-pending", "active-idle", NULL }; - -static int match_word(const char *word, char **list) -{ - int n; - for (n=0; list[n]; n++) - if (cmd_match(word, list[n])) - break; - return n; -} - -static ssize_t -array_state_show(mddev_t *mddev, char *page) -{ - enum array_state st = inactive; - - if (mddev->pers) - switch(mddev->ro) { - case 1: - st = readonly; - break; - case 2: - st = read_auto; - break; - case 0: - if (mddev->in_sync) - st = clean; - else if (mddev->safemode) - st = active_idle; - else - st = active; - } - else { - if (list_empty(&mddev->disks) && - mddev->raid_disks == 0 && - mddev->size == 0) - st = clear; - else - st = inactive; - } - return sprintf(page, "%s\n", array_states[st]); -} - -static int do_md_stop(mddev_t * mddev, int ro); -static int do_md_run(mddev_t * mddev); -static int restart_array(mddev_t *mddev); - -static ssize_t -array_state_store(mddev_t *mddev, const char *buf, size_t len) -{ - int err = -EINVAL; - enum array_state st = match_word(buf, array_states); - switch(st) { - case bad_word: - break; - case clear: - /* stopping an active array */ - if (mddev->pers) { - if (atomic_read(&mddev->active) > 1) - return -EBUSY; - err = do_md_stop(mddev, 0); - } - break; - case inactive: - /* stopping an active array */ - if (mddev->pers) { - if (atomic_read(&mddev->active) > 1) - return -EBUSY; - err = do_md_stop(mddev, 2); - } - break; - case suspended: - break; /* not supported yet */ - case readonly: - if (mddev->pers) - err = do_md_stop(mddev, 1); - else { - mddev->ro = 1; - err = do_md_run(mddev); - } - break; - case read_auto: - /* stopping an active array */ - if (mddev->pers) { - err = do_md_stop(mddev, 1); - if (err == 0) - mddev->ro = 2; /* FIXME mark devices writable */ - } else { - mddev->ro = 2; - err = do_md_run(mddev); - } - break; - case clean: - if (mddev->pers) { - restart_array(mddev); - spin_lock_irq(&mddev->write_lock); - if (atomic_read(&mddev->writes_pending) == 0) { - mddev->in_sync = 1; - mddev->sb_dirty = 1; - } - spin_unlock_irq(&mddev->write_lock); - } else { - mddev->ro = 0; - mddev->recovery_cp = MaxSector; - err = do_md_run(mddev); - } - break; - case active: - if (mddev->pers) { - restart_array(mddev); - mddev->sb_dirty = 0; - wake_up(&mddev->sb_wait); - err = 0; - } else { - mddev->ro = 0; - err = do_md_run(mddev); - } - break; - case write_pending: - case active_idle: - /* these cannot be set */ - break; - } - if (err) - return err; - else - return len; -} -static struct md_sysfs_entry md_array_state = __ATTR(array_state, 0644, array_state_show, array_state_store); - static ssize_t null_show(mddev_t *mddev, char *page) { @@ -2803,15 +2428,11 @@ __ATTR(suspend_hi, S_IRUGO|S_IWUSR, suspend_hi_show, suspend_hi_store); static struct attribute *md_default_attrs[] = { &md_level.attr, - &md_layout.attr, &md_raid_disks.attr, &md_chunk_size.attr, &md_size.attr, - &md_resync_start.attr, &md_metadata.attr, &md_new_device.attr, - &md_safe_delay.attr, - &md_array_state.attr, NULL, }; @@ -2932,6 +2553,8 @@ static struct kobject *md_probe(dev_t dev, int *part, void *data) return NULL; } +void md_wakeup_thread(mdk_thread_t *thread); + static void md_safemode_timeout(unsigned long data) { mddev_t *mddev = (mddev_t *) data; @@ -3085,7 +2708,7 @@ static int do_md_run(mddev_t * mddev) mddev->safemode = 0; mddev->safemode_timer.function = md_safemode_timeout; mddev->safemode_timer.data = (unsigned long) mddev; - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ + mddev->safemode_delay = (20 * HZ)/1000 +1; /* 20 msec delay */ mddev->in_sync = 1; ITERATE_RDEV(mddev,rdev,tmp) @@ -3113,36 +2736,6 @@ static int do_md_run(mddev_t * mddev) mddev->queue->queuedata = mddev; mddev->queue->make_request_fn = mddev->pers->make_request; - /* If there is a partially-recovered drive we need to - * start recovery here. If we leave it to md_check_recovery, - * it will remove the drives and not do the right thing - */ - if (mddev->degraded) { - struct list_head *rtmp; - int spares = 0; - ITERATE_RDEV(mddev,rdev,rtmp) - if (rdev->raid_disk >= 0 && - !test_bit(In_sync, &rdev->flags) && - !test_bit(Faulty, &rdev->flags)) - /* complete an interrupted recovery */ - spares++; - if (spares && mddev->pers->sync_request) { - mddev->recovery = 0; - set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - mddev->sync_thread = md_register_thread(md_do_sync, - mddev, - "%s_resync"); - if (!mddev->sync_thread) { - printk(KERN_ERR "%s: could not start resync" - " thread...\n", - mdname(mddev)); - /* leave the spares where they are, it shouldn't hurt */ - mddev->recovery = 0; - } else - md_wakeup_thread(mddev->sync_thread); - } - } - mddev->changed = 1; md_new_event(mddev); return 0; @@ -3176,47 +2769,18 @@ static int restart_array(mddev_t *mddev) */ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); md_wakeup_thread(mddev->thread); - md_wakeup_thread(mddev->sync_thread); err = 0; - } else + } else { + printk(KERN_ERR "md: %s has no personality assigned.\n", + mdname(mddev)); err = -EINVAL; + } out: return err; } -/* similar to deny_write_access, but accounts for our holding a reference - * to the file ourselves */ -static int deny_bitmap_write_access(struct file * file) -{ - struct inode *inode = file->f_mapping->host; - - spin_lock(&inode->i_lock); - if (atomic_read(&inode->i_writecount) > 1) { - spin_unlock(&inode->i_lock); - return -ETXTBSY; - } - atomic_set(&inode->i_writecount, -1); - spin_unlock(&inode->i_lock); - - return 0; -} - -static void restore_bitmap_write_access(struct file *file) -{ - struct inode *inode = file->f_mapping->host; - - spin_lock(&inode->i_lock); - atomic_set(&inode->i_writecount, 1); - spin_unlock(&inode->i_lock); -} - -/* mode: - * 0 - completely stop and dis-assemble array - * 1 - switch to readonly - * 2 - stop but do not disassemble array - */ -static int do_md_stop(mddev_t * mddev, int mode) +static int do_md_stop(mddev_t * mddev, int ro) { int err = 0; struct gendisk *disk = mddev->gendisk; @@ -3228,7 +2792,6 @@ static int do_md_stop(mddev_t * mddev, int mode) } if (mddev->sync_thread) { - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_unregister_thread(mddev->sync_thread); mddev->sync_thread = NULL; @@ -3238,15 +2801,12 @@ static int do_md_stop(mddev_t * mddev, int mode) invalidate_partition(disk, 0); - switch(mode) { - case 1: /* readonly */ + if (ro) { err = -ENXIO; if (mddev->ro==1) goto out; mddev->ro = 1; - break; - case 0: /* disassemble */ - case 2: /* stop */ + } else { bitmap_flush(mddev); md_super_wait(mddev); if (mddev->ro) @@ -3261,20 +2821,19 @@ static int do_md_stop(mddev_t * mddev, int mode) if (mddev->ro) mddev->ro = 0; } - if (!mddev->in_sync || mddev->sb_dirty) { + if (!mddev->in_sync) { /* mark array as shutdown cleanly */ mddev->in_sync = 1; md_update_sb(mddev); } - if (mode == 1) + if (ro) set_disk_ro(disk, 1); - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); } /* * Free resources if final stop */ - if (mode == 0) { + if (!ro) { mdk_rdev_t *rdev; struct list_head *tmp; struct gendisk *disk; @@ -3282,7 +2841,7 @@ static int do_md_stop(mddev_t * mddev, int mode) bitmap_destroy(mddev); if (mddev->bitmap_file) { - restore_bitmap_write_access(mddev->bitmap_file); + atomic_set(&mddev->bitmap_file->f_dentry->d_inode->i_writecount, 1); fput(mddev->bitmap_file); mddev->bitmap_file = NULL; } @@ -3298,15 +2857,11 @@ static int do_md_stop(mddev_t * mddev, int mode) export_array(mddev); mddev->array_size = 0; - mddev->size = 0; - mddev->raid_disks = 0; - mddev->recovery_cp = 0; - disk = mddev->gendisk; if (disk) set_capacity(disk, 0); mddev->changed = 1; - } else if (mddev->pers) + } else printk(KERN_INFO "md: %s switched to read-only mode.\n", mdname(mddev)); err = 0; @@ -3709,17 +3264,6 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info) rdev->raid_disk = -1; err = bind_rdev_to_array(rdev, mddev); - if (!err && !mddev->pers->hot_remove_disk) { - /* If there is hot_add_disk but no hot_remove_disk - * then added disks for geometry changes, - * and should be added immediately. - */ - super_types[mddev->major_version]. - validate_super(mddev, rdev); - err = mddev->pers->hot_add_disk(mddev, rdev); - if (err) - unbind_rdev_from_array(rdev); - } if (err) export_rdev(rdev); @@ -3890,6 +3434,23 @@ static int hot_add_disk(mddev_t * mddev, dev_t dev) return err; } +/* similar to deny_write_access, but accounts for our holding a reference + * to the file ourselves */ +static int deny_bitmap_write_access(struct file * file) +{ + struct inode *inode = file->f_mapping->host; + + spin_lock(&inode->i_lock); + if (atomic_read(&inode->i_writecount) > 1) { + spin_unlock(&inode->i_lock); + return -ETXTBSY; + } + atomic_set(&inode->i_writecount, -1); + spin_unlock(&inode->i_lock); + + return 0; +} + static int set_bitmap_file(mddev_t *mddev, int fd) { int err; @@ -3930,17 +3491,12 @@ static int set_bitmap_file(mddev_t *mddev, int fd) mddev->pers->quiesce(mddev, 1); if (fd >= 0) err = bitmap_create(mddev); - if (fd < 0 || err) { + if (fd < 0 || err) bitmap_destroy(mddev); - fd = -1; /* make sure to put the file */ - } mddev->pers->quiesce(mddev, 0); - } - if (fd < 0) { - if (mddev->bitmap_file) { - restore_bitmap_write_access(mddev->bitmap_file); + } else if (fd < 0) { + if (mddev->bitmap_file) fput(mddev->bitmap_file); - } mddev->bitmap_file = NULL; } @@ -4421,6 +3977,11 @@ static int md_ioctl(struct inode *inode, struct file *file, goto done_unlock; default: + if (_IOC_TYPE(cmd) == MD_MAJOR) + printk(KERN_WARNING "md: %s(pid %d) used" + " obsolete MD ioctl, upgrade your" + " software to use new ictls.\n", + current->comm, current->pid); err = -EINVAL; goto abort_unlock; } @@ -5025,7 +4586,7 @@ void md_write_start(mddev_t *mddev, struct bio *bi) spin_lock_irq(&mddev->write_lock); if (mddev->in_sync) { mddev->in_sync = 0; - mddev->sb_dirty = 3; + mddev->sb_dirty = 1; md_wakeup_thread(mddev->thread); } spin_unlock_irq(&mddev->write_lock); @@ -5038,7 +4599,7 @@ void md_write_end(mddev_t *mddev) if (atomic_dec_and_test(&mddev->writes_pending)) { if (mddev->safemode == 2) md_wakeup_thread(mddev->thread); - else if (mddev->safemode_delay) + else mod_timer(&mddev->safemode_timer, jiffies + mddev->safemode_delay); } } @@ -5059,14 +4620,10 @@ void md_do_sync(mddev_t *mddev) struct list_head *tmp; sector_t last_check; int skipped = 0; - struct list_head *rtmp; - mdk_rdev_t *rdev; /* just incase thread restarts... */ if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) return; - if (mddev->ro) /* never try to sync a read-only array */ - return; /* we overload curr_resync somewhat here. * 0 == not engaged in resync at all @@ -5125,30 +4682,17 @@ void md_do_sync(mddev_t *mddev) } } while (mddev->curr_resync < 2); - j = 0; if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { /* resync follows the size requested by the personality, * which defaults to physical size, but can be virtual size */ max_sectors = mddev->resync_max_sectors; mddev->resync_mismatches = 0; - /* we don't use the checkpoint if there's a bitmap */ - if (!mddev->bitmap && - !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) - j = mddev->recovery_cp; } else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) max_sectors = mddev->size << 1; - else { + else /* recovery follows the physical size of devices */ max_sectors = mddev->size << 1; - j = MaxSector; - ITERATE_RDEV(mddev,rdev,rtmp) - if (rdev->raid_disk >= 0 && - !test_bit(Faulty, &rdev->flags) && - !test_bit(In_sync, &rdev->flags) && - rdev->recovery_offset < j) - j = rdev->recovery_offset; - } printk(KERN_INFO "md: syncing RAID array %s\n", mdname(mddev)); printk(KERN_INFO "md: minimum _guaranteed_ reconstruction speed:" @@ -5158,7 +4702,12 @@ void md_do_sync(mddev_t *mddev) speed_max(mddev)); is_mddev_idle(mddev); /* this also initializes IO event counters */ - + /* we don't use the checkpoint if there's a bitmap */ + if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) && !mddev->bitmap + && ! test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) + j = mddev->recovery_cp; + else + j = 0; io_sectors = 0; for (m = 0; m < SYNC_MARKS; m++) { mark[m] = jiffies; @@ -5279,28 +4828,15 @@ void md_do_sync(mddev_t *mddev) if (!test_bit(MD_RECOVERY_ERR, &mddev->recovery) && test_bit(MD_RECOVERY_SYNC, &mddev->recovery) && !test_bit(MD_RECOVERY_CHECK, &mddev->recovery) && - mddev->curr_resync > 2) { - if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { - if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { - if (mddev->curr_resync >= mddev->recovery_cp) { - printk(KERN_INFO - "md: checkpointing recovery of %s.\n", - mdname(mddev)); - mddev->recovery_cp = mddev->curr_resync; - } - } else - mddev->recovery_cp = MaxSector; - } else { - if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) - mddev->curr_resync = MaxSector; - ITERATE_RDEV(mddev,rdev,rtmp) - if (rdev->raid_disk >= 0 && - !test_bit(Faulty, &rdev->flags) && - !test_bit(In_sync, &rdev->flags) && - rdev->recovery_offset < mddev->curr_resync) - rdev->recovery_offset = mddev->curr_resync; - mddev->sb_dirty = 1; - } + mddev->curr_resync > 2 && + mddev->curr_resync >= mddev->recovery_cp) { + if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { + printk(KERN_INFO + "md: checkpointing recovery of %s.\n", + mdname(mddev)); + mddev->recovery_cp = mddev->curr_resync; + } else + mddev->recovery_cp = MaxSector; } skip: @@ -5372,7 +4908,7 @@ void md_check_recovery(mddev_t *mddev) if (mddev->safemode && !atomic_read(&mddev->writes_pending) && !mddev->in_sync && mddev->recovery_cp == MaxSector) { mddev->in_sync = 1; - mddev->sb_dirty = 3; + mddev->sb_dirty = 1; } if (mddev->safemode == 1) mddev->safemode = 0; @@ -5421,8 +4957,6 @@ void md_check_recovery(mddev_t *mddev) clear_bit(MD_RECOVERY_INTR, &mddev->recovery); clear_bit(MD_RECOVERY_DONE, &mddev->recovery); - if (test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) - goto unlock; /* no recovery is running. * remove any failed drives, then * add spares if possible. @@ -5445,7 +4979,6 @@ void md_check_recovery(mddev_t *mddev) ITERATE_RDEV(mddev,rdev,rtmp) if (rdev->raid_disk < 0 && !test_bit(Faulty, &rdev->flags)) { - rdev->recovery_offset = 0; if (mddev->pers->hot_add_disk(mddev,rdev)) { char nm[20]; sprintf(nm, "rd%d", rdev->raid_disk); @@ -5683,6 +5216,7 @@ EXPORT_SYMBOL(md_write_end); EXPORT_SYMBOL(md_register_thread); EXPORT_SYMBOL(md_unregister_thread); EXPORT_SYMBOL(md_wakeup_thread); +EXPORT_SYMBOL(md_print_devices); EXPORT_SYMBOL(md_check_recovery); MODULE_LICENSE("GPL"); MODULE_ALIAS("md"); diff --git a/trunk/drivers/md/raid1.c b/trunk/drivers/md/raid1.c index cead918578a7..4070eff6f0f8 100644 --- a/trunk/drivers/md/raid1.c +++ b/trunk/drivers/md/raid1.c @@ -374,26 +374,26 @@ static int raid1_end_write_request(struct bio *bio, unsigned int bytes_done, int * already. */ if (atomic_dec_and_test(&r1_bio->remaining)) { - if (test_bit(R1BIO_BarrierRetry, &r1_bio->state)) + if (test_bit(R1BIO_BarrierRetry, &r1_bio->state)) { reschedule_retry(r1_bio); - else { - /* it really is the end of this request */ - if (test_bit(R1BIO_BehindIO, &r1_bio->state)) { - /* free extra copy of the data pages */ - int i = bio->bi_vcnt; - while (i--) - safe_put_page(bio->bi_io_vec[i].bv_page); - } - /* clear the bitmap if all writes complete successfully */ - bitmap_endwrite(r1_bio->mddev->bitmap, r1_bio->sector, - r1_bio->sectors, - !test_bit(R1BIO_Degraded, &r1_bio->state), - behind); - md_write_end(r1_bio->mddev); - raid_end_bio_io(r1_bio); + goto out; } + /* it really is the end of this request */ + if (test_bit(R1BIO_BehindIO, &r1_bio->state)) { + /* free extra copy of the data pages */ + int i = bio->bi_vcnt; + while (i--) + safe_put_page(bio->bi_io_vec[i].bv_page); + } + /* clear the bitmap if all writes complete successfully */ + bitmap_endwrite(r1_bio->mddev->bitmap, r1_bio->sector, + r1_bio->sectors, + !test_bit(R1BIO_Degraded, &r1_bio->state), + behind); + md_write_end(r1_bio->mddev); + raid_end_bio_io(r1_bio); } - + out: if (to_put) bio_put(to_put); @@ -1625,12 +1625,6 @@ static sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *skipped, i /* before building a request, check if we can skip these blocks.. * This call the bitmap_start_sync doesn't actually record anything */ - if (mddev->bitmap == NULL && - mddev->recovery_cp == MaxSector && - conf->fullsync == 0) { - *skipped = 1; - return max_sector - sector_nr; - } if (!bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, 1) && !conf->fullsync && !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) { /* We can skip this block, and probably several more */ @@ -1894,8 +1888,7 @@ static int run(mddev_t *mddev) disk = conf->mirrors + i; - if (!disk->rdev || - !test_bit(In_sync, &disk->rdev->flags)) { + if (!disk->rdev) { disk->head_position = 0; mddev->degraded++; } diff --git a/trunk/drivers/md/raid10.c b/trunk/drivers/md/raid10.c index 7f636283a1ba..1440935414e6 100644 --- a/trunk/drivers/md/raid10.c +++ b/trunk/drivers/md/raid10.c @@ -29,7 +29,6 @@ * raid_disks * near_copies (stored in low byte of layout) * far_copies (stored in second byte of layout) - * far_offset (stored in bit 16 of layout ) * * The data to be stored is divided into chunks using chunksize. * Each device is divided into far_copies sections. @@ -37,14 +36,10 @@ * near_copies copies of each chunk is stored (each on a different drive). * The starting device for each section is offset near_copies from the starting * device of the previous section. - * Thus they are (near_copies*far_copies) of each chunk, and each is on a different + * Thus there are (near_copies*far_copies) of each chunk, and each is on a different * drive. * near_copies and far_copies must be at least one, and their product is at most * raid_disks. - * - * If far_offset is true, then the far_copies are handled a bit differently. - * The copies are still in different stripes, but instead of be very far apart - * on disk, there are adjacent stripes. */ /* @@ -362,7 +357,8 @@ static int raid10_end_write_request(struct bio *bio, unsigned int bytes_done, in * With this layout, and block is never stored twice on the one device. * * raid10_find_phys finds the sector offset of a given virtual sector - * on each device that it is on. + * on each device that it is on. If a block isn't on a device, + * that entry in the array is set to MaxSector. * * raid10_find_virt does the reverse mapping, from a device and a * sector offset to a virtual address @@ -385,8 +381,6 @@ static void raid10_find_phys(conf_t *conf, r10bio_t *r10bio) chunk *= conf->near_copies; stripe = chunk; dev = sector_div(stripe, conf->raid_disks); - if (conf->far_offset) - stripe *= conf->far_copies; sector += stripe << conf->chunk_shift; @@ -420,24 +414,16 @@ static sector_t raid10_find_virt(conf_t *conf, sector_t sector, int dev) { sector_t offset, chunk, vchunk; - offset = sector & conf->chunk_mask; - if (conf->far_offset) { - int fc; - chunk = sector >> conf->chunk_shift; - fc = sector_div(chunk, conf->far_copies); - dev -= fc * conf->near_copies; - if (dev < 0) - dev += conf->raid_disks; - } else { - while (sector > conf->stride) { - sector -= conf->stride; - if (dev < conf->near_copies) - dev += conf->raid_disks - conf->near_copies; - else - dev -= conf->near_copies; - } - chunk = sector >> conf->chunk_shift; + while (sector > conf->stride) { + sector -= conf->stride; + if (dev < conf->near_copies) + dev += conf->raid_disks - conf->near_copies; + else + dev -= conf->near_copies; } + + offset = sector & conf->chunk_mask; + chunk = sector >> conf->chunk_shift; vchunk = chunk * conf->raid_disks + dev; sector_div(vchunk, conf->near_copies); return (vchunk << conf->chunk_shift) + offset; @@ -914,12 +900,9 @@ static void status(struct seq_file *seq, mddev_t *mddev) seq_printf(seq, " %dK chunks", mddev->chunk_size/1024); if (conf->near_copies > 1) seq_printf(seq, " %d near-copies", conf->near_copies); - if (conf->far_copies > 1) { - if (conf->far_offset) - seq_printf(seq, " %d offset-copies", conf->far_copies); - else - seq_printf(seq, " %d far-copies", conf->far_copies); - } + if (conf->far_copies > 1) + seq_printf(seq, " %d far-copies", conf->far_copies); + seq_printf(seq, " [%d/%d] [", conf->raid_disks, conf->working_disks); for (i = 0; i < conf->raid_disks; i++) @@ -1932,7 +1915,7 @@ static int run(mddev_t *mddev) mirror_info_t *disk; mdk_rdev_t *rdev; struct list_head *tmp; - int nc, fc, fo; + int nc, fc; sector_t stride, size; if (mddev->chunk_size == 0) { @@ -1942,9 +1925,8 @@ static int run(mddev_t *mddev) nc = mddev->layout & 255; fc = (mddev->layout >> 8) & 255; - fo = mddev->layout & (1<<16); if ((nc*fc) <2 || (nc*fc) > mddev->raid_disks || - (mddev->layout >> 17)) { + (mddev->layout >> 16)) { printk(KERN_ERR "raid10: %s: unsupported raid10 layout: 0x%8x\n", mdname(mddev), mddev->layout); goto out; @@ -1976,16 +1958,12 @@ static int run(mddev_t *mddev) conf->near_copies = nc; conf->far_copies = fc; conf->copies = nc*fc; - conf->far_offset = fo; conf->chunk_mask = (sector_t)(mddev->chunk_size>>9)-1; conf->chunk_shift = ffz(~mddev->chunk_size) - 9; - if (fo) - conf->stride = 1 << conf->chunk_shift; - else { - stride = mddev->size >> (conf->chunk_shift-1); - sector_div(stride, fc); - conf->stride = stride << conf->chunk_shift; - } + stride = mddev->size >> (conf->chunk_shift-1); + sector_div(stride, fc); + conf->stride = stride << conf->chunk_shift; + conf->r10bio_pool = mempool_create(NR_RAID10_BIOS, r10bio_pool_alloc, r10bio_pool_free, conf); if (!conf->r10bio_pool) { @@ -2037,8 +2015,7 @@ static int run(mddev_t *mddev) disk = conf->mirrors + i; - if (!disk->rdev || - !test_bit(In_sync, &rdev->flags)) { + if (!disk->rdev) { disk->head_position = 0; mddev->degraded++; } @@ -2060,13 +2037,7 @@ static int run(mddev_t *mddev) /* * Ok, everything is just fine now */ - if (conf->far_offset) { - size = mddev->size >> (conf->chunk_shift-1); - size *= conf->raid_disks; - size <<= conf->chunk_shift; - sector_div(size, conf->far_copies); - } else - size = conf->stride * conf->raid_disks; + size = conf->stride * conf->raid_disks; sector_div(size, conf->near_copies); mddev->array_size = size/2; mddev->resync_max_sectors = size; @@ -2079,7 +2050,7 @@ static int run(mddev_t *mddev) * maybe... */ { - int stripe = conf->raid_disks * (mddev->chunk_size / PAGE_SIZE); + int stripe = conf->raid_disks * mddev->chunk_size / PAGE_SIZE; stripe /= conf->near_copies; if (mddev->queue->backing_dev_info.ra_pages < 2* stripe) mddev->queue->backing_dev_info.ra_pages = 2* stripe; diff --git a/trunk/drivers/md/raid5.c b/trunk/drivers/md/raid5.c index f920e50ea124..31843604049c 100644 --- a/trunk/drivers/md/raid5.c +++ b/trunk/drivers/md/raid5.c @@ -2,11 +2,8 @@ * raid5.c : Multiple Devices driver for Linux * Copyright (C) 1996, 1997 Ingo Molnar, Miguel de Icaza, Gadi Oxman * Copyright (C) 1999, 2000 Ingo Molnar - * Copyright (C) 2002, 2003 H. Peter Anvin * - * RAID-4/5/6 management functions. - * Thanks to Penguin Computing for making the RAID-6 development possible - * by donating a test server! + * RAID-5 management functions. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -22,11 +19,11 @@ #include #include #include +#include #include #include #include #include -#include "raid6.h" #include @@ -71,16 +68,6 @@ #define __inline__ #endif -#if !RAID6_USE_EMPTY_ZERO_PAGE -/* In .bss so it's zeroed */ -const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256))); -#endif - -static inline int raid6_next_disk(int disk, int raid_disks) -{ - disk++; - return (disk < raid_disks) ? disk : 0; -} static void print_raid5_conf (raid5_conf_t *conf); static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh) @@ -117,7 +104,7 @@ static void release_stripe(struct stripe_head *sh) { raid5_conf_t *conf = sh->raid_conf; unsigned long flags; - + spin_lock_irqsave(&conf->device_lock, flags); __release_stripe(conf, sh); spin_unlock_irqrestore(&conf->device_lock, flags); @@ -130,7 +117,7 @@ static inline void remove_hash(struct stripe_head *sh) hlist_del_init(&sh->hash); } -static inline void insert_hash(raid5_conf_t *conf, struct stripe_head *sh) +static void insert_hash(raid5_conf_t *conf, struct stripe_head *sh) { struct hlist_head *hp = stripe_hash(conf, sh->sector); @@ -203,7 +190,7 @@ static void init_stripe(struct stripe_head *sh, sector_t sector, int pd_idx, int (unsigned long long)sh->sector); remove_hash(sh); - + sh->sector = sector; sh->pd_idx = pd_idx; sh->state = 0; @@ -282,9 +269,8 @@ static struct stripe_head *get_active_stripe(raid5_conf_t *conf, sector_t sector } else { if (!test_bit(STRIPE_HANDLE, &sh->state)) atomic_inc(&conf->active_stripes); - if (list_empty(&sh->lru)) - BUG(); - list_del_init(&sh->lru); + if (!list_empty(&sh->lru)) + list_del_init(&sh->lru); } } } while (sh == NULL); @@ -335,9 +321,10 @@ static int grow_stripes(raid5_conf_t *conf, int num) return 1; conf->slab_cache = sc; conf->pool_size = devs; - while (num--) + while (num--) { if (!grow_one_stripe(conf)) return 1; + } return 0; } @@ -644,7 +631,8 @@ static void raid5_build_block (struct stripe_head *sh, int i) dev->req.bi_private = sh; dev->flags = 0; - dev->sector = compute_blocknr(sh, i); + if (i != sh->pd_idx) + dev->sector = compute_blocknr(sh, i); } static void error(mddev_t *mddev, mdk_rdev_t *rdev) @@ -671,7 +659,7 @@ static void error(mddev_t *mddev, mdk_rdev_t *rdev) " Operation continuing on %d devices\n", bdevname(rdev->bdev,b), conf->working_disks); } -} +} /* * Input: a 'big' sector number, @@ -709,12 +697,9 @@ static sector_t raid5_compute_sector(sector_t r_sector, unsigned int raid_disks, /* * Select the parity disk based on the user selected algorithm. */ - switch(conf->level) { - case 4: + if (conf->level == 4) *pd_idx = data_disks; - break; - case 5: - switch (conf->algorithm) { + else switch (conf->algorithm) { case ALGORITHM_LEFT_ASYMMETRIC: *pd_idx = data_disks - stripe % raid_disks; if (*dd_idx >= *pd_idx) @@ -736,39 +721,6 @@ static sector_t raid5_compute_sector(sector_t r_sector, unsigned int raid_disks, default: printk(KERN_ERR "raid5: unsupported algorithm %d\n", conf->algorithm); - } - break; - case 6: - - /**** FIX THIS ****/ - switch (conf->algorithm) { - case ALGORITHM_LEFT_ASYMMETRIC: - *pd_idx = raid_disks - 1 - (stripe % raid_disks); - if (*pd_idx == raid_disks-1) - (*dd_idx)++; /* Q D D D P */ - else if (*dd_idx >= *pd_idx) - (*dd_idx) += 2; /* D D P Q D */ - break; - case ALGORITHM_RIGHT_ASYMMETRIC: - *pd_idx = stripe % raid_disks; - if (*pd_idx == raid_disks-1) - (*dd_idx)++; /* Q D D D P */ - else if (*dd_idx >= *pd_idx) - (*dd_idx) += 2; /* D D P Q D */ - break; - case ALGORITHM_LEFT_SYMMETRIC: - *pd_idx = raid_disks - 1 - (stripe % raid_disks); - *dd_idx = (*pd_idx + 2 + *dd_idx) % raid_disks; - break; - case ALGORITHM_RIGHT_SYMMETRIC: - *pd_idx = stripe % raid_disks; - *dd_idx = (*pd_idx + 2 + *dd_idx) % raid_disks; - break; - default: - printk (KERN_CRIT "raid6: unsupported algorithm %d\n", - conf->algorithm); - } - break; } /* @@ -790,17 +742,12 @@ static sector_t compute_blocknr(struct stripe_head *sh, int i) int chunk_number, dummy1, dummy2, dd_idx = i; sector_t r_sector; - chunk_offset = sector_div(new_sector, sectors_per_chunk); stripe = new_sector; BUG_ON(new_sector != stripe); - if (i == sh->pd_idx) - return 0; - switch(conf->level) { - case 4: break; - case 5: - switch (conf->algorithm) { + + switch (conf->algorithm) { case ALGORITHM_LEFT_ASYMMETRIC: case ALGORITHM_RIGHT_ASYMMETRIC: if (i > sh->pd_idx) @@ -814,37 +761,7 @@ static sector_t compute_blocknr(struct stripe_head *sh, int i) break; default: printk(KERN_ERR "raid5: unsupported algorithm %d\n", - conf->algorithm); - } - break; - case 6: - data_disks = raid_disks - 2; - if (i == raid6_next_disk(sh->pd_idx, raid_disks)) - return 0; /* It is the Q disk */ - switch (conf->algorithm) { - case ALGORITHM_LEFT_ASYMMETRIC: - case ALGORITHM_RIGHT_ASYMMETRIC: - if (sh->pd_idx == raid_disks-1) - i--; /* Q D D D P */ - else if (i > sh->pd_idx) - i -= 2; /* D D P Q D */ - break; - case ALGORITHM_LEFT_SYMMETRIC: - case ALGORITHM_RIGHT_SYMMETRIC: - if (sh->pd_idx == raid_disks-1) - i--; /* Q D D D P */ - else { - /* D D P Q D */ - if (i < sh->pd_idx) - i += raid_disks; - i -= (sh->pd_idx + 2); - } - break; - default: - printk (KERN_CRIT "raid6: unsupported algorithm %d\n", conf->algorithm); - } - break; } chunk_number = stripe * data_disks + i; @@ -861,11 +778,10 @@ static sector_t compute_blocknr(struct stripe_head *sh, int i) /* - * Copy data between a page in the stripe cache, and one or more bion - * The page could align with the middle of the bio, or there could be - * several bion, each with several bio_vecs, which cover part of the page - * Multiple bion are linked together on bi_next. There may be extras - * at the end of this list. We ignore them. + * Copy data between a page in the stripe cache, and a bio. + * There are no alignment or size guarantees between the page or the + * bio except that there is some overlap. + * All iovecs in the bio must be considered. */ static void copy_data(int frombio, struct bio *bio, struct page *page, @@ -894,7 +810,7 @@ static void copy_data(int frombio, struct bio *bio, if (len > 0 && page_offset + len > STRIPE_SIZE) clen = STRIPE_SIZE - page_offset; else clen = len; - + if (clen > 0) { char *ba = __bio_kmap_atomic(bio, i, KM_USER0); if (frombio) @@ -946,14 +862,14 @@ static void compute_block(struct stripe_head *sh, int dd_idx) set_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); } -static void compute_parity5(struct stripe_head *sh, int method) +static void compute_parity(struct stripe_head *sh, int method) { raid5_conf_t *conf = sh->raid_conf; int i, pd_idx = sh->pd_idx, disks = sh->disks, count; void *ptr[MAX_XOR_BLOCKS]; struct bio *chosen; - PRINTK("compute_parity5, stripe %llu, method %d\n", + PRINTK("compute_parity, stripe %llu, method %d\n", (unsigned long long)sh->sector, method); count = 1; @@ -1040,195 +956,9 @@ static void compute_parity5(struct stripe_head *sh, int method) clear_bit(R5_UPTODATE, &sh->dev[pd_idx].flags); } -static void compute_parity6(struct stripe_head *sh, int method) -{ - raid6_conf_t *conf = sh->raid_conf; - int i, pd_idx = sh->pd_idx, qd_idx, d0_idx, disks = conf->raid_disks, count; - struct bio *chosen; - /**** FIX THIS: This could be very bad if disks is close to 256 ****/ - void *ptrs[disks]; - - qd_idx = raid6_next_disk(pd_idx, disks); - d0_idx = raid6_next_disk(qd_idx, disks); - - PRINTK("compute_parity, stripe %llu, method %d\n", - (unsigned long long)sh->sector, method); - - switch(method) { - case READ_MODIFY_WRITE: - BUG(); /* READ_MODIFY_WRITE N/A for RAID-6 */ - case RECONSTRUCT_WRITE: - for (i= disks; i-- ;) - if ( i != pd_idx && i != qd_idx && sh->dev[i].towrite ) { - chosen = sh->dev[i].towrite; - sh->dev[i].towrite = NULL; - - if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) - wake_up(&conf->wait_for_overlap); - - if (sh->dev[i].written) BUG(); - sh->dev[i].written = chosen; - } - break; - case CHECK_PARITY: - BUG(); /* Not implemented yet */ - } - - for (i = disks; i--;) - if (sh->dev[i].written) { - sector_t sector = sh->dev[i].sector; - struct bio *wbi = sh->dev[i].written; - while (wbi && wbi->bi_sector < sector + STRIPE_SECTORS) { - copy_data(1, wbi, sh->dev[i].page, sector); - wbi = r5_next_bio(wbi, sector); - } - - set_bit(R5_LOCKED, &sh->dev[i].flags); - set_bit(R5_UPTODATE, &sh->dev[i].flags); - } - -// switch(method) { -// case RECONSTRUCT_WRITE: -// case CHECK_PARITY: -// case UPDATE_PARITY: - /* Note that unlike RAID-5, the ordering of the disks matters greatly. */ - /* FIX: Is this ordering of drives even remotely optimal? */ - count = 0; - i = d0_idx; - do { - ptrs[count++] = page_address(sh->dev[i].page); - if (count <= disks-2 && !test_bit(R5_UPTODATE, &sh->dev[i].flags)) - printk("block %d/%d not uptodate on parity calc\n", i,count); - i = raid6_next_disk(i, disks); - } while ( i != d0_idx ); -// break; -// } - - raid6_call.gen_syndrome(disks, STRIPE_SIZE, ptrs); - - switch(method) { - case RECONSTRUCT_WRITE: - set_bit(R5_UPTODATE, &sh->dev[pd_idx].flags); - set_bit(R5_UPTODATE, &sh->dev[qd_idx].flags); - set_bit(R5_LOCKED, &sh->dev[pd_idx].flags); - set_bit(R5_LOCKED, &sh->dev[qd_idx].flags); - break; - case UPDATE_PARITY: - set_bit(R5_UPTODATE, &sh->dev[pd_idx].flags); - set_bit(R5_UPTODATE, &sh->dev[qd_idx].flags); - break; - } -} - - -/* Compute one missing block */ -static void compute_block_1(struct stripe_head *sh, int dd_idx, int nozero) -{ - raid6_conf_t *conf = sh->raid_conf; - int i, count, disks = conf->raid_disks; - void *ptr[MAX_XOR_BLOCKS], *p; - int pd_idx = sh->pd_idx; - int qd_idx = raid6_next_disk(pd_idx, disks); - - PRINTK("compute_block_1, stripe %llu, idx %d\n", - (unsigned long long)sh->sector, dd_idx); - - if ( dd_idx == qd_idx ) { - /* We're actually computing the Q drive */ - compute_parity6(sh, UPDATE_PARITY); - } else { - ptr[0] = page_address(sh->dev[dd_idx].page); - if (!nozero) memset(ptr[0], 0, STRIPE_SIZE); - count = 1; - for (i = disks ; i--; ) { - if (i == dd_idx || i == qd_idx) - continue; - p = page_address(sh->dev[i].page); - if (test_bit(R5_UPTODATE, &sh->dev[i].flags)) - ptr[count++] = p; - else - printk("compute_block() %d, stripe %llu, %d" - " not present\n", dd_idx, - (unsigned long long)sh->sector, i); - - check_xor(); - } - if (count != 1) - xor_block(count, STRIPE_SIZE, ptr); - if (!nozero) set_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); - else clear_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); - } -} - -/* Compute two missing blocks */ -static void compute_block_2(struct stripe_head *sh, int dd_idx1, int dd_idx2) -{ - raid6_conf_t *conf = sh->raid_conf; - int i, count, disks = conf->raid_disks; - int pd_idx = sh->pd_idx; - int qd_idx = raid6_next_disk(pd_idx, disks); - int d0_idx = raid6_next_disk(qd_idx, disks); - int faila, failb; - - /* faila and failb are disk numbers relative to d0_idx */ - /* pd_idx become disks-2 and qd_idx become disks-1 */ - faila = (dd_idx1 < d0_idx) ? dd_idx1+(disks-d0_idx) : dd_idx1-d0_idx; - failb = (dd_idx2 < d0_idx) ? dd_idx2+(disks-d0_idx) : dd_idx2-d0_idx; - - BUG_ON(faila == failb); - if ( failb < faila ) { int tmp = faila; faila = failb; failb = tmp; } - - PRINTK("compute_block_2, stripe %llu, idx %d,%d (%d,%d)\n", - (unsigned long long)sh->sector, dd_idx1, dd_idx2, faila, failb); - - if ( failb == disks-1 ) { - /* Q disk is one of the missing disks */ - if ( faila == disks-2 ) { - /* Missing P+Q, just recompute */ - compute_parity6(sh, UPDATE_PARITY); - return; - } else { - /* We're missing D+Q; recompute D from P */ - compute_block_1(sh, (dd_idx1 == qd_idx) ? dd_idx2 : dd_idx1, 0); - compute_parity6(sh, UPDATE_PARITY); /* Is this necessary? */ - return; - } - } - - /* We're missing D+P or D+D; build pointer table */ - { - /**** FIX THIS: This could be very bad if disks is close to 256 ****/ - void *ptrs[disks]; - - count = 0; - i = d0_idx; - do { - ptrs[count++] = page_address(sh->dev[i].page); - i = raid6_next_disk(i, disks); - if (i != dd_idx1 && i != dd_idx2 && - !test_bit(R5_UPTODATE, &sh->dev[i].flags)) - printk("compute_2 with missing block %d/%d\n", count, i); - } while ( i != d0_idx ); - - if ( failb == disks-2 ) { - /* We're missing D+P. */ - raid6_datap_recov(disks, STRIPE_SIZE, faila, ptrs); - } else { - /* We're missing D+D. */ - raid6_2data_recov(disks, STRIPE_SIZE, faila, failb, ptrs); - } - - /* Both the above update both missing blocks */ - set_bit(R5_UPTODATE, &sh->dev[dd_idx1].flags); - set_bit(R5_UPTODATE, &sh->dev[dd_idx2].flags); - } -} - - - /* * Each stripe/dev can have one or more bion attached. - * toread/towrite point to the first in a chain. + * toread/towrite point to the first in a chain. * The bi_next chain must be in order. */ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, int forwrite) @@ -1301,13 +1031,6 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in static void end_reshape(raid5_conf_t *conf); -static int page_is_zero(struct page *p) -{ - char *a = page_address(p); - return ((*(u32*)a) == 0 && - memcmp(a, a+4, STRIPE_SIZE-4)==0); -} - static int stripe_to_pdidx(sector_t stripe, raid5_conf_t *conf, int disks) { int sectors_per_chunk = conf->chunk_size >> 9; @@ -1339,7 +1062,7 @@ static int stripe_to_pdidx(sector_t stripe, raid5_conf_t *conf, int disks) * */ -static void handle_stripe5(struct stripe_head *sh) +static void handle_stripe(struct stripe_head *sh) { raid5_conf_t *conf = sh->raid_conf; int disks = sh->disks; @@ -1671,7 +1394,7 @@ static void handle_stripe5(struct stripe_head *sh) if (locked == 0 && (rcw == 0 ||rmw == 0) && !test_bit(STRIPE_BIT_DELAY, &sh->state)) { PRINTK("Computing parity...\n"); - compute_parity5(sh, rcw==0 ? RECONSTRUCT_WRITE : READ_MODIFY_WRITE); + compute_parity(sh, rcw==0 ? RECONSTRUCT_WRITE : READ_MODIFY_WRITE); /* now every locked buffer is ready to be written */ for (i=disks; i--;) if (test_bit(R5_LOCKED, &sh->dev[i].flags)) { @@ -1698,10 +1421,13 @@ static void handle_stripe5(struct stripe_head *sh) !test_bit(STRIPE_INSYNC, &sh->state)) { set_bit(STRIPE_HANDLE, &sh->state); if (failed == 0) { + char *pagea; BUG_ON(uptodate != disks); - compute_parity5(sh, CHECK_PARITY); + compute_parity(sh, CHECK_PARITY); uptodate--; - if (page_is_zero(sh->dev[sh->pd_idx].page)) { + pagea = page_address(sh->dev[sh->pd_idx].page); + if ((*(u32*)pagea) == 0 && + !memcmp(pagea, pagea+4, STRIPE_SIZE-4)) { /* parity is correct (on disc, not in buffer any more) */ set_bit(STRIPE_INSYNC, &sh->state); } else { @@ -1761,7 +1487,7 @@ static void handle_stripe5(struct stripe_head *sh) /* Need to write out all blocks after computing parity */ sh->disks = conf->raid_disks; sh->pd_idx = stripe_to_pdidx(sh->sector, conf, conf->raid_disks); - compute_parity5(sh, RECONSTRUCT_WRITE); + compute_parity(sh, RECONSTRUCT_WRITE); for (i= conf->raid_disks; i--;) { set_bit(R5_LOCKED, &sh->dev[i].flags); locked++; @@ -1889,630 +1615,67 @@ static void handle_stripe5(struct stripe_head *sh) } } -static void handle_stripe6(struct stripe_head *sh, struct page *tmp_page) +static void raid5_activate_delayed(raid5_conf_t *conf) { - raid6_conf_t *conf = sh->raid_conf; - int disks = conf->raid_disks; - struct bio *return_bi= NULL; - struct bio *bi; - int i; - int syncing; - int locked=0, uptodate=0, to_read=0, to_write=0, failed=0, written=0; - int non_overwrite = 0; - int failed_num[2] = {0, 0}; - struct r5dev *dev, *pdev, *qdev; - int pd_idx = sh->pd_idx; - int qd_idx = raid6_next_disk(pd_idx, disks); - int p_failed, q_failed; - - PRINTK("handling stripe %llu, state=%#lx cnt=%d, pd_idx=%d, qd_idx=%d\n", - (unsigned long long)sh->sector, sh->state, atomic_read(&sh->count), - pd_idx, qd_idx); + if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) { + while (!list_empty(&conf->delayed_list)) { + struct list_head *l = conf->delayed_list.next; + struct stripe_head *sh; + sh = list_entry(l, struct stripe_head, lru); + list_del_init(l); + clear_bit(STRIPE_DELAYED, &sh->state); + if (!test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) + atomic_inc(&conf->preread_active_stripes); + list_add_tail(&sh->lru, &conf->handle_list); + } + } +} - spin_lock(&sh->lock); - clear_bit(STRIPE_HANDLE, &sh->state); - clear_bit(STRIPE_DELAYED, &sh->state); +static void activate_bit_delay(raid5_conf_t *conf) +{ + /* device_lock is held */ + struct list_head head; + list_add(&head, &conf->bitmap_list); + list_del_init(&conf->bitmap_list); + while (!list_empty(&head)) { + struct stripe_head *sh = list_entry(head.next, struct stripe_head, lru); + list_del_init(&sh->lru); + atomic_inc(&sh->count); + __release_stripe(conf, sh); + } +} - syncing = test_bit(STRIPE_SYNCING, &sh->state); - /* Now to look around and see what can be done */ +static void unplug_slaves(mddev_t *mddev) +{ + raid5_conf_t *conf = mddev_to_conf(mddev); + int i; rcu_read_lock(); - for (i=disks; i--; ) { - mdk_rdev_t *rdev; - dev = &sh->dev[i]; - clear_bit(R5_Insync, &dev->flags); - - PRINTK("check %d: state 0x%lx read %p write %p written %p\n", - i, dev->flags, dev->toread, dev->towrite, dev->written); - /* maybe we can reply to a read */ - if (test_bit(R5_UPTODATE, &dev->flags) && dev->toread) { - struct bio *rbi, *rbi2; - PRINTK("Return read for disc %d\n", i); - spin_lock_irq(&conf->device_lock); - rbi = dev->toread; - dev->toread = NULL; - if (test_and_clear_bit(R5_Overlap, &dev->flags)) - wake_up(&conf->wait_for_overlap); - spin_unlock_irq(&conf->device_lock); - while (rbi && rbi->bi_sector < dev->sector + STRIPE_SECTORS) { - copy_data(0, rbi, dev->page, dev->sector); - rbi2 = r5_next_bio(rbi, dev->sector); - spin_lock_irq(&conf->device_lock); - if (--rbi->bi_phys_segments == 0) { - rbi->bi_next = return_bi; - return_bi = rbi; - } - spin_unlock_irq(&conf->device_lock); - rbi = rbi2; - } - } + for (i=0; iraid_disks; i++) { + mdk_rdev_t *rdev = rcu_dereference(conf->disks[i].rdev); + if (rdev && !test_bit(Faulty, &rdev->flags) && atomic_read(&rdev->nr_pending)) { + request_queue_t *r_queue = bdev_get_queue(rdev->bdev); - /* now count some things */ - if (test_bit(R5_LOCKED, &dev->flags)) locked++; - if (test_bit(R5_UPTODATE, &dev->flags)) uptodate++; + atomic_inc(&rdev->nr_pending); + rcu_read_unlock(); + if (r_queue->unplug_fn) + r_queue->unplug_fn(r_queue); - if (dev->toread) to_read++; - if (dev->towrite) { - to_write++; - if (!test_bit(R5_OVERWRITE, &dev->flags)) - non_overwrite++; - } - if (dev->written) written++; - rdev = rcu_dereference(conf->disks[i].rdev); - if (!rdev || !test_bit(In_sync, &rdev->flags)) { - /* The ReadError flag will just be confusing now */ - clear_bit(R5_ReadError, &dev->flags); - clear_bit(R5_ReWrite, &dev->flags); + rdev_dec_pending(rdev, mddev); + rcu_read_lock(); } - if (!rdev || !test_bit(In_sync, &rdev->flags) - || test_bit(R5_ReadError, &dev->flags)) { - if ( failed < 2 ) - failed_num[failed] = i; - failed++; - } else - set_bit(R5_Insync, &dev->flags); } rcu_read_unlock(); - PRINTK("locked=%d uptodate=%d to_read=%d" - " to_write=%d failed=%d failed_num=%d,%d\n", - locked, uptodate, to_read, to_write, failed, - failed_num[0], failed_num[1]); - /* check if the array has lost >2 devices and, if so, some requests might - * need to be failed - */ - if (failed > 2 && to_read+to_write+written) { - for (i=disks; i--; ) { - int bitmap_end = 0; - - if (test_bit(R5_ReadError, &sh->dev[i].flags)) { - mdk_rdev_t *rdev; - rcu_read_lock(); - rdev = rcu_dereference(conf->disks[i].rdev); - if (rdev && test_bit(In_sync, &rdev->flags)) - /* multiple read failures in one stripe */ - md_error(conf->mddev, rdev); - rcu_read_unlock(); - } +} - spin_lock_irq(&conf->device_lock); - /* fail all writes first */ - bi = sh->dev[i].towrite; - sh->dev[i].towrite = NULL; - if (bi) { to_write--; bitmap_end = 1; } +static void raid5_unplug_device(request_queue_t *q) +{ + mddev_t *mddev = q->queuedata; + raid5_conf_t *conf = mddev_to_conf(mddev); + unsigned long flags; - if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) - wake_up(&conf->wait_for_overlap); - - while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS){ - struct bio *nextbi = r5_next_bio(bi, sh->dev[i].sector); - clear_bit(BIO_UPTODATE, &bi->bi_flags); - if (--bi->bi_phys_segments == 0) { - md_write_end(conf->mddev); - bi->bi_next = return_bi; - return_bi = bi; - } - bi = nextbi; - } - /* and fail all 'written' */ - bi = sh->dev[i].written; - sh->dev[i].written = NULL; - if (bi) bitmap_end = 1; - while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS) { - struct bio *bi2 = r5_next_bio(bi, sh->dev[i].sector); - clear_bit(BIO_UPTODATE, &bi->bi_flags); - if (--bi->bi_phys_segments == 0) { - md_write_end(conf->mddev); - bi->bi_next = return_bi; - return_bi = bi; - } - bi = bi2; - } - - /* fail any reads if this device is non-operational */ - if (!test_bit(R5_Insync, &sh->dev[i].flags) || - test_bit(R5_ReadError, &sh->dev[i].flags)) { - bi = sh->dev[i].toread; - sh->dev[i].toread = NULL; - if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) - wake_up(&conf->wait_for_overlap); - if (bi) to_read--; - while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS){ - struct bio *nextbi = r5_next_bio(bi, sh->dev[i].sector); - clear_bit(BIO_UPTODATE, &bi->bi_flags); - if (--bi->bi_phys_segments == 0) { - bi->bi_next = return_bi; - return_bi = bi; - } - bi = nextbi; - } - } - spin_unlock_irq(&conf->device_lock); - if (bitmap_end) - bitmap_endwrite(conf->mddev->bitmap, sh->sector, - STRIPE_SECTORS, 0, 0); - } - } - if (failed > 2 && syncing) { - md_done_sync(conf->mddev, STRIPE_SECTORS,0); - clear_bit(STRIPE_SYNCING, &sh->state); - syncing = 0; - } - - /* - * might be able to return some write requests if the parity blocks - * are safe, or on a failed drive - */ - pdev = &sh->dev[pd_idx]; - p_failed = (failed >= 1 && failed_num[0] == pd_idx) - || (failed >= 2 && failed_num[1] == pd_idx); - qdev = &sh->dev[qd_idx]; - q_failed = (failed >= 1 && failed_num[0] == qd_idx) - || (failed >= 2 && failed_num[1] == qd_idx); - - if ( written && - ( p_failed || ((test_bit(R5_Insync, &pdev->flags) - && !test_bit(R5_LOCKED, &pdev->flags) - && test_bit(R5_UPTODATE, &pdev->flags))) ) && - ( q_failed || ((test_bit(R5_Insync, &qdev->flags) - && !test_bit(R5_LOCKED, &qdev->flags) - && test_bit(R5_UPTODATE, &qdev->flags))) ) ) { - /* any written block on an uptodate or failed drive can be - * returned. Note that if we 'wrote' to a failed drive, - * it will be UPTODATE, but never LOCKED, so we don't need - * to test 'failed' directly. - */ - for (i=disks; i--; ) - if (sh->dev[i].written) { - dev = &sh->dev[i]; - if (!test_bit(R5_LOCKED, &dev->flags) && - test_bit(R5_UPTODATE, &dev->flags) ) { - /* We can return any write requests */ - int bitmap_end = 0; - struct bio *wbi, *wbi2; - PRINTK("Return write for stripe %llu disc %d\n", - (unsigned long long)sh->sector, i); - spin_lock_irq(&conf->device_lock); - wbi = dev->written; - dev->written = NULL; - while (wbi && wbi->bi_sector < dev->sector + STRIPE_SECTORS) { - wbi2 = r5_next_bio(wbi, dev->sector); - if (--wbi->bi_phys_segments == 0) { - md_write_end(conf->mddev); - wbi->bi_next = return_bi; - return_bi = wbi; - } - wbi = wbi2; - } - if (dev->towrite == NULL) - bitmap_end = 1; - spin_unlock_irq(&conf->device_lock); - if (bitmap_end) - bitmap_endwrite(conf->mddev->bitmap, sh->sector, - STRIPE_SECTORS, - !test_bit(STRIPE_DEGRADED, &sh->state), 0); - } - } - } - - /* Now we might consider reading some blocks, either to check/generate - * parity, or to satisfy requests - * or to load a block that is being partially written. - */ - if (to_read || non_overwrite || (to_write && failed) || (syncing && (uptodate < disks))) { - for (i=disks; i--;) { - dev = &sh->dev[i]; - if (!test_bit(R5_LOCKED, &dev->flags) && !test_bit(R5_UPTODATE, &dev->flags) && - (dev->toread || - (dev->towrite && !test_bit(R5_OVERWRITE, &dev->flags)) || - syncing || - (failed >= 1 && (sh->dev[failed_num[0]].toread || to_write)) || - (failed >= 2 && (sh->dev[failed_num[1]].toread || to_write)) - ) - ) { - /* we would like to get this block, possibly - * by computing it, but we might not be able to - */ - if (uptodate == disks-1) { - PRINTK("Computing stripe %llu block %d\n", - (unsigned long long)sh->sector, i); - compute_block_1(sh, i, 0); - uptodate++; - } else if ( uptodate == disks-2 && failed >= 2 ) { - /* Computing 2-failure is *very* expensive; only do it if failed >= 2 */ - int other; - for (other=disks; other--;) { - if ( other == i ) - continue; - if ( !test_bit(R5_UPTODATE, &sh->dev[other].flags) ) - break; - } - BUG_ON(other < 0); - PRINTK("Computing stripe %llu blocks %d,%d\n", - (unsigned long long)sh->sector, i, other); - compute_block_2(sh, i, other); - uptodate += 2; - } else if (test_bit(R5_Insync, &dev->flags)) { - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantread, &dev->flags); -#if 0 - /* if I am just reading this block and we don't have - a failed drive, or any pending writes then sidestep the cache */ - if (sh->bh_read[i] && !sh->bh_read[i]->b_reqnext && - ! syncing && !failed && !to_write) { - sh->bh_cache[i]->b_page = sh->bh_read[i]->b_page; - sh->bh_cache[i]->b_data = sh->bh_read[i]->b_data; - } -#endif - locked++; - PRINTK("Reading block %d (sync=%d)\n", - i, syncing); - } - } - } - set_bit(STRIPE_HANDLE, &sh->state); - } - - /* now to consider writing and what else, if anything should be read */ - if (to_write) { - int rcw=0, must_compute=0; - for (i=disks ; i--;) { - dev = &sh->dev[i]; - /* Would I have to read this buffer for reconstruct_write */ - if (!test_bit(R5_OVERWRITE, &dev->flags) - && i != pd_idx && i != qd_idx - && (!test_bit(R5_LOCKED, &dev->flags) -#if 0 - || sh->bh_page[i] != bh->b_page -#endif - ) && - !test_bit(R5_UPTODATE, &dev->flags)) { - if (test_bit(R5_Insync, &dev->flags)) rcw++; - else { - PRINTK("raid6: must_compute: disk %d flags=%#lx\n", i, dev->flags); - must_compute++; - } - } - } - PRINTK("for sector %llu, rcw=%d, must_compute=%d\n", - (unsigned long long)sh->sector, rcw, must_compute); - set_bit(STRIPE_HANDLE, &sh->state); - - if (rcw > 0) - /* want reconstruct write, but need to get some data */ - for (i=disks; i--;) { - dev = &sh->dev[i]; - if (!test_bit(R5_OVERWRITE, &dev->flags) - && !(failed == 0 && (i == pd_idx || i == qd_idx)) - && !test_bit(R5_LOCKED, &dev->flags) && !test_bit(R5_UPTODATE, &dev->flags) && - test_bit(R5_Insync, &dev->flags)) { - if (test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) - { - PRINTK("Read_old stripe %llu block %d for Reconstruct\n", - (unsigned long long)sh->sector, i); - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantread, &dev->flags); - locked++; - } else { - PRINTK("Request delayed stripe %llu block %d for Reconstruct\n", - (unsigned long long)sh->sector, i); - set_bit(STRIPE_DELAYED, &sh->state); - set_bit(STRIPE_HANDLE, &sh->state); - } - } - } - /* now if nothing is locked, and if we have enough data, we can start a write request */ - if (locked == 0 && rcw == 0 && - !test_bit(STRIPE_BIT_DELAY, &sh->state)) { - if ( must_compute > 0 ) { - /* We have failed blocks and need to compute them */ - switch ( failed ) { - case 0: BUG(); - case 1: compute_block_1(sh, failed_num[0], 0); break; - case 2: compute_block_2(sh, failed_num[0], failed_num[1]); break; - default: BUG(); /* This request should have been failed? */ - } - } - - PRINTK("Computing parity for stripe %llu\n", (unsigned long long)sh->sector); - compute_parity6(sh, RECONSTRUCT_WRITE); - /* now every locked buffer is ready to be written */ - for (i=disks; i--;) - if (test_bit(R5_LOCKED, &sh->dev[i].flags)) { - PRINTK("Writing stripe %llu block %d\n", - (unsigned long long)sh->sector, i); - locked++; - set_bit(R5_Wantwrite, &sh->dev[i].flags); - } - /* after a RECONSTRUCT_WRITE, the stripe MUST be in-sync */ - set_bit(STRIPE_INSYNC, &sh->state); - - if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) { - atomic_dec(&conf->preread_active_stripes); - if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) - md_wakeup_thread(conf->mddev->thread); - } - } - } - - /* maybe we need to check and possibly fix the parity for this stripe - * Any reads will already have been scheduled, so we just see if enough data - * is available - */ - if (syncing && locked == 0 && !test_bit(STRIPE_INSYNC, &sh->state)) { - int update_p = 0, update_q = 0; - struct r5dev *dev; - - set_bit(STRIPE_HANDLE, &sh->state); - - BUG_ON(failed>2); - BUG_ON(uptodate < disks); - /* Want to check and possibly repair P and Q. - * However there could be one 'failed' device, in which - * case we can only check one of them, possibly using the - * other to generate missing data - */ - - /* If !tmp_page, we cannot do the calculations, - * but as we have set STRIPE_HANDLE, we will soon be called - * by stripe_handle with a tmp_page - just wait until then. - */ - if (tmp_page) { - if (failed == q_failed) { - /* The only possible failed device holds 'Q', so it makes - * sense to check P (If anything else were failed, we would - * have used P to recreate it). - */ - compute_block_1(sh, pd_idx, 1); - if (!page_is_zero(sh->dev[pd_idx].page)) { - compute_block_1(sh,pd_idx,0); - update_p = 1; - } - } - if (!q_failed && failed < 2) { - /* q is not failed, and we didn't use it to generate - * anything, so it makes sense to check it - */ - memcpy(page_address(tmp_page), - page_address(sh->dev[qd_idx].page), - STRIPE_SIZE); - compute_parity6(sh, UPDATE_PARITY); - if (memcmp(page_address(tmp_page), - page_address(sh->dev[qd_idx].page), - STRIPE_SIZE)!= 0) { - clear_bit(STRIPE_INSYNC, &sh->state); - update_q = 1; - } - } - if (update_p || update_q) { - conf->mddev->resync_mismatches += STRIPE_SECTORS; - if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) - /* don't try to repair!! */ - update_p = update_q = 0; - } - - /* now write out any block on a failed drive, - * or P or Q if they need it - */ - - if (failed == 2) { - dev = &sh->dev[failed_num[1]]; - locked++; - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantwrite, &dev->flags); - } - if (failed >= 1) { - dev = &sh->dev[failed_num[0]]; - locked++; - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantwrite, &dev->flags); - } - - if (update_p) { - dev = &sh->dev[pd_idx]; - locked ++; - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantwrite, &dev->flags); - } - if (update_q) { - dev = &sh->dev[qd_idx]; - locked++; - set_bit(R5_LOCKED, &dev->flags); - set_bit(R5_Wantwrite, &dev->flags); - } - clear_bit(STRIPE_DEGRADED, &sh->state); - - set_bit(STRIPE_INSYNC, &sh->state); - } - } - - if (syncing && locked == 0 && test_bit(STRIPE_INSYNC, &sh->state)) { - md_done_sync(conf->mddev, STRIPE_SECTORS,1); - clear_bit(STRIPE_SYNCING, &sh->state); - } - - /* If the failed drives are just a ReadError, then we might need - * to progress the repair/check process - */ - if (failed <= 2 && ! conf->mddev->ro) - for (i=0; idev[failed_num[i]]; - if (test_bit(R5_ReadError, &dev->flags) - && !test_bit(R5_LOCKED, &dev->flags) - && test_bit(R5_UPTODATE, &dev->flags) - ) { - if (!test_bit(R5_ReWrite, &dev->flags)) { - set_bit(R5_Wantwrite, &dev->flags); - set_bit(R5_ReWrite, &dev->flags); - set_bit(R5_LOCKED, &dev->flags); - } else { - /* let's read it back */ - set_bit(R5_Wantread, &dev->flags); - set_bit(R5_LOCKED, &dev->flags); - } - } - } - spin_unlock(&sh->lock); - - while ((bi=return_bi)) { - int bytes = bi->bi_size; - - return_bi = bi->bi_next; - bi->bi_next = NULL; - bi->bi_size = 0; - bi->bi_end_io(bi, bytes, 0); - } - for (i=disks; i-- ;) { - int rw; - struct bio *bi; - mdk_rdev_t *rdev; - if (test_and_clear_bit(R5_Wantwrite, &sh->dev[i].flags)) - rw = 1; - else if (test_and_clear_bit(R5_Wantread, &sh->dev[i].flags)) - rw = 0; - else - continue; - - bi = &sh->dev[i].req; - - bi->bi_rw = rw; - if (rw) - bi->bi_end_io = raid5_end_write_request; - else - bi->bi_end_io = raid5_end_read_request; - - rcu_read_lock(); - rdev = rcu_dereference(conf->disks[i].rdev); - if (rdev && test_bit(Faulty, &rdev->flags)) - rdev = NULL; - if (rdev) - atomic_inc(&rdev->nr_pending); - rcu_read_unlock(); - - if (rdev) { - if (syncing) - md_sync_acct(rdev->bdev, STRIPE_SECTORS); - - bi->bi_bdev = rdev->bdev; - PRINTK("for %llu schedule op %ld on disc %d\n", - (unsigned long long)sh->sector, bi->bi_rw, i); - atomic_inc(&sh->count); - bi->bi_sector = sh->sector + rdev->data_offset; - bi->bi_flags = 1 << BIO_UPTODATE; - bi->bi_vcnt = 1; - bi->bi_max_vecs = 1; - bi->bi_idx = 0; - bi->bi_io_vec = &sh->dev[i].vec; - bi->bi_io_vec[0].bv_len = STRIPE_SIZE; - bi->bi_io_vec[0].bv_offset = 0; - bi->bi_size = STRIPE_SIZE; - bi->bi_next = NULL; - if (rw == WRITE && - test_bit(R5_ReWrite, &sh->dev[i].flags)) - atomic_add(STRIPE_SECTORS, &rdev->corrected_errors); - generic_make_request(bi); - } else { - if (rw == 1) - set_bit(STRIPE_DEGRADED, &sh->state); - PRINTK("skip op %ld on disc %d for sector %llu\n", - bi->bi_rw, i, (unsigned long long)sh->sector); - clear_bit(R5_LOCKED, &sh->dev[i].flags); - set_bit(STRIPE_HANDLE, &sh->state); - } - } -} - -static void handle_stripe(struct stripe_head *sh, struct page *tmp_page) -{ - if (sh->raid_conf->level == 6) - handle_stripe6(sh, tmp_page); - else - handle_stripe5(sh); -} - - - -static void raid5_activate_delayed(raid5_conf_t *conf) -{ - if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) { - while (!list_empty(&conf->delayed_list)) { - struct list_head *l = conf->delayed_list.next; - struct stripe_head *sh; - sh = list_entry(l, struct stripe_head, lru); - list_del_init(l); - clear_bit(STRIPE_DELAYED, &sh->state); - if (!test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) - atomic_inc(&conf->preread_active_stripes); - list_add_tail(&sh->lru, &conf->handle_list); - } - } -} - -static void activate_bit_delay(raid5_conf_t *conf) -{ - /* device_lock is held */ - struct list_head head; - list_add(&head, &conf->bitmap_list); - list_del_init(&conf->bitmap_list); - while (!list_empty(&head)) { - struct stripe_head *sh = list_entry(head.next, struct stripe_head, lru); - list_del_init(&sh->lru); - atomic_inc(&sh->count); - __release_stripe(conf, sh); - } -} - -static void unplug_slaves(mddev_t *mddev) -{ - raid5_conf_t *conf = mddev_to_conf(mddev); - int i; - - rcu_read_lock(); - for (i=0; iraid_disks; i++) { - mdk_rdev_t *rdev = rcu_dereference(conf->disks[i].rdev); - if (rdev && !test_bit(Faulty, &rdev->flags) && atomic_read(&rdev->nr_pending)) { - request_queue_t *r_queue = bdev_get_queue(rdev->bdev); - - atomic_inc(&rdev->nr_pending); - rcu_read_unlock(); - - if (r_queue->unplug_fn) - r_queue->unplug_fn(r_queue); - - rdev_dec_pending(rdev, mddev); - rcu_read_lock(); - } - } - rcu_read_unlock(); -} - -static void raid5_unplug_device(request_queue_t *q) -{ - mddev_t *mddev = q->queuedata; - raid5_conf_t *conf = mddev_to_conf(mddev); - unsigned long flags; - - spin_lock_irqsave(&conf->device_lock, flags); + spin_lock_irqsave(&conf->device_lock, flags); if (blk_remove_plug(q)) { conf->seq_flush++; @@ -2590,7 +1753,7 @@ static int make_request(request_queue_t *q, struct bio * bi) for (;logical_sector < last_sector; logical_sector += STRIPE_SECTORS) { DEFINE_WAIT(w); - int disks, data_disks; + int disks; retry: prepare_to_wait(&conf->wait_for_overlap, &w, TASK_UNINTERRUPTIBLE); @@ -2618,9 +1781,7 @@ static int make_request(request_queue_t *q, struct bio * bi) } spin_unlock_irq(&conf->device_lock); } - data_disks = disks - conf->max_degraded; - - new_sector = raid5_compute_sector(logical_sector, disks, data_disks, + new_sector = raid5_compute_sector(logical_sector, disks, disks - 1, &dd_idx, &pd_idx, conf); PRINTK("raid5: make_request, sector %llu logical %llu\n", (unsigned long long)new_sector, @@ -2672,7 +1833,7 @@ static int make_request(request_queue_t *q, struct bio * bi) } finish_wait(&conf->wait_for_overlap, &w); raid5_plug_device(conf); - handle_stripe(sh, NULL); + handle_stripe(sh); release_stripe(sh); } else { /* cannot get stripe for read-ahead, just give-up */ @@ -2688,7 +1849,7 @@ static int make_request(request_queue_t *q, struct bio * bi) if (remaining == 0) { int bytes = bi->bi_size; - if ( rw == WRITE ) + if ( bio_data_dir(bi) == WRITE ) md_write_end(mddev); bi->bi_size = 0; bi->bi_end_io(bi, bytes, 0); @@ -2696,142 +1857,17 @@ static int make_request(request_queue_t *q, struct bio * bi) return 0; } -static sector_t reshape_request(mddev_t *mddev, sector_t sector_nr, int *skipped) -{ - /* reshaping is quite different to recovery/resync so it is - * handled quite separately ... here. - * - * On each call to sync_request, we gather one chunk worth of - * destination stripes and flag them as expanding. - * Then we find all the source stripes and request reads. - * As the reads complete, handle_stripe will copy the data - * into the destination stripe and release that stripe. - */ - raid5_conf_t *conf = (raid5_conf_t *) mddev->private; - struct stripe_head *sh; - int pd_idx; - sector_t first_sector, last_sector; - int raid_disks; - int data_disks; - int i; - int dd_idx; - sector_t writepos, safepos, gap; - - if (sector_nr == 0 && - conf->expand_progress != 0) { - /* restarting in the middle, skip the initial sectors */ - sector_nr = conf->expand_progress; - sector_div(sector_nr, conf->raid_disks-1); - *skipped = 1; - return sector_nr; - } - - /* we update the metadata when there is more than 3Meg - * in the block range (that is rather arbitrary, should - * probably be time based) or when the data about to be - * copied would over-write the source of the data at - * the front of the range. - * i.e. one new_stripe forward from expand_progress new_maps - * to after where expand_lo old_maps to - */ - writepos = conf->expand_progress + - conf->chunk_size/512*(conf->raid_disks-1); - sector_div(writepos, conf->raid_disks-1); - safepos = conf->expand_lo; - sector_div(safepos, conf->previous_raid_disks-1); - gap = conf->expand_progress - conf->expand_lo; - - if (writepos >= safepos || - gap > (conf->raid_disks-1)*3000*2 /*3Meg*/) { - /* Cannot proceed until we've updated the superblock... */ - wait_event(conf->wait_for_overlap, - atomic_read(&conf->reshape_stripes)==0); - mddev->reshape_position = conf->expand_progress; - mddev->sb_dirty = 1; - md_wakeup_thread(mddev->thread); - wait_event(mddev->sb_wait, mddev->sb_dirty == 0 || - kthread_should_stop()); - spin_lock_irq(&conf->device_lock); - conf->expand_lo = mddev->reshape_position; - spin_unlock_irq(&conf->device_lock); - wake_up(&conf->wait_for_overlap); - } - - for (i=0; i < conf->chunk_size/512; i+= STRIPE_SECTORS) { - int j; - int skipped = 0; - pd_idx = stripe_to_pdidx(sector_nr+i, conf, conf->raid_disks); - sh = get_active_stripe(conf, sector_nr+i, - conf->raid_disks, pd_idx, 0); - set_bit(STRIPE_EXPANDING, &sh->state); - atomic_inc(&conf->reshape_stripes); - /* If any of this stripe is beyond the end of the old - * array, then we need to zero those blocks - */ - for (j=sh->disks; j--;) { - sector_t s; - if (j == sh->pd_idx) - continue; - s = compute_blocknr(sh, j); - if (s < (mddev->array_size<<1)) { - skipped = 1; - continue; - } - memset(page_address(sh->dev[j].page), 0, STRIPE_SIZE); - set_bit(R5_Expanded, &sh->dev[j].flags); - set_bit(R5_UPTODATE, &sh->dev[j].flags); - } - if (!skipped) { - set_bit(STRIPE_EXPAND_READY, &sh->state); - set_bit(STRIPE_HANDLE, &sh->state); - } - release_stripe(sh); - } - spin_lock_irq(&conf->device_lock); - conf->expand_progress = (sector_nr + i)*(conf->raid_disks-1); - spin_unlock_irq(&conf->device_lock); - /* Ok, those stripe are ready. We can start scheduling - * reads on the source stripes. - * The source stripes are determined by mapping the first and last - * block on the destination stripes. - */ - raid_disks = conf->previous_raid_disks; - data_disks = raid_disks - 1; - first_sector = - raid5_compute_sector(sector_nr*(conf->raid_disks-1), - raid_disks, data_disks, - &dd_idx, &pd_idx, conf); - last_sector = - raid5_compute_sector((sector_nr+conf->chunk_size/512) - *(conf->raid_disks-1) -1, - raid_disks, data_disks, - &dd_idx, &pd_idx, conf); - if (last_sector >= (mddev->size<<1)) - last_sector = (mddev->size<<1)-1; - while (first_sector <= last_sector) { - pd_idx = stripe_to_pdidx(first_sector, conf, conf->previous_raid_disks); - sh = get_active_stripe(conf, first_sector, - conf->previous_raid_disks, pd_idx, 0); - set_bit(STRIPE_EXPAND_SOURCE, &sh->state); - set_bit(STRIPE_HANDLE, &sh->state); - release_stripe(sh); - first_sector += STRIPE_SECTORS; - } - return conf->chunk_size>>9; -} - /* FIXME go_faster isn't used */ -static inline sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *skipped, int go_faster) +static sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *skipped, int go_faster) { raid5_conf_t *conf = (raid5_conf_t *) mddev->private; struct stripe_head *sh; int pd_idx; + sector_t first_sector, last_sector; int raid_disks = conf->raid_disks; - int data_disks = raid_disks - conf->max_degraded; + int data_disks = raid_disks-1; sector_t max_sector = mddev->size << 1; int sync_blocks; - int still_degraded = 0; - int i; if (sector_nr >= max_sector) { /* just being told to finish up .. nothing much to do */ @@ -2844,22 +1880,134 @@ static inline sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *ski if (mddev->curr_resync < max_sector) /* aborted */ bitmap_end_sync(mddev->bitmap, mddev->curr_resync, &sync_blocks, 1); - else /* completed sync */ + else /* compelted sync */ conf->fullsync = 0; bitmap_close_sync(mddev->bitmap); return 0; } - if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) - return reshape_request(mddev, sector_nr, skipped); - - /* if there is too many failed drives and we are trying + if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) { + /* reshaping is quite different to recovery/resync so it is + * handled quite separately ... here. + * + * On each call to sync_request, we gather one chunk worth of + * destination stripes and flag them as expanding. + * Then we find all the source stripes and request reads. + * As the reads complete, handle_stripe will copy the data + * into the destination stripe and release that stripe. + */ + int i; + int dd_idx; + sector_t writepos, safepos, gap; + + if (sector_nr == 0 && + conf->expand_progress != 0) { + /* restarting in the middle, skip the initial sectors */ + sector_nr = conf->expand_progress; + sector_div(sector_nr, conf->raid_disks-1); + *skipped = 1; + return sector_nr; + } + + /* we update the metadata when there is more than 3Meg + * in the block range (that is rather arbitrary, should + * probably be time based) or when the data about to be + * copied would over-write the source of the data at + * the front of the range. + * i.e. one new_stripe forward from expand_progress new_maps + * to after where expand_lo old_maps to + */ + writepos = conf->expand_progress + + conf->chunk_size/512*(conf->raid_disks-1); + sector_div(writepos, conf->raid_disks-1); + safepos = conf->expand_lo; + sector_div(safepos, conf->previous_raid_disks-1); + gap = conf->expand_progress - conf->expand_lo; + + if (writepos >= safepos || + gap > (conf->raid_disks-1)*3000*2 /*3Meg*/) { + /* Cannot proceed until we've updated the superblock... */ + wait_event(conf->wait_for_overlap, + atomic_read(&conf->reshape_stripes)==0); + mddev->reshape_position = conf->expand_progress; + mddev->sb_dirty = 1; + md_wakeup_thread(mddev->thread); + wait_event(mddev->sb_wait, mddev->sb_dirty == 0 || + kthread_should_stop()); + spin_lock_irq(&conf->device_lock); + conf->expand_lo = mddev->reshape_position; + spin_unlock_irq(&conf->device_lock); + wake_up(&conf->wait_for_overlap); + } + + for (i=0; i < conf->chunk_size/512; i+= STRIPE_SECTORS) { + int j; + int skipped = 0; + pd_idx = stripe_to_pdidx(sector_nr+i, conf, conf->raid_disks); + sh = get_active_stripe(conf, sector_nr+i, + conf->raid_disks, pd_idx, 0); + set_bit(STRIPE_EXPANDING, &sh->state); + atomic_inc(&conf->reshape_stripes); + /* If any of this stripe is beyond the end of the old + * array, then we need to zero those blocks + */ + for (j=sh->disks; j--;) { + sector_t s; + if (j == sh->pd_idx) + continue; + s = compute_blocknr(sh, j); + if (s < (mddev->array_size<<1)) { + skipped = 1; + continue; + } + memset(page_address(sh->dev[j].page), 0, STRIPE_SIZE); + set_bit(R5_Expanded, &sh->dev[j].flags); + set_bit(R5_UPTODATE, &sh->dev[j].flags); + } + if (!skipped) { + set_bit(STRIPE_EXPAND_READY, &sh->state); + set_bit(STRIPE_HANDLE, &sh->state); + } + release_stripe(sh); + } + spin_lock_irq(&conf->device_lock); + conf->expand_progress = (sector_nr + i)*(conf->raid_disks-1); + spin_unlock_irq(&conf->device_lock); + /* Ok, those stripe are ready. We can start scheduling + * reads on the source stripes. + * The source stripes are determined by mapping the first and last + * block on the destination stripes. + */ + raid_disks = conf->previous_raid_disks; + data_disks = raid_disks - 1; + first_sector = + raid5_compute_sector(sector_nr*(conf->raid_disks-1), + raid_disks, data_disks, + &dd_idx, &pd_idx, conf); + last_sector = + raid5_compute_sector((sector_nr+conf->chunk_size/512) + *(conf->raid_disks-1) -1, + raid_disks, data_disks, + &dd_idx, &pd_idx, conf); + if (last_sector >= (mddev->size<<1)) + last_sector = (mddev->size<<1)-1; + while (first_sector <= last_sector) { + pd_idx = stripe_to_pdidx(first_sector, conf, conf->previous_raid_disks); + sh = get_active_stripe(conf, first_sector, + conf->previous_raid_disks, pd_idx, 0); + set_bit(STRIPE_EXPAND_SOURCE, &sh->state); + set_bit(STRIPE_HANDLE, &sh->state); + release_stripe(sh); + first_sector += STRIPE_SECTORS; + } + return conf->chunk_size>>9; + } + /* if there is 1 or more failed drives and we are trying * to resync, then assert that we are finished, because there is * nothing we can do. */ - if (mddev->degraded >= conf->max_degraded && - test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { + if (mddev->degraded >= 1 && test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { sector_t rv = (mddev->size << 1) - sector_nr; *skipped = 1; return rv; @@ -2878,26 +2026,17 @@ static inline sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *ski if (sh == NULL) { sh = get_active_stripe(conf, sector_nr, raid_disks, pd_idx, 0); /* make sure we don't swamp the stripe cache if someone else - * is trying to get access + * is trying to get access */ schedule_timeout_uninterruptible(1); } - /* Need to check if array will still be degraded after recovery/resync - * We don't need to check the 'failed' flag as when that gets set, - * recovery aborts. - */ - for (i=0; iraid_disks; i++) - if (conf->disks[i].rdev == NULL) - still_degraded = 1; - - bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, still_degraded); - - spin_lock(&sh->lock); + bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, 0); + spin_lock(&sh->lock); set_bit(STRIPE_SYNCING, &sh->state); clear_bit(STRIPE_INSYNC, &sh->state); spin_unlock(&sh->lock); - handle_stripe(sh, NULL); + handle_stripe(sh); release_stripe(sh); return STRIPE_SECTORS; @@ -2952,7 +2091,7 @@ static void raid5d (mddev_t *mddev) spin_unlock_irq(&conf->device_lock); handled++; - handle_stripe(sh, conf->spare_page); + handle_stripe(sh); release_stripe(sh); spin_lock_irq(&conf->device_lock); @@ -3042,8 +2181,8 @@ static int run(mddev_t *mddev) struct disk_info *disk; struct list_head *tmp; - if (mddev->level != 5 && mddev->level != 4 && mddev->level != 6) { - printk(KERN_ERR "raid5: %s: raid level not set to 4/5/6 (%d)\n", + if (mddev->level != 5 && mddev->level != 4) { + printk(KERN_ERR "raid5: %s: raid level not set to 4/5 (%d)\n", mdname(mddev), mddev->level); return -EIO; } @@ -3112,11 +2251,6 @@ static int run(mddev_t *mddev) if ((conf->stripe_hashtbl = kzalloc(PAGE_SIZE, GFP_KERNEL)) == NULL) goto abort; - if (mddev->level == 6) { - conf->spare_page = alloc_page(GFP_KERNEL); - if (!conf->spare_page) - goto abort; - } spin_lock_init(&conf->device_lock); init_waitqueue_head(&conf->wait_for_stripe); init_waitqueue_head(&conf->wait_for_overlap); @@ -3148,16 +2282,12 @@ static int run(mddev_t *mddev) } /* - * 0 for a fully functional array, 1 or 2 for a degraded array. + * 0 for a fully functional array, 1 for a degraded array. */ mddev->degraded = conf->failed_disks = conf->raid_disks - conf->working_disks; conf->mddev = mddev; conf->chunk_size = mddev->chunk_size; conf->level = mddev->level; - if (conf->level == 6) - conf->max_degraded = 2; - else - conf->max_degraded = 1; conf->algorithm = mddev->layout; conf->max_nr_stripes = NR_STRIPES; conf->expand_progress = mddev->reshape_position; @@ -3166,11 +2296,6 @@ static int run(mddev_t *mddev) mddev->size &= ~(mddev->chunk_size/1024 -1); mddev->resync_max_sectors = mddev->size << 1; - if (conf->level == 6 && conf->raid_disks < 4) { - printk(KERN_ERR "raid6: not enough configured devices for %s (%d, minimum 4)\n", - mdname(mddev), conf->raid_disks); - goto abort; - } if (!conf->chunk_size || conf->chunk_size % 4) { printk(KERN_ERR "raid5: invalid chunk size %d for %s\n", conf->chunk_size, mdname(mddev)); @@ -3182,14 +2307,14 @@ static int run(mddev_t *mddev) conf->algorithm, mdname(mddev)); goto abort; } - if (mddev->degraded > conf->max_degraded) { + if (mddev->degraded > 1) { printk(KERN_ERR "raid5: not enough operational devices for %s" " (%d/%d failed)\n", mdname(mddev), conf->failed_disks, conf->raid_disks); goto abort; } - if (mddev->degraded > 0 && + if (mddev->degraded == 1 && mddev->recovery_cp != MaxSector) { if (mddev->ok_start_degraded) printk(KERN_WARNING @@ -3254,12 +2379,11 @@ static int run(mddev_t *mddev) } /* read-ahead size must cover two whole stripes, which is - * 2 * (datadisks) * chunksize where 'n' is the number of raid devices + * 2 * (n-1) * chunksize where 'n' is the number of raid devices */ { - int data_disks = conf->previous_raid_disks - conf->max_degraded; - int stripe = data_disks * - (mddev->chunk_size / PAGE_SIZE); + int stripe = (mddev->raid_disks-1) * mddev->chunk_size + / PAGE_SIZE; if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe) mddev->queue->backing_dev_info.ra_pages = 2 * stripe; } @@ -3269,14 +2393,12 @@ static int run(mddev_t *mddev) mddev->queue->unplug_fn = raid5_unplug_device; mddev->queue->issue_flush_fn = raid5_issue_flush; - mddev->array_size = mddev->size * (conf->previous_raid_disks - - conf->max_degraded); + mddev->array_size = mddev->size * (conf->previous_raid_disks - 1); return 0; abort: if (conf) { print_raid5_conf(conf); - safe_put_page(conf->spare_page); kfree(conf->disks); kfree(conf->stripe_hashtbl); kfree(conf); @@ -3305,23 +2427,23 @@ static int stop(mddev_t *mddev) } #if RAID5_DEBUG -static void print_sh (struct seq_file *seq, struct stripe_head *sh) +static void print_sh (struct stripe_head *sh) { int i; - seq_printf(seq, "sh %llu, pd_idx %d, state %ld.\n", - (unsigned long long)sh->sector, sh->pd_idx, sh->state); - seq_printf(seq, "sh %llu, count %d.\n", - (unsigned long long)sh->sector, atomic_read(&sh->count)); - seq_printf(seq, "sh %llu, ", (unsigned long long)sh->sector); + printk("sh %llu, pd_idx %d, state %ld.\n", + (unsigned long long)sh->sector, sh->pd_idx, sh->state); + printk("sh %llu, count %d.\n", + (unsigned long long)sh->sector, atomic_read(&sh->count)); + printk("sh %llu, ", (unsigned long long)sh->sector); for (i = 0; i < sh->disks; i++) { - seq_printf(seq, "(cache%d: %p %ld) ", - i, sh->dev[i].page, sh->dev[i].flags); + printk("(cache%d: %p %ld) ", + i, sh->dev[i].page, sh->dev[i].flags); } - seq_printf(seq, "\n"); + printk("\n"); } -static void printall (struct seq_file *seq, raid5_conf_t *conf) +static void printall (raid5_conf_t *conf) { struct stripe_head *sh; struct hlist_node *hn; @@ -3332,7 +2454,7 @@ static void printall (struct seq_file *seq, raid5_conf_t *conf) hlist_for_each_entry(sh, hn, &conf->stripe_hashtbl[i], hash) { if (sh->raid_conf != conf) continue; - print_sh(seq, sh); + print_sh(sh); } } spin_unlock_irq(&conf->device_lock); @@ -3352,8 +2474,9 @@ static void status (struct seq_file *seq, mddev_t *mddev) test_bit(In_sync, &conf->disks[i].rdev->flags) ? "U" : "_"); seq_printf (seq, "]"); #if RAID5_DEBUG - seq_printf (seq, "\n"); - printall(seq, conf); +#define D(x) \ + seq_printf (seq, "<"#x":%d>", atomic_read(&conf->x)) + printall(conf); #endif } @@ -3437,20 +2560,14 @@ static int raid5_add_disk(mddev_t *mddev, mdk_rdev_t *rdev) int disk; struct disk_info *p; - if (mddev->degraded > conf->max_degraded) + if (mddev->degraded > 1) /* no point adding a device */ return 0; /* - * find the disk ... but prefer rdev->saved_raid_disk - * if possible. + * find the disk ... */ - if (rdev->saved_raid_disk >= 0 && - conf->disks[rdev->saved_raid_disk].rdev == NULL) - disk = rdev->saved_raid_disk; - else - disk = 0; - for ( ; disk < conf->raid_disks; disk++) + for (disk=0; disk < conf->raid_disks; disk++) if ((p=conf->disks + disk)->rdev == NULL) { clear_bit(In_sync, &rdev->flags); rdev->raid_disk = disk; @@ -3473,10 +2590,8 @@ static int raid5_resize(mddev_t *mddev, sector_t sectors) * any io in the removed space completes, but it hardly seems * worth it. */ - raid5_conf_t *conf = mddev_to_conf(mddev); - sectors &= ~((sector_t)mddev->chunk_size/512 - 1); - mddev->array_size = (sectors * (mddev->raid_disks-conf->max_degraded))>>1; + mddev->array_size = (sectors * (mddev->raid_disks-1))>>1; set_capacity(mddev->gendisk, mddev->array_size << 1); mddev->changed = 1; if (sectors/2 > mddev->size && mddev->recovery_cp == MaxSector) { @@ -3565,7 +2680,6 @@ static int raid5_start_reshape(mddev_t *mddev) set_bit(In_sync, &rdev->flags); conf->working_disks++; added_devices++; - rdev->recovery_offset = 0; sprintf(nm, "rd%d", rdev->raid_disk); sysfs_create_link(&mddev->kobj, &rdev->kobj, nm); } else @@ -3617,17 +2731,6 @@ static void end_reshape(raid5_conf_t *conf) conf->expand_progress = MaxSector; spin_unlock_irq(&conf->device_lock); conf->mddev->reshape_position = MaxSector; - - /* read-ahead size must cover two whole stripes, which is - * 2 * (datadisks) * chunksize where 'n' is the number of raid devices - */ - { - int data_disks = conf->previous_raid_disks - conf->max_degraded; - int stripe = data_disks * - (conf->mddev->chunk_size / PAGE_SIZE); - if (conf->mddev->queue->backing_dev_info.ra_pages < 2 * stripe) - conf->mddev->queue->backing_dev_info.ra_pages = 2 * stripe; - } } } @@ -3659,23 +2762,6 @@ static void raid5_quiesce(mddev_t *mddev, int state) } } -static struct mdk_personality raid6_personality = -{ - .name = "raid6", - .level = 6, - .owner = THIS_MODULE, - .make_request = make_request, - .run = run, - .stop = stop, - .status = status, - .error_handler = error, - .hot_add_disk = raid5_add_disk, - .hot_remove_disk= raid5_remove_disk, - .spare_active = raid5_spare_active, - .sync_request = sync_request, - .resize = raid5_resize, - .quiesce = raid5_quiesce, -}; static struct mdk_personality raid5_personality = { .name = "raid5", @@ -3718,12 +2804,6 @@ static struct mdk_personality raid4_personality = static int __init raid5_init(void) { - int e; - - e = raid6_select_algo(); - if ( e ) - return e; - register_md_personality(&raid6_personality); register_md_personality(&raid5_personality); register_md_personality(&raid4_personality); return 0; @@ -3731,7 +2811,6 @@ static int __init raid5_init(void) static void raid5_exit(void) { - unregister_md_personality(&raid6_personality); unregister_md_personality(&raid5_personality); unregister_md_personality(&raid4_personality); } @@ -3744,10 +2823,3 @@ MODULE_ALIAS("md-raid5"); MODULE_ALIAS("md-raid4"); MODULE_ALIAS("md-level-5"); MODULE_ALIAS("md-level-4"); -MODULE_ALIAS("md-personality-8"); /* RAID6 */ -MODULE_ALIAS("md-raid6"); -MODULE_ALIAS("md-level-6"); - -/* This used to be two separate modules, they were: */ -MODULE_ALIAS("raid5"); -MODULE_ALIAS("raid6"); diff --git a/trunk/drivers/md/raid6main.c b/trunk/drivers/md/raid6main.c new file mode 100644 index 000000000000..bc69355e0100 --- /dev/null +++ b/trunk/drivers/md/raid6main.c @@ -0,0 +1,2427 @@ +/* + * raid6main.c : Multiple Devices driver for Linux + * Copyright (C) 1996, 1997 Ingo Molnar, Miguel de Icaza, Gadi Oxman + * Copyright (C) 1999, 2000 Ingo Molnar + * Copyright (C) 2002, 2003 H. Peter Anvin + * + * RAID-6 management functions. This code is derived from raid5.c. + * Last merge from raid5.c bkcvs version 1.79 (kernel 2.6.1). + * + * Thanks to Penguin Computing for making the RAID-6 development possible + * by donating a test server! + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * You should have received a copy of the GNU General Public License + * (for example /usr/src/linux/COPYING); if not, write to the Free + * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include +#include +#include +#include +#include +#include +#include "raid6.h" + +#include + +/* + * Stripe cache + */ + +#define NR_STRIPES 256 +#define STRIPE_SIZE PAGE_SIZE +#define STRIPE_SHIFT (PAGE_SHIFT - 9) +#define STRIPE_SECTORS (STRIPE_SIZE>>9) +#define IO_THRESHOLD 1 +#define NR_HASH (PAGE_SIZE / sizeof(struct hlist_head)) +#define HASH_MASK (NR_HASH - 1) + +#define stripe_hash(conf, sect) (&((conf)->stripe_hashtbl[((sect) >> STRIPE_SHIFT) & HASH_MASK])) + +/* bio's attached to a stripe+device for I/O are linked together in bi_sector + * order without overlap. There may be several bio's per stripe+device, and + * a bio could span several devices. + * When walking this list for a particular stripe+device, we must never proceed + * beyond a bio that extends past this device, as the next bio might no longer + * be valid. + * This macro is used to determine the 'next' bio in the list, given the sector + * of the current stripe+device + */ +#define r5_next_bio(bio, sect) ( ( (bio)->bi_sector + ((bio)->bi_size>>9) < sect + STRIPE_SECTORS) ? (bio)->bi_next : NULL) +/* + * The following can be used to debug the driver + */ +#define RAID6_DEBUG 0 /* Extremely verbose printk */ +#define RAID6_PARANOIA 1 /* Check spinlocks */ +#define RAID6_DUMPSTATE 0 /* Include stripe cache state in /proc/mdstat */ +#if RAID6_PARANOIA && defined(CONFIG_SMP) +# define CHECK_DEVLOCK() assert_spin_locked(&conf->device_lock) +#else +# define CHECK_DEVLOCK() +#endif + +#define PRINTK(x...) ((void)(RAID6_DEBUG && printk(KERN_DEBUG x))) +#if RAID6_DEBUG +#undef inline +#undef __inline__ +#define inline +#define __inline__ +#endif + +#if !RAID6_USE_EMPTY_ZERO_PAGE +/* In .bss so it's zeroed */ +const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256))); +#endif + +static inline int raid6_next_disk(int disk, int raid_disks) +{ + disk++; + return (disk < raid_disks) ? disk : 0; +} + +static void print_raid6_conf (raid6_conf_t *conf); + +static void __release_stripe(raid6_conf_t *conf, struct stripe_head *sh) +{ + if (atomic_dec_and_test(&sh->count)) { + BUG_ON(!list_empty(&sh->lru)); + BUG_ON(atomic_read(&conf->active_stripes)==0); + if (test_bit(STRIPE_HANDLE, &sh->state)) { + if (test_bit(STRIPE_DELAYED, &sh->state)) + list_add_tail(&sh->lru, &conf->delayed_list); + else if (test_bit(STRIPE_BIT_DELAY, &sh->state) && + conf->seq_write == sh->bm_seq) + list_add_tail(&sh->lru, &conf->bitmap_list); + else { + clear_bit(STRIPE_BIT_DELAY, &sh->state); + list_add_tail(&sh->lru, &conf->handle_list); + } + md_wakeup_thread(conf->mddev->thread); + } else { + if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) { + atomic_dec(&conf->preread_active_stripes); + if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) + md_wakeup_thread(conf->mddev->thread); + } + list_add_tail(&sh->lru, &conf->inactive_list); + atomic_dec(&conf->active_stripes); + if (!conf->inactive_blocked || + atomic_read(&conf->active_stripes) < (conf->max_nr_stripes*3/4)) + wake_up(&conf->wait_for_stripe); + } + } +} +static void release_stripe(struct stripe_head *sh) +{ + raid6_conf_t *conf = sh->raid_conf; + unsigned long flags; + + spin_lock_irqsave(&conf->device_lock, flags); + __release_stripe(conf, sh); + spin_unlock_irqrestore(&conf->device_lock, flags); +} + +static inline void remove_hash(struct stripe_head *sh) +{ + PRINTK("remove_hash(), stripe %llu\n", (unsigned long long)sh->sector); + + hlist_del_init(&sh->hash); +} + +static inline void insert_hash(raid6_conf_t *conf, struct stripe_head *sh) +{ + struct hlist_head *hp = stripe_hash(conf, sh->sector); + + PRINTK("insert_hash(), stripe %llu\n", (unsigned long long)sh->sector); + + CHECK_DEVLOCK(); + hlist_add_head(&sh->hash, hp); +} + + +/* find an idle stripe, make sure it is unhashed, and return it. */ +static struct stripe_head *get_free_stripe(raid6_conf_t *conf) +{ + struct stripe_head *sh = NULL; + struct list_head *first; + + CHECK_DEVLOCK(); + if (list_empty(&conf->inactive_list)) + goto out; + first = conf->inactive_list.next; + sh = list_entry(first, struct stripe_head, lru); + list_del_init(first); + remove_hash(sh); + atomic_inc(&conf->active_stripes); +out: + return sh; +} + +static void shrink_buffers(struct stripe_head *sh, int num) +{ + struct page *p; + int i; + + for (i=0; idev[i].page; + if (!p) + continue; + sh->dev[i].page = NULL; + put_page(p); + } +} + +static int grow_buffers(struct stripe_head *sh, int num) +{ + int i; + + for (i=0; idev[i].page = page; + } + return 0; +} + +static void raid6_build_block (struct stripe_head *sh, int i); + +static void init_stripe(struct stripe_head *sh, sector_t sector, int pd_idx) +{ + raid6_conf_t *conf = sh->raid_conf; + int disks = conf->raid_disks, i; + + BUG_ON(atomic_read(&sh->count) != 0); + BUG_ON(test_bit(STRIPE_HANDLE, &sh->state)); + + CHECK_DEVLOCK(); + PRINTK("init_stripe called, stripe %llu\n", + (unsigned long long)sh->sector); + + remove_hash(sh); + + sh->sector = sector; + sh->pd_idx = pd_idx; + sh->state = 0; + + for (i=disks; i--; ) { + struct r5dev *dev = &sh->dev[i]; + + if (dev->toread || dev->towrite || dev->written || + test_bit(R5_LOCKED, &dev->flags)) { + PRINTK("sector=%llx i=%d %p %p %p %d\n", + (unsigned long long)sh->sector, i, dev->toread, + dev->towrite, dev->written, + test_bit(R5_LOCKED, &dev->flags)); + BUG(); + } + dev->flags = 0; + raid6_build_block(sh, i); + } + insert_hash(conf, sh); +} + +static struct stripe_head *__find_stripe(raid6_conf_t *conf, sector_t sector) +{ + struct stripe_head *sh; + struct hlist_node *hn; + + CHECK_DEVLOCK(); + PRINTK("__find_stripe, sector %llu\n", (unsigned long long)sector); + hlist_for_each_entry (sh, hn, stripe_hash(conf, sector), hash) + if (sh->sector == sector) + return sh; + PRINTK("__stripe %llu not in cache\n", (unsigned long long)sector); + return NULL; +} + +static void unplug_slaves(mddev_t *mddev); + +static struct stripe_head *get_active_stripe(raid6_conf_t *conf, sector_t sector, + int pd_idx, int noblock) +{ + struct stripe_head *sh; + + PRINTK("get_stripe, sector %llu\n", (unsigned long long)sector); + + spin_lock_irq(&conf->device_lock); + + do { + wait_event_lock_irq(conf->wait_for_stripe, + conf->quiesce == 0, + conf->device_lock, /* nothing */); + sh = __find_stripe(conf, sector); + if (!sh) { + if (!conf->inactive_blocked) + sh = get_free_stripe(conf); + if (noblock && sh == NULL) + break; + if (!sh) { + conf->inactive_blocked = 1; + wait_event_lock_irq(conf->wait_for_stripe, + !list_empty(&conf->inactive_list) && + (atomic_read(&conf->active_stripes) + < (conf->max_nr_stripes *3/4) + || !conf->inactive_blocked), + conf->device_lock, + unplug_slaves(conf->mddev); + ); + conf->inactive_blocked = 0; + } else + init_stripe(sh, sector, pd_idx); + } else { + if (atomic_read(&sh->count)) { + BUG_ON(!list_empty(&sh->lru)); + } else { + if (!test_bit(STRIPE_HANDLE, &sh->state)) + atomic_inc(&conf->active_stripes); + BUG_ON(list_empty(&sh->lru)); + list_del_init(&sh->lru); + } + } + } while (sh == NULL); + + if (sh) + atomic_inc(&sh->count); + + spin_unlock_irq(&conf->device_lock); + return sh; +} + +static int grow_one_stripe(raid6_conf_t *conf) +{ + struct stripe_head *sh; + sh = kmem_cache_alloc(conf->slab_cache, GFP_KERNEL); + if (!sh) + return 0; + memset(sh, 0, sizeof(*sh) + (conf->raid_disks-1)*sizeof(struct r5dev)); + sh->raid_conf = conf; + spin_lock_init(&sh->lock); + + if (grow_buffers(sh, conf->raid_disks)) { + shrink_buffers(sh, conf->raid_disks); + kmem_cache_free(conf->slab_cache, sh); + return 0; + } + /* we just created an active stripe so... */ + atomic_set(&sh->count, 1); + atomic_inc(&conf->active_stripes); + INIT_LIST_HEAD(&sh->lru); + release_stripe(sh); + return 1; +} + +static int grow_stripes(raid6_conf_t *conf, int num) +{ + kmem_cache_t *sc; + int devs = conf->raid_disks; + + sprintf(conf->cache_name[0], "raid6/%s", mdname(conf->mddev)); + + sc = kmem_cache_create(conf->cache_name[0], + sizeof(struct stripe_head)+(devs-1)*sizeof(struct r5dev), + 0, 0, NULL, NULL); + if (!sc) + return 1; + conf->slab_cache = sc; + while (num--) + if (!grow_one_stripe(conf)) + return 1; + return 0; +} + +static int drop_one_stripe(raid6_conf_t *conf) +{ + struct stripe_head *sh; + spin_lock_irq(&conf->device_lock); + sh = get_free_stripe(conf); + spin_unlock_irq(&conf->device_lock); + if (!sh) + return 0; + BUG_ON(atomic_read(&sh->count)); + shrink_buffers(sh, conf->raid_disks); + kmem_cache_free(conf->slab_cache, sh); + atomic_dec(&conf->active_stripes); + return 1; +} + +static void shrink_stripes(raid6_conf_t *conf) +{ + while (drop_one_stripe(conf)) + ; + + if (conf->slab_cache) + kmem_cache_destroy(conf->slab_cache); + conf->slab_cache = NULL; +} + +static int raid6_end_read_request(struct bio * bi, unsigned int bytes_done, + int error) +{ + struct stripe_head *sh = bi->bi_private; + raid6_conf_t *conf = sh->raid_conf; + int disks = conf->raid_disks, i; + int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags); + + if (bi->bi_size) + return 1; + + for (i=0 ; idev[i].req) + break; + + PRINTK("end_read_request %llu/%d, count: %d, uptodate %d.\n", + (unsigned long long)sh->sector, i, atomic_read(&sh->count), + uptodate); + if (i == disks) { + BUG(); + return 0; + } + + if (uptodate) { +#if 0 + struct bio *bio; + unsigned long flags; + spin_lock_irqsave(&conf->device_lock, flags); + /* we can return a buffer if we bypassed the cache or + * if the top buffer is not in highmem. If there are + * multiple buffers, leave the extra work to + * handle_stripe + */ + buffer = sh->bh_read[i]; + if (buffer && + (!PageHighMem(buffer->b_page) + || buffer->b_page == bh->b_page ) + ) { + sh->bh_read[i] = buffer->b_reqnext; + buffer->b_reqnext = NULL; + } else + buffer = NULL; + spin_unlock_irqrestore(&conf->device_lock, flags); + if (sh->bh_page[i]==bh->b_page) + set_buffer_uptodate(bh); + if (buffer) { + if (buffer->b_page != bh->b_page) + memcpy(buffer->b_data, bh->b_data, bh->b_size); + buffer->b_end_io(buffer, 1); + } +#else + set_bit(R5_UPTODATE, &sh->dev[i].flags); +#endif + if (test_bit(R5_ReadError, &sh->dev[i].flags)) { + printk(KERN_INFO "raid6: read error corrected!!\n"); + clear_bit(R5_ReadError, &sh->dev[i].flags); + clear_bit(R5_ReWrite, &sh->dev[i].flags); + } + if (atomic_read(&conf->disks[i].rdev->read_errors)) + atomic_set(&conf->disks[i].rdev->read_errors, 0); + } else { + int retry = 0; + clear_bit(R5_UPTODATE, &sh->dev[i].flags); + atomic_inc(&conf->disks[i].rdev->read_errors); + if (conf->mddev->degraded) + printk(KERN_WARNING "raid6: read error not correctable.\n"); + else if (test_bit(R5_ReWrite, &sh->dev[i].flags)) + /* Oh, no!!! */ + printk(KERN_WARNING "raid6: read error NOT corrected!!\n"); + else if (atomic_read(&conf->disks[i].rdev->read_errors) + > conf->max_nr_stripes) + printk(KERN_WARNING + "raid6: Too many read errors, failing device.\n"); + else + retry = 1; + if (retry) + set_bit(R5_ReadError, &sh->dev[i].flags); + else { + clear_bit(R5_ReadError, &sh->dev[i].flags); + clear_bit(R5_ReWrite, &sh->dev[i].flags); + md_error(conf->mddev, conf->disks[i].rdev); + } + } + rdev_dec_pending(conf->disks[i].rdev, conf->mddev); +#if 0 + /* must restore b_page before unlocking buffer... */ + if (sh->bh_page[i] != bh->b_page) { + bh->b_page = sh->bh_page[i]; + bh->b_data = page_address(bh->b_page); + clear_buffer_uptodate(bh); + } +#endif + clear_bit(R5_LOCKED, &sh->dev[i].flags); + set_bit(STRIPE_HANDLE, &sh->state); + release_stripe(sh); + return 0; +} + +static int raid6_end_write_request (struct bio *bi, unsigned int bytes_done, + int error) +{ + struct stripe_head *sh = bi->bi_private; + raid6_conf_t *conf = sh->raid_conf; + int disks = conf->raid_disks, i; + unsigned long flags; + int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags); + + if (bi->bi_size) + return 1; + + for (i=0 ; idev[i].req) + break; + + PRINTK("end_write_request %llu/%d, count %d, uptodate: %d.\n", + (unsigned long long)sh->sector, i, atomic_read(&sh->count), + uptodate); + if (i == disks) { + BUG(); + return 0; + } + + spin_lock_irqsave(&conf->device_lock, flags); + if (!uptodate) + md_error(conf->mddev, conf->disks[i].rdev); + + rdev_dec_pending(conf->disks[i].rdev, conf->mddev); + + clear_bit(R5_LOCKED, &sh->dev[i].flags); + set_bit(STRIPE_HANDLE, &sh->state); + __release_stripe(conf, sh); + spin_unlock_irqrestore(&conf->device_lock, flags); + return 0; +} + + +static sector_t compute_blocknr(struct stripe_head *sh, int i); + +static void raid6_build_block (struct stripe_head *sh, int i) +{ + struct r5dev *dev = &sh->dev[i]; + int pd_idx = sh->pd_idx; + int qd_idx = raid6_next_disk(pd_idx, sh->raid_conf->raid_disks); + + bio_init(&dev->req); + dev->req.bi_io_vec = &dev->vec; + dev->req.bi_vcnt++; + dev->req.bi_max_vecs++; + dev->vec.bv_page = dev->page; + dev->vec.bv_len = STRIPE_SIZE; + dev->vec.bv_offset = 0; + + dev->req.bi_sector = sh->sector; + dev->req.bi_private = sh; + + dev->flags = 0; + if (i != pd_idx && i != qd_idx) + dev->sector = compute_blocknr(sh, i); +} + +static void error(mddev_t *mddev, mdk_rdev_t *rdev) +{ + char b[BDEVNAME_SIZE]; + raid6_conf_t *conf = (raid6_conf_t *) mddev->private; + PRINTK("raid6: error called\n"); + + if (!test_bit(Faulty, &rdev->flags)) { + mddev->sb_dirty = 1; + if (test_bit(In_sync, &rdev->flags)) { + conf->working_disks--; + mddev->degraded++; + conf->failed_disks++; + clear_bit(In_sync, &rdev->flags); + /* + * if recovery was running, make sure it aborts. + */ + set_bit(MD_RECOVERY_ERR, &mddev->recovery); + } + set_bit(Faulty, &rdev->flags); + printk (KERN_ALERT + "raid6: Disk failure on %s, disabling device." + " Operation continuing on %d devices\n", + bdevname(rdev->bdev,b), conf->working_disks); + } +} + +/* + * Input: a 'big' sector number, + * Output: index of the data and parity disk, and the sector # in them. + */ +static sector_t raid6_compute_sector(sector_t r_sector, unsigned int raid_disks, + unsigned int data_disks, unsigned int * dd_idx, + unsigned int * pd_idx, raid6_conf_t *conf) +{ + long stripe; + unsigned long chunk_number; + unsigned int chunk_offset; + sector_t new_sector; + int sectors_per_chunk = conf->chunk_size >> 9; + + /* First compute the information on this sector */ + + /* + * Compute the chunk number and the sector offset inside the chunk + */ + chunk_offset = sector_div(r_sector, sectors_per_chunk); + chunk_number = r_sector; + if ( r_sector != chunk_number ) { + printk(KERN_CRIT "raid6: ERROR: r_sector = %llu, chunk_number = %lu\n", + (unsigned long long)r_sector, (unsigned long)chunk_number); + BUG(); + } + + /* + * Compute the stripe number + */ + stripe = chunk_number / data_disks; + + /* + * Compute the data disk and parity disk indexes inside the stripe + */ + *dd_idx = chunk_number % data_disks; + + /* + * Select the parity disk based on the user selected algorithm. + */ + + /**** FIX THIS ****/ + switch (conf->algorithm) { + case ALGORITHM_LEFT_ASYMMETRIC: + *pd_idx = raid_disks - 1 - (stripe % raid_disks); + if (*pd_idx == raid_disks-1) + (*dd_idx)++; /* Q D D D P */ + else if (*dd_idx >= *pd_idx) + (*dd_idx) += 2; /* D D P Q D */ + break; + case ALGORITHM_RIGHT_ASYMMETRIC: + *pd_idx = stripe % raid_disks; + if (*pd_idx == raid_disks-1) + (*dd_idx)++; /* Q D D D P */ + else if (*dd_idx >= *pd_idx) + (*dd_idx) += 2; /* D D P Q D */ + break; + case ALGORITHM_LEFT_SYMMETRIC: + *pd_idx = raid_disks - 1 - (stripe % raid_disks); + *dd_idx = (*pd_idx + 2 + *dd_idx) % raid_disks; + break; + case ALGORITHM_RIGHT_SYMMETRIC: + *pd_idx = stripe % raid_disks; + *dd_idx = (*pd_idx + 2 + *dd_idx) % raid_disks; + break; + default: + printk (KERN_CRIT "raid6: unsupported algorithm %d\n", + conf->algorithm); + } + + PRINTK("raid6: chunk_number = %lu, pd_idx = %u, dd_idx = %u\n", + chunk_number, *pd_idx, *dd_idx); + + /* + * Finally, compute the new sector number + */ + new_sector = (sector_t) stripe * sectors_per_chunk + chunk_offset; + return new_sector; +} + + +static sector_t compute_blocknr(struct stripe_head *sh, int i) +{ + raid6_conf_t *conf = sh->raid_conf; + int raid_disks = conf->raid_disks, data_disks = raid_disks - 2; + sector_t new_sector = sh->sector, check; + int sectors_per_chunk = conf->chunk_size >> 9; + sector_t stripe; + int chunk_offset; + int chunk_number, dummy1, dummy2, dd_idx = i; + sector_t r_sector; + int i0 = i; + + chunk_offset = sector_div(new_sector, sectors_per_chunk); + stripe = new_sector; + if ( new_sector != stripe ) { + printk(KERN_CRIT "raid6: ERROR: new_sector = %llu, stripe = %lu\n", + (unsigned long long)new_sector, (unsigned long)stripe); + BUG(); + } + + switch (conf->algorithm) { + case ALGORITHM_LEFT_ASYMMETRIC: + case ALGORITHM_RIGHT_ASYMMETRIC: + if (sh->pd_idx == raid_disks-1) + i--; /* Q D D D P */ + else if (i > sh->pd_idx) + i -= 2; /* D D P Q D */ + break; + case ALGORITHM_LEFT_SYMMETRIC: + case ALGORITHM_RIGHT_SYMMETRIC: + if (sh->pd_idx == raid_disks-1) + i--; /* Q D D D P */ + else { + /* D D P Q D */ + if (i < sh->pd_idx) + i += raid_disks; + i -= (sh->pd_idx + 2); + } + break; + default: + printk (KERN_CRIT "raid6: unsupported algorithm %d\n", + conf->algorithm); + } + + PRINTK("raid6: compute_blocknr: pd_idx = %u, i0 = %u, i = %u\n", sh->pd_idx, i0, i); + + chunk_number = stripe * data_disks + i; + r_sector = (sector_t)chunk_number * sectors_per_chunk + chunk_offset; + + check = raid6_compute_sector (r_sector, raid_disks, data_disks, &dummy1, &dummy2, conf); + if (check != sh->sector || dummy1 != dd_idx || dummy2 != sh->pd_idx) { + printk(KERN_CRIT "raid6: compute_blocknr: map not correct\n"); + return 0; + } + return r_sector; +} + + + +/* + * Copy data between a page in the stripe cache, and one or more bion + * The page could align with the middle of the bio, or there could be + * several bion, each with several bio_vecs, which cover part of the page + * Multiple bion are linked together on bi_next. There may be extras + * at the end of this list. We ignore them. + */ +static void copy_data(int frombio, struct bio *bio, + struct page *page, + sector_t sector) +{ + char *pa = page_address(page); + struct bio_vec *bvl; + int i; + int page_offset; + + if (bio->bi_sector >= sector) + page_offset = (signed)(bio->bi_sector - sector) * 512; + else + page_offset = (signed)(sector - bio->bi_sector) * -512; + bio_for_each_segment(bvl, bio, i) { + int len = bio_iovec_idx(bio,i)->bv_len; + int clen; + int b_offset = 0; + + if (page_offset < 0) { + b_offset = -page_offset; + page_offset += b_offset; + len -= b_offset; + } + + if (len > 0 && page_offset + len > STRIPE_SIZE) + clen = STRIPE_SIZE - page_offset; + else clen = len; + + if (clen > 0) { + char *ba = __bio_kmap_atomic(bio, i, KM_USER0); + if (frombio) + memcpy(pa+page_offset, ba+b_offset, clen); + else + memcpy(ba+b_offset, pa+page_offset, clen); + __bio_kunmap_atomic(ba, KM_USER0); + } + if (clen < len) /* hit end of page */ + break; + page_offset += len; + } +} + +#define check_xor() do { \ + if (count == MAX_XOR_BLOCKS) { \ + xor_block(count, STRIPE_SIZE, ptr); \ + count = 1; \ + } \ + } while(0) + +/* Compute P and Q syndromes */ +static void compute_parity(struct stripe_head *sh, int method) +{ + raid6_conf_t *conf = sh->raid_conf; + int i, pd_idx = sh->pd_idx, qd_idx, d0_idx, disks = conf->raid_disks, count; + struct bio *chosen; + /**** FIX THIS: This could be very bad if disks is close to 256 ****/ + void *ptrs[disks]; + + qd_idx = raid6_next_disk(pd_idx, disks); + d0_idx = raid6_next_disk(qd_idx, disks); + + PRINTK("compute_parity, stripe %llu, method %d\n", + (unsigned long long)sh->sector, method); + + switch(method) { + case READ_MODIFY_WRITE: + BUG(); /* READ_MODIFY_WRITE N/A for RAID-6 */ + case RECONSTRUCT_WRITE: + for (i= disks; i-- ;) + if ( i != pd_idx && i != qd_idx && sh->dev[i].towrite ) { + chosen = sh->dev[i].towrite; + sh->dev[i].towrite = NULL; + + if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) + wake_up(&conf->wait_for_overlap); + + BUG_ON(sh->dev[i].written); + sh->dev[i].written = chosen; + } + break; + case CHECK_PARITY: + BUG(); /* Not implemented yet */ + } + + for (i = disks; i--;) + if (sh->dev[i].written) { + sector_t sector = sh->dev[i].sector; + struct bio *wbi = sh->dev[i].written; + while (wbi && wbi->bi_sector < sector + STRIPE_SECTORS) { + copy_data(1, wbi, sh->dev[i].page, sector); + wbi = r5_next_bio(wbi, sector); + } + + set_bit(R5_LOCKED, &sh->dev[i].flags); + set_bit(R5_UPTODATE, &sh->dev[i].flags); + } + +// switch(method) { +// case RECONSTRUCT_WRITE: +// case CHECK_PARITY: +// case UPDATE_PARITY: + /* Note that unlike RAID-5, the ordering of the disks matters greatly. */ + /* FIX: Is this ordering of drives even remotely optimal? */ + count = 0; + i = d0_idx; + do { + ptrs[count++] = page_address(sh->dev[i].page); + if (count <= disks-2 && !test_bit(R5_UPTODATE, &sh->dev[i].flags)) + printk("block %d/%d not uptodate on parity calc\n", i,count); + i = raid6_next_disk(i, disks); + } while ( i != d0_idx ); +// break; +// } + + raid6_call.gen_syndrome(disks, STRIPE_SIZE, ptrs); + + switch(method) { + case RECONSTRUCT_WRITE: + set_bit(R5_UPTODATE, &sh->dev[pd_idx].flags); + set_bit(R5_UPTODATE, &sh->dev[qd_idx].flags); + set_bit(R5_LOCKED, &sh->dev[pd_idx].flags); + set_bit(R5_LOCKED, &sh->dev[qd_idx].flags); + break; + case UPDATE_PARITY: + set_bit(R5_UPTODATE, &sh->dev[pd_idx].flags); + set_bit(R5_UPTODATE, &sh->dev[qd_idx].flags); + break; + } +} + +/* Compute one missing block */ +static void compute_block_1(struct stripe_head *sh, int dd_idx, int nozero) +{ + raid6_conf_t *conf = sh->raid_conf; + int i, count, disks = conf->raid_disks; + void *ptr[MAX_XOR_BLOCKS], *p; + int pd_idx = sh->pd_idx; + int qd_idx = raid6_next_disk(pd_idx, disks); + + PRINTK("compute_block_1, stripe %llu, idx %d\n", + (unsigned long long)sh->sector, dd_idx); + + if ( dd_idx == qd_idx ) { + /* We're actually computing the Q drive */ + compute_parity(sh, UPDATE_PARITY); + } else { + ptr[0] = page_address(sh->dev[dd_idx].page); + if (!nozero) memset(ptr[0], 0, STRIPE_SIZE); + count = 1; + for (i = disks ; i--; ) { + if (i == dd_idx || i == qd_idx) + continue; + p = page_address(sh->dev[i].page); + if (test_bit(R5_UPTODATE, &sh->dev[i].flags)) + ptr[count++] = p; + else + printk("compute_block() %d, stripe %llu, %d" + " not present\n", dd_idx, + (unsigned long long)sh->sector, i); + + check_xor(); + } + if (count != 1) + xor_block(count, STRIPE_SIZE, ptr); + if (!nozero) set_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); + else clear_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); + } +} + +/* Compute two missing blocks */ +static void compute_block_2(struct stripe_head *sh, int dd_idx1, int dd_idx2) +{ + raid6_conf_t *conf = sh->raid_conf; + int i, count, disks = conf->raid_disks; + int pd_idx = sh->pd_idx; + int qd_idx = raid6_next_disk(pd_idx, disks); + int d0_idx = raid6_next_disk(qd_idx, disks); + int faila, failb; + + /* faila and failb are disk numbers relative to d0_idx */ + /* pd_idx become disks-2 and qd_idx become disks-1 */ + faila = (dd_idx1 < d0_idx) ? dd_idx1+(disks-d0_idx) : dd_idx1-d0_idx; + failb = (dd_idx2 < d0_idx) ? dd_idx2+(disks-d0_idx) : dd_idx2-d0_idx; + + BUG_ON(faila == failb); + if ( failb < faila ) { int tmp = faila; faila = failb; failb = tmp; } + + PRINTK("compute_block_2, stripe %llu, idx %d,%d (%d,%d)\n", + (unsigned long long)sh->sector, dd_idx1, dd_idx2, faila, failb); + + if ( failb == disks-1 ) { + /* Q disk is one of the missing disks */ + if ( faila == disks-2 ) { + /* Missing P+Q, just recompute */ + compute_parity(sh, UPDATE_PARITY); + return; + } else { + /* We're missing D+Q; recompute D from P */ + compute_block_1(sh, (dd_idx1 == qd_idx) ? dd_idx2 : dd_idx1, 0); + compute_parity(sh, UPDATE_PARITY); /* Is this necessary? */ + return; + } + } + + /* We're missing D+P or D+D; build pointer table */ + { + /**** FIX THIS: This could be very bad if disks is close to 256 ****/ + void *ptrs[disks]; + + count = 0; + i = d0_idx; + do { + ptrs[count++] = page_address(sh->dev[i].page); + i = raid6_next_disk(i, disks); + if (i != dd_idx1 && i != dd_idx2 && + !test_bit(R5_UPTODATE, &sh->dev[i].flags)) + printk("compute_2 with missing block %d/%d\n", count, i); + } while ( i != d0_idx ); + + if ( failb == disks-2 ) { + /* We're missing D+P. */ + raid6_datap_recov(disks, STRIPE_SIZE, faila, ptrs); + } else { + /* We're missing D+D. */ + raid6_2data_recov(disks, STRIPE_SIZE, faila, failb, ptrs); + } + + /* Both the above update both missing blocks */ + set_bit(R5_UPTODATE, &sh->dev[dd_idx1].flags); + set_bit(R5_UPTODATE, &sh->dev[dd_idx2].flags); + } +} + + +/* + * Each stripe/dev can have one or more bion attached. + * toread/towrite point to the first in a chain. + * The bi_next chain must be in order. + */ +static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, int forwrite) +{ + struct bio **bip; + raid6_conf_t *conf = sh->raid_conf; + int firstwrite=0; + + PRINTK("adding bh b#%llu to stripe s#%llu\n", + (unsigned long long)bi->bi_sector, + (unsigned long long)sh->sector); + + + spin_lock(&sh->lock); + spin_lock_irq(&conf->device_lock); + if (forwrite) { + bip = &sh->dev[dd_idx].towrite; + if (*bip == NULL && sh->dev[dd_idx].written == NULL) + firstwrite = 1; + } else + bip = &sh->dev[dd_idx].toread; + while (*bip && (*bip)->bi_sector < bi->bi_sector) { + if ((*bip)->bi_sector + ((*bip)->bi_size >> 9) > bi->bi_sector) + goto overlap; + bip = &(*bip)->bi_next; + } + if (*bip && (*bip)->bi_sector < bi->bi_sector + ((bi->bi_size)>>9)) + goto overlap; + + BUG_ON(*bip && bi->bi_next && (*bip) != bi->bi_next); + if (*bip) + bi->bi_next = *bip; + *bip = bi; + bi->bi_phys_segments ++; + spin_unlock_irq(&conf->device_lock); + spin_unlock(&sh->lock); + + PRINTK("added bi b#%llu to stripe s#%llu, disk %d.\n", + (unsigned long long)bi->bi_sector, + (unsigned long long)sh->sector, dd_idx); + + if (conf->mddev->bitmap && firstwrite) { + sh->bm_seq = conf->seq_write; + bitmap_startwrite(conf->mddev->bitmap, sh->sector, + STRIPE_SECTORS, 0); + set_bit(STRIPE_BIT_DELAY, &sh->state); + } + + if (forwrite) { + /* check if page is covered */ + sector_t sector = sh->dev[dd_idx].sector; + for (bi=sh->dev[dd_idx].towrite; + sector < sh->dev[dd_idx].sector + STRIPE_SECTORS && + bi && bi->bi_sector <= sector; + bi = r5_next_bio(bi, sh->dev[dd_idx].sector)) { + if (bi->bi_sector + (bi->bi_size>>9) >= sector) + sector = bi->bi_sector + (bi->bi_size>>9); + } + if (sector >= sh->dev[dd_idx].sector + STRIPE_SECTORS) + set_bit(R5_OVERWRITE, &sh->dev[dd_idx].flags); + } + return 1; + + overlap: + set_bit(R5_Overlap, &sh->dev[dd_idx].flags); + spin_unlock_irq(&conf->device_lock); + spin_unlock(&sh->lock); + return 0; +} + + +static int page_is_zero(struct page *p) +{ + char *a = page_address(p); + return ((*(u32*)a) == 0 && + memcmp(a, a+4, STRIPE_SIZE-4)==0); +} +/* + * handle_stripe - do things to a stripe. + * + * We lock the stripe and then examine the state of various bits + * to see what needs to be done. + * Possible results: + * return some read request which now have data + * return some write requests which are safely on disc + * schedule a read on some buffers + * schedule a write of some buffers + * return confirmation of parity correctness + * + * Parity calculations are done inside the stripe lock + * buffers are taken off read_list or write_list, and bh_cache buffers + * get BH_Lock set before the stripe lock is released. + * + */ + +static void handle_stripe(struct stripe_head *sh, struct page *tmp_page) +{ + raid6_conf_t *conf = sh->raid_conf; + int disks = conf->raid_disks; + struct bio *return_bi= NULL; + struct bio *bi; + int i; + int syncing; + int locked=0, uptodate=0, to_read=0, to_write=0, failed=0, written=0; + int non_overwrite = 0; + int failed_num[2] = {0, 0}; + struct r5dev *dev, *pdev, *qdev; + int pd_idx = sh->pd_idx; + int qd_idx = raid6_next_disk(pd_idx, disks); + int p_failed, q_failed; + + PRINTK("handling stripe %llu, state=%#lx cnt=%d, pd_idx=%d, qd_idx=%d\n", + (unsigned long long)sh->sector, sh->state, atomic_read(&sh->count), + pd_idx, qd_idx); + + spin_lock(&sh->lock); + clear_bit(STRIPE_HANDLE, &sh->state); + clear_bit(STRIPE_DELAYED, &sh->state); + + syncing = test_bit(STRIPE_SYNCING, &sh->state); + /* Now to look around and see what can be done */ + + rcu_read_lock(); + for (i=disks; i--; ) { + mdk_rdev_t *rdev; + dev = &sh->dev[i]; + clear_bit(R5_Insync, &dev->flags); + + PRINTK("check %d: state 0x%lx read %p write %p written %p\n", + i, dev->flags, dev->toread, dev->towrite, dev->written); + /* maybe we can reply to a read */ + if (test_bit(R5_UPTODATE, &dev->flags) && dev->toread) { + struct bio *rbi, *rbi2; + PRINTK("Return read for disc %d\n", i); + spin_lock_irq(&conf->device_lock); + rbi = dev->toread; + dev->toread = NULL; + if (test_and_clear_bit(R5_Overlap, &dev->flags)) + wake_up(&conf->wait_for_overlap); + spin_unlock_irq(&conf->device_lock); + while (rbi && rbi->bi_sector < dev->sector + STRIPE_SECTORS) { + copy_data(0, rbi, dev->page, dev->sector); + rbi2 = r5_next_bio(rbi, dev->sector); + spin_lock_irq(&conf->device_lock); + if (--rbi->bi_phys_segments == 0) { + rbi->bi_next = return_bi; + return_bi = rbi; + } + spin_unlock_irq(&conf->device_lock); + rbi = rbi2; + } + } + + /* now count some things */ + if (test_bit(R5_LOCKED, &dev->flags)) locked++; + if (test_bit(R5_UPTODATE, &dev->flags)) uptodate++; + + + if (dev->toread) to_read++; + if (dev->towrite) { + to_write++; + if (!test_bit(R5_OVERWRITE, &dev->flags)) + non_overwrite++; + } + if (dev->written) written++; + rdev = rcu_dereference(conf->disks[i].rdev); + if (!rdev || !test_bit(In_sync, &rdev->flags)) { + /* The ReadError flag will just be confusing now */ + clear_bit(R5_ReadError, &dev->flags); + clear_bit(R5_ReWrite, &dev->flags); + } + if (!rdev || !test_bit(In_sync, &rdev->flags) + || test_bit(R5_ReadError, &dev->flags)) { + if ( failed < 2 ) + failed_num[failed] = i; + failed++; + } else + set_bit(R5_Insync, &dev->flags); + } + rcu_read_unlock(); + PRINTK("locked=%d uptodate=%d to_read=%d" + " to_write=%d failed=%d failed_num=%d,%d\n", + locked, uptodate, to_read, to_write, failed, + failed_num[0], failed_num[1]); + /* check if the array has lost >2 devices and, if so, some requests might + * need to be failed + */ + if (failed > 2 && to_read+to_write+written) { + for (i=disks; i--; ) { + int bitmap_end = 0; + + if (test_bit(R5_ReadError, &sh->dev[i].flags)) { + mdk_rdev_t *rdev; + rcu_read_lock(); + rdev = rcu_dereference(conf->disks[i].rdev); + if (rdev && test_bit(In_sync, &rdev->flags)) + /* multiple read failures in one stripe */ + md_error(conf->mddev, rdev); + rcu_read_unlock(); + } + + spin_lock_irq(&conf->device_lock); + /* fail all writes first */ + bi = sh->dev[i].towrite; + sh->dev[i].towrite = NULL; + if (bi) { to_write--; bitmap_end = 1; } + + if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) + wake_up(&conf->wait_for_overlap); + + while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS){ + struct bio *nextbi = r5_next_bio(bi, sh->dev[i].sector); + clear_bit(BIO_UPTODATE, &bi->bi_flags); + if (--bi->bi_phys_segments == 0) { + md_write_end(conf->mddev); + bi->bi_next = return_bi; + return_bi = bi; + } + bi = nextbi; + } + /* and fail all 'written' */ + bi = sh->dev[i].written; + sh->dev[i].written = NULL; + if (bi) bitmap_end = 1; + while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS) { + struct bio *bi2 = r5_next_bio(bi, sh->dev[i].sector); + clear_bit(BIO_UPTODATE, &bi->bi_flags); + if (--bi->bi_phys_segments == 0) { + md_write_end(conf->mddev); + bi->bi_next = return_bi; + return_bi = bi; + } + bi = bi2; + } + + /* fail any reads if this device is non-operational */ + if (!test_bit(R5_Insync, &sh->dev[i].flags) || + test_bit(R5_ReadError, &sh->dev[i].flags)) { + bi = sh->dev[i].toread; + sh->dev[i].toread = NULL; + if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) + wake_up(&conf->wait_for_overlap); + if (bi) to_read--; + while (bi && bi->bi_sector < sh->dev[i].sector + STRIPE_SECTORS){ + struct bio *nextbi = r5_next_bio(bi, sh->dev[i].sector); + clear_bit(BIO_UPTODATE, &bi->bi_flags); + if (--bi->bi_phys_segments == 0) { + bi->bi_next = return_bi; + return_bi = bi; + } + bi = nextbi; + } + } + spin_unlock_irq(&conf->device_lock); + if (bitmap_end) + bitmap_endwrite(conf->mddev->bitmap, sh->sector, + STRIPE_SECTORS, 0, 0); + } + } + if (failed > 2 && syncing) { + md_done_sync(conf->mddev, STRIPE_SECTORS,0); + clear_bit(STRIPE_SYNCING, &sh->state); + syncing = 0; + } + + /* + * might be able to return some write requests if the parity blocks + * are safe, or on a failed drive + */ + pdev = &sh->dev[pd_idx]; + p_failed = (failed >= 1 && failed_num[0] == pd_idx) + || (failed >= 2 && failed_num[1] == pd_idx); + qdev = &sh->dev[qd_idx]; + q_failed = (failed >= 1 && failed_num[0] == qd_idx) + || (failed >= 2 && failed_num[1] == qd_idx); + + if ( written && + ( p_failed || ((test_bit(R5_Insync, &pdev->flags) + && !test_bit(R5_LOCKED, &pdev->flags) + && test_bit(R5_UPTODATE, &pdev->flags))) ) && + ( q_failed || ((test_bit(R5_Insync, &qdev->flags) + && !test_bit(R5_LOCKED, &qdev->flags) + && test_bit(R5_UPTODATE, &qdev->flags))) ) ) { + /* any written block on an uptodate or failed drive can be + * returned. Note that if we 'wrote' to a failed drive, + * it will be UPTODATE, but never LOCKED, so we don't need + * to test 'failed' directly. + */ + for (i=disks; i--; ) + if (sh->dev[i].written) { + dev = &sh->dev[i]; + if (!test_bit(R5_LOCKED, &dev->flags) && + test_bit(R5_UPTODATE, &dev->flags) ) { + /* We can return any write requests */ + int bitmap_end = 0; + struct bio *wbi, *wbi2; + PRINTK("Return write for stripe %llu disc %d\n", + (unsigned long long)sh->sector, i); + spin_lock_irq(&conf->device_lock); + wbi = dev->written; + dev->written = NULL; + while (wbi && wbi->bi_sector < dev->sector + STRIPE_SECTORS) { + wbi2 = r5_next_bio(wbi, dev->sector); + if (--wbi->bi_phys_segments == 0) { + md_write_end(conf->mddev); + wbi->bi_next = return_bi; + return_bi = wbi; + } + wbi = wbi2; + } + if (dev->towrite == NULL) + bitmap_end = 1; + spin_unlock_irq(&conf->device_lock); + if (bitmap_end) + bitmap_endwrite(conf->mddev->bitmap, sh->sector, + STRIPE_SECTORS, + !test_bit(STRIPE_DEGRADED, &sh->state), 0); + } + } + } + + /* Now we might consider reading some blocks, either to check/generate + * parity, or to satisfy requests + * or to load a block that is being partially written. + */ + if (to_read || non_overwrite || (to_write && failed) || (syncing && (uptodate < disks))) { + for (i=disks; i--;) { + dev = &sh->dev[i]; + if (!test_bit(R5_LOCKED, &dev->flags) && !test_bit(R5_UPTODATE, &dev->flags) && + (dev->toread || + (dev->towrite && !test_bit(R5_OVERWRITE, &dev->flags)) || + syncing || + (failed >= 1 && (sh->dev[failed_num[0]].toread || to_write)) || + (failed >= 2 && (sh->dev[failed_num[1]].toread || to_write)) + ) + ) { + /* we would like to get this block, possibly + * by computing it, but we might not be able to + */ + if (uptodate == disks-1) { + PRINTK("Computing stripe %llu block %d\n", + (unsigned long long)sh->sector, i); + compute_block_1(sh, i, 0); + uptodate++; + } else if ( uptodate == disks-2 && failed >= 2 ) { + /* Computing 2-failure is *very* expensive; only do it if failed >= 2 */ + int other; + for (other=disks; other--;) { + if ( other == i ) + continue; + if ( !test_bit(R5_UPTODATE, &sh->dev[other].flags) ) + break; + } + BUG_ON(other < 0); + PRINTK("Computing stripe %llu blocks %d,%d\n", + (unsigned long long)sh->sector, i, other); + compute_block_2(sh, i, other); + uptodate += 2; + } else if (test_bit(R5_Insync, &dev->flags)) { + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantread, &dev->flags); +#if 0 + /* if I am just reading this block and we don't have + a failed drive, or any pending writes then sidestep the cache */ + if (sh->bh_read[i] && !sh->bh_read[i]->b_reqnext && + ! syncing && !failed && !to_write) { + sh->bh_cache[i]->b_page = sh->bh_read[i]->b_page; + sh->bh_cache[i]->b_data = sh->bh_read[i]->b_data; + } +#endif + locked++; + PRINTK("Reading block %d (sync=%d)\n", + i, syncing); + } + } + } + set_bit(STRIPE_HANDLE, &sh->state); + } + + /* now to consider writing and what else, if anything should be read */ + if (to_write) { + int rcw=0, must_compute=0; + for (i=disks ; i--;) { + dev = &sh->dev[i]; + /* Would I have to read this buffer for reconstruct_write */ + if (!test_bit(R5_OVERWRITE, &dev->flags) + && i != pd_idx && i != qd_idx + && (!test_bit(R5_LOCKED, &dev->flags) +#if 0 + || sh->bh_page[i] != bh->b_page +#endif + ) && + !test_bit(R5_UPTODATE, &dev->flags)) { + if (test_bit(R5_Insync, &dev->flags)) rcw++; + else { + PRINTK("raid6: must_compute: disk %d flags=%#lx\n", i, dev->flags); + must_compute++; + } + } + } + PRINTK("for sector %llu, rcw=%d, must_compute=%d\n", + (unsigned long long)sh->sector, rcw, must_compute); + set_bit(STRIPE_HANDLE, &sh->state); + + if (rcw > 0) + /* want reconstruct write, but need to get some data */ + for (i=disks; i--;) { + dev = &sh->dev[i]; + if (!test_bit(R5_OVERWRITE, &dev->flags) + && !(failed == 0 && (i == pd_idx || i == qd_idx)) + && !test_bit(R5_LOCKED, &dev->flags) && !test_bit(R5_UPTODATE, &dev->flags) && + test_bit(R5_Insync, &dev->flags)) { + if (test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) + { + PRINTK("Read_old stripe %llu block %d for Reconstruct\n", + (unsigned long long)sh->sector, i); + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantread, &dev->flags); + locked++; + } else { + PRINTK("Request delayed stripe %llu block %d for Reconstruct\n", + (unsigned long long)sh->sector, i); + set_bit(STRIPE_DELAYED, &sh->state); + set_bit(STRIPE_HANDLE, &sh->state); + } + } + } + /* now if nothing is locked, and if we have enough data, we can start a write request */ + if (locked == 0 && rcw == 0 && + !test_bit(STRIPE_BIT_DELAY, &sh->state)) { + if ( must_compute > 0 ) { + /* We have failed blocks and need to compute them */ + switch ( failed ) { + case 0: BUG(); + case 1: compute_block_1(sh, failed_num[0], 0); break; + case 2: compute_block_2(sh, failed_num[0], failed_num[1]); break; + default: BUG(); /* This request should have been failed? */ + } + } + + PRINTK("Computing parity for stripe %llu\n", (unsigned long long)sh->sector); + compute_parity(sh, RECONSTRUCT_WRITE); + /* now every locked buffer is ready to be written */ + for (i=disks; i--;) + if (test_bit(R5_LOCKED, &sh->dev[i].flags)) { + PRINTK("Writing stripe %llu block %d\n", + (unsigned long long)sh->sector, i); + locked++; + set_bit(R5_Wantwrite, &sh->dev[i].flags); + } + /* after a RECONSTRUCT_WRITE, the stripe MUST be in-sync */ + set_bit(STRIPE_INSYNC, &sh->state); + + if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) { + atomic_dec(&conf->preread_active_stripes); + if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) + md_wakeup_thread(conf->mddev->thread); + } + } + } + + /* maybe we need to check and possibly fix the parity for this stripe + * Any reads will already have been scheduled, so we just see if enough data + * is available + */ + if (syncing && locked == 0 && !test_bit(STRIPE_INSYNC, &sh->state)) { + int update_p = 0, update_q = 0; + struct r5dev *dev; + + set_bit(STRIPE_HANDLE, &sh->state); + + BUG_ON(failed>2); + BUG_ON(uptodate < disks); + /* Want to check and possibly repair P and Q. + * However there could be one 'failed' device, in which + * case we can only check one of them, possibly using the + * other to generate missing data + */ + + /* If !tmp_page, we cannot do the calculations, + * but as we have set STRIPE_HANDLE, we will soon be called + * by stripe_handle with a tmp_page - just wait until then. + */ + if (tmp_page) { + if (failed == q_failed) { + /* The only possible failed device holds 'Q', so it makes + * sense to check P (If anything else were failed, we would + * have used P to recreate it). + */ + compute_block_1(sh, pd_idx, 1); + if (!page_is_zero(sh->dev[pd_idx].page)) { + compute_block_1(sh,pd_idx,0); + update_p = 1; + } + } + if (!q_failed && failed < 2) { + /* q is not failed, and we didn't use it to generate + * anything, so it makes sense to check it + */ + memcpy(page_address(tmp_page), + page_address(sh->dev[qd_idx].page), + STRIPE_SIZE); + compute_parity(sh, UPDATE_PARITY); + if (memcmp(page_address(tmp_page), + page_address(sh->dev[qd_idx].page), + STRIPE_SIZE)!= 0) { + clear_bit(STRIPE_INSYNC, &sh->state); + update_q = 1; + } + } + if (update_p || update_q) { + conf->mddev->resync_mismatches += STRIPE_SECTORS; + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) + /* don't try to repair!! */ + update_p = update_q = 0; + } + + /* now write out any block on a failed drive, + * or P or Q if they need it + */ + + if (failed == 2) { + dev = &sh->dev[failed_num[1]]; + locked++; + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantwrite, &dev->flags); + } + if (failed >= 1) { + dev = &sh->dev[failed_num[0]]; + locked++; + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantwrite, &dev->flags); + } + + if (update_p) { + dev = &sh->dev[pd_idx]; + locked ++; + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantwrite, &dev->flags); + } + if (update_q) { + dev = &sh->dev[qd_idx]; + locked++; + set_bit(R5_LOCKED, &dev->flags); + set_bit(R5_Wantwrite, &dev->flags); + } + clear_bit(STRIPE_DEGRADED, &sh->state); + + set_bit(STRIPE_INSYNC, &sh->state); + } + } + + if (syncing && locked == 0 && test_bit(STRIPE_INSYNC, &sh->state)) { + md_done_sync(conf->mddev, STRIPE_SECTORS,1); + clear_bit(STRIPE_SYNCING, &sh->state); + } + + /* If the failed drives are just a ReadError, then we might need + * to progress the repair/check process + */ + if (failed <= 2 && ! conf->mddev->ro) + for (i=0; idev[failed_num[i]]; + if (test_bit(R5_ReadError, &dev->flags) + && !test_bit(R5_LOCKED, &dev->flags) + && test_bit(R5_UPTODATE, &dev->flags) + ) { + if (!test_bit(R5_ReWrite, &dev->flags)) { + set_bit(R5_Wantwrite, &dev->flags); + set_bit(R5_ReWrite, &dev->flags); + set_bit(R5_LOCKED, &dev->flags); + } else { + /* let's read it back */ + set_bit(R5_Wantread, &dev->flags); + set_bit(R5_LOCKED, &dev->flags); + } + } + } + spin_unlock(&sh->lock); + + while ((bi=return_bi)) { + int bytes = bi->bi_size; + + return_bi = bi->bi_next; + bi->bi_next = NULL; + bi->bi_size = 0; + bi->bi_end_io(bi, bytes, 0); + } + for (i=disks; i-- ;) { + int rw; + struct bio *bi; + mdk_rdev_t *rdev; + if (test_and_clear_bit(R5_Wantwrite, &sh->dev[i].flags)) + rw = 1; + else if (test_and_clear_bit(R5_Wantread, &sh->dev[i].flags)) + rw = 0; + else + continue; + + bi = &sh->dev[i].req; + + bi->bi_rw = rw; + if (rw) + bi->bi_end_io = raid6_end_write_request; + else + bi->bi_end_io = raid6_end_read_request; + + rcu_read_lock(); + rdev = rcu_dereference(conf->disks[i].rdev); + if (rdev && test_bit(Faulty, &rdev->flags)) + rdev = NULL; + if (rdev) + atomic_inc(&rdev->nr_pending); + rcu_read_unlock(); + + if (rdev) { + if (syncing) + md_sync_acct(rdev->bdev, STRIPE_SECTORS); + + bi->bi_bdev = rdev->bdev; + PRINTK("for %llu schedule op %ld on disc %d\n", + (unsigned long long)sh->sector, bi->bi_rw, i); + atomic_inc(&sh->count); + bi->bi_sector = sh->sector + rdev->data_offset; + bi->bi_flags = 1 << BIO_UPTODATE; + bi->bi_vcnt = 1; + bi->bi_max_vecs = 1; + bi->bi_idx = 0; + bi->bi_io_vec = &sh->dev[i].vec; + bi->bi_io_vec[0].bv_len = STRIPE_SIZE; + bi->bi_io_vec[0].bv_offset = 0; + bi->bi_size = STRIPE_SIZE; + bi->bi_next = NULL; + if (rw == WRITE && + test_bit(R5_ReWrite, &sh->dev[i].flags)) + atomic_add(STRIPE_SECTORS, &rdev->corrected_errors); + generic_make_request(bi); + } else { + if (rw == 1) + set_bit(STRIPE_DEGRADED, &sh->state); + PRINTK("skip op %ld on disc %d for sector %llu\n", + bi->bi_rw, i, (unsigned long long)sh->sector); + clear_bit(R5_LOCKED, &sh->dev[i].flags); + set_bit(STRIPE_HANDLE, &sh->state); + } + } +} + +static void raid6_activate_delayed(raid6_conf_t *conf) +{ + if (atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD) { + while (!list_empty(&conf->delayed_list)) { + struct list_head *l = conf->delayed_list.next; + struct stripe_head *sh; + sh = list_entry(l, struct stripe_head, lru); + list_del_init(l); + clear_bit(STRIPE_DELAYED, &sh->state); + if (!test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) + atomic_inc(&conf->preread_active_stripes); + list_add_tail(&sh->lru, &conf->handle_list); + } + } +} + +static void activate_bit_delay(raid6_conf_t *conf) +{ + /* device_lock is held */ + struct list_head head; + list_add(&head, &conf->bitmap_list); + list_del_init(&conf->bitmap_list); + while (!list_empty(&head)) { + struct stripe_head *sh = list_entry(head.next, struct stripe_head, lru); + list_del_init(&sh->lru); + atomic_inc(&sh->count); + __release_stripe(conf, sh); + } +} + +static void unplug_slaves(mddev_t *mddev) +{ + raid6_conf_t *conf = mddev_to_conf(mddev); + int i; + + rcu_read_lock(); + for (i=0; iraid_disks; i++) { + mdk_rdev_t *rdev = rcu_dereference(conf->disks[i].rdev); + if (rdev && !test_bit(Faulty, &rdev->flags) && atomic_read(&rdev->nr_pending)) { + request_queue_t *r_queue = bdev_get_queue(rdev->bdev); + + atomic_inc(&rdev->nr_pending); + rcu_read_unlock(); + + if (r_queue->unplug_fn) + r_queue->unplug_fn(r_queue); + + rdev_dec_pending(rdev, mddev); + rcu_read_lock(); + } + } + rcu_read_unlock(); +} + +static void raid6_unplug_device(request_queue_t *q) +{ + mddev_t *mddev = q->queuedata; + raid6_conf_t *conf = mddev_to_conf(mddev); + unsigned long flags; + + spin_lock_irqsave(&conf->device_lock, flags); + + if (blk_remove_plug(q)) { + conf->seq_flush++; + raid6_activate_delayed(conf); + } + md_wakeup_thread(mddev->thread); + + spin_unlock_irqrestore(&conf->device_lock, flags); + + unplug_slaves(mddev); +} + +static int raid6_issue_flush(request_queue_t *q, struct gendisk *disk, + sector_t *error_sector) +{ + mddev_t *mddev = q->queuedata; + raid6_conf_t *conf = mddev_to_conf(mddev); + int i, ret = 0; + + rcu_read_lock(); + for (i=0; iraid_disks && ret == 0; i++) { + mdk_rdev_t *rdev = rcu_dereference(conf->disks[i].rdev); + if (rdev && !test_bit(Faulty, &rdev->flags)) { + struct block_device *bdev = rdev->bdev; + request_queue_t *r_queue = bdev_get_queue(bdev); + + if (!r_queue->issue_flush_fn) + ret = -EOPNOTSUPP; + else { + atomic_inc(&rdev->nr_pending); + rcu_read_unlock(); + ret = r_queue->issue_flush_fn(r_queue, bdev->bd_disk, + error_sector); + rdev_dec_pending(rdev, mddev); + rcu_read_lock(); + } + } + } + rcu_read_unlock(); + return ret; +} + +static inline void raid6_plug_device(raid6_conf_t *conf) +{ + spin_lock_irq(&conf->device_lock); + blk_plug_device(conf->mddev->queue); + spin_unlock_irq(&conf->device_lock); +} + +static int make_request (request_queue_t *q, struct bio * bi) +{ + mddev_t *mddev = q->queuedata; + raid6_conf_t *conf = mddev_to_conf(mddev); + const unsigned int raid_disks = conf->raid_disks; + const unsigned int data_disks = raid_disks - 2; + unsigned int dd_idx, pd_idx; + sector_t new_sector; + sector_t logical_sector, last_sector; + struct stripe_head *sh; + const int rw = bio_data_dir(bi); + + if (unlikely(bio_barrier(bi))) { + bio_endio(bi, bi->bi_size, -EOPNOTSUPP); + return 0; + } + + md_write_start(mddev, bi); + + disk_stat_inc(mddev->gendisk, ios[rw]); + disk_stat_add(mddev->gendisk, sectors[rw], bio_sectors(bi)); + + logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1); + last_sector = bi->bi_sector + (bi->bi_size>>9); + + bi->bi_next = NULL; + bi->bi_phys_segments = 1; /* over-loaded to count active stripes */ + + for (;logical_sector < last_sector; logical_sector += STRIPE_SECTORS) { + DEFINE_WAIT(w); + + new_sector = raid6_compute_sector(logical_sector, + raid_disks, data_disks, &dd_idx, &pd_idx, conf); + + PRINTK("raid6: make_request, sector %llu logical %llu\n", + (unsigned long long)new_sector, + (unsigned long long)logical_sector); + + retry: + prepare_to_wait(&conf->wait_for_overlap, &w, TASK_UNINTERRUPTIBLE); + sh = get_active_stripe(conf, new_sector, pd_idx, (bi->bi_rw&RWA_MASK)); + if (sh) { + if (!add_stripe_bio(sh, bi, dd_idx, (bi->bi_rw&RW_MASK))) { + /* Add failed due to overlap. Flush everything + * and wait a while + */ + raid6_unplug_device(mddev->queue); + release_stripe(sh); + schedule(); + goto retry; + } + finish_wait(&conf->wait_for_overlap, &w); + raid6_plug_device(conf); + handle_stripe(sh, NULL); + release_stripe(sh); + } else { + /* cannot get stripe for read-ahead, just give-up */ + clear_bit(BIO_UPTODATE, &bi->bi_flags); + finish_wait(&conf->wait_for_overlap, &w); + break; + } + + } + spin_lock_irq(&conf->device_lock); + if (--bi->bi_phys_segments == 0) { + int bytes = bi->bi_size; + + if (rw == WRITE ) + md_write_end(mddev); + bi->bi_size = 0; + bi->bi_end_io(bi, bytes, 0); + } + spin_unlock_irq(&conf->device_lock); + return 0; +} + +/* FIXME go_faster isn't used */ +static sector_t sync_request(mddev_t *mddev, sector_t sector_nr, int *skipped, int go_faster) +{ + raid6_conf_t *conf = (raid6_conf_t *) mddev->private; + struct stripe_head *sh; + int sectors_per_chunk = conf->chunk_size >> 9; + sector_t x; + unsigned long stripe; + int chunk_offset; + int dd_idx, pd_idx; + sector_t first_sector; + int raid_disks = conf->raid_disks; + int data_disks = raid_disks - 2; + sector_t max_sector = mddev->size << 1; + int sync_blocks; + int still_degraded = 0; + int i; + + if (sector_nr >= max_sector) { + /* just being told to finish up .. nothing much to do */ + unplug_slaves(mddev); + + if (mddev->curr_resync < max_sector) /* aborted */ + bitmap_end_sync(mddev->bitmap, mddev->curr_resync, + &sync_blocks, 1); + else /* completed sync */ + conf->fullsync = 0; + bitmap_close_sync(mddev->bitmap); + + return 0; + } + /* if there are 2 or more failed drives and we are trying + * to resync, then assert that we are finished, because there is + * nothing we can do. + */ + if (mddev->degraded >= 2 && test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { + sector_t rv = (mddev->size << 1) - sector_nr; + *skipped = 1; + return rv; + } + if (!bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, 1) && + !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) && + !conf->fullsync && sync_blocks >= STRIPE_SECTORS) { + /* we can skip this block, and probably more */ + sync_blocks /= STRIPE_SECTORS; + *skipped = 1; + return sync_blocks * STRIPE_SECTORS; /* keep things rounded to whole stripes */ + } + + x = sector_nr; + chunk_offset = sector_div(x, sectors_per_chunk); + stripe = x; + BUG_ON(x != stripe); + + first_sector = raid6_compute_sector((sector_t)stripe*data_disks*sectors_per_chunk + + chunk_offset, raid_disks, data_disks, &dd_idx, &pd_idx, conf); + sh = get_active_stripe(conf, sector_nr, pd_idx, 1); + if (sh == NULL) { + sh = get_active_stripe(conf, sector_nr, pd_idx, 0); + /* make sure we don't swamp the stripe cache if someone else + * is trying to get access + */ + schedule_timeout_uninterruptible(1); + } + /* Need to check if array will still be degraded after recovery/resync + * We don't need to check the 'failed' flag as when that gets set, + * recovery aborts. + */ + for (i=0; iraid_disks; i++) + if (conf->disks[i].rdev == NULL) + still_degraded = 1; + + bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, still_degraded); + + spin_lock(&sh->lock); + set_bit(STRIPE_SYNCING, &sh->state); + clear_bit(STRIPE_INSYNC, &sh->state); + spin_unlock(&sh->lock); + + handle_stripe(sh, NULL); + release_stripe(sh); + + return STRIPE_SECTORS; +} + +/* + * This is our raid6 kernel thread. + * + * We scan the hash table for stripes which can be handled now. + * During the scan, completed stripes are saved for us by the interrupt + * handler, so that they will not have to wait for our next wakeup. + */ +static void raid6d (mddev_t *mddev) +{ + struct stripe_head *sh; + raid6_conf_t *conf = mddev_to_conf(mddev); + int handled; + + PRINTK("+++ raid6d active\n"); + + md_check_recovery(mddev); + + handled = 0; + spin_lock_irq(&conf->device_lock); + while (1) { + struct list_head *first; + + if (conf->seq_flush - conf->seq_write > 0) { + int seq = conf->seq_flush; + spin_unlock_irq(&conf->device_lock); + bitmap_unplug(mddev->bitmap); + spin_lock_irq(&conf->device_lock); + conf->seq_write = seq; + activate_bit_delay(conf); + } + + if (list_empty(&conf->handle_list) && + atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD && + !blk_queue_plugged(mddev->queue) && + !list_empty(&conf->delayed_list)) + raid6_activate_delayed(conf); + + if (list_empty(&conf->handle_list)) + break; + + first = conf->handle_list.next; + sh = list_entry(first, struct stripe_head, lru); + + list_del_init(first); + atomic_inc(&sh->count); + BUG_ON(atomic_read(&sh->count)!= 1); + spin_unlock_irq(&conf->device_lock); + + handled++; + handle_stripe(sh, conf->spare_page); + release_stripe(sh); + + spin_lock_irq(&conf->device_lock); + } + PRINTK("%d stripes handled\n", handled); + + spin_unlock_irq(&conf->device_lock); + + unplug_slaves(mddev); + + PRINTK("--- raid6d inactive\n"); +} + +static ssize_t +raid6_show_stripe_cache_size(mddev_t *mddev, char *page) +{ + raid6_conf_t *conf = mddev_to_conf(mddev); + if (conf) + return sprintf(page, "%d\n", conf->max_nr_stripes); + else + return 0; +} + +static ssize_t +raid6_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len) +{ + raid6_conf_t *conf = mddev_to_conf(mddev); + char *end; + int new; + if (len >= PAGE_SIZE) + return -EINVAL; + if (!conf) + return -ENODEV; + + new = simple_strtoul(page, &end, 10); + if (!*page || (*end && *end != '\n') ) + return -EINVAL; + if (new <= 16 || new > 32768) + return -EINVAL; + while (new < conf->max_nr_stripes) { + if (drop_one_stripe(conf)) + conf->max_nr_stripes--; + else + break; + } + while (new > conf->max_nr_stripes) { + if (grow_one_stripe(conf)) + conf->max_nr_stripes++; + else break; + } + return len; +} + +static struct md_sysfs_entry +raid6_stripecache_size = __ATTR(stripe_cache_size, S_IRUGO | S_IWUSR, + raid6_show_stripe_cache_size, + raid6_store_stripe_cache_size); + +static ssize_t +stripe_cache_active_show(mddev_t *mddev, char *page) +{ + raid6_conf_t *conf = mddev_to_conf(mddev); + if (conf) + return sprintf(page, "%d\n", atomic_read(&conf->active_stripes)); + else + return 0; +} + +static struct md_sysfs_entry +raid6_stripecache_active = __ATTR_RO(stripe_cache_active); + +static struct attribute *raid6_attrs[] = { + &raid6_stripecache_size.attr, + &raid6_stripecache_active.attr, + NULL, +}; +static struct attribute_group raid6_attrs_group = { + .name = NULL, + .attrs = raid6_attrs, +}; + +static int run(mddev_t *mddev) +{ + raid6_conf_t *conf; + int raid_disk, memory; + mdk_rdev_t *rdev; + struct disk_info *disk; + struct list_head *tmp; + + if (mddev->level != 6) { + PRINTK("raid6: %s: raid level not set to 6 (%d)\n", mdname(mddev), mddev->level); + return -EIO; + } + + mddev->private = kzalloc(sizeof (raid6_conf_t), GFP_KERNEL); + if ((conf = mddev->private) == NULL) + goto abort; + conf->disks = kzalloc(mddev->raid_disks * sizeof(struct disk_info), + GFP_KERNEL); + if (!conf->disks) + goto abort; + + conf->mddev = mddev; + + if ((conf->stripe_hashtbl = kzalloc(PAGE_SIZE, GFP_KERNEL)) == NULL) + goto abort; + + conf->spare_page = alloc_page(GFP_KERNEL); + if (!conf->spare_page) + goto abort; + + spin_lock_init(&conf->device_lock); + init_waitqueue_head(&conf->wait_for_stripe); + init_waitqueue_head(&conf->wait_for_overlap); + INIT_LIST_HEAD(&conf->handle_list); + INIT_LIST_HEAD(&conf->delayed_list); + INIT_LIST_HEAD(&conf->bitmap_list); + INIT_LIST_HEAD(&conf->inactive_list); + atomic_set(&conf->active_stripes, 0); + atomic_set(&conf->preread_active_stripes, 0); + + PRINTK("raid6: run(%s) called.\n", mdname(mddev)); + + ITERATE_RDEV(mddev,rdev,tmp) { + raid_disk = rdev->raid_disk; + if (raid_disk >= mddev->raid_disks + || raid_disk < 0) + continue; + disk = conf->disks + raid_disk; + + disk->rdev = rdev; + + if (test_bit(In_sync, &rdev->flags)) { + char b[BDEVNAME_SIZE]; + printk(KERN_INFO "raid6: device %s operational as raid" + " disk %d\n", bdevname(rdev->bdev,b), + raid_disk); + conf->working_disks++; + } + } + + conf->raid_disks = mddev->raid_disks; + + /* + * 0 for a fully functional array, 1 or 2 for a degraded array. + */ + mddev->degraded = conf->failed_disks = conf->raid_disks - conf->working_disks; + conf->mddev = mddev; + conf->chunk_size = mddev->chunk_size; + conf->level = mddev->level; + conf->algorithm = mddev->layout; + conf->max_nr_stripes = NR_STRIPES; + + /* device size must be a multiple of chunk size */ + mddev->size &= ~(mddev->chunk_size/1024 -1); + mddev->resync_max_sectors = mddev->size << 1; + + if (conf->raid_disks < 4) { + printk(KERN_ERR "raid6: not enough configured devices for %s (%d, minimum 4)\n", + mdname(mddev), conf->raid_disks); + goto abort; + } + if (!conf->chunk_size || conf->chunk_size % 4) { + printk(KERN_ERR "raid6: invalid chunk size %d for %s\n", + conf->chunk_size, mdname(mddev)); + goto abort; + } + if (conf->algorithm > ALGORITHM_RIGHT_SYMMETRIC) { + printk(KERN_ERR + "raid6: unsupported parity algorithm %d for %s\n", + conf->algorithm, mdname(mddev)); + goto abort; + } + if (mddev->degraded > 2) { + printk(KERN_ERR "raid6: not enough operational devices for %s" + " (%d/%d failed)\n", + mdname(mddev), conf->failed_disks, conf->raid_disks); + goto abort; + } + + if (mddev->degraded > 0 && + mddev->recovery_cp != MaxSector) { + if (mddev->ok_start_degraded) + printk(KERN_WARNING "raid6: starting dirty degraded array:%s" + "- data corruption possible.\n", + mdname(mddev)); + else { + printk(KERN_ERR "raid6: cannot start dirty degraded array" + " for %s\n", mdname(mddev)); + goto abort; + } + } + + { + mddev->thread = md_register_thread(raid6d, mddev, "%s_raid6"); + if (!mddev->thread) { + printk(KERN_ERR + "raid6: couldn't allocate thread for %s\n", + mdname(mddev)); + goto abort; + } + } + + memory = conf->max_nr_stripes * (sizeof(struct stripe_head) + + conf->raid_disks * ((sizeof(struct bio) + PAGE_SIZE))) / 1024; + if (grow_stripes(conf, conf->max_nr_stripes)) { + printk(KERN_ERR + "raid6: couldn't allocate %dkB for buffers\n", memory); + shrink_stripes(conf); + md_unregister_thread(mddev->thread); + goto abort; + } else + printk(KERN_INFO "raid6: allocated %dkB for %s\n", + memory, mdname(mddev)); + + if (mddev->degraded == 0) + printk(KERN_INFO "raid6: raid level %d set %s active with %d out of %d" + " devices, algorithm %d\n", conf->level, mdname(mddev), + mddev->raid_disks-mddev->degraded, mddev->raid_disks, + conf->algorithm); + else + printk(KERN_ALERT "raid6: raid level %d set %s active with %d" + " out of %d devices, algorithm %d\n", conf->level, + mdname(mddev), mddev->raid_disks - mddev->degraded, + mddev->raid_disks, conf->algorithm); + + print_raid6_conf(conf); + + /* read-ahead size must cover two whole stripes, which is + * 2 * (n-2) * chunksize where 'n' is the number of raid devices + */ + { + int stripe = (mddev->raid_disks-2) * mddev->chunk_size + / PAGE_SIZE; + if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe) + mddev->queue->backing_dev_info.ra_pages = 2 * stripe; + } + + /* Ok, everything is just fine now */ + sysfs_create_group(&mddev->kobj, &raid6_attrs_group); + + mddev->array_size = mddev->size * (mddev->raid_disks - 2); + + mddev->queue->unplug_fn = raid6_unplug_device; + mddev->queue->issue_flush_fn = raid6_issue_flush; + return 0; +abort: + if (conf) { + print_raid6_conf(conf); + safe_put_page(conf->spare_page); + kfree(conf->stripe_hashtbl); + kfree(conf->disks); + kfree(conf); + } + mddev->private = NULL; + printk(KERN_ALERT "raid6: failed to run raid set %s\n", mdname(mddev)); + return -EIO; +} + + + +static int stop (mddev_t *mddev) +{ + raid6_conf_t *conf = (raid6_conf_t *) mddev->private; + + md_unregister_thread(mddev->thread); + mddev->thread = NULL; + shrink_stripes(conf); + kfree(conf->stripe_hashtbl); + blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/ + sysfs_remove_group(&mddev->kobj, &raid6_attrs_group); + kfree(conf); + mddev->private = NULL; + return 0; +} + +#if RAID6_DUMPSTATE +static void print_sh (struct seq_file *seq, struct stripe_head *sh) +{ + int i; + + seq_printf(seq, "sh %llu, pd_idx %d, state %ld.\n", + (unsigned long long)sh->sector, sh->pd_idx, sh->state); + seq_printf(seq, "sh %llu, count %d.\n", + (unsigned long long)sh->sector, atomic_read(&sh->count)); + seq_printf(seq, "sh %llu, ", (unsigned long long)sh->sector); + for (i = 0; i < sh->raid_conf->raid_disks; i++) { + seq_printf(seq, "(cache%d: %p %ld) ", + i, sh->dev[i].page, sh->dev[i].flags); + } + seq_printf(seq, "\n"); +} + +static void printall (struct seq_file *seq, raid6_conf_t *conf) +{ + struct stripe_head *sh; + struct hlist_node *hn; + int i; + + spin_lock_irq(&conf->device_lock); + for (i = 0; i < NR_HASH; i++) { + sh = conf->stripe_hashtbl[i]; + hlist_for_each_entry(sh, hn, &conf->stripe_hashtbl[i], hash) { + if (sh->raid_conf != conf) + continue; + print_sh(seq, sh); + } + } + spin_unlock_irq(&conf->device_lock); +} +#endif + +static void status (struct seq_file *seq, mddev_t *mddev) +{ + raid6_conf_t *conf = (raid6_conf_t *) mddev->private; + int i; + + seq_printf (seq, " level %d, %dk chunk, algorithm %d", mddev->level, mddev->chunk_size >> 10, mddev->layout); + seq_printf (seq, " [%d/%d] [", conf->raid_disks, conf->working_disks); + for (i = 0; i < conf->raid_disks; i++) + seq_printf (seq, "%s", + conf->disks[i].rdev && + test_bit(In_sync, &conf->disks[i].rdev->flags) ? "U" : "_"); + seq_printf (seq, "]"); +#if RAID6_DUMPSTATE + seq_printf (seq, "\n"); + printall(seq, conf); +#endif +} + +static void print_raid6_conf (raid6_conf_t *conf) +{ + int i; + struct disk_info *tmp; + + printk("RAID6 conf printout:\n"); + if (!conf) { + printk("(conf==NULL)\n"); + return; + } + printk(" --- rd:%d wd:%d fd:%d\n", conf->raid_disks, + conf->working_disks, conf->failed_disks); + + for (i = 0; i < conf->raid_disks; i++) { + char b[BDEVNAME_SIZE]; + tmp = conf->disks + i; + if (tmp->rdev) + printk(" disk %d, o:%d, dev:%s\n", + i, !test_bit(Faulty, &tmp->rdev->flags), + bdevname(tmp->rdev->bdev,b)); + } +} + +static int raid6_spare_active(mddev_t *mddev) +{ + int i; + raid6_conf_t *conf = mddev->private; + struct disk_info *tmp; + + for (i = 0; i < conf->raid_disks; i++) { + tmp = conf->disks + i; + if (tmp->rdev + && !test_bit(Faulty, &tmp->rdev->flags) + && !test_bit(In_sync, &tmp->rdev->flags)) { + mddev->degraded--; + conf->failed_disks--; + conf->working_disks++; + set_bit(In_sync, &tmp->rdev->flags); + } + } + print_raid6_conf(conf); + return 0; +} + +static int raid6_remove_disk(mddev_t *mddev, int number) +{ + raid6_conf_t *conf = mddev->private; + int err = 0; + mdk_rdev_t *rdev; + struct disk_info *p = conf->disks + number; + + print_raid6_conf(conf); + rdev = p->rdev; + if (rdev) { + if (test_bit(In_sync, &rdev->flags) || + atomic_read(&rdev->nr_pending)) { + err = -EBUSY; + goto abort; + } + p->rdev = NULL; + synchronize_rcu(); + if (atomic_read(&rdev->nr_pending)) { + /* lost the race, try later */ + err = -EBUSY; + p->rdev = rdev; + } + } + +abort: + + print_raid6_conf(conf); + return err; +} + +static int raid6_add_disk(mddev_t *mddev, mdk_rdev_t *rdev) +{ + raid6_conf_t *conf = mddev->private; + int found = 0; + int disk; + struct disk_info *p; + + if (mddev->degraded > 2) + /* no point adding a device */ + return 0; + /* + * find the disk ... but prefer rdev->saved_raid_disk + * if possible. + */ + if (rdev->saved_raid_disk >= 0 && + conf->disks[rdev->saved_raid_disk].rdev == NULL) + disk = rdev->saved_raid_disk; + else + disk = 0; + for ( ; disk < mddev->raid_disks; disk++) + if ((p=conf->disks + disk)->rdev == NULL) { + clear_bit(In_sync, &rdev->flags); + rdev->raid_disk = disk; + found = 1; + if (rdev->saved_raid_disk != disk) + conf->fullsync = 1; + rcu_assign_pointer(p->rdev, rdev); + break; + } + print_raid6_conf(conf); + return found; +} + +static int raid6_resize(mddev_t *mddev, sector_t sectors) +{ + /* no resync is happening, and there is enough space + * on all devices, so we can resize. + * We need to make sure resync covers any new space. + * If the array is shrinking we should possibly wait until + * any io in the removed space completes, but it hardly seems + * worth it. + */ + sectors &= ~((sector_t)mddev->chunk_size/512 - 1); + mddev->array_size = (sectors * (mddev->raid_disks-2))>>1; + set_capacity(mddev->gendisk, mddev->array_size << 1); + mddev->changed = 1; + if (sectors/2 > mddev->size && mddev->recovery_cp == MaxSector) { + mddev->recovery_cp = mddev->size << 1; + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + } + mddev->size = sectors /2; + mddev->resync_max_sectors = sectors; + return 0; +} + +static void raid6_quiesce(mddev_t *mddev, int state) +{ + raid6_conf_t *conf = mddev_to_conf(mddev); + + switch(state) { + case 1: /* stop all writes */ + spin_lock_irq(&conf->device_lock); + conf->quiesce = 1; + wait_event_lock_irq(conf->wait_for_stripe, + atomic_read(&conf->active_stripes) == 0, + conf->device_lock, /* nothing */); + spin_unlock_irq(&conf->device_lock); + break; + + case 0: /* re-enable writes */ + spin_lock_irq(&conf->device_lock); + conf->quiesce = 0; + wake_up(&conf->wait_for_stripe); + spin_unlock_irq(&conf->device_lock); + break; + } +} + +static struct mdk_personality raid6_personality = +{ + .name = "raid6", + .level = 6, + .owner = THIS_MODULE, + .make_request = make_request, + .run = run, + .stop = stop, + .status = status, + .error_handler = error, + .hot_add_disk = raid6_add_disk, + .hot_remove_disk= raid6_remove_disk, + .spare_active = raid6_spare_active, + .sync_request = sync_request, + .resize = raid6_resize, + .quiesce = raid6_quiesce, +}; + +static int __init raid6_init(void) +{ + int e; + + e = raid6_select_algo(); + if ( e ) + return e; + + return register_md_personality(&raid6_personality); +} + +static void raid6_exit (void) +{ + unregister_md_personality(&raid6_personality); +} + +module_init(raid6_init); +module_exit(raid6_exit); +MODULE_LICENSE("GPL"); +MODULE_ALIAS("md-personality-8"); /* RAID6 */ +MODULE_ALIAS("md-raid6"); +MODULE_ALIAS("md-level-6"); diff --git a/trunk/drivers/media/Kconfig b/trunk/drivers/media/Kconfig index ef52e6da01ed..583d151b7486 100644 --- a/trunk/drivers/media/Kconfig +++ b/trunk/drivers/media/Kconfig @@ -82,6 +82,9 @@ config VIDEO_IR config VIDEO_TVEEPROM tristate +config VIDEO_CX2341X + tristate + config USB_DABUSB tristate "DABUSB driver" depends on USB diff --git a/trunk/drivers/media/dvb/dvb-core/dvb_frontend.c b/trunk/drivers/media/dvb/dvb-core/dvb_frontend.c index 5e8bb41a088b..3152a54a2539 100644 --- a/trunk/drivers/media/dvb/dvb-core/dvb_frontend.c +++ b/trunk/drivers/media/dvb/dvb-core/dvb_frontend.c @@ -556,23 +556,22 @@ static int dvb_frontend_thread(void *data) } /* do an iteration of the tuning loop */ - if (fe->ops.get_frontend_algo) { - if (fe->ops.get_frontend_algo(fe) == FE_ALGO_HW) { - /* have we been asked to retune? */ - params = NULL; - if (fepriv->state & FESTATE_RETUNE) { - params = &fepriv->parameters; - fepriv->state = FESTATE_TUNED; - } + if (fe->ops.get_frontend_algo(fe) == FE_ALGO_HW) { + /* have we been asked to retune? */ + params = NULL; + if (fepriv->state & FESTATE_RETUNE) { + params = &fepriv->parameters; + fepriv->state = FESTATE_TUNED; + } - fe->ops.tune(fe, params, fepriv->tune_mode_flags, &fepriv->delay, &s); - if (s != fepriv->status) { - dvb_frontend_add_event(fe, s); - fepriv->status = s; - } + fe->ops.tune(fe, params, fepriv->tune_mode_flags, &fepriv->delay, &s); + if (s != fepriv->status) { + dvb_frontend_add_event(fe, s); + fepriv->status = s; } - } else + } else { dvb_frontend_swzigzag(fe); + } } if (dvb_shutdown_timeout) { diff --git a/trunk/drivers/media/dvb/ttpci/av7110.c b/trunk/drivers/media/dvb/ttpci/av7110.c index 7a5c99c200e8..8832f80c05f7 100644 --- a/trunk/drivers/media/dvb/ttpci/av7110.c +++ b/trunk/drivers/media/dvb/ttpci/av7110.c @@ -152,9 +152,13 @@ static void init_av7110_av(struct av7110 *av7110) /* remaining inits according to card and frontend type */ av7110->analog_tuner_flags = 0; av7110->current_input = 0; - if (dev->pci->subsystem_vendor == 0x13c2 && dev->pci->subsystem_device == 0x000a) + if (dev->pci->subsystem_vendor == 0x13c2 && dev->pci->subsystem_device == 0x000a) { + printk("dvb-ttpci: MSP3415 audio DAC @ card %d\n", + av7110->dvb_adapter.num); + av7110->adac_type = DVB_ADAC_MSP34x5; av7110_fw_cmd(av7110, COMTYPE_AUDIODAC, ADSwitch, 1, 0); // SPDIF on - if (i2c_writereg(av7110, 0x20, 0x00, 0x00) == 1) { + } + else if (i2c_writereg(av7110, 0x20, 0x00, 0x00) == 1) { printk ("dvb-ttpci: Crystal audio DAC @ card %d detected\n", av7110->dvb_adapter.num); av7110->adac_type = DVB_ADAC_CRYSTAL; diff --git a/trunk/drivers/media/dvb/ttpci/av7110_av.c b/trunk/drivers/media/dvb/ttpci/av7110_av.c index 0f3a044aeb17..2eff09f638d3 100644 --- a/trunk/drivers/media/dvb/ttpci/av7110_av.c +++ b/trunk/drivers/media/dvb/ttpci/av7110_av.c @@ -318,17 +318,7 @@ int av7110_set_volume(struct av7110 *av7110, int volleft, int volright) msp_writereg(av7110, MSP_WR_DSP, 0x0000, val); /* loudspeaker */ msp_writereg(av7110, MSP_WR_DSP, 0x0006, val); /* headphonesr */ return 0; - - case DVB_ADAC_MSP34x5: - vol = (volleft > volright) ? volleft : volright; - val = (vol * 0x73 / 255) << 8; - if (vol > 0) - balance = ((volright - volleft) * 127) / vol; - msp_writereg(av7110, MSP_WR_DSP, 0x0001, balance << 8); - msp_writereg(av7110, MSP_WR_DSP, 0x0000, val); /* loudspeaker */ - return 0; } - return 0; } @@ -1277,32 +1267,23 @@ static int dvb_audio_ioctl(struct inode *inode, struct file *file, switch(av7110->audiostate.channel_select) { case AUDIO_STEREO: ret = audcom(av7110, AUDIO_CMD_STEREO); - if (!ret) { + if (!ret) if (av7110->adac_type == DVB_ADAC_CRYSTAL) i2c_writereg(av7110, 0x20, 0x02, 0x49); - else if (av7110->adac_type == DVB_ADAC_MSP34x5) - msp_writereg(av7110, MSP_WR_DSP, 0x0008, 0x0220); - } break; case AUDIO_MONO_LEFT: ret = audcom(av7110, AUDIO_CMD_MONO_L); - if (!ret) { + if (!ret) if (av7110->adac_type == DVB_ADAC_CRYSTAL) i2c_writereg(av7110, 0x20, 0x02, 0x4a); - else if (av7110->adac_type == DVB_ADAC_MSP34x5) - msp_writereg(av7110, MSP_WR_DSP, 0x0008, 0x0200); - } break; case AUDIO_MONO_RIGHT: ret = audcom(av7110, AUDIO_CMD_MONO_R); - if (!ret) { + if (!ret) if (av7110->adac_type == DVB_ADAC_CRYSTAL) i2c_writereg(av7110, 0x20, 0x02, 0x45); - else if (av7110->adac_type == DVB_ADAC_MSP34x5) - msp_writereg(av7110, MSP_WR_DSP, 0x0008, 0x0210); - } break; default: diff --git a/trunk/drivers/media/dvb/ttpci/av7110_v4l.c b/trunk/drivers/media/dvb/ttpci/av7110_v4l.c index 64055461559d..603a22e4bfe2 100644 --- a/trunk/drivers/media/dvb/ttpci/av7110_v4l.c +++ b/trunk/drivers/media/dvb/ttpci/av7110_v4l.c @@ -42,18 +42,7 @@ int msp_writereg(struct av7110 *av7110, u8 dev, u16 reg, u16 val) { u8 msg[5] = { dev, reg >> 8, reg & 0xff, val >> 8 , val & 0xff }; - struct i2c_msg msgs = { .flags = 0, .len = 5, .buf = msg }; - - switch (av7110->adac_type) { - case DVB_ADAC_MSP34x0: - msgs.addr = 0x40; - break; - case DVB_ADAC_MSP34x5: - msgs.addr = 0x42; - break; - default: - return 0; - } + struct i2c_msg msgs = { .flags = 0, .addr = 0x40, .len = 5, .buf = msg }; if (i2c_transfer(&av7110->i2c_adap, &msgs, 1) != 1) { dprintk(1, "dvb-ttpci: failed @ card %d, %u = %u\n", @@ -68,23 +57,10 @@ static int msp_readreg(struct av7110 *av7110, u8 dev, u16 reg, u16 *val) u8 msg1[3] = { dev, reg >> 8, reg & 0xff }; u8 msg2[2]; struct i2c_msg msgs[2] = { - { .flags = 0 , .len = 3, .buf = msg1 }, - { .flags = I2C_M_RD, .len = 2, .buf = msg2 } + { .flags = 0, .addr = 0x40, .len = 3, .buf = msg1 }, + { .flags = I2C_M_RD, .addr = 0x40, .len = 2, .buf = msg2 } }; - switch (av7110->adac_type) { - case DVB_ADAC_MSP34x0: - msgs[0].addr = 0x40; - msgs[1].addr = 0x40; - break; - case DVB_ADAC_MSP34x5: - msgs[0].addr = 0x42; - msgs[1].addr = 0x42; - break; - default: - return 0; - } - if (i2c_transfer(&av7110->i2c_adap, &msgs[0], 2) != 2) { dprintk(1, "dvb-ttpci: failed @ card %d, %u\n", av7110->dvb_adapter.num, reg); @@ -702,23 +678,17 @@ int av7110_init_analog_module(struct av7110 *av7110) { u16 version1, version2; - if (i2c_writereg(av7110, 0x80, 0x0, 0x80) == 1 && - i2c_writereg(av7110, 0x80, 0x0, 0) == 1) { - printk("dvb-ttpci: DVB-C analog module @ card %d detected, initializing MSP3400\n", - av7110->dvb_adapter.num); - av7110->adac_type = DVB_ADAC_MSP34x0; - } else if (i2c_writereg(av7110, 0x84, 0x0, 0x80) == 1 && - i2c_writereg(av7110, 0x84, 0x0, 0) == 1) { - printk("dvb-ttpci: DVB-C analog module @ card %d detected, initializing MSP3415\n", - av7110->dvb_adapter.num); - av7110->adac_type = DVB_ADAC_MSP34x5; - } else + if (i2c_writereg(av7110, 0x80, 0x0, 0x80) != 1 + || i2c_writereg(av7110, 0x80, 0x0, 0) != 1) return -ENODEV; + printk("dvb-ttpci: DVB-C analog module @ card %d detected, initializing MSP3400\n", + av7110->dvb_adapter.num); + av7110->adac_type = DVB_ADAC_MSP34x0; msleep(100); // the probing above resets the msp... msp_readreg(av7110, MSP_RD_DSP, 0x001e, &version1); msp_readreg(av7110, MSP_RD_DSP, 0x001f, &version2); - dprintk(1, "dvb-ttpci: @ card %d MSP34xx version 0x%04x 0x%04x\n", + dprintk(1, "dvb-ttpci: @ card %d MSP3400 version 0x%04x 0x%04x\n", av7110->dvb_adapter.num, version1, version2); msp_writereg(av7110, MSP_WR_DSP, 0x0013, 0x0c00); msp_writereg(av7110, MSP_WR_DSP, 0x0000, 0x7f00); // loudspeaker + headphone @@ -727,7 +697,7 @@ int av7110_init_analog_module(struct av7110 *av7110) msp_writereg(av7110, MSP_WR_DSP, 0x0004, 0x7f00); // loudspeaker volume msp_writereg(av7110, MSP_WR_DSP, 0x000a, 0x0220); // SCART 1 source msp_writereg(av7110, MSP_WR_DSP, 0x0007, 0x7f00); // SCART 1 volume - msp_writereg(av7110, MSP_WR_DSP, 0x000d, 0x1900); // prescale SCART + msp_writereg(av7110, MSP_WR_DSP, 0x000d, 0x4800); // prescale SCART if (i2c_writereg(av7110, 0x48, 0x01, 0x00)!=1) { INFO(("saa7113 not accessible.\n")); diff --git a/trunk/drivers/media/video/Kconfig b/trunk/drivers/media/video/Kconfig index e4290491fa9e..824a63c92629 100644 --- a/trunk/drivers/media/video/Kconfig +++ b/trunk/drivers/media/video/Kconfig @@ -381,18 +381,6 @@ config VIDEO_WM8739 To compile this driver as a module, choose M here: the module will be called wm8739. -config VIDEO_CX2341X - tristate "Conexant CX2341x MPEG encoders" - depends on VIDEO_V4L2 && EXPERIMENTAL - ---help--- - Support for the Conexant CX23416 MPEG encoders - and CX23415 MPEG encoder/decoders. - - This module currently supports the encoding functions only. - - To compile this driver as a module, choose M here: the - module will be called cx2341x. - source "drivers/media/video/cx25840/Kconfig" config VIDEO_SAA711X diff --git a/trunk/drivers/media/video/cx2341x.c b/trunk/drivers/media/video/cx2341x.c index 01b22eab5725..554813e6f65d 100644 --- a/trunk/drivers/media/video/cx2341x.c +++ b/trunk/drivers/media/video/cx2341x.c @@ -43,7 +43,6 @@ MODULE_PARM_DESC(debug, "Debug level (0-1)"); const u32 cx2341x_mpeg_ctrls[] = { V4L2_CID_MPEG_CLASS, V4L2_CID_MPEG_STREAM_TYPE, - V4L2_CID_MPEG_STREAM_VBI_FMT, V4L2_CID_MPEG_AUDIO_SAMPLING_FREQ, V4L2_CID_MPEG_AUDIO_ENCODING, V4L2_CID_MPEG_AUDIO_L2_BITRATE, @@ -136,9 +135,6 @@ static int cx2341x_get_ctrl(struct cx2341x_mpeg_params *params, case V4L2_CID_MPEG_STREAM_TYPE: ctrl->value = params->stream_type; break; - case V4L2_CID_MPEG_STREAM_VBI_FMT: - ctrl->value = params->stream_vbi_fmt; - break; case V4L2_CID_MPEG_CX2341X_VIDEO_SPATIAL_FILTER_MODE: ctrl->value = params->video_spatial_filter_mode; break; @@ -261,9 +257,6 @@ static int cx2341x_set_ctrl(struct cx2341x_mpeg_params *params, params->video_bitrate_mode = V4L2_MPEG_VIDEO_BITRATE_MODE_CBR; } break; - case V4L2_CID_MPEG_STREAM_VBI_FMT: - params->stream_vbi_fmt = ctrl->value; - break; case V4L2_CID_MPEG_CX2341X_VIDEO_SPATIAL_FILTER_MODE: params->video_spatial_filter_mode = ctrl->value; break; @@ -425,14 +418,6 @@ int cx2341x_ctrl_query(struct cx2341x_mpeg_params *params, struct v4l2_queryctrl qctrl->flags |= V4L2_CTRL_FLAG_INACTIVE; return err; - case V4L2_CID_MPEG_STREAM_VBI_FMT: - if (params->capabilities & CX2341X_CAP_HAS_SLICED_VBI) - return v4l2_ctrl_query_fill_std(qctrl); - return cx2341x_ctrl_query_fill(qctrl, - V4L2_MPEG_STREAM_VBI_FMT_NONE, - V4L2_MPEG_STREAM_VBI_FMT_NONE, 1, - V4L2_MPEG_STREAM_VBI_FMT_NONE); - /* CX23415/6 specific */ case V4L2_CID_MPEG_CX2341X_VIDEO_SPATIAL_FILTER_MODE: return cx2341x_ctrl_query_fill(qctrl, @@ -654,7 +639,6 @@ void cx2341x_fill_defaults(struct cx2341x_mpeg_params *p) { static struct cx2341x_mpeg_params default_params = { /* misc */ - .capabilities = 0, .port = CX2341X_PORT_MEMORY, .width = 720, .height = 480, @@ -662,7 +646,6 @@ void cx2341x_fill_defaults(struct cx2341x_mpeg_params *p) /* stream */ .stream_type = V4L2_MPEG_STREAM_TYPE_MPEG2_PS, - .stream_vbi_fmt = V4L2_MPEG_STREAM_VBI_FMT_NONE, /* audio */ .audio_sampling_freq = V4L2_MPEG_AUDIO_SAMPLING_FREQ_48000, diff --git a/trunk/drivers/media/video/cx88/Kconfig b/trunk/drivers/media/video/cx88/Kconfig index 80e23ee9801c..91e1c481a164 100644 --- a/trunk/drivers/media/video/cx88/Kconfig +++ b/trunk/drivers/media/video/cx88/Kconfig @@ -11,6 +11,7 @@ config VIDEO_CX88 select VIDEO_BUF select VIDEO_TUNER select VIDEO_TVEEPROM + select VIDEO_CX2341X select VIDEO_IR ---help--- This is a video4linux driver for Conexant 2388x based @@ -35,25 +36,13 @@ config VIDEO_CX88_ALSA To compile this driver as a module, choose M here: the module will be called cx88-alsa. -config VIDEO_CX88_BLACKBIRD - tristate "Blackbird MPEG encoder support (cx2388x + cx23416)" - depends on VIDEO_CX88 - select VIDEO_CX2341X - ---help--- - This adds support for MPEG encoder cards based on the - Blackbird reference design, using the Conexant 2388x - and 23416 chips. - - To compile this driver as a module, choose M here: the - module will be called cx88-blackbird. - config VIDEO_CX88_DVB tristate "DVB/ATSC Support for cx2388x based TV cards" depends on VIDEO_CX88 && DVB_CORE select VIDEO_BUF_DVB ---help--- This adds support for DVB/ATSC cards based on the - Conexant 2388x chip. + Connexant 2388x chip. To compile this driver as a module, choose M here: the module will be called cx88-dvb. diff --git a/trunk/drivers/media/video/cx88/Makefile b/trunk/drivers/media/video/cx88/Makefile index 352b919f30c4..0dcd09b9b727 100644 --- a/trunk/drivers/media/video/cx88/Makefile +++ b/trunk/drivers/media/video/cx88/Makefile @@ -3,10 +3,9 @@ cx88xx-objs := cx88-cards.o cx88-core.o cx88-i2c.o cx88-tvaudio.o \ cx8800-objs := cx88-video.o cx88-vbi.o cx8802-objs := cx88-mpeg.o -obj-$(CONFIG_VIDEO_CX88) += cx88xx.o cx8800.o cx8802.o -obj-$(CONFIG_VIDEO_CX88_ALSA) += cx88-alsa.o -obj-$(CONFIG_VIDEO_CX88_BLACKBIRD) += cx88-blackbird.o +obj-$(CONFIG_VIDEO_CX88) += cx88xx.o cx8800.o cx8802.o cx88-blackbird.o obj-$(CONFIG_VIDEO_CX88_DVB) += cx88-dvb.o +obj-$(CONFIG_VIDEO_CX88_ALSA) += cx88-alsa.o obj-$(CONFIG_VIDEO_CX88_VP3054) += cx88-vp3054-i2c.o EXTRA_CFLAGS += -Idrivers/media/video diff --git a/trunk/drivers/media/video/cx88/cx88-blackbird.c b/trunk/drivers/media/video/cx88/cx88-blackbird.c index 78df66671ea2..67fd3302e8f2 100644 --- a/trunk/drivers/media/video/cx88/cx88-blackbird.c +++ b/trunk/drivers/media/video/cx88/cx88-blackbird.c @@ -846,7 +846,7 @@ static int mpeg_do_ioctl(struct inode *inode, struct file *file, BLACKBIRD_MPEG_CAPTURE, BLACKBIRD_RAW_BITS_NONE); - cx88_do_ioctl(inode, file, 0, dev->core, cmd, arg, mpeg_do_ioctl); + cx88_do_ioctl( inode, file, 0, dev->core, cmd, arg, cx88_ioctl_hook ); blackbird_initialize_codec(dev); cx88_set_scale(dev->core, dev->width, dev->height, @@ -855,11 +855,15 @@ static int mpeg_do_ioctl(struct inode *inode, struct file *file, } default: - return cx88_do_ioctl(inode, file, 0, dev->core, cmd, arg, mpeg_do_ioctl); + return cx88_do_ioctl( inode, file, 0, dev->core, cmd, arg, cx88_ioctl_hook ); } return 0; } +int (*cx88_ioctl_hook)(struct inode *inode, struct file *file, + unsigned int cmd, void *arg); +unsigned int (*cx88_ioctl_translator)(unsigned int cmd); + static unsigned int mpeg_translate_ioctl(unsigned int cmd) { return cmd; @@ -868,8 +872,8 @@ static unsigned int mpeg_translate_ioctl(unsigned int cmd) static int mpeg_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg) { - cmd = mpeg_translate_ioctl( cmd ); - return video_usercopy(inode, file, cmd, arg, mpeg_do_ioctl); + cmd = cx88_ioctl_translator( cmd ); + return video_usercopy(inode, file, cmd, arg, cx88_ioctl_hook); } static int mpeg_open(struct inode *inode, struct file *file) @@ -1115,6 +1119,8 @@ static int blackbird_init(void) printk(KERN_INFO "cx2388x: snapshot date %04d-%02d-%02d\n", SNAPSHOT/10000, (SNAPSHOT/100)%100, SNAPSHOT%100); #endif + cx88_ioctl_hook = mpeg_do_ioctl; + cx88_ioctl_translator = mpeg_translate_ioctl; return pci_register_driver(&blackbird_pci_driver); } @@ -1126,6 +1132,9 @@ static void blackbird_fini(void) module_init(blackbird_init); module_exit(blackbird_fini); +EXPORT_SYMBOL(cx88_ioctl_hook); +EXPORT_SYMBOL(cx88_ioctl_translator); + /* ----------------------------------------------------------- */ /* * Local variables: diff --git a/trunk/drivers/media/video/cx88/cx88-cards.c b/trunk/drivers/media/video/cx88/cx88-cards.c index f9d68f20dc88..67cdd8270863 100644 --- a/trunk/drivers/media/video/cx88/cx88-cards.c +++ b/trunk/drivers/media/video/cx88/cx88-cards.c @@ -1700,6 +1700,11 @@ void cx88_card_setup(struct cx88_core *core) /* ------------------------------------------------------------------ */ EXPORT_SYMBOL(cx88_boards); +EXPORT_SYMBOL(cx88_bcount); +EXPORT_SYMBOL(cx88_subids); +EXPORT_SYMBOL(cx88_idcount); +EXPORT_SYMBOL(cx88_card_list); +EXPORT_SYMBOL(cx88_card_setup); /* * Local variables: diff --git a/trunk/drivers/media/video/cx88/cx88-core.c b/trunk/drivers/media/video/cx88/cx88-core.c index 26f4c0fb8c36..c56292d8d93b 100644 --- a/trunk/drivers/media/video/cx88/cx88-core.c +++ b/trunk/drivers/media/video/cx88/cx88-core.c @@ -1181,6 +1181,8 @@ EXPORT_SYMBOL(cx88_set_scale); EXPORT_SYMBOL(cx88_vdev_init); EXPORT_SYMBOL(cx88_core_get); EXPORT_SYMBOL(cx88_core_put); +EXPORT_SYMBOL(cx88_start_audio_dma); +EXPORT_SYMBOL(cx88_stop_audio_dma); /* * Local variables: diff --git a/trunk/drivers/media/video/cx88/cx88-i2c.c b/trunk/drivers/media/video/cx88/cx88-i2c.c index 70663805cc30..7efa6def0bde 100644 --- a/trunk/drivers/media/video/cx88/cx88-i2c.c +++ b/trunk/drivers/media/video/cx88/cx88-i2c.c @@ -234,6 +234,7 @@ int cx88_i2c_init(struct cx88_core *core, struct pci_dev *pci) /* ----------------------------------------------------------------------- */ EXPORT_SYMBOL(cx88_call_i2c_clients); +EXPORT_SYMBOL(cx88_i2c_init); /* * Local variables: diff --git a/trunk/drivers/media/video/cx88/cx88-tvaudio.c b/trunk/drivers/media/video/cx88/cx88-tvaudio.c index 5785c3481579..1e4278b588d8 100644 --- a/trunk/drivers/media/video/cx88/cx88-tvaudio.c +++ b/trunk/drivers/media/video/cx88/cx88-tvaudio.c @@ -726,7 +726,7 @@ static void set_audio_standard_FM(struct cx88_core *core, /* ----------------------------------------------------------- */ -static int cx88_detect_nicam(struct cx88_core *core) +int cx88_detect_nicam(struct cx88_core *core) { int i, j = 0; diff --git a/trunk/drivers/media/video/cx88/cx88-video.c b/trunk/drivers/media/video/cx88/cx88-video.c index dcda5291b990..694d1d80ff3f 100644 --- a/trunk/drivers/media/video/cx88/cx88-video.c +++ b/trunk/drivers/media/video/cx88/cx88-video.c @@ -494,7 +494,8 @@ static int restart_video_queue(struct cx8800_dev *dev, return 0; buf = list_entry(q->queued.next, struct cx88_buffer, vb.queue); if (NULL == prev) { - list_move_tail(&buf->vb.queue, &q->active); + list_del(&buf->vb.queue); + list_add_tail(&buf->vb.queue,&q->active); start_video_dma(dev, q, buf); buf->vb.state = STATE_ACTIVE; buf->count = q->count++; @@ -505,7 +506,8 @@ static int restart_video_queue(struct cx8800_dev *dev, } else if (prev->vb.width == buf->vb.width && prev->vb.height == buf->vb.height && prev->fmt == buf->fmt) { - list_move_tail(&buf->vb.queue, &q->active); + list_del(&buf->vb.queue); + list_add_tail(&buf->vb.queue,&q->active); buf->vb.state = STATE_ACTIVE; buf->count = q->count++; prev->risc.jmp[1] = cpu_to_le32(buf->risc.dma); diff --git a/trunk/drivers/media/video/cx88/cx88.h b/trunk/drivers/media/video/cx88/cx88.h index 9a9a0fc7a41a..dc7bc35f18f4 100644 --- a/trunk/drivers/media/video/cx88/cx88.h +++ b/trunk/drivers/media/video/cx88/cx88.h @@ -563,6 +563,7 @@ void cx88_newstation(struct cx88_core *core); void cx88_get_stereo(struct cx88_core *core, struct v4l2_tuner *t); void cx88_set_stereo(struct cx88_core *core, u32 mode, int manual); int cx88_audio_thread(void *data); +int cx88_detect_nicam(struct cx88_core *core); /* ----------------------------------------------------------- */ /* cx88-input.c */ @@ -591,6 +592,12 @@ extern int cx88_do_ioctl(struct inode *inode, struct file *file, int radio, struct cx88_core *core, unsigned int cmd, void *arg, v4l2_kioctl driver_ioctl); +/* ----------------------------------------------------------- */ +/* cx88-blackbird.c */ +extern int (*cx88_ioctl_hook)(struct inode *inode, struct file *file, + unsigned int cmd, void *arg); +extern unsigned int (*cx88_ioctl_translator)(unsigned int cmd); + /* * Local variables: * c-basic-offset: 8 diff --git a/trunk/drivers/media/video/tuner-core.c b/trunk/drivers/media/video/tuner-core.c index a26ded7d6fae..e95792fd70f8 100644 --- a/trunk/drivers/media/video/tuner-core.c +++ b/trunk/drivers/media/video/tuner-core.c @@ -730,10 +730,14 @@ static int tuner_command(struct i2c_client *client, unsigned int cmd, void *arg) { struct v4l2_frequency *f = arg; - if (set_mode (client, t, f->type, "VIDIOC_S_FREQUENCY") - == EINVAL) - return 0; switch_v4l2(); + if ((V4L2_TUNER_RADIO == f->type && V4L2_TUNER_RADIO != t->mode) + || (V4L2_TUNER_DIGITAL_TV == f->type + && V4L2_TUNER_DIGITAL_TV != t->mode)) { + if (set_mode (client, t, f->type, "VIDIOC_S_FREQUENCY") + == EINVAL) + return 0; + } set_freq(client,f->frequency); break; diff --git a/trunk/drivers/media/video/usbvideo/quickcam_messenger.c b/trunk/drivers/media/video/usbvideo/quickcam_messenger.c index 56e01b622417..3f3182a24da1 100644 --- a/trunk/drivers/media/video/usbvideo/quickcam_messenger.c +++ b/trunk/drivers/media/video/usbvideo/quickcam_messenger.c @@ -33,7 +33,7 @@ #include #include #include -#include +#include #include "usbvideo.h" #include "quickcam_messenger.h" diff --git a/trunk/drivers/media/video/v4l2-common.c b/trunk/drivers/media/video/v4l2-common.c index f4b3d64ebf73..14e523471354 100644 --- a/trunk/drivers/media/video/v4l2-common.c +++ b/trunk/drivers/media/video/v4l2-common.c @@ -1101,11 +1101,6 @@ const char **v4l2_ctrl_get_menu(u32 id) "MPEG-2 SVCD-compatible Stream", NULL }; - static const char *mpeg_stream_vbi_fmt[] = { - "No VBI", - "VBI in private packets, IVTV format", - NULL - }; switch (id) { case V4L2_CID_MPEG_AUDIO_SAMPLING_FREQ: @@ -1134,8 +1129,6 @@ const char **v4l2_ctrl_get_menu(u32 id) return mpeg_video_bitrate_mode; case V4L2_CID_MPEG_STREAM_TYPE: return mpeg_stream_type; - case V4L2_CID_MPEG_STREAM_VBI_FMT: - return mpeg_stream_vbi_fmt; default: return NULL; } @@ -1189,7 +1182,6 @@ int v4l2_ctrl_query_fill(struct v4l2_queryctrl *qctrl, s32 min, s32 max, s32 ste case V4L2_CID_MPEG_STREAM_PID_PCR: name = "Stream PCR Program ID"; break; case V4L2_CID_MPEG_STREAM_PES_ID_AUDIO: name = "Stream PES Audio ID"; break; case V4L2_CID_MPEG_STREAM_PES_ID_VIDEO: name = "Stream PES Video ID"; break; - case V4L2_CID_MPEG_STREAM_VBI_FMT: name = "Stream VBI Format"; break; default: return -EINVAL; @@ -1216,7 +1208,6 @@ int v4l2_ctrl_query_fill(struct v4l2_queryctrl *qctrl, s32 min, s32 max, s32 ste case V4L2_CID_MPEG_VIDEO_ASPECT: case V4L2_CID_MPEG_VIDEO_BITRATE_MODE: case V4L2_CID_MPEG_STREAM_TYPE: - case V4L2_CID_MPEG_STREAM_VBI_FMT: qctrl->type = V4L2_CTRL_TYPE_MENU; step = 1; break; @@ -1376,11 +1367,6 @@ int v4l2_ctrl_query_fill_std(struct v4l2_queryctrl *qctrl) return v4l2_ctrl_query_fill(qctrl, 0, 255, 1, 0); case V4L2_CID_MPEG_STREAM_PES_ID_VIDEO: return v4l2_ctrl_query_fill(qctrl, 0, 255, 1, 0); - case V4L2_CID_MPEG_STREAM_VBI_FMT: - return v4l2_ctrl_query_fill(qctrl, - V4L2_MPEG_STREAM_VBI_FMT_NONE, - V4L2_MPEG_STREAM_VBI_FMT_IVTV, 1, - V4L2_MPEG_STREAM_VBI_FMT_NONE); default: return -EINVAL; } diff --git a/trunk/drivers/mtd/maps/sun_uflash.c b/trunk/drivers/mtd/maps/sun_uflash.c index 24a03152d196..0758cb1d0105 100644 --- a/trunk/drivers/mtd/maps/sun_uflash.c +++ b/trunk/drivers/mtd/maps/sun_uflash.c @@ -18,7 +18,6 @@ #include #include #include -#include #include #include @@ -31,140 +30,146 @@ #define UFLASH_WINDOW_SIZE 0x200000 #define UFLASH_BUSWIDTH 1 /* EBus is 8-bit */ -MODULE_AUTHOR("Eric Brower "); -MODULE_DESCRIPTION("User-programmable flash device on Sun Microsystems boardsets"); -MODULE_SUPPORTED_DEVICE("userflash"); -MODULE_LICENSE("GPL"); -MODULE_VERSION("2.0"); +MODULE_AUTHOR + ("Eric Brower "); +MODULE_DESCRIPTION + ("User-programmable flash device on Sun Microsystems boardsets"); +MODULE_SUPPORTED_DEVICE + ("userflash"); +MODULE_LICENSE + ("GPL"); static LIST_HEAD(device_list); struct uflash_dev { - char *name; /* device name */ + char * name; /* device name */ struct map_info map; /* mtd map info */ - struct mtd_info *mtd; /* mtd info */ + struct mtd_info * mtd; /* mtd info */ + struct list_head list; }; struct map_info uflash_map_templ = { - .name = "SUNW,???-????", - .size = UFLASH_WINDOW_SIZE, - .bankwidth = UFLASH_BUSWIDTH, + .name = "SUNW,???-????", + .size = UFLASH_WINDOW_SIZE, + .bankwidth = UFLASH_BUSWIDTH, }; -int uflash_devinit(struct linux_ebus_device *edev, struct device_node *dp) +int uflash_devinit(struct linux_ebus_device* edev) { - struct uflash_dev *up; - struct resource *res; + int iTmp, nregs; + struct linux_prom_registers regs[2]; + struct uflash_dev *pdev; + + iTmp = prom_getproperty( + edev->prom_node, "reg", (void *)regs, sizeof(regs)); + if ((iTmp % sizeof(regs[0])) != 0) { + printk("%s: Strange reg property size %d\n", + UFLASH_DEVNAME, iTmp); + return -ENODEV; + } - res = &edev->resource[0]; + nregs = iTmp / sizeof(regs[0]); - if (edev->num_addrs != 1) { + if (nregs != 1) { /* Non-CFI userflash device-- once I find one we * can work on supporting it. */ printk("%s: unsupported device at 0x%lx (%d regs): " \ "email ebrower@usa.net\n", - dp->full_name, res->start, edev->num_addrs); - + UFLASH_DEVNAME, edev->resource[0].start, nregs); return -ENODEV; } - up = kzalloc(sizeof(struct uflash_dev), GFP_KERNEL); - if (!up) - return -ENOMEM; + if(0 == (pdev = kmalloc(sizeof(struct uflash_dev), GFP_KERNEL))) { + printk("%s: unable to kmalloc new device\n", UFLASH_DEVNAME); + return(-ENOMEM); + } /* copy defaults and tweak parameters */ - memcpy(&up->map, &uflash_map_templ, sizeof(uflash_map_templ)); - up->map.size = (res->end - res->start) + 1UL; - - up->name = of_get_property(dp, "model", NULL); - if (up->name && 0 < strlen(up->name)) - up->map.name = up->name; - - up->map.phys = res->start; - - up->map.virt = ioremap_nocache(res->start, up->map.size); - if (!up->map.virt) { - printk("%s: Failed to map device.\n", dp->full_name); - kfree(up); - - return -EINVAL; + memcpy(&pdev->map, &uflash_map_templ, sizeof(uflash_map_templ)); + pdev->map.size = regs[0].reg_size; + + iTmp = prom_getproplen(edev->prom_node, "model"); + pdev->name = kmalloc(iTmp, GFP_KERNEL); + prom_getstring(edev->prom_node, "model", pdev->name, iTmp); + if(0 != pdev->name && 0 < strlen(pdev->name)) { + pdev->map.name = pdev->name; + } + pdev->map.phys = edev->resource[0].start; + pdev->map.virt = ioremap_nocache(edev->resource[0].start, pdev->map.size); + if(0 == pdev->map.virt) { + printk("%s: failed to map device\n", __FUNCTION__); + kfree(pdev->name); + kfree(pdev); + return(-1); } - simple_map_init(&up->map); + simple_map_init(&pdev->map); /* MTD registration */ - up->mtd = do_map_probe("cfi_probe", &up->map); - if (!up->mtd) { - iounmap(up->map.virt); - kfree(up); - - return -ENXIO; + pdev->mtd = do_map_probe("cfi_probe", &pdev->map); + if(0 == pdev->mtd) { + iounmap(pdev->map.virt); + kfree(pdev->name); + kfree(pdev); + return(-ENXIO); } - up->mtd->owner = THIS_MODULE; + list_add(&pdev->list, &device_list); - add_mtd_device(up->mtd); + pdev->mtd->owner = THIS_MODULE; - dev_set_drvdata(&edev->ofdev.dev, up); - - return 0; + add_mtd_device(pdev->mtd); + return(0); } -static int __devinit uflash_probe(struct of_device *dev, const struct of_device_id *match) +static int __init uflash_init(void) { - struct linux_ebus_device *edev = to_ebus_device(&dev->dev); - struct device_node *dp = dev->node; + struct linux_ebus *ebus = NULL; + struct linux_ebus_device *edev = NULL; + + for_each_ebus(ebus) { + for_each_ebusdev(edev, ebus) { + if (!strcmp(edev->prom_name, UFLASH_OBPNAME)) { + if(0 > prom_getproplen(edev->prom_node, "user")) { + DEBUG(2, "%s: ignoring device at 0x%lx\n", + UFLASH_DEVNAME, edev->resource[0].start); + } else { + uflash_devinit(edev); + } + } + } + } - if (of_find_property(dp, "user", NULL)) + if(list_empty(&device_list)) { + printk("%s: unable to locate device\n", UFLASH_DEVNAME); return -ENODEV; - - return uflash_devinit(edev, dp); -} - -static int __devexit uflash_remove(struct of_device *dev) -{ - struct uflash_dev *up = dev_get_drvdata(&dev->dev); - - if (up->mtd) { - del_mtd_device(up->mtd); - map_destroy(up->mtd); } - if (up->map.virt) { - iounmap(up->map.virt); - up->map.virt = NULL; - } - - kfree(up); - - return 0; + return(0); } -static struct of_device_id uflash_match[] = { - { - .name = UFLASH_OBPNAME, - }, - {}, -}; - -MODULE_DEVICE_TABLE(of, uflash_match); - -static struct of_platform_driver uflash_driver = { - .name = UFLASH_DEVNAME, - .match_table = uflash_match, - .probe = uflash_probe, - .remove = __devexit_p(uflash_remove), -}; - -static int __init uflash_init(void) +static void __exit uflash_cleanup(void) { - return of_register_driver(&uflash_driver, &ebus_bus_type); -} - -static void __exit uflash_exit(void) -{ - of_unregister_driver(&uflash_driver); + struct list_head *udevlist; + struct uflash_dev *udev; + + list_for_each(udevlist, &device_list) { + udev = list_entry(udevlist, struct uflash_dev, list); + DEBUG(2, "%s: removing device %s\n", + UFLASH_DEVNAME, udev->name); + + if(0 != udev->mtd) { + del_mtd_device(udev->mtd); + map_destroy(udev->mtd); + } + if(0 != udev->map.virt) { + iounmap(udev->map.virt); + udev->map.virt = NULL; + } + kfree(udev->name); + kfree(udev); + } } module_init(uflash_init); -module_exit(uflash_exit); +module_exit(uflash_cleanup); diff --git a/trunk/drivers/net/irda/nsc-ircc.c b/trunk/drivers/net/irda/nsc-ircc.c index cb62f2a9676a..cc7ff8f00e42 100644 --- a/trunk/drivers/net/irda/nsc-ircc.c +++ b/trunk/drivers/net/irda/nsc-ircc.c @@ -115,12 +115,8 @@ static nsc_chip_t chips[] = { /* Contributed by Jan Frey - IBM A30/A31 */ { "PC8739x", { 0x2e, 0x4e, 0x0 }, 0x20, 0xea, 0xff, nsc_ircc_probe_39x, nsc_ircc_init_39x }, - /* IBM ThinkPads using PC8738x (T60/X60/Z60) */ - { "IBM-PC8738x", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff, - nsc_ircc_probe_39x, nsc_ircc_init_39x }, - /* IBM ThinkPads using PC8394T (T43/R52/?) */ - { "IBM-PC8394T", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf9, 0xff, - nsc_ircc_probe_39x, nsc_ircc_init_39x }, + { "IBM", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff, + nsc_ircc_probe_39x, nsc_ircc_init_39x }, { NULL } }; diff --git a/trunk/drivers/net/ppp_generic.c b/trunk/drivers/net/ppp_generic.c index d643a097faa5..01cd8ec751ea 100644 --- a/trunk/drivers/net/ppp_generic.c +++ b/trunk/drivers/net/ppp_generic.c @@ -2578,7 +2578,8 @@ ppp_find_channel(int unit) list_for_each_entry(pch, &new_channels, list) { if (pch->file.index == unit) { - list_move(&pch->list, &all_channels); + list_del(&pch->list); + list_add(&pch->list, &all_channels); return pch; } } diff --git a/trunk/drivers/net/wireless/bcm43xx/Kconfig b/trunk/drivers/net/wireless/bcm43xx/Kconfig index 533993f538fc..25ea4748f0b9 100644 --- a/trunk/drivers/net/wireless/bcm43xx/Kconfig +++ b/trunk/drivers/net/wireless/bcm43xx/Kconfig @@ -2,7 +2,6 @@ config BCM43XX tristate "Broadcom BCM43xx wireless support" depends on PCI && IEEE80211 && IEEE80211_SOFTMAC && NET_RADIO && EXPERIMENTAL select FW_LOADER - select HW_RANDOM ---help--- This is an experimental driver for the Broadcom 43xx wireless chip, found in the Apple Airport Extreme and various other devices. diff --git a/trunk/drivers/net/wireless/bcm43xx/bcm43xx.h b/trunk/drivers/net/wireless/bcm43xx/bcm43xx.h index 17a56828e232..d8f917c21ea4 100644 --- a/trunk/drivers/net/wireless/bcm43xx/bcm43xx.h +++ b/trunk/drivers/net/wireless/bcm43xx/bcm43xx.h @@ -1,7 +1,6 @@ #ifndef BCM43xx_H_ #define BCM43xx_H_ -#include #include #include #include @@ -83,7 +82,6 @@ #define BCM43xx_MMIO_TSF_1 0x634 /* core rev < 3 only */ #define BCM43xx_MMIO_TSF_2 0x636 /* core rev < 3 only */ #define BCM43xx_MMIO_TSF_3 0x638 /* core rev < 3 only */ -#define BCM43xx_MMIO_RNG 0x65A #define BCM43xx_MMIO_POWERUP_DELAY 0x6A8 /* SPROM offsets. */ @@ -752,10 +750,6 @@ struct bcm43xx_private { const struct firmware *initvals0; const struct firmware *initvals1; - /* Random Number Generator. */ - struct hwrng rng; - char rng_name[20 + 1]; - /* Debugging stuff follows. */ #ifdef CONFIG_BCM43XX_DEBUG struct bcm43xx_dfsentry *dfsentry; diff --git a/trunk/drivers/net/wireless/bcm43xx/bcm43xx_main.c b/trunk/drivers/net/wireless/bcm43xx/bcm43xx_main.c index 27bcf47228e2..085d7857fe31 100644 --- a/trunk/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ b/trunk/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -3237,39 +3237,6 @@ static void bcm43xx_security_init(struct bcm43xx_private *bcm) bcm43xx_clear_keys(bcm); } -static int bcm43xx_rng_read(struct hwrng *rng, u32 *data) -{ - struct bcm43xx_private *bcm = (struct bcm43xx_private *)rng->priv; - unsigned long flags; - - bcm43xx_lock_irqonly(bcm, flags); - *data = bcm43xx_read16(bcm, BCM43xx_MMIO_RNG); - bcm43xx_unlock_irqonly(bcm, flags); - - return (sizeof(u16)); -} - -static void bcm43xx_rng_exit(struct bcm43xx_private *bcm) -{ - hwrng_unregister(&bcm->rng); -} - -static int bcm43xx_rng_init(struct bcm43xx_private *bcm) -{ - int err; - - snprintf(bcm->rng_name, ARRAY_SIZE(bcm->rng_name), - "%s_%s", KBUILD_MODNAME, bcm->net_dev->name); - bcm->rng.name = bcm->rng_name; - bcm->rng.data_read = bcm43xx_rng_read; - bcm->rng.priv = (unsigned long)bcm; - err = hwrng_register(&bcm->rng); - if (err) - printk(KERN_ERR PFX "RNG init failed (%d)\n", err); - - return err; -} - /* This is the opposite of bcm43xx_init_board() */ static void bcm43xx_free_board(struct bcm43xx_private *bcm) { @@ -3281,7 +3248,6 @@ static void bcm43xx_free_board(struct bcm43xx_private *bcm) bcm43xx_set_status(bcm, BCM43xx_STAT_SHUTTINGDOWN); - bcm43xx_rng_exit(bcm); for (i = 0; i < BCM43xx_MAX_80211_CORES; i++) { if (!bcm->core_80211[i].available) continue; @@ -3359,9 +3325,6 @@ static int bcm43xx_init_board(struct bcm43xx_private *bcm) bcm43xx_switch_core(bcm, &bcm->core_80211[0]); bcm43xx_mac_enable(bcm); } - err = bcm43xx_rng_init(bcm); - if (err) - goto err_80211_unwind; bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC); bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 *)(bcm->net_dev->dev_addr)); dprintk(KERN_INFO PFX "80211 cores initialized\n"); diff --git a/trunk/drivers/parport/parport_sunbpp.c b/trunk/drivers/parport/parport_sunbpp.c index 7c43c5392bed..69a4bbd4cbee 100644 --- a/trunk/drivers/parport/parport_sunbpp.c +++ b/trunk/drivers/parport/parport_sunbpp.c @@ -389,7 +389,7 @@ static struct of_device_id bpp_match[] = { {}, }; -MODULE_DEVICE_TABLE(of, bpp_match); +MODULE_DEVICE_TABLE(of, qec_sbus_match); static struct of_platform_driver bpp_sbus_driver = { .name = "bpp", diff --git a/trunk/drivers/pci/msi-apic.c b/trunk/drivers/pci/msi-apic.c index 5ed798b319c7..0eb5fe9003a2 100644 --- a/trunk/drivers/pci/msi-apic.c +++ b/trunk/drivers/pci/msi-apic.c @@ -4,7 +4,6 @@ #include #include -#include #include "msi.h" diff --git a/trunk/drivers/s390/net/lcs.c b/trunk/drivers/s390/net/lcs.c index 2eded55ae88d..f94419b334f7 100644 --- a/trunk/drivers/s390/net/lcs.c +++ b/trunk/drivers/s390/net/lcs.c @@ -1140,9 +1140,10 @@ lcs_fix_multicast_list(struct lcs_card *card) } } /* re-insert all entries from the failed_list into ipm_list */ - list_for_each_entry_safe(ipm, tmp, &failed_list, list) - list_move_tail(&ipm->list, &card->ipm_list); - + list_for_each_entry_safe(ipm, tmp, &failed_list, list) { + list_del_init(&ipm->list); + list_add_tail(&ipm->list, &card->ipm_list); + } spin_unlock_irqrestore(&card->ipm_lock, flags); } diff --git a/trunk/drivers/sbus/char/cpwatchdog.c b/trunk/drivers/sbus/char/cpwatchdog.c index 21737b7e86a1..5bf3dd901b65 100644 --- a/trunk/drivers/sbus/char/cpwatchdog.c +++ b/trunk/drivers/sbus/char/cpwatchdog.c @@ -755,7 +755,7 @@ static int __init wd_init(void) for_each_ebus(ebus) { for_each_ebusdev(edev, ebus) { - if (!strcmp(edev->ofdev.node->name, WD_OBPNAME)) + if (!strcmp(edev->prom_name, WD_OBPNAME)) goto ebus_done; } } diff --git a/trunk/drivers/sbus/char/openprom.c b/trunk/drivers/sbus/char/openprom.c index d7e4bb41bd79..cf5b476b5496 100644 --- a/trunk/drivers/sbus/char/openprom.c +++ b/trunk/drivers/sbus/char/openprom.c @@ -29,6 +29,8 @@ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ +#define PROMLIB_INTERNAL + #include #include #include @@ -37,10 +39,10 @@ #include #include #include +#include #include #include #include -#include #include #include #include @@ -49,20 +51,15 @@ #include #endif -MODULE_AUTHOR("Thomas K. Dyas (tdyas@noc.rutgers.edu) and Eddie C. Dost (ecd@skynet.be)"); -MODULE_DESCRIPTION("OPENPROM Configuration Driver"); -MODULE_LICENSE("GPL"); -MODULE_VERSION("1.0"); - /* Private data kept by the driver for each descriptor. */ typedef struct openprom_private_data { - struct device_node *current_node; /* Current node for SunOS ioctls. */ - struct device_node *lastnode; /* Last valid node used by BSD ioctls. */ + int current_node; /* Current node for SunOS ioctls. */ + int lastnode; /* Last valid node used by BSD ioctls. */ } DATA; /* ID of the PROM node containing all of the EEPROM options. */ -static struct device_node *options_node; +static int options_node = 0; /* * Copy an openpromio structure into kernel space from user space. @@ -90,8 +87,9 @@ static int copyin(struct openpromio __user *info, struct openpromio **opp_p) if (bufsize > OPROMMAXPARAM) bufsize = OPROMMAXPARAM; - if (!(*opp_p = kzalloc(sizeof(int) + bufsize + 1, GFP_KERNEL))) + if (!(*opp_p = kmalloc(sizeof(int) + bufsize + 1, GFP_KERNEL))) return -ENOMEM; + memset(*opp_p, 0, sizeof(int) + bufsize + 1); if (copy_from_user(&(*opp_p)->oprom_array, &info->oprom_array, bufsize)) { @@ -109,9 +107,10 @@ static int getstrings(struct openpromio __user *info, struct openpromio **opp_p) if (!info || !opp_p) return -EFAULT; - if (!(*opp_p = kzalloc(sizeof(int) + OPROMMAXPARAM + 1, GFP_KERNEL))) + if (!(*opp_p = kmalloc(sizeof(int) + OPROMMAXPARAM + 1, GFP_KERNEL))) return -ENOMEM; + memset(*opp_p, 0, sizeof(int) + OPROMMAXPARAM + 1); (*opp_p)->oprom_size = 0; n = bufsize = 0; @@ -141,164 +140,16 @@ static int copyout(void __user *info, struct openpromio *opp, int len) return 0; } -static int opromgetprop(void __user *argp, struct device_node *dp, struct openpromio *op, int bufsize) -{ - void *pval; - int len; - - pval = of_get_property(dp, op->oprom_array, &len); - if (!pval || len <= 0 || len > bufsize) - return copyout(argp, op, sizeof(int)); - - memcpy(op->oprom_array, pval, len); - op->oprom_array[len] = '\0'; - op->oprom_size = len; - - return copyout(argp, op, sizeof(int) + bufsize); -} - -static int opromnxtprop(void __user *argp, struct device_node *dp, struct openpromio *op, int bufsize) -{ - struct property *prop; - int len; - - if (op->oprom_array[0] == '\0') { - prop = dp->properties; - if (!prop) - return copyout(argp, op, sizeof(int)); - len = strlen(prop->name); - } else { - prop = of_find_property(dp, op->oprom_array, NULL); - - if (!prop || - !prop->next || - (len = strlen(prop->next->name)) + 1 > bufsize) - return copyout(argp, op, sizeof(int)); - - prop = prop->next; - } - - memcpy(op->oprom_array, prop->name, len); - op->oprom_array[len] = '\0'; - op->oprom_size = ++len; - - return copyout(argp, op, sizeof(int) + bufsize); -} - -static int opromsetopt(struct device_node *dp, struct openpromio *op, int bufsize) -{ - char *buf = op->oprom_array + strlen(op->oprom_array) + 1; - int len = op->oprom_array + bufsize - buf; - - return of_set_property(options_node, op->oprom_array, buf, len); -} - -static int opromnext(void __user *argp, unsigned int cmd, struct device_node *dp, struct openpromio *op, int bufsize, DATA *data) -{ - phandle ph; - - BUILD_BUG_ON(sizeof(phandle) != sizeof(int)); - - if (bufsize < sizeof(phandle)) - return -EINVAL; - - ph = *((int *) op->oprom_array); - if (ph) { - dp = of_find_node_by_phandle(ph); - if (!dp) - return -EINVAL; - - switch (cmd) { - case OPROMNEXT: - dp = dp->sibling; - break; - - case OPROMCHILD: - dp = dp->child; - break; - - case OPROMSETCUR: - default: - break; - }; - } else { - /* Sibling of node zero is the root node. */ - if (cmd != OPROMNEXT) - return -EINVAL; - - dp = of_find_node_by_path("/"); - } - - ph = 0; - if (dp) - ph = dp->node; - - data->current_node = dp; - *((int *) op->oprom_array) = ph; - op->oprom_size = sizeof(phandle); - - return copyout(argp, op, bufsize + sizeof(int)); -} - -static int oprompci2node(void __user *argp, struct device_node *dp, struct openpromio *op, int bufsize, DATA *data) -{ - int err = -EINVAL; - - if (bufsize >= 2*sizeof(int)) { -#ifdef CONFIG_PCI - struct pci_dev *pdev; - struct pcidev_cookie *pcp; - pdev = pci_find_slot (((int *) op->oprom_array)[0], - ((int *) op->oprom_array)[1]); - - pcp = pdev->sysdata; - if (pcp != NULL) { - dp = pcp->prom_node; - data->current_node = dp; - *((int *)op->oprom_array) = dp->node; - op->oprom_size = sizeof(int); - err = copyout(argp, op, bufsize + sizeof(int)); - } -#endif - } - - return err; -} - -static int oprompath2node(void __user *argp, struct device_node *dp, struct openpromio *op, int bufsize, DATA *data) -{ - dp = of_find_node_by_path(op->oprom_array); - data->current_node = dp; - *((int *)op->oprom_array) = dp->node; - op->oprom_size = sizeof(int); - - return copyout(argp, op, bufsize + sizeof(int)); -} - -static int opromgetbootargs(void __user *argp, struct openpromio *op, int bufsize) -{ - char *buf = saved_command_line; - int len = strlen(buf); - - if (len > bufsize) - return -EINVAL; - - strcpy(op->oprom_array, buf); - op->oprom_size = len; - - return copyout(argp, op, bufsize + sizeof(int)); -} - /* * SunOS and Solaris /dev/openprom ioctl calls. */ static int openprom_sunos_ioctl(struct inode * inode, struct file * file, - unsigned int cmd, unsigned long arg, - struct device_node *dp) + unsigned int cmd, unsigned long arg, int node) { - DATA *data = file->private_data; + DATA *data = (DATA *) file->private_data; + char buffer[OPROMMAXPARAM+1], *buf; struct openpromio *opp; - int bufsize, error = 0; + int bufsize, len, error = 0; static int cnt; void __user *argp = (void __user *)arg; @@ -313,35 +164,119 @@ static int openprom_sunos_ioctl(struct inode * inode, struct file * file, switch (cmd) { case OPROMGETOPT: case OPROMGETPROP: - error = opromgetprop(argp, dp, opp, bufsize); + len = prom_getproplen(node, opp->oprom_array); + + if (len <= 0 || len > bufsize) { + error = copyout(argp, opp, sizeof(int)); + break; + } + + len = prom_getproperty(node, opp->oprom_array, buffer, bufsize); + + memcpy(opp->oprom_array, buffer, len); + opp->oprom_array[len] = '\0'; + opp->oprom_size = len; + + error = copyout(argp, opp, sizeof(int) + bufsize); break; case OPROMNXTOPT: case OPROMNXTPROP: - error = opromnxtprop(argp, dp, opp, bufsize); + buf = prom_nextprop(node, opp->oprom_array, buffer); + + len = strlen(buf); + if (len == 0 || len + 1 > bufsize) { + error = copyout(argp, opp, sizeof(int)); + break; + } + + memcpy(opp->oprom_array, buf, len); + opp->oprom_array[len] = '\0'; + opp->oprom_size = ++len; + + error = copyout(argp, opp, sizeof(int) + bufsize); break; case OPROMSETOPT: case OPROMSETOPT2: - error = opromsetopt(dp, opp, bufsize); + buf = opp->oprom_array + strlen(opp->oprom_array) + 1; + len = opp->oprom_array + bufsize - buf; + + error = prom_setprop(options_node, opp->oprom_array, + buf, len); + + if (error < 0) + error = -EINVAL; break; case OPROMNEXT: case OPROMCHILD: case OPROMSETCUR: - error = opromnext(argp, cmd, dp, opp, bufsize, data); + if (bufsize < sizeof(int)) { + error = -EINVAL; + break; + } + + node = *((int *) opp->oprom_array); + + switch (cmd) { + case OPROMNEXT: node = __prom_getsibling(node); break; + case OPROMCHILD: node = __prom_getchild(node); break; + case OPROMSETCUR: break; + } + + data->current_node = node; + *((int *)opp->oprom_array) = node; + opp->oprom_size = sizeof(int); + + error = copyout(argp, opp, bufsize + sizeof(int)); break; case OPROMPCI2NODE: - error = oprompci2node(argp, dp, opp, bufsize, data); + error = -EINVAL; + + if (bufsize >= 2*sizeof(int)) { +#ifdef CONFIG_PCI + struct pci_dev *pdev; + struct pcidev_cookie *pcp; + pdev = pci_find_slot (((int *) opp->oprom_array)[0], + ((int *) opp->oprom_array)[1]); + + pcp = pdev->sysdata; + if (pcp != NULL) { + node = pcp->prom_node->node; + data->current_node = node; + *((int *)opp->oprom_array) = node; + opp->oprom_size = sizeof(int); + error = copyout(argp, opp, bufsize + sizeof(int)); + } +#endif + } break; case OPROMPATH2NODE: - error = oprompath2node(argp, dp, opp, bufsize, data); + node = prom_finddevice(opp->oprom_array); + data->current_node = node; + *((int *)opp->oprom_array) = node; + opp->oprom_size = sizeof(int); + + error = copyout(argp, opp, bufsize + sizeof(int)); break; case OPROMGETBOOTARGS: - error = opromgetbootargs(argp, opp, bufsize); + buf = saved_command_line; + + len = strlen(buf); + + if (len > bufsize) { + error = -EINVAL; + break; + } + + strcpy(opp->oprom_array, buf); + opp->oprom_size = len; + + error = copyout(argp, opp, bufsize + sizeof(int)); break; case OPROMU2P: @@ -362,14 +297,25 @@ static int openprom_sunos_ioctl(struct inode * inode, struct file * file, return error; } -static struct device_node *get_node(phandle n, DATA *data) -{ - struct device_node *dp = of_find_node_by_phandle(n); - if (dp) - data->lastnode = dp; +/* Return nonzero if a specific node is in the PROM device tree. */ +static int intree(int root, int node) +{ + for (; root != 0; root = prom_getsibling(root)) + if (root == node || intree(prom_getchild(root),node)) + return 1; + return 0; +} - return dp; +/* Return nonzero if a specific node is "valid". */ +static int goodnode(int n, DATA *data) +{ + if (n == data->lastnode || n == prom_root_node || n == options_node) + return 1; + if (n == 0 || n == -1 || !intree(prom_root_node,n)) + return 0; + data->lastnode = n; + return 1; } /* Copy in a whole string from userspace into kernelspace. */ @@ -384,7 +330,7 @@ static int copyin_string(char __user *user, size_t len, char **ptr) if (!tmp) return -ENOMEM; - if (copy_from_user(tmp, user, len)) { + if(copy_from_user(tmp, user, len)) { kfree(tmp); return -EFAULT; } @@ -399,187 +345,162 @@ static int copyin_string(char __user *user, size_t len, char **ptr) /* * NetBSD /dev/openprom ioctl calls. */ -static int opiocget(void __user *argp, DATA *data) +static int openprom_bsd_ioctl(struct inode * inode, struct file * file, + unsigned int cmd, unsigned long arg) { + DATA *data = (DATA *) file->private_data; + void __user *argp = (void __user *)arg; struct opiocdesc op; - struct device_node *dp; - char *str; - void *pval; - int err, len; + int error, node, len; + char *str, *tmp; + char buffer[64]; + static int cnt; - if (copy_from_user(&op, argp, sizeof(op))) - return -EFAULT; + switch (cmd) { + case OPIOCGET: + if (copy_from_user(&op, argp, sizeof(op))) + return -EFAULT; - dp = get_node(op.op_nodeid, data); + if (!goodnode(op.op_nodeid,data)) + return -EINVAL; - err = copyin_string(op.op_name, op.op_namelen, &str); - if (err) - return err; + error = copyin_string(op.op_name, op.op_namelen, &str); + if (error) + return error; - pval = of_get_property(dp, str, &len); - err = 0; - if (!pval || len > op.op_buflen) { - err = -EINVAL; - } else { - op.op_buflen = len; - if (copy_to_user(argp, &op, sizeof(op)) || - copy_to_user(op.op_buf, pval, len)) - err = -EFAULT; - } - kfree(str); - - return err; -} + len = prom_getproplen(op.op_nodeid,str); -static int opiocnextprop(void __user *argp, DATA *data) -{ - struct opiocdesc op; - struct device_node *dp; - struct property *prop; - char *str; - int err, len; + if (len > op.op_buflen) { + kfree(str); + return -ENOMEM; + } - if (copy_from_user(&op, argp, sizeof(op))) - return -EFAULT; + op.op_buflen = len; - dp = get_node(op.op_nodeid, data); - if (!dp) - return -EINVAL; + if (len <= 0) { + kfree(str); + /* Verified by the above copy_from_user */ + if (__copy_to_user(argp, &op, + sizeof(op))) + return -EFAULT; + return 0; + } - err = copyin_string(op.op_name, op.op_namelen, &str); - if (err) - return err; + tmp = kmalloc(len + 1, GFP_KERNEL); + if (!tmp) { + kfree(str); + return -ENOMEM; + } - if (str[0] == '\0') { - prop = dp->properties; - } else { - prop = of_find_property(dp, str, NULL); - if (prop) - prop = prop->next; - } - kfree(str); + cnt = prom_getproperty(op.op_nodeid, str, tmp, len); + if (cnt <= 0) { + error = -EINVAL; + } else { + tmp[len] = '\0'; - if (!prop) - len = 0; - else - len = prop->length; + if (__copy_to_user(argp, &op, sizeof(op)) != 0 || + copy_to_user(op.op_buf, tmp, len) != 0) + error = -EFAULT; + } - if (len > op.op_buflen) - len = op.op_buflen; + kfree(tmp); + kfree(str); - if (copy_to_user(argp, &op, sizeof(op))) - return -EFAULT; + return error; - if (len && - copy_to_user(op.op_buf, prop->value, len)) - return -EFAULT; + case OPIOCNEXTPROP: + if (copy_from_user(&op, argp, sizeof(op))) + return -EFAULT; - return 0; -} + if (!goodnode(op.op_nodeid,data)) + return -EINVAL; -static int opiocset(void __user *argp, DATA *data) -{ - struct opiocdesc op; - struct device_node *dp; - char *str, *tmp; - int err; + error = copyin_string(op.op_name, op.op_namelen, &str); + if (error) + return error; - if (copy_from_user(&op, argp, sizeof(op))) - return -EFAULT; + tmp = prom_nextprop(op.op_nodeid,str,buffer); - dp = get_node(op.op_nodeid, data); - if (!dp) - return -EINVAL; + if (tmp) { + len = strlen(tmp); + if (len > op.op_buflen) + len = op.op_buflen; + else + op.op_buflen = len; + } else { + len = op.op_buflen = 0; + } - err = copyin_string(op.op_name, op.op_namelen, &str); - if (err) - return err; + if (!access_ok(VERIFY_WRITE, argp, sizeof(op))) { + kfree(str); + return -EFAULT; + } - err = copyin_string(op.op_buf, op.op_buflen, &tmp); - if (err) { - kfree(str); - return err; - } + if (!access_ok(VERIFY_WRITE, op.op_buf, len)) { + kfree(str); + return -EFAULT; + } - err = of_set_property(dp, str, tmp, op.op_buflen); + error = __copy_to_user(argp, &op, sizeof(op)); + if (!error) error = __copy_to_user(op.op_buf, tmp, len); - kfree(str); - kfree(tmp); + kfree(str); - return err; -} + return error; -static int opiocgetnext(unsigned int cmd, void __user *argp) -{ - struct device_node *dp; - phandle nd; + case OPIOCSET: + if (copy_from_user(&op, argp, sizeof(op))) + return -EFAULT; - BUILD_BUG_ON(sizeof(phandle) != sizeof(int)); + if (!goodnode(op.op_nodeid,data)) + return -EINVAL; - if (copy_from_user(&nd, argp, sizeof(phandle))) - return -EFAULT; + error = copyin_string(op.op_name, op.op_namelen, &str); + if (error) + return error; - if (nd == 0) { - if (cmd != OPIOCGETNEXT) - return -EINVAL; - dp = of_find_node_by_path("/"); - } else { - dp = of_find_node_by_phandle(nd); - nd = 0; - if (dp) { - if (cmd == OPIOCGETNEXT) - dp = dp->sibling; - else - dp = dp->child; + error = copyin_string(op.op_buf, op.op_buflen, &tmp); + if (error) { + kfree(str); + return error; } - } - if (dp) - nd = dp->node; - if (copy_to_user(argp, &nd, sizeof(phandle))) - return -EFAULT; - return 0; -} + len = prom_setprop(op.op_nodeid,str,tmp,op.op_buflen+1); -static int openprom_bsd_ioctl(struct inode * inode, struct file * file, - unsigned int cmd, unsigned long arg) -{ - DATA *data = (DATA *) file->private_data; - void __user *argp = (void __user *)arg; - int err; - - switch (cmd) { - case OPIOCGET: - err = opiocget(argp, data); - break; + if (len != op.op_buflen) + return -EINVAL; - case OPIOCNEXTPROP: - err = opiocnextprop(argp, data); - break; + kfree(str); + kfree(tmp); - case OPIOCSET: - err = opiocset(argp, data); - break; + return 0; case OPIOCGETOPTNODE: - BUILD_BUG_ON(sizeof(phandle) != sizeof(int)); - - if (copy_to_user(argp, &options_node->node, sizeof(phandle))) + if (copy_to_user(argp, &options_node, sizeof(int))) return -EFAULT; - return 0; case OPIOCGETNEXT: case OPIOCGETCHILD: - err = opiocgetnext(cmd, argp); - break; + if (copy_from_user(&node, argp, sizeof(int))) + return -EFAULT; + + if (cmd == OPIOCGETNEXT) + node = __prom_getsibling(node); + else + node = __prom_getchild(node); + + if (__copy_to_user(argp, &node, sizeof(int))) + return -EFAULT; + + return 0; default: + if (cnt++ < 10) + printk(KERN_INFO "openprom_bsd_ioctl: cmd 0x%X\n", cmd); return -EINVAL; - }; - - return err; + } } @@ -590,6 +511,7 @@ static int openprom_ioctl(struct inode * inode, struct file * file, unsigned int cmd, unsigned long arg) { DATA *data = (DATA *) file->private_data; + static int cnt; switch (cmd) { case OPROMGETOPT: @@ -641,8 +563,10 @@ static int openprom_ioctl(struct inode * inode, struct file * file, return openprom_bsd_ioctl(inode,file,cmd,arg); default: + if (cnt++ < 10) + printk("openprom_ioctl: cmd 0x%X, arg 0x%lX\n", cmd, arg); return -EINVAL; - }; + } } static long openprom_compat_ioctl(struct file *file, unsigned int cmd, @@ -670,7 +594,9 @@ static long openprom_compat_ioctl(struct file *file, unsigned int cmd, case OPROMSETCUR: case OPROMPCI2NODE: case OPROMPATH2NODE: + lock_kernel(); rval = openprom_ioctl(file->f_dentry->d_inode, file, cmd, arg); + lock_kernel(); break; } @@ -681,13 +607,13 @@ static int openprom_open(struct inode * inode, struct file * file) { DATA *data; - data = kmalloc(sizeof(DATA), GFP_KERNEL); + data = (DATA *) kmalloc(sizeof(DATA), GFP_KERNEL); if (!data) return -ENOMEM; - data->current_node = of_find_node_by_path("/"); - data->lastnode = data->current_node; - file->private_data = (void *) data; + data->current_node = prom_root_node; + data->lastnode = prom_root_node; + file->private_data = (void *)data; return 0; } @@ -708,30 +634,24 @@ static struct file_operations openprom_fops = { }; static struct miscdevice openprom_dev = { - .minor = SUN_OPENPROM_MINOR, - .name = "openprom", - .fops = &openprom_fops, + SUN_OPENPROM_MINOR, "openprom", &openprom_fops }; static int __init openprom_init(void) { - struct device_node *dp; - int err; - - err = misc_register(&openprom_dev); - if (err) - return err; + int error; - dp = of_find_node_by_path("/"); - dp = dp->child; - while (dp) { - if (!strcmp(dp->name, "options")) - break; - dp = dp->sibling; + error = misc_register(&openprom_dev); + if (error) { + printk(KERN_ERR "openprom: unable to get misc minor\n"); + return error; } - options_node = dp; - if (!options_node) { + options_node = prom_getchild(prom_root_node); + options_node = prom_searchsiblings(options_node,"options"); + + if (options_node == 0 || options_node == -1) { + printk(KERN_ERR "openprom: unable to find options node\n"); misc_deregister(&openprom_dev); return -EIO; } @@ -746,3 +666,4 @@ static void __exit openprom_cleanup(void) module_init(openprom_init); module_exit(openprom_cleanup); +MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/sbus/char/riowatchdog.c b/trunk/drivers/sbus/char/riowatchdog.c index 2a9cc8204429..d1babff6a535 100644 --- a/trunk/drivers/sbus/char/riowatchdog.c +++ b/trunk/drivers/sbus/char/riowatchdog.c @@ -211,7 +211,7 @@ static int __init riowd_bbc_init(void) for_each_ebus(ebus) { for_each_ebusdev(edev, ebus) { - if (!strcmp(edev->ofdev.node->name, "bbc")) + if (!strcmp(edev->prom_name, "bbc")) goto found_bbc; } } @@ -238,7 +238,7 @@ static int __init riowd_init(void) for_each_ebus(ebus) { for_each_ebusdev(edev, ebus) { - if (!strcmp(edev->ofdev.node->name, RIOWD_NAME)) + if (!strcmp(edev->prom_name, RIOWD_NAME)) goto ebus_done; } } diff --git a/trunk/drivers/scsi/aacraid/comminit.c b/trunk/drivers/scsi/aacraid/comminit.c index 7cea514e810a..35b0a6ebd3f5 100644 --- a/trunk/drivers/scsi/aacraid/comminit.c +++ b/trunk/drivers/scsi/aacraid/comminit.c @@ -104,11 +104,8 @@ static int aac_alloc_comm(struct aac_dev *dev, void **commaddr, unsigned long co * always true on real computers. It also has some slight problems * with the GART on x86-64. I've btw never tried DMA from PCI space * on this platform but don't be surprised if its problematic. - * [AK: something is very very wrong when a driver tests this symbol. - * Someone should figure out what the comment writer really meant here and fix - * the code. Or just remove that bad code. ] */ -#ifndef CONFIG_IOMMU +#ifndef CONFIG_GART_IOMMU if ((num_physpages << (PAGE_SHIFT - 12)) <= AAC_MAX_HOSTPHYSMEMPAGES) { init->HostPhysMemPages = cpu_to_le32(num_physpages << (PAGE_SHIFT-12)); diff --git a/trunk/drivers/scsi/ncr53c8xx.c b/trunk/drivers/scsi/ncr53c8xx.c index b28712df0b77..6ab035590ee6 100644 --- a/trunk/drivers/scsi/ncr53c8xx.c +++ b/trunk/drivers/scsi/ncr53c8xx.c @@ -5118,7 +5118,8 @@ static void ncr_ccb_skipped(struct ncb *np, struct ccb *cp) cp->host_status &= ~HS_SKIPMASK; cp->start.schedule.l_paddr = cpu_to_scr(NCB_SCRIPT_PHYS (np, select)); - list_move_tail(&cp->link_ccbq, &lp->skip_ccbq); + list_del(&cp->link_ccbq); + list_add_tail(&cp->link_ccbq, &lp->skip_ccbq); if (cp->queued) { --lp->queuedccbs; } diff --git a/trunk/drivers/scsi/qla2xxx/qla_init.c b/trunk/drivers/scsi/qla2xxx/qla_init.c index 3d4487eac9b7..aef093db597e 100644 --- a/trunk/drivers/scsi/qla2xxx/qla_init.c +++ b/trunk/drivers/scsi/qla2xxx/qla_init.c @@ -2258,7 +2258,8 @@ qla2x00_configure_fabric(scsi_qla_host_t *ha) } /* Remove device from the new list and add it to DB */ - list_move_tail(&fcport->list, &ha->fcports); + list_del(&fcport->list); + list_add_tail(&fcport->list, &ha->fcports); /* Login and update database */ qla2x00_fabric_dev_login(ha, fcport, &next_loopid); diff --git a/trunk/drivers/usb/host/hc_crisv10.c b/trunk/drivers/usb/host/hc_crisv10.c index 4a22909518f5..2fe7fd19437b 100644 --- a/trunk/drivers/usb/host/hc_crisv10.c +++ b/trunk/drivers/usb/host/hc_crisv10.c @@ -411,7 +411,8 @@ static inline void urb_list_move_last(struct urb *urb, int epid) urb_entry_t *urb_entry = __urb_list_entry(urb, epid); assert(urb_entry); - list_move_tail(&urb_entry->list, &urb_list[epid]); + list_del(&urb_entry->list); + list_add_tail(&urb_entry->list, &urb_list[epid]); } /* Get the next urb in the list. */ diff --git a/trunk/drivers/usb/input/fixp-arith.h b/trunk/drivers/usb/input/fixp-arith.h index ed3d2da0c485..b44d398de071 100644 --- a/trunk/drivers/usb/input/fixp-arith.h +++ b/trunk/drivers/usb/input/fixp-arith.h @@ -2,6 +2,8 @@ #define _FIXP_ARITH_H /* + * $$ + * * Simplistic fixed-point arithmetics. * Hmm, I'm probably duplicating some code :( * @@ -29,20 +31,20 @@ #include -/* The type representing fixed-point values */ +// The type representing fixed-point values typedef s16 fixp_t; #define FRAC_N 8 #define FRAC_MASK ((1<>= 1; diff --git a/trunk/drivers/usb/input/hid-debug.h b/trunk/drivers/usb/input/hid-debug.h index f04d6d75c098..702c48c2f81b 100644 --- a/trunk/drivers/usb/input/hid-debug.h +++ b/trunk/drivers/usb/input/hid-debug.h @@ -563,7 +563,7 @@ static char *keys[KEY_MAX + 1] = { [KEY_VOLUMEUP] = "VolumeUp", [KEY_POWER] = "Power", [KEY_KPEQUAL] = "KPEqual", [KEY_KPPLUSMINUS] = "KPPlusMinus", [KEY_PAUSE] = "Pause", [KEY_KPCOMMA] = "KPComma", - [KEY_HANGUEL] = "Hangeul", [KEY_HANJA] = "Hanja", + [KEY_HANGUEL] = "Hanguel", [KEY_HANJA] = "Hanja", [KEY_YEN] = "Yen", [KEY_LEFTMETA] = "LeftMeta", [KEY_RIGHTMETA] = "RightMeta", [KEY_COMPOSE] = "Compose", [KEY_STOP] = "Stop", [KEY_AGAIN] = "Again", diff --git a/trunk/drivers/usb/serial/whiteheat.c b/trunk/drivers/usb/serial/whiteheat.c index 56ffc81302fc..5b06fa366098 100644 --- a/trunk/drivers/usb/serial/whiteheat.c +++ b/trunk/drivers/usb/serial/whiteheat.c @@ -686,16 +686,19 @@ static void whiteheat_close(struct usb_serial_port *port, struct file * filp) wrap = list_entry(tmp, struct whiteheat_urb_wrap, list); urb = wrap->urb; usb_kill_urb(urb); - list_move(tmp, &info->rx_urbs_free); + list_del(tmp); + list_add(tmp, &info->rx_urbs_free); + } + list_for_each_safe(tmp, tmp2, &info->rx_urb_q) { + list_del(tmp); + list_add(tmp, &info->rx_urbs_free); } - list_for_each_safe(tmp, tmp2, &info->rx_urb_q) - list_move(tmp, &info->rx_urbs_free); - list_for_each_safe(tmp, tmp2, &info->tx_urbs_submitted) { wrap = list_entry(tmp, struct whiteheat_urb_wrap, list); urb = wrap->urb; usb_kill_urb(urb); - list_move(tmp, &info->tx_urbs_free); + list_del(tmp); + list_add(tmp, &info->tx_urbs_free); } spin_unlock_irqrestore(&info->lock, flags); @@ -1077,7 +1080,8 @@ static void whiteheat_write_callback(struct urb *urb, struct pt_regs *regs) err("%s - Not my urb!", __FUNCTION__); return; } - list_move(&wrap->list, &info->tx_urbs_free); + list_del(&wrap->list); + list_add(&wrap->list, &info->tx_urbs_free); spin_unlock(&info->lock); if (urb->status) { @@ -1367,7 +1371,8 @@ static int start_port_read(struct usb_serial_port *port) wrap = list_entry(tmp, struct whiteheat_urb_wrap, list); urb = wrap->urb; usb_kill_urb(urb); - list_move(tmp, &info->rx_urbs_free); + list_del(tmp); + list_add(tmp, &info->rx_urbs_free); } break; } diff --git a/trunk/drivers/video/Kconfig b/trunk/drivers/video/Kconfig index 17de4c84db69..168ede7902bd 100644 --- a/trunk/drivers/video/Kconfig +++ b/trunk/drivers/video/Kconfig @@ -4,21 +4,6 @@ menu "Graphics support" -config FIRMWARE_EDID - bool "Enable firmware EDID" - default y - ---help--- - This enables access to the EDID transferred from the firmware. - On the i386, this is from the Video BIOS. Enable this if DDC/I2C - transfers do not work for your driver and if you are using - nvidiafb, i810fb or savagefb. - - In general, choosing Y for this option is safe. If you - experience extremely long delays while booting before you get - something on your display, try setting this to N. Matrox cards in - combination with certain motherboards and monitors are known to - suffer from this problem. - config FB tristate "Support for frame buffer devices" ---help--- @@ -85,6 +70,22 @@ config FB_MACMODES depends on FB default n +config FB_FIRMWARE_EDID + bool "Enable firmware EDID" + depends on FB + default y + ---help--- + This enables access to the EDID transferred from the firmware. + On the i386, this is from the Video BIOS. Enable this if DDC/I2C + transfers do not work for your driver and if you are using + nvidiafb, i810fb or savagefb. + + In general, choosing Y for this option is safe. If you + experience extremely long delays while booting before you get + something on your display, try setting this to N. Matrox cards in + combination with certain motherboards and monitors are known to + suffer from this problem. + config FB_BACKLIGHT bool depends on FB @@ -550,14 +551,10 @@ config FB_VESA You will get a boot time penguin logo at no additional cost. Please read . If unsure, say Y. -config FB_IMAC - bool "Intel-based Macintosh Framebuffer Support" - depends on (FB = y) && X86 - select FB_CFB_FILLRECT - select FB_CFB_COPYAREA - select FB_CFB_IMAGEBLIT - help - This is the frame buffer device driver for the Intel-based Macintosh +config VIDEO_SELECT + bool + depends on FB_VESA + default y config FB_HGA tristate "Hercules mono graphics support" @@ -581,6 +578,12 @@ config FB_HGA_ACCEL This will compile the Hercules mono graphics with acceleration functions. + +config VIDEO_SELECT + bool + depends on (FB = y) && X86 + default y + config FB_SGIVW tristate "SGI Visual Workstation framebuffer support" depends on FB && X86_VISWS diff --git a/trunk/drivers/video/Makefile b/trunk/drivers/video/Makefile index c335e9bc3b20..23de3b2c7856 100644 --- a/trunk/drivers/video/Makefile +++ b/trunk/drivers/video/Makefile @@ -4,15 +4,15 @@ # Each configuration option enables a list of files. +obj-$(CONFIG_VT) += console/ +obj-$(CONFIG_LOGO) += logo/ +obj-$(CONFIG_SYSFS) += backlight/ + obj-$(CONFIG_FB) += fb.o fb-y := fbmem.o fbmon.o fbcmap.o fbsysfs.o \ modedb.o fbcvt.o fb-objs := $(fb-y) -obj-$(CONFIG_VT) += console/ -obj-$(CONFIG_LOGO) += logo/ -obj-$(CONFIG_SYSFS) += backlight/ - obj-$(CONFIG_FB_CFB_FILLRECT) += cfbfillrect.o obj-$(CONFIG_FB_CFB_COPYAREA) += cfbcopyarea.o obj-$(CONFIG_FB_CFB_IMAGEBLIT) += cfbimgblt.o @@ -97,7 +97,6 @@ obj-$(CONFIG_FB_S3C2410) += s3c2410fb.o # Platform or fallback drivers go here obj-$(CONFIG_FB_VESA) += vesafb.o -obj-$(CONFIG_FB_IMAC) += imacfb.o obj-$(CONFIG_FB_VGA16) += vga16fb.o vgastate.o obj-$(CONFIG_FB_OF) += offb.o diff --git a/trunk/drivers/video/aty/aty128fb.c b/trunk/drivers/video/aty/aty128fb.c index 11cf7fcb1d55..db878fd55fb2 100644 --- a/trunk/drivers/video/aty/aty128fb.c +++ b/trunk/drivers/video/aty/aty128fb.c @@ -100,7 +100,7 @@ #ifndef CONFIG_PPC_PMAC /* default mode */ -static struct fb_var_screeninfo default_var __devinitdata = { +static struct fb_var_screeninfo default_var __initdata = { /* 640x480, 60 Hz, Non-Interlaced (25.175 MHz dotclock) */ 640, 480, 640, 480, 0, 0, 8, 0, {0, 8, 0}, {0, 8, 0}, {0, 8, 0}, {0, 0, 0}, @@ -123,7 +123,7 @@ static struct fb_var_screeninfo default_var = { /* default modedb mode */ /* 640x480, 60 Hz, Non-Interlaced (25.172 MHz dotclock) */ -static struct fb_videomode defaultmode __devinitdata = { +static struct fb_videomode defaultmode __initdata = { .refresh = 60, .xres = 640, .yres = 480, @@ -335,7 +335,7 @@ static const struct aty128_meminfo sdr_sgram = static const struct aty128_meminfo ddr_sgram = { 4, 4, 3, 3, 2, 3, 1, 16, 31, 16, "64-bit DDR SGRAM" }; -static struct fb_fix_screeninfo aty128fb_fix __devinitdata = { +static struct fb_fix_screeninfo aty128fb_fix __initdata = { .id = "ATY Rage128", .type = FB_TYPE_PACKED_PIXELS, .visual = FB_VISUAL_PSEUDOCOLOR, @@ -345,15 +345,15 @@ static struct fb_fix_screeninfo aty128fb_fix __devinitdata = { .accel = FB_ACCEL_ATI_RAGE128, }; -static char *mode_option __devinitdata = NULL; +static char *mode_option __initdata = NULL; #ifdef CONFIG_PPC_PMAC -static int default_vmode __devinitdata = VMODE_1024_768_60; -static int default_cmode __devinitdata = CMODE_8; +static int default_vmode __initdata = VMODE_1024_768_60; +static int default_cmode __initdata = CMODE_8; #endif -static int default_crt_on __devinitdata = 0; -static int default_lcd_on __devinitdata = 1; +static int default_crt_on __initdata = 0; +static int default_lcd_on __initdata = 1; #ifdef CONFIG_MTRR static int mtrr = 1; @@ -445,9 +445,9 @@ static int aty128_encode_var(struct fb_var_screeninfo *var, static int aty128_decode_var(struct fb_var_screeninfo *var, struct aty128fb_par *par); #if 0 -static void __devinit aty128_get_pllinfo(struct aty128fb_par *par, +static void __init aty128_get_pllinfo(struct aty128fb_par *par, void __iomem *bios); -static void __devinit __iomem *aty128_map_ROM(struct pci_dev *pdev, const struct aty128fb_par *par); +static void __init __iomem *aty128_map_ROM(struct pci_dev *pdev, const struct aty128fb_par *par); #endif static void aty128_timings(struct aty128fb_par *par); static void aty128_init_engine(struct aty128fb_par *par); @@ -573,7 +573,7 @@ static void aty_pll_writeupdate(const struct aty128fb_par *par) /* write to the scratch register to test r/w functionality */ -static int __devinit register_test(const struct aty128fb_par *par) +static int __init register_test(const struct aty128fb_par *par) { u32 val; int flag = 0; @@ -772,7 +772,7 @@ static u32 depth_to_dst(u32 depth) #ifndef __sparc__ -static void __iomem * __devinit aty128_map_ROM(const struct aty128fb_par *par, struct pci_dev *dev) +static void __iomem * __init aty128_map_ROM(const struct aty128fb_par *par, struct pci_dev *dev) { u16 dptr; u8 rom_type; @@ -856,7 +856,7 @@ static void __iomem * __devinit aty128_map_ROM(const struct aty128fb_par *par, s return NULL; } -static void __devinit aty128_get_pllinfo(struct aty128fb_par *par, unsigned char __iomem *bios) +static void __init aty128_get_pllinfo(struct aty128fb_par *par, unsigned char __iomem *bios) { unsigned int bios_hdr; unsigned int bios_pll; @@ -903,7 +903,7 @@ static void __iomem * __devinit aty128_find_mem_vbios(struct aty128fb_par *par) #endif /* ndef(__sparc__) */ /* fill in known card constants if pll_block is not available */ -static void __devinit aty128_timings(struct aty128fb_par *par) +static void __init aty128_timings(struct aty128fb_par *par) { #ifdef CONFIG_PPC_OF /* instead of a table lookup, assume OF has properly @@ -1645,7 +1645,7 @@ static int aty128fb_sync(struct fb_info *info) } #ifndef MODULE -static int __devinit aty128fb_setup(char *options) +static int __init aty128fb_setup(char *options) { char *this_opt; @@ -1893,7 +1893,7 @@ static void aty128_early_resume(void *data) } #endif /* CONFIG_PPC_PMAC */ -static int __devinit aty128_init(struct pci_dev *pdev, const struct pci_device_id *ent) +static int __init aty128_init(struct pci_dev *pdev, const struct pci_device_id *ent) { struct fb_info *info = pci_get_drvdata(pdev); struct aty128fb_par *par = info->par; @@ -2037,7 +2037,7 @@ static int __devinit aty128_init(struct pci_dev *pdev, const struct pci_device_i #ifdef CONFIG_PCI /* register a card ++ajoshi */ -static int __devinit aty128_probe(struct pci_dev *pdev, const struct pci_device_id *ent) +static int __init aty128_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { unsigned long fb_addr, reg_addr; struct aty128fb_par *par; @@ -2556,7 +2556,7 @@ static int aty128_pci_resume(struct pci_dev *pdev) } -static int __devinit aty128fb_init(void) +static int __init aty128fb_init(void) { #ifndef MODULE char *option = NULL; diff --git a/trunk/drivers/video/aty/atyfb_base.c b/trunk/drivers/video/aty/atyfb_base.c index 22e720611bf6..c5185f7cf4ba 100644 --- a/trunk/drivers/video/aty/atyfb_base.c +++ b/trunk/drivers/video/aty/atyfb_base.c @@ -316,12 +316,12 @@ static int vram; static int pll; static int mclk; static int xclk; -static int comp_sync __devinitdata = -1; +static int comp_sync __initdata = -1; static char *mode; #ifdef CONFIG_PPC -static int default_vmode __devinitdata = VMODE_CHOOSE; -static int default_cmode __devinitdata = CMODE_CHOOSE; +static int default_vmode __initdata = VMODE_CHOOSE; +static int default_cmode __initdata = CMODE_CHOOSE; module_param_named(vmode, default_vmode, int, 0); MODULE_PARM_DESC(vmode, "int: video mode for mac"); @@ -330,10 +330,10 @@ MODULE_PARM_DESC(cmode, "int: color mode for mac"); #endif #ifdef CONFIG_ATARI -static unsigned int mach64_count __devinitdata = 0; -static unsigned long phys_vmembase[FB_MAX] __devinitdata = { 0, }; -static unsigned long phys_size[FB_MAX] __devinitdata = { 0, }; -static unsigned long phys_guiregbase[FB_MAX] __devinitdata = { 0, }; +static unsigned int mach64_count __initdata = 0; +static unsigned long phys_vmembase[FB_MAX] __initdata = { 0, }; +static unsigned long phys_size[FB_MAX] __initdata = { 0, }; +static unsigned long phys_guiregbase[FB_MAX] __initdata = { 0, }; #endif /* top -> down is an evolution of mach64 chipset, any corrections? */ @@ -583,7 +583,7 @@ static u32 atyfb_get_pixclock(struct fb_var_screeninfo *var, struct atyfb_par *p * Apple monitor sense */ -static int __devinit read_aty_sense(const struct atyfb_par *par) +static int __init read_aty_sense(const struct atyfb_par *par) { int sense, i; @@ -1281,14 +1281,6 @@ static int atyfb_set_par(struct fb_info *info) par->accel_flags = var->accel_flags; /* hack */ - if (var->accel_flags) { - info->fbops->fb_sync = atyfb_sync; - info->flags &= ~FBINFO_HWACCEL_DISABLED; - } else { - info->fbops->fb_sync = NULL; - info->flags |= FBINFO_HWACCEL_DISABLED; - } - if (par->blitter_may_be_busy) wait_for_idle(par); @@ -2261,7 +2253,7 @@ static void aty_bl_exit(struct atyfb_par *par) #endif /* CONFIG_FB_ATY_BACKLIGHT */ -static void __devinit aty_calc_mem_refresh(struct atyfb_par *par, int xclk) +static void __init aty_calc_mem_refresh(struct atyfb_par *par, int xclk) { const int ragepro_tbl[] = { 44, 50, 55, 66, 75, 80, 100 @@ -2321,7 +2313,7 @@ static int __devinit atyfb_get_timings_from_lcd(struct atyfb_par *par, } #endif /* defined(__i386__) && defined(CONFIG_FB_ATY_GENERIC_LCD) */ -static int __devinit aty_init(struct fb_info *info, const char *name) +static int __init aty_init(struct fb_info *info, const char *name) { struct atyfb_par *par = (struct atyfb_par *) info->par; const char *ramname = NULL, *xtal; @@ -2402,15 +2394,12 @@ static int __devinit aty_init(struct fb_info *info, const char *name) break; } switch (clk_type) { -#ifdef CONFIG_ATARI case CLK_ATI18818_1: par->pll_ops = &aty_pll_ati18818_1; break; -#else case CLK_IBMRGB514: par->pll_ops = &aty_pll_ibm514; break; -#endif #if 0 /* dead code */ case CLK_STG1703: par->pll_ops = &aty_pll_stg1703; @@ -2615,11 +2604,7 @@ static int __devinit aty_init(struct fb_info *info, const char *name) info->fbops = &atyfb_ops; info->pseudo_palette = pseudo_palette; - info->flags = FBINFO_DEFAULT | - FBINFO_HWACCEL_IMAGEBLIT | - FBINFO_HWACCEL_FILLRECT | - FBINFO_HWACCEL_COPYAREA | - FBINFO_HWACCEL_YPAN; + info->flags = FBINFO_FLAG_DEFAULT; #ifdef CONFIG_PMAC_BACKLIGHT if (M64_HAS(G3_PB_1_1) && machine_is_compatible("PowerBook1,1")) { @@ -2748,7 +2733,7 @@ static int __devinit aty_init(struct fb_info *info, const char *name) } #ifdef CONFIG_ATARI -static int __devinit store_video_par(char *video_str, unsigned char m64_num) +static int __init store_video_par(char *video_str, unsigned char m64_num) { char *p; unsigned long vmembase, size, guiregbase; @@ -3779,7 +3764,7 @@ static struct pci_driver atyfb_driver = { #endif /* CONFIG_PCI */ #ifndef MODULE -static int __devinit atyfb_setup(char *options) +static int __init atyfb_setup(char *options) { char *this_opt; @@ -3851,7 +3836,7 @@ static int __devinit atyfb_setup(char *options) } #endif /* MODULE */ -static int __devinit atyfb_init(void) +static int __init atyfb_init(void) { #ifndef MODULE char *option = NULL; diff --git a/trunk/drivers/video/aty/mach64_accel.c b/trunk/drivers/video/aty/mach64_accel.c index 1490e5e1c232..c98f4a442134 100644 --- a/trunk/drivers/video/aty/mach64_accel.c +++ b/trunk/drivers/video/aty/mach64_accel.c @@ -200,6 +200,8 @@ void atyfb_copyarea(struct fb_info *info, const struct fb_copyarea *area) if (!area->width || !area->height) return; if (!par->accel_flags) { + if (par->blitter_may_be_busy) + wait_for_idle(par); cfb_copyarea(info, area); return; } @@ -246,6 +248,8 @@ void atyfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect) if (!rect->width || !rect->height) return; if (!par->accel_flags) { + if (par->blitter_may_be_busy) + wait_for_idle(par); cfb_fillrect(info, rect); return; } @@ -284,10 +288,14 @@ void atyfb_imageblit(struct fb_info *info, const struct fb_image *image) return; if (!par->accel_flags || (image->depth != 1 && info->var.bits_per_pixel != image->depth)) { + if (par->blitter_may_be_busy) + wait_for_idle(par); + cfb_imageblit(info, image); return; } + wait_for_idle(par); pix_width = pix_width_save = aty_ld_le32(DP_PIX_WIDTH, par); host_cntl = aty_ld_le32(HOST_CNTL, par) | HOST_BYTE_ALIGN; @@ -417,6 +425,8 @@ void atyfb_imageblit(struct fb_info *info, const struct fb_image *image) } } + wait_for_idle(par); + /* restore pix_width */ wait_for_fifo(1, par); aty_st_le32(DP_PIX_WIDTH, pix_width_save, par); diff --git a/trunk/drivers/video/aty/mach64_cursor.c b/trunk/drivers/video/aty/mach64_cursor.c index 2a7f381c330f..ad8b7496f853 100644 --- a/trunk/drivers/video/aty/mach64_cursor.c +++ b/trunk/drivers/video/aty/mach64_cursor.c @@ -66,6 +66,11 @@ static const u8 cursor_bits_lookup[16] = { 0x01, 0x41, 0x11, 0x51, 0x05, 0x45, 0x15, 0x55 }; +static const u8 cursor_mask_lookup[16] = { + 0xaa, 0x2a, 0x8a, 0x0a, 0xa2, 0x22, 0x82, 0x02, + 0xa8, 0x28, 0x88, 0x08, 0xa0, 0x20, 0x80, 0x00 +}; + static int atyfb_cursor(struct fb_info *info, struct fb_cursor *cursor) { struct atyfb_par *par = (struct atyfb_par *) info->par; @@ -125,13 +130,13 @@ static int atyfb_cursor(struct fb_info *info, struct fb_cursor *cursor) fg_idx = cursor->image.fg_color; bg_idx = cursor->image.bg_color; - fg = ((info->cmap.red[fg_idx] & 0xff) << 24) | - ((info->cmap.green[fg_idx] & 0xff) << 16) | - ((info->cmap.blue[fg_idx] & 0xff) << 8) | 0xff; + fg = (info->cmap.red[fg_idx] << 24) | + (info->cmap.green[fg_idx] << 16) | + (info->cmap.blue[fg_idx] << 8) | 15; - bg = ((info->cmap.red[bg_idx] & 0xff) << 24) | - ((info->cmap.green[bg_idx] & 0xff) << 16) | - ((info->cmap.blue[bg_idx] & 0xff) << 8); + bg = (info->cmap.red[bg_idx] << 24) | + (info->cmap.green[bg_idx] << 16) | + (info->cmap.blue[bg_idx] << 8); wait_for_fifo(2, par); aty_st_le32(CUR_CLR0, bg, par); @@ -161,17 +166,19 @@ static int atyfb_cursor(struct fb_info *info, struct fb_cursor *cursor) switch (cursor->rop) { case ROP_XOR: // Upper 4 bits of mask data - fb_writeb(cursor_bits_lookup[(b ^ m) >> 4], dst++); + fb_writeb(cursor_mask_lookup[m >> 4 ] | + cursor_bits_lookup[(b ^ m) >> 4], dst++); // Lower 4 bits of mask - fb_writeb(cursor_bits_lookup[(b ^ m) & 0x0f], - dst++); + fb_writeb(cursor_mask_lookup[m & 0x0f ] | + cursor_bits_lookup[(b ^ m) & 0x0f], dst++); break; case ROP_COPY: // Upper 4 bits of mask data - fb_writeb(cursor_bits_lookup[(b & m) >> 4], dst++); + fb_writeb(cursor_mask_lookup[m >> 4 ] | + cursor_bits_lookup[(b & m) >> 4], dst++); // Lower 4 bits of mask - fb_writeb(cursor_bits_lookup[(b & m) & 0x0f], - dst++); + fb_writeb(cursor_mask_lookup[m & 0x0f ] | + cursor_bits_lookup[(b & m) & 0x0f], dst++); break; } } @@ -187,7 +194,7 @@ static int atyfb_cursor(struct fb_info *info, struct fb_cursor *cursor) return 0; } -int __devinit aty_init_cursor(struct fb_info *info) +int __init aty_init_cursor(struct fb_info *info) { unsigned long addr; diff --git a/trunk/drivers/video/aty/radeon_base.c b/trunk/drivers/video/aty/radeon_base.c index 68b15645b893..c5ecbb02e01d 100644 --- a/trunk/drivers/video/aty/radeon_base.c +++ b/trunk/drivers/video/aty/radeon_base.c @@ -2379,6 +2379,7 @@ static int __devinit radeonfb_pci_register (struct pci_dev *pdev, err_release_fb: framebuffer_release(info); err_disable: + pci_disable_device(pdev); err_out: return ret; } @@ -2435,6 +2436,7 @@ static void __devexit radeonfb_pci_unregister (struct pci_dev *pdev) #endif fb_dealloc_cmap(&info->cmap); framebuffer_release(info); + pci_disable_device(pdev); } diff --git a/trunk/drivers/video/au1100fb.c b/trunk/drivers/video/au1100fb.c index d63c3f485853..789450bb0bc9 100644 --- a/trunk/drivers/video/au1100fb.c +++ b/trunk/drivers/video/au1100fb.c @@ -7,8 +7,6 @@ * Karl Lessard * * - * PM support added by Rodolfo Giometti - * * Copyright 2002 MontaVista Software * Author: MontaVista Software, Inc. * ppopov@mvista.com or source@mvista.com @@ -604,52 +602,17 @@ int au1100fb_drv_remove(struct device *dev) return 0; } -#ifdef CONFIG_PM -static u32 sys_clksrc; -static struct au1100fb_regs fbregs; - int au1100fb_drv_suspend(struct device *dev, pm_message_t state) { - struct au1100fb_device *fbdev = dev_get_drvdata(dev); - - if (!fbdev) - return 0; - - /* Save the clock source state */ - sys_clksrc = au_readl(SYS_CLKSRC); - - /* Blank the LCD */ - au1100fb_fb_blank(VESA_POWERDOWN, &fbdev->info); - - /* Stop LCD clocking */ - au_writel(sys_clksrc & ~SYS_CS_ML_MASK, SYS_CLKSRC); - - memcpy(&fbregs, fbdev->regs, sizeof(struct au1100fb_regs)); - + /* TODO */ return 0; } int au1100fb_drv_resume(struct device *dev) { - struct au1100fb_device *fbdev = dev_get_drvdata(dev); - - if (!fbdev) - return 0; - - memcpy(fbdev->regs, &fbregs, sizeof(struct au1100fb_regs)); - - /* Restart LCD clocking */ - au_writel(sys_clksrc, SYS_CLKSRC); - - /* Unblank the LCD */ - au1100fb_fb_blank(VESA_NO_BLANKING, &fbdev->info); - + /* TODO */ return 0; } -#else -#define au1100fb_drv_suspend NULL -#define au1100fb_drv_resume NULL -#endif static struct device_driver au1100fb_driver = { .name = "au1100-lcd", diff --git a/trunk/drivers/video/backlight/Kconfig b/trunk/drivers/video/backlight/Kconfig index 022f9d3473f5..b895eaaa73fd 100644 --- a/trunk/drivers/video/backlight/Kconfig +++ b/trunk/drivers/video/backlight/Kconfig @@ -10,7 +10,7 @@ menuconfig BACKLIGHT_LCD_SUPPORT config BACKLIGHT_CLASS_DEVICE tristate "Lowlevel Backlight controls" - depends on BACKLIGHT_LCD_SUPPORT && FB + depends on BACKLIGHT_LCD_SUPPORT default m help This framework adds support for low-level control of the LCD @@ -26,7 +26,7 @@ config BACKLIGHT_DEVICE config LCD_CLASS_DEVICE tristate "Lowlevel LCD controls" - depends on BACKLIGHT_LCD_SUPPORT && FB + depends on BACKLIGHT_LCD_SUPPORT default m help This framework adds support for low-level control of LCD. @@ -50,14 +50,6 @@ config BACKLIGHT_CORGI If you have a Sharp Zaurus SL-C7xx, SL-Cxx00 or SL-6000x say y to enable the backlight driver. -config BACKLIGHT_LOCOMO - tristate "Sharp LOCOMO LCD/Backlight Driver" - depends on BACKLIGHT_DEVICE && SHARP_LOCOMO - default y - help - If you have a Sharp Zaurus SL-5500 (Collie) or SL-5600 (Poodle) say y to - enable the LCD/backlight driver. - config BACKLIGHT_HP680 tristate "HP Jornada 680 Backlight Driver" depends on BACKLIGHT_DEVICE && SH_HP6XX diff --git a/trunk/drivers/video/backlight/Makefile b/trunk/drivers/video/backlight/Makefile index 65e5553fc849..744210c38e74 100644 --- a/trunk/drivers/video/backlight/Makefile +++ b/trunk/drivers/video/backlight/Makefile @@ -4,4 +4,4 @@ obj-$(CONFIG_LCD_CLASS_DEVICE) += lcd.o obj-$(CONFIG_BACKLIGHT_CLASS_DEVICE) += backlight.o obj-$(CONFIG_BACKLIGHT_CORGI) += corgi_bl.o obj-$(CONFIG_BACKLIGHT_HP680) += hp680_bl.o -obj-$(CONFIG_BACKLIGHT_LOCOMO) += locomolcd.o +obj-$(CONFIG_SHARP_LOCOMO) += locomolcd.o diff --git a/trunk/drivers/video/backlight/locomolcd.c b/trunk/drivers/video/backlight/locomolcd.c index bd879b7ec119..60831bb23685 100644 --- a/trunk/drivers/video/backlight/locomolcd.c +++ b/trunk/drivers/video/backlight/locomolcd.c @@ -17,8 +17,6 @@ #include #include #include -#include -#include #include #include @@ -27,10 +25,7 @@ #include "../../../arch/arm/mach-sa1100/generic.h" -static struct backlight_device *locomolcd_bl_device; static struct locomo_dev *locomolcd_dev; -static unsigned long locomolcd_flags; -#define LOCOMOLCD_SUSPENDED 0x01 static void locomolcd_on(int comadj) { @@ -94,10 +89,12 @@ void locomolcd_power(int on) } /* read comadj */ - if (comadj == -1 && machine_is_collie()) - comadj = 128; - if (comadj == -1 && machine_is_poodle()) - comadj = 118; + if (comadj == -1) { + if (machine_is_poodle()) + comadj = 118; + if (machine_is_collie()) + comadj = 128; + } if (on) locomolcd_on(comadj); @@ -108,100 +105,26 @@ void locomolcd_power(int on) } EXPORT_SYMBOL(locomolcd_power); - -static int current_intensity; - -static int locomolcd_set_intensity(struct backlight_device *bd) -{ - int intensity = bd->props->brightness; - - if (bd->props->power != FB_BLANK_UNBLANK) - intensity = 0; - if (bd->props->fb_blank != FB_BLANK_UNBLANK) - intensity = 0; - if (locomolcd_flags & LOCOMOLCD_SUSPENDED) - intensity = 0; - - switch (intensity) { - /* AC and non-AC are handled differently, but produce same results in sharp code? */ - case 0: locomo_frontlight_set(locomolcd_dev, 0, 0, 161); break; - case 1: locomo_frontlight_set(locomolcd_dev, 117, 0, 161); break; - case 2: locomo_frontlight_set(locomolcd_dev, 163, 0, 148); break; - case 3: locomo_frontlight_set(locomolcd_dev, 194, 0, 161); break; - case 4: locomo_frontlight_set(locomolcd_dev, 194, 1, 161); break; - - default: - return -ENODEV; - } - current_intensity = intensity; - return 0; -} - -static int locomolcd_get_intensity(struct backlight_device *bd) -{ - return current_intensity; -} - -static struct backlight_properties locomobl_data = { - .owner = THIS_MODULE, - .get_brightness = locomolcd_get_intensity, - .update_status = locomolcd_set_intensity, - .max_brightness = 4, -}; - -#ifdef CONFIG_PM -static int locomolcd_suspend(struct locomo_dev *dev, pm_message_t state) -{ - locomolcd_flags |= LOCOMOLCD_SUSPENDED; - locomolcd_set_intensity(locomolcd_bl_device); - return 0; -} - -static int locomolcd_resume(struct locomo_dev *dev) -{ - locomolcd_flags &= ~LOCOMOLCD_SUSPENDED; - locomolcd_set_intensity(locomolcd_bl_device); - return 0; -} -#else -#define locomolcd_suspend NULL -#define locomolcd_resume NULL -#endif - -static int locomolcd_probe(struct locomo_dev *dev) +static int poodle_lcd_probe(struct locomo_dev *dev) { unsigned long flags; local_irq_save(flags); locomolcd_dev = dev; - locomo_gpio_set_dir(dev, LOCOMO_GPIO_FL_VR, 0); - /* the poodle_lcd_power function is called for the first time * from fs_initcall, which is before locomo is activated. * We need to recall poodle_lcd_power here*/ - if (machine_is_poodle()) - locomolcd_power(1); - +#ifdef CONFIG_MACH_POODLE + locomolcd_power(1); +#endif local_irq_restore(flags); - - locomolcd_bl_device = backlight_device_register("locomo-bl", NULL, &locomobl_data); - - if (IS_ERR (locomolcd_bl_device)) - return PTR_ERR (locomolcd_bl_device); - - /* Set up frontlight so that screen is readable */ - locomobl_data.brightness = 2; - locomolcd_set_intensity(locomolcd_bl_device); - return 0; } -static int locomolcd_remove(struct locomo_dev *dev) +static int poodle_lcd_remove(struct locomo_dev *dev) { unsigned long flags; - - backlight_device_unregister(locomolcd_bl_device); local_irq_save(flags); locomolcd_dev = NULL; local_irq_restore(flags); @@ -213,33 +136,19 @@ static struct locomo_driver poodle_lcd_driver = { .name = "locomo-backlight", }, .devid = LOCOMO_DEVID_BACKLIGHT, - .probe = locomolcd_probe, - .remove = locomolcd_remove, - .suspend = locomolcd_suspend, - .resume = locomolcd_resume, + .probe = poodle_lcd_probe, + .remove = poodle_lcd_remove, }; - -static int __init locomolcd_init(void) +static int __init poodle_lcd_init(void) { int ret = locomo_driver_register(&poodle_lcd_driver); - if (ret) - return ret; + if (ret) return ret; #ifdef CONFIG_SA1100_COLLIE sa1100fb_lcd_power = locomolcd_power; #endif return 0; } +device_initcall(poodle_lcd_init); -static void __exit locomolcd_exit(void) -{ - locomo_driver_unregister(&poodle_lcd_driver); -} - -module_init(locomolcd_init); -module_exit(locomolcd_exit); - -MODULE_AUTHOR("John Lenz , Pavel Machek "); -MODULE_DESCRIPTION("Collie LCD driver"); -MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/video/cfbimgblt.c b/trunk/drivers/video/cfbimgblt.c index ad8a89bf8eae..8ba6152db2fd 100644 --- a/trunk/drivers/video/cfbimgblt.c +++ b/trunk/drivers/video/cfbimgblt.c @@ -230,7 +230,6 @@ static inline void fast_imageblit(const struct fb_image *image, struct fb_info * tab = cfb_tab16; break; case 32: - default: tab = cfb_tab32; break; } diff --git a/trunk/drivers/video/cirrusfb.c b/trunk/drivers/video/cirrusfb.c index dda240eb7360..1103010af54a 100644 --- a/trunk/drivers/video/cirrusfb.c +++ b/trunk/drivers/video/cirrusfb.c @@ -2227,6 +2227,7 @@ static void cirrusfb_pci_unmap (struct cirrusfb_info *cinfo) release_region(0x3C0, 32); pci_release_regions(pdev); framebuffer_release(cinfo->info); + pci_disable_device(pdev); } #endif /* CONFIG_PCI */ @@ -2457,6 +2458,7 @@ static int cirrusfb_pci_register (struct pci_dev *pdev, err_release_fb: framebuffer_release(info); err_disable: + pci_disable_device(pdev); err_out: return ret; } diff --git a/trunk/drivers/video/console/fbcon.c b/trunk/drivers/video/console/fbcon.c index 5dc4083552d8..47ba1a79adcd 100644 --- a/trunk/drivers/video/console/fbcon.c +++ b/trunk/drivers/video/console/fbcon.c @@ -125,8 +125,6 @@ static int softback_lines; static int first_fb_vc; static int last_fb_vc = MAX_NR_CONSOLES - 1; static int fbcon_is_default = 1; -static int fbcon_has_exited; - /* font data */ static char fontname[40]; @@ -142,6 +140,7 @@ static const struct consw fb_con; #define advance_row(p, delta) (unsigned short *)((unsigned long)(p) + (delta) * vc->vc_size_row) +static void fbcon_free_font(struct display *); static int fbcon_set_origin(struct vc_data *); #define CURSOR_DRAW_DELAY (1) @@ -195,9 +194,6 @@ static void fbcon_redraw_move(struct vc_data *vc, struct display *p, int line, int count, int dy); static void fbcon_modechanged(struct fb_info *info); static void fbcon_set_all_vcs(struct fb_info *info); -static void fbcon_start(void); -static void fbcon_exit(void); -static struct class_device *fbcon_class_device; #ifdef CONFIG_MAC /* @@ -256,7 +252,7 @@ static void fbcon_rotate_all(struct fb_info *info, u32 rotate) if (!ops || ops->currcon < 0 || rotate > 3) return; - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { vc = vc_cons[i].d; if (!vc || vc->vc_mode != KD_TEXT || registered_fb[con2fb_map[i]] != info) @@ -393,18 +389,15 @@ static void fb_flashcursor(void *private) int c; int mode; - acquire_console_sem(); - if (ops && ops->currcon != -1) + if (ops->currcon != -1) vc = vc_cons[ops->currcon].d; if (!vc || !CON_IS_VISIBLE(vc) || fbcon_is_inactive(vc, info) || registered_fb[con2fb_map[vc->vc_num]] != info || - vc_cons[ops->currcon].d->vc_deccm != 1) { - release_console_sem(); + vc_cons[ops->currcon].d->vc_deccm != 1) return; - } - + acquire_console_sem(); p = &fb_display[vc->vc_num]; c = scr_readw((u16 *) vc->vc_pos); mode = (!ops->cursor_flash || ops->cursor_state.enable) ? @@ -535,7 +528,7 @@ static int search_fb_in_map(int idx) { int i, retval = 0; - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { if (con2fb_map[i] == idx) retval = 1; } @@ -546,7 +539,7 @@ static int search_for_mapped_con(void) { int i, retval = 0; - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { if (con2fb_map[i] != -1) retval = 1; } @@ -568,7 +561,6 @@ static int fbcon_takeover(int show_logo) err = take_over_console(&fb_con, first_fb_vc, last_fb_vc, fbcon_is_default); - if (err) { for (i = first_fb_vc; i <= last_fb_vc; i++) { con2fb_map[i] = -1; @@ -803,8 +795,8 @@ static int set_con2fb_map(int unit, int newidx, int user) if (oldidx == newidx) return 0; - if (!info || fbcon_has_exited) - return -EINVAL; + if (!info) + err = -EINVAL; if (!err && !search_for_mapped_con()) { info_idx = newidx; @@ -840,9 +832,6 @@ static int set_con2fb_map(int unit, int newidx, int user) con2fb_init_display(vc, info, unit, show_logo); } - if (!search_fb_in_map(info_idx)) - info_idx = newidx; - release_console_sem(); return err; } @@ -1045,7 +1034,6 @@ static const char *fbcon_startup(void) #endif /* CONFIG_MAC */ fbcon_add_cursor_timer(info); - fbcon_has_exited = 0; return display_desc; } @@ -1073,36 +1061,17 @@ static void fbcon_init(struct vc_data *vc, int init) /* If we are not the first console on this fb, copy the font from that console */ - t = &fb_display[fg_console]; - if (!p->fontdata) { - if (t->fontdata) { - struct vc_data *fvc = vc_cons[fg_console].d; - - vc->vc_font.data = (void *)(p->fontdata = - fvc->vc_font.data); - vc->vc_font.width = fvc->vc_font.width; - vc->vc_font.height = fvc->vc_font.height; - p->userfont = t->userfont; - - if (p->userfont) - REFCOUNT(p->fontdata)++; - } else { - const struct font_desc *font = NULL; - - if (!fontname[0] || !(font = find_font(fontname))) - font = get_default_font(info->var.xres, - info->var.yres); - vc->vc_font.width = font->width; - vc->vc_font.height = font->height; - vc->vc_font.data = (void *)(p->fontdata = font->data); - vc->vc_font.charcount = 256; /* FIXME Need to - support more fonts */ - } + t = &fb_display[svc->vc_num]; + if (!vc->vc_font.data) { + vc->vc_font.data = (void *)(p->fontdata = t->fontdata); + vc->vc_font.width = (*default_mode)->vc_font.width; + vc->vc_font.height = (*default_mode)->vc_font.height; + p->userfont = t->userfont; + if (p->userfont) + REFCOUNT(p->fontdata)++; } - if (p->userfont) charcnt = FNTCHARCNT(p->fontdata); - vc->vc_can_do_color = (fb_get_color_depth(&info->var, &info->fix)!=1); vc->vc_complement_mask = vc->vc_can_do_color ? 0x7700 : 0x0800; if (charcnt == 256) { @@ -1176,47 +1145,13 @@ static void fbcon_init(struct vc_data *vc, int init) ops->p = &fb_display[fg_console]; } -static void fbcon_free_font(struct display *p) -{ - if (p->userfont && p->fontdata && (--REFCOUNT(p->fontdata) == 0)) - kfree(p->fontdata - FONT_EXTRA_WORDS * sizeof(int)); - p->fontdata = NULL; - p->userfont = 0; -} - static void fbcon_deinit(struct vc_data *vc) { struct display *p = &fb_display[vc->vc_num]; - struct fb_info *info; - struct fbcon_ops *ops; - int idx; + if (info_idx != -1) + return; fbcon_free_font(p); - idx = con2fb_map[vc->vc_num]; - - if (idx == -1) - goto finished; - - info = registered_fb[idx]; - - if (!info) - goto finished; - - ops = info->fbcon_par; - - if (!ops) - goto finished; - - if (CON_IS_VISIBLE(vc)) - fbcon_del_cursor_timer(info); - - ops->flags &= ~FBCON_FLAGS_INIT; -finished: - - if (!con_is_bound(&fb_con)) - fbcon_exit(); - - return; } /* ====================================================================== */ @@ -2164,11 +2099,12 @@ static int fbcon_switch(struct vc_data *vc) if (info->fbops->fb_set_par) info->fbops->fb_set_par(info); - if (old_info != info) + if (old_info != info) { fbcon_del_cursor_timer(old_info); + fbcon_add_cursor_timer(info); + } } - fbcon_add_cursor_timer(info); set_blitting_type(vc, info); ops->cursor_reset = 1; @@ -2286,6 +2222,14 @@ static int fbcon_blank(struct vc_data *vc, int blank, int mode_switch) return 0; } +static void fbcon_free_font(struct display *p) +{ + if (p->userfont && p->fontdata && (--REFCOUNT(p->fontdata) == 0)) + kfree(p->fontdata - FONT_EXTRA_WORDS * sizeof(int)); + p->fontdata = NULL; + p->userfont = 0; +} + static int fbcon_get_font(struct vc_data *vc, struct console_font *font) { u8 *fontdata = vc->vc_font.data; @@ -2499,7 +2443,7 @@ static int fbcon_set_font(struct vc_data *vc, struct console_font *font, unsigne FNTSUM(new_data) = csum; /* Check if the same font is on some other console already */ - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { struct vc_data *tmp = vc_cons[i].d; if (fb_display[i].userfont && @@ -2824,7 +2768,7 @@ static void fbcon_set_all_vcs(struct fb_info *info) if (!ops || ops->currcon < 0) return; - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { vc = vc_cons[i].d; if (!vc || vc->vc_mode != KD_TEXT || registered_fb[con2fb_map[i]] != info) @@ -2886,57 +2830,22 @@ static int fbcon_mode_deleted(struct fb_info *info, return found; } -static int fbcon_fb_unregistered(int idx) -{ - int i; - - for (i = first_fb_vc; i <= last_fb_vc; i++) { - if (con2fb_map[i] == idx) - con2fb_map[i] = -1; - } - - if (idx == info_idx) { - info_idx = -1; - - for (i = 0; i < FB_MAX; i++) { - if (registered_fb[i] != NULL) { - info_idx = i; - break; - } - } - } - - if (info_idx != -1) { - for (i = first_fb_vc; i <= last_fb_vc; i++) { - if (con2fb_map[i] == -1) - con2fb_map[i] = info_idx; - } - } - - if (!num_registered_fb) - unregister_con_driver(&fb_con); - - return 0; -} - static int fbcon_fb_registered(int idx) { int ret = 0, i; if (info_idx == -1) { - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { if (con2fb_map_boot[i] == idx) { info_idx = idx; break; } } - if (info_idx != -1) ret = fbcon_takeover(1); } else { - for (i = first_fb_vc; i <= last_fb_vc; i++) { - if (con2fb_map_boot[i] == idx && - con2fb_map[i] == -1) + for (i = 0; i < MAX_NR_CONSOLES; i++) { + if (con2fb_map_boot[i] == idx) set_con2fb_map(i, idx, 0); } } @@ -2973,7 +2882,7 @@ static void fbcon_new_modelist(struct fb_info *info) struct fb_var_screeninfo var; struct fb_videomode *mode; - for (i = first_fb_vc; i <= last_fb_vc; i++) { + for (i = 0; i < MAX_NR_CONSOLES; i++) { if (registered_fb[con2fb_map[i]] != info) continue; if (!fb_display[i].mode) @@ -3001,14 +2910,6 @@ static int fbcon_event_notify(struct notifier_block *self, struct fb_con2fbmap *con2fb; int ret = 0; - /* - * ignore all events except driver registration and deregistration - * if fbcon is not active - */ - if (fbcon_has_exited && !(action == FB_EVENT_FB_REGISTERED || - action == FB_EVENT_FB_UNREGISTERED)) - goto done; - switch(action) { case FB_EVENT_SUSPEND: fbcon_suspended(info); @@ -3029,9 +2930,6 @@ static int fbcon_event_notify(struct notifier_block *self, case FB_EVENT_FB_REGISTERED: ret = fbcon_fb_registered(info->node); break; - case FB_EVENT_FB_UNREGISTERED: - ret = fbcon_fb_unregistered(info->node); - break; case FB_EVENT_SET_CONSOLE_MAP: con2fb = event->data; ret = set_con2fb_map(con2fb->console - 1, @@ -3047,9 +2945,16 @@ static int fbcon_event_notify(struct notifier_block *self, case FB_EVENT_NEW_MODELIST: fbcon_new_modelist(info); break; + case FB_EVENT_SET_CON_ROTATE: + fbcon_rotate(info, *(int *)event->data); + break; + case FB_EVENT_GET_CON_ROTATE: + ret = fbcon_get_rotate(info); + break; + case FB_EVENT_SET_CON_ROTATE_ALL: + fbcon_rotate_all(info, *(int *)event->data); } -done: return ret; } @@ -3087,181 +2992,27 @@ static struct notifier_block fbcon_event_notifier = { .notifier_call = fbcon_event_notify, }; -static ssize_t store_rotate(struct class_device *class_device, - const char *buf, size_t count) -{ - struct fb_info *info; - int rotate, idx; - char **last = NULL; - - if (fbcon_has_exited) - return count; - - acquire_console_sem(); - idx = con2fb_map[fg_console]; - - if (idx == -1 || registered_fb[idx] == NULL) - goto err; - - info = registered_fb[idx]; - rotate = simple_strtoul(buf, last, 0); - fbcon_rotate(info, rotate); -err: - release_console_sem(); - return count; -} - -static ssize_t store_rotate_all(struct class_device *class_device, - const char *buf, size_t count) -{ - struct fb_info *info; - int rotate, idx; - char **last = NULL; - - if (fbcon_has_exited) - return count; - - acquire_console_sem(); - idx = con2fb_map[fg_console]; - - if (idx == -1 || registered_fb[idx] == NULL) - goto err; - - info = registered_fb[idx]; - rotate = simple_strtoul(buf, last, 0); - fbcon_rotate_all(info, rotate); -err: - release_console_sem(); - return count; -} - -static ssize_t show_rotate(struct class_device *class_device, char *buf) +static int __init fb_console_init(void) { - struct fb_info *info; - int rotate = 0, idx; - - if (fbcon_has_exited) - return 0; + int i; acquire_console_sem(); - idx = con2fb_map[fg_console]; - - if (idx == -1 || registered_fb[idx] == NULL) - goto err; - - info = registered_fb[idx]; - rotate = fbcon_get_rotate(info); -err: + fb_register_client(&fbcon_event_notifier); release_console_sem(); - return snprintf(buf, PAGE_SIZE, "%d\n", rotate); -} - -static struct class_device_attribute class_device_attrs[] = { - __ATTR(rotate, S_IRUGO|S_IWUSR, show_rotate, store_rotate), - __ATTR(rotate_all, S_IWUSR, NULL, store_rotate_all), -}; - -static int fbcon_init_class_device(void) -{ - int i; - for (i = 0; i < ARRAY_SIZE(class_device_attrs); i++) - class_device_create_file(fbcon_class_device, - &class_device_attrs[i]); - return 0; -} + for (i = 0; i < MAX_NR_CONSOLES; i++) + con2fb_map[i] = -1; -static void fbcon_start(void) -{ if (num_registered_fb) { - int i; - - acquire_console_sem(); - for (i = 0; i < FB_MAX; i++) { if (registered_fb[i] != NULL) { info_idx = i; break; } } - - release_console_sem(); fbcon_takeover(0); } -} - -static void fbcon_exit(void) -{ - struct fb_info *info; - int i, j, mapped; - - if (fbcon_has_exited) - return; - -#ifdef CONFIG_ATARI - free_irq(IRQ_AUTO_4, fbcon_vbl_handler); -#endif -#ifdef CONFIG_MAC - if (MACH_IS_MAC && vbl_detected) - free_irq(IRQ_MAC_VBL, fbcon_vbl_handler); -#endif - - kfree((void *)softback_buf); - softback_buf = 0UL; - - for (i = 0; i < FB_MAX; i++) { - mapped = 0; - info = registered_fb[i]; - - if (info == NULL) - continue; - - for (j = first_fb_vc; j <= last_fb_vc; j++) { - if (con2fb_map[j] == i) - mapped = 1; - } - - if (mapped) { - if (info->fbops->fb_release) - info->fbops->fb_release(info, 0); - module_put(info->fbops->owner); - - if (info->fbcon_par) { - fbcon_del_cursor_timer(info); - kfree(info->fbcon_par); - info->fbcon_par = NULL; - } - if (info->queue.func == fb_flashcursor) - info->queue.func = NULL; - } - } - - fbcon_has_exited = 1; -} - -static int __init fb_console_init(void) -{ - int i; - - acquire_console_sem(); - fb_register_client(&fbcon_event_notifier); - fbcon_class_device = - class_device_create(fb_class, NULL, MKDEV(0, 0), NULL, "fbcon"); - - if (IS_ERR(fbcon_class_device)) { - printk(KERN_WARNING "Unable to create class_device " - "for fbcon; errno = %ld\n", - PTR_ERR(fbcon_class_device)); - fbcon_class_device = NULL; - } else - fbcon_init_class_device(); - - for (i = 0; i < MAX_NR_CONSOLES; i++) - con2fb_map[i] = -1; - - release_console_sem(); - fbcon_start(); return 0; } @@ -3269,24 +3020,12 @@ module_init(fb_console_init); #ifdef MODULE -static void __exit fbcon_deinit_class_device(void) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(class_device_attrs); i++) - class_device_remove_file(fbcon_class_device, - &class_device_attrs[i]); -} - static void __exit fb_console_exit(void) { acquire_console_sem(); fb_unregister_client(&fbcon_event_notifier); - fbcon_deinit_class_device(); - class_device_destroy(fb_class, MKDEV(0, 0)); - fbcon_exit(); release_console_sem(); - unregister_con_driver(&fb_con); + give_up_console(&fb_con); } module_exit(fb_console_exit); diff --git a/trunk/drivers/video/console/fbcon.h b/trunk/drivers/video/console/fbcon.h index 3487a636370a..c38c3d8e7a74 100644 --- a/trunk/drivers/video/console/fbcon.h +++ b/trunk/drivers/video/console/fbcon.h @@ -175,7 +175,6 @@ extern void fbcon_set_tileops(struct vc_data *vc, struct fb_info *info); #endif extern void fbcon_set_bitops(struct fbcon_ops *ops); extern int soft_cursor(struct fb_info *info, struct fb_cursor *cursor); -extern struct class *fb_class; #define FBCON_ATTRIBUTE_UNDERLINE 1 #define FBCON_ATTRIBUTE_REVERSE 2 diff --git a/trunk/drivers/video/console/mdacon.c b/trunk/drivers/video/console/mdacon.c index c89f90edf8ac..7f939d066a5a 100644 --- a/trunk/drivers/video/console/mdacon.c +++ b/trunk/drivers/video/console/mdacon.c @@ -308,7 +308,7 @@ static void __init mda_initialize(void) outb_p(0x00, mda_gfx_port); } -static const char *mdacon_startup(void) +static const char __init *mdacon_startup(void) { mda_num_columns = 80; mda_num_lines = 25; diff --git a/trunk/drivers/video/console/newport_con.c b/trunk/drivers/video/console/newport_con.c index 03041311711b..e99fe30e568c 100644 --- a/trunk/drivers/video/console/newport_con.c +++ b/trunk/drivers/video/console/newport_con.c @@ -51,7 +51,6 @@ static int topscan; static int xcurs_correction = 29; static int newport_xsize; static int newport_ysize; -static int newport_has_init; static int newport_set_def_font(int unit, struct console_font *op); @@ -284,15 +283,6 @@ static void newport_get_revisions(void) xcurs_correction = 21; } -static void newport_exit(void) -{ - int i; - - /* free memory used by user font */ - for (i = 0; i < MAX_NR_CONSOLES; i++) - newport_set_def_font(i, NULL); -} - /* Can't be __init, take_over_console may call it later */ static const char *newport_startup(void) { @@ -300,10 +290,8 @@ static const char *newport_startup(void) if (!sgi_gfxaddr) return NULL; - - if (!npregs) - npregs = (struct newport_regs *)/* ioremap cannot fail */ - ioremap(sgi_gfxaddr, sizeof(struct newport_regs)); + npregs = (struct newport_regs *) /* ioremap cannot fail */ + ioremap(sgi_gfxaddr, sizeof(struct newport_regs)); npregs->cset.config = NPORT_CFG_GD0; if (newport_wait(npregs)) @@ -319,11 +307,11 @@ static const char *newport_startup(void) newport_reset(); newport_get_revisions(); newport_get_screensize(); - newport_has_init = 1; return "SGI Newport"; out_unmap: + iounmap((void *)npregs); return NULL; } @@ -336,10 +324,11 @@ static void newport_init(struct vc_data *vc, int init) static void newport_deinit(struct vc_data *c) { - if (!con_is_bound(&newport_con) && newport_has_init) { - newport_exit(); - newport_has_init = 0; - } + int i; + + /* free memory used by user font */ + for (i = 0; i < MAX_NR_CONSOLES; i++) + newport_set_def_font(i, NULL); } static void newport_clear(struct vc_data *vc, int sy, int sx, int height, @@ -739,23 +728,16 @@ const struct consw newport_con = { #ifdef MODULE static int __init newport_console_init(void) { - - if (!sgi_gfxaddr) - return NULL; - - if (!npregs) - npregs = (struct newport_regs *)/* ioremap cannot fail */ - ioremap(sgi_gfxaddr, sizeof(struct newport_regs)); - return take_over_console(&newport_con, 0, MAX_NR_CONSOLES - 1, 1); } -module_init(newport_console_init); static void __exit newport_console_exit(void) { give_up_console(&newport_con); iounmap((void *)npregs); } + +module_init(newport_console_init); module_exit(newport_console_exit); #endif diff --git a/trunk/drivers/video/console/promcon.c b/trunk/drivers/video/console/promcon.c index d6e6ad537f9f..04f42fcaac59 100644 --- a/trunk/drivers/video/console/promcon.c +++ b/trunk/drivers/video/console/promcon.c @@ -109,7 +109,7 @@ promcon_end(struct vc_data *conp, char *b) return b - p; } -const char *promcon_startup(void) +const char __init *promcon_startup(void) { const char *display_desc = "PROM"; int node; @@ -133,7 +133,7 @@ const char *promcon_startup(void) return display_desc; } -static void +static void __init promcon_init_unimap(struct vc_data *conp) { mm_segment_t old_fs = get_fs(); diff --git a/trunk/drivers/video/console/sticon.c b/trunk/drivers/video/console/sticon.c index 45c4f227e56e..fd5940f41271 100644 --- a/trunk/drivers/video/console/sticon.c +++ b/trunk/drivers/video/console/sticon.c @@ -75,7 +75,7 @@ static inline void cursor_undrawn(void) cursor_drawn = 0; } -static const char *sticon_startup(void) +static const char *__init sticon_startup(void) { return "STI console"; } diff --git a/trunk/drivers/video/console/vgacon.c b/trunk/drivers/video/console/vgacon.c index f32b590730f2..e64d42e2449e 100644 --- a/trunk/drivers/video/console/vgacon.c +++ b/trunk/drivers/video/console/vgacon.c @@ -114,7 +114,6 @@ static int vga_512_chars; static int vga_video_font_height; static int vga_scan_lines; static unsigned int vga_rolled_over = 0; -static int vga_init_done; static int __init no_scroll(char *str) { @@ -191,7 +190,7 @@ static void vgacon_scrollback_init(int pitch) } } -static void vgacon_scrollback_startup(void) +static void __init vgacon_scrollback_startup(void) { vgacon_scrollback = alloc_bootmem(CONFIG_VGACON_SOFT_SCROLLBACK_SIZE * 1024); @@ -356,7 +355,7 @@ static int vgacon_scrolldelta(struct vc_data *c, int lines) } #endif /* CONFIG_VGACON_SOFT_SCROLLBACK */ -static const char *vgacon_startup(void) +static const char __init *vgacon_startup(void) { const char *display_desc = NULL; u16 saved1, saved2; @@ -524,12 +523,7 @@ static const char *vgacon_startup(void) vgacon_xres = ORIG_VIDEO_COLS * VGA_FONTWIDTH; vgacon_yres = vga_scan_lines; - - if (!vga_init_done) { - vgacon_scrollback_startup(); - vga_init_done = 1; - } - + vgacon_scrollback_startup(); return display_desc; } @@ -537,20 +531,10 @@ static void vgacon_init(struct vc_data *c, int init) { unsigned long p; - /* - * We cannot be loaded as a module, therefore init is always 1, - * but vgacon_init can be called more than once, and init will - * not be 1. - */ + /* We cannot be loaded as a module, therefore init is always 1 */ c->vc_can_do_color = vga_can_do_color; - - /* set dimensions manually if init != 0 since vc_resize() will fail */ - if (init) { - c->vc_cols = vga_video_num_columns; - c->vc_rows = vga_video_num_lines; - } else - vc_resize(c, vga_video_num_columns, vga_video_num_lines); - + c->vc_cols = vga_video_num_columns; + c->vc_rows = vga_video_num_lines; c->vc_scan_lines = vga_scan_lines; c->vc_font.height = vga_video_font_height; c->vc_complement_mask = 0x7700; diff --git a/trunk/drivers/video/epson1355fb.c b/trunk/drivers/video/epson1355fb.c index f0a621ecc288..082759447bf6 100644 --- a/trunk/drivers/video/epson1355fb.c +++ b/trunk/drivers/video/epson1355fb.c @@ -605,6 +605,11 @@ static void clearfb16(struct fb_info *info) fb_writeb(0, dst); } +static void epson1355fb_platform_release(struct device *device) +{ + dev_err(device, "This driver is broken, please bug the authors so they will fix it.\n"); +} + static int epson1355fb_remove(struct platform_device *dev) { struct fb_info *info = platform_get_drvdata(dev); @@ -728,7 +733,13 @@ static struct platform_driver epson1355fb_driver = { }, }; -static struct platform_device *epson1355fb_device; +static struct platform_device epson1355fb_device = { + .name = "epson1355fb", + .id = 0, + .dev = { + .release = epson1355fb_platform_release, + } +}; int __init epson1355fb_init(void) { @@ -738,21 +749,11 @@ int __init epson1355fb_init(void) return -ENODEV; ret = platform_driver_register(&epson1355fb_driver); - if (!ret) { - epson1355fb_device = platform_device_alloc("epson1355fb", 0); - - if (epson1355fb_device) - ret = platform_device_add(epson1355fb_device); - else - ret = -ENOMEM; - - if (ret) { - platform_device_put(epson1355fb_device); + ret = platform_device_register(&epson1355fb_device); + if (ret) platform_driver_unregister(&epson1355fb_driver); - } } - return ret; } @@ -761,7 +762,7 @@ module_init(epson1355fb_init); #ifdef MODULE static void __exit epson1355fb_exit(void) { - platform_device_unregister(epson1355fb_device); + platform_device_unregister(&epson1355fb_device); platform_driver_unregister(&epson1355fb_driver); } diff --git a/trunk/drivers/video/fbcvt.c b/trunk/drivers/video/fbcvt.c index b5498999c4ec..ac90883dc3aa 100644 --- a/trunk/drivers/video/fbcvt.c +++ b/trunk/drivers/video/fbcvt.c @@ -376,3 +376,4 @@ int fb_find_mode_cvt(struct fb_videomode *mode, int margins, int rb) return 0; } +EXPORT_SYMBOL(fb_find_mode_cvt); diff --git a/trunk/drivers/video/fbmem.c b/trunk/drivers/video/fbmem.c index 31143afe7c95..372aa1776827 100644 --- a/trunk/drivers/video/fbmem.c +++ b/trunk/drivers/video/fbmem.c @@ -34,6 +34,7 @@ #endif #include #include +#include #include #include @@ -161,6 +162,7 @@ char* fb_get_buffer_offset(struct fb_info *info, struct fb_pixmap *buf, u32 size } #ifdef CONFIG_LOGO +#include static inline unsigned safe_shift(unsigned d, int n) { @@ -334,11 +336,11 @@ static void fb_rotate_logo_ud(const u8 *in, u8 *out, u32 width, u32 height) static void fb_rotate_logo_cw(const u8 *in, u8 *out, u32 width, u32 height) { - int i, j, h = height - 1; + int i, j, w = width - 1; for (i = 0; i < height; i++) for (j = 0; j < width; j++) - out[height * j + h - i] = *in++; + out[height * j + w - i] = *in++; } static void fb_rotate_logo_ccw(const u8 *in, u8 *out, u32 width, u32 height) @@ -356,24 +358,24 @@ static void fb_rotate_logo(struct fb_info *info, u8 *dst, u32 tmp; if (rotate == FB_ROTATE_UD) { - fb_rotate_logo_ud(image->data, dst, image->width, - image->height); image->dx = info->var.xres - image->width; image->dy = info->var.yres - image->height; - } else if (rotate == FB_ROTATE_CW) { - fb_rotate_logo_cw(image->data, dst, image->width, + fb_rotate_logo_ud(image->data, dst, image->width, image->height); + } else if (rotate == FB_ROTATE_CW) { tmp = image->width; image->width = image->height; image->height = tmp; - image->dx = info->var.xres - image->width; + image->dx = info->var.xres - image->height; + fb_rotate_logo_cw(image->data, dst, image->width, + image->height); } else if (rotate == FB_ROTATE_CCW) { - fb_rotate_logo_ccw(image->data, dst, image->width, - image->height); tmp = image->width; image->width = image->height; image->height = tmp; - image->dy = info->var.yres - image->height; + image->dy = info->var.yres - image->width; + fb_rotate_logo_ccw(image->data, dst, image->width, + image->height); } image->data = dst; @@ -433,7 +435,7 @@ int fb_prepare_logo(struct fb_info *info, int rotate) depth = info->var.green.length; } - if (info->fix.visual == FB_VISUAL_STATIC_PSEUDOCOLOR && depth > 4) { + if (info->fix.visual == FB_VISUAL_STATIC_PSEUDOCOLOR) { /* assume console colormap */ depth = 4; } @@ -1276,8 +1278,8 @@ static struct file_operations fb_fops = { #endif }; -struct class *fb_class; -EXPORT_SYMBOL(fb_class); +static struct class *fb_class; + /** * register_framebuffer - registers a frame buffer device * @fb_info: frame buffer info structure @@ -1353,7 +1355,6 @@ register_framebuffer(struct fb_info *fb_info) int unregister_framebuffer(struct fb_info *fb_info) { - struct fb_event event; int i; i = fb_info->node; @@ -1361,17 +1362,13 @@ unregister_framebuffer(struct fb_info *fb_info) return -EINVAL; devfs_remove("fb/%d", i); - if (fb_info->pixmap.addr && - (fb_info->pixmap.flags & FB_PIXMAP_DEFAULT)) + if (fb_info->pixmap.addr && (fb_info->pixmap.flags & FB_PIXMAP_DEFAULT)) kfree(fb_info->pixmap.addr); fb_destroy_modelist(&fb_info->modelist); registered_fb[i]=NULL; num_registered_fb--; fb_cleanup_class_device(fb_info); class_device_destroy(fb_class, MKDEV(FB_MAJOR, i)); - event.info = fb_info; - blocking_notifier_call_chain(&fb_notifier_list, - FB_EVENT_FB_UNREGISTERED, &event); return 0; } @@ -1494,6 +1491,28 @@ int fb_new_modelist(struct fb_info *info) return err; } +/** + * fb_con_duit - user<->fbcon passthrough + * @info: struct fb_info + * @event: notification event to be passed to fbcon + * @data: private data + * + * DESCRIPTION + * This function is an fbcon-user event passing channel + * which bypasses fbdev. This is hopefully temporary + * until a user interface for fbcon is created + */ +int fb_con_duit(struct fb_info *info, int event, void *data) +{ + struct fb_event evnt; + + evnt.info = info; + evnt.data = data; + + return blocking_notifier_call_chain(&fb_notifier_list, event, &evnt); +} +EXPORT_SYMBOL(fb_con_duit); + static char *video_options[FB_MAX]; static int ofonly; @@ -1603,5 +1622,6 @@ EXPORT_SYMBOL(fb_set_suspend); EXPORT_SYMBOL(fb_register_client); EXPORT_SYMBOL(fb_unregister_client); EXPORT_SYMBOL(fb_get_options); +EXPORT_SYMBOL(fb_new_modelist); MODULE_LICENSE("GPL"); diff --git a/trunk/drivers/video/fbmon.c b/trunk/drivers/video/fbmon.c index 3ccfff715a51..53beeb4a9998 100644 --- a/trunk/drivers/video/fbmon.c +++ b/trunk/drivers/video/fbmon.c @@ -29,9 +29,9 @@ #include #include #include -#include #include