diff --git a/[refs] b/[refs] index db568907eb57..586e3be3a69d 100644 --- a/[refs] +++ b/[refs] @@ -1,2 +1,2 @@ --- -refs/heads/master: 7ae93f51d7fa8b9130d47e0b7d17979a165c5bc3 +refs/heads/master: 4378dcca8578b0fd0fba883a3354ad4820d4f85f diff --git a/trunk/CREDITS b/trunk/CREDITS index e97bea06b59f..c62dcb3b7e26 100644 --- a/trunk/CREDITS +++ b/trunk/CREDITS @@ -317,6 +317,14 @@ S: 2322 37th Ave SW S: Seattle, Washington 98126-2010 S: USA +N: Muli Ben-Yehuda +E: mulix@mulix.org +E: muli@il.ibm.com +W: http://www.mulix.org +D: trident OSS sound driver, x86-64 dma-ops and Calgary IOMMU, +D: KVM and Xen bits and other misc. hackery. +S: Haifa, Israel + N: Johannes Berg E: johannes@sipsolutions.net W: http://johannes.sipsolutions.net/ @@ -3344,8 +3352,7 @@ S: Spain N: Linus Torvalds E: torvalds@linux-foundation.org D: Original kernel hacker -S: 12725 SW Millikan Way, Suite 400 -S: Beaverton, Oregon 97005 +S: Portland, Oregon 97005 S: USA N: Marcelo Tosatti diff --git a/trunk/Documentation/ABI/testing/sysfs-dev b/trunk/Documentation/ABI/testing/sysfs-dev new file mode 100644 index 000000000000..a9f2b8b0530f --- /dev/null +++ b/trunk/Documentation/ABI/testing/sysfs-dev @@ -0,0 +1,20 @@ +What: /sys/dev +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Dan Williams +Description: The /sys/dev tree provides a method to look up the sysfs + path for a device using the information returned from + stat(2). There are two directories, 'block' and 'char', + beneath /sys/dev containing symbolic links with names of + the form ":". These links point to the + corresponding sysfs path for the given device. + + Example: + $ readlink /sys/dev/block/8:32 + ../../block/sdc + + Entries in /sys/dev/char and /sys/dev/block will be + dynamically created and destroyed as devices enter and + leave the system. + +Users: mdadm diff --git a/trunk/Documentation/ABI/testing/sysfs-devices-memory b/trunk/Documentation/ABI/testing/sysfs-devices-memory new file mode 100644 index 000000000000..7a16fe1e2270 --- /dev/null +++ b/trunk/Documentation/ABI/testing/sysfs-devices-memory @@ -0,0 +1,24 @@ +What: /sys/devices/system/memory +Date: June 2008 +Contact: Badari Pulavarty +Description: + The /sys/devices/system/memory contains a snapshot of the + internal state of the kernel memory blocks. Files could be + added or removed dynamically to represent hot-add/remove + operations. + +Users: hotplug memory add/remove tools + https://w3.opensource.ibm.com/projects/powerpc-utils/ + +What: /sys/devices/system/memory/memoryX/removable +Date: June 2008 +Contact: Badari Pulavarty +Description: + The file /sys/devices/system/memory/memoryX/removable + indicates whether this memory block is removable or not. + This is useful for a user-level agent to determine + identify removable sections of the memory before attempting + potentially expensive hot-remove memory operation + +Users: hotplug memory remove tools + https://w3.opensource.ibm.com/projects/powerpc-utils/ diff --git a/trunk/Documentation/ABI/testing/sysfs-kernel-mm b/trunk/Documentation/ABI/testing/sysfs-kernel-mm new file mode 100644 index 000000000000..190d523ac159 --- /dev/null +++ b/trunk/Documentation/ABI/testing/sysfs-kernel-mm @@ -0,0 +1,6 @@ +What: /sys/kernel/mm +Date: July 2008 +Contact: Nishanth Aravamudan , VM maintainers +Description: + /sys/kernel/mm/ should contain any and all VM + related information in /sys/kernel/. diff --git a/trunk/Documentation/ABI/testing/sysfs-kernel-mm-hugepages b/trunk/Documentation/ABI/testing/sysfs-kernel-mm-hugepages new file mode 100644 index 000000000000..e21c00571cf4 --- /dev/null +++ b/trunk/Documentation/ABI/testing/sysfs-kernel-mm-hugepages @@ -0,0 +1,15 @@ +What: /sys/kernel/mm/hugepages/ +Date: June 2008 +Contact: Nishanth Aravamudan , hugetlb maintainers +Description: + /sys/kernel/mm/hugepages/ contains a number of subdirectories + of the form hugepages-kB, where is the page size + of the hugepages supported by the kernel/CPU combination. + + Under these directories are a number of files: + nr_hugepages + nr_overcommit_hugepages + free_hugepages + surplus_hugepages + resv_hugepages + See Documentation/vm/hugetlbpage.txt for details. diff --git a/trunk/Documentation/DMA-attributes.txt b/trunk/Documentation/DMA-attributes.txt index 6d772f84b477..b768cc0e402b 100644 --- a/trunk/Documentation/DMA-attributes.txt +++ b/trunk/Documentation/DMA-attributes.txt @@ -22,3 +22,12 @@ ready and available in memory. The DMA of the "completion indication" could race with data DMA. Mapping the memory used for completion indications with DMA_ATTR_WRITE_BARRIER would prevent the race. +DMA_ATTR_WEAK_ORDERING +---------------------- + +DMA_ATTR_WEAK_ORDERING specifies that reads and writes to the mapping +may be weakly ordered, that is that reads and writes may pass each other. + +Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING, +those that do not will simply ignore the attribute and exhibit default +behavior. diff --git a/trunk/Documentation/DocBook/gadget.tmpl b/trunk/Documentation/DocBook/gadget.tmpl index 5a8ffa761e09..ea3bc9565e6a 100644 --- a/trunk/Documentation/DocBook/gadget.tmpl +++ b/trunk/Documentation/DocBook/gadget.tmpl @@ -524,6 +524,44 @@ These utilities include endpoint autoconfiguration. +Composite Device Framework + +The core API is sufficient for writing drivers for composite +USB devices (with more than one function in a given configuration), +and also multi-configuration devices (also more than one function, +but not necessarily sharing a given configuration). +There is however an optional framework which makes it easier to +reuse and combine functions. + + +Devices using this framework provide a struct +usb_composite_driver, which in turn provides one or +more struct usb_configuration instances. +Each such configuration includes at least one +struct usb_function, which packages a user +visible role such as "network link" or "mass storage device". +Management functions may also exist, such as "Device Firmware +Upgrade". + + +!Iinclude/linux/usb/composite.h +!Edrivers/usb/gadget/composite.c + + + +Composite Device Functions + +At this writing, a few of the current gadget drivers have +been converted to this framework. +Near-term plans include converting all of them, except for "gadgetfs". + + +!Edrivers/usb/gadget/f_acm.c +!Edrivers/usb/gadget/f_serial.c + + + + Peripheral Controller Drivers diff --git a/trunk/Documentation/DocBook/uio-howto.tmpl b/trunk/Documentation/DocBook/uio-howto.tmpl index fdd7f4f887b7..df87d1b93605 100644 --- a/trunk/Documentation/DocBook/uio-howto.tmpl +++ b/trunk/Documentation/DocBook/uio-howto.tmpl @@ -21,6 +21,18 @@ + + 2006-2008 + Hans-Jürgen Koch. + + + + +This documentation is Free Software licensed under the terms of the +GPL version 2. + + + 2006-12-11 @@ -29,6 +41,12 @@ + + 0.5 + 2008-05-22 + hjk + Added description of write() function. + 0.4 2007-11-26 @@ -57,20 +75,9 @@ - + About this document - - -Copyright and License - - Copyright (c) 2006 by Hans-Jürgen Koch. - -This documentation is Free Software licensed under the terms of the -GPL version 2. - - - Translations @@ -189,6 +196,30 @@ interested in translating it, please email me represents the total interrupt count. You can use this number to figure out if you missed some interrupts. + + For some hardware that has more than one interrupt source internally, + but not separate IRQ mask and status registers, there might be + situations where userspace cannot determine what the interrupt source + was if the kernel handler disables them by writing to the chip's IRQ + register. In such a case, the kernel has to disable the IRQ completely + to leave the chip's register untouched. Now the userspace part can + determine the cause of the interrupt, but it cannot re-enable + interrupts. Another cornercase is chips where re-enabling interrupts + is a read-modify-write operation to a combined IRQ status/acknowledge + register. This would be racy if a new interrupt occurred + simultaneously. + + + To address these problems, UIO also implements a write() function. It + is normally not used and can be ignored for hardware that has only a + single interrupt source or has separate IRQ mask and status registers. + If you need it, however, a write to /dev/uioX + will call the irqcontrol() function implemented + by the driver. You have to write a 32-bit value that is usually either + 0 or 1 to disable or enable interrupts. If a driver does not implement + irqcontrol(), write() will + return with -ENOSYS. + To handle interrupts properly, your custom kernel module can @@ -362,6 +393,14 @@ device is actually used. open(), you will probably also want a custom release() function. + + +int (*irqcontrol)(struct uio_info *info, s32 irq_on) +: Optional. If you need to be able to enable or disable +interrupts from userspace by writing to /dev/uioX, +you can implement this function. The parameter irq_on +will be 0 to disable interrupts and 1 to enable them. + diff --git a/trunk/Documentation/HOWTO b/trunk/Documentation/HOWTO index 619e8caf30db..c2371c5a98f9 100644 --- a/trunk/Documentation/HOWTO +++ b/trunk/Documentation/HOWTO @@ -358,7 +358,7 @@ Here is a list of some of the different kernel trees available: - pcmcia, Dominik Brodowski git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git - - SCSI, James Bottomley + - SCSI, James Bottomley git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git - x86, Ingo Molnar diff --git a/trunk/Documentation/fb/sh7760fb.txt b/trunk/Documentation/fb/sh7760fb.txt new file mode 100644 index 000000000000..c87bfe5c630a --- /dev/null +++ b/trunk/Documentation/fb/sh7760fb.txt @@ -0,0 +1,131 @@ +SH7760/SH7763 integrated LCDC Framebuffer driver +================================================ + +0. Overwiew +----------- +The SH7760/SH7763 have an integrated LCD Display controller (LCDC) which +supports (in theory) resolutions ranging from 1x1 to 1024x1024, +with color depths ranging from 1 to 16 bits, on STN, DSTN and TFT Panels. + +Caveats: +* Framebuffer memory must be a large chunk allocated at the top + of Area3 (HW requirement). Because of this requirement you should NOT + make the driver a module since at runtime it may become impossible to + get a large enough contiguous chunk of memory. + +* The driver does not support changing resolution while loaded + (displays aren't hotpluggable anyway) + +* Heavy flickering may be observed + a) if you're using 15/16bit color modes at >= 640x480 px resolutions, + b) during PCMCIA (or any other slow bus) activity. + +* Rotation works only 90degress clockwise, and only if horizontal + resolution is <= 320 pixels. + +files: drivers/video/sh7760fb.c + include/asm-sh/sh7760fb.h + Documentation/fb/sh7760fb.txt + +1. Platform setup +----------------- +SH7760: + Video data is fetched via the DMABRG DMA engine, so you have to + configure the SH DMAC for DMABRG mode (write 0x94808080 to the + DMARSRA register somewhere at boot). + + PFC registers PCCR and PCDR must be set to peripheral mode. + (write zeros to both). + +The driver does NOT do the above for you since board setup is, well, job +of the board setup code. + +2. Panel definitions +-------------------- +The LCDC must explicitly be told about the type of LCD panel +attached. Data must be wrapped in a "struct sh7760fb_platdata" and +passed to the driver as platform_data. + +Suggest you take a closer look at the SH7760 Manual, Section 30. +(http://documentation.renesas.com/eng/products/mpumcu/e602291_sh7760.pdf) + +The following code illustrates what needs to be done to +get the framebuffer working on a 640x480 TFT: + +====================== cut here ====================================== + +#include +#include + +/* + * NEC NL6440bc26-01 640x480 TFT + * dotclock 25175 kHz + * Xres 640 Yres 480 + * Htotal 800 Vtotal 525 + * HsynStart 656 VsynStart 490 + * HsynLenn 30 VsynLenn 2 + * + * The linux framebuffer layer does not use the syncstart/synclen + * values but right/left/upper/lower margin values. The comments + * for the x_margin explain how to calculate those from given + * panel sync timings. + */ +static struct fb_videomode nl6448bc26 = { + .name = "NL6448BC26", + .refresh = 60, + .xres = 640, + .yres = 480, + .pixclock = 39683, /* in picoseconds! */ + .hsync_len = 30, + .vsync_len = 2, + .left_margin = 114, /* HTOT - (HSYNSLEN + HSYNSTART) */ + .right_margin = 16, /* HSYNSTART - XRES */ + .upper_margin = 33, /* VTOT - (VSYNLEN + VSYNSTART) */ + .lower_margin = 10, /* VSYNSTART - YRES */ + .sync = FB_SYNC_HOR_HIGH_ACT | FB_SYNC_VERT_HIGH_ACT, + .vmode = FB_VMODE_NONINTERLACED, + .flag = 0, +}; + +static struct sh7760fb_platdata sh7760fb_nl6448 = { + .def_mode = &nl6448bc26, + .ldmtr = LDMTR_TFT_COLOR_16, /* 16bit TFT panel */ + .lddfr = LDDFR_8BPP, /* we want 8bit output */ + .ldpmmr = 0x0070, + .ldpspr = 0x0500, + .ldaclnr = 0, + .ldickr = LDICKR_CLKSRC(LCDC_CLKSRC_EXTERNAL) | + LDICKR_CLKDIV(1), + .rotate = 0, + .novsync = 1, + .blank = NULL, +}; + +/* SH7760: + * 0xFE300800: 256 * 4byte xRGB palette ram + * 0xFE300C00: 42 bytes ctrl registers + */ +static struct resource sh7760_lcdc_res[] = { + [0] = { + .start = 0xFE300800, + .end = 0xFE300CFF, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = 65, + .end = 65, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct platform_device sh7760_lcdc_dev = { + .dev = { + .platform_data = &sh7760fb_nl6448, + }, + .name = "sh7760-lcdc", + .id = -1, + .resource = sh7760_lcdc_res, + .num_resources = ARRAY_SIZE(sh7760_lcdc_res), +}; + +====================== cut here ====================================== diff --git a/trunk/Documentation/fb/tridentfb.txt b/trunk/Documentation/fb/tridentfb.txt index 8a6c8a43e6a3..45d9de5b13a3 100644 --- a/trunk/Documentation/fb/tridentfb.txt +++ b/trunk/Documentation/fb/tridentfb.txt @@ -3,11 +3,25 @@ Tridentfb is a framebuffer driver for some Trident chip based cards. The following list of chips is thought to be supported although not all are tested: -those from the Image series with Cyber in their names - accelerated -those with Blade in their names (Blade3D,CyberBlade...) - accelerated -the newer CyberBladeXP family - nonaccelerated - -Only PCI/AGP based cards are supported, none of the older Tridents. +those from the TGUI series 9440/96XX and with Cyber in their names +those from the Image series and with Cyber in their names +those with Blade in their names (Blade3D,CyberBlade...) +the newer CyberBladeXP family + +All families are accelerated. Only PCI/AGP based cards are supported, +none of the older Tridents. +The driver supports 8, 16 and 32 bits per pixel depths. +The TGUI family requires a line length to be power of 2 if acceleration +is enabled. This means that range of possible resolutions and bpp is +limited comparing to the range if acceleration is disabled (see list +of parameters below). + +Known bugs: +1. The driver randomly locks up on 3DImage975 chip with acceleration + enabled. The same happens in X11 (Xorg). +2. The ramdac speeds require some more fine tuning. It is possible to + switch resolution which the chip does not support at some depths for + older chips. How to use it? ============== @@ -17,12 +31,11 @@ video=tridentfb The parameters for tridentfb are concatenated with a ':' as in this example. -video=tridentfb:800x600,bpp=16,noaccel +video=tridentfb:800x600-16@75,noaccel The second level parameters that tridentfb understands are: noaccel - turns off acceleration (when it doesn't work for your card) -accel - force text acceleration (for boards which by default are noacceled) fp - use flat panel related stuff crt - assume monitor is present instead of fp @@ -31,21 +44,24 @@ center - for flat panels and resolutions smaller than native size center the image, otherwise use stretch -memsize - integer value in Kb, use if your card's memory size is misdetected. +memsize - integer value in KB, use if your card's memory size is misdetected. look at the driver output to see what it says when initializing. -memdiff - integer value in Kb,should be nonzero if your card reports - more memory than it actually has.For instance mine is 192K less than + +memdiff - integer value in KB, should be nonzero if your card reports + more memory than it actually has. For instance mine is 192K less than detection says in all three BIOS selectable situations 2M, 4M, 8M. Only use if your video memory is taken from main memory hence of - configurable size.Otherwise use memsize. - If in some modes which barely fit the memory you see garbage at the bottom - this might help by not letting change to that mode anymore. + configurable size. Otherwise use memsize. + If in some modes which barely fit the memory you see garbage + at the bottom this might help by not letting change to that mode + anymore. nativex - the width in pixels of the flat panel.If you know it (usually 1024 800 or 1280) and it is not what the driver seems to detect use it. -bpp - bits per pixel (8,16 or 32) -mode - a mode name like 800x600 (as described in Documentation/fb/modedb.txt) +bpp - bits per pixel (8,16 or 32) +mode - a mode name like 800x600-8@75 as described in + Documentation/fb/modedb.txt Using insane values for the above parameters will probably result in driver misbehaviour so take care(for instance memsize=12345678 or memdiff=23784 or diff --git a/trunk/Documentation/feature-removal-schedule.txt b/trunk/Documentation/feature-removal-schedule.txt index 65a1482457a8..9f73587219e8 100644 --- a/trunk/Documentation/feature-removal-schedule.txt +++ b/trunk/Documentation/feature-removal-schedule.txt @@ -308,9 +308,41 @@ Who: Matthew Wilcox --------------------------- +What: SCTP_GET_PEER_ADDRS_NUM_OLD, SCTP_GET_PEER_ADDRS_OLD, + SCTP_GET_LOCAL_ADDRS_NUM_OLD, SCTP_GET_LOCAL_ADDRS_OLD +When: June 2009 +Why: A newer version of the options have been introduced in 2005 that + removes the limitions of the old API. The sctp library has been + converted to use these new options at the same time. Any user + space app that directly uses the old options should convert to using + the new options. +Who: Vlad Yasevich + +--------------------------- + What: CONFIG_THERMAL_HWMON When: January 2009 Why: This option was introduced just to allow older lm-sensors userspace to keep working over the upgrade to 2.6.26. At the scheduled time of removal fixed lm-sensors (2.x or 3.x) should be readily available. Who: Rene Herman + +--------------------------- + +What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS + (in net/core/net-sysfs.c) +When: After the only user (hal) has seen a release with the patches + for enough time, probably some time in 2010. +Why: Over 1K .text/.data size reduction, data is available in other + ways (ioctls) +Who: Johannes Berg + +--------------------------- + +What: CONFIG_NF_CT_ACCT +When: 2.6.29 +Why: Accounting can now be enabled/disabled without kernel recompilation. + Currently used only to set a default value for a feature that is also + controlled by a kernel/module/sysfs/sysctl parameter. +Who: Krzysztof Piotr Oledzki + diff --git a/trunk/Documentation/filesystems/Locking b/trunk/Documentation/filesystems/Locking index 8b22d7d8b991..680fb566b928 100644 --- a/trunk/Documentation/filesystems/Locking +++ b/trunk/Documentation/filesystems/Locking @@ -510,6 +510,7 @@ prototypes: void (*close)(struct vm_area_struct*); int (*fault)(struct vm_area_struct*, struct vm_fault *); int (*page_mkwrite)(struct vm_area_struct *, struct page *); + int (*access)(struct vm_area_struct *, unsigned long, void*, int, int); locking rules: BKL mmap_sem PageLocked(page) @@ -517,6 +518,7 @@ open: no yes close: no yes fault: no yes page_mkwrite: no yes no +access: no yes ->page_mkwrite() is called when a previously read-only page is about to become writeable. The file system is responsible for @@ -525,6 +527,11 @@ taking to lock out truncate, the page range should be verified to be within i_size. The page mapping should also be checked that it is not NULL. + ->access() is called when get_user_pages() fails in +acces_process_vm(), typically used to debug a process through +/proc/pid/mem or ptrace. This function is needed only for +VM_IO | VM_PFNMAP VMAs. + ================================================================================ Dubious stuff diff --git a/trunk/Documentation/filesystems/bfs.txt b/trunk/Documentation/filesystems/bfs.txt index ea825e178e79..78043d5a8fc3 100644 --- a/trunk/Documentation/filesystems/bfs.txt +++ b/trunk/Documentation/filesystems/bfs.txt @@ -26,11 +26,11 @@ You can simplify mounting by just typing: this will allocate the first available loopback device (and load loop.o kernel module if necessary) automatically. If the loopback driver is not -loaded automatically, make sure that your kernel is compiled with kmod -support (CONFIG_KMOD) enabled. Beware that umount will not -deallocate /dev/loopN device if /etc/mtab file on your system is a -symbolic link to /proc/mounts. You will need to do it manually using -"-d" switch of losetup(8). Read losetup(8) manpage for more info. +loaded automatically, make sure that you have compiled the module and +that modprobe is functioning. Beware that umount will not deallocate +/dev/loopN device if /etc/mtab file on your system is a symbolic link to +/proc/mounts. You will need to do it manually using "-d" switch of +losetup(8). Read losetup(8) manpage for more info. To create the BFS image under UnixWare you need to find out first which slice contains it. The command prtvtoc(1M) is your friend: diff --git a/trunk/Documentation/filesystems/configfs/configfs.txt b/trunk/Documentation/filesystems/configfs/configfs.txt index 15838d706ea2..44c97e6accb2 100644 --- a/trunk/Documentation/filesystems/configfs/configfs.txt +++ b/trunk/Documentation/filesystems/configfs/configfs.txt @@ -233,12 +233,10 @@ accomplished via the group operations specified on the group's config_item_type. struct configfs_group_operations { - int (*make_item)(struct config_group *group, - const char *name, - struct config_item **new_item); - int (*make_group)(struct config_group *group, - const char *name, - struct config_group **new_group); + struct config_item *(*make_item)(struct config_group *group, + const char *name); + struct config_group *(*make_group)(struct config_group *group, + const char *name); int (*commit_item)(struct config_item *item); void (*disconnect_notify)(struct config_group *group, struct config_item *item); diff --git a/trunk/Documentation/filesystems/configfs/configfs_example.c b/trunk/Documentation/filesystems/configfs/configfs_example.c index 0b422acd470c..039648791701 100644 --- a/trunk/Documentation/filesystems/configfs/configfs_example.c +++ b/trunk/Documentation/filesystems/configfs/configfs_example.c @@ -273,13 +273,13 @@ static inline struct simple_children *to_simple_children(struct config_item *ite return item ? container_of(to_config_group(item), struct simple_children, group) : NULL; } -static int simple_children_make_item(struct config_group *group, const char *name, struct config_item **new_item) +static struct config_item *simple_children_make_item(struct config_group *group, const char *name) { struct simple_child *simple_child; simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL); if (!simple_child) - return -ENOMEM; + return ERR_PTR(-ENOMEM); config_item_init_type_name(&simple_child->item, name, @@ -287,8 +287,7 @@ static int simple_children_make_item(struct config_group *group, const char *nam simple_child->storeme = 0; - *new_item = &simple_child->item; - return 0; + return &simple_child->item; } static struct configfs_attribute simple_children_attr_description = { @@ -360,21 +359,20 @@ static struct configfs_subsystem simple_children_subsys = { * children of its own. */ -static int group_children_make_group(struct config_group *group, const char *name, struct config_group **new_group) +static struct config_group *group_children_make_group(struct config_group *group, const char *name) { struct simple_children *simple_children; simple_children = kzalloc(sizeof(struct simple_children), GFP_KERNEL); if (!simple_children) - return -ENOMEM; + return ERR_PTR(-ENOMEM); config_group_init_type_name(&simple_children->group, name, &simple_children_type); - *new_group = &simple_children->group; - return 0; + return &simple_children->group; } static struct configfs_attribute group_children_attr_description = { diff --git a/trunk/Documentation/filesystems/nfs-rdma.txt b/trunk/Documentation/filesystems/nfs-rdma.txt index d0ec45ae4e7d..44bd766f2e5d 100644 --- a/trunk/Documentation/filesystems/nfs-rdma.txt +++ b/trunk/Documentation/filesystems/nfs-rdma.txt @@ -5,7 +5,7 @@ ################################################################################ Author: NetApp and Open Grid Computing - Date: April 15, 2008 + Date: May 29, 2008 Table of Contents ~~~~~~~~~~~~~~~~~ @@ -60,16 +60,18 @@ Installation The procedures described in this document have been tested with distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). - - Install nfs-utils-1.1.1 or greater on the client + - Install nfs-utils-1.1.2 or greater on the client - An NFS/RDMA mount point can only be obtained by using the mount.nfs - command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs - you are using, type: + An NFS/RDMA mount point can be obtained by using the mount.nfs command in + nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils + version with support for NFS/RDMA mounts, but for various reasons we + recommend using nfs-utils-1.1.2 or greater). To see which version of + mount.nfs you are using, type: - > /sbin/mount.nfs -V + $ /sbin/mount.nfs -V - If the version is less than 1.1.1 or the command does not exist, - then you will need to install the latest version of nfs-utils. + If the version is less than 1.1.2 or the command does not exist, + you should install the latest version of nfs-utils. Download the latest package from: @@ -77,22 +79,33 @@ Installation Uncompress the package and follow the installation instructions. - If you will not be using GSS and NFSv4, the installation process - can be simplified by disabling these features when running configure: + If you will not need the idmapper and gssd executables (you do not need + these to create an NFS/RDMA enabled mount command), the installation + process can be simplified by disabling these features when running + configure: - > ./configure --disable-gss --disable-nfsv4 + $ ./configure --disable-gss --disable-nfsv4 - For more information on this see the package's README and INSTALL files. + To build nfs-utils you will need the tcp_wrappers package installed. For + more information on this see the package's README and INSTALL files. After building the nfs-utils package, there will be a mount.nfs binary in the utils/mount directory. This binary can be used to initiate NFS v2, v3, - or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4. - The standard technique is to create a symlink called mount.nfs4 to mount.nfs. + or v4 mounts. To initiate a v4 mount, the binary must be called + mount.nfs4. The standard technique is to create a symlink called + mount.nfs4 to mount.nfs. - NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed + This mount.nfs binary should be installed at /sbin/mount.nfs as follows: + + $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs + + In this location, mount.nfs will be invoked automatically for NFS mounts + by the system mount commmand. + + NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed on the NFS client machine. You do not need this specific version of nfs-utils on the server. Furthermore, only the mount.nfs command from - nfs-utils-1.1.1 is needed on the client. + nfs-utils-1.1.2 is needed on the client. - Install a Linux kernel with NFS/RDMA @@ -156,8 +169,8 @@ Check RDMA and NFS Setup this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel card: - > modprobe ib_mthca - > modprobe ib_ipoib + $ modprobe ib_mthca + $ modprobe ib_ipoib If you are using InfiniBand, make sure there is a Subnet Manager (SM) running on the network. If your IB switch has an embedded SM, you can @@ -166,7 +179,7 @@ Check RDMA and NFS Setup If an SM is running on your network, you should see the following: - > cat /sys/class/infiniband/driverX/ports/1/state + $ cat /sys/class/infiniband/driverX/ports/1/state 4: ACTIVE where driverX is mthca0, ipath5, ehca3, etc. @@ -174,10 +187,10 @@ Check RDMA and NFS Setup To further test the InfiniBand software stack, use IPoIB (this assumes you have two IB hosts named host1 and host2): - host1> ifconfig ib0 a.b.c.x - host2> ifconfig ib0 a.b.c.y - host1> ping a.b.c.y - host2> ping a.b.c.x + host1$ ifconfig ib0 a.b.c.x + host2$ ifconfig ib0 a.b.c.y + host1$ ping a.b.c.y + host2$ ping a.b.c.x For other device types, follow the appropriate procedures. @@ -202,11 +215,11 @@ NFS/RDMA Setup /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) - The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the - cleint's iWARP address(es) for an RNIC. + The IP address(es) is(are) the client's IPoIB address for an InfiniBand + HCA or the cleint's iWARP address(es) for an RNIC. - NOTE: The "insecure" option must be used because the NFS/RDMA client does not - use a reserved port. + NOTE: The "insecure" option must be used because the NFS/RDMA client does + not use a reserved port. Each time a machine boots: @@ -214,43 +227,45 @@ NFS/RDMA Setup For InfiniBand using a Mellanox adapter: - > modprobe ib_mthca - > modprobe ib_ipoib - > ifconfig ib0 a.b.c.d + $ modprobe ib_mthca + $ modprobe ib_ipoib + $ ifconfig ib0 a.b.c.d NOTE: use unique addresses for the client and server - Start the NFS server - If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), - load the RDMA transport module: + If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in + kernel config), load the RDMA transport module: - > modprobe svcrdma + $ modprobe svcrdma - Regardless of how the server was built (module or built-in), start the server: + Regardless of how the server was built (module or built-in), start the + server: - > /etc/init.d/nfs start + $ /etc/init.d/nfs start or - > service nfs start + $ service nfs start Instruct the server to listen on the RDMA transport: - > echo rdma 2050 > /proc/fs/nfsd/portlist + $ echo rdma 2050 > /proc/fs/nfsd/portlist - On the client system - If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), - load the RDMA client module: + If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in + kernel config), load the RDMA client module: - > modprobe xprtrdma.ko + $ modprobe xprtrdma.ko - Regardless of how the client was built (module or built-in), issue the mount.nfs command: + Regardless of how the client was built (module or built-in), use this + command to mount the NFS/RDMA server: - > /path/to/your/mount.nfs :/ /mnt -i -o rdma,port=2050 + $ mount -o rdma,port=2050 :/ /mnt - To verify that the mount is using RDMA, run "cat /proc/mounts" and check the - "proto" field for the given mount. + To verify that the mount is using RDMA, run "cat /proc/mounts" and check + the "proto" field for the given mount. Congratulations! You're using NFS/RDMA! diff --git a/trunk/Documentation/filesystems/proc.txt b/trunk/Documentation/filesystems/proc.txt index 7f268f327d75..8c6384bdfed4 100644 --- a/trunk/Documentation/filesystems/proc.txt +++ b/trunk/Documentation/filesystems/proc.txt @@ -296,6 +296,7 @@ Table 1-4: Kernel info in /proc uptime System uptime version Kernel version video bttv info of video resources (2.4) + vmallocinfo Show vmalloced areas .............................................................................. You can, for example, check which interrupts are currently in use and what @@ -557,6 +558,49 @@ VmallocTotal: total size of vmalloc memory area VmallocUsed: amount of vmalloc area which is used VmallocChunk: largest contigious block of vmalloc area which is free +.............................................................................. + +vmallocinfo: + +Provides information about vmalloced/vmaped areas. One line per area, +containing the virtual address range of the area, size in bytes, +caller information of the creator, and optional information depending +on the kind of area : + + pages=nr number of pages + phys=addr if a physical address was specified + ioremap I/O mapping (ioremap() and friends) + vmalloc vmalloc() area + vmap vmap()ed pages + user VM_USERMAP area + vpages buffer for pages pointers was vmalloced (huge area) + N=nr (Only on NUMA kernels) + Number of pages allocated on memory node + +> cat /proc/vmallocinfo +0xffffc20000000000-0xffffc20000201000 2101248 alloc_large_system_hash+0x204 ... + /0x2c0 pages=512 vmalloc N0=128 N1=128 N2=128 N3=128 +0xffffc20000201000-0xffffc20000302000 1052672 alloc_large_system_hash+0x204 ... + /0x2c0 pages=256 vmalloc N0=64 N1=64 N2=64 N3=64 +0xffffc20000302000-0xffffc20000304000 8192 acpi_tb_verify_table+0x21/0x4f... + phys=7fee8000 ioremap +0xffffc20000304000-0xffffc20000307000 12288 acpi_tb_verify_table+0x21/0x4f... + phys=7fee7000 ioremap +0xffffc2000031d000-0xffffc2000031f000 8192 init_vdso_vars+0x112/0x210 +0xffffc2000031f000-0xffffc2000032b000 49152 cramfs_uncompress_init+0x2e ... + /0x80 pages=11 vmalloc N0=3 N1=3 N2=2 N3=3 +0xffffc2000033a000-0xffffc2000033d000 12288 sys_swapon+0x640/0xac0 ... + pages=2 vmalloc N1=2 +0xffffc20000347000-0xffffc2000034c000 20480 xt_alloc_table_info+0xfe ... + /0x130 [x_tables] pages=4 vmalloc N0=4 +0xffffffffa0000000-0xffffffffa000f000 61440 sys_init_module+0xc27/0x1d00 ... + pages=14 vmalloc N2=14 +0xffffffffa000f000-0xffffffffa0014000 20480 sys_init_module+0xc27/0x1d00 ... + pages=4 vmalloc N1=4 +0xffffffffa0014000-0xffffffffa0017000 12288 sys_init_module+0xc27/0x1d00 ... + pages=2 vmalloc N1=2 +0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ... + pages=10 vmalloc N0=10 1.3 IDE devices in /proc/ide ---------------------------- diff --git a/trunk/Documentation/filesystems/sysfs.txt b/trunk/Documentation/filesystems/sysfs.txt index 7f27b8f840d0..9e9c348275a9 100644 --- a/trunk/Documentation/filesystems/sysfs.txt +++ b/trunk/Documentation/filesystems/sysfs.txt @@ -248,6 +248,7 @@ The top level sysfs directory looks like: block/ bus/ class/ +dev/ devices/ firmware/ net/ @@ -274,6 +275,11 @@ fs/ contains a directory for some filesystems. Currently each filesystem wanting to export attributes must create its own hierarchy below fs/ (see ./fuse.txt for an example). +dev/ contains two directories char/ and block/. Inside these two +directories there are symlinks named :. These symlinks +point to the sysfs directory for the given device. /sys/dev provides a +quick way to lookup the sysfs interface for a device from the result of +a stat(2) operation. More information can driver-model specific features can be found in Documentation/driver-model/. diff --git a/trunk/Documentation/ia64/paravirt_ops.txt b/trunk/Documentation/ia64/paravirt_ops.txt new file mode 100644 index 000000000000..39ded02ec33f --- /dev/null +++ b/trunk/Documentation/ia64/paravirt_ops.txt @@ -0,0 +1,137 @@ +Paravirt_ops on IA64 +==================== + 21 May 2008, Isaku Yamahata + + +Introduction +------------ +The aim of this documentation is to help with maintainability and/or to +encourage people to use paravirt_ops/IA64. + +paravirt_ops (pv_ops in short) is a way for virtualization support of +Linux kernel on x86. Several ways for virtualization support were +proposed, paravirt_ops is the winner. +On the other hand, now there are also several IA64 virtualization +technologies like kvm/IA64, xen/IA64 and many other academic IA64 +hypervisors so that it is good to add generic virtualization +infrastructure on Linux/IA64. + + +What is paravirt_ops? +--------------------- +It has been developed on x86 as virtualization support via API, not ABI. +It allows each hypervisor to override operations which are important for +hypervisors at API level. And it allows a single kernel binary to run on +all supported execution environments including native machine. +Essentially paravirt_ops is a set of function pointers which represent +operations corresponding to low level sensitive instructions and high +level functionalities in various area. But one significant difference +from usual function pointer table is that it allows optimization with +binary patch. It is because some of these operations are very +performance sensitive and indirect call overhead is not negligible. +With binary patch, indirect C function call can be transformed into +direct C function call or in-place execution to eliminate the overhead. + +Thus, operations of paravirt_ops are classified into three categories. +- simple indirect call + These operations correspond to high level functionality so that the + overhead of indirect call isn't very important. + +- indirect call which allows optimization with binary patch + Usually these operations correspond to low level instructions. They + are called frequently and performance critical. So the overhead is + very important. + +- a set of macros for hand written assembly code + Hand written assembly codes (.S files) also need paravirtualization + because they include sensitive instructions or some of code paths in + them are very performance critical. + + +The relation to the IA64 machine vector +--------------------------------------- +Linux/IA64 has the IA64 machine vector functionality which allows the +kernel to switch implementations (e.g. initialization, ipi, dma api...) +depending on executing platform. +We can replace some implementations very easily defining a new machine +vector. Thus another approach for virtualization support would be +enhancing the machine vector functionality. +But paravirt_ops approach was taken because +- virtualization support needs wider support than machine vector does. + e.g. low level instruction paravirtualization. It must be + initialized very early before platform detection. + +- virtualization support needs more functionality like binary patch. + Probably the calling overhead might not be very large compared to the + emulation overhead of virtualization. However in the native case, the + overhead should be eliminated completely. + A single kernel binary should run on each environment including native, + and the overhead of paravirt_ops on native environment should be as + small as possible. + +- for full virtualization technology, e.g. KVM/IA64 or + Xen/IA64 HVM domain, the result would be + (the emulated platform machine vector. probably dig) + (pv_ops). + This means that the virtualization support layer should be under + the machine vector layer. + +Possibly it might be better to move some function pointers from +paravirt_ops to machine vector. In fact, Xen domU case utilizes both +pv_ops and machine vector. + + +IA64 paravirt_ops +----------------- +In this section, the concrete paravirt_ops will be discussed. +Because of the architecture difference between ia64 and x86, the +resulting set of functions is very different from x86 pv_ops. + +- C function pointer tables +They are not very performance critical so that simple C indirect +function call is acceptable. The following structures are defined at +this moment. For details see linux/include/asm-ia64/paravirt.h + - struct pv_info + This structure describes the execution environment. + - struct pv_init_ops + This structure describes the various initialization hooks. + - struct pv_iosapic_ops + This structure describes hooks to iosapic operations. + - struct pv_irq_ops + This structure describes hooks to irq related operations + - struct pv_time_op + This structure describes hooks to steal time accounting. + +- a set of indirect calls which need optimization +Currently this class of functions correspond to a subset of IA64 +intrinsics. At this moment the optimization with binary patch isn't +implemented yet. +struct pv_cpu_op is defined. For details see +linux/include/asm-ia64/paravirt_privop.h +Mostly they correspond to ia64 intrinsics 1-to-1. +Caveat: Now they are defined as C indirect function pointers, but in +order to support binary patch optimization, they will be changed +using GCC extended inline assembly code. + +- a set of macros for hand written assembly code (.S files) +For maintenance purpose, the taken approach for .S files is single +source code and compile multiple times with different macros definitions. +Each pv_ops instance must define those macros to compile. +The important thing here is that sensitive, but non-privileged +instructions must be paravirtualized and that some privileged +instructions also need paravirtualization for reasonable performance. +Developers who modify .S files must be aware of that. At this moment +an easy checker is implemented to detect paravirtualization breakage. +But it doesn't cover all the cases. + +Sometimes this set of macros is called pv_cpu_asm_op. But there is no +corresponding structure in the source code. +Those macros mostly 1:1 correspond to a subset of privileged +instructions. See linux/include/asm-ia64/native/inst.h. +And some functions written in assembly also need to be overrided so +that each pv_ops instance have to define some macros. Again see +linux/include/asm-ia64/native/inst.h. + + +Those structures must be initialized very early before start_kernel. +Probably initialized in head.S using multi entry point or some other trick. +For native case implementation see linux/arch/ia64/kernel/paravirt.c. diff --git a/trunk/Documentation/input/gameport-programming.txt b/trunk/Documentation/input/gameport-programming.txt index 14e0a8b70225..03a74fc3b496 100644 --- a/trunk/Documentation/input/gameport-programming.txt +++ b/trunk/Documentation/input/gameport-programming.txt @@ -1,5 +1,3 @@ -$Id: gameport-programming.txt,v 1.3 2001/04/24 13:51:37 vojtech Exp $ - Programming gameport drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/trunk/Documentation/input/input.txt b/trunk/Documentation/input/input.txt index ff8cea0225f9..686ee9932dff 100644 --- a/trunk/Documentation/input/input.txt +++ b/trunk/Documentation/input/input.txt @@ -1,7 +1,6 @@ Linux Input drivers v1.0 (c) 1999-2001 Vojtech Pavlik Sponsored by SuSE - $Id: input.txt,v 1.8 2002/05/29 03:15:01 bradleym Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/trunk/Documentation/input/joystick-api.txt b/trunk/Documentation/input/joystick-api.txt index acbd32b88454..c507330740cd 100644 --- a/trunk/Documentation/input/joystick-api.txt +++ b/trunk/Documentation/input/joystick-api.txt @@ -5,8 +5,6 @@ 7 Aug 1998 - $Id: joystick-api.txt,v 1.2 2001/05/08 21:21:23 vojtech Exp $ - 1. Initialization ~~~~~~~~~~~~~~~~~ diff --git a/trunk/Documentation/input/joystick-parport.txt b/trunk/Documentation/input/joystick-parport.txt index ede5f33daad3..1c856f32ff2c 100644 --- a/trunk/Documentation/input/joystick-parport.txt +++ b/trunk/Documentation/input/joystick-parport.txt @@ -2,7 +2,6 @@ (c) 1998-2000 Vojtech Pavlik (c) 1998 Andree Borrmann Sponsored by SuSE - $Id: joystick-parport.txt,v 1.6 2001/09/25 09:31:32 vojtech Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/trunk/Documentation/input/joystick.txt b/trunk/Documentation/input/joystick.txt index 389de9bd9878..154d767b2acb 100644 --- a/trunk/Documentation/input/joystick.txt +++ b/trunk/Documentation/input/joystick.txt @@ -1,7 +1,6 @@ Linux Joystick driver v2.0.0 (c) 1996-2000 Vojtech Pavlik Sponsored by SuSE - $Id: joystick.txt,v 1.12 2002/03/03 12:13:07 jdeneux Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/trunk/Documentation/kernel-parameters.txt b/trunk/Documentation/kernel-parameters.txt index 09ad7450647b..497a98dafdaa 100644 --- a/trunk/Documentation/kernel-parameters.txt +++ b/trunk/Documentation/kernel-parameters.txt @@ -87,7 +87,8 @@ parameter is applicable: SH SuperH architecture is enabled. SMP The kernel is an SMP kernel. SPARC Sparc architecture is enabled. - SWSUSP Software suspend is enabled. + SWSUSP Software suspend (hibernation) is enabled. + SUSPEND System suspend states are enabled. TS Appropriate touchscreen support is enabled. USB USB support is enabled. USBHID USB Human Interface Device support is enabled. @@ -147,10 +148,12 @@ and is between 256 and 4096 characters. It is defined in the file default: 0 acpi_sleep= [HW,ACPI] Sleep options - Format: { s3_bios, s3_mode, s3_beep, old_ordering } + Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig, old_ordering } See Documentation/power/video.txt for s3_bios and s3_mode. s3_beep is for debugging; it makes the PC's speaker beep as soon as the kernel's real-mode entry point is called. + s4_nohwsig prevents ACPI hardware signature from being + used during resume from hibernation. old_ordering causes the ACPI 1.0 ordering of the _PTS control method, wrt putting devices into low power states, to be enforced (the ACPI 2.0 ordering of _PTS is @@ -774,8 +777,22 @@ and is between 256 and 4096 characters. It is defined in the file hisax= [HW,ISDN] See Documentation/isdn/README.HiSax. - hugepages= [HW,X86-32,IA-64] Maximal number of HugeTLB pages. - hugepagesz= [HW,IA-64,PPC] The size of the HugeTLB pages. + hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot. + hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages. + On x86-64 and powerpc, this option can be specified + multiple times interleaved with hugepages= to reserve + huge pages of different sizes. Valid pages sizes on + x86-64 are 2M (when the CPU supports "pse") and 1G + (when the CPU supports the "pdpe1gb" cpuinfo flag) + Note that 1GB pages can only be allocated at boot time + using hugepages= and not freed afterwards. + default_hugepagesz= + [same as hugepagesz=] The size of the default + HugeTLB page size. This is the size represented by + the legacy /proc/ hugepages APIs, used for SHM, and + default size when mounting hugetlbfs filesystems. + Defaults to the default architecture's huge page size + if not specified. i8042.direct [HW] Put keyboard port into non-translated mode i8042.dumbkbd [HW] Pretend that controller can only read data from @@ -1206,7 +1223,7 @@ and is between 256 and 4096 characters. It is defined in the file or memmap=0x10000$0x18690000 - memtest= [KNL,X86_64] Enable memtest + memtest= [KNL,X86] Enable memtest Format: range: 0,4 : pattern number default : 0 @@ -1225,6 +1242,14 @@ and is between 256 and 4096 characters. It is defined in the file mga= [HW,DRM] + mminit_loglevel= + [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this + parameter allows control of the logging verbosity for + the additional memory initialisation checks. A value + of 0 disables mminit logging and a level of 4 will + log everything. Information is printed at KERN_DEBUG + so loglevel=8 may also need to be specified. + mousedev.tap_time= [MOUSE] Maximum time between finger touching and leaving touchpad surface for touch to be considered @@ -1279,6 +1304,13 @@ and is between 256 and 4096 characters. It is defined in the file This usage is only documented in each driver source file if at all. + nf_conntrack.acct= + [NETFILTER] Enable connection tracking flow accounting + 0 to disable accounting + 1 to enable accounting + Default value depends on CONFIG_NF_CT_ACCT that is + going to be removed in 2.6.29. + nfsaddrs= [NFS] See Documentation/filesystems/nfsroot.txt. @@ -2027,6 +2059,9 @@ and is between 256 and 4096 characters. It is defined in the file snd-ymfpci= [HW,ALSA] + softlockup_panic= + [KNL] Should the soft-lockup detector generate panics. + sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/sonypi.txt @@ -2091,6 +2126,12 @@ and is between 256 and 4096 characters. It is defined in the file tdfx= [HW,DRM] + test_suspend= [SUSPEND] + Specify "mem" (for Suspend-to-RAM) or "standby" (for + standby suspend) as the system sleep state to briefly + enter during system startup. The system is woken from + this state using a wakeup-capable RTC alarm. + thash_entries= [KNL,NET] Set number of hash buckets for TCP connection @@ -2158,6 +2199,10 @@ and is between 256 and 4096 characters. It is defined in the file Note that genuine overcurrent events won't be reported either. + unknown_nmi_panic + [X86-32,X86-64] + Set unknown_nmi_panic=1 early on boot. + usbcore.autosuspend= [USB] The autosuspend time delay (in seconds) used for newly-detected USB devices (default 2). This diff --git a/trunk/Documentation/md.txt b/trunk/Documentation/md.txt index a8b430627473..1da9d1b1793f 100644 --- a/trunk/Documentation/md.txt +++ b/trunk/Documentation/md.txt @@ -236,6 +236,11 @@ All md devices contain: writing the word for the desired state, however some states cannot be explicitly set, and some transitions are not allowed. + Select/poll works on this file. All changes except between + active_idle and active (which can be frequent and are not + very interesting) are notified. active->active_idle is + reported if the metadata is externally managed. + clear No devices, no size, no level Writing is equivalent to STOP_ARRAY ioctl @@ -292,6 +297,10 @@ Each directory contains: writemostly - device will only be subject to read requests if there are no other options. This applies only to raid1 arrays. + blocked - device has failed, metadata is "external", + and the failure hasn't been acknowledged yet. + Writes that would write to this device if + it were not faulty are blocked. spare - device is working, but not a full member. This includes spares that are in the process of being recovered to @@ -301,6 +310,12 @@ Each directory contains: Writing "remove" removes the device from the array. Writing "writemostly" sets the writemostly flag. Writing "-writemostly" clears the writemostly flag. + Writing "blocked" sets the "blocked" flag. + Writing "-blocked" clear the "blocked" flag and allows writes + to complete. + + This file responds to select/poll. Any change to 'faulty' + or 'blocked' causes an event. errors An approximate count of read errors that have been detected on @@ -332,7 +347,7 @@ Each directory contains: for storage of data. This will normally be the same as the component_size. This can be written while assembling an array. If a value less than the current component_size is - written, component_size will be reduced to this value. + written, it will be rejected. An active md device will also contain and entry for each active device @@ -381,6 +396,19 @@ also have 'check' and 'repair' will start the appropriate process providing the current state is 'idle'. + This file responds to select/poll. Any important change in the value + triggers a poll event. Sometimes the value will briefly be + "recover" if a recovery seems to be needed, but cannot be + achieved. In that case, the transition to "recover" isn't + notified, but the transition away is. + + degraded + This contains a count of the number of devices by which the + arrays is degraded. So an optimal array with show '0'. A + single failed/missing drive will show '1', etc. + This file responds to select/poll, any increase or decrease + in the count of missing devices will trigger an event. + mismatch_count When performing 'check' and 'repair', and possibly when performing 'resync', md will count the number of errors that are diff --git a/trunk/Documentation/networking/bonding.txt b/trunk/Documentation/networking/bonding.txt index a0cda062bc33..7fa7fe71d7a8 100644 --- a/trunk/Documentation/networking/bonding.txt +++ b/trunk/Documentation/networking/bonding.txt @@ -289,35 +289,73 @@ downdelay fail_over_mac Specifies whether active-backup mode should set all slaves to - the same MAC address (the traditional behavior), or, when - enabled, change the bond's MAC address when changing the - active interface (i.e., fail over the MAC address itself). - - Fail over MAC is useful for devices that cannot ever alter - their MAC address, or for devices that refuse incoming - broadcasts with their own source MAC (which interferes with - the ARP monitor). - - The down side of fail over MAC is that every device on the - network must be updated via gratuitous ARP, vs. just updating - a switch or set of switches (which often takes place for any - traffic, not just ARP traffic, if the switch snoops incoming - traffic to update its tables) for the traditional method. If - the gratuitous ARP is lost, communication may be disrupted. - - When fail over MAC is used in conjuction with the mii monitor, - devices which assert link up prior to being able to actually - transmit and receive are particularly susecptible to loss of - the gratuitous ARP, and an appropriate updelay setting may be - required. - - A value of 0 disables fail over MAC, and is the default. A - value of 1 enables fail over MAC. This option is enabled - automatically if the first slave added cannot change its MAC - address. This option may be modified via sysfs only when no - slaves are present in the bond. - - This option was added in bonding version 3.2.0. + the same MAC address at enslavement (the traditional + behavior), or, when enabled, perform special handling of the + bond's MAC address in accordance with the selected policy. + + Possible values are: + + none or 0 + + This setting disables fail_over_mac, and causes + bonding to set all slaves of an active-backup bond to + the same MAC address at enslavement time. This is the + default. + + active or 1 + + The "active" fail_over_mac policy indicates that the + MAC address of the bond should always be the MAC + address of the currently active slave. The MAC + address of the slaves is not changed; instead, the MAC + address of the bond changes during a failover. + + This policy is useful for devices that cannot ever + alter their MAC address, or for devices that refuse + incoming broadcasts with their own source MAC (which + interferes with the ARP monitor). + + The down side of this policy is that every device on + the network must be updated via gratuitous ARP, + vs. just updating a switch or set of switches (which + often takes place for any traffic, not just ARP + traffic, if the switch snoops incoming traffic to + update its tables) for the traditional method. If the + gratuitous ARP is lost, communication may be + disrupted. + + When this policy is used in conjuction with the mii + monitor, devices which assert link up prior to being + able to actually transmit and receive are particularly + susecptible to loss of the gratuitous ARP, and an + appropriate updelay setting may be required. + + follow or 2 + + The "follow" fail_over_mac policy causes the MAC + address of the bond to be selected normally (normally + the MAC address of the first slave added to the bond). + However, the second and subsequent slaves are not set + to this MAC address while they are in a backup role; a + slave is programmed with the bond's MAC address at + failover time (and the formerly active slave receives + the newly active slave's MAC address). + + This policy is useful for multiport devices that + either become confused or incur a performance penalty + when multiple ports are programmed with the same MAC + address. + + + The default policy is none, unless the first slave cannot + change its MAC address, in which case the active policy is + selected by default. + + This option may be modified via sysfs only when no slaves are + present in the bond. + + This option was added in bonding version 3.2.0. The "follow" + policy was added in bonding version 3.3.0. lacp_rate @@ -338,7 +376,8 @@ max_bonds Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 - and bond2 will be created. The default value is 1. + and bond2 will be created. The default value is 1. Specifying + a value of 0 will load bonding, but will not create any devices. miimon @@ -501,6 +540,17 @@ mode swapped with the new curr_active_slave that was chosen. +num_grat_arp + + Specifies the number of gratuitous ARPs to be issued after a + failover event. One gratuitous ARP is issued immediately after + the failover, subsequent ARPs are sent at a rate of one per link + monitor interval (arp_interval or miimon, whichever is active). + + The valid range is 0 - 255; the default value is 1. This option + affects only the active-backup mode. This option was added for + bonding version 3.3.0. + primary A string (eth0, eth2, etc) specifying which slave is the diff --git a/trunk/Documentation/networking/dm9000.txt b/trunk/Documentation/networking/dm9000.txt new file mode 100644 index 000000000000..65df3dea5561 --- /dev/null +++ b/trunk/Documentation/networking/dm9000.txt @@ -0,0 +1,167 @@ +DM9000 Network driver +===================== + +Copyright 2008 Simtec Electronics, + Ben Dooks + + +Introduction +------------ + +This file describes how to use the DM9000 platform-device based network driver +that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h. + +The driver supports three DM9000 variants, the DM9000E which is the first chip +supported as well as the newer DM9000A and DM9000B devices. It is currently +maintained and tested by Ben Dooks, who should be CC: to any patches for this +driver. + + +Defining the platform device +---------------------------- + +The minimum set of resources attached to the platform device are as follows: + + 1) The physical address of the address register + 2) The physical address of the data register + 3) The IRQ line the device's interrupt pin is connected to. + +These resources should be specified in that order, as the ordering of the +two address regions is important (the driver expects these to be address +and then data). + +An example from arch/arm/mach-s3c2410/mach-bast.c is: + +static struct resource bast_dm9k_resource[] = { + [0] = { + .start = S3C2410_CS5 + BAST_PA_DM9000, + .end = S3C2410_CS5 + BAST_PA_DM9000 + 3, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40, + .end = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f, + .flags = IORESOURCE_MEM, + }, + [2] = { + .start = IRQ_DM9000, + .end = IRQ_DM9000, + .flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL, + } +}; + +static struct platform_device bast_device_dm9k = { + .name = "dm9000", + .id = 0, + .num_resources = ARRAY_SIZE(bast_dm9k_resource), + .resource = bast_dm9k_resource, +}; + +Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags, +as this will generate a warning if it is not present. The trigger from +the flags field will be passed to request_irq() when registering the IRQ +handler to ensure that the IRQ is setup correctly. + +This shows a typical platform device, without the optional configuration +platform data supplied. The next example uses the same resources, but adds +the optional platform data to pass extra configuration data: + +static struct dm9000_plat_data bast_dm9k_platdata = { + .flags = DM9000_PLATF_16BITONLY, +}; + +static struct platform_device bast_device_dm9k = { + .name = "dm9000", + .id = 0, + .num_resources = ARRAY_SIZE(bast_dm9k_resource), + .resource = bast_dm9k_resource, + .dev = { + .platform_data = &bast_dm9k_platdata, + } +}; + +The platform data is defined in include/linux/dm9000.h and described below. + + +Platform data +------------- + +Extra platform data for the DM9000 can describe the IO bus width to the +device, whether or not an external PHY is attached to the device and +the availability of an external configuration EEPROM. + +The flags for the platform data .flags field are as follows: + +DM9000_PLATF_8BITONLY + + The IO should be done with 8bit operations. + +DM9000_PLATF_16BITONLY + + The IO should be done with 16bit operations. + +DM9000_PLATF_32BITONLY + + The IO should be done with 32bit operations. + +DM9000_PLATF_EXT_PHY + + The chip is connected to an external PHY. + +DM9000_PLATF_NO_EEPROM + + This can be used to signify that the board does not have an + EEPROM, or that the EEPROM should be hidden from the user. + +DM9000_PLATF_SIMPLE_PHY + + Switch to using the simpler PHY polling method which does not + try and read the MII PHY state regularly. This is only available + when using the internal PHY. See the section on link state polling + for more information. + + The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry + "Force simple NSR based PHY polling" allows this flag to be + forced on at build time. + + +PHY Link state polling +---------------------- + +The driver keeps track of the link state and informs the network core +about link (carrier) availablilty. This is managed by several methods +depending on the version of the chip and on which PHY is being used. + +For the internal PHY, the original (and currently default) method is +to read the MII state, either when the status changes if we have the +necessary interrupt support in the chip or every two seconds via a +periodic timer. + +To reduce the overhead for the internal PHY, there is now the option +of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY +platform data option to read the summary information without the +expensive MII accesses. This method is faster, but does not print +as much information. + +When using an external PHY, the driver currently has to poll the MII +link status as there is no method for getting an interrupt on link change. + + +DM9000A / DM9000B +----------------- + +These chips are functionally similar to the DM9000E and are supported easily +by the same driver. The features are: + + 1) Interrupt on internal PHY state change. This means that the periodic + polling of the PHY status may be disabled on these devices when using + the internal PHY. + + 2) TCP/UDP checksum offloading, which the driver does not currently support. + + +ethtool +------- + +The driver supports the ethtool interface for access to the driver +state information, the PHY state and the EEPROM. diff --git a/trunk/Documentation/networking/e1000.txt b/trunk/Documentation/networking/e1000.txt index 61b171cf5313..2df71861e578 100644 --- a/trunk/Documentation/networking/e1000.txt +++ b/trunk/Documentation/networking/e1000.txt @@ -513,21 +513,11 @@ Additional Configurations Intel(R) PRO/1000 PT Dual Port Server Connection Intel(R) PRO/1000 PT Dual Port Server Adapter Intel(R) PRO/1000 PF Dual Port Server Adapter - Intel(R) PRO/1000 PT Quad Port Server Adapter + Intel(R) PRO/1000 PT Quad Port Server Adapter NAPI ---- - NAPI (Rx polling mode) is supported in the e1000 driver. NAPI is enabled - or disabled based on the configuration of the kernel. To override - the default, use the following compile-time flags. - - To enable NAPI, compile the driver module, passing in a configuration option: - - make CFLAGS_EXTRA=-DE1000_NAPI install - - To disable NAPI, compile the driver module, passing in a configuration option: - - make CFLAGS_EXTRA=-DE1000_NO_NAPI install + NAPI (Rx polling mode) is enabled in the e1000 driver. See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI. diff --git a/trunk/Documentation/networking/ip-sysctl.txt b/trunk/Documentation/networking/ip-sysctl.txt index 946b66e1b652..d84932650fd3 100644 --- a/trunk/Documentation/networking/ip-sysctl.txt +++ b/trunk/Documentation/networking/ip-sysctl.txt @@ -551,8 +551,9 @@ icmp_echo_ignore_broadcasts - BOOLEAN icmp_ratelimit - INTEGER Limit the maximal rates for sending ICMP packets whose type matches icmp_ratemask (see below) to specific targets. - 0 to disable any limiting, otherwise the maximal rate in jiffies(1) - Default: 100 + 0 to disable any limiting, + otherwise the minimal space between responses in milliseconds. + Default: 1000 icmp_ratemask - INTEGER Mask made of ICMP types for which rates are being limited. @@ -1023,11 +1024,23 @@ max_addresses - INTEGER autoconfigured addresses. Default: 16 +disable_ipv6 - BOOLEAN + Disable IPv6 operation. + Default: FALSE (enable IPv6 operation) + +accept_dad - INTEGER + Whether to accept DAD (Duplicate Address Detection). + 0: Disable DAD + 1: Enable DAD (default) + 2: Enable DAD, and disable IPv6 operation if MAC-based duplicate + link-local address has been found. + icmp/*: ratelimit - INTEGER Limit the maximal rates for sending ICMPv6 packets. - 0 to disable any limiting, otherwise the maximal rate in jiffies(1) - Default: 100 + 0 to disable any limiting, + otherwise the minimal space between responses in milliseconds. + Default: 1000 IPv6 Update by: diff --git a/trunk/Documentation/networking/ixgb.txt b/trunk/Documentation/networking/ixgb.txt index 7c98277777eb..a0d0ffb5e584 100644 --- a/trunk/Documentation/networking/ixgb.txt +++ b/trunk/Documentation/networking/ixgb.txt @@ -1,7 +1,7 @@ -Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters -================================================================ +Linux Base Driver for 10 Gigabit Intel(R) Network Connection +============================================================= -November 17, 2004 +October 9, 2007 Contents @@ -9,94 +9,151 @@ Contents - In This Release - Identifying Your Adapter +- Building and Installation - Command Line Parameters - Improving Performance +- Additional Configurations +- Known Issues/Troubleshooting - Support + In This Release =============== -This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family -of Adapters, version 1.0.x. +This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R) +Network Connection. This driver includes support for Itanium(R)2-based +systems. + +For questions related to hardware requirements, refer to the documentation +supplied with your 10 Gigabit adapter. All hardware requirements listed apply +to use with Linux. + +The following features are available in this kernel: + - Native VLANs + - Channel Bonding (teaming) + - SNMP + +Channel Bonding documentation can be found in the Linux kernel source: +/Documentation/networking/bonding.txt + +The driver information previously displayed in the /proc filesystem is not +supported in this release. Alternatively, you can use ethtool (version 1.6 +or later), lspci, and ifconfig to obtain the same information. + +Instructions on updating ethtool can be found in the section "Additional +Configurations" later in this document. -For questions related to hardware requirements, refer to the documentation -supplied with your Intel PRO/10GbE adapter. All hardware requirements listed -apply to use with Linux. Identifying Your Adapter ======================== -To verify your Intel adapter is supported, find the board ID number on the -adapter. Look for a label that has a barcode and a number in the format -A12345-001. +The following Intel network adapters are compatible with the drivers in this +release: + +Controller Adapter Name Physical Layer +---------- ------------ -------------- +82597EX Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber) + Server Adapters 10G Base-SR (850 nm optical fiber) + 10G Base-CX4(twin-axial copper cabling) + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/network/sb/CS-012904.htm + + +Building and Installation +========================= + +select m for "Intel(R) PRO/10GbE support" located at: + Location: + -> Device Drivers + -> Network device support (NETDEVICES [=y]) + -> Ethernet (10000 Mbit) (NETDEV_10000 [=y]) +1. make modules && make modules_install + +2. Load the module: + +    modprobe ixgb = + + The insmod command can be used if the full + path to the driver module is specified. For example: + + insmod /lib/modules//kernel/drivers/net/ixgb/ixgb.ko + + With 2.6 based kernels also make sure that older ixgb drivers are + removed from the kernel, before loading the new module: -Use the above information and the Adapter & Driver ID Guide at: + rmmod ixgb; modprobe ixgb - http://support.intel.com/support/network/adapter/pro100/21397.htm +3. Assign an IP address to the interface by entering the following, where + x is the interface number: -For the latest Intel network drivers for Linux, go to: + ifconfig ethx + +4. Verify that the interface works. Enter the following, where + is the IP address for another machine on the same subnet as the interface + that is being tested: + + ping - http://downloadfinder.intel.com/scripts-df/support_intel.asp Command Line Parameters ======================= -If the driver is built as a module, the following optional parameters are -used by entering them on the command line with the modprobe or insmod command -using this syntax: +If the driver is built as a module, the following optional parameters are +used by entering them on the command line with the modprobe command using +this syntax: modprobe ixgb [