Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 118525
b: refs/heads/master
c: 6597cb8
h: refs/heads/master
i:
  118523: 4c4c72e
v: v3
  • Loading branch information
Russell King authored and Russell King committed Nov 6, 2008
1 parent 4ad2d23 commit e5f4a34
Show file tree
Hide file tree
Showing 54 changed files with 834 additions and 299 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: 5c32f62b97d62bec097c09e54e6602d0fce2af07
refs/heads/master: 6597cb84c86cefe4e174533b79e17b86f634b5e0
82 changes: 82 additions & 0 deletions trunk/Documentation/io-mapping.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
The io_mapping functions in linux/io-mapping.h provide an abstraction for
efficiently mapping small regions of an I/O device to the CPU. The initial
usage is to support the large graphics aperture on 32-bit processors where
ioremap_wc cannot be used to statically map the entire aperture to the CPU
as it would consume too much of the kernel address space.

A mapping object is created during driver initialization using

struct io_mapping *io_mapping_create_wc(unsigned long base,
unsigned long size)

'base' is the bus address of the region to be made
mappable, while 'size' indicates how large a mapping region to
enable. Both are in bytes.

This _wc variant provides a mapping which may only be used
with the io_mapping_map_atomic_wc or io_mapping_map_wc.

With this mapping object, individual pages can be mapped either atomically
or not, depending on the necessary scheduling environment. Of course, atomic
maps are more efficient:

void *io_mapping_map_atomic_wc(struct io_mapping *mapping,
unsigned long offset)

'offset' is the offset within the defined mapping region.
Accessing addresses beyond the region specified in the
creation function yields undefined results. Using an offset
which is not page aligned yields an undefined result. The
return value points to a single page in CPU address space.

This _wc variant returns a write-combining map to the
page and may only be used with mappings created by
io_mapping_create_wc

Note that the task may not sleep while holding this page
mapped.

void io_mapping_unmap_atomic(void *vaddr)

'vaddr' must be the the value returned by the last
io_mapping_map_atomic_wc call. This unmaps the specified
page and allows the task to sleep once again.

If you need to sleep while holding the lock, you can use the non-atomic
variant, although they may be significantly slower.

void *io_mapping_map_wc(struct io_mapping *mapping,
unsigned long offset)

This works like io_mapping_map_atomic_wc except it allows
the task to sleep while holding the page mapped.

void io_mapping_unmap(void *vaddr)

This works like io_mapping_unmap_atomic, except it is used
for pages mapped with io_mapping_map_wc.

At driver close time, the io_mapping object must be freed:

void io_mapping_free(struct io_mapping *mapping)

Current Implementation:

The initial implementation of these functions uses existing mapping
mechanisms and so provides only an abstraction layer and no new
functionality.

On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole
range, creating a permanent kernel-visible mapping to the resource. The
map_atomic and map functions add the requested offset to the base of the
virtual address returned by ioremap_wc.

On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses
kmap_atomic_pfn to map the specified page in an atomic fashion;
kmap_atomic_pfn isn't really supposed to be used with device pages, but it
provides an efficient mapping for this usage.

On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and
io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which
performs an IPI to inform all processors about the new mapping. This results
in a significant performance penalty.
12 changes: 6 additions & 6 deletions trunk/arch/arm/include/asm/memory.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,10 @@
* The module space lives between the addresses given by TASK_SIZE
* and PAGE_OFFSET - it must be within 32MB of the kernel text.
*/
#define MODULE_END (PAGE_OFFSET)
#define MODULE_START (MODULE_END - 16*1048576)
#define MODULES_END (PAGE_OFFSET)
#define MODULES_VADDR (MODULES_END - 16*1048576)

#if TASK_SIZE > MODULE_START
#if TASK_SIZE > MODULES_VADDR
#error Top of user space clashes with start of module space
#endif

Expand All @@ -56,7 +56,7 @@
* Since we use sections to map it, this macro replaces the physical address
* with its virtual address while keeping offset from the base section.
*/
#define XIP_VIRT_ADDR(physaddr) (MODULE_START + ((physaddr) & 0x000fffff))
#define XIP_VIRT_ADDR(physaddr) (MODULES_VADDR + ((physaddr) & 0x000fffff))

/*
* Allow 16MB-aligned ioremap pages
Expand Down Expand Up @@ -94,8 +94,8 @@
/*
* The module can be at any place in ram in nommu mode.
*/
#define MODULE_END (END_MEM)
#define MODULE_START (PHYS_OFFSET)
#define MODULES_END (END_MEM)
#define MODULES_VADDR (PHYS_OFFSET)

#endif /* !CONFIG_MMU */

Expand Down
4 changes: 4 additions & 0 deletions trunk/arch/arm/include/asm/system.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@
#define CR_U (1 << 22) /* Unaligned access operation */
#define CR_XP (1 << 23) /* Extended page tables */
#define CR_VE (1 << 24) /* Vectored interrupts */
#define CR_EE (1 << 25) /* Exception (Big) Endian */
#define CR_TRE (1 << 28) /* TEX remap enable */
#define CR_AFE (1 << 29) /* Access flag enable */
#define CR_TE (1 << 30) /* Thumb exception enable */

/*
* This is used to ensure the compiler did actually allocate the register we
Expand Down
6 changes: 5 additions & 1 deletion trunk/arch/arm/kernel/elf.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,16 @@ int elf_check_arch(const struct elf32_hdr *x)

eflags = x->e_flags;
if ((eflags & EF_ARM_EABI_MASK) == EF_ARM_EABI_UNKNOWN) {
unsigned int flt_fmt;

/* APCS26 is only allowed if the CPU supports it */
if ((eflags & EF_ARM_APCS_26) && !(elf_hwcap & HWCAP_26BIT))
return 0;

flt_fmt = eflags & (EF_ARM_VFP_FLOAT | EF_ARM_SOFT_FLOAT);

/* VFP requires the supporting code */
if ((eflags & EF_ARM_VFP_FLOAT) && !(elf_hwcap & HWCAP_VFP))
if (flt_fmt == EF_ARM_VFP_FLOAT && !(elf_hwcap & HWCAP_VFP))
return 0;
}
return 1;
Expand Down
8 changes: 4 additions & 4 deletions trunk/arch/arm/kernel/module.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@
/*
* The XIP kernel text is mapped in the module area for modules and
* some other stuff to work without any indirect relocations.
* MODULE_START is redefined here and not in asm/memory.h to avoid
* MODULES_VADDR is redefined here and not in asm/memory.h to avoid
* recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
*/
extern void _etext;
#undef MODULE_START
#define MODULE_START (((unsigned long)&_etext + ~PGDIR_MASK) & PGDIR_MASK)
#undef MODULES_VADDR
#define MODULES_VADDR (((unsigned long)&_etext + ~PGDIR_MASK) & PGDIR_MASK)
#endif

#ifdef CONFIG_MMU
Expand All @@ -43,7 +43,7 @@ void *module_alloc(unsigned long size)
if (!size)
return NULL;

area = __get_vm_area(size, VM_ALLOC, MODULE_START, MODULE_END);
area = __get_vm_area(size, VM_ALLOC, MODULES_VADDR, MODULES_END);
if (!area)
return NULL;

Expand Down
4 changes: 2 additions & 2 deletions trunk/arch/arm/mm/cache-xsc3l2.c
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ static void xsc3_l2_inv_range(unsigned long start, unsigned long end)
/*
* Clean and invalidate partial last cache line.
*/
if (end & (CACHE_LINE_SIZE - 1)) {
if (start < end && (end & (CACHE_LINE_SIZE - 1))) {
xsc3_l2_clean_pa(end & ~(CACHE_LINE_SIZE - 1));
xsc3_l2_inv_pa(end & ~(CACHE_LINE_SIZE - 1));
end &= ~(CACHE_LINE_SIZE - 1);
Expand All @@ -107,7 +107,7 @@ static void xsc3_l2_inv_range(unsigned long start, unsigned long end)
/*
* Invalidate all full cache lines between 'start' and 'end'.
*/
while (start != end) {
while (start < end) {
xsc3_l2_inv_pa(start);
start += CACHE_LINE_SIZE;
}
Expand Down
111 changes: 79 additions & 32 deletions trunk/arch/arm/mm/mmu.c
Original file line number Diff line number Diff line change
Expand Up @@ -180,20 +180,20 @@ void adjust_cr(unsigned long mask, unsigned long set)
#endif

#define PROT_PTE_DEVICE L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_WRITE
#define PROT_SECT_DEVICE PMD_TYPE_SECT|PMD_SECT_XN|PMD_SECT_AP_WRITE
#define PROT_SECT_DEVICE PMD_TYPE_SECT|PMD_SECT_AP_WRITE

static struct mem_type mem_types[] = {
[MT_DEVICE] = { /* Strongly ordered / ARMv6 shared device */
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED |
L_PTE_SHARED,
.prot_l1 = PMD_TYPE_TABLE,
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_UNCACHED,
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_S,
.domain = DOMAIN_IO,
},
[MT_DEVICE_NONSHARED] = { /* ARMv6 non-shared device */
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_NONSHARED,
.prot_l1 = PMD_TYPE_TABLE,
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_TEX(2),
.prot_sect = PROT_SECT_DEVICE,
.domain = DOMAIN_IO,
},
[MT_DEVICE_CACHED] = { /* ioremap_cached */
Expand All @@ -205,7 +205,7 @@ static struct mem_type mem_types[] = {
[MT_DEVICE_WC] = { /* ioremap_wc */
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_WC,
.prot_l1 = PMD_TYPE_TABLE,
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_BUFFERABLE,
.prot_sect = PROT_SECT_DEVICE,
.domain = DOMAIN_IO,
},
[MT_CACHECLEAN] = {
Expand Down Expand Up @@ -273,22 +273,23 @@ static void __init build_mem_type_table(void)
#endif

/*
* On non-Xscale3 ARMv5-and-older systems, use CB=01
* (Uncached/Buffered) for ioremap_wc() mappings. On XScale3
* and ARMv6+, use TEXCB=00100 mappings (Inner/Outer Uncacheable
* in xsc3 parlance, Uncached Normal in ARMv6 parlance).
* Strip out features not present on earlier architectures.
* Pre-ARMv5 CPUs don't have TEX bits. Pre-ARMv6 CPUs or those
* without extended page tables don't have the 'Shared' bit.
*/
if (cpu_is_xsc3() || cpu_arch >= CPU_ARCH_ARMv6) {
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_TEX(1);
mem_types[MT_DEVICE_WC].prot_sect &= ~PMD_SECT_BUFFERABLE;
}
if (cpu_arch < CPU_ARCH_ARMv5)
for (i = 0; i < ARRAY_SIZE(mem_types); i++)
mem_types[i].prot_sect &= ~PMD_SECT_TEX(7);
if ((cpu_arch < CPU_ARCH_ARMv6 || !(cr & CR_XP)) && !cpu_is_xsc3())
for (i = 0; i < ARRAY_SIZE(mem_types); i++)
mem_types[i].prot_sect &= ~PMD_SECT_S;

/*
* ARMv5 and lower, bit 4 must be set for page tables.
* (was: cache "update-able on write" bit on ARM610)
* However, Xscale cores require this bit to be cleared.
* ARMv5 and lower, bit 4 must be set for page tables (was: cache
* "update-able on write" bit on ARM610). However, Xscale and
* Xscale3 require this bit to be cleared.
*/
if (cpu_is_xscale()) {
if (cpu_is_xscale() || cpu_is_xsc3()) {
for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
mem_types[i].prot_sect &= ~PMD_BIT4;
mem_types[i].prot_l1 &= ~PMD_BIT4;
Expand All @@ -302,6 +303,64 @@ static void __init build_mem_type_table(void)
}
}

/*
* Mark the device areas according to the CPU/architecture.
*/
if (cpu_is_xsc3() || (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP))) {
if (!cpu_is_xsc3()) {
/*
* Mark device regions on ARMv6+ as execute-never
* to prevent speculative instruction fetches.
*/
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_XN;
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_XN;
mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_XN;
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_XN;
}
if (cpu_arch >= CPU_ARCH_ARMv7 && (cr & CR_TRE)) {
/*
* For ARMv7 with TEX remapping,
* - shared device is SXCB=1100
* - nonshared device is SXCB=0100
* - write combine device mem is SXCB=0001
* (Uncached Normal memory)
*/
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_TEX(1);
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(1);
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_BUFFERABLE;
} else if (cpu_is_xsc3()) {
/*
* For Xscale3,
* - shared device is TEXCB=00101
* - nonshared device is TEXCB=01000
* - write combine device mem is TEXCB=00100
* (Inner/Outer Uncacheable in xsc3 parlance)
*/
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_TEX(1) | PMD_SECT_BUFFERED;
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(2);
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_TEX(1);
} else {
/*
* For ARMv6 and ARMv7 without TEX remapping,
* - shared device is TEXCB=00001
* - nonshared device is TEXCB=01000
* - write combine device mem is TEXCB=00100
* (Uncached Normal in ARMv6 parlance).
*/
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_BUFFERED;
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(2);
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_TEX(1);
}
} else {
/*
* On others, write combining is "Uncached/Buffered"
*/
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_BUFFERABLE;
}

/*
* Now deal with the memory-type mappings
*/
cp = &cache_policies[cachepolicy];
vecs_pgprot = kern_pgprot = user_pgprot = cp->pte;

Expand All @@ -317,12 +376,8 @@ static void __init build_mem_type_table(void)
* Enable CPU-specific coherency if supported.
* (Only available on XSC3 at the moment.)
*/
if (arch_is_coherent()) {
if (cpu_is_xsc3()) {
mem_types[MT_MEMORY].prot_sect |= PMD_SECT_S;
mem_types[MT_MEMORY].prot_pte |= L_PTE_SHARED;
}
}
if (arch_is_coherent() && cpu_is_xsc3())
mem_types[MT_MEMORY].prot_sect |= PMD_SECT_S;

/*
* ARMv6 and above have extended page tables.
Expand All @@ -336,11 +391,6 @@ static void __init build_mem_type_table(void)
mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;

/*
* Mark the device area as "shared device"
*/
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_BUFFERED;

#ifdef CONFIG_SMP
/*
* Mark memory with the "shared" attribute for SMP systems
Expand All @@ -360,9 +410,6 @@ static void __init build_mem_type_table(void)
mem_types[MT_LOW_VECTORS].prot_pte |= vecs_pgprot;
mem_types[MT_HIGH_VECTORS].prot_pte |= vecs_pgprot;

if (cpu_arch < CPU_ARCH_ARMv5)
mem_types[MT_MINICLEAN].prot_sect &= ~PMD_SECT_TEX(1);

pgprot_user = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG | user_pgprot);
pgprot_kernel = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG |
L_PTE_DIRTY | L_PTE_WRITE |
Expand Down Expand Up @@ -654,7 +701,7 @@ static inline void prepare_page_table(struct meminfo *mi)
/*
* Clear out all the mappings below the kernel image.
*/
for (addr = 0; addr < MODULE_START; addr += PGDIR_SIZE)
for (addr = 0; addr < MODULES_VADDR; addr += PGDIR_SIZE)
pmd_clear(pmd_off_k(addr));

#ifdef CONFIG_XIP_KERNEL
Expand Down Expand Up @@ -766,7 +813,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
*/
#ifdef CONFIG_XIP_KERNEL
map.pfn = __phys_to_pfn(CONFIG_XIP_PHYS_ADDR & SECTION_MASK);
map.virtual = MODULE_START;
map.virtual = MODULES_VADDR;
map.length = ((unsigned long)&_etext - map.virtual + ~SECTION_MASK) & SECTION_MASK;
map.type = MT_ROM;
create_mapping(&map);
Expand Down
Loading

0 comments on commit e5f4a34

Please sign in to comment.