Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Force 32-bit displacement in memset-vec-unaligned-erms.S
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Force
	32-bit displacement to avoid long nop between instructions.
  • Loading branch information
H.J. Lu committed Apr 5, 2016
1 parent 696ac77 commit ec0cac9
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 0 deletions.
5 changes: 5 additions & 0 deletions ChangeLog
@@ -1,3 +1,8 @@
2016-04-05 H.J. Lu <hongjiu.lu@intel.com>

* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Force
32-bit displacement to avoid long nop between instructions.

2016-04-05 H.J. Lu <hongjiu.lu@intel.com> 2016-04-05 H.J. Lu <hongjiu.lu@intel.com>


* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Add * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Add
Expand Down
13 changes: 13 additions & 0 deletions sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
Expand Up @@ -159,10 +159,23 @@ L(return):
.p2align 4 .p2align 4
L(loop_start): L(loop_start):
leaq (VEC_SIZE * 4)(%rdi), %rcx leaq (VEC_SIZE * 4)(%rdi), %rcx
# if VEC_SIZE == 32 || VEC_SIZE == 64
/* Force 32-bit displacement to avoid long nop between
instructions. */
VMOVU.d32 %VEC(0), (%rdi)
# else
VMOVU %VEC(0), (%rdi) VMOVU %VEC(0), (%rdi)
# endif
andq $-(VEC_SIZE * 4), %rcx andq $-(VEC_SIZE * 4), %rcx
# if VEC_SIZE == 32
/* Force 32-bit displacement to avoid long nop between
instructions. */
VMOVU.d32 %VEC(0), -VEC_SIZE(%rdi,%rdx)
VMOVU.d32 %VEC(0), VEC_SIZE(%rdi)
# else
VMOVU %VEC(0), -VEC_SIZE(%rdi,%rdx) VMOVU %VEC(0), -VEC_SIZE(%rdi,%rdx)
VMOVU %VEC(0), VEC_SIZE(%rdi) VMOVU %VEC(0), VEC_SIZE(%rdi)
# endif
VMOVU %VEC(0), -(VEC_SIZE * 2)(%rdi,%rdx) VMOVU %VEC(0), -(VEC_SIZE * 2)(%rdi,%rdx)
VMOVU %VEC(0), (VEC_SIZE * 2)(%rdi) VMOVU %VEC(0), (VEC_SIZE * 2)(%rdi)
VMOVU %VEC(0), -(VEC_SIZE * 3)(%rdi,%rdx) VMOVU %VEC(0), -(VEC_SIZE * 3)(%rdi,%rdx)
Expand Down

0 comments on commit ec0cac9

Please sign in to comment.