Similar to x86/sparc/powerpc implementations except: 1) we implement an extremely efficient has_zero()/find_zero() sequence with both prep_zero_mask() and create_zero_mask() no-operations. 2) Our output from prep_zero_mask() differs in that only the lowest eight bits are used to represent the zero bytes nevertheless it can be safely ORed with other similar masks from prep_zero_mask() and forms input to create_zero_mask(), the two fundamental properties prep_zero_mask() must satisfy. Tests on EV67 and EV68 CPUs revealed that the generic code is essentially as fast (to within 0.5% of CPU cycles) of the old Alpha specific code for large quadword-aligned strings, despite the 30% extra CPU instructions executed. In contrast, the generic code for unaligned strings is substantially slower (by more than a factor of 3) than the old Alpha specific code. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|---|---|---|
| .. | ||
| callback_srm.S | ||
| checksum.c | ||
| clear_page.S | ||
| clear_user.S | ||
| copy_page.S | ||
| copy_user.S | ||
| csum_ipv6_magic.S | ||
| csum_partial_copy.c | ||
| dbg_current.S | ||
| dbg_stackcheck.S | ||
| dbg_stackkill.S | ||
| dec_and_lock.c | ||
| divide.S | ||
| ev6-clear_page.S | ||
| ev6-clear_user.S | ||
| ev6-copy_page.S | ||
| ev6-copy_user.S | ||
| ev6-csum_ipv6_magic.S | ||
| ev6-divide.S | ||
| ev6-memchr.S | ||
| ev6-memcpy.S | ||
| ev6-memset.S | ||
| ev6-stxcpy.S | ||
| ev6-stxncpy.S | ||
| ev67-strcat.S | ||
| ev67-strchr.S | ||
| ev67-strlen.S | ||
| ev67-strncat.S | ||
| ev67-strrchr.S | ||
| fls.c | ||
| fpreg.c | ||
| Makefile | ||
| memchr.S | ||
| memcpy.c | ||
| memmove.S | ||
| memset.S | ||
| srm_printk.c | ||
| srm_puts.c | ||
| stacktrace.c | ||
| strcat.S | ||
| strchr.S | ||
| strcpy.S | ||
| strlen.S | ||
| strncat.S | ||
| strncpy.S | ||
| strrchr.S | ||
| stxcpy.S | ||
| stxncpy.S | ||
| udelay.c | ||