Skip to content
  1. Aug 14, 2020
  2. Jul 29, 2020
  3. Jul 01, 2020
    • Wilco Dijkstra's avatar
      string: Optimize strlen · 224cb5f6
      Wilco Dijkstra authored
      Optimize strlen using a mix of scalar and SIMD code. On modern micro
      architectures large strings are 55% faster than the current version,
      and 35% faster than strlen-mte.  On the random strlen benchmark the
      speedup is 3.4% and 40% respectively.
      224cb5f6
  4. Jun 23, 2020
  5. Jun 12, 2020
  6. Jun 01, 2020
    • Wilco Dijkstra's avatar
      string: Fix issue in strcmp-mte NUL check · f08b12e8
      Wilco Dijkstra authored
      Improve the previous fix - if a string is immediately preceded by a NUL byte
      and the first byte is 0x1, it may be confused by the NUL check as a NUL byte.
      Instead of removing bytes outside the string via a shift, force them to be
      non-NUL.
      f08b12e8
  7. May 29, 2020
    • Szabolcs Nagy's avatar
      v20.05 release · ef907c7a
      Szabolcs Nagy authored
      * New functionality (64-bit Arm)
        * string: Optimized MTE variants of strlen, strnlen, strchr,
          strchrnul, strrchr, memchr, memrchr, strcpy, stpcpy, strcmp,
          strncmp
        * string: Changes to support BTI
        * string: New optimized memrchr, strnlen
      * Performance improvements (Neoverse N1)
        * strchr/strchrnul: 21% improvement on long strings
        * strrchr: 11% improvement
        * strnlen: 130% improvement on long strings, 50% on short strings
      * Benchmark and tests
        * string: New memcpy benchmark
        * string: Cleanup testsuite and improve test coverage
      ef907c7a
    • Wilco Dijkstra's avatar
      string: Fix issue in strcmp-mte · 304137d8
      Wilco Dijkstra authored
      Ensure nul bytes before unaligned strings are correctly ignored.
      304137d8
  8. May 28, 2020
    • Wilco Dijkstra's avatar
      string: Improve strcmp-mte performance · 27bb6b2b
      Wilco Dijkstra authored
      Improve strcmp performance. On various micro architectures the speedup is 65%
      on large unaligned strings and 21% on large (mutually) aligned strings.
      On small unaligned strings the speedup is 12%.
      27bb6b2b
    • Wilco Dijkstra's avatar
      string: Improve memcpy benchmark · 2525af9b
      Wilco Dijkstra authored
      Print results in bytes/ns. Add medium and large copy benchmark.
      2525af9b
    • Branislav Rankov's avatar
      string: Add MTE support to string tests. · 4d55c2d3
      Branislav Rankov authored
      Set taggs for every test case so that boundaries are as narrow as
      possible. There is no handling of tag faults, so the test will
      crash if there is a MTE problem.
      
      The implementations that are not compatible are excluded, including
      the standard symbols that may come from an mte incompatible libc.
      4d55c2d3
  9. May 22, 2020
  10. May 20, 2020
  11. May 18, 2020
    • Wilco Dijkstra's avatar
      string: Improve strrchr-mte performance · a99a1a96
      Wilco Dijkstra authored
      Improve strrchr performance by using a fast strchr loop to find the first
      match. On various micro architectures the speedup is 30-80% on large strings
      and 32% on small strings.
      a99a1a96
  12. May 13, 2020
  13. May 12, 2020
  14. May 01, 2020
    • Szabolcs Nagy's avatar
      string: add a setting to disable GNU Property Notes · e1127946
      Szabolcs Nagy authored
      GNU Property Notes are only supported in recent tooling and older
      tools may warn about them, so it makes sense to remove these notes
      on a system where BTI is not supported anyway.
      
      The actual BTI instructions should be kept in place to avoid
      disturbing code layout.
      
      -DWANT_GNU_PROPERTY=0 removes the .note.gnu.property section
      from assembly files (ideally it would be based on the compiler
      default setting, but there is no feature test macro for BTI and
      PAC-RET).
      e1127946
    • Wilco Dijkstra's avatar
      string: Further improve strchrnul-mte performance · fa69d42a
      Wilco Dijkstra authored
      Remove 2 more instructions, resulting in a 9.8% speedup of medium
      sized strings (16-32).
      
      The BTI patch changed ENTRY so the loops got misaligned, this fixes
      that regression.
      fa69d42a
    • Wilco Dijkstra's avatar
      string: Further improve strchr-mte performance · 7bb8464f
      Wilco Dijkstra authored
      Remove 2 more instructions, resulting in a 6.8% speedup of medium
      sized strings (16-32).
      
      The BTI patch changed ENTRY so the loops got misaligned, this fixes
      that regression.
      7bb8464f
  15. Apr 30, 2020
    • Branislav Rankov's avatar
      string: ARMv8.5 MTE: Add MTE compatible version of strncmp. · 1de12a67
      Branislav Rankov authored
      Reading outside the range of the string is only allowed within 16 byte
      aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strncmp.S
      
      Change the case when strings are are misaligned, align the pointers
      down, and ignore bytes before the start of the string. Carry the part
      that is not compared to the next comparison.
      
      Testing done:
      string/test/strncmp.c on big endian, little endian and with MTE support.
      Booted nanodroid with MTE enabled.
      
      Bechmarked on Pixel4.
      1de12a67
    • Branislav Rankov's avatar
      string: ARMv8.5 MTE: Add MTE compatible version of strcmp. · 9ebd9615
      Branislav Rankov authored
      Reading outside the range of the string is only allowed within 16 byte
      aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strcmp.S
      
      Change the case when strings are are misaligned, align the pointers
      down, and ignore bytes before the start of the string. Carry the part
      that is not compared to the next comparison.
      
      Testing done:
      optimized-routines/string/test/strcmp.c on big and little endian.
      Booted nanodroid with MTE enabled.
      bionic string tests with MTE enabled.
      
      Benchmarks results:
      Run both bionic benchmarks and glibc benchmarks on Pixel4. Cores A76 and A55.
      9ebd9615
    • Tamas Zsoldos's avatar
      string: ARMv8.5 BTI: Add BTI support to assembly files. · 232e3c08
      Tamas Zsoldos authored
      This change addds the landing pads to the start of functions
      implemented in assembly, by adding it to the ENTRY macro. To avoid
      skipping it when using an alias, every ENTRY_ALIAS use must precede
      the corresponding ENTRY.
      
      Furthermore, the GNU property note is added to the assembly
      files. Since none of the functions save LR to stack, both BTI and PAC
      support are indicated.
      
      Paddings before __strncmp_aarch64 and __strnlen_aarch64 were adjusted.
      232e3c08
    • Gabor Kertesz's avatar
      ARMv8.5 MTE: Add MTE compatible version of strrchr. · bfaeb591
      Gabor Kertesz authored
      Reading outside the range of the string is only allowed within
      16 byte aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strrchr.S.
      
      Testing done:
      optimized-routines/string/test/strrchr.c
      Booted nanodroid with MTE enabled.
      Bionic string tests with MTE enabled.
      Big endian with Qemu: qemu-aarch64_be
      bfaeb591
    • Wilco Dijkstra's avatar
      string: Improve strchrnul-mte performance · 878eb93f
      Wilco Dijkstra authored
      Improve strchrnul performance by using more efficient termination tests.
      On various micro architectures the speedup is 20% on large strings and 26%
      on small strings.
      878eb93f
    • Wilco Dijkstra's avatar
      string: Improve strchr-mte performance · 7aeabda5
      Wilco Dijkstra authored
      Improve strchr performance by using a more efficient termination test.
      On various micro architectures the speedup is 19% on large strings and
      19% on small strings.
      7aeabda5
    • Wilco Dijkstra's avatar
      string: Improve strrchr performance · 2f12ab4a
      Wilco Dijkstra authored
      Improve strrchr performance by using a more efficient termination test.
      On various micro architectures the speedup is 11% on large strings.
      2f12ab4a
Loading