Skip to content
  1. May 29, 2020
    • Szabolcs Nagy's avatar
      v20.05 release · ef907c7a
      Szabolcs Nagy authored
      * New functionality (64-bit Arm)
        * string: Optimized MTE variants of strlen, strnlen, strchr,
          strchrnul, strrchr, memchr, memrchr, strcpy, stpcpy, strcmp,
          strncmp
        * string: Changes to support BTI
        * string: New optimized memrchr, strnlen
      * Performance improvements (Neoverse N1)
        * strchr/strchrnul: 21% improvement on long strings
        * strrchr: 11% improvement
        * strnlen: 130% improvement on long strings, 50% on short strings
      * Benchmark and tests
        * string: New memcpy benchmark
        * string: Cleanup testsuite and improve test coverage
      ef907c7a
    • Wilco Dijkstra's avatar
      string: Fix issue in strcmp-mte · 304137d8
      Wilco Dijkstra authored
      Ensure nul bytes before unaligned strings are correctly ignored.
      304137d8
  2. May 28, 2020
    • Wilco Dijkstra's avatar
      string: Improve strcmp-mte performance · 27bb6b2b
      Wilco Dijkstra authored
      Improve strcmp performance. On various micro architectures the speedup is 65%
      on large unaligned strings and 21% on large (mutually) aligned strings.
      On small unaligned strings the speedup is 12%.
      27bb6b2b
    • Wilco Dijkstra's avatar
      string: Improve memcpy benchmark · 2525af9b
      Wilco Dijkstra authored
      Print results in bytes/ns. Add medium and large copy benchmark.
      2525af9b
    • Branislav Rankov's avatar
      string: Add MTE support to string tests. · 4d55c2d3
      Branislav Rankov authored
      Set taggs for every test case so that boundaries are as narrow as
      possible. There is no handling of tag faults, so the test will
      crash if there is a MTE problem.
      
      The implementations that are not compatible are excluded, including
      the standard symbols that may come from an mte incompatible libc.
      4d55c2d3
  3. May 22, 2020
  4. May 20, 2020
  5. May 18, 2020
    • Wilco Dijkstra's avatar
      string: Improve strrchr-mte performance · a99a1a96
      Wilco Dijkstra authored
      Improve strrchr performance by using a fast strchr loop to find the first
      match. On various micro architectures the speedup is 30-80% on large strings
      and 32% on small strings.
      a99a1a96
  6. May 13, 2020
  7. May 12, 2020
  8. May 01, 2020
    • Szabolcs Nagy's avatar
      string: add a setting to disable GNU Property Notes · e1127946
      Szabolcs Nagy authored
      GNU Property Notes are only supported in recent tooling and older
      tools may warn about them, so it makes sense to remove these notes
      on a system where BTI is not supported anyway.
      
      The actual BTI instructions should be kept in place to avoid
      disturbing code layout.
      
      -DWANT_GNU_PROPERTY=0 removes the .note.gnu.property section
      from assembly files (ideally it would be based on the compiler
      default setting, but there is no feature test macro for BTI and
      PAC-RET).
      e1127946
    • Wilco Dijkstra's avatar
      string: Further improve strchrnul-mte performance · fa69d42a
      Wilco Dijkstra authored
      Remove 2 more instructions, resulting in a 9.8% speedup of medium
      sized strings (16-32).
      
      The BTI patch changed ENTRY so the loops got misaligned, this fixes
      that regression.
      fa69d42a
    • Wilco Dijkstra's avatar
      string: Further improve strchr-mte performance · 7bb8464f
      Wilco Dijkstra authored
      Remove 2 more instructions, resulting in a 6.8% speedup of medium
      sized strings (16-32).
      
      The BTI patch changed ENTRY so the loops got misaligned, this fixes
      that regression.
      7bb8464f
  9. Apr 30, 2020
    • Branislav Rankov's avatar
      string: ARMv8.5 MTE: Add MTE compatible version of strncmp. · 1de12a67
      Branislav Rankov authored
      Reading outside the range of the string is only allowed within 16 byte
      aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strncmp.S
      
      Change the case when strings are are misaligned, align the pointers
      down, and ignore bytes before the start of the string. Carry the part
      that is not compared to the next comparison.
      
      Testing done:
      string/test/strncmp.c on big endian, little endian and with MTE support.
      Booted nanodroid with MTE enabled.
      
      Bechmarked on Pixel4.
      1de12a67
    • Branislav Rankov's avatar
      string: ARMv8.5 MTE: Add MTE compatible version of strcmp. · 9ebd9615
      Branislav Rankov authored
      Reading outside the range of the string is only allowed within 16 byte
      aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strcmp.S
      
      Change the case when strings are are misaligned, align the pointers
      down, and ignore bytes before the start of the string. Carry the part
      that is not compared to the next comparison.
      
      Testing done:
      optimized-routines/string/test/strcmp.c on big and little endian.
      Booted nanodroid with MTE enabled.
      bionic string tests with MTE enabled.
      
      Benchmarks results:
      Run both bionic benchmarks and glibc benchmarks on Pixel4. Cores A76 and A55.
      9ebd9615
    • Tamas Zsoldos's avatar
      string: ARMv8.5 BTI: Add BTI support to assembly files. · 232e3c08
      Tamas Zsoldos authored
      This change addds the landing pads to the start of functions
      implemented in assembly, by adding it to the ENTRY macro. To avoid
      skipping it when using an alias, every ENTRY_ALIAS use must precede
      the corresponding ENTRY.
      
      Furthermore, the GNU property note is added to the assembly
      files. Since none of the functions save LR to stack, both BTI and PAC
      support are indicated.
      
      Paddings before __strncmp_aarch64 and __strnlen_aarch64 were adjusted.
      232e3c08
    • Gabor Kertesz's avatar
      ARMv8.5 MTE: Add MTE compatible version of strrchr. · bfaeb591
      Gabor Kertesz authored
      Reading outside the range of the string is only allowed within
      16 byte aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strrchr.S.
      
      Testing done:
      optimized-routines/string/test/strrchr.c
      Booted nanodroid with MTE enabled.
      Bionic string tests with MTE enabled.
      Big endian with Qemu: qemu-aarch64_be
      bfaeb591
    • Wilco Dijkstra's avatar
      string: Improve strchrnul-mte performance · 878eb93f
      Wilco Dijkstra authored
      Improve strchrnul performance by using more efficient termination tests.
      On various micro architectures the speedup is 20% on large strings and 26%
      on small strings.
      878eb93f
    • Wilco Dijkstra's avatar
      string: Improve strchr-mte performance · 7aeabda5
      Wilco Dijkstra authored
      Improve strchr performance by using a more efficient termination test.
      On various micro architectures the speedup is 19% on large strings and
      19% on small strings.
      7aeabda5
    • Wilco Dijkstra's avatar
      string: Improve strrchr performance · 2f12ab4a
      Wilco Dijkstra authored
      Improve strrchr performance by using a more efficient termination test.
      On various micro architectures the speedup is 11% on large strings.
      2f12ab4a
    • Wilco Dijkstra's avatar
      string: Improve strchrnul performance · 6306e48c
      Wilco Dijkstra authored
      Improve strchrnul performance by using a more efficient termination test.
      On various micro architectures the speedup is 21% on large strings.
      6306e48c
    • Wilco Dijkstra's avatar
      string: Improve strchr performance · c1509861
      Wilco Dijkstra authored
      Improve strchr performance by using a more efficient termination test.
      On various micro architectures the speedup is 21% on large strings.
      c1509861
    • Szabolcs Nagy's avatar
      string: fix check-string · e3ac5315
      Szabolcs Nagy authored
      Don't stop at first failing test and allow running tests in parallel.
      e3ac5315
  10. Apr 29, 2020
  11. Apr 24, 2020
    • Szabolcs Nagy's avatar
      string: test cleanups · 22fd9317
      Szabolcs Nagy authored
      Tests printed too much output on broken string function
      and the output was not entirely useful.
      
      Added a new header file with some common logic for
      printing buffers nicely.
      
      In str* tests len now means string length (not buffer
      size which was confusing).
      22fd9317
  12. Apr 08, 2020
    • Gabor Kertesz's avatar
      ARMv8.5 MTE: Add MTE compatible version of strchrnul. · 563b710d
      Gabor Kertesz authored
      Reading outside the range of the string is only allowed within
      16 byte aligned granules when MTE is enabled.
      
      This implementation is based on string/aarch64/strchr-mte.S and
      string/aarch64/strchrnul.S
      
      Testing done:
      optimized-routines/string/test/strchrnul.c
      Booted nanodroid with MTE enabled.
      bionic string tests with MTE enabled.
      Big endian with Qemu: qemu-aarch64_be
      563b710d
    • Gabor Kertesz's avatar
      ARMv8.5 MTE: Make strchr-mte big and little endian compatible. · 6d09ce9f
      Gabor Kertesz authored
      Previously used LDR istrunction resulted little endian behavior.
      This LD1 results byte-by-byte load.
      
      Testing done:
      optimized-routines/string/test/memchr.c
        Big Endian test: qemu-aarch64_be
      Booted nanodroid with MTE enabled.
      bionic string tests with MTE enabled.
      6d09ce9f
Loading