Commits · ded1e17ec29745209188db8b40ef17466b0d595a · Android-smartphones / realme / realme 5pro / kaderbava / external_arm-optimized-routines

Aug 14, 2020
- string: Benchmark unaligned memmove · ded1e17e
  Wilco Dijkstra authored Aug 14, 2020
```
Add benchmarking of forward and backward unaligned memmoves.
```
  ded1e17e
- string: Improve backwards memmove performance · cf3b6b37
  Wilco Dijkstra authored Aug 14, 2020
```
On some microarchitectures performance of the backwards memmove improves
if the stores use STR with decreasing addresses.
```
  cf3b6b37
Jul 29, 2020
- string: Fix CVE-2020-6096 for arm memcpy · 77ac889d
  Adhemerval Zanella authored Jul 27, 2020
```
This fix is similar to the one done one glibc (beea361050).
```
  77ac889d
Jul 01, 2020

Wilco Dijkstra authored Jul 01, 2020

Optimize strlen using a mix of scalar and SIMD code. On modern micro
architectures large strings are 55% faster than the current version,
and 35% faster than strlen-mte.  On the random strlen benchmark the
speedup is 3.4% and 40% respectively.

224cb5f6

Jun 23, 2020
- string: Add strlen benchmark · bb88c18f
  Wilco Dijkstra authored Jun 23, 2020
```
Add strlen benchmark with a random latency and small/medium throughput tests.
```
  bb88c18f
Jun 12, 2020

string: Fix overflow issue in strncmp-mte · d4f3f7f0

Wilco Dijkstra authored Jun 12, 2020

If limit is near SIZE_MAX it can overflow in the mutually aligned path.
Fix this by clamping limit to SIZE_MAX.

d4f3f7f0

Jun 01, 2020

string: Fix issue in strcmp-mte NUL check · f08b12e8

Wilco Dijkstra authored Jun 01, 2020

Improve the previous fix - if a string is immediately preceded by a NUL byte
and the first byte is 0x1, it may be confused by the NUL check as a NUL byte.
Instead of removing bytes outside the string via a shift, force them to be
non-NUL.

f08b12e8

May 29, 2020

v20.05 release · ef907c7a

Szabolcs Nagy authored May 29, 2020

* New functionality (64-bit Arm)
  * string: Optimized MTE variants of strlen, strnlen, strchr,
    strchrnul, strrchr, memchr, memrchr, strcpy, stpcpy, strcmp,
    strncmp
  * string: Changes to support BTI
  * string: New optimized memrchr, strnlen
* Performance improvements (Neoverse N1)
  * strchr/strchrnul: 21% improvement on long strings
  * strrchr: 11% improvement
  * strnlen: 130% improvement on long strings, 50% on short strings
* Benchmark and tests
  * string: New memcpy benchmark
  * string: Cleanup testsuite and improve test coverage

ef907c7a

string: Fix issue in strcmp-mte · 304137d8
Wilco Dijkstra authored May 29, 2020
```
Ensure nul bytes before unaligned strings are correctly ignored.
```
304137d8

May 28, 2020

string: Improve strcmp-mte performance · 27bb6b2b

Wilco Dijkstra authored May 28, 2020

Improve strcmp performance. On various micro architectures the speedup is 65%
on large unaligned strings and 21% on large (mutually) aligned strings.
On small unaligned strings the speedup is 12%.

27bb6b2b

string: Improve memcpy benchmark · 2525af9b
Wilco Dijkstra authored May 28, 2020
```
Print results in bytes/ns. Add medium and large copy benchmark.
```
2525af9b

string: Add MTE support to string tests. · 4d55c2d3

Branislav Rankov authored May 28, 2020

Set taggs for every test case so that boundaries are as narrow as
possible. There is no handling of tag faults, so the test will
crash if there is a MTE problem.

The implementations that are not compatible are excluded, including
the standard symbols that may come from an mte incompatible libc.

4d55c2d3

May 22, 2020
- string: Cleanup strchrnul test · 09b21d98
  Wilco Dijkstra authored May 22, 2020
```
Clean up code and improve test coverage.
```
  09b21d98
- string: Cleanup strchr test · 620b09f1
  Wilco Dijkstra authored May 22, 2020
```
Clean up code and improve test coverage.
```
  620b09f1
- string: Cleanup strrchr test · f5edabeb
  Wilco Dijkstra authored May 22, 2020
```
Clean up code and improve test coverage.
```
  f5edabeb
- string: Cleanup stpcpy test · e3b6fdf1
  Wilco Dijkstra authored May 22, 2020
```
Cleanup stpcpy test and improve test coverage.
```
  e3b6fdf1
- string: Cleanup strcpy test · 76203e7e
  Wilco Dijkstra authored May 22, 2020
```
Cleanup strcpy test and improve test coverage.
```
  76203e7e
- string: Cleanup strnlen test · edfa34d2
  Wilco Dijkstra authored May 22, 2020
```
Cleanup strnlen test and improve test coverage.
```
  edfa34d2
- string: Cleanup strlen test · 833a1ea7
  Wilco Dijkstra authored May 21, 2020
```
Cleanup strlen test and improve test coverage.
```
  833a1ea7
May 20, 2020

string: Add optimized strcpy-mte and stpcpy-mte · 0c9a5f3e

Wilco Dijkstra authored May 20, 2020

Add optimized MTE-compatible strcpy-mte and stpcpy-mte. On various micro
architectures the speedup over the non-MTE version is 53% on large strings
and 20-60% on small strings.

0c9a5f3e

May 18, 2020

string: Improve strrchr-mte performance · a99a1a96

Wilco Dijkstra authored May 18, 2020

Improve strrchr performance by using a fast strchr loop to find the first
match. On various micro architectures the speedup is 30-80% on large strings
and 32% on small strings.

a99a1a96

May 13, 2020

string: cleaner handling of GNU property notes · 98e4d6a5

Szabolcs Nagy authored May 12, 2020

Add GNU property notes to asm files in asmdefs.h instead of adding
the END_FILE macro to each file.

The WANT_GNU_PROPERTY macro can be still used to opt-out from the
notes.

98e4d6a5

May 12, 2020

string: Add memrchr test · 875cc5fd
Wilco Dijkstra authored May 12, 2020
```
Add new memrchr test.
```
875cc5fd

string: Add optimized memrchr · ad3f8def

Wilco Dijkstra authored May 12, 2020

Add optimized MTE-comparible memrchr. This walks the input backwards
using the same algorithm as memchr-mte.

ad3f8def

string: Improve strlen-mte performance · 2fdbac97

Wilco Dijkstra authored May 12, 2020

Improve strlen performance by using a much simpler SIMD implementation.
On various micro architectures the speedup is 11% on large strings and
63% on small strings.

2fdbac97

string: Cleanup memchr test · 04957075
Wilco Dijkstra authored May 12, 2020
```
Improve memchr test coverage and cleanup code.
```
04957075
string: Cleanup strnlen test · e7517100
Wilco Dijkstra authored May 12, 2020
```
Improve strnlen test coverage and cleanup code.
```
e7517100

string: Improve memchr-mte performance · cbbc5965

Wilco Dijkstra authored May 12, 2020

Improve memchr performance by using a more efficient termination test.
On various micro architectures the speedup is 16% on large strings and
46% on small strings.

cbbc5965

string: Improve strnlen performance · 6b23ea83

Wilco Dijkstra authored May 12, 2020

Improve strnlen performance by using a much simpler SIMD implementation.
On modern micro architectures the speedup is 2.3x on large strings and
1.5x on small strings.

6b23ea83

string: format tests according to GNU style · 0b7d1aeb

Szabolcs Nagy authored May 12, 2020

Use the GNU style consistently in the string test code.

Added clang-format guard comments where necessary so the
code can be reformated using the clang-format tool and
GNU style settings from gcc contrib/clang-format.

0b7d1aeb

May 01, 2020

string: add a setting to disable GNU Property Notes · e1127946

Szabolcs Nagy authored May 01, 2020

GNU Property Notes are only supported in recent tooling and older
tools may warn about them, so it makes sense to remove these notes
on a system where BTI is not supported anyway.

The actual BTI instructions should be kept in place to avoid
disturbing code layout.

-DWANT_GNU_PROPERTY=0 removes the .note.gnu.property section
from assembly files (ideally it would be based on the compiler
default setting, but there is no feature test macro for BTI and
PAC-RET).

e1127946

string: Further improve strchrnul-mte performance · fa69d42a

Wilco Dijkstra authored May 01, 2020

Remove 2 more instructions, resulting in a 9.8% speedup of medium
sized strings (16-32).

The BTI patch changed ENTRY so the loops got misaligned, this fixes
that regression.

fa69d42a

string: Further improve strchr-mte performance · 7bb8464f

Wilco Dijkstra authored May 01, 2020

Remove 2 more instructions, resulting in a 6.8% speedup of medium
sized strings (16-32).

The BTI patch changed ENTRY so the loops got misaligned, this fixes
that regression.

7bb8464f

Apr 30, 2020

string: ARMv8.5 MTE: Add MTE compatible version of strncmp. · 1de12a67

Branislav Rankov authored Apr 30, 2020

Reading outside the range of the string is only allowed within 16 byte
aligned granules when MTE is enabled.

This implementation is based on string/aarch64/strncmp.S

Change the case when strings are are misaligned, align the pointers
down, and ignore bytes before the start of the string. Carry the part
that is not compared to the next comparison.

Testing done:
string/test/strncmp.c on big endian, little endian and with MTE support.
Booted nanodroid with MTE enabled.

Bechmarked on Pixel4.

1de12a67

string: ARMv8.5 MTE: Add MTE compatible version of strcmp. · 9ebd9615

Branislav Rankov authored Apr 30, 2020

Reading outside the range of the string is only allowed within 16 byte
aligned granules when MTE is enabled.

This implementation is based on string/aarch64/strcmp.S

Change the case when strings are are misaligned, align the pointers
down, and ignore bytes before the start of the string. Carry the part
that is not compared to the next comparison.

Testing done:
optimized-routines/string/test/strcmp.c on big and little endian.
Booted nanodroid with MTE enabled.
bionic string tests with MTE enabled.

Benchmarks results:
Run both bionic benchmarks and glibc benchmarks on Pixel4. Cores A76 and A55.

9ebd9615

string: ARMv8.5 BTI: Add BTI support to assembly files. · 232e3c08

Tamas Zsoldos authored Apr 28, 2020

This change addds the landing pads to the start of functions
implemented in assembly, by adding it to the ENTRY macro. To avoid
skipping it when using an alias, every ENTRY_ALIAS use must precede
the corresponding ENTRY.

Furthermore, the GNU property note is added to the assembly
files. Since none of the functions save LR to stack, both BTI and PAC
support are indicated.

Paddings before __strncmp_aarch64 and __strnlen_aarch64 were adjusted.

232e3c08

ARMv8.5 MTE: Add MTE compatible version of strrchr. · bfaeb591

Gabor Kertesz authored Apr 24, 2020

Reading outside the range of the string is only allowed within
16 byte aligned granules when MTE is enabled.

This implementation is based on string/aarch64/strrchr.S.

Testing done:
optimized-routines/string/test/strrchr.c
Booted nanodroid with MTE enabled.
Bionic string tests with MTE enabled.
Big endian with Qemu: qemu-aarch64_be

bfaeb591

string: Improve strchrnul-mte performance · 878eb93f

Wilco Dijkstra authored Apr 29, 2020

Improve strchrnul performance by using more efficient termination tests.
On various micro architectures the speedup is 20% on large strings and 26%
on small strings.

878eb93f

string: Improve strchr-mte performance · 7aeabda5

Wilco Dijkstra authored Apr 29, 2020

Improve strchr performance by using a more efficient termination test.
On various micro architectures the speedup is 19% on large strings and
19% on small strings.

7aeabda5

string: Improve strrchr performance · 2f12ab4a

Wilco Dijkstra authored Apr 24, 2020

Improve strrchr performance by using a more efficient termination test.
On various micro architectures the speedup is 11% on large strings.

2f12ab4a