iommu/dma: Plumb in the per-CPU IOVA caches
With IOVA allocation suitably tidied up, we are finally free to opt in to the per-CPU caching mechanism. The caching alone can provide a modest improvement over walking the rbtree for weedier systems (iperf3 shows ~10% more ethernet throughput on an ARM Juno r1 constrained to a single 650MHz Cortex-A53), but the real gain will be in sidestepping the rbtree lock contention which larger ARM-based systems with lots of parallel I/O are starting to feel the pain of. Reviewed-by: Nate Watterson <nwatters@codeaurora.org> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Git-commit: bb65a64c7285e7105c1a6c8a33b37770343a4e96 Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [pdaly@codeaurora.org: Resolve conflicts due to missing msi changes] Change-Id: I8b0ef39e577e22b914fc10cfdcf5482b41bd6661 Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
Loading
Please register or sign in to comment