nfit: do an ARS scrub on hitting a latent media error
When a latent (unknown to 'badblocks') error is encountered, it will trigger a machine check exception. On a system with machine check recovery, this will only SIGBUS the process(es) which had the bad page mapped (as opposed to a kernel panic on platforms without machine check recovery features). In the former case, we want to trigger a full rescan of that nvdimm bus. This will allow any additional, new errors to be captured in the block devices' badblocks lists, and offending operations on them can be trapped early, avoiding machine checks. This is done by registering a callback function with the x86_mce_decoder_chain and calling the new ars_rescan functionality with the address in the mce notificatiion. Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Loading
Please register or sign in to comment