When the Meltdown and Spectre vulnerabilities were first rumored last year I wondered about something: we’re familiar here with the CPU’s hardware performance counters through our work on software performance analysis and optimization. Briefly, these counters were introduced over the last decade or so and are built-in to the CPU. They solve a problem that existed in earlier designs which is that it was hard for software to figure out how much time had passed, to a high level of resolution. Knowing the time between two points in a program helps the developer to understand its performance characteristics. As processors became faster and more complex over the years this issue became a major impediment in analysis work with various workarounds being used such as executing performance-critical code many times over in a loop or using statistical sampling techniques. Hardware performance counters came to the rescue with a very low-overhead way to get both timing and data about things like cache hit/miss behavior.
Since these counters allow a detailed view into the operation of the micro-architecture it follows that they could be used for both good and ill in relation to Meltdown/Spectre.
Cody Pierce from Endgame has written an interesting article about the “white hat” side of this: the idea is to be able to detect when an adversary is trying to exploit the vulnerability by looking for anomalous patterns in the performance counter data. This would allow a kind of “burglar alarm” to be triggered in the victim process, presumably then allowing administrative action to investigate and remove the attacker’s agent from the machine.