Since 2018, an almost endless series of attacks broadly known as Spectre has kept Intel and AMD scrambling to develop defenses to mitigate vulnerabilities that allow malware to pluck passwords and other sensitive information directly out of silicon. Now, researchers say they’ve devised a new attack that breaks most—if not all—of those on-chip defenses.
Spectre got its name for its abuse of speculative execution, a feature in virtually all modern CPUs that predicts the future instructions the CPUs might receive and then follows a path that the instructions are likely to follow. By using code that forces a CPU to execute instructions along the wrong path, Spectre can extract confidential data that would have been accessed had the CPU continued down that wrong path. These exploits are known as transient executions.
Since Spectre was first described in 2018, new variants have surfaced almost every month. In many cases, the new variants have required chipmakers to develop new or augmented defenses to mitigate the attacks.
A key Intel protection known as LFENCE, for instance, stops more recent instructions from being dispatched to execution before earlier ones. Other hardware- and software-based solutions broadly known as “fencing” build digital fences around secret data to protect against transient execution attacks that would allow unauthorized access.
Researchers at the University of Virginia said last week that they found a new transient execution variant that breaks virtually all on-chip defenses that Intel and AMD have implemented to date. The new technique works by targeting an on-chip buffer that caches “micro-ops,” which are simplified commands that are derived from complex instructions. By allowing the CPU to fetch the commands quickly and early in the speculative execution process, micro-op caches improve processor speed.
The researchers are the first to exploit the micro-ops cache as a side channel, or as a medium for making observations about the confidential data stored inside a vulnerable computing system. By measuring the timing, power consumption, or other physical properties of a targeted system, an attacker can use a side channel to deduce data that otherwise would be off-limits.
“The micro-op cache as a side channel has several dangerous implications,” the researchers wrote in an academic paper. “First, it bypasses all techniques that mitigate caches as side channels. Second, these attacks are not detected by any existing attack or malware profile. Third, because the micro-op cache sits at the front of the pipeline, well before execution, certain defenses that mitigate Spectre and other transient execution attacks by restricting speculative cache updates still remain vulnerable to micro-op cache attacks.”
The paper continues:
Most existing invisible speculation and fencing-based solutions focus on hiding the unintended vulnerable side-effects of speculative execution that occur at the backend of the processor pipeline, rather than inhibiting the source of speculation at the front-end. That makes them vulnerable to the attack we describe, which discloses speculatively accessed secrets through a front-end side channel, before a transient instruction has the opportunity to get dispatched for execution. This eludes a whole suite of existing defenses. Furthermore, due to the relatively small size of the micro-op cache, our attack is significantly faster than existing Spectre variants that rely on priming and probing several cache sets to transmit secret information, and is considerably more stealthy, as it uses the micro-op cache as its sole disclosure primitive, introducing fewer data/instruction cache accesses, let alone misses.
There has been some pushback since the researchers published their paper. Intel disagreed that the new technique breaks defenses already put in place to protect against transient execution. In a statement, company officials wrote:
Intel reviewed the report and informed researchers that existing mitigations were not being bypassed and that this scenario is addressed in our secure coding guidance. Software following our guidance already have protections against incidental channels including the uop cache incidental channel. No new mitigations or guidance are needed.
Transient execution uses malicious code to exploit speculative execution. The exploits, in turn, bypass bounds checks, authorization checks, and other security measures built into applications. Software that follows Intel’s secure coding guidelines are resistant to such attacks, including the variant introduced last week.
Key to Intel’s guidance is the use of constant-time programming, an approach where code is written to be secret-independent. The technique the researchers introduced last week uses code that embeds secrets into the CPU branch predictors, and as such, it doesn’t follow Intel’s recommendations, a company spokeswoman said on background.
AMD didn’t provide a response in time to be included in this post.
Another rebuff has come in a blog post written by Jon Masters, an independent researcher into computer architecture. He said the paper, particularly the cross-domain attack it describes, is “interesting reading” and a “potential concern” but that there are ways to fix the vulnerabilities, possibly by invalidating the micro-ops cache when crossing the privilege barrier.
“The industry had a huge problem on its hands with Spectre, and as a direct consequence, a great deal of effort was invested in separating privilege, isolating workloads, and using different contexts,” Masters wrote. “There may be some cleanup needed in light of this latest paper, but there are mitigations available, albeit always at some performance cost.”
Not so simple
Ashish Venkat, a professor in the computer science department at the University of Virginia and a co-author of last week’s paper, agreed that constant-time programming is an effective means for writing apps that are invulnerable to side-channel attacks, including those described by last week’s paper. But he said that the vulnerability being exploited resides in the CPU and therefore should receive a microcode patch.
He also said that much of today’s software remains vulnerable because it doesn’t use constant-time programming, and there’s no indication when that will change. He also echoed Masters’ observation that the code approach slows down applications.
Constant-time programming, he told me, “is not only extremely hard in terms of the actual programmer effort but also entails significant deployment challenges related to patching all sensitive software that’s ever been written. It is also typically exclusively used for small, specialized security routines due to the performance overhead.”
Venkat said the new technique is effective against all Intel chips designed since 2011. He told me that besides being vulnerable to the same cross-domain exploit, AMD CPUs are also susceptible to a separate attack. It exploits the simultaneous multithreading design because the micro-op cache in AMD processors is competitively shared. As a result, attackers can create a cross-thread covert channel that can transmit secrets with a bandwidth of 250 Kbps and an error rate of 5.6 percent.
Transient execution poses serious risks, but at the moment, they are mostly theoretical because they’re rarely if ever actively exploited. Software engineers, on the other hand, have much more reason for concern, and this new technique should only increase their worries.