As expected, there is no impact on reads enabling the mitigations. There is a small increase in writes when the average writes during the test are compared, but the chart of the Host writes/sec during the tests look very similar.

Lower is better

Lower is better

An increase in logon times of 17% when enabling Spectre and Meltdown and 71% when also enabling the Side-Channel Aware scheduler does seem a lot. But this is also caused by the higher CPU utilization when the mitigations are enabled. In the next chart, the logon times during the first 20 minutes are compared.

Lower is better

When the CPU utilization of the host is still below 80%, the logon times are much closer between the tests. Enabling the Spectre and Meltdown mitigations increase the logon times with 5% and with 6% if the Side-Channel aware scheduler is enabled.

Alternative test

Because the impact of enabling the mitigations against these vulnerabilities is significant (especially enabling the Side-Channel aware scheduler on ESXi), we decided to run another test with a lower number of sessions and VMs when the Side-Channel Aware scheduler is enabled. The number of launched sessions is lowered by 25% and the number of VMs is 4 instead of 8 RDSH VMs running on the host. These results should provide a more realistic comparison of user density because the virtual CPU to CPU core ration will now be the same again as enabling the Side-Channel Aware scheduler will not leverage hyperthreading.

The alternative results

In the following chart, the impact on Login VSI VSImax of enabling Spectre, Meltdown and L1TF in an RDSH environment is presented, when the number of sessions and VMs are adjusted for the test with L1TF mitigations.

Vmware Side-channel-aware Scheduler

Higher is better

Lowering the number of launched sessions and VMs when the Side-Channel Aware scheduler is enabled on ESXi will improve the user density. Instead of a decrease of VSImax of 50%, it’s now “only” 34%.

Lower is better

The Login VSI Baseline is now comparable to the Login VSI Baseline of the test without the mitigations for Spectre and Meltdown enabled (instead of the 2% increase in Login VSI baseline).

Lower is better

Vmware Side Channel Aware Scheduler Tool

As you can see, the CPU load is quite similar between the 3 configurations. The CPU utilization of the test with 4 VMs with L1TF enabled is lower during almost the entire test. This is caused by the lower number of launched sessions. In the end, the CPU still reaches 100% utilization.

Vmware Side Channel Aware Scheduler App

Aware

Lower is better

Vmware Side Channel Aware Scheduler Jobs

On average, the CPU utilization was 6% lower with the Side-Channel Aware scheduler enabled, using less VMs, but also with 25% fewer users launched.

Conclusion

The impact of Spectre, Meltdown, and L1TF on a virtualized RDSH infrastructure can be significant. In our testing enabling just the Spectre and Meltdown mitigations cause an impact on user density of 9%. The impact on the user experience (Login VSI baseline) is 9%, which means 9% higher response times of applications. The mitigation for L1TF is a different story. Our testing shows an impact of 50% on Login VSI VSImax when the Side-Channel aware scheduler on ESXi is enabled. Testing with a different amount of launched sessions and a lower number of VMs showed that the impact was still more than 30% on the Login VSI VSImax score, with all mitigations enabled.
The question remains, should you enable these mitigations in your RDSH environment? And the answer is: it depends on who you ask. If you ask your security officer, he will (or should) say yes. Anyone else probably doesn’t know or don’t care and just want good performance, good user experience or a low TCO (fewer servers hosting the RDSH-environment). This translates to not wanting the mitigations enabled because it will impact the performance, user experience and TCO.
If you don’t enable the mitigations, your environment is vulnerable to these exploits. We cannot decide for you if it’s worth the risk of not enabling the mitigations. We can only advice you to test the mitigations and measure the impact on your environment and take the appropriate measures.
If you have comments about this research or want to discuss other configurations, please join us on our GO-EUC Slack channel.
Photo by Shawn Ang on Unsplash

Share this post: