Royal Flush: Privilege Escalation Vulnerability in Azure Functions

April 8, 2021

Paul Litvak

One of the most common benefits of transitioning to cloud services is the shared responsibility for securing your assets. But cloud providers are not immune to security mistakes such as a vulnerability or misconfiguration. This is the second escalation of privileges (EoP) vulnerability we have found in Azure Functions in the last few months. We worked with and reported the vulnerability to Microsoft Security Response Center (MSRC). They determined that this behavior has no security impact on Azure Functions users. Since the Docker host we probed was actually a HyperV guest, it was protected with another sandboxing layer. Still, cases like this underscore that vulnerabilities are sometimes unknown or out of the cloud consumer’s control. A two-pronged approach to cloud security is recommended: Do the basics, like fixing known vulnerabilities and hardening your systems to decrease the likelihood of getting attacked, and implement runtime protection to detect/respond to post vulnerability exploitation and other in-memory attacks as they occur.

Disclosing a Vulnerability in Azure Functions Containers

Azure Functions containers run with the –privileged Docker flag, causing device files in the /dev directory to be shared between the Docker host and the container guest. This is standard privileged container behavior, however, these device files have ‘rw’ permissions for ‘others’ as seen below, which is the root cause of the vulnerability we present.

Azure Functions containers are run with the low privileges app user. The container’s hostname includes the word Sandbox, meaning that it’s important for the user to be contained with low privileges. The container is run with the –privileged flag, meaning that if a user is able to escalate to root, they would be able to escape to the Docker host using various Docker escape techniques. The lax permissions on the device files are not standard behavior. As can be seen in my own local privileged container setup, device files in /dev are not very permissive by default:

The Azure Functions environment contains 52 “pmem” partitions with ext4 filesystems. At first, we suspected that these partitions belonged to other Azure Functions clients but further assessment showed that these partitions were just ordinary file systems used by the same operating system, including pmem0, which is the Docker host’s filesystem.

Reading the Azure Functions Docker host’s disk using debugfs To further investigate how we could exploit this writable disk without potentially affecting other Azure customers, we set up a local environment imitating the vulnerability in a container of our own, together with the unprivileged user ‘bob`:

Exploiting Device File o+rw

On our local set up, /dev/sda5 is the root filesystem and it will be the one we target. Using the debugfs utility an attacker can traverse the filesystem as we successfully demonstrated above. debugfs also supports a write-mode via the -w flag, so we can commit changes to the underlying disk. It’s important to note that writing to a mounted disk is generally a bad idea as it can cause corruption in the disk.

Exploit Through Direct Filesystem Editing

To demonstrate how the attacker can change any arbitrary file, we wanted to gain control over /etc/passwd. At first, we tried to edit the file’s contents using the zap_block command by directly editing filesystem blocks’ contents. Internally, the Linux Kernel treats these changes to the *device file* /dev/sda5 and they are write-cached in a different location than changes to the *regular file* /etc/passwd. As a result, it is required to flush changes to disk but this flush is handled by the debugfs utility (for more information regarding this mechanism refer to Understanding Linux Kernel pages 601-602).

Overwriting /etc/passwd content with ‘A’ (0x41) using debugfs Similarly, the Linux Kernel hosts a read cache for pages that were recently loaded into memory. Unfortunately, due to the same constraint we explained with the write cache, changes to /dev/sda5 would not propagate to the view of the /etc/passwd file until its cached pages are discarded. Meaning, we were only able to overwrite files that were not recently loaded from disk to memory, or otherwise wait for a system restart for our changes to apply.

Royal Flush

After further research we were able to find a way to instruct the kernel to discard the read cache so that our zap_block changes could take effect. First, we created a hard link via debugfs into our container’s diff directory so that changes would radiate to our container:

This hard link still requires root permissions to edit, so we still had to use zap_block to edit its content. We then used posix_fadvise to instruct the kernel to discard pages from the read cache (flush them, hence the name of the technique), inspired by a project named pagecache management (source: fadv.c slightly edited by us). This caused the kernel to load our changes and we were finally able to propagate them to the Docker host filesystem:

Flushing the read cache

/etc/passwd in the Docker host filesystem. We can see the ‘AAA’ string after flushing

Summary

By being able to edit arbitrary files belonging to the Docker host, an attacker can launch a preload hijack by similarly performing changes to /etc/ld.so.preload and serving a malicious shared object through the container’s diff directory. This file could be preloaded into every process in the Docker host system (we previously documented HiddenWasp malware using this technique) and thus the attacker would be able to execute malicious code on the Docker host. To sum things up the PoC for the exploit is:

Afterword

We demonstrated this vulnerability to Microsoft Security Response Center (MSRC). Their assessment is that this behavior has no security impact on Azure Functions users. Since the Docker host we probed was actually a HyperV guest, it was protected with another sandboxing layer. No matter how hard you work to secure your own code, sometimes vulnerabilities are unknown or out of your control. You should have runtime protection in place to detect and terminate when the attacker executes unauthorized code in your production environment. This Zero Trust mentality is echoed by Microsoft.

Paul Litvak

Paul is a malware analyst and reverse engineer at Intezer. He previously served as a developer in the Israel Defense Force (IDF) Intelligence Corps for three years.

Share this article

Recommended Blogs

Blog

23MIN READ

How attackers are gaining access to LLM inference

Threat actors are wiring live LLM APIs into malware to generate malicious logic at runtime, and this research maps the five routes they use to access AI models for free.

Blog

5MIN READ

A Gartner take on the MDR market in 2026

For CISOs navigating the AI era, the question is no longer whether AI will change the SOC. It is whether the current service model is the right vehicle for that change.

Blog

27MIN READ

OrBit (Re)turns: Tracking an open-source Linux rootkit across four years of forks and deployments

Explore how OrBit, a two-stage malware, has changed over the last 4 years and why it matters for defenders.

Product Tour

Product

Product Tour

Use Cases

Case Study

Customers

Product Tour

Company

Company

Blog Post

Learn

Guides

Royal Flush: Privilege Escalation Vulnerability in Azure Functions

Paul Litvak

Paul Litvak

How attackers are gaining access to LLM inference

A Gartner take on the MDR market in 2026

OrBit (Re)turns: Tracking an open-source Linux rootkit across four years of forks and deployments

Product

Customers

Use Cases

Learn

Company

Guides

Learn

Royal Flush: Privilege Escalation Vulnerability in Azure Functions

Paul Litvak

Disclosing a Vulnerability in Azure Functions Containers

Exploiting Device File o+rw

Exploit Through Direct Filesystem Editing

Royal Flush

Summary

Afterword

Paul Litvak

In this article

Share this article

Recommended Blogs