The official PASS Blog is where you’ll find the latest blog posts from PASS community members and the PASS Board. Contributors share their thoughts and discuss a wide variety of topics spanning PASS and the data community.

Introducing Always Encrypted with Secure Enclaves in SQL Server 2019

There are several encryption features that have been around in SQL Server for a long time. For starters, column-level encryption has been available since SQL Server 2005. This type of encryption uses either certificates or symmetric keys such that, without the required certificate or key, encrypted data remains hidden from view. SQL Server 2008 (Enterprise Edition) added Transparent Data Encryption (TDE) to encrypt the entire database–again, using a special database encryption key–so that without that key, the entire database (and its backups) remains encrypted and completely inaccessible (though notably, data is still fully exposed in memory).

From a security standpoint, these older features do serve us well. However, they suffer from two significant drawbacks.

First, the very certificates and keys used for encryption are themselves stored in the database (or database server), which means that the database is always capable of decrypting the data (assuming the required certificates and keys are present). While this may be perfectly fine for an on-premise data center that you manage yourself, it’s a major problem if you want to move your data to the cloud. Because by giving up physical ownership of the database, you're also handing over the encryption keys and certificates to your cloud provider (Microsoft, in the case of Azure), empowering them to access your private data.

Another issue is the fact that these older features only encrypt data “at rest”–meaning that data is only encrypted after you call EncryptByKey or EncryptByCert, before it gets written to disk. Thus, other techniques (for example, SSL and TLS) are required to provide additional encryption “in flight” (as it travels across the network).

Enter Always Encrypted

To address these concerns, SQL Server 2016 introduced Always Encrypted. With this feature, data is encrypted not just at rest and in flight, but on the server as well. So Always Encrypted really means never decrypted anywhere other than the client. So, the cryptography keys themselves–which are essential for both encrypting and decrypting–are not stored in the database, nor are they ever shared with the database system. Those keys stay with you, the customer (even though you might end up storing them in Azure Key Vault, if both your applications and database run in the cloud).

With this approach, we effectively separate the clients who own the data, from the cloud providers who host it. Because the data is always encrypted, SQL Server (and the cloud hosting provider) is powerless to decrypt it. All it can do is serve it up in its encrypted state, so that the data is also encrypted in flight. Only when it arrives at the client can it be decrypted, on the client, by the client, who possesses the necessary keys. Likewise, when inserting or updating new data, that data gets encrypted immediately on the client, before it ever leaves the client, and remains encrypted in-flight all the way to the database server, where SQL Server can only store it in that encrypted state–it cannot decrypt it. This is classic GIGO (garbage in, garbage out), as far as SQL Server is concerned.

The initial version (“V1”) of Always Encrypted in SQL Server 2016 represented an important first step toward confidential computing. In SQL Server 2019, Always Encrypted has been greatly enhanced to work with secure enclaves to enable rich query processing over encrypted data. But first, I’ll describe the V1 implementation, which is pre-requisite knowledge before jumping into secure enclaves.

Always Encrypted V1 (SQL Server 2016)

Always Encrypted protects sensitive data on the server using cryptography keys available only to the client. Specifically, these include Column Encryption Keys and Column Master Keys.

Column Encryption Key (CEK)

For each column in each table that you want to encrypt, you create one or more Column Encryption Keys (CEKs). These are keys that encrypt your data using the SHA256 cryptography algorithm. You can create one CEK for each column you want to encrypt, or you can use the same CEK to encrypt multiple columns in multiple tables; it’s not necessarily one-to-one.

With a CEK in hand, data can be encrypted and decrypted in column(s) protected by that CEK. Thus, CEKs must be carefully guarded. They cannot be stored in the database, since that would defeat the whole idea of encrypting and decrypting exclusively on the client side. However, if the CEK itself is encrypted, then it’s safe to store in the database. And that’s where the Column Master Key (CMK) comes in.

Column Master Key (CMK)

Every CEK is encrypted by a CMK. Think of the CMK as a “key encrypting key” that encrypts the CEK which is a “data encrypting key.” The CMK is only ever available on the client side; for example, in the client machine’s Windows certificate store, or in the cloud using Azure Key Vault (AKV) with client credentials. Then, the encrypted CEKs (not the CEKs themselves) are stored in the database, along with the client-side path to the CMK (not the CMK itself).

As a result, the database has all the information that the client needs to perform encryption and decryption, but is itself powerless to perform these operations on its own. And that’s because the CEK is needed for cryptography operations; but the database only has a CEK that has been encrypted by the CMK (not the CEK itself). And it has only a client-side path to the CMK (not the CMK itself). Thus, Always Encrypted is a hybrid feature based on client-side encryption/decryption, and driven by server-side metadata.

Encrypting a Table

SQL Server Management Studio (SSMS) provides tooling to generate CMKs and CEKs, and has an Always Encrypted wizard that will migrate an existing (non-encrypted) table to a table with one or more encrypted columns.

For each column, you choose a CEK and an encryption type of deterministic or randomized. You must choose deterministic if you want to be able to query (equality only) or join on the column. This works because the same ciphertext (encrypted data) is always generated from the same clear text. Otherwise, you should choose randomized because it’s much more secure than deterministic. For example, deterministically encrypting a Boolean column yields only two distinct ciphertext values, making it easy for a hacker to distinguish true and false values. With random encryption, the same Boolean column will appear to have many different values, but cannot be queried against.

The wizard then creates a new table with matching schema, and with CEK and encryption type designations assigned accordingly to each column you selected. Then it transfers the rows into the new table, encrypting along the way. But remember, it’s the client (SSMS in this case) that’s performing the encryption–not SQL Server, which has no access to the CEK. All the data gets round-tripped through the wizard, which encrypts the selected column(s) using the CEK(s) and encryption type(s) that you selected. Finally, the wizard drops the old table, renames the new table, and the migration is complete.

Querying an Encrypted Table

On the client side, cryptography operations are transparently performed by the client driver, and is currently supported for ADO.NET, ODBC, JDBC, or PHP. Note that the ADO.NET driver supports Always Encrypted only for .NET Framework; it is not currently supported for .NET Core.

Here’s the workflow:

  1. The client includes “Column Encryption Setting=Enabled” in the connection string
  2. The server sends the encrypted CEK (yellow key with red lock) and the CMK path (red key in dotted border) back to the client.
  3. The client retrieves the CMK (red key) from the path (local computer certificate store or Azure Key Vault) and uses it to decrypt the CEK.
  4. The client encrypts the SSN with the CEK (yellow key), so that it can be queried by the server. This implies that deterministic encryption is being used on the SSN column, making it possible to query on it (equality only).
  5. The client issues a modified version of the query with ciphertext for the SSN. This is in-flight encryption, so anyone hacking the wire cannot see the SSN in clear text. Of course, TLS encrypts the wire anyway, but on the other end of the wire, DBAs still just see ciphertext, and memory dumps reveal nothing.
  6. The Name column returned to the client is also encrypted, but it could (should) be using randomized encryption if there is never a need to query on it. Again, anyone hacking the wire in this direction cannot see the Name in clear text.
  7. The client receives the encrypted name column, and decrypts it using the CEK.

With this basic implementation, data is encrypted not just at rest and in-flight, but “in-use” as well. That is, limited operations over encrypted data can be performed by SQL Server without requiring decryption. Notably, only equality comparison is allowed, supporting point lookups (like the SSN example), as well as JOIN, GROUP BY, and DISTINCT, but not much else. Furthermore, these operations only work with deterministically encrypted columns, which is less secure than randomly encrypted columns.

Always Encrypted with Secure Enclaves (SQL Server 2019)

In SQL Server 2019, Always Encrypted has been greatly enhanced to leverage secure enclaves. With secure enclaves, server-side processing of encrypted data is fully supported, including not just equality comparisons (previously possible only with deterministic encryption), but range queries, pattern matching (LIKE), and sorting–all over randomly encrypted data.

Furthermore, using secure enclaves, cryptography operations can be performed in place on the server. Encrypting existing data therefore does not require round-tripping the network (like the SSMS wizard with V1), which does not scale well for large amounts of data.

All this may seem like an impossible feat, given that SQL Server has no access to the keys needed for encryption and decryption. Yet SQL Server 2019 makes this possible by leveraging secure enclaves in conjunction with the base Always Encrypted functionality added in SQL Server 2016. So, data remains protected on the one hand, while at the same time, the ability to perform rich server-side computations over that data is preserved.

What is a Secure Enclave?

To understand how this magic works, you need to understand what an enclave is. Simply put, an enclave is a special region of the normal memory allocated to a process. This region of memory is isolated and protected not only from its containing process, but everything else on the entire machine. No other processes, and not even the almighty kernel itself, can access this region of memory. The enclave is essentially a black box that cannot be accessed even by highly privileged administrators.

Like any memory, an enclave can contain both code and data. However, code must be signed in a special way in order to be able to run in an enclave, and then that becomes the only code running on the machine that can access data contained inside the same enclave.

Several technologies are available today to provide the secure isolation of an enclave. This includes hardware-based solutions such as Intel Software Guard Extensions (SGX) and Arm TrustZone. Secure enclave isolation can also be powered by leveraging the machine’s hypervisor, such as virtualization-based security (VBS) in Windows Server 2019 and Windows 10 v1809. In the current SQL Server 2019 Preview, only VBS enclaves are supported, while the product team has already announced their roadmap to support SGX (there is no planned support for ARM TrustZone at this time).

An attacker attempting to access an enclave can easily open the WinDbg debugger, connect to the process that contains the enclave, and find the enclave memory. But the memory contents will not be visible to them. For example, attempting to view the contents of a VBS enclave reveals nothing but question marks:

Enclaves will never be exposed in a full memory dump and they are completely impervious to memory scanning attacks. This makes them an extremely attractive technology that can serve as a trusted execution environment for processing sensitive data.

Leveraging Secure Enclaves

For Always Encrypted, the goal with secure enclaves remains the same; protect sensitive data from highly privileged but unauthorized users (like DBAs and machine admins). By using secure enclaves, this level of protection can be maintained without compromising SQL Server’s ability to perform rich queries, and, encryption can be performed in-place on the server.

When SQL Server 2019 starts, it loads an enclave. This means that SQL Server is now a hosting process that contains an enclave, but SQL Server itself does not run in the enclave, nor can it access the enclave’s content. Rather, the enclave acts as an extension of the client-side trust boundary on the server machine; a trusted representative of the client within the SQL Server environment. Think of it as a foreign embassy. The embassy is physically located inside a foreign country. Yet within the perimeter of the embassy, only the laws of its native country apply, while the laws of the hosting foreign country do not. At the same time, it’s just a footstep to enter or exit the embassy, compared with the thousands of miles to travel back and forth between the countries.

The way to think of this is in terms of the Always Encrypted philosophy is, cryptography operations are still performed exclusively by the client, but not necessarily the client machine. Meaning, the enclave on the server machine in essence is the client. Critically, this means that the client and server can communicate without round-tripping the network, because client code is running inside the enclave as an extension of the client machine.

Enclave Attestation

But how does the client machine know that the enclave on the server machine can be trusted? How does it know that there isn’t malicious code running inside the enclave? This supreme level of trust is achieved by both the client and server machines negotiating through a third machine, called the attestation server.

As the name implies, the sole purpose of this server is to attest to the authenticity of the enclave. That is, it certifies to the client that the enclave on the server is running code that can be trusted. Only then does the client authorize the use of the enclave.

Once attestation succeeds, the client driver establishes a secure tunnel connection to client code running inside the enclave on the server machine. The client machine and the client code inside the enclave on the server both exchange a shared secret over this secure tunnel. This secret is used to encrypt a CEK on the client machine and send it to the enclave on the server machine. Inside the enclave–and only inside the enclave–the shared secret is used to decrypt the CEK.

Enabling Rich Query

Now the CEK is available inside the enclave, but still completely unavailable to SQL Server running inside the process that’s hosting the enclave. At this point, SQL Server can perform rich queries over encrypted data; for example, with support for pattern matching (LIKE), range comparisons, and sorting. And that’s because the client running in the enclave is so close at hand, and can be utilized for cryptography operations all on one machine, with no network activity.

When we ask SQL Server to execute a query that includes rich computations, it’s still powerless to process those portions of the query that operate over encrypted columns. So instead, SQL Server delegates these portions of the query over to the enclave, along with the encrypted data that needs to be examined for that particular operation (for example, a range comparison). The query engine injects the encrypted data into the enclave (which is effectively the same as passing it to the client without a network call) and asks it to perform the operation.

Inside the enclave, the CEK obtained from the client machine via the secure tunnel is used to decrypt the value and perform the operation. The result is then returned to the query engine, which then continues to process the rest of the query. Any additional query references to encrypted columns are similarly resolved by the enclave in-line with query execution. In this manner, encrypted data is decrypted on the fly and processed by the client running in the enclave, as needed, by the query engine on the server.

Conclusion

This article gave you a high-level overview of Always Encrypted with secure enclaves. Hopefully, you’re getting excited about these new possibilities for greater security in your SQL Server databases and applications. So, stay tuned for the next installment, where I’ll walk you through the environment set and test-drive Always Encrypted with secure enclaves, step by step. Until then, thanks for reading!

Lenni Lobel
About the author

Leonard Lobel (Microsoft MVP, Data Platform) is CTO and co-founder of Sleek Technologies, Inc., a New York-based development shop with an early adopter philosophy toward new technologies. He is also a principal consultant at Tallan, Inc., a Microsoft National Systems Integrator and Gold Competency Partner.

Programming since 1979, Lenni specializes in Microsoft-based solutions, with experience that spans a variety of business domains, including publishing, financial, wholesale/retail, health care, and e-commerce. Lenni has served as chief architect and lead developer for various organizations, ranging from small shops to high-profile clients. He is also a consultant, trainer, and frequent speaker at local group meetings, VSLive, SQL PASS, and other industry conferences.

Lenni has also authored several MS Press books and Pluralsight courses on SQL Server programming.

Please login or register to post comments.

Back to Top
cage-aids
cage-aids
cage-aids
cage-aids