How encryption, digital signatures, and digital certificates work
More system administrators are turning to public key infrastructure (PKI) solutions as the trend of letting data flow freely past network boundaries becomes more prevalent. Most people associate PKI with encryption, but PKI isn’t just about encryption. It’s also about data integrity and authentication. So, before implementing a PKI solution, you need to understand how encryption, digital signatures, and digital certificates work together to secure and maintain the integrity and confidentiality of sensitive data.
Encryption is the process of turning legible clear text, which is referred to as plaintext, into incomprehensible ciphertext. In other words, you use cryptography to make the data you want to keep secret indecipherable to everyone except for the people with the necessary key to decrypt it.
Cryptography uses mathematical methods, sometimes referred to as ciphers or algorithms, to scramble data so that it can’t be easily read without the necessary key. A decryption key is usually a long random number that you must possess to decrypt a given piece of data using the same algorithm with which the data was encrypted.
There are several types of encryption, including symmetric and asymmetric. In symmetric encryption, shared keys are used to encrypt and decrypt data. The encryption and decryption keys can be identical or one key can be easily derived from the other. Although symmetric encryption is computationally fast, it requires that the key be exchanged between the sender and recipient. If the key is compromised during transit, the encrypted data can be read by the person in possession of the key.
Asymmetric encryption, which PKI implements, involves two keys: a public key and a private key. As Figure 1 shows, the process starts when a sender uses a public key to encrypt a message. The sender can request a public key from the intended recipient or download it from a public directory or website. Only the intended recipient can decrypt the message with its corresponding private key. Although slower than symmetric encryption, asymmetric encryption doesn’t require a secure key exchange.
Figure 1: Asymmetric encryption process
Symmetric and asymmetric encryption are often used together. An asymmetric cipher is used to encrypt a session key (i.e., a symmetric key intended for use in a given exchange of data), and the encrypted session key is used to encode the message. This approach, which is referred to as bulk encryption, provides the security of asymmetric encryption with the speed of a symmetric cipher.
The length of the key is an important factor in bulk and asymmetric encryption. It’s mathematically feasible to derive a private key having access only to a public key. Therefore, as computing power constantly improves, you should assume that the encrypted data will be secure for only a limited amount of time. The longer the key, the more time your data should remain secure. However, longer keys are more processor intensive, so you need to strike a balance between security and speed.
The length of a shared key is also an important factor in symmetric encryption. For information about the key lengths in symmetric and asymmetric encryption standards, see the sidebar “Common Encryption and Hash Standards.”
Public key cryptography can be used to issue messages with a digital signature. As with a handwritten signature, this seal of approval enables a message’s receiver to verify that the information did in fact come from a given sender. Digital signatures are much more reliable than handwritten signatures, as it’s very difficult to produce a fake digital signature. In addition, the integrity of the message content is guaranteed.
A hash is used to ensure message integrity—in other words, it guarantees that the message hasn’t been modified in transit. Hash algorithms analyze a message, then generate a small code (hash or message digest) that uniquely identifies it. Changing a message without changing its hash is difficult. Besides proving that a message hasn’t been modified, hash algorithms ensure that no two messages have the same hash.
Hash algorithms produce message digests that form part of the digital signature sent with a message. As Figure 2 shows, the process begins when the sender uses an algorithm to generate a hash of the original data to form a message digest. The sender then uses its private key to encrypt the message digest and sends the message to the recipient. The recipient generates its own hash of the message using the same algorithm. The recipient decrypts the original message digest sent with the message using the sender’s public key and compares the two digests. If they’re identical, the message hasn’t been tampered with in transit.
Figure 2: Data integrity process
Digital certificates are electronic documents that contain:
- A public key
- Information about the purposes for which the certificate can be used (e.g., server authentication, email encryption)
- Start and end validity dates
- Identity information about the individual or organization using the certificate
- A digital signature to attest that the identity information provided corresponds with the included public key
Digital certificates are usually distributed in the standard X.509 format.
A Certification Authority (CA) is a trusted entity that confirms the identities of individuals and organizations that are using digital certificates, much in the same way that one government relies on the passport authority of another country to validate its citizens’ identities. For instance, if you require a digital certificate for a public-facing web server for data encryption and server authentication, you can approach a CA to confirm your organization’s identity and send information that only your company can provide. Client OSs usually come supplied with the root CA certificate of the most commonly used public CAs (e.g., Thawte, VeriSign), enabling the OS (and the applications that run on it) to trust them. If you require authentication inside your organization only, you can install and manage your own CA.
CA systems consist of several components, including a registration authority and a validation authority. The registration authority is responsible for proving the identity of entities that require a certificate. It’s also responsible for revoking certificates, approving requests to renew expiring certificates, and providing a new key for an existing certificate (i.e., re-key a certificate).
The validation authority is used to provide real-time assurance that a certificate is valid. This can be done by checking certificate revocation lists (CRLs) or using the Online Certificate Status Protocol (OCSP), which I’ll discuss shortly. First, though, I want to bring up the topic of self-signed certificates.
Because public keys for asymmetrical encryption are usually distributed using digital certificates, organizations often use a CA to manage this process. Technically, using a CA isn’t required, as server applications can usually generate self-signed certificates without a CA. However, you can’t easily use self-signed certificates to authenticate the identity of internal resources or devices outside of your organization. Self-signed certificates are recommended only for test or lab scenarios, as they are difficult to manage.
CLRs and OCSP
Occasionally certificates are issued in error and need to be invalidated, or they need to be invalidated for some other reason. This process is called certificate revocation. Each CA has a CRL that contains information about previously issued certificates that have yet to expire but are no longer valid.
The primary drawback of CRLs is that a large CA might need to revoke many certificates. Consequently, the CRL can grow quite large. When checking the status of a certificate, client OSs must retrieve the CRL in its entirety, which becomes bandwidth intensive. A delta CRL—a CRL that lists only the certificates revoked since the last complete (or base) CRL was issued—can help ease the problem. However, it doesn’t provide the ideal solution because it, too, must be retrieved in its entirety.
OCSP is an HTTP protocol that uses minimal bandwidth to perform certificate status checks, as opposed to the clients downloading a CRL. OCSP determines certificate status by requesting information about a single certificate, so the volume of data returned to the client doesn't increase if the number of revoked certificates increases. Starting in Windows Server 2008 and Windows Vista, OCSP is enabled by default in Microsoft Internet Explorer (IE). The issuing certificate server must also support OCSP and configure certificates appropriately.
Chain of Trust
At some point, it becomes impractical for one CA to validate and issue certificates to every entity that requires one. Therefore, root CAs can grant subordinate CAs the right to issue certificates. This system creates a root/subordinate hierarchy.
The private key of a root CA certificate is used to sign the certificate of subordinate CAs. As long as a subordinate CA certificate is signed by the root CA certificate, certificates issued by the subordinate CA are valid within the hierarchy.
In the example of a web browser, root CA certificates are shipped with the client OS and provide a direct line of trust to public CAs, such as VeriSign and Thawte. If the CA that issued a certificate isn’t directly trusted, the certificate chain must include a CA that’s directly trusted.
Here’s how the validation process works: The client OS checks the certificate’s Issuer field to see which CA issued the certificate. Using the public key of the issuer’s subordinate or root CA certificate, the client OS decrypts the digital signature of the certificate to be validated in order to read the signature’s hash. The client OS then generates a second hash for the certificate to be validated and compares it to the hash from the decrypted signature. If both match, the certificate is considered valid.
The Big Picture
Let’s take a look at how all the pieces fit into a PKI solution. SSL encryption is commonly used by websites and web browsers to verify the authenticity of a web server and encrypt data in transit over the public Internet. Transport Layer Security (TLS) is an advanced version of SSL (SSL 3.0 to be precise) commonly used for secure Internet transactions.
When a browser initiates communication, the web server and client OS first negotiate algorithm support. The server defaults to using the strongest standards that both the client OS and server support. The server then identifies itself by sending its public key in the form of a digital certificate. The client OS determines whether it trusts that certificate by checking the installed root CA certificates, checking the certificate’s dates of validity, and making sure the certificate hasn’t been revoked. Modern OSs, such Windows Vista and later, can also perform validation over the Internet using OCSP.
After validating the server’s identity, the client OS creates a symmetric encryption key by generating a random number and encrypts it with the server’s public key. The client OS then sends the encrypted key (the encrypted random number) to the server, which the server decrypts with its own private key. The new symmetric encryption key can then be used by the client OS and server to encrypt and decrypt message data.
The process of validating identities and exchanging a symmetric encryption key is known as a handshake. Once completed, encrypted message data is sent between the two parties.