General concepts to study Cryptography

Cryptography is a wide topic. Sometimes beginners don’t know where to start. In this article, I’ll explain the main concepts and some recommendations that will facilitate the understanding of this important area of knowledge.

What is cryptography?
Confusion and diffusion
Information protection in modern cryptography
Symmetric cryptography
Asymmetric cryptography
- Scenario 1
- Scenario 2
Why do we still use symmetric algorithms?
Digital Certificates and Public Key Infrastructure
- Example
Hash functions
- Examples of use
Summary

What is cryptography?

The word Cryptology is derived from the Greek Kryptos, which means hidden and logos, which means word. In other words, cryptology studies the science of hidden writing.

Cryptology is divided into 4 disciplines:

Cryptography
Cryptanalysis
Steganography
Steganalysis

As you can see above, Cryptography is one of the areas of knowledge studied in cryptology.

Cryptography encompasses the study of methods and techniques to secure information, so people without required permissions cannot have access to it.

Cryptanalysis is kind of opposed to cryptography. In this area of knowledge, you can study how to obtain information that was secured using cryptography, without having enough permissions.

Steganography focus is to hide information. A common example is to hide a message in a picture. You will see the picture but won’t see the message because it is hidden. Notice that it is different from cryptography because in the former one the text is protected, you can see the message, but you can’t understand it. Steganography is about hiding the existence of a message, if you can detect the message, then you can understand it.

Steganalysis is the study of how to detect messages that were hidden using steganography.

Cryptography has two main goals: to achieve confusion and diffusion in the text.

Confusion and diffusion

Confusion is focused on changing the symbols from the original message to a different one (for instance, substituting an “a” by a “6”). Notice that when we refer to a message, it can be text, audio, video, images, etc.

Diffusion refers to the goal of encrypted symbols not having the same position as the original symbol.

If we achieve both objectives, the protected/secured message will seem random, and it won’t have patterns that can facilitate obtaining the original message from the secured one, or the key that was used to protect the message.

Cryptography study is divided into classic and modern cryptography. Classic cryptography groups the “manual methods”, i.e. Caesar cypher, scytale, etc.

Current solutions use modern cryptography. In this case, we are referring to methods that are applied through machines.

The change from classic to modern cryptography happened when the machine Enigma was created. It was used during the 2nd world war by the Germans.

In this article, we will focus on concepts related to modern cryptography.

Information protection in modern cryptography

Modern cryptography focuses on protecting the information, so when it reaches its destination, the information is reliable. the receiver of the information. To achieve that, one or more of the following characteristics must be guaranteed:

Integrity: The information cannot be modified while going from the sender to the receiver.
Availability: The information must be readable/legible for the person or machine that is addressed to.
Confidentiality: The information is readable only for the person/machine that was sent to.

There are other characteristics to consider, like non-repudiation and the time the information is valid, etc. These should also be guaranteed through cryptography but won’t be explained in this article.

To achieve those characteristics or goals, we can subdivide modern cryptography into these topics:

Symmetric cryptography
Asymmetric cryptography
Hash functions (These are cryptographic data integrity algorithms)

Symmetric cryptography

Symmetric cryptography group algorithms that we can use to encrypt and decrypt high volumes of information at a relatively low computational cost (time needed to encrypt and decrypt). The main characteristic of these algorithms is that the same key is used to encrypt and decrypt the message.

We can measure the strength of encryption by the robustness of the algorithm (confusion and diffusion on the encrypted text) and the robustness of the key. The key must have certain characteristics that will make the encryption strong depending on the algorithm we use.

It is important to clarify that the security of the encryption cannot depend on the fact the algorithm used to encrypt the text is unknown.

Symmetric cryptography algorithms are grouped into stream cyphers and block cyphers. The former encrypts the message bit by bit while it is being transmitted and the latter divides the message into blocks and encrypt each block.

Stream cyphers are generally used in hardware with very few memory resources or in communications with a high risk of information loss, while block cyphers are generally used in computers.

Find below some block cyphers that are currently considered secure.

Algorithm	Key length
AES	256 bit
GOST	256 bit

Symmetric cryptography solves the problem of achieving confidentiality in large volumes of data efficiently. There is a condition though, the receiver and the sender must know the same key. If they are in distant places, there must be a secure way to share the key between the sender and the receiver. In cryptography, we call that, the key exchange problem.

This problem is solved by using asymmetric cryptography.

Asymmetric cryptography

Asymmetric cryptography, or public-key cryptography, is based on the use of a different key to encrypt and decrypt a message. That pair of keys are generated together and are closely related through mathematical functions that vary depending on the algorithm that is being used.

The main restriction between the two keys is that no one should be able to calculate a key having the other one, and an encrypted message can only be decrypted with the corresponding key of the pair.

The characteristics explained above allow you to use the keys in a variety of situations. Find below explanations of some ways you can use it.

The secret key “s” is only known by the owner of the pair of keys and the public key “p” is known by anyone who wants to communicate securely with the owner of the key.

Scenario 1

‘A’ wants to send a secret message ‘M’ to ‘B’ (only ‘B’ should be able to read the message).

‘A’ write the message, then encrypt it with B’s public key, and send it to ‘B’.

When ‘B’ receives the message, it follows the steps: ‘B’ uses its secret key to decrypt the message, then it can see the original message that was sent only for him/her to read it.

Public keys are known by everyone. But once a message is encrypted with the public key, it can only be decrypted with the respective private key. Remember in asymmetric cryptography we always will have a pair of keys involved in the encryption/decryption process.

This is the way that we can solve the key exchange problem. We share public keys using an insecure communication channel, but the private keys remain secret.

Scenario 2

Another scenario is this one: ‘A’ wants to send a message to ‘B’, but ‘B’ has to be sure that the sender of the message is ‘A’.

‘A’ write the message and encrypt it with its own private key and send it to ‘B’. Once B receives the message, it uses ‘A’s’ public key and decrypts the message.

Notice that in this case, the process doesn’t guarantee confidentiality because anyone with ‘A’s’ private key can read the message. However, only ‘A’ could send that message because the only way you can read it is if you use ‘A’s’ public key.

This is the basis of digital signatures. Although the process is more complex (it combines other techniques such as hash functions), you can get an initial understanding with the previous explanation.

If asymmetric algorithms solve the key exchange problem, then why we still use symmetric algorithms?

Why do we still use symmetric algorithms?

We are still using symmetric algorithms because asymmetric algorithms work with complex mathematical operations, so they are slow. They only work efficiently when they encrypt small volumes of data, while symmetric are very fast even encrypting large volumes of data.

Based on that, we combine and use both, symmetric and asymmetric algorithms. For example, without going into details about cryptographic protocols, first, we generate a symmetric key to encrypt the data. Secondly, the key is securely exchanged between sender and receiver with an asymmetric algorithm.

Once the sender and receiver know the key, they encrypt the message using a symmetric algorithm with the pre-shared key. The use is as follows: the sender uses the asymmetric algorithm to encrypt the key and share it with the receiver, once both have the (private symmetric) key, they can use that key to encrypt and decrypt messages between them.

It is important to point out that each protocol has its own process. But in general, they follow the previous approach.

An example of asymmetric algorithms that is currently considered safe is the following:

Algorithm	Key length
RSA	4096 bit

Digital Certificates and Public Key Infrastructure

Asymmetric algorithms solve the key exchange problem, but a new problem arises. When using an asymmetric algorithm there is a risk that someone will send a public key pretending to be someone else and tricking the sender into encrypting the information with their key. We know that type of attack in cryptography as the man-in-the-middle attack.

The next issue we should consider is the following: does the public key belongs to the person that I think it belongs to? This is considered a trust issue.

Existing solutions to mitigate this risk are digital certificates and public key infrastructures.

Example

A digital certificate contains the issuer’s (sender) public key and some information such as name, address, etc. When the certificate is sent to the receiver, the public key can be reviewed along with the other identification data.

The issue is that in most cases the receiver does not have prior knowledge of the legitimate public key of the service to which it is connecting or of the person who is sending the message. So, you don’t know whether to trust the digital certificate.

For this reason, the Certification Entities were created as part of the Public Key Infrastructures (PKI). These entities have the technical means and procedures to validate the identity of the entities that use digital certificates. After validating it, they sign it with their own key as a certifying entity and respond for the veracity of the digital certificate.

The HTTPS protocol used by websites uses this verification system. Web browsers already have a list of international certification authorities. When you connect to a website and the certificate is signed by one of these entities, the browser accepts it without problems, otherwise, it displays a warning because it may be a false certificate.

Cryptography Digital certificate example

Hash functions

It is also important to know how hash functions work while studying cryptography.

We can use these functions to generate a fixed-size message, defined in the configuration of the function, from an input message. It is not an encryption algorithm because there is no way to reverse the process and obtain the original message. However, these functions have the main characteristic that, a small variation in the original message, generates a large variation in the output message of the hash function.

Examples of use

Check the integrity of a text: We calculate the message hash (the fixed-size message) and store it. To check if the text has not been modified, the hash is calculated again and if it matches the original it means that it is the same text.
create a digital signature: we calculate the message hash, this hash is usually smaller than the message and what is signed is the (fixed-size) hash to consume less time. To check if the message has been modified, the hash is calculated and if it is the same as the signed one, then it means that the message is correct. Notice that because the hash is signed, and it was created from the message, then the message is also signed.
Storage of keys: As a rule, keys should not be stored as plain text to reduce the risk of someone (without permission) having access to the key. Instead of storing the key, we store the hash of the key. When the user writes his/her key, what is sent is the hash of the key, then the system checks that hash against the one in the database and if it is the same then the key is correct.

Currently, it is secure to use the standard SHA-3 with length 512.

Summary

In the article, I presented you with an introduction to cryptography. This introduction can facilitate the learning process you are in. Each of the aforementioned topics represents a field of study that we will explain in this blog. We hope that this introduction has been useful to you, and it can facilitate the study of cryptography.

How to create secure keys for symmetric encryption?

What is the Diffie-Hellman Key Exchange?

Digital certificates validation methods: a comparison