HOME | EDIT | RSS | INDEX | ABOUT | GITHUB

Cryptography Cheat Sheet for Developers

This is just a quick reference of cryptography, for developers just be able to tell when and how to use what. Click the reference links(and links there(and links there…)) to read to chase the raddit.

RULE No.1 of cryptography: DO NOT implement or invent new cipher yourself!!!

Haskell is chosen as programming language for examples since it is concise, interactive, typesafe and cryptonite! actually Haskell is chosen because I learnt most of the content in this article while implementing age in Haskell for my new experimental project https://github.com/jcouyang/dhall-secret/pull/1 PR welcome btw :)

The Sheet

    Variants/Schemes
Encoding/Format Base Base16 Base32 Base64
  X.509  
  JWT JWS JWE JWK
  PEM  
  PKCS PKCS1 PKCS5 PKCS8 PKCS12/PFX
Hash MD5  
  SHA SHA1 SHA2 SHA3
  BLAKE2 Blake2b Black2s
MAC HMAC  
  Poly1305  
KDF PBKDF2  
  HKDF  
  Bcrypt  
  Scrypt​  
Symetric AES AES-GCM AES-GCM-SIV
     
  ChaCha20 ChaCha20-Poly1305
Asymetric ECC ECDSA EdDSA ECDH X25519 Ed25519
     
  RSA  

Prerequisites

All you need is just nix!

All codes in examples are executable in interative repl GHCi.

Please run the following to get into GHCi and config it correctly before trying any example.

nix-shell -p "haskellPackages.ghcWithPackages (pkgs: [ pkgs.cryptonite pkgs.memory ])" --run ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/  :? for help
ghci> :set -XOverloadedStrings
ghci> import Data.ByteString

openssl CLI will be used in few examples as well.

Encoding

Encoding is not Encryption!!! It is just converting bytes from one format to another, for purposes like easier to store, transit etc. Although it looks scrambled, any one can convert it back and forth.

Base

Most developers are very familiar with base64 https://datatracker.ietf.org/doc/html/rfc4648

ghci> import Data.ByteArray.Encoding
ghci> convertToBase Base64 ("hello world" :: ByteString) :: ByteString
"aGVsbG8gd29ybGQ="
ghci> convertFromBase Base64 ("aGVsbG8gd29ybGQ=" :: ByteString) :: Either String ByteString
Right "hello world"
ghci> convertToBase Base16 ("hello world" :: ByteString) :: ByteString
"68656c6c6f20776f726c64"
ghci> convertFromBase Base16 ("68656c6c6f20776f726c64" :: ByteString) :: Either String ByteString
Right "hello world"

The number 64 or 16 indicates how large the alphabets table is. The larger the table, usually the shorter encoded message.

For instance base64 has 64 alphabets(actually 63, = is for padding), hence each alphabet can describe 2^6 aka 6bit.

base16 is basically just hex since one alphabet map to 2^4(4bit) using US-ASCII.

The following example of base64 data from message "hel" https://datatracker.ietf.org/doc/html/rfc4648#section-9

Input:   h        e        l
Hex:     6   8    6   5    6   c  
8-bit:   01101000 01100101 01101100
6-bit:   011010 000110 010101 1101100
Decimal: 26     6      21     44     
Output:  a      G      V      s      

Privacy-Enhanced Mail(PEM) https://datatracker.ietf.org/doc/html/rfc1421

PEM is baseline format for most of the public key encryption, which contains

  • Boudary
  • Headers
  • Body

Example PEM:

-----BEGIN PRIVACY-ENHANCED MESSAGE-----
Proc-Type: 4,ENCRYPTED
Content-Domain: RFC822
DEK-Info: DES-CBC,F8143EDE5960C597
Originator-ID-Symmetric: linn@zendia.enet.dec.com,,
Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,3
Key-Info: DES-ECB,RSA-MD2,9FD3AAD2F2691B9A,
          B70665BB9BF7CBCDA60195DB94F727D3
Recipient-ID-Symmetric: pem-dev@tis.com,ptf-kmc,4
Key-Info: DES-ECB,RSA-MD2,161A3F75DC82EF26,
          E2EF532C65CBCFF79F83A2658132DB47

LLrHB0eJzyhP+/fSStdW8okeEnv47jxe7SJ/iN72ohNcUk2jHEUSoH1nvNSIWL9M
8tEjmF/zxB+bATMtPjCUWbz8Lr9wloXIkjHUlBLpvXR0UrUzYbkNpk0agV2IzUpk
J6UiRRGcDSvzrsoK+oNvqu6z7Xs5Xfz5rDqUcMlK1Z6720dcBWGGsDLpTpSCnpot
dXd/H5LMDWnonNvPCwQUHt==
-----END PRIVACY-ENHANCED MESSAGE-----

Public Key Cryptography Standard(PKCS) https://datatracker.ietf.org/doc/html/rfc5958

There are a lot of PKCS #X standards, maybe the most common one is RSA keys since you may have seen it quite often(something like server.key) when updating a website's certiciate.

PKCS #8

PKCS8 in usually used as syntax of unencrypted RSA private key,

you can simply generate a rsa key via openssl:

nix-shell -p openssl
[nix-shell:/tmp]$ openssl genpkey -algorithm rsa -out test.key
[nix-shell:/tmp]$ cat test.key
-----BEGIN PRIVATE KEY-----
MIIEvwIBADANBgkqhkiG9w0BAQEFAASCBKkwggSlAgEAAoIBAQC6LU2ZNdy32+HL
...
c581/XSSIu1kZpptICNGM4MiDJyGoysNX7417wXgwr8YEb6fbMAMGjjYKbF9BlpY
yRdkNiEmIKL4/ZQoTLdyQR4vJQ==
-----END PRIVATE KEY-----
[nix-shell:/tmp]$ openssl rsa -in test.key -noout -text
RSA Private-Key: (2048 bit, 2 primes)
modulus:
    00:e9:6a:68:ab:7b:73:f0:14:72:24:e5:35:f1:c2:
    ...
publicExponent: 65537 (0x10001)
privateExponent:
    00:c0:6f:a1:11:d7:ba:f2:f0:f8:56:20:be:c3:ad:
    ...
prime1:
    00:fb:d7:d5:fd:2c:b5:b2:cd:92:b0:ea:60:83:29:
    ...
prime2:
    ...
exponent1:
    00:e5:8f:16:15:92:9d:85:00:71:c8:25:bc:17:92:
    ...
exponent2:
    3e:6e:01:ad:b7:63:36:96:90:f9:ed:38:c4:10:bf:
    ...
coefficient:
    00:89:e1:69:2b:78:97:a9:91:88:39:7a:75:08:f0:
    ...

The output is readable text but the original PKCS8 is in Abstract Syntax Notation One(ASN.1) https://www.itu.int/en/ITU-T/asn1/Pages/introduction.aspx syntax and DER encoded.

Private key is kind of too long as example, lets use public key to explain.

PKCS #1

RSA Public key can be generate from private key:

[nix-shell:/tmp]$ openssl rsa -in test.key -outform PEM -pubout
writing RSA key
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA6Wpoq3tz8BRyJOU18cJK
D+4lVGBz94lpRLfAAK3mjEWbIbeQ/uElnyGbq0Fe+XRMBdVpI5B5PQbM8kf6sNYv
n7BM+dRVq1LuRGmxjL/i+CW7VRWiIZxHWNF/eCaqn3j2hij81NK22m13eFMSOELo
76G6TDtEbv5rqJdhrJw6BlCbslHXNr4rT+q0R2ajricbY/xig/bz6mOetjgxoL6X
WiuJibtAYqGa7+iQse1icFz3SWCwwZjYE46uW1rUI7iyugRBhdVMiypPDj00wdak
77NiaiFw91Vl1EfZo09b8ztcSjBKWeE0tte8Iy5+AhKsC59/hE2wIFj5TnxVE4JC
kwIDAQAB
-----END PUBLIC KEY-----

The output of the public key is in PKCS1 in ASN.1 syntax DER encoded in PEM format.

PKCS1 represent RSA public key ASN.1 as

RSAPublicKey:

      RSAPublicKey ::= SEQUENCE {
          modulus           INTEGER,  -- n
          publicExponent    INTEGER   -- e
      }

You can read the same information as ASN.1 from

[nix-shell:/tmp]$ openssl rsa -in test.key -pubout -out test.pem
[nix-shell:/tmp]$ openssl rsa -in test.pem -pubin -noout -text
RSA Public-Key: (2048 bit)
Modulus:
    00:e9:6a:68:ab:7b:73:f0:14:72:24:e5:35:f1:c2:
    4a:0f:ee:25:54:60:73:f7:89:69:44:b7:c0:00:ad:
    e6:8c:45:9b:21:b7:90:fe:e1:25:9f:21:9b:ab:41:
    5e:f9:74:4c:05:d5:69:23:90:79:3d:06:cc:f2:47:
    fa:b0:d6:2f:9f:b0:4c:f9:d4:55:ab:52:ee:44:69:
    b1:8c:bf:e2:f8:25:bb:55:15:a2:21:9c:47:58:d1:
    7f:78:26:aa:9f:78:f6:86:28:fc:d4:d2:b6:da:6d:
    77:78:53:12:38:42:e8:ef:a1:ba:4c:3b:44:6e:fe:
    6b:a8:97:61:ac:9c:3a:06:50:9b:b2:51:d7:36:be:
    2b:4f:ea:b4:47:66:a3:ae:27:1b:63:fc:62:83:f6:
    f3:ea:63:9e:b6:38:31:a0:be:97:5a:2b:89:89:bb:
    40:62:a1:9a:ef:e8:90:b1:ed:62:70:5c:f7:49:60:
    b0:c1:98:d8:13:8e:ae:5b:5a:d4:23:b8:b2:ba:04:
    41:85:d5:4c:8b:2a:4f:0e:3d:34:c1:d6:a4:ef:b3:
    62:6a:21:70:f7:55:65:d4:47:d9:a3:4f:5b:f3:3b:
    5c:4a:30:4a:59:e1:34:b6:d7:bc:23:2e:7e:02:12:
    ac:0b:9f:7f:84:4d:b0:20:58:f9:4e:7c:55:13:82:
    42:93
Exponent: 65537 (0x10001)

So far all these PKCS are UNENCRYPTED, they are just encoded in certain format.

There is a common standard for store and exchange certs and keys that is encypted - PKCS #12 Personal Information Exchange Syntax https://datatracker.ietf.org/doc/html/rfc7292 aka PFX This is the common format when you get a new cert.

JSON Web Token(JWT) https://datatracker.ietf.org/doc/html/rfc7519

JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties

JWT is either a JWS or JWE

JSON Web Signature (JWS) https://datatracker.ietf.org/doc/html/rfc7515

JWS is commonly used in OIDC https://openid.net/specs/openid-connect-core-1_0.html as id_token and sometimes access_token too.

The message is NOT ENCTYPTED, so anyone can actually see the claims in the JSON.

BASE64URL(UTF8(JWS Protected Header)) || '.' ||
BASE64URL(JWS Payload) || '.' ||
BASE64URL(JWS Signature)

A example of JWS:(with line breaks for display purposes only):

eyJ0eXAiOiJKV1QiLA0KICJhbGciOiJIUzI1NiJ9
.
eyJpc3MiOiJqb2UiLA0KICJleHAiOjEzMDA4MTkzODAsDQogImh0dHA6Ly9leGFt
cGxlLmNvbS9pc19yb290Ijp0cnVlfQ
.
dBjftJeZ4CVP-mB92K27uhbUJU1p1r_wW1gFWFOEjXk

JWS Signature basically is MAC, of header || payload

JSON Web Encryption (JWE) https://datatracker.ietf.org/doc/html/rfc7516

As the name indicate JWE can be use to encrypt messages, JWE is in the following format, where Ciphertext is the encrypted message.

BASE64URL(UTF8(JWE Protected Header)) ||
      '.' || BASE64URL(JWE Encrypted Key) || '.' || BASE64URL(JWE
      Initialization Vector) || '.' || BASE64URL(JWE Ciphertext) || '.'
      || BASE64URL(JWE Authentication Tag)

example:(with line breaks for display purposes only)

eyJhbGciOiJSU0EtT0FFUCIsImVuYyI6IkEyNTZHQ00ifQ.
OKOawDo13gRp2ojaHV7LFpZcgV7T6DVZKTyKOMTYUmKoTCVJRgckCL9kiMT03JGe
ipsEdY3mx_etLbbWSrFr05kLzcSr4qKAq7YN7e9jwQRb23nfa6c9d-StnImGyFDb
Sv04uVuxIp5Zms1gNxKKK2Da14B8S4rzVRltdYwam_lDp5XnZAYpQdb76FdIKLaV
mqgfwX7XWRxv2322i-vDxRfqNzo_tETKzpVLzfiwQyeyPGLBIO56YJ7eObdv0je8
1860ppamavo35UgoRdbYaBcoh9QcfylQr66oc6vFWXRcZ_ZT2LawVCWTIy3brGPi
6UklfCpIMfIjf7iGdXKHzg.
48V1_ALb6US04U3b.
5eym8TW_c8SuK0ltJ3rpYIzOeDQz7TALvtu6UG9oMo4vpzs9tX_EFShS8iB7j6ji
SdiwkIr3ajwQzaBtQD_A.
XFBoMYUZodetZdvTiFvSkQ

Hash Function

Hash function can map bytes to another ONE WAY only but not the other way around. Common hash functions are SHA2, SHA3, MD5, Blake2… Modern hash functions such as SHA2, SHA3, Blake2 are consider secure hash functions. Old funtions such as MD5 and SHA1 are not secure since collisions found, and should avoid using them.

Hash functions are commonly used to proof the content not tampered, for example if you download an executable file form internet, you should compare the hash provided by the site and the one caclulated locally. Collisions found will indicate the function is not secure anymore, for example if someone hijack the content and replace with another malware which can calculate to the same hash.

ghci> import Crypto.Hash
ghci> hash ("hello world"::ByteString) :: Digest SHA1
2aae6c35c94fcfb415dbe95f408b9ce91ee846ed
ghci> hash ("hello world"::ByteString) :: Digest MD5
5eb63bbbe01eeed093cb22bb8f5acdc3
ghci> hash ("hello world"::ByteString) :: Digest SHA256
b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
ghci> hash ("hello world"::ByteString) :: Digest SHA3_256
644bcc7e564373040999aac89e7622f3ca71fba1d972fd94a31c3bfbf24e3938
ghci> hash ("hello world"::ByteString) :: Digest Blake2s_256
9aec6806794561107e594b1f6a8a6b0c92a0cba9acf5e5e93cca06f781813b0b
ghci> hash ("hello world"::ByteString) :: Digest Blake2b_256
256c83b297114d201b30179f3f0ef0cace9783622da5974326b436178aeef610

The number 256 in SHA and Blake indicates the output bits length, usually more bits means higher collisions resistance.

Hashing is NOT encryption!!! DO NOT store hash of password in database. Although hash function is not reversible, if I have a large enough dictionary, I can definitly tell from database the password 5eb63bbbe01eeed093cb22bb8f5acdc3 is hello world

There is example of Blake2b of "abc" and C implementation in rfc7693 https://datatracker.ietf.org/doc/html/rfc7693#appendix-A

Message Authentication Code(MAC)

MAC is basically a hash function + key.

For example HMAC SHA256 is HMAC scramble the message with a key and hash with SHA256.

ghci> import Crypto.MAC.HMAC
ghci> import Crypto.Hash
ghci> hmacGetDigest $ hmac ("secret key"::ByteString) ("hello world"::ByteString) :: Digest SHA256
c61b5198df58639edb9892514756b89a36856d826e5d85023ab181b48ea5d018
ghci> hmacGetDigest $ hmac ("secret key"::ByteString) ("hello world"::ByteString) :: Digest Blake2b_256
198e317eba56eee5056b88f527c895d6235ace9153fdf6467e38c2758073328c

The scramble part is defined in rfc2104 https://datatracker.ietf.org/doc/html/rfc2104 , H is hash function e.g. SHA256, K is secret key and , is concat

ipad = the byte 0x36 repeated B times
opad = the byte 0x5C repeated B times
H(K XOR opad, H(K XOR ipad, text))

MAC can be used in senario like:

  • Exchange private message, append a MAC of the message to proof it is not tampered, very similar to usage of hash function, but hash function is mainly use for public messages, for example a file from public website that everyone can download.
  • Pseudo Random Generator(PRG), HMAC(salt, seed) generate a pretty random enough key can be used in KDF

Key Derivation Function(KDF)

KDF is a function generates pseudo random key from password. Password is something we usually used to encrypt a file, or login to a website, because it is easy to remember or note for human, but not random enough to use directly as key to encrypt, and not secure to store in database.

You can think of KDF as just MAC, but run many iterations and consume some CPU and RAM.

Password Based Key Derivation Function (PBKDF2) https://datatracker.ietf.org/doc/html/rfc2898

The following example of PBKDF using HMAC SHA256, iterate 1000 times, and output length 32 bytes.

ghci> import Crypto.KDF.PBKDF2
ghci> generate (prfHMAC SHA256 :: PRF ByteString) (Parameters {iterCounts = 1000, outputLength = 32}) ("password":: ByteString) ("salt"::ByteString) :: ByteString
"c,(\DC2\228mF\EOT\DLE+\167a\142\157m}/\129(\246&kJ\ETX&M*\EOT`\183\220\179"

The output is 32 bytes length pseudo random bytestring, we can output hex format with base16 encoding

ghci> convertToBase Base16 $ (generate (prfHMAC SHA256 :: PRF ByteString) (Parameters {iterCounts = 1000, outputLength = 32}) ("password":: ByteString) ("salt"::ByteString) :: ByteString) :: ByteString
"632c2812e46d4604102ba7618e9d6d7d2f8128f6266b4a03264d2a0460b7dcb3"

It is secure to store parameters( salt, iterations count, output length), together with the output bytes in database, in senario such as login, a server can run the same function again with the salt, iterations and length from the record, and compare the output bytes with the one stored in the database.

Since PBKDF2 hash each password with HMAC and a random salt many iterations, it is resistanct to dictionary attacks https://datatracker.ietf.org/doc/html/rfc4949#page-102 .

PBKDF2 is a common KDF but it is consider less secure than modern KDF such as Scrypt, Argon2.

Scrypt https://datatracker.ietf.org/doc/html/rfc7914

The following is a example of deriving 32 bytes length key in 1024 iterations, block size 8 and parallel 2.

ghci> import Crypto.KDF.Scrypt
ghci> generate (Parameters {n=1024,r=8,p=2,outputLength=32}) ("password":: ByteString) ("salt"::ByteString) ::ByteString
"\ETBeHl\244\197Y\DEL\181\&0\141\SYN\185\151\148\215\211\160\189.\148d\185\172\177\202\&2\ETX\SUB\133\223\237"

HMAC-based Extract-and-Expand Key Derivation Function (HKDF) https://datatracker.ietf.org/doc/html/rfc5869

ghci> import Crypto.KDF.HKDF
ghci> import Crypto.Hash (SHA256)
ghci> let pkr = extract ("salt" :: ByteString) ("secret" :: ByteString) :: PRK SHA256
ghci| in expand pkr ("payload" :: ByteString) 32 :: ByteString
"\DC4\147\223\v%\175\f\177\143\132\202\142\233\236\135\153\253\CANs\213wh\149\193\128\240\192t\DC1\UST,"

Symmetric Ciphers

AES

AES requires a initial vector(IV), aka nonce

Counter Mode(CTR)

The following is a example of AES CTR mode with a random key and 0 iv:

In practice key should be generated from one of the secure KDF, and iv should be a random number.

ghci> import Crypto.Random
ghci> import Crypto.Cipher.Types
ghci> import Crypto.Cipher.AES (AES256)
ghci> import Crypto.Error
ghci> do
ghci| cipher <- (getRandomBytes 32 :: IO ByteString) >>= (throwCryptoErrorIO . cipherInit) :: IO AES256
ghci| return $ ctrCombine cipher nullIV ("message"::ByteString)
ghci| 
"\208\207\SI\191\206\DELN"

Galois/Counter Mode https://csrc.nist.gov/publications/detail/sp/800-38d/final Synthetic Initialization Vector (GCM-SIV) https://datatracker.ietf.org/doc/html/rfc8452

CTR is good enough for common encryption case, while GCM added Authenticated Encryption with Additional Data (AEAD) https://datatracker.ietf.org/doc/html/rfc5116 , and SIV to nonce misuse-resistant.

AEAD basically bind extra data, or context to cipher text and generate a MAC, aka authentication tag, to be able to verify cipher text's integrity(not tampered), and authenticity(not cut-and-paste).

The following is example of AES-GCM-SIV encryption of "message" with additional data "context" and a nonce.

ghci> import Crypto.Cipher.AESGCMSIV
ghci> do
ghci| key :: ByteString <- getRandomBytes 32
ghci| nonce <- generateNonce
ghci| throwCryptoErrorIO $ do
ghci| aes :: AES256 <- cipherInit key
ghci| return $ encrypt aes nonce ("context" :: ByteString) ("message" :: ByteString) 
ghci| 
(AuthTag {unAuthTag = "\239|\229V\USNT3\ACKf\NAK\STXC\251\134\FS"},"\149\229\142SW\209Z")

ChaCha20

ChaCha20 is high speed stream cipher, a vairant of Salsa20, usually combine with Poly1305 as AEAD construction.

ChaCha20-Poly1305 https://datatracker.ietf.org/doc/html/rfc8439

ChaCha20-Poly1305 requires very similar inputs to AES:

  • a 32 bytes (256-bit) key, can derive key from password with a secure KDF
  • a 12 bytes (96-bit) nonce aka IV

There are more steps to encrypt a message due to it is stream cipher aka state cipher, which is different from block cipher such as AES, block cipher generate fixed length key to encrypt fixed length message, while stream cipher can produce state to generate keystream for next chunk of data.

  • AEAD need to add and finalize before encrypt, and cannot modify later on
  • encrypt can call multiple times based on current state
  • finalise a state will generate auth tag
ghci> import Crypto.Error
ghci> import Crypto.Cipher.ChaChaPoly1305
ghci| do
ghci| key <- getRandomBytes 32 :: IO ByteString
ghci| nonce <- getRandomBytes 12 :: IO ByteString
ghci| throwCryptoErrorIO $ do
ghci| st1 <- nonce12 nonce >>= initialize key
ghci| let
ghci|   st2 = finalizeAAD $ appendAAD ("context":: ByteString) st1
ghci|   (out, st3) = encrypt ("message":: ByteString) st2
ghci|   auth = finalize st3
ghci| return $ (convertToBase Base16 out :: ByteString, convertToBase Base16 auth :: ByteString)
("f0dd593fb3cac0","4a29dd7ae8b51ac748b37092ed485e88")

TODO Asymmetric Ciphers

Footnotes: