Session-based Authentication
Before we get to the token-based authentication, let's first see the session-based authentication which is used traditionally.
We know HTTP is a stateless protocol. So in order to maintain a state(typically user login state), we need a way for the client to tell the server: hey, I'm already authenticated, so don't let me be authenticated again, just given me my private information. Tranditionally, we use a session id which is stored in browser cookie to represent the authenticated state.
The flow of session-based authentication looks like below.
- The user (normally a browser) sends a request to the server. The request contains the login credentials of the user and the info it is requesting.
- The server authenticates the user by creating a sessionId which is stored in memory or a database and returns it to the user.
- This sessionId is stored by the user in browser cookies. The next time the user makes a request it sends the cookies as well in the HTTP header.
- The server looks at the sessionId and checks if it is valid.
- If the sessionId is valid then the web server recognizes the user and returns the requested information.
In the above process, we can see that the sessionId is generated by the server and be stored in cache or databases. This may incurs problem in distributed systems where requests may be distributed across many servers by a load balancer.
Token-based Authentication
Token-based authentication is invented to improve the traditional session-based one. Its flow looks like below.
- The client sends a request to the server with a username/password. The server validates the credentials and generates a secure, signed token for the client.
- The token is sent back to the client and stored there.
- When the client needs to access something new on the server, it sends the token through the HTTP header.
- The server decodes and verifies the attached token. If it is valid, the server sends a response to the client.
- When the client logs out, the token is destroyed.
You may find the above flow has no big difference from the session-based one. Actually the key difference between token and sessionId is that: token is signed by some crypto algorithms and the server validate the token through crypto keys and there are no need to store them. Because there is no storage, so any server could be used to generate tokens and any server could be used to validate tokens as long as the process follows the crypto algorithms.
JWT
JWT(JSON Web Token) is a standard which defines how the token be generated, what information the token contains and how to validate tokens, etc. Now let's gets to the details about JWT.
A JSON Web Token is basically three based64 encoded strings separated by a dot.
aaaa.bbbb.cccc
The first part aaaa
is called the header. This header describes what algorithm is used to sign or encrypt the data contained in the JWT. The header JSON looks like as shown below.
{
"alg": "HS256",
"typ": "JWT"
}
alg
: the algorithm used to sign or encrypt the JWTtyp
: the content that is being signed or encrypted
The second part bbbb
is called the payload. It contains the main information that the server uses to identify the user and permissions. The payload consists of claims. Claims are statements about an entity (typically the user) and additional data. Typical registered clamins are:
iss
: identifies the principal that issued the JWTsub
: identifies the principal that is the subject of the JWTaud
: identifies the recipients that the JWT is intended forexp
: identifies the expiration time at or after which the JWT MUST NOT be accepted for processingnbf
: identifies the time before which the JWT MUST NOT be accepted for processingiat
: identifies the time at which the JWT was issuedjti
: The JWT ID is a unique identifier for the JWT
The third part cccc
is created by combining the header and payload parts of JWT and then hashing them using a secret key.
HMACSHA256(
base64UrlEncode(header) + "." + base64UrlEncode(payload),
secret key
)
This is the part which is used to validate the token without storing them. Because the secret key is only known by the server, so nobody can fake it. And because the token is created also by the content of the header and the payload, so any changes made to the token should be detected. This is the process of token validation.
Key Management
There are two mechanisms to sign a token: symmetric signature and asymmetric signature.
Symmetric signature means that we use the same secret key to sign and validate tokens. Typically we use hash functions (like MD5/SHA1/SHA256) over the data and the secret key to generate/validate tokens.
Asymmetric signature uses a secret key to sign tokens and use a public key to validate tokens. The secret key is only known by the server, so only the server can generate valid tokens. The public key is known by the public, anyone can obtain this public key and use it to validate tokens. Usually this process is done using RSA algorithm.
So the validation process is, the server receives the token sent by the client, use a secret key(symmetric signature) or a public key(assymetric signature) to validate the token.
For now, we only have one key for validation. What if we change the key? Because the key changed, tokens generated by the old key should be invalid if we use the new key to validate them. So if we have multiple keys, when the server receives the token, it needs to know which key is used to generate the token, so it can validate it properly.
For the tokens generated by symmetric signature, we have multiple secret keys. we give each keys a unique name called kid
and store it in the JWT header part of the token. So when the server receive the token, it can search the kid
parameter, find the corresponding secret key.
For the tokens generated by assymmetric signature, the validation is done by public keys. Because it's already public, so we can embed the public key in the JWT using jwk
parameter directly. So the server could get the jwt
key from the token and use the key to validate the token. Another way to do it is save all public keys (with its kid
) in a public location(like a url) and store the url in jku
paramter. And embed the corresponding kid
in the JWT. So this time, the server get the kid
and jku
from JWT, and then find the targeted public key.