Multi-environment OIDC auth with custom JWT claims and local RS256 verification
AWS CognitoOIDCLambdaJWKSRS256Terraform
Context
A multi-tenant SaaS platform serves several customer organizations across four isolated environments — dev, sandbox, beta, and production. Each tenant's users need to authenticate against an identity provider that's both centrally operated and tenant-aware: a backend service handling a request must be able to read the caller's tenant id, role, and employee id directly from the bearer token, without hitting the IdP on every call.
The challenge
Out-of-the-box Cognito issues JWTs with a fixed claim set (sub, email, cognito:username, aud, exp). The platform needs additional claims that originate from the application database, kept in sync with every token. The auth design must also be consistent across all four pools (any policy change must roll out everywhere), and backends must verify tokens locally — going back to Cognito on every API call would add a round-trip per request and crush p99 latency.
Approach
Provisioned four Cognito user pools through a single Terraform module so password policy, recovery rules, OAuth scopes, and schema attributes stay in lockstep across environments. Wired a pre-token-generation Lambda into each pool that fires on every token issuance and overrides the JWT's claim set with values pulled from the application database. On the consumer side, every backend service implements RS256 verification locally against the pool's published JWKS (JSON Web Key Set), with key material cached for an hour after first fetch.
Authorization Code OAuth flow (with PKCE on public clients) for the hosted UI — Implicit flow is deprecated by OAuth 2.1 and forbidden for new clients.
Token lifetimes: 60-minute access/ID tokens, 5-day refresh tokens. Short access reduces blast radius on token leaks; long refresh keeps the user signed in across the work week.
Password policy enforced via Terraform: min 8 chars, requires upper / lower / number / symbol; account recovery via verified email then verified phone.
Pre-token Lambda is idempotent and side-effect-free — Cognito retries on 5xx, so a non-idempotent claim source could double-write. The Lambda only reads.
JWKS cache TTL is 1 hour. Cognito publishes its rotation cadence in years; an hour is a safe upper bound for staleness with negligible memory cost.
Architecture
Authentication is a request-time path with strict latency budgets. The flow keeps the IdP off the hot path: Cognito issues, the pre-token Lambda enriches, the backend verifies locally with cached keys.
Workflow diagram
01
User authenticates
Frontend calls Cognito's hosted UI (Authorization Code + PKCE) or the InitiateAuth API directly with email and password. Cognito performs the credential check.
02
Pre-token Lambda enriches
On every token issuance Cognito invokes the pre-token-generation trigger, which looks up the user in the application database and overrides the outgoing JWT's claim set with role, tenant_id, and employee_id.
03
Cognito returns three tokens
ID token (identity), access token (authorization — sent to APIs), refresh token (5-day lifetime, used to obtain new access tokens without re-prompting credentials).
04
API call with bearer
Each request to the backend carries the access token in the Authorization header. The token is opaque to the network — verification is the consumer's responsibility.
05
Local RS256 verification
Backend reads the kid from the token header, looks up the matching public key in the cached JWKS, verifies the RS256 signature, then checks aud equals the app client id and exp is in the future.
06
Trust the claims
Once all four checks pass, the service treats role / tenant_id / employee_id as authoritative for that request — no further lookups needed.
Engineering decisions
Why RS256, not HS256
RS256 is asymmetric: Cognito signs with a private key, services verify with the corresponding public key from JWKS. Services never possess Cognito's signing key, so a compromised service can't forge tokens. HS256 is symmetric — services would need the shared secret, multiplying the attack surface.
Why a pre-token Lambda, not custom user attributes
Cognito custom user attributes (custom:role) are stored on the user record and only updated when the user record is mutated. The pre-token Lambda runs on every token issuance, so claims always reflect the current state of the source-of-truth database, not a stale snapshot.
Why JWKS caching with a 1-hour TTL
Cognito documents that key rotation happens infrequently and gives a long-lived JWKS endpoint. Caching for an hour means at most one extra Cognito call per service-process per hour — round-trip cost amortized to effectively zero. Without caching, JWKS would be fetched on every API call.
Why lifecycle ignore_changes on the schema attribute
Cognito's user-pool schema is append-only — the API accepts AddCustomAttributes but rejects deletes. Without `lifecycle { ignore_changes = [schema] }`, an attempt to remove an attribute via Terraform would fail every plan/apply forever. The lifecycle rule lets us add freely and treat removals as silent no-ops.
import boto3, os
ddb = boto3.resource("dynamodb")
profiles = ddb.Table(os.environ["PROFILES_TABLE"])
def lambda_handler(event, _ctx):
"""Cognito Pre-Token-Generation trigger.
Runs on every JWT issuance — claims always reflect current DB state."""
sub = event["request"]["userAttributes"]["sub"]
profile = profiles.get_item(Key={"sub": sub}).get("Item", {})
event["response"] = {
"claimsOverrideDetails": {
"claimsToAddOrOverride": {
"tenant_id": profile.get("tenant_id", ""),
"role": profile.get("role", "viewer"),
"employee_id": profile.get("employee_id", ""),
},
# Drop sensitive defaults if present
"claimsToSuppress": ["custom:debug"],
}
}
return event
Simplified illustrative example
Local JWT verification with cached JWKSpython
import time, jwt, requests
from jwt.algorithms import RSAAlgorithm
JWKS_URL = (
f"https://cognito-idp.{REGION}.amazonaws.com"
f"/{POOL_ID}/.well-known/jwks.json"
)
_keys: dict = {}
_fetched_at: float = 0
_TTL = 3600 # 1 hour
def _load_keys():
global _keys, _fetched_at
if _keys and (time.time() - _fetched_at) < _TTL:
return _keys
raw = requests.get(JWKS_URL, timeout=2).json()["keys"]
_keys = {k["kid"]: RSAAlgorithm.from_jwk(k) for k in raw}
_fetched_at = time.time()
return _keys
def verify(token: str) -> dict:
kid = jwt.get_unverified_header(token)["kid"]
key = _load_keys().get(kid)
if key is None:
# Possible key rotation — force refresh once, then fail loud
_keys.clear()
key = _load_keys()[kid]
return jwt.decode(
token,
key=key,
algorithms=["RS256"],
audience=APP_CLIENT_ID, # rejects tokens issued for other apps
)
Simplified illustrative example
Impact
All four environments authenticate users in the same way; one Terraform apply rolls a policy or schema change across the entire footprint. Backend services verify a token in well under a millisecond, fully offline from Cognito — token verification is no longer a measurable contributor to API latency. Adding a new claim is a one-file change to the pre-token Lambda; downstream services pick it up the next time their token rotates, without any service redeploy.