Handling the event lifecycle: retries, DLQs and destinations

Serverless Applications with AWS Lambda

Claudio Canales

Senior DevOps Engineer

The event lifecycle at a glance

  • Lambda invokes your handler.
  • The handler either succeeds or fails.
  • When it fails, retries and routing decide what happens next.

Event lifecycle flow

Serverless Applications with AWS Lambda

Two ways failures show up

Synchronous

  • Caller waits and receives an error response.

Asynchronous

  • Caller is acknowledged first.
  • Lambda retries in the background.
  • Failure handling depends on the invocation mode.

Sync vs async failure paths

Serverless Applications with AWS Lambda

Retries are normal

  • Retries are often a feature, not a bug.
  • A transient failure might succeed on the next attempt.
  • Retries can cause duplicate processing; your handler must account for it.

Retry redialing analogy

Serverless Applications with AWS Lambda

Retries over time

  • Retries can recover from transient issues.
  • But the same event may run multiple times.
  • Idempotency and clear error handling are essential.

Retry attempts timeline

Serverless Applications with AWS Lambda

When retries are dangerous

  • Retries are risky when work is not idempotent.
  • Examples: charging a card, sending an email.
  • Use idempotency keys and safe updates so duplicates do not cause harm.

Idempotency goal

Serverless Applications with AWS Lambda

DLQ (Dead-Letter Queue)

  • A safe place for events that still fail after retries.
  • Often an SQS queue, AWS's managed message queue.
  • Inspect the payload, fix the issue, and re-drive.

DLQ lost-and-found analogy

Serverless Applications with AWS Lambda

DLQ vs destinations

DLQ lost-and-found analogy

  • Captures failed events after retries.
  • Use for investigation.

Destinations routing diagram

  • Route outcomes on success or failure.
  • Build explicit success and failure paths.
Serverless Applications with AWS Lambda

Destinations: success and failure routes

  • On success, send a result to onSuccess.
  • On failure, send details to onFailure.
  • This makes the next step explicit.

Destinations routing diagram

Serverless Applications with AWS Lambda

Tuning retry policy

  • Tune how many times Lambda retries.
  • Limit event age to avoid processing stale data.
  • More retries improve reliability but increase duplicates and delay.

Retry policy controls

Serverless Applications with AWS Lambda

Maximum event age: an expiration date

  • Maximum event age is an expiration policy.
  • If an event is too old, processing it may be pointless.
  • A trade-off: fewer late events, more timely behavior.

Event age expiration analogy

Serverless Applications with AWS Lambda

Observability: where to look

  • Logs answer what happened.
  • Metrics answer how often it is happening.
  • Alarms help you catch spikes quickly.

Logs vs metrics

Serverless Applications with AWS Lambda

What to do with failed events

  • Inspect the payload and error.
  • Fix the root cause.
  • Re-drive the event, then monitor errors and throughput.

Failed events recovery cycle

Serverless Applications with AWS Lambda

Key takeaways

  • Reliability comes from retries, routing, and observability.
  • Synchronous errors reach the caller.
  • Asynchronous errors need retries plus DLQs or destinations.
  • This keeps failures visible.

Reliability formula diagram

Serverless Applications with AWS Lambda

Let's practice!

Serverless Applications with AWS Lambda

Preparing Video For Download...