Event source mappings and stream processing

Serverless Applications with AWS Lambda

Claudio Canales

Senior DevOps Engineer

Where event source mappings fit

  • Queues and streams are often polled in batches.
  • Event source mappings manage polling and batching behavior.

Event source mapping position

Serverless Applications with AWS Lambda

What is an event source mapping?

  • A resource that connects a source to your function.
  • Lambda polls the source for records.
  • Lambda invokes your handler with a batch.

Event source mapping flow

Serverless Applications with AWS Lambda

Push vs poll (why it matters)

Push model

  • Push sources send events immediately.
  • Example: an upload to Amazon S3.

Poll model

  • Poll sources are checked by Lambda.
  • Lambda builds a batch and invokes your handler.
  • This changes latency and retry behavior.

Push vs poll models

Serverless Applications with AWS Lambda

Queue vs stream

Queue (Amazon SQS)

  • A managed message queue, like an inbox.

Stream (DynamoDB Streams)

  • A change log for a DynamoDB table.
  • Both arrive as Records, but the meaning differs.

Queue vs stream comparison

Serverless Applications with AWS Lambda

Batch event shape (simplified)

{
  "Records": [{
    "messageId": "abc-123",
    "body": "{\"order_id\": \"A-42\"}"
  }]
}
  • Most mapping events start with Records.
  • Each record has a messageId and a body string.
  • Parse body into your own payload.
Serverless Applications with AWS Lambda

Walkthrough: process each record

def lambda_handler(event, context):
    records = event.get("Records", [])
    for record in records:
        body = record.get("body", "")
        print("BODY:", body)
    return {"statusCode": 200}
  • Read Records with a default list.
  • Loop through each record and read body safely.
  • Log what you need, then return.
Serverless Applications with AWS Lambda

Walkthrough: parse JSON body

import json

def lambda_handler(event, context):
    record = event.get("Records", [])[0]
    payload = json.loads(record.get("body", "{}"))
    order_id = payload.get("order_id")
    print("ORDER_ID:", order_id)
    return {"statusCode": 200}
  • Import json and read body with a safe default.
  • Parse with json.loads to get a dict.
  • Extract the fields you need; always validate first.
Serverless Applications with AWS Lambda

Batching: batch size and batch window

  • Batch size: how many records per invocation.
  • Batch window: how long Lambda waits to fill a batch.
  • Bigger batches improve throughput but add per-run work.
  • For SQS, watch ApproximateAgeOfOldestMessage for backlog.

Batch size tradeoff diagram

Serverless Applications with AWS Lambda

Batch size trade-offs

  • Large batches are efficient but can increase latency.
  • Small batches reduce time-to-first-processing but increase invocations.
  • Choose based on your workload.

Batch size vs latency chart

Serverless Applications with AWS Lambda

Scaling with concurrency

  • Multiple batches are processed at the same time.
  • Higher concurrency increases throughput.
  • But it also increases downstream load.

Concurrent batch processing

Serverless Applications with AWS Lambda

Protect downstream systems

  • Too much concurrency can overwhelm a database or API.
  • Use limits to control load.
  • Prefer small, safe units of work.

Downstream load protection

Serverless Applications with AWS Lambda

Partial batch failures

  • A batch can contain good and bad records.
  • Without partial failures, one bad record retries the whole batch.
  • Partial failure handling lets you retry only the failed items.

Partial batch failure flow

Serverless Applications with AWS Lambda

Walkthrough: reporting partial failures (SQS)

def lambda_handler(event, context):
    failures = []
    for record in event.get("Records", []):
        try:
            process(record)
        except Exception:
            failures.append({"itemIdentifier": record["messageId"]})
    return {"batchItemFailures": failures}
  • Collect messageIds for failed records in batchItemFailures.
  • Lambda retries only those items.
  • Successfully processed records are not re-sent.
Serverless Applications with AWS Lambda

Key takeaways

  • Event source mappings poll queues and streams.
  • They deliver Records in batches to your handler.
  • Batch size and window trade latency for throughput.
  • Partial failures avoid reprocessing successes.

Mapping summary diagram

Serverless Applications with AWS Lambda

Let's practice!

Serverless Applications with AWS Lambda

Preparing Video For Download...