Understanding Modern Data Architecture
Miller Trujillo
Senior Software Engineer
Use case | Solution | Cloud solution |
---|---|---|
Batch/streaming, big data, cluster | Apache Spark, Flink, Beam | AWS EMR, AWS Glue, Google Dataproc, Google Dataflow |
Batch/streaming, big data, serverless (servers are fully managed by the provider) | Apache Spark, Beam | AWS Glue, Google Dataflow |
Individual events, simple processing, 24/7 support without servers running | General programming languages: Python, Javascript, C#, Java, Go | AWS Lambda, Google Cloud Functions |
Understanding Modern Data Architecture