Monitoring AKS Clusters

Azure Compute Solutions

Florin Angelescu

Azure Cloud Architect

Why monitoring matters

 

Kubernetes

 

 

  • Monitoring is essential for understanding how applications behave in production.
  • Issues can go unnoticed until users are impacted.
Azure Compute Solutions

Why monitoring matters

 

 

Kubernetes

 

  • Kubernetes provides basic metrics.
  • AKS integrates with Azure Monitor and Log Analytics.

  • Monitoring helps:

    • Detect problems early
    • Optimize resource usage
    • Applications meet performance expectations
  • Supports compliance reporting.

Azure Compute Solutions

Metrics and dashboards

Kubernetes

  • Metrics are the foundation of monitoring.
  • AKS exposes CPU, memory, and network usage for pods and nodes.
Azure Compute Solutions

Metrics and dashboards

Kubernetes

Azure Monitor

  • Collects these metrics.
  • Presents them in dashboards.
  • Makes trends easy to visualize.
Azure Compute Solutions

Metrics and dashboards

Kubernetes

Custom dashboards can combine:

  • Infrastructure metrics
  • Application-level data (request latency, error counts)

By tailoring dashboards to team needs, you ensure that monitoring delivers actionable insights rather than overwhelming detail.

Azure Compute Solutions

Logs and diagnostics

 

 

Kubernetes

 

Logs

  • Provide detailed information about application behavior.
  • Stream logs directly from pods using kubectl logs.
  • Centralize logs with Log Analytics.
Azure Compute Solutions

Logs and diagnostics

 

 

Kubernetes

 

  • Diagnostics tools help identify root causes:

    • Failing container
    • Mis-configured service
    • Resource bottleneck
  • Structured logging formats (JSON):

    • Easier to parse and analyze logs automatically
Azure Compute Solutions

Alerts and automation

Alerts

Azure Monitor allows you to set thresholds for metrics and trigger alerts when they are exceeded.

Azure Compute Solutions

Alerts and automation

Alerts

CPU usage stays above 80% for several minutes:

  • An alert can notify the operations team
Azure Compute Solutions

Alerts and automation

Alerts

Alerts can also trigger automated actions, such as scaling workloads or restarting pods.

Azure Compute Solutions

Alerts and automation

Alerts

Combining alerts with automation:

  • Reduce manual intervention
  • Shorten recovery times
Azure Compute Solutions

Best practices for monitoring

Metrics

Collect metrics that reflect system health and user experience

Dashboards

Focus on actionable insights

Logs

Use log aggregation to simplify troubleshooting

Monitor

Review and refine monitoring rules

Azure Compute Solutions

Recap

 

Kubernetes

 

  • Monitoring in AKS provides the visibility needed to operate applications confidently.
  • With metrics, logs, dashboards, and alerts you can:
    • Detect issues early
    • Optimize performance
    • Maintain reliability
Azure Compute Solutions

Let's practice!

Azure Compute Solutions

Preparing Video For Download...