Nobody likes the idea of degraded performance or unscheduled downtime in their services. In many cases, the implications can be quite damaging, to say the least.
In the context of Kubernetes, many would argue that it goes against “the promise” that it’s meant to provide a container orchestration solution. That’s why teams build fault tolerance and resilience mechanisms to mitigate the risks of such issues in their environments. However, the solutions to these challenges aren’t always straightforward.
A typical pattern (and recommended practice) is high availability of your infrastructure and applications. But how do you optimize the networking in such environments to cater to both performance and cost? In cloud environments, cross-zone traffic can significantly impact your total cost. So how do you manage this while maintaining fault tolerance? In other scenarios, it’s not enough to manage traffic at a zonal level. Some applications require very low latency with traffic restricted to a specific node in your cluster. Does that mean we dump all our pods on one node?
No one size fits all, but there are different patterns, each with their respective constraints.
[Technicality rating: 4/4]
CNCF Event link: Optimizing Network Costs in Kubernetes
Meetup Event link: Optimizing Network Costs in Kubernetes