Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 33 additions & 24 deletions docs/en/configure/networking/functions/endpoint_health_checker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,14 @@ weight: 400

## Overview

The Endpoint Health Checker is a cluster plugin designed to monitor and manage the health status of service endpoints. It automatically removes unhealthy endpoints from load balancers to ensure traffic is only routed to healthy instances, improving overall service reliability and availability.
The Endpoint Health Checker is a cluster plugin designed to monitor and manage the health status of service endpoints in ALB + MetalLB environments. It automatically removes unhealthy endpoints from load balancers to ensure traffic is only routed to healthy instances, improving overall service reliability and availability.

## Key Features

- **Automatic Health Monitoring**: Continuously monitors the health status of service endpoints
- **Load Balancer Integration**: Automatically removes unhealthy endpoints from load balancer rotation
- **Automatic Health Monitoring**: Continuously monitors the health status of service endpoints in ALB + MetalLB environments
- **Load Balancer Integration**: Automatically removes unhealthy endpoints from ALB and MetalLB rotation
- **Service Availability**: Ensures traffic is only directed to healthy, available endpoints
- **Rapid Failover**: Reduces endpoint switching time from 40s to 10s during node power outages

## Installation

Expand All @@ -30,13 +31,13 @@ The Endpoint Health Checker is a cluster plugin designed to monitor and manage t

### Health Check Mechanism

The Endpoint Health Checker is a dedicated health monitoring component that ensures only healthy endpoints receive traffic. It operates by monitoring service endpoints and automatically managing their availability status.
The Endpoint Health Checker is a dedicated health monitoring component that ensures only healthy endpoints receive traffic in ALB + MetalLB environments. It operates by monitoring service endpoints and automatically managing their availability status.

#### Core Functionality

The Endpoint Health Checker works by:

1. **Service Discovery**: Identifies services and pods configured for health monitoring
1. **Service Discovery**: Identifies services and pods configured for health monitoring in ALB + MetalLB environments
2. **Pod Health Monitoring**: Monitors the readiness and liveness probe status of pods backing the service endpoints
3. **Active Health Checks**: Performs active health assessments using configurable criteria:
- **TCP connectivity checks**: Establishes TCP connections to verify port accessibility
Expand All @@ -50,7 +51,13 @@ The health checking process involves:
- **Probe Integration**: Leverages Kubernetes readiness and liveness probe results as initial health indicators
- **Network Connectivity**: Sends TCP or HTTP packets to target endpoint ports to verify accessibility
- **Response Validation**: Evaluates response status, timing, and content to determine endpoint health
- **Automatic Failover**: Removes unresponsive or failed endpoints from load balancer rotation
- **Automatic Failover**: Removes unresponsive or failed endpoints from ALB and MetalLB rotation

#### Performance Improvement

- **Previous Method**: Relied on kubelet heartbeat detection with up to 40 seconds delay
- **Current Method**: Active endpoint health checking with 10 second detection and switching time
- **Improvement**: Significantly improves service availability during node failures in ALB + MetalLB environments

#### Activation Methods

Expand All @@ -64,32 +71,33 @@ Health checking can be activated through two methods:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
name: alb-pod
namespace: cpaas-system
spec:
replicas: 10
replicas: 3
selector:
matchLabels:
app: nginx-demo
app: alb-pod
template:
metadata:
labels:
app: nginx-demo
app: alb-pod
annotations:
endpoint-health-checker.io/enabled: "true"
spec:
containers:
- name: nginx
image: nginx:alpine
- name: alb-container
image: your-alb-image:latest
ports:
- containerPort: 80
- containerPort: 8080
livenessProbe:
tcpSocket:
port: 80
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 80
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
Expand All @@ -102,32 +110,33 @@ Health checking can be activated through two methods:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo-legacy
name: alb-pod-legacy
namespace: cpaas-system
spec:
replicas: 5
replicas: 3
selector:
matchLabels:
app: nginx-demo-legacy
app: alb-pod-legacy
template:
metadata:
labels:
app: nginx-demo-legacy
app: alb-pod-legacy
spec:
readinessGates:
- conditionType: "endpointHealthCheckSuccess"
containers:
- name: nginx
image: nginx:alpine
- name: alb-container
image: your-alb-image:latest
ports:
- containerPort: 80
- containerPort: 8080
livenessProbe:
tcpSocket:
port: 80
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 80
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
Expand Down