-
Notifications
You must be signed in to change notification settings - Fork 38
feat: Add cluster-level CCR sync and dynamic monitor interval configuration #637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
CLUSTER_CCR_*.md 放到 doc 目录下吧。 |
@@ -0,0 +1,27 @@ | |||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file.
@@ -0,0 +1,140 @@ | |||
# Cluster CCR Feature Changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file
@@ -0,0 +1,347 @@ | |||
# Cluster CCR 功能使用指南 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to doc/
## 性能优化 | ||
|
||
### 1. 锁优化 | ||
- 使用channel通信替代频繁锁操作 | ||
- 减少监控循环中的锁竞争 | ||
- 提升整体监控性能 | ||
|
||
### 2. 资源管理 | ||
- 合理设置监控间隔,避免过度消耗资源 | ||
- 监控任务数量,避免创建过多并发任务 | ||
- 定期清理已删除数据库的相关资源 | ||
|
||
### 3. 网络优化 | ||
- 使用连接池减少连接开销 | ||
- 合理设置超时时间 | ||
- 考虑网络延迟对监控间隔的影响 | ||
|
||
## 版本兼容性 | ||
|
||
- **向后兼容**: 现有单数据库同步功能完全保持不变 | ||
- **默认行为**: `cluster_sync` 参数默认为 `false` | ||
- **API兼容**: 所有现有API接口保持兼容 | ||
|
||
## 安全考虑 | ||
|
||
1. **权限控制**: 确保同步用户只有必要的数据库权限 | ||
2. **网络安全**: 使用安全的网络连接,考虑VPN或专线 | ||
3. **密码管理**: 避免在日志中记录敏感信息 | ||
4. **访问控制**: 限制监控间隔配置API的访问权限 | ||
|
||
--- | ||
|
||
更多详细信息请参考 `CLUSTER_CCR_CHANGES.md` 技术文档。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## 性能优化 | |
### 1. 锁优化 | |
- 使用channel通信替代频繁锁操作 | |
- 减少监控循环中的锁竞争 | |
- 提升整体监控性能 | |
### 2. 资源管理 | |
- 合理设置监控间隔,避免过度消耗资源 | |
- 监控任务数量,避免创建过多并发任务 | |
- 定期清理已删除数据库的相关资源 | |
### 3. 网络优化 | |
- 使用连接池减少连接开销 | |
- 合理设置超时时间 | |
- 考虑网络延迟对监控间隔的影响 | |
## 版本兼容性 | |
- **向后兼容**: 现有单数据库同步功能完全保持不变 | |
- **默认行为**: `cluster_sync` 参数默认为 `false` | |
- **API兼容**: 所有现有API接口保持兼容 | |
## 安全考虑 | |
1. **权限控制**: 确保同步用户只有必要的数据库权限 | |
2. **网络安全**: 使用安全的网络连接,考虑VPN或专线 | |
3. **密码管理**: 避免在日志中记录敏感信息 | |
4. **访问控制**: 限制监控间隔配置API的访问权限 | |
--- | |
更多详细信息请参考 `CLUSTER_CCR_CHANGES.md` 技术文档。 |
log.Infof("Cluster-level sync tasks creation completed, success: %d, failed: %d", successCount, len(errors)) | ||
|
||
// Start daemon task to periodically detect new databases in source cluster, passing existing database list | ||
go startDatabaseMonitor(request, db, jobManager, databases) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你需要设置一个 DatabaseMonitor manager,并从 main.go 中启动;启动时可以设置 ctx 等用于停止服务。现在这种方式无法做到 graceful shutdown.
return xerror.Wrapf(err, xerror.Normal, "Failed to get database list from source cluster") | ||
} | ||
|
||
if len(databases) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
既然是监控并自动创建 CCR JOB,那么这里是不是也可以允许为空。
if err := rows.Scan(&database); err != nil { | ||
return nil, xerror.Wrapf(err, xerror.Normal, "scan database failed") | ||
} | ||
// 过滤系统数据库 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// 过滤系统数据库 |
@@ -83,6 +83,22 @@ func ParseBackupState(state string) BackupState { | |||
} | |||
} | |||
|
|||
// isSystemDatabase 判断是否为系统数据库,需要跳过 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// isSystemDatabase 判断是否为系统数据库,需要跳过 |
dbRequest := &CreateCcrRequest{ | ||
Name: fmt.Sprintf("%s_%s", request.Name, dbName), // Task name with database name appended | ||
Src: request.Src, | ||
Dest: request.Dest, | ||
SkipError: request.SkipError, | ||
AllowTableExists: request.AllowTableExists, | ||
ReuseBinlogLabel: request.ReuseBinlogLabel, | ||
ClusterSync: false, // Set to false to avoid recursive calls | ||
} | ||
|
||
dbRequest.Src.Database = dbName | ||
dbRequest.Dest.Database = dbName | ||
|
||
if err := createCcr(dbRequest, db, jobManager); err != nil { | ||
errMsg := fmt.Sprintf("Failed to create sync task for database %s: %v", dbName, err) | ||
log.Warnf(errMsg) | ||
errors = append(errors, errMsg) | ||
} else { | ||
successCount++ | ||
log.Infof("Successfully created sync task for database %s", dbName) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你需要在 DB 中单独创建一个类型的 Job ,而不是复用 Ccr Job。
另外重启后,你还需要考虑
- 怎么处理这些 cluster sync task
- 有多个 syncer 组成 failover 模式时,cluster sync task 谁来负责。
功能概述
本PR为CCR Syncer添加了集群级别同步功能和动态监控间隔配置功能。
主要更改
新增API端点
POST /update_monitor_interval
- 更新监控间隔GET /get_monitor_interval
- 获取当前监控间隔向后兼容性
cluster_sync
默认为false
1. 集群级别CCR同步
2. 动态监控间隔配置
API 接口
创建集群级同步任务
接口:
POST /create_ccr
请求参数:
参数说明:
cluster_sync
: 设置为true
启用集群级同步name
: 同步任务的基础名称,实际任务名会加上数据库名后缀src/dest
: 源集群和目标集群的连接信息响应示例:
更新监控间隔
接口:
POST /update_monitor_interval
请求参数:
参数说明:
interval_seconds
: 监控间隔时间(秒),必须大于0响应示例:
获取当前监控间隔
接口:
GET /get_monitor_interval
响应示例: