Skip to content

Conversation

liangxin1300
Copy link
Collaborator

@liangxin1300 liangxin1300 commented Jul 9, 2025

Problem

When SSH is not available at the peer node, configuring sbd will fail and might lead to inconsistent

 # crm sbd configure watchdog-timeout=120
INFO: No 'msgwait-timeout=' specified in the command, use 2*watchdog timeout: 240
INFO: Configuring disk-based SBD
INFO: Initializing SBD device /dev/sda5
INFO: Update SBD_WATCHDOG_DEV in /etc/sysconfig/sbd: /dev/watchdog0
ERROR: sbd.configure: Failed to run command test -d /etc/sysconfig || mkdir -p /etc/sysconfig on root@sle16-2: Cannot create SSH connection to root@sle16-2: ssh: connect to host sle16-2 port 22: Connection refused

See issue #1738

Changed

Raise exception when detecting that any node's SSH port is not reachable

# crm sbd configure watchdog-timeout=120
WARNING: host "sle16-2" is unreachable
ERROR: sbd.configure: There are unreachable nodes: sle16-2.
Please check the network connectivity before configuring SBD.

@liangxin1300 liangxin1300 force-pushed the 20250709_no_ssh branch 2 times, most recently from 5b9aa52 to 79751a9 Compare October 20, 2025 03:07
@liangxin1300 liangxin1300 changed the title Dev: utils: Raise UnreachableNodeError while detecting unreachable nodes Fix: utils: Raise UnreachableNodeError while detecting unreachable nodes (bsc#1250645) Oct 20, 2025
@liangxin1300 liangxin1300 force-pushed the 20250709_no_ssh branch 2 times, most recently from 3e97567 to c538eb3 Compare October 20, 2025 09:00
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 70.00000% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.75%. Comparing base (2b481a7) to head (7aaa85e).

Files with missing lines Patch % Lines
crmsh/utils.py 66.66% 6 Missing ⚠️
crmsh/ui_cluster.py 75.00% 3 Missing ⚠️
Additional details and impacted files
Flag Coverage Δ
integration 55.19% <70.00%> (-0.01%) ⬇️
unit 52.94% <53.33%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
crmsh/qdevice.py 98.03% <ø> (-0.01%) ⬇️
crmsh/ui_cluster.py 72.71% <75.00%> (+0.17%) ⬆️
crmsh/utils.py 67.48% <66.66%> (-0.09%) ⬇️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Drop ping-based check and only use SSH to determine node reachability
- When SSH check fails, raise NoSSHError when config.core.no_ssh is set to yes
- Otherwise, raise ValueError as before
@liangxin1300 liangxin1300 changed the title Fix: utils: Raise UnreachableNodeError while detecting unreachable nodes (bsc#1250645) Fix: utils: Raise UnreachableNodeError for those ssh unreachable nodes (bsc#1250645) Oct 20, 2025
@liangxin1300 liangxin1300 marked this pull request as ready for review October 21, 2025 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant