-
Notifications
You must be signed in to change notification settings - Fork 800
feat: add SELFCHECK command and related configuration options #1551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
What I think off is about naming, honestly I would see default |
|
@jhumlick can I ask you for a review please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a full review. I haven't tested yet. I noticed a couple of things while reading the code
|
By the way, adding a new option and command is something that will have to wait until after 1.5 is published. So it would go towards the 1.6 release if we merge it. |
|
@val-ms there is no way to backport it? As far I understand 1.6 is a bit far in the future? |
Added comments for SELFCHECK command configuration.
Added documentation for EnableSelfCheckCommand option.
|
@dragoangel No sorry we're behind on 1.5 and between the release candidate and release - I'm trying my best not to merge any new features before we do. We don't put new features in patch versions (e.g., 1.5.1) either. So it will have to go into 1.6.0 which means it will make it into a release maybe 4-ish months from now, assuming we can get back on track with a 4-month feature release cadence. |
|
Okay, thanks. Eh that linters 😅 |
Autotests failed, I honestly expected that they will go one after another and will have a delay, so on time autotests would trigger P.s. at least autotests showing that code is working 🤣 |
|
Moved this autotest a bit lower, will see if that help to get result I expect on already up-to-date database 😊 |
|
@val-ms can you please run approve once again so github runner can start build? |
Sorry for the delay. Ran it now |
Missed one test, will fix it 😊 |
|
done, @val-ms can you please re-review? 😊 |
|
Eh, looks like this test should be changed to be stable. As far I understand - we can run scan on non-reloaded yet clamd due to concurrent reload (which is enabled by default) I assume, right @val-ms? In last run - macos build - passed (which means we got Here is potential patches to achieve both options (requires testing & review): simple - accept both options: diff --git a/unit_tests/check_clamd.c b/unit_tests/check_clamd.c
index 0f0f0f0..1a1a1a1 100644
--- a/unit_tests/check_clamd.c
+++ b/unit_tests/check_clamd.c
@@ -35,6 +35,44 @@
#include <errno.h>
#include <time.h>
+/* Accept one-of expected replies by separating with '|' in the expect field.
+ * Example: "DBUPTODATE|RELOADING"
+ */
+static int expected_matches(const char *actual, const char *expect)
+{
+ /* fast path: no alternation */
+ if (!expect || !strchr(expect, '|'))
+ return (expect && actual) ? (strcmp(actual, expect) == 0) : (actual == expect);
+
+ /* copy to a scratch buffer so we can strtok it safely */
+ char tmp[256];
+ size_t elen = strlen(expect);
+ if (elen >= sizeof(tmp)) {
+ /* fallback: if expect is absurdly long, do strict compare */
+ return strcmp(actual, expect) == 0;
+ }
+ memcpy(tmp, expect, elen + 1);
+
+ for (char *tok = strtok(tmp, "|"); tok; tok = strtok(NULL, "|")) {
+ if (strcmp(actual, tok) == 0)
+ return 1;
+ }
+ return 0;
+}
+
@@ -280,7 +318,7 @@ static void run_one_test(const struct req *r)
/* ... code that sends r->cmd and reads reply into 'buf' ... */
- if (strcmp(buf, r->expect) != 0) {
+ if (!expected_matches(buf, r->expect)) {
ck_abort_msg("Wrong reply for command %s\nReceived: \n%s\n\nExpected: \n%s\n",
r->cmd, buf, r->expect ? r->expect : "(null)");
}
}
@@ -612,9 +650,9 @@ static struct req tests[] = {
/* ping */
{"PING", NULL, "PONG", 1, 0, IDS_REJECT},
- /* selfcheck on freshly loaded database */
- {"SELFCHECK", NULL, "DBUPTODATE", 1, 0, IDS_REJECT},
+ /* SELFCHECK can legitimately be RELOADING while the db worker finishes. */
+ {"SELFCHECK", NULL, "DBUPTODATE|RELOADING", 1, 0, IDS_REJECT},
/* unknown commands */
};verify that diff --git a/unit_tests/check_clamd.c b/unit_tests/check_clamd.c
index 1234567..89abcde 100644
--- a/unit_tests/check_clamd.c
+++ b/unit_tests/check_clamd.c
@@ -610,9 +610,6 @@ static struct req tests[] = {
/* ping */
{"PING", NULL, "PONG", 1, 0, IDS_REJECT},
- /* selfcheck on freshly loaded database */
- {"SELFCHECK", NULL, "DBUPTODATE", 1, 0, IDS_REJECT},
-
/* unknown commands */
};
@@ -750,6 +747,63 @@ START_TEST(test_stats)
}
END_TEST
+/* Robust SELFCHECK: tolerate RELOADING for a short while, then require DBUPTODATE */
+#define SELFCHECK_EXPECT "DBUPTODATE"
+#define SELFCHECK_RELOADING "RELOADING"
+
+START_TEST(test_selfcheck)
+{
+ char *recvdata = NULL;
+ size_t len;
+ int rc;
+ int attempts = 0;
+ const int max_attempts = 60; /* timeout ~3m with check each 3s */
+ const int sleep_ms = 3000;
+
+ conn_setup();
+
+ do {
+ const char *cmd = "nSELFCHECK\n";
+ len = strlen(cmd);
+ rc = send(sockd, cmd, len, 0);
+ ck_assert_msg((size_t)rc == len, "Unable to send(): %s\n", strerror(errno));
+
+ recvdata = (char *)recvfull(sockd, &len);
+ ck_assert_msg(recvdata != NULL, "recvfull() returned NULL");
+
+ /* Trim trailing newlines */
+ while (len > 0 && (recvdata[len - 1] == '\n' || recvdata[len - 1] == '\r')) {
+ recvdata[--len] = '\0';
+ }
+
+ if (strcmp(recvdata, SELFCHECK_EXPECT) == 0) {
+ /* success */
+ free(recvdata);
+ conn_teardown();
+ return;
+ }
+
+ if (strcmp(recvdata, SELFCHECK_RELOADING) != 0) {
+ ck_abort_msg("Wrong reply for SELFCHECK: '%s' (expected DBUPTODATE or RELOADING)", recvdata);
+ }
+
+ /* still reloading, wait then retry */
+ free(recvdata);
+ recvdata = NULL;
+
+#if defined(_WIN32)
+ Sleep(sleep_ms);
+#else
+ struct timespec ts = { .tv_sec = sleep_ms / 1000,
+ .tv_nsec = (sleep_ms % 1000) * 1000000L };
+ nanosleep(&ts, NULL);
+#endif
+ } while (++attempts < max_attempts);
+
+ ck_abort_msg("SELFCHECK did not reach DBUPTODATE within timeout");
+
+ conn_teardown();
+}
+END_TEST
+
Suite *make_clamd_suite(void)
{
Suite *s;
@@ -780,6 +834,7 @@ Suite *make_clamd_suite(void)
tcase_add_test(tc_clamd, test_stats);
+ tcase_add_test(tc_clamd, test_selfcheck);
suite_add_tcase(s, tc_clamd);
return s;
|
Yes you can scan while reloading with the concurrent reload feature enabled.
I kind of like the second option more where it retries waiting for the reload to complete. Do you agree? |
|
Yes, same for me @val-ms 😊, I don't want to mess with random commits, I will try running this tests locally to be sure that they would work as expected and then will add it to PR. Meanwhile it would be nice if you were able to test clamd manually on your local env |
|
P.s. on my k8s env I disabled concurrent reload of dbs to economy ram memory (2x time less requests for a pod) and have liveness & readiness probes, so when clamd of 1 pod reloading it's just not getting incoming traffic and it's routed to another pod, forgot that concurrent reload is enabled by default 😅. There is minimum of 5 pods and maximum of 15 via horizontal pod auto scaling 😊, so there is always some pods to respond |
Added robust SELFCHECK test to handle RELOADING state.
|
Hello @val-ms, as far I can see ubuntu test build is passed - and one I modified |
|
@val-ms kind ping, as far of now all builds\tests are passed 😊 and Windows Server 2019 has been retired so it will never pass, need switch to windows-2022(windows-latest) or windows-2025 instead. UDP: take attempt to update |
|
The windows GitHub Actions build will be fixed by #1518 Please don't try to solve it yourself. We're able to test Windows more thoroughly on our internal Jenkins system so it isn't a huge deal. I do need to finish review but I have been trying to solve more critical blockers. As I mentioned earlier, your change appears to be a good thing but must wait until after we publish 1.5 and can start merging stuff to include with 1.6. Thank you for your patience. |
|
Oh, okay, I thought it wasn't merged due to other issues. Now I get, we waiting to 1.5 become tagged |
…on has been retired" This reverts commit 26c4681.
|
Reverted that small change on ci |
|
Hi @val-ms as 1.5 was released, can we merge this PR? |
Resolves #1528 by adding
SELFCHECKTCP command that utilizeSelfCheckfunctional (do reload only if it's required and provide answer if reload is triggered/running or skipped), but on demand via network instead of periodic interval which allow more fine controlled reload of clustered clamd via freshclam notify custom script.