-
Notifications
You must be signed in to change notification settings - Fork 6
chore(parser): remove old parser #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #81 +/- ##
==========================================
- Coverage 73.44% 72.30% -1.15%
==========================================
Files 193 191 -2
Lines 16015 15629 -386
==========================================
- Hits 11762 11300 -462
- Misses 3662 3745 +83
+ Partials 591 584 -7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughRemoves the legacy parser (ParseQuery and tokenizer) and migrates parsing to SeqQL across the codebase. Updates gRPC search to always use ParseSeqQL and drops header-based gating. Adjusts and refactors tests: deletes old parser tests/benchmarks, revises SeqQL filter tests, and updates integration tests. Minor call-site changes follow. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
tests/integration_tests/single_test.go (1)
574-579
: Potential loop variable capture in parallel subtestsIf SingleEnvs returns pointers, the loop variable cfg may be captured by reference across parallel subtests. Shadow cfg within the loop to avoid heisenbugs.
for _, cfg := range suites.SingleEnvs() { - t.Run(cfg.Name, func(t *testing.T) { + cfg := cfg // capture per-iteration + t.Run(cfg.Name, func(t *testing.T) { t.Parallel() suite.Run(t, NewSingleTestSuite(cfg)) }) }
🧹 Nitpick comments (6)
parser/ast_test.go (1)
8-27
: Don’t leave flaky/commented tests; use t.Skip with context or remove themReplace the commented-out test with a minimal t.Skip stub (with an issue link) or delete the dead code to keep the suite clean. Also, when re-enabling, avoid nondeterminism (e.g., seed math/rand) to keep tests stable.
Example stub:
+func TestBuildingTree(t *testing.T) { + t.Skip("TODO(moflotas): investigate failure; tracked in #76") + // original assertions go here when fixed +}parser/seqql_filter.go (1)
103-124
: Avoid panic path in parser; return an error insteadPanicking on unexpected index types is risky in user-facing parsing. Prefer a typed error so callers can handle it gracefully.
Proposed change:
switch t { case seq.TokenizerTypeKeyword, seq.TokenizerTypePath: // ... return &ASTNode{Value: &Literal{Field: fieldName, Terms: terms}}, nil case seq.TokenizerTypeText: // ... return buildAndTree(tokens), nil default: - panic(fmt.Errorf("BUG: unexpected index type: %d", t)) + return nil, fmt.Errorf("unsupported index type %d for field %q", t, fieldName) }storeapi/grpc_search.go (2)
192-195
: Inline return of AST root.Minor: avoid the temporary variable to keep it tight.
Apply this diff:
- ast := seqql.Root - - return ast, nil + return seqql.Root, nil
32-42
: Tracing attribute bug: “to” uses req.From instead of req.To.This skews trace metadata and can mislead analysis.
Apply this diff:
- span.AddAttributes(trace.Int64Attribute("to", req.From)) + span.AddAttributes(trace.Int64Attribute("to", req.To))parser/seqql_filter_test.go (2)
408-415
: Use runes in getPerm to avoid breaking multi-byte sequences.Templates are ASCII today, but using []rune makes the helper future-proof and robust with Unicode.
Apply this diff:
-func getPerm(p []int, s string) string { - res := []byte(s) - for i, v := range p { - res[i], res[i+v] = res[i+v], res[i] - } - return string(res) -} +func getPerm(p []int, s string) string { + res := []rune(s) + for i, v := range p { + res[i], res[i+v] = res[i+v], res[i] + } + return string(res) +}
444-460
: Commented-out stress test: consider a build tag or t.Skip.Keeping dead code reduces clarity. If you want it around, guard it behind a dedicated build tag (e.g., //go:build stress) or convert to a test that calls t.Skip with context.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (12)
frac/processor/eval_test.go
(1 hunks)parser/ast_test.go
(1 hunks)parser/bench_test.go
(0 hunks)parser/parser_test.go
(0 hunks)parser/process_test.go
(0 hunks)parser/query_parser.go
(0 hunks)parser/seqql_filter.go
(1 hunks)parser/seqql_filter_test.go
(5 hunks)parser/token_parser.go
(0 hunks)storeapi/grpc_search.go
(2 hunks)tests/integration_tests/integration_test.go
(2 hunks)tests/integration_tests/single_test.go
(4 hunks)
💤 Files with no reviewable changes (5)
- parser/parser_test.go
- parser/bench_test.go
- parser/process_test.go
- parser/query_parser.go
- parser/token_parser.go
🧰 Additional context used
🧬 Code Graph Analysis (4)
parser/seqql_filter.go (2)
seq/tokenizer.go (6)
TokenizerType
(20-20)TokenAll
(10-10)TokenizerTypeKeyword
(24-24)TokenExists
(11-11)TokenIndex
(12-12)TokenizerTypeNoop
(23-23)seq/mapping.go (1)
Mapping
(89-89)
frac/processor/eval_test.go (1)
parser/seqql.go (1)
ParseSeqQL
(28-58)
storeapi/grpc_search.go (4)
storeapi/grpc_v1.go (1)
GrpcV1
(83-98)parser/ast_node.go (1)
ASTNode
(8-11)seq/tokenizer.go (1)
TokenAll
(10-10)parser/seqql.go (1)
ParseSeqQL
(28-58)
tests/integration_tests/single_test.go (1)
tests/suites/single.go (1)
AllFracEnvs
(94-98)
🔇 Additional comments (15)
parser/seqql_filter.go (2)
16-21
: Built-in tokenizer mapping looks goodTreating all, exists, and _index as keyword fields is reasonable and keeps special tokens predictable.
22-36
: Confirm: Defaulting to Keyword when mapping is nil is intentionalindexType returns Keyword when userMapping is nil; parseSeqQLFieldFilter then allows any field (unless explicitly Noop). This changes error behavior vs. “field is not indexed” and effectively enables queries on arbitrary fields when no mapping is provided.
- If this is meant to align with “Index all fields” behavior, consider tying it to config (e.g., IndexAllFields) rather than mapping == nil.
- If not intended, return Noop here and let the caller surface “not indexed”.
Please confirm intended semantics.
Also applies to: 46-49
frac/processor/eval_test.go (1)
76-79
: SeqQL migration in tests: LGTMSwitching to ParseSeqQL and feeding buildEvalTree with query.Root is correct and consistent with the new API.
Also applies to: 86-89
tests/integration_tests/single_test.go (4)
144-164
: Nice table-driven rewrite for basic search casesThe consolidation improves readability and maintenance without changing semantics.
244-269
: NOT-query cases table: LGTMGood coverage across negations and combinations; concise and clear.
471-493
: Wildcard symbol cases table: LGTMThe explicit quoting/escaping scenarios are well covered.
529-543
: Index-all-fields assertions: LGTMResetting mapping and enabling IndexAllFields with restart is clear; expectations align with SeqQL behavior.
tests/integration_tests/integration_test.go (2)
1402-1412
: Range-query table is clear and correctThe bracket/parenthesis semantics and wildcard upper bounds are well represented; counts match the generated dataset.
1710-1721
: Path search: quoting and patterns look correctQuoting path literals and exercising segment/dot/wildcard variants is aligned with SeqQL path semantics.
storeapi/grpc_search.go (3)
102-104
: Call-site update to SeqQL-only parse looks correct.Removing ctx from parseQuery and invoking the SeqQL path directly is consistent with the PR objective and keeps tracing around the call intact.
183-195
: Single SeqQL-based parsing path: good simplification.This consolidates logic and standardizes error handling to InvalidArgument, aligning with the migration.
183-191
: Empty-query fallback (all: ) is safe — no action required.*Quick check: seq.TokenAll == "all" and parser treats it as a built-in field. parseQuery sets seq.TokenAll + ":*", indexType consults builtinMapping (which contains TokenAll) so ParseSeqQL accepts all even when the user mapping is nil or missing that field.
Relevant locations:
- storeapi/grpc_search.go — parseQuery sets fallback: seq.TokenAll + ":*" (around line 185)
- seq/tokenizer.go — const TokenAll = "all" (line ~10)
- parser/seqql_filter.go — builtinMapping contains seq.TokenAll and indexType() falls back to it (lines ~16–32)
- parser/token_literal.go — DumpSeqQL prints "" for all: (lines ~25–27)
- parser/seqql.go — top-level "*" is parsed into Literal with Field seq.TokenAll (around lines ~354–358)
parser/seqql_filter_test.go (3)
194-206
: Nice expansion of range filter coverage.Added cases cover wildcards, quoted/unquoted forms, whitespace, and mixed constructs. This materially improves confidence in range parsing.
471-472
: Benchmark cleanup LGTM.Using
_ = query.Root
avoids unused-variable churn while keeping the benchmark semantics.
484-485
: Long benchmark cleanup LGTM.Same as above; keeps the benchmark focused.
func nextPerm(p []int) { | ||
for i := len(p) - 1; i >= 0; i-- { | ||
if i == 0 || p[i] < len(p)-i-1 { | ||
p[i]++ | ||
return | ||
} | ||
p[i] = 0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factorial blow-up risk: nextPerm-driven fuzz enumerates n! permutations.
With templates up to length 10–11 (e.g., "AND OR NOT"), the loop runs millions of iterations per template. This will significantly slow or time out CI.
Consider sampling instead of exhaustive enumeration. For example, replace the factorial loop with bounded random shuffles:
// Replace the exhaustive permutation loop with bounded random samples.
const maxSamplesPerTemplate = 5000
for _, template := range templates {
if len(template) >= 12 {
t.Skipf("template %q is too long for fuzz sampling", template)
}
r := rand.New(rand.NewSource(1)) // deterministic for CI
b := []rune(template)
for i := 0; i < maxSamplesPerTemplate; i++ {
r.Shuffle(len(b), func(i, j int) { b[i], b[j] = b[j], b[i] })
s := string(b)
_, err := ParseSeqQL(s, nil)
require.Errorf(t, err, "query: %s", s)
}
}
If you prefer to retain nextPerm/getPerm, add a hard cap on iterations and break after N checks.
func BenchmarkParsing(b *testing.B) { | ||
str := `service: "some service" AND level:1` | ||
for i := 0; i < b.N; i++ { | ||
exp, _ = ParseQuery(str, seq.TestMapping) | ||
} | ||
} | ||
|
||
func BenchmarkParsingLong(b *testing.B) { | ||
str := `((NOT ((((m:19 OR m:20) OR m:18) AND m:16) OR ((NOT (m:25 OR m:26)) AND m:12))) OR (((NOT m:29) AND m:22) OR (((m:31 OR m:32) AND m:14) OR (m:27 AND m:28))))` | ||
for i := 0; i < b.N; i++ { | ||
exp, _ = ParseQuery(str, seq.TestMapping) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those benchmarks are already present in seqql_filter_test.go
} | ||
} | ||
|
||
func TestAggregationFilter(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get the point of this test, since it tests the function which only used in tests, so I removed it. But double-check me, please
} | ||
|
||
// error messages are actually readable | ||
func TestParserErr(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
return string(res) | ||
} | ||
|
||
func TestParserFuzz(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
} | ||
} | ||
|
||
func TestParseRange(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this test, since it seems redundant to me. Other tests in pattern
package cover Range
filter functionality
} | ||
} | ||
|
||
func TestTokenization(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
} | ||
} | ||
|
||
func TestTokenizationCaseSensitive(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
} | ||
} | ||
|
||
func TestExistsCaseSensitive(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
assert.Equal(t, "*", r.To.Data) | ||
} | ||
|
||
func TestPropagateNot(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
} | ||
} | ||
|
||
func TestWildcardText(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already present in seqql_filter_test.go
Fixes #76
Summary by CodeRabbit