Skip to content

Conversation

moflotas
Copy link
Contributor

@moflotas moflotas commented Aug 15, 2025

Fixes #76

Summary by CodeRabbit

  • New Features
    • Expanded and standardized range filter support with inclusive/exclusive bounds.
  • Changes
    • Searches now use SeqQL exclusively.
    • Path queries must be quoted (e.g., request_uri:"/one/two").
    • Clearer errors for unindexed fields; user-defined mappings take precedence over built-ins.
  • Refactor
    • Simplified query parsing flow by consolidating on a single parser.
  • Tests
    • Reorganized and expanded range and fuzz tests; removed legacy parser tests and benchmarks.

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.30%. Comparing base (53031b9) to head (ba0024e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
- Coverage   73.44%   72.30%   -1.15%     
==========================================
  Files         193      191       -2     
  Lines       16015    15629     -386     
==========================================
- Hits        11762    11300     -462     
- Misses       3662     3745      +83     
+ Partials      591      584       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

coderabbitai bot commented Aug 15, 2025

📝 Walkthrough

Walkthrough

Removes the legacy parser (ParseQuery and tokenizer) and migrates parsing to SeqQL across the codebase. Updates gRPC search to always use ParseSeqQL and drops header-based gating. Adjusts and refactors tests: deletes old parser tests/benchmarks, revises SeqQL filter tests, and updates integration tests. Minor call-site changes follow.

Changes

Cohort / File(s) Summary
Parser deprecation/removal
parser/query_parser.go, parser/token_parser.go
Deletes legacy query parser, tokenization utilities, and public APIs (ParseQuery, ParseSingleTokenForTests, ParseAggregationFilter).
Test suite cleanup (old parser)
parser/ast_test.go, parser/bench_test.go, parser/parser_test.go, parser/process_test.go
Removes or disables historical AST/parser tests and benchmarks; retains limited helpers and TODOs.
SeqQL filter updates and tests
parser/seqql_filter.go, parser/seqql_filter_test.go
Adds builtin mapping and index resolution helper; enforces indexed fields; reorganizes and expands range tests; adds fuzz helpers; minor benchmark/test fixes.
gRPC parser path simplified
storeapi/grpc_search.go
Drops context/header gating; parseQuery now always uses ParseSeqQL and returns AST root; removes unused imports and ctx param.
Consumer migration to SeqQL
frac/processor/eval_test.go
Switches tests from ParseQuery to ParseSeqQL; updates buildEvalTree invocation to use query.Root.
Integration tests update/refactor
tests/integration_tests/integration_test.go, tests/integration_tests/single_test.go
Normalizes range syntax cases; quotes path patterns; refactors some assertions into table-driven tests; preserves expected outcomes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested reviewers

  • forshev
  • eguguchkin

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 76-remove-old-parser

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
tests/integration_tests/single_test.go (1)

574-579: Potential loop variable capture in parallel subtests

If SingleEnvs returns pointers, the loop variable cfg may be captured by reference across parallel subtests. Shadow cfg within the loop to avoid heisenbugs.

 for _, cfg := range suites.SingleEnvs() {
-  t.Run(cfg.Name, func(t *testing.T) {
+  cfg := cfg // capture per-iteration
+  t.Run(cfg.Name, func(t *testing.T) {
     t.Parallel()
     suite.Run(t, NewSingleTestSuite(cfg))
   })
 }
🧹 Nitpick comments (6)
parser/ast_test.go (1)

8-27: Don’t leave flaky/commented tests; use t.Skip with context or remove them

Replace the commented-out test with a minimal t.Skip stub (with an issue link) or delete the dead code to keep the suite clean. Also, when re-enabling, avoid nondeterminism (e.g., seed math/rand) to keep tests stable.

Example stub:

+func TestBuildingTree(t *testing.T) {
+    t.Skip("TODO(moflotas): investigate failure; tracked in #76")
+    // original assertions go here when fixed
+}
parser/seqql_filter.go (1)

103-124: Avoid panic path in parser; return an error instead

Panicking on unexpected index types is risky in user-facing parsing. Prefer a typed error so callers can handle it gracefully.

Proposed change:

 switch t {
 case seq.TokenizerTypeKeyword, seq.TokenizerTypePath:
   // ...
   return &ASTNode{Value: &Literal{Field: fieldName, Terms: terms}}, nil
 case seq.TokenizerTypeText:
   // ...
   return buildAndTree(tokens), nil
 default:
-  panic(fmt.Errorf("BUG: unexpected index type: %d", t))
+  return nil, fmt.Errorf("unsupported index type %d for field %q", t, fieldName)
 }
storeapi/grpc_search.go (2)

192-195: Inline return of AST root.

Minor: avoid the temporary variable to keep it tight.

Apply this diff:

-	ast := seqql.Root
-
-	return ast, nil
+	return seqql.Root, nil

32-42: Tracing attribute bug: “to” uses req.From instead of req.To.

This skews trace metadata and can mislead analysis.

Apply this diff:

-		span.AddAttributes(trace.Int64Attribute("to", req.From))
+		span.AddAttributes(trace.Int64Attribute("to", req.To))
parser/seqql_filter_test.go (2)

408-415: Use runes in getPerm to avoid breaking multi-byte sequences.

Templates are ASCII today, but using []rune makes the helper future-proof and robust with Unicode.

Apply this diff:

-func getPerm(p []int, s string) string {
-	res := []byte(s)
-	for i, v := range p {
-		res[i], res[i+v] = res[i+v], res[i]
-	}
-	return string(res)
-}
+func getPerm(p []int, s string) string {
+	res := []rune(s)
+	for i, v := range p {
+		res[i], res[i+v] = res[i+v], res[i]
+	}
+	return string(res)
+}

444-460: Commented-out stress test: consider a build tag or t.Skip.

Keeping dead code reduces clarity. If you want it around, guard it behind a dedicated build tag (e.g., //go:build stress) or convert to a test that calls t.Skip with context.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 53031b9 and ba0024e.

📒 Files selected for processing (12)
  • frac/processor/eval_test.go (1 hunks)
  • parser/ast_test.go (1 hunks)
  • parser/bench_test.go (0 hunks)
  • parser/parser_test.go (0 hunks)
  • parser/process_test.go (0 hunks)
  • parser/query_parser.go (0 hunks)
  • parser/seqql_filter.go (1 hunks)
  • parser/seqql_filter_test.go (5 hunks)
  • parser/token_parser.go (0 hunks)
  • storeapi/grpc_search.go (2 hunks)
  • tests/integration_tests/integration_test.go (2 hunks)
  • tests/integration_tests/single_test.go (4 hunks)
💤 Files with no reviewable changes (5)
  • parser/parser_test.go
  • parser/bench_test.go
  • parser/process_test.go
  • parser/query_parser.go
  • parser/token_parser.go
🧰 Additional context used
🧬 Code Graph Analysis (4)
parser/seqql_filter.go (2)
seq/tokenizer.go (6)
  • TokenizerType (20-20)
  • TokenAll (10-10)
  • TokenizerTypeKeyword (24-24)
  • TokenExists (11-11)
  • TokenIndex (12-12)
  • TokenizerTypeNoop (23-23)
seq/mapping.go (1)
  • Mapping (89-89)
frac/processor/eval_test.go (1)
parser/seqql.go (1)
  • ParseSeqQL (28-58)
storeapi/grpc_search.go (4)
storeapi/grpc_v1.go (1)
  • GrpcV1 (83-98)
parser/ast_node.go (1)
  • ASTNode (8-11)
seq/tokenizer.go (1)
  • TokenAll (10-10)
parser/seqql.go (1)
  • ParseSeqQL (28-58)
tests/integration_tests/single_test.go (1)
tests/suites/single.go (1)
  • AllFracEnvs (94-98)
🔇 Additional comments (15)
parser/seqql_filter.go (2)

16-21: Built-in tokenizer mapping looks good

Treating all, exists, and _index as keyword fields is reasonable and keeps special tokens predictable.


22-36: Confirm: Defaulting to Keyword when mapping is nil is intentional

indexType returns Keyword when userMapping is nil; parseSeqQLFieldFilter then allows any field (unless explicitly Noop). This changes error behavior vs. “field is not indexed” and effectively enables queries on arbitrary fields when no mapping is provided.

  • If this is meant to align with “Index all fields” behavior, consider tying it to config (e.g., IndexAllFields) rather than mapping == nil.
  • If not intended, return Noop here and let the caller surface “not indexed”.

Please confirm intended semantics.

Also applies to: 46-49

frac/processor/eval_test.go (1)

76-79: SeqQL migration in tests: LGTM

Switching to ParseSeqQL and feeding buildEvalTree with query.Root is correct and consistent with the new API.

Also applies to: 86-89

tests/integration_tests/single_test.go (4)

144-164: Nice table-driven rewrite for basic search cases

The consolidation improves readability and maintenance without changing semantics.


244-269: NOT-query cases table: LGTM

Good coverage across negations and combinations; concise and clear.


471-493: Wildcard symbol cases table: LGTM

The explicit quoting/escaping scenarios are well covered.


529-543: Index-all-fields assertions: LGTM

Resetting mapping and enabling IndexAllFields with restart is clear; expectations align with SeqQL behavior.

tests/integration_tests/integration_test.go (2)

1402-1412: Range-query table is clear and correct

The bracket/parenthesis semantics and wildcard upper bounds are well represented; counts match the generated dataset.


1710-1721: Path search: quoting and patterns look correct

Quoting path literals and exercising segment/dot/wildcard variants is aligned with SeqQL path semantics.

storeapi/grpc_search.go (3)

102-104: Call-site update to SeqQL-only parse looks correct.

Removing ctx from parseQuery and invoking the SeqQL path directly is consistent with the PR objective and keeps tracing around the call intact.


183-195: Single SeqQL-based parsing path: good simplification.

This consolidates logic and standardizes error handling to InvalidArgument, aligning with the migration.


183-191: Empty-query fallback (all: ) is safe — no action required.*

Quick check: seq.TokenAll == "all" and parser treats it as a built-in field. parseQuery sets seq.TokenAll + ":*", indexType consults builtinMapping (which contains TokenAll) so ParseSeqQL accepts all even when the user mapping is nil or missing that field.

Relevant locations:

  • storeapi/grpc_search.go — parseQuery sets fallback: seq.TokenAll + ":*" (around line 185)
  • seq/tokenizer.go — const TokenAll = "all" (line ~10)
  • parser/seqql_filter.go — builtinMapping contains seq.TokenAll and indexType() falls back to it (lines ~16–32)
  • parser/token_literal.go — DumpSeqQL prints "" for all: (lines ~25–27)
  • parser/seqql.go — top-level "*" is parsed into Literal with Field seq.TokenAll (around lines ~354–358)
parser/seqql_filter_test.go (3)

194-206: Nice expansion of range filter coverage.

Added cases cover wildcards, quoted/unquoted forms, whitespace, and mixed constructs. This materially improves confidence in range parsing.


471-472: Benchmark cleanup LGTM.

Using _ = query.Root avoids unused-variable churn while keeping the benchmark semantics.


484-485: Long benchmark cleanup LGTM.

Same as above; keeps the benchmark focused.

Comment on lines +398 to +406
func nextPerm(p []int) {
for i := len(p) - 1; i >= 0; i-- {
if i == 0 || p[i] < len(p)-i-1 {
p[i]++
return
}
p[i] = 0
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Factorial blow-up risk: nextPerm-driven fuzz enumerates n! permutations.

With templates up to length 10–11 (e.g., "AND OR NOT"), the loop runs millions of iterations per template. This will significantly slow or time out CI.

Consider sampling instead of exhaustive enumeration. For example, replace the factorial loop with bounded random shuffles:

// Replace the exhaustive permutation loop with bounded random samples.
const maxSamplesPerTemplate = 5000

for _, template := range templates {
	if len(template) >= 12 {
		t.Skipf("template %q is too long for fuzz sampling", template)
	}
	r := rand.New(rand.NewSource(1)) // deterministic for CI
	b := []rune(template)
	for i := 0; i < maxSamplesPerTemplate; i++ {
		r.Shuffle(len(b), func(i, j int) { b[i], b[j] = b[j], b[i] })
		s := string(b)
		_, err := ParseSeqQL(s, nil)
		require.Errorf(t, err, "query: %s", s)
	}
}

If you prefer to retain nextPerm/getPerm, add a hard cap on iterations and break after N checks.

Comment on lines -11 to -23
func BenchmarkParsing(b *testing.B) {
str := `service: "some service" AND level:1`
for i := 0; i < b.N; i++ {
exp, _ = ParseQuery(str, seq.TestMapping)
}
}

func BenchmarkParsingLong(b *testing.B) {
str := `((NOT ((((m:19 OR m:20) OR m:18) AND m:16) OR ((NOT (m:25 OR m:26)) AND m:12))) OR (((NOT m:29) AND m:22) OR (((m:31 OR m:32) AND m:14) OR (m:27 AND m:28))))`
for i := 0; i < b.N; i++ {
exp, _ = ParseQuery(str, seq.TestMapping)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those benchmarks are already present in seqql_filter_test.go

}
}

func TestAggregationFilter(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get the point of this test, since it tests the function which only used in tests, so I removed it. But double-check me, please

}

// error messages are actually readable
func TestParserErr(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

return string(res)
}

func TestParserFuzz(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

}
}

func TestParseRange(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this test, since it seems redundant to me. Other tests in pattern package cover Range filter functionality

}
}

func TestTokenization(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

}
}

func TestTokenizationCaseSensitive(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

}
}

func TestExistsCaseSensitive(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

assert.Equal(t, "*", r.To.Data)
}

func TestPropagateNot(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

}
}

func TestWildcardText(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already present in seqql_filter_test.go

@ssnd ssnd requested review from dkharms and eguguchkin August 18, 2025 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parser: Remove old parser
2 participants