Skip to content

Commit 41d5a8d

Browse files
authored
feat: port remaining squawk rules (#508)
will have claude do the work over night closes #131 will port https://github.com/kaaveland/eugene/blob/main/eugene/src/lints/rules.rs afterwards also added a new `agentic/` dir and two new `just` commands to standardise as well as track the prompts for such work. UPDATE: - refactored some of the rules to make them easier to consume. - added a bunch of tests - refactored the test framework to allow multiple statements per file (required to test some rules) - refactored the test framework to require explicit declaring each expected diagnostic - migrated `prefer jsonb` from eugene too. the remaining ones are either covered be squawk or require a bit more updates to the file context. will do them in a follow-up. see full comparison here: #305 (reply in thread)
1 parent 8e6da9d commit 41d5a8d

File tree

163 files changed

+6105
-88
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

163 files changed

+6105
-88
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,5 @@ node_modules/
2424

2525
**/dist/
2626
.claude-session-id
27+
28+
squawk/

.sqlx/query-d61f2f56ce777c99593df240b3a126cacb3c9ed5f915b7e98052d58df98d480b.json renamed to .sqlx/query-332fd0e123d2e7cc1e9abdd6c8db4718faaa1005845c8f8f14c5c78e76a258eb.json

Lines changed: 9 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.lock

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

agentic/port_squawk_rules.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
The goal is to port all missing rules from Squawk to our analyser.
2+
3+
Our analyser lives in the `pgt_analyser` crate. There is a `CONTRIBUTING.md` guide in that crate which explains how to add new rules. Please also read existing rules to see how it all works.
4+
5+
Then, I want you to check the rules in the squawk project which I copied here for convenience. The rules are in `squawk/linter/src/rules`. The implementation should be very similar to what we have, and porting them straightforward. Here a few things to watch out for though:
6+
7+
- although both libraries are using `libpg_query` to parse the SQL, the bindings can be different. Ours is in the `pgt_query` crate of you need a reference. The `protobuf.rs` file contains the full thing.
8+
- the context for each rule is different, but you can get the same information out of it:
9+
```rust
10+
pub struct RuleContext<'a, R: Rule> {
11+
// the ast of the target statement
12+
stmt: &'a pgt_query::NodeEnum,
13+
// options for that specific rule
14+
options: &'a R::Options,
15+
// the schema cache - also includes the postgres version
16+
schema_cache: Option<&'a SchemaCache>,
17+
// the file context which contains other statements in that file in case you need them
18+
file_context: &'a AnalysedFileContext,
19+
}
20+
21+
22+
pub struct AnalysedFileContext<'a> {
23+
// all statements in this file
24+
pub stmts: &'a Vec<pgt_query::NodeEnum>,
25+
26+
pos: usize,
27+
}
28+
29+
impl<'a> AnalysedFileContext<'a> {
30+
pub fn new(stmts: &'a Vec<pgt_query::NodeEnum>) -> Self {
31+
Self { stmts, pos: 0 }
32+
}
33+
34+
// all statements before the currently analysed one
35+
pub fn previous_stmts(&self) -> &[pgt_query::NodeEnum] {
36+
&self.stmts[0..self.pos]
37+
}
38+
39+
// total count of statements in this file
40+
pub fn stmt_count(&self) -> usize {
41+
self.stmts.len()
42+
}
43+
}
44+
```
45+
46+
In squawk, you will see:
47+
```rust
48+
// all statements of that file -> our analyser goes statement by statement but has access to the files content via `file_context`
49+
tree: &[RawStmt],
50+
// the postgres version -> we store it in the schema cache
51+
_pg_version: Option<Version>,
52+
// for us, this is always true
53+
_assume_in_transaction: bool,
54+
55+
```
56+
57+
Please always write idiomatic code!
58+
Only add comments to explain WHY the code is doing something. DO NOT write comments to explain WHAT the code is doing.
59+
60+
If you learn something new that might help in porting all the rules, please update this document.
61+
62+
LEARNINGS:
63+
- Use `cargo clippy` to check your code after writing it
64+
- The `just new-lintrule` command expects severity to be "info", "warn", or "error" (not "warning")
65+
- RuleDiagnostic methods: `detail(span, msg)` takes two parameters, `note(msg)` takes only one parameter
66+
- To check Postgres version: access `ctx.schema_cache().is_some_and(|sc| sc.version.major_version)` which gives e.g. 17
67+
- NEVER skip anything, or use a subset of something. ALWAYS do the full thing. For example, copy the entire non-volatile functions list from Squawk, not just a subset.
68+
- If you are missing features from our rule context to be able to properly implement a rule, DO NOT DO IT. Instead, add that rule to the NEEDS FEATURES list below. The node enum is generated from the same source as it is in squawk, so they have feature parity.
69+
- Remember to run `just gen-lint` after creating a new rule to generate all necessary files
70+
71+
Please update the list below with the rules that we need to migrate, and the ones that are already migrated. Keep the list up-to-date.
72+
73+
NEEDS FEATURES:
74+
75+
TODO:
76+
77+
DONE:
78+
- adding_field_with_default ✓ (ported from Squawk)
79+
- adding_foreign_key_constraint ✓ (ported from Squawk)
80+
- adding_not_null_field ✓ (ported from Squawk)
81+
- adding_primary_key_constraint ✓ (ported from Squawk)
82+
- adding_required_field (already exists in pgt_analyser)
83+
- ban_char_field ✓ (ported from Squawk)
84+
- ban_concurrent_index_creation_in_transaction ✓ (ported from Squawk)
85+
- ban_drop_column (already exists in pgt_analyser)
86+
- changing_column_type ✓ (ported from Squawk)
87+
- constraint_missing_not_valid ✓ (ported from Squawk)
88+
- ban_drop_database (already exists in pgt_analyser, as bad_drop_database in squawk)
89+
- ban_drop_not_null (already exists in pgt_analyser)
90+
- ban_drop_table (already exists in pgt_analyser)
91+
- prefer_big_int ✓ (ported from Squawk)
92+
- prefer_bigint_over_int ✓ (ported from Squawk)
93+
- prefer_bigint_over_smallint ✓ (ported from Squawk)
94+
- prefer_identity ✓ (ported from Squawk)
95+
- prefer_jsonb ✓ (new rule added)
96+
- prefer_text_field ✓ (ported from Squawk)
97+
- prefer_timestamptz ✓ (ported from Squawk)
98+
- disallow_unique_constraint ✓ (ported from Squawk)
99+
- renaming_column ✓ (ported from Squawk)
100+
- renaming_table ✓ (ported from Squawk)
101+
- require_concurrent_index_creation ✓ (ported from Squawk)
102+
- require_concurrent_index_deletion ✓ (ported from Squawk)
103+
- transaction_nesting ✓ (ported from Squawk)
104+
- prefer_robust_stmts ✓ (ported from Squawk - simplified version)
105+
106+
Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,23 @@
1-
#[derive(Default)]
2-
pub struct AnalysedFileContext {}
1+
pub struct AnalysedFileContext<'a> {
2+
pub stmts: &'a Vec<pgt_query::NodeEnum>,
33

4-
impl AnalysedFileContext {
5-
#[allow(unused)]
6-
pub fn update_from(&mut self, stmt_root: &pgt_query::NodeEnum) {}
4+
pos: usize,
5+
}
6+
7+
impl<'a> AnalysedFileContext<'a> {
8+
pub fn new(stmts: &'a Vec<pgt_query::NodeEnum>) -> Self {
9+
Self { stmts, pos: 0 }
10+
}
11+
12+
pub fn previous_stmts(&self) -> &[pgt_query::NodeEnum] {
13+
&self.stmts[0..self.pos]
14+
}
15+
16+
pub fn stmt_count(&self) -> usize {
17+
self.stmts.len()
18+
}
19+
20+
pub fn next(&mut self) {
21+
self.pos += 1;
22+
}
723
}

crates/pgt_analyse/src/context.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ pub struct RuleContext<'a, R: Rule> {
1010
stmt: &'a pgt_query::NodeEnum,
1111
options: &'a R::Options,
1212
schema_cache: Option<&'a SchemaCache>,
13-
file_context: &'a AnalysedFileContext,
13+
file_context: &'a AnalysedFileContext<'a>,
1414
}
1515

1616
impl<'a, R> RuleContext<'a, R>

crates/pgt_analyse/src/registry.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ impl RuleRegistry {
159159
pub struct RegistryRuleParams<'a> {
160160
pub root: &'a pgt_query::NodeEnum,
161161
pub options: &'a AnalyserOptions,
162-
pub analysed_file_context: &'a AnalysedFileContext,
162+
pub analysed_file_context: &'a AnalysedFileContext<'a>,
163163
pub schema_cache: Option<&'a pgt_schema_cache::SchemaCache>,
164164
}
165165

crates/pgt_analyse/src/rule.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,8 @@ impl RuleDiagnostic {
291291
pub enum RuleSource {
292292
/// Rules from [Squawk](https://squawkhq.com)
293293
Squawk(&'static str),
294+
/// Rules from [Eugene](https://github.com/kaaveland/eugene)
295+
Eugene(&'static str),
294296
}
295297

296298
impl PartialEq for RuleSource {
@@ -303,6 +305,7 @@ impl std::fmt::Display for RuleSource {
303305
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
304306
match self {
305307
Self::Squawk(_) => write!(f, "Squawk"),
308+
Self::Eugene(_) => write!(f, "Eugene"),
306309
}
307310
}
308311
}
@@ -325,18 +328,23 @@ impl RuleSource {
325328
pub fn as_rule_name(&self) -> &'static str {
326329
match self {
327330
Self::Squawk(rule_name) => rule_name,
331+
Self::Eugene(rule_name) => rule_name,
328332
}
329333
}
330334

331335
pub fn to_namespaced_rule_name(&self) -> String {
332336
match self {
333337
Self::Squawk(rule_name) => format!("squawk/{rule_name}"),
338+
Self::Eugene(rule_name) => format!("eugene/{rule_name}"),
334339
}
335340
}
336341

337342
pub fn to_rule_url(&self) -> String {
338343
match self {
339344
Self::Squawk(rule_name) => format!("https://squawkhq.com/docs/{rule_name}"),
345+
Self::Eugene(rule_name) => {
346+
format!("https://kaveland.no/eugene/hints/{rule_name}/index.html")
347+
}
340348
}
341349
}
342350

crates/pgt_analyser/Cargo.toml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,14 @@ pgt_analyse = { workspace = true }
1616
pgt_console = { workspace = true }
1717
pgt_diagnostics = { workspace = true }
1818
pgt_query = { workspace = true }
19+
pgt_query_ext = { workspace = true }
1920
pgt_schema_cache = { workspace = true }
2021
pgt_text_size = { workspace = true }
2122
serde = { workspace = true }
2223

2324
[dev-dependencies]
24-
insta = { version = "1.42.1" }
25-
pgt_diagnostics = { workspace = true }
26-
pgt_test_macros = { workspace = true }
27-
termcolor = { workspace = true }
25+
insta = { version = "1.42.1" }
26+
pgt_diagnostics = { workspace = true }
27+
pgt_statement_splitter = { workspace = true }
28+
pgt_test_macros = { workspace = true }
29+
termcolor = { workspace = true }

crates/pgt_analyser/src/lib.rs

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -62,25 +62,29 @@ impl<'a> Analyser<'a> {
6262
pub fn run(&self, params: AnalyserParams) -> Vec<RuleDiagnostic> {
6363
let mut diagnostics = vec![];
6464

65-
let mut file_context = AnalysedFileContext::default();
65+
let roots: Vec<pgt_query::NodeEnum> = params.stmts.iter().map(|s| s.root.clone()).collect();
66+
let mut file_context = AnalysedFileContext::new(&roots);
67+
68+
for (i, stmt) in params.stmts.into_iter().enumerate() {
69+
let stmt_diagnostics: Vec<_> = {
70+
let rule_params = RegistryRuleParams {
71+
root: &roots[i],
72+
options: self.options,
73+
analysed_file_context: &file_context,
74+
schema_cache: params.schema_cache,
75+
};
6676

67-
for stmt in params.stmts {
68-
let rule_params = RegistryRuleParams {
69-
root: &stmt.root,
70-
options: self.options,
71-
analysed_file_context: &file_context,
72-
schema_cache: params.schema_cache,
73-
};
74-
75-
diagnostics.extend(
7677
self.registry
7778
.rules
7879
.iter()
7980
.flat_map(|rule| (rule.run)(&rule_params))
80-
.map(|r| r.span(stmt.range)),
81-
);
81+
.map(|r| r.span(stmt.range))
82+
.collect()
83+
}; // end immutable borrow
84+
85+
diagnostics.extend(stmt_diagnostics);
8286

83-
file_context.update_from(&stmt.root);
87+
file_context.next();
8488
}
8589

8690
diagnostics

0 commit comments

Comments
 (0)