Skip to content

Conversation

svix-jbrown
Copy link

Summary

Fixes #286

Checklist

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided so that we can include it in CHANGELOG later

@slvrtrn slvrtrn requested a review from abonander September 18, 2025 08:23
@slvrtrn
Copy link
Contributor

slvrtrn commented Sep 18, 2025

I am hesitant about expanding on the client-side binding, which is included in this client for historical reasons.

Ideally, you should use server-side params:

SET param_db = 'system';
SET param_tbl = 'numbers';
SET param_col = 'number';
SET param_val = '3';

SELECT {tbl:Identifier}.{col:Identifier}
FROM {db:Identifier}.{tbl:Identifier}
WHERE {col:Identifier} < {val:UInt64}

   ┌─number─┐
1. │      02. │      13. │      2 │
   └────────┘

Curl equivalent:

curl -XPOST "http://localhost:8123?param_db=system&param_tbl=numbers&param_col=number&param_val=3" --data-binary "SELECT {tbl:Identifier}.{col:Identifier} FROM {db:Identifier}.{tbl:Identifier} WHERE {col:Identifier} < {val:UInt64} FORMAT Pretty"

   ┏━━━━━━━━┓
   ┃ number ┃
   ┡━━━━━━━━┩
1. │      0 │
   ├────────┤
2. │      1 │
   ├────────┤
3. │      2 │
   └────────┘

And, finally, Rust equivalent:

use clickhouse::{error::Result, Client};

#[tokio::main]
async fn main() -> Result<()> {
    let client = Client::default().with_url("http://localhost:8123");

    let result = client
        .query(
            "
                SELECT {tbl:Identifier}.{col:Identifier}
                FROM {db:Identifier}.{tbl:Identifier}
                WHERE {col:Identifier} < {val:UInt64}
            ",
        )
        .param("db", "system")
        .param("tbl", "numbers")
        .param("col", "number")
        .param("val", 3u64)
        .fetch_all::<u64>()
        .await?;

    println!("Parametrized query output: {:?}", result);
    Ok(())
}

Which prints:

Parametrized query output: [0, 1, 2]

@slvrtrn
Copy link
Contributor

slvrtrn commented Sep 18, 2025

I've added an example for server-side params here: #288.
Perhaps it could work instead of continuing with merging a new ScopedIdentifier type for the client-side params?

@svix-jbrown
Copy link
Author

I've added an example for server-side params here: #288. Perhaps it could work instead of continuing with merging a new ScopedIdentifier type for the client-side params?

We have some very large queries; I'm worried about how well clickhouse will do when it's binding potentially hundreds of server-side params. Those would be some big http requests!

@abonander
Copy link
Contributor

We have some very large queries; I'm worried about how well clickhouse will do when it's binding potentially hundreds of server-side params. Those would be some big http requests!

That is a valid concern since query parameters are passed in the URL: https://clickhouse.com/docs/interfaces/http#cli-queries-with-parameters

The maximum length of a request URI appears to be governed by http_max_uri_size which defaults to 1 MiB: https://clickhouse.com/docs/interfaces/http#querying

However, what I would be more concerned about is if ClickHouse is behind a cloud load balancer. Both GCP and AWS have much tighter limits on the size of request headers they will accept:

It doesn't appear to be documented, but it looks like it might be actually be possible to submit a query as a multipart/form-data request and pass parameters that way: https://github.com/ClickHouse/ClickHouse/blob/8adba6a235b093f7b468b7d9a936c4e02330d280/src/Server/HTTP/HTMLForm.cpp#L110

I'm not sure if this is designed for general use, however. The comment here just mentions "external data for query processing" https://github.com/ClickHouse/ClickHouse/blob/8adba6a235b093f7b468b7d9a936c4e02330d280/src/Server/HTTPHandler.cpp#L820-L821

@abonander
Copy link
Contributor

@svix-jbrown can you elaborate a little about your use of query parameters? Are you using them to insert data, or do you have, e.g., SELECT queries with that many different parameters?

Copy link
Contributor

@abonander abonander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're prepared to accept this since we maybe have some limitations with server-side params that we need to figure out.

Just a couple nits.

@svix-jbrown svix-jbrown changed the title feat: add ScopedIdentifier type feat: add QualifiedIdentifier type Sep 19, 2025
Copy link
Contributor

@abonander abonander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now, just needs a rebase and the merge conflict in CHANGELOG.md fixed.

@slvrtrn
Copy link
Contributor

slvrtrn commented Sep 22, 2025

I double-checked the current implementation.
QualifiedIdentifier seems to be an equivalent of the following:

let data = client
    .query("SELECT number FROM ?.?")
    .bind(Identifier("system"))  // db
    .bind(Identifier("numbers")) // table
    .with_option("limit", "3")
    .fetch_all::<u64>()
    .await?;

Perhaps it can work in your use case, and we don't need to bloat the client-side binding further?

@abonander
Copy link
Contributor

I double-checked the current implementation. QualifiedIdentifier seems to be an equivalent of the following:

let data = client
    .query("SELECT number FROM ?.?")
    .bind(Identifier("system"))  // db
    .bind(Identifier("numbers")) // table
    .with_option("limit", "3")
    .fetch_all::<u64>()
    .await?;

Perhaps it can work in your use case, and we don't need to bloat the client-side binding further?

One downside with that approach is it's perhaps not the most obvious solution. I think if that's the route we end up recommending, we should add an example to Identifier showing how it can be implemented.

One benefit of QualifiedIdentifier is that it doesn't require a change to the query to use, which could be useful with a dynamically chosen identifier:

let mut query = client.query("SELECT foo FROM ?");

query = if use_external_bar {
    query.bind(QualifiedIdentifier("external_db", "bar"))
} else {
    query.bind(Identifier("bar"))
};

impl Bind for QualifiedIdentifier<'_> {
#[inline]
fn write(&self, dst: &mut impl fmt::Write) -> Result<(), String> {
if self.0.len() > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix the Clippy error:

Suggested change
if self.0.len() > 0 {
if !self.0.is_empty() {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create a version of Identifier that handles qualified identifiers

3 participants