Skip to content

Conversation

YasuhiroHarada
Copy link

This pull request enhances the ODBC driver by improving support for specific SQL functions and adding query-building logic for new functionality. The most important changes include enabling previously unsupported SQL APIs, adding query logic for SQLSpecialColumns and SQLStatistics, and refining query filtering for catalog and table information.

Enhancements to SQL API Support:

  • Enabled support for SQL_API_SQLSPECIALCOLUMNS and SQL_API_SQLSTATISTICS by uncommenting their definitions in SQLGetFunctions. This allows the driver to declare support for these APIs.

Query Logic Additions:

  • Implemented query-building logic for SQLSpecialColumns, including a placeholder query that returns an empty result set with the expected schema. This ensures compatibility with applications expecting this API.
  • Added query-building logic for SQLStatistics to retrieve index and statistics information from system.data_skipping_indices. The query includes filters for catalog and table names, ensuring accurate results based on input parameters.

Query Filtering Improvements:

  • Enhanced SQLTables to filter results by catalog name when provided, improving the specificity of returned data.

@CLAassistant
Copy link

CLAassistant commented Jul 22, 2025

CLA assistant check
All committers have signed the CLA.

@YasuhiroHarada YasuhiroHarada marked this pull request as ready for review July 22, 2025 07:11
@slabko
Copy link
Contributor

slabko commented Jul 22, 2025

Hi @YasuhiroHarada!

We appreciate your contribution!
Would it also be possible to add tests for your changes?

@YasuhiroHarada
Copy link
Author

Regarding the test failures in GitHub Actions, I tried to address them with the following changes. Could you please take a look and let me know your thoughts?

Updates to Performance Tests:

  • Updated performance tests to verify the unimplemented status of SQLTablePrivileges instead of SQLStatistics, reflecting the new implementation. ( 1753d5d )

@YasuhiroHarada
Copy link
Author

Integration tests have also been added.

Copy link
Contributor

@slabko slabko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, @YasuhiroHarada, for your contribution. I'm really glad you had a moment to look into the issue with MS Access.

Overall, I understand the problem and what you're trying to achieve. However, there are still a few parts I'm unsure about, and I’ve added some questions in my comments.

Could you please take a look and help clarify them?

SQLSMALLINT NameLength2,
SQLTCHAR * TableName,
SQLSMALLINT NameLength3,
SQLUSMALLINT Unique,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return nothing if Unique is set to SQL_INDEX_UNIQUE? Data skipping indexes do not guarantee uniqueness.

query << ")";
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a simplified version of the constraint above (lines 911–917)?

if (is_pattern && !is_odbc_v2) {
    if (!isMatchAnythingCatalogFnPatternArg(catalog))
        query << " AND isNotNull(TABLE_CAT) AND coalesce(TABLE_CAT, '') LIKE '" << escapeForSQL(catalog) << "'";
}
else if (CatalogName) {
    query << " AND isNotNull(TABLE_CAT) AND coalesce(TABLE_CAT, '') == '" << escapeForSQL(catalog) << "'";
}

I understand that this filter didn’t work for you. Could you please provide an example where you encountered a problem?
I believe AND database = {catalog} does not always work, because the catalog name can be a pattern (e.g., %).

ODBC_CALL_ON_STMT_THROW(hstmt, SQLExecDirect(hstmt, ptcharCast(create_index_query.data()), SQL_NTS));
ODBC_CALL_ON_STMT_THROW(hstmt, SQLFreeStmt(hstmt, SQL_CLOSE));
// Index creation succeeded
} catch (const std::exception& e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be the case in the test environment. That would make sense only if we're intentionally skipping the test in certain environments, but it should never happen implicitly.

}

// Test SQLStatistics with actual data skipping index
TEST_F(CatalogFunctionsTest, SQLStatisticsWithDataSkippingIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this test - has it already been implemented, or is it still a work in progress?

auto table_name = reader.getData<std::string>("TABLE_NAME");
auto index_name = reader.getData<std::string>("INDEX_NAME");

if (table_name.has_value() && table_name.value() == "test_index_table") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this part is unfinished.

rc = SQLFetch(hstmt);
EXPECT_EQ(rc, SQL_NO_DATA);

ODBC_CALL_ON_STMT_THROW(hstmt, SQLFreeStmt(hstmt, SQL_CLOSE));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary ClientTestBase sets hstmt for each test.
This applies to the other tests as well.

ODBC_CALL_ON_STMT_THROW(hstmt, SQLExecDirect(hstmt, ptcharCast(drop_table_query.data()), SQL_NTS));
ODBC_CALL_ON_STMT_THROW(hstmt, SQLFreeStmt(hstmt, SQL_CLOSE));
} catch (...) {
// Ignore cleanup errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should still be logged, but even in the case of cleanup, it's better for the test to fail - because that will indicate the issue early.

YasuhiroHarada and others added 4 commits August 1, 2025 17:35
Co-authored-by: Andrew Slabko <andrew.slabko@clickhouse.com>
Co-authored-by: Andrew Slabko <andrew.slabko@clickhouse.com>
Co-authored-by: Andrew Slabko <andrew.slabko@clickhouse.com>
Co-authored-by: Andrew Slabko <andrew.slabko@clickhouse.com>
@slabko
Copy link
Contributor

slabko commented Aug 1, 2025

@YasuhiroHarada
I understand it might take a lot of time to address all the suggestions, so it’s okay if you leave them as they are. I will accept the PR as long as it helps with using the driver with MS Access. These suggestions are not critical, especially in the tests.

However, I have a few questions that I hope you can answer, since I am not very familiar with this area. Specifically:

  • Quick fix for CH ODBC with MS Access #275 #516 (comment), In SQLStatistics, In SQLStatistics, I noticed that the NON_UNIQUE column is always set to SQL_FALSE. This means the index is always marked as unique. But data-skipping indexes are not unique. Should we change this value? Or did I misunderstand something important?

  • Do you think we should return an empty result when the Unique parameter of SQLStatistics is set to SQL_INDEX_UNIQUE? Since data-skipping indexes are not unique, I'm worried client applications might misunderstand the structure of the data.

  • https://github.com/ClickHouse/clickhouse-odbc/pull/516/files/6ee4efbd04bb9a5053636688a1216964c2e416d1#r2245723056, Here, I’m concerned because the filtering behavior of the SQLTables function changes. Could you please explain why you suggested this change? Maybe you found a problem that I’m not aware of.

If you can help clarify these points, I'll merge the PR and add any missing parts.

Thank you @YasuhiroHarada again for your help, I truly appreciate it!

Comment on lines +1360 to +1401

auto func = [&](Statement & statement) {
const auto catalog = (CatalogName ? toUTF8(CatalogName, NameLength1) : statement.getParent().database);
const auto schema = (SchemaName ? toUTF8(SchemaName, NameLength2) : "");
const auto table = (TableName ? toUTF8(TableName, NameLength3) : "");

// Build query to retrieve index/statistics information
// ClickHouse has indices which we can report as statistics
std::stringstream query;
query << "SELECT "
"cast(database, 'Nullable(String)') AS TABLE_CAT, "
"cast('', 'Nullable(String)') AS TABLE_SCHEM, "
"cast(table, 'String') AS TABLE_NAME, "
"cast(" + toSqlQueryValue(SQL_FALSE) + ", 'Int16') AS NON_UNIQUE, "
"cast(NULL, 'Nullable(String)') AS INDEX_QUALIFIER, "
"cast(name, 'Nullable(String)') AS INDEX_NAME, "
"cast(" + toSqlQueryValue(SQL_INDEX_OTHER) + ", 'Int16') AS TYPE, "
"cast(1, 'Int16') AS ORDINAL_POSITION, "
"cast(NULL, 'Nullable(String)') AS COLUMN_NAME, "
"cast(NULL, 'Nullable(String)') AS ASC_OR_DESC, "
"cast(NULL, 'Nullable(Int32)') AS CARDINALITY, "
"cast(NULL, 'Nullable(Int32)') AS PAGES, "
"cast(NULL, 'Nullable(String)') AS FILTER_CONDITION "
"FROM system.data_skipping_indices "
"WHERE (1 == 1)";

// Add filters based on provided parameters
if (!catalog.empty() && catalog != "%") {
query << " AND database = '" << escapeForSQL(catalog) << "'";
}

if (!table.empty() && table != "%") {
query << " AND table = '" << escapeForSQL(table) << "'";
}

query << " ORDER BY NON_UNIQUE, TYPE, INDEX_NAME, ORDINAL_POSITION";
statement.executeQuery(query.str());

return SQL_SUCCESS;
};

return CALL_WITH_TYPED_HANDLE(SQL_HANDLE_STMT, StatementHandle, func);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I would implement a minimum working version of this function. I still need to understand what we should show in PAGES; for now, I took the size of the table and divided it by 4096. A better version of this function would also return data for the columns used to sort the data, as well as any other indices. However, this minimal version already makes MS Access work.

Suggested change
auto func = [&](Statement & statement) {
const auto catalog = (CatalogName ? toUTF8(CatalogName, NameLength1) : statement.getParent().database);
const auto schema = (SchemaName ? toUTF8(SchemaName, NameLength2) : "");
const auto table = (TableName ? toUTF8(TableName, NameLength3) : "");
// Build query to retrieve index/statistics information
// ClickHouse has indices which we can report as statistics
std::stringstream query;
query << "SELECT "
"cast(database, 'Nullable(String)') AS TABLE_CAT, "
"cast('', 'Nullable(String)') AS TABLE_SCHEM, "
"cast(table, 'String') AS TABLE_NAME, "
"cast(" + toSqlQueryValue(SQL_FALSE) + ", 'Int16') AS NON_UNIQUE, "
"cast(NULL, 'Nullable(String)') AS INDEX_QUALIFIER, "
"cast(name, 'Nullable(String)') AS INDEX_NAME, "
"cast(" + toSqlQueryValue(SQL_INDEX_OTHER) + ", 'Int16') AS TYPE, "
"cast(1, 'Int16') AS ORDINAL_POSITION, "
"cast(NULL, 'Nullable(String)') AS COLUMN_NAME, "
"cast(NULL, 'Nullable(String)') AS ASC_OR_DESC, "
"cast(NULL, 'Nullable(Int32)') AS CARDINALITY, "
"cast(NULL, 'Nullable(Int32)') AS PAGES, "
"cast(NULL, 'Nullable(String)') AS FILTER_CONDITION "
"FROM system.data_skipping_indices "
"WHERE (1 == 1)";
// Add filters based on provided parameters
if (!catalog.empty() && catalog != "%") {
query << " AND database = '" << escapeForSQL(catalog) << "'";
}
if (!table.empty() && table != "%") {
query << " AND table = '" << escapeForSQL(table) << "'";
}
query << " ORDER BY NON_UNIQUE, TYPE, INDEX_NAME, ORDINAL_POSITION";
statement.executeQuery(query.str());
return SQL_SUCCESS;
};
return CALL_WITH_TYPED_HANDLE(SQL_HANDLE_STMT, StatementHandle, func);
auto func = [&](Statement & statement) {
const auto catalog = (CatalogName ? toUTF8(CatalogName, NameLength1) : statement.getParent().database);
const auto schema = (SchemaName ? toUTF8(SchemaName, NameLength2) : "");
const auto table = (TableName ? toUTF8(TableName, NameLength3) : "");
// The driver only provides SQL_TABLE_STAT data, at the moment no information about the indices
std::stringstream query;
query << "SELECT "
"cast(database, 'Nullable(String)') AS TABLE_CAT, "
"cast(NULL, 'Nullable(String)') AS TABLE_SCHEM, "
"cast(name, 'String') AS TABLE_NAME, "
"cast(NULL, 'Nullable(Int16)') AS NON_UNIQUE, "
"cast(NULL, 'Nullable(String)') AS INDEX_QUALIFIER, "
"cast(NULL, 'Nullable(String)') AS INDEX_NAME, "
"cast(" + toSqlQueryValue(SQL_TABLE_STAT) + ", 'Int16') AS TYPE, "
"cast(NULL, 'Nullable(Int16)') AS ORDINAL_POSITION, "
"cast(NULL, 'Nullable(String)') AS COLUMN_NAME, "
"cast(NULL, 'Nullable(String)') AS ASC_OR_DESC, "
"cast(total_rows, 'Nullable(Int32)') AS CARDINALITY, "
"cast(ceil(total_bytes / 4096), 'Nullable(Int32)') AS PAGES, "
"cast(NULL, 'Nullable(String)') AS FILTER_CONDITION "
"FROM system.tables "
"WHERE name = " + toSqlQueryValue(table) + " AND database = " + toSqlQueryValue(catalog);
statement.executeQuery(query.str());
return SQL_SUCCESS;
};
return CALL_WITH_TYPED_HANDLE(SQL_HANDLE_STMT, StatementHandle, func);

@realtimeraytracer
Copy link

Hi everyone, I have some coding experience and would like to help. How can I help to get this fix ready to add to the next release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants