Skip to content

fix: Fix incorrect edge table state when transforming between bundled and unbundled#28

Draft
zhanglei1949 wants to merge 1 commit intomainfrom
fix-etable-trans
Draft

fix: Fix incorrect edge table state when transforming between bundled and unbundled#28
zhanglei1949 wants to merge 1 commit intomainfrom
fix-etable-trans

Conversation

@zhanglei1949
Copy link
Collaborator

@zhanglei1949 zhanglei1949 commented Mar 11, 2026

When deleting properties, a unbundled table could become bundled, a bundled table could become unbundled.
When adding properties, a bundled table could become unbundled.

Greptile Summary

This PR fixes incorrect EdgeTable state when transitioning between bundled and unbundled CSR storage by introducing a remaining_col parameter to dropAndCreateNewBundledCSR so existing edge data is preserved during the transition. It also adds missing table_idx_/capacity_ resets, a missing break in a switch, and a null-safety guard in AddEdge.

Key changes and issues found:

  • dropAndCreateNewBundledCSR(remaining_col) correctly migrates real edge data into the new bundled CSR when a column is present; the nullptr path (used from AddProperties) fills with defaults — correct.
  • Logic bug: CHECK(meta_->properties.size() == 1) inside dropAndCreateNewBundledCSR fires when called from DeleteProperties, because the production call site (property_graph.cc:601) intentionally updates the schema after calling DeleteProperties, leaving meta_->properties at its pre-deletion size.
  • Logic bug: DeleteProperties unconditionally calls dropAndCreateNewBundledCSR when one column remains, but kVarchar properties are never bundled (EdgeSchema::is_bundled() returns false for a single kVarchar property). Calling create_csr with kVarchar hits LOG(FATAL) since string types are excluded from FOR_EACH_DATA_TYPE_NO_STRING.
  • Style gap: The new batch_put_edges_with_edata helper also omits kVarchar, consistent with the existing TODO but undocumented.
  • New tests in test_edge_table.cc cover the described scenarios and are good additions, but TestDeletePropertiesTransitionFromUnbundledToBundled exercises the two logic bugs above and will fail.

Confidence Score: 2/5

  • Not safe to merge — two logic bugs cause runtime crashes (CHECK failure / LOG(FATAL)) on the primary code path this PR is fixing.
  • The CHECK(meta_->properties.size() == 1) assertion in dropAndCreateNewBundledCSR will fire whenever it is called from DeleteProperties because the schema is updated after the edge table, leaving meta_ stale. Additionally, the unconditional call to dropAndCreateNewBundledCSR for any single remaining column fails for kVarchar types, which create_csr explicitly does not support. Both paths are exercised by the new tests added in this PR.
  • Pay close attention to src/storages/graph/edge_table.cc — specifically dropAndCreateNewBundledCSR and the new DeleteProperties branch.

Important Files Changed

Filename Overview
src/storages/graph/edge_table.cc Core fix for bundled/unbundled transitions — two logic bugs: CHECK(meta_->properties.size() == 1) fires when called from DeleteProperties (meta not yet updated), and kVarchar remaining columns incorrectly route to dropAndCreateNewBundledCSR which doesn't support string types.
include/neug/storages/graph/edge_table.h Header correctly updated to reflect the new dropAndCreateNewBundledCSR signature accepting a shared_ptr<ColumnBase> parameter.
tests/storage/test_edge_table.cc New tests cover empty-to-bundled, bundled-to-unbundled, and unbundled-to-bundled transitions; however the TestDeletePropertiesTransitionFromUnbundledToBundled test will fail at runtime because of the two logic bugs in the production code it exercises.
tests/storage/test_vertex_table.cc Trivial fix — adds the missing newline at end of file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[DeleteProperties called] --> B{meta_->is_bundled?}
    B -- yes --> C{property in col_names?}
    C -- yes --> D[dropAndCreateNewUnbundledCSR true]
    C -- no --> E[return]
    B -- no --> F[table_->delete_column for each col]
    F --> G{table_->col_num after deletion}
    G -- 0 --> H[dropAndCreateNewUnbundledCSR true\nresets table_idx_ and capacity_]
    G -- 1 --> I[remaining_col = get_column_by_id 0]
    I --> J[dropAndCreateNewBundledCSR remaining_col]
    J --> K{CHECK meta properties.size == 1\n⚠️ FAILS if meta not yet updated}
    K -- pass --> L{remaining_col != nullptr?}
    L -- yes --> M[batch_export row_id_col\nbuild remaining_data vector]
    M --> N[batch_put_edges_with_edata\n⚠️ FATAL if kVarchar]
    L -- no --> O[batch_put_edges_with_default_edata]
    N --> P[drop table, reset table_idx_ / capacity_\nswap new CSRs]
    O --> P
    G -- other --> Q[no CSR rebuild needed]
Loading

Last reviewed commit: eed8593

Greptile also left 3 inline comments on this PR.

@zhanglei1949 zhanglei1949 marked this pull request as draft March 11, 2026 07:26
void EdgeTable::dropAndCreateNewBundledCSR() {
void EdgeTable::dropAndCreateNewBundledCSR(
std::shared_ptr<ColumnBase> remaining_col) {
CHECK(meta_->properties.size() == 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHECK fails when called from DeleteProperties

dropAndCreateNewBundledCSR is now called from DeleteProperties (line 817), but at that call site meta_ has not yet been updated — it still reflects the pre-deletion property list. The surrounding production code in property_graph.cc (line 601–602) intentionally calls edge_table.DeleteProperties before schema_.DeleteEdgeProperties, because the edge table's delete logic depends on the old schema state.

Concretely, if an unbundled edge table has 2 properties and one is deleted (leaving 1), meta_->properties.size() is still 2 when this function is entered from DeleteProperties, so the CHECK always fires. Even if CHECK is a no-op in release mode, create_csr at line 952 would then receive the first original type (which may not match the remaining column's type), causing wrong-type CSR construction or a LOG(FATAL) if that type is kVarchar.

A minimal fix is to derive the remaining property type from remaining_col->type() instead of relying on meta_->properties[0], and to replace the strict equality CHECK with an assertion against the table column count:

// instead of:
CHECK(meta_->properties.size() == 1);
new_out_csr = create_csr(meta_->oe_mutable, meta_->oe_strategy,
                         meta_->properties[0].id());
// ...
// consider:
//   DataTypeId remaining_type = (remaining_col != nullptr)
//       ? remaining_col->type()
//       : meta_->properties[0].id();
//   new_out_csr = create_csr(meta_->oe_mutable, meta_->oe_strategy, remaining_type);

Comment on lines +813 to +818
if (table_->col_num() == 0) {
dropAndCreateNewUnbundledCSR(true);
} else if (table_->col_num() == 1) {
auto remaining_col = table_->get_column_by_id(0);
dropAndCreateNewBundledCSR(remaining_col);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect transition for kVarchar remaining property

dropAndCreateNewBundledCSR is called whenever exactly one column remains after deletion, but EdgeSchema::is_bundled() (in schema.cc:241–252) explicitly returns false when the single property is kVarchar. A kVarchar edge table is always unbundled, so attempting to bundle it will hit LOG(FATAL) inside create_csr (which dispatches via FOR_EACH_DATA_TYPE_NO_STRING, deliberately excluding strings).

The guard should use the remaining column's actual type — or equivalently check whether the edge should truly become bundled — rather than unconditionally calling dropAndCreateNewBundledCSR for any single-column result:

if (table_->col_num() == 0) {
  dropAndCreateNewUnbundledCSR(true);
} else if (table_->col_num() == 1) {
  auto remaining_col = table_->get_column_by_id(0);
  // Only transition to bundled for non-varchar types
  if (remaining_col->type() != DataTypeId::kVarchar) {
    dropAndCreateNewBundledCSR(remaining_col);
  }
  // For kVarchar the table stays unbundled; no CSR rebuild needed
}

Comment on lines +161 to +197
void batch_put_edges_with_edata(const std::vector<vid_t>& src_lid,
const std::vector<vid_t>& dst_lid,
DataTypeId property_type,
const std::vector<Property>& edge_data,
CsrBase* out_csr) {
switch (property_type) {
#define TYPE_DISPATCHER(enum_val, type) \
case DataTypeId::enum_val: { \
std::vector<type> typed_data; \
typed_data.reserve(edge_data.size()); \
for (const auto& prop : edge_data) { \
typed_data.emplace_back(PropUtils<type>::to_typed(prop)); \
} \
dynamic_cast<TypedCsrBase<type>*>(out_csr)->batch_put_edges( \
src_lid, dst_lid, typed_data); \
break; \
}
TYPE_DISPATCHER(kBoolean, bool);
TYPE_DISPATCHER(kInt32, int32_t);
TYPE_DISPATCHER(kUInt32, uint32_t);
TYPE_DISPATCHER(kInt64, int64_t);
TYPE_DISPATCHER(kUInt64, uint64_t);
TYPE_DISPATCHER(kFloat, float);
TYPE_DISPATCHER(kDouble, double);
TYPE_DISPATCHER(kDate, Date);
TYPE_DISPATCHER(kTimestampMs, DateTime);
TYPE_DISPATCHER(kInterval, Interval);
#undef TYPE_DISPATCHER
case DataTypeId::kEmpty: {
dynamic_cast<TypedCsrBase<EmptyType>*>(out_csr)->batch_put_edges(
src_lid, dst_lid, {});
break;
}
default:
LOG(FATAL) << "Unsupported edge property type "
<< static_cast<int>(property_type);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kVarchar / string type not handled

batch_put_edges_with_edata uses a hand-rolled TYPE_DISPATCHER that mirrors FOR_EACH_DATA_TYPE_NO_STRING but does not include kVarchar. batch_put_edges_with_default_edata (the existing counterpart, line 137) handles the same set and documents this gap with a // TODO(zhanglei) comment.

While kVarchar is currently never stored in a bundled CSR (and dropAndCreateNewBundledCSR is the only caller), the default branch silently calls LOG(FATAL), making the failure mode hard to trace. At a minimum, the comment from batch_put_edges_with_default_edata should be replicated here, and ideally a THROW_NOT_SUPPORTED_EXCEPTION (consistent with the existing helper) could be used in place of LOG(FATAL) to give callers a chance to recover.

Louyk14 pushed a commit that referenced this pull request Mar 12, 2026
Use cibuildwheel to build wheels for all platforms and archs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant