Skip to content

feat: add SnapshotManager#542

Open
zhjwpku wants to merge 4 commits intoapache:mainfrom
zhjwpku:add_snapshot_manager
Open

feat: add SnapshotManager#542
zhjwpku wants to merge 4 commits intoapache:mainfrom
zhjwpku:add_snapshot_manager

Conversation

@zhjwpku
Copy link
Collaborator

@zhjwpku zhjwpku commented Jan 28, 2026

The test cases are derived from Java's TestSnapshotManager.java. Since MergeAppend is not supported yet, some tests are omitted for now and can be added later.

The test cases are derived from Java's TestSnapshotManager.java.
Since MergeAppend is not supported yet, some tests are omitted for
now and can be added later.
Copy link
Contributor

@evindj evindj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM only question that I have is why is UpdateSnapshotReference and SnapShotManager both exposed. Also, is there a way we can integration test this change?

/// and tags) and commit the changes.
Result<std::shared_ptr<UpdateSnapshotReference>> NewUpdateSnapshotReference();

/// \brief Create a new SnapshotManager to manage snapshots.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be a bit more descriptive here especially how it differs from NewUpdateSnapshotReference() I am a little confuse as to why the transaction would expose both since SnapshotManager seems to be a wrapper on top of UpdateSnapshotReference

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that UpdateSnapshotReference should be added to the Transaction's pending updates, whereas SnapshotManager should no, just as Gang noted below, SnapshotManager has its own commit logic.

///
/// \param branch Which is name of SnapshotRef of type branch
/// \return Reference to this for method chaining
auto& ToBranch(this auto& self, const std::string& branch) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename SetTargetBranch to ToBranch as they are equivalent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, I was suggesting to remove ToBranch by reusing SetTargetBranch because they are literally equivalent. If we think ToBranch is a better name, we should just rename it without adding a new one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, changed to SetTargetBranch.

///
/// \param branch Which is name of SnapshotRef of type branch
/// \return Reference to this for method chaining
auto& ToBranch(this auto& self, const std::string& branch) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, I was suggesting to remove ToBranch by reusing SetTargetBranch because they are literally equivalent. If we think ToBranch is a better name, we should just rename it without adding a new one.


/// \brief Roll this table's data back to a specific Snapshot identified by id.
///
/// \param snapshot_id Long id of the snapshot to roll back table data to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// \param snapshot_id Long id of the snapshot to roll back table data to
/// \param snapshot_id id of the snapshot to roll back table data to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for below

/// \brief Create a new tag pointing to the given snapshot id.
///
/// \param name Tag name
/// \param snapshot_id Snapshot ID for the head of the new tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be consistent with id, ID and Snapshot ID in these comments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changed to Snapshot ID since it's used in other places.

return *this;
}

SnapshotManager& SnapshotManager::SetCurrentSnapshot(int64_t snapshot_id) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we need to call CommitIfRefUpdatesExist() before anything in this SetCurrentSnapshot and RollbackToXXX below because different pending updates like UpdateSnapshotReference and SetSnapshot cannot be interleaved.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the Java impl behaves this way, but I'm not entirely sure about our implementation. After calling CommitIfRefUpdatesExist(), the transaction state becomes committed, which should prevent any further commits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can remove CommitIfRefUpdatesExist and put its logic directly into Commit.

Result<std::shared_ptr<Snapshot>> SnapshotManager::Apply() { return base().Snapshot(); }

Status SnapshotManager::Commit() {
ICEBERG_RETURN_UNEXPECTED(CheckErrors());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we call CheckErrors() in the Apply()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply() removed, so I think the CheckErrors() here is still needed.

return *this;
}

Result<std::shared_ptr<Snapshot>> SnapshotManager::Apply() { return base().Snapshot(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nothing happens on the transaction side and nowhere calls this, perhaps we don't even need the Apply function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable, let's just remove it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants