Avoid persisting redundant tx sets#3501
Conversation
|
note: I realized some functionality is currently missing from the PR, so please don't review it yet |
0a06b7c to
a8fc982
Compare
|
This is up-to-date now. Note that protocol-next doesn't build at the moment because we need to update XDR upsteam. I'll do it after a round of review though, just in case XDR needs to change. |
src/main/PersistentState.cpp
Outdated
There was a problem hiding this comment.
Interesting, it looks like sqlite (which is what we use for tests) doesn't actually enforce this limit, but it's now too small because we reference tx sets by hash. I got a failure on Postgres though, so am currently fixing this.
UPD: this is fixed now (added extra tests for Postgres as well)
src/herder/TxSetFrame.h
Outdated
There was a problem hiding this comment.
This is also a factory method, isn't it? I would rather call it makeFromStoredTxSet for consistency with other factory methods. Also, could you please add a short doc comment?
There was a problem hiding this comment.
good idea, done.
src/main/PersistentState.cpp
Outdated
There was a problem hiding this comment.
nit: there is no need to use the pair constructor when you're using emplace
There was a problem hiding this comment.
True, just a habit of writing things down explicitly. Updated (in the other place too)
src/main/ApplicationImpl.cpp
Outdated
There was a problem hiding this comment.
Just to double-check: this should be safe for restarts without upgrades, right?
There was a problem hiding this comment.
Yeah, when a node upgrades its version to include this change, it'll do a schema upgrade, which will replace all of the old-style rows with new ones. If there is no upgrade, it means the schema is already up-to-date, which is what restoreState expects (as it queries PersistentStateV1)
src/herder/HerderImpl.cpp
Outdated
There was a problem hiding this comment.
nit: no need to use pair constructor with emplace
src/herder/test/HerderTests.cpp
Outdated
There was a problem hiding this comment.
nit: if (expectedSCPState) is sufficient, though I suppose you could use the explicit check for expressivity, so up to you
There was a problem hiding this comment.
kept it as-is if you don't mind: has_value is used pretty commonly in the codebase, and I do like the explicitness of it
c29ea2d to
862b15c
Compare
src/database/Database.cpp
Outdated
There was a problem hiding this comment.
Can't you make the schema upgrade on sqlite instead of skipping? That way in the future (when we squash sql upgrades) we can use the same code to create tables (and the schema is the same everywhere)
There was a problem hiding this comment.
ALTER COLUMN doesn't exist in sqlite. Instead, you have to create a new table and copy its contents from the old table. It doesn't really make sense to do this, given that this upgrade is a no-op on sqlite.
There was a problem hiding this comment.
oh I see, so a sad sqlite limitation.
So maybe update the comment to something a little more specific: "sqlite does not enforce size constraints and does not support updating columns".
Also, you should just move this to upgradeSCPDataV1Format: there is no reason to split the schema upgrade into 2 parts.
src/main/PersistentState.cpp
Outdated
There was a problem hiding this comment.
This is probably not the right place for this code: it's supposed to match logic that lives right now in HerderImpl. That way we let PersistentState be a "dumb" key/value store.
There was a problem hiding this comment.
Moved it to Herder.
src/main/PersistentState.cpp
Outdated
There was a problem hiding this comment.
I don't really like having this GC timer setup independently of what Herder is doing (at a minimum this should be managed by Herder).
Do we really need to GC using a timer?
It can be done on startup and every X times we store SCP state or maybe it should be based on the number of TxSets that we have in the database (we can estimate that from herder, doest not need to be exact)?
Maybe we should just reuse the code in HerderPersistence for all this (there is quite a bit of overlap, but I understand it may not be worth it)?
There was a problem hiding this comment.
I'm not a fan of the suggested approaches, because they both make it harder to reason about what's happening:
- we don't know how many times SCP persistence is called per ledger, so we can't really reason about how often tx sets are cleaned up (same thing with the number of tx sets).
- It makes testing harder as now we have to into account cases where purging triggers at different times depending on consensus.
- I went with the timer, as it seems to be the simplest approach that avoids anything complicated where we have to map Herder's state to the database and try to estimate a good time to do the cleanup.
I suggest we keep the timer approach. I do like the idea of moving it to Herder, this way we don't need extra start/shutdown logic in PersistentState, and it'll be easy to refactor the purging function based on your other suggestion.
There was a problem hiding this comment.
actually you're right, I was thinking that we could get arbitrary number of values from other validators, so we should try to GC more often in some situations, but this is only tracking what the current validator is doing.
src/herder/HerderImpl.cpp
Outdated
There was a problem hiding this comment.
this code and GC may be able to share code (and we want to GC before loading all those txsets)
There was a problem hiding this comment.
I had the same thought, but ultimately decided to keep them separate: the functions are sufficiently different, so refactoring didn't actually simplify much.
c01b46d to
be65ec2
Compare
3e7d437 to
f13ddfe
Compare
|
r+ f13ddfe |
Resolves #3499