When scanning for outputs used in a set of incoming blocks,
we expect that some of the inputs in their transactions will
not be found in the blockchain, as they could be in previous
blocks in that set. Those outputs will be scanned there at
a later point. In this case, we add a flag to control wehther
an output not being found is expected or not.
The recent change to not keep separate track of the blockchain
height caused the reported height to jump early in the lmdb
transaction (when the block data is added to the blocks table),
rather than at the end, after everything succeeded. Since the
block data is added before the transaction data, this caused
the transaction data to be saved with a height one more than
its expected value.
Fix this by saving the block data last. This should have no
side effects.
This replaces the epee and data_loggers logging systems with
a single one, and also adds filename:line and explicit severity
levels. Categories may be defined, and logging severity set
by category (or set of categories). epee style 0-4 log level
maps to a sensible severity configuration. Log files now also
rotate when reaching 100 MB.
To select which logs to output, use the MONERO_LOGS environment
variable, with a comma separated list of categories (globs are
supported), with their requested severity level after a colon.
If a log matches more than one such setting, the last one in
the configuration string applies. A few examples:
This one is (mostly) silent, only outputting fatal errors:
MONERO_LOGS=*:FATAL
This one is very verbose:
MONERO_LOGS=*:TRACE
This one is totally silent (logwise):
MONERO_LOGS=""
This one outputs all errors and warnings, except for the
"verify" category, which prints just fatal errors (the verify
category is used for logs about incoming transactions and
blocks, and it is expected that some/many will fail to verify,
hence we don't want the spam):
MONERO_LOGS=*:WARNING,verify:FATAL
Log levels are, in decreasing order of priority:
FATAL, ERROR, WARNING, INFO, DEBUG, TRACE
Subcategories may be added using prefixes and globs. This
example will output net.p2p logs at the TRACE level, but all
other net* logs only at INFO:
MONERO_LOGS=*:ERROR,net*:INFO,net.p2p:TRACE
Logs which are intended for the user (which Monero was using
a lot through epee, but really isn't a nice way to go things)
should use the "global" category. There are a few helper macros
for using this category, eg: MGINFO("this shows up by default")
or MGINFO_RED("this is red"), to try to keep a similar look
and feel for now.
Existing epee log macros still exist, and map to the new log
levels, but since they're used as a "user facing" UI element
as much as a logging system, they often don't map well to log
severities (ie, a log level 0 log may be an error, or may be
something we want the user to see, such as an important info).
In those cases, I tried to use the new macros. In other cases,
I left the existing macros in. When modifying logs, it is
probably best to switch to the new macros with explicit levels.
The --log-level options and set_log commands now also accept
category settings, in addition to the epee style log levels.
3ff54bdd Check for correct thread before ending batch transaction (Howard Chu)
eaf8470b Must wait for previous batch to finish before starting new one (Howard Chu)
c903c554 Don't cache block height, always get from DB (Howard Chu)
eb1fb601 Tweak default db-sync-mode to fast:async:1 (Howard Chu)
0693cff9 Use batch transactions when syncing (Howard Chu)
m_num_outputs keeps track of the number of outputs, which should
be the same as the size of both the output_txs and output_amounts
databases. If one goes out of sync, we need to throw to abort
whatever it is we were doing.
Add consts in a few places where it makes sense, avoid unnecessary
memory reallocation where we know the full size needed at the outset,
simplify and avoid memory copy.
25% of the outputs are selected from the last 5 days (if possible),
in order to avoid the common case of sending recently received
outputs again. 25% and 5 days are subject to review later, since
it's just a wallet level change.
Keep the immediate direct deps at the library that depends on them,
declare deps as PUBLIC so that targets that link against that library
get the library's deps as transitive deps.
Break dep cycle between blockchain_db <-> crytonote_core.
No code refactoring, just hide cycle from cmake so that
it doesn't complain (cycles are allowed only between
static libs, not shared libs).
This is in preparation for supproting BUILD_SHARED_LIBS cmake
built-in option for building internal libs as shared.
Since this queries block heights for blocks that may or may not
exist, queries for non existing blocks would throw an exception,
and that would slow down the loop a lot. 7 seconds to go through
a 30 hash list.
Fix this by adding an optional return block height to block_exists
and using this instead. Actual errors will still throw an
exception.
This also cuts down on log exception spam.
When RingCT is enabled, outputs from coinbase transactions
are created as a single output, and stored as RingCT output,
with a fake mask. Their amount is not hidden on the blockchain
itself, but they are then able to be used as fake inputs in
a RingCT ring. Since the output amounts are hidden, their
"dustiness" is not an obstacle anymore to mixing, and this
makes the coinbase transactions a lot smaller, as well as
helping the TXO set to grow more slowly.
Also add a new "Null" type of rct signature, which decreases
the size required when no signatures are to be stored, as
in a coinbase tx.
Since these are needed at the same time as the output pubkeys,
this is a whole lot faster, and takes less space. Only outputs
of 0 amount store the commitment. When reading other outputs,
a fake commitment is regenerated on the fly. This avoids having
to rewrite the database to add space for fake commitments for
existing outputs.
This code relies on two things:
- LMDB must support fixed size records per key, rather than
per database (ie, all records on key 0 are the same size, all
records for non 0 keys are same size, but records from key 0
and non 0 keys do have different sizes).
- the commitment must be directly after the rest of the data
in outkey and output_data_t.
This constrains the number of instances of any amount
to the unlocked ones (as defined by the default unlock time
setting: outputs with non default unlock time are not
considered, so may be counted as unlocked even if they are
not actually unlocked).
It sets the max number of threads to use for a parallel job.
This is different that the number of total threads, since monero
binaries typically start a lot of them.
Also bumped DB VERSION to 1
Another significant speedup and space savings:
Get rid of global_output_indices, remove indirection from output to keys
This is the change warptangent described on irc but never got to finish.
This speeds up wallet refresh by directly retrieving a tx's amount output indices.
It removes the indirection and walking the amount output duplicate list
for every amount in each requested tx.
"tx_outputs" is used by:
Amount output indices are needed for wallet refresh.
Global output indices are needed for removing a tx.
Both amount output indices and global output indices are now stored in
an array of 64-bit unsigned ints:
tx_outputs[<tx_hash>] -> [ <a1_oi, a1_gi, a2_oi, a2_gi, ...> ]
Previously it was:
tx_outputs[<tx_hash>] -> duplicate list of <a1_gi, a2_gi, a3_gi, ...>
The amount output list had to be walked for every amount in order to
find each amount's output index, by comparing the amount's global output
index with each one in the duplicate list until a match was found.
See also d045dfa7ce
This is a list of existing output amounts along with the number
of outputs of that amount in the blockchain.
The daemon command takes:
- no parameters: all outputs with at least 3 instances
- one parameter: all outputs with at least that many instances
- two parameters: all outputs within that many instances
The default starts at 3 to avoid massive spamming of all dust
outputs in the blockchain, and is the current minimum mixin
requirement.
An optional vector of amounts may be passed, to request
histogram only for those outputs.
bdec7cb BlockchainLMDB: Use DB error helper consistently (warptangent)
c5932eb BlockchainLMDB: Add DB error to exception (warptangent)
a49c355 Blockchain: Omit verbose time stats messages by default (warptangent)
When keys are contiguous and monotonically increasing, this gets
denser page utilization (doesn't leave padding in page splits).
Can't be used for keys that are inserted in random order (e.g. hashes)
In total this only saves around 1.5% of space compared to original
DB code. The previous patch accounted for 0.8% savings on its own;
the blocks tables just aren't that big.
1995923 BlockchainLMDB: Deal with DB exceptions at block level with particularity (warptangent)
c16cc20 BlockchainLMDB: Add sanity check for inconsistent state (warptangent)
9118d0a BlockchainLMDB: Call destructor on allocated txn if setup fails (warptangent)
f5581c3 BlockchainLMDB: Replace remaining txn pointer NULLs with nullptr (warptangent)
Add another DB error exception type to distinguish failed txn setup from
general use of txn.
This keeps the error handling flow the same as before the block-level
txn setup changes that moved control up a layer to BlockchainDB.
This improves blockchain reorganization time by allowing one of the more
expensive DB lookups when popping a block to not have to seek through a
long dup list in the "output_amounts" db. This is most noticeable for
HDDs.
See ffcf6bdb95
Data should be removed in the reverse order it was added.
This matches the order of removal in
blockchain_storage::pop_transaction_from_global_index.
See f11def012f
b39aae7 Tweak 45800a25e9 (hyc)
4a5a5ff blockchain: always stop the ioservice before returning (moneromooo-monero)
78b65cf db_lmdb: safety close db at exit (moneromooo-monero)
45800a2 db_lmdb: fix a strdup/delete[] mistmatch (moneromooo-monero)
79beed2 tests: fix various tests by using parameters better suited to monero (moneromooo-monero)
d0a8362 tests: fix some double spending tests (moneromooo-monero)
2358d0d tests: use 255 as a "too high" block version (moneromooo-monero)
f33a88c blockchain: fix a few block addition bugs (moneromooo-monero)
a9ff11c blockchain: fix an off by one error in unlocked time check (moneromooo-monero)
f294be3 blockchain: reinstate double spending checks in check_tx_inputs (moneromooo-monero)
737b6d6 blockchain: make some flag twiddling code closer to the original (moneromooo-monero)
81cb0fc blockchain: fix bitflipping test with quantized block rewards (moneromooo-monero)
22ddf09 blockchain: add missing m_tx_pool.on_blockchain_dec (moneromooo-monero)
d837c0c blockchain: fix switch to alternative blockchain for more than one block (moneromooo-monero)
5cec076 blockchain: add a missing validity check to rollback_blockchain_switching (moneromooo-monero)
3cabdb5 core: catch exceptions from get_output_key (moneromooo-monero)
5eef645 db: throw when given a non txout_to_key output to add (moneromooo-monero)
The check was explicit in the original version, so it seems
safer to make it explicit here, especially as it is now done
implicitely in a different place, away from the original check.
Data should be removed in the reverse order it was added. Not doing so
breaks assumptions and can cause problems in other DB implementations.
This matches the order of tx removal in
blockchain_storage::purge_block_data_from_blockchain.
This improves blockchain reorganization time by allowing one of the more
expensive DB lookups when popping a block to not have to seek through a
long dup list in the "output_amounts" subdb. This is most noticeable for
HDDs.
As before, the dup list is still walked if necessary (but in reverse),
and the global output index still confirmed to be the one looked for.
But under proper use, the result will be found at the end of the dup
list, so we start there.
Removing an amount output index is always done in the context of popping
a block, so the global output index being looked for should be the last
one in that amount key's dup list. Even if the txs themselves aren't
removed in reverse order (supposed to be according to original
implementation), the specified amount output index will still be near
the end, because the txs are in the same block.
TEST:
Pop blocks with blockchain_import.
Blocks should be successfully removed with no errors shown.
bitmonerod should be able to start syncing from the reduced blockchain
height.
cbded43 core_tests: fix ring_signature_1 tests (moneromooo-monero)
c3d208f core_tests: bump default test fee to 0.02 monero (moneromooo-monero)
10da0a0 add a --fakechain argument for tests (moneromooo-monero)
eee44e6 unit_tests: fix block reward test using post hard fork settings (moneromooo-monero)
595893f blockchain: log block (not chain) height in "BLOCK SUCCESFULLY ADDED" (moneromooo-monero)
2369968 blockchain: fix off by one in get_blocks (moneromooo-monero)
8af913a db_lmdb: implement BlockchainLMDB::reset (moneromooo-monero)
4833f4f db_bdb: implement BlockchainBDB::reset (moneromooo-monero)
18bf06e tx_pool: fix "minumim" typo in message (moneromooo-monero)
44f1267 tests: fix a typo in test name (moneromooo-monero)
1494557 db_lmdb: create all needed directories, not just the leaf one (moneromooo-monero)
015b68a db_bdb: create all needed directories, not just the leaf one (moneromooo-monero)
f141869 tests: remove data-dir argument registration (moneromooo-monero)
When throwing an exception from being unable to begin an LMDB
transaction, include the reason.
It's often been due to a write transaction attempted within a write
transaction (batch mode), but there can be other reasons such as write
transaction attempted while database was opened read only, or
environment's map needs to be resized.
Data is only guaranteed to be valid within the lifetime of a txn.
You cannot use data returned from LMDB after the txn ends.
Also, fixed a missing txn.commit BlockchainLMDB::get_tx_unlock_time()
This is a precaution for older Berkeley DB versions.
- smooth reports an issue running with 4.7:
DB_ENV->log_set_config: DB_LOG_IN_MEMORY: method not permitted
after handle's open method
- this works just fine with 5.3
- we do not use DB_LOG_IN_MEMORY, but we use DB_LOG_AUTO_REMOVE
- libdb docs say some flags must be set before open, and some
may be set at any time, but never say some must be set after
open
- moving the call to log_set_config before open works with 5.3
Therefore, it seems best to move the call before open.
Early DB versions did not store key images for inputs if the
transaction spending them had no outputs (ie, all fee). This
is not correct, as this would allow these outputs to be double
spent. This was fixed in 533acc30ed
a few months ago, but databases having synced blocks 2021612 and
685498 with a faulty version will be missing those key images
in the spent keys database. This code checks for this, and adds
those key images if they are missing.
It looks like some of the indices passed to the DB access functions
are already bumped by 1. Moreover, the existing code was not
throwing DB errors with 0 keys, and this is unlikely if it really
was using 0 keys. Last, this patch broke sync from scratch in at
least one case. So I'm calling it bad and reverting it.
This reverts commit bfc97401ae81bb30278a318de7f048c653bf6582.
The original code removed key images from a tx from the blockchain
when an non to-key nor gen input was found in that tx. Additionally,
the remainder of the tx data was added to the blockchain only after
the double spend check passed.
It was only used by the older blockchain_storage.
We also move the code to the calling blockchain level, to avoid
replicating the code in every DB implementation. This also makes
the get_random_out method obsolete, and we delete it.
If there's no blocks in database (m_height == 0):
Don't assign incorrect block range to check.
Skip average block size check.
Test:
Run blockchain_converter with an existing source blockchain.bin and
a non-existent LMDB destination database.
The converter creates a BlockchainLMDB instance with zero height, due to
not being initialized with a genesis block, normally done by
Blockchain::init(). While different than the behavior of bitmonerod,
blockchain_import, and blockchain_export, the initialization hasn't been
strictly necessary.
The db batch size estimation normally uses an average block size, or a
default minimum block size, whichever is greater. In this case, as
there's no existing blocks to check for an average block size, the
default should be used.
Bockchain:
1. Optim: Multi-thread long-hash computation when encountering groups of blocks.
2. Optim: Cache verified txs and return result from cache instead of re-checking whenever possible.
3. Optim: Preload output-keys when encoutering groups of blocks. Sort by amount and global-index before bulk querying database and multi-thread when possible.
4. Optim: Disable double spend check on block verification, double spend is already detected when trying to add blocks.
5. Optim: Multi-thread signature computation whenever possible.
6. Patch: Disable locking (recursive mutex) on called functions from check_tx_inputs which causes slowdowns (only seems to happen on ubuntu/VMs??? Reason: TBD)
7. Optim: Removed looped full-tx hash computation when retrieving transactions from pool (???).
8. Optim: Cache difficulty/timestamps (735 blocks) for next-difficulty calculations so that only 2 db reads per new block is needed when a new block arrives (instead of 1470 reads).
Berkeley-DB:
1. Fix: 32-bit data errors causing wrong output global indices and failure to send blocks to peers (etc).
2. Fix: Unable to pop blocks on reorganize due to transaction errors.
3. Patch: Large number of transaction aborts when running multi-threaded bulk queries.
4. Patch: Insufficient locks error when running full sync.
5. Patch: Incorrect db stats when returning from an immediate exit from "pop block" operation.
6. Optim: Add bulk queries to get output global indices.
7. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
8. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
9. Optim: Added thread-safe buffers used when multi-threading bulk queries.
10. Optim: Added support for nosync/write_nosync options for improved performance (*see --db-sync-mode option for details)
11. Mod: Added checkpoint thread and auto-remove-logs option.
12. *Now usable on 32-bit systems like RPI2.
LMDB:
1. Optim: Added custom comparison for 256-bit key tables (minor speed-up, TBD: get actual effect)
2. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
3. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
4. Optim: Added support for sync/writemap options for improved performance (*see --db-sync-mode option for details)
5. Mod: Auto resize to +1GB instead of multiplier x1.5
ETC:
1. Minor optimizations for slow-hash for ARM (RPI2). Incomplete.
2. Fix: 32-bit saturation bug when computing next difficulty on large blocks.
[PENDING ISSUES]
1. Berkely db has a very slow "pop-block" operation. This is very noticeable on the RPI2 as it sometimes takes > 10 MINUTES to pop a block during reorganization.
This does not happen very often however, most reorgs seem to take a few seconds but it possibly depends on the number of outputs present. TBD.
2. Berkeley db, possible bug "unable to allocate memory". TBD.
[NEW OPTIONS] (*Currently all enabled for testing purposes)
1. --fast-block-sync arg=[0:1] (default: 1)
a. 0 = Compute long hash per block (may take a while depending on CPU)
b. 1 = Skip long-hash and verify blocks based on embedded known good block hashes (faster, minimal CPU dependence)
2. --db-sync-mode arg=[[safe|fast|fastest]:[sync|async]:[nblocks_per_sync]] (default: fastest:async:1000)
a. safe = fdatasync/fsync (or equivalent) per stored block. Very slow, but safest option to protect against power-out/crash conditions.
b. fast/fastest = Enables asynchronous fdatasync/fsync (or equivalent). Useful for battery operated devices or STABLE systems with UPS and/or systems with battery backed write cache/solid state cache.
Fast - Write meta-data but defer data flush.
Fastest - Defer meta-data and data flush.
Sync - Flush data after nblocks_per_sync and wait.
Async - Flush data after nblocks_per_sync but do not wait for the operation to finish.
3. --prep-blocks-threads arg=[n] (default: 4 or system max threads, whichever is lower)
Max number of threads to use when computing long-hash in groups.
4. --show-time-stats arg=[0:1] (default: 1)
Show benchmark related time stats.
5. --db-auto-remove-logs arg=[0:1] (default: 1)
For berkeley-db only. Auto remove logs if enabled.
**Note: lmdb and berkeley-db have changes to the tables and are not compatible with official git head version.
At the moment, you need a full resync to use this optimized version.
[PERFORMANCE COMPARISON]
**Some figures are approximations only.
Using a baseline machine of an i7-2600K+SSD+(with full pow computation):
1. The optimized lmdb/blockhain core can process blocks up to 585K for ~1.25 hours + download time, so it usually takes 2.5 hours to sync the full chain.
2. The current head with memory can process blocks up to 585K for ~4.2 hours + download time, so it usually takes 5.5 hours to sync the full chain.
3. The current head with lmdb can process blocks up to 585K for ~32 hours + download time and usually takes 36 hours to sync the full chain.
Averate procesing times (with full pow computation):
lmdb-optimized:
1. tx_ave = 2.5 ms / tx
2. block_ave = 5.87 ms / block
memory-official-repo:
1. tx_ave = 8.85 ms / tx
2. block_ave = 19.68 ms / block
lmdb-official-repo (0f4a036437)
1. tx_ave = 47.8 ms / tx
2. block_ave = 64.2 ms / block
**Note: The following data denotes processing times only (does not include p2p download time)
lmdb-optimized processing times (with full pow computation):
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.25 hours processing time (--db-sync-mode=fastest:async:1000).
2. Laptop, Dual-core / 4-threads U4200 (3Mb) - 4.90 hours processing time (--db-sync-mode=fastest:async:1000).
3. Embedded, Quad-core / 4-threads Z3735F (2x1Mb) - 12.0 hours processing time (--db-sync-mode=fastest:async:1000).
lmdb-optimized processing times (with per-block-checkpoint)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 10 minutes processing time (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with full pow computation)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.8 hours processing time (--db-sync-mode=fastest:async:1000).
2. RPI2. Improved from estimated 3 months(???) into 2.5 days (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with per-block-checkpoint)
1. RPI2. 12-15 hours (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
This currently only affects blockchain_import and blockchain_converter.
When the number of blocks expected for the batch transaction is
provided, make an estimate of the DB space needed. If not enough free
space remains, resize the DB.
The estimate is made based on:
- the average size of the last 500 blocks, or if larger, a min. block
size of 4k
- a factor for the expanded size a block occupies in the DB across the
sub-dbs/tables
- a safety factor (1.7) to allow for a "reasonable" average block size
increase over the batch
Increase the DB size by whichever is greater: the estimated size needed
or a minimum increase size, currently 128 MB.
The conservative factors in the estimate help in testing that the resize
occurs when needed, and without gratuitous size increases. For common
use, the safety factor and minimum increase size could reasonably be
increased.
For testing, setting DEFAULT_MAPSIZE (blockchain_db/lmdb/db_lmdb.h) to 1
<< 27 (128 MB) and recompiling will ensure DB resizes take place sooner
and more frequently.
5680604 Replace hardcoded value with existing constant of same value (warptangent)
f37ee2f Update database resize behavior (warptangent)
f85cd8e Include database error in more error messages (warptangent)
Some filesystems (*cough* NTFS *cough*) aren't good with sparse files,
so this makes LMDB dynamically resize its mapsize as needed. Note: the
check interval is currently every 10 blocks (for testing) and will
probably need to change to 1000 or something. Default mapsize set to
1GiB.
Blockchain conversion tools using batching will probably segfault, I'll
fix that in the next commit.
There will need to be some more refactoring for these changes to be
considered complete/correct, but for now it's working.
new daemon cli argument "--db-type", works for LMDB and BerkeleyDB.
A good deal of refactoring is also present in this commit, namely
Blockchain no longer instantiates BlockchainDB, but rather is passed a
pointer to an already-instantiated BlockchainDB on init().
Everything except actually *using* BlockchainBDB is wired up, but the db
itself is not yet working. Some error about user mem not large enough.
I think I know what this error means, but I can't determine the cause.
Notes: BerkeleyDB does not allow 0-indexing in its recno type databases,
so block numbers *in the database* will be 1-indexed. Modifications
to indexing have been made as needed.