IRC meeting summary for 2017-03-30

Overview

Notes / short topics
Main topics
Slow unit tests
Dealing with abortnode / ConnectTip / DisconnectTip failures
High priority review
Comic relief
Participants
Disclaimer

Notes / short topics

BlueMatt notes 2 out of 6 of last weeks blocked and review-needed PRs got merged, which can be better. Wumpus created a high-priority for review github project page for the PRs that are mentioned to have priority in the meetings.
There have been 11 merges tagged 0.14.1 and 3 open PRs. Once those are in 0.14.1 should be good to go.

Main topics

Slow unit tests
Dealing with abortnode / ConnectTip / DisconnectTip failures
High priority review

Slow unit tests

background

Bitcoin Core provides a makefile target check that runs the project’s unit tests. The project has over time written more and more integration tests executed through the RPC interface, which are run automatically by the Travis Continuous Integration (CI) server that tests every Bitcoin Core pull request and which can manually be run by executing, qa/pull-tester/rpc-tests.py

As previously discussed in the 2017-03-16 meeting, these tests as a whole might currently take too long.

meeting comments

Wumpus made an overview of the slowest unit tests. Some of them have already been worked on or have PRs to make them faster.

We could also introduce an -extended mode for the unit tests, which does extra thorough tests that shouldn’t run every time. The extended mode should be part of the release process (and ran by gitian) and/or once a day on master.

Jonasschnelli has a build server with a nice web UI which does gitian builds every day at https://bitcoin.jonasschnelli.ch/.

Jnewbery notes the Travis CI service is currently failing because we’ve set it to run the extended tests once per day, so we’re flushing out all the extended tests that have always failed on Travis. Once PR #10114 and #10072 are merged these daily runs should pass Travis.

meeting conclusion

For the standard make check have a guideline of maximum ~1 second per test case, have an extended mode for unit tests with more extensive tests.

Dealing with abortnode / ConnectTip / DisconnectTip failures

background

Sdaftuar has an open PR (#9208) which improves performance after a chain reorg(anization), where a node discovers a new longest valid chain which excludes a block previously thought as being the longest valid chain (which will be orphaned). Currently we try to add transactions from each orphaned block back to the mempool, even though it’s likely many of those transactions will reoccur in the newly discovered blocks. #9208 stores those transactions in a separate “disconnect pool” for later processing.

meeting comments

BlueMatt brought up some edgecases in cases when ConnectTip or DisconnectTip return false where we now assert() instead of AbortNode(). Some broader discussion ensues on when to use AbortNode() and when Abort()/assert() and what’s the best way to alert the user an error has occured. AbortNode() allows us to exit with a message to inform the user, so ideally only critical errors should result in Abort()/Assert().

meeting conclusion

Rename AbortNode() to ShutdownSoon() and make sure disk corruption uses something different.

High priority review

All the PRs tagged with 0.14.1 should have priority

Sipa adds he’d like to see #9792 (FastRandomContext improvements and switch to ChaCha20) get in at some point to further remove the dependency on OpenSSL.

Gmaxwell proposes to re-open his PR #9424 which changes the logging categories to boolean flags instead of strings. This would make the use of PR’s like #10123, which allows you to exclude certain components from the debug log, easier. Cfields adds he’d like to do something simular for net messages.

Comic relief

wumpus             if BlueMatt can make it work faster that's great, but don't silently kill the program on every error
gmaxwell           wumpus: how about every other error?

9:48  BlueMatt     so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
...
9:53  BlueMatt     <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
...
9:57  BlueMatt     ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
9:58  jeremyrubin  BlueMatt: maybe if you paste it again
9:58  BlueMatt     jeremyrubin: ok, <BlueMatt> ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?

jtimon             it seems it's time to abort the meeting
wumpus             #endmeeting
BlueMatt           wumpus: we need to change that to #abort()
gmaxwell           But I wanted to cleanly flush!

Participants

IRC nick	Name/Nym
gmaxwell	Gregory Maxwell
wumpus	Wladimir van der Laan
jonasschnelli	Jonas Schnelli
sipa	Pieter Wuille
BlueMatt	Matt Corallo
jtimon	Jorge Timón
cfields	Cory Fields
achow101	Andrew Chow
jeremyrubin	Jeremy Rubin
sdaftuar	Suhas Daftuar
MarcoFalke	Marco Falke
jnewbery	John Newbery
morcos	Alex Morcos
instagibbs	Gregory Sanders
Chris_Stewart_5	Chris Stewart

Disclaimer

This summary was compiled without input from any of the participants in the discussion, so any errors are the fault of the summary author and not the discussion participants.