v1.1-beta
版本发布时间: 2015-12-31 03:59:43
bft-smart/library最新发布版本:v1.2(2018-09-30 23:17:11)
Lastest version of the BFT-SMaRt library (v1.1 beta). Includes source code, binary, javadoc and runscripts.
This version does not provide any new features in relation to the previous one (v1.0 beta), but it does include a significant amount of bug fixes, changes in the code, and a few modifications to the replication protocol.
Protocol alterations:
-
After sending a STOP message, each replica will now periodically re-transmitted it. This is necessary for cases where a replica that recovered from a failure does not return to the system in time to receive enough STOP messages from the other replicas. Consequently the synchronization phase may not complete in such scenario.
-
Under CFT mode, a replica now updates its timestamp/value pair immediately after receiving a (valid) PROPOSE message (or at the end of the synchronization phase), This must be done because the original consensus algorithm requires a a quorum of WRITE messages before updating this pair, but CFT mode bypasses the WRITE phase completely. Since in CFT mode replicas are expected to fail only by crashing, this does not break the correctness of the protocol.
-
Replicas now will only stop executing consensus instances after collecting 2f+1 STOP messages. This was done to avoid a corner case where a system with a single client would block, which can happen if:
- There is only one client sending requests;
- One replica is crashed;
- One of the three correct replicas timeout before being able to order the request (assuming f = 1, n =4).
This would not be a problem if the library did not support read-only invocations, which require only f+1 replies from replicas (which is in accordance to the specification of the Mod-SMaRt protocol). But with read-only invocations, clients need to wait for a Byzantine quorum of replies.
-
Standard state transfer now randomly selects a replica to ask for the full state. Implemented to deal with a corner case where a leader change may not ever finish if:
- The new leader is late and needs to ask for a state transfer;
- The timeout for requests is shorter than the state transfer timeout.
-
The state transfer is now obligated to send a proof for the last decided consensus, so that a recovered replica can obtain a CertifiedDecision object. This is necessary to ensure that any recovered replica can send its proof for its last consensus if the synchronization phase is triggered immediately after a recovered replica finishes installing the state.
Furthermore, replicas that are asked for the state should now check if they indeed have a proof for the requested state up to the specified consensus instance. If they do not, they should reply in the same way as if they did not had the state requested. However, a proof is never needed in CFT mode.
-
Lastly, there is a small, yet important correction to the Mod-SMaRt protocol: the content of the requests will now be validated before being stored and marked as pending requests. This is done to avoid malicious clients from forcing all correct replicas to propose invalid requests. If all correct replicas proposed invalid requests once they become leaders, the consensus instance would never decide anything, since all correct replicas refuse to send WRITE messages to invalid content. However, it is not necessary to perform any such verification under CFT mode.
Code modifications:
- Added method 'appExecuteUnordered(...)' to 'DefaultRecoverable', 'DefaultSingleRecoverable' and 'DurabilityCoordenator'. All demos now implement this method instead of 'executeUnordered(...)' from the 'Executable' interface;
- Transfered a huge portion of the code from 'TOMLayer' to a new class 'Synchronizer'. This was done because the TOMLayer class already had more code related to the synchronization phase than to the normal case (approximately 2/3 of TOMLayer's code was dedicated to the synchronization phase);
- The cryptographic proof for an ACCEPT message is now done within a dedicated method;
- Removed a few legacy attributes from the classic state transfer protocol that were no longer necessary;
- Removed a legacy attribute from the reconfiguration protocol that was no longer necessary;
- Removed a legacy parameter from the 'decided(..)' method of the 'Consensus' class;
- Renamed class 'Round' to 'Epoch';
- Renamed class 'Consensus' to 'Decision';
- Renamed class 'Execution' to 'Consensus';
- Renamed class 'PaxosMessage' to 'ConsensusMessage';
- Renamed class 'LastEidData' to 'CertifiedDecision';
- Renamed methods and variables across all code from 'EID' (Execution ID) to 'CID' (Consensus ID);
- Re-distributed the classes from all sub-packages from 'bftsmart.consensus' and 'bftsmart.tom.core', which resulted in removing 2 sub-packages that were rendered empty ('bftsmart.consensus.executionmanager' and 'bftsmart.tom.core.timer');
- The nonces generated within each consensus instance are now only generated upon usage of the method 'getNonces()' of a 'MessageContext' object. The original seed and number of nonces is now the only information that is exchanged amoung replicas (it is all that is needed to obtain the nonces);
- 'MessageContext' objects now hold all information of the original 'TOMMessage' object and is also able to re-create the original object with the method 'recreateTOMMessage(...)';
- 'MessageContext' objects now hold the cryptographic proof for the consensus instance to which it is associated with;
- Method 'noOp' from 'Recoverable' now provides the complete 'MessageContext' object associated with the consensus instance where it was triggered;
- Deleted class 'LeaderModule' and moved the few methods that were still being used to the 'ExecutionManager' class;
- Deleted classes 'Proof', 'CounterState' and 'ReceivedMessage', since they were no longer being used in any part of the code;
- Organized 'import's and fixed edentation in some classes;
- Added 'override' annotations across all the code.
Bug fixes:
- Setting 'useMAC' parameter to '0' will no longer throw any exception during execution;
- Fixed a bug related with nonce generation (the leader replica was not keeping this information);
- Fixed bug in initialization, which would make replicas always select replica 0 as the leader (regardless of if it was part of the group or not);
- Fixed issue on 'DefaultSingleRecoverable' class that would make all consensus messages go to the out-of-context set;
- Recovered replicas can now correctly calculate a quorum of replies associated with the state transfer;
- Fixed a bug that happened in the absence of clients issuing requests. if crashed replica X finished recovering and then another replica Y crashed and later asked for the latest state, replica X would send the wrong consensus ID;
- Leader change was sending a wrong message type in CFT mode (was sending a WRITE message instead of an ACCEPT);
- Replicas will now send their STOPDATA message even if they do not hold a proof for their last executed consensus;
- Fixed implementation of predicate 'sound', which was waiting for more than 'n-f' STOPDATA messages (instead of waiting for at least 'n-f');
- Fixed the timestamps associated with each consensus. They were being incremented at the SYNC message, but this must be done earlier (after receiving 2f+1 STOP messages);
- PROPOSE message is now validated in relation to the replica's current epoch (which must be 0 in order for the PROPOSE to be accepted);
- Fixed a bug that would send a timestamp/value pair with timestamp equal to -1;
- Implemented an out-of-context mechanism for the synchronization phase of the replication protocol (such mechanism only existed for the normal phase);
- Made sure that upon a leader change, the protocol will use a new Epoch object with the latest timestamp, and include such timestamp in the upcoming WRITE/ACCEPT messages;
- Made sure client requests relayed within STOP messages were proccessed in accordance to the Mod-SMaRt protocol;
- Made sure the synchronization phase now installs the consensus proof received in the SYNC message;
- Made sure STOP messages are exchanged also in CFT mode;
- Fixed concurrency issue related to a consensus WRITESET;
- Fixed a bug on synchronization phase which updated the replica's WRITESET before properly installing the new ETS;
- The synchronization phase now updates the consensus' ETS of delayed replicas (using the value of the current regency);
- Fixed memory leak in 'ExecutionManager' class;
- Fixed script 'smartrun.sh'.
Miscellaneous:
- Added interface where developers can enforce the 'external validity' property of VP-Consensus;
- Extended 'Recoverable' interface to allow delivery of requests via the 'op(...)' method;
- 'DurabilityCoordinator' now supports the 'noOp(...)' method;
- DefaultRecoverable and DefaultSingleRecoverable (finally) store and send the MessageContext objects associated with the ordered commands during checkpoints and state transfer, respectively;
- Added debug messages for the synchronization phase of the replication protocol;
- Removed a redundant lock from the communication system;
- Moved locks of the 'run_lc_protocol()' method to a more precise part of the code (in class RequestsTimer);
- Updated 'ShutdownHookThread' to properly display the state of a replica that gets shutdown;
- It is now possible to specify the key size in 'RSAPairGenerator';
- Method 'computeHash(...)' from 'TOMUtil' class is now thread-safe.