Rethinking Distributed Ledger Technology R. Kuhn, J. Voas, D. Yaga, T. Saidkhodjaev Presented by: Rick Kuhn US National Institute of Standards and Technology Computer Security Division [email protected] What is the problem? Blockchain has been defined as "an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way". The permanence/immutability property that makes blockchain technology useful also leads to difficulty in supporting privacy requirements Privacy rules such as those of European Union General Data Protection Regulation (GDPR) requires that all information related to a particular person can be deleted at that person's request
personal data, defined as "any information concerning an identified or identifiable natural person" - data for which blockchains are designed to be used "Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person." What is the rationale for blockchain properties? Blockchain and proof-of-work protocol were of course designed to solve the problem of double spending in cryptocurrencies. As with all design choices, blockchain properties have tradeoffs Proof of work provides an ordering guarantee, => at the expense of enormous processing time and expense Linked hash records provide trust and integrity guarantee, => at the expense of modification or erasure mechanisms required for privacy How well do blockchain properties apply to traditional
data management applications? Cryptocurrency Finance, supply chain, e-commerce, etc. 1. Partial anonymity ID required for contracts or government regulation 2. Public access/transparency Controlled access 3. Small transaction size Range of message sizes up to large documents, images 4. Immutable records
Changes and deletions, often required by law 5. Proof of work Flexible consensus models 6. Block ordering guarantees Timestamps often required 7. Decentralization Same in many applications 8. Replication Same in many applications 9. Data integrity guarantees
Same in many applications Can we try something else? Datablock matrix uses two hash values per block instead of a linked chain implemented now, as a research tool need to try on practical application Verified time high resolution time stamp instead of ordering guarantee currently a research activity research implementation to be released soon Why datablock matrix for distributed ledgers? Blockchain provides integrity, sequencing No erasure possible, by design
Double-spend problem solved by distributed time-stamp/sequencing guarantees Sequencing guarantees require proof of work algorithms Proof of work extremely slow, by design Datablock matrix provides integrity, erasure Integrity protection guarantees for all blocks not erased Verified timestamps instead of sequencing guarantee Greater range of consensus algorithms available, suitable for permissioned DL Very fast consensus algorithms can
be used Changing data in blockchain vs. datablock matrix Blockchain Initial data entry -> transaction in a block Modification -> new transaction keyed to previous Use key to new value, not allow use of previous, obsolete, value Dependent on proof of work to ensure sequence Datablock matrix Initial data entry -> transaction in a block Modification -> delete/replace transaction by owner
Use previous key, new value found in block Sequence not needed since only one value exists Structure of a Traditional Blockchain Why is GDPR deletion requirement a problem for blockchains? Conventional distributed ledger blockchain change to one block changes hashes of all; provides integrity protection Hashes provide assurance that information in every other block is unchanged if one block is modified If we had to delete a block, hash values for others are no longer valid
Dont want to create a new chain What are ways of dealing with this problem? Dont put personal information on blockchain Pseudo-anonymized data are still considered personal Even if not directly tied to a person dynamic IP address can be considered personal if it can be indirectly tied to an individual Encrypt data and destroy key to delete Data must be secure for decades Cannot be sure that future developments in crypto will not reveal it e.g. quantum computing puts current public key systems at risk
What are block matrix constraints and assumptions? Hash integrity protection must not be disrupted for blocks not deleted Deletions will be relatively rare Ensure auditability and accountability Application to permissioned/private distributed ledger systems New data structure solution: a datablock matrix A data structure that provides integrity assurance using hashlinked records while also allowing the deletion of records Stores hashes of each row and column
=> each block within the matrix is protected by two hashes Suggested use for private/permissioned distributed ledger systems How does this work? Suppose we want to delete block 12 0 1 2 3
4 disrupts the hash values of H3,for row 3 and H-,2 and column 2 0 1 3 7 13 H0,-
blocks of row 3 are included in the hashes for columns 0, 1, 3, and 4 1 2 5 9 15 H1,- 2
4 6 11 17 H2,- 3 8 10
12 19 H3,- 4 14 16 18 20
H4,- H-,0 H-,1 H-,2 H-,3 H-,4 etc. blocks of column 2 are included in the hashes for rows 0, 1, 2, and 4
Datablock Matrix Population Algorithm Algorithm Basic algorithm is simple, many variations possible Implemented as Java code Github project Block ordering properties provides desirable Data Structure Properties Balance: upper half (above diagonal) contains at most one additional cell more than the lower half. Hash sequence length: number of
blocks in a row or column hash proportional to for a matrix with N blocks, by the balance property. Number of blocks: The total number of data blocks in the matrix is for k rows/columns since the diagonal is null. Block dispersal: No consecutive blocks appear in the same row or column Consecutive block deletion Algorithm keeps main diagonal null Allows deletion of two consecutive blocks without disrupting hashes Example deleting blocks 7 and 8 without null diagonal would lose hash integrity protection for blocks 4 and 9
Vs. 3 Verified Time Securely combine time from multiple national time services Process has been defined, incorporating multiple algorithms/protocols Would allow use of timestamps instead of ordering guarantees that require proof-of-work Currently a research effort, to be published Multidimensional Blockmatrix Once the size of the blockmatrix is set, it cannot be easily expanded Block addition and erasure complexity is , where d =
number of dimensions, s = number of blocks Can we choose number of dimensions so that complexity is optimal for given size? Complexity Analysis Optimal complexity of is achieved when dimension width is and number of dimensions is 3-dimensional example: Dimension width = 3 Number of dimensions = 3 Size = Indexing and Balance Balancing algorithm of 2D blockmatrix is difficult to generalize, hence not used here Start at (0,0,0) and fill in the order
of breadth-first search Need to delete consecutive blocks to lose integrity Hash sequence length is not more than 3, but each block is protected by multiple dimensions Tests and Performance Starting at around 100,000 blocks, 3D blockmatrix starts outperforming 2D blockmatrix At 1,000,000 blocks, 3D blockmatrix is 8 times faster A lot of time is wasted for breadth-first search indexing initialization 1,000,000 Additions Blockmatrix init: 346 ms Current implementation: Blocktensor init: Blockmatrix add:
Blockmatrix init: 20,839 ms 312,743 ms 41,570 ms Future Work Consider proof of work or alternate consensus schemes Web tool to easily see structure Extension to peer-to-peer system Demonstration project for access control Investigate performance and applications of higher dimension structures 13 Summary - comparison
Blockchain Integrity protection Transparency global Permanence, proof of work New approach Integrity protection Transparency global Erasure, timestamps More information: Kuhn, R., Yaga, D. and Voas, J., 2019. Rethinking Distributed Ledger Technology. Computer, 52(2), pp.68-72. Stavrou, A. and Voas, J., 2017. Verified time. Computer, 50(3), pp.78-82. Kuhn, D. R. (2018). A Data Structure for Integrity Protection with Erasure Capability. https://csrc.nist.gov/publications/detail/white-paper/2018/05/31/data-structur e-for-integrity-protection-with-erasure-capability/draft
Github project: https://github.com/usnistgov/blockmatrix