Hashing

Using 1st party implementations of circuit friendly (and unfriendly) hash functions in Circom circuits

Introduction to Circom Hashing

See the Theory > Primitives > Hashing section for more of an explanation on hash functions in general, as well as the specifics of hash functions friendly to the arithmetic circuit.

Hashing

As a preface, in general you can assume more inputs to make the computation of your hash more expensive both in Zero Knowledge constraints and in EVM gas. There is also an increased cost to multiple rounds of hash function/ template calls compared to a one-shot hash with multiple inputs.

You most likely have seen the equivalent of Circom's "Hello World" on their official docs:

pragma circom 2.0.0;

/*This circuit template checks that c is the multiplication of a and b.*/  

template Multiplier2 () {  

   // Declaration of signals.  
   signal input a;  
   signal input b;  
   signal output c;  

   // Constraints.  
   c <== a * b;  
}

This circuit may seem absolutely pointless without any experience. However, as a black box hashing is little more than a irreversible function to deterministically obscure data using math.

Put even simpler, pretend that the operation a * b is our hash function.

  • a and b represent a preimage, the data being hidden or checksum'd for integrity

  • c represents the hash

Functionally, the "Hello World" circuit is meant to quickly convey that with the Circom DSL, individuals can prove knowledge of underlying data and prove that an encrypted value is the hash of that underlying data. From this point on, we can work with the hash in public or further in Zero Knowledge as a verifiable receipt of the underlying data contained within.

Example of Application

For instance, part of the decentralization in BattleZips V1 is derived from the contract storage of a hash proven to be a MiMC sponge hash of a valid board configuration. Consider the flow of (see Examples > BattleZips for all pseudo UML sequence diagrams for BattleZips):

When Alice runs newGame() or Bob runs joinGame(), the following code runs, given inputs of a board.circom proof and the board hash:

...
/// PROVE THE HASH IS OF A VALID BOARD CONFIGURATION ///
require(
    bv.verifyProof(a, [b_0, b_1], c, [_boardHash]),
    "Invalid Board Config!"
);
/// STORE THE BOARD HASH IN THE CONTRACT ///
// _game = game lobby
// .participants[1] means bob/ joinGame; [0] would be alice/ newGame
games[_game].participants[1] = _msgSender();
games[_game].boards[1] = _boardHash;
...

Logically, this can be broken down to a sort of informational escrow:

  1. using the bv (deployed) on-chain zkSNARK verifier, use the proof(a, b_0, b_1, c)to convince the smart contract that _boardHashis the MiMC7 hash of a valid battleship board, as encoded by board.circom

  2. store _boardHash for the player for the duration of that game

Once the contract has been convinced of the authenticity of the hash, we can transfer the right to input the hash for shot.circom from the end user to the smart contract. Notice that in turn(), the boardHash value is recalled from EVM storage and directly assigned. The caller has no ability to assign the 0th (first) public proof input parameter, beyond their initial commitment to the board hash when creating or starting the game.

MiMC

Full support from the auxiliary first party libraries is included for both MiMC and MiMC Sponge:

@todo: Figure out wtf the difference between MiMC and MiMC Sponge. For this reason, we will explain the rest of the MiMC section using the MiMC Sponge circomlib construct. Know that while the exact API for the MiMC template may differ slightly from the MiMC Sponge, the actual input output functionality is exactly the same and reworking the following code for determined users of mimc.circom should be quick.

Zero Knowledge

JavaScript

Solidity

Poseidon

Zero Knowledge

The use of the Poseidon hash function in Circom is quite simple. The Poseidon template is shown below:

template Poseidon(nInputs) {
    signal input inputs[nInputs];
    signal output out;

    component pEx = PoseidonEx(nInputs, 1);
    pEx.initialState <== 0;
    for (var i=0; i<nInputs; i++) {
        pEx.inputs[i] <== inputs[i];
    }
    out <== pEx.out[0];
}

Let's individually note the properties before demonstrating the component use:

  • Supplying the nInputs template parameter determines the number of signals that will be hashed together

    • Note that this is not limited to 6 like the Poseidon solidity contract

  • The component is used simply by assigning nInputs number of signals in the inputs signal array

  • Once all inputs have been assigned, the Poseidon hash of the values can be accessed with the output signal out

In practice, we can look to the BattleZips updated RollupNC repository. Specifically, lets take a look at the process of hashing an account leaf for the state tree in balance_leaf.circom (see Examples > RollupNC > Account/ State Tree for context), where each account leaf is the Poseidon hash of 5 values:

pragma circom 2.0.3;

include "../../../node_modules/circomlib/circuits/poseidon.circom";

template BalanceLeaf() {

    /// SIGNAL IO ///
    signal input pubkey[2]; // Account EdDSA pubkey [x, y]
    signal input balance; // Account token balance
    signal input nonce; // Number of transactions made by the account
    signal input tokenType; // Token registry index for token type
    signal output out; // Poseidon hash of account (account leaf)
    
    /// HASH ACCOUNT DATA INTO ACCOUNT LEAF ///
    component balanceLeaf = Poseidon(5);
    balanceLeaf.inputs[0] <== pubkey[0];
    balanceLeaf.inputs[1] <== pubkey[1];
    balanceLeaf.inputs[2] <== balance;
    balanceLeaf.inputs[3] <== nonce; 
    balanceLeaf.inputs[4] <== tokenType;

    out <== balanceLeaf.out;
}

In this case, we assign the Poseidon component's nInputs parameter to 5 to ingest the two pubkey points, the balance, the nonce, and the tokenType. These items must be assigned to the array in proper order. Once nInputs assignments are made to each index of the of the balanceLeaf.inputs[] array, we can access the Poseidon hash with balanceLeaf.out.

Lastly, you can find the first party unit test for the Poseidon circuit in the circomlib repo.

JavaScript

The circomlibjs library includes a JS component for Poseidon. Your application will very frequently use the JavaScript hasher to work with this data on the client side, including use in both Solidity and the Circom circuits (see more on browser inclusion of circomlibjs assets in Development > SnarkJS > Browser).

To employ the hash function, let's use the BattleZips updated RollupNC unit tests (edited slightly here):

const { buildPoseidon } = require('circomlibjs')

...
it('Deposit #1 (COORDINATOR ADDRESS)', async () => {
    // instantiate poseidon hasher from circomlibjs
    const poseidon = await buildPoseidon();
    ...
    // generate account leaf
    // A.K.A. Poseidon hash of the eddsa key, balance, nonce, tokenType
    const data = [...l2Pubkey, 0, 0, 0];
    const leaf = F.toObject(poseidon(data));
    ...
})
...

Poseidon is a popular hash function and you should not have trouble finding its use within the open-source code contained in Ecosystem > Projects. You can start with the circomlibjs Poseidon unit tests for the first party demonstration of use.

Solidity

Note: Poseidon is colloquially and theoretically a superior upgrade of MiMC. When dealing with 6 or less inputs to your hash function, you will most likely experience all around superior performance to MiMC. However, in the BattleZips update to RollupNC, we found that the max input size for the circomlibjs auto-generated contract was 6, meaning a transaction leaf with 8 different fields was most efficiently hashes in three rounds - hash each half of the transaction leaf separately with 4 inputs, then hash the two intermediate hashes together for the transaction leaf. This provided inferior performance to the older MiMC version of the RollupNC repository.

SHA-256

BattleZips V1 was forked (almost entirely transformed at that) from zkbattleship-circuit written by an unaffiliated GitHub user tommymsz006. In that README.md, they explained:

There have been different discussions on whether/how hashing can be made more efficient in arithmetic circuits for zkSNARKs use; see Zcash forum and ethresear.ch.

For this battleship use case, taking circom's hash as an example,Pedersen Hash is proved to be more efficent in terms of the resulting number of wires (>88% reduction) and constraints (>97% reduction), compared with the SHA-256 counterpart:

> ./zkcompile battleship_pedersen.circom 
# Wires: 3540
# Constraints: 708
# Private Inputs: 15
# Public Inputs: 4
# Outputs: 1
> ./zkcompile battleship_sha256.circom 
# Wires: 31366
# Constraints: 29785
# Private Inputs: 15
# Public Inputs: 4
# Outputs: 1

There is an immense cost to verifying a SHA-256 hash in Zero Knowledge. Unless your circuit's existence is predicated on the need to verify existing ECDSA signatures or SHA-256 hashes, you should not create new SHA-256 hashes to verify by starting out with SHA-256. As demonstrated above with the BattleZips board placement, if you only need the data authentication provided by SHA-256 and not SHA-256 itself you should implement a different hash function.

Nonetheless, there is first party support for SHA-256 in the form of the template sha256compression.circom. Being SHA-256, you can use standard solidity and javascript libraries. There are third party circom templates for keccak256 as well. If you really wanted to see ZK Battleship, and can stomach translating you can check out theClever use of GitHub search should yield you additional resources to replicate in your own work.

Last updated