Hashing in Crypto++

This posts explores the hashing algorithms available in Crypto++. We will examine how to use them independently and how to integrate them into a pipeline. First, let's review the available implementations. These are listed in the "Hash Functions" section of the documentation index. Notable examples include Blake2s/2b, Keccak, and SHA3. Some algorithms, such as MD5 and MD4, are located in the Weak namespace. As the name implies, these are not secure for production use and are included only for reference.

HashTransformation Base Class

All hashing algorithms in Crypto++ implement the HashTransformation interface in some form. Crypto++ frequently uses template parameter inheritance, which can make it less obvious how a specific type implements an interface. For instance, SHA3 directly inherits from HashTransformation. On the other hand, Blake2b inherits from SimpleKeyingInterfaceImpl<MessageAuthenticationCode, BLAKE2b_Info>, which is defined as:

template <class BASE, class INFO = BASE>
class SimpleKeyingInterfaceImpl : public BASE

In this case, SimpleKeyingInterfaceImpl inherits from MessageAuthenticationCode, which implements HashTransformation. While this detail is not critical for simply using a hashing algorithm, it is helpful to understand when examining Crypto++'s internal code.

The HashTransformation class is stateful. It can process large inputs in chunks and compute their hash. It provides several key public methods:

  • Update: Adds a chunk of input data. This method can be called multiple times to process input in chunks.
  • Final: Computes the hash of the input provided by one or more Update calls. It also resets the object's state.
  • Verify: Verifies the hash of the current message. To use this method, you must first load the message using one or more Update calls. Do not call Final, as it resets the object's state.
  • Restart: Clears the object's state.
  • DigestSize: Returns the size of the digest (output hash) used by the algorithm. This is useful for dynamically allocating a buffer for the result. Alternatively some implementations provide a constant named DIGESTSIZE which is more useful if you need the value during compilation time.

Additionally, there are two convenience methods worth noting:

  • CalculateDigest: Allows you to process a single chunk of data and calculate its hash in one step. This is useful when the input data consists of a single chunk.
  • VerifyDigest: Similarly, this method allows you to verify the hash of a message in one step.

Using HashTransformation on its own

Let's see how we can use a HashTransformation implementation to calculate the SHA3-512 digest of an input message. In this example we will not create a Crypto++ pipeline. We will only use SHA3_512 which implements the corresponding hashing algorithm and implements HashTransformation.

Here is the code:

#include <iostream>

#include <cryptopp/cryptlib.h>
#include <cryptopp/filters.h>
#include <cryptopp/hex.h>
#include <cryptopp/sha3.h>

int main() {
  const std::string msg{"TEST STRING"};

  auto sha3_512 = CryptoPP::SHA3_512();
  sha3_512.Update(reinterpret_cast<const uint8_t *>(msg.data()), msg.length());

  std::array<uint8_t, CryptoPP::SHA3_512::DIGESTSIZE> digest;
  sha3_512.Final(digest.data());

  // pipeline showing the hash in hex format
  std::string hash;
  (void)CryptoPP::ArraySource(
      digest.data(), digest.size(), true,
      new CryptoPP::HexEncoder(new CryptoPP::StringSink(hash)));
  std::cout << "SHA3 512: " << hash << std::endl;

  sha3_512.Update(reinterpret_cast<const uint8_t *>(msg.data()), msg.length());
  bool verification_result = sha3_512.Verify(digest.data());

  std::cout << "Verifying digest: " << verification_result << std::endl;

  return 0;
}

msg is our input string which we want to hash. We instantiate a SHA3_512 object and pass the input message to it via a raw pointer and its size. Note that in this case Update doesn't take ownership over the input. "Raw pointer means pass ownership" paradigm is valid only for non-trivial types.

Then we call Final with a pointer to a buffer to obtain the digest. Since the hash is a known constant we don't need to pass the size of the buffer but we need to ensure it is big enough. SHA3 512 produces a 512 bit hash which equals to 64 bytes. It's a good practice not to use magic numbers so for the array size here we use the DIGESTSIZE constant.

After we compute the hash we use a simple Crypto++ pipeline to print it on screen. If you need a refresher on what a Crypto++ pipeline is have a look at A Brief Introduction to Crypto++.

Finally we use the same object to verify the hash. Final resets the state of the object so we can use it directly. We call Update again with the input but this time we call Verify by passing the digest computed in the previous step. This function returns a bool indicating if the hash is valid or not.

In this example we used Update, Final and Verify but our input was small and we had it in one chunk. Let's rewrite the example with the convenience functions CalculateDigest and VerifyDigest to make it simpler.

Here is the code:

#include <iostream>
#include <string>

#include <cryptopp/cryptlib.h>
#include <cryptopp/filters.h>
#include <cryptopp/hex.h>
#include <cryptopp/sha3.h>

int main() {
  const std::string msg{"TEST STRING"};
  std::array<uint8_t, CryptoPP::SHA3_512::DIGESTSIZE> digest;

  auto sha3_512 = CryptoPP::SHA3_512();
  sha3_512.CalculateDigest(digest.data(),
                           reinterpret_cast<const uint8_t *>(msg.data()),
                           msg.length());

  // pipeline showing the hash in hex format
  std::string hash;
  (void)CryptoPP::ArraySource(
      digest.data(), digest.size(), true,
      new CryptoPP::HexEncoder(new CryptoPP::StringSink(hash)));
  std::cout << "SHA3 512: " << hash << std::endl;

  bool verification_result = sha3_512.VerifyDigest(
      digest.data(), reinterpret_cast<const uint8_t *>(msg.data()),
      msg.length());

  std::cout << "Verifying digest: " << verification_result << std::endl;

  return 0;
}

Here instead of Update and Final we directly call CalculateDigest which accepts a pointer to the output buffer of the digest (no size since it's a known constant) and a pointer to the input with its size. The function loads the data in the SHA3_512 object, calculates the hash and resets it in one call. Then on the verification step we use VerifyDigest which in a similar fashion loads the data and verifies that the hash of the message matches the provided one.

Using HashTransformation as part of a pipeline

Now let's do the same thing but with a Crypto++ pipeline. There are two filters working with HashTransformations to help us here:

  • HashFilter uses a HashTransformation to produce a digest.
  • HashVerificationFilter uses a HashTransformation instance to verify a digest.

HashFilter's constructor accepts a reference to a HashTransformation (the filter doesn't take ownership over the instance) and a pointer to a BufferedTransformation (the filter takes ownership over this instance) which is the next element from the pipeline. It reads all the data pumped to it and produces a digest.

HashVerificationFilter also accepts a reference to a HashTransformation instance, a pointer to a BufferedTransformation and additional flags. HashVerificationFilter expects digest+message as an input. This literally means that the previous element in the pipeline should provide a concatenated digest and the corresponding message. The size of the digest is determined via the provided HashTransformation. On its output HashVerificationFilter outputs a boolean (0 or 1) representing the result of the digest check.

The flags passed to the constructor of the HashVerificationFilter allow modifying its behaviour. They are represented by enum Flags and for Crypto++ 3.8.9 they look like this:

/// \enum Flags
/// \brief Flags controlling filter behavior.
/// \details The flags are a bitmask and can be OR'd together.
enum Flags {
    /// \brief The hash is at the end of the message (i.e., concatenation of message+hash)
    HASH_AT_END=0,
    /// \brief The hash is at the beginning of the message (i.e., concatenation of hash+message)
    HASH_AT_BEGIN=1,
    /// \brief The message should be passed to an attached transformation
    PUT_MESSAGE=2,
    /// \brief The hash should be passed to an attached transformation
    PUT_HASH=4,
    /// \brief The result of the verification should be passed to an attached transformation
    PUT_RESULT=8,
    /// \brief The filter should throw a HashVerificationFailed if a failure is encountered
    THROW_EXCEPTION=16,
    /// \brief Default flags using HASH_AT_BEGIN and PUT_RESULT
    DEFAULT_FLAGS = HASH_AT_BEGIN | PUT_RESULT
};

DEFAULT_FLAGS represent the default behaviour if no flags are passed - the expected input is 'digest|message' and the result (a bool) is sent to the next element of the pipeline. By passing another combination of flags you can modify the behaviour of the filter. For example you might want to pass the message before the digest or to have an exception if the digest is not correct etc. Have a look at the comments for each variant of the enum - they are documented pretty well.

Now let's see a sample pipeline which calculates a digest and verifies it:

#include <iostream>
#include <string>

#include <cryptopp/cryptlib.h>
#include <cryptopp/filters.h>
#include <cryptopp/hex.h>
#include <cryptopp/sha3.h>

int main() {
  const std::string input{"TEST STRING"};
  std::string digest;
  auto sha3_512 = CryptoPP::SHA3_512();

  (void)CryptoPP::StringSource(
      input, true,
      new CryptoPP::HashFilter(sha3_512, new CryptoPP::StringSink(digest)));

  std::string hash;
  (void)CryptoPP::StringSource(
      digest, true, new CryptoPP::HexEncoder(new CryptoPP::StringSink(hash)));
  std::cout << "SHA3 512: " << hash << std::endl;

  bool result;
  (void)CryptoPP::StringSource(
      digest + input, true,
      new CryptoPP::HashVerificationFilter(
          sha3_512, new CryptoPP::ArraySink(
                        reinterpret_cast<uint8_t *>(&result), sizeof(bool))));

  std::cout << "Result: " << result << std::endl;

  return 0;
}

Initially we allocate our input message, the digest and an instance of SHA3_512. Then we create a pipeline calculating the digest - a StringSource to read the input, HashFilter (with the SHA3_512 instance) to calculate the digest and a StringSink to save the result to the buffer. Here we use std::string as an output buffer but we can also use an array or a vector - it doesn't matter.

Then we create another pipeline which convert the digest to a human readable output. It's the same as the one in the previous section.

And finally we have got the digest verification pipeline. We allocate a buffer for the result of the check which is just a bool variable. We use a StringSource to prepare the input data. Note that the input of the source is digest + input - the digest we calculated before and the original message concatenated. The next element from the pipeline is a HashVerificationFilter instance. We pass it the SHA3_512 object, an ArraySink which writes to the bool and we use the default flags. If the verification is successful the bool should be set to true (1).

Conclusion

In this post we saw how to use a hashing algorithm with Crypto++. You learned how to use a HashTransformation directly and via a Crypto++ pipeline. Both approaches have their strengths and weaknesses. For example to verify a hash with a pipeline you need to provide a concatenation of the hash and the input which usually means another copy of the input data. Using the hashing object directly can be more efficient but you loose the convenience of the Crypto++ pipeline. Bottom line - both approaches do the same so pick whatever feels more natural for your case.

I haven't talked about error handling in this post because usually it's hard to do anything wrong with hashes. If you are using HashTransformation directly double check that you pass valid pointers and that their sizes are correct. For example length() and size() of a std::string mean different things. If you stick to Crypto++ pipelines these things are usually handled for you. Don't forget to handle exceptions if you have enabled them with the THROW_EXCEPTION flag.

The book

This post is an excerpt from my book "Brief Introduction to Crypto++". The book should be published by the end of May 2025. You can learn more about the book in this post.

Use the form below to join my newsteller if you want to receive occasional updates about the book and other stuff I find interesting:

Comments

Comments powered by Disqus