Hashcash


Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks, and more recently has become known for its use in bitcoin as part of the mining algorithm. Hashcash was proposed in 1997 by Adam Back and described more formally in Back's 2002 paper "Hashcash - A Denial of Service Counter-Measure".

Background

The idea "...to require a user to compute a moderately hard, but not intractable function..." was proposed by Cynthia Dwork and Moni Naor in their 1992 paper "Pricing via Processing or Combatting Junk Mail".

How it works

Hashcash is a cryptographic hash-based proof-of-work algorithm that requires a selectable amount of work to compute, but the proof can be verified efficiently. For email uses, a textual encoding of a hashcash stamp is added to the header of an email to prove the sender has expended a modest amount of CPU time calculating the stamp prior to sending the email. In other words, as the sender has taken a certain amount of time to generate the stamp and send the email, it is unlikely that they are a spammer. The receiver can, at negligible computational cost, verify that the stamp is valid. However, the only known way to find a header with the necessary properties is brute force, trying random values until the answer is found; though testing an individual string is easy, satisfactory answers are rare enough that it will require a substantial number of tries to find the answer.
The hypothesis is that spammers, whose business model relies on their ability to send large numbers of emails with very little cost per message, will cease to be profitable if there is even a small cost for each spam they send. Receivers can verify whether a sender made such an investment and use the results to help filter email.

Technical details

The header line looks something like this:
X-Hashcash: 1:20:1303030600:anni@cypherspace.org::McMybZIhxKXu57jd:ckvi
The header contains:
The header contains the recipient's email address, the date of the message, and information proving that the required computation has been performed. The presence of the recipient's email address requires that a different header be computed for each recipient. The date allows the recipient to record headers received recently and to ensure that the header is unique to the email message.

Sender's side

The sender prepares a header and appends a counter value initialized to a random number. It then computes the 160-bit SHA-1 hash of the header. If the first 20 bits of the hash are all zeros, then this is an acceptable header. If not, then the sender increments the counter and tries the hash again. Out of 2160 possible hash values, there are 2140 hash values that satisfy this criterion. Thus the chance of randomly selecting a header that will have 20 zeros as the beginning of the hash is 1 in 220. The number of times that the sender needs to try to get a valid hash value is modeled by geometric distribution. Hence the sender will on average have to try 220 values to find a valid header. Given reasonable estimates of the time needed to compute the hash, this would take about one second to find. No more efficient method than this brute force approach is known to find a valid header.
A normal user on a desktop PC would not be significantly inconvenienced by the processing time required to generate the Hashcash string. However, spammers would suffer significantly due to the large number of spam messages sent by them.

Recipient's side

Technically the system is implemented with the following steps:
If the hash string passes all of these tests, it is considered a valid hash string. All of these tests take far less time and disk space than receiving the body content of the e-mail.

Required effort

The time needed to compute such a hash collision is exponential with the number of zero bits. So zero bits can be added until it is too expensive for spammers to generate valid header lines.
Confirming that the header is valid is very much faster and always takes the same amount of time, no matter how many zero bits are required for a valid header, since this requires only a single hashing operation.

Advantages and disadvantages

The Hashcash system has the advantage over micropayment proposals applying to legitimate e-mail that no real money is involved. Neither the sender nor recipient need to pay, thus the administrative issues involved with any micropayment system and moral issues related to charging for e-mail are entirely avoided.
On the other hand, as Hashcash requires potentially significant computational resources to be expended on each e-mail being sent, it is somewhat difficult to tune the ideal amount of average time one wishes clients to expend computing a valid header. This can mean sacrificing accessibility from low-end embedded systems or else running the risk of hostile hosts not being challenged enough to provide an effective filter from spam.
Hashcash is also fairly simple to implement in mail user agents and spam filters. No central server is needed. Hashcash can be incrementally deployed—the extra Hashcash header is ignored when it is received by mail clients that do not understand it.
One plausible analysis concluded that only one of the following cases is likely: either non-spam e-mail will get stuck due to lack of processing power of the sender, or spam e-mail is bound to still get through. Examples of each include, respectively, a centralized e-mail topology, in which some server is to send an enormous amount of legitimate e-mails, and botnets or cluster farms with which spammers can increase their processing power enormously.
Most of these issues may be addressed. E.g., botnets may expire faster because users notice the high CPU load and take counter-measures, and mailing list servers can be registered in white lists on the subscribers' hosts and thus be relieved from the hashcash challenges. But they represent serious obstacles to hashcash deployment that remain to be addressed.
Another projected problem is that computers continue to get faster according to Moore's law. So the difficulty of the calculations required must be increased over time. However, developing countries can be expected to use older hardware, which means that they will find it increasingly difficult to participate in the e-mail system. This also applies to lower-income individuals in developed countries who cannot afford the latest hardware.
Like hashcash, cryptocurrencies use a hash function as their proof-of-work system. The rise of cryptocurrency has created a demand for ASIC-based mining machines. Although most cryptocurrencies use the SHA-256 hash function, the same ASIC technology could be used to create hashcash solvers that are three orders of magnitude faster than a consumer CPU, reducing the computational hurdle for spammers.

Applications

Bitcoin mining

In contrast to hashcash in mail applications that relies on recipients to set manually an amount of work intended to deter malicious senders, the bitcoin cryptocurrency network employs a different hashing-based proof-of-work challenge to enable competitive bitcoin mining. A bitcoin miner runs a computer program that collects unconfirmed transactions from coin dealers in the network. With other data these can form a block and earn a payment to the miner, but a block is accepted by the network only when the miner discovers by trial and error a "nonce" number that when included in the block yields a hash with a sufficient number of leading zero bits to meet the network's difficulty target. Blocks accepted from miners form the bitcoin blockchain that is a growing ledger of every bitcoin transaction since the coin's first creation.
While hashcash uses the SHA-1 hash and requires the first 20 of 160 hash bits to be zero, bitcoin's proof of work uses two successive SHA-256 hashes and originally required at least the first 32 of 256 hash bits to be zero. However the bitcoin network periodically resets the difficulty level to keep the average rate of block creation at 6 per hour. As of January 2020 the bitcoin network has responded to deployments of ever faster hashing hardware by miners by hardening the requirement to first 74 of 256 hash bits must be zero.

Spam filters

Hashcash is used as a potential solution for false positives with automated spam filtering systems, as legitimate users will rarely be inconvenienced by the extra time it takes to mine a stamp. SpamAssassin has been able to check for Hashcash stamps since version 2.70, assigning a negative score for valid, unspent Hashcash stamps. However, although the hashcash plugin is not initially on by default, it still needs to be configured with a list of address patterns that must match against the Hashcash resource field, so it doesn't actually work by default.

Email clients

The Penny Post software project on SourceForge implements Hashcash in the Mozilla Thunderbird email client. The project is named for the historical availability of conventional mailing services that cost the sender just one penny; see Penny Post for information about such mailing services in history.

Email Postmark

Microsoft also designed and implemented a now deprecated open spec, similar to and yet incompatible with Hashcash, Email Postmark, as part of their Coordinated Spam Reduction Initiative. The Microsoft email postmark variant of Hashcash is implemented in the Microsoft mail infrastructure components Exchange, Outlook and Hotmail.
The format differences between Hashcash and Microsoft's email postmark is that postmark hashes the body in addition to the recipient, and uses a modified SHA-1 as the hash function and uses multiple sub-puzzles to reduce proof of work variance.

Blogs

Like e-mail, blogs often fall victim to comment spam.
Some blog owners have used hashcash scripts written in the JavaScript language to slow down comment spammers. Some scripts claim to implement hashcash but instead depend on JavaScript obfuscation to force the client to generate a matching key; while this does require some processing power, it does not use the hashcash algorithm or hashcash stamps.

Intellectual property

Hashcash is not patented, and the reference implementation and most of the other implementations are free software. Hashcash is included or available for many Linux distributions.
RSA has made IPR statements to the IETF about client-puzzles in the context of an RFC that described client-puzzles. The RFC included hashcash in the title and referenced hashcash, but the mechanism described in it is a known-solution interactive challenge which is more akin to Client-Puzzles; hashcash is non-interactive and therefore does not have a known solution. In any case RSA's IPR statement can not apply to hashcash because hashcash predates the client-puzzles publication and the client-puzzles patent filing US7197639.