Shamir's Secret Sharing


Shamir's Secret Sharing is an algorithm in cryptography created by Adi Shamir. It is a form of secret sharing, where a secret is divided into parts, giving each participant its own unique part.
To reconstruct the original secret, a minimum number of parts is required. In the threshold scheme this number is less than the total number of parts. Otherwise all participants are needed to reconstruct the original secret.

High-level explanation

Shamir's Secret Sharing is used to secure a secret in a distributed way, most often to secure other encryption keys. The secret is split into multiple parts, called shares. These shares are used to reconstruct the original secret.
To unlock the secret via Shamir's secret sharing, you need a minimum number of shares. This is called the threshold, and is used to denote the minimum number of shares needed to unlock the secret. Let us walk through an example:
Problem: Company XYZ needs to secure their vault's passcode. They could use something standard, such as AES, but what if the holder of the key is unavailable or dies? What if the key is compromised via a malicious hacker or the holder of the key turns rogue, and uses their power over the vault to their benefit?
This is where SSS comes in. It can be used to encrypt the vault's passcode and generate a certain number of shares, where a certain number of shares can be allocated to each executive within Company XYZ. Now, only if they pool their shares can they unlock the vault. The threshold can be appropriately set for the number of executives, so the vault is always able to be accessed by the authorized individuals. Should a share or two fall into the wrong hands, they couldn't open the passcode unless the other executives cooperated.

Mathematical definition

The aim is to divide secret into pieces of data in such a way that:
  1. Knowledge of any or more pieces makes easy to compute. That is, the complete secret can be reconstructed from any combination of pieces of data.
  2. Knowledge of any or fewer pieces leaves completely undetermined, in the sense that the possible values for seem as likely as with knowledge of pieces. Said another way, the secret cannot be reconstructed with fewer than pieces.
This scheme is called a threshold scheme.
If then every piece of the original secret is required to reconstruct the secret.

Shamir's secret sharing scheme

The essential idea of Adi Shamir's threshold scheme is that 2 points are sufficient to define a line, 3 points are sufficient to define a parabola, 4 points to define a cubic curve and so forth.
That is, it takes points to define a polynomial of degree.
Suppose we want to use a threshold scheme to share our secret, without loss of generality assumed to be an element in a finite field of size where and is a prime number.
Choose at random positive integers with, and let. Build the polynomial. Let us construct any points out of it, for instance set to retrieve. Every participant is given a point along with the prime which defines the finite field to use.
Given any subset of of these pairs, we can find the coefficients of the polynomial using interpolation. The secret is the constant term.

Usage

Example

The following example illustrates the basic idea. Note, however, that calculations in the example are done using integer arithmetic rather than using finite field arithmetic. Therefore the example below does not provide perfect secrecy and is not a true example of Shamir's scheme. So we'll explain this problem and show the right way to implement it.

Preparation

Suppose that our secret is 1234.
We wish to divide the secret into 6 parts, where any subset of 3 parts is sufficient to reconstruct the secret. At random we obtain numbers: 166 and 94.
Our polynomial to produce secret shares is therefore:
We construct six points from the polynomial:
We give each participant a different single point. Because we use instead of the points start from and not. This is necessary because is the secret.

Reconstruction

In order to reconstruct the secret any 3 points will be enough.
Consider.
We will compute Lagrange basis polynomials:
Therefore
Recall that the secret is the free coefficient, which means that, and we are done.

Computationally efficient approach

Considering that the goal of using polynomial interpolation is to find a constant in a source polynomial using Lagrange polynomials "as it is" is not efficient, since unused constants are calculated.
An optimized approach to use Lagrange polynomials to find is defined as follows:

Problem

Although the simplified version of the method demonstrated above, which uses integer arithmetic rather than finite field arithmetic, works fine, there is a security problem: Eve gains a lot of information about with every that she finds.
Suppose that she finds the 2 points and,
she still doesn't have points so in theory she shouldn't have gained any more info about.
But she combines the info from the 2 points with the public info: and she :

Solution

Geometrically this attack exploits the fact that we know the order of the polynomial and so gain insight into the paths it may take between known points. This reduces possible values of unknown points since it must lie on a smooth curve.
This problem can be fixed by using finite field arithmetic. A field of size is used. The graph shows a polynomial curve over a finite field, in contrast to the usual smooth curve it appears very disorganised and disjointed.
In practice this is only a small change, it just means that we should choose a prime that is bigger than the number of participants and every and we have to calculate the points as instead of.
Since everyone who receives a point also has to know the value of, it may be considered to be publicly known. Therefore, one should select a value for that is not too low.
Low values of are risky because Eve knows, so the lower one sets, the fewer possible values Eve has to guess from to get.
For this example we choose, so our polynomial becomes which gives the points:
This time Eve doesn't win any info when she finds a .
Suppose again that Eve finds and, this time the public info is: so she:
This time she can't stop because could be any integer so there are an infinite amount of possible values for. She knows that always decreases by 3 so if was divisible by she could conclude but because it's prime she can't even conclude that and so she didn't win any information.

Python example



"""
The following Python implementation of Shamir's Secret Sharing is
released into the Public Domain under the terms of CC0 and OWFa:
https://creativecommons.org/publicdomain/zero/1.0/
http://www.openwebfoundation.org/legal/the-owf-1-0-agreements/owfa-1-0
See the bottom few lines for usage. Tested on Python 2 and 3.
"""
from __future__ import division
from __future__ import print_function
import random
import functools
  1. 12th Mersenne Prime
_PRIME = 2 ** 127 - 1
  1. 13th Mersenne Prime is 2**521 - 1
_RINT = functools.partial
def _eval_at:
"""Evaluates polynomial at x, used to generate a
shamir pool in make_random_shares below.
"""
accum = 0
for coeff in reversed:
accum *= x
accum += coeff
accum %= prime
return accum
def make_random_shares:
"""
Generates a random shamir pool, returns the secret and the share
points.
"""
if minimum > shares:
raise ValueError
poly =
points =
return poly, points
def _extended_gcd:
"""
Division in integers modulus p means finding the inverse of the
denominator modulo p and then multiplying the numerator by this
inverse this can
be computed via extended Euclidean algorithm
http://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Computation
"""
x = 0
last_x = 1
y = 1
last_y = 0
while b != 0:
quot = a // b
a, b = b, a % b
x, last_x = last_x - quot * x, x
y, last_y = last_y - quot * y, y
return last_x, last_y
def _divmod:
"""Compute num / den modulo prime p
To explain what this means, the return value will be such that
the following is true: den * _divmod % p num
"""
inv, _ = _extended_gcd
return num * inv
def _lagrange_interpolate:
"""
Find the y-value for the given x, given n points;
k points will define a polynomial of up to kth order.
"""
k = len
assert k len, "points must be distinct"
def PI: # upper-case PI -- product of inputs
accum = 1
for v in vals:
accum *= v
return accum
nums = # avoid inexact division
dens =
for i in range:
others = list
cur = others.pop
nums.append
dens.append
den = PI
num = sum
for i in range
return % p
def recover_secret:
"""
Recover the secret from share points
.
"""
if len < 2:
raise ValueError
x_s, y_s = zip
return _lagrange_interpolate
def main:
"""Main function"""
secret, shares = make_random_shares
print
print
if shares:
for share in shares:
print
print
print
if __name__ '__main__':
main

Properties

Some of the useful properties of Shamir's threshold scheme are:
  1. Secure: Information theoretic security.
  2. Minimal: The size of each piece does not exceed the size of the original data.
  3. Extensible: When is kept fixed, pieces can be dynamically added or deleted without affecting the other pieces.
  4. Dynamic: Security can be easily enhanced without changing the secret, but by changing the polynomial occasionally and constructing new shares to the participants.
  5. Flexible: In organizations where hierarchy is important, we can supply each participant different number of pieces according to their importance inside the organization. For instance, the president can unlock the safe alone, whereas 3 secretaries are required together to unlock it.
A known issue in Shamir's Secret Sharing scheme is the verification of correctness of the retrieved shares during the reconstruction process, which is known as verifiable secret sharing. Verifiable secret sharing aims at verifying that shareholders are honest and not submitting fake shares.