Abstract
Let \(u,m \ge 1\) be arbitrary integers and let \(r\ge (u-1)m\) be any multiple of m. Consider the multiset (“class”) \( \mathcal {H}^{\text {lin}}_{u,m,r}= \{ h_{a,b} \mid 0 \le a,b < r \} \) of hash functions from \(U = \{0,\ldots ,u-1\}\) to \(M = \{0,\ldots ,m-1\}\), where \( h_{a,b}(x) = \left\lfloor ((ax + b) \bmod r) / (r/m)\right\rfloor \), for \(x\in U\). In a STACS paper of 1996 it was shown that \(\mathcal {H}^{\text {lin}}_{u,m,r}\) is \(\frac{5}{4}\)-approximately 2-wise independent and (error-free) 2-wise independent if in addition r is a power of a prime number. Here, we revisit this result. We prove slightly stronger bounds (in part also shown by Woelfel (1999) with different methods) and we discuss applications that have appeared in the meantime, e.g., complexity lower bounds for integer multiplication, fine-grained complexity inside the class P, and the usefulness and limitations of these simple hash classes in combination with other applications of hashing.
You have full access to this open access chapter, Download chapter PDF
Similar content being viewed by others
Dedicated to Juraj Hromkovič on the occasion of his 60th birthday.
1 Introduction
Let \(u,m \ge 1\) be arbitrary integers and let \(r\ge (u-1)m\) be any multiple of m. Consider the “class” \( \mathcal {H}^{\text {lin}}_{u,m,r}= \{ h_{a,b} \mid 0 \le a,b < r \}\) of hash functions from \(U = [u] = \{0,\ldots ,u-1\}\) to \(M = [m]= \{0,\ldots ,m-1\}\), where
In a STACS paper [10] of 1996 it was shown that \(\mathcal {H}^{\text {lin}}_{u,m,r}\) is \(\frac{5}{4}\)-approximately 2-wise independent and (error-free) 2-wise independent if in addition r is a power of a prime number. In the present paper, we revisit the contribution of 1996, describe improvements of the results and discuss ramifications and applications that have appeared in the meantime.
1.1 Background: Two-Wise Independence and Universality
Two-wise independent random variables and hash functions have a multitude of applications. We mention just two: universal hashing for use in data structures [7, 13, 18, 26, 29, 40] and amplifying randomness and derandomization [2, 8, 11, 23, 28]. For a classical survey of applications, see [24].
Given sets U (the universe, usually finite) and M (the range, finite), we call a function \(h:U \rightarrow M\) a hash function. Elements of U are called keys. We consider “classes” (multisets) \(\mathcal {H}\) of hash functions. Such a class is called 1-universal if for h chosen uniformly at random from \(\mathcal {H}\) the collision probability is bounded by \(1/\left|M\right|\), i.e., if we have
Class \(\mathcal {H}\) is called 2-wise independent if for h chosen at random from \(\mathcal {H}\) we have that for arbitrary distinct \(x_1, x_2\in U\) the random variables \(h(x_1)\) and \(h(x_2)\) are uniformly and independently distributed in M, more precisely:
For applications, it is usually sufficient if (1) and (2) hold up to some relative error, i.e., if we require that \({\mathbf {Pr}}\left( h(x_1) = h(x_2)\right) \le c/\left|M\right|\) in (1) for some \(c\ge 1\) (such a class is called c-universal, which corresponds to the original definition of the term in [7]) or \((2-c)/\left|M\right|^2 \le {\mathbf {Pr}}\left( h(x_1) = i_1 \wedge h(x_2) = i_2\right) \le c/\left|M\right|^2\) (2) for some \(c \in [1,2)\) (then we say the class is c-approximately 2-wise independent).
By 1996, simple methods for constructing such classes had been known for quite some time. We list examples of such constructions.Footnote 1
-
(a)
\(U=[p]\), \(M=[m]\) for a prime p and some \(m\le p\), \(h\in \mathcal {H}\) is given by \(h(x)=((ax+b)\bmod p)\bmod m\), for \(a,b\in [p]\). This class is c-approximately 2-wise independent for \(c=(\left\lceil p/m\right\rceil m/p)^2 \le 1 + 3m/p\). For \(a,b\in [p]\), \(a\ne 0\), it is 1-universal [7, 18, 40].
-
(b)
\(U= \mathbb {F}\) is a finite field of prime characteristic p, \(M=\mathbb {Z}_p^\mu \) for some \(\mu \), \(\xi :\mathbb {F}\rightarrow M\) is some \(\mathbb {Z}_p\)-linear projection, \(h\in \mathcal {H}\) is given by \(h(x)=\xi (ax+b)\), for \(a,b\in \mathbb {F}\). This class is 2-wise independent.
-
(c)
\(U = \mathbb {F}^{\mu }\) for a finite field \(\mathbb {F}\), \(M = \mathbb {F}^{\nu }\), \(h\in \mathcal {H}\) given by \(h(x)= A\cdot x+b\), for \(A \in \mathbb {F}^{\nu \times \mu }\), \(b \in M = \mathbb {F}^{\nu }\). This class is 2-wise independent.
-
(d)
\(U = \mathbb {F}^{\mu }\) for a finite field \(\mathbb {F}\), \(M = \mathbb {F}^{\nu }\), \(h\in \mathcal {H}\) given by \(h(x)= a \circ x+b\), for \(a \in \mathbb {F}^{\mu + \nu - 1}\), \(b \in M = \mathbb {F}^{\nu }\), where \(\circ \) denotes convolution. This class is 2-wise independent [25].
We note that for implementing such classes we either need prime numbers or representations of the arithmetic in finite fields that have size |U| or at least size |M|, or, for (c) or (d), we have to carry out vector-matrix multiplication or a convolution over some finite field \(\mathbb {F}\), the most natural case being \(\mathbb {F}_2 = \{0,1\}\). The main purpose of [10] was to provide a construction using only plain integer arithmetic without the need for prime numbers to obtain two-wise independent hash families.
A simple first step in this direction had been made in [12]. There, for \(k \ge \mu \ge 1\) and sets \(U=[2^k]\) and \(M=[2^{\mu }]\) the class \(\mathcal {H}^{\text {mult}}_{2^k,2^\mu }\) of multiplicative functions
for \(0< a < 2^k\) odd was studied. This class is not 2-wise independent; hash values h(x) for a key x are not even uniformly distributed. However, it is 2-universal. Its big advantage is that functions from the class can be evaluated very efficiently on a computer—one only needs one multiplication and one or two shift operations. The construction in [10] is a variant of this class.
1.2 Definitions and Properties of the Class
Notation. The universe is \(U=[u]\), the range is \(M=[m]\), for positive integers u and m. We will calculate in the ring \(\mathbb {Z}_r\) of integers with operations modulo r, where \(r\ge 2\) is a suitable integer. In order to keep the notation simple, we will identify the ground set of \(\mathbb {Z}_r\) with [r]. Arithmetic operations modulo r are denoted by \(\cdot _r\), \(+_r\), \(-_r\). If a positive integer x divides an integer y, we write \(x\mid y\), otherwise \(x \not \mid y\).
Definition 1
Let u, m, and \(r \ge m\) be given, where \(m \mid r\). Let \(k=r/m\).
-
(a) For \(a,b\in \mathbb {Z}_r\) define
$$\begin{aligned} g_{a,b}(x)&=a\cdot _r x +_r b , \,\text {for}\, x\in U, \, \text {and} \\ h_{a,b}(x)&=\left\lfloor g_{a,b}(x) / k\right\rfloor ,\, \text {for}\, \text {x} \in \text {U}. \end{aligned}$$ -
(b) The class of linear hash functions from U to M modulo r is the multiset
$$\begin{aligned} \mathcal {H}^{\text {lin}}_{u,m,r}=\{h_{a,b}\mid a,b\in \mathbb {Z}_r\}. \end{aligned}$$
This class has size \(r^2\); representing (choosing) a function from the class requires \(2\left\lceil \log r\right\rceil \) (random) bits. Following [10], Woelfel [41, 42] gave several related constructions of substantially smaller subclasses that exhibit behaviour similar to that of \(\mathcal {H}^{\text {lin}}_{u,m,r}\).
The basic result of [10] and of this paper is that \(\mathcal {H}^{lin }_{u,m,r}\) is approximately 2-wise independent if r is large enough. (This will be proved in Sect. 3.)
Theorem 1
(i) If \(m,u\ge 2\) and \(r\ge (u-1)m\) is a multiple of m, then \(\mathcal {H}^{lin }_{u,m,r}\) is \(\frac{9}{8}\)-approximately 2-wise independent. More precisely, for \(i \in M\), \(x \in U\) we have
and for arbitrary \(i_1, i_2\in M\) and distinct \(x_1, x_2\in U\) we have
where \(c = c(u,m,r) = (4z^2+4z+1)/(4z^2+4z)\), for some \(z\ge \left\lfloor r/((u-1)m)\right\rfloor \).
(ii) If r and m are powers of a prime p, and \(r\ge um/p\), then \(\mathcal {H}^{lin }_{u,m,r}\) is 2-wise independent. (The most natural value for p is 2.)
Section 2 contains technical preparations involving “gap matrices”, which form the basis for the proof of the main theorem in Sect. 3. In Subsect. 4.1 the main result is extended to keys that are represented as sequences of numbers, a standard variation of hash classes, which leads to a more efficient evaluation for very long keys. In Subsect. 4.3 we show that the approach is sufficient for constructing sequences in \(M=[m]\) that are sufficiently close to two-independence for carrying out two-independent sampling in the sense of [8]. As an example, we show that for \(r\ge u^{3/2} \cdot m\) the sequence
where \(a,b \in [r]\) are chosen uniformly at random, is suitable. Subsection 4.4 deals with the problem of “collapsing the universe”: Given a subset \(S\subseteq U\) of n keys from \(U=[u]\), transform the long keys \(x\in U\) into ones (\(x'\)) of length \(O(\log \log u + \log n)\) such that this transformation is one-to-one on S. We give a construction that uses just a linear hash function from \(\mathcal {H}^{\text {lin}}_{u,m,r}\) to achieve the length \(\log r = O(\log \log u + \log n)\).
Finally, Sect. 5 discusses various applications of and observations about the hash class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) that have appeared in the literature since it was first described in 1996.
2 Gap Matrices
This section provides bounds on the number of 1’s in 0–1 matrices with a certain shape.
Definition 2
Let \(\gamma , k, \ell \ge 1\) be integers. A \(k \times \ell \) matrix \(A=(a_{ij})_{0\le i<k,0\le j < \ell }\) with entries from \(\{0,1\}\) is called a gap-\(\gamma \) (diagonal) matrix if the 1’s are arranged in diagonals of (horizontal/vertical) distance \(\gamma \), i.e., if there is some \(t \in [\gamma ]\) such that
Figure 1 shows examples of gap-\(\gamma \) square matrices. By \(N\!_A\) we denote the number of 1’s in a 0–1 matrix A. In a \(k\times \ell \) matrix we expect \(N\!_A\) to be about \(k\ell /\gamma \), if \(\gamma \le k, \ell \). Even in \(k\times k\) matrices there will be deviations if \(\gamma \not \mid k\), as demonstrated in Fig. 1. We provide bounds on the relative deviation \(N\!_A/(k\ell /\gamma )\), for \(k\times \ell \) matrices A.
Proposition 1
Assume A is a gap-\(\gamma \) matrix of dimensions \(k\times \ell \) with \(k,\ell \ge \gamma \). Then we have the following:
-
(a) If \(\gamma \mid k\) or \(\gamma \mid \ell \), then \(N\!_A = k\ell /\gamma \).
-
(b) If \(z=\left\lfloor k/\gamma \right\rfloor =\left\lfloor \ell /\gamma \right\rfloor \), then
$$\begin{aligned} \frac{1}{c_z} \;\le \; \frac{N\!_A}{k^2/\gamma } \;\le \; c_z \text {, for } c_z = 1+\frac{1}{4z(z+1)} . \end{aligned}$$ -
(c) If \(y=\left\lfloor k/\gamma \right\rfloor \le z =\left\lfloor \ell /\gamma \right\rfloor \), then
$$2-c_{y,z} \;\le \; \frac{N\!_A}{k\ell /\gamma } \;\le \; c_{y,z} \text {, for } c_{y,z} = 1+\frac{1}{4y(z+1)}. $$
Remark 1
-
(i)
Note that \(1/c_z = 4z(z+1)/(4z(z+1)+1) = 1 - 1/(2z+1)^2\) and \(2-c_{y,z} = 1 - 1/(4y(z+1))\). In (b), the upper bound is \(\le \frac{9}{8}\), the lower bound \(\ge \frac{8}{9}\). Asymptotically, the upper and lower bounds in (b) and (c) are \(1 \pm O(\gamma ^2/k^2)\) and \(1 \pm O(\gamma ^2/(k\ell ))\), respectively.
-
(ii)
A bound \(\left|{N\!_A}/({k\ell /\gamma }) - 1\right| \le \gamma ^2/(4k\ell )\) for \(k \times \ell \) gap-\(\gamma \) matrices was proved in [10, Lemma 8(c)]. It depends on the concrete values of k, \(\ell \), and \(\gamma \) if that bound or the one from Proposition 1(b) and (c) is stronger. For \(\gamma \) slightly smaller than \(k=\ell \) (hence \(y=z=1\)) the new bound is smaller by essentially a factor of 2.
-
(iii)
The bounds in (b) are optimal. (See Remark 2 below.)
Proof
(of Proposition 1). (a) If \(\gamma = \ell \), each row contains exactly one 1, and hence \(N\!_{A} = k\). (For an example see Fig. 2.) If \(\gamma \mid \ell \), we may partition A into \(\ell /\gamma \) many disjoint \(k\times \gamma \) gap-\(\gamma \) matrices. So \(N\!_A= (\ell /\gamma ) \cdot k = k\ell /\gamma \). The cases \(\gamma = k\) and \(\gamma \mid k\) are similar.
(b) We can assume w.l.o.g. that \(k\le \ell \). (Otherwise consider the transpose of A.) Because of (a), we may assume that \(\gamma \not \mid k\) and \(\gamma \not \mid \ell \). Let \(k'= k \bmod \gamma \) and \(\ell ' = \ell \bmod \gamma \). With \(\delta = k'/\gamma \) and \(\varepsilon = \ell '/\gamma \) we have \(0< \delta \le \varepsilon < 1\) and \(k=(z+\delta )\gamma \) and \(\ell = (z+\varepsilon )\gamma \).
We partition A into four matrices, splitting after the \(z\gamma \)’th column and \(z\gamma \)’th row. We get a \((k-k')\times (\ell -\ell ') \) submatrix \(A_1\), a \((k-k')\times \ell '\) submatrix \(A_2\), a \(k'\times (\ell -\ell ')\) submatrix \(A_3\), and a \(k'\times \ell '\) submatrix \(A_4\). Since \(\gamma \) divides \(k-k'\) and \(\ell -\ell '\), part (a) gives that
So
We need to bound this error term.
Upper Bound. Since no row in \(A_4\) contains more than one 1, we have \(N\!_{A_4}\le k'\). Hence
Since \(\delta \le \varepsilon \), this implies that \((N\!_{A_4} - k'\ell '/\gamma )/(k\ell /\gamma ) \le \delta (1-\delta )/(z+\delta )^2\). Standard methods yield that the last fraction is maximal for \(\delta = \delta _z = z/(1+2z)\); the corresponding maximal value is \({\delta _z(1-\delta _z)}/{(z+\delta _z)^2}=1/(4z(1+z))\). With (4) we get \(N\!_A/(k\ell /\gamma ) \le 1+1/(4z(1+z))\), which is the claimed upper bound.
Lower Bound. According to (4), we must show that \((k'\ell '/\gamma - N\!_{A_4} )/(k\ell /\gamma ) \le 1/(2z+1)^2\).
Case 1: \(\delta + \varepsilon \le 1\), or equivalently \(k' + \ell ' \le \gamma \). — In this case we have
We observe that the last expression cannot be maximal if \(\delta < \varepsilon \). (Assume \(\delta < \varepsilon \). Define a mapping
Since
the value \({\delta \varepsilon }/((z+\delta )(z+\varepsilon ))\) cannot be maximal.) Thus we only have to bound \(\delta ^2/(z+\delta )^2\) over all \(\delta \le \frac{1}{2}\). Clearly, this maximum is \(\bigl (\frac{1}{2}\bigr )^2/\bigl (z+\frac{1}{2}\bigr )^2=1/(2z+1)^2\).
Case 2: \(\delta + \varepsilon > 1\), or equivalently \( k'+\ell ' > \gamma \). — In this case \(A_4\) contains at least \(k' + \ell ' -\gamma =(\delta + \varepsilon -1)\gamma \) many 1’s. Hence
As above, we observe that the last expression cannot be maximal for \(\delta < \varepsilon \). (Define \(D:[0,\varepsilon -\delta ] \ni \zeta \mapsto \ln (1-(\delta +\zeta )) + \ln (1-(\varepsilon -\zeta )) - \ln (z+(\delta +\zeta )) - \ln (z+(\varepsilon -\zeta ))\). Then
This is positive, since the function \(\tau \mapsto (1-\tau )(z+\tau )\) is decreasing for \(\tau \in (0,1)\).) Thus we only have to bound \((1-\delta )^2/(z+\delta )^2\) over all \(\delta > \frac{1}{2}\). This expression is bounded by \(\bigl (\frac{1}{2}\bigr )^2/\bigl (z+\frac{1}{2}\bigr )^2=1/(2z+1)^2\).
(c) We now turn to matrices that are not necessarily almost square: Let A be a \(k\times \ell \) gap-\(\gamma \) matrix with \(y=\left\lfloor k/\gamma \right\rfloor \le z =\left\lfloor \ell /\gamma \right\rfloor \). Since the other cases have been covered already, we may assume that \(y < z\) and that \(\gamma \not \mid k\) and \(\gamma \not \mid \ell \). Let \(\delta = k/\gamma -y\) and \(\varepsilon = \ell /\gamma - z\). Then \(0< \delta , \varepsilon <1\). We extend A to an almost square gap-\(\gamma \) matrix \(\bar{A}\) by adding \((z-y)\gamma \) rows at the bottom. Let \(\bar{k} = (z+\delta )\gamma \) be the number of rows in \(\bar{A}\). Applying part (b) yields
Further, by Proposition 1(a), we have \(N\!_{\bar{A}} - N\!_A = (z-y)\ell \).
Upper Bound. We have
Lower Bound. By (7) we have
\(\square \)
Remark 2
There are arbitrarily large \(k\times k\) matrices for which the bounds in Proposition 1(b) are optimal. For the upper bound we first consider \(z=1\). In the middle \(k\times k\) matrix of Fig. 1 we have \(\gamma =9\) and \(k=12\), and \(N\!_A = 18 = \frac{9}{8}k^2/\gamma \) many 1’s. In the same way one gets \(12h \times 12h\) gap-9h matrices for arbitrary \(h\ge 1\) that realize excess ratio \(c_1=\frac{9}{8}\). For arbitrary z, we define a \(k\times k\) matrix for \(k=2z(z+1)\), as follows. We choose \(\gamma =2z+1\) and obtain \(k' = k-\gamma z = z\). If we place 1’s on the main diagonal, the \(k'\times k'\) submatrix in the lower right corner has \(k'\) many 1’s; hence the total number of 1’s is \((k^2-k'^2)/\gamma + k'\). One easily checks that \(((k^2-k'^2)/\gamma + k')/(k^2/\gamma )\) evaluates to \(1 + 1/(4z(1+z)) = c_z\). For the lower bound we first consider \(k=9\) and \(\gamma =6\), hence \(z=1\). We obtain a matrix A with the minimum number of 1’s by placing them in the diagonals running from (0, 3) to (5, 8) and from (3, 0) to (8, 5), which gives \(N\!_A=12\), hence \(N\!_A/(k^2/\gamma ) = 12/(9^2/6) = \frac{8}{9} = 1/c_1\). Again, it is easy to find larger matrices with ratio \(1/c_1\) and examples for arbitrary z with ratio \(1/c_z\).
3 Proof of Universality Properties
In this section we show that \(\mathcal {H}^{\text {lin}}_{u,m,r}\) is approximately two-wise independent in general and two-wise independent if r is a prime power.
We will assume throughout this section that \(U = [u] = \{0, \dots , u-1\}\) for some \(u \ge 2\) and \(M = [m] =\{0, \dots , m-1\}\) for some \(m \ge 2\), and that \(r=km\) is a multiple of m.
As preparation for the proof, we provide some observations and lemmas.
Fact 1
Let \(z \in \mathbb {Z}_r\) and let \(\gamma =\gcd (z,r)\). Then for arbitrary \(t\in \mathbb {Z}_r\) the following holds:
Proof
Using the fact (“Bézout’s Lemma”) that we may write \(\gamma = \alpha r + \beta z\) for certain \(\alpha ,\beta \in \mathbb {Z}\), one sees that the range \( m_z(\mathbb {Z}_r)\) of the mapping \(m_z :\mathbb {Z}_r\ni x \mapsto x \cdot _r z \in \mathbb {Z}_r\) is \((\gamma )\), the set of all multiples of \(\gamma \) in \(\mathbb {Z}_r\). The fundamental homomorphism theorem from group theory applied to the group homomorphism \(m_z\) (with respect to the additive group structure \((\mathbb {Z}_r,+_r,0)\)) yields that \(\left| m_z^{-1}(t)\right|\) is the same for all \(t \in m_z(\mathbb {Z}_r)=(\gamma )\). Since \(\left|(\gamma )\right|= r/\gamma \), the claim follows. \(\square \)
Recall that \(h(x) = \left\lfloor g(x)/k\right\rfloor \), where \(g(x) = g_{a,b} (x) = a\cdot _r x +_r b\), for \(a, b \in \mathbb {Z}_r\) chosen uniformly at random. We make some simple observations on the distributions of g(x) and h(x).
Lemma 1
For each \(x \in U\) the following holds:
-
(a) g(x) is independent of a and uniformly distributed in \(\mathbb {Z}_r\);
-
(b) h(x) is independent of a and uniformly distributed in M.
Proof
(a) For \(s \in \mathbb {Z}_r\) and \(\alpha \in \mathbb {Z}_r\) fixed we calculate:
Thus g(x) and a are independent. Since the events \(\{a=\alpha \}\), \(\alpha \in \mathbb {Z}_r\), partition the probability space, we also get \({\mathbf {Pr}}\left( g(x)=s\right) =1/r\).
(b) This follows immediately from (a), since k divides r, and hence the operation \(s \mapsto \left\lfloor s/k\right\rfloor \) maps exactly k elements of \(\mathbb {Z}_r\) to each element i of M. \(\square \)
From here on, assume that \(x_1, x_2 \in U\) with \(x_2 < x_1\) are fixed. Let
Then \(0< z < u\) and \(1 \le \gamma < u\). Now we can describe the joint distribution of \(g(x_1)\) and \(g(x_2)\) (see Fig. 3 below).
Lemma 2
(a) For arbitrary \(t \in \mathbb {Z}_r\) we have
(b) For arbitrary \(s_1, s_2 \in \mathbb {Z}_r\) we have
Proof
(a) Clearly, for \(g = g_{a,b}\), we have
Thus, the claim follows immediately from Fact 1.
(b) Since \(g(x_1)\) and a are independent by Lemma 1(a) and \(g(x_1) -_r g(x_2) = a \cdot _r z\) is a function of a, the random variables \(g(x_1)\) and \(g(x_1) -_r g(x_2)\) are independent. Thus, using part (a) we may calculate:
(For \((*)\) we use both (a) and Lemma 1(a).) Thus (b) is proved. \(\square \)
Finally we are ready to formulate and prove the central theorem about our hash functions. Let
(The definition from [10] is changed in the clever way suggested in [41] to take the special case into account where all \(\gamma \le u-1\) that divide r also divide k.) Note that \(\varGamma _{u,m,r} \le u-1\).
Observation 1
-
(a)
If r is a power of a prime p and \(r\ge um/p\), then \(\varGamma _{u,m,r}=0\).
-
(b)
If \(r\ge (u-1)m\), then \(\varGamma _{u,m,r} \le k\).
Proof
(a) Since \(r=mk\), the numbers m and k are also powers of p. Now if \(\gamma < u\) divides r, it is a power of p itself, strictly smaller than \(u \le rp/m\). So \(\gamma \) divides \(r/m=k\). This shows that \(\varGamma _{u,m,r}=0\). (b) This is trivial. \(\square \)
Even if no assumption about r is made, we will always be able to make sure that \(\varGamma _{u,m,r} \le k\), so that the following is well-defined. With \(\varGamma = \varGamma _{u,m,r}\), let
Note that \(1 \le c_{u,m,r} \le \frac{9}{8}\) and that \(c_{u,m,r} = 1 + O(u^2/k^2)\).
Theorem 2
(Main Theorem). Let h be chosen at random from \(\mathcal {H}^{\text {lin}}_{u,m,r}\), where \(m\mid r\). Then the following holds, for arbitrary \(i_1, i_2 \in M\) and distinct \(x_1, x_2 \in U\): (a) If r is a power of a prime p and \(r \ge um/p\), then
i.e., in this case \(\mathcal {H}^{\text {lin}}_{u,m,r}\) is 2-wise independent.
(b) If \(r\ge (u-1)m\), then \({\mathbf {Pr}}\left( h(x_1) = i_1\right) = 1/m\) and
i.e., in this case \(\mathcal {H}^{\text {lin}}_{u,m,r}\) is \(c_{u,m,r}\)-approximately 2-wise independent.
Remark 3
A weaker version of the upper bound in (b), a similar lower bound, and (a) were shown in [10]. Woelfel [41] was the first to prove the upper bound in (b), with a different method.
Proof
Our aim is to find upper and lower bounds on
From the definition of \(h_{a,b}\) it is immediate that with \(I_i = \{ki, ki+1, \ldots , k(i+1) - 1\}\), for \(i\in M\), we have
As usual, let \(\gamma =\gcd (x_1-x_2,r)\). Adding the equation given in Lemma 2(b) for all \(s_1 \in I_{i_1}, s_2 \in I_{i_2}\), we obtain
where \( S_{i_1, i_2} = \{(s_1, s_2) \in I_{i_1} \times I_{i_2} \mid \gamma \text { divides } s_1 -_r s_2\}, \) hence
We can regard (the characteristic function of) \(S_{i_1, i_2}\) as being represented by a \(k \times k\) matrix with row indices \(s_1\) from \(I_{i_1}\) and column indices \(s_2\) from \(I_{i_2}\), with the entry for \((s_1, s_2)\) being 1 if \(\gamma \) divides \(s_1 -_r s_2\) and 0 otherwise. (Figure 3 shows all these matrices in one picture, for an example.) Clearly this is a gap-\(\gamma \) matrix as considered in Sect. 2, with shifted row and column numberings. So we can apply the estimates from there.
Case 1: \(\gamma \mid k\). — In this case the quotient in (9) equals 1, by Proposition 1(a). Since the hypothesis of (a) implies that \(\gamma \) divides k (as seen in Observation 1), we have proved (a).
Case 2: \(\gamma \not \mid k\). — With Proposition 1(b) we may estimate:
We have \(\gamma \le \varGamma \), hence \(\left\lfloor k/\varGamma \right\rfloor (\left\lfloor k/\varGamma \right\rfloor + 1) \le \left\lfloor k/\gamma \right\rfloor (\left\lfloor k/\gamma \right\rfloor +1)\), so (10) implies part (b). \(\square \)
Remark 4
One can utilize the construction in Remark 2 of gap-\(\gamma \) matrices that realize the upper and lower bounds for the number of 1’s to find u, m, and r for which the bounds in Theorem 2 are optimal. The details are left to the reader.
3.1 Generalization: r Not Divisible by m
We have proven the central universality properties of \(\mathcal {H}^{\text {lin}}_{u,m,r}\). However, we always had to assume that m, the size of the range, is a divisor of r, which may not always be convenient. For example, we might wish to use as r some power of 2, because computer arithmetic in \(\mathbb {Z}_r\) is particularly efficient in this case, but on the other hand wish to use some m that is not a power of 2. Another such case arises when we want to choose r randomly from some range and do not have control over the divisors of r. (See Subsect. 4.4 for such a situation.) For this reason we note here that the methods developed so far make it possible to construct approximately 2-wise independent classes also with a ring \(\mathbb {Z}_r=\{0,\ldots ,r-1\}\) where r is not a multiple of m.
Partition \(\mathbb {Z}_r\) into m intervals \(I_0, \ldots , I_{m-1}\), where \(I_0, \ldots , I_{(r\,\bmod \, m)-1}\) have length \(\overline{k} = \left\lceil r/m\right\rceil \) and \(I_{r\,\bmod \, m}, \ldots , I_{m-1}\) have length \(\underline{k} = \left\lfloor r/m\right\rfloor \). Let \(h_{a,b}(x) = i\) if \(g_{a,b}(x) \in I_i\). Arithmetically, this is expressed as follows: Precompute \(\overline{k}\) and \(\underline{k}\), as well as \(B = r \bmod m\) and \(T = B \cdot \overline{k}\). Then let
The resulting class serves all hash values with almost the same probability in the case \(r \gg m\), since
The class of all \(h_{a,b}\), \(a,b\in [r]\), is again called \(\mathcal {H}^{\text {lin}}_{u,m,r}\). This class satisfies approximate independence conditions, which can be proved in analogy to Theorem 2(b), using Proposition 1(b) for \(\ell \in \{k-1,k,k+1\}\). (Assume \(\gamma = \gcd (x_1-x_2,r)\) as in the proof of Theorem 2. Following that proof, we will have \(\left|I_{i_1}\right|, \left|I_{i_2}\right| \in \{\underline{k},\overline{k}\}\). Since the error-free formula from Proposition 1(a) applies if at least one of the numbers \(\left|I_{i_1}\right|\) and \(\left|I_{i_2}\right|\) is divisible by \(\gamma \), the quotients \(y=\left\lfloor \left|I_{i_1}\right|/\gamma \right\rfloor \) and \(z=\left\lfloor \left|I_{i_2}\right|/\gamma \right\rfloor \) can be assumed to be the same.) The reader is invited to work out the details of the proof of the following statement.
Proposition 2
Assume \(r\ge (u-1)m\), and consider the generalized version of class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) as just described. Let \(x_1,x_2 \in U=[u]\) be distinct and \(i_1,i_2 \in M =[m]\) be arbitrary. Then we have, with \(c_{u,m,r}\) as defined for Theorem 2:
\(\square \)
4 Variations of the Construction
4.1 Vectors as Keys and Range Elements
If keys are very long, it may be inconvenient in practice to treat them as integers in a universe \(U=[u]\), since one has to carry out long integer multiplication. How can one deal with longer keys, e.g., given as vectors in a universe \(U'=U^\ell \), for \(U=[u]\), when the range is \(M=[m]\)? It is well known that 2-independence can be ensured by just using 2-independent hashing on the \(\ell \) components separately and taking the sum of the results modulo m. In [10] it was sketched how to proceed just utilizing the ring \(\mathbb {Z}_r\) with \(r\ge (u-1)m\) and \(k=r/m\), as before. An advantage is that fewer random bits are needed and that, as observed by Woelfel [41, 42], a parsimonious extension to a range that is also a set of vectors is possible. A hash function is specified by a coefficient vector \({\varvec{a}} = (a_0,\dots ,a_{\ell -1})\in \mathbb {Z}_r^\ell \) and \(b\in \mathbb {Z}_r\). The hash function \(h_{{\varvec{a}}, b}\) is defined by
for \({\varvec{x}}=(\xi _0,\dots ,\xi _{\ell -1})\in [u]^\ell \). (The superscript “(r)” indicates summation in \(\mathbb {Z}_r\).) The resulting class exhibits the same universality properties as \(\mathcal {H}^{\text {lin}}_{u,m,r}\), as stated in Theorem 2. The proof is very similar to that of Theorem 2. The idea is simple. Two different keys \({\varvec{x}}^{(1)}, {\varvec{x}}^{(2)}\in U^\ell \) must have one component in which they differ, \(\xi ^{(1)}_0\ne \xi ^{(2)}_0\), say. One fixes \(a_1,\dots ,a_{\ell -1}\) and studies the joint distribution of the pair
of random variables. The values \(C_1\) and \(C_2\) are regarded as constant. The analysis is practically identical to that in the proof of Theorem 2.
Example: Assume we work on a computer with fast 64-bit arithmetic, the keys are bit strings of arbitrary, fixed length given as a sequence of words, and the range is \(M=[2^\mu ]\) for some \(\mu \le 32\). Then we may let \(u=2^{32}\) and \(r=2^{64}\). A key x is split into 32-bit pieces \(\xi _0,\xi _1,\dots ,\xi _{\ell -1}\) for a suitable \(\ell \). A hash function is represented as a sequence \((a_0,\dots ,a_{\ell -1},b)\) of 64-bit integers \(a_0,\dots ,a_{\ell -1},b\). The modulo r operation is for free, since the standard hardware carries out multiplication and addition modulo \(2^{64}\). The final division by \(2^{64-\mu }\) is done by a shift.
This construction is helpful if the range M is not too big— the ring size r must be bigger than \((u-1)m\) or um / p in the case that r is a power of a prime number p. Woelfel [41, 42] gave a more general, very elegant construction to remove this restriction. It resembles the convolution construction over finite fields studied in [25], but is made to work over the ring \(\mathbb {Z}_r\). Assume the range is a set \(M'=M^\varrho = [m]^\varrho \) of vectors. Of course, we could use \(\varrho \) many independent hash functions with range M to build one with range \(M^\varrho \). However, then the number \((\ell +1)\varrho \) of required random coefficients from \(\mathbb {Z}_r\) grows with the product of \(\ell \) and \(\varrho \).
To save random bits, and storage space, Woelfel proposed using convolution (or “polynomial multiplication”) over \(\mathbb {Z}_r\). A hash function is given by vectors \({\varvec{a}} = (a_0,\dots ,a_{\ell +\varrho -2})\) and \({\varvec{b}}=(b_0,\dots ,b_{\varrho -1})\) with coefficients from \(\mathbb {Z}_r\) For a key \({\varvec{x}}=(\xi _0,\dots ,\xi _{\ell -1})\in U^\ell \) and \(0\le \kappa < \varrho \) component \(\kappa \) of the hash value \(h({\varvec{x}})\) is given by
The resulting class of hash functions from \(U'=U^\ell \) to \(M'=M^\varrho \) is called \(\mathcal {H}^{\text {conv},\ell ,\varrho }_{u,m,r}\). In the “clean” case where r is a power of a prime number p and \(r\ge um/p\) one can show that \(\mathcal {H}^{\text {conv},\ell ,\varrho }_{u,m,r}\) is 2-wise independent. If we only have that m divides r and \(r\ge (u-1)m\), then \(\mathcal {H}^{\text {conv},\ell ,\varrho }_{u,m,r}\) is \((c_{u,m,r})^\varrho \)-approximately 2-wise independent, for \(c_{u,m,r}\) the approximation constant for class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) as in Theorem 2. (For details, see [41, 42]. There it is also discussed how the parameter spaces for the coefficients may be reduced.)
4.2 Higher Independence
In [10] it was proved that polynomials over \(\mathbb {Z}_r\) can be used to obtain (approximately) d-wise independent classes for arbitrary fixed \(d\ge 2\). Here, we give the relevant definitions and state the theorem, and refer the reader to [10] for the proof. Unfortunately, the construction seems to be mainly of theoretical interest. While it achieves higher independence without prime numbers or finite fields being involved (and without tabulation [37], which requires longer tables of random entries), the big disadvantage of the construction is that r has to be quite large, which makes for slow evaluation. —As before, we assume that a universe \(U=[u]\) and a range \(M=[m]\) are given.
Definition 3
Let \(d\ge 2\). A class \(\mathcal {H}\) of hash functions from U to M is called d-wise independent if for arbitrary distinct keys \(x_0,\ldots ,x_{d-1}\in U\) and arbitrary \(i_0,\ldots ,i_{d-1}\in M\) we have
for h chosen uniformly at random from \(\mathcal {H}\). For \(c\ge 1\), such a class is called c-approximately d-wise independent if for each key x the hash value h(x) is uniformly distributed in M and if for distinct \(x_0,\ldots ,x_{d-1}\in U\) and arbitrary \(i_0,\ldots ,i_{d-1}\in M\) we have
As before, we fix \(r=km\) for some positive integer k, and consider the ring \(\mathbb {Z}_r=[r]\) with arithmetic modulo r. Further, fix \(d\ge 2\).
Definition 4
For \({\varvec{a}}=(a_0,\ldots ,a_{d-1})\in \mathbb {Z}_r^d\) define (arithmetic in \(\mathbb {Z}_r\))
and
Further, let
Theorem 3
If \(r\ge m \cdot (u-1)^{\left( {\begin{array}{c}d\\ 2\end{array}}\right) }\), then \(\mathcal {H}^{deg\text{- } d}_{u,m,r}\) is c-approximately d-wise independent for \(c = c(u,m,r,d) = (1 - m (u-1)^{\left( {\begin{array}{c}d\\ 2\end{array}}\right) }/r)^{-d}\). More precisely, h(x) is uniformly distributed in [m], for each \(x\in U\), and for arbitrary distinct elements \(x_0,\ldots ,x_{d-1}\in U\) and arbitrary values \(i_0,\ldots ,i_{d-1}\in M\) we have:
-
(a) \({\displaystyle \biggl (1-\frac{m (u-1)^{\left( {\begin{array}{c}d\\ 2\end{array}}\right) }}{r}\biggr )^d\le \frac{{\mathbf {Pr}}\left( h(x_{\lambda })=i_{\lambda }, 0\le \lambda <l\right) }{1/m^d}\le \biggl (1+\frac{m (u-1)^{\left( {\begin{array}{c}d\\ 2\end{array}}\right) }}{r}\biggr )^d}\);
-
(b) if u, m, and r are powers of the same prime number \(p\ge 2\), and \(r\ge m (u/p)^{\left( {\begin{array}{c}d\\ 2\end{array}}\right) }\), we even have d-wise independence, i.e.,
$$\begin{aligned} {\mathbf {Pr}}\left( h(x_{\lambda })=i_{\lambda }, \text { for }0\le \lambda <l\right) =\frac{1}{m^d}. \end{aligned}$$\(\square \)
4.3 Two-Wise Independent Sampling
Here we describe how the function classes described before can be used for two-independent sampling in the sense of Chor and Goldreich [8]. There one has a finite set \(M=[m]\) and a set \(A\subseteq M\) of which one wants to find an arbitrary element. The only operation available regarding A is a membership test “is \(x\in A\)?” for \(x\in M\). The most obvious search method is random sampling (keep choosing random elements of M until an element of A is found); however, this method has the disadvantage that in expectation \((1/\varrho )\log \left|M\right|\) random bits are needed, where \(\varrho = \left|A\right|/\left|M\right|\) is the density of A in M (which sometimes is small). In order to save random bits, one employs a 2-wise independent sequence \(X_0,\ldots ,X_{u-1}\) of random variables, uniformly distributed in M. These elements are tested one after the other, until an element of A is found (“success”). In [8], Chor and Goldreich use finite fields based on prime numbers for constructing such sequences. They show that the probability that the first j trials all fail is bounded by \(1/(j\varrho )\). If \(\mathcal {H}\) is an arbitrary 2-wise independent class of hash functions from \(U=[u]\) to M, one can use \(X_i=h(i)\) for \(0\le i < u\). The classes studied in the present paper can be used immediately if m and u are powers of the same prime number p and \(r\ge um\). This requires \(2\log r\) random bits for the coefficients a and b.
We wish to show with a slightly refined analysis that the classes \(\mathcal {H}^{\text {lin}}_{u,m,r}\) can be used for arbitrary \(M=[m]\) and \(U=[u]\) if one chooses r sufficiently large. Let h be chosen uniformly at random from \(\mathcal {H}^{\text {lin}}_{u,m,r}\). We let
For given \(A\subseteq M\) we let
for \(0 \le i < u\). Since \(X_i\) is uniformly distributed, we have \({\mathbf {E}}\left( Y_i\right) =1/\varrho \).
For \(0\le j\le u\), the number of successes one encounters in \(X_0,\dots ,X_{j-1}\) is
Clearly, \({\mathbf {E}}\left( Z_j\right) =j\varrho \). The crux of Chor and Goldreich’s method is to note that from the two-wise independence of the \(Y_i\)’s it follows that \({\mathbf {Var}}\left( Z_j\right) \le {\mathbf {E}}\left( Z_j\right) \). The Chebychev-Cantelli inequality \({\mathbf {Pr}}\left( X \le {\mathbf {E}}\left( X\right) - t\right) \le {\mathbf {Var}}\left( X\right) /({\mathbf {Var}}\left( X\right) +t^2)\) then implies that
This is the desired bound on the failure probability of 2-wise independent sampling.
We show that for all sufficiently large r the required bound \({\mathbf {Var}}\left( Z_j\right) \le {\mathbf {E}}\left( Z_j\right) \) is true even though the random variables \(X_0,\dots ,X_{u-1}\) are only approximately 2-wise independent.
Lemma 3
If \(r \ge u^{3/2} m\), then \({\mathbf {Var}}\left( Z_j\right) \le {\mathbf {E}}\left( Z_j\right) \), for \(0\le j < u\).
Proof
We calculate:
Clearly, \({\mathbf {Var}}\left( Y_i\right) = \varrho (1 - \varrho )\) for all i, \(0\le i < j\). We analyze a summand of the second sum in (11). Fix \(i \not = i'\). For \(s \in M\), \(t \in M\), let
Then
For \(s,t\in A\) arbitrary we have
by Theorem 2(b). Since \(r\ge (u-1)m\), we have \(c_{u,m,r} - 1 \le 1/(4\left\lfloor r/((u-1)m)\right\rfloor \cdot {}\) \((\left\lfloor r/((u-1)m)\right\rfloor +1)) \le (um/r)^2\), and we get by summing up according to (11) and (12):
By the assumption \(r \ge u^{3/2}m \) we get \({\mathbf {Var}}\left( Z_j\right) \le j \varrho (1 - \varrho ) + j \varrho ^2 = j \varrho = {\mathbf {E}}\left( Z_j\right) \), as desired. \(\square \)
4.4 “Collapsing the Universe” Without Finite Fields
Assume \(U=[u]\) is very large, meaning that if we represent keys as bit strings, they are very long. Very often, in applications like dictionaries, one wishes to replace a key x by a shorter key \(x'\) in a range \(U'=[u']\), where a “collapse function” \(h :U \rightarrow U'\) maps x to \(x'\). For this to work, h must be one-to-one on the set \(S \subseteq U\) of “relevant keys”. With the results from Sect. 4 it is easy to find a good collapsing function with a description size of \(O(\log u)\) bits. Collapse functions with smaller description sizes can be constructed deterministically [19, 33], or, usually much simpler, in a randomized way [11, 35]. In the latter case the description size of h is \(O(\log n + \log \log u)\) bits, and it requires choosing a random prime of this bitlength or knowing a finite field with elements of this bitlength. Here we demonstrate that class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) can be used to build collapse functions with such a small description size on the basis of modular arithmetic alone, without the need for prime numbers or finite field arithmetic.
Technically, assume \(S=\{x_0,\ldots ,x_{n-1}\}\subseteq U=[u]\) is given. We let \(m = \left\lceil n^3\log u\right\rceil \). The size r of the ring \(\mathbb {Z}_r\) is chosen uniformly at random from \([m^2,2m^2)\). Then \(\left\lfloor r/m\right\rfloor \in [m,2m)\). (Since usually r is not a multiple of m, we need to use the modified functions from Subsect. 3.1.) For the hash class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) to function in a 2-wise independent way on S, we only require, to make the proof of Theorem 2 work, that r is S-good in the following sense:
If this is the case, we will have \({\mathbf {Pr}}\left( h(x_i)=h(x_j)\right) = O(1/m) = O(1/n^3)\) for all distinct \(x_i,x_j\in S\), and hence \({\mathbf {Pr}}\left( h\text { is not one-to-one on }S\right) =O(1/n)\). Clearly, for r being S-good it is sufficient that
The following lemma implies that among the numbers in \([m^2,2m^2)\) a constant fraction is S-good.
Lemma 4
Let Y be a sufficiently large integer, and let \(L \ge \ln Y \). Then (at least) a constant fraction of the numbers r in \([L^2, 2L^2)\) satisfy \(\gcd (r,Y) \le r/L\).
Proof
Let
Then \(A \subseteq B \cup C_L\), where
and
Indeed, if \(r \in [L^2,2L^2) - (B \cup C_L)\), then r has a prime factor \(p>L\) that does not divide Y, hence \(\gcd (r,Y) \le r/p < r/L\). We must estimate the sizes of B and \(C_L\). First note that Y has at most \(\ln Y/\ln L\) prime factors larger than L. Each such factor divides at most \(L^2/L=L\) elements of \([L^2,2L^2)\); thus, \(\left|B\right| \le L \cdot \ln Y/\ln L \le L^2/\ln L \).
In order to deal with \(C_L\), we give a lower bound on the size of the complementary set \(D_L=\{r \in [L^2,2L^2) \mid r \, \text {has a prime factor larger than}\, L \}\). It is well known that \(\left|D_L\right| = (\ln 2 - O(1/\log L)) \cdot L^2\) (for \(L \rightarrow \infty \)), the reason being the following: For each prime number \(p \in (L,L^2]\) the set of multiples of p in \((L^2,2L^2)\) has size at least \(\lfloor L^2/p \rfloor \ge L^2/p - 1\). If we just add these figures, numbers \(r\in [L^2,2L^2)\) with two distinct prime factors \(p_1,p_2>L\) are counted twice. Note that in this situation we must have \(p_1,p_2 < 2L\) and \(r=p_1p_2\); hence, by the prime number theorem, there are only \(O((L/\log L)^2)=O(L/(\log L)^2)\) many such numbers r. The prime number theorem also entails that there are only \(O(L^2/\log L)\) many primes in \((L,2L^2]\). So we obtain:
A well-known theorem from analytic number theory [20, p. 351, Theorem 427] says that \(\sum _{ p\le x , p \, \text {prime} } \! 1/p = \ln \ln x + B_1 + E(x)\), where \(B_1\) is a constant and \(E(x)= O(1/\log x)\). It follows that
Summing up, we get that
i.e., 69% of the r in \([L^2,2L^2)\) satisfy \(\gcd (r,Y)\le r/L\), asymptotically. \(\square \)
We apply Lemma 4 for \(L=m\) and \(Y=\prod _{ 0 \le i< j <n}\!(x_j-x_i)\). Since \(m=n^3\ln u > \ln Y\), the assumptions are satisfied, and we conclude that asymptotically at least 69% of the r in \([m^2,2m^2)\) are S-good.
Remark 5
In the context of algorithms that offer higher reliability for the running time, like the real-time dictionary from [14], the probability bounds provided by Lemma 4 are too weak. At present, all reliable collapse functions that only use \(O(\log \log u+\log n)\) random bits involve the use of prime numbers.
5 Extensions and Applications
Smaller Pure Arithmetic Classes. Starting from the construction of Sect. 3, Woelfel [41, 42] succeeded in giving smaller hash classes with similar universality properties, thus reducing the space, the number of random bits, and possibly the evaluation time. In particular, he observed that it is sufficient to choose b at random from \(k\cdot [m] = \{ik \mid 0\le i < m\}\) to achieve the results in Sect. 3. Moreover, he showed that if r is a power of some prime p, one can have u as large as r, and one still gets a 1-universal class of hash functions from [u] to [m] with \(m=r/k\), if one chooses a from \(1+p[r/p]\) and b from \(p^{\left\lceil k/2\right\rceil }\cdot [p^{\left\lfloor k/2\right\rfloor }]\) uniformly at random. This class is both very efficient and small. Woelfel’s work contains many more constructions for different universality concepts.
Lower Bounds on the Complexity of Multiplication. In [25], it is shown that 2-wise independent classes contain functions that are difficult to compute in several respects (time-space tradeoff \(T\cdot S=\varOmega (\log u \cdot \log m)\); quadratic \(A\cdot T^2\) bounds for VLSI implementation; bounds for CREW PRAMs, boolean formulas, and constant-depth circuits). All these bounds transfer to integer multiplication in binary notation, by choosing r as a power of 2. Woelfel [43] and Bollig and Woelfel [6] used linear classes for proving lower bounds on the complexity of multiplication on several versions of branching programs, notably OBDDs. The central observation was that the universality properties imply that there are functions of the form \(x\mapsto \left\lfloor ((ax+b)\bmod r)/m\right\rfloor \) with moderately large r and \(U=[u]\) only of size \(m^2\) that are surjective. Bollig, Waack, and Woelfel [5] used a similar approach for lower bounds for multiplication in more general branching programs. Using similar hash classes, Sauerhoff and Woelfel [34] prove a time-space tradeoff for unrestricted, deterministic multi-way branching programs that compute the middle bit of integer multiplication.
Error-Correction Properties. For the purpose of constructing deterministic dictionaries, Miltersen [27] and Hagerup, Miltersen, and Pagh [19] utilized error-correction properties of 2-wise independent classes, in particular of the class from [12] and the classes studied in the present paper.
Limitations for Cuckoo Hashing and Other Applications. Pǎtraşcu and Thorup [31] proved that for min-wise hashing the linear classes fail badly. Dietzfelbinger and Schellbach [15, 16] showed that using linear functions \(h(x)=((ax+b)\bmod p)\bmod m\) (for \(p=u\) prime) and \(h(x)=\lfloor ((ax+b)\bmod r)/(r/m)\rfloor \) as in this paper in the naive way for cuckoo hashing will lead to failure of the data structure. On the other hand, Dietzfelbinger and Woelfel [17] showed how to run cuckoo hashing [32] with polynomials of constant degree in combination with lookup tables of random numbers as hash functions. With Aumüller these authors showed [3] how to modify the construction so that 1-universal classes, 2-wise independent classes, and tables are sufficient, so that now cuckoo hashing can be run with the linear classes from the present paper alone.
3SUM and Fine-Grained Complexity Inside P. 3SUM over an interval [u] of integers, for sets of size n, is the following problem: Given three sets \(A,B,C\subseteq [u]\), all of size n, decide whether there are \(x\in A\), \(y\in B\), \(z\in C\) such that \(x+y=z\). Depending on the model of computation, this problem is thought to have different complexities. The obvious deterministic algorithm (involving sorting and repeated merging) takes time \(\Theta (n^2)\). If one specifies the model in more detail, one may consider the word RAM (random access machine with word length w). Here, randomization, bit-level parallelism, and other tricks make it possible to solve the problem faster, see, e.g., the seminal paper by Baran, Demaine, and Pǎtraşcu [4]. One tool in this faster (randomized) algorithm is universal hashing. The authors utilize a 1-universal hash class of functions from [u] to [m] that is “almost affine”, meaning that for every hash function h from the class there is a constant \(c_h\) such that for \(x,y,z \in [u]\) with \(x+y=z\) we have \(h(x) +_m h(y) \in \{h(x+y)+_m c_h, h(x+y)+_m c_h+_m 1\}\). (The offset \(c_h\) is known from affine functions in vector spaces.) It is not hard to see (see Wang [39], who also considers k-SUM) that our class \(\mathcal {H}^{\text {lin}}_{u,m,r}\) has this property. A second property that is needed is 1-universality, from which it follows that the number of keys mapped to overfull buckets is close to its expectation. The same kind of hash function was used in many works on low-level complexity, e.g., [1, 22, 30, 38], in arguments that prove conditional lower bounds for dynamic data structures and string and graph problems.Footnote 2
Upper Bound on Bucket Size. Knudsen [21] showed that the expected bucket sizes created by \(\mathcal {H}^{\text {lin}}_{u,m,r}\) on a set S of size \(\left|S\right|=m\) is \(O(m^{1/3})\).
Efficiency. Thorup [36] and Dahlgaard et al. [9] experimentally explore the efficiency of different hash classes. Our class \(\mathcal {H}^{\text {lin}}_{u,m,r}\), for u, m, r powers of 2, turns out to be very fast, in particular for values that combine well with the word length of the computer. One should beware, however, that the class must be used only for purposes where its suitability has been proved.
Notes
- 1.
For a natural number n, we denote the set \(\{0,1,\dots ,n-1\}\) by [n]; for a prime number p, we denote by \(\mathbb {Z}_p\) the field of size p, with ground set [p].
- 2.
It should be noted that although the class of functions \(x\mapsto (ax \bmod p)\bmod m\) for primes p is not suitable, the class of functions \(x\mapsto (ax \bmod p)/\left\lfloor p/m\right\rfloor \) is also almost affine and \((1+\frac{1}{m})\)-universal, so it could also have been used for this application.
References
Abboud, A., Williams, V.V., Weimann, O.: Consequences of faster alignment of sequences. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds.) ICALP 2014. LNCS, vol. 8572, pp. 39–51. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43948-7_4
Alon, N., Goldreich, O., Håstad, J., Peralta, R.: Simple constructions of almost \(k\)-wise independent random variables. Random Struct. Algorithms 3, 289–304 (1992). https://doi.org/10.1002/rsa.3240030308
Aumüller, M., Dietzfelbinger, M., Woelfel, P.: Explicit and efficient hash families suffice for cuckoo hashing with a stash. Algorithmica 70(3), 428–456 (2014). https://doi.org/10.1007/s00453-013-9840-x
Baran, I., Demaine, E.D., Pǎtraşcu, M.: Subquadratic algorithms for 3SUM. Algorithmica 50(4), 584–596 (2008). https://doi.org/10.1007/s00453-007-9036-3
Bollig, B., Waack, S., Woelfel, P.: Parity graph-driven read-once branching programs and an exponential lower bound for integer multiplication. In: Baeza-Yates, R., Montanari, U., Santoro, N. (eds.) Foundations of Information Technology in the Era of Network and Mobile Computing. ITIFIP, vol. 96, pp. 83–94. Springer, Boston, MA (2002). https://doi.org/10.1007/978-0-387-35608-2_8
Bollig, B., Woelfel, P.: A read-once branching program lower bound of \(\Omega (2^{n/4})\) for integer multiplication using universal hashing. In: Proceedings of 33rd ACM STOC, pp. 419–424 (2001). https://doi.org/10.1145/380752.380835
Carter, J.L., Wegman, M.N.: Universal classes of hash functions. J. Comp. Syst. Sci. 18, 143–154 (1979). https://doi.org/10.1016/0022-0000(79)90044-8
Chor, B., Goldreich, O.: On the power of two-point based sampling. J. Complex. 5, 96–106 (1989). https://doi.org/10.1016/0885-064X(89)90015-0
Dahlgaard, S., Knudsen, M.B.T., Thorup, M.: Practical hash functions for similarity estimation and dimensionality reduction. In: Proceedings of NIPS 30, pp. 6618–6628 (2017). http://papers.nips.cc/paper/7239-practical-hash-functions-for-similarity-estimation-and-dimensionality-reduction.pdf
Dietzfelbinger, M.: Universal hashing and k-wise independent random variables via integer arithmetic without primes. In: Puech, C., Reischuk, R. (eds.) STACS 1996. LNCS, vol. 1046, pp. 567–580. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60922-9_46
Dietzfelbinger, M., Gil, J., Matias, Y., Pippenger, N.: Polynomial hash functions are reliable. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 235–246. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-55719-9_77
Dietzfelbinger, M., Hagerup, T., Katajainen, J., Penttonen, M.: A reliable randomized algorithm for the closest-pair problem. J. Algorithms 25(1), 19–51 (1997). https://doi.org/10.1006/jagm.1997.0873
Dietzfelbinger, M., Karlin, A., Mehlhorn, K., Meyer auf der Heide, F., Rohnert, H., Tarjan, R.E.: Dynamic perfect hashing: upper and lower bounds. SIAM J. Comput. 23(4), 738–761 (1994). https://doi.org/10.1137/S0097539791194094
Dietzfelbinger, M., Meyer auf der Heide, F.: Dynamic hashing in real time. In: Buchmann, J., Ganzinger, H., Paul, W.J. (eds.) Informatik: Festschrift zum 60. Geburtstag von Günter Hotz, Teubner-Texte zur Informatik, vol. 1, pp. 95–119. B. G. Teubner, Stuttgart-Leipzig (1992). https://doi.org/10.1007/978-3-322-95233-2_7
Dietzfelbinger, M., Schellbach, U.: On risks of using cuckoo hashing with simple universal hash classes. In: Proceedings of 20th ACM-SIAM SODA, pp. 795–804 (2009). https://doi.org/10.1137/1.9781611973068.87
Dietzfelbinger, M., Schellbach, U.: Weaknesses of cuckoo hashing with a simple universal hash class: the case of large universes. In: Nielsen, M., Kučera, A., Miltersen, P.B., Palamidessi, C., Tůma, P., Valencia, F. (eds.) SOFSEM 2009. LNCS, vol. 5404, pp. 217–228. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-95891-8_22
Dietzfelbinger, M., Woelfel, P.: Almost random graphs with simple hash functions. In: Proceedings of 35th ACM STOC, pp. 629–638 (2003). https://doi.org/10.1145/780542.780634
Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with \(O(1)\) worst case access time. J. Assoc. Comput. Mach. 31(3), 538–544 (1984). https://doi.org/10.1145/828.1884
Hagerup, T., Miltersen, P.B., Pagh, R.: Deterministic dictionaries. J. Algorithms 41(1), 69–85 (2001). https://doi.org/10.1006/jagm.2001.1171
Hardy, G.H., Wright, E.M.: An Introduction to the Theory of Numbers. Oxford Science Publications, Clarendon Press, Oxford (1994)
Knudsen, M.B.T.: Linear hashing is awesome. CoRR arXiv:1706.02783 (2017). Technical report version of Knudsen, M.B.T.: Linear hashing is awesome. In: Proceedings of 57th IEEE FOCS, pp. 345–352 (2016). https://doi.org/10.1109/FOCS.2016.45
Kopelowitz, T., Pettie, S., Porat, E.: Higher lower bounds from the 3SUM conjecture. In: Proceedings of 27th ACM-SIAM SODA, pp. 1272–1287 (2016). https://doi.org/10.1137/1.9781611974331.ch89
Luby, M.: A simple parallel algorithm for the maximal independent set problem. SIAM J. Comput. 15, 1036–1053 (1986). https://doi.org/10.1137/0215074
Luby, M., Wigderson, A.: Pairwise independence and derandomization. In: Foundations and Trends in Theoretical Computer Science, vol. 1, no. 4 (2005). https://doi.org/10.1561/0400000009
Mansour, Y., Nisan, N., Tiwari, P.: The computational complexity of universal hashing. Theor. Comput. Sci. 107, 121–133 (1993). https://doi.org/10.1016/0304-3975(93)90257-T
Mehlhorn, K., Vishkin, U.: Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Inform. 21, 339–374 (1984). https://doi.org/10.1007/BF00264615
Miltersen, P.B.: Error correcting codes, perfect hashing circuits, and deterministic dynamic dictionaries. In: Proceedings of 9th ACM-SIAM SODA, pp. 556–563 (1998). https://dl.acm.org/citation.cfm?id=314613.314845
Nisan, N.: Pseudorandom generators for space-bounded computations. Combinatorica 12(4), 449–461 (1992). https://doi.org/10.1007/BF01305237
Pagh, R.: Hash and displace: efficient evaluation of minimal perfect hash functions. In: Dehne, F., Sack, J.-R., Gupta, A., Tamassia, R. (eds.) WADS 1999. LNCS, vol. 1663, pp. 49–54. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48447-7_5
Pǎtraşcu, M.: Towards polynomial lower bounds for dynamic problems. In: Proceedings of 42nd ACM STOC, pp. 603–610 (2010). https://doi.org/10.1145/1806689.1806772
Pǎtraşcu, M., Thorup, M.: On the \(k\)-independence required by linear probing and minwise independence. ACM Trans. Algorithms 12(1), 8:1–8:27 (2016). https://doi.org/10.1145/2716317
Pagh, R., Rodler, F.F.: Cuckoo hashing. J. Algorithms 51(2), 122–144 (2004). https://doi.org/10.1016/j.jalgor.2003.12.002
Ružić, M.: Making deterministic signatures quickly. ACM Trans. Algorithms 5(3), 26:1–26:26 (2009). https://doi.org/10.1145/1541885.1541887
Sauerhoff, M., Woelfel, P.: Time-space tradeoff lower bounds for integer multiplication and graphs of arithmetic functions. In: Proceedings of 35th ACM STOC, pp. 186–195 (2003). https://doi.org/10.1145/780542.780571
Siegel, A.: On universal classes of extremely random constant-time hash functions. SIAM J. Comput. 33(3), 505–543 (2004). https://doi.org/10.1137/S0097539701386216
Thorup, M.: Even strongly universal hashing is pretty fast. In: Proceedings of 11th ACM-SIAM SODA, pp. 496–497 (2000). https://dl.acm.org/citation.cfm?id=338219.338597
Thorup, M., Zhang, Y.: Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput. 41(2), 293–331 (2012). https://doi.org/10.1137/100800774
Williams, V.V., Williams, R.: Finding, minimizing, and counting weighted subgraphs. SIAM J. Comput. 42(3), 831–854 (2013). https://doi.org/10.1137/09076619X
Wang, J.R.: Space-efficient randomized algorithms for K-SUM. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 810–829. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44777-2_67
Wegman, M.N., Carter, J.L.: New classes and applications of hash functions. In: Proceedings of 20th IEEE FOCS, pp. 175–182 (1979). https://doi.org/10.1109/SFCS.1979.26
Woelfel, P.: Efficient strongly universal and optimally universal hashing. In: Kutyłowski, M., Pacholski, L., Wierzbicki, T. (eds.) MFCS 1999. LNCS, vol. 1672, pp. 262–272. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48340-3_24
Woelfel, P.: Klassen universeller Hashfunktionen mit ganzzahliger Arithmetik. Diploma thesis, University of Dortmund (2000). (In German)
Woelfel, P.: New bounds on the OBDD-Size of integer multiplication via universal hashing. In: Ferreira, A., Reichel, H. (eds.) STACS 2001. LNCS, vol. 2010, pp. 563–574. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44693-1_49
Acknowledgment
The author is grateful to Carl Pomerance for helping with the proof of Lemma 4 and to a reviewer for a careful reading of the manuscript and helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dietzfelbinger, M. (2018). Universal Hashing via Integer Arithmetic Without Primes, Revisited. In: Böckenhauer, HJ., Komm, D., Unger, W. (eds) Adventures Between Lower Bounds and Higher Altitudes. Lecture Notes in Computer Science(), vol 11011. Springer, Cham. https://doi.org/10.1007/978-3-319-98355-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-98355-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98354-7
Online ISBN: 978-3-319-98355-4
eBook Packages: Computer ScienceComputer Science (R0)