## 布隆过滤器的错误率推导

$$P(h_i [x] = 1) = \frac{1}{n}$$

$$P(h_i [x] = 0) = 1 - \frac{1}{n}$$

$$Pr(h_i[x] = 0, i \in [1,k]) = (1 - \frac{1}{n})^k$$

$$Pr(B[j] = 0) = (1 - \frac{1}{n})^{k|s|}$$

$$Pr(B[j] = 1) = 1 - (1 - \frac{1}{n})^{k|s|}$$

$$\lim_{n \to \infty} (1 - \frac{1}{n})^n = \frac{1}{e}$$

$$\lim_{n \to \infty} (1 - \frac{1}{n})^{k|s|} = (1 - \frac{1}{n})^{\frac{nk|s|}{n}} = \frac{1}{e}^{\frac{k|s|}{n}}$$

$$Pr(B[j] = 1) \approx 1 - e^{-\frac{k|s|}{n}}$$

$$P_{false} = (Pr(B[j] = 1))^k \approx (1 -e^{-\frac{k|s|}{n}})^k$$

## 寻找函数个数的最优值

$$P_{false} = (Pr(B[j] = 1))^k \approx (1 -e^{-\frac{k|S|}{n}})^k$$

$$P_{false} = (1-x^k) ^k$$

$$f(x) = (1-x^k) ^k$$

$$g’(x) = [ln((1-x^k)^k)]’$$

$$g’(x) = ln(1-x^k) + (1-x^k)’ \cdot \frac{1}{1-x^k} \cdot k = ln(1-x^k) - \frac{x^k \cdot ln(x^k)}{1-x^k}$$

$$g’(x) = \frac{f’(x)}{f(x)}$$

$$ln(1-x^k) - \frac{x^k \cdot ln(x^k)}{1-x^k} = 0$$

$$(1-x^k)ln(1-x^k) = x^kln(x^k)$$

$$x^k = \frac{1}{2}$$

$$e^{-\frac{k|S|}{n}} = \frac{1}{2}$$

$$k_{opt} = \frac{nln(2)}{|S|}$$

$$P_{false} \approx (1-\frac{1}{2})^{\frac{n}{|S|}ln(2)} = 0.6185 ^ {\frac{n}{|S|}}$$