切尔诺夫界的导出
假设$X$为一个$r.v.$,若存在$h>0$使得对任意的$\lambda \in \left[ 0,h \right] $都有$\mathbb{E}\left[ e^{\lambda x} \right] $存在,则称$X$存在一个矩母函数$(MGF)$记作$M_X\left( \lambda \right) $。
(轻尾/重尾):若随机变量$X$满足$\mathbb{E}\left[ e^{\lambda x} \right] =\infty ,\forall \lambda >0$则称之为重尾,否则为轻尾。
由马尔可夫不等式可以导出如下不等式:
当$t>0$时:
$$
Pr\left( X\ge a \right) =Pr\left( e^{tX}\ge e^{ta} \right) \le \frac{\mathbb{E}\left( e^{tX} \right)}{e^{ta}}\le \underset{t>0}{min}\ \frac{\mathbb{E}\left( e^{tX} \right)}{e^{ta}}
$$
当$t<0$时: $$ Pr\left( X\le a \right) =Pr\left( e^{tX}\ge e^{ta} \right) \le \frac{\mathbb{E}\left( e^{tX} \right)}{e^{ta}}\le \underset{t<0}{min}\,\,\frac{\mathbb{E}\left( e^{tX} \right)}{e^{ta}} $$
一维正态分布$X~\mathcal{N}\left( \mu ,\sigma ^2 \right) $的切诺夫界推导:
先求出矩母函数:
$$
M_X\left( \lambda \right) =\mathbb{E}\left[ e^{\lambda X} \right] =\int_{-\infty}^{\infty}{e^{\lambda x}\cdot \frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot e^{-\frac{\left( x-\mu \right) ^2}{2\sigma ^2}}}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( \lambda x-\frac{\left( x-\mu \right) ^2}{2\sigma ^2} \right)}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( \frac{2\lambda x\sigma ^2-x^2+2x\mu -\mu ^2}{2\sigma ^2} \right)}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( -\frac{x^2-2x\left( \mu +\lambda \sigma ^2 \right) +\mu ^2}{2\sigma ^2} \right)}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( -\frac{x^2-2x\left( \mu +\lambda \sigma ^2 \right) +\left( \mu +\lambda \sigma ^2 \right) ^2+\mu ^2-\left( \mu +\lambda \sigma ^2 \right) ^2}{2\sigma ^2} \right)}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( -\frac{\left[ x-\left( \mu +\lambda \sigma ^2 \right) \right] ^2+\mu ^2-\mu ^2-2\mu \lambda \sigma ^2-\left( \lambda \sigma ^2 \right) ^2}{2\sigma ^2} \right)}dx
$$
$$
=\int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( -\frac{\left[ x-\left( \mu +\lambda \sigma ^2 \right) \right] ^2}{2\sigma ^2}+\mu \lambda +\frac{\left( \lambda \sigma ^2 \right) ^2}{2\sigma ^2} \right)}dx
$$
$$
=exp\left( \mu \lambda +\frac{\left( \lambda \sigma ^2 \right) ^2}{2\sigma ^2} \right) \int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\cdot \sigma}\cdot exp\left( -\frac{\left[ x-\left( \mu +\lambda \sigma ^2 \right) \right] ^2}{2\sigma ^2} \right)}dx
$$
$$
=exp\left( \mu \lambda +\frac{\lambda ^2\sigma ^2}{2} \right)
$$
显然$M_X\left( \lambda \right) $对于任意$\lambda \in \mathbb{R}$都有定义,故有:
$$
\underset{t>0}{min}\,\,\frac{\mathbb{E}\left( e^{\lambda \left( X-\mu \right)} \right)}{e^{\lambda a}}=\underset{t>0}{min}\,\,\frac{e^{\frac{\lambda ^2\sigma ^2}{2}}}{e^{\lambda a}}=\underset{t>0}{min}\,\,e^{\frac{\lambda ^2\sigma ^2}{2}-\lambda a}
$$
进而有:
$$
\underset{\lambda \in \mathbb{R}}{argmin}\ e^{\frac{\lambda ^2\sigma ^2}{2}-\lambda a}\ =\underset{\lambda \in \mathbb{R}}{argmin}\ \frac{\lambda ^2\sigma ^2}{2}-\lambda a
$$
解得$\lambda ^*=\frac{a}{\sigma ^2}$时最优,最终得到:
$$
Pr\left( \left( X-\mu \right) \ge a \right) \le e^{-\frac{a^2}{2\sigma ^2}},\forall a>0
$$
在泊松试验中随机变量的分布不必相同,伯努利试验是泊松·试验的一种特殊形式。设$X_1,X_2,…X_n$是独立的泊松试验序列,满足$Pr\left( X_i=1 \right) =p_i$,设$X=\sum_{i=1}^n{X_i}$,同时有:
$$
\mu =\mathbb{E}\left[ X \right] =\mathbb{E}\left[ \sum_{i=1}^n{X_i} \right] =\sum_{i=1}^n{\mathbb{E}\left[ X_i \right]}=\sum_{i=1}^n{p_i}
$$
为计算泊松试验的切尔诺夫界,首先需要计算出其矩母函数:
$$
M_X\left( t \right) =\prod_{i=1}^n{M_{X_i}\left( t \right)}=\prod_{i=1}^n{\mathbb{E}\left[ e^{tX_i} \right]}=\prod_{i=1}^n{\left[ p_i\cdot e^t+\left( 1-p_i \right) \right]}
$$
$$
=\prod_{i=1}^n{\left[ 1+p_i\left( e^t-1 \right) \right]}\le \prod_{i=1}^n{e^{p_i\left( e^t-1 \right)}}=e^{\mu \left( e^t-1 \right)}
$$
对于上方的界主要考虑对于给定的$\delta >0$,$Pr\left( X\ge \left( 1+\delta \right) \mu \right) $的尾概率,接下来将给出泊松试验的均值上方的三个界。首先是一般性的界:
对于任意$\delta >0$:
$$
Pr\left( X\ge \left( 1+\delta \right) \mu \right) \le \left( \frac{e^{\delta}}{\left( 1+\delta \right) ^{1+\delta}} \right) ^{\mu}
$$
$Proof:$由马尔可夫不等式,对任意$t>0$,有:
$$
Pr\left( tX\ge t\left( 1+\delta \right) \mu \right) =Pr\left( e^{tX}\ge e^{t\left( 1+\delta \right) \mu} \right) \le \frac{\mathbb{E}\left[ e^{tX} \right]}{e^{t\left( 1+\delta \right) \mu}}\le \frac{e^{\left( e^t-1 \right) \mu}}{e^{t\left( 1+\delta \right) \mu}}
$$
取$t=ln\left( 1+\delta \right) >0$,易得:
$$
Pr\left( X\ge \left( 1+\delta \right) \mu \right) \le \left( \frac{e^{\delta}}{\left( 1+\delta \right) ^{1+\delta}} \right) ^{\mu}
$$
虽然如上取得了一个较为紧的界,但是它往往在许多场合难以叙述和计算,故导入如下两个较松的界:
对于$0<\delta \le 1$,有: $$ Pr\left( X\ge \left( 1+\delta \right) \mu \right) \le e^{-\mu \delta ^2/3} $$ $Proof:$为证明此界,只需证明$\left( \frac{e^{\delta}}{\left( 1+\delta \right) ^{1+\delta}} \right) ^{\mu}\le e^{-\mu \delta ^2/3}$即可;两边取对数后知只需证明: $$ f\left( \delta \right) =\delta -\left( 1+\delta \right) ln\left( 1+\delta \right) +\frac{\delta ^2}{3}\le 0 $$ 求导得: $$ f'\left( \delta \right) =-ln\left( 1+\delta \right) +\frac{2}{3}\delta $$ $$ f''\left( \delta \right) =-\frac{1}{1+\delta}+\frac{2}{3} $$ 易知三阶导恒大于 0,故$f''\left( \delta \right)$仅在$\delta =\frac{1}{2}$处取得零点,即一阶导先减后增;
易有$f’\left( 0 \right) =0,f’\left( 1 \right) =\frac{2}{3}-ln2<0$,即原函数单调递减,且$f\left( 0 \right) =0$,证毕!
第三个界为,当$\delta \ge 5$时:
$$
Pr\left( X\ge \left( 1+\delta \right) \mu \right) \le \left( \frac{e^{\delta}}{\left( 1+\delta \right) ^{1+\delta}} \right) ^{\mu}\le \left( \frac{e}{\left( 1+\delta \right)} \right) ^{\mu \left( 1+\delta \right)}\le \left( \frac{e}{6} \right) ^{\mu \left( 1+\delta \right)}\le 2^{-\mu \left( 1+\delta \right)}
$$
同理可证得:
对$0<\delta <1$有: $$ Pr\left( X<\left( 1-\delta \right) \mu \right) \le \left( \frac{e^{-\delta}}{\left( 1-\delta \right) ^{1-\delta}} \right) ^{\mu} $$ $$ Pr\left( X<\left( 1-\delta \right) \mu \right) \le e^{-\mu \delta ^2/2} $$
设$X_1,X_2,…X_n$是独立的泊松试验序列,满足$Pr\left( X_i=1 \right) =p_i$,设$X=\sum_{i=1}^n{X_i}$,$\mu =\mathbb{E}\left( X \right) $;那么对于$0<\delta <1$有: $$ Pr\left( \left| X-\mu \right|\ge \delta \mu \right) \le 2e^{-\mu \delta ^2/2} $$