$\def\NN{\mathbb{N}}$$\def\RR{\mathbb{R}}$$\def\eps{\epsilon}$$\def\calB{\mathcal{B}}$$\def\calS{\mathcal{S}}$ $\def\calA{\mathcal{A}}$$\newcommand{\inner}[2]{\langle#1, #2\rangle}$$\newcommand{\abs}[1]{\left\vert#1\right\vert}$$\newcommand{\norm}[1]{\left\Vert#1\right\Vert}$$\newcommand{\paren}[1]{\left(#1\right)}$$\newcommand{\sqbracket}[1]{\left[#1\right]}$$\def\var{\text{Var}}$$\def\cov{\text{Cov}}$$\newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}}$$\newcommand{\doublepd}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}}$
Definition. A measure space $(\Omega, \mathcal{F}, P)$ is a probability space if $P(\Omega) = 1$. An $\mathcal{F}$-measurable function $X : \Omega \to \mathbb{R}$ is called a random variable. An $\mathcal{F}$-measurable function $X = (X_1, \ldots, X_d) : \Omega \to \mathbb{R}^d$ is called a ($d$-dimensional) random vector. (Check that $X = (X_1, \ldots, X_d)$ is a random vector if and only if each $X_i$ is a random variable.)
Definition. Let $X$ be a random variable. The induced probability measure on $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$ is called the probability distribution of $X$, denoted $P_X$; that is,
$$P_X(A) = P(X \in A)$$
for each $A \in \mathcal{B}(\mathbb{R})$. The function
$$ F_X : \mathbb{R} \to [0, 1] : x \mapsto P(X \le x) $$
is called the cumulative distribution function (cdf) of $X$.
Definition. Let $X = (X_1, \ldots, X_d)$ be a random vector. The induced probability measure on $(\mathbb{R}^d, \mathcal{B}(\mathbb{R^d}))$ is called the (joint) probability distribution of $X$, denoted $P_X$. The function
$$ F_X : \mathbb{R} \to [0, 1] : (x_1, \ldots, x_d) \mapsto P(X_1 \le x_1, \ldots, X_d \le x_d) $$
is called the joint cumulative distribution function (joint cdf) of $X$. For each $i = 1, \ldots, d$, the cdf $F_{X_i}$ and the probability distribution $P_{X_i}$ of $X_i$ is called the marginal cdf and marginal probability distribution of $X_i$, respectively.
Definition. Let $X$ be a random variable. The integral
$$\int_{\Omega} X dP$$
is called the expected value of $X$, denoted $EX$ or $E(X)$, provided that the integral is well-defined.
Proposition. (Change of variable.) Let $X$ be a random variable and $h : \mathbb{R} \to \mathbb{R}$ be a Borel function. Let $Y = h(X)$. (Check that $Y$ is also a random variable.) Then
(a) $\displaystyle \int_\Omega \abs{Y} dP = \int_{\RR} \abs{h(x)} P_X (dx) = \int_{\RR} \abs{y} P_Y (dy)$.
(b) If $\displaystyle \int_\Omega \abs{Y} dP < \infty$, then
$$EY = \int_\RR h(x) P_X(dx) = \int_{\RR} y P_Y(dy)$$.
Proof. Show that this function holds when $h$ is an indicator function. By linearity, this extends to a nonnegative simple function $h$ and by the MCT, to any nonnegative measurable $h$. This completes the proof. $\square$
Definition. For any $n \in \NN$, the $n$-the moment of a random variable $X$ is defined as $\mu_n := EX^n$, provided the expectation is well-defined.
Definition. The variance of a random variable $X$ with $EX^2 < \infty$ is defined as $\var(X) := EX^2 - (EX)^2$.
Definition. The moment generating function (mgf) of a random variable $X$ is defined by
$$M_X : \RR \to [0, \infty] : t \mapsto E(e^{tX}).$$
Proposition. Let $X$ be a nonnegative random variable. Then for all $t \ge 0$,
$$M_X(t) = \sum_{n=0}^\infty \frac{t^n \mu_n}{n!} .$$
Proof. By Taylor expansion of $e^{tX}$ at $t = 0$ and the MCT. $\square$
Proposition. Let $X$ be a random variable. Suppose $M_X$ is finite on some neighborhood $(-\eps, \eps)$ of zero. Then
(a) $E\abs{X}^n < \infty$ for all $n \ge 1$.
(b) $\displaystyle M_X(t) = \sum_{n=0}^\infty t^n \frac{\mu_n}{n!}$ for $\abs{t} < \eps$.
(c) $M_X$ is of $C^\infty$ on $(-\eps, \eps)$. For each $r \in \NN$, the $r$-th derivative of $M_X$ is given by
$$ M_X^{(r)}(t) = \sum_{n=0}^\infty \frac{t^n}{n!} \mu_{n+r} = E(e^{tX} X^r) \quad (\abs{t} < \eps).$$
In particular, $M_X^{(r)}(0)= \mu_r$.
Proof. For each $n \ge 1$ and $\abs{t} < \eps$,
$$\frac{\abs{t}^n}{n!} E\abs{X}^n \le E(e^{\abs{tX}}) \le E(e^{tX}) + E(e^{-tX}) = M(t) + M(-t) < \infty$$
so (a) follows. Next, note that
$$ \abs{\sum_{k=0}^n \frac{t^k X^k}{k!}} \le e^{\abs{tX}}$$
for each $n$, so (b) follows from the DCT. (c) is a basic property of power series. $\square$
Proposition. (Markov's inequality.) Let $X$ be a random variable and $\phi : \RR_{\ge 0} \to \RR_{\ge 0}$ be a nondecreasing function. Then for any $t > 0$ with $\phi(t) > 0$,
$$P(\abs{X} \ge t) \le \frac{E(\phi(\abs{X})) }{\phi(t)}. $$
Proof. Since $\abs{X} \ge t \implies \phi(\abs{X}) \ge \phi(t)$, it suffices to prove when $\phi$ is an identity map. Now
$$E(\abs{X}) = \int_\Omega \abs{X} dP \ge \int_{\abs{X} \ge t} \abs{X} dP \ge t P(\abs{X} \ge t)$$
so the proof is complete. $\square$
Corollary. Let $X$ be a random variable. For any $r, t > 0$,
$$P(X \ge t) \le P(\abs{X} \ge t) \le \frac{E\abs{X}^r}{t^r}.$$
Proposition. (Chebychev's inequality.) Let $X$ be a random variable with $EX^2 < \infty, EX = \mu, \var(X) = \sigma^2$. For any $k > 0$,
$$P(\abs{X - \mu} \ge k\sigma) \le \frac{1}{k^2}.$$
Proof. By the corollary above,
$$P(\abs{X - \mu} \ge k\sigma) \le \frac{E\abs{X - \mu}^2}{k^2\sigma^2} = \frac{1}{k^2}.$$
$\square$
Proposition. (Jensen's inequality.) Let $X$ be a random variable with $P(a < X < b) = 1$ for some $a, b \in [-\infty, \infty]$. Let $\phi : (a, b) \to \RR$ be a convex function. Then
$$E\phi(X) \ge \phi E(X),$$
provided that $E\abs{X}, E\abs{\phi(X)} < \infty$.
Proposition. (Hölder's inequality.) Let $X, Y$ be random variables. Then for $p, q \in [1, \infty]$ satisfying $1/p + 1/q = 1$,
$$E\abs{XY} \le [E\abs{X}^p]^{1/p}[E\abs{X}^q]^{1/q}.$$
The equality holds if and only if $c_1 \abs{X}^p = c_2 \abs{Y}^q = 1$ a.e. for some $c_1, c_2 \in \RR$, not both zero.
Corollary. $[1, \infty) \to [0, \infty] : p \mapsto [E\abs{X}^p]^{1/p}$ is nondecreasing.
Definition. Let $X, Y$ be random variables with $E\abs{X}^2, E\abs{Y}^2 < \infty$. The covariance of $X$ and $Y$ is defined as $\cov(X, Y) := E(XY) - (EX)(EY)$.
Proposition. (Cauchy-Schwarz inequality.) Let $X, Y$ be random variables with $E\abs{X}^2, E\abs{Y}^2 < \infty$. Then
$$\abs{\cov(X, Y)} \le \sqrt{\var(X)} \sqrt{\var(Y)}.$$
If $\var(X) > 0$, then the equality holds if and only if $Y = aX + b$ a.e. for some $a, b \in \RR$.
Proposition. (Minkowski's inequality.) Let $X, Y$ be random variables. Then for $p \in [1, \infty)$,
$$[E\abs{X+Y}^p]^{1/p} \le [E\abs{X}^p]^{1/p} + [E\abs{Y}^p]^{1/p}.$$
Definition. Let $X = (X_1, \ldots, X_d)$ be a random vector. The product moment of order $r = (r_1, \ldots, r_d)$, where each $r_i$ is a nonnegative integer, is defined as
$$\mu_r = \mu_{r_1, \ldots, r_d} := E(X^{r_1} \ldots X^{r_d})$$
provided that the expectation is well-defined. The joint moment generating function (joint mgf) of $X$ is defined as
$$M_X : \RR^d \to [0, \infty] : (t_1, \ldots, t_d) \mapsto E(e^{t_1X_1 + \ldots + t_dX_d}).$$
Proposition. Let $X = (X_1, \ldots, X_d)$ be a random vector. Suppose $M_X$ is finite on some neighborhood $(-\eps, \eps)^d$ of zero. Then
(a) $E\abs{X_i}^n < \infty$ for all $i = 1, \ldots, d, n \ge 1$.
(b) For $t = (t_1, \ldots, t_d) \in \RR^d$ and $r = (r_1, \ldots, r_d) \in \NN^d$, define
$$t^r := \prod_{i=1}^d t_i^{r_i}, \quad r! = \prod_{i=1}^d r_i!.$$
Then for all $t \in (-\eps, \eps)^d$,
$$ M_X(t) = \sum_{r \in \NN^d} t^r \frac{\mu_r}{r!}.$$
(c) $M_X$ is of $C^\infty$ on $(-\eps, \eps)^d$, and
$$\frac{d^r M_X}{dt^r} (0) := \frac{\partial^d M_X}{\partial t_1^{r_1} \ldots \partial t_d^{r_d}} (0)= \mu_r.$$
'수학 > 확률론' 카테고리의 다른 글
[Athreya] 8.1. Weak Laws of Large Numbers (0) | 2022.08.16 |
---|---|
[Athreya] 7.2. Borel-Cantelli Lemmas, Tail $\sigma$-algebras, and Kolmogorov's 0-1 Law (0) | 2022.08.10 |
[Athreya] 7.1. Independent Events and Random Variables (0) | 2022.08.08 |
[Athreya] 6.3. Kolmogorov's consistency theorem (0) | 2022.08.06 |