Saturday, September 2, 2017

Ch1 2 Binominal Distribution

Ch1 2 Binominal Distribution


Ch1.2 Binominal Distribution
What is a Bernoulli trial? Here is Rouders way to define it:
A random variable is distributed as a Bernoulli if it has two possible values: 0 (for failure) and 1 (for success).
Conventionally, one of the outcomes is called success, and the other, failure. Below is Rouders formula.
[ f(x; p) = left{ egin{array}{l l l} 1-p & quad ext{if $x = 0$} p & quad ext{if $x = 1$} 0 & quad ext{otherwise} end{array} ight. ]
Comparing to Kruschkes (Doing Bayesian Data Analysis, p.78) formula: [ p(y|?)=?^y(1-?)^{(1-y)} ]
If we put Rouders symbol into Kruschkes formula: [ p(x|p)=p^x(1-p)^{(1-x)}=f(x; p) ] ( x ) is the random variable, and ( p ) is the parameter. Sometimes, confusingly, you could also say the above formula as the Bernoulli distribution, or the Bernoulli (probability) density function. Basically, they refer to an identical concept.
A few important concepts mentioned here are: 1. The equivalence of two Bernoulli parameters. Because Bernoulli parameters represent probability, if ( p_1=p_2 ), it says two corresponding random variables X1 and X2 are identical. Since a random variable is a not-yet unrealised value, we cannot say for sure that two outcomes coming from the Bernoulli density functions with identical random variables will be identical. Basically, this is my understanding that Rouder tried to explain. 2. The independence of two random variables. If two experiments do not depend on each other, they are said to be mutually independent. This should apply for also other types of distribution, not just Bernoulli distribution. 3. iid. If the above two situations happen together, we say two random variables are iid.
Some other important concepts mentioned in Kruschkes book: The equation ( p(x|p)=p^x(1-p)^{(1-x)}=f(x; p) ), if considering from the perspective of parameter, it is the likelihood function for the parmeter (p), which also is called the Bernoulli likelihood function. The same function is also the probability density fucntion, if considering from the perspective of the random variable (x, the datum).

When Bernoulli become Binomial

A Bernoulli trial could be tossing a coin once. If we toss a coin several times, then it becomes a Bernoulli process. If the success (e.g., the outcome of head side) denoted as 1, the number of success as another random variable, which is then called binomial. That is, the binomial random variable. So now one-parameter Bernoulli density function becomes the binomial density function, [ f(x; p) = left{ egin{array}{l l} {N choose x} p^x(1-p)^{N-x} & quad ext{if $x = 0,...,N$} 0 & quad ext{Otherwise} end{array} ight. ]
Usually, we know how many time we want to toss the coin, so N is deterministic (i.e., already decided). Thus, binomial density function still has only one parameter, the probability of getting a succuss outcome (a head).
# assign 0, 1, ..., 20 to x and store them as a vector, which says we want
# to know the probability of 0 head, 1 head, 2 heads, ..., 20 heads in
# this coin tossing experiment.
x <- 0:20
# assume the parameter (probability) is .7 and we want to toss the coin
# for 20 times.
f <- dbinom(x, 20, 0.7)
plot(x, f, type = "h", xlab = "Value of X", ylab = "Probability Mass Function")
points(x, f)
plot of chunk unnamed-chunk-1

Differentiating the parameter and its estimator

Mostly, we have to estimate the underlying parameter. In this case, Rouder calles it an estimator. In some books, the two concepts are named the statistic and the parameter. An estimator (or statistic, if you like) is an educated guess, based on the data you have. In stat text, an estimator usually is writen as the same symbol as a parameter with a hat (caret) on top of it. So if the parameter is ( p ), then the estimator would be ( hat{p} ).
-->