Sunday, May 21, 2017
Chapter 3 1 and 3 2 High threshold Model
Chapter 3 1 and 3 2 High threshold Model
This is the note for Rounder's book 3.1 and 3.2.
Chapter 3 talked about high-threshold model. This chapter started from introducing the signal-detection experiment, which usually produces four type of statistics: hit, miss, false alarm and correct rejection. The former two statistics add up to the total number of signal trials, and the latter equals to the total number of noise trials.
The Signal-detection Model
A tone detection experiment was used as an example to explain the signal detection model. A simulated data looked like the below table. The first row represents when an observer announces that a tone is present or absent. The first column represents a tone or a noise has been presented.
| Stimulus | Tone Present | Tone Absent | Subtotal |
|---|---|---|---|
| Signal | 75 | 25 | 100 |
| Noise | 30 | 20 | 50 |
| ——– | ————- | ————- | ————- |
| Subtotal | 105 | 45 | 150-Total |
- The hit correspoonds to the interaction of Tone Present and Signal. That is the observer announces that the tone is there and indeed the machine presents the tone.
- The miss is the interaction of Tone Absent and Signal. In thise case, an observer says that no tone is presented, but in reality the machine does present the tone. The subtotal of hit and miss is the total number of signal that the machine sends out. On the other hand, in a signal-detection experiment, usually a machine also presents only noise sound in which no singal is there. In this scenario, two types of event may occur.
- The false alarm is the interaction between Tone Present and Noise.
- The correct rejection is the interaction between Tone Absent and Noise.
We could represent a signal-detection data by using the variables below: 1. Random variables: ( Y_h, Y_m, Y_f, Y_c ) to denote the hit count, miss count, false alarm count and correct rejection count, respectively. 2. Data (outcomes): ( y_h, y_m, y_f, y_c ), using lowercase letter. 3. Subtotal numbers of signal and noise trials: ( N_s, N_n ).
Thus, the hit and miss rates are ( y_h/N_s ) and ( y_m/N_s ); the rates of false alarm and correct rejection are ( y_f/N_n ) and ( y_c/N_n ). In other words, the signal trials compose of hits and misses and the noise trials compose of false alarms and correct rejection. Because usually the trial numbers of signal and noise are known, we can reduce the four random variables to two, ( Y_h ) and ( Y_f ).
Therefore, a simple signal-detection model can be expressed in two (stochastic) Binominal functions, one for the hit rate and the other for the false alarm rate. [ egin{array}{l l} Y_h sim Binominal(p_h, N_s), Y_f sim Binominal(p_f, N_n) end{array} ] ( p_h and p_f ) are the true (usually unknown) probabilities of hits and false alarms, respectively. The MLE for ( p_h ) and ( p_f ) are ( hat{p}_h=y_h/N_s ) and ( hat{p}_f=y_f/N_n ). Because the ( N_s = N_h+N_m ) and ( N_n = N_f+N_c ), only two parameters, ( p_h ) and ( p_f ) are to be estimated.
The High-threshold Model
The reason to seek for an alternative model to reflect a signal-detection experiment is because the binominal model in previous section yields two different measurments of performance: the hit rate and the correct rejection rate. The high-threshold model provides a single measure of performance. The binominal model than can be re-written as: [ egin{array}{l l} Y_h sim Binominal(d+(1-d)g, N_s), Y_f sim Binominal(g, N_n) end{array} ] d and g are the probabilities of signal detection and guess, respectively. In high-threshold model, two situations produce the hit event. The first is an observer detects a signal and a signal indeed is presented, so the probability is ( d ). The second situation is when a signal is presented, the observer does not detect it (( 1-d )), but simply guesses that it is there, so the probability is ( (1-d)g ). Thus the ( p_h ) in the simple signal-detection model is replaced by the ( d+(1-d)g ). The ( p_f ) is, by simple reasoning, replaced by ( g ), which is when no signal is presented, the observer guess that a signal is there.
The Log-Likelihood Function
So, since we have had the model, which is one form of density function. We can now convert it to likelihood function and then log-likelihood function.
Below are the density functions for the two random variables, ( Y_h ) and ( Y_f ), respectively. [ f(y_h; d, g) = left(! egin{array}{c} N_s y_h end{array} ! ight) (d+(1-d)g)^{y_h}(1-(d+(1-d)g))^{N_s-y_h} quad ext{hit rate function} ]
[ f(y_f; d, g) = left(! egin{array}{c} N_n y_f end{array} ! ight) g^{y_f}(1-g)^{N_n-y_f} quad ext{false alarm rate function} ]
Multiplying the above two functions, we get the joint probability function: [ f(y_h, y_f; d, g) = left(! egin{array}{c} N_s y_h end{array} ! ight) (d+(1-d)g)^{y_h}((1-d)(1-g))^{y_m} imes left(! egin{array}{c} N_n y_f end{array} ! ight) g^{y_f}(1-g)^{y_c} ]
The parameters still are ( d ) and ( g ) and the random variables are ( Y_h ) and ( Y_f ). The lowercase ( y_h ) and ( y_f ) are realisations of the random variables. The likelihood function is to rewrite the probability function with respect to ( d ) and ( g ). [ Likelihood(d, g; y_h, y_f) = left(! egin{array}{c} N_s y_h end{array} ! ight) (d+(1-d)g)^{y_h}((1-d)(1-g))^{y_m} imes left(! egin{array}{c} N_n y_f end{array} ! ight) g^{y_f}(1-g)^{y_c} ]
Then, the netative log-likelihood is to take a nagative of for the function: [ l(d, g; y_h, y_f) = -(y_hlog(d+(1-d)g)+y_mlog((1-d)(1-g)) + y_f log(g) + y_c log(1-g)) ]
Note that the parameter-unrelated terms have been omitted.
The negative log-likehood function can be simplified as: [ egin{array}{rcl} l(d, g; y_h, y_f) &=& -(y_hlog(p_h)+y_mlog(p_m) + y_f log(p_f) + y_c log(p_c)) &=& -sum_{i} y_i log(p_i) end{array} ]
Calculus Method
Firstly, I took of the parentheses and multiplied each term by -1. [ l(d, g; y_h, y_f) = -y_hlog(d+(1-d)g)-y_mlog((1-d)(1-g)) - y_f log(g) - y_c log(1-g) ]
To calculate ( hat{d} ), I did the partial derivative with respect to ( d ). Note that the third and forth terms will equal to zero, because the derivative of a constant equals to zero. Here is what I got: [ egin{array}{rcl} frac{partial l(d, g; y_h, y_f)}{partial d} &=& frac{-y_h(1-g)}{d+(1-d)g} + frac{y_m(1-g)}{(1-d)(1-g)} - frac{y_f}{g} imes 0 - frac{y_c}{1-g} imes 0 &=& frac{-y_h(1-g)}{d+(1-d)g} + frac{y_m}{1-d} end{array} ]
After finisning the partial derivative part, I then set the equation to zero and solve for d, which is an estimate, so denoted it as ( hat{d} ). Note that:
- ( d+(1-d)g = y_h/N_s )
- ( y_m = N_s - y_h )
- ( 1-g=y_c/N_n )
[ egin{array}{rcl} frac{-y_h(1-g)}{y_h/N_s} + frac{y_m}{1-d} &=& 0 frac{y_h(1-g)}{y_h/N_s} &=& frac{N_s-y_h}{1-d} (1-d) &=& frac{1}{N_s} frac{N_s-y_h}{1-g} d &=& frac{y_h-N_s}{N_s(1-g)} + 1 hat{d} &=& frac{(y_h/N_s) - (y_f/N_n)}{1-(y_f/N_n)} end{array} ]
To compute ( hat{g} ), similarly I calculated partial derivative with respect to g. Re-arranging and replacing the terms however were a bit complicated, but in the end we can still get Rouder's result: [ hat{g} = y_f/N_n ]
Replace the simulated data the calculus derived equation, we get ( hat{d} = .375 ) and ( hat{g}=.6 )
Numerical Method
y is the data vector, ( y_h, y_m, y_f, y_c ) and par is the parameter vector.
# negative log-likelihood function for high-threshold model
nll.ht <- function(par, y) {
d <- par[1]
g <- par[2]
# y[1] is hit numbers; y[2] is the miss number y[3] is false alarm
# numbers; y[4] is the correct rejection number
ll <- y[1] * log(d + (1 - d) * g) + y[2] * log((1 - d) * (1 - g)) + y[3] *
log(g) + y[4] * log(1 - g)
return(-ll) # return negative ll
}
Previously, the negative log-likehood function was simplified as: [ egin{array}{rcl} l(d, g; y_h, y_f) &=& -(y_hlog(p_h)+y_mlog(p_m) + y_f log(p_f) + y_c log(p_c)) &=& -sum_{i} y_i log(p_i) end{array} ]
The codes were thus simplifid as such:
nll.ht2 <- function(par, y) {
d <- par[1]
g <- par[2]
p <- 1:4
p[1] <- d + (1 - d) * g # p_h
p[2] <- 1 - p[1] # p_m
p[3] <- g # p_f
p[4] <- 1 - p[3] # p_c
return(-sum(y * log(p))) # return negative ll
}
y <- c(75, 25, 30, 20) # the data
par <- c(0.5, 0.5) # initial detection and guess rates
optim(par, nll.ht2, y = y)
## $par
## [1] 0.3751 0.5999
##
## $value
## [1] 89.88
##
## $counts
## function gradient
## 43 NA
##
## $convergence
## [1] 0
##
## $message
## NULL