欢迎您访问 最编程 本站为您分享编程语言代码,编程技术文章!
您现在的位置是: 首页

why log-likelihood is negative-when , the function becomes

最编程 2024-07-25 15:37:56
...

Cost(h_\theta(x), y) = -(1-y)log(1-h_\theta(x))
and if h_\theta(x) \rightarrow^{(approach)} 0
e.g. 0.05, the result becomes
Cost(h_\theta(x), y) = -(1-y)log(1-h_\theta(x)) = -1 * log(1-0.05)

image.png

Notice that the log(#) \rightarrowNegative tiny number, that's the final reason why the Log-Likelihood Cost Function need the Negative Sign

See the following figure and find out why we use log-likelihood this way

image.png

image.png

Conclusion

  • log-likelihood would simplify the computation that likelihood function does since both functions are estimating monotonically
  • We want the part that is between a range of {0,1}, therefore we need to take the reversed computation that adding negative sign at the beginning of the log-likelihood, i.e. the cross-entropy form which would help us to minimize the cost function
  • Because we use log-likelihood as our cost function and we want to find the parameter \theta, we actually want to know that how we determine that value of \theta that would help us to approach the lowest point as quick as possible, in other words Quasi-Newton method, as well as it is similar to what we often use in ML The Gradient Descent

    Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The "full" Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema.