The truncated Newton method, originated in a paper by Ron Dembo and Trond Steihaug,[1] also known as Hessian-free optimization,[2] are a family of optimization algorithms designed for optimizing non-linear functions with large numbers of independent variables. A truncated Newton method consists of repeated application of an iterative optimization algorithm to approximately solve Newton's equations, to determine an update to the function's parameters. The inner solver is truncated, i.e., run for only a limited number of iterations. It follows that, for truncated Newton methods to work, the inner solver needs to produce a good approximation in a finite number of iterations;[3] conjugate gradient has been suggested and evaluated as a candidate inner loop.[2] Another prerequisite is good preconditioning for the inner algorithm.[4]


  1. ^ Dembo, Ron S.; Steihaug, Trond (1983). "Truncated-Newton algorithms for large-scale unconstrained optimization". Mathematical Programming. 26 (2). Springer: 190–212. doi:10.1007/BF02592055. S2CID 40537623.. Convergence results for this algorithm can be found in Dembo, Ron S.; Eisenstat, Stanley C.; Steihaug, Trond (1982). "Inexact newton methods". SIAM Journal on Numerical Analysis. 19 (2): 400–408. Bibcode:1982SJNA...19..400D. doi:10.1137/0719025. JSTOR 2156954..
  2. ^ a b Martens, James (2010). Deep learning via Hessian-free optimization (PDF). Proc. International Conference on Machine Learning.
  3. ^ Nash, Stephen G. (2000). "A survey of truncated-Newton methods". Journal of Computational and Applied Mathematics. 124 (1–2): 45–59. Bibcode:2000JCoAM.124...45N. doi:10.1016/S0377-0427(00)00426-X.
  4. ^ Nash, Stephen G. (1985). "Preconditioning of truncated-Newton methods" (PDF). SIAM J. Sci. Stat. Comput. 6 (3): 599–616. doi:10.1137/0906042.

Further reading