Gradient methods are experiencing a growth in methodological and theoretical developments owing to the challenges of optimization problems arising in data science. Focusing on data science applications with expensive objective function evaluations yet inexpensive gradient function evaluations, gradient methods that never make objective function evaluations are either being rejuvenated or actively developed. However, as we show, such gradient methods are all susceptible to catastrophic divergence under realistic conditions for data science applications. In light of this, gradient methods which make use of objective function evaluations become more appealing, yet, as we show, can result in an exponential increase in objective evaluations between accepted iterates. As a result, existing gradient methods are poorly suited to the needs of optimization problems arising from data science. In this work, we address this gap by developing a generic methodology that economically uses objective function evaluations in a problem-driven manner to prevent catastrophic divergence and avoid an explosion in objective evaluations between accepted iterates. Our methodology allows for specific procedures that can make use of specific step size selection methodologies or search direction strategies, and we develop a novel step size selection methodology that is well-suited to data science applications. We show that a procedure resulting from our methodology is highly competitive with standard optimization methods on CUTEst test problems. We then show a procedure resulting from our methodology is highly favorable relative to standard optimization methods on optimization problems arising in our target data science applications. Thus, we provide a novel gradient methodology that is better suited to optimization problems arising in data science.
翻译:暂无翻译