Optimization problems arising in data science have given rise to a number of new derivative-based optimization methods. Such methods often use standard smoothness assumptions -- namely, global Lipschitz continuity of the gradient function -- to establish a convergence theory. Unfortunately, in this work, we show that common optimization problems from data science applications are not globally Lipschitz smooth, nor do they satisfy some more recently developed smoothness conditions in literature. Instead, we show that such optimization problems are better modeled as having locally Lipschitz continuous gradients. We then construct explicit examples satisfying this assumption on which existing classes of optimization methods are either unreliable or experience an explosion in evaluation complexity. In summary, we show that optimization problems arising in data science are particularly difficult to solve, and that there is a need for methods that can reliably and practically solve these problems.
翻译:暂无翻译