In Radhakrishnan et al. [2020], the authors empirically show that autoencoders trained with usual SGD methods shape out basins of attraction around their training data. We consider network functions of width not exceeding the input dimension and prove that in this situation basins of attraction are bounded and their complement cannot have bounded components. Our conditions in these results are met in several experiments of the latter work and we thus address a question posed therein. We also show that under some more restrictive conditions the basins of attraction are path-connected. The tightness of the conditions in our results is demonstrated by means of several examples. Finally, the arguments used to prove the above results allow us to derive a root cause why scalar-valued neural network functions that fulfill our bounded width condition are not dense in spaces of continuous functions.
翻译:在Radhakrishnan等人一案中[2020],作者从经验上表明,以通常的 SGD 方法培训的自动电算器在其培训数据周围形成了吸引的盆地。我们认为宽度的网络功能不超过输入维度,并证明在这种情况下,吸引力的盆地是封闭的,其补充不能有捆绑的组成部分。我们在后一项工作的若干实验中满足了我们这些结果中的条件,因此我们解决了其中提出的一个问题。我们还表明,在某些限制性更强的条件下,吸引盆地是连接路径的。我们结果中的条件的紧凑性通过几个例子来证明。最后,用来证明上述结果的论据使我们得以找出一个根本原因,为什么满足我们受限制的宽度条件的天价值神经网络功能在连续功能的空隙中并不密集。