To better understand complexity in neural networks, we theoretically investigate the idealised phenomenon of lossless network compressibility, whereby an identical function can be implemented with fewer hidden units. In the setting of single-hidden-layer hyperbolic tangent networks, we define the rank of a parameter as the minimum number of hidden units required to implement the same function. We give efficient formal algorithms for optimal lossless compression and computing the rank of a parameter. Losslessly compressible parameters are atypical, but their existence has implications for nearby parameters. We define the proximate rank of a parameter as the rank of the most compressible parameter within a small L-infinity neighbourhood. We give an efficient greedy algorithm for bounding the proximate rank of a parameter, and show that the problem of tightly bounding the proximate rank is NP-complete. These results lay a foundation for future theoretical and empirical work on losslessly compressible parameters and their neighbours.
翻译:暂无翻译