We propose a simple modification to standard deep learning architectures during their training phase--L2 normalization over feature space--that produces results competitive with state-of-the-art Out-of-Distribution (OoD) detection but with relatively little training time. When L2 normalization is removed at test time, magnitudes of feature vectors becomes a surprisingly good measurement for OoD detection. Intuitively, In Distribution (ID) images result in large vectors, while OoD images have small magnitudes, which permits a simple threshold scheme for screen OoD images. We provide a theoretical analysis of how this simple change works. Competitive results are possible in only 60 epochs of training on a standard ResNet18.
翻译:暂无翻译