This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder. Particularly, the developed methodologies allow the selection of the most relevant subset of latent variables. In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel. Therefore, choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels. Such leads to a reduction in the number of the model's parameters. Both wrapper and filter methods were evaluated for feature selection. The second was of particular relevance as it is based only on the distributions of the kernels. It was assessed by measuring the Kullback-Leibler divergence between all distributions, hypothesizing that the kernels whose distributions are more similar can be discarded. This hypothesis was confirmed since it was observed that the most similar kernels do not convey relevant information and can be removed. As a result, the proposed methodology is suitable for developing applications for resource-constrained devices.
翻译:根据变式自动coard 变式编码器的进化编码器所生成的特征,为概率分类器建议了内核选择方法。 特别是, 开发的方法允许选择最相关的潜在变量子组。 在拟议的实施中, 每一个潜在变量都是从与最后一个编码器的进化层的单个内核相关的分布中抽样的, 因为每个内核都为每个内核创建了单个分布方式。 因此, 选择抽样潜在变量的相关特性使得有可能进行内核选择, 过滤非信息化特性和内核。 这样可以减少模型参数的数量。 无论是包装器还是过滤器方法都对功能选择进行了评价。 第二个潜在变量特别相关, 因为它仅以最后一个内核的分布为基础。 评估方法是测量所有分布之间的 Kullback- Leeber 差异, 假设其分布更加相似的内核流可以被丢弃。 这一假设得到证实, 因为观察到最相似的内核器并不传递相关的信息, 开发资源装置的拟议结果可以被删除。