In current ML field models are getting larger and more complex, data we use are also getting larger in quantity and higher in dimension, so in order to train better models, save training time and computational resources, a good Feature Selection (FS) method in preprocessing stage is necessary. Feature importance (FI) is of great importance since it is the basis of feature selection. This paper creatively introduces the calculation of PNS(the probability of Necessity and Sufficiency) in Causality into quantifying feature importance and creates new FI measuring methods: PN-FI, which means how much importance a feature has in image recognition tasks, PS_FI that means how much importance a feature has in image generating tasks, and PNS_FI which measures both. The main body of this paper is three RCTs, with whose results we show how PS_FI, PN_FI and PNS_FI of three features: dog nose, dog eyes and dog mouth are calculated. The FI values are intervals with tight upper and lower bounds.
翻译:暂无翻译