A fitting algorithm for conjunctive queries (CQs) produces, given a set of positively and negatively labeled data examples, a CQ that fits these examples. In general, there may be many non-equivalent fitting CQs and thus the algorithm has some freedom in producing its output. Additional desirable properties of the produced CQ are that it generalizes well to unseen examples in the sense of PAC learning and that it is most general or most specific in the set of all fitting CQs. In this research note, we show that these desiderata are incompatible when we require PAC-style generalization from a polynomial sample: we prove that any fitting algorithm that produces a most-specific fitting CQ cannot be a sample-efficient PAC learning algorithm, and the same is true for fitting algorithms that produce a most-general fitting CQ (when it exists). Our proofs rely on a polynomial construction of relativized homomorphism dualities for path-shaped structures.
翻译:暂无翻译