An efficient algorithm, Apriori_Goal, is proposed for constructing association rules for a relational database with a given classification. The algorithm's features are related to the specifics of the database and the method of encoding its records. The algorithm proposes five criteria that characterize the quality of the rules being constructed. Different criteria are also proposed for filtering the sets used when constructing association rules. The proposed method of encoding records allows for an efficient implementation of the basic operation underlying the computation of rule characteristics. The algorithm works with a relational database, where the columns can be of different types, both continuous and discrete. Among the columns, a target discrete column is distinguished, which defines the classification of the records. This allows the original database to be divided into $n$ subsets according to the number of categories of the target parameter. A classical example of such databases is medical databases, where the target parameter is the diagnosis established by doctors. A preprocessor, which is an important part of the algorithm, converts the properties of the objects represented by the columns of the original database into binary properties and encodes each record as a single integer. In addition to saving memory, the proposed format allows the complete preservation of information about the binary properties representing the original record. More importantly, the computationally intensive operations on records, required for calculating rule characteristics, are performed almost instantly in this format using a pair of logical operations on integers.
翻译:暂无翻译