Abstract
Most contemporary process discovery methods take as inputs only positive examples of process executions, and so they are one-class classification algorithms. However, we have found negative examples to also be available in industry, hence we build on earlier work that treats process discovery as a binary classification problem. This approach opens the door to many well-established methods and metrics from machine learning, in particular to improve the distinction between what should and should not be allowed by the output model. Concretely, we (1) present a verified formalisation of process discovery as a binary classification problem; (2) provide cases with negative examples from industry, including real-life logs; (3) propose the Rejection Miner binary classification procedure, applicable to any process notation that has a suitable syntactic composition operator; (4) implement two concrete binary miners, one outputting Declare patterns, the other Dynamic Condition Response (DCR) graphs; and (5) apply these miners to real world and synthetic logs obtained from our industry partners and the process discovery contest, showing increased output model quality in terms of accuracy and model size.
Original language | English |
---|---|
Article number | 102339 |
Journal | Information Systems |
Volume | 121 |
Number of pages | 20 |
ISSN | 0306-4379 |
DOIs | |
Publication status | Published - 2024 |
Bibliographical note
Publisher Copyright:© 2023
Keywords
- Binary classification
- DisCoveR
- Dynamic condition response graphs
- Labelled event logs
- Negative examples
- Process mining