Abstract
Most contemporary process discovery methods take as inputs only positive examples of process executions, and so they are one-class classification algorithms. However, we have found negative examples to also be available in industry, hence we build on earlier work that treats process discovery as a binary classification problem. This approach opens the door to many well-established methods and metrics from machine learning, in particular to improve the distinction between what should and should not be allowed by the output model. Concretely, we (1) present a verified formalisation of process discovery as a binary classification problem; (2) provide cases with negative examples from industry, including real-life logs; (3) propose the Rejection Miner binary classification procedure, applicable to any process notation that has a suitable syntactic composition operator; (4) implement two concrete binary miners, one outputting Declare patterns, the other Dynamic Condition Response (DCR) graphs; and (5) apply these miners to real world and synthetic logs obtained from our industry partners and the process discovery contest, showing increased output model quality in terms of accuracy and model size.
Originalsprog | Engelsk |
---|---|
Artikelnummer | 102339 |
Tidsskrift | Information Systems |
Vol/bind | 121 |
Antal sider | 20 |
ISSN | 0306-4379 |
DOI | |
Status | Udgivet - 2024 |
Bibliografisk note
Funding Information:This work is in part supported by Digital Research Centre Denmark (DIREC) through the AI and Blockchains for Complex Business Processes project. We wish to express our deep gratitude towards Sergio Tessaris, Chiara Di Francescomarino and Fabrizio Maria Maggi for their efforts in replicating our results and making us aware of the discrepancies they discovered in the output of the rejection miner, which led to the detection and correction of an error in the evaluation of a number of patterns. We also wish to thank DCR Solutions for providing us with a data set containing the test cases for all processes stored in the DCR process portal.
Publisher Copyright:
© 2023