TY - UNPB
T1 - Posets and Bounded Probabilities for Discovering Order-inducing Features in Event Knowledge Graphs
AU - Back, Christoffer Olling
AU - Simonsen, Jakob Grue
PY - 2024
Y1 - 2024
N2 - Event knowledge graphs (EKG) extend the classical notion of a trace to capture multiple, interacting views of a process execution. In this paper, we tackle the open problem of automating EKG discovery from uncurated data through a principled, probabilistic framing based on the outcome space resulting from featured-derived partial orders on events. From this, we derive an EKG discovery algorithm based upon statistical inference rather than an ad-hoc or heuristic-based strategy, or relying on manual analysis from domain experts. This approach comes at the computational cost of exploring a large, non-convex hypothesis space. In particular, solving the maximum likelihood term involves counting the number of linear extensions of posets, which in general is #P-complete. Fortunately, bound estimates suffice for model comparison, and admit incorporation into a bespoke branch-and-bound algorithm. We show that the posterior probability as defined is antitonic w.r.t. search depth for branching rules that are monotonic w.r.t. model inclusion. This allows pruning of large portions of the search space, which we show experimentally leads to rapid convergence toward optimal solutions that are consistent with manually built EKGs.
AB - Event knowledge graphs (EKG) extend the classical notion of a trace to capture multiple, interacting views of a process execution. In this paper, we tackle the open problem of automating EKG discovery from uncurated data through a principled, probabilistic framing based on the outcome space resulting from featured-derived partial orders on events. From this, we derive an EKG discovery algorithm based upon statistical inference rather than an ad-hoc or heuristic-based strategy, or relying on manual analysis from domain experts. This approach comes at the computational cost of exploring a large, non-convex hypothesis space. In particular, solving the maximum likelihood term involves counting the number of linear extensions of posets, which in general is #P-complete. Fortunately, bound estimates suffice for model comparison, and admit incorporation into a bespoke branch-and-bound algorithm. We show that the posterior probability as defined is antitonic w.r.t. search depth for branching rules that are monotonic w.r.t. model inclusion. This allows pruning of large portions of the search space, which we show experimentally leads to rapid convergence toward optimal solutions that are consistent with manually built EKGs.
KW - cs.LG
KW - cs.AI
KW - 06-08
KW - G.3; I.2.6; I.5
KW - Bayesian inference
KW - Branch and bound
KW - Event knowledge graphs
KW - Linear extension
KW - Partial order
KW - Process mining
U2 - 10.48550/arXiv.2410.06065
DO - 10.48550/arXiv.2410.06065
M3 - Preprint
BT - Posets and Bounded Probabilities for Discovering Order-inducing Features in Event Knowledge Graphs
PB - arXiv.org
ER -