Abstract
Policy learning is an important component of many real-world learning systems. A major
challenge in policy learning is how to adapt efficiently to unseen environments or tasks.
Recently, it has been suggested to exploit invariant conditional distributions to learn models
that generalize better to unseen environments. However, assuming invariance of entire
conditional distributions (which we call full invariance) may be too strong of an assumption
in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance
(e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zeroshot policy generalization. We also discuss an extension that exploits e-invariance when
we have a small sample from the test environment, enabling few-shot policy generalization.
Our work does not assume an underlying causal graph or that the data are generated
by a structural causal model; instead, we develop testing procedures to test e-invariance
directly from data. We present empirical results using simulated data and a mobile health
intervention dataset to demonstrate the effectiveness of our approach
challenge in policy learning is how to adapt efficiently to unseen environments or tasks.
Recently, it has been suggested to exploit invariant conditional distributions to learn models
that generalize better to unseen environments. However, assuming invariance of entire
conditional distributions (which we call full invariance) may be too strong of an assumption
in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance
(e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zeroshot policy generalization. We also discuss an extension that exploits e-invariance when
we have a small sample from the test environment, enabling few-shot policy generalization.
Our work does not assume an underlying causal graph or that the data are generated
by a structural causal model; instead, we develop testing procedures to test e-invariance
directly from data. We present empirical results using simulated data and a mobile health
intervention dataset to demonstrate the effectiveness of our approach
Original language | English |
---|---|
Article number | 34 |
Journal | Journal of Machine Learning Research |
Volume | 25 |
Pages (from-to) | 1-36 |
ISSN | 1533-7928 |
Publication status | Published - 2024 |