TUPLES’ members papers at the 2023 ICAPS Conference
Learned action policies π can be used to make real-time decisions in dynamic environments; one simply evaluates the policy on the current state in order to obtain the next action. Yet, this raises obvious concerns regarding potential policy “bugs”, that is, undesirable or even fatal policy behavior in particular situations. Testing – searching for bugs – is a natural paradigm to address these concerns.
A central component of action-policy testing are test oracles, which are responsible for recognizing states to be bugs. In the context of this paper, this means that, given a query state t, a test oracle attempts to establish that the behavior of π on t is sub-optimal. Recent work introduced metamorphic oracles realizing this by comparing the behavior of π on state pairs where one of the states is known to be easier to solve, i.e., if π performs better on a more difficult state s than on a simpler state t, then its behavior on t must be sub-optimal so that t must be a bug.
This paper shows how to automatically design such oracles based on simulation relations between states. It introduces two oracle families of this kind: first, morphing query states t to obtain suitable s; second, maintaining and comparing upper bounds on the cost of optimal plans for the states encountered during testing. Experiments show that these new oracles can find bugs much more quickly than the existing (search-based) alternatives and that the combination of the new oracles with search-based ones almost consistently dominates all other oracles.
You can view the paper here
The International Conference on Automated Planning and Scheduling (ICAPS) is the premier forum for exchanging news and research results on the theory and applications of intelligent and automated planning and scheduling technology. ICAPS 2023 is part of the ICAPS conference series. After three years of virtual events, ICAPS 2023 will be a physical conference again. ICAPS 2023 will be held July 8-13, 2023, in Prague, Czech Republic.
Latest News
The wait is over—the competition has officially begun! Register, collect data and requirements, and start developing your solution.
Matteo Francobaldi from TUPLES presented SMLE at Brown’s CRUNCH Group—a step towards safer, compliant AI models for healthcare & automotive
Big presence for TUPLES at ECAI2024! Our teams presented pioneering work in AI decision-making, from explainable constraint solving to safe planning verification
Just wrapped up our 24-month General Assembly. As we enter the final year of the project, we’re focused on delivering AI-driven planning & scheduling tools for real-world impact. Exciting times ahead!