Scope and Motivation

This workshop explores the intersection of formal requirements engineering and artificial intelligence (AI), addressing a central challenge in modern software and AI system development: how to precisely specify requirements for increasingly complex, heterogeneous, and autonomous systems, and how to verify and validate that such systems meet those requirements.

As AI technologies become deeply embedded in safety and mission-critical domains, from autonomous vehicles to medical diagnostics, financial systems to industrial automation and space exploration, the need for rigorous, formal approaches to requirements is becoming increasingly important. At the same time, traditional requirements engineering methods face new challenges when applied to systems with learning-enabled components, unpredictable behaviors, and emergent properties. The opacity of AI models and the semantic gap between high-level requirements and low-level model inputs and internals create significant impediments to verifying and validating that such systems meet their specified requirements.

The workshop welcomes extended abstract contributions on formal specification languages for AI systems, verification and validation techniques, requirements for trustworthy AI, case studies from real-world applications, and novel applications of AI to requirements engineering itself. We aim to foster dialogue between communities that have traditionally worked separately, building bridges toward more reliable, safe, and trustworthy AI systems grounded in rigorous requirements practices. Extended abstracts can summarize and cite results from recent published paper(s) and/or state your perspective.

Topics of Interest

  • How can we formally specify requirements for systems with learning-enabled components?
  • How can formal frameworks capture fairness, safety, robustness and explainability requirements for AI systems?
  • How do we verify that AI systems meet their specified requirements?
  • What role can AI play in automating requirements elicitation, formalization analysis, and validation?
  • How do we bridge high-level requirements and behavior of AI-enabled systems to enable traceability, safety assurance, and certification?

Remote Participation - same as NFM

While RExAI strongly encourages authors and attendees to join us in person, we recognize that travel to the United States can present challenges and that many valued members of the formal methods community reside abroad. For authors who are unable to travel, we will make every reasonable effort to support remote participation so they can present their work and fully engage with the symposium.

Location and Cost

The symposium will take place at the University of Southern California, Los Angeles, California, USA. There will be no registration fee for participants. All interested Individuals are welcome to attend, to listen to the talks, and to participate in discussions; however, all attendees must register.

Schedule (all times in PDT)

May 4, 2026

Opening

08:20 - 08:30

Opening remarks

Session 1: Keynote Session

Session Chair: Marie Farrell
08:30 - 09:30

Safe Planning and Scheduling for Autonomous Systems

Klaus Havelund

Session 2: Paper Session

Session Chair: Nicolas F. Rouquette
09:30 - 09:45

Verifying STL Properties of Sliding-Window Neural Classifiers under Adversarial Perturbations.

Xaver Fink, Borja Fernandez Adiego, Joost-Pieter Katoen

09:45 - 10:00

Developing Robot Software Models using LLMs and Formal Verification.

Richard North, Pedro Ribeiro, Ana Cavalcanti, Luke Jackson, Ipek Caliskanelli

10:00 - 10:30

Coffee Break

Session 3: Paper Session

Session Chair: Divya Gopinath
10:30 - 10:45

Evaluating Generative AI Outputs as a Human-Machine Interface Between Domain Experts and Domain-Specific Languages: A Cost-of-Error Framework.

Panagiotis Kalogeropoulos

10:45 - 11:00

RBT4DNN: Requirements-based Testing of Neural Networks.

Nusrat Jahan Mozumder, Felipe Toledo, Swaroopa Dola, Matthew B. Dwyer

11:00 - 11:15

Adaptive Stress Testing of VLM Safety Monitors.

Aleicia Zhu, Rory Lipkis, Adrian Agogino

Session 4: Keynote Session

Session Chair: Anastasia Mavridou
11:15 - 12:15

You Can't Validate What You Don't Understand: LLMs as Explanation Partners for Formal Requirements Engineering

Paola Spoletini

12:15 - 13:45

Lunch

Session 5: Paper Session

Session Chair: Hazel Taylor
13:45 - 14:00

Formal Requirements for AI-driven Satellite Image Classification of Biomass Burning in Angolan Miombo Woodlands (2015–2025).

Isau Alfredo B. Quissindo, Virgínia Quartin

14:00 - 14:15

Abstract Traffic Scenario Specifications for Improving Safety and Trustworthiness of AI-based Mobility Systems.

Ishan Saxena, Dominik Grundt, Eike Möhlmann, Bernd Westphal

14:15 - 14:30

Toward a Requirements Framework for Trustworthy AI in Enterprise Cloud Deployments.

Kranthi Kumar Manchikanti

14:30 - 14:45

Ambiguity Reduction in Natural Language Requirements through Autoformalization.

Eunsuk Kang, Sam Procter, Teresa Fortes, Yining She

14:45 - 15:00

Gen-AI as a Formal Requirements Engineering Partner: A Case Study in Verified Parser Design.

Nicolas F Rouquette

15:00 - 15:15

Can LLM Support Formal Requirements Engineers in ASMETA?

Andrea Bombarda, Silvia Bonfanti, Angelo Gargantini, Nico Pellegrinelli

15:15 - 15:45

Coffee Break

Session 6: Panel Discussion

Panel Chairs: Anastasia Mavridou and Divya Gopinath
15:45 - 16:45

Verifying What We Cannot Fully Specify

As autonomous systems are increasingly integrated into safety-critical domains — from space exploration and autonomous vehicles to medical devices and industrial automation — the question of whether they truly do what we intend has never been more urgent. Requirement-based Verification and Validation (V&V) grounds assurance in rigorously specified behaviors that can be analyzed, traced, and certified. Yet the rise of AI-enabled systems poses critical challenges due to their opacity and complexity. Unlike traditional systems whose behavior can often be tested or formally verified, AI systems exhibit emergent behaviors that resist conventional V&V approaches. This opacity is compounded by a semantic gap: requirements are typically expressed in high-level natural language (e.g., English text), while DNNs operate on low-level representations (e.g., raw pixels). This mismatch creates barriers to standard software engineering practices such as testing, debugging, runtime monitoring, and verification.

This panel brings together voices from government, industry, and academia to discuss: (1) where the field is making progress, (2) where the most significant challenges lie, and (3) what practical steps each community can take to move the needle.

Panelists: Misty Davies, Jyotirmoy Deshmukh, Dimitra Giannakopoulou, Vandi Verma, Timothy Wang

Closing

16:45 - 17:00

Closing remarks

Keynote Speakers

Klaus Havelund

Klaus Havelund

Senior Research Scientist, NASA JPL

Bio: Klaus Havelund is a Senior Research Scientist at NASA’s Jet Propulsion Laboratory (JPL). Before joining JPL in 2006, he spent eight years as a researcher at NASA Ames Research Center in Silicon Valley, California, and one year at Kestrel Technology, a company focused on commercializing formal methods. He has more than four decades of experience in the research and application of formal methods. His work has covered topics such as programming language semantics, specification language design, applications of theorem proving, model checking, and runtime verification. He earned his PhD in Computer Science from the University of Copenhagen, Denmark, part of which was carried out at École Normale Supérieure in Paris, France, followed by a postdoc at École Polytechnique and Paris 6. He then spent one year as a research associate at Aalborg University in Denmark before moving to the United States in 1997.

Title: Safe Planning and Scheduling for Autonomous Systems

Abstract: Model-based planning and scheduling (P&S) is an approach to controlling autonomous systems. The approach relies on a model, here referred to as a tasknet, of the tasks that are possible and the constraints on them. From this model, a P&S system generates a plan or schedule of tasks satisfying the constraints such that their execution achieves a given goal. Programming tasknets is a complex task. We propose a domain-specific language for programming tasknets and a tool, TaskSAT, for exploring them. We check validity, meaning that a schedule exists, as well as safety, meaning that all possible schedules satisfy properties stated in linear temporal logic. Temporal logic can also be used as part of the constraints that determine which schedules are generated. Tasknets are translated into SMT problems and solved with the Z3 SMT solver. TaskSAT was developed to emulate MEXEC, a C++ P&S system in operational use at NASA/JPL, and is intended to support the verification and validation of MEXEC tasknet programming.

Paola Spoletini

Paola Spoletini

Professor of Software Engineering, Kennesaw State University

Bio: Paola Spoletini is a Professor at Kennesaw State University, where she serves her college as Associate Dean for Research. She received her Ph.D. from Politecnico di Milano, where she also held a postdoctoral position, before joining the University of Insubria as a faculty member. Her research spans requirements engineering, formal methods, and human-centric computing, with a focus on how people understand and communicate requirements and work with formal specifications. She has contributed foundational work on temporal logic, model checking, and LLM-supported requirements analysis. She received the Most Influential Paper Award at IEEE RE 2020 and has served as Program Chair of both IEEE RE and REFSQ, as well as in organizing and program committee roles at ICSE and other premier venues.

Title: You Can't Validate What You Don't Understand: LLMs as Explanation Partners for Formal RE

Abstract: Formal specifications are hard to write and hard to understand. Even for professionals with formal training, reasoning about the meaning of an LTL formula is non-trivial, and mistakes in specifications have consequences. The rise of LLMs has introduced a tempting shortcut: automate the translation from natural language to logic and remove the human from the loop entirely. This talk argues that this shortcut is dangerous and can negatively affect the correctness of our systems and the preparedness of our professionals, and introduces an alternative approach: using LLMs as explanation partners. Given an LTL formula, the LLM generates a natural language explanation designed to support human understanding, not to replace it. I will present empirical results supporting this approach, including a prompt engineering study across six LLMs and a classroom evaluation.

Panelists

Misty Davies

Misty Davies

NASA

Bio: Dr. Misty Davies is a research engineer with expertise in safety and in the behavior of complex systems. She is currently the Acting Division Chief for the Intelligent Systems Division. Her graduate degrees are from Stanford University, where she earned a Ph.D. in computational mechanics and an M.S. in aerospace and astrospace engineering. She is a Fellow of the American Institute of Aeronautics and Astronautics, and serves on several technical and standards committees

Jyotirmoy Deshmukh

Jyotirmoy Deshmukh

USC

Bio: Jyotirmoy V. Deshmukh is an Associate Professor in the Thomas Lord Department of Computer Science and the Department of Electrical and Computer Engineering in the Viterbi School of Engineering at the University of Southern California in Los Angeles, USA. Before joining USC, Jyo worked as a Principal Research Engineer in Toyota Motors North America R&D. He got his Ph.D. degree from the University of Texas at Austin and was a post-doctoral fellow at the University of Pennsylvania. Jyo's research interest is in the broad area of analysis, design, security and trustworthiness, and synthesis of cyber-physical and autonomous systems, including those that use machine learning and AI-based components.

Dimitra Giannakopoulou

Dimitra Giannakopoulou

AWS

Bio: Dimitra Giannakopoulou is a Senior Principal Applied Scientist with Amazon Web Services (AWS). Her work uses automated reasoning to ensure the correctness and security of distributed systems, and more recently, to detect hallucinations in artificial intelligence. Prior to that, she was a Senior Research Scientist with the NASA Ames Research Center, where she developed scalable verification and validation techniques for safety critical systems. She holds a PhD in Distributed Computing from Imperial College, London. Dr. Giannakopoulou was the technical lead for two NASA open source verification tools, jpf-psycho and FRET. She is the recipient of several awards, including a NASA Group Achievement Award, a NASA exceptional technology achievement medal, and an Automated Software Engineering conference most influential paper award. She has authored over 70 peer-reviewed publications and has delivered several keynote talks at international conferences.

Vandi Verma

Vandi Verma

NASA

Bio: Vandi Verma is a JPL Principal Engineer and a Program Area Manager for Strategy and Formation at the NASA Jet Propulsion Laboratory, California Institute of Technology. She also serves as Chief Engineer of Robotic Operations for Mars 2020. She was the Assistant Section Manager of the Mobility & Robotics section, the Supervisor of the Section Staff Group, and the Supervisor of the Operable Robotics Group. Her areas of expertise include space robotics, autonomous robots, and robotic operations. She has worked on Robotics and AI research and technology development tasks and has deployed robots in the Arctic, Antarctica, and the Atacama Desert. She works on new robotics capabilities from research & development, mission design and prototyping, through flight development, testing and launch, to landing and surface operations. She is also the Principal Investigator of an Operations for Autonomy research task.  She has been engaged in robotic operations on Mars since 2008 with the Mars Exploration Rovers Spirit and Opportunity, Curiosity rover, Perseverance rover, and Ingenuity helicopter.

Timothy Wang

Timothy Wang

RTRC

Bio: Dr. Timothy E. Wang is in the AI Assurance team at RTX Technology Research Center (RTRC). He received his PhD, MS and BS all from the Department of Aerospace Engineering at Georgia Tech. He has more than 10 years of industrial research experience in the design, verification & validation, assurance and certification of autonomous and control systems in the aerospace & defense domains, with a focus on formal analysis & synthesis techniques, contract theory for design and verification, optimization-based control & estimation algorithms, and assurance cases. In his free time, he enjoys skiing especially in the Sierra.

Submission TL;DR

  • Format: LNCS (same as NFM) Proceedings Guidelines
  • Length: 2-4 pages
  • Submission site: OpenReview
  • Review model: Single blind
  • Special issue: A journal special issue that will contain the post-proceedings will be organized for extended, full papers corresponding to accepted abstracts.

Submission Guidelines

We invite extended abstracts 2-4 pages in length, in LNCS format. All papers should be in English and fall into one of the following categories:

  • New and Emerging Work: Presents novel research in the focus areas of the workshop. Submissions will be evaluated primarily on "novelty".
  • Summary of Recent Results: Presents existing work and highlights its contribution in terms of relevance and impact in the focus areas of the workshop. Submissions will be evaluated primarily on "impact".

Please note that:

  • No Formal Proceedings: We welcome submissions of work that has already been presented or submitted elsewhere. No copyright transfer is required; we only request permission to post accepted abstracts on the workshop website.
  • Journal Special Issue: Authors of selected accepted abstracts will be invited to submit extended versions for a journal special issue.

All submissions will be reviewed by members of the Program Committee. The paper review process is single-blind, which means that the author identities are not required to be anonymous and are visible to the PC members/reviewers, but reviewer identities are not visible to the authors. No special efforts are required to anonymize content in the paper (such as referencing the authors’ prior work).

Submission will be via the OpenReview. To submit a paper on OpenReview, you must first create a profile and log in to the system.

A journal special issue that will contain the post-proceedings will be organized for extended, full papers corresponding to accepted abstracts.

Policy on the use of Gen AI - same as NFM: We understand the convenience afforded by the use of generative AI-based large language models to produce text in the submitted manuscript. However, we strongly encourage the authors to check the generated text for factual errors and inconsistencies. We encourage the authors to adopt appropriate standards for citing products obtained using generative AI (such as text, tables, graphics). Use of AI-based coding assistants is permitted, and we encourage authors to disclose the use of such tools as the community may find this scientifically interesting.

Organizers

Program Committee