-
Notifications
You must be signed in to change notification settings - Fork 0
Challenges identified for proof scores
Adrián Riesco edited this page Nov 5, 2025
·
6 revisions
Large Language Models (LLMs) have revolutionized programming by enabling developers to generate, debug, and optimize code more efficiently than ever before. However, the domain of formal specifications has not benefited equally from this explosion, mainly because there is insufficient publicly available data online to train such models effectively. As a result, it is worthwhile to investigate how existing general-purpose models can be refined to better support the creation of formal specifications in Maude. In particular, the COMFIA project aims to pursue this goal by leveraging the Maude and Dafny languages.
There are two problems in this challenge:
- There is not enough training data for automating the development of formal specification of systems.
- Further refinements of programs by IA are not guaranteed to be equivalent.
- Refining existing models to deal with formal languages. We propose Maude, a rewriting engine, and Dafny, a theorem prover, as starting point.
- Using a formal verification and the corresponding AST as basis for programs that can be later modified by LLMs.
- On the one hand, we have LLMs, that will be refined in order to help developers with their specifications.
- On the other hand, we have the specification/verification languages Maude and Dafny, which have been chosen as case studies for improving its use via LLMs.
More information about this project is available here.