Skip to content

SESARLab/llm-certification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Certification Scheme for Large Language Models-Based Applications

CC BY 4.0

Nicola Bena, Marco Anisetti, Ernesto Damiani, Alex Della Bruna, Chan Yeob Yeun, Claudio A. Ardagna

The advent of Large Language Models (LLMs) is revolutionizing the design and deployment of modern applications, enabling intelligent, adaptive, and context-aware services across a wide range of domains. These AI-driven components now coexist with legacy systems, microservices, and nanoservices, forming complex and evolving ecosystems. While LLMs offer unprecedented capabilities, they also introduce new risks related to security, privacy, and ethics, especially due to their probabilistic nature and dependence on vast, often opaque, training data. This paradigm shift calls for robust mechanisms to assess and verify the non-functional behavior of LLM-based applications. Assurance emerges as a key strategy, with certification techniques traditionally used to validate non-functional properties. However, existing certification approaches for LLM are still in their infancy, while traditional methods prove inadequate in this context. In this paper, we propose a multi-dimensional certification scheme for LLM-based applications that initially captures the broader context and dynamic behavior introduced by LLMs and other AI components. To this aim, after proposing a taxonomy of LLM-based applications and discussing the gaps in LLM assessment and verification, we define a certification model, where a hypergraph represents a specific behavior supporting the property to be certified and drives the evidence collection at the basis of the corresponding certification process. We then instantiate the proposed approach in different use cases and experimentally evaluate it using an LLM-based application that implements a recommendation task in a security-critical scenario.

Overview

This repository contains the source code, input dataset, intermediate results and detailed results of our experimental evaluation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published