LLM-Based Metamorphic Fuzz Oracles
Abstract
Fuzz drivers are important components of greybox fuzzing; they encapsulate the target interface of the tested library, define the test space, and largely determine the outcomes of fuzzing. Fuzz drivers rely on (fuzz) oracles to detect potential bugs. As a common practice, existing fuzz drivers typically focus on security testing and adopt program crashes as the oracle. Although general and practical, this practice overlooks the functionality of the library under test, thereby restricting the bug-mining capability of greybox fuzzing. In this paper, we present the first study on metamorphic-based fuzz oracle enhancement (MFOE), which aims to improve existing fuzz drivers by incorporating metamorphic-based oracles. The metamorphic-based oracles are grounded on metamorphic relations (MRs), which reflect the expected relationships between multiple inputs and their corresponding outputs. However, both the construction and integration of metamorphic-based oracles require a deep understanding of the fuzz target, making automatic MFOE particularly challenging. Modern large language models (LLMs), with their advanced capabilities in code understanding and generation, provide a promising opportunity to overcome this obstacle, inspiring us to develop an LLM-based framework. We name the LLM-based MFOE framework as MetaFOE and extensively evaluate it on fuzz drivers selected from OSS-Fuzz. To conduct a comprehensive investigation, we adopt three modern LLMs and five prompt strategies to configure MetaFOE. In total, we generate 3,475 MRs, 77.3% of which are applicable. Building on these MRs, we implement 12,351 meta drivers (i.e., drivers incorporating metamorphic-based oracles), of which 6,228 are valid. After three hours of fuzzing, these valid meta drivers achieve an average 18.7% improvement in edge coverage and triggered 1,528 unique crashes. Our study highlights the necessity of incorporating fuzz drivers with metamorphic-based oracles and demonstrates the feasibility of an LLM-based automatic MFOE, offering valuable insight to the fuzzing community.