FlowChronicle: Synthetic Network Flow Generation through Pattern Set Mining
Published in 20th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), 2024
Network traffic datasets are regularly criticized, notably for the lack of realism and diversity in their attack or benign traffic. Generating synthetic network traffic using generative machine learning techniques is a recent area of research that could complement experimental test beds and help assess the efficiency of network security tools such as network intrusion detection systems. Most methods generating synthetic network flows disregard the temporal dependencies between them, leading to unrealistic traffic. To address this issue, we introduce FlowChronicle, a novel synthetic network flow generation tool that relies on pattern mining and statistical models to preserve temporal dependencies. We empirically compare our method against state-of-the-art techniques on several criteria, namely realism, diversity, compliance, and novelty. This evaluation demonstrates the capability of FlowChronicle to achieve high-quality generation while significantly outperforming the other methods in preserving temporal dependencies between flows. Besides, in contrast to deep learning methods, the patterns identified by FlowChronicle are explainable, and experts can verify their soundness. Our work substantially advances synthetic network traffic generation, offering a method that enhances both the utility and trustworthiness of the generated network flows.
Recommended citation: Cüppers, J., Schoen, A., Blanc, G. & Gimenez, P. F., (2024, December). FlowChronicle: Synthetic Network Flow Generation through Pattern Set Mining Generation. In the 20th International Conference on emerging Networking EXperiments and Technologies (CoNEXT).