sands: security and ai network defense sandbox

Fortifying AI Systems: Investigating & Mitigating Vulnerabilities

University

Carnegie Mellon University

Date

Fall 2024

DISCOVERING SANDS: AI AND NETWORK DEFENSE SANDBOX

This project, undertaken through a CMU Practicum in partnership with 99P Labs, focuses on establishing a secure environment to investigate and mitigate emerging threats in AI-driven systems. Dubbed “SANDS,” the Security and AI Network Defense Sandbox tackles the vulnerabilities that arise as AI applications become more sophisticated and widespread. By combining a reproducible AWS setup with controlled virtual machine deployments, the project ensures that researchers, engineers, and innovators can safely explore attack vectors and test corresponding defense strategies without endangering real-world infrastructures.

‍

SANDS begins with an automated AWS deployment strategy that provisions an isolated Virtual Private Cloud using Terraform. This approach, augmented by robust security configurations, creates a controlled testing environment where users can conduct experiments without direct exposure to external networks. Docker containers preloaded with essential AI and ML toolsets further streamline the process, allowing fast setup of Jupyter notebooks and other frameworks for advanced experimentation. The result is a reproducible sandbox that empowers investigators to focus on discovering vulnerabilities and defending against them, rather than grappling with time-consuming infrastructure tasks.

‍

A core element of SANDS is its comprehensive logging and monitoring infrastructure, built on open-source solutions like Wazuh and Suricata. These tools capture detailed logs of network behavior, file system activity, and model interactions, providing critical insights into how AI models function under stress or malicious influence. By examining real-time data flows, researchers can detect unusual patterns—such as suspicious traffic or unexpected file modifications—that might indicate attacks like data poisoning or prompt injection. This rigorous visibility lays the groundwork for a proactive security posture, enabling the sandbox to serve as both a testing ground and an educational platform for those seeking to understand AI threats in depth.

‍

A defining feature of the project is the suite of “playbooks” designed to demonstrate and document real-world AI attacks. Examples include using Fast Gradient Sign Method (FGSM) to manipulate image classifiers, jailbreaking large language models through prompt injection, and probing for vulnerabilities like Server-Side Template Injection (SSTI) within AI pipelines. These playbooks come preconfigured within the SANDS environment, enabling users to reproduce sophisticated attacks step by step, analyze model behavior, and refine threat detection methods. This blend of practical exercises and robust infrastructure paves the way for a new generation of AI security specialists—individuals who can anticipate the unpredictable while fortifying defenses against tomorrow’s threats.

‍

In summary, the SANDS project offers a powerful, isolated ecosystem where AI and security intersect. By providing a reproducible sandbox with extensive monitoring capabilities, automated deployment, and guided playbooks, it stands as a critical innovation for addressing the rising complexity of AI vulnerabilities. Through hands-on exploration, interactive research, and continuous adaptation, this environment equips practitioners and learners alike to assess, counter, and stay ahead of malicious exploits in an ever-evolving AI landscape.

Stay Connected

Follow our journey on Medium and LinkedIn.

Read Our Blog Connect on LinkedIn