4 matches found
XekRung Technical Report
We present XekRung, a frontier large language model for cybersecurity, designed to provide comprehensive security capabilities. To achieve this, we develop diverse data synthesis pipelines tailored to the cybersecurity domain, enabling the scalable construction of high-quality training data and...
Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges
Large Language Model LLM agents are increasingly proposed for autonomous cybersecurity tasks, but their capabilities in realistic offensive settings remain poorly understood. We present DeepRed, an open-source benchmark for evaluating LLM-based agents on realistic Capture The Flag CTF challenges ...
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model derived from Llama-3.1-8B-Base, the model is trained through a two-stage process combining supervised fine-tuning SFT and...
Meta’s Purple Llama wants to test safety risks in AI models
Meta has announced Purple Llama, a project that aims to "bring together tools and evaluations to help the community build responsibly with open generative AI models." Generative Artificial Intelligence AI models have been around for years and their main function, compared to older AI models is th...