Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

CTINexus is a framework that leverages optimized in-context learning (ICL) of large language models (LLMs) for automatic cyber threat intelligence (CTI) knowledge extraction and cybersecurity knowledge graph (CSKG) construction. CTINexus adapts to various cybersecurity ontologies with minimal annotated examples and provides a user-friendly web interface for instant threat intelligence analysis.

What CTINexus Does

The framework automatically processes unstructured threat intelligence reports to:

Extract cybersecurity entities (malware, vulnerabilities, tactics, IOCs)
Identify relationships between security concepts
Construct knowledge graphs with interactive visualizations
Require minimal configuration - no extensive data or parameter tuning needed

Core Components

Intelligence Extraction (IE): Automatically extracts cybersecurity entities and relationships from unstructured text using optimized prompt construction and demonstration retrieval
Hierarchical Entity Alignment: Canonicalizes extracted knowledge and removes redundancy through:
- Entity Typing (ET): Groups mentions of the same semantic type
- Entity Merging (EM): Merges mentions referring to the same entity with IOC (Indicator of Compromise) protection
Link Prediction (LP): Predicts and adds missing relationships to complete the knowledge graph
Graph Visualization: Interactive network visualization of the constructed cybersecurity knowledge graph

News

📦 [2025/09/03] CTINexus Python package released! Install with pip install ctinexus for seamless integration into your Python projects.

🌟 [2025/07/29] CTINexus now features an intuitive Gradio interface! Submit threat intelligence text and instantly visualize extracted interactive graphs.

🔥 [2025/04/21] We released the camera-ready paper on arxiv.

🔥 [2025/02/12] CTINexus is accepted at 2025 IEEE European Symposium on Security and Privacy (Euro S&P).

Quick Start

You can use CTINexus in three ways:

📦 Python Package: Python package for easy integration
⚡ Command Line: For automation and batch processing → 📖 CLI Guide
🖥️ Web Interface: User-friendly GUI for interactive analysis (follow the setup below)

Supported Models

CTINexus supports the following AI providers: OpenAI, Gemini, AWS, Ollama

All models from these providers are supported. If you would like to see additional providers integrated, please open a feature request issue here.

📦 Using as a Python Package

CTINexus can be used as a Python library for seamless integration into your projects.

Installation

pip install ctinexus

Configuration

Before using CTINexus, you need to configure API keys. Create a .env file in your project directory with your credentials. Look at the example env for reference.

Usage

from ctinexus import process_cti_report
from dotenv import load_dotenv

load_dotenv()

# Example usage
text = "Your CTI text here"

result = process_cti_report(
    text=text,
    provider="openai",  # optional: auto-detected if not specified
    model="gpt-4",      # optional: uses default if not specified
    similarity_threshold=0.6,
    output="results.json"  # optional: save results to file
)

# Access results
print(f"Graph:", result["entity_relation_graph"])
# Outputs the html file with the graph visualization.
# Open the html file on your browser to see the results.

Parameters

text (str): The threat intelligence report text to process
provider (str, optional): AI provider ("openai", "gemini", "aws", "ollama"). Auto-detected from available keys if not specified
model (str, optional): Specific model name (e.g., "gpt-4", "gemini-pro")
embedding_model (str, optional): Model for embeddings
ie_model, et_model, ea_model, lp_model (str, optional): Specific models for each pipeline component
similarity_threshold (float, default 0.6): Threshold for entity similarity matching
output (str, optional): File path to save JSON results

Return Value

Returns a dictionary containing the complete CTI analysis results:

text: The original input text
IE: Intelligence Extraction results with:
- triplets: Raw extracted subject-relation-object triplets
ET: Entity Typing results with:
- typed_triplets: Triplets with entity type classifications (Malware, Vulnerability, Infrastructure, etc.)
EA: Entity Alignment results with:
- aligned_triplets: Triplets with merged entities and canonical entity IDs
LP: Link Prediction results with:
- predicted_links: Additional predicted relationships between entities
entity_relation_graph: File path to the interactive HTML visualization

🐍 Local Development Setup

For users who want to run the web interface, use the command line interface, or contribute to the project, you'll need to clone the repository and set up the development environment.

Prerequisites

API Key from one of the supported providers: OpenAI, Gemini, AWS, or Ollama (local, free)
Python 3.11+ and pip

Step 1: Clone the Repository

git clone https://github.com/peng-gao-lab/CTINexus.git
cd CTINexus

Step 2: Configure API Keys

Create a .env file in the project root:

cp .env.example .env

Edit the .env file with your API credentials.

Note: You only need to set up one provider. If using Ollama, see the Ollama Guide.

Step 3: Setup Python Environment

# Create virtual environment
python -m venv .venv

# Activate virtual environment (macOS/Linux:)
source .venv/bin/activate

# On Windows:
# .venv\Scripts\activate

# Install package
pip install -e .

Step 4: Run the Application

ctinexus

Step 5: Access the Application

Open your browser and navigate to: http://127.0.0.1:7860

Use Ctrl+C in the terminal to stop the application.

🐳 Docker Setup

For containerized deployment or development:

Prerequisites

Docker and Docker Compose installed
API keys configured (see Local Development Setup above)

Step 1: Clone the Repository

git clone https://github.com/peng-gao-lab/CTINexus.git
cd CTINexus

Step 2: Configure API Keys

Create a .env file as described in the Local Development Setup section.

Step 3: Launch with Docker

# Build and start
docker compose up --build

# Or run in detached mode (runs in background)
docker compose up -d --build

Step 4: Access the Application

Open your browser and navigate to: http://localhost:8000

Step 5: Stop the Application

docker compose down

Web Interface and CLI Usage

After setting up the local environment, you can use CTINexus through the web interface or command line.

⚡ Command Line Interface (CLI)

For automation and batch processing:

ctinexus --input-file report.txt

📖 Complete CLI Documentation - Detailed usage examples and options.

🖥️ Web Interface (GUI)

Once the application is running:

Open your browser to the appropriate URL:
- Docker: http://localhost:8000
- Local: http://127.0.0.1:7860
Paste threat intelligence text into the input area
Select your preferred AI model from the dropdown
Click "Run" to analyze the text
View results:
- Extracted Entities: Identified cybersecurity entities
- Relationships: Discovered connections between entities
- Interactive Graph: Network visualization
- Export Options: Download results as JSON or images

Contributing

We warmly welcome contributions from the community! Whether you're interested in:

🐛 Fixing bugs or adding new features
📖 Improving documentation or adding examples
🎨 UI/UX enhancements for the web interface

Please check out our Contributing Guide for detailed information on how to get started, development setup, and submission guidelines.

Citation

@inproceedings{cheng2025ctinexusautomaticcyberthreat,
      title={CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models},
      author={Yutong Cheng and Osama Bajaber and Saimon Amanuel Tsegai and Dawn Song and Peng Gao},
      booktitle={2025 IEEE European Symposium on Security and Privacy (EuroS\&P)},
      year={2025},
      organization={IEEE}
}

License

The source code is licensed under the MIT License. We warmly welcome industry collaboration. If you’re interested in building on CTINexus or exploring joint initiatives, please email [email protected] or [email protected], we’d be happy to set up a brief call to discuss ideas.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github		.github
ctinexus		ctinexus
docs		docs
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

peng-gao-lab/ctinexus

Folders and files

Latest commit

History

Repository files navigation

Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

What CTINexus Does

Core Components

News

Quick Start

Supported Models

📦 Using as a Python Package

Installation

Configuration

Usage

Parameters

Return Value

🐍 Local Development Setup

Prerequisites

Step 1: Clone the Repository

Step 2: Configure API Keys

Step 3: Setup Python Environment

Step 4: Run the Application

Step 5: Access the Application

🐳 Docker Setup

Prerequisites

Step 1: Clone the Repository

Step 2: Configure API Keys

Step 3: Launch with Docker

Step 4: Access the Application

Step 5: Stop the Application

Web Interface and CLI Usage

⚡ Command Line Interface (CLI)

🖥️ Web Interface (GUI)

Contributing

Citation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors 5

Uh oh!

Languages