Neptune - PyTorch integration

The Neptune-PyTorch integration simplifies tracking your PyTorch experiments with Neptune by providing automated tracking of PyTorch model internals including activations, gradients, and parameters.

Installation

pip install -U neptune-pytorch

Requirements

Neptune 3.x: Requires a Neptune 3.x account. See the Getting Started Guide for setup instructions.
Python 3.10+: Minimum Python version requirement
PyTorch 1.11+: For tensor operations and model support
NumPy 1.20+: For numerical computations

Quickstart

The below quickstart example logs the following data to Neptune:

Model architecture: Visual diagram and summary of the neural network
Training metrics: Loss curves and epoch progress
Layer activations: Mean, std, norm, histograms for each layer
Gradient analysis: Gradient statistics to detect vanishing/exploding gradients
Parameter tracking: Weight and bias distributions over time

import torch
import torch.nn as nn
import torch.optim as optim
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger

# Initialize Neptune run
run = Run(project="your-project/experiment-tracking")

# Create your PyTorch model
model = nn.Sequential(
    nn.Linear(784, 128),
    nn.ReLU(),
    nn.Linear(128, 10)
)

# Initialize Neptune logger with model tracking
neptune_logger = NeptuneLogger(
    run=run,
    model=model,
    base_namespace="mnist_classification",  # Organizes all metrics under this folder
    log_model_diagram=True,  # Generates model architecture diagram
)

# Training setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Training loop with comprehensive tracking
for epoch in range(num_epochs):
    for batch_idx, (data, target) in enumerate(train_loader):
        # Forward pass
        output = model(data)
        loss = criterion(output, target)

        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Log training metrics to Neptune
        run.log_metrics({
            f"{neptune_logger.base_namespace}/batch/loss": loss.item(),
            f"{neptune_logger.base_namespace}/epoch": epoch,
        })

        # Track model internals every 10 steps
        if batch_idx % 10 == 0:
            neptune_logger.log_model_internals(
                step=batch_idx,
                prefix="train",
                track_activations=True,   # Monitor activation patterns
                track_gradients=True,     # Track gradient flow
                track_parameters=True     # Log parameter statistics
            )

Advanced configuration

The below example demonstrates the following additional features:

Layer filtering: Only track Conv2d and Linear layers (reduces overhead)
Custom statistics: Use mean, std, hist instead of all 8 statistics
Phase-specific tracking: Different tracking strategies for train/validation
Frequency control: Track every 20 steps in training, every 50 in validation

import torch
import torch.nn as nn
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger

# Initialize Neptune run
run = Run(project="your-project/advanced-tracking")

# Create a more complex model (e.g., CNN for image classification)
class CNNModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNNModel()

# Advanced Neptune logger configuration
neptune_logger = NeptuneLogger(
    run=run,
    model=model,
    base_namespace="cnn_experiment",  # Custom organization folder
    track_layers=[nn.Conv2d, nn.Linear],  # Only track conv and linear layers
    tensor_stats=["mean", "norm", "hist"],  # Custom statistics (faster than default)
    log_model_diagram=True,  # Log model summary and diagram
)

# Training with phase-specific tracking
for epoch in range(num_epochs):
    # Training phase - comprehensive tracking
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        # ... your training code ...
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        # Track everything during training
        if batch_idx % 20 == 0:  # Every 20 steps
            neptune_logger.log_model_internals(
                step=batch_idx,
                prefix="train",
                track_activations=True,   # Monitor activation patterns
                track_gradients=True,     # Track gradient flow
                track_parameters=True     # Log parameter statistics
            )

    # Validation phase - lightweight tracking
    model.eval()
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(val_loader):
            # ... your validation code ...
            output = model(data)
            val_loss = criterion(output, target)

            # Only track activations during validation (faster)
            if batch_idx % 50 == 0:  # Every 50 steps
                neptune_logger.log_model_internals(
                    step=batch_idx,
                    prefix="validation",
                    track_activations=True,   # Monitor activation patterns
                    track_gradients=False,    # Skip gradients (no backward pass)
                    track_parameters=False    # Skip parameters (expensive)
                )

Features

Model monitoring

Layer activations: Track activation patterns across all layers with 8 different statistics
Gradient analysis: Monitor gradient flow and detect vanishing/exploding gradients
Parameter tracking: Log parameter statistics and distributions for model analysis
Custom statistics: Choose from mean, std, norm, min, max, var, abs_mean, and hist

Configuration options

Layer filtering: Track only specific layer types (Conv2d, Linear, etc.)
Phase organization: Separate tracking for training/validation phases with custom prefixes
Custom namespaces: Organize experiments with custom folder structures

Visualizations

Model architecture: Automatic model diagram generation with torchviz
Distribution histograms: 50-bin histograms for all tracked metrics
Real-time monitoring: Live tracking during training with Neptune
Comparative analysis: Easy comparison across experiments and runs

Integration

Minimal setup: Simple integration with existing code
PyTorch native: Works with existing PyTorch workflows

Performance optimization

Since parameter logging can be expensive for large models, you can control the frequency explicitly:

for step in range(num_steps):
    # ... training code ...

    # Log lightweight metrics every step
    neptune_logger.log_model_internals(
        step=step,
        track_activations=True,
        track_gradients=True,
        track_parameters=False  # Skip expensive parameter logging
    )

    # Log expensive parameters less frequently
    if step % 100 == 0:
        neptune_logger.log_model_internals(
            step=step,
            track_activations=False,
            track_gradients=False,
            track_parameters=True
        )

Namespace structure

The integration organizes all logged data under a clear hierarchical and customizable namespace structure:

{base_namespace}/                   # Optional custom top-level folder
├── batch/
│   └── loss                        # Training loss per batch (logged by the user)
├── model/
│   ├── summary                     # Model architecture (if log_model_diagram=True)
│   └── internals/                  # Model internals tracking
│       └── {prefix}/               # Optional prefix (e.g., "train", "validation")
│           ├── activations/        # Layer activations
│           │   └── {layer_name}/
│           │       ├── mean        # Mean activation value
│           │       ├── std         # Standard deviation
│           │       ├── norm        # L2 norm
│           │       ├── min         # Minimum value
│           │       ├── max         # Maximum value
│           │       ├── var         # Variance
│           │       ├── abs_mean    # Mean of absolute values
│           │       └── hist        # Histogram (50 bins)
│           ├── gradients/          # Layer gradients
│           │   └── {layer_name}/
│           │       └── {statistic} # Same statistics as activations
│           └── parameters/         # Model parameters
│               └── {layer_name}/
│                   └── {statistic} # Same statistics as activations

Example namespaces:

With base_namespace="my_experiment":

my_experiment/batch/loss - Training loss
my_experiment/model/summary - Model architecture
my_experiment/model/internals/activations/conv/1/mean - Mean activation (no prefix)
my_experiment/model/internals/train/activations/conv/1/mean - Mean activation (with "train" prefix)
my_experiment/model/internals/validation/gradients/linear1/norm - L2 norm of gradients (with "validation" prefix)

With base_namespace=None:

batch/loss - Training loss
model/summary - Model architecture
model/internals/activations/conv/1/mean - Mean activation (no prefix)
model/internals/train/activations/conv/1/mean - Mean activation (with "train" prefix)
model/internals/validation/gradients/linear1/norm - L2 norm of gradients (with "validation" prefix)

Layer name handling:

Dots in layer names are automatically replaced with forward slashes for proper namespace organization
Example: seq_model.0.weight becomes seq_model/0/weight in the namespace
Example: module.submodule.layer becomes module/submodule/layer in the namespace

Available statistics: mean, std, norm, min, max, var, abs_mean, hist

API reference

NeptuneLogger

NeptuneLogger(
    run: Run,
    model: torch.nn.Module,
    base_namespace: Optional[str] = None,
    track_layers: Optional[List[Type[nn.Module]]] = None,
    tensor_stats: Optional[List[TensorStatType]] = None,
    log_model_diagram: bool = False
)

Parameters:

run: Neptune run object for logging
model: PyTorch model to track
base_namespace: Optional top-level folder for organization (default: None)
track_layers: List of layer types to track (default: None = all layers)
tensor_stats: Statistics to compute (default: ["mean", "norm", "hist"])
log_model_diagram: Log the model summary and diagram (default: False)

log_model_internals()

log_model_internals(
    step: int,
    track_activations: bool = True,
    track_gradients: bool = True,
    track_parameters: bool = False,
    prefix: Optional[str] = None
)

Parameters:

step: Current training step for logging
track_activations: Track layer activations (default: True)
track_gradients: Track layer gradients (default: True)
track_parameters: Track model parameters (default: False)
prefix: Optional phase identifier (e.g., "train", "validation")

Available statistics

Statistic	Description	Use case
`mean`	Mean value	Monitor activation levels
`std`	Standard deviation	Detect activation variance
`norm`	L2 norm	Monitor gradient/activation magnitude
`min`	Minimum value	Detect dead neurons
`max`	Maximum value	Detect saturation
`var`	Variance	Monitor activation spread
`abs_mean`	Mean of absolute values	Monitor activation strength
`hist`	50-bin histogram	Visualize distributions

Contributing

Contributions to neptune-pytorch are welcome. Here's how you can help:

Report issues

Found a bug? Open an issue
Include Python version, PyTorch version, and error traceback
Provide a minimal reproducible example

Suggest features

Have an idea? Create a feature request
Describe the use case and expected behavior
Check existing issues first to avoid duplicates

Contribute code

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and add tests
Run tests: pytest tests/
Commit your changes: git commit -m 'Add amazing feature'
Push to remote: git push origin feature/amazing-feature
Open a Pull Request

Support

Get help

🔧 Troubleshooting: Common Issues Guide
🎫 Support Portal: Reach out to us

Resources

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Made with ❤️ by the Neptune team

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github		.github
src/neptune_pytorch		src/neptune_pytorch
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neptune - PyTorch integration

Installation

Requirements

Quickstart

Advanced configuration

Features

Model monitoring

Configuration options

Visualizations

Integration

Performance optimization

Namespace structure

API reference

NeptuneLogger

log_model_internals()

Available statistics

Contributing

Report issues

Suggest features

Contribute code

Support

Get help

Resources

License

About

Uh oh!

Releases 7

Uh oh!

Contributors 6

Uh oh!

Languages

License

neptune-ai/neptune-pytorch

Folders and files

Latest commit

History

Repository files navigation

Neptune - PyTorch integration

Installation

Requirements

Quickstart

Advanced configuration

Features

Model monitoring

Configuration options

Visualizations

Integration

Performance optimization

Namespace structure

API reference

NeptuneLogger

log_model_internals()

Available statistics

Contributing

Report issues

Suggest features

Contribute code

Support

Get help

Resources

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Uh oh!

Contributors 6

Uh oh!

Languages