Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
8e1621f
archiving old docs
jgerh Jun 25, 2025
b62cd28
staging site
jgerh Jun 25, 2025
83d9f4f
updates to first draft
jgerh Jun 25, 2025
6b28a1a
revision check 1
jgerh Jun 26, 2025
2f56808
revision check 2
jgerh Jun 26, 2025
b0f3a73
revision check 2
jgerh Jun 26, 2025
1ba0b98
revision check 3
jgerh Jun 26, 2025
a583c2c
revision check 3
jgerh Jun 27, 2025
630b95f
revisions 4
jgerh Jun 27, 2025
2e6fc76
revision check 5
jgerh Jun 27, 2025
c36c81b
revision check 5
jgerh Jun 27, 2025
4c89fe3
revision check 5
jgerh Jun 27, 2025
d8bf26a
revision check
jgerh Jun 27, 2025
6b415e8
revision check 5
jgerh Jun 27, 2025
7df7b66
revision check 6
jgerh Jun 27, 2025
215ebb7
revision check 6
jgerh Jun 27, 2025
6052b36
revision check 6
jgerh Jun 27, 2025
1106124
revision check 7
jgerh Jun 27, 2025
eed2f3e
revison check 7
jgerh Jun 28, 2025
fdf549a
revision check 8
jgerh Jun 30, 2025
d4b8df4
revision check 8
jgerh Jun 30, 2025
4d4e998
revision check 8
jgerh Jun 30, 2025
792ba0c
revision check 8
jgerh Jun 30, 2025
3083d7d
revisoin check 8
jgerh Jun 30, 2025
5e00cec
added analysis
jgerh Jul 2, 2025
f5cfdf0
Merge pull request #282 from NVIDIA-NeMo/test-changes-2
jgerh Jul 2, 2025
6cef217
fix build issues
jgerh Jul 2, 2025
f643973
fixed gridcards
jgerh Jul 2, 2025
87eec41
revise makefile
jgerh Jul 2, 2025
6b686a5
content update and reorg
jgerh Jul 23, 2025
fc6bbf8
content and organizational updates
jgerh Jul 24, 2025
154dcdb
update index
jgerh Jul 24, 2025
c49ea61
import legacy articles, update extensions, add api
lbliii Aug 1, 2025
be7aaa6
Merge pull request #307 from lbliii/jgerhold/docs-refactor-staging
jgerh Aug 1, 2025
878ff6a
updated structure
jgerh Aug 7, 2025
5e655f8
revised structure and content
jgerh Aug 7, 2025
512de77
ran validation checks and added project doc
jgerh Aug 11, 2025
b234b9a
updated plan
jgerh Aug 11, 2025
7654700
updated project docs
jgerh Aug 12, 2025
6c4c2b8
checkpoint
jgerh Sep 12, 2025
4c3df6b
checkpoint
jgerh Sep 12, 2025
2c89b10
checkpoint
jgerh Sep 24, 2025
8371620
checkpoint
jgerh Sep 25, 2025
b0402c2
checkpoint
jgerh Sep 25, 2025
b7645c9
Validation check
jgerh Sep 25, 2025
2e22765
checkpoint
jgerh Sep 26, 2025
736675e
checkpoint
jgerh Sep 29, 2025
a10f7ac
checkpoint
jgerh Sep 30, 2025
f29a197
checkpoint
jgerh Sep 30, 2025
f115943
removed duplicate info
jgerh Oct 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions .cursor/rules/docs-assist-merge-conflict-resolution.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
description: Resolve the merge conflicts using a "manual rebase" approach
globs:
alwaysApply: false
---
# LLM-Assisted Merge Conflict Resolution

When documentation branches fall behind main, use this "smart rebase" approach with LLM assistance to resolve conflicts safely and accurately.

## Prerequisites
- Identify the current branch name (`user/example-branch`) that contains your changes
- Ensure you have no uncommitted changes in your working directory

## Resolution Steps

1. Create a backup of your current branch:
```bash
git branch backup/user-branch-$(date +%Y-%m-%d) user/example-branch
```

2. Get the latest main branch:
```bash
git checkout main
git pull origin main
```

3. Create a new resolution branch:
```bash
git checkout -b conflict-resolution/example-branch
```

4. Bring in changes from your original branch:
```bash
git merge --squash user/example-branch
```

5. LLM-Assisted Conflict Resolution:
a. For each conflict marker (`<<<<<<<`), the LLM will:
- Analyze and explain the conflict context
- Show the semantic differences between versions
- Provide a recommended resolution with rationale
- Wait for user approval before proceeding

b. After each resolution:
- LLM verifies the resolved content maintains technical accuracy
- LLM checks for documentation consistency
- LLM ensures cross-references remain valid

6. Final Validation:
- LLM performs comprehensive review of all resolved files
- Verifies all conflict markers are removed
- Checks documentation structure remains intact
- Ensures all technical content is accurate
- Validates all internal references and links

7. Commit the resolved changes:
```bash
git add .
git commit -m "Resolve conflicts from user/example-branch

- List major conflicts resolved
- Note any significant decisions made
- Reference relevant documentation updates"
```

## Post-Resolution
You now have a clean branch based on latest main with your changes properly integrated. The backup branch can be deleted once you've verified everything is correct:
```bash
git branch -D backup/user-branch-YYYY-MM-DD
```

13 changes: 13 additions & 0 deletions .cursor/rules/docs-bump-version.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
description: Version Bump Instructions for Docs Publishing
globs:
alwaysApply: false
---

1. Update the [versions1.json](mdc:docs/versions1.json) file with the user's provided version by adding a new entry at the top and updating preferred to false for the previous entry.
2. Update the [repo.toml](mdc:repo.toml) `version` to the latest version provided by the user.
2. Create a tag for the latest commit on the `main` branch in the format of "git tag docs-v{}.{}.{}`.
3. Push the tag.
4. Recap everything you did to prepare for the release.

If a user asks you to bump the version but hasn't provided a full version number, ask for clarification on the version number.
8 changes: 8 additions & 0 deletions .cursor/rules/docs-check-source.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
description: Where to find source code for drafting details.
globs:
alwaysApply: false
---
- You can find the source code used to draft docs in the `nemo_run/` directory.

- Make sure the nmp submodule is pointing to the `main` branch when verifying this information, as that is the bleeding edge.
235 changes: 235 additions & 0 deletions .cursor/rules/docs-frontmatter.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
---
description:
globs:
alwaysApply: false
---
# Documentation Frontmatter Taxonomy Framework

Every markdown file in `docs/` should have frontmatter following this taxonomy framework at the very top of the page:

## Required Frontmatter Structure

```yaml
---
description: "Brief description of the content (1-2 sentences)"
categories: ["primary-category"] # Single category (required)
tags: ["tag1", "tag2", "tag3"] # 2-8 tags recommended
personas: ["persona1", "persona2"] # Target audience(s)
difficulty: "beginner|intermediate|advanced|reference"
content_type: "tutorial|concept|reference|troubleshooting|example"
modality: "text-only|image-only|video-only|multimodal|universal"
# only: not ga # Optional: content gating
---
```

## Taxonomy Guidelines

### Categories (Choose ONE - Required)

**Primary functional domains aligned with user workflows:**

- `getting-started` - Installation, setup, quickstart guides
- `concepts-architecture` - Core concepts, fundamentals, system architecture
- `training-algorithms` - RL algorithms (GRPO, DPO, SFT), policy optimization
- `model-development` - Model integration, validation, custom architectures
- `research-advanced` - Custom algorithms, ablation studies, theory
- `deployment-operations` - Infrastructure, deployment, maintenance
- `integrations-apis` - External services, API docs, custom integrations
- `reference` - API documentation, configuration, troubleshooting

### Tags (Select 2-8 - Recommended)

**Training Techniques:**



- `reinforcement-learning` - RL algorithms and policy optimization
- `policy-optimization` - Policy gradient methods and optimization
- `model-training` - Model training and fine-tuning
- `loss-functions` - Loss function implementations and gradients
- `convergence` - Training convergence and stability
- `environments` - RL environment interfaces and implementations
- `rollouts` - Experience collection and rollout management
- `evaluation` - Model evaluation and metrics



**Technical Implementation:**

- `gpu-accelerated` - GPU-specific processing
- `distributed` - Multi-node/distributed processing
- `kubernetes` - K8s deployment content
- `slurm` - HPC/Slurm-related content
- `docker` - Container-related content

- `python-api` - Python API usage
- `configuration` - Config file management


**Data Types & Formats:**

- `webdataset` - WebDataset format handling
- `jsonl` - JSONL data format

- `parquet` - Parquet data format
- `bitext` - Parallel/bilingual text
- `code-data` - Programming code datasets

- `multimodal` - Cross-modal content

**Workflow Stages:**


- `data-loading` - Data ingestion processes
- `data-processing` - Core processing steps
- `data-export` - Output and export

- `pipeline` - End-to-end workflows
- `monitoring` - Observability and tracking


**Performance & Scale:**

- `large-scale` - Large dataset processing
- `optimization` - Performance tuning
- `memory-management` - Memory optimization

- `batch-processing` - Batch operation techniques

### Personas (Select 1+ - Required)

**Target audiences based on user roles:**

- `data-scientist-focused` - Analytics, metrics, model behavior
- `mle-focused` - Implementation details, pipelines, optimization
- `admin-focused` - Deployment, operations, maintenance
- `devops-focused` - Infrastructure, automation, monitoring

### Difficulty Levels

- `beginner` - New users, basic concepts
- `intermediate` - Some experience required
- `advanced` - Expert-level content
- `reference` - API docs, detailed specs

### Content Types

- `tutorial` - Step-by-step guides
- `concept` - Explanatory content
- `reference` - API/config documentation
- `troubleshooting` - Problem-solving guides
- `example` - Code samples and demos


### Modality Focus

- `text-only` - Text-specific content
- `image-only` - Image-specific content
- `video-only` - Video-specific content (EA-only)

- `multimodal` - Cross-modal content
- `universal` - Applies to all modalities

## Example Frontmatter

### Tutorial Example


```yaml
---
description: "Step-by-step guide to setting up distributed GRPO training for large language models"
categories: ["training-algorithms"]
tags: ["reinforcement-learning", "distributed", "large-scale", "configuration", "gpu-accelerated"]
personas: ["mle-focused", "admin-focused"]

difficulty: "intermediate"
content_type: "tutorial"
modality: "text-only"
---
```


### Concept Example

```yaml
---
description: "Core concepts behind experiment management and distributed computing for machine learning models"
categories: ["concepts-architecture"]
tags: ["loss-functions", "convergence", "multimodal", "machine-learning"]

personas: ["data-scientist-focused", "mle-focused"]
difficulty: "beginner"
content_type: "concept"
modality: "multimodal"

---
```

### Reference Example

```yaml
---
description: "Complete API reference for NeMo Run training algorithms and experiment management methods"
categories: ["reference"]

tags: ["reinforcement-learning", "python-api", "policy-optimization", "configuration"]
personas: ["mle-focused", "data-scientist-focused"]
difficulty: "reference"
content_type: "reference"
modality: "universal"
---
```

### Operations Example

```yaml
---
description: "Deploy NeMo Run on Kubernetes clusters with GPU acceleration and distributed training"
categories: ["deployment-operations"]
tags: ["kubernetes", "gpu-accelerated", "monitoring", "distributed", "docker"]
personas: ["admin-focused", "devops-focused"]
difficulty: "advanced"
content_type: "tutorial"
modality: "universal"
---
```

## Content Gating Integration

For early access or internal content, add the `only` field:

```yaml
---
description: "Video pipeline customization and advanced GPU processing techniques"
categories: ["research-advanced"]
tags: ["pipeline", "gpu-accelerated", "customization"]
personas: ["mle-focused"]
difficulty: "advanced"

content_type: "tutorial"
modality: "video-only"
only: not ga # Exclude from GA builds
---
```

## Benefits



1. **User-Centric Navigation** - Categories align with natural user workflows
2. **Flexible Discovery** - Tags enable cross-cutting content discovery
3. **Persona Adaptation** - Content serves different user types effectively
4. **Search Optimization** - Rich metadata improves search relevance
5. **Content Strategy** - Clear guidelines for authors and maintainers
6. **Scalability** - Structure accommodates future content and features

## Validation
When adding frontmatter, ensure:

- ✅ Description is 1-2 sentences and informative
- ✅ Exactly one category is specified
- ✅ 2-8 relevant tags are selected
- ✅ At least one persona is specified
- ✅ Difficulty and content type are appropriate
- ✅ Modality focus matches the content
- ✅ Content gating (`only`) is used when needed
Loading