pytorch · arvindcr4 · Aug 21, 2025
diff --git a/facebookresearch_semi-supervised-ImageNet1K-models_resnext.md b/facebookresearch_semi-supervised-ImageNet1K-models_resnext.md
@@ -85,7 +85,7 @@ This project includes the semi-supervised and semi-weakly supervised ImageNet mo
 
 "Semi-supervised" (SSL) ImageNet models are pre-trained on a subset of unlabeled YFCC100M public image dataset and fine-tuned with the ImageNet1K training dataset, as described by the semi-supervised training framework in the paper mentioned above. In this case, the high capacity teacher model was trained only with labeled examples.
 
-"Semi-weakly" supervised (SWSL) ImageNet models are pre-trained on **940 million** public images with 1.5K hashtags matching with 1000 ImageNet1K synsets, followed by fine-tuning on ImageNet1K dataset. In this case, the associated hashtags are only used for building a better teacher model. During training the student model, those hashtags are ingored and the student model is pretrained with a subset of 64M images selected by the teacher model from the same 940 million public image dataset.
+"Semi-weakly" supervised (SWSL) ImageNet models are pre-trained on **940 million** public images with 1.5K hashtags matching with 1000 ImageNet1K synsets, followed by fine-tuning on ImageNet1K dataset. In this case, the associated hashtags are only used for building a better teacher model. During training the student model, those hashtags are ignored and the student model is pretrained with a subset of 64M images selected by the teacher model from the same 940 million public image dataset.
 
 Semi-weakly supervised ResNet and ResNext models provided in the table below significantly improve the top-1 accuracy on the ImageNet validation set compared to training from scratch or other training mechanisms introduced in the literature as of September 2019. For example, **We achieve state-of-the-art accuracy of 81.2% on ImageNet for the widely used/adopted ResNet-50 model architecture**.
 

diff --git a/nvidia_deeplearningexamples_efficientnet.md b/nvidia_deeplearningexamples_efficientnet.md
@@ -108,7 +108,7 @@ for uri, result in zip(uris, results):
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit:
+For detailed information on model input and output, training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet)
 and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:efficientnet_for_pytorch)
 
@@ -123,4 +123,4 @@ and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:efficientnet_for_py
  - [pretrained model on NGC (efficientnet-widese-b4)](https://ngc.nvidia.com/catalog/models/nvidia:efficientnet_widese_b4_pyt_amp)
  - [pretrained, quantized model on NGC (efficientnet-widese-b0)](https://ngc.nvidia.com/catalog/models/nvidia:efficientnet_widese_b0_pyt_amp)
  - [pretrained, quantized model on NGC (efficientnet-widese-b4)](https://ngc.nvidia.com/catalog/models/nvidia:efficientnet_widese_b4_pyt_amp)
-
+
diff --git a/nvidia_deeplearningexamples_fastpitch.md b/nvidia_deeplearningexamples_fastpitch.md
@@ -41,7 +41,7 @@ In the example below:
 - HiFiGAN generates sound given the mel spectrogram
 - the output sound is saved in an 'audio.wav' file
 
-To run the example you need some extra python packages installed. These are needed for preprocessing of text and audio, as well as for display and input/output handling. Finally, for better performance of FastPitch model, we download the CMU pronounciation dictionary.
+To run the example you need some extra python packages installed. These are needed for preprocessing of text and audio, as well as for display and input/output handling. Finally, for better performance of the FastPitch model, we download the CMU pronunciation dictionary.
 ```bash
 apt-get update
 apt-get install -y libsndfile1 wget
@@ -99,7 +99,7 @@ Load text processor.
 tp = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_textprocessing_utils', cmudict_path="cmudict-0.7b", heteronyms_path="heteronyms")
 ```
 
-Set the text to be synthetized, prepare input and set additional generation parameters.
+Set the text to be synthesized, prepare input and set additional generation parameters.
 ```python
 text = "Say this smoothly, to prove you are not a robot."
 ```
@@ -136,7 +136,7 @@ plt.ylabel('frequency')
 _=plt.title('Spectrogram')
 ```
 
-Syntesize audio.
+Synthesize audio.
 ```python
 audio_numpy = audios[0].cpu().numpy()
 Audio(audio_numpy, rate=22050)
@@ -149,7 +149,7 @@ write("audio.wav", vocoder_train_setup['sampling_rate'], audio_numpy)
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFiGAN) and/or [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/fastpitch_pyt)
+For detailed information on model input and output, training recipes, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFiGAN) and/or [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/fastpitch_pyt)
 
 ### References
 

diff --git a/nvidia_deeplearningexamples_gpunet.md b/nvidia_deeplearningexamples_gpunet.md
@@ -122,7 +122,7 @@ for uri, result in zip(uris, results):
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit:
+For detailed information on model input and output, training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/GPUNet)
 
 ### References

diff --git a/nvidia_deeplearningexamples_hifigan.md b/nvidia_deeplearningexamples_hifigan.md
@@ -34,7 +34,7 @@ In the example below:
 - HiFiGAN generates sound given the mel spectrogram
 - the output sound is saved in an 'audio.wav' file
 
-To run the example you need some extra python packages installed. These are needed for preprocessing of text and audio, as well as for display and input/output handling. Finally, for better performance of FastPitch model, we download the CMU pronounciation dictionary.
+To run the example you need some extra python packages installed. These are needed for preprocessing of text and audio, as well as for display and input/output handling. Finally, for better performance of the FastPitch model, we download the CMU pronunciation dictionary.
 ```bash
 pip install numpy scipy librosa unidecode inflect librosa matplotlib==3.6.3
 apt-get update
@@ -92,7 +92,7 @@ Load text processor.
 tp = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_textprocessing_utils', cmudict_path="cmudict-0.7b", heteronyms_path="heteronyms")
 ```
 
-Set the text to be synthetized, prepare input and set additional generation parameters.
+Set the text to be synthesized, prepare input and set additional generation parameters.
 ```python
 text = "Say this smoothly, to prove you are not a robot."
 ```
@@ -129,7 +129,7 @@ plt.ylabel('frequency')
 _=plt.title('Spectrogram')
 ```
 
-Syntesize audio.
+Synthesize audio.
 ```python
 audio_numpy = audios[0].cpu().numpy()
 Audio(audio_numpy, rate=22050)
@@ -142,12 +142,12 @@ write("audio.wav", vocoder_train_setup['sampling_rate'], audio_numpy)
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFiGAN) and/or [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/hifigan_pyt)
+For detailed information on model input and output, training recipes, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFiGAN) and/or [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/hifigan_pyt)
 
 ### References
 
  - [HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis](https://arxiv.org/abs/2010.05646)
  - [Original implementation](https://github.com/jik876/hifi-gan)
  - [FastPitch on NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/fastpitch_pyt)
  - [HiFi-GAN on NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dle/resources/hifigan_pyt)
- - [FastPitch and HiFi-GAN on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFi-GAN)
+ - [FastPitch and HiFi-GAN on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/HiFi-GAN)
diff --git a/nvidia_deeplearningexamples_resnet50.md b/nvidia_deeplearningexamples_resnet50.md
@@ -105,7 +105,7 @@ for uri, result in zip(uris, results):
 
 ### Details
 
-For detailed information on model input and output, training recipies, inference and performance visit:
+For detailed information on model input and output, training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnet50v1.5)
 and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnet_50_v1_5_for_pytorch)
 

diff --git a/nvidia_deeplearningexamples_resnext.md b/nvidia_deeplearningexamples_resnext.md
@@ -107,7 +107,7 @@ for uri, result in zip(uris, results):
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit:
+For detailed information on model input and output, training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnext101-32x4d)
 and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnext_for_pytorch)
 

diff --git a/nvidia_deeplearningexamples_se-resnext.md b/nvidia_deeplearningexamples_se-resnext.md
@@ -107,7 +107,7 @@ for uri, result in zip(uris, results):
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit:
+For detailed information on model input and output, training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/se-resnext101-32x4d)
 and/or [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/se_resnext_for_pytorch).
 

diff --git a/nvidia_deeplearningexamples_ssd.md b/nvidia_deeplearningexamples_ssd.md
@@ -123,7 +123,7 @@ plt.show()
 
 ### Details
 For detailed information on model input and output,
-training recipies, inference and performance visit:
+training recipes, inference and performance visit:
 [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Detection/SSD)
 and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:ssd_for_pytorch)
 

diff --git a/nvidia_deeplearningexamples_tacotron2.md b/nvidia_deeplearningexamples_tacotron2.md
@@ -89,7 +89,7 @@ Audio(audio_numpy, rate=rate)
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:tacotron_2_and_waveglow_for_pytorch)
+For detailed information on model input and output, training recipes, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:tacotron_2_and_waveglow_for_pytorch)
 
 ### References
 

diff --git a/nvidia_deeplearningexamples_waveglow.md b/nvidia_deeplearningexamples_waveglow.md
@@ -91,7 +91,7 @@ Audio(audio_numpy, rate=rate)
 ```
 
 ### Details
-For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:tacotron_2_and_waveglow_for_pytorch)
+For detailed information on model input and output, training recipes, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:tacotron_2_and_waveglow_for_pytorch)
 
 ### References
 

diff --git a/pytorch_vision_deeplabv3_resnet101.md b/pytorch_vision_deeplabv3_resnet101.md
@@ -74,7 +74,7 @@ To get the maximum prediction of each class, and then use it for a downstream ta
 Here's a small snippet that plots the predictions, with each color being assigned to each class (see the visualized image on the left).
 
 ```python
-# create a color pallette, selecting a color for each class
+# create a color palette, selecting a color for each class
 palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])
 colors = torch.as_tensor([i for i in range(21)])[:, None] * palette
 colors = (colors % 255).numpy().astype("uint8")

diff --git a/pytorch_vision_fcn_resnet101.md b/pytorch_vision_fcn_resnet101.md
@@ -31,7 +31,7 @@ The images have to be loaded in to a range of `[0, 1]` and then normalized using
 and `std = [0.229, 0.224, 0.225]`.
 
 The model returns an `OrderedDict` with two Tensors that are of the same height and width as the input Tensor, but with 21 classes.
-`output['out']` contains the semantic masks, and `output['aux']` contains the auxillary loss values per-pixel. In inference mode, `output['aux']` is not useful.
+`output['out']` contains the semantic masks, and `output['aux']` contains the auxiliary loss values per-pixel. In inference mode, `output['aux']` is not useful.
 So, `output['out']` is of shape `(N, 21, H, W)`. More documentation can be found [here](https://pytorch.org/vision/stable/models.html#object-detection-instance-segmentation-and-person-keypoint-detection).
 
 
@@ -73,7 +73,7 @@ To get the maximum prediction of each class, and then use it for a downstream ta
 Here's a small snippet that plots the predictions, with each color being assigned to each class (see the visualized image on the left).
 
 ```python
-# create a color pallette, selecting a color for each class
+# create a color palette, selecting a color for each class
 palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])
 colors = torch.as_tensor([i for i in range(21)])[:, None] * palette
 colors = (colors % 255).numpy().astype("uint8")

diff --git a/pytorch_vision_googlenet.md b/pytorch_vision_googlenet.md
@@ -84,7 +84,7 @@ for i in range(top5_prob.size(0)):
 
 ### Model Description
 
-GoogLeNet was based on a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The 1-crop error rates on the ImageNet dataset with a pretrained model are list below.
+GoogLeNet was based on a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The 1-crop error rates on the ImageNet dataset with a pretrained model are listed below.
 
 | Model structure | Top-1 error | Top-5 error |
 | --------------- | ----------- | ----------- |

diff --git a/pytorch_vision_once_for_all.md b/pytorch_vision_once_for_all.md
@@ -74,7 +74,7 @@ model, image_size = ofa_specialized_get("flops@[email protected]_finetune@75", pret
 model.eval()
 ```
 
-The model's prediction can be evalutaed by 
+The model's prediction can be evaluated by 
 ```python
 # Download an example image from pytorch website
 import urllib

diff --git a/pytorch_vision_proxylessnas.md b/pytorch_vision_proxylessnas.md
@@ -20,7 +20,7 @@ demo-model-link: https://huggingface.co/spaces/pytorch/ProxylessNAS
 ```python
 import torch
 target_platform = "proxyless_cpu"
-# proxyless_gpu, proxyless_mobile, proxyless_mobile14 are also avaliable.
+# proxyless_gpu, proxyless_mobile, proxyless_mobile14 are also available.
 model = torch.hub.load('mit-han-lab/ProxylessNAS', target_platform, pretrained=True)
 model.eval()
 ```
@@ -87,7 +87,7 @@ for i in range(top5_prob.size(0)):
 
 ProxylessNAS models are from the [ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/abs/1812.00332) paper.
 
-Conventionally, people tend to design *one efficient model* for *all hardware platforms*. But different hardware has different properties, for example, CPU has higher frequency and GPU is better at parallization. Therefore, instead of generalizing, we need to **specialize** CNN architectures for different hardware platforms. As shown in below, with similar accuracy, specialization offers free yet significant performance boost on all three platforms.
+Conventionally, people tend to design *one efficient model* for *all hardware platforms*. But different hardware has different properties, for example, CPU has higher frequency and GPU is better at parallelization. Therefore, instead of generalizing, we need to **specialize** CNN architectures for different hardware platforms. As shown in below, with similar accuracy, specialization offers free yet significant performance boost on all three platforms.
 
 | Model structure |  GPU Latency | CPU Latency | Mobile Latency
 | --------------- | ----------- | ----------- | ----------- |

diff --git a/pytorch_vision_resnext.md b/pytorch_vision_resnext.md
@@ -2,7 +2,7 @@
 layout: hub_detail
 background-class: hub-background
 body-class: hub
-title: ResNext
+title: ResNeXt
 summary: Next generation ResNets, more efficient and accurate
 category: researchers
 image: resnext.png
@@ -87,9 +87,9 @@ for i in range(top5_prob.size(0)):
 
 ### Model Description
 
-Resnext models were proposed in [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
-Here we have the 2 versions of resnet models, which contains 50, 101 layers repspectively.
-A comparison in model archetechure between resnet50 and resnext50 can be found in Table 1.
+ResNeXt models were proposed in [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
+Here are the two versions of ResNeXt models, which contain 50 and 101 layers, respectively.
+A comparison of model architecture between ResNet-50 and ResNeXt-50 can be found in Table 1.
 Their 1-crop error rates on ImageNet dataset with pretrained models are listed below.
 
 |  Model structure  | Top-1 error | Top-5 error |

diff --git a/sigsep_open-unmix-pytorch_umx.md b/sigsep_open-unmix-pytorch_umx.md
@@ -61,7 +61,7 @@ Furthermore, we provide a model for speech enhancement trained by [Sony Corporat
 
 * __`umxse`__ speech enhancement model is trained on the 28-speaker version of the [Voicebank+DEMAND corpus](https://datashare.is.ed.ac.uk/handle/10283/1942?show=full).
 
-All three models are also available as spectrogram (core) models, which take magnitude spectrogram inputs and ouput separated spectrograms.
+All three models are also available as spectrogram (core) models, which take magnitude spectrogram inputs and output separated spectrograms.
 These models can be loaded using `umxhq_spec`, `umx_spec` and `umxse_spec`.
 
 ### Details
@@ -77,4 +77,4 @@ pip install openunmix
 ### References
 
 - [Open-Unmix - A Reference Implementation for Music Source Separation](https://doi.org/10.21105/joss.01667)
-- [SigSep - Open Ressources for Music Separation](https://sigsep.github.io/)
+- [SigSep - Open Resources for Music Separation](https://sigsep.github.io/)
diff --git a/test_run_python_code.py b/test_run_python_code.py
@@ -11,7 +11,7 @@
 @pytest.mark.parametrize('file_path', ALL_FILES)
 def test_run_file(file_path):
     if 'nvidia' in file_path:
-        # FIXME: NVIDIA models checkoints are on cuda
+        # FIXME: NVIDIA models checkpoints are on CUDA
         pytest.skip("temporarily disabled")
     if 'pytorch_fairseq_translation' in file_path:
         pytest.skip("temporarily disabled")
@@ -26,11 +26,11 @@ def test_run_file(file_path):
 
     # We just run the python files in a separate sub-process. We really want a
     # subprocess here because otherwise we might run into package versions
-    # issues: imagine script A that needs torchvivion 0.9 and script B that
+    # issues: imagine script A that needs torchvision 0.9 and script B that
     # needs torchvision 0.10. If script A is run prior to script B in the same
     # process, script B will still be run with torchvision 0.9 because the only
     # "import torchvision" statement that counts is the first one, and even
-    # torchub sys.path shenanigans can do nothing about this. By creating
+    # torchhub sys.path shenanigans can do nothing about this. By creating
     # subprocesses we're sure that all file executions are fully independent.
     try:
         # This is inspired (and heavily simplified) from