Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Releases: neuralmagic/deepsparse

DeepSparse v0.1.1 Patch Release

01 Mar 19:56
4940121
Compare
Choose a tag to compare

This is a patch release for 0.1.0 that contains the following changes:

  • Docs updates: tagline, overview, update to use sparsification for verbiage
  • Examples updated to use new ResNet-50 pruned_quant moderate model from the SparseZoo
  • Nightly build dependencies now match on major.minor and not full version
  • Benchmarking script added for reproducing ResNet-50 numbers
  • Small (3-5%) performance improvement for pruned quantized ResNet-50 models, for batch size greater than 16
  • Reduced memory footprint for networks with sparse fully connected layers
  • Improved performance on multi-socket systems when batch size is larger than 1

DeepSparse v0.1.0 First GitHub Release

04 Feb 21:21
6036701
Compare
Choose a tag to compare

Welcome to our initial release on GitHub! Older release notes can be found here.

New Features:

  • Operator support enabled:
    • QLinearAdd
    • 2D QLinearMatMul when the second operand is constant
  • Multi-stream support added for concurrent requests.
  • Examples for benchmarking, classification flows, detection flows, and Flask servers added.
  • Jupyter Notebooks for classification and detection flows added.
  • MakeFile flows and utilities implemented for GitHub repo structure.

Changes:

  • Software packaging updated to reflect new GitHub distribution channel, from file naming conventions to license enforcement removal.
  • Initial startup message updated with improved language.
  • Distribution now manylinux2014 compliant; support for Ubuntu 16.04 deprecated.
  • QuantizeLinear operations now use division instead of scaling by reciprocal for small quantization scales.
  • Small performance improvements made on some quantized networks with nontrivial activation zero points.

Resolved Issues:

  • Networks with sparse quantized convolutions and nontrivial activation zero points now have consistent correct results.
  • Crash no longer occurs for some models where a quantized depthwise convolution follows a non-depthwise quantized convolution.

Known Issues:

  • None