This repository was archived by the owner on Jun 3, 2025. It is now read-only.
Releases: neuralmagic/deepsparse
Releases · neuralmagic/deepsparse
DeepSparse v0.1.1 Patch Release
This is a patch release for 0.1.0 that contains the following changes:
- Docs updates: tagline, overview, update to use sparsification for verbiage
- Examples updated to use new ResNet-50 pruned_quant moderate model from the SparseZoo
- Nightly build dependencies now match on major.minor and not full version
- Benchmarking script added for reproducing ResNet-50 numbers
- Small (3-5%) performance improvement for pruned quantized ResNet-50 models, for batch size greater than 16
- Reduced memory footprint for networks with sparse fully connected layers
- Improved performance on multi-socket systems when batch size is larger than 1
DeepSparse v0.1.0 First GitHub Release
Welcome to our initial release on GitHub! Older release notes can be found here.
New Features:
- Operator support enabled:
- QLinearAdd
- 2D QLinearMatMul when the second operand is constant
- Multi-stream support added for concurrent requests.
- Examples for benchmarking, classification flows, detection flows, and Flask servers added.
- Jupyter Notebooks for classification and detection flows added.
- MakeFile flows and utilities implemented for GitHub repo structure.
Changes:
- Software packaging updated to reflect new GitHub distribution channel, from file naming conventions to license enforcement removal.
- Initial startup message updated with improved language.
- Distribution now manylinux2014 compliant; support for Ubuntu 16.04 deprecated.
- QuantizeLinear operations now use division instead of scaling by reciprocal for small quantization scales.
- Small performance improvements made on some quantized networks with nontrivial activation zero points.
Resolved Issues:
- Networks with sparse quantized convolutions and nontrivial activation zero points now have consistent correct results.
- Crash no longer occurs for some models where a quantized depthwise convolution follows a non-depthwise quantized convolution.
Known Issues:
- None