Skip to content

Releases: huggingface/text-generation-inference

v1.4.3

28 Feb 15:14
e6bb3ff

Choose a tag to compare

Highlights

  • Add support for Starcoder 2
  • Add support for Qwen2

What's Changed

Full Changelog: v1.4.2...v1.4.3

v1.4.2

21 Feb 13:52
9c1cb81

Choose a tag to compare

Highlights

  • Add support for Google Gemma models

What's Changed

Full Changelog: v1.4.1...v1.4.2

v1.4.1

16 Feb 16:53
4139054

Choose a tag to compare

Highlights

What's Changed

New Contributors

Full Changelog: v1.4.0...v1.4.1

v1.4.0

26 Jan 18:07
c2d4a3b

Choose a tag to compare

Highlights

  • OpenAI compatible API #1427
  • exllama v2 Tensor Parallel #1490
  • GPTQ support for AMD GPUs #1489
  • Phi support #1442

What's Changed

New Contributors

Full Changelog: v1.3.4...v1.4.0

v1.3.4

22 Dec 14:46

Choose a tag to compare

What's Changed

Full Changelog: v1.3.3...v1.3.4

v1.3.3

15 Dec 00:22

Choose a tag to compare

What's Changed

  • fix gptq params loading
  • improve decode latency for long sequences two fold
  • feat: add more latency metrics in forward by @OlivierDehaene in #1346
  • fix: max_past default value must be -1, not 0 by @OlivierDehaene in #1348

Full Changelog: v1.3.2...v1.3.3

v1.3.2

12 Dec 17:14

Choose a tag to compare

What's Changed

Full Changelog: v1.3.1...v1.3.2

v1.3.1

11 Dec 15:47

Choose a tag to compare

Hotfix Mixtral implementation

Full Changelog: v1.3.0...v1.3.1

v1.3.0

11 Dec 14:11

Choose a tag to compare

What's Changed

Full Changelog: v1.2.0...v1.3.0

v.1.2.0

30 Nov 14:19

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.1...v1.2.0