New Mixtral HQQ Quantzied 4-bit/2-bit configuration (opens in new tab)

(huggingface.co)

5 pointsibuildthings2y ago1 comments

1 comments

We are releasing new 2-bit Mixtral models. These ones use a mixed HQQ 4-bit/2-bit configuration, resulting in a significantly improved model (ppl 4.69 vs. 5.90) with a negligible 0.20 GB VRAM increase.

Base: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-a...

Instruct: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-...

Shout-out to Artem Eliseev and Denis Mazur for suggesting this idea ( https://github.com/mobiusml/hqq/issues/2 )

j / k navigate · click thread line to collapse

1 comments

ibuildthingsOP2y ago

Base: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-a...

Instruct: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-...

Shout-out to Artem Eliseev and Denis Mazur for suggesting this idea ( https://github.com/mobiusml/hqq/issues/2 )

j / k navigate · click thread line to collapse