Skip to content

Tags: modelscope/dash-infer

Tags

v2.2.5

Toggle v2.2.5's commit message
v2.2.5 tag. 

v2.2.4

Toggle v2.2.4's commit message
tag 2.2.0 post 

v2.2.3

Toggle v2.2.3's commit message
tag 2.2.0 post 

v2.2.2

Toggle v2.2.2's commit message
tag 2.2.2 post 

v2.2.1

Toggle v2.2.1's commit message
v2.2.1 release 

v2.2.0

Toggle v2.2.0's commit message
v2.2.0 release 

v2.1.0

Toggle v2.1.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
add high performance moe kernel; fix a16w8 compile bug for sm<80 (#67) 

v2.0.0

Toggle v2.0.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
update sampling, prefix cache, json mode impl (#55) - engine: stop and release model when engine release, and remove deprecated lock - sampling: generate_op heavily modified, remove dependency on global tensors - prefix cache: some bug fix, impove evict performance - json mode: update lmfe-cpp patch, add process_logits, sampling with top_k top_p - span-attention: move span_attn decoderReshape to init - lora: add docs, fix typo - ubuntu: add ubuntu dockerfile, fix install dir err - bugifx: fix multi-batch rep_penlty bug