-
Notifications
You must be signed in to change notification settings - Fork 2.1k
tk_v0.1 #2462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
tk_v0.1 #2462
Conversation
Thanks for your contribution! |
@@ -530,6 +529,7 @@ def get_static_model_on_pdc(remote_path, local_path, timeout, enable_flash_devic | |||
Returns: | |||
str: path to load static model | |||
""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
记个TODO,后续get_static_model_on_pdc这个函数可以删掉
@@ -568,6 +566,8 @@ def __init__(self, **kwargs): | |||
self.use_filtered_label_loss = kwargs.pop("use_filtered_label_loss", False) | |||
self.loss_subbatch_seqlen = kwargs.pop("loss_subbatch_seqlen", -1) | |||
|
|||
from ..quantization.quantization_config import QuantizationConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/PaddlePaddle/PaddleFormers/blob/develop/paddleformers/quantization/quantization_config.py#L116 QuantizationConfig中这个paddle的依赖也只是个判断而已,不如try:from paddle.nn.quant.quantized_linear import _get_arch_info 没有就不对GPU版本做判断就行
Adapted from transformers.AutoTokenizer.from_pretrained with modifications: | ||
1. Added get_paddleformers_tokenizer_config() to extend tokenizer_config.json download source | ||
2. Explicitly binds PaddleTokenizerMixin to the tokenizer class before final instantiation | ||
绑定 PaddleTokenizerMixin,如果 Paddle 可用则绑定,否则返回原类 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释不要用中文
@@ -205,7 +245,9 @@ def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): | |||
|
|||
if tokenizer_class is None: | |||
raise ValueError(f"Tokenizer class {tokenizer_class_name} is not currently imported.") | |||
tokenizer_class = type(tokenizer_class.__name__, (PaddleTokenizerMixin, tokenizer_class), {}) | |||
|
|||
# 绑定 PaddleTokenizerMixin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
@@ -14,6 +14,9 @@ | |||
# limitations under the License. | |||
import transformers as hf | |||
|
|||
from ..tokenizer_utils import warp_tokenizer | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我们的目标是处理掉PaddleTokenizerMixin中对Paddle冗余的依赖,而不是让paddleformers中的tokenizer不使用 PaddleTokenizerMixin
U:Tokenizer v0.1
D:45T、45Lite、Qwen/Qwen2.5-7B-Instruct-1M、Qwen/Qwen3-32B