Docs: return type of `get_default_model_and_revision` might be incorrectly documented? #35981

MarcoGorelli · 2025-01-31T10:34:48Z

The return type here is documented as Union[str, Tuple[str, str]]

transformers/src/transformers/pipelines/base.py

Lines 385 to 387 in d7188ba

    
           def get_default_model_and_revision( 
        
               targeted_task: Dict, framework: Optional[str], task_options: Optional[Any] 
        
           ) -> Union[str, Tuple[str, str]]:

The docstring just says str

transformers/src/transformers/pipelines/base.py

Line 404 in d7188ba

`str` The model string representing the default model for this pipeline

But I think that only Tuple[str, str] might be correct?

For example, if I run

from transformers import Pipeline
# from pair_classification import PairClassificationPipeline
from transformers.pipelines import PIPELINE_REGISTRY
from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification
from transformers.pipelines import PIPELINE_REGISTRY
from transformers import pipeline
from transformers.utils import direct_transformers_import, is_tf_available, is_torch_available
import numpy as np


def softmax(outputs):
    maxes = np.max(outputs, axis=-1, keepdims=True)
    shifted_exp = np.exp(outputs - maxes)
    return shifted_exp / shifted_exp.sum(axis=-1, keepdims=True)


class PairClassificationPipeline(Pipeline):
    def _sanitize_parameters(self, **kwargs):
        preprocess_kwargs = {}
        if "second_text" in kwargs:
            preprocess_kwargs["second_text"] = kwargs["second_text"]
        return preprocess_kwargs, {}, {}

    def preprocess(self, text, second_text=None):
        return self.tokenizer(text, text_pair=second_text, return_tensors=self.framework)

    def _forward(self, model_inputs):
        return self.model(**model_inputs)

    def postprocess(self, model_outputs):
        logits = model_outputs.logits[0].numpy()
        probabilities = softmax(logits)

        best_class = np.argmax(probabilities)
        label = self.model.config.id2label[best_class]
        score = probabilities[best_class].item()
        logits = logits.tolist()
        return {"label": label, "score": score, "logits": logits}


PIPELINE_REGISTRY.register_pipeline(
    "custom-text-classification",
    pipeline_class=PairClassificationPipeline,
    pt_model=AutoModelForSequenceClassification if is_torch_available() else None,
    tf_model=TFAutoModelForSequenceClassification if is_tf_available() else None,
    default={"pt": ("hf-internal-testing/tiny-random-distilbert", "2ef615d")},
    type="text",
)
assert "custom-text-classification" in PIPELINE_REGISTRY.get_supported_tasks()

_, task_def, _ = PIPELINE_REGISTRY.check_task("custom-text-classification")

classifier = pipeline('custom-text-classification')

then I get

ValueError                                Traceback (most recent call last)
<ipython-input-6-0cc5199a8521> in <cell line: 53>()
     51 _, task_def, _ = PIPELINE_REGISTRY.check_task("custom-text-classification")
     52 
---> 53 classifier = pipeline('custom-text-classification')

/usr/local/lib/python3.10/dist-packages/transformers/pipelines/__init__.py in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
    898     if model is None:
    899         # At that point framework might still be undetermined
--> 900         model, default_revision = get_default_model_and_revision(targeted_task, framework, task_options)
    901         revision = revision if revision is not None else default_revision
    902         logger.warning(

ValueError: too many values to unpack (expected 2)

It looks like pipeline expects a tuple, not a string

Looks like this may have just been forgotten during #17667?

The text was updated successfully, but these errors were encountered:

MarcoGorelli · 2025-01-31T10:35:20Z

I can submit a PR if there's interest

MarcoGorelli linked a pull request Jan 31, 2025 that will close this issue

docs: fix return type annotation of get_default_model_revision #35982

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: return type of `get_default_model_and_revision` might be incorrectly documented? #35981

Docs: return type of `get_default_model_and_revision` might be incorrectly documented? #35981

MarcoGorelli commented Jan 31, 2025 •

edited

Loading

MarcoGorelli commented Jan 31, 2025

Docs: return type of get_default_model_and_revision might be incorrectly documented? #35981

Docs: return type of get_default_model_and_revision might be incorrectly documented? #35981

Comments

MarcoGorelli commented Jan 31, 2025 • edited Loading

MarcoGorelli commented Jan 31, 2025

Docs: return type of `get_default_model_and_revision` might be incorrectly documented? #35981

Docs: return type of `get_default_model_and_revision` might be incorrectly documented? #35981

MarcoGorelli commented Jan 31, 2025 •

edited

Loading