1

Just to be clear, this question has nothing to do with the regular expression itself and my code is perfectly running even though it is not passing mypy strict verification.

Let's start from the basic, I have a class defined as follows:

from __future__ import annotations

import re
from typing import AnyStr


class MyClass:
    def __init__(self, regexp: AnyStr | re.Pattern[AnyStr]) -> None:
        if not isinstance(regexp, re.Pattern):
            regexp = re.compile(regexp)
        self._regexp: re.Pattern[str] | re.Pattern[bytes]= regexp

The user can build the class either passing a compiled re pattern or AnyStr. I want the class to store in the private _regexp attribute the compiled value. So I check if the user does not provided a compiled pattern, then I compile it and assign it to the private attribute.

So far so good, even though I would have expected self._regexp to be type re.Pattern[AnyStr] instead of the union of the type pattern types. Anyhow, up to here everything is ok with mypy.

Now, in some (or most) cases, the user provides the regexp string via a configuration TOML file, that is read in, parsed in a dictionary. For this case I have a class method constructor defined as follow:

    @classmethod
    def from_dict(cls, d: dict[str, str]) -> MyClass:
        r = d.get('regexp')
        if r is None:
            raise KeyError('missing regexp')
        return cls(regexp=r)

The type of dictionary will be dict[str, str]. I have to check that the dictionary contains the right key to prevent a NoneType in case the get function cannot find it.

I get the error:

error: Argument "regexp" to "MyClass" has incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" [arg-type]

That looks bizarre, because str should be compatible with AnyStr.

Let's say that I modify the dictionary typing to dict[str, AnyStr]. Instead of fixing the problem, it multiplies it because I get two errors:

error: Argument "regexp" to "MyClass" has incompatible type "str"; expected "AnyStr | Pattern[AnyStr]" [arg-type]
error: Argument "regexp" to "MyClass" has incompatible type "bytes"; expected "AnyStr | Pattern[AnyStr]" [arg-type]

It looks like I am in a loop: when I think I have fixed something, I just moved the problem back elsewhere.

4
  • 1
    AnyStr is a type variable, and type variables should either appear 2+ times in a function signature or 1+ time in the signature and 1 time in the enclosing class as a type variable. If you have neither of these situations, you'd be better off to use a union. See mypy Playground Commented Dec 24, 2024 at 9:45
  • Thanks, I will marked as accepted answer if you copy it in a real answer :) Commented Dec 24, 2024 at 10:19
  • Weird... it's like mypy is mixing up a generic __init__ method and a generic class. If the class was generic in AnyStr, trying to do cls(regexp=r) would be wrong, because cls might be class Child(MyClass[bytes]), or you might do MyClass[bytes].from_dict(d). But semantically strange or not, I think this code should be valid with a generic __init__ method. Commented Dec 24, 2024 at 17:19
  • @dROOOze Please don't post answers as comments; post a real answer instead :) Commented Dec 24, 2024 at 17:51

1 Answer 1

1

AnyStr is a type variable, and type variables should either appear 2+ times in a function signature or 1+ time in the signature and 1 time in the enclosing class as a type variable. If you have neither of these situations, you'd be better off to use a union. See mypy Playground

comment by dROOOze

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.