How can I check if a Python object is a string (either regular or Unicode)?
-
18What Jason's referring to is duck typing (if it quacks like a duck it probably is a duck). In Python you often "let your code work" on any string-like object without testing whether it's a string or string subclass. For more info, see: docs.python.org/glossary.html#term-duck-typingBen Hoyt– Ben Hoyt2009-08-20 00:12:58 +00:00Commented Aug 20, 2009 at 0:12
-
5That's what I love about SO. I usually ask a question, it isn't answered, people tell me I shouldn't be doing that anyway and why, and I grow as a programmer. =)physicsmichael– physicsmichael2009-08-20 17:41:42 +00:00Commented Aug 20, 2009 at 17:41
-
30+1: Just because an answer is rarely needed, doesn't mean the question is invalid. Although, I think it's great to have a caution here, I don't think it merits demoting the question.Trevor– Trevor2013-03-08 23:42:17 +00:00Commented Mar 8, 2013 at 23:42
-
17This is possibly the most legitimate use of type checking in Python. Strings are iterable, so distinguishing them from lists any other way is a bad idea.ojrac– ojrac2013-03-15 19:07:50 +00:00Commented Mar 15, 2013 at 19:07
-
3There are definitely cases where it is necessary to distinguish strings from other iterables. For example, see the source code for PrettyPrinter in the pprint module.saxman01– saxman012014-06-10 14:38:14 +00:00Commented Jun 10, 2014 at 14:38
15 Answers
Python 3
In Python 3.x basestring is not available anymore, as str is the sole string type (with the semantics of Python 2.x's unicode).
So the check in Python 3.x is just:
isinstance(obj_to_test, str)
This follows the fix of the official 2to3 conversion tool: converting basestring to str.
2 Comments
UserString, UserDict, UserList) before. Those types already predate Python 2. As those types do not inherit from the build-in types str, dict or list the isinstance call will not work. FYI: Because of this it is not guaranteed that those types can be used as a drop-in replacement. E.g. the regex module re does not work with UserString (at least with Python 3.8.2).Python 2
To check if an object o is a string type of a subclass of a string type:
isinstance(o, basestring)
because both str and unicode are subclasses of basestring.
To check if the type of o is exactly str:
type(o) is str
To check if o is an instance of str or any subclass of str:
isinstance(o, str)
The above also work for Unicode strings if you replace str with unicode.
However, you may not need to do explicit type checking at all. "Duck typing" may fit your needs. See http://docs.python.org/glossary.html#term-duck-typing.
See also What’s the canonical way to check for type in python?
2 Comments
basestring in py2.Python 2 and 3
(cross-compatible)
If you want to check with no regard for Python version (2.x vs 3.x), use six (PyPI) and its string_types attribute:
import six
if isinstance(obj, six.string_types):
print('obj is a string!')
Within six (a very light-weight single-file module), it's simply doing this:
import sys
PY3 = sys.version_info[0] == 3
if PY3:
string_types = str
else:
string_types = basestring
3 Comments
basestring and then fall back to str. E.g. def is_string(obj): try: return isinstance(obj, basestring) # python 2 except NameError: return isinstance(obj, str) # python 3 I found this ans more pythonic:
if type(aObject) is str:
#do your stuff here
pass
since type objects are singleton, is can be used to do the compare the object to the str type
1 Comment
isinstance(obj_to_test, str) is obviously meant to test for type, and it has the advantage of using the same procedure as for other, non-str cases.If one wants to stay away from explicit type-checking (and there are good reasons to stay away from it), probably the safest part of the string protocol to check is:
str(maybe_string) == maybe_string
It won't iterate through an iterable or iterator, it won't call a list-of-strings a string and it correctly detects a stringlike as a string.
Of course there are drawbacks. For example, str(maybe_string) may be a heavy calculation. As so often, the answer is it depends.
EDIT: As @Tcll points out in the comments, the question actually asks for a way to detect both unicode strings and bytestrings. On Python 2 this answer will fail with an exception for unicode strings that contain non-ASCII characters, and on Python 3 it will return False for all bytestrings.
2 Comments
b = b'test'; r = str(b) == b where b holds the same data as str(b) but (being a bytes object) does not validate as a string.In order to check if your variable is something you could go like:
s='Hello World'
if isinstance(s,str):
#do something here,
The output of isistance will give you a boolean True or False value so you can adjust accordingly. You can check the expected acronym of your value by initially using: type(s) This will return you type 'str' so you can use it in the isistance function.
Comments
I might deal with this in the duck-typing style, like others mention. How do I know a string is really a string? well, obviously by converting it to a string!
def myfunc(word):
word = unicode(word)
...
If the arg is already a string or unicode type, real_word will hold its value unmodified. If the object passed implements a __unicode__ method, that is used to get its unicode representation. If the object passed cannot be used as a string, the unicode builtin raises an exception.
Comments
isinstance(your_object, basestring)
will be True if your object is indeed a string-type. 'str' is reserved word.
my apologies, the correct answer is using 'basestring' instead of 'str' in order of it to include unicode strings as well - as been noted above by one of the other responders.
1 Comment
This evening I ran into a situation in which I thought I was going to have to check against the str type, but it turned out I did not.
My approach to solving the problem will probably work in many situations, so I offer it below in case others reading this question are interested (Python 3 only).
# NOTE: fields is an object that COULD be any number of things, including:
# - a single string-like object
# - a string-like object that needs to be converted to a sequence of
# string-like objects at some separator, sep
# - a sequence of string-like objects
def getfields(*fields, sep=' ', validator=lambda f: True):
'''Take a field sequence definition and yield from a validated
field sequence. Accepts a string, a string with separators,
or a sequence of strings'''
if fields:
try:
# single unpack in the case of a single argument
fieldseq, = fields
try:
# convert to string sequence if string
fieldseq = fieldseq.split(sep)
except AttributeError:
# not a string; assume other iterable
pass
except ValueError:
# not a single argument and not a string
fieldseq = fields
invalid_fields = [field for field in fieldseq if not validator(field)]
if invalid_fields:
raise ValueError('One or more field names is invalid:\n'
'{!r}'.format(invalid_fields))
else:
raise ValueError('No fields were provided')
try:
yield from fieldseq
except TypeError as e:
raise ValueError('Single field argument must be a string'
'or an interable') from e
Some tests:
from . import getfields
def test_getfields_novalidation():
result = ['a', 'b']
assert list(getfields('a b')) == result
assert list(getfields('a,b', sep=',')) == result
assert list(getfields('a', 'b')) == result
assert list(getfields(['a', 'b'])) == result
Comments
I think it's safe to assume that if the final character of the output of repr() is a ' or ", then whatever it is, it aught to be considered some kind of string.
def isStr(o):
return repr(o)[-1] in '\'"'
I'm assuming that repr won't be doing anything too heavy and that it'll return a string of at least one character. You can support empty strings by using something like
repr(o)[-1:].replace('"', "'") == "'"
but that's still assuming repr returns a string at all.
Comments
if type(varA) == str or type(varB) == str:
print 'string involved'
from EDX - online course MITx: 6.00.1x Introduction to Computer Science and Programming Using Python
1 Comment
str!For a nice duck-typing approach for string-likes that has the bonus of working with both Python 2.x and 3.x:
def is_string(obj):
try:
obj + ''
return True
except TypeError:
return False
wisefish was close with the duck-typing before he switched to the isinstance approach, except that += has a different meaning for lists than + does.
5 Comments
isalpha, but who knows what methods would be safe to look for?try can be faster. If you expect it 99% of the time, maybe not. The performance difference being minimal, it's better to be idiomatic unless you profile your code and identify it as actually being slow.