Is there some string class in Python like StringBuilder in C#?
-
10This is a duplicate of Python equivalent of Java StringBuffer. CAUTION: The answers here are way out of date and have, in fact, become misleading. See that other question for answers that are more relevant to modern Python versions (certainly 2.7 and above).Jean-François Corbett– Jean-François Corbett2017-11-20 08:52:42 +00:00Commented Nov 20, 2017 at 8:52
-
It's February 8, 2025 as I write this response. Yes, with Python 3 (3.13.2 to be exact in my case), you can create a reasonable facsimile of the C# (very close to Java) StringBuilder class with the most often used capabilities. One such example can be found in my GitHub repo located at github.com/fmorriso/Python-StringBuilderIntelligenceGuidedByExperience– IntelligenceGuidedByExperience2025-02-08 15:14:48 +00:00Commented Feb 8 at 15:14
7 Answers
There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:
Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.
TLDR the fastest method is below. It's extremely compact, and also pretty understandable:
def method6():
return ''.join([`num` for num in xrange(loop_count)])
5 Comments
Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.
Use join. It's very fast and more robust.
$ ipython3
Python 3.5.1 (default, Mar 2 2016, 03:38:02)
IPython 4.1.2 -- An enhanced Interactive Python.
In [1]: values = [str(num) for num in range(int(1e3))]
In [2]: %%timeit
...: ''.join(values)
...:
100000 loops, best of 3: 7.37 µs per loop
In [3]: %%timeit
...: result = ''
...: for value in values:
...: result += value
...:
10000 loops, best of 3: 82.8 µs per loop
In [4]: import io
In [5]: %%timeit
...: writer = io.StringIO()
...: for value in values:
...: writer.write(value)
...: writer.getvalue()
...:
10000 loops, best of 3: 81.8 µs per loop
8 Comments
repr call dominates the runtime, but there's no need to make the mistake personal.I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).
Here is the result for 10,000 iterations:
method1: 0.0538418292999 secs process size 4800 kb method2: 0.22602891922 secs process size 4960 kb method3: 0.0605459213257 secs process size 4980 kb method4: 0.0544030666351 secs process size 5536 kb method5: 0.0551080703735 secs process size 5272 kb method6: 0.0542731285095 secs process size 5512 kb
and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):
method1: 5.88603997231 secs process size 37976 kb method3: 8.40748500824 secs process size 38024 kb method4: 7.96380496025 secs process size 321968 kb method5: 8.03666186333 secs process size 71720 kb method6: 6.68192911148 secs process size 38240 kb
It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: "premature optimization is the root of all evil" :-)
3 Comments
Python has several things that fulfill similar purposes:
- One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
- To build strings incorporating data with formatting, you would do the formatting separately.
- For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you'd call
list(your_string). You could also use aUserString.MutableStringfor this. (c)StringIO.StringIOis useful for things that would otherwise take a file, but less so for general string building.
Comments
Using method 5 from above (The Pseudo File) we can get very good perf and flexibility
from cStringIO import StringIO
class StringBuilder:
_file_str = None
def __init__(self):
self._file_str = StringIO()
def Append(self, str):
self._file_str.write(str)
def __str__(self):
return self._file_str.getvalue()
now using it
sb = StringBuilder()
sb.Append("Hello\n")
sb.Append("World")
print sb
Comments
There is no explicit analogue - i think you are expected to use string concatenations(likely optimized as said before) or third-party class(i doubt that they are a lot more efficient - lists in python are dynamic-typed so no fast-working char[] for buffer as i assume). Stringbuilder-like classes are not premature optimization because of innate feature of strings in many languages(immutability) - that allows many optimizations(for example, referencing same buffer for slices/substrings). Stringbuilder/stringbuffer/stringstream-like classes work a lot faster than concatenating strings(producing many small temporary objects that still need allocations and garbage collection) and even string formatting printf-like tools, not needing of interpreting formatting pattern overhead that is pretty consuming for a lot of format calls.