3

I've always found the lack of a built-in aggregate function that just strings together values using some user defined separator, frustrating.

The closest thing I know of is the XML hack:

select s.string as [text()] 
from strings s 
for xml path('')

But in my opinion that's a very strange way of doing it and you still have to handle the case with trailing or leading separator junk. So, I thought I'd write a CLR aggregate:

select dbo.Fold(s.string, ', ') 
from strings s

That's kind of nice. Except this thing doesn't perform very well when the number of rows reaches 5000. I don't really expect to run it against that much data but to my surprise the performance of the aggregate degrades rather rapidly as the data set grows. I didn't understand why at first, then I realized that the way the UDF is handled in SQL Server is the problem.

I'm using an UDF with a StringBuilder class, because of this I have to implement IBinarySerialize. But for reasons I do not understand SQL Server is calling the Write/Read pair between each iteration of Accumulate. Obviously this creates a performance problem when the string gets big. Is there anyway to avoid this costly serialization?

FYI, the XML hack are orders of magnitudes faster.

1 Answer 1

1

What you are doing is implementing the MySQL GROUP_CONCAT

See these for inspiration...

Personally, I use the XML trick.

Sign up to request clarification or add additional context in comments.

1 Comment

The XML trick appears to be the right choice in this matter, it's just a horrible syntax. The group_concat stuff maintains a frequency table while running the aggregate, I'm guessing it performance is just as bad but it's an interesting approach.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.