4

I have a string that I need to do multiple search and replaces to remove leading and trailing spaces inside an attribute. The before and after effect is shown here (visually and with a JS example of it working):

http://lloydi.com/x/re/

Now, I need to do the equivalent in C# - replace all references in a string. But I am really stuck. I know the pattern is correct, as shown in the JS version, but the syntax/escape syntax is doing my head in.

Here's what I have, but of course it doesn't work ;-)

//define the string
string xmlString = "<xml><elementName specificattribute=" 111 222 333333 " anotherattribute="something" somethingelse="winkle"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

// here's the regExPattern - the syntax checker doesn't like this at all
string regExPattern = "/(specificattribute=)"\s*([^"]+?)\s*"/g";

// here's the replacement
string replacement = "$1\"$2\"";

Regex rgx = new Regex(regExPattern);
string result = rgx.Replace(xmlString, replacement);

Can someone tell me the error of my ways?

Many thanks!

2
  • try putting an @ symbol in from on the regExPattern String like so: string regExPattern = @"/(specificattribute=)"\s*([^"]+?)\s*"/g"; Commented Feb 4, 2010 at 23:33
  • 3
    You shouldn't use regex to parse XML. C# has powerful tools for handling XML documents. Commented Feb 4, 2010 at 23:36

3 Answers 3

3

Don't use regular expressions for this task. .NET has powerful tools for manipulating XML documents. Try this instead:

XDocument doc = XDocument.Load("input.xml");
foreach (XAttribute attr in doc.Descendants("elementName")
                               .Attributes("specificattribute"))
{
    attr.Value = attr.Value.Trim();
}
doc.Save("output.xml");
Sign up to request clarification or add additional context in comments.

Comments

2

Remove the /g at the end of regExPattern. That's the first mistake I see for certain. .NET's regex implementation has no global modifier, it's global by default.

UPDATE:

I think this should work:

           //define the string
            string xmlString = "<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

            // here's the regExPattern - the syntax checker doesn't like this at all
            string regExPattern = "(specificattribute=)\"\\s*([^\"]+?)\\s*";

            // here's the replacement
            string replacement = "$1\"$2\"";

            Regex rgx = new Regex(regExPattern);
            string result = rgx.Replace(xmlString, replacement);

Although this may actually work for you, XML's nested/context-specific nature makes regular expressions ill-suited to parse it properly and efficiently. It's certainly not the best tool for the job, let's put it that way.

From the look of things you should really use something like Xpath, or Linq to XML to parse and modify these attributes.

I'm practically stealing Mark Byer's answer, but since his example is with xml files and you're doing this in memory it should be more like this:

XDocument doc = XDocument.Parse("<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>");
foreach (XAttribute attr in doc.Descendants("elementName")
                               .Attributes("specificattribute"))
{
    attr.Value = attr.Value.Trim();
}
string result = doc.ToString();

Comments

0

Seriously, you should be using the System.Xml class for this. Here's another example using XPath:

    string xmlString = "<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

    XmlDocument xml = new XmlDocument(); ;
    xml.LoadXml(xmlString);

    foreach (XmlAttribute el in xml.SelectNodes("//@specificattribute"))
    {
        el.Value = el.Value.Trim();
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.