29

Is there a way to HTML encode a string (NSString) in Objective-C, something along the lines of Server.HtmlEncode in .NET?

14 Answers 14

43

There isn't an NSString method that does that. You'll have to write your own function that does string replacements. It is sufficient to do the following replacements:

  • '&' => "&"
  • '"' => """
  • '\'' => "'"
  • '>' => ">"
  • '<' => "&lt;"

Something like this should do (haven't tried):

[[[[[myStr stringByReplacingOccurrencesOfString: @"&" withString: @"&amp;"]
 stringByReplacingOccurrencesOfString: @"\"" withString: @"&quot;"]
 stringByReplacingOccurrencesOfString: @"'" withString: @"&#39;"]
 stringByReplacingOccurrencesOfString: @">" withString: @"&gt;"]
 stringByReplacingOccurrencesOfString: @"<" withString: @"&lt;"];
Sign up to request clarification or add additional context in comments.

6 Comments

I was hoping to avoid this, but thanks for the info about them not having something built in.
This solution doesn't work in all cases. I recently ran into a bug with the UTF-8 string that included the character 0xE28099 which translates to UTF-8 2019 or the 'right single quotation mark'. Such extended characters are not handled in the above example, and caused errors in our client's server (as they were out of spec). The above example work work in most cases, but not all.
Although it does occur to me now that our client's XML Schema specifically called out ISO-8859-1, and I'm not sure if this would be a problem otherwise.
Just a quick note, if you're going perform a lot of these escapes on a lot of XML/HTML inside 1 run loop then don't forget to wrap each encode in an NSAutoreleasePool!
+1 This is an excellent code with short coding .. .really helpful & saved my day .. Thanks @thesamet
|
34

I took Mike's work and turn it into a category for NSMutableString and NSString

Make a Category for NSMutableString with:

- (NSMutableString *)xmlSimpleUnescape
{
    [self replaceOccurrencesOfString:@"&amp;"  withString:@"&"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&quot;" withString:@"\"" options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&#x27;" withString:@"'"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&#39;"  withString:@"'"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&#x92;" withString:@"'"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&#x96;" withString:@"-"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&gt;"   withString:@">"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"&lt;"   withString:@"<"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];

    return self;
}

- (NSMutableString *)xmlSimpleEscape
{
    [self replaceOccurrencesOfString:@"&"  withString:@"&amp;"  options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"\"" withString:@"&quot;" options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"'"  withString:@"&#x27;" options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@">"  withString:@"&gt;"   options:NSLiteralSearch range:NSMakeRange(0, [self length])];
    [self replaceOccurrencesOfString:@"<"  withString:@"&lt;"   options:NSLiteralSearch range:NSMakeRange(0, [self length])];

    return self;
}

Make a Category for NSString with:

- (NSString *)xmlSimpleUnescapeString
{
    NSMutableString *unescapeStr = [NSMutableString stringWithString:self];

    return [unescapeStr xmlSimpleUnescape];
}


- (NSString *)xmlSimpleEscapeString
{
    NSMutableString *escapeStr = [NSMutableString stringWithString:self];

    return [escapeStr xmlSimpleEscape];
}

* A Swift 2.0 Version *

The Objective-C version is a little more efficient as it does mutable operations on the string. However, this is a swift way to do simple escaping:

extension String
{
    typealias SimpleToFromRepalceList = [(fromSubString:String,toSubString:String)]

    // See http://stackoverflow.com/questions/24200888/any-way-to-replace-characters-on-swift-string
    //
    func simpleReplace( mapList:SimpleToFromRepalceList ) -> String
    {
        var string = self

        for (fromStr, toStr) in mapList {
            let separatedList = string.componentsSeparatedByString(fromStr)
            if separatedList.count > 1 {
                string = separatedList.joinWithSeparator(toStr)
            }
        }

        return string
    }

    func xmlSimpleUnescape() -> String
    {
        let mapList : SimpleToFromRepalceList = [
            ("&amp;",  "&"),
            ("&quot;", "\""),
            ("&#x27;", "'"),
            ("&#39;",  "'"),
            ("&#x92;", "'"),
            ("&#x96;", "-"),
            ("&gt;",   ">"),
            ("&lt;",   "<")]

        return self.simpleReplace(mapList)
    }

    func xmlSimpleEscape() -> String
    {
        let mapList : SimpleToFromRepalceList = [
            ("&",  "&amp;"),
            ("\"", "&quot;"),
            ("'",  "&#x27;"),
            (">",  "&gt;"),
            ("<",  "&lt;")]

        return self.simpleReplace(mapList)
    }
}

I could have used the NSString bridging capabilities to write something very similar to the NSString version, but I decided to do it more swifty.

6 Comments

Added Swift version as I'm starting to do more Swift
&#x39; should be &#39, and &#x96 is a dash, not an apostrophe. character-code.com/punctuation-html-codes.php
@skensell thanks for the feedback. I made those changes above.
@TodCunningham, for complete ascii character, see character-code.com/ascii-table.php. I think all those codes should be included as completion, at least all "HTML entities".
@TodCunningham At least for 5 specials chars: quote, amp, apos, lt, and gt, the func ...Unescape should add more mappings like quote: &#34; &#x22;; amp: &#x26;, &#38;; lt: &#x3c;, &#60;; and gt: &#x3e;, &#62;.
|
28

I use Google Toolbox for Mac (works on iPhone). In particular, see the additions to NSString in GTMNSString+HTML.h and GTMNSString+XML.h.

Comments

13

For URL encoding:

NSString * encodedString = [originalString
      stringByAddingPercentEscapesUsingEncoding:NSASCIIStringEncoding];

See Apple's NSString documentation for more info.

For HTML encoding:

Check out CFXMLCreateStringByEscapingEntities, which is part of the Core Foundation XML library, but should still do the trick.

4 Comments

I believe that's URL encoding, not HTML encoding.
@JW: Yes, you are absolutely right. I didn't know there was a difference!
stringByAddingPercentEscapesUsingEncoding is what I am currently using, unfortunately it doesn't work for some of the more odd characters.
Exactly what I needed to make strings compliant to send to my server side code in a UIwebview, thanks
6

the samets's routine forgot the hex digit. Here's the routine I came up with that works:

- (NSString*)convertEntities:(NSString*)string
{

NSString    *returnStr = nil;

    if( string )
    {
        returnStr = [ string stringByReplacingOccurrencesOfString:@"&amp;" withString: @"&"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&quot;" withString:@"\""  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&#x27;" withString:@"'"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&#x39;" withString:@"'"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&#x92;" withString:@"'"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&#x96;" withString:@"'"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&gt;" withString:@">"  ];

        returnStr = [ returnStr stringByReplacingOccurrencesOfString:@"&lt;" withString:@"<"  ];

        returnStr = [ [ NSString alloc ] initWithString:returnStr ];
    }

    return returnStr;
}

1 Comment

you are creating so many NSString(about 8 temp nsstrings). it's not so good for memory usage.
5

Swift 4

extension String {
    var xmlEscaped: String {
        return replacingOccurrences(of: "&", with: "&amp;")
            .replacingOccurrences(of: "\"", with: "&quot;")
            .replacingOccurrences(of: "'", with: "&#39;")
            .replacingOccurrences(of: ">", with: "&gt;")
            .replacingOccurrences(of: "<", with: "&lt;")
    }
}

Comments

4

If you can use NSXMLNode (on OS X) Here is the trick:

NSString *string = @"test<me>"
NSXMLNode *textNode = [NSXMLNode textWithStringValue:string];
NSString *escapedString = [textNode.XMLString];

2 Comments

this is useful if you download kissXML library which adds support for NSXMLNode/NSXMLElement etc..
Very useful! To also have newlines encoded for HTML, you need to simply do [[textNode XMLString] stringByReplacingOccurrencesOfString:@"\n" withString:@"<br>"]
3

Here is a more efficient implementation of this xml escape logic.

+ (NSString*) xmlSimpleEscape:(NSString*)unescapedStr
{
  if (unescapedStr == nil || [unescapedStr length] == 0) {
    return unescapedStr;
  }

  const int len = [unescapedStr length];
  int longer = ((int) (len * 0.10));
  if (longer < 5) {
    longer = 5;
  }
  longer = len + longer;
  NSMutableString *mStr = [NSMutableString stringWithCapacity:longer];

  NSRange subrange;
  subrange.location = 0;
  subrange.length = 0;

  for (int i = 0; i < len; i++) {
    char c = [unescapedStr characterAtIndex:i];
    NSString *replaceWithStr = nil;

    if (c == '\"')
    {
      replaceWithStr = @"&quot;";
    }
    else if (c == '\'')
    {
      replaceWithStr = @"&#x27;";
    }
    else if (c == '<')
    {
      replaceWithStr = @"&lt;";
    }
    else if (c == '>')
    {
      replaceWithStr = @"&gt;";
    }
    else if (c == '&')
    {
      replaceWithStr = @"&amp;";
    }

    if (replaceWithStr == nil) {
      // The current character is not an XML escape character, increase subrange length

      subrange.length += 1;
    } else {
      // The current character will be replaced, but append any pending substring first

      if (subrange.length > 0) {
        NSString *substring = [unescapedStr substringWithRange:subrange];
        [mStr appendString:substring];
      }

      [mStr appendString:replaceWithStr];

      subrange.location = i + 1;
      subrange.length = 0;
    }
  }

  // Got to end of unescapedStr so append any pending substring, in the
  // case of no escape characters this will append the whole string.

  if (subrange.length > 0) {
    if (subrange.location == 0) {
      [mStr appendString:unescapedStr];      
    } else {
      NSString *substring = [unescapedStr substringWithRange:subrange];
      [mStr appendString:substring];
    }
  }

  return [NSString stringWithString:mStr];
}

+ (NSString*) formatSimpleNode:(NSString*)tagname value:(NSString*)value
{
  NSAssert(tagname != nil, @"tagname is nil");
  NSAssert([tagname length] > 0, @"tagname is the empty string");

  if (value == nil || [value length] == 0) {
    // Certain XML parsers don't like empty nodes like "<foo/>", use "<foo />" instead
    return [NSString stringWithFormat:@"<%@ />", tagname];
  } else {
    NSString *escapedValue = [self xmlSimpleEscape:value];
    return [NSString stringWithFormat:@"<%@>%@</%@>", tagname, escapedValue, tagname];    
  }
}

1 Comment

Definitely more efficient than many calls to stringByReplacing..., thank you!
1

Here is my swift category for html encoding/decoding:

extension String
{
    static let htmlEscapedDictionary = [
        "&amp;": "&",
        "&quot;" : "\"",
        "&#x27;" : "'",
        "&#x39;" : "'",
        "&#x92;" : "'",
        "&#x96;" : "'",
        "&gt;" : ">",
        "&lt;" : "<"]

    var escapedHtmlString : String {
        var newString = "\(self)"

        for (key, value) in String.htmlEscapedDictionary {
            newString.replace(value, withString: key)
        }
        return newString
    }

    var unescapedHtmlString : String {
        let encodedData = self.dataUsingEncoding(NSUTF8StringEncoding)!
        let attributedOptions : [String: AnyObject] = [
            NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
            NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
        ]
        let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!
        return attributedString.string
    }

    mutating func replace(originalString:String, withString newString:String)
    {
        let replacedString = self.stringByReplacingOccurrencesOfString(originalString, withString: newString, options: nil, range: nil)
        self = replacedString
    }
}

I guess a reverse of htmlEscapedDictionary could've been used as well in unescapedHtmlString

Note: As MarkBau pointed out in the comment below: Since Swift does not guarantee the order of dictionaries, make sure to replace & first.

2 Comments

This is great, thank you. One thing to note for the escapedHtmlString method is that sometimes & will be processed after other entities are converted (because Swift dictionaries do not guarantee order), which means that some things will get double-encoded. The fix is that & should be processed first; I did this by pulling out the & line on line 4 and putting newString.replace("&", withString: "&amp;") before the for loop.
@MarkBao very good point. Think I later ran into the same issue.
1

I put together a quick example project using Mike and Tod's answers here.

Makes the encoding/unencoding dead simple:

NSString *html = @"<p>This \"paragraph\" contains quoted & 'single' quoted stuff.</p>";
NSLog(@"Original String: %@", html);

NSString *escapedHTML = [html xmlSimpleEscapeString];
NSLog(@"Escaped String: %@", escapedHTML);

NSString *unescapedHTML = [escapedHTML xmlSimpleUnescapeString];
NSLog(@"Unescaped String: %@", unescapedHTML);

Comments

0

This easiest solution is to create a category as below:

Here’s the category’s header file:

#import <Foundation/Foundation.h>
@interface NSString (URLEncoding)
-(NSString *)urlEncodeUsingEncoding:(NSStringEncoding)encoding;
@end

And here’s the implementation:

#import "NSString+URLEncoding.h"
@implementation NSString (URLEncoding)
-(NSString *)urlEncodeUsingEncoding:(NSStringEncoding)encoding {
    return (NSString *)CFURLCreateStringByAddingPercentEscapes(NULL,
               (CFStringRef)self,
               NULL,
               (CFStringRef)@"!*'\"();:@&=+$,/?%#[]% ",
               CFStringConvertNSStringEncodingToEncoding(encoding));
}
@end

And now we can simply do this:

NSString *raw = @"hell & brimstone + earthly/delight";
NSString *url = [NSString stringWithFormat:@"http://example.com/example?param=%@",
            [raw urlEncodeUsingEncoding:NSUTF8StringEncoding]];
NSLog(url);

The credits for this answer goes to the website below:-

http://madebymany.com/blog/url-encoding-an-nsstring-on-ios

1 Comment

HTML escaping is not the same as URL encoding. The sample you provided does URL encoding.
0

Refer below answer:

NSString *content = global.strPrivacyPolicy;
content =  [[[[[content stringByReplacingOccurrencesOfString: @"&amp;" withString: @"&"]
stringByReplacingOccurrencesOfString:@"&quot;"  withString:@"\" "]
stringByReplacingOccurrencesOfString: @"&#39;"  withString:@"'"]
stringByReplacingOccurrencesOfString: @"&gt;" withString: @">"]
stringByReplacingOccurrencesOfString:  @"&lt;" withString:@"<"];
[_webViewPrivacy loadHTMLString:content baseURL:nil];

Comments

0

Use the message in the example below :

anyStringConverted = [anyString stringByReplacingOccurrencesOfString:@"\n" withString:@"<br>"]; 

This converts 'new line' command to corresponding html code. But to convert symbols, you have to write the corresponding html number. You can see the complete list of html numbers here at

http://www.ascii.cl/htmlcodes.htm

Comments

0

I found the only way that uses only built-in functions (not manual parsing) and covers all cases. Requires AppKit/UIKit in addition to Foundation. This is Swift but can easily be Objective-C:

func encodedForHTML() -> String {

    // make a plain attributed string and then use its HTML write functionality
    let attrStr = NSAttributedString(string: self)

    // by default, the document outputs a whole HTML element
    // warning: if default apple implementation changes, this may need to be tweaked
    let options: [NSAttributedString.DocumentAttributeKey: Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .excludedElements: [
                "html",
                "head",
                "meta",
                "title",
                "style",
                "p",
                "body",
                "font",
                "span"
            ]
    ]

    // generate data and turn into string
    let data = try! attrStr.data(from: NSRange(location: 0, length: attrStr.length), documentAttributes: options)
    let str = String(data: data, encoding: .utf8)!

    // remove <?xml line
    return str.components(separatedBy: .newlines).dropFirst().first!
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.