2

I am using HtmlAgilityPack to read a parse a html file and extract some text:

static void Main(string[] args)
        {
            var webGet = new HtmlWeb();
            var document = webGet.Load("http://port.ro/");

            var programs = from program in document.DocumentNode.Descendants()
                           where program.Name == "a" && program.Attributes["href"] != null && program.InnerText.Trim().Length > 0
                           select program.InnerText ;

            foreach (string s in programs)
            {
                Console.WriteLine(s);
            }

            Console.ReadLine();
        }

My problem is that the website contains characters like à and when I print them, they are replaced by ?.

What should I need to do so when I print the text the character à its replaced by a or print it like à ?

1

2 Answers 2

1

Did you try using or set the encoding as required for the site. This should help you get the proper text

var document = webGet.Load("http://port.ro/", Encoding.UTF8);//check your encoding

Above one is for htmldocument

for HtmlWeb Try this:

var web = new HtmlWeb
{
    AutoDetectEncoding = false,
    OverrideEncoding = myEncoding,
};
var doc = web.Load(myUrl);
Sign up to request clarification or add additional context in comments.

4 Comments

can I use that for the second argument ? I get "invalid arguments"
Now it doesn't know about OverrideEncoding, without it I get the same results
seems to be an old version, check out the comments in the link and perhaps you should use the latest version
Yep the latest version its working, btw do you know what encoding should I use so I dont get 'à' characters, only 'a' ?
1

In HtmlAgility there is property to set stream encoding (normaly it should autodetect encoding ) but maybe not working for your page.. (wrong meta tags etc..)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.