9

using VB.net or c#, How do I get the generated HTML source?

To get the html source of a page I can use this below but this wont get the generated source, it won't contain any of the html that was added dynamically by the javascript in the browser. How do I get the the final generated HTML source?

thanks

WebRequest req = WebRequest.Create("http://www.asp.net"); 
WebResponse res = req.GetResponse(); 
StreamReader sr = new StreamReader(res.GetResponseStream()); 
string html = sr.ReadToEnd();

if I try this below then it returns the document with out the JavaScript code injected

Public Class Form1

    Dim WB As WebBrowser = Nothing

    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load

        WB = New WebBrowser()
        Me.Controls.Add(WB)
        AddHandler WB.DocumentCompleted, AddressOf WebBrowser1_DocumentCompleted


        WB.Navigate("mysite/Default.aspx")

    End Sub

    Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs)


        'Dim htmlcode As String = WebBrowser1.Document.Body.OuterHtml()
        Dim s As String = WB.DocumentText

    End Sub
End Class

HTML returned

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
    <title></title>

</head>
<body>
    <form id="form1" runat="server">
    <div id="center_text_panel">
    //test text  this text should be here
    </div>
    </form>
</body>
</html>

    <script type="text/javascript">

        document.getElementById("center_text_panel").innerText = "test text";


    </script>
3
  • 1
    Use WEbbrowser control. Commented Feb 13, 2013 at 6:23
  • Do you have an example? have you done it before as I tried that but could not get it to work? Commented Feb 13, 2013 at 6:25
  • Did you ever figure this out @Hello-World? I'm having the SAME issue and tried using the new WebView2 control from MS, but still no love! Commented Jul 4, 2020 at 0:59

3 Answers 3

2

You can use WebKit.NET

Look here for official tutorials

This can not only grab the source, but also process javascript through the pageload event.

webKitBrowser1.Navigate(MyURL)

Then, handle the DocumentCompleted event, and:

private documentContent = webKitBrowser1.DocumentText

Edit - This might be the better open source WebKit option: http://code.google.com/p/open-webkit-sharp/

Sign up to request clarification or add additional context in comments.

2 Comments

+1, also I think "need" is too strong here - built in WebBrowser or even PhantomJS can be used to do the same.
The webkit is giving me the source HTML and not the final generated.
1

Just put a webbrowser control to your form and you flowing code:

 webBrowser1.Navigate("YourLink");

     private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
           string htmlcode= webBrowser1.Document.Body.InnerHtml;//Or Each Filed Or element..//WebBrowser.DocumentText
        }

Edited

for getting also html code that generated dynamically by java script code you have two way:

  1. run flowing code after webBrowser1_DocumentCompleted Event
 StringBuilder htmlcode = new StringBuilder();
            foreach (HtmlElement item in webBrowser1.Document.All)
            {
                htmlcode.Append( item.InnerHtml);
            }
  1. write a javascript code for returning document.documentElement.innerHTML and using InvolkeScript Function To Return Result:
   var htmlcode = webBrowser1.Document.InvokeScript("javascriptcode");

4 Comments

thanks that s great but it returns the source not the generated souce
For Getting dynamically Generated Code You must Using extra JavaScript Code if You if add more details of something that you want to do it will show solution(or adding some extra code)
Hi - WebBrowser.DocumentText needs to return the generated html code with the javascript injected into it. Do you think that this might need to be done as async. thanks for your help.
The string returned is just HTML before javascript runs.
0

You can use this code:

webBrowser1.Document.Body.OuterHtml

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.