12

I sucessfully display a web site on WebView2 in my VB.net (Visual Studio 2017) project but can not get html souce code. Please advise me how to get html code.

My code:

Private Sub testbtn_Click(sender As Object, e As EventArgs) Handles testbtn.Click
        WebView2.CoreWebView2.Navigate("https://www.microsoft.com/")
End Sub

Private Sub WebView2_NavigationCompleted(sender As Object, e As CoreWebView2NavigationCompletedEventArgs) Handles WebView2.NavigationCompleted
        Dim html As String = ?????
End Sub

Thank you indeed for your advise in advance.

4
  • I've never used a WebView2 control and there seems to be little information around about this but I suspect that it starts here. I think the reason that it's not well documented is that it's part of Chromium. Commented Jun 17, 2020 at 15:06
  • 1
    Does this answer your question? How I get page source from WebView? Commented Jun 17, 2020 at 19:18
  • Also, stackoverflow.com/questions/29654149/… Commented Jun 17, 2020 at 19:18
  • Thank you indeed. I have read through the document but still can not find the answer. I also tried the link "stackoverflow.com/questions/29654149/…" but unfortunately "Await myWebView.InvokeScriptAsync" is marked error and does not work. Commented Jun 17, 2020 at 22:56

5 Answers 5

35

I've only just started messing with the WebView2 earlier today as well, and was just looking for this same thing. I did manage to scrape together this solution:

Dim html As String
html = Await WebView2.ExecuteScriptAsync("document.documentElement.outerHTML;")

' The Html comes back with unicode character codes, other escaped characters, and
' wrapped in double quotes, so I'm using this code to clean it up for what I'm doing.
html = Regex.Unescape(html)
html = html.Remove(0, 1)
html = html.Remove(html.Length - 1, 1)

Converted my code from C# to VB on the fly, so hopefully didn't miss any syntax errors.

Sign up to request clarification or add additional context in comments.

3 Comments

Fantastic. Thank you indeed. I can acomplish getting html source code from WebView2 as in followoing code. I really appreciate for it. Private Sub testbtn_Click() Handles testbtn.Click wv.CoreWebView2.Navigate(""microsoft.com/"") End Sub Private Async Sub wv_NavigationCompleted() Handles wv.NavigationCompleted Dim html As String = String.Empty html = Await wv.ExecuteScriptAsync("document.documentElement.outerHTML;") html = Regex.Unescape(html) html = html.Remove(0, 1) html = html.Remove(html.Length - 1, 1) End Sub
But, what about simply invoking the "View Page Source" command in WebBiew2? Can we do that? I know we can display it via a hot key so why not "on demand"? This command would display the source in a popup window.
Your answer was exactly what I needed. I was using WebBrowser.DocumentStream to load HtmlAgilityPack.HtmlDocument. Now I am converting to WebView2 and I could not get a valid document into HtmlAgilityPack. Your answer solved the problem. Perhaps I will post my code tomorrow in case it would help someone.
4

The accepted answer is on the right track. However, it's missing on important thing:

The returned string is NOT HTMLEncoded, it's JSON!

So to do it right, you need to deserialize the JSON, which is just as simple:

Dim html As String
html = Await WebView2.ExecuteScriptAsync("document.documentElement.outerHTML;")
html = Await JsonSerializer.DeserializeAsync(Of String)(html);

Comments

3

Adding to @Xaviorq8 answer, you can use Span to get rid of generating new strings with Remove:

html = Regex.Unescape(html)
html = html.AsSpan()[1..^1].ToString();

Comments

1

I must credit @Xaviorq8; his answer was needed to solve my problem. I was successfully using .NET WebBrowser and Html Agility Pack but I wanted to replace WebBrowser with .NET WebView2.

Snippet (working code with WebBrowser):
using HAP = HtmlAgilityPack;
HAP.HtmlDocument hapHtmlDocument = null;
hapHtmlDocument = new HAP.HtmlDocument();
hapHtmlDocument.Load(webBrowser1.DocumentStream);
HtmlNodeCollection nodes = hapHtmlDocument.DocumentNode.SelectNodes("//*[@id=\"apptAndReportsTbl\"]");
Snippet (failing code with WebView2):
using HAP = HtmlAgilityPack;
HAP.HtmlDocument hapHtmlDocument = null;
string html = await webView21.ExecuteScriptAsync("document.documentElement.outerHTML");
hapHtmlDocument = new HAP.HtmlDocument();
hapHtmlDocument.LoadHtml(html);
HtmlNodeCollection nodes = hapHtmlDocument.DocumentNode.SelectNodes("//*[@id=\"apptAndReportsTbl\"]");

Success withWebView2 and Html Agility Pack

using HAP = HtmlAgilityPack;
HAP.HtmlDocument hapHtmlDocument = null;
string html = await webView21.ExecuteScriptAsync("document.documentElement.outerHTML");
// thanks to @Xaviorq8 answer (next 3 lines)
html = Regex.Unescape(html);
html = html.Remove(0, 1);
html = html.Remove(html.Length - 1, 1);
hapHtmlDocument = new HAP.HtmlDocument();
hapHtmlDocument.LoadHtml(html);
HtmlNodeCollection nodes = hapHtmlDocument.DocumentNode.SelectNodes("//*[@id=\"apptAndReportsTbl\"]");

Comments

0

Components

Form1 As Form
---------------
Button1 As Button
---------------
WV1 As WebView2
---------------
TextBox1 As TextBox
------------------------

Imports Microsoft.Web.WebView2.Core

Public Class Form1
    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        WV1.Source = New Uri("https://www.google.com/")
    End Sub

    Private Sub WV1_NavigationCompleted(sender As Object, e As CoreWebView2NavigationCompletedEventArgs) Handles WV1.NavigationCompleted
        Dim task = GetPage2InfoAsync()
    End Sub

    Private Async Function GetPage2InfoAsync() As Task
        Dim DateStr As String
        DateStr = Await WV1.ExecuteScriptAsync("document.documentElement.outerHTML")
        TextBox1.MaxLength = DateStr.Length + 1000
        TextBox1.Text = DateStr
    End Function
End Class

One thing I did find out...TextBox is default to 32k length...A lot of page source is like 2 to 3 megs...So I set my TextBox max length to 5000000

I added a line to cure that problem Under

DateStr = Await WV1.ExecuteScriptAsync("document.documentElement.outerHTML")

I added

TextBox1.MaxLength = DateStr.Length + 1000

That sets the TextBox Length to the Returned Length Plus 1000 Characters.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.