0

I have a string that looks like this

<br/><description>Using a combination of remote probes, (TCP/IP, SMB, HTTP, NTP, SNMP, etc...) it is possible to guess the name of the remote operating system in use, and sometimes its version.</description><br/><fname>os_fingerprint.nasl</fname><br/><plugin_modification_date>2012/12/01</plugin_modification_date><br/><plugin_name>OS Identification</plugin_name><br/><plugin_publication_date>2003/12/09</plugin_publication_date><br/><plugin_type>combined</plugin_type><br/><risk_factor>None</risk_factor><br/><solution>n/a</solution><br/><synopsis>It is possible to guess the remote operating system.</synopsis><br/><plugin_output><br/>Remote operating system : Microsoft Windows Server 2008 R2 Enterprise Service Pack 1<br/>Confidence Level : 99<br/>Method : MSRPC<br/><br/> <br/>The remote host is running Microsoft Windows Server 2008 R2 Enterprise Service Pack 1</plugin_output><br/>

I want to extract the "Remote operating system :" and get "Microsoft Windows Server 2008 R2 Enterprise Service Pack 1".

Remote operating system : Microsoft Windows Server 2008 R2 Enterprise Service Pack 1<br/>

So I crafted up a regular expression using

Pattern pattern = Pattern.compile("(?<=\\bRemote operating system :\\b).*?(?=\\b<br/>\\b)");

But my regular expression doesn't seem to be working. Any idea? Also is this a good way to extract this operating system string out or I should do it another way? Thanks!

4 Answers 4

2

Try this pattern: ".*Remote operating system : (.*?)<br/>"

public static void main(String[] args) throws Exception {
    String s = "<br/><description>Using a combination of remote probes, (TCP/IP, SMB, HTTP, NTP, SNMP, etc...) it is possible to guess the name of the remote operating system in use, and sometimes its version.</description><br/><fname>os_fingerprint.nasl</fname><br/><plugin_modification_date>2012/12/01</plugin_modification_date><br/><plugin_name>OS Identification</plugin_name><br/><plugin_publication_date>2003/12/09</plugin_publication_date><br/><plugin_type>combined</plugin_type><br/><risk_factor>None</risk_factor><br/><solution>n/a</solution><br/><synopsis>It is possible to guess the remote operating system.</synopsis><br/><plugin_output><br/>Remote operating system : Microsoft Windows Server 2008 R2 Enterprise Service Pack 1<br/>Confidence Level : 99<br/>Method : MSRPC<br/><br/> <br/>The remote host is running Microsoft Windows Server 2008 R2 Enterprise Service Pack 1</plugin_output><br/>";

    Pattern pattern = Pattern.compile(".*Remote operating system : (.*?)<br/>");
    Matcher m = pattern.matcher(s);
    if (m.find()) {
      System.out.println(m.group(1));
    }
    else System.out.println("Not found");
}
Sign up to request clarification or add additional context in comments.

Comments

0
String test = 
        "<br/><description>Using a combination of remote probes, " +
        "(TCP/IP, SMB, HTTP, NTP, SNMP, etc...) it is possible to guess " +
        "the name of the remote operating system in use, and sometimes " +
        "its version.</description><br/><fname>os_fingerprint.nasl</fname>" +
        "<br/><plugin_modification_date>2012/12/01</plugin_modification_date>" +
        "<br/><plugin_name>OS Identification</plugin_name><br/>" +
        "<plugin_publication_date>2003/12/09</plugin_publication_date><br/>" +
        "<plugin_type>combined</plugin_type><br/><risk_factor>None</risk_factor>" +
        "<br/><solution>n/a</solution><br/><synopsis>It is possible to guess the " +
        "remote operating system.</synopsis><br/><plugin_output><br/>Remote operating " +
        "system : Microsoft Windows Server 2008 R2 Enterprise Service Pack 1<br/>" +
        "Confidence Level : 99<br/>Method : MSRPC<br/><br/> <br/>The remote host is " +
        "running Microsoft Windows Server 2008 R2 Enterprise Service Pack 1" +
        "</plugin_output><br/>";
        Pattern pattern = Pattern.compile("Remote\\soperating\\ssystem\\s:\\s(.+?)\\<br/>");
        Matcher matcher = pattern.matcher(test);
        if (matcher.find()) {
            System.out.println(matcher.group(1));
        }

Output:

Microsoft Windows Server 2008 R2 Enterprise Service Pack 1

Note that in general, using regex against markup language is not advised. However here you are using regex against a specific string of text, that only happens to be inside markup, so I guess it's ok.

Comments

0

There is no space after : and before \\b in your regex.

Try this way:

Pattern.compile("(?<=\\bRemote operating system : \\b).*?(?=\\b<br/>\\b)");
//                                               ^additional space

Without that space \\b wont match start of new word (Microsoft) (it also will never match end of word since : cant be end of correct word).

Comments

0

Try the next:

if (str.matches("^.*Remote operating system : ([^<]*).*$")) {
    System.out.println(
        str.replaceAll("^.*Remote operating system : ([^<]*).*$", "$1")
    );
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.