I'm trying to write a Splunk query, and I need to parse out the command line arguments given to a Windows program. Specifically, I'm trying to get the name of the package that is being installed. Here are some examples of the data:
/i "package\name" test
/i "package\name" "test"
/i "package\ name" test
/i "package\ name" "test"
/i package\name test
/package package\name "test"
The package name is always preceded by "/i" or "/package" (they can be upper or lower case) and a space (although sometimes there is no space). The package name is normally in quotes, but sometimes it isn't. If it's in quotes, it can contain spaces. It is usually followed by more command line arguments, sometimes in quotes and sometimes not, but I don't really care about those. They're represented by the string test/"test". I'm basically trying to get everything between the "i" (or package) and the command line arguments that comes after the package name.
I first tried using \/([iI]|(?i)package)\s?(?<package>.*?)\s to extract the package name into a capture group. But the problem was the third and fourth test strings due to the spaces within the quotes. They would cause everything after them to get cut off, so I'd only end up with "package" instead of "package name".
So I thought maybe I could use one regex to extract everything within quotes, another to extract everything with no quotes, and then combine them.
With the following regex, I can get "package\name" or "package\ name" from the first 4 of the above strings with no issue:
\/([iI]|(?i)package)\s?"(?<package1>.*?)"
To get the last 2, I tried to get everything after i/package that didn't start with quotes:
\/([iI]|(?i)package)\s?[^"](?<package2>.*?)\s
But, using regex101.com, it seems that matches the package name for all the test strings. And it cuts off the first character in the last 2, so I'd have "ackage\name". I'm not sure why either is happening.
If it's possible to extract what I want with one expression, that would be the preferred solution. But, being able to extract the package name from the last 2 test cases would also work. However, if this is the solution, there should be no overlap between the capture groups. package1 should match the package names in test strings 1-4, and package 2 should match 5-6.
UPDATE:
I appreciate everyone's answers. I got some help from a colleague which I was able to tweak into what I believe is a viable solution. I thought I'd share it in case anyone else found it helpful:
(?i)(\/i)\s?(?:\"(?<package1>[^\"]*)\"|(?<package2>\S+))
(?Ji)\/(i|package)\s*(?:"(?<package1>.*?)"|(?<package1>\S+))(?i)is better left in the beginning in this case.\/.+?\\\W*(?<pkg_name>\w+)