3

I have a binary file that contains a readable filename* bounded by 'namexx:' and 'xx:piece', where x is any digit from 0-9 in both cases.

I am working on a Mac in bash 5.

I have tried using sed:

cat filename.xxx | sed -E 's/^.*name[0-9]{2}:(.*)[0-9]{2}:piece.*$/\1/'

The problem is that the regex does not consume the whole file, so I get a lot of random stuff returned in addition to the captured filename.

I've tried prefixing sed with LC_ALL=C as I read in another answer that this will treat all binary data as 'consumable' with wildcards, but it makes no difference (and I may have misunderstood).

I have also tried removing the beginning and end anchors, but that makes no difference either.


*The file is a torrent file from which I just want to extract the filename. I have looked at bencoding and trying to extract the filename, but it seemed too complex for a trivial task.

2
  • Maybe all you need is sed -n -E 's/^.*name[0-9]{2}:(.*)[0-9]{2}:piece.*$/\1/p;'? Commented May 23, 2019 at 12:16
  • Try grep -m 1 -o 'name[0-9]\{2\}:\(.*\)[0-9]\{2\}:piece' filename.xxx | sed 's/^name[0-9]\{2\}://' | sed 's/[0-9]\{2\}:piece$//' Commented May 23, 2019 at 12:30

1 Answer 1

2

You may use

sed -n -E 's/^.*name[0-9]{2}:(.*)[0-9]{2}:piece.*$/\1/p;' filename.xxx

Here, -n prevents line from being printed and p prints the matches (what remains after replacement).

As an alternative, you may use something like

grep -m 1 -o 'name[0-9]\{2\}:\(.*\)[0-9]\{2\}:piece' filename.xxx | \
   sed -E 's/^name[0-9]{2}:(.*)[0-9]{2}:piece$/\1/'

The first grep will only extract the first (-m 1) match and then sed will only keep the capturing group value inside the result.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.