0

I've been assigned some sed homework in my class and am one step away from finishing the assignment. I've racked my head trying to come up with a solution and nothing's worked to the point where I'm about to give up.

Basically, in the file I've got...I'm supposed to replace this:

<b>Some text here...each bold tag has different content...</b>

with

Some text here...each bold tag has different content...

I've got it partially completed, but what I can't figure out is how to "echo" the extracted content using sed (regexp).

I manage to substitute the content out just fine, but it's when I'm trying to actually OUTPUT the content that's between the HTML tags that it goes wrong.

If that's confusing, I truly apologize. I've been at this project a couple hours now and am getting a bit frusturated. Basically, why does this not work?

s/<b>.*<\/b>/.*/g

I simply want to output the content WITHOUT the bold tags.

Thanks a bunch!

1

3 Answers 3

1

If you want to reference a part of your regex match in the replacement, you need to place that portion of the regex into a capturing group, and then refer to it using the group number preceded by a backslash. Try the following:

s/<b>\(.*\)</b>/\1/g
Sign up to request clarification or add additional context in comments.

1 Comment

The lazy quantifier isn't supported in sed.
1

You need to use a capturing group, which are parentheses ()

So, it's just this:

s/<b>(.*)<\/b>/\1/g

Capturing groups are numbered, from left to right, starting with one, and increasing.

This syntax is the standard way to do regular expressions; sed's syntax is slightly different. the sed command is

sed 's/<b>\(.*\)<\/b>/\1/g' [file]

or

sed -r 's/<b>(.*)<\/b>/\1/g' [file]

Of course, if you just want to remove the bold tags, the other solution would be to just replace the HTML tags with blanks like so

sed 's/<\([^>]\|\(\"[^\"]\"\)\)*>//g' [file]

(I dislike sed's need to escape everything)

s/<([^\]|(\"[^\"]\"))*>//g

Comments

-1

I think this question should be addressed to SED's mans. Like this: http://www.grymoire.com/Unix/Sed.html#uh-4

1 Comment

Ah, the good old rtfm. The asker just didn't know about capturing groups, or how to search for them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.