1

I would like to replace every _ with a - on lines starting with #| label: using PCRE2 regex within my text editor.

Example:

#| label: my_chunk_label
my_function_name <- function(x)

Should become:

#| label: my-chunk-label
my_function_name <- function(x)

In contrast to .NET regex, where one could substitute (?<=^#\| label: .+)_ with - (regex101 example), PCRE2 does not support infinite lookbehind so the regex is invalid. So far, the only way I found was to repeatedly substitute ^#[^_]+\K_ with - (regex101 example) but I was curious if there is a single-pass solution.

4
  • 2
    Also an idea to skip the other lines: ^(?!#\| label:).+(*SKIP)(*F)|_ Commented Aug 23, 2023 at 15:00
  • @bobblebubble That is also nice :-) Commented Aug 23, 2023 at 15:09
  • @bobblebubble interesting and quite simple to understand (as well as efficient). Thanks! Commented Aug 23, 2023 at 15:35
  • 1
    Thanks jkd and 4th bird! I added this as an alternative. :) Commented Aug 23, 2023 at 17:35

2 Answers 2

3

If you are using pcre, you could make use of \G and \K

Then in the replacement use -

(?:^#\|\h+label:\h+|\G(?!^))[^\r\n_]*\K_

The pattern matches:

  • (?: Non capture group for the alternatives
    • ^#\|\h+label:\h+ Match the pattern that should be at the start of the string, where \h matches a horizontal whitespace character
    • | Or
    • \G(?!^) Assert the current position at the end of the previous match, not at the start
  • ) Close the non capture group
  • [^\r\n_]* Match optional characters except for newlines or _
  • \K Forget what is matched so far
  • _ Match the underscore

Regex demo

Sign up to request clarification or add additional context in comments.

3 Comments

Excellent! Thank you for the perfect and fast reply. I had not yet found out about the \G anchor.
first |should be escaped, otherwise great answer
@NahuelFouilleul You are right, I have updated it.
2

An alternative idea is to use PCRE verbs (*SKIP)(*F) to skip the lines that you don't want.

^(?!#\| label:).+(*SKIP)(*F)|_

It's also related to The Trick. On the left side of the alternation lines that don't start with #| label get identified by use of a negative lookahead, matched by .+ and skipped - no replacment wanted in these lines. On the right side matching is done in the remaining lines.

See this demo at regex101 (replace with whatever you like - add variable space if needed)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.