3

I'm trying to import my notes from Kindle into a Google Doc (you can view it here), and I have a document in which I want to strip out all of the occurrences of the following text (including the line break):

Read more at location 6567 • Delete this highlight
Add a note

I came up with the following search pattern and tested it on this google sheet to make sure my regex syntax works:

"Read more at location (\d*)   • Delete this highlight\nAdd a note"

Then I created a google apps script, and have it load in my document:

function onOpen() {
  DocumentApp.getUi() // Or DocumentApp or FormApp.
      .createMenu('AdvancedFind&Replace')
      .addItem('Remove Kindle HTML', 'findAndReplace')
      .addToUi();
}

// In-Document Find and Replace

function findAndReplace() {
  var body = DocumentApp.getActiveDocument().getBody();
    body.replaceText("Read more at location (\d*)   • Delete this highlight\nAdd a note", "");
      }

However, when I run it, it doesn't replace the text. I think it's a problem with the REGEX, because when I run this code instead, it works:

function replaceBat() {
  var body = DocumentApp.getActiveDocument().getBody();
    body.replaceText("BBat", "BBAat REPLACEMENT SUCCESSFUL");
      }

Any help would be greatly appreciated, thanks!

4 Answers 4

2

According to the documentation, some patterns may not work:

A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.

See regex specifications in GoogleDocs here, it does not say that \d is supported. So, try this regex:

^Read more at location [0-9]* • Delete this highlight[[:space:]]Add a note

Or

^Read more at location [0-9]*[^[:alpha:]]*Delete this highlight[[:space:]]Add a note
Sign up to request clarification or add additional context in comments.

2 Comments

Hi Wiktor, That did not work for me, but this did: ^Read more at location [0-9]* • Delete this highlight[[:space:]]Add a note
That is strange. However, that means you can use POSIX character classes and the * quantifier. I updated the answer with your and my new suggestion.
2

The problem was that the Docs regex doesn't support "/d" for matching any digit or "/s" for matching any whitespace character, but it does support "[[:space:]]" for matching any whitespace character!

The following syntax worked in my document:

// In-Document Find and Replace

function findAndReplace() {
  var body = DocumentApp.getActiveDocument().getBody();
  body.replaceText("^Read more at location [0-9]*   • Delete this highlight[[:space:]]Add a note", "");
      }

I found the [[:space:]] syntax at https://www.google.com/support/enterprise/static/postini/docs/admin/en/admin_ee_cu/cm_regex.html

Comments

2

It looks like Google Docs & Sheets now allow you to natively use regular expressions when you search for text.

Simply open the native search functionality and you'll find the option to use regex.

regex search functionality in google docs

you can find more info here: https://support.google.com/docs/answer/62754

Comments

0

A regex does not start and end with quotes; a string does. Your replace is looking for a literal string, not a regex.

Try:

body.replaceText(/Read more at location (\d*)   • Delete this highlight\nAdd a note/, "");

To get all instances, add the g flag:

body.replaceText(/Read more at location (\d*)   • Delete this highlight\nAdd a note/g, "");

1 Comment

Hi Mogsdad, - not according to the documentation for the replaceText function: developers.google.com/apps-script/reference/document/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.