Go: Retrieve a string from between two characters or other strings

Question

Let's say for example that I have one string, like this:

<h1>Hello World!</h1>

What Go code would be able to extract Hello World! from that string? I'm still relatively new to Go. Any help is greatly appreciated!

Are you looking to parse a specific pattern or format? For example, is the text always surrounded by <h1> tags, general HTML, something else entirely? There is not enough information to answer the question so I am downvoting. — Stephen Weinberg
– Stephen Weinberg, Commented Nov 13, 2014 at 19:49
It's just matching strings. If I hit one matching string and then another, then give me the stuff in the middle. — T145
– T145, Commented Nov 13, 2014 at 19:52
For manipulating HTML, look at GoQuery or golang.org/x/net/html (formerly go.net/html). — twotwotwo
– twotwotwo, Commented Nov 14, 2014 at 20:30
Good answer to this question is stackoverflow.com/a/62555190/3415984 — ttrasn
– ttrasn, Commented Jun 24, 2020 at 17:43

yusufmalikul · Accepted Answer · 2021-06-09 12:07:01Z

20

If the string looks like whatever;START;extract;END;whatever you can use this which will get the string in between:

// GetStringInBetween Returns empty string if no start string found
func GetStringInBetween(str string, start string, end string) (result string) {
    s := strings.Index(str, start)
    if s == -1 {
        return
    }
    s += len(start)
    e := strings.Index(str[s:], end)
    if e == -1 {
        return
    }
    e += s + e - 1
    return str[s:e]
}

What happens here is it will find first index of START, adds length of START string and returns all that exists from there until first index of END.

edited Jun 9, 2021 at 12:07

yusufmalikul

5781 gold badge11 silver badges29 bronze badges

answered Feb 19, 2017 at 19:06

Jan Kardaš

3331 gold badge3 silver badges3 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

schollz Over a year ago

This is the best answer but it will panic if the END is not found or if the END is also found before START, see this play link: play.golang.org/p/C2sZRYC15XN. That play link also includes the revision to fix this problem. I submitted this revision to SO which is under peer review.

sanbornm Over a year ago

@schollz is correct and provides a more correct answer. Copying and pasting this answer is dangerous as it will panic. However, thank you Jan for the original work.

ttrasn Over a year ago

improved answer is there : stackoverflow.com/a/62555190/3415984

life4 Over a year ago

@schollz comment is good return str[s:s+e] is require

kristianp Over a year ago

''' Panic: runtime error: slice bounds out of range [:27] with length 21 goroutine 1 [running]: main.GetStringInBetween({0x49690b, 0x15}, {0x494327?, 0x4}, {0x4943f3, 0x5}) /tmp/sandbox2821937282/prog.go:20 +0xf0 main.main() /tmp/sandbox2821937282/prog.go:24 +0x38 ''' go.dev/play/p/ju_9M0ZZzqZ

DiBa · Accepted Answer · 2014-11-13 19:53:26Z

15

There are lots of ways to split strings in all programming languages.

Since I don't know what you are especially asking for I provide a sample way to get the output you want from your sample.

package main

import "strings"
import "fmt"

func main() {
    initial := "<h1>Hello World!</h1>"

    out := strings.TrimLeft(strings.TrimRight(initial,"</h1>"),"<h1>")
    fmt.Println(out)
}

In the above code you trim <h1> from the left of the string and </h1> from the right.

As I said there are hundreds of ways to split specific strings and this is only a sample to get you started.

Hope it helps, Good luck with Golang :)

DB

answered Nov 13, 2014 at 19:53

DiBa

3222 silver badges2 bronze badges

2 Comments

gondo Over a year ago

this is wrong as trim argument is a list of characters not a string. if initial := "<h1>hhhhhello</h1>" then the result would be ello play.golang.org/p/HkopYJEDg9F

Nigel Ainscoe Over a year ago

Ignore this answer. It works for @T145s specific case but not generally. The answer below works perfectly.

ttrasn · Accepted Answer · 2020-06-24 12:36:40Z

7

I improved the Jan Kardaš`s answer. now you can find string with more than 1 character at the start and end.

func GetStringInBetweenTwoString(str string, startS string, endS string) (result string,found bool) {
    s := strings.Index(str, startS)
    if s == -1 {
        return result,false
    }
    newS := str[s+len(startS):]
    e := strings.Index(newS, endS)
    if e == -1 {
        return result,false
    }
    result = newS[:e]
    return result,true
}

answered Jun 24, 2020 at 12:36

ttrasn

4,8975 gold badges34 silver badges49 bronze badges

Comments

dganesh2002 · Accepted Answer · 2021-09-10 03:55:54Z

6

Here is my answer using regex. Not sure why no one suggested this safest approach

package main

import (
    "fmt"
        "regexp"
)

func main() {
    content := "<h1>Hello World!</h1>"
    re := regexp.MustCompile(`<h1>(.*)</h1>`)
    match := re.FindStringSubmatch(content)
    if len(match) > 1 {
        fmt.Println("match found -", match[1])
    } else {
        fmt.Println("match not found")
    }
    
}

Playground - https://play.golang.org/p/Yc61x1cbZOJ

answered Sep 10, 2021 at 3:55

dganesh2002

2,2901 gold badge31 silver badges33 bronze badges

Comments

miltonb · Accepted Answer · 2014-11-13 19:49:01Z

1

Read up on the strings package. Have a look into the SplitAfter function which can do something like this:

var sample = "[this][is my][string]"
t := strings.SplitAfter(sample, "[")

That should produce a slice something like: "[", "this][", "is my][", "string]". Using further functions for Trimming you should get your solution. Best of luck.

answered Nov 13, 2014 at 19:49

miltonb

7,4538 gold badges50 silver badges58 bronze badges

Comments

jmaloney · Accepted Answer · 2014-11-14 17:05:25Z

1

In the strings pkg you can use the Replacer to great affect.

r := strings.NewReplacer("<h1>", "", "</h1>", "")
fmt.Println(r.Replace("<h1>Hello World!</h1>"))

Go play!

answered Nov 14, 2014 at 17:05

jmaloney

12.4k3 gold badges39 silver badges29 bronze badges

2 Comments

stix Over a year ago

How does this answer the OP's question about finding the string between the tags? It only shows how to remove the tags.

jmaloney Over a year ago

My answer does exactly what the OP asked for "What Go code would be able to extract Hello World! from that string?"

Andre Romano · Accepted Answer · 2017-06-27 11:19:23Z

1

func findInString(str, start, end string) ([]byte, error) {
    var match []byte
    index := strings.Index(str, start)

    if index == -1 {
        return match, errors.New("Not found")
    }

    index += len(start)

    for {
        char := str[index]

        if strings.HasPrefix(str[index:index+len(match)], end) {
            break
        }

        match = append(match, char)
        index++
    }

    return match, nil
}

edited Jun 27, 2017 at 11:19

answered Jun 27, 2017 at 10:43

Andre Romano

1,01211 silver badges20 bronze badges

Comments

kwatson · Accepted Answer · 2023-03-19 16:10:04Z

1

How about:

func SplitBetween(str, bef, aft string) string {
    sa := strings.SplitN(str, bef, 2)
    if len(sa) == 1 {
        return ""
    }
    sa = strings.SplitN(sa[1], aft, 2)
    if len(sa) == 1 {
        return ""
    }
    return sa[0]
}

Returns empty string if split is not found.

answered Mar 19, 2023 at 16:10

kwatson

291 bronze badge

Comments

gondo · Accepted Answer · 2019-04-17 16:22:42Z

func Split(str, before, after string) string {
    a := strings.SplitAfterN(str, before, 2)
    b := strings.SplitAfterN(a[len(a)-1], after, 2)
    if 1 == len(b) {
        return b[0]
    }
    return b[0][0:len(b[0])-len(after)]
}

the first call of SplitAfterN will split the original string into array of 2 parts divided by the first found after string, or it will produce array containing 1 part equal to the original string.

second call of SplitAfterN uses a[len(a)-1] as input, as it is "the last item of array a". so either string after after or the original string str. the input will be split into array of 2 parts divided by the first found before string, or it will produce array containing 1 part equal to the input.

if after was not found than we can simply return b[0] as it is equal to a[len(a)-1]

if after is found, it will be included at the end of b[0] string, therefore you have to trim it via b[0][0:len(b[0])-len(after)]

all strings are case sensitive

Collectives™ on Stack Overflow

Go: Retrieve a string from between two characters or other strings

9 Answers 9

5 Comments

2 Comments

Comments

Comments

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

5 Comments

2 Comments

Comments

Comments

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related