2

Suppose I have a string containing "ATGTTTGGATTAGGTAATGAAT".

I'd like to search the string for the first instance of either "TAG", "TAA", or "TGA".

To do this, I'd like to use regular expressions. I think std::regex_search will work, but I'm unsure how to write the syntax.

Any help would be greatly appreciated.

EDIT: I need to retrieve the position of the first instance of "TAG", "TAA", or "TGA" (whichever comes first).

3
  • Why would you want to do this with a regular expression, when a simple 'find' would work. Or just 'strstr'? Commented Nov 14, 2013 at 1:41
  • find will find one substring, but not the first occurrence of a set of substrings Commented Nov 14, 2013 at 1:43
  • I think I just need to use something like "TAG|TAA|TGA". Commented Nov 14, 2013 at 1:54

4 Answers 4

3

You can try this:

#include <iostream>
#include <regex>

int main() {
    std::string s("ATGTTTGGATTAGGTAATGAAT");
    std::regex r("TAG|TAA|TGA");
    std::sregex_iterator first(s.begin(), s.end(), r);
    std::cout << "position: " << first->position() << std::endl; // position: 10
    return 0;
}

doc is here: http://en.cppreference.com/w/cpp/regex

Sign up to request clarification or add additional context in comments.

Comments

0

You can do like this:

#include <iostream>
using namespace std;
int main()
{
    string str="ATGTTTGGATTAGGTAATGAAT";
    string regstr="TAG";
    const char *show;

    show=strstr(str.c_str(), regstr.c_str());//return the pointer of the first place reg 'TAG',else return NULL。
    cout<<show<<endl;
    return 0;
}

Comments

0

I don't know the specific call in c++ (maybe that's what you are asking about), but this is your regex:

/T(A[GA]|GA)/

That is, find a "T" followed by either (an "A" and a ["G" or "A"]) or followed by "GA".

3 Comments

This is definitely a start. But yeah I need to figure out the specific call and a way to get the position.
Stick my regex into the example given on the cplusplus.com reference for regex_search (NB: don't include the outer slashes).
That's actually what I'm working from but I'm not getting the results I want. I really just need the position (the index) of the first occurrence of one of the three strings. I'll keep trying.
0

For this specific problem (that is, assuming that "TAG", "TAA", and "TGA" are the strings to be searched for, and not just representatives of a more general problem), a simple search is easier:

find 'T'
  if the next character is 'A' and the character after that is 'A' or 'G', matched;
  else if the next character is 'G' and the character after that is 'A' matched;
  else go back and try the next 'T'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.