2

I want to match a string with the following criteria:

  • Match any letters, followed by a '.', followed by letters, followed by end-of-line.

For example, for the string 'www.stackoverflow.com', the regex should return 'stackoverflow.com'. I have the following code that works:

my_string = '''
    123.domain.com
    123.456.domain.com
    domain.com
    '''

>>> for i in my_string.split():
...     re.findall('[A-Za-z\.]*?([A-Za-z]+\.[a-z]+)$', i)
...
['domain.com']
['domain.com']
['domain.com']
>>>

The code snippet above works perfectly. But I'm sure there must be a more elegant way to achieve the same.

Is it possible to start the regex search/match starting from the end of the string, moving towards the start of the string? How would one code that type of regex? Or should I be using regex at all?

1
  • You can just simplify it by setting the i and m modifiers: poc. Commented Jun 5, 2013 at 16:47

2 Answers 2

2

Your regex won't account for domains like domain.co.uk, so I would consider using something a little more robust. If you don't mind adding more dependencies to your script, there's a module named tldextract (pip install tldextract) that makes this pretty simple:

import tldextract

def get_domain(url):
    result = tldextract.extract(url)

    return result.domain + '.' + result.tld
Sign up to request clarification or add additional context in comments.

Comments

2

I'm not sure from your example if you're just trying to get the last two parts of the domain name, or if you're trying to remove the numbers. If you just want the last parts of the domain, you can do something like:

for i in my_string.split():
     '.'.join(i.split('.')[-2:])

This:

  1. splits each string into a list of words, split where the '.' was originally, then
  2. combines the final two words into a single string, with a '.' separator.

Or, like this:

>>> my_string = ['123.domain.com', '123.456.domain.com', 'domain.com', 'www.stackoverflow.com']
>>> ['.'.join(i.split('.')[-2:]) for i in my_string]
['domain.com', 'domain.com', 'domain.com', 'stackoverflow.com']

1 Comment

Thanks for the reply. The split and join methods work nicely.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.