0

There is a csv file with following urls inside:

1;https://www.one.de 
2;https://www.two.de 
3;https://www.three.de
4;https://www.four.de
5;https://www.five.de

Then I load it to a pandas dataframe df.

cols = ['nr','url']
df = pd.read_csv("listing.csv", sep=';', encoding = "utf8", dtype=str, names=cols)

Then I like to add another col 'domain_name' corresponding to the nr.

def takedn(url):
    m = urlsplit(url)
    return m.netloc.split('.')[-2]

df['domain_name'] = takedn(df['url'].all())
print(df.head())

But it takes the last domain_name for all nr's.

Output:
  nr                   url domain_name
0  1    https://www.one.de        five
1  2    https://www.two.de        five
2  3  https://www.three.de        five
3  4   https://www.four.de        five
4  5   https://www.five.de        five

I try this to learn vectorizing. It will not work as I think. First line the domain_name should be one, second two and so on.

2 Answers 2

1

To operate on element, you can use apply().

def takedn(url):
    m = urlsplit(url)
    return m.netloc.split('.')[-2]

df['domain_name'] = df['url'].apply(takedn)
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, perfect answer. Is there a good tutorial or explanation for vectorization? Just for understanding.
@orgen Not know what do you mean by vectorization. But there is book called Python for Data Analysis teaching you how to use pandas to handle data.
@orgen That seems unrelated with pandas. But if you mainly want to apply functions on pandas. There are mainly three functions apply, map and applymap. For the difference among them, you can refer to stackoverflow.com/questions/19798153.
1

We have built-in function in tldextract

import tldextract
df['domain'] = df.url.map(lambda x : tldextract.extract(x).domain)
df
   nr                   url domain_name domain
0   1    https://www.one.de        five    one
1   2    https://www.two.de        five    two
2   3  https://www.three.de        five  three
3   4   https://www.four.de        five   four
4   5   https://www.five.de        five   five

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.