1

I am trying to parse a movie database with Python 3. How can I parse genres of a movie with different variables? For example:

1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy

First value is movie_id, second is movie_name, and the third values are genres but I want to parse them as separate variables that belong to corresponding movie. In other words, I want second separator to my database as "|". How can I achieve this? Here is my code:

import numpy as np
import pandas as pd
header = ["movie_id", "title", "genres"]
movie_db = pd.read_csv("movielens/movies.csv", sep=",", names=header)

1 Answer 1

2

You can use separator ,| but is necessary first row have to contains all possible genres:

df = pd.read_csv("movielens/movies.csv", sep="[,|]", header=None, engine='python')
print (df)
   0                 1          2          3         4       5        6
0  1  Toy Story (1995)  Adventure  Animation  Children  Comedy  Fantasy
1  2    Jumanji (1995)  Adventure   Children   Fantasy    None     None

But here is better create new columns by categories and set to 1 if category exist in row by get_dummies and add to original columns by join:

movie_db = pd.read_csv("movielens/movies.csv", sep=",", names=header)
df =  movie_db.join(movie_db.pop('genres').str.get_dummies())
print (df)
   movie_id             title  Adventure  Animation  Children  Comedy  Fantasy
0         1  Toy Story (1995)          1          1         1       1        1
1         2    Jumanji (1995)          1          0         1       0        1

But if need columns is possible use split by |:

df = movie_db.join(movie_db.pop('genres').str.split('|', expand=True))
print (df)
   movie_id             title          0          1         2       3        4
0         1  Toy Story (1995)  Adventure  Animation  Children  Comedy  Fantasy
1         2    Jumanji (1995)  Adventure   Children   Fantasy    None     None
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.