0

I have a data frame that I'm trying to plot in a bar graph but I'm facing a weird error.

import pandas
import matplotlib.pyplot as plot

.... a bunch of code that combines two different data frames to get one data frame


df = df.groupby(['title']).sum()
df.reindex()

print(df)

df.plot('bar', df['title'], df['number'])

the print statement gives:

Action         1.159667e+10
Adventure      7.086050e+09
Animation      1.159219e+10
Comedy         2.071842e+10
Crime          3.525629e+09
Drama          8.479182e+09
Family         3.705357e+09
Fantasy        3.613503e+10
History        1.261357e+09
Horror         1.034400e+09
Music          1.963180e+09
Romance        1.273498e+10
Sci-Fi         2.586427e+10
Sport          6.863091e+08
Thriller       2.245254e+10
War            1.699709e+09

but then the plot code: df.plot('bar', df['title'], df['number']) give the following error:

--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2645 try: -> 2646 return self._engine.get_loc(key) 2647 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Main_Genre'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) in 30 print(df) 31 ---> 32 df.plot('bar', df['Main_Genre'], df['worldwide_gross'])

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in getitem(self, key) 2798 if self.columns.nlevels > 1: 2799 return self._getitem_multilevel(key) -> 2800 indexer = self.columns.get_loc(key) 2801 if is_integer(indexer): 2802 indexer = [indexer]

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2646
return self._engine.get_loc(key) 2647 except KeyError: -> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2649
indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2650 if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

what am I doing wrong? Any help would be appreciated.

Thanks

2
  • Try: df.plot.bar('title', 'number') instead. Commented Sep 8, 2020 at 22:26
  • The same error happens. Commented Sep 9, 2020 at 6:30

2 Answers 2

2

I have to write code to generate the DataFrame. Next time please make sure you include that in your question for conveniences.

df = pd.DataFrame([
    ['Action', 1.159667e+10],
    ['Adventure', 7.086050e+09],
    ['Animation', 1.159219e+10],
    ['Comedy', 2.071842e+10],
    ['Crime', 3.525629e+09],
    ['Drama', 8.479182e+09],
    ['Family', 3.705357e+09],
    ['Fantasy', 3.613503e+10],
    ['History', 1.261357e+09],
    ['Horror', 1.034400e+09],
    ['Music', 1.963180e+09],
    ['Romance', 1.273498e+10],
    ['Sci-Fi', 2.586427e+10],
    ['Sport', 6.863091e+08],
    ['Thriller', 2.245254e+10],
    ['War', 1.699709e+09]
], columns=['title', 'number'])

There are several ways to do this. The easiest one is the way you're trying to do, but it should be like this:

df.plot.bar('title', 'number')

enter image description here

There is another way of doing the same thing, which is more explicit.

df.plot(kind='bar', x='title', y='number')

Finally, if you want to use matplotlib, you can plot it as follows. This is sort of standard way that gives you maximum flexibility because you can adjust most of the elements.

import matplotlib.pyplot as plt

plt.figure(figsize=(12,8))
plt.bar(df['title'], df['number'])
plt.xticks(rotation='vertical')
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot, Christopher. The reason for the way of creating the data frame is because I had to combine two other data frames to get to this final data frame and then sum up the numbers for all the same 'titles', so I can't simply write the data manually, Sorry I didn't provide that before. Is there any way I could change the current data frame that I have to be able to plot it? Thanks
@tommy, isn't the data frame I created the same as yours? What's the difference then?
I'm fairly new to python, but I believe it should be related to the grouping and sun function that I'm using, like the column that I've grouped with can't be used as axes or something. Because when I try the way you mentioned before grouping it graphs properly. Is there anything related to a grouped data frame that would cause this error? should I make any changes my data frame after grouping and summing?
@tommy, I think I roughly understand what do you mean, but I can't help if I don't know how your original data frame looks like.
0

The problem is that your groupby moves the "title" column to the index. Therefore it no longer exists as a column name. If you look at the error message, as you have used different column names to the data in your question, it reads:

KeyError: 'Main_Genre'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) in 30 print(df) 31 ---> 32 df.plot('bar', df['Main_Genre'], df['worldwide_gross'])

This KeyError occurs because the "Main_Genre" column is no longer a column, but is now the index (assuming you have switched this to "title" for your example code in your question).

As "title" is now the index, because you have grouped on it, you need to first return this to a column using .reset_index() to be able to use df.plot()

df.reset_index(drop=False, inplace=True)
df.plot(kind="bar", x="title", y="sum")

Alternatively, from reading the Pandas documentation, you can use the index as the x-axis:

df.plot(kind="bar", y="sum", use_index=True)

As can be seen from the above, if you take the first method of resetting the index, you can then use @Christopher's answer for the different methods of plotting. If you want to keep the index as "title", you could change his matplotlib.pyplot answer for df.index instead of df["title"], or use use_index=True within the df.plot.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.