1

I have an existing DataFrame which is grouped by the job title and by the year. I want to create a nested bar graph in Bokeh from this but I am confused on what to put in order to plot it properly.

The dataframe:

                       size
fromJobtitle      year   

CEO               2000   236
                  2001   479
                  2002     4
Director          2000    42
                  2001   609
                  2002   188
Employee          1998    23
                  1999   365
                  2000  2393
                  2001  5806
                  2002   817
In House Lawyer   2000     5
                  2001    54
Manager           1999     8
                  2000   979
                  2001  2173
                  2002   141
Managing Director 1998     2
                  1999    14
                  2000   130
                  2001   199
                  2002    11
President         1999    31
                  2000   202
                  2001   558
                  2002   198
Trader            1999     5
                  2000   336
                  2001   494
                  2002    61
Unknown           1999   591
                  2000  2960
                  2001  3959
                  2002   673
Vice President    1999    49
                  2000  2040
                  2001  3836
                  2002   370

An example output is:example graph

1 Answer 1

2

I assume you have a DataFrame df with three columns fromJobtitle, year, size. If you have a MultiIndex, reset the Index. To use FactorRange from bokeh, we need a list of tupels with two strings (this is imporant, floats won't work) like

[('CEO', '2000'), ('CEO', '2001'), ('CEO', '2002'), ...] 

an so on.

This can be done with

df['x'] = df[['fromJobtitle', 'year']].apply(lambda x: (x[0],str(x[1])), axis=1)

And this is all the heavy part. The rest does bokeh for you.

from bokeh.plotting import show, figure, output_notebook
from bokeh.models import FactorRange
output_notebook()

p = figure(
    x_range=FactorRange(*list(df["x"])),
    width=1400
)
p.vbar(
    x="x",
    top="size",
    width=0.9,
    source=df,
)

show(p)

This is the generated figure

Bar plot

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.