creating histogram from 2d array python

Question

Currently I have a matrix of 1's, 0's, and -1's where each row is a person and each column is a bill that they voted on. The 1's, 0's, and -1's in each cell denote how they voted.

The histogram I am trying to build would show the number of people with x amount of yes votes (the number of rows with x amount of 1's) on the Y axis. On the X axis it would have ticks 0-N yes votes. So for example, if 30 people voted yes, the bar at the 30 label on the X axis would go up to 30 on the Y axis.

Here is a screenshot of these histograms that I quickly made in MatLab(where my experience which such things is): histograms built in MatLab

My question is how to easily and effectively do this in Python. I have very little experience with Python.

The code I have:

def buildHistogram(matrix):
    plt.hist(matrix, bins = 30)
    plt.show()

Which yields: histograms built in Python

Please let me know how I can split these into three different histograms. Do I need to make three different arrays?

Try using pandas itself for filtering the data and then using its hist built-in: df[df.desired_column == 1].hist(bins = 30), for the Yes votes of a desired_column — Vinícius Figueiredo
– Vinícius Figueiredo, Commented Jul 12, 2017 at 23:34
Do you mean the file from which the data is pulled? It is a long text file of -1's, 1's, and 0's. @MSeifert — cadence glorpon
– cadence glorpon, Commented Jul 12, 2017 at 23:35
@ViníciusAguiar do you know if I am able to include a list of columns? I would like to see all the columns past the first 10 at once. — cadence glorpon
– cadence glorpon, Commented Jul 12, 2017 at 23:37
hmm I'm not sure how to do that, maybe @MSeifert knows a good way! =) — Vinícius Figueiredo
– Vinícius Figueiredo, Commented Jul 12, 2017 at 23:41

MSeifert · Accepted Answer · 2017-07-12 23:39:24Z

2

I used some random data set to reproduce it:

import numpy as np
import matplotlib.pyplot as plt
arr = np.random.randint(-1, 2, (200, 100))

Then it's just (neglecting axis labels and titles):

fig, (ax1, ax2, ax3) = plt.subplots(1, 3)
ax1.hist(np.sum(arr==-1, axis=1), bins=30)  # no
ax2.hist(np.sum(arr==0, axis=1), bins=30)   # nothing
ax3.hist(np.sum(arr==1, axis=1), bins=30)   # yes

Which gives me (which should be roughly what you want):

answered Jul 12, 2017 at 23:39

MSeifert

154k41 gold badges356 silver badges377 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Daniel F Over a year ago

Weird how the "random" data generated such a stripey graphs.

MSeifert Over a year ago

@DanielF I don't think these "empty stripes" are real. That's a problem in my dataset where the range of values can be less than the number of bins. For example in the second data set the range is ~23 - ~44 so it has 21 "filled bins" and 9 "empty bins"...

Collectives™ on Stack Overflow

creating histogram from 2d array python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related