I have an array that has different values, some of which are duplicates. How can I draw a histogram for them whose horizontal axis is the name of the element and the vertical axis is the number in the array?
arr= ['a','a','a','b','c','b']
I have an array that has different values, some of which are duplicates. How can I draw a histogram for them whose horizontal axis is the name of the element and the vertical axis is the number in the array?
arr= ['a','a','a','b','c','b']
Note that matplotlib's hist does not play nicely with string data (see the bar/tick positions):
import matplotlib.pyplot as plt
plt.hist(arr)
It's certainly possible to fix this manually, but it's easier to use pandas or seaborn. Both use matplotlib under the hood, but they provide better default formatting.
Also:
figsize. In these examples I've set figsize=(6, 3).x ticks, add plt.xticks(rotation=90).pandas value_counts and plot.bar
import pandas as pd
pd.value_counts(arr).plot.bar(figsize=(6, 3))
# pd.Series(arr).value_counts().plot.bar(figsize=(6, 3))
seaborn histplot
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3))
sns.histplot(arr, ax=ax)
seaborn countplot
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3))
sns.countplot(arr, ax=ax)
collections.Counter with matplotlib bar
from collections import Counter
counts = Counter(arr)
fig, ax = plt.subplots(figsize=(6, 3))
ax.bar(counts.keys(), counts.values())
numpy unique with matplotlib bar
import numpy as np
uniques, counts = np.unique(arr, return_counts=True)
fig, ax = plt.subplots(figsize=(6, 3))
ax.bar(uniques, counts)
You can use the matplotlib library to plot a histogram directly from a list. The code for it goes as follows:
from matplotlib import pyplot as plt
arr= ['a','a','a','b','c','b']
plt.hist(arr)
plt.show()
You can check out more about the histogram function from matplotlib out here: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html
You can do other stuff like setting the color for your histogram plot, changing the alignment, and many other things.
set(arr), create a dictionary with those elements as the keys and default values as 0, iterate over the lists, update the count, and plot a histogram from that dictionary using matplotlibHistograms are used for numerical data. Your strings data (categorical) it would be better to use a bar chart. Here's the code to generate a bar chart with matplotlib:
import matplotlib.pyplot as plt
from collections import Counter
arr= ['a', 'a', 'a', 'b', 'c', 'b']
data = Counter(arr)
plt.bar(data.keys(), data.values())
plt.show()
Even though histogram and bar chart in this case look similar, with an histogram you could have an unexpected result for instance if requiring a certain number of bins.
There are multiple steps to this problem..
Step 1: You need to collect the data in a convenient location. Going by your example, a good option would be to make a list with values. This could use a .count() to accomplish that. Other methods are possible, of course.
Step 2: To display the data, you could use a lib like matplotlib.pyplot. This may also take care of step 1. But that is not importent.