Problems assigning color to bars in Pandas v0.20 and matplotlib

Question

I am struggling for a while with the definition of colors in a bar plot using Pandas and Matplotlib. Let us imagine that we have following dataframe:

import pandas as pd
pers1 = ["Jesús","lord",2]
pers2 = ["Mateo","apostel",1]
pers3 = ["Lucas","apostel",1]
    
dfnames = pd.DataFrame(
    [pers1,pers2, pers3],
    columns=["name","type","importance"]
)

Now, I want to create a bar plot with the importance as the numerical value, the names of the people as ticks and use the type column to assign colors. I have read other questions (for example: Define bar chart colors for Pandas/Matplotlib with defined column) but it doesn't work...

So, first I have to define colors and assign them to different values:

colors = {'apostel':'blue','lord':'green'}

And finally use the .plot() function:

dfnames.plot(
    x="name",
    y="importance",
    kind="bar",
    color = dfnames['type'].map(colors)
)

Good. The only problem is that all bars are green:

Why?? I don't know... I am testing it in Spyder and Jupyter... Any help? Thanks!

What is your pandas version? I haven't tested this but if I remember correctly, there was a bug related to this. — ayhan, Jan 10 '18 at 09:40
@José I've mentioned two alternatives. Please try the second one and let me know whether it works or not, before updating. Thanks. — cs95, Jan 10 '18 at 09:52
You were right, it was a bug. I tend to think that the problem is in my code, of course. Thanks! — José, Jan 10 '18 at 09:58

score 7 · Accepted Answer · answered Jan 10 '18 at 09:51

As per this GH16822, this is a regression bug introduced in version 0.20.3, wherein only the first colour was picked from the list of colours passed. This was not an issue with prior versions.

The reason, according to one of the contributors was this -

The problem seems to be in _get_colors. I think that BarPlot should define a _get_colors that does something like

def _get_colors(self, num_colors=None, color_kwds='color'):
    color = self.kwds.get('color')
    if color is None:
        return super()._get_colors(self, num_colors=num_colors, color_kwds=color_kwds)
    else:
        num_colors = len(self.data)  # maybe? may not work for some cases
        return _get_standard_colors(color=kwds.get('color'), num_colors=num_colors)

There's a couple of options for you -

The most obvious choice would be to update to the latest version of pandas (currently v0.22)

If you need a workaround, there's one (also mentioned in the issue tracker) whereby you wrap the arguments within an extra tuple -

dfnames.plot(x="name",  
             y="importance", 
             kind="bar", 
             color=[tuple(dfnames['type'].map(colors))]

Though, in the interest of progress, I'd recommend updating your pandas.

score 2 · Answer 2 · answered Jan 10 '18 at 10:27

I find another solution to your problem and it works!

I used directly matplotlib library instead of using plot attribute of the data frame : here is the code :

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline # for jupyter notebook

pers1 = ["Jesús","lord",2]
pers2 = ["Mateo","apostel",1]
pers3 = ["Lucas","apostel",1]

dfnames = pd.DataFrame([pers1,pers2, pers3], columns=["name","type","importance"])

fig, ax = plt.subplots()
bars = ax.bar(dfnames.name, dfnames.importance)


colors = {'apostel':'blue','lord':'green'}

for index, bar in enumerate(bars) :
    color = colors.get(dfnames.loc[index]['type'],'b') # get the color key in your df
    bar.set_facecolor(color[0])
plt.show()

And here is the results :

Problems assigning color to bars in Pandas v0.20 and matplotlib

2 Answers2