Assigning a column to a SparseDataFrame

Question

Consider -

df = pd.DataFrame({"a":[1,2,3]})
df

   a
0  1
1  2
2  3

I'd like to do two things:

Convert the dataframe to sparse with a default fill value of False
Assign a column of all False values to this sparse dataframe

Here's two seemingly similar approaches I've come up with.

First method; assign the column and convert the result to sparse.

df.assign(newcol=False).to_sparse(fill_value=False)

   a  newcol
0  1   False
1  2   False
2  3   False

Second method; first convert to sparse and then assign the column.

df.to_sparse(fill_value=False).assign(newcol=False)

   a  newcol
0  1     0.0
1  2     0.0
2  3     0.0

These 0.0s threw me off.

FWIW, this similar-to-the-second method also seems to work properly, giving False instead of 0.0 -

df = df.to_sparse(fill_value=False)
df['newcol'] = pd.SparseSeries([False] * len(df), dtype='bool_', fill_value=False)
df

   a  newcol
0  1   False
1  2   False
2  3   False

I'm confused by why two seemingly similar methods produce radically different outputs. What's the correct way to do this, and why is there a difference between these outputs?

`.assign(newcol=[False] * len(df))` will assign new col with False. Maybe scalars are converted to similar datatype of existing sparse dataframe when assigned to it. — Bharath M Shetty, Jan 10 '18 at 04:52
@coldspeed opened an issue https://github.com/pandas-dev/pandas/issues/19163, it might be a bug — Bharath M Shetty, Jan 10 '18 at 05:17
check jezz comfirmed its a bug. Check this out. It might be fixed in coming update of pandas. https://github.com/pandas-dev/pandas/issues/19163#issuecomment-356570006 — Bharath M Shetty, Jan 11 '18 at 13:41
@Dark thanks... you can write an answer for now stating that it's a bug. — cs95, Jan 11 '18 at 17:24
Related PR : https://github.com/pandas-dev/pandas/pull/13849 — Qusai Alothman, Jan 11 '18 at 17:48

Assigning a column to a SparseDataFrame

0 Answers0