Pandas assigning column names as values based on condition

Asked Feb 02 '21 at 07:46

Active Feb 02 '21 at 07:59

Viewed 25 times

I want to pick two biggest values on each row, sort them and get column names as values. Other values are dropped from dataframe.

import pandas as pd
d = {'col1': [1, 2, np.nan], 'col2': [2,3,3], 'col3': [3,6,5], 'col4': [4,9,10], 'col5': [5,1, np.nan], 'col6': [7,np.nan,2], 'col7': [np.nan, 5,6]}
df = pd.DataFrame(data=d)

I'm now able to get two biggest values of each row but reshaping dataframe based on column values is another task. Code below leaves rest of the values as Nan which is fine. But how to reshape and get column names?

lasttwo = df.stack().sort_values(ascending=True).groupby(level=0).tail(2).unstack()

Example:

---	col1	col2	col3	col4	col5	col6	col7
a	1	2	3	4	5	7	Nan
b	2	3	6	9	1	Nan	5
c	Nan	3	5	10	Nan	2	6

Intended result:

---	---	---
a	col6	col5
b	col4	col3
c	col4	col7

asked Feb 02 '21 at 07:46

Eraseri

1

Does this answer your question? [Find the column names which have top 3 largest values for each row](https://stackoverflow.com/questions/37494844/find-the-column-names-which-have-top-3-largest-values-for-each-row) – Paul Brennan Feb 02 '21 at 07:56
I'm afraid it doesn't. Problem is those Nan-values. Argsort doesn't work out with them. – Eraseri Feb 02 '21 at 09:16

Pandas assigning column names as values based on condition

0 Answers0