0

I want to pick two biggest values on each row, sort them and get column names as values. Other values are dropped from dataframe.

import pandas as pd
d = {'col1': [1, 2, np.nan], 'col2': [2,3,3], 'col3': [3,6,5], 'col4': [4,9,10], 'col5': [5,1, np.nan], 'col6': [7,np.nan,2], 'col7': [np.nan, 5,6]}
df = pd.DataFrame(data=d)

I'm now able to get two biggest values of each row but reshaping dataframe based on column values is another task. Code below leaves rest of the values as Nan which is fine. But how to reshape and get column names?

lasttwo = df.stack().sort_values(ascending=True).groupby(level=0).tail(2).unstack()

Example:

--- col1 col2 col3 col4 col5 col6 col7
a 1 2 3 4 5 7 Nan
b 2 3 6 9 1 Nan 5
c Nan 3 5 10 Nan 2 6

Intended result:

--- --- ---
a col6 col5
b col4 col3
c col4 col7
Eraseri
  • 15
  • 3
  • 1
    Does this answer your question? [Find the column names which have top 3 largest values for each row](https://stackoverflow.com/questions/37494844/find-the-column-names-which-have-top-3-largest-values-for-each-row) – Paul Brennan Feb 02 '21 at 07:56
  • I'm afraid it doesn't. Problem is those Nan-values. Argsort doesn't work out with them. – Eraseri Feb 02 '21 at 09:16

0 Answers0