I want to pick two biggest values on each row, sort them and get column names as values. Other values are dropped from dataframe.
import pandas as pd
d = {'col1': [1, 2, np.nan], 'col2': [2,3,3], 'col3': [3,6,5], 'col4': [4,9,10], 'col5': [5,1, np.nan], 'col6': [7,np.nan,2], 'col7': [np.nan, 5,6]}
df = pd.DataFrame(data=d)
I'm now able to get two biggest values of each row but reshaping dataframe based on column values is another task. Code below leaves rest of the values as Nan which is fine. But how to reshape and get column names?
lasttwo = df.stack().sort_values(ascending=True).groupby(level=0).tail(2).unstack()
Example:
| --- | col1 | col2 | col3 | col4 | col5 | col6 | col7 |
|---|---|---|---|---|---|---|---|
| a | 1 | 2 | 3 | 4 | 5 | 7 | Nan |
| b | 2 | 3 | 6 | 9 | 1 | Nan | 5 |
| c | Nan | 3 | 5 | 10 | Nan | 2 | 6 |
Intended result:
| --- | --- | --- |
|---|---|---|
| a | col6 | col5 |
| b | col4 | col3 |
| c | col4 | col7 |