How do I convert a Pandas dataframe into a NumPy array?
import numpy as np
import pandas as pd
df = pd.DataFrame(
{
'A': [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1],
'B': [0.2, np.nan, 0.2, 0.2, 0.2, np.nan, np.nan],
'C': [np.nan, 0.5, 0.5, np.nan, 0.5, 0.5, np.nan],
},
index=[1, 2, 3, 4, 5, 6, 7],
).rename_axis('ID')
That gives this DataFrame:
A B C
ID
1 NaN 0.2 NaN
2 NaN NaN 0.5
3 NaN 0.2 0.5
4 0.1 0.2 NaN
5 0.1 0.2 0.5
6 0.1 NaN 0.5
7 0.1 NaN NaN
I would like to convert this to a NumPy array, like so:
array([[ nan, 0.2, nan],
[ nan, nan, 0.5],
[ nan, 0.2, 0.5],
[ 0.1, 0.2, nan],
[ 0.1, 0.2, 0.5],
[ 0.1, nan, 0.5],
[ 0.1, nan, nan]])
Also, is it possible to preserve the dtypes, like this?
array([[ 1, nan, 0.2, nan],
[ 2, nan, nan, 0.5],
[ 3, nan, 0.2, 0.5],
[ 4, 0.1, 0.2, nan],
[ 5, 0.1, 0.2, 0.5],
[ 6, 0.1, nan, 0.5],
[ 7, 0.1, nan, nan]],
dtype=[('ID', '