Skip to content Skip to sidebar Skip to footer

Dropping Nans From Selected Data In Pandas

Continuing on my previous question link (things are explained there), I now have obtained an array. However, I don't know how to use this array, but that is a further question. The

Solution 1:

Try

import pandas as pd

df = pd.read_csv("~/Truncated raw data hcl.csv")

data1 = df.iloc[:, [0, 1]]
cleaned_data = data1.dropna()

You were probably getting an Exception like "List does not have a method 'dropna'". That's because your data1 was not a Pandas DataFrame, but a List - and inside that list was a DataFrame.

Solution 2:

However the answer is already given, Though i would like to put some thoughts across this.

Importing Your dataFrame taking the example dataset from your earlier post you provided:

>>>import pandas as pd>>>df = pd.read_csv("so.csv")>>>df
    time  1mnaoh trial 1  1mnaoh trial 2  1mnaoh trial 3       ...        5mnaoh trial 1  5mnaoh trial 2  5mnaoh trial 3  5mnaoh trial 4
0    0.0            23.2            23.1            23.1       ...                  23.3            24.3            24.1            24.1
1    0.5            23.2            23.1            23.1       ...                  23.4            24.3            24.1            24.1
2    1.0            23.2            23.1            23.1       ...                  23.5            24.3            24.1            24.1
3    1.5            23.2            23.1            23.1       ...                  23.6            24.3            24.1            24.1
4    2.0            23.3            23.2            23.2       ...                  23.7            24.5            24.7            25.1
5    2.5            24.0            23.5            23.5       ...                  23.8            27.2            26.7            28.1
6    3.0            25.4            24.4            24.1       ...                  23.9            31.4            29.8            31.3
7    3.5            26.9            25.5            25.1       ...                  23.9            35.1            33.2            34.4
8    4.0            27.8            26.5            26.2       ...                  24.0            37.7            35.9            36.8
9    4.5            28.5            27.3            27.0       ...                  24.0            39.7            38.0            38.7
10   5.0            28.9            27.9            27.7       ...                  24.0            40.9            39.6            40.2
11   5.5            29.2            28.2            28.3       ...                  24.0            41.9            40.7            41.0
12   6.0            29.4            28.5            28.6       ...                  24.1            42.5            41.6            41.2
13   6.5            29.5            28.8            28.9       ...                  24.1            43.1            42.3            41.7
14   7.0            29.6            29.0            29.1       ...                  24.1            43.4            42.8            42.3
15   7.5            29.7            29.2            29.2       ...                  24.0            43.7            43.1            42.9
16   8.0            29.8            29.3            29.3       ...                  24.2            43.8            43.3            43.3
17   8.5            29.8            29.4            29.4       ...                  27.0            43.9            43.5            43.6
18   9.0            29.9            29.5            29.5       ...                  30.8            44.0            43.6            43.8
19   9.5            29.9            29.6            29.5       ...                  33.9            44.0            43.7            44.0
20  10.0            30.0            29.7            29.6       ...                  36.2            44.0            43.7            44.1
21  10.5            30.0            29.7            29.6       ...                  37.9            44.0            43.8            44.2
22  11.0            30.0            29.7            29.6       ...                  39.3             NaN            43.8            44.3
23  11.5            30.0            29.8            29.7       ...                  40.2             NaN            43.8            44.3
24  12.0            30.0            29.8            29.7       ...                  40.9             NaN            43.9            44.3
25  12.5            30.1            29.8            29.7       ...                  41.4             NaN            43.9            44.3
26  13.0            30.1            29.8            29.8       ...                  41.8             NaN            43.9            44.4
27  13.5            30.1            29.9            29.8       ...                  42.0             NaN            43.9            44.4
28  14.0            30.1            29.9            29.8       ...                  42.1             NaN             NaN            44.4
29  14.5             NaN            29.9            29.8       ...                  42.3             NaN             NaN            44.4
30  15.0             NaN            29.9             NaN       ...                  42.4             NaN             NaN             NaN
31  15.5             NaN             NaN             NaN       ...                  42.4             NaN             NaN             NaN

However, It good to clean the data beforehand and then process the data as you desired hence dropping the NA values during import itself will be significantly useful.

>>>df = pd.read_csv("so.csv").dropna()    <-- dropping the NA here itself>>>df
    time  1mnaoh trial 1  1mnaoh trial 2  1mnaoh trial 3       ...        5mnaoh trial 1  5mnaoh trial 2  5mnaoh trial 3  5mnaoh trial 4
0    0.0            23.2            23.1            23.1       ...                  23.3            24.3            24.1            24.1
1    0.5            23.2            23.1            23.1       ...                  23.4            24.3            24.1            24.1
2    1.0            23.2            23.1            23.1       ...                  23.5            24.3            24.1            24.1
3    1.5            23.2            23.1            23.1       ...                  23.6            24.3            24.1            24.1
4    2.0            23.3            23.2            23.2       ...                  23.7            24.5            24.7            25.1
5    2.5            24.0            23.5            23.5       ...                  23.8            27.2            26.7            28.1
6    3.0            25.4            24.4            24.1       ...                  23.9            31.4            29.8            31.3
7    3.5            26.9            25.5            25.1       ...                  23.9            35.1            33.2            34.4
8    4.0            27.8            26.5            26.2       ...                  24.0            37.7            35.9            36.8
9    4.5            28.5            27.3            27.0       ...                  24.0            39.7            38.0            38.7
10   5.0            28.9            27.9            27.7       ...                  24.0            40.9            39.6            40.2
11   5.5            29.2            28.2            28.3       ...                  24.0            41.9            40.7            41.0
12   6.0            29.4            28.5            28.6       ...                  24.1            42.5            41.6            41.2
13   6.5            29.5            28.8            28.9       ...                  24.1            43.1            42.3            41.7
14   7.0            29.6            29.0            29.1       ...                  24.1            43.4            42.8            42.3
15   7.5            29.7            29.2            29.2       ...                  24.0            43.7            43.1            42.9
16   8.0            29.8            29.3            29.3       ...                  24.2            43.8            43.3            43.3
17   8.5            29.8            29.4            29.4       ...                  27.0            43.9            43.5            43.6
18   9.0            29.9            29.5            29.5       ...                  30.8            44.0            43.6            43.8
19   9.5            29.9            29.6            29.5       ...                  33.9            44.0            43.7            44.0
20  10.0            30.0            29.7            29.6       ...                  36.2            44.0            43.7            44.1
21  10.5            30.0            29.7            29.6       ...                  37.9            44.0            43.8            44.2

and lastly cast your dataFrame as you wish:

>>> df = [df.iloc[:, [0, 1]]]
# new_df = [df.iloc[:, [0, 1]]]  <-- if you don't want to alter actual dataFrame
>>> df
[    time  1mnaoh trial 1
0    0.0            23.2
1    0.5            23.2
2    1.0            23.2
3    1.5            23.2
4    2.0            23.3
5    2.5            24.0
6    3.0            25.4
7    3.5            26.9
8    4.0            27.8
9    4.5            28.5
10   5.0            28.9
11   5.5            29.2
12   6.0            29.4
13   6.5            29.5
14   7.0            29.6
15   7.5            29.7
16   8.0            29.8
17   8.5            29.8
18   9.0            29.9
19   9.5            29.9
20  10.0            30.0
21  10.5            30.0]

Better Answer :

While looking at the end result, i see you are just concerning about the particular columns those are 'time' & '1mnaoh trial 1' hence idealistic would be to use usecole option which will reduce your memory footprint for the search across the data because you just opted the only columns which are useful for you and then use dropna() which will give you wanted you wanted i believe.

>>>df = pd.read_csv("so.csv", usecols=['time', '1mnaoh trial 1']).dropna()>>>df
    time  1mnaoh trial 1
0    0.0            23.2
1    0.5            23.2
2    1.0            23.2
3    1.5            23.2
4    2.0            23.3
5    2.5            24.0
6    3.0            25.4
7    3.5            26.9
8    4.0            27.8
9    4.5            28.5
10   5.0            28.9
11   5.5            29.2
12   6.0            29.4
13   6.5            29.5
14   7.0            29.6
15   7.5            29.7
16   8.0            29.8
17   8.5            29.8
18   9.0            29.9
19   9.5            29.9
20  10.0            30.0
21  10.5            30.0
22  11.0            30.0
23  11.5            30.0
24  12.0            30.0
25  12.5            30.1
26  13.0            30.1
27  13.5            30.1
28  14.0            30.1

Post a Comment for "Dropping Nans From Selected Data In Pandas"