Python 3x Pandas Django

Pandas Selecting Data


Displaying Single Column

Selecting a single column, which yields a Series, equivalent to car_details.Make or car_details["Make"]

print(car_details.Make)

Output:

0    Toyota
1    Toyota
2    Nissan
3     Honda
4    Toyota
Name: Make, dtype: object

Displaying Multiple Columns

print(car_details[["Make","Colour","Price"]])

Output:

   Make    Colour     Price
0  Toyota  White  $4,000.00
1  Toyota   Blue  $7,000.00
2  Nissan  White  $3,500.00
3   Honda   Blue  $7,500.00
4  Toyota  White  $6,250.00

Selecting via [], which slices the rows.

print(car_details[0:3])

Output:

   Make    Colour  Odometer (KM)  Doors     Price
0  Toyota  White         150043      4  $4,000.00
1  Toyota   Blue          32549      3  $7,000.00
2  Nissan  White         213095      4  $3,500.00

When should you use .loc[] or .iloc[] ?

1. loc is label-based, which means that you have to specify rows and columns based on their row and column labels.

2. iloc is integer position-based, so you have to specify rows and columns by their integer position values (0-based integer position).

Example1:

import pandas as pd
car_details_custom_index = pd.DataFrame({ "index" : pd.Series(["A","B","C","D","E"]),
                             "Make"  : pd.Series(["Toyota", "Toyota", "Nissan","Honda", "Toyota"]),
                             "Colour": pd.Series(["White", "Blue", "White","Blue", "White"]),
                             "Odometer (KM)": pd.Series([150043, 32549, 213095, 45698, 60000]),
                             "Doors" : pd.Series([4, 3, 4, 4, 4]),
                             "Price" : pd.Series(["$4,000.00", "$7,000.00", "$3,500.00","$7,500.00", "$6,250.00"]) })
# Default index 0 - 4
print(car_details_default_index)

car_details_custom_index = car_details_default_index.set_index("index")

# Custom index added A - E
print(car_details_custom_index)

Output:

  index    Make Colour  Odometer (KM)  Doors      Price
0     A  Toyota  White         150043      4  $4,000.00
1     B  Toyota   Blue          32549      3  $7,000.00
2     C  Nissan  White         213095      4  $3,500.00
3     D   Honda   Blue          45698      4  $7,500.00
4     E  Toyota  White          60000      4  $6,250.00


         Make Colour  Odometer (KM)  Doors      Price
index
A      Toyota  White         150043      4  $4,000.00
B      Toyota   Blue          32549      3  $7,000.00
C      Nissan  White         213095      4  $3,500.00
D       Honda   Blue          45698      4  $7,500.00
E      Toyota  White          60000      4  $6,250.00

let try .loc to display data using index which is A to E

print(car_details_custom_index.loc["A"])

Output:

Make                Toyota
Colour               White
Odometer (KM)       150043
Doors                    4
Price            $4,000.00
Name: A, dtype: object

now try .iloc to display data using index A and see,

print(car_details_custom_index.iloc["A"])

Output:

TypeError: Cannot index by location index with a non-integer key

You will get an error as .iloc cannot index by location index with a non-interger key as iloc is integer position-based(Default Index), so you have to specify rows and columns by their integer position values (0-based integer position).

print(car_details_custom_index.iloc[0])

Output:

Make                Toyota
Colour               White
Odometer (KM)       150043
Doors                    4
Price            $4,000.00
Name: A, dtype: object

Selecting on a multi-axis by label:

print(car_details_custom_index.loc[:,["Make","Colour"]])

Output:

index
A      Toyota  White
B      Toyota   Blue
C      Nissan  White
D       Honda   Blue
E      Toyota  White

Label slicing with both endpoints are included:

print(car_details_custom_index.loc["A":"C", ["Make","Colour"]])

Output:

index
A      Toyota  White
B      Toyota   Blue
C      Nissan  White

Selecting on a multi-axis by position:

Let's take our car_details DataFrame,

print(car_details.iloc[:, 0:2])

Output:

     Make  Colour
0  Toyota  White
1  Toyota   Blue
2  Nissan  White
3   Honda   Blue
4  Toyota  White

Position slicing with both endpoints are included:

print(car_details.iloc[0:2, 0:2])

Output:

     Make  Colour
0  Toyota  White
1  Toyota   Blue

Boolean indexing (Conditional)

Using a single column’s values to select data.

Displaying car details where column doors greater than 3.

print(car_details[car_details["Doors"]>3])

Output:

     Make Colour  Odometer (KM)  Doors      Price
0  Toyota  White         150043      4  $4,000.00
2  Nissan  White         213095      4  $3,500.00
3   Honda   Blue          45698      4  $7,500.00
4  Toyota  White          60000      4  $6,250.00

Displaying car details where column doors greater than 3 and white colour.

print(car_details[(car_details["Doors"]>3) & (car_details["Colour"]=="White")])

Output:

   Make   Colour  Odometer (KM)  Doors      Price
0  Toyota  White         150043      4  $4,000.00
2  Nissan  White         213095      4  $3,500.00
4  Toyota  White          60000      4  $6,250.00

If you have any doubts or queries related to this chapter, get them clarified from our Python Team experts on ibmmainframer Community!