Operations

Data management operations

Pandas contains several helpful functions to manage and format numerical data (Table 1).

Table 1: Common data management functions for pandas columns.

Operation	Example	Description
Round	`df['VEI'].round(1)`	Rounds values to the specified number of decimals
Floor	`df['VEI'].apply(np.floor)`	Rounds values down to the nearest integer
Ceil	`df['VEI'].apply(np.ceil)`	Rounds values up to the nearest integer
Absolute value	`df['VEI'].abs()`	Returns the absolute value of each element
Clip	`df['VEI'].clip(lower=0, upper=5)`	Limits values to a specified range
Fill missing	`df['VEI'].fillna(0)`	Replaces missing values with a specified value

Filling missing data

The .fillna example in Table 1 shows how to replace missing data - often referred to as Nan for Not a number - with 0 value. However, Pandas’s .fillna contains a lot of different methodologies to fill missing values (e.g., interpolation). Again, take the habit of checking out the documentation of the functions your frequently use.

Numeric operations

Let’s now see how we can manipulate and operate on data contained within our DataFrame. Table 2 illustrates arithmetic operators that can be applied to parts of the DataFrame. Table 2 relies only on native Python arithmetic operators, which can be expanded using the numpy package (Table 3).

Listing 1 Illustrates how to half the VEI column save the results to a new column.

Listing 1: Divide VEI by two and save the results to a new column.

df['VEI_halved'] = df['VEI'] / 2

Exercise

Longitudes are expressed as degrees E (i.e., from 0–180 ) and degrees W (i.e., from -180–0). Use operators to convert longitudes to degrees E (i.e., from 0–360) and store the results to a column called Longitude_E. To do so:

Define a mask where Longitudes are negative using [logical operators]
Where the mask is True (i.e. where the longitude is negative), add the longitude (or subtract its absolute value) to 360

Define a mask

Start by defining a mask

How?

mask = df['Longitude'] <= 0

Select the values

Select the values using .loc and do the maths.

How?

360 + df.loc[mask, 'Longitude']

Store back the values

df.loc[mask, 'Longitude_E'] = 360 + df.loc[mask, 'Longitude']

Table 2: Common arithmetic operations on numerical pandas columns.

Operation	Symbol	Example	Description
Addition	`+`	`df['VEI'] + 1`	Adds a value to each element
Subtraction	`-`	`df['VEI'] - 1`	Subtracts a value from each element
Multiplication	`*`	`df['VEI'] * 2`	Multiplies each element by a value
Division	`/`	`df['VEI'] / 2`	Divides each element by a value
Exponentiation	`**`	`df['VEI'] ** 2`	Raises each element to a power
Modulo	`%`	`df['VEI'] % 2`	Remainder after division for each element

Table 3: Common NumPy operations on pandas columns or arrays.

Operation	Symbol	Example	Description
Exponentiation	`np.power`	`np.power(df['VEI'], 2)`	Element-wise exponentiation
Square root	`np.sqrt`	`np.sqrt(df['VEI'])`	Element-wise square root
Logarithm (base e)	`np.log`	`np.log(df['VEI'])`	Element-wise natural logarithm
Logarithm (base 10)	`np.log10`	`np.log10(df['VEI'])`	Element-wise base-10 logarithm
Exponential	`np.exp`	`np.exp(df['VEI'])`	Element-wise exponential (e^x)

String operations

Similarly, Table 4 illustrates Pandas’s string-based operators.

Table 4: Common string operations on pandas columns.

Operation	Example	Description
Concatenation	`df['Country'] + ' volcano'`	Adds a string to each element
String length	`df['Country'].str.len()`	Returns the length of each string
Uppercase	`df['Country'].str.upper()`	Converts each string to uppercase
Lowercase	`df['Country'].str.lower()`	Converts each string to lowercase
Replace	`df['Country'].str.replace('USA', 'US')`	Replaces substrings in each string