Applying a function to all rows of a column with pandas
A common operation to implement with a pandas dataframe is to run a function for each entry or row of a column. Below shows how we can do this using a simple custom function.
First we’ll put together a dataframe - remember that you could read your own in from file using for example pandas.read_csv()
- incidentally, we’ll need to import pandas
and, just for this example, numpy
too:
Now let’s make the function that we want to apply to each entry of a specified column:
The final step is to apply the function to a specific column - remember that to save the changes to the dataframe variable, you’ll need to assign it (i.e. column name = whatever…). Rather than writing a loop that goes through each row, the function pandas.DataFrame.apply()
will do all of the work for us:
If you want to apply it to all columns, you can use the function applymap()
:
To read more about the lambda
function, have a read here.