Split a text column into two columns in Pandas DataFrame

Split a text column into two columns in Pandas DataFrame

If you want to split a text column in a pandas DataFrame into two separate columns, you can use the str.split method. This is especially useful when dealing with string columns that have a consistent delimiter.

Here's how to do it:

  • Using a Delimiter:

Let's say you have a DataFrame with a column "Name" containing full names, and you want to split it into "First_Name" and "Last_Name".

import pandas as pd # Sample DataFrame df = pd.DataFrame({ 'Name': ['John Doe', 'Jane Smith', 'Alice Johnson'] }) # Split the 'Name' column into two separate columns df[['First_Name', 'Last_Name']] = df['Name'].str.split(' ', expand=True) print(df) 

The expand=True argument ensures the result is a DataFrame with multiple columns, which can then be assigned back to the original DataFrame.

  • Using a Fixed Position:

In cases where you want to split strings at a fixed position rather than a delimiter, you can use string slicing.

For example, suppose you have a column with 7-character codes, and you want to split it into two columns after the 3rd character.

df = pd.DataFrame({ 'Code': ['ABC1234', 'XYZ5678', 'PQR9012'] }) df['Part1'] = df['Code'].str[:3] # Take the first 3 characters df['Part2'] = df['Code'].str[3:] # Take everything after the first 3 characters print(df) 

Remember that when splitting columns, it's essential to consider edge cases. The examples provided assume a consistent structure in the text column, so always verify the content of your data before applying such operations.


More Tags

datagrid crontrigger submenu cross-validation google-cloud-endpoints pointers init egit hibernate google-cloud-logging

More Programming Guides

Other Guides

More Programming Examples