CSV is a file format widely used to store and transfer tabular data in text format. CSV data can include both textual and numeric fields. Values in CSV files are separated by commas.
Pandas (short from "panel data") is a powerful data manipulation and processing library written for the Python programming language. It provides multiple advanced data manipulation and processing features including:
In this short article we will only examine how to write data from Pandas into the CSV file format. It is important to know that data in Pandas is stored in DataFrame which is a two dimension labeled data structure with columns which may have different data types. So once you process the data and have the resulting DataFrame, you can write it to CSV easily. Let's look at an example of how to write the DataFrame to CSV.
First we will assume that we have the file below loaded into the DataFrame.
Name,Age,Gender
John,30,Male
Melissa,25,Female
Alan,42,Male
Chelsey,40,Female
To read this CSV file into Pandas dataframe we will use following Python code:
import pandas as pd
df = pd.read_csv('csvfile.csv')
print(df)
The first line in the code above will import Python pandas module under "pd" alias. The second line will read the data from the CSV into "df" data frame. Notice the simplicity of this line. And the final line will print the data frame in an easy to read format. The output is below:
Name Age Gender
0 John 30 Male
1 Melissa 25 Female
2 Alan 42 Male
3 Chelsey 40 Female
Now let's add a line to do a quick transformation on the CSV file we read and put the resulting adata into a different DataFrame.
import pandas as pd
data = pd.read_csv('csvfile.csv')
df = pd.DataFrame(data, columns=['Age','Gender'])
print(df)
As you can see we added the call to DataFrame function with parameter columns where we specified the fields which we want to select to the resulting "df" dataframe. The output of the code is shown below:
Age Gender
0 30 Male
1 25 Female
2 42 Male
3 40 Female
Now to the final step. Let's write the "df" frame contents to the CSV. It is done by just adding a call to "to_csv"function as shown below.
import pandas as pd
data = pd.read_csv('csvfile.csv')
df = pd.DataFrame(data, columns=['Age','Gender'])
print(df)
df.to_csv('out.csv', index=False)
As you can see below the to_csv will write the data to the "out.csv" file. Index=false parameter is used to tell Pandas not to write the numbers of rows into the CSV. The resulting CSV file is shown below:
Age,Gender
30,Male
25,Female
42,Male
40,Female
As you can see writing out CSV from Pandas is very easy. For full reference of to_csv command refer to Python to_csv manual.
CSV Quick Info | |
---|---|
Comma Separaed Values | |
MIME Type | |
text/csv | |
Opens with | |
Microsoft Excel | |
Apple Numbers | |
Google Sheets | |
Microsoft Office Online |