Open CSV File

Information, tips and instructions

CSV Format

CSV is short for comma-separated values and represents a text format where each line is a set of values delimited by comma.

CSV is a text-based file format where limitations on characters which could be stored in it depend on the specific CSV reader and writer implementations.

CSV is implementation specific and various readers and writers can have some compatibility problems.

The most popular and generic CSV file format adheres to the following rules:

  • Values are separated by commas
  • Lines are separated by new line characters (ASCII symbols with codes 10 and 13 or CRLF).
  • Values could be enclosed in double quotes
  • Values containing new line (CRLF) characters, double quotes or commas must be enclosed in double quotes for correct parsing
  • Double quotes should not appear in fields which are not enclosed in double quotes
  • If double quotes are used to enclose the field, then quotes inside the fields should be escaped by a preceding double quote character

Below is an example of CSV file with CRLF, comma and double quotes used inside values.

Name, Job Title, Address CRLF
John Doe, Designer, "Third Avenue CRLF
325" CRLF
Edward Johns, Tester, "Pike ""Street"", 200" CRLF

As you can see above "Third Avenue CRLF
325" value has a line break inside it. Because the value is enclosed in double quotes it will be correctly interpreted by most CSV parsers. Same goes about "Pike ""Street"", 200" value. It uses double quotations to enclose the value and all double quotations inside the value are preceded by double quotation prefix. It will be parsed correctly as a single value and comma inside it will not be parsed as value separator.

There are technically not limitations on the size and number of lines or fields inside the CSV file. In certain cases, limits could be set by specific writers, parsers, and computer memory limits. If necessary, CSV file could be split into multiple ones.

Frequently CSV file first line is reserved to be used to store column names for corresponding values below. Having column names in the file simplifies understanding what is inside the file and how to correctly process it.

Formal specification for CSV file format is available as RFC 4180 specification. It is used by many CSV readers and writers and is a great place to start when writing a code which should work with a CSV file.