CSV is a comma-separated Values file format. It is used to store text data in the form table, Contains rows and columns.

It contains Rows in a table format, Where each row represents columns, separated by a comma.

CSV file format

CSV file contains text data columns separated by a delimiter. Commonly Delimeter is a comma. You can use other options such as semicolon or hyphen.

The table contains multiple lines, Each line represents a row. Data in rows are separated by delimiter.

It contains the following concepts in a CSV file.

  • Delimiter. Delimeters are used to separate data by character. Common deliemeter is comma(,). You can use another semicolon (;) and tabs(\t) or any as a delimiter.

  • Header

An Optional Header is provided in the first line of a CSV file, It is used to specify the content of data. data in the header are separated by a delimiter. These are also called column names.

  • Table(rows and columns) Data

CSV data is in the form of a table of data. Where row represents an entity or record in a database. data in the row are called columns, separated by delimiter. Columns represent property data for a record.

  • comments CSV has no support for comments natively or officially. You can include comments based on the parser of a language.

For example, Each line prefix with # or // is treated as a comment. parser that reads CSV file skip the comment line

Here is an example

Id, Name, Salary
# Comments
1, Eric, 5000
2, Mike, 3000
3, Wrik, 4000
  • The first line is a header, contains columns separated by Comma The second line or row represents data of columns separated by a comma

Quote characters: You need to enclose a value for a field if the value contains

a delimiter such as a comma, or hyphen in its value Field value contains a space before or after.

Enclose can be double ("") or single quote()

Id, Name, Salary
# Comments
1, Eric, 5000
2, "Mike, Wranger", 3000
3, "Wrik, Rit", 4000

In this example, The value Wrik, Rit is enclosed with double quotes because It contains a comma delimiter. In the same way, Mike, Wranger is enclosed in double quotes.

Quote characters allow and help to read the data correctly during parse.

Escape Characters:

Escape characters are allowed and need to include double characters in case of quote character is present in a field value.

Id, Name, Salary
# Comments
1, Eric, 5000
2, Mike ""Wranger, 3000
3, "Wrik, Rit", 4000

In this example, The value Mike ""Wranger contains two quotes, the first quote is used to escape the second quote.

Always use two quote values repeatedly to escape the quote in value.

Empty values: Empty values can be placed with blank or multiple spaces in palce of an value.

Id, Name, Salary
1, Eric, 5000
2, , 6000

The second record name field value is empty.

Encoding:

Encoding is how the other languages and international characters are represented. In CSV, Encoding helps the parser reads and understand the characters in data.

The recommeneded character encoding is utf-8, It supports most of characters. Other Encodings supported are ISO-8859-1, ASCII.