CSV formats are best used to represent sets or sequences of records in which each record has an identical list of fields. This corresponds to a single relation in a relational database, or to data (though not calculations) in a typical spreadsheet.

Records in a CSV file are, by definition, in some order. Whether the recipient maintains and/or uses that order can vary. Thus, CSV files can represent either unordered or ordered record sequences.

CSV formats are not limited to a particular character set. They work just as well with Unicode as with ASCII (although particular programs that support CSV may have their own limitations). CSV files normally will even survive naive translation from one character set to another (unlike nearly all proprietary data formats). CSV does not, however, provide any way to indicate what character set is in use, so that must be communicated separately, or figured out at the receiving end (if possible).

Databases that include multiple relations cannot be exported as a single CSV file as described here. At best, more notational conventions must be added, for example to identify and separate the different relations. Such notations are not difficult to design or implement, but there is no consensus on them and consequently very little portability.

Similarly, CSV cannot naturally represent hierarchical or object-oriented databases or other data. This is because every CSV record is expected to have the same structure. CSV is therefore rarely appropriate for documents (such as are created with HTML, XML, or other markup or word-processing technologies).

Statistical databases in various fields often have a generally relation-like structure, but with some groups of fields repeatable. For example, health databases such as the Demographic and Health Survey typically repeat some questions for each child of a given parent (perhaps up to a fixed maximum number of children). Statistical analysis systems often include utilities that can “rotate” such data: for example, a “parent” record that includes information about 5 children, can be split into 5 separated records, each containing (a) the information on one child, and (b) a copy of all the non-child-specific information. CSV can represent either the “vertical” or “horizontal” form of such data.

In a relational database, similar issues are readily handled by creating a separate relation for each such group, and connecting “child” records to the related “parent” records using a foreign key (such as an ID number or name for the parent). In markup languages such as XML, such groups are typically enclosed in a container (for example, <child>), which is then repeated as necessary. With CSV there is no widely-accepted single-file solution.

Copyright © CCJK Technologies Co., Ltd. 2000-2017. All rights reserved.