You have a data file that you need to convert to a Comma Separated Values (CSV) file.
Use awk to convert the data into CSV format:
1 2 3 4 5 6 |
$ awk 'BEGIN { FS="\t"; OFS="\",\"" } { gsub(/"/, "\"\""); $1 = $1; printf "\"%s\"\ n", $0}' tab_delimited "Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" |
You can do the same thing in Perl also:
1 2 3 4 5 6 |
$ perl -naF'\t' -e 'chomp @F; s/"/""/g for @F; print q(").join(q(","), @F).qq("\n);' tab_delimited "Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" |
First of all, it’s tricky to define exactly what CSV really means. There is no formal specification, and various vendors have implemented various versions.
Our version here is very simple, and should hopefully work just about anywhere.
We place double quotes around all fields (some implementations only quote strings, or strings with internal commas), and we double internal double quotes.
To do that, we have awk split up the input fields using a tab as the field separator, and set the output field separator (OFS) to “,”.
We then globally replace any double quotes with two double quotes, make an assignment so awk rebuilds the record and print out the record with leading and trailing double quotes.
We have to escape double quotes in several places, which looks a little cluttered, but otherwise this is very straightforward.