Hi first of all thank you for your wonderful work on Datamancer, really nice library. This is a glitch (at least for me) I've found.
I'm testing on Datamancer 0.3.17 / Windows 10.
If a CSV fileld is surrounded by double quotes, I would expect that:
This is what does happen with std/parsecsv, but apparently does not happen with datamancer's readCsv for values (while it's ok for column names).
I've enclosed a very small snippet to show the concept for point 1. One obvious drawback (maybe it's wanted behaviour, but it would be nice to give the user the option to chose parseCsv-like behaviour) it is that digit-only columns (like "Two" and "Four" in the example) cannot be converted to float or int without applying a string modification before. While CSV like this may appear exotic, they are not.. if someone decides to export a DB table to a csv file and He/She is not sure about end-user regional settings or multi-line fields (e.g. "Note"), the enclose in double quotes every field is a robust approach.
import std/[parsecsv,strutils], datamancer
var p: CsvParser
let content = """"One","Two","Three","Four"
"a1","2","a3","4"
"b10","20","b30","40"
"c100","200","c300","400"
"""
writeFile("temp.csv", content)
p.open("temp.csv")
#p.readHeaderRow()
while p.readRow():
echo(join(p.row,","))
p.close()
#[double quotes are consumed when using std/parsecsv, echo display this:
One,Two,Three,Four
a1,2,a3,4
b10,20,b30,40
c100,200,c300,400
]#
let csvfile = "temp.csv"
let df = readCsv(csvfile, quote = '"') #no better result with '\"'
echo(df.pretty())
#[double quotes are not consumed as expected, df.pretty() show this for values, not only for digit-only strings
DataFrame with 4 columns and 3 rows:
Idx One Two Three Four
dtype: string string string string
0 "a1" "2" "a3" "4"
1 "b10" "20" "b30" "40"
2 "c100" "200" "c300" "400"
]#
# FURTHERMORE, If a quoted field is containing newline i.e. \n, parsing will fail