Giter VIP home page Giter VIP logo

csv's People

Contributors

btmash avatar cameroncondry avatar cedric-anne avatar chalasr avatar dmlogic avatar duncan3dc avatar george-zakharov avatar ghobaty avatar grahamcampbell avatar hannesvdvreken avatar heyratfans avatar incredimike avatar ionbazan avatar jalle19 avatar jpuck avatar mfrost503 avatar ntzm avatar nyamsprod avatar on2 avatar ordago avatar pableu avatar pborreli avatar pdelre avatar samyoul avatar sebsobseb avatar shameerc avatar tacman avatar tomkyle avatar vlakarados avatar webcraft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csv's Issues

What do you make out of this?

I tried installing your library and it didn't work... I "self-update"d composer, just in case and I still got the error which is kinda weird and it has never happened to me before... :)


C:\Zampps\www\building-blocks\databases-flat\LeagueCSV>composer require league/csv
Using version ~6.0 for league/csv
./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)

  • Installing league/csv (6.0.0)
    Loading from cache
    Failed to download league/csv from dist: RecursiveDirectoryIterator::__construct(C:\Zampps\www\building-blocks\databases-flat\LeagueCSV\ve
    -blocks\databases-flat\LeagueCSV\vendor/league/csv): The system cannot find the path specified. (code: 3)
    Now trying to download from source
  • Installing league/csv (6.0.0)
    Cloning 7c881c2
    [UnexpectedValueException]
    RecursiveDirectoryIterator::__construct(C:\Zampps\www\building-blocks\databases-flat\LeagueCSV\vendor\league\csv,C:\Zampps\www\buildng-blocks\databases-flat\LeagueCSV\vendor\league\csv): The system cannot find the path specified. (code: 3)

Filtering out the BOM

The docs say that we should rely on the extracting methods to remove the BOM character while the CSV is read, but that doesn't seem to be the case.

Even though the BOM could be detected, it is not automatically removed, but rather included in the first column read.

I've put together an example here to illustrate what I'm doing:
http://runnable.com/VRYJNJ-c0wNVqTc6/parsing-bom-character-for-php

Am I missing something, or is it really up to the developer to manually remove it?

Should Exceptions be more detailed to make errors more catchable?

For example, someone uploaded a CSV with a couple of empty header fields at the end of the CSV, and the following exception was thrown:

        if (! $this->isValidAssocKeys($res)) {
            throw new InvalidArgumentException(
                'Use a flat non empty array with unique string values'
            );
        }

There's not much here that I can differentiate between another InvalidArgumentException other than the message, which I assume is not going to maintain BC?

UTF-8 support?

Hello,

As I see from docs, utf8 should be supported without problems, but I am hitting some.

I also tried adding at the top of the file: mb_internal_encoding("UTF-8"); (sinse this sometimes helps)

basicly the code is really simple:

        $writer = \League\Csv\Writer::createFromFileObject(new SplTempFileObject());
        $writer->setNewline("\r\n");
        $writer->setEncodingFrom("UTF-8"); // this probably is not needed?
        $headers = ["Title", "Start date", "End date", "Time", "Address", "Place", "Organizer", "Text", "Contact person", "Email", "Phone"];
        $writer->insertOne($headers);

        foreach($items as $item)
        {
            $data = [
                strip_tags($item['title']),
                format_date($item['date_at']),
                format_date($item['expires_at']),
                $item['start_time'],
                $item['address'],
                $item['place'],
                $item['organizer'],
                strip_tags($item['body']),
                $item['user']['full_name'],
                $item['user']['email'],
                $item['user']['phone'],
            ];

            $writer->insertOne($data);
        }

        header('Content-Type: text/csv; charset=UTF-8');
        $filename = 'events-'.format_date(time()).'.csv';

        $writer->output($filename);

but what strange I saw, that when I open the file using sublime - I see all those chars, but i.e. opening this file in excel 2014 gives me invalida characters. Any ideas?

Thank you!

library doesn't support iconv stream filter

Hi, I just pulled the latest version of the library and try to integrate it in my application but it fails at the beginning when I do

$stream = Reader::createFromPath('/home/nsitbon/export_members_20150217.csv');

if ($stream->isActiveStreamFilter()) { 
    $stream->appendStreamFilter('convert.iconv.UTF-16/UTF-8//TRANSLIT');
}

it throws the following exception :

[RuntimeException]
SplFileObject::__construct(): unable to create or locate filter "convert.iconv.UTF-16"

whereas this code works perfectly

$stream = fopen('/home/nsitbon/export_members_20150217.csv', 'r');
$stream_filter_append($stream, 'convert.iconv.UTF-16/UTF-8//TRANSLIT', STREAM_FILTER_READ);

any ideas?

Can't parse tab-delimited file with no enclosure

I deal with tab delimited files with no enclosures on a regular basis. There seems to be no way to configure the Reader to parse these files. If I try to setEnclosure(null) or setEnclosure(""), it throws an exception.

Add delimiter detection to Reader

Hello,

I've tried out your and many other CSV libraries in the hope of finding one that automatically detects the delimiter and sadly none of the libraries I've tried have had a go at this.

I find this odd since it seems like a really useful abstraction that I as the consumer of the library don't want to worry about. The delimiter is a ;? Bakame should have my back without me having to set this manually.

Just grabbing the first line and counting occurences of valid delimiters for a csv file should cover 95% of the cases. Open office lists tab, comma, semicolon and space as valid delimiters for csv.

I'm guessing that if space is the delimiter then you have to use an enclosure for it to work.

SplFileObject Flags have no effect / empty lines cannot be ignored

I just updated from 7.0.1 to 7.1.0 and noticed that my internal tests failed due to a null value appended at the end of each CSV file that was imported. After some trial and error I figured that the SplFileObject falgs don't seem to have an effect any longer.

I'm missing the SplFileObject::SKIP_EMPTY flag in particular since it seemed to make sure that there's no null value at the end if the file terminates on a new line.

See the following test script:

<?php
use League\Csv\Reader;

include __DIR__."/vendor/autoload.php";

$path = __DIR__."/tmp.txt";
$str = "1st\n2nd\n";
$obj = new SplFileObject($path,"w+");
$obj->fwrite($str);
$obj = new SplFileObject($path,"r");
$reader = Reader::createFromFileObject($obj);

$flags = [
    "NONE" => 0,
    "READ_AHEAD" => SplFileObject::READ_AHEAD,
    "READ_AHEAD | DROP_NEW_LINE" => SplFileObject::READ_AHEAD | SplFileObject::DROP_NEW_LINE,
    "READ_AHEAD | SKIP_EMPTY" => SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY,
    "READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY" => SplFileObject::READ_AHEAD | SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY,
    "DROP_NEW_LINE" => SplFileObject::DROP_NEW_LINE ,
    "DROP_NEW_LINE | SKIP_EMPTY" => SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY,
    "SKIP_EMPTY" => SplFileObject::SKIP_EMPTY ,
];

foreach($flags as $flagName => $flag) {
    $reader->setFlags($flag);
    $lines = $reader->fetchAll();
    $vals = [];
    foreach($lines as $line){
        if($line == null){
            $vals[] = "<null>";
        }elseif(count($line)){
            $val = array_shift($line);
            if($val == null){
                $val = "<null>";
            }
            $vals[] = $val;
        }
    }
    echo count($lines). " lines [".implode(', ',$vals)."]\tfor ". $flagName."\n";
}

Output in 7.0.1

3 lines [1st, 2nd, <null>]  for NONE
3 lines [1st, 2nd, <null>]  for READ_AHEAD
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE
2 lines [1st, 2nd]  for READ_AHEAD | SKIP_EMPTY
2 lines [1st, 2nd]  for READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE
2 lines [1st, 2nd]  for DROP_NEW_LINE | SKIP_EMPTY
2 lines [1st, 2nd]  for SKIP_EMPTY

Please note the lines with SplFileObject::SKIP_EMPTY set: Those contain only 2 values (as expected).

Output in 7.1.0

3 lines [1st, 2nd, <null>]  for NONE
3 lines [1st, 2nd, <null>]  for READ_AHEAD
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE
3 lines [1st, 2nd, <null>]  for READ_AHEAD | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for READ_AHEAD | DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE
3 lines [1st, 2nd, <null>]  for DROP_NEW_LINE | SKIP_EMPTY
3 lines [1st, 2nd, <null>]  for SKIP_EMPTY

Regardless of the flags, the output is always the same (3 lines, last one is null. I tried to figure out what was changed between those version but was unable to identify the cause - so this is my last desperate attempt to understand what's going on :)

example gives error

<?php
use League\Csv\Writer;

//we fetch the info from a DB using a PDO object
$sth = $dbh->prepare(
    "SELECT firstname, lastname, email FROM users LIMIT 200"
);
//because we don't want to duplicate the data for each row
// PDO::FETCH_NUM could also have been used
$sth->setFetchMode(PDO::FETCH_ASSOC);
$sth->execute();

//we create the CSV into memory
$csv = Writer::createFromFileObject(new SplTempFileObject());

//we insert the CSV header
$csv->insertOne(['firstname', 'lastname', 'email']);

// The PDOStatement Object implements the Traversable Interface
// that's why Writer::insertAll can directly insert
// the data into the CSV
$csv->insertAll($sth);

// Because you are providing the filename you don't have to
// set the HTTP headers Writer::output can
// directly set them for you
// The file is downloadable
$csv->output('users.csv');

Gives this error in lumen:

[Mon Apr 20 21:19:39 2015] PHP Fatal error:  Class 'App\Http\Controllers\SplTempFileObject' not found in /Users/tim/Documents/lumen-api/API - source code/app/Http/Controllers/apiSoldiers.php on line 25

Add count() wrapper method

When using the offset/limit for pagination, it would be useful if the library had a count() method to quickly determine the total rows without having to fetchAll() including the data memory overhead it comes with.

Importing CSV from Windows UTF-8 return NULL

When CSV file is an US standard, has only letters (Dec) code from c.a. 33 to 127 imports works fine.
Other Localisation, non english languages, letters like "óęłńśżźŻŹĆÓŁĘŚŃ" return all fields in one line empty (null), even if only ''strange'' letter exist. After removing that letter, work.

Code is simple:

if (move_uploaded_file($this->data['Payment']['file']['tmp_name'], $filename)) {
$csv = Reader::createFromPath($filename);
$headers = $csv->fetchOne(); -----> works on first line
echo debug($headers);
$data = $csv->setOffset(0)->setLimit(2)->fetchAll(); ----> not working
echo debug($data);
}

Source file is:
--------------------cut here-------------
2014-09-09;2014-09-09;PRZELEW ZEWNETRZNY PRZYCHODZACY;"OPLATA ZA KURS, MICHAL ";"ANDRZEJ JERZY UL. 11 LISTOPADA 08-110 SIEDLCE";'6575676646444666444545454545';200,00;250,67;
2014-09-09;2014-09-09;PRZELEW ZEWN TRZNY PRZYCHODZ•CY;"OP£ATA ZA KURS, JAKUB ";"ANDRZEJ JERZY UL. 11 LISTOPADA 08-110 SIEDLCE";'4435446435354657656575634534534534535';200,00;450,67;
-----------cut-here-------------

change development workflow to PRs

currently commits are being pushed to master directly
would help a lot to follow your work on this package to create PRs even if you are the top lead or maintainer, this is how it is done in FOSS. I respect your decision if you ignore this request though it would be a great deal of help to get PR notifications and help review contributing.

Thank you

Unwanted additional line at end of CSV file

I'm trying to create a new CSV file and write it to disk so that it can be emailed. This may not be an issue with the package but may be an issue with the way I am doing things.

Code:

    //create empty file to store CSV data (is there a better way of doing this?)
$handle = fopen($filepath, "w");
fwrite($handle, "");

    //create new writer instance
$csv = League\Csv\Writer::createFromPath(new SplFileObject($filepath));

    //insert data from array
$csv->insertAll($csv_data);

The resulting CSV has all the data in it as expected but there is an additional line with no entry on it.

This is a problem because the resulting CSV has to be uploaded into an Oracle-based system and it does not like empty lines in CSV files.

Importing data using the each method

Exporting data to csv is made easy with the Writer::insertAll method, the same is not true if we want to import data from csv into another storage medium. Right now you need to take the result of one of the Reader::fetch* method and iterate over to import your data. This can be memory expensive and you end up doing 2 iterations !!

I'm introducing the Reader::each method that will help ease CSV data import by applying a callback to each CSV row:

As with all the other Reader::fetch* methods, this new method can be modify using the filters methods.

Default keys for fetchAssoc

Hello,

First, thanks for sharing your code.

By default, why don't choose the csv file's first line to set the keys when we are doing a fetchAssoc without keys ?

Actually, we need to do :

$csv = Reader::createFromPath(dirname(__FILE__).'/my.csv');

$header = $csv->fetchOne();
$res = $csv->addFilter(function ($row, $index) { return $index > 0; })
                  ->fetchAssoc($header);

I will propose a pull-request in a few minuts.

Extra row with "null" in reader fetchAll result

Using fetchAll I have extra row with NULL in result.
See following testcase:

<?php
require_once 'vendor/autoload.php';
use League\Csv\Reader;

$reader = Reader::createFromString("a,b".PHP_EOL."3,11");
$result = $reader->fetchAll();
var_export($result);// array(0 =>array (0 => 'a',1 => 'b',),1 =>array (0 => '3',1 => '11',),2 =>array (0 => NULL,),)

My environment:
Ubuntu 14.04.1 LTS
PHP 5.5.21-1+deb.sury.org~trusty+2

HHVM support

I would seems that HHVM supports CallbackFilterIterator since version 3.2. Yet the Library test suite still does not pass on HHVM. We should investigate the failed test and fix once for all this issue

Iterate using foreach adds an empty element

I just discovered this library. So, after seeing I can iterate with a simple foreach, I tried with a very simple code:

<?php
require '../vendor/autoload.php';

use League\Csv\Reader;

$data = Reader::createFromPath('file.csv');

foreach ($data as $row) {
    print_r($row);
}

It works, but adds an empty element at the end of data, in the last iteration:

Array
(
    [0] => 15581
    [1] => 1
    [2] => 1
    [3] => 2
    [4] => 339
    [5] => 1400
)
Array
(
    [0] => 
)

I definitely have to read the documentation, but I think this code should just work...


Let me clarify: I have ensured that there is no cause for an empty line. The file in this example has a single line and no more.

screenshot - 271114 - 22 24 17

Adding Stream Filtering capabilities to the library

As of now the library does not handle well CSV encoded in different charsets. To resolve this issue we can use PHP stream filter functions. But those functions are not restricted to encoding problems. Which means that we can enhance the library by providing a generic solution to apply PHP stream filtering capabilities to the CSV data.

A work in progress has already landed on the stream branch. Those works needs reviewing before being merge to the master. This features is stated for 5.4 inclusion

You can see a pratical use of the stream filtering capabilities by viewing the stream example source code .

Feedback are welcomed

Character replacement

Hello,
i am trying to create some csv fields as the example

$data = [
            ['"1","name", "surname"'],
            ['"2","name", "surname"'],
            ['"3","name", "surname"']
        ];

I want to add each row to a single line but the outlut is the following:

"""1"",""name"", ""surname"""
"""2"",""name"", ""surname"""
"""3"",""name"", ""surname"""

I want the output to be the following:

"1","name","surname"
"2","name","surname"
"3","name","surname"
etc.

I get more " than expected.
Am i doing something wrong??

Thanks

Allowed Memory Size Error and Memory Limit

Using a big file of 75MB we are getting the following error:

Fatal error in module Reader:
Allowed memory size of 134217728 bytes exhausted (tried to allocate 81 bytes)

Now I am going to play with @ini_set('memory_limit', '-1'); and see what happens. PLease let me know if there is something I am missing to improve my query.

Thanks for your time and happy hacking!

This is the same piece of code that works but only using small files:

    public function convertItemsFile($files)
    {
        $itemsFileName = 'items_file.csv';

        $items = $this->fileExists($files, 'shortname', $itemsFileName);

        if ($items) {

            $input = $items[0]['name'];

            $csvItems = Reader::createFromPath($input);

            $headers = $csvItems->fetchOne();

            $dataItems = $csvItems
                ->setEncodingFrom('UTF-8')
                ->setDelimiter(',')
                ->setOffset(1)
                ->addFilter(array($this,'filterItemsByStyleCodes'))
                ->fetchAssoc();

            $output = $this->_csvFilesPath . 'vp-' . $items[0]['shortname'];

            @unlink($output);
            touch($output);

            $csvItems = Writer::createFromPath($output);

            $csvItems->insertOne($headers);

            foreach ($dataItems as $row) {

                $csvItems->insertOne([
                    $row['company'],
                    $row['item-number'],
                    $row['description'],
                    $row['style-code']
                ]);

            }

        }
    }

Build suddenly failing

Hi gang,

I installed last night with "league/csv": "~7.0" in my composer.json, and today my builds are failing:

  • Installing league/csv (dev-master 0d3c28b)
    Cloning 0d3c28b
    0d3c28b is gone (history was rewritten?)
    Failed to download league/csv from source: Failed to execute git checkout '0d3c28bf20ad26e1e23599e1746aaf7c680c0477' && git reset --hard '0d3c28bf20ad26e1e23599e1746aaf7c680c0477'
    fatal: reference is not a tree: 0d3c28b
    Now trying to download from dist

    • Installing league/csv (dev-master 0d3c28b)
      Downloading: connection...

    [Composer\Downloader\TransportException]
    Could not authenticate against github.com

Did I screw something up?

more iterators in the Reader class

As of now the Reader::fetch* methods all return an array. This behavior is okay when:

  • the method use is Reader::fetchOne as it always return one row
  • you are dealing with a small CSV
  • if you used the filtering mechanism to reduce the CSV data size.

In any other cases these method could lead to intensive memory usage.
To fixed this issue we could:

  • make all these functions return an iterator instead of an array. In this respect, the Reader::query method is an equivalent to the Reader::fetchAll method and they are both interchangeable since often the result of a CSV is used in a foreach loop. IMHO this should not affect regular usage.
  • make all these functions return a generator using the yield keyword. The only drawback is that generators are only present in PHP5.5+ and the library is PHP5.4 compliant. If this solution is to be implemented how show we treat PHP5.4 users ?

Typo in Docs

Typo of "tree" instead of "three" in Reading section of docs. PR coming soon. Not particularly exciting but every little helps!

setOffset() is not respected with toHTML()

Hello,

I have a csv file from which I don't want to use the first 3 lines then I want to convert it to an html table.
So I used something like this:

$inputCsv = Reader::createFromString(.....);
$inputCsv->setDelimiter("\t");
$inputCsv->setEncodingFrom("UTF-8");
$inputCsv->setOffset(3);
echo $inputCsv->toHTML('table table-striped');

But the generated table contains the first 3 lines of my CSV file.

Bye,
Hervé

[bug?] First cell always contains "<?php"

Good morning gentlemen,

I've fighting with this weird bug (?) right now:

UPDATE: This seems to be a problem in Yii framework itself, not in thephpleague/csv (as I could reproduce the but by just using PHP's native CSV functions). I'm unsure how to handle this issue now.

Reproduceable problem

When using Yii framework 2.0.3 and putting out a .csv file created with thephpleague/csv 7.0 in the mosst simple, most cleanest way, exactly like in the examples described, I'm always getting a perfect file, but it has <?php in the very first cell and puts the cell's content into double quotes.

Should be
test

But is
<?php"test"

My code (a standard controller in Yii2):

    public function test()
    {
        $writer = Writer::createFromFileObject(new \SplTempFileObject());
        $writer->insertOne(array("test", "", "test2")); // header line
        $writer->insertOne(array("123", "444")); // demo content line
        $writer->output('123.csv');
    }

What I've tried:

  • all the different BOM settings
  • all the different Delimiters,, Enclosures, NewLine-settings, Encodings
  • rendering the file via header()

Note:

  • showing the data in a HTML table or as raw data works perfectly
  • I'm suspecting something in the Yii framework working "into" the CSV creation

I'm kindly asking for a short look if this could be a real bug or just something on my side.
This is a bug-report, not a help request :)

Cheers

Working with large files

I'm trying to read a CSV file with 150k rows. My server isn't massive so I am trying to do this is a way that handles memory properly, I can't just return all rows as an array.

What is the best way to loop over this? I tried using a loop like this: foreach ($data as $lineIndex => $row) but ran into memory issues.

I can create a PHP loop and then use ->fetchOne() row at a time, but I believe this 'rewinds' the file after every read, so if I am grabbing row 140,000 it takes 10 seconds to return the data. Any ideas for processing large files?

I am okay with storing the last line read in my cache, and then using ->fetchOne but is there a way to prevent the long seek delay?

Adding Multiple filter conditions

The IteratorQuery trait enable filtering the CSV. But as of now you can only set one filter by query. So if you want to make a complex filtering condition you have to register a very complex function.

It would be interesting to be able to set as many filtering conditions as we want. So that the user can register small yet more readable filters ?

If we register multiple filter:

  • should the registration order matter ?
  • should we be able to register/unregistered the filters ?

Proposal: Use grunt for registering precommit hooks

We could use grunt to register precommit hooks that

  1. Check files for PSR-2 violations
  2. Run phpunit tests

It will allow collaborators to commit only if all the conditions pass.
Precommit hooks, can however be overridden with --no-verify to skip checks.

Pros

  • Enforce coding conventions. No way of saying, oops i forgot to check.
  • Ensure no test cases are broken

Cons

  • Addition of node.js / io.js to the development stack, which will make setting up the dev environment a bit lengthy.

I will submit a PR if this is approved.

Import large csv files

I'm trying to import a large csv files with +600,000 records, do you have an example for this?

A function to shop de file into smaller parts and replace them in the memory like this would be nice:

function file_get_contents_chunked($file,$chunk_size,$callback)
{
    try
    {
        $handle = fopen($file, "r");
        $i = 0;
        while (!feof($handle))
        {
            call_user_func_array($callback,array(fread($handle,$chunk_size),&$handle,$i));
            $i++;
        }

        fclose($handle);

    }
    catch(Exception $e)
    {
         trigger_error("file_get_contents_chunked::" . $e->getMessage(),E_USER_NOTICE);
         return false;
    }

    return true;
}

 $success = file_get_contents_chunked("my/large/file",4096,function($chunk,&$handle,$iteration){
    /*
        * Do what you will with the {&chunk} here
        * {$handle} is passed in case you want to seek
        ** to different parts of the file
        * {$iteration} is the section fo the file that has been read so
        * ($i * 4096) is your current offset within the file.
    */

});

if(!$success)
{
    //It Failed
}

Got it from this question:

http://stackoverflow.com/questions/5249279/file-get-contents-php-fatal-error-allowed-memory-exhausted/5249971#5249971

iterate over fetchAssoc

I didn't see it but maybe i missed it in the docs, but it would be nice if it was possible to iterate over an associated array from the first line.

something like:

$reader = \League\Csv\Reader::createFromPath('/file.csv');
$reader->setMode(Reader::MODE_ASSOC);
foreach ($reader as $row) {
 // here row will be an array with names fro the header
}

Adding Multiple sorting conditions

The IteratorQuery trait enable sorting the CSV. But as of now you can only set one sorting condition by query.

The current IteratorQuery::setSortBy behaviour is somehow difficult to understand so I'm thinking this method needs an important rewrite to enable

  • Setting multiple sorting conditions
  • Allow for a simpler method call

If we register multiple sorting condition :

  • should the registration order matter ?
  • should we be able to register/unregistered the settings ?

Does not handle quotes as it should

I'm using this nice class since some time in order to read CSV file but I recently detected that on double quotes some issues could happen.

Here is an example of CSV file.

"Label","Login","Password","Web Site","Comments"
"Generic Bank #1","jdoe","superS3cret","https://www.genericbank.com","Checking accounts, etc"
"Generic Bank #2","jdoe","S3cret with spaces","https://www.genericbank.com","Checking accounts, etc"
"Generic Bank #2","jdoe","SpecialChars\\!"'@#$$%^&&*()@\\///\","https://www.genericbank.com","Checking accounts, etc"
"Retailer Chain #1","",""Secretstartingwithquotes","https://www.bigboxstore.com",""
"Retailer Chain #2","","'Secretstartingwithsinglequote","https://www.bigboxstore.com",""
"Retailer Chain #3","","'Twosinglequotes'","https://www.bigboxstore.com",""
"Health Care #1","jdoe","S3cretwithsinglequote'init","https://www.myhealthcare.com","Health care stuff"
"Health Care #2","jdoe","S3cretwithcomma,init","https://www.myhealthcare.com","Health care stuff"
"Health Care #3","jdoe","S3cretwithdoublequote"init","https://www.myhealthcare.com","Health care stuff"

Here is the output:

Array
(
    [0] => Array
        (
            [Label] => "Label"
            [Login] => Login
            [Password] => Password
            [Web site] => Web Site
            [Comments] => Comments
        )

    [1] => Array
        (
            [Label] => Generic Bank #1
            [Login] => jdoe
            [Password] => superS3cret
            [Web site] => https://www.genericbank.com
            [Comments] => Checking accounts, etc
        )

    [2] => Array
        (
            [Label] => Generic Bank #2
            [Login] => jdoe
            [Password] => S3cret with spaces
            [Web site] => https://www.genericbank.com
            [Comments] => Checking accounts, etc
        )

    [3] => Array
        (
            [Label] => Generic Bank #2
            [Login] => jdoe
            [Password] => SpecialChars\\!'@#$$%^&&*()@\\///\"
            [Web site] => https://www.genericbank.com
            [Comments] => Checking accounts, etc
        )

    [4] => Array
        (
            [Label] => Retailer Chain #1
            [Login] => 
            [Password] => Secretstartingwithquotes"
            [Web site] => https://www.bigboxstore.com
            [Comments] => 
        )

    [5] => Array
        (
            [Label] => Retailer Chain #2
            [Login] => 
            [Password] => 'Secretstartingwithsinglequote
            [Web site] => https://www.bigboxstore.com
            [Comments] => 
        )

    [6] => Array
        (
            [Label] => Retailer Chain #3
            [Login] => 
            [Password] => 'Twosinglequotes'
            [Web site] => https://www.bigboxstore.com
            [Comments] => 
        )

    [7] => Array
        (
            [Label] => Health Care #1
            [Login] => jdoe
            [Password] => S3cretwithsinglequote'init
            [Web site] => https://www.myhealthcare.com
            [Comments] => Health care stuff
        )

    [8] => Array
        (
            [Label] => Health Care #2
            [Login] => jdoe
            [Password] => S3cretwithcomma,init
            [Web site] => https://www.myhealthcare.com
            [Comments] => Health care stuff
        )

    [9] => Array
        (
            [Label] => Health Care #3
            [Login] => jdoe
            [Password] => S3cretwithdoublequoteinit"
            [Web site] => https://www.myhealthcare.com
            [Comments] => Health care stuff
        )

)

Take a look to the password field and the different values, you will see that the passwords that contain double quotes are not correctly managed.
Here is how I'm launching the class.

            $csv = new Reader($file);
            $csv->setDelimiter(',');
            $csv->setEnclosure('"');
            $csv->setEscape('\\');
            $csv->setFlags(SplFileObject::READ_AHEAD|SplFileObject::SKIP_EMPTY);
            $res = $csv->fetchAssoc(['Label', 'Login', 'Password', 'Web site', 'Comments']);
            print_r($res);

Can help in anyway?

Allow empty enclosure and escape.

Would it be possible to allow a empty string when setting the enclosure and the escape char? A client has a special format that, for some reason, should not differentiate any chars between the delimiter.

    public function setEnclosure($enclosure = '"')
    {
        if (1 != mb_strlen($enclosure)) {
            throw new InvalidArgumentException('The enclosure must be a single character');
        }
        $this->enclosure = $enclosure;

        return $this;
    }
    public function setEscape($escape = "\\")
    {
        if (1 != mb_strlen($escape)) {
            throw new InvalidArgumentException('The escape character must be a single character');
        }
        $this->escape = $escape;

        return $this;
    }

Get Rid Of Develop Branch

Can we not have a develop branch for working on patch releases please? It's a bit odd, and makes the branch alias for getting dev changes pointless.

Memory consumption when writing large files

Hey,

I came across another problem when attempting to write a large CSV file which is somewhat related to #81

Code to reproduce

function writeFile($file, $content){
    $spl = new SplFileObject($file,"w");
    $spl->fwrite($content);
    $spl = null;
}

$writer = Writer::createFromString("");
$cols = 100;
$rows = 1000;
$chars = 100;
$text = implode(" ", array_fill(0, $chars, "a"));
$row = array_fill(0, $cols, $text);
$lines = array_fill(0, $rows, $row);
echo "Using ".memory_get_usage()/(1024*1024)." MB after creating data\n";
$writer->insertAll($lines);
echo "Using ".memory_get_usage()/(1024*1024)." MB after inserting data\n";
$str = $writer->__toString();
echo "Using ".memory_get_usage()/(1024*1024)." MB after converting data to string\n";
writeFile(__DIR__."/test2.csv",$str);
echo "Using ".memory_get_usage()/(1024*1024)." MB after writing data to file\n";

Output

Using 0.60456848144531 MB after creating data
Using 0.60501861572266 MB after inserting data
Using 19.885452270508 MB after converting data to string
Using 19.885543823242 MB after writing data to file

As you can see, I need to somehow "get" the CSV content (using $str = $writer->__toString();) in order to write it to a .csv file.
The resulting string can be very large which leads to a massive increase in used memory (0.6MB => 19.8MB).

Imho it should be possible to perform chunked (as in line by line) writing, like this:

function writeCsv($file, $lines){
    $spl = new SplFileObject($file,"w");

    foreach ($lines as $line)
    {
        $spl->fputcsv($line);
    }
    $spl = null;
}

$cols = 100;
$rows = 1000;
$chars = 100;
$text = implode(" ", array_fill(0, $chars, "a"));
$row = array_fill(0, $cols, $text);
$lines = array_fill(0, $rows, $row);
echo "Using ".memory_get_usage()/(1024*1024)." MB after creating data\n";
writeCsv(__DIR__."/test3.csv",$lines);
echo "Using ".memory_get_usage()/(1024*1024)." MB after writing data to file\n";

Output

Using 0.37704467773438 MB after creating data
Using 0.37732696533203 MB after writing data to file

Since it uses SplFileObject internally, one could also use the filter api in order to use a streaming filter (e.g. for encoding conversion). I feel the Writer would greatly benefit from an Writer::writeToFile method.

Any thoughts on that?

Cheers
Pascal

Filesize equal to 0

I have a simple function which write a csv file:

public function writeCsv($csvFilename, $rows)
{
        $writer = CSV\Writer::createFromPath(new \SplFileObject($csvFilename, 'w+'), 'w');

        // header
        $writer->insertOne(array_keys($rows[0]));

        // rows
        $writer->insertAll($rows);

        $fileSize = filesize($csvFilename);

        $this->logger->info("csv file $csvFilename written size=$fileSize");
}

My problem is that $fileSize is equal to 0 but at the end of execution
csvfile "is well flushed".

Is there a way to get the size of the file?

Null Handling in Csv\Writer class

When passing an array containing null to $writer->insertOne() an InvalidArgumentException is thrown with message:

the provided data can not be transform into a single CSV data row

Example:

$writer->insertOne(["one", "two", null, "four"]);

This creates the requirement for the client code to "sanitize" nulls before inserting

Reader ignores first newline in value

I came across a weird behaviour while evaluation this package. When reading a CSV file or string via Reader, the first newline in a value of each line will be ignored.

Code to reproduce:

$input = <<<EOS
"line 1 field 1 with crlf: > \r\n < second crlf: > \r\n <, line 1 field 2 with crlf: > \r\n < second crlf: > \r\n <"
"line 2 field 1 with crlf: > \r\n < second crlf: > \r\n <, line 2 field 2 with crlf: > \r\n < second crlf: > \r\n <"
EOS;

$reader = Reader::createFromString($input);
$reader->setNewline("\r\n");
foreach($reader as $r){
    var_dump($r);
}

Output

array(1) {
  [0] =>
  string(104) "line 1 field 1 with crlf: >  < second crlf: > 
 <, line 1 field 2 with crlf: > 
 < second crlf: > 
 <"
}
array(1) {
  [0] =>
  string(104) "line 2 field 1 with crlf: >  < second crlf: > 
 <, line 2 field 2 with crlf: > 
 < second crlf: > 
 <"
}

Please note the first "> <" (without \r\n) in the beginning of each csv line in the output.

The problem does only occur when SplFileObject::DROP_NEW_LINE ist defined as flag -- unfortunately I could not find away to "unset" this flag from outside of the class since it is automatically added in the Controls class when setting the flags:

    /**
     * Set the Flags associated to the CSV SplFileObject
     *
     * @param int $flags
     *
     * @throws \InvalidArgumentException If the argument is not a valid integer
     *
     * @return $this
     */
    public function setFlags($flags)
    {
        if (false === filter_var($flags, FILTER_VALIDATE_INT, ['options' => ['min_range' => 0]])) {
            throw new InvalidArgumentException('you should use a `SplFileObject` Constant');
        }

        $this->flags = $flags|SplFileObject::READ_CSV|SplFileObject::DROP_NEW_LINE;

        return $this;
    }

Could you confirm this behaviour? To me, this looks like a bug - although I'm not sure if I'm missing something.

Cheers
Pascal

detectDelimiterList() don't find the correct delimiter

I have the following csv file (without a csv header), line 1:

Surname;Name;;;;;3316157;12360000;"Bank name";DE49123600000003316157;GENODIF1DIR;12,456

With this line detectDelimiterList() can't find the correct delimiter:

$delimiters_list = $inputCsv->detectDelimiterList();

if (!$delimiters_list) {
    //no delimiter found
} elseif (1 == count($delimiters_list)) {
    $delimiter = $delimiters_list[0]; // the found delimiter
} else {
    //inconsistent CSV 
    var_dump($delimiters_list); // all the delimiters found
}

shows:

array(2) {
  [0]=>
  string(1) ","
  [1]=>
  string(1) ";"
}

Double spaces are created when writing file with empty enclosure.

When writing a file with an empty enclosure, the class replaces every space in the data with two spaces. For example, a pipe delimited file:

$fileWriter->setDelimiter("|");
$fileWriter->setEnclosure(" ");

exports this to file (noticed the multiple spaces):

Rapala - Pro Guide Electric Fillet Knife|more data

instead of the actual data:

Rapala - Pro Guide Electric Fillet Knife|more data

(Github seems to be mangling the double spaces in my example)

Option for checking row consistency

It would be a really nice feature if both the Reader and the Writer could check whether the row lengths in the file are consistent. The Writer should probably throw an Exception if strict check is enabled, and someone tries to write a row, which is not consistent with the file.

I've created my own CSV package, as there were no decent CSV packages at that time. You can check it here. The row consistency check is implemented.

How to create a plain Writer and save content to ".csv" file

A very common scenario for me is:

  • importing data from various sources
  • manipulating data (usually in some sort of array)
  • exporting data as CSV file

My first naive approach to accomplish this with this package would look something like this:

$file = "result.csv";
$writer = new Writer();
$data = [/*...*/];
$writer->insertAll($data );
$writer->saveToCsv();

Unfortunately this does not seem to be possible at the moment, because

  1. a Writer can only be instantiated via from... method (right?)
  2. there is no method to simply write the data to a file

My workaround looks like this:

$file = "result.csv";
Utility::writeEmptyFile($file);
$writer = Writer::createFromPath($file);
$data = [/*...*/];
$dataAsString = $writer->__toString();
Utility::writeContentToFile($file,$dataAsString);

So I'm forced to handle the file processing myself, which I'd really like to avoid. This becomes especially cumbersome when having to deal with different encodings (a problem you've already solved when reading csv files with your stream/filter-implementation). Do you plan on integrating this functionality? If not, why not :)?

Cheers
Pascal
A very common scenario for me is:

  • importing data from various sources (CSV, XML, etc.)
  • manipulating data (usually in some sort of array)
  • exporting data as CSV file

The last part doesn't seem to be possible right now. My first naive approach to accomplish this with this package would look something like this:

$file = "result.csv";
$writer = new Writer();
$data = [/*...*/];
$writer->insertAll($data);
$writer->saveToCsv();

Unfortunately this does not work, because

  1. a Writer can only be instantiated via from... method (right?)
  2. there is no method to simply write the data to a file

My workaround looks like this:

$file = "result.csv";
Utility::createEmptyFile($file);
$writer = Writer::createFromPath($file);
$data = [/*...*/];
$dataAsString = $writer->__toString();
Utility::writeContentToFile($file,$dataAsString);

So I'm forced to handle the file processing myself, which I'd really like to avoid. This becomes especially cumbersome when having to deal with different encodings (a problem you've already solved when reading csv files with your stream/filter-implementation). Do you plan on integrating this functionality? If not, why not :)?

Cheers
Pascal

Weird encoding issue when reading .csv generated by Google AdWords

I have a .csv from Google AdWords that's actually tab-seperated. This is done by default in Google AdWords, not sure why. But when I var_dump() the array, I get a bunch of characters like this:

Image of Yaktocat

Here is the .csv I'm trying to use:

https://dl.dropboxusercontent.com/u/30874695/Keyword%20Planner%202015-03-27%20at%2011-13-26.csv

Again, this is straight from Google AdWords, and hasn't been opened/saved in Excel. I've confirmed the strings are 'ASCII'. Here's the code I'm using to read the csv:

$reader = Reader::createFromPath($csv_file);
$reader->setFlags(\SplFileObject::READ_AHEAD|\SplFileObject::SKIP_EMPTY);

$detect_delimiter = $reader->detectDelimiterList();
$delimiter = array_shift($detect_delimiter);
$reader->setDelimiter($delimiter);

$data = $reader
    ->setOffset(1)
    ->fetchAssoc($this->csv_headers);

// Remove duplicate entries
$data = array_map('unserialize', array_unique(array_map('serialize', $data)));

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.