Giter VIP home page Giter VIP logo

camelot-php's Introduction

randomstate/camelot-php

A PHP wrapper for Camelot, the python PDF table extraction library

Installation

composer require randomstate/camelot-php

Usage

The package adheres closely with the camelot CLI API Usage. Default output is in CSV format as a simple string. If you need to parse CSV strings we recommend the league/csv package (https://csv.thephpleague.com/)

<?php

use RandomState\Camelot\Camelot;
use League\Csv\Reader;

$tables = Camelot::lattice('/path/to/my/file.pdf')
       ->extract();

$csv = Reader::createFromString($tables[0]);
$allRecords = $csv->getRecords();

Advanced Processing

Saving / Extracting

Note: No Camelot operations are run until one of these methods is run

$camelot->extract(); // uses temporary files and automatically grabs the table contents for you from each
$camelot->save('/path/to/my-file.csv'); // mirrors the behaviour of Camelot and saves files in the format /path/to/my-file-page-*-table-*.csv
$camelot->plot(); // useful for debugging, it will plot it in a separate window (see Visual Debugging below)   
$camelot->json();
$camelot->csv();
$camelot->html();
$camelot->excel();
$camelot->sqlite();

$camelot->pages('1,2,3-4,8-end')

$camelot->password('my-pass')

$camelot->stream()->processBackgroundLines()

$camelot->plot()

<?php

use RandomState\Camelot\Camelot;
use RandomState\Camelot\Areas;

Camelot::stream('my-file.pdf')
    ->inAreas(
        Areas::from($xTopLeft, $yTopLeft, $xBottomRight, $yBottomRight)
            // ->add($xTopLeft2, $yTopLeft2, $xBottomRight2, $yBottomRight2)
            // ->add($xTopLeft3, $yTopLeft3, $xBottomRight3, $yBottomRight3)
    );
<?php

use RandomState\Camelot\Camelot;
use RandomState\Camelot\Areas;

Camelot::stream('my-file.pdf')
    ->inRegions(
        Areas::from($xTopLeft, $yTopLeft, $xBottomRight, $yBottomRight)
            // ->add($xTopLeft2, $yTopLeft2, $xBottomRight2, $yBottomRight2)
            // ->add($xTopLeft3, $yTopLeft3, $xBottomRight3, $yBottomRight3)
    );

$camelot->stream()->setColumnSeparators($x1,$x2...)

$camelot->split()

$camelot->flagSize()

$camelot->strip("\n")

$camelot->setEdgeTolerance(500)

$camelot->setRowTolerance(15)

$camelot->lineScale(20)

$camelot->shiftText('r', 'b')

$camelot->copyTextSpanningCells('r', 'b')

License

MIT. Use at your own risk, we accept no liability for how this code is used.

camelot-php's People

Contributors

cimrie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

camelot-php's Issues

Set format doesn't work

In Camelot runCommand() in cmd argument of param --format is hardcoded as csv. I changed it to $this->getFormat() and everything works just fine. Dunno if I'm right, but you can fix this if I actually am.

camelot: command not found

I ran below in Laravel :
$camelot = Camelot::stream($request->file('image')->path())
->inAreas(
Areas::from(25, 560, 574, 445)

        );

Below error is coming ->
Unexpected Camelot error. Command: camelot --format csv --output /tmp/956703291-0995589001626259870/extract.txt stream -T 25,560,574,445 /tmp/phpUBjIvu Output: ----------- sh: camelot: command not found

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.