Giter VIP home page Giter VIP logo

python-tika's People

Contributors

sudharsh avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

python-tika's Issues

Aborted (core dumped)

here is the tika code i run:

def handler(file,desc='',tags=''):
## return the file url 
    couch = couchdb.Server('http://localhost:5984/')
    try:
        db=couch['test']
    except:
        db = couch.create('test')

    tika =  parser.from_buffer(file.read())
    meta = tika['metadata']
    meta['desc'] = desc
    meta['tag']= tags

    # meta['fre'] = analysis(tika['content'])

    id = db.save(meta)[0]
    db.put_attachment(meta,file)
    return "http://127.0.0.1:5984/test/"+id+"/"+file.name,id

when i run the function above solo, it works ; when put in django project , it crash

A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xb41074b6, pid=10762, tid=2213530432
#
# JRE version: 7.0_07-b30
# Java VM: OpenJDK Server VM (23.2-b09 mixed mode linux-x86 )
# Problematic frame:
# C  [_tika.so+0xe54b6]  JArray<signed char>::JArray(_object*)+0x36
#
# Core dump written. Default location: /home/googcheng/mysite/core or core.10762
#
# An error report file with more information is saved as:
# /home/googcheng/mysite/hs_err_pid10762.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   https://bugs.launchpad.net/ubuntu/+source/openjdk-7/

any help would be greatly appreciated!

get stuck in converting some doc and all pdf files

hi:
when i use tika-python in my project, sometimes it just stuck when i converting some doc without any error info. Would you help me plz?
i use tika-python in mac pro. however, converting is ok by using tika-app. Let me know if you need any extra information.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.