Giter VIP home page Giter VIP logo

antiword's Issues

tryCatch didn't gracefully handle an antiword error

Thought others may run across a similar issue. I'm processing a large number of MS word documents, and one of them is apparently corrupted (it's attached). Antiword produced the following error:

System call to 'antiword' failed (1): Read long 0x7a74 not possible

I tried using tryCatch(antiword(file), finally=as.character(NA)), but tryCatch didn't save me. I didn't understand how to get around this. Tried signalCondition() clumsily but with no luck.

So, inside the 'antiword' function there is a 'stop' command. All I really need is a warning, so to hack my way around this, I wrote this myantiword function:

is_windows <- function(){
identical(.Platform$OS.type, "windows")
}

myantiword <- function (file = NULL, format = FALSE)
{
args <- if (length(file)) {
if (grepl("^https?://", file)) {
tmp <- tempfile(fileext = ".doc")
utils::download.file(file, tmp, mode = "wb")
file <- tmp
}
file <- normalizePath(file, mustWork = TRUE)
c(ifelse(isTRUE(format), "-f", "-t"), ifelse(is_windows(),
shQuote(file), file))
}
wd <- getwd()
on.exit(setwd(wd))
bindir <- system.file("bin", package = "antiword")
setwd(bindir)
postfix <- if (is_windows())
.Machine$sizeof.pointer * 8
path <- file.path(bindir, paste0("antiword", postfix))
out <- sys::exec_internal(path, args, error = FALSE)

if (out$status == 0) {
  if (length(out$stderr))
    cat(rawToChar(out$stderr), file = stderr())
  return(rawToChar(out$stdout))
} else {
  warning(sprintf("System call to 'antiword' failed (%d): %s",
                  out$status, rawToChar(out$stderr)))
  return(as.character(NA))
}

}

BILL_ANALYSIS_TBL_20913.doc.zip

Error when using devtools::load_all || error: Exactly one of the DEBUG and NDEBUG flags MUST be set

Hi there,

I fork the antiword library, git clone on my machine (OSX, 10.14.1). R v3.6.3.

In order to locally install the library, I open R shell and load devtools from within the antiword library location. I then run devtools::load_all(); sorry if I'm doing this in an inane manner. I get error as:

$ devtools::load_all()
Loading antiword
Re-compiling antiword
─  installing *source* package ‘antiword’ ...
   ** using staged installation
   ** libs
   rm -f antiword.so register.o libantiword/main_u.o libantiword/asc85enc.o libantiword/blocklist.o libantiword/chartrans.o libantiword/datalist.o libantiword/depot.o libantiword/dib2eps.o libantiword/doclist.o libantiword/fail.o libantiword/finddata.o libantiword/findtext.o libantiword/fmt_text.o libantiword/fontlist.o libantiword/fonts.o libantiword/fonts_u.o libantiword/hdrftrlist.o libantiword/imgexam.o libantiword/imgtrans.o libantiword/jpeg2eps.o libantiword/listlist.o libantiword/misc.o libantiword/notes.o libantiword/options.o libantiword/out2window.o libantiword/output.o libantiword/pdf.o libantiword/pictlist.o libantiword/png2eps.o libantiword/postscript.o libantiword/prop0.o libantiword/prop2.o libantiword/prop6.o libantiword/prop8.o libantiword/properties.o libantiword/propmod.o libantiword/rowlist.o libantiword/sectlist.o libantiword/stylelist.o libantiword/stylesheet.o libantiword/summary.o libantiword/tabstop.o libantiword/text.o libantiword/unix.o libantiword/utf8.o libantiword/word2text.o libantiword/worddos.o libantiword/wordlib.o libantiword/wordmac.o libantiword/wordole.o libantiword/wordwin.o libantiword/xmalloc.o libantiword/xml.o antiword
   gcc -I"/Users/sanjeevsariya/bin/R3.6.3/Rv3.6.3/lib/R/include" -DNDEBUG   -I/usr/local/include  -fPIC  -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -fdiagnostics-color=always -c libantiword/main_u.c -o libantiword/main_u.o
   In file included from libantiword/main_u.c:48:
   libantiword/antiword.h:13:2: error: Exactly one of the DEBUG and NDEBUG flags MUST be set
   #error Exactly one of the DEBUG and NDEBUG flags MUST be set
    ^
   1 error generated.
   make: *** [libantiword/main_u.o] Error 1
   ERROR: compilation failed for package ‘antiword’
─  removing ‘/private/var/folders/7w/kl6vpf596h738qtnndqqm7k00000gn/T/Rtmpi2skhK/devtools_install_56981013902c/antiword’
Error in (function (command = NULL, args = character(), error_on_status = TRUE,  : 
  System command 'R' failed, exit status: 1, stdout + stderr (last 10 lines):
E> rm -f antiword.so register.o libantiword/main_u.o libantiword/asc85enc.o libantiword/blocklist.o libantiword/chartrans.o libantiword/datalist.o libantiword/depot.o libantiword/dib2eps.o libantiword/doclist.o libantiword/fail.o libantiword/finddata.o libantiword/findtext.o libantiword/fmt_text.o libantiword/fontlist.o libantiword/fonts.o libantiword/fonts_u.o libantiword/hdrftrlist.o libantiword/imgexam.o libantiword/imgtrans.o libantiword/jpeg2eps.o libantiword/listlist.o libantiword/misc.o libantiword/notes.o libantiword/options.o libantiword/out2window.o libantiword/output.o libantiword/pdf.o libantiword/pictlist.o libantiword/png2eps.o libantiword/postscript.o libantiword/prop0.o libantiword/prop2.o libantiword/prop6.o libantiword/prop8.o libantiword/properties.o libantiword/propmod.o libantiword/rowlist.o libantiword/sectlist.
[...]
Type .Last.error.trace to see where the error occured

I compiled R locally as with below flags:
./configure --prefix=~/Rv3.6.3 --enable-R-shlib --enable-BLAS-shlib

Any pointers shall be appreciated in order to set this initial loadings/compilations/installations.

Platform: x86_64-apple-darwin18.2.0 (64-bit)
Running under: macOS Mojave 10.14.1
other attached packages:
devtools_2.2.2 usethis_1.5.1 
Rcpp_1.0.3        rstudioapi_0.11   magrittr_1.5      pkgload_1.0.2    
R6_2.4.1          rlang_0.4.5       fansi_0.4.1       tools_3.6.3      
pkgbuild_1.0.6    sessioninfo_1.1.1 cli_2.0.2         withr_2.1.2      
ellipsis_0.3.0    remotes_2.1.1     assertthat_0.2.1  digest_0.6.25    
rprojroot_1.3-2   crayon_1.3.4      processx_3.4.2    callr_3.4.2      
fs_1.3.2          ps_1.3.2          testthat_2.3.2    memoise_1.1.0    
glue_1.3.1        compiler_3.6.3    desc_1.2.0        backports_1.1.5  
prettyunits_1.1.1

Add other antiword parameters

It would be great to be able to use other parameters, for instance:

-w width
              In text mode this is the line width in characters. A value of zero puts an entire paragraph on a line, useful when the text is to used as input for another wordprocessor.

For example -w 0 would be helpfull for extracting text in an NLP pipeline.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.