Giter VIP home page Giter VIP logo

bibtex's Introduction

rOpenSci

Project Status: Abandoned

This repository has been archived. The former README is now in README-NOT.md.

bibtex's People

Contributors

coatless avatar dieghernan avatar katrinleinweber avatar kurthornik avatar mkoohafkan avatar mllg avatar mwmclean avatar romainfrancois avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bibtex's Issues

oldrel testthat snapshot differences

Looks like there is an underlying change in one of the string functions used between R 4.1 and R 4.2 causing the oldrel action to error during the unit testing portion:

https://github.com/ropensci/bibtex/actions/runs/3125413081/jobs/5069755857#step:6:362

─ Failure (test-examples.R:53:3): Read graphics ───────────────────────────────
Snapshot of `bib` has changed:
old[5:18] vs new[5:18]
  "and Methods_, 1025-1041."
  ""
  "Friendly M (1982). \"Graphical methods for categorical data.\" _SAS User"
- "Group International Conference Proceedings_, 190-200."
+ "Group International Conference Proceedings_, 190-200. <URL:"
- "<http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html>."
+ "http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html>."
  ""
  "Meyer D, Zeileis A, Hornik K (2005). \"The strucplot framework:"
  "Visualizing multi-way contingency tables with vcd.\" Department of"
  "Statistics and Mathematics, Wirtschaftsuniversität, Wien. Report 22,"
- "Research Report Series,"
+ "Research Report Series, <URL:"
- "<http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_8a1>."
+ "http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_8a1>."
  ""
  "Chambers JM, Cleveland WS, Kleiner B, Tukey PA (1983). _Graphical"
  "Methods for Data Analysis_. Wadsworth & Brooks/Cole."

old[29:36] vs new[29:36]
  "Hobart."
  ""
  "Friendly M (1994). \"A fourfold display for 2 by 2 by k tables.\" York"
- "University, Psychology Department. Technical Report 217,"
+ "University, Psychology Department. Technical Report 217, <URL:"
- "<http://www.math.yorku.ca/SCS/Papers/4fold/4fold.ps.gz>."
+ "http://www.math.yorku.ca/SCS/Papers/4fold/4fold.ps.gz>."
  ""
  "Murrell PR (1999). \"Layouts: A mechanism for arranging plots on a"
  "page.\" _Journal of Computational and Graphical Statistics_, 121-134."

old[45:52] vs new[45:52]
  "Friendly M (1994). \"Mosaic displays for multi-way contingency tables.\""
  "_Journal of the American Statistical Association_, 190-200."
  ""
- "Friendly M (????). \"The home page of Michael Friendly.\""
+ "Friendly M (????). \"The home page of Michael Friendly.\" <URL:"
- "<http://www.math.yorku.ca/SCS/friendly.html>."
+ "http://www.math.yorku.ca/SCS/friendly.html>."
  ""
  "Cleveland WS (1985). _The elements of graphing data_. Wadsworth,"
  "Monterey, CA, USA."

old[65:72] vs new[65:72]
  "_The American Statistician_, 303-305."
  ""
  "Blanc C, Schlick C (1995). \"X-splines : A Spline Model Designed for the"
- "End User.\" In _Proceedings of SIGGRAPH 95_, [377](https://github.com/ropensci/bibtex/actions/runs/3125413081/jobs/5069755857#step:6:379)-386."
+ "End User.\" In _Proceedings of SIGGRAPH 95_, 377-386. <URL:"
- "<http://dept-info.labri.fr/~schlick/DOC/sig1.html>."
+ "http://dept-info.labri.fr/~schlick/DOC/sig1.html>."
  ""
  "Murrell P (1998). _Investigations in Graphical Statistics_. Ph.D."
  "thesis, The University of Auckland."

Error message for invalid .bib files to read.bib

I don't think this issue is any bigger than perhaps "read.bib could spit out a nicer error message for invalid input". It seems in my code I can occasionally get errors in the .bib files I generate and this would lead to "unprotect_ptr: pointer not found" errors when I try to read the bib files with read.bib. E.g.

library(bibtex)
options(error = function() traceback(2))
writeLines('Not a real bib file', 'junk.bib')
read.bib('junk.bib')`
Error in read.bib("junk.bib") : unprotect_ptr: pointer not found
In addition: Warning message:
In read.bib("junk.bib") : 
junk.bib:2:0
syntax error, unexpected $end
Dropping the entry '(nil)' (starting at line 0) 
2: .External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile)
1: read.bib("junk.bib")

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                        [5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bibtex_0.3-6

> R.version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          0.2                         
year           2013                        
month          09                          
day            25                          
svn rev        63987                       
language       R                           
version.string R version 3.0.2 (2013-09-25)
nickname       Frisbee Sailing  

`write.bib` does not write UTF-8 characters properly

Hi,

I have seen this when trying to create a bibtex file for cffr:

library(bibtex)
packageVersion("bibtex")


entry <- bibentry(
  "Article",
  doi = "10.21105/joss.03900",
  url = "https://doi.org/10.21105/joss.03900",
  year = 2021,
  publisher = "The Open Journal",
  volume = 6,
  number = 67,
  pages = 3900,
  author = person("Diego", "Hernangómez"),
  title = "cffr: Generate Citation File Format Metadata for R Packages",
  journal = "Journal of Open Source Software"
)

write.bib(entry)

But the "Rpackages.bib" file is:

@Article{,
  doi = {10.21105/joss.03900},
  url = {https://doi.org/10.21105/joss.03900},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {67},
  pages = {3900},
  author = {Diego Hernang?mez},
  title = {cffr: Generate Citation File Format Metadata for R Packages},
  journal = {Journal of Open Source Software},
}

Note the author key: author = {Diego Hernang?mez},, not propley displayed.

Support reading from arbitrary connections.

From an email exchange with Kurt:

romain writes:

Le 2013-07-29 21:06, Kurt Hornik a écrit :

Romain Francois writes:

Hello,
I've seen that you've released a new version of bibtex. Thanks and
oops
for the missing fclose.

Sure---was reported by someone on Windows who could not file.remove
the
.bib he had just read in ...

There was a tweet from Gavin Simpson:
https://twitter.com/ucfagls/status/360621762578354176

"
Anyone aware of a bibtex parser for #rstats that doesn't need to
read
from files? Looking to parse strings pulled from a web API
"

I guess it would not be too hard to give read.bib the ability to
read
from a character vector rather than from a file. I might have a go
at
this.

Great. Ideally, this would work from arbitrary connections

Best
-k

Maybe I need a fresh look at connections, but last time I tried to use
them in C(++), I could not find an api.

Some is exposed via R_ext/Connections.h (with a note that code should
test for R_CONNECTIONS_VERSION).

(Seems that this is currently not used in any CRAN package ...)

Best
-k

One way is to call back an R function to get some more characters to
feed in to the lexer, but this would not be very efficient.

Romain

Difficulty loading bibtex in R Studio

Hi!

I am having difficulty loading bibtex in R, I realized this while trying to install and load the "plm" package. I am working from a cloud environment, and have R version 3.6.3. I have been able to run plm before, so I am confused why this is now an issue. I get almost an identical error when I try to load from CRAN and from github.
What I am simply trying to run:

install.packages("plm")
install.packages("bibtex")

Or

if(!requireNamespace("remotes", quietly = TRUE)) { install.packages("remotes") }
remotes::install_github("ropensci/bibtex")

This is the error I get:

bibparse.y: In functionxx_simple_value:
bibparse.y:945: error:forloop initial declarations are only allowed in C99 mode
bibparse.y:945: note: use option -std=c99 or -std=gnu99 to compile your code
bibparse.y: In functionxx_expand_abbrev:
bibparse.y:1021: error:forloop initial declarations are only allowed in C99 mode
bibparse.y: In functionasVector:
bibparse.y:1145: error:forloop initial declarations are only allowed in C99 mode
make: *** [bibparse.o] Error 1
ERROR: compilation failed for packagebibtex

I have no idea why it is treating something as a for loop. I've tried to download archived versions of bibtext as well, with the same error message above. Please let me know if I can provide any more information. Thank you so much!

Direct import into EndNote is not possible

Hi,

I am using EndNote as a citation manager. Unfortunately I am not able to directly import .bib entries generated with bibtex into EndNote. Currently I import the entries into Zotero, export as .ris from there and reimport into EndNote. Except from a minor issue when using Zotero (see #53 ) this works, but having a direct option to import into EndNote would of course be desirable.

Thanks!

read.bib with '?' in key

read.bib does not work if the .bib file has a question mark in the "key".

tmp <- tempfile(fileext = ".bib")  
entry <- "@Misc{key?,\n author = \"Smith, Bob\",\n title = \"The Title\",\n year = 2012, \n}"
writeLines(entry, tmp)
read.bib(tmp)
## Warning message:
## In read.bib(tmp) : 
## C:\Users\MMCLEA~1.ADS\AppData\Local\Temp\Rtmp6ZP3yZ\file86058c626ee.bib:1:0
##  syntax error, unexpected TOKEN_LITERAL, expecting TOKEN_COMMA
##  Dropping the entry `k` (starting at line 1) 

rchk issues

https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/bibtex.out

Package bibtex version 0.4.2.3
Package built using 79226/R 4.1.0; x86_64-pc-linux-gnu; 2020-09-19 23:50:40 UTC; unix   
Checked with rchk version 36c7ad2294619ba0a81109c9acb675eea2c96e6d
More information at https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md

Suspicious call (two or more unprotected arguments) to Rf_setAttrib at makeSrcRef bibtex/src/bibparse.y:1178

Function asVector
  [PB] has negative depth bibtex/src/bibparse.y:1159
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:1160

Function do_read_bib
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:428

Function junk1
  [PB] has negative depth bibtex/src/bibparse.y:1081
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:1082

Function recordComment
  [PB] has negative depth bibtex/src/bibparse.y:994
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:995

Function recordInclude
  [PB] has negative depth bibtex/src/bibparse.y:986
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:987

Function recordPreamble
  [PB] has negative depth bibtex/src/bibparse.y:1008
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:1009

Function recordString
  [PB] has negative depth bibtex/src/bibparse.y:1001
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:1002

Function setToken
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:1058

Function xx_assignement
  [PB] has negative depth bibtex/src/bibparse.y:875
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:879

Function xx_assignement_list2
  [PB] has negative depth bibtex/src/bibparse.y:854
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:858

Function xx_atobject_comment
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:515

Function xx_atobject_entry
  [UP] unprotected variable object while calling allocating function Rf_allocVector bibtex/src/bibparse.y:543
  [UP] unprotected variable object while calling allocating function Rf_allocVector bibtex/src/bibparse.y:547
  [UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:550
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:entry,V) bibtex/src/bibparse.y:550
  [UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:551
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:names,V) bibtex/src/bibparse.y:551
  [UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:552
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:key,V) bibtex/src/bibparse.y:552
  [PB] has negative depth bibtex/src/bibparse.y:573
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:575

Function xx_atobject_include
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:593

Function xx_atobject_preamble
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:609

Function xx_atobject_string
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:625

Function xx_entry_head
  [PB] has negative depth bibtex/src/bibparse.y:684
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:689

Function xx_null
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:975

Function xx_object_list_2
  [PB] has negative depth bibtex/src/bibparse.y:474
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:478

Function xx_token_entry
  [PB] has negative depth bibtex/src/bibparse.y:642
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:647

Function xx_value
  [PB] has negative depth bibtex/src/bibparse.y:814
  [PB] has possible protection stack imbalance bibtex/src/bibparse.y:819

Function yydestruct
  [PB] has possible protection stack imbalance bibtex/src/bibparse.c:2734
  [PB] has negative depth bibtex/src/bibparse.y:211

Function yyparse
  [PB] has negative depth bibtex/src/bibparse.c:3409
  [PB] has negative depth bibtex/src/bibparse.c:3413
  [PB] has negative depth bibtex/src/bibparse.c:3537
  [PB] has possible protection stack imbalance bibtex/src/bibparse.c:3565

Commas added to references when using bibtex in rmarkdown

Thanks for this terrific package!

I ran into something that seems like a bug, but I can't quite figure out where it comes from. I'm using bibtex (your package) to generate full citations in the body of the text in RMarkdown. However, when I generate the final document (a pdf, but the problem appears when I try other formats like html), some citations include commas that don't exist in the bib file.

Here's a simple example.

My rmd file (inspired by an earlier stack overflow suggestion here):

---
output: pdf_document
---

`r refs <- bibtex::read.bib("minimalbib.bib")`

`r capture.output(refs["BusemeyerTober2022"])`

minimalbib.bib contains the following:

@article{BusemeyerTober2022,
	author = {Marius R. Busemeyer and Tobias Tober},
	journal = {Comparative Political Studies},
	pages = {00104140221139381},
	title = {Dealing with Technological Change: Social Policy Preferences and Institutional Context},
	year = {2022}}

The pdf (and html) renders the following:

Busemeyer MR, Tober T (2022). "Dealing with Technological Change:, Social Policy Preferences and Institutional Context." Comparative, Political Studies, 00104140221139381.

Note two new commas: one after the colon ("Change:, Social") and one in the name of the journal ("_Comparative, Political").

Yet when I check the citation in the R console, I get the correct markdown code:

[1] "Busemeyer MR, Tober T (2022). “Dealing with Technological Change: Social"
[2] "Policy Preferences and Institutional Context.” _Comparative Political"
[3] "Studies_, 00104140221139381."

I would add that it doesn't affect all references (eyeballing my file, I would say about 1 in 4). I don't know if I'm doing something wrong, if rmarkdown is causing the problem, or if there's a bug, but I was wondering if anyone had an idea of what's happening here. Thanks!

curley brackets and formatting in titles of bibentries

Dear Romain,

I came across your nice package to handle bibtex entries in R, which I
played around with over the past few days. I came across some things I
want to share with you:

  1. Sometimes curly brackets seem to confuse the parser:

To display what I mean I give some examples:

S. Lavou'{e}
or
S. Lavou'e

in the bibfile wouldn't make a difference for LaTeX, but reading a bibentry

@Article{Lavo,
author = {S. Lavou'{e}},
title = {A new species of \textit{Petrocephalus} \textsc{Marcusen}
1854 ({O}steoglossomorpha:
{M}ormyridae) from the {S}anaga {R}iver basin, {C}ameroon},
journal = {Zootaxa},
year = {2011},
volume = {2934},
pages = {20-28},
}

will give after calling

bib$author
[1] "S. Lavou' e"

,

bib$author$family
[1] "e"

and

bib$author$given
[1] "S. Lavou'"

This all is not a problem for the character é, as one can also write
'e, and LaTeX will know what to do. But for other characters this
indeed is a problem. E.g. in the following bibtex entry

@Article{Bart,
author = {L'{a}szl'{o} Bartha and Attila {Moln'ar V.} and Nicolae
Drago\c{s}
and G'{a}bor Sramk'o},
title = {Molecular evidence for reticulate speciation in
\textit{Astragalus}
(Fabaceae) as revealed by a case study from section
\textit{Dissitiflori}},
journal = {Botany},
year = {2013},
volume = {91},
pages = {702-714},
}

One couldn't substitute the \c{s}, by \cs, because LaTeX can't handle
this, but bibtex.R gives the following results for the bibentry above:

bib$author[3]
[1] "Nicolae Drago\c s"

,

bib$author[3]$given
[1] "Nicolae Drago\c"

and

bib$author[3]$family
[1] "s"

The same applies to
J. Fjelds\r{a}
In a bibentry.

  1. Further, especially in Biology it is often necessary to include
    \textit{}. From the example above (@Article{Bart,), I get the following:

bib$title
[1] "Molecular evidence for reticulate speciation in
\textit{Astragalus}\n\t(Fabaceae) as revealed by a case study from
section \textit{Dissitiflori}"

(note the \n\t in the title)

Do you see a chance, that these limitations (or do they have a reason?)
will disappear in a future release of the bibtex package?

Best whishes

Ingo

ASCII turned into non-ASCII

I have the following bib-file example.bib:

@Article{GirLeq2016_CSTM_EntropyBasedGOFTests,
  title = {Entropy-Based Goodness-of-Fit Tests -- a Unifying Framework. Application to DNA Replication},
  author = {Val{\'e}rie Girardin and Justine Lequesne},
  journal = {Communications in Statistics--Theory and Methods},
  pages = {62--74},
  volume = {48},
  number = {1},
  year = {2017},
  doi = {10.1080/03610926.2017.1401084},
  publisher = {Taylor \& Francis},
}

If I read this and want to proceed working with it, the bibtex-package turns my ASCII text into non-ASCII.

## no non-ASCII
tools::showNonASCII(readLines("example.bib"))

library("bibtex")
entry <- read.bib("example.bib")

## bibtex package turns Val{\'e}rie into Valérie
print(entry, style = "citation")

Is there a way to avoid that? Thanks!

Broken reading entries from datasets package

> bib <- read.bib( package = "datasets" )
Error in FUN(c("Chambers, J. M., Cleveland, W. S., Kleiner, B.", "Tukey, P. A." : 
  Invalid name format in bibentry.
Calls: read.bib -> lapply -> FUN -> ArrangeAuthors -> lapply -> FUN

fatal flex scanner internal error--end of buffer missed

This error when I read .bib file. First I thought it happens because file is huge, with something like 5000 citations, so I exported only 4 citations from this set in bibtex format in a .bib format file. But even this 4 citations files does not work. I get the same error.

windows and encoding

@mwmclean this test fails on windows, probably some encoding issue:

test_that("{Herm{\\`e}s International S.A.} and Katzfu{\\ss}, Matthias", {
  authors <- "{Herm{\\`e}s International S.A.} and Katzfu{\\ss}, Matthias"
  parsed <- bibtex:::ArrangeAuthors(authors)
  expect_match(parsed$family[[1]], "Hermès International S.A.")
  expect_match(parsed$family[[2]], "Katzfuß")
})

See for example this on r-hub.
https://builder.r-hub.io/status/bibtex_0.4.1.tar.gz-eb660cb5067d4363a25f20980ed19e9b#L251

I'm not sure how to fix it, short of simply disabling the test ...

Parse single entry from string

I am fetching citations from Zotero with BetterBibtex-interfacing {rbbt} by @paleolimbot. I would love to be able to edit bibtex entry before it gets written to ".bib" file, i.e. while it is still in string format.

Ideally I would do that with {RefManageR}, but currently {RefManageR} relies on {bibtex} to parse .bib. If {bibtex} had an exposed function for reading literal bib string, then I could use {RefManageR} to edit it. Would you consider adding a parser for literal bibtex?

my_ref <- " @book{McElreath_2020, edition={2}, 
   title={Statistical Rethinking: A Bayesian Course with Examples in R and Stan}, ISBN={978-0-429-02960-8},
   url={https://www.taylorfrancis.com/books/9780429642319}, DOI={10.1201/9780429029608}, 
   publisher={Chapman and Hall/CRC}, author={McElreath, Richard}, year={2020}, month={Mar} }"

my_bibentry <- read.bib(text=my_ref)

It needs to be vectorized, of course, i.e. accepting character vectors of length()>1.

Thank you for providing such an important infrastructure package for bibliography management infrastructure in R.

Encoding error when parsing BibTeX file with multi-byte characters on Windows

Thanks for this great package. I encountered a problem when using bibtex package to parse BibTeX files with Chinese characters on Windows:

# Get current locale info
Sys.getlocale()
#> [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"

# Set locale to Chinese
Sys.setlocale(locale = "Chinese")
#> [1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified)_China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_China.936"

bib_text <- "
    @misc{text,
        title = {{你好}},
        language = {zh-CN},
        author = {{你好}},
        month = jun,
        year = {2013},
        pages = {163}
    }
"
# change encoding to "UTF-8"
bib_text_utf8 <- enc2utf8(bib_text)
Encoding(bib_text_utf8)
#> [1] "UTF-8"

# make sure the saved BibTeX file is UTF-8 encoded
con <- file("test.bib", encoding = "UTF-8")
writeLines(bib_text_utf8, con)
close(con)

readLines("test.bib", encoding = "UTF-8")
#> [1] ""                                "        @misc{text,"            
#> [3] "            title = {{你好}},"   "            language = {zh-CN},"
#> [5] "            author = {{你好}},"  "            month = jun,"       
#> [7] "            year = {2013},"      "            pages = {163}"      
#> [9] "        }"                       "    "                           

read.bib could not parse Chinese characters no matter encoding was set to "UTF-8" or not.

str(bibtex::read.bib("test.bib"))
#> List of 1
#>  $ text:Class 'bibentry'  hidden list of 1
#>   ..$ text:List of 6
#>   .. ..$ title   : chr "{浣犲ソ}"
#>   .. ..$ language: chr "zh-CN"
#>   .. ..$ author  :Class 'person'  hidden list of 1
#>   .. .. ..$ :List of 5
#>   .. .. .. ..$ given  : NULL
#>   .. .. .. ..$ family : chr "浣犲ソ"
#>   .. .. .. ..$ role   : NULL
#>   .. .. .. ..$ email  : NULL
#>   .. .. .. ..$ comment: NULL
#>   .. ..$ month   : chr "jun"
#>   .. ..$ year    : chr "2013"
#>   .. ..$ pages   : chr "163"
#>   .. ..- attr(*, "bibtype")= chr "Misc"
#>   .. ..- attr(*, "key")= chr "text"
#>  - attr(*, "class")= chr "bibentry"
#>  - attr(*, "strings")= Named chr(0) 
#>   ..- attr(*, "names")= chr(0) 

str(bibtex::read.bib("test.bib", encoding = "UTF-8"))
#> List of 1
#>  $ text:Class 'bibentry'  hidden list of 1
#>   ..$ text:List of 6
#>   .. ..$ title   : chr "{浣犲ソ}"
#>   .. ..$ language: chr "zh-CN"
#>   .. ..$ author  :Class 'person'  hidden list of 1
#>   .. .. ..$ :List of 5
#>   .. .. .. ..$ given  : NULL
#>   .. .. .. ..$ family : chr "浣犲ソ"
#>   .. .. .. ..$ role   : NULL
#>   .. .. .. ..$ email  : NULL
#>   .. .. .. ..$ comment: NULL
#>   .. ..$ month   : chr "jun"
#>   .. ..$ year    : chr "2013"
#>   .. ..$ pages   : chr "163"
#>   .. ..- attr(*, "bibtype")= chr "Misc"
#>   .. ..- attr(*, "key")= chr "text"
#>  - attr(*, "class")= chr "bibentry"
#>  - attr(*, "strings")= Named chr(0) 
#>   ..- attr(*, "names")= chr(0) 

Here is my session info

sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17134)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 

After digging a little bit, I found that encode the input of make.bib.entry to "UTF-8" can solve this problem. But I am not sure if this is a proper solution.

devtools::install_github("hongyuanjia/bibtex")
str(bibtex::read.bib("test.bib"))
#> List of 1
#>  $ text:Class 'bibentry'  hidden list of 1
#>   ..$ text:List of 6
#>   .. ..$ title   : chr "{你好}"
#>   .. ..$ language: chr "zh-CN"
#>   .. ..$ author  :Class 'person'  hidden list of 1
#>   .. .. ..$ :List of 5
#>   .. .. .. ..$ given  : NULL
#>   .. .. .. ..$ family : chr "你好"
#>   .. .. .. ..$ role   : NULL
#>   .. .. .. ..$ email  : NULL
#>   .. .. .. ..$ comment: NULL
#>   .. ..$ month   : chr "jun"
#>   .. ..$ year    : chr "2013"
#>   .. ..$ pages   : chr "163"
#>   .. ..- attr(*, "bibtype")= chr "Misc"
#>   .. ..- attr(*, "key")= chr "text"
#>  - attr(*, "class")= chr "bibentry"
#>  - attr(*, "strings")= Named chr(0) 
#>   ..- attr(*, "names")= chr(0) 

Unable to recover after encountering two consecutive TOKEN_LBRACE "{"

As part of studying the parser more thoroughly for #40, I discovered what appears to be a critical bug. It occurs in both master and my WIP fixes for the rchk issues (which I have not yet added to the PR)

The bibtex entry that causes this:

encoding = "utf-8"

@String{{ BAZ = "Foo Bar Baz" }

And accessing it in R (I've stored the file in inst/bib locally):

f <- file.path(system.file("bib", "parseError.bib", package = "bibtex"))
bib <- read.bib(f)

The second { correctly causes an error, but the grammar and/or error handling does not appear sufficient in order to recover correctly. Afterwards, I am unable to read any other .bib files - they fail and enter recovery as well.

In addition, a second call to read.bib(f) causes the R session to crash.

mismatched keys & citations

In generating a .bib file for my installed packages, I get a mismatch part way through, where the rest of the .bib file has the key assigned to the previous package. A minimum reproducible example:

test <- c("aod", "ape", "aplpack", "aqp", "argosfilter", "bibtex", "knitr")
install.packages(test, repos = "https://cloud.r-project.org/")
junk <- bibtex::write.bib(test, "bibtex_version.bib")

The jag in bibtex_version.bib is:

@Article{ape,
title = {ape 5.0: an environment for modern phylogenetics and evolutionary analyses in {R}},
author = {E. Paradis and K. Schliep},
journal = {Bioinformatics},
year = {2018},
volume = {35},
pages = {526-528},
}

@Manual{aplpack,
title = {Algorithms for quantitative pedology: A toolkit for soil scientists},
author = {D.E. [email protected] Beaudette and P. Roudier and A.T. O'Geen},
journal = {Computers & Geosciences},
year = {2013},
volume = {52},
pages = {258 - 268},
url = {http://ncss-tech.github.io/AQP/},
}

@Manual{aqp,
title = {argosfilter: Argos locations filter},
author = {Carla Freitas},
year = {2012},
note = {R package version 0.63},
url = {https://CRAN.R-project.org/package=argosfilter},
}

The warnings messages are:

It is recommended to use ‘given’ instead of ‘middle’.
Converted 7 of 7 package citations to BibTeX
Writing 9 Bibtex entries ... OK
Results written to file 'bibtex_version.bib'
Warning messages:
1: In structure(x, package = package, class = c("citation", "bibentry")) :
Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
Consider 'structure(list(), *)' instead.
2: In mapply(function(b, k) { :
longer argument not a multiple of length of shorter

I have tested in both R 3.5.3 bibtex_0.4.2 and R 3.6.0 bibtex_0.4.2 The similar knitr::write_bib() function does not generate this mis-match of keys, but it pulls very different information from the package info.

I think the problem may be in the CITATION file for aplpack, which is a quoted bibentry, not p1
p2 citHeader() citEntry() textVersion as for most other packages.

"bibentry(bibtype = Manual,"
" title = {{aplpack}: Another Plot Package (version 180312)},"
" author = {Hans Peter Wolf},"
" year = {2018},"
" url = https://cran.r-project.org/package=aplpack)"

I will contact that package's maintainers, but since that version made it past the package checkers, it might be nice to add some error trapping to this package, too.

Also, note that citation('aplpack') fails:

citation('aplpack')
Warning message:
In structure(x, package = package, class = c("citation", "bibentry")) :
Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
Consider 'structure(list(), *)' instead.

fatal flex scanner internal error

Hello,
I am trying to use bibtex package to mine a bibtex file for certain keywords.
When trying the read.bib method, i get following error:

bib <- read.bib( package = "bibtex" )
Error in read.bib(package = "bibtex") : lex fatal error:
fatal flex scanner internal error--end of buffer missed

sessionInfo()

R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252    LC_MONETARY=German_Switzerland.1252
[4] LC_NUMERIC=C                        LC_TIME=German_Switzerland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bibtex_0.4.0

loaded via a namespace (and not attached):
[1] tools_3.1.3

Thanks

Development environment of contributors?

Hey all,

I encountered some bumps in the road unrelated to the rchk issues that I've been recently looking into, and thought it pertinent to bring up: how do existing contributors develop for bibtex locally?

For example, I'm on Ubuntu 20.04, and so my bison version is 3.5.1 and flex is 2.6.4 - this proved to be a headache when running regram.sh to rebuild the lexer and parser C source/headers.

For example, the first line of src/bibparse.c:

/* A Bison parser, made by GNU Bison 2.3.  */

Suggests that it was built from a version of Bison 15 years old!

I ended up compiling Bison 2.3 from source and modifying regram.sh to use that as a temporary workaround. Am I missing something here? How is this package maintained/developed?

protected string not correctly parsed

From Achim:

if author/editor fields contain commas in a protected string, they are not parsed correctly by bibtex::read.bib. Attached is an example for author = {{The MathWorks, Inc.}} where the current code first splits on the comma:

R> x <- read.bib("matlab.bib")
R> x$author
[1] "Inc.} {The MathWorks"

Of course, subsequent operations (printing, export to BibTeX again) also don't do the right thing or throw warnings. It would be great if you could fix this.

Thanks in advance & best wishes,
Z

missing imports

@mwmclean can you import these from where they are from:

ArrangeSingleAuthor: no visible global function definition for
  ‘parseLatex’
ArrangeSingleAuthor: no visible global function definition for
  ‘deparseLatex’
ArrangeSingleAuthor: no visible global function definition for
  ‘latexToUtf8’

Proposal: Improving the package

Hi

Following #28 find here a proposed roadmap for improving the package. This is to be discussed:

Phase 1: Current state

Add test ideally based in snapshot (testthat >= 3):

Tests

  • Move to testthat > = 3
  • Improve current testing, including test for checking already closed issues.
  • Add separated test for advanced features on bibtex (crossref, use of STRING)

Phase 2: Alternative solution

  • Implement and Test alternative solution on separate branch

Phase 3: Agree on changes

  • Decide whether the changes should be applied, iterate
  • Check reverse dependencies {revdep}
  • Merge on main
  • Bump to a major version
  • CRAN release

Nice to have

Happy to get feedback on this

caught segfault read.bib() - macOS 10.14.6

Issue

Description

Today, suddenly I got various fatal crashes of my R sessions (e.g., while using the package 'RefManageR' and 'vitae') which I tracked down to the update of 'bibtex' to version 0.4.2.1 (released on CRAN 2019-12-20), which leads to the following error message:

*** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
1: doTryCatch(return(expr), name, parentenv, handler)
2: tryCatchOne(expr, names, parentenv, handlers[[1L]])
3: tryCatchList(expr, classes, parentenv, handlers)
4: tryCatch(.External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile), error = function(e) { if (!any(grepl("unprotect_ptr", e))) stop(geterrmessage(), call. = FALSE) else stop("Invalid bib file", call. = FALSE)})
5: withCallingHandlers(tryCatch(.External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile), error = function(e) { if (!any(grepl("unprotect_ptr", e))) stop(geterrmessage(), call. = FALSE) else stop("Invalid bib file", call. = FALSE)}), warning = function(w) { if (any(grepl("syntax error, unexpected [$]end", w))) invokeRestart("muffleWarning")})

Remarks

  • Downgrading to 'bibtex' version 04.2 solves the issue
  • Initially I observed the crash using R-devel, however, I could reproduce the issue also using
    pre-compiled packages and R 3.6.2 (see session info below)
  • I tried different BIB-files (downloaded from publishers, manually generated files) all crashed
  • I tried to understand what kind of change causes this new behaviour, but the GitHub repository seems to be outdated

Running Example

Example BIB-file

@article{Hawking1966,
  author = {Hawking, Stephen William  and Bondi, Hermann },
  title = {The occurrence of singularities in cosmology},
  journal = {Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences},
  volume = {294},
  number = {1439},
  pages = {511-521},
  year = {1966},
  doi = {10.1098/rspa.1966.0221},
}

R call

bibtex::read.bib("test.bib")

Session info

R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.6.2 magrittr_1.5   tools_3.6.2    stringi_1.4.3  stringr_1.4.0  bibtex_0.4.2.1

Update (2019-12-24)

Now I see that also the CRAN tests for macOS fail, so I guess the problem is real and not related to a particular configuration. However, help is at hand #24 and will probably available once the CRAN is back from vacation.

Issue with "\\}$"

Hi,
thanks for developing and maintaining this package.
I am having an issue with bibtex returning no entry when reading a bib file (with more than 1000 entries), and no error message is given.

After using a debugger to find out what is going on, there a few places in which there is code like:

guess_eoentry <- max(grep("\\}$", entry_lines))
...
guess_eostring <- max(grep("\\}$", string_line))

this crashes (with no error message) the parser when there are spaces after the closing }, eg, } .
Not super-familiar with regex in R, but likely should be something like "\\}\\s*$"

Replace as.personList(authors) with do.call(c, authors)

From Kurt:

Apparently your package code still uses as.personList():

bibtex/R/bibtex.R:
as.personList(authors)

Since R 2.14.0, person objects can be combined into person objects via
c(), and there is no need for a seperate personList class for
collections of persons.

Can you please change your code to no longer use as.personList()?

As far as we can tell, in your case 'authors' is built as a list of
person objects, so you should be able to simply replace

as.personList(authors)

by

do.call(c, authors)

write.bib chooses the wrong citation, and doesn't warn that there was an option

Some of the packages in R have multiple citations associated with them. For example knitr::write_bib("testthat") returns

@Manual{R-testthat,
title = {testthat: Unit Testing for R},
author = {Hadley Wickham},
year = {2021},
note = {R package version 3.0.2},
url = {https://CRAN.R-project.org/package=testthat},
}

@Article{testthat2011,
author = {Hadley Wickham},
title = {testthat: Get Started with Testing},
journal = {The R Journal},
year = {2011},
volume = {3},
pages = {5--10},
url = {https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf},
}

The first one is the citation for the package, so it's the one I want. But bibtex::write.bib("testthat") gives the second one, and doesn't warn that it's doing it.

GSOC 2021 R project

Time is slipping away from me on multiple fronts. To ease this, I'm going to add the bibtex package as part of GSOC 2021.

@sckott any issue with being a co-mentor?

DONT WRITE BACK TO YOUR BIBTEX-FILE: custom fields are imported with column-names that includes the values in the fields...!!

So I've been trying to use this function for a while, but when using it and writing the file back to disk with df2bib(), Jabref suddenly told me

scrot_2021-03-03_20-54-10

I found this puzzling, and tried troubleshooting the error. While doing this, I then stumbled upon something where using the eprint field, a entry that looked like this (very minimal)

@inproceedings{joulin2017bag,
    title = "Bag of Tricks for Efficient Text Classification",
    author = "Joulin, Armand",
    year = "2017",
    eprint={1607.01759},
}

was transformed into this

@Inproceedings{joulin2017bag,
  Author = {Joulin, Armand},
  Title = {Bag of Tricks for Efficient Text Classification},
  Year = {2017},
  Eprint..1607.01759.. = {1607.01759}
}

This problem does not appear to related to issue with the empty text fields that I was actually investigating. This seems to be an issue related to the package only having a specific set of fields, and any not in that list will be parsed with the name of the values IN the field as part of the column names. This means that
Eprint
gets converted into
Eprint..1607.01759..

So there are two serious bugs here, and only one of them is immediatly reproducible. I give up for now :)

I conclude that this package should be used with caution, and you should definitely NOT write back into a bibtex file that you need for anything reliable.

Orphaned on CRAN

I see that this package has recently become listed on CRAN as orphaned – which is a real shame as I use it often. Is this a temporary blip or has maintenance of the package been discontinued?

Importing bibtex to Zotero classifies citation as "Book"

Hi,

I noticed that when I import a .bib entry into Zotero it automatically classifies it as a book, rather than software. Not sure if this a Zotero issue. I guess that maybe there is no info on the reference type and Zotero assumes it is a book if this info is missing. So I am wondering whether this info could be added to the .bib entry when generating a citation with the bibtex package.

Thanks!

read.bib wrongly requires year field when date is valid

biblatex advises bib entries use a date field, rather than year, but read.bib discards such entries. The following should be valid:

example.bib

@Article{newspaper,
	author = {Author Smith},
	title  = {Article title},
	date   = {2016-12-21},
	journal = {Newspaper name},
}
\documentclass{article}
\usepackage[style=authoryear]{biblatex}

\addbibresource{example.bib}

\begin{document}
\nocite{*}
\printbibliography
\end{document}

List of author names joined by "AND" cannot be parsed

If a list of authors is joined by AND rather than and the BibTeX entry cannot be parsed, as reported here.

@Article{10.1371/journal.pone.0109458,
author = {Liu, Yang-Yu AND Nacher, Jose C. AND Ochiai, Tomoshiro AND Martino, Mauro AND Altshuler, Yaniv},
journal = {PLOS ONE},
publisher = {Public Library of Science},
title = {Prospect Theory for Online Financial Trading},
year = {2014},
month = {10},
volume = {9},
url = {https://doi.org/10.1371/journal.pone.0109458},
pages = {1-7},
number = {10},
doi = {10.1371/journal.pone.0109458}
}
> bibtex::read.bib("/tmp/test.bib")
ignoring entry '10.1371/journal.pone.0109458' (line 1) because :
	Invalid author/editor format.

If I replace AND by and it works.

> bibtex::read.bib("/tmp/test.bib")
Liu Y, Nacher JC, Ochiai T, Martino M and Altshuler Y (2014).Prospect Theory for Online Financial Trading._PLOS ONE_,
*9*(10), pp. 1-7. doi: 10.1371/journal.pone.0109458 (URL:
http://doi.org/10.1371/journal.pone.0109458), <URL:
https://doi.org/10.1371/journal.pone.0109458>.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.