ropensci / bibtex Goto Github PK
View Code? Open in Web Editor NEWbibtex parser for R
Home Page: https://docs.ropensci.org/bibtex/
bibtex parser for R
Home Page: https://docs.ropensci.org/bibtex/
In generating a .bib file for my installed packages, I get a mismatch part way through, where the rest of the .bib file has the key assigned to the previous package. A minimum reproducible example:
test <- c("aod", "ape", "aplpack", "aqp", "argosfilter", "bibtex", "knitr")
install.packages(test, repos = "https://cloud.r-project.org/")
junk <- bibtex::write.bib(test, "bibtex_version.bib")
The jag in bibtex_version.bib is:
@Article{ape,
title = {ape 5.0: an environment for modern phylogenetics and evolutionary analyses in {R}},
author = {E. Paradis and K. Schliep},
journal = {Bioinformatics},
year = {2018},
volume = {35},
pages = {526-528},
}@Manual{aplpack,
title = {Algorithms for quantitative pedology: A toolkit for soil scientists},
author = {D.E. [email protected] Beaudette and P. Roudier and A.T. O'Geen},
journal = {Computers & Geosciences},
year = {2013},
volume = {52},
pages = {258 - 268},
url = {http://ncss-tech.github.io/AQP/},
}@Manual{aqp,
title = {argosfilter: Argos locations filter},
author = {Carla Freitas},
year = {2012},
note = {R package version 0.63},
url = {https://CRAN.R-project.org/package=argosfilter},
}
The warnings messages are:
It is recommended to use ‘given’ instead of ‘middle’.
Converted 7 of 7 package citations to BibTeX
Writing 9 Bibtex entries ... OK
Results written to file 'bibtex_version.bib'
Warning messages:
1: In structure(x, package = package, class = c("citation", "bibentry")) :
Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
Consider 'structure(list(), *)' instead.
2: In mapply(function(b, k) { :
longer argument not a multiple of length of shorter
I have tested in both R 3.5.3 bibtex_0.4.2 and R 3.6.0 bibtex_0.4.2 The similar knitr::write_bib() function does not generate this mis-match of keys, but it pulls very different information from the package info.
I think the problem may be in the CITATION file for aplpack, which is a quoted bibentry, not p1
p2 citHeader() citEntry() textVersion as for most other packages.
"bibentry(bibtype = Manual,"
" title = {{aplpack}: Another Plot Package (version 180312)},"
" author = {Hans Peter Wolf},"
" year = {2018},"
" url = https://cran.r-project.org/package=aplpack)"
I will contact that package's maintainers, but since that version made it past the package checkers, it might be nice to add some error trapping to this package, too.
Also, note that citation('aplpack') fails:
citation('aplpack')
Warning message:
In structure(x, package = package, class = c("citation", "bibentry")) :
Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
Consider 'structure(list(), *)' instead.
@mwmclean this test fails on windows, probably some encoding issue:
test_that("{Herm{\\`e}s International S.A.} and Katzfu{\\ss}, Matthias", {
authors <- "{Herm{\\`e}s International S.A.} and Katzfu{\\ss}, Matthias"
parsed <- bibtex:::ArrangeAuthors(authors)
expect_match(parsed$family[[1]], "Hermès International S.A.")
expect_match(parsed$family[[2]], "Katzfuß")
})
See for example this on r-hub.
https://builder.r-hub.io/status/bibtex_0.4.1.tar.gz-eb660cb5067d4363a25f20980ed19e9b#L251
I'm not sure how to fix it, short of simply disabling the test ...
Nobiliary particles (or any time a family name has whitespace) is not being handled correctly by arrange.single.author
bibtex:::arrange.single.author('Van Damme, Jean-Claude')$given
## "Jean-Claude" "Van"
@mwmclean can you import these from where they are from:
ArrangeSingleAuthor: no visible global function definition for
‘parseLatex’
ArrangeSingleAuthor: no visible global function definition for
‘deparseLatex’
ArrangeSingleAuthor: no visible global function definition for
‘latexToUtf8’
biblatex
advises bib entries use a date
field, rather than year
, but read.bib
discards such entries. The following should be valid:
example.bib
@Article{newspaper,
author = {Author Smith},
title = {Article title},
date = {2016-12-21},
journal = {Newspaper name},
}
\documentclass{article}
\usepackage[style=authoryear]{biblatex}
\addbibresource{example.bib}
\begin{document}
\nocite{*}
\printbibliography
\end{document}
Thanks for this great package. I encountered a problem when using bibtex package to parse BibTeX files with Chinese characters on Windows:
# Get current locale info
Sys.getlocale()
#> [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
# Set locale to Chinese
Sys.setlocale(locale = "Chinese")
#> [1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified)_China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_China.936"
bib_text <- "
@misc{text,
title = {{你好}},
language = {zh-CN},
author = {{你好}},
month = jun,
year = {2013},
pages = {163}
}
"
# change encoding to "UTF-8"
bib_text_utf8 <- enc2utf8(bib_text)
Encoding(bib_text_utf8)
#> [1] "UTF-8"
# make sure the saved BibTeX file is UTF-8 encoded
con <- file("test.bib", encoding = "UTF-8")
writeLines(bib_text_utf8, con)
close(con)
readLines("test.bib", encoding = "UTF-8")
#> [1] "" " @misc{text,"
#> [3] " title = {{你好}}," " language = {zh-CN},"
#> [5] " author = {{你好}}," " month = jun,"
#> [7] " year = {2013}," " pages = {163}"
#> [9] " }" " "
read.bib
could not parse Chinese characters no matter encoding was set to "UTF-8" or not.
str(bibtex::read.bib("test.bib"))
#> List of 1
#> $ text:Class 'bibentry' hidden list of 1
#> ..$ text:List of 6
#> .. ..$ title : chr "{浣犲ソ}"
#> .. ..$ language: chr "zh-CN"
#> .. ..$ author :Class 'person' hidden list of 1
#> .. .. ..$ :List of 5
#> .. .. .. ..$ given : NULL
#> .. .. .. ..$ family : chr "浣犲ソ"
#> .. .. .. ..$ role : NULL
#> .. .. .. ..$ email : NULL
#> .. .. .. ..$ comment: NULL
#> .. ..$ month : chr "jun"
#> .. ..$ year : chr "2013"
#> .. ..$ pages : chr "163"
#> .. ..- attr(*, "bibtype")= chr "Misc"
#> .. ..- attr(*, "key")= chr "text"
#> - attr(*, "class")= chr "bibentry"
#> - attr(*, "strings")= Named chr(0)
#> ..- attr(*, "names")= chr(0)
str(bibtex::read.bib("test.bib", encoding = "UTF-8"))
#> List of 1
#> $ text:Class 'bibentry' hidden list of 1
#> ..$ text:List of 6
#> .. ..$ title : chr "{浣犲ソ}"
#> .. ..$ language: chr "zh-CN"
#> .. ..$ author :Class 'person' hidden list of 1
#> .. .. ..$ :List of 5
#> .. .. .. ..$ given : NULL
#> .. .. .. ..$ family : chr "浣犲ソ"
#> .. .. .. ..$ role : NULL
#> .. .. .. ..$ email : NULL
#> .. .. .. ..$ comment: NULL
#> .. ..$ month : chr "jun"
#> .. ..$ year : chr "2013"
#> .. ..$ pages : chr "163"
#> .. ..- attr(*, "bibtype")= chr "Misc"
#> .. ..- attr(*, "key")= chr "text"
#> - attr(*, "class")= chr "bibentry"
#> - attr(*, "strings")= Named chr(0)
#> ..- attr(*, "names")= chr(0)
Here is my session info
sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17134)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
After digging a little bit, I found that encode the input of make.bib.entry to "UTF-8" can solve this problem. But I am not sure if this is a proper solution.
devtools::install_github("hongyuanjia/bibtex")
str(bibtex::read.bib("test.bib"))
#> List of 1
#> $ text:Class 'bibentry' hidden list of 1
#> ..$ text:List of 6
#> .. ..$ title : chr "{你好}"
#> .. ..$ language: chr "zh-CN"
#> .. ..$ author :Class 'person' hidden list of 1
#> .. .. ..$ :List of 5
#> .. .. .. ..$ given : NULL
#> .. .. .. ..$ family : chr "你好"
#> .. .. .. ..$ role : NULL
#> .. .. .. ..$ email : NULL
#> .. .. .. ..$ comment: NULL
#> .. ..$ month : chr "jun"
#> .. ..$ year : chr "2013"
#> .. ..$ pages : chr "163"
#> .. ..- attr(*, "bibtype")= chr "Misc"
#> .. ..- attr(*, "key")= chr "text"
#> - attr(*, "class")= chr "bibentry"
#> - attr(*, "strings")= Named chr(0)
#> ..- attr(*, "names")= chr(0)
From an email exchange with Kurt:
romain writes:
Le 2013-07-29 21:06, Kurt Hornik a écrit :
Romain Francois writes:
Hello,
I've seen that you've released a new version of bibtex. Thanks and
oops
for the missing fclose.Sure---was reported by someone on Windows who could not file.remove
the
.bib he had just read in ...There was a tweet from Gavin Simpson:
https://twitter.com/ucfagls/status/360621762578354176"
Anyone aware of a bibtex parser for #rstats that doesn't need to
read
from files? Looking to parse strings pulled from a web API
"I guess it would not be too hard to give read.bib the ability to
read
from a character vector rather than from a file. I might have a go
at
this.Great. Ideally, this would work from arbitrary connections
Best
-kMaybe I need a fresh look at connections, but last time I tried to use
them in C(++), I could not find an api.
Some is exposed via R_ext/Connections.h (with a note that code should
test for R_CONNECTIONS_VERSION).
(Seems that this is currently not used in any CRAN package ...)
Best
-k
One way is to call back an R function to get some more characters to
feed in to the lexer, but this would not be very efficient.Romain
Hello,
I am trying to use bibtex package to mine a bibtex file for certain keywords.
When trying the read.bib method, i get following error:
bib <- read.bib( package = "bibtex" )
Error in read.bib(package = "bibtex") : lex fatal error:
fatal flex scanner internal error--end of buffer missed
sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252
[4] LC_NUMERIC=C LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bibtex_0.4.0
loaded via a namespace (and not attached):
[1] tools_3.1.3
Thanks
If a list of authors is joined by AND
rather than and
the BibTeX entry cannot be parsed, as reported here.
@Article{10.1371/journal.pone.0109458,
author = {Liu, Yang-Yu AND Nacher, Jose C. AND Ochiai, Tomoshiro AND Martino, Mauro AND Altshuler, Yaniv},
journal = {PLOS ONE},
publisher = {Public Library of Science},
title = {Prospect Theory for Online Financial Trading},
year = {2014},
month = {10},
volume = {9},
url = {https://doi.org/10.1371/journal.pone.0109458},
pages = {1-7},
number = {10},
doi = {10.1371/journal.pone.0109458}
}
> bibtex::read.bib("/tmp/test.bib")
ignoring entry '10.1371/journal.pone.0109458' (line 1) because :
Invalid author/editor format.
If I replace AND
by and
it works.
> bibtex::read.bib("/tmp/test.bib")
Liu Y, Nacher JC, Ochiai T, Martino M and Altshuler Y (2014).
“Prospect Theory for Online Financial Trading.” _PLOS ONE_,
*9*(10), pp. 1-7. doi: 10.1371/journal.pone.0109458 (URL:
http://doi.org/10.1371/journal.pone.0109458), <URL:
https://doi.org/10.1371/journal.pone.0109458>.
Some of the packages in R have multiple citations associated with them. For example knitr::write_bib("testthat")
returns
@Manual{R-testthat,
title = {testthat: Unit Testing for R},
author = {Hadley Wickham},
year = {2021},
note = {R package version 3.0.2},
url = {https://CRAN.R-project.org/package=testthat},
}
@Article{testthat2011,
author = {Hadley Wickham},
title = {testthat: Get Started with Testing},
journal = {The R Journal},
year = {2011},
volume = {3},
pages = {5--10},
url = {https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf},
}
The first one is the citation for the package, so it's the one I want. But bibtex::write.bib("testthat")
gives the second one, and doesn't warn that it's doing it.
I see that this package has recently become listed on CRAN as orphaned – which is a real shame as I use it often. Is this a temporary blip or has maintenance of the package been discontinued?
Instead of NewList, GrowList and Insert
https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/bibtex.out
Package bibtex version 0.4.2.3
Package built using 79226/R 4.1.0; x86_64-pc-linux-gnu; 2020-09-19 23:50:40 UTC; unix
Checked with rchk version 36c7ad2294619ba0a81109c9acb675eea2c96e6d
More information at https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md
Suspicious call (two or more unprotected arguments) to Rf_setAttrib at makeSrcRef bibtex/src/bibparse.y:1178
Function asVector
[PB] has negative depth bibtex/src/bibparse.y:1159
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:1160
Function do_read_bib
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:428
Function junk1
[PB] has negative depth bibtex/src/bibparse.y:1081
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:1082
Function recordComment
[PB] has negative depth bibtex/src/bibparse.y:994
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:995
Function recordInclude
[PB] has negative depth bibtex/src/bibparse.y:986
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:987
Function recordPreamble
[PB] has negative depth bibtex/src/bibparse.y:1008
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:1009
Function recordString
[PB] has negative depth bibtex/src/bibparse.y:1001
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:1002
Function setToken
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:1058
Function xx_assignement
[PB] has negative depth bibtex/src/bibparse.y:875
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:879
Function xx_assignement_list2
[PB] has negative depth bibtex/src/bibparse.y:854
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:858
Function xx_atobject_comment
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:515
Function xx_atobject_entry
[UP] unprotected variable object while calling allocating function Rf_allocVector bibtex/src/bibparse.y:543
[UP] unprotected variable object while calling allocating function Rf_allocVector bibtex/src/bibparse.y:547
[UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:550
[UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:entry,V) bibtex/src/bibparse.y:550
[UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:551
[UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:names,V) bibtex/src/bibparse.y:551
[UP] unprotected variable object while calling allocating function Rf_install bibtex/src/bibparse.y:552
[UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:key,V) bibtex/src/bibparse.y:552
[PB] has negative depth bibtex/src/bibparse.y:573
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:575
Function xx_atobject_include
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:593
Function xx_atobject_preamble
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:609
Function xx_atobject_string
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:625
Function xx_entry_head
[PB] has negative depth bibtex/src/bibparse.y:684
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:689
Function xx_null
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:975
Function xx_object_list_2
[PB] has negative depth bibtex/src/bibparse.y:474
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:478
Function xx_token_entry
[PB] has negative depth bibtex/src/bibparse.y:642
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:647
Function xx_value
[PB] has negative depth bibtex/src/bibparse.y:814
[PB] has possible protection stack imbalance bibtex/src/bibparse.y:819
Function yydestruct
[PB] has possible protection stack imbalance bibtex/src/bibparse.c:2734
[PB] has negative depth bibtex/src/bibparse.y:211
Function yyparse
[PB] has negative depth bibtex/src/bibparse.c:3409
[PB] has negative depth bibtex/src/bibparse.c:3413
[PB] has negative depth bibtex/src/bibparse.c:3537
[PB] has possible protection stack imbalance bibtex/src/bibparse.c:3565
Hi,
thanks for developing and maintaining this package.
I am having an issue with bibtex returning no entry when reading a bib file (with more than 1000 entries), and no error message is given.
After using a debugger to find out what is going on, there a few places in which there is code like:
guess_eoentry <- max(grep("\\}$", entry_lines))
...
guess_eostring <- max(grep("\\}$", string_line))
this crashes (with no error message) the parser when there are spaces after the closing }
, eg, }
.
Not super-familiar with regex in R, but likely should be something like "\\}\\s*$"
Hi,
I am using EndNote as a citation manager. Unfortunately I am not able to directly import .bib entries generated with bibtex into EndNote. Currently I import the entries into Zotero, export as .ris from there and reimport into EndNote. Except from a minor issue when using Zotero (see #53 ) this works, but having a direct option to import into EndNote would of course be desirable.
Thanks!
I have the following bib-file example.bib
:
@Article{GirLeq2016_CSTM_EntropyBasedGOFTests,
title = {Entropy-Based Goodness-of-Fit Tests -- a Unifying Framework. Application to DNA Replication},
author = {Val{\'e}rie Girardin and Justine Lequesne},
journal = {Communications in Statistics--Theory and Methods},
pages = {62--74},
volume = {48},
number = {1},
year = {2017},
doi = {10.1080/03610926.2017.1401084},
publisher = {Taylor \& Francis},
}
If I read this and want to proceed working with it, the bibtex-package turns my ASCII text into non-ASCII.
## no non-ASCII
tools::showNonASCII(readLines("example.bib"))
library("bibtex")
entry <- read.bib("example.bib")
## bibtex package turns Val{\'e}rie into Valérie
print(entry, style = "citation")
Is there a way to avoid that? Thanks!
When the author or editor field contains braces, a trailing space is produced in the given
component of the person object returned by arrange.single.author
bibtex:::arrange.single.author('Jean-Claude {Van Damme}')$given
## "Jean-Claude "
Hi,
I have seen this when trying to create a bibtex file for cffr:
library(bibtex)
packageVersion("bibtex")
entry <- bibentry(
"Article",
doi = "10.21105/joss.03900",
url = "https://doi.org/10.21105/joss.03900",
year = 2021,
publisher = "The Open Journal",
volume = 6,
number = 67,
pages = 3900,
author = person("Diego", "Hernangómez"),
title = "cffr: Generate Citation File Format Metadata for R Packages",
journal = "Journal of Open Source Software"
)
write.bib(entry)
But the "Rpackages.bib" file is:
@Article{,
doi = {10.21105/joss.03900},
url = {https://doi.org/10.21105/joss.03900},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {67},
pages = {3900},
author = {Diego Hernang?mez},
title = {cffr: Generate Citation File Format Metadata for R Packages},
journal = {Journal of Open Source Software},
}
Note the author key: author = {Diego Hernang?mez},
, not propley displayed.
Looks like there is an underlying change in one of the string functions used between R 4.1 and R 4.2 causing the oldrel action to error during the unit testing portion:
https://github.com/ropensci/bibtex/actions/runs/3125413081/jobs/5069755857#step:6:362
─ Failure (test-examples.R:53:3): Read graphics ───────────────────────────────
Snapshot of `bib` has changed:
old[5:18] vs new[5:18]
"and Methods_, 1025-1041."
""
"Friendly M (1982). \"Graphical methods for categorical data.\" _SAS User"
- "Group International Conference Proceedings_, 190-200."
+ "Group International Conference Proceedings_, 190-200. <URL:"
- "<http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html>."
+ "http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html>."
""
"Meyer D, Zeileis A, Hornik K (2005). \"The strucplot framework:"
"Visualizing multi-way contingency tables with vcd.\" Department of"
"Statistics and Mathematics, Wirtschaftsuniversität, Wien. Report 22,"
- "Research Report Series,"
+ "Research Report Series, <URL:"
- "<http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_8a1>."
+ "http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_8a1>."
""
"Chambers JM, Cleveland WS, Kleiner B, Tukey PA (1983). _Graphical"
"Methods for Data Analysis_. Wadsworth & Brooks/Cole."
old[29:36] vs new[29:36]
"Hobart."
""
"Friendly M (1994). \"A fourfold display for 2 by 2 by k tables.\" York"
- "University, Psychology Department. Technical Report 217,"
+ "University, Psychology Department. Technical Report 217, <URL:"
- "<http://www.math.yorku.ca/SCS/Papers/4fold/4fold.ps.gz>."
+ "http://www.math.yorku.ca/SCS/Papers/4fold/4fold.ps.gz>."
""
"Murrell PR (1999). \"Layouts: A mechanism for arranging plots on a"
"page.\" _Journal of Computational and Graphical Statistics_, 121-134."
old[45:52] vs new[45:52]
"Friendly M (1994). \"Mosaic displays for multi-way contingency tables.\""
"_Journal of the American Statistical Association_, 190-200."
""
- "Friendly M (????). \"The home page of Michael Friendly.\""
+ "Friendly M (????). \"The home page of Michael Friendly.\" <URL:"
- "<http://www.math.yorku.ca/SCS/friendly.html>."
+ "http://www.math.yorku.ca/SCS/friendly.html>."
""
"Cleveland WS (1985). _The elements of graphing data_. Wadsworth,"
"Monterey, CA, USA."
old[65:72] vs new[65:72]
"_The American Statistician_, 303-305."
""
"Blanc C, Schlick C (1995). \"X-splines : A Spline Model Designed for the"
- "End User.\" In _Proceedings of SIGGRAPH 95_, [377](https://github.com/ropensci/bibtex/actions/runs/3125413081/jobs/5069755857#step:6:379)-386."
+ "End User.\" In _Proceedings of SIGGRAPH 95_, 377-386. <URL:"
- "<http://dept-info.labri.fr/~schlick/DOC/sig1.html>."
+ "http://dept-info.labri.fr/~schlick/DOC/sig1.html>."
""
"Murrell P (1998). _Investigations in Graphical Statistics_. Ph.D."
"thesis, The University of Auckland."
> bib <- read.bib( package = "datasets" )
Error in FUN(c("Chambers, J. M., Cleveland, W. S., Kleiner, B.", "Tukey, P. A." :
Invalid name format in bibentry.
Calls: read.bib -> lapply -> FUN -> ArrangeAuthors -> lapply -> FUN
I am fetching citations from Zotero with BetterBibtex-interfacing {rbbt} by @paleolimbot. I would love to be able to edit bibtex entry before it gets written to ".bib" file, i.e. while it is still in string format.
Ideally I would do that with {RefManageR}, but currently {RefManageR} relies on {bibtex} to parse .bib
. If {bibtex} had an exposed function for reading literal bib string, then I could use {RefManageR} to edit it. Would you consider adding a parser for literal bibtex?
my_ref <- " @book{McElreath_2020, edition={2},
title={Statistical Rethinking: A Bayesian Course with Examples in R and Stan}, ISBN={978-0-429-02960-8},
url={https://www.taylorfrancis.com/books/9780429642319}, DOI={10.1201/9780429029608},
publisher={Chapman and Hall/CRC}, author={McElreath, Richard}, year={2020}, month={Mar} }"
my_bibentry <- read.bib(text=my_ref)
It needs to be vectorized, of course, i.e. accepting character vectors of length()>1.
Thank you for providing such an important infrastructure package for bibliography management infrastructure in R.
Hey all,
I encountered some bumps in the road unrelated to the rchk
issues that I've been recently looking into, and thought it pertinent to bring up: how do existing contributors develop for bibtex
locally?
For example, I'm on Ubuntu 20.04, and so my bison
version is 3.5.1 and flex
is 2.6.4 - this proved to be a headache when running regram.sh
to rebuild the lexer and parser C source/headers.
For example, the first line of src/bibparse.c
:
/* A Bison parser, made by GNU Bison 2.3. */
Suggests that it was built from a version of Bison 15 years old!
I ended up compiling Bison 2.3 from source and modifying regram.sh
to use that as a temporary workaround. Am I missing something here? How is this package maintained/developed?
This error when I read .bib file. First I thought it happens because file is huge, with something like 5000 citations, so I exported only 4 citations from this set in bibtex format in a .bib format file. But even this 4 citations files does not work. I get the same error.
Thanks for this terrific package!
I ran into something that seems like a bug, but I can't quite figure out where it comes from. I'm using bibtex (your package) to generate full citations in the body of the text in RMarkdown. However, when I generate the final document (a pdf, but the problem appears when I try other formats like html), some citations include commas that don't exist in the bib file.
Here's a simple example.
My rmd file (inspired by an earlier stack overflow suggestion here):
---
output: pdf_document
---
`r refs <- bibtex::read.bib("minimalbib.bib")`
`r capture.output(refs["BusemeyerTober2022"])`
minimalbib.bib contains the following:
@article{BusemeyerTober2022,
author = {Marius R. Busemeyer and Tobias Tober},
journal = {Comparative Political Studies},
pages = {00104140221139381},
title = {Dealing with Technological Change: Social Policy Preferences and Institutional Context},
year = {2022}}
The pdf (and html) renders the following:
Busemeyer MR, Tober T (2022). "Dealing with Technological Change:, Social Policy Preferences and Institutional Context." Comparative, Political Studies, 00104140221139381.
Note two new commas: one after the colon ("Change:, Social") and one in the name of the journal ("_Comparative, Political").
Yet when I check the citation in the R console, I get the correct markdown code:
[1] "Busemeyer MR, Tober T (2022). “Dealing with Technological Change: Social"
[2] "Policy Preferences and Institutional Context.” _Comparative Political"
[3] "Studies_, 00104140221139381."
I would add that it doesn't affect all references (eyeballing my file, I would say about 1 in 4). I don't know if I'm doing something wrong, if rmarkdown is causing the problem, or if there's a bug, but I was wondering if anyone had an idea of what's happening here. Thanks!
Dear Romain,
I came across your nice package to handle bibtex entries in R, which I
played around with over the past few days. I came across some things I
want to share with you:
To display what I mean I give some examples:
S. Lavou'{e}
or
S. Lavou'e
in the bibfile wouldn't make a difference for LaTeX, but reading a bibentry
@Article{Lavo,
author = {S. Lavou'{e}},
title = {A new species of \textit{Petrocephalus} \textsc{Marcusen}
1854 ({O}steoglossomorpha:
{M}ormyridae) from the {S}anaga {R}iver basin, {C}ameroon},
journal = {Zootaxa},
year = {2011},
volume = {2934},
pages = {20-28},
}
will give after calling
bib$author
[1] "S. Lavou' e"
,
bib$author$family
[1] "e"
and
bib$author$given
[1] "S. Lavou'"
This all is not a problem for the character é, as one can also write
'e, and LaTeX will know what to do. But for other characters this
indeed is a problem. E.g. in the following bibtex entry
@Article{Bart,
author = {L'{a}szl'{o} Bartha and Attila {Moln'ar V.} and Nicolae
Drago\c{s}
and G'{a}bor Sramk'o},
title = {Molecular evidence for reticulate speciation in
\textit{Astragalus}
(Fabaceae) as revealed by a case study from section
\textit{Dissitiflori}},
journal = {Botany},
year = {2013},
volume = {91},
pages = {702-714},
}
One couldn't substitute the \c{s}, by \cs, because LaTeX can't handle
this, but bibtex.R gives the following results for the bibentry above:
bib$author[3]
[1] "Nicolae Drago\c s"
,
bib$author[3]$given
[1] "Nicolae Drago\c"
and
bib$author[3]$family
[1] "s"
The same applies to
J. Fjelds\r{a}
In a bibentry.
bib$title
[1] "Molecular evidence for reticulate speciation in
\textit{Astragalus}\n\t(Fabaceae) as revealed by a case study from
section \textit{Dissitiflori}"
(note the \n\t in the title)
Do you see a chance, that these limitations (or do they have a reason?)
will disappear in a future release of the bibtex package?
Best whishes
Ingo
Romain,
Brian Ripley fixed the gcc10 and LTO issues for you, pls merge with your master sources.
Best,
From Kurt:
Apparently your package code still uses as.personList():
bibtex/R/bibtex.R:
as.personList(authors)Since R 2.14.0, person objects can be combined into person objects via
c(), and there is no need for a seperate personList class for
collections of persons.Can you please change your code to no longer use as.personList()?
As far as we can tell, in your case 'authors' is built as a list of
person objects, so you should be able to simply replaceas.personList(authors)
by
do.call(c, authors)
Today, suddenly I got various fatal crashes of my R sessions (e.g., while using the package 'RefManageR' and 'vitae') which I tracked down to the update of 'bibtex' to version 0.4.2.1 (released on CRAN 2019-12-20), which leads to the following error message:
*** caught segfault ***
address 0x0, cause 'memory not mapped'
Traceback:
1: doTryCatch(return(expr), name, parentenv, handler)
2: tryCatchOne(expr, names, parentenv, handlers[[1L]])
3: tryCatchList(expr, classes, parentenv, handlers)
4: tryCatch(.External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile), error = function(e) { if (!any(grepl("unprotect_ptr", e))) stop(geterrmessage(), call. = FALSE) else stop("Invalid bib file", call. = FALSE)})
5: withCallingHandlers(tryCatch(.External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile), error = function(e) { if (!any(grepl("unprotect_ptr", e))) stop(geterrmessage(), call. = FALSE) else stop("Invalid bib file", call. = FALSE)}), warning = function(w) { if (any(grepl("syntax error, unexpected [$]end", w))) invokeRestart("muffleWarning")})
@article{Hawking1966,
author = {Hawking, Stephen William and Bondi, Hermann },
title = {The occurrence of singularities in cosmology},
journal = {Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences},
volume = {294},
number = {1439},
pages = {511-521},
year = {1966},
doi = {10.1098/rspa.1966.0221},
}
bibtex::read.bib("test.bib")
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.2 magrittr_1.5 tools_3.6.2 stringi_1.4.3 stringr_1.4.0 bibtex_0.4.2.1
Now I see that also the CRAN tests for macOS fail, so I guess the problem is real and not related to a particular configuration. However, help is at hand #24 and will probably available once the CRAN is back from vacation.
read.bib
does not work if the .bib
file has a question mark in the "key".
tmp <- tempfile(fileext = ".bib")
entry <- "@Misc{key?,\n author = \"Smith, Bob\",\n title = \"The Title\",\n year = 2012, \n}"
writeLines(entry, tmp)
read.bib(tmp)
## Warning message:
## In read.bib(tmp) :
## C:\Users\MMCLEA~1.ADS\AppData\Local\Temp\Rtmp6ZP3yZ\file86058c626ee.bib:1:0
## syntax error, unexpected TOKEN_LITERAL, expecting TOKEN_COMMA
## Dropping the entry `k` (starting at line 1)
Hi,
I noticed that when I import a .bib entry into Zotero it automatically classifies it as a book, rather than software. Not sure if this a Zotero issue. I guess that maybe there is no info on the reference type and Zotero assumes it is a book if this info is missing. So I am wondering whether this info could be added to the .bib entry when generating a citation with the bibtex package.
Thanks!
I don't think this issue is any bigger than perhaps "read.bib could spit out a nicer error message for invalid input". It seems in my code I can occasionally get errors in the .bib files I generate and this would lead to "unprotect_ptr: pointer not found" errors when I try to read the bib files with read.bib. E.g.
library(bibtex)
options(error = function() traceback(2))
writeLines('Not a real bib file', 'junk.bib')
read.bib('junk.bib')`
Error in read.bib("junk.bib") : unprotect_ptr: pointer not found
In addition: Warning message:
In read.bib("junk.bib") :
junk.bib:2:0
syntax error, unexpected $end
Dropping the entry '(nil)' (starting at line 0)
2: .External("do_read_bib", file = file, encoding = encoding, srcfile = srcfile)
1: read.bib("junk.bib")
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bibtex_0.3-6
> R.version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 0.2
year 2013
month 09
day 25
svn rev 63987
language R
version.string R version 3.0.2 (2013-09-25)
nickname Frisbee Sailing
As part of studying the parser more thoroughly for #40, I discovered what appears to be a critical bug. It occurs in both master and my WIP fixes for the rchk issues (which I have not yet added to the PR)
The bibtex entry that causes this:
encoding = "utf-8"
@String{{ BAZ = "Foo Bar Baz" }
And accessing it in R (I've stored the file in inst/bib
locally):
f <- file.path(system.file("bib", "parseError.bib", package = "bibtex"))
bib <- read.bib(f)
The second {
correctly causes an error, but the grammar and/or error handling does not appear sufficient in order to recover correctly. Afterwards, I am unable to read any other .bib
files - they fail and enter recovery as well.
In addition, a second call to read.bib(f)
causes the R session to crash.
So I've been trying to use this function for a while, but when using it and writing the file back to disk with df2bib(), Jabref suddenly told me
I found this puzzling, and tried troubleshooting the error. While doing this, I then stumbled upon something where using the eprint field, a entry that looked like this (very minimal)
@inproceedings{joulin2017bag,
title = "Bag of Tricks for Efficient Text Classification",
author = "Joulin, Armand",
year = "2017",
eprint={1607.01759},
}
was transformed into this
@Inproceedings{joulin2017bag,
Author = {Joulin, Armand},
Title = {Bag of Tricks for Efficient Text Classification},
Year = {2017},
Eprint..1607.01759.. = {1607.01759}
}
This problem does not appear to related to issue with the empty text fields that I was actually investigating. This seems to be an issue related to the package only having a specific set of fields, and any not in that list will be parsed with the name of the values IN the field as part of the column names. This means that
Eprint
gets converted into
Eprint..1607.01759..
So there are two serious bugs here, and only one of them is immediatly reproducible. I give up for now :)
I conclude that this package should be used with caution, and you should definitely NOT write back into a bibtex file that you need for anything reliable.
Time is slipping away from me on multiple fronts. To ease this, I'm going to add the bibtex
package as part of GSOC 2021.
@sckott any issue with being a co-mentor?
Hi
Following #28 find here a proposed roadmap for improving the package. This is to be discussed:
Add test ideally based in snapshot (testthat >= 3):
bibtex.R
Happy to get feedback on this
From Achim:
if author/editor fields contain commas in a protected string, they are not parsed correctly by bibtex::read.bib. Attached is an example for author = {{The MathWorks, Inc.}} where the current code first splits on the comma:
R> x <- read.bib("matlab.bib")
R> x$author
[1] "Inc.} {The MathWorks"
Of course, subsequent operations (printing, export to BibTeX again) also don't do the right thing or throw warnings. It would be great if you could fix this.
Thanks in advance & best wishes,
Z
Hi!
I am having difficulty loading bibtex in R, I realized this while trying to install and load the "plm" package. I am working from a cloud environment, and have R version 3.6.3. I have been able to run plm before, so I am confused why this is now an issue. I get almost an identical error when I try to load from CRAN and from github.
What I am simply trying to run:
install.packages("plm")
install.packages("bibtex")
Or
if(!requireNamespace("remotes", quietly = TRUE)) { install.packages("remotes") }
remotes::install_github("ropensci/bibtex")
This is the error I get:
bibparse.y: In function ‘xx_simple_value’:
bibparse.y:945: error: ‘for’ loop initial declarations are only allowed in C99 mode
bibparse.y:945: note: use option -std=c99 or -std=gnu99 to compile your code
bibparse.y: In function ‘xx_expand_abbrev’:
bibparse.y:1021: error: ‘for’ loop initial declarations are only allowed in C99 mode
bibparse.y: In function ‘asVector’:
bibparse.y:1145: error: ‘for’ loop initial declarations are only allowed in C99 mode
make: *** [bibparse.o] Error 1
ERROR: compilation failed for package ‘bibtex’
I have no idea why it is treating something as a for loop. I've tried to download archived versions of bibtext as well, with the same error message above. Please let me know if I can provide any more information. Thank you so much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.