Giter VIP home page Giter VIP logo

arxiv_script's Introduction

arXiv-script v0.2

The arXiv is the most important open-access repository for preprints in various sciences, e.g. Computer Science, Mathematics and Physics. Each preprint has its unique arXiv identifier (often called arXiv number). The arXiv script (axs) is a minimal command line tool to interact with the preprint of an arXiv identifier:

  • show print its title, authors and abstract to the terminal.
  • get download the preprint and save it with a uniform file name.
  • bib create a BibTeX entry for the preprint (and optionally add it to a .bib-file) to easily cite the preprint in LaTeX documents.

example

Given the arXiv identifier math/0211159, let's see what we can do with it (after the installation which is explained below). Let's take a look at the preprint via

axs show math/0211159 

This command prints the title, author(s), abstract and arXiv subject to the terminal. If we like the article, simply change show to get in the above command. Then the article is downloaded (to the default directory, see below) in the convenient formate AUTHOR(S)-TITLE-YEAR.pdf. In this example:

Perelman-The_entropy_formula_for_the_Ricci_flow_and_its_geometric_applications-2002.pdf

If we decide to cite this preprint in some LaTeX document using BibTeX, simply modify the command to axs bib math/0211159. This yields the following BibTeX-entry:

@article{Perelman-EntropyFormulaFor-math/0211159,
	Author = {Perelman, Grisha},
	Title = {{The} entropy formula for the {Ricci} flow and its geometric applications},
	Year = {2002},
	Note = {\href{https://arxiv.org/abs/math/0211159}{arXiv:math/0211159}}
}

We are then asked if we want to automatically add this entry to our (default) .bib-file enabling us to cite the preprint in LaTeX right away.

installation

The installation is most convenient using pip:

pip install arxiv-script

You can check the installation using

axs --help

setup

After installation it is recommended to set a default directory, which needs to exist already, where articles are downloaded to. This is done via

axs --set-directory PATH_TO_DIR

where PATH_TO_DIR is our chosen directory path. Alternatively, we can give a directory for each download, see below. To set a default .bib-file, where BibTeX-entries are added to, simply use

axs --set-bib-file PATH_TO_FILE

Here PATH_TO_FILE is our chosen default .bib-file where our BibTeX-entries will be added to. As before, we can alternatively choose a .bib-file for each BibTeX-entry individually, see below.

the commands in detail

The basic usage is the following

axs cmd flag ax_id  

where cmd one of the commands below, flag is an (optional) flag and ax_id is an arXiv identifier. In fact, you can combine several flags.

The flag --help provides help for each command. Note that axs --help gives quick general help.

show

This command prints the title, (some of) the authors, the abstract and the main arXiv subject of the corresponding arXiv preprint. The flag -f gives a full version, i.e. additionally all authors and the main arXiv subject.

get

Simply downloads the article to your default directory (if it was already set as explained above) under the file name AUTHOR(S)-TITLE-YEAR.pdf. Two comments on the file name:

  • For => 3 authors, we use the first author and append 'et al'.
  • The title name is shortened to 15 words to prevent too long file names.

Before the download, the title and author(s) are printed to the terminal and there is a short countdown so that we can still cancel if we accidentally entered the wrong arXiv identifier.

With the flag -d (or --directory) we can download the article to another directory, i.e.

axs ax_id get -dir PATH_TO_DIR

downloads the article to the directory at PATH_TO_DIR. The flag -o (or --open) opens the preprint after the download.

bib

Prints a BibTeX-entry of the article to the terminal and asks if it should be added to our default .bib-file (if it has been set before). Alternatively, use the flag -a (or --add-to) combined with the path to another .bib-file to which we want to add the BibTeX-entry. Note that at the moment, the BibTeX-entry is simply added to the end of the corresponding .bib-file, so it is not (yet) sorted e.g. alphabetically.

Three comments on the BibTeX-entry:

  • The BibTeX-key, which is used to cite the preprint in a LaTeX document (Perelman-EntropyFormulaFor-math/0211159 in our example above), is created is in the formate AUTHOR(S)-SHORT_TITLE-AX_ID where ++ AUTHOR(S): as for the file name but without white spaces. ++ SHORT_TITLE: created from the title by removing all articles & most common prepositions and then taking the first three words. Finally, remove all white spaces. ++ AX_ID: the arXiv identifier which is added to make the BibTeX-entry unique.
  • The BibTeX-key is reasonably concise but contains enough information so that it can be easily found with the auto-completion for citations in any modern LaTeX editor.
  • We only put curly braces around capital words to make it compatible with as many citation styles in LaTeX as possible (see for example this discussion).

planned features

In the future we plan to implement the following:

  • Option to automatically download articles of different arXiv main subjects to different folders.
  • Many arXiv preprints have already been published. Give an option to search for the BibTeX-entry of the published version (e.g. in zbMATH for Mathematics).
  • More convenient installation without requiring an installed version of Python.
  • 'Browsing' arXiv in the terminal?

background

Even though there are great tools to manage scientific articles (e.g. Mendeley or Zotero), I realized - after using them for a while - that I saved way too many arXiv articles. Eventually, I manually downloaded only the important ones to one and the same directory. However, one problem, e.g. when trying to find an article, was that I did not stick to a systematic file name. So the idea for the arXiv script was born. Since I was not that satisfied with the available common BibTeX-entries of arXiv articles, I've automated them myself.

Altogether I hope that this script is useful for others as well who prefer a minimalistic management of (arXiv) articles.

main changes from v0.1 to v0.2

  • In v0.1 the syntax was quite unconventional (axs AX_ID CMD instead of axs CMD AX_ID).
  • BUG FIX: The axs now works 'anywhere' in the terminal not just in its directory. To do so, we now use use environment variables (via the dotenv package to make them last) to store the default directory and bib-file. Thanks @r-raymond!

arxiv_script's People

Contributors

flrnbc avatar

Stargazers

wangbing avatar Federer Fanatic avatar Robin Raymond avatar

Watchers

 avatar

Forkers

r-raymond

arxiv_script's Issues

Use environment variables instead of `data`.

Currently the default directory is loaded from data.

  • Use environment variables to specify default directory
  • If there is none specified, create a random temp directory (see python's default lib)

Use PRs to close issues.

Branch off of main, make a PR, link it to the issue, and ask for a reviewer, before merging to master. Consider enforcing this on a github repo lovel

Random review of a function

def bib(ax_id, add_to):

Consider using early returns. That is, instead of

if article:
    /* very long function that does many things */

do

if not article:
    return
/* very long function that does many things */

this reduces the complexity when reading.

if add_to in ("", None) -> if not add_to:

Again I'd move this to the very beginning of the function.

def bib(...):
    if not add_to:
        print("RTFM!")
        return
    /* now I can be sure add_to exists */

I'm also pretty sure you can make the option add-to required instead of having this error check.

file.write("{}".format(bit_entry)) -> file.write(bib_entry)

Consider using pathlib instead of os.path.

General logic of this function imho should be

def bib(...):
    if something wrong:
        print("RTFM")
        return
    if something else wrong:
        print("RTFM")
        return
    /* happy path */

Consider not having the click confirm in the function, or having a -y overwrite argument. The way it currently is I can't use it in a script, because I always need to confirm manually.

Consider using exit codes. A well behaved script should return 0 if successful, but non zero if something went wrong. So instead of just return in the snippet above, use an exit code that signals to the outside that the execution failed (check out click's documentation on how they recommend doing that, worst case, there is os.exit(1))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.