Giter VIP home page Giter VIP logo

zotero-cli's Introduction

Zotero CLI Tweet

Sort and rank your Zotero references easy from your CLI.
Ask questions to your Zotero documents with GPT locally.

PyPi Platform Python Versions Build Status Read The Docs Known Vulnerabilities DOI License

This Tinyscript tool relies on pyzotero for communicating with Zotero's Web API. It allows to list field values, show items in tables in the CLI or also export sorted items to an Excel file.

$ pip install zotero-cli-tool

โฉ Quick Start

The first time you start it, the tool will ask for your API identifier and key. It will cache it to ~/.zotero/creds.txt with permissions set to rw for your user only. Data is cached to ~/.zotero/cache/. If you are using a shared group library, you can either pass the "-g" ("--group") option in your zotero-cli command or, for setting it permanently, touch an empty file ~/.zotero/group.

  • Manually update cached data
$ zotero-cli reset

Note that it could take a while. That's why caching is interesting for further use.

  • Count items in a collection
$ zotero-cli count --filter "collections:biblio"
123
  • List values for a given field
$ zotero-cli list itemType

    Type             
    ----             
    computer program 
    conference paper 
    document         
    journal article  
    manuscript       
    thesis           
    webpage          
  • Show entries with the given set of fields, filtered based on multiple critera and limited to a given number of items
$ zotero-cli show year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"

    Year  Title                                                                                                                             Type              #Pages 
    ----  -----                                                                                                                             ----              ------ 
    2016  Classifying Packed Programs as Malicious Software Detected                                                                        conference paper  3      
    2016  Detecting Packed Executable File: Supervised or Anomaly Detection Method?                                                         conference paper  5      
    2016  Entropy analysis to classify unknown packing algorithms for malware detection                                                     conference paper  21     
    2017  Packer Detection for Multi-Layer Executables Using Entropy Analysis                                                               journal article   18     
    2018  Sensitive system calls based packed malware variants detection using principal component initialized MultiLayers neural networks  journal article   13     
    2018  Effective, efficient, and robust packing detection and classification                                                             journal article   15     
    2019  Efficient automatic original entry point detection                                                                                journal article   14     
    2019  All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis                                              journal article   16     
    2020  Experimental Comparison of Machine Learning Models in Malware Packing Detection                                                   conference paper  3      
    2020  Building a smart and automated tool for packed malware detections using machine learning                                          thesis            99     
  • Export entries
$ zotero-cli export year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"
$ file export.xlsx 
export.xlsx: Microsoft Excel 2007+
  • Use a predefined query
$ zotero-cli show - --query "top-50-most-relevants"

Note: "-" is used for the field positional argument to tell the tool to select the predefined list of fields included in the query.

This is equivalent to:

$ zotero-cli show year title numPages itemType --limit ">rank:50"

Available queries:

  • no-attachment: list of all items with no attachment ; displayed fields: title
  • no-url: list of all items with no URL ; displayed fields: year, title
  • top-10-most-relevants: top-10 best ranked items ; displayed fields: year, title, numPages, itemType
  • top-50-most-relevants: same as top-10 but with the top-50

Mark items:

$ zotero-cli mark read --filter "title:a nice paper"
$ zotero-cli mark unread --filter "title:a nice paper"

Markers:

  • read / unread: by default, items are displayed in bold ; marking an item as read will make it display as normal
  • irrelevant / relevant: this allows to exclude a result from the output list of items
  • ignore / unignore: this allows to completely ignore an item, including in the ranking algorithm

๐Ÿ’ป Local GPT

This feature is based on PrivateGPT. It can be used to ingest local Zotero documents and ask questions based on a chosen GPT model.

  • Install optional dependencies
$ pip install zotero-cli-tool[gpt]
  • Install a model among the followings:

    • ggml-gpt4all-j-v1.3-groovy.bin (default)
    • ggml-gpt4all-l13b-snoozy.bin
    • ggml-mpt-7b-chat.bin
    • ggml-v3-13b-hermes-q5_1.bin
    • ggml-vicuna-7b-1.1-q4_2.bin
    • ggml-vicuna-13b-1.1-q4_2.bin
    • ggml-wizardLM-7B.q4_2.bin
    • ggml-stable-vicuna-13B.q4_2.bin
    • ggml-mpt-7b-base.bin
    • ggml-nous-gpt4-vicuna-13b.bin
    • ggml-mpt-7b-instruct.bin
    • ggml-wizard-13b-uncensored.bin
$ zotero-cli install

The latest installed model gets selected for the ask command (see hereafter).

  • Ingest your documents
$ zotero-cli ingest
  • Ask questions to your documents
$ zotero-cli ask
Using embedded DuckDB with persistence: data will be stored in: /home/morfal/.zotero/db
Found model file.
[...]
Enter a query: 

๐Ÿ’ก Special Features

Some additional fields can be used for listing/filtering/showing/exporting data.

  • Computed fields

    • authors: the list of creators with creatorType equal to author
    • citations: the number of relations the item has to other items with a later date
    • editors: the list of creators with creatorType equal to editor
    • numAttachments: the number of child items with itemType equal to attachment
    • numAuthors: the number of creators with creatorType equal to author
    • numCreators: the number of creators
    • numEditors: the number of creators with creatorType equal to editor
    • numNotes: the number of child items with itemType equal to note
    • numPages: the (corrected) number of pages, either got from the original or pages field
    • references: the number of relations the item has to other items with an earlier date
    • year: the year coming from the datetime parsing of the date field
  • Extracted fields (from the extra field)

    • comments: custom field for adding comments
    • results: custom field for mentioning results related to the item
    • what: custom field for a short description of what the item is about
    • zscc: number of Scholar citations, computed with the Zotero Google Scholar Citations plugin
  • PageRank-based reference ranking algorithm

    • rank: computed field aimed to rank references in order of relevance ; this uses an algorithm similar to Google's PageRank while weighting references in function of their year of publication (giving more importance to recent references, which cannot have as much citations as older references anyway)

๐Ÿ‘ Supporters

Stargazers repo roster for @dhondta/zotero-cli

Forkers repo roster for @dhondta/zotero-cli

Back to top

zotero-cli's People

Contributors

dhondta avatar iwishiwasaneagle avatar snyk-bot avatar volker-fr avatar willforan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

zotero-cli's Issues

Failure saving results on setup/reset

% ls $HOME/.zotero/cache 
collections.json  items.json

% zotero-cli -v           
18:13:31 [DEBUG] Getting collections from zotero.org...
18:13:31 [DEBUG] Saving collections to cache '/home/user/.zotero/cache/collections.json'...
18:13:31 [DEBUG] Getting items from zotero.org...
18:14:17 [DEBUG] Saving items to cache '/home/user/.zotero/cache/items.json'...
18:14:18 [DEBUG] Getting attachments from zotero.org...
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 114, in __init__
    with cached_file.open() as f:
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 1548, in open
    return io.open(
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 1383, in _opener
    return self._accessor.open(self, flags, mode)
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 646, in wrapped
    return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.zotero/cache/attachments.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 888, in <module>
    z = ZoteroCLI(args.id, ["user", "group"][args.group or GROUP_FILE.exists()], args.key)
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 135, in __init__
    if i['meta']['numChildren'] > 0:
KeyError: 'numChildren'

Creating the file as empty json works

% echo "{}" > $HOME/.zotero/cache/attachments.json 
% echo "{}" > $HOME/.zotero/cache/notes.json 
% echo "{}" > $HOME/.zotero/cache/annotations.json 

zotero-cli -v                                     
18:28:37 [DEBUG] Getting collections from cache '/home/user/.zotero/cache/collections.json'...
18:28:37 [DEBUG] Getting items from cache '/home/user/.zotero/cache/items.json'...
18:28:37 [DEBUG] Getting attachments from cache '/home/user/.zotero/cache/attachments.json'...
18:28:37 [DEBUG] Getting notes from cache '/home/user/.zotero/cache/notes.json'...
18:28:37 [DEBUG] Getting annotations from cache '/home/user/.zotero/cache/annotations.json'...
18:28:37 [DEBUG] Opening marks from cache '/home/user/.zotero/cache/marks.json'...

Of course then the cache files are empty, and zotero-cli reset will delete the files and start from scratch again.

Example code leads to errors

Not a blocker for me. In the README, there is the following example code:

zotero-cli show year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"

Looks like it has some issues handling the data it has. I get for pages errors if the page is as not an integer.

19:28:30 [WARNING] Bad pages value 'S20'

Dates also fail a lot with the following error:

19:34:24 [WARNING] Bad datetime format: January 1, 2021 for item titled 'XXXX'. Using default date 1900-01-01.

Different time formats that throw errors:

July 1, 2021
2021/11
2021/07/05
2019-07-20T05:00:01

More errors

% zotero-cli show - --query "top-50-most-relevants"
19:28:30 [WARNING] Bad pages value '...'
19:42:12 [WARNING] Bad datetime format: .... for item titled '...'. Using default date 1900-01-01.
... a lot more warnings as the above two ...
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 905, in <module>
    z.show(args.field, args.filter, args.sort, args.desc, args.limit)
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/tinyscript/preimports/log.py", line 99, in _wrapper
    return f(*args, **kwargs)
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/tinyscript/helpers/decorators.py", line 112, in wrapper
    return f(*args, **kwargs)
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 705, in show
    headers, data = self._items(fields, filters, sort, desc, limit)
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 484, in _items
    dt = set(ZoteroCLI.date(i['data']['date']).timestamp() for i in items.values()) - {dt_zero}
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 484, in <genexpr>
    dt = set(ZoteroCLI.date(i['data']['date']).timestamp() for i in items.values()) - {dt_zero}
KeyError: 'date'

Unknown item type 'annotation'

Running into an issue downloading the attachments. Seems like annotations aren't supported yet.

zotero-cli --verbose
23:36:16 [DEBUG] Getting collections from cache '/home/user/.zotero/cache/collections.json'...
23:36:16 [DEBUG] Getting items from cache '/home/user/.zotero/cache/items.json'...
23:36:16 [DEBUG] Getting attachments from zotero.org...
^TTraceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 117, in __init__
    with cached_file.open() as f:
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 1548, in open
    return io.open(
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 1383, in _opener
    return self._accessor.open(self, flags, mode)
  File "/home/user/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pathlib2/__init__.py", line 646, in wrapped
    return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.zotero/cache/attachments.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 889, in <module>
    z = ZoteroCLI(args.id, ["user", "group"][args.group or GROUP_FILE.exists()], args.key)
  File "/home/user/.pyenv/versions/3.10.6/bin/zotero-cli", line 145, in __init__
    raise ValueError("Unknown item type '%s'" % c['data']['itemType'])
ValueError: Unknown item type 'annotation'

Feature request: Showcase more sort and export options

I think this tool looks really good! It would be great to have more export options (say to MD, HTML, text) and flexibility in what data you can export. However I do not know if that is outside of the focus of this.

Import .bibtex to zotero library

Hi, I'm wondering if it is possible to import .bibtex file to zotero from command line?

sth like zotero-cli import xxx.bib

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.