Giter VIP home page Giter VIP logo

tributors's Introduction

tributors

docs/assets/img/logo.png

All Contributors

Documentation

What is tributors?

Tributors is a Python library and GitHub action that helps you to pay tribute to your contributors. Tribute interacts with several well-known repository metadata files:

Each of the services above allows you to generate some kind of metadata file that has one or more repository contributors. This file typically needs to be generated and updated manually, and this is where tributors comes in to help! Tributors will allow you to programmatically create and update these files. By way of using a shared cache, a .tributors file that can store common identifiers, it becomes easy to update several of these metadata files at once. You can set criteria such as a threshold for contributions to add a contributor, export an Orcid ID token to ensure that you have Orcid Ids where needed, or use an interactive mode to make decisions as you go.

How does it work?

Tributors uses the GitHub API, Zenodo API, and Orcid API to look up shared identifiers for common metadata services like all contributors, Zenodo, and CodeMeta. The tool is available for local or container usage, and as a GitHub Action (see the examples folder). See the full documentation for getting started.

Contributors

Yaroslav Halchenko
Yaroslav Halchenko

💻 📖
Vanessasaurus
Vanessasaurus

💻
Pierre Grimaud
Pierre Grimaud

💻
vuillaut
vuillaut

💻
jwodder
jwodder

💻

tributors's People

Contributors

github-actions[bot] avatar jwodder avatar pgrimaud avatar vsoch avatar vuillaut avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tributors's Issues

Create list of tests needed

hey @yarikoptic ! I've done just about all the changes / updates that I think are good for this first version, so please have it it / unlease the Yarik! Now is the right time to start review and tell me all the things I did wrong :) In all seriousness I always can quickly put together a first go, but then lots of tweaking / fine tuning / testing is needed. Actually, we don't technically have proper python tests, so if you want to make a list of tests to write I can go from there!

Initial discussion and possible takers?

Dear @adswa @snastase @vsoch @effigies @mih @soichih and all of the @con/i18n and the internet.

I initiated this repo since found no existing implementation yet. README.md gives more content but overall point is to automaintain .zenodo.org based on information from allcontributors for which there is a very convenient setup/bot available.

Immediate needs are in OBC: con/open-brain-consent#71 and datalad-osf where it is maintained "manually" ATM

Just thought to ask first if someone already saw an implementation or has an coding itch to cure

ideally order of new additions should be deterministic

I guess somewhere some set or unordered dict in place.

In datalad we got two PRs (since we didn't react to initial in time) for the "same" change:
datalad/datalad#5618 and datalad/datalad#5622

I thought diffs would be identical and then the fix needs to be solely at the level of workflow but apparently there are differences in order:

1:  4e1365a77 ! 1:  59a016f50 Automated deployment to update contributors 2021-04-29
    @@ Metadata
     Author: github-actions <[email protected]>
     
      ## Commit message ##
    -    Automated deployment to update contributors 2021-04-29
    +    Automated deployment to update contributors 2021-04-30
     
      ## .all-contributorsrc ##
     @@
    @@ .all-contributorsrc
     +            ]
     +        },
     +        {
    -+            "login": "mattcieslak",
    -+            "name": "Matt Cieslak",
    ++            "login": "mikapfl",
    ++            "name": "Mika Pflüger",
     +            "contributions": [
     +                "code"
     +            ],
    -+            "profile": "https://github.com/mattcieslak",
    ++            "profile": "https://github.com/mikapfl",
     +            "avatar_url": [
    -+                "https://avatars.githubusercontent.com/u/170026?v=4"
    ++                "https://avatars.githubusercontent.com/u/7226087?v=4"
     +            ]
     +        },
     +        {
    -+            "login": "mikapfl",
    -+            "name": "Mika Pflüger",
    ++            "login": "mattcieslak",
    ++            "name": "Matt Cieslak",
     +            "contributions": [
     +                "code"
     +            ],
    -+            "profile": "https://github.com/mikapfl",
    ++            "profile": "https://github.com/mattcieslak",
     +            "avatar_url": [
    -+                "https://avatars.githubusercontent.com/u/7226087?v=4"
    ++                "https://avatars.githubusercontent.com/u/170026?v=4"
     +            ]
              }
          ],
    @@ README.md: Sciences, Imaging Platform.  This work is further facilitated by the
     +    <td align="center"><a href="https://github.com/yetanothertestuser"><img src="https://avatars0.githubusercontent.com/u/19335420?v=4?s=100" width="100px;" alt=""/><br /><sub><b>yetanothertestuser</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=yetanothertestuser" title="Code">💻</a></td>
     +    <td align="center"><a href="https://github.com/bhanuprasad14"><img src="https://avatars3.githubusercontent.com/u/19843?v=4?s=100" width="100px;" alt=""/><br /><sub><b>bhanuprasad14</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=bhanuprasad14" title="Code">💻</a></td>
     +    <td align="center"><a href="https://github.com/christian-monch"><img src="https://avatars.githubusercontent.com/u/17925232?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Christian Mönch</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=christian-monch" title="Code">💻</a></td>
    -+    <td align="center"><a href="https://github.com/mattcieslak"><img src="https://avatars.githubusercontent.com/u/170026?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Matt Cieslak</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=mattcieslak" title="Code">💻</a></td>
     +    <td align="center"><a href="https://github.com/mikapfl"><img src="https://avatars.githubusercontent.com/u/7226087?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Mika Pflüger</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=mikapfl" title="Code">💻</a></td>
    ++    <td align="center"><a href="https://github.com/mattcieslak"><img src="https://avatars.githubusercontent.com/u/170026?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Matt Cieslak</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=mattcieslak" title="Code">💻</a></td>
     +  </tr>
     +</table>
     +

HOWTO new types of "contribution"

In light of the ongoing discussion in BIDS I thought that tributors could be a nice fit! For that to happen we would need

  • be able to create additional contribution categories, such as "steering-committee" or "bep032" to annotate contributors involved in a particular activity/sub-project. It seems that all-contributors already supports providing additional "contribution" types: https://allcontributors.org/docs/en/bot/configuration which is great!
  • ideally, we should get a mechanism to "sync" individuals with github Teams. In the simplest case one way it could be github -> tributors direction:
    • e.g. if someone is added to the team among https://github.com/orgs/bids-standard/teams , tributors
      • possibly initiates configuration item for allcontributors
      • adds that category in .all-contributors for that person
    • and if removed from the team
      • removes that category in .all-contributors for that person (related but another issue will follow)

Be able to limit only to contributors listed in the "source"?

Follow up to #22 (comment)

$> grep name .all-contributorsrc | nl
     1	      "name": "Yaroslav Halchenko",
     2	      "name": "Adina Wagner",
     3	      "name": "Valentina Borghesani",
     4	      "name": "Cyril Pernet",
     5	      "name": "Marcin Koculak",
     6	      "name": "Marko Havu",
     7	      "name": "John Pellman",
     8	      "name": "Chris Gorgolewski",
     9	      "name": "Stefan Appelhoff",
    10	      "name": "Satrajit Ghosh",
    11	      "name": "Robert Oostenveld",
    12	      "name": "pjtoussaint",
    13	      "name": "Marie-Luise Kieseler",
    14	      "name": "Stephan Heunis",
    15	      "name": "Chuan-Peng Hu",
    16	      "name": "Peer Herholz",
    17	      "name": "AsykaKolbacyka",
(dev3) 1 42602.....................................:Thu 02 Jul 2020 06:54:37 PM EDT:.
(git-annex)lena:~/proj/open-brain-consent[master]git
$> echo '{}' > .zenodo.json                                                                  
(dev3) 1 42603.....................................:Thu 02 Jul 2020 06:54:45 PM EDT:.
(git-annex)lena:~/proj/open-brain-consent[master]git
$> tributors update zenodo                
INFO:zenodo:Updating .zenodo.json
INFO:tributors:⭐️ new contributor yarikoptic
INFO:tributors:⭐️ new contributor vborghe
INFO:tributors:⭐️ new contributor allcontributors[bot]
INFO:tributors:⭐️ new contributor adswa
INFO:tributors:⭐️ new contributor CPernet
INFO:tributors:⭐️ new contributor jpellman
INFO:tributors:⭐️ new contributor mkoculak
INFO:tributors:⭐️ new contributor chrisgorgo
INFO:tributors:⭐️ new contributor mhavu
INFO:tributors:⭐️ new contributor sappelhoff
INFO:tributors:⭐️ new contributor AsykaKolbacyka
INFO:tributors:⭐️ new contributor hcp4715
INFO:tributors:⭐️ new contributor mlkieseler
INFO:tributors:⭐️ new contributor pjtoussaint
INFO:tributors:⭐️ new contributor PeerHerholz
INFO:tributors:⭐️ new contributor robertoostenveld
INFO:tributors:⭐️ new contributor satra
INFO:tributors:⭐️ new contributor jsheunis
INFO:tributors:⭐️ new contributor yarikoptic-private
$> grep name .zenodo.json| nl        
     1	            "name": "Yaroslav Halchenko"
     2	            "name": "vborghesani"
     3	            "name": "Adina Wagner"
     4	            "name": "Cyril Pernet"
     5	            "name": "John Pellman"
     6	            "name": "Marcin Koculak"
     7	            "name": "Chris Gorgolewski"
     8	            "name": "Marko Havu"
     9	            "name": "Stefan Appelhoff"
    10	            "name": "AsykaKolbacyka"
    11	            "name": "Chuan-Peng Hu"
    12	            "name": "Marie-Luise Kieseler"
    13	            "name": "pjtoussaint"
    14	            "name": "Peer Herholz"
    15	            "name": "Robert Oostenveld"
    16	            "name": "Satrajit Ghosh"
    17	            "name": "Stephan Heunis"
    18	            "name": "yarikoptic-private"

So tributors also picked up yarikoptic-private which was not in .all-contributorsrc . In many other projects it would also be useful to limit to some 'authoritive' source of contributors list. Since all-contributors made it so easy to add them, it would be the one for many projects. So I wonder if .tributors should have a separate structure to list those to ignore or smth like that?

It will be particularly important for https://github.com/datalad-datasets/ohbm2020-posters/

automated pull request workflow

The template in the repo for the automated pull requests was giving me lots of failures:

https://github.com/cpp-lln-lab/CPP_ROI/actions/runs/2518984368

Fixed by using another github action to open the PR at the end of the workflow:

https://github.com/cpp-lln-lab/CPP_ROI/blob/96350f5100057c607eaf8dae16363ea229e7f08a/.github/workflows/update-contributors.yml#L40

Side note:

Using this automated PR with certain pre-commit hooks (end of file fixer?), leads to an "interesting" bot version of a tug of war

cpp-lln-lab/CPP_ROI#53

Filtering per date

Dear Authors,
I have rather a question than posting an issue.
Can we make a list of contributors filtered by dates, ie with a 'since' and a 'last', for any branch/dev/etc?
Thank you in advance
B.

cummulative vs "temporal"

This is a follow up to #54 and bids-standard/bids-specification#627 (comment): The remove of "contributions" from contributors no longer associated with some "dynamic entity" (such as "steering committee") is somewhat not "in-line" with the idea of all-contributors which is "cumulative" - all categories a person contributed to some point in time. But it opens the way for establishing "collective names" (such as "Steering committee"), the exact listing of which then could be deduced given a range of commits in the repository with .all-contributors changing, and which might be needed for adequate reflection of participation for publications etc.
But may be such an approach should be applied to the rest of the contributors (especially if we start collecting data on issues/discussions), so then at each particular point in time we would get a list of contributors since previous release, and for "cumulative" just analyze all the "released states" to that point.

Alternative could be to keep timeline for each time of contribution type, thus ignoring the history which could be recovered from git. But IMHO it would be not only more tedious but also would grow the file with all the time periods (even if updated/populated only at release points...).

Add support for update of lookup

Follow up #22 (comment) where for vborghesani and pjtoussaint no name was added. But there is a standard git feature to have .mailmap mapping different logins to desired ones. E.g. our simplistic one in OBC:

$> cat .mailmap 
Yaroslav O. Halchenko <[email protected]>
Yaroslav O. Halchenko <[email protected]>
Valentina Borghesani <[email protected]>
Marie-Luise Kieseler <[email protected]>
Cyril Pernet <[email protected]>
Cyril Pernet <[email protected]>
Stephan Heunis <[email protected]>
P.-J. Toussaint <[email protected]>

results in

$> git shortlog -sn | nl
     1	   111	Yaroslav O. Halchenko
     2	    64	Cyril Pernet
     3	    35	Valentina Borghesani
     4	    34	allcontributors[bot]
     5	    21	Adina Wagner
     6	     6	John Pellman
     7	     5	Marcin Koculak
     8	     4	Chris Filo Gorgolewski
     9	     4	Marko Havu
    10	     2	Stefan Appelhoff
    11	     1	Anna Tovchigrechko
    12	     1	Chuan-Peng Hu
    13	     1	Marie-Luise Kieseler
    14	     1	P.-J. Toussaint
    15	     1	Peer Herholz
    16	     1	Robert Oostenveld
    17	     1	Satrajit Ghosh
    18	     1	Stephan Heunis

zenodo: ability to add grant information based on information from orcid

zenodo allows to annotate for funding grants which supported the project.

So for datalad we have

  "grants": [
    {"id": "10.13039/100000001::1429999"}
  ],

where IIRC 10.13039/100000001 corresponds to NSF and 1429999 is the grant number.

that funding agency thing is a proper DOI:
$> curl -i --head https://doi.org/10.13039/100000001
HTTP/2 302 
date: Tue, 04 Aug 2020 19:56:24 GMT
content-type: text/html;charset=utf-8
content-length: 209
set-cookie: __cfduid=d0fac7759c716bcc769a51e389d3b85dc1596570983; expires=Thu, 03-Sep-20 19:56:23 GMT; path=/; domain=.doi.org; HttpOnly; SameSite=Lax; Secure
vary: Accept
location: http://data.crossref.org/fundingdata/funder/10.13039/100000001
expires: Tue, 04 Aug 2020 20:27:10 GMT
cf-cache-status: DYNAMIC
cf-request-id: 045ca4f630000073d90d805200000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
strict-transport-security: max-age=31536000; includeSubDomains; preload
server: cloudflare
cf-ray: 5bdad769ecaa73d9-IAD

ORCID records also contain grant information, see e.g. my orcid record which would have

    "fundings": {
      "last-modified-date": {
        "value": 1579231189138
      },
      "group": [
        {
          "last-modified-date": {
            "value": 1437078901426
          },
          "external-ids": {
            "external-id": [
              {
                "external-id-type": "grant_number",
                "external-id-value": "1429999",
                "external-id-normalized": null,
                "external-id-normalized-error": null,
                "external-id-url": {
                  "value": "http://www.nsf.gov/awardsearch/showAward?AWD_ID=1429999&HistoricalAwards=false"
                },
                "external-id-relationship": "self"
              }
            ]
          },

and some more...

IIRC It is PITA though to figure out what is the DOI for the funder, make sure that id is correct (since there is no validator for zenodo, but there is a sandbox where record could be uploaded I was told) etc, and so that zenodo doesn't just blow up later on.

If in --interactive mode tributors (when adding a new contributor or just triggered explicitly to seek for grants information) could ask to have some grants to be added (as found for a specific ORCID record or all ORCID records, and thus list of people for a given grant which have it?) or ignored (now and forever since not pertinent to the project) -- that would be a great help and funders would definitely appreciate it ;) !

minor UI issue: "parser" specific options are not listed in --help

e.g.

$> tributors init zenodo       
Please provide the zenodo doi with --doi

but

$> tributors init zenodo --help
usage: tributors init [-h] [--force] [--repo REPO]
                      [{zenodo,allcontrib,codemeta,all} [{zenodo,allcontrib,codemeta,all} ...]]

positional arguments:
  {zenodo,allcontrib,codemeta,all}
                        Metadata file parsers to update or initialize.

optional arguments:
  -h, --help            show this help message and exit
  --force               If exists, overwrite existing .all-contributorsrc
  --repo REPO           The repository URI, if not exported to GITHUB_REPOSITORY

I guess --help could list options for all the parsers and annotate to which any applies?

GitHub pages search not working

I pushed a fix but it doesn't seem to be live, this is a reminder to check later. GitHub pages has been pretty disappointing lately. :/

a collection based on OBC: collect emails to .tributors records, may be RF .tributors to allow for options, github for update-lookup

I liked how .all-contributorsrc allows for consistent options storage alongside with the records, may be that should be done for .tributors? Use case -- in majority (well, all open) projects emails of contributors are available in .mailmap, git commits and possibly as public record from github, zenodo, orcid etc. They allow to "identify" contributors. But may be in some scenarios that would be undesired? then adding an option within .tributors telling to not store emails would help.

Without knowing emails I am not sure how mailmap extractor works, and it didn't do expected magic for me:

$> tributors update-lookup mailmap            
INFO:   mailmap:Updating .tributors cache from .mailmap

$> grep pjtouss .tributors .all-contributorsrc
.tributors:    "pjtoussaint": {
.tributors:        "name": "pjtoussaint"
.all-contributorsrc:      "login": "pjtoussaint",
.all-contributorsrc:      "name": "pjtoussaint",
.all-contributorsrc:      "profile": "https://github.com/pjtoussaint",

$> grep pjtouss .mailmap
P.-J. Toussaint <[email protected]>

$> git log | grep pjtouss
    Merge pull request #83 from con/all-contributors/add-pjtoussaint
    docs: add pjtoussaint as a contributor
    Added mailmap entries for jsheunis and pjtoussaint
    @pjtoussaint, Please check if introduced name is the preferable form for you.
Author: P.-J. Toussaint <[email protected]>

I expect that github would have provided the association (that the pjtoussaint contributed following commits, for which I could look up email address within repo). BTW, I guess github should be added to update-lookup?

$> tributors update-lookup github 
usage: tributors update-lookup [-h] [{mailmap,allcontrib,codemeta,zenodo,unset} [{mailmap,allcontrib,codemeta,zenodo,unset} ...]]
tributors update-lookup: error: argument files: invalid choice: 'github' (choose from 'mailmap', 'allcontrib', 'codemeta', 'zenodo', 'unset')

since it seems the "sources" here, not necessarily destinations .

update-lookup: crashes with "ValueError: too many values to unpack" on .mailmap parsin

(git)lena:~datalad/datalad-master[master]git
$> tributors update-lookup
INFO:allcontrib:Updating .tributors cache from .all-contributorsrc
INFO:    zenodo:Updating .tributors cache from .zenodo.json
INFO:   mailmap:Updating .tributors cache from .mailmap
Traceback (most recent call last):
  File "/home/yoh/proj/tributors/venvs/dev3/bin/tributors", line 33, in <module>
    sys.exit(load_entry_point('tributors', 'console_scripts', 'tributors')())
  File "/home/yoh/proj/tributors/tributors/client/__init__.py", line 175, in main
    main(args, extra)
  File "/home/yoh/proj/tributors/tributors/client/lookup.py", line 52, in main
    client.update_resource(resources=resources, params=extra)
  File "/home/yoh/proj/tributors/tributors/main/__init__.py", line 97, in update_resource
    resource.update_lookup()
  File "/home/yoh/proj/tributors/tributors/main/parsers/mailmap.py", line 63, in update_lookup
    self.load_data()
  File "/home/yoh/proj/tributors/tributors/main/parsers/mailmap.py", line 41, in load_data
    name, email = line.split("<")
ValueError: too many values to unpack (expected 2)
(dev3) 1 15113 ->1.....................................:Wed 18 Aug 2021 09:53:20 AM EDT:.
(git)lena:~datalad/datalad-master[master]git
$> git describe           
0.14.7-544-g379d4e2f9
changes on filesystem:                                                                                                                                                                                               
 .github/workflows/update-contributors.yml | 2 +-
(dev3) 1 15114.....................................:Wed 18 Aug 2021 09:54:31 AM EDT:.
(git)lena:~datalad/datalad-master[master]git
$> tributors --version
0.0.19
changes on filesystem:                                                                                                                                                                                               
 .github/workflows/update-contributors.yml | 2 +-
(dev3) 1 15115.....................................:Wed 18 Aug 2021 09:54:34 AM EDT:.
(git)lena:~datalad/datalad-master[master]git
$> cat .mailmap 
Alejandro de la Vega <[email protected]>
Alex Waite <[email protected]> Alex Waite <[email protected]>
Andy Connolly <[email protected]> <[email protected]>
Anisha Keshavan <[email protected]>
Benjamin Poldrack <[email protected]> <[email protected]>
Christian Mönch <[email protected]> Christian Moench <[email protected]>
Christian Olaf Häusler <[email protected]> chris <[email protected]>
Dave MacFarlane <[email protected]> Dave MacFarlane <[email protected]>
Dave MacFarlane <[email protected]> Dave MacFarlane <[email protected]>
Debanjum Singh Solanky <[email protected]> Debanjum <[email protected]> debanjum <[email protected]>
Gergana Alteva <[email protected]> Gergana Alteva <[email protected]>
Jason Gors <[email protected]>
John T. Wodder II <[email protected]> <[email protected]>
Kusti Skytén <[email protected]> <[email protected]>
Michael Hanke <[email protected]> mih <mih>
Neuroimaging Community <[email protected]> blah <[email protected]> 
Neuroimaging Community <[email protected]> <[email protected]> 
Neuroimaging Community <[email protected]> unknown <[email protected]> 
Taylor Olson <[email protected]> Taylor Olson <[email protected]>
Torsten Stoeter <[email protected]>
Vanessa Sochat <[email protected]> 
Yaroslav Halchenko <[email protected]> 
Yaroslav Halchenko <[email protected]> <[email protected]>
Yaroslav Halchenko <[email protected]> <[email protected]>

TypeError: unhashable type: 'dict'

To mitigate #76 I switched to use the most recent tag 0.0.21-node-20 . Container built fine but execution crashed

https://github.com/datalad/datalad/actions/runs/5193213913/jobs/9363452090

.all-contributorsrc already exists.
tributors update unset --thresh 1 --skip-users manbat bhanuprasad14 yetanothertestuser bobknob23987 --allcontrib-type code
INFO:allcontrib:Updating .all-contributorsrc
INFO:allcontrib:Updating .tributors cache from .all-contributorsrc
INFO:github:Alejandro de la Vega: found more than 1 result, run with --interactive mode to select.
INFO:github:Alejandro de la Vega: found more than 1 result, run with --interactive mode to select.
INFO:github:Alejandro de la Vega: found more than 1 result, run with --interactive mode to select.
INFO:github:Christian Mönch: found more than 1 result, run with --interactive mode to select.
INFO:github:Christian Mönch: found more than 1 result, run with --interactive mode to select.
INFO:github:Matt Cieslak: found more than 1 result, run with --interactive mode to select.
INFO:github:Robin Schneider: found more than 1 result, run with --interactive mode to select.
INFO:github:Robin Schneider: found more than 1 result, run with --interactive mode to select.
INFO:github:Sin Kim: found more than 1 result, run with --interactive mode to select.
INFO:github:Sin Kim: found more than 1 result, run with --interactive mode to select.
INFO:github:Michael Burgardt: found more than 1 result, run with --interactive mode to select.
INFO:github:Michał Szczepanik: found more than 1 result, run with --interactive mode to select.
INFO:github:Michał Szczepanik: found more than 1 result, run with --interactive mode to select.
INFO:github:Taylor Olson: found more than 1 result, run with --interactive mode to select.
INFO:github:Taylor Olson: found more than 1 result, run with --interactive mode to select.
INFO:github:James Kent: found more than 1 result, run with --interactive mode to select.
INFO:github:James Kent: found more than 1 result, run with --interactive mode to select.
INFO:github:Chris Lamb: found more than 1 result, run with --interactive mode to select.
INFO:github:Chris Lamb: found more than 1 result, run with --interactive mode to select.
INFO:github:Matt McCormick: found more than 1 result, run with --interactive mode to select.
INFO:github:Matt McCormick: found more than 1 result, run with --interactive mode to select.
INFO:github:Vicky C Lau: found more than 1 result, run with --interactive mode to select.
INFO:github:Vicky C Lau: found more than 1 result, run with --interactive mode to select.
INFO:github:Austin Macdonald: found more than 1 result, run with --interactive mode to select.
INFO:github:Austin Macdonald: found more than 1 result, run with --interactive mode to select.
WARNING:tributors:allcontrib does not support updating from orcids.
WARNING:tributors:allcontrib does not support updating from email.
INFO:    zenodo:Updating .zenodo.json
INFO:    zenodo:Updating .tributors cache from .zenodo.json
Traceback (most recent call last):
  File "/opt/conda/bin/tributors", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/tributors/client/__init__.py", line 179, in main
    main(args, extra)
  File "/opt/conda/lib/python3.10/site-packages/tributors/client/update.py", line [65](https://github.com/datalad/datalad/actions/runs/5193213913/jobs/9363452090#step:5:66), in main
    client.update(
  File "/opt/conda/lib/python3.10/site-packages/tributors/main/__init__.py", line 127, in update
    client.update(thresh=thresh, from_resources=resources)
  File "/opt/conda/lib/python3.10/site-packages/tributors/main/parsers/zenodo.py", line 1[90](https://github.com/datalad/datalad/actions/runs/5193213913/jobs/9363452090#step:5:91), in update
    self.update_cache()
  File "/opt/conda/lib/python3.10/site-packages/tributors/main/parsers/base.py", line 141, in update_cache
    self.update_lookup()
  File "/opt/conda/lib/python3.10/site-packages/tributors/main/parsers/zenodo.py", line 221, in update_lookup
    value = lookup[cache]["orcid"][field]
TypeError: unhashable type: 'dict'
Not sure why it complains about more than 1 result since on a sample name -- has just one entry in .tributors
❯ git grep 'James Kent'
.all-contributorsrc:            "name": "James Kent",
.tributors:        "name": "James Kent",
CHANGELOG.md:- James Kent (@jdkent)
README.md:      <td align="center" valign="top" width="14.28%"><a href="https://jdkent.github.io/"><img src="https://avatars.githubusercontent.com/u/12564882?v=4?s=100" width="100px;" alt="James Kent"/><br /><sub><b>James Kent</b></sub></a><br /><a href="https://github.com/datalad/datalad/commits?author=jdkent" title="Code">💻</a></td>
docs/source/changelog.rst:-  James Kent (@jdkent)

Add support to update --from

As discussed in #28, we also want to add support for updating from.

The functionaliy that we haven't implemented yet is being able to say something like:

$ tributors update --from mailmap
$ tributors update allcontrib --from zenodo

So what we would want to do is add a --from parser that can handle some input source of emails, orcids, etc, and then the class would be able to parse the source to expose emails, other metadata, and then be able to update the initial file with them. For example it would do the following:

$ tributors update allcontrib --from zenodo
# 1. read in and parse all contrib file
# 2. read in zenodo
# 3. update initial all contrib entries with whatever is discovered via zenodo (likely just emails would be the mapping attribute)
# 4. then run a standard update for all contrib, the entries from zendo would be added.

I think this feature request would meet the initial use case of wanting "allcontrib2zenodo" but it was important to implement them modularly to start.

allcontributors bot and tributors workflow seem to have a different opinion on spaces

https://github.com/datalad/datalad-gooey uses both the GitHub tributors action - THANKS! - and the all-contributors bot (to easily credit non-commit contributions). https://github.com/datalad/datalad-gooey/pull/122/files seems to suggest that the bot and the action have a different taste about spaces. In practice, this causes the action to run after a manual call to the bot.

Edit: I just realized I'm not running the latest version (0.0.19) - whoops.

zenodo: do not just return empty on "unauthorized" etc while doing orcid query

I was trying to run

(git)lena:~datalad/datalad[add/tributors]git
$> tributors --log-level DEBUG update zenodo
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 "GET /repos/datalad/datalad/contributors HTTP/1.1" 200 None
INFO:    zenodo:Updating .zenodo.json
INFO:    zenodo:Updating .tributors cache from .zenodo.json
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pub.orcid.org:443
DEBUG:urllib3.connectionpool:https://pub.orcid.org:443 "GET /v3.0/search/?q=given-names:+AND+family-name:glalteva HTTP/1.1" 401 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pub.orcid.org:443
DEBUG:urllib3.connectionpool:https://pub.orcid.org:443 "GET /v3.0/search/?q=given-names:glalteva+AND+family-name: HTTP/1.1" 401 None

to only realize that 401 is unauthorized, so it should have just bailed out right away announcing that I forgot to provide token.
Code has

    if response.status_code != 200:
        return

so just returns upon any non-successfull attempt. It should be more particular IMHO. In the future might even want to retry on some server side fails (5xx code) etc.

target use case: initiate .zenodo.json from information available

I know that it is WiP, but just wanted to create a dedicated issue to know when I should try tributors e.g. on https://github.com/con/open-brain-consent which as .all-contributorsrc. My blind attempt to "trick" tributors resulted in an error which I have no clue where it came from (what was not found) and what I should do (configuration?):

(git-annex)lena:~/proj/open-brain-consent[master]
$> tributors update
.zenodo.json does not exist

$> echo '{}' > .zenodo.json

$> tributors update zenodo 
INFO:zenodo:Updating .zenodo.json
Response 404: Not Found, cannot retrieve contributors.

mysterious user was added to datalad

I guess would be tricky to figure out now how/why, but in datalad/datalad@011cb15

$> git grep -i bhanupr        
.all-contributorsrc:            "login": "bhanuprasad14",
.all-contributorsrc:            "name": "bhanuprasad14",
.all-contributorsrc:            "profile": "https://github.com/bhanuprasad14",
.tributors:    "bhanuprasad14": {
.tributors:        "name": "bhanuprasad14",
.tributors:        "blog": "https://github.com/bhanuprasad14"
CONTRIBUTING.md:INFO:allcontrib:⭐️ Found new contributor bhanuprasad14 in .all-contributorsrc

was added but that user not longer exists on github, no activity reported in google cache for that login
https://webcache.googleusercontent.com/search?q=cache:yAuBXNE0jawJ:https://github.com/bhanuprasad14+&cd=1&hl=en&ct=clnk&gl=us
and I see no contributions via

$> git shortlog -sn | grep -i bhanu  

or any issues for that login as harvested from git-bug: https://github.com/datalad/datalad-git-bug-dumps

but may be it is because this user no longer exists, but still I wonder what "code" contribution it was, and either I should start worry (that someone some code was injected somewhere with "fake" author entries (so no git shortlog -sn output)... may be you @vsoch still have somewhere that run/log somehow which could shine the light?

selectivity about updating cache

right now we are very generous to update and use the cache, but we should probably follow some standard practices:

  • the cache should hold shared metadata across parsers
  • but we should loop over the self.contributors instead of it, and then lookup in the cache if needed
  • the user should be able to disable update as well.

hard to start going, crashes with not informative traceback

why just not to assume the same whenever there is no ORCID_TOKEN? or that is not the reason for

$> ORCID_ID=0000-0003-3456-2493 ORCID_SECRET=wrong tributors update
INFO:allcontrib:Updating .all-contributorsrc
Traceback (most recent call last):
  File "/home/yoh/proj/tributors/venvs/dev3/bin/tributors", line 11, in <module>
    load_entry_point('tributors', 'console_scripts', 'tributors')()
  File "/home/yoh/proj/tributors/tributors/client/__init__.py", line 145, in main
    main(args, extra)
  File "/home/yoh/proj/tributors/tributors/client/update.py", line 53, in main
    client.update(parsers=parsers, repo=args.repo, params=extra, thresh=args.thresh)
  File "/home/yoh/proj/tributors/tributors/main/__init__.py", line 107, in update
    client.update(thresh=thresh)
  File "/home/yoh/proj/tributors/tributors/main/parsers/allcontrib.py", line 120, in update
    self.lookup = {x["login"]: x for x in self.data.get("contributors", [])}
  File "/home/yoh/proj/tributors/tributors/main/parsers/allcontrib.py", line 120, in <dictcomp>
    self.lookup = {x["login"]: x for x in self.data.get("contributors", [])}
KeyError: 'login'

Consider Orcid to retrieve metadata (requires hosted server)

@yarikoptic I'm looking into adding Orcid now, but this is problematic because it's an Oauth2 flow, which requires to log in and then generate a client id and secret. E.g., if you go to https://orcid.org/developer-tools this is the only option - I don't see any option to just generate a one off token to use for GitHub actions, for example. How did you have in mind this would work? Requiring the user to generate an app and then login with it does not seem like it would work.

Does action builds docker image every time it runs?

I just saw that allcontributors-auto-detect in datalad repository workflow failed with

Build container for action use: '/home/runner/work/_actions/con/tributors/0.0.21/Dockerfile'.
  /usr/bin/docker build -t ed866e:be60600ffa[2](https://github.com/datalad/datalad/actions/runs/5191730431/jobs/9359936514#step:2:2)6408[3](https://github.com/datalad/datalad/actions/runs/5191730431/jobs/9359936514#step:2:3)ab4b2b48f65541e2 -f "/home/runner/work/_actions/con/tributors/0.0.21/Dockerfile" "/home/runner/work/_actions/con/tributors/0.0.21"
  Sending build context to Docker daemon  3.915MB
  ....
  Step 1/12 : FROM node:12
  12: Pulling from library/node
  f5196cdf2518: Pulling fs layer
  9bed1e86f01e: Pulling fs layer
  f[4](https://github.com/datalad/datalad/actions/runs/5191730431/jobs/9359936514#step:2:4)4e4bdb3a6c: Pulling fs layer
  2f7[5](https://github.com/datalad/datalad/actions/runs/5191730431/jobs/9359936514#step:2:5)d131f40[6](https://github.com/datalad/datalad/actions/runs/5191730431/jobs/9359936514#step:2:6): Pulling fs layer......
 Err:10 http://deb.debian.org/debian stretch/main amd64 Packages
    404  Not Found
  Ign:11 http://deb.debian.org/debian stretch-updates/main all Packages
  Err:12 http://deb.debian.org/debian stretch-updates/main amd64 Packages
    404  Not Found
  Reading package lists...
  W: The repository 'http://security.debian.org/debian-security stretch/updates Release' does not have a Release file.
  W: The repository 'http://deb.debian.org/debian stretch Release' does not have a Release file.
  W: The repository 'http://deb.debian.org/debian stretch-updates Release' does not have a Release file.
  E: Failed to fetch http://security.debian.org/debian-security/dists/stretch/updates/main/binary-amd64/Packages  404  Not Found
  E: Failed to fetch http://deb.debian.org/debian/dists/stretch/main/binary-amd64/Packages  404  Not Found
  E: Failed to fetch http://deb.debian.org/debian/dists/stretch-updates/main/binary-amd64/Packages  404  Not Found
  E: Some index files failed to download. They have been ignored, or old ones used instead.
  The command '/bin/sh -c /bin/bash -c "apt-get update && apt-get install -y wget bzip2 ca-certificates git &&     wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh &&     bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda &&     rm Miniconda3-latest-Linux-x86_64.sh"' returned a non-zero code: 100
...

but I am surprised that it tried to build it at all -- shouldn't it just fetch an image?

FWIW, according to https://github.com/datalad/datalad/actions/workflows/update-contributors.yml it started to fail over a month back.

More information on when interaction with external service fails + offline mode?

$> vim .tributors  # provided a full name for a person

and expected update to just update "name", but it seems did go for a full "interaction":

$> tributors update zenodo
INFO:zenodo:Updating .zenodo.json
Response 403: rate limit exceeded, cannot retrieve contributors.
  1. would be nice to report which service failed... if we even "hard code" some information about most common gotchas and inform user here ("rate limit on communication with X exceeded. There should be only up to Y interactions. To increase the limit you need to obtain an API key Z from A, or just wait B and rerun again.
  2. update should get an option to perform only offline updates (like here)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.