brumaire-agency / howdusty Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 3.0 191 KB

License: MIT License

JavaScript 0.75% TypeScript 98.74% Shell 0.15% Dockerfile 0.35%

howdusty's People

Contributors

Forkers

robinstraub zoemeriet kds-js

howdusty's Issues

feat: add an onlydust module

OnlyDust possess some data about contributors that howdusty can leverage to improve its metric and scoring systems. We need a module to centralize this logic. We have a partial SQL read access on some tables

Tasks

add a "onlydust" module
add a "onlydust.user.ts" that defines a OnlydustUser model with an id attribute only
add an api that uses a postgresql client to connect to the onlydust database
expose a getUsers allowing to retrieve all onlydust users

The corresponding table is github_users.

chore: add issue templates

Add a template for new issues. The template should be inspired from open source repositories

Tasks

include a "resolved issues" section
include a "time spent" section
do some research as to what sections should be embedded into an issue template

feat: add a "regularity" metric

Add a "regularity" metric that shows how frequently a user does contributions. It should take the number of contributions per month, and add more weight to more recent months.

Tasks

Add a "regularity" metric
Add a formula for calculating the metric

Note: The proposed formula is to leverage the number of distant months to the current month (e.g. same month from one year ago = 12), and divide the given month by this number, then sum the computed values and divide by the appropriate factor.

chore: disable husky for the version creation workflow

The version creation workflow fails (example) when there are too many commits.

This is not the intented behavior: husky should instead be disabled in the workflow. Most recommandations points towards adding a HUSKY=0 environment variable to disable the hooks installation.

fix: onlydust-import command not supporting a high amount of users

The onlydust:import command currently fails when there are more than 1000 users in the database. Fix the command for bigger quantities and add login to show the update of the process.

Tasks

Show a progress (e.g. usernames being processed with a counter) to have a sense of progression and see where it fails
Add error handling and make the command able to be recovered

Note: rxjs is a good candidate here for managing flux with retries, however a loop with a retry mechanism can be sufficient for a first iteration.

feat: add a contributor entity

Context

HowDusty keeps track of contributor data. Some features need a centralized place for representing a contributor and its associated metrics, and a persistence layer for storing the related data. We selected TypeORM which has good support with NestJS.

The information to store is:

id (number)
username (string)
name (string)
avatar_url (string)

More columns will be able in subsequent issues.

Tasks

Install and configure TypeORM for NestJS
Define a contributors module and add a contributor entity
Define a contributors service which expose a findAll function leveraging the contributors repository
Define a test using a mock repository for the findAll function

feat: add a "maxRank" property

Add a "maxRank" property to the contributor.

Tasks

Add a "maxRank" property to the contributor entity.
Update the scoring process to compute the maxRank for every contributor (the value should be the same)

feat: collect the number of contributions

Update the app to track the number of contributions done by a given contributor, to open source projects. An open source project is a public project with an open source license.

bug: the onlydust:import command fails

The onlydust:import fails with an error pointing towards a concurrency issue in the onlydust.api SQL client connection

Project management

Prepare the next iteration of HowDusty, onboard contributors, make a presentation of the latest release and much more.

feat: add contributed project count metric

Add "contributedProjectCount" metric, which tracks the number of unique projects each contributor has contributed to.

bug: ignore github bots in onlydust import

Onlydust references github bots in their github_users table, which are imported on onlydust:import command run. Those usernames do not match github users, causing repeated errors in the synchronization process.

Tasks

Update the onlydust:import command to skip github bot
The bots are named 'xxxxxx[bot]' and can be detected with a simple regex

feat: add a POST endpoint for adding a contributor

Add a POST endpoint to allow consumers to add a new contributor to howdusty.

Tasks

Add a POST endpoint
Perform a synchronization on the added user
Perform a scoring/ranking of the userbase
Return the synced contributor

feat: add the "meanGrantPerProject" metric

Add the "meanGrantPerProject" metric, which tracks the mean of collected grants per project (total grants / project count)

Todo

Add a "meanGrantPerProject" value to the MetricName enum in the metrics volumes
Add a "meanGrantPerProject" function to the onlydust.api api
Update the "getMetricsForAll" function to call the function
Update the contributor entity in the contributors module to track the metric in the database

feat: separate the user model and its metrics in 2 tables

User data is currently stored in a single table. This simplified the user representation in the development early stages but leads to some difficulties in keeping concerns separated.

As a first effort to make the data model more scalable, separate the metrics in a specific "metrics" table.

Tasks

Create a "metrics.entity.ts" file to represent the metrics table on the ORM level (in the "metrics" module)
Update the general flow of the application to store the metrics in the correct table
Add a one-to-one relation between the 2 tables (the foreign key should be held by the metrics table)

fix: allow null names

Github allow (or has allowed) users to not set their name, resulting in potential null values for user names (not to be confused with usernames).

Update the "contributor.entity.ts" to allow values (add a nullable: true property in the @Entity({}) annotation), and update the user types accordingly.

fix: get user id instead of node_id from github

Github GraphQL API returns a node_id instead of the id, which conflicts with onlydust tracked id. Update the synchronization process so the user id is the correct value, which should be numerical.

feat: add the "contributedRepositoryCount" metric

Add the "activeContributionWeeks" metric to the app.

Tasks

Add a "contributedRepositoryCount" field to the contributor entity in the contributors module
Add the "contributedRepositoryCount" field to the User interface in the github module
Add a "ContributedRepositoryCount" implementation of the BaseMetric in the github module
Add the new metric to the GithubApi service
Update the tests

doc: describe the different modules in the README

Add a "modules" section to the project README, explaining what the different modules achieve

Modules to document

doc: list commands in the readme

The README.md is currently very scarce and does not detail how to interact with the application, especially through its main mean: commands.

Update the README.md to explain how to run the commands. Also add a quick section to showcase the API endpoints.

build: setup a staging environment

Setup a kubernetes environment to deploy the app in staging.

Tasks

bug: the synchronization command fails past a given number of users

The synchronization command (contributors:sync) fetches information for every user in parallel. This causes a github rate-limiting error (labelled with a 403 http status code).

Update the

Tasks

~~Locate the code causing this behavior~~ (this is the "synchronizeUsers" function in the "synchronization.service.ts", L31)
Update the code to fetch user info sequentially and not in parallel
Run the synchronization command and checks it works properly, otherwise update the issue

Add a CI pipeline for the project

Add a github workflow that tests:

project installation
project build
code style (prettier + eslint)
tests

documentation: describe the project in the README.md

The README.md is currently lackluster and does not explain what the project achieves or how it works.

create and structure sections (introduction, features, architecture, getting started, roadmap)
add content about the vision, tech and what is here
leave room for evolutions (e.g. features will be added progressively)

feat: add a scoring command

Add a "ScoreContributorsCommand" command.

Tasks

Add a "ScoreContributorsCommand" command to the commands module.
Ensure it computes a score for every contributor and updates the contributors table

feat: setup 3 scorings

We want to have 3 different scores to allow assessing contributors based on their global metrics, their onlydust-specific metrics and their github-specific metrics.

Tasks

refactor the contributor entity to expose the following properties: githubScore, githubRank, globalScore, globalRank, onlydustScore, onlydustRank and remove the old score and rank
Update the ranking process to compute these 6 properties
Update the tests to reflect these changes

feat: add the "issuePullRequestRatio" metric

Add the "issuePullRequestRatio" metric to the app.

Tasks

Add a "issuePullRequestRatio" field to the contributor entity in the contributors module
Add the "issuePullRequestRatio" field to the User interface in the github module
Add a "IssuePullRequestRatio" implementation of the BaseMetric in the github module
Add the new metric to the GithubApi service
Update the tests

feat: add a "collectedGrant" metric

Track the amount of grant a user has collected (in dollars). This information can be retrieved from the onlydust database.

feat: add a contributors endpoint

Add a contributor controller to allow external users to read information about contributors.

Tasks

Add a contributors controller to the contributors endpoint
Expose a GET endpoint that returns every contributor
Add tests to ensure contributor data is returned, especially their score/rank

feat: add normalization capability

In order to compute a score based on several metrics, those metrics need to have their values normalized. A scalar metric should range from 0 to 1 so the subsequent work on computing scores with weights is not altered by metrics having different scales.

This issue is the first brick of ther "scorer" module and should thus set the foundations of the brick.

Tasks

Add a "scorer" module to the app
Add the "normalization.service.ts" file exporting the "NormalizationService"
Add a "normalize" method that takes an array of contributors and return an array of same dimension with normalized (scaled) values
Add tests

feat: create a commands module

Create a commands module that exposes commands to interact with the application. Use the commander package for defining command. A second entrypoint to the program can be defined next to the main.ts to allow booting either the app, or running a command

Tasks

Create a new module named "commands"
Add a command for synchronizing a contributor by its username
Define a second entrypoint for the program

refactor: externalize the metrics in a specific module

Metrics have been historically tied to the github module. Now that metrics from other sources (SQL/Onlydust) are added, they should be externalized in a specific module named metrics.

test: add testing modules

The number of modules is in expansion and their ties as well, making bootstrapping testing modules in the tests increasingly difficult.

Add testing modules for every module, allowing tests to import them and centralizing the logic of composing them.

feat: create a synchronization module

Create a synchronization module that exposes functions to synchronize contributors. This module will be used in a subsequent PR to provide a command that allows updating a contributor info.

Tasks

Create a new module named "synchronization"
Create a synchronization.service.ts file that expose methods to synchronize a user by its github username, fetching its data from the github module and updating the related contributor
Update the contributors module to allow creating/updating a contributor info

feat: add an endpoint for reading a contributor by its username.

Expose a GET "/contributors/:username" endpoint on the contributors controller to allow callers to read a contributor by its username.

refacto: use generic functions in all metrics

Reorganize all metrics to match the active-contribution-weeks folder :

Add a folder for the metric (put the .metric.ts and the .metric.spec.ts)
Add a types folder
Add a helpers folder
Use generic functions if necessary

Exploring and Documenting the HowDusty App

Research and understand the features and functions of the HowDusty application.
Analyze and describe the app's ability to determine the 'dustiness' or reputation of contributors, by collecting and processing their data from GitHub and OnlyDust.
Document the workings of the visualization tools provided by HowDusty, including the contributor profile, reputation leaderboard, and contributor proximity graph.
Explore and explain the underlying logic of the scoring system, which factors in parameters such as experience, ecosystem knowledge, and contribution quality and quantity.
Study the biases in the scoring system and their implications on the scoring outcome.
Understand and document the difference between metrics (scalar information) and perks (non-scalar information) in the data collected by the scoring system.
Detail the potential roadmap for HowDusty, which involves making the app functional, user-friendly, and fair by enabling user-defined reputation scoring rules.
Research and outline possible future enhancements of the app, such as new metrics, gamification features, and expansion beyond OnlyDust.
Collect, organize, and document essential resources, like GraphQL queries used within the app.
Overall, through research and documentation, contribute to the continuous evolution and improvement of the HowDusty application.

refactor: synchronize every users by default

The "contributors:sync" command currently synchronizes the contributors passed as command parameters. By default, it should synchronize every user in the database.

Todo

Add a "synchronizeUsers" to the SynchronizationService. it should accept an array of usernames, and if none specified retrieve all users from the contributors service
Rename the the "githubUser" method to "synchronizeUser" for more coherence
Refactor the SynchronizeContributorCommand command to match the changes

The goal is to have a central module (synchronization) for performing this kind of operations. The logic should be in the synchronization module, and the commands module only acts as an entry point. This behavior simplifies implementing new ways to perform synchronization (e.g. through an API endpoint)

feat: add the "activeContributionWeeks" metric

Add the "activeContributionWeeks" metric to the app.

Tasks

Add a "activeContributionWeeks" field to the contributor entity in the contributors module
Add the "activeContributionWeeks" field to the User interface in the github module
Add a "ActiveContributionWeeks" implementation of the BaseMetric in the github module
Add the new metric to the GithubApi service
Update the tests

chore: setup a manual release system

We need a manual release system to deliver versions of the application once its development reaches a given milestone.

Tasks

Use semantic-release to create a new release, constituted of a github tag and deployed to github (no need to push it to npm)
Update the project CHANGELOG.md and list every change for the release (commit history with conventional commits)
Provide a trigger mechanism

test: normalization.service.spec.ts assertions fail

A test is failing in the CI pipeline, here is the output:

 FAIL  src/scorer/normalization.service.spec.ts
  ● Test suite failed to run

    src/scorer/normalization.service.spec.ts:27:46 - error TS2554: Expected 2 arguments, but got 1.

    27       const normalizedContributors = service.normalize(contributors);
                                                    ~~~~~~~~~~~~~~~~~~~~~~~

      src/scorer/normalization.service.ts:29:5
        29     metrics: MetricName[],
               ~~~~~~~~~~~~~~~~~~~~~
        An argument for 'metrics' was not provided.

bug: the collectedGrant metric is innacurate for some users

Some users which have received grants show a value of 0 for the "collectedGrant" metric. This should be investigated to see if it comes from the synchronization process or somewhere else (e.g. a different way of representing grants in the database)

feat: add an import command for onlydust users

Adding a user to howdusty currently requires to run a command and specifying usernames to track. Onlydust has over 1700 users so this behavior is cumbersome. Add a "onlydust:import" command that import onlydust users into howdusty.

Tasks

Add the onlydust module to the imports of the "commands" module
Add a "OnlydustImportCommand" command to the "commands" module
Write the command run function so it retrieve the user github usernames from onlydust and calls an import on them

feat: fetch contributor info from github graphql api

Most contributor information can be retrieved from github. Create a github module allowing to retrieve user info that can be mapped to the contributor.

This module will be leveraged in a subsequent issue to fetch data from github and save it as a contributor

Tasks

Create a new module named "github"
Add a file user.interface.ts representing the data we can pull from github (should match the data from the contributor)
Add a file github.api.ts querying the graphql API with a function to get a contributor info and transforming it to a structure matching the User interface.
Add a github.service.ts reexposing the method from the api (apis should be private to the module, communication between modules go through services)

refactor: add logging

NestJS provides a logger which helps handle logging with finer grain than relying on the javascript console object.

This issue is a preliminary work for setting up a qualitative logging system with levels. A focus is put on preserving/adding logs to the synchronization process which is important to express the process progress for the synchronization command.

Tasks

Configure the Logger in the application
Refactor the logging in the synchronization command and its sub processes
Parse the codebase and ensure every reference to console.log is removed, either by refactoring it to use the Logger or by deleting it if the log does not make sense.

feat: add a scorer service

Add a scorer service to the scorer module. This service should follow a simple linear regression, accepting arbitrary weights (until a future iteration handles computing those weights) and producing a score for a contributor by factoring its metrics with the weights

Tasks

Add a "scorer.service.ts" file exporting a "Scorer" class
Define arbitrary weights for every metric
Add a method to the scorer that accepts a contributor and return a score

feat: add mission count metric

Add "missionCount" metric, which tracks the number of missions each Onlydust contributor.

feat: add the "projectMaintainedCount" metric

Add the "projectMaintainedCount" metric to the app.

Tasks

Add a "projectMaintainedCount" field to the contributor entity in the contributors module
Add the "projectMaintainedCount" field to the User interface in the github module
Add a "ProjectMaintainedCount" implementation of the BaseMetric in the github module
Add the new metric to the GithubApi service
Update the tests

build: setup a cron service for the synchronization process

Setup a cron service for periodically performing a synchronization, leveraging the kubernetes config.

fix: the contributors:score command fails on second run

It's not possible to launch the score command when there is already a value in the score and rank column, it throw this error :
QueryFailedError: Unknown column 'NaN' in 'field list'%

We have to make it possible

brumaire-agency / howdusty Goto Github PK

howdusty's People

Contributors

Forkers

howdusty's Issues

Tasks

Tasks

Tasks

Tasks

Context

Tasks

Tasks

Tasks

Todo

Tasks

Tasks

Modules to document

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Todo

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Tasks

Recommend Projects

Recommend Topics

Recommend Org