Giter VIP home page Giter VIP logo

howdusty's People

Contributors

kds-js avatar robinstraub avatar semantic-release-bot avatar zoemeriet avatar

howdusty's Issues

feat: add an onlydust module

OnlyDust possess some data about contributors that howdusty can leverage to improve its metric and scoring systems. We need a module to centralize this logic. We have a partial SQL read access on some tables

Tasks

  • add a "onlydust" module
  • add a "onlydust.user.ts" that defines a OnlydustUser model with an id attribute only
  • add an api that uses a postgresql client to connect to the onlydust database
  • expose a getUsers allowing to retrieve all onlydust users

The corresponding table is github_users.

chore: add issue templates

Add a template for new issues. The template should be inspired from open source repositories

Tasks

  • include a "resolved issues" section
  • include a "time spent" section
  • do some research as to what sections should be embedded into an issue template

feat: add a "regularity" metric

Add a "regularity" metric that shows how frequently a user does contributions. It should take the number of contributions per month, and add more weight to more recent months.

Tasks

  • Add a "regularity" metric
  • Add a formula for calculating the metric

Note: The proposed formula is to leverage the number of distant months to the current month (e.g. same month from one year ago = 12), and divide the given month by this number, then sum the computed values and divide by the appropriate factor.

fix: onlydust-import command not supporting a high amount of users

The onlydust:import command currently fails when there are more than 1000 users in the database. Fix the command for bigger quantities and add login to show the update of the process.

Tasks

  • Show a progress (e.g. usernames being processed with a counter) to have a sense of progression and see where it fails
  • Add error handling and make the command able to be recovered

Note: rxjs is a good candidate here for managing flux with retries, however a loop with a retry mechanism can be sufficient for a first iteration.

feat: add a contributor entity

Context

HowDusty keeps track of contributor data. Some features need a centralized place for representing a contributor and its associated metrics, and a persistence layer for storing the related data. We selected TypeORM which has good support with NestJS.

The information to store is:

  • id (number)
  • username (string)
  • name (string)
  • avatar_url (string)

More columns will be able in subsequent issues.

Tasks

feat: add a "maxRank" property

Add a "maxRank" property to the contributor.

Tasks

  • Add a "maxRank" property to the contributor entity.
  • Update the scoring process to compute the maxRank for every contributor (the value should be the same)

feat: collect the number of contributions

Update the app to track the number of contributions done by a given contributor, to open source projects. An open source project is a public project with an open source license.

Project management

Prepare the next iteration of HowDusty, onboard contributors, make a presentation of the latest release and much more.

bug: ignore github bots in onlydust import

Onlydust references github bots in their github_users table, which are imported on onlydust:import command run. Those usernames do not match github users, causing repeated errors in the synchronization process.

Tasks

  • Update the onlydust:import command to skip github bot
  • The bots are named 'xxxxxx[bot]' and can be detected with a simple regex

feat: add a POST endpoint for adding a contributor

Add a POST endpoint to allow consumers to add a new contributor to howdusty.

Tasks

  • Add a POST endpoint
  • Perform a synchronization on the added user
  • Perform a scoring/ranking of the userbase
  • Return the synced contributor

feat: add the "meanGrantPerProject" metric

Add the "meanGrantPerProject" metric, which tracks the mean of collected grants per project (total grants / project count)

Todo

  • Add a "meanGrantPerProject" value to the MetricName enum in the metrics volumes
  • Add a "meanGrantPerProject" function to the onlydust.api api
  • Update the "getMetricsForAll" function to call the function
  • Update the contributor entity in the contributors module to track the metric in the database

feat: separate the user model and its metrics in 2 tables

User data is currently stored in a single table. This simplified the user representation in the development early stages but leads to some difficulties in keeping concerns separated.

As a first effort to make the data model more scalable, separate the metrics in a specific "metrics" table.

Tasks

  • Create a "metrics.entity.ts" file to represent the metrics table on the ORM level (in the "metrics" module)
  • Update the general flow of the application to store the metrics in the correct table
  • Add a one-to-one relation between the 2 tables (the foreign key should be held by the metrics table)

fix: allow null names

Github allow (or has allowed) users to not set their name, resulting in potential null values for user names (not to be confused with usernames).

Update the "contributor.entity.ts" to allow values (add a nullable: true property in the @Entity({}) annotation), and update the user types accordingly.

fix: get user id instead of node_id from github

Github GraphQL API returns a node_id instead of the id, which conflicts with onlydust tracked id. Update the synchronization process so the user id is the correct value, which should be numerical.

feat: add the "contributedRepositoryCount" metric

Add the "activeContributionWeeks" metric to the app.

Tasks

  • Add a "contributedRepositoryCount" field to the contributor entity in the contributors module
  • Add the "contributedRepositoryCount" field to the User interface in the github module
  • Add a "ContributedRepositoryCount" implementation of the BaseMetric in the github module
  • Add the new metric to the GithubApi service
  • Update the tests

doc: list commands in the readme

The README.md is currently very scarce and does not detail how to interact with the application, especially through its main mean: commands.

Update the README.md to explain how to run the commands. Also add a quick section to showcase the API endpoints.

build: setup a staging environment

Setup a kubernetes environment to deploy the app in staging.

Tasks

  • Add a Dockerfile for building a container
  • Add a deployment manifest
  • Add a service manifest
  • Add an ingress route manifest
  • Add a config manifest

bug: the synchronization command fails past a given number of users

The synchronization command (contributors:sync) fetches information for every user in parallel. This causes a github rate-limiting error (labelled with a 403 http status code).

Update the

Tasks

  • Locate the code causing this behavior (this is the "synchronizeUsers" function in the "synchronization.service.ts", L31)
  • Update the code to fetch user info sequentially and not in parallel
  • Run the synchronization command and checks it works properly, otherwise update the issue

documentation: describe the project in the README.md

The README.md is currently lackluster and does not explain what the project achieves or how it works.

  • create and structure sections (introduction, features, architecture, getting started, roadmap)
  • add content about the vision, tech and what is here
  • leave room for evolutions (e.g. features will be added progressively)

feat: add a scoring command

Add a "ScoreContributorsCommand" command.

Tasks

  • Add a "ScoreContributorsCommand" command to the commands module.
  • Ensure it computes a score for every contributor and updates the contributors table

feat: setup 3 scorings

We want to have 3 different scores to allow assessing contributors based on their global metrics, their onlydust-specific metrics and their github-specific metrics.

Tasks

  • refactor the contributor entity to expose the following properties: githubScore, githubRank, globalScore, globalRank, onlydustScore, onlydustRank and remove the old score and rank
  • Update the ranking process to compute these 6 properties
  • Update the tests to reflect these changes

feat: add the "issuePullRequestRatio" metric

Add the "issuePullRequestRatio" metric to the app.

Tasks

  • Add a "issuePullRequestRatio" field to the contributor entity in the contributors module
  • Add the "issuePullRequestRatio" field to the User interface in the github module
  • Add a "IssuePullRequestRatio" implementation of the BaseMetric in the github module
  • Add the new metric to the GithubApi service
  • Update the tests

feat: add a contributors endpoint

Add a contributor controller to allow external users to read information about contributors.

Tasks

  • Add a contributors controller to the contributors endpoint
  • Expose a GET endpoint that returns every contributor
  • Add tests to ensure contributor data is returned, especially their score/rank

feat: add normalization capability

In order to compute a score based on several metrics, those metrics need to have their values normalized. A scalar metric should range from 0 to 1 so the subsequent work on computing scores with weights is not altered by metrics having different scales.

This issue is the first brick of ther "scorer" module and should thus set the foundations of the brick.

Tasks

  • Add a "scorer" module to the app
  • Add the "normalization.service.ts" file exporting the "NormalizationService"
  • Add a "normalize" method that takes an array of contributors and return an array of same dimension with normalized (scaled) values
  • Add tests

feat: create a commands module

Create a commands module that exposes commands to interact with the application. Use the commander package for defining command. A second entrypoint to the program can be defined next to the main.ts to allow booting either the app, or running a command

Tasks

  • Create a new module named "commands"
  • Add a command for synchronizing a contributor by its username
  • Define a second entrypoint for the program

test: add testing modules

The number of modules is in expansion and their ties as well, making bootstrapping testing modules in the tests increasingly difficult.

Add testing modules for every module, allowing tests to import them and centralizing the logic of composing them.

feat: create a synchronization module

Create a synchronization module that exposes functions to synchronize contributors. This module will be used in a subsequent PR to provide a command that allows updating a contributor info.

Tasks

  • Create a new module named "synchronization"
  • Create a synchronization.service.ts file that expose methods to synchronize a user by its github username, fetching its data from the github module and updating the related contributor
  • Update the contributors module to allow creating/updating a contributor info

refacto: use generic functions in all metrics

Reorganize all metrics to match the active-contribution-weeks folder :

  • Add a folder for the metric (put the .metric.ts and the .metric.spec.ts)
  • Add a types folder
  • Add a helpers folder
  • Use generic functions if necessary

Exploring and Documenting the HowDusty App

  • Research and understand the features and functions of the HowDusty application.
  • Analyze and describe the app's ability to determine the 'dustiness' or reputation of contributors, by collecting and processing their data from GitHub and OnlyDust.
  • Document the workings of the visualization tools provided by HowDusty, including the contributor profile, reputation leaderboard, and contributor proximity graph.
  • Explore and explain the underlying logic of the scoring system, which factors in parameters such as experience, ecosystem knowledge, and contribution quality and quantity.
  • Study the biases in the scoring system and their implications on the scoring outcome.
  • Understand and document the difference between metrics (scalar information) and perks (non-scalar information) in the data collected by the scoring system.
  • Detail the potential roadmap for HowDusty, which involves making the app functional, user-friendly, and fair by enabling user-defined reputation scoring rules.
  • Research and outline possible future enhancements of the app, such as new metrics, gamification features, and expansion beyond OnlyDust.
  • Collect, organize, and document essential resources, like GraphQL queries used within the app.
  • Overall, through research and documentation, contribute to the continuous evolution and improvement of the HowDusty application.

refactor: synchronize every users by default

The "contributors:sync" command currently synchronizes the contributors passed as command parameters. By default, it should synchronize every user in the database.

Todo

  • Add a "synchronizeUsers" to the SynchronizationService. it should accept an array of usernames, and if none specified retrieve all users from the contributors service
  • Rename the the "githubUser" method to "synchronizeUser" for more coherence
  • Refactor the SynchronizeContributorCommand command to match the changes

The goal is to have a central module (synchronization) for performing this kind of operations. The logic should be in the synchronization module, and the commands module only acts as an entry point. This behavior simplifies implementing new ways to perform synchronization (e.g. through an API endpoint)

feat: add the "activeContributionWeeks" metric

Add the "activeContributionWeeks" metric to the app.

Tasks

  • Add a "activeContributionWeeks" field to the contributor entity in the contributors module
  • Add the "activeContributionWeeks" field to the User interface in the github module
  • Add a "ActiveContributionWeeks" implementation of the BaseMetric in the github module
  • Add the new metric to the GithubApi service
  • Update the tests

chore: setup a manual release system

We need a manual release system to deliver versions of the application once its development reaches a given milestone.

Tasks

  • Use semantic-release to create a new release, constituted of a github tag and deployed to github (no need to push it to npm)
  • Update the project CHANGELOG.md and list every change for the release (commit history with conventional commits)
  • Provide a trigger mechanism

test: normalization.service.spec.ts assertions fail

A test is failing in the CI pipeline, here is the output:

 FAIL  src/scorer/normalization.service.spec.ts
  โ— Test suite failed to run

    src/scorer/normalization.service.spec.ts:27:46 - error TS2554: Expected 2 arguments, but got 1.

    27       const normalizedContributors = service.normalize(contributors);
                                                    ~~~~~~~~~~~~~~~~~~~~~~~

      src/scorer/normalization.service.ts:29:5
        29     metrics: MetricName[],
               ~~~~~~~~~~~~~~~~~~~~~
        An argument for 'metrics' was not provided.


bug: the collectedGrant metric is innacurate for some users

Some users which have received grants show a value of 0 for the "collectedGrant" metric. This should be investigated to see if it comes from the synchronization process or somewhere else (e.g. a different way of representing grants in the database)

feat: add an import command for onlydust users

Adding a user to howdusty currently requires to run a command and specifying usernames to track. Onlydust has over 1700 users so this behavior is cumbersome. Add a "onlydust:import" command that import onlydust users into howdusty.

Tasks

  • Add the onlydust module to the imports of the "commands" module
  • Add a "OnlydustImportCommand" command to the "commands" module
  • Write the command run function so it retrieve the user github usernames from onlydust and calls an import on them

feat: fetch contributor info from github graphql api

Most contributor information can be retrieved from github. Create a github module allowing to retrieve user info that can be mapped to the contributor.

This module will be leveraged in a subsequent issue to fetch data from github and save it as a contributor

Tasks

  • Create a new module named "github"
  • Add a file user.interface.ts representing the data we can pull from github (should match the data from the contributor)
  • Add a file github.api.ts querying the graphql API with a function to get a contributor info and transforming it to a structure matching the User interface.
  • Add a github.service.ts reexposing the method from the api (apis should be private to the module, communication between modules go through services)

refactor: add logging

NestJS provides a logger which helps handle logging with finer grain than relying on the javascript console object.

This issue is a preliminary work for setting up a qualitative logging system with levels. A focus is put on preserving/adding logs to the synchronization process which is important to express the process progress for the synchronization command.

Tasks

  • Configure the Logger in the application
  • Refactor the logging in the synchronization command and its sub processes
  • Parse the codebase and ensure every reference to console.log is removed, either by refactoring it to use the Logger or by deleting it if the log does not make sense.

feat: add a scorer service

Add a scorer service to the scorer module. This service should follow a simple linear regression, accepting arbitrary weights (until a future iteration handles computing those weights) and producing a score for a contributor by factoring its metrics with the weights

Tasks

  • Add a "scorer.service.ts" file exporting a "Scorer" class
  • Define arbitrary weights for every metric
  • Add a method to the scorer that accepts a contributor and return a score

feat: add the "projectMaintainedCount" metric

Add the "projectMaintainedCount" metric to the app.

Tasks

  • Add a "projectMaintainedCount" field to the contributor entity in the contributors module
  • Add the "projectMaintainedCount" field to the User interface in the github module
  • Add a "ProjectMaintainedCount" implementation of the BaseMetric in the github module
  • Add the new metric to the GithubApi service
  • Update the tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.