brumaire-agency / howdusty Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
OnlyDust possess some data about contributors that howdusty can leverage to improve its metric and scoring systems. We need a module to centralize this logic. We have a partial SQL read access on some tables
getUsers
allowing to retrieve all onlydust usersThe corresponding table is github_users.
Add a template for new issues. The template should be inspired from open source repositories
Add a "regularity" metric that shows how frequently a user does contributions. It should take the number of contributions per month, and add more weight to more recent months.
Note: The proposed formula is to leverage the number of distant months to the current month (e.g. same month from one year ago = 12), and divide the given month by this number, then sum the computed values and divide by the appropriate factor.
The version creation workflow fails (example) when there are too many commits.
This is not the intented behavior: husky should instead be disabled in the workflow. Most recommandations points towards adding a HUSKY=0
environment variable to disable the hooks installation.
The onlydust:import command currently fails when there are more than 1000 users in the database. Fix the command for bigger quantities and add login to show the update of the process.
Note: rxjs is a good candidate here for managing flux with retries, however a loop with a retry mechanism can be sufficient for a first iteration.
HowDusty keeps track of contributor data. Some features need a centralized place for representing a contributor and its associated metrics, and a persistence layer for storing the related data. We selected TypeORM which has good support with NestJS.
The information to store is:
More columns will be able in subsequent issues.
Add a "maxRank" property to the contributor.
Update the app to track the number of contributions done by a given contributor, to open source projects. An open source project is a public project with an open source license.
The onlydust:import
fails with an error pointing towards a concurrency issue in the onlydust.api SQL client connection
Prepare the next iteration of HowDusty, onboard contributors, make a presentation of the latest release and much more.
Add "contributedProjectCount" metric, which tracks the number of unique projects each contributor has contributed to.
Onlydust references github bots in their github_users table, which are imported on onlydust:import
command run. Those usernames do not match github users, causing repeated errors in the synchronization process.
onlydust:import
command to skip github botAdd a POST endpoint to allow consumers to add a new contributor to howdusty.
Tasks
Add the "meanGrantPerProject" metric, which tracks the mean of collected grants per project (total grants / project count)
MetricName
enum in the metrics
volumesonlydust.api
apicontributors
module to track the metric in the databaseUser data is currently stored in a single table. This simplified the user representation in the development early stages but leads to some difficulties in keeping concerns separated.
As a first effort to make the data model more scalable, separate the metrics in a specific "metrics" table.
Github allow (or has allowed) users to not set their name, resulting in potential null values for user names (not to be confused with usernames).
Update the "contributor.entity.ts" to allow values (add a nullable: true
property in the @Entity({})
annotation), and update the user types accordingly.
Github GraphQL API returns a node_id instead of the id, which conflicts with onlydust tracked id. Update the synchronization process so the user id is the correct value, which should be numerical.
Add the "activeContributionWeeks" metric to the app.
Add a "modules" section to the project README, explaining what the different modules achieve
The README.md is currently very scarce and does not detail how to interact with the application, especially through its main mean: commands.
Update the README.md to explain how to run the commands. Also add a quick section to showcase the API endpoints.
Setup a kubernetes environment to deploy the app in staging.
The synchronization command (contributors:sync
) fetches information for every user in parallel. This causes a github rate-limiting error (labelled with a 403 http status code).
Update the
Add a github workflow that tests:
The README.md is currently lackluster and does not explain what the project achieves or how it works.
Add a "ScoreContributorsCommand" command.
We want to have 3 different scores to allow assessing contributors based on their global metrics, their onlydust-specific metrics and their github-specific metrics.
githubScore
, githubRank
, globalScore
, globalRank
, onlydustScore
, onlydustRank
and remove the old score
and rank
Add the "issuePullRequestRatio" metric to the app.
Track the amount of grant a user has collected (in dollars). This information can be retrieved from the onlydust database.
Add a contributor controller to allow external users to read information about contributors.
In order to compute a score based on several metrics, those metrics need to have their values normalized. A scalar metric should range from 0 to 1 so the subsequent work on computing scores with weights is not altered by metrics having different scales.
This issue is the first brick of ther "scorer" module and should thus set the foundations of the brick.
Create a commands module that exposes commands to interact with the application. Use the commander package for defining command. A second entrypoint to the program can be defined next to the main.ts to allow booting either the app, or running a command
Metrics have been historically tied to the github module. Now that metrics from other sources (SQL/Onlydust) are added, they should be externalized in a specific module named metrics
.
The number of modules is in expansion and their ties as well, making bootstrapping testing modules in the tests increasingly difficult.
Add testing modules for every module, allowing tests to import them and centralizing the logic of composing them.
Create a synchronization module that exposes functions to synchronize contributors. This module will be used in a subsequent PR to provide a command that allows updating a contributor info.
Expose a GET "/contributors/:username" endpoint on the contributors controller to allow callers to read a contributor by its username.
Reorganize all metrics to match the active-contribution-weeks folder :
.metric.ts
and the .metric.spec.ts
)The "contributors:sync" command currently synchronizes the contributors passed as command parameters. By default, it should synchronize every user in the database.
The goal is to have a central module (synchronization) for performing this kind of operations. The logic should be in the synchronization module, and the commands module only acts as an entry point. This behavior simplifies implementing new ways to perform synchronization (e.g. through an API endpoint)
Add the "activeContributionWeeks" metric to the app.
We need a manual release system to deliver versions of the application once its development reaches a given milestone.
A test is failing in the CI pipeline, here is the output:
FAIL src/scorer/normalization.service.spec.ts
โ Test suite failed to run
src/scorer/normalization.service.spec.ts:27:46 - error TS2554: Expected 2 arguments, but got 1.
27 const normalizedContributors = service.normalize(contributors);
~~~~~~~~~~~~~~~~~~~~~~~
src/scorer/normalization.service.ts:29:5
29 metrics: MetricName[],
~~~~~~~~~~~~~~~~~~~~~
An argument for 'metrics' was not provided.
Some users which have received grants show a value of 0 for the "collectedGrant" metric. This should be investigated to see if it comes from the synchronization process or somewhere else (e.g. a different way of representing grants in the database)
Adding a user to howdusty currently requires to run a command and specifying usernames to track. Onlydust has over 1700 users so this behavior is cumbersome. Add a "onlydust:import" command that import onlydust users into howdusty.
Most contributor information can be retrieved from github. Create a github module allowing to retrieve user info that can be mapped to the contributor.
This module will be leveraged in a subsequent issue to fetch data from github and save it as a contributor
User
interface.NestJS provides a logger which helps handle logging with finer grain than relying on the javascript console object.
This issue is a preliminary work for setting up a qualitative logging system with levels. A focus is put on preserving/adding logs to the synchronization process which is important to express the process progress for the synchronization command.
console.log
is removed, either by refactoring it to use the Logger or by deleting it if the log does not make sense.Add a scorer service to the scorer module. This service should follow a simple linear regression, accepting arbitrary weights (until a future iteration handles computing those weights) and producing a score for a contributor by factoring its metrics with the weights
Add "missionCount" metric, which tracks the number of missions each Onlydust contributor.
Add the "projectMaintainedCount" metric to the app.
Setup a cron service for periodically performing a synchronization, leveraging the kubernetes config.
It's not possible to launch the score command when there is already a value in the score and rank column, it throw this error :
QueryFailedError: Unknown column 'NaN' in 'field list'%
We have to make it possible
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.