Comments (8)
@andinus I am assuming that you've done a shallow clone?!
If you have and you're still suffering from performance issues, I can submit a patch (PR) for this issue. The following is what I have in mind:
a simple script that runs:
git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive
Add it to a GH workflow that crons
it every week.
Thoughts @manwar
P.S: @andinus if upstream does not want to the proposed PR. Please note, that you can do this to your local clone.
from perlweeklychallenge-club.
- Update
I've just seen that the scripts
directory has attempted this already. So the solution may not be upstream.
from perlweeklychallenge-club.
I am also in favour of doing some house keeping. I use zsh with some git integration and the meanwhile 90k files slow down the shell. Can the "historic" commits maybe automatically be squashed, so we have perhaps only a single commit per week on the master?
from perlweeklychallenge-club.
@andinus I think your recommendation is the best and quick approach (i.e. deleting stale dirs w/ README
files). I ran a test locally and this is what i got:
Before I ran cleanup-readme-only.sh
╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time gs
Refresh index: 100% (88731/88731), done.
On branch issue/7358
Untracked files:
(use "git add <file>..." to include in what will be committed)
script/cleanup_readme_only
nothing added to commit but untracked files present (use "git add" to track)
real 0m3.350s
user 0m1.562s
sys 0m2.123s
Running clean-up-readme-only.sh
This is how long it took the shell script took to run. However, this may be just be a one time only since it deleted the entirety of the repo's history.
╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bash -c script/cleanup_readme_only
real 2m1.530s
user 4m33.764s
sys 3m12.803s
It got rid of 39k files (see below), but we could do better.
╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ git diff --name-only HEAD~ | wc -l
39066
Doing git status
after running clean-up-readme-only.sh
Untracked files:
(use "git add <file>..." to include in what will be committed)
script/cleanup_readme_only
no changes added to commit (use "git add" and/or "git commit -a")
real 0m1.082s
user 0m0.602s
sys 0m0.817s
Great improvement, but the script is too slow (even with xargs
). So I rewrote it in Go! See below speed improvement.
╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bin/cleanup
real 0m2.658s
user 0m2.780s
sys 0m5.675s
Night and day!!!
@manwar : let me know if this is a desirable action, and I'll submit the PR (all the code and local tests are complete). See below GH Action workflow:
name: Cleanup Readmes From Repository
on:
schedule:
- cron: '0 0 * * 0' # Run at midnight every Sunday
jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Go
uses: actions/setup-go@v2
with:
go-version: 1.17
- name: Build Go Script
run: go build -o bin/cleanup bin/main.go
- name: Execute Cleanup
run: ./bin/cleanup
from perlweeklychallenge-club.
I am assuming that you've done a shallow clone?!
git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive
IIRC even after a shallow clone, running this ^, it was slow. @ealvar3z Can you share the script? I'll try running that and report back.
from perlweeklychallenge-club.
Please be advised that I ran this on a separate repo: cp -r perlweeklychallenge-club/ test_perlweeklychallenge-club
Here's main.go
:
package main
import (
"fmt"
"os"
"path/filepath"
"runtime"
"sync"
)
func isReadmeOnly(dir string) bool {
files, _ := os.ReadDir(dir)
if len(files) == 1 && (files[0].Name() == "README" || files[0].Name() == "README.md") {
return true
}
return false
}
func cleanupReadmeOnly(wg *sync.WaitGroup, pathChan <-chan string) {
defer wg.Done()
for path := range pathChan {
if isReadmeOnly(path) {
os.RemoveAll(path)
}
}
}
func main() {
var wg sync.WaitGroup
ncores := runtime.NumCPU()
pathChan := make(chan string)
for i := 0; i < ncores; i++ {
wg.Add(1)
go cleanupReadmeOnly(&wg, pathChan)
}
err := filepath.WalkDir(".", func(path string, d os.DirEntry, err error) error {
if d.IsDir() {
pathChan <- path
}
return nil
})
if err != nil {
fmt.Println("Error:", err)
}
close(pathChan)
wg.Wait()
}
And the bash
script:
#!/bin/bash
cleanup_readme_only() {
num_cores=$(nproc)
find . -type d -print0 | xargs -0 -I {} -P "$num_cores" bash -c \
'if [ "$(ls -A {})" = "README" ] || [ "$(ls -A {})" = "README.md" ]; \
then rm -rf {}; fi'
}
cleanup_readme_only
from perlweeklychallenge-club.
It does improve performance, previous these took 71, 16 seconds. Takes about 8, 4 seconds now.
andinus@~/d/o/C/perlweeklychallenge-club (master)> time git status > /dev/null
Refresh index: 100% (93480/93480), done.
________________________________________________________
Executed in 8.44 secs fish external
usr time 1.65 secs 0.00 micros 1.65 secs
sys time 14.08 secs 0.00 micros 14.08 secs
andinus@~/d/o/C/perlweeklychallenge-club (master)> time git status -uno > /dev/null
Refresh index: 100% (93480/93480), done.
________________________________________________________
Executed in 4.34 secs fish external
usr time 1.01 secs 0.00 micros 1.01 secs
sys time 10.64 secs 0.00 micros 10.64 secs
from perlweeklychallenge-club.
Related Issues (14)
- Please add Hacktoberfest label to the repository. HOT 7
- Challenge solutions not compliant HOT 1
- Uphill Task HOT 8
- M4 vs Macro Processor HOT 1
- Befunge -> Befunge-93 HOT 1
- Not sure your definition of Pentagon Number is correct.
- This program finds a sequence of only 14 words with Mohammad's input list HOT 2
- how to submit perl one liner? HOT 1
- Missed blog posts for Avery Adams 210. HOT 2
- benchmark scaleability of solutions HOT 2
- Main TWC page points to wrong week HOT 1
- Wrong language Adb HOT 1
- Suggestion - Data Structure and algorithm in Perl/Raku HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from perlweeklychallenge-club.