Giter VIP home page Giter VIP logo

Comments (8)

ealvar3z avatar ealvar3z commented on June 25, 2024

@andinus I am assuming that you've done a shallow clone?!

If you have and you're still suffering from performance issues, I can submit a patch (PR) for this issue. The following is what I have in mind:

a simple script that runs:

git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive

Add it to a GH workflow that crons it every week.

Thoughts @manwar

P.S: @andinus if upstream does not want to the proposed PR. Please note, that you can do this to your local clone.

from perlweeklychallenge-club.

ealvar3z avatar ealvar3z commented on June 25, 2024
  • Update

I've just seen that the scripts directory has attempted this already. So the solution may not be upstream.

from perlweeklychallenge-club.

rcmlz avatar rcmlz commented on June 25, 2024

I am also in favour of doing some house keeping. I use zsh with some git integration and the meanwhile 90k files slow down the shell. Can the "historic" commits maybe automatically be squashed, so we have perhaps only a single commit per week on the master?

from perlweeklychallenge-club.

ealvar3z avatar ealvar3z commented on June 25, 2024

@andinus I think your recommendation is the best and quick approach (i.e. deleting stale dirs w/ README files). I ran a test locally and this is what i got:

Before I ran cleanup-readme-only.sh

╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time gs
Refresh index: 100% (88731/88731), done.
On branch issue/7358
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        script/cleanup_readme_only

nothing added to commit but untracked files present (use "git add" to track)

real    0m3.350s
user    0m1.562s
sys     0m2.123s

Running clean-up-readme-only.sh

This is how long it took the shell script took to run. However, this may be just be a one time only since it deleted the entirety of the repo's history.

╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bash -c script/cleanup_readme_only

real    2m1.530s
user    4m33.764s
sys     3m12.803s

It got rid of 39k files (see below), but we could do better.

╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ git diff --name-only HEAD~ | wc -l
39066

Doing git status after running clean-up-readme-only.sh

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        script/cleanup_readme_only

no changes added to commit (use "git add" and/or "git commit -a")

real    0m1.082s
user    0m0.602s
sys     0m0.817s

Great improvement, but the script is too slow (even with xargs). So I rewrote it in Go! See below speed improvement.

╔ eax@nix:test_perlweeklychallenge-club(issue/7358)
╚ λ time bin/cleanup

real    0m2.658s
user    0m2.780s
sys     0m5.675s

Night and day!!!

@manwar : let me know if this is a desirable action, and I'll submit the PR (all the code and local tests are complete). See below GH Action workflow:

name: Cleanup Readmes From Repository

on:
  schedule:
    - cron:  '0 0 * * 0'  # Run at midnight every Sunday

jobs:
  cleanup:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Setup Go
      uses: actions/setup-go@v2
      with:
        go-version: 1.17

    - name: Build Go Script
      run: go build -o bin/cleanup bin/main.go 

    - name: Execute Cleanup
      run: ./bin/cleanup

from perlweeklychallenge-club.

andinus avatar andinus commented on June 25, 2024

I am assuming that you've done a shallow clone?!

git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive

IIRC even after a shallow clone, running this ^, it was slow. @ealvar3z Can you share the script? I'll try running that and report back.

from perlweeklychallenge-club.

ealvar3z avatar ealvar3z commented on June 25, 2024

@andinus

Please be advised that I ran this on a separate repo: cp -r perlweeklychallenge-club/ test_perlweeklychallenge-club

Here's main.go:

package main

import (
	"fmt"
	"os"
	"path/filepath"
	"runtime"
	"sync"
)

func isReadmeOnly(dir string) bool {
	files, _ := os.ReadDir(dir)
	if len(files) == 1 && (files[0].Name() == "README" || files[0].Name() == "README.md") {
		return true
	}
	return false
}

func cleanupReadmeOnly(wg *sync.WaitGroup, pathChan <-chan string) {
	defer wg.Done()
	for path := range pathChan {
		if isReadmeOnly(path) {
			os.RemoveAll(path)
		}
	}
}

func main() {
	var wg sync.WaitGroup
	ncores := runtime.NumCPU()
	pathChan := make(chan string)

	for i := 0; i < ncores; i++ {
		wg.Add(1)
		go cleanupReadmeOnly(&wg, pathChan)
	}

	err := filepath.WalkDir(".", func(path string, d os.DirEntry, err error) error {
		if d.IsDir() {
			pathChan <- path
		}
		return nil
	})

	if err != nil {
		fmt.Println("Error:", err)
	}

	close(pathChan)
	wg.Wait()
}

And the bash script:

#!/bin/bash

cleanup_readme_only() {
  num_cores=$(nproc)
  find . -type d -print0 | xargs -0 -I {} -P "$num_cores" bash -c \
  'if [ "$(ls -A {})" = "README" ] || [ "$(ls -A {})" = "README.md" ]; \
  then rm -rf {}; fi'
}

cleanup_readme_only

from perlweeklychallenge-club.

andinus avatar andinus commented on June 25, 2024

It does improve performance, previous these took 71, 16 seconds. Takes about 8, 4 seconds now.

andinus@~/d/o/C/perlweeklychallenge-club (master)> time git status > /dev/null
Refresh index: 100% (93480/93480), done.

________________________________________________________
Executed in    8.44 secs    fish           external
   usr time    1.65 secs    0.00 micros    1.65 secs
   sys time   14.08 secs    0.00 micros   14.08 secs

andinus@~/d/o/C/perlweeklychallenge-club (master)> time git status -uno > /dev/null
Refresh index: 100% (93480/93480), done.

________________________________________________________
Executed in    4.34 secs    fish           external
   usr time    1.01 secs    0.00 micros    1.01 secs
   sys time   10.64 secs    0.00 micros   10.64 secs

from perlweeklychallenge-club.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.