Giter VIP home page Giter VIP logo

Comments (16)

Riccorl avatar Riccorl commented on September 26, 2024 3

Hey guys apologies you are seeing this behavior. To confirm, both @nzl-thu and @Adamits are deleting their data using the UI and @Riccorl is deleting it using the API?

@Riccorl, could you send me the toy script of you trying to delete your data?

Could you guys also all send me your username so I can look into your accounts as well, as potentially escalate this to our engineering team.

Here is the script I used:

import argparse
import wandb

if __name__ == "__main__":
    arg_parser = argparse.ArgumentParser()
    arg_parser.add_argument("project_name", type=str, help="Name of the project")
    arg_parser.add_argument(
        "--dry_run", action="store_true", help="If true, don't delete anything"
    )
    args = arg_parser.parse_args()

    dry_run = args.dry_run

    project_name = args.project_name

    api = wandb.Api(overrides={"project": project_name, "entity": "riccorl"})
    runs = api.runs(project_name)

    print("Deleting checkpoints and models in runs")
    for run in runs:
        if run.state != "finished":
            continue
        for f in run.files():
            if "ckpt" in f.name or "pt" in f.name or "hf_model" in f.name or "retriever" in f.name or "document_index" in f.name or "model" in f.name:
                print(f"DELETING {run.id}/{f.name}")
                if not dry_run:
                    f.delete()
                else:
                    print("DRY RUN: NOT DELETING")

    print("Deleting models in artifacts")
    project = api.project(project_name)
    for artifact_type in project.artifacts_types():
        for artifact_collection in artifact_type.collections():
            for version in api.artifacts(artifact_type.type, artifact_collection.name):
                if artifact_type.type == "model":
                    print(f"DELETING {version.name}")
                    if not dry_run:
                        version.delete(delete_aliases=True)
                    else:
                        print("DRY RUN: NOT DELETING")

Just FYI, I can see the prints ("DELETING ...") the first time I run the script on a project, but it doesn't print that line anymore after that.

from wandb.

ArtsiomWB avatar ArtsiomWB commented on September 26, 2024 1

@Riccorl, apologies it's taking so long to resolve this, could you please write into [email protected], and I can potentially help you with that there? For us to talk privately about your account status?

from wandb.

Adamits avatar Adamits commented on September 26, 2024

I am seeing exactly the same behavior. I would have thought that some caching mechanism could be causing this, but several days have passed and the app still says I have no more storage space.

This is a pretty large problem as it makes my account unusable.

from wandb.

nzl-thu avatar nzl-thu commented on September 26, 2024

I am seeing exactly the same behavior. I would have thought that some caching mechanism could be causing this, but several days have passed and the app still says I have no more storage space.

This is a pretty large problem as it makes my account unusable.

Yes! This is frustrating...

from wandb.

Riccorl avatar Riccorl commented on September 26, 2024

I've been deleting files through the Python API for a week but still see no changes in the web UI. I want to access my runs at some point...

from wandb.

ArtsiomWB avatar ArtsiomWB commented on September 26, 2024

Hey guys apologies you are seeing this behavior. To confirm, both @nzl-thu and @Adamits are deleting their data using the UI and @Riccorl is deleting it using the API?

@Riccorl, could you send me the toy script of you trying to delete your data?

Could you guys also all send me your username so I can look into your accounts as well, as potentially escalate this to our engineering team.

from wandb.

Adamits avatar Adamits commented on September 26, 2024

Hi @ArtsiomWB

Mine seems to have started working somewhere in the last few hours. I still might have a script inadvertently synching model data, so I suspect I could need to mass delete artifacts again. In case it is still useful, my username is also adamits on W&B. My profile is at https://wandb.ai/adamits

Thanks!

from wandb.

nzl-thu avatar nzl-thu commented on September 26, 2024

Hi @ArtsiomWB

My profile is at https://wandb.ai/thu-n.
Meanwhile, could you please answer my first question as well?

Thank you!

from wandb.

ArtsiomWB avatar ArtsiomWB commented on September 26, 2024

Apologies for taking a long time to get back to you guys. Currently we are experiencing some unexpected behaviors regarding freeing up space, and we are sincerely sorry for the inconvenience. What happens right now is that after you free up your space the job gets added to the queue, and because of a very high number of people currently cleaning up their accounts, it takes longer than usual to update the storage that is displayed in the account.

Regarding @nzl-thu's question I just tried it out on my side and once you delete your run from that page, option a is the one that is happening:

a) the entire run is removed, including the logged data (e.g., training loss).
So no metrics or artifacts + media files are saved.

Since it has been sometime since I've gotten back to you guys, is everyone still seeing this behavior?

from wandb.

Riccorl avatar Riccorl commented on September 26, 2024

Thanks for the update!

Since it has been sometime since I've gotten back to you guys, is everyone still seeing this behavior?

Yes I still can't access my runs due to storage limits

from wandb.

nzl-thu avatar nzl-thu commented on September 26, 2024

Hi @ArtsiomWB

Thank you for your response! In fact, a more urgent requirement for me is finding an efficient way to delete millions of saved images without impacting any logged data, such as training loss.

I initially considered using the web UI to quickly remove entire folders. However, since this approach also deletes logged data when removing a run folder, while iterating through all images using the Python API is frustratingly slow, I am now a little bit stucked.

Could you please suggest any possible solutions? Thank you!

from wandb.

ArtsiomWB avatar ArtsiomWB commented on September 26, 2024

@Riccorl, looking at your code, to confirm you are trying to delete checkpoints in models in runs per a single project right?

from wandb.

ArtsiomWB avatar ArtsiomWB commented on September 26, 2024

@nzl-thu , you could use a scrip like this:

import wandb

# Initialize the W&B API
api = wandb.Api()

# Replace <entity> with your actual entity name
entity = "<entity>"

# Define the file extensions you want to delete
image_extensions = [".png", ".jpg", ".jpeg", ".bmp", ".gif"]
media_extensions = [".mp4", ".mp3", ".wav", ".avi", ".mov"]
extensions_to_delete = image_extensions + media_extensions

# Iterate over all projects
for project in api.projects(entity):
    print(f"Processing project: {project.name}")
    
    # Iterate over all runs in the project
    for run in api.runs(f"{entity}/{project.name}"):
        print(f" - Processing run: {run.id}")
        
        # Get all files in the run
        files = run.files()
        
        # Delete files with the specified extensions
        for file in files:
            if any(file.name.endswith(ext) for ext in extensions_to_delete):
                print(f"   - Deleting file: {file.name}")
                file.delete()

Just be careful because it does go over every single project in your entity and delete all of the media files from it.

from wandb.

Riccorl avatar Riccorl commented on September 26, 2024

@Riccorl, looking at your code, to confirm you are trying to delete checkpoints in models in runs per a single project right?

Yep, I confirm

from wandb.

Riccorl avatar Riccorl commented on September 26, 2024

Given the current issues, isn't it possible to give run access in the meantime? I can't access my account for a month now.

from wandb.

Riccorl avatar Riccorl commented on September 26, 2024

@Riccorl, apologies it's taking so long to resolve this, could you please write into [email protected], and I can potentially help you with that there? For us to talk privately about your account status?

Sure, thanks for the help!

from wandb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.