Comments (4)
Hey,
I just want to confirm that I am having the same issue. In my case: total number of runs is 1200 (same as in UI), number of distinct run ids is 1163 and the number of duplicates is 37 as verified through this code snippet:
api = wandb.Api()
runs = api.runs(project_name)
distinct_run_ids = set()
duplicate_run_ids = set()
print(f"Number of runs (total): {len(runs)}")
for run in runs:
if run.id in distinct_run_ids:
duplicate_run_ids.add(run.id)
else:
distinct_run_ids.add(run.id)
print(f"Number of distinct run ids: {len(distinct_run_ids)}")
print(f"Number of duplicate run ids: {len(duplicate_run_ids)}")
With output:
Number of runs (total): 1200
Number of distinct run ids: 1163
Number of duplicate run ids: 37
I also verified that there are as many IDs missing as there are duplicate IDs in the runs
list when compared to the UI. I did so by downloading the .csv from the UI and comparing its IDs with the IDs in the runs object. Just to make sure there are no duplicate IDs displayed already in the UI.
Python: 3.10.14 / wandb: 0.15.12 on MacOS 14
from wandb.
Hello @kotekjedi and @PhilippBordne , thank you both for flagging this, we have the fix for this issue this coming June. As a work around please add this to your code:
runs = api.runs(
path=<entity/project>,
order="+created_at"
)
Hope this helps. Thanks!
from wandb.
Hi @kotekjedi , we tried to repro this one but did not get the same result. Are you also seeing those result in the UI, runs being duplicated?
from wandb.
HI @JoanaMarieL, thanks for reaching out!
In UI it is perfectly fine, runs are not duplicated or missing. However, when I try to download it I am not getting all of the runs - some are just missing, and some are duplicated.
from wandb.
from wandb.