Problem: as noted in <a class="issue-link js-issue-link" data-error-text="Failed to lo

Pondering about a unified utility... Maybe: <div class="snippet-clipboard-content

flux filemap map only works on rank 0 about flux-core HOT 6 OPEN

garlick commented on September 9, 2024

flux filemap map only works on rank 0

from flux-core.

Comments (6)

garlick commented on September 9, 2024 1

Pondering about a unified utility... Maybe:

flux archive create ARCHIVE-KEY [--mmap] PATH ...
flux archive remove ARCHIVE-KEY
flux archive list ARCHIVE-KEY [pattern]
flux archive get ARCHIVE-KEY [pattern]

where the create subcommand would work on any rank, unless the --mmap option is specified in which case rank 0 only.

In other words do away with the tags used in flux filemap and use KVS keys instead. What's actually stored under the keys could be debated, but it might be OK to move at least the metadata out of rank 0 broker memory into the KVS....

from flux-core.

garlick commented on September 9, 2024

Storing files in the kvs might be another way to go, where storage would be consumed on rank 0 and the content cache would be leveraged for parallel reads. The net result is not that different from copying files to storage on rank 0, mmapping them, and then fetching them through the content cache

from flux-core.

grondo commented on September 9, 2024

One thought is that it might be nicer for users if there was one set of commands that work in both use cases here: distributing files from rank 0 vs from other ranks. I wonder if we could offer a command that does the Right Thing in either case?

One other thought is when running many jobs, each of which uses this facility, content store usage on rank 0 could grow quickly and there is no way to remove the archives.

Otherwise, I think the flux kvs archive could work and perhaps is a handy tool nonetheless (maybe someone wants to archive results or something in the kvs for provenance, etc)

from flux-core.

garlick commented on September 9, 2024

Great points!

If we go forward, then I agree, we probably should take a look at redesigning flux filemap (possibly renamed) to incorporate this rather than tuck it away in flux kvs.

A TODO for the prototype is to figure out how to reference the content blobs so the archive would be complete on a dump/restore. I thought maybe if the key is data.foo, we could optionally write a data.foo.blobs directory containing keys that are just the blobref strings, that point to the actual blobs. As it is, the archive references blobs that might not be included in the dump.

The caveats would need to be documented of course, but I'm liking this because it leverages a lot of existing work.

from flux-core.

grondo commented on September 9, 2024

Yeah, nice work!

from flux-core.

garlick commented on September 9, 2024

I think I have an OK solution to the Cray problem where shell 0 puts data in an archive and other shells take it out with:

$ flux archive create [-k KEY] --no-force-primary PATH ...
$ flux archive extract [-k KEY] --no-force-primary --waitcreate [PATTERN]

but it makes me wonder if we should have a programmatic interface for shell plugins rather than requiring a shell plugin to exec a command?

Not sure what that looks like but thought I'd put the thought out here for me in the morning or anybody else who wants to weigh in 😃

from flux-core.

flux filemap map only works on rank 0 about flux-core HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent