Giter VIP home page Giter VIP logo

Comments (36)

imeoer avatar imeoer commented on July 16, 2024 2

是不是没有必要单独增加一个 FileExists 接口,用 GetFileMeta 就可以了,比如 GetFileMeta 一个不存在的 Object,只需要返回错误码 codes.NotFound,另外所有接口有详细的错误码定义吗?

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1
  // Get file with stream.
  rpc GetFile(GetFileRequest) returns (stream GetFileResponse) {}

  // Put file with stream.
  rpc PutFile(stream PutFileRequest) returns (PutFileResponse) {}

这样是不是好一点呀

同意,好像get入参用不到stream

另外传参加下store_name吧?

Yes, it's a mistake, but for PutFile interface, i think return empty is ok.

from layotto.

fengmk2 avatar fengmk2 commented on July 16, 2024 1

可以适配支持一下 minio https://github.com/minio/minio ,S3 的接口

哦,已经看到代码库有 minio 的相关代码了。

from layotto.

fengmk2 avatar fengmk2 commented on July 16, 2024 1

https://github.com/mosn/layotto/blob/main/docs/en/api_reference/api_reference_v1.md#spec.proto.runtime.v1.ListFileRequest
ListFile 的参数还不够,单目录文件数会非常多的,没法一次性返回。需要类似 oss 那样带上分页参数,ListFileResp 也需要返回分页游标数据。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1

https://github.com/mosn/layotto/blob/main/docs/en/api_reference/api_reference_v1.md#spec.proto.runtime.v1.ListFileRequest ListFile 的参数还不够,单目录文件数会非常多的,没法一次性返回。需要类似 oss 那样带上分页参数,ListFileResp 也需要返回分页游标数据。

  • 对于prefix,采用 buketName + ”/“ + ”prefix“的形式,如果bucketName是空,sdk就传空,也就是 /prefix的形式,默认开头第一个字段对oss来说永远是bucketName,layotto测截取/前面的值作为bucketName。
  • 分页可以做,minio目前的v6版本支持不了分页,v7版本可以。这个可以做下升级。

对于list的返回值,目前是个文件名列表,需要增加一下文件信息吗?看了一下common的有下面几个字段:

  • Size 文件大小
  • Name 文件名字
  • LastModified 最后编辑时间。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1

@wenxuwan 还有一个查询文件元信息或者探测文件是否存在的接口,也需要看看 https://help.aliyun.com/document_detail/31984.htm?spm=a2c4g.11186623.0.0.77af5d7aqqsyje#reference-bgh-cbw-wdb

否则目前判断文件是否存在,必须是一个 get stream ,抛异常才能知道是否存在,开发者使用体验比较差的。我以 js 举例你看看。

// throw error
      await assert.rejects(
        async () => {
          const filepath = join(tmpfileDir, name);
          const stream = await client.file.get({
            storeName,
            name,
          });
          await pipeline(
            stream,
            createWriteStream(filepath),
          );
        },
        (err: any) => {
          // console.error(err);
          assert.equal(err.code, 13);
          assert.match(err.message, /StatusCode=404, ErrorCode=NoSuchKey/);
          return true;
        }
      );

嗯,这个有在设计的最近,接口已经定义了,可以一起看下

//判断文件/文件夹/Bucket是否存在
rpc FileExists(FileExistsRequest) returns (FileExistsResponse){}
message FileExistsRequest {
  //
  string store_name = 1;
  // The name of file/directory/bucket you want to check
  string name = 2;
  // The metadata for user extension.
  map<string,string> metadata = 4;
}

message FileExistsResponse{
	  bool success = 1;
}

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1

本期将对接口和实现进行以下变动:

  • 新增接口:
1.判断文件是否存在

	rpc FileExists(FileExistsRequest) returns (FileExistsResponse){}

	message FileExistsRequest {
	  //
	  string store_name = 1;
	  // The name of file you want to check
	  string name = 2;
	  // The metadata for user extension.
	  map<string,string> metadata = 3;
	}

	message FileExistsResponse{
		  bool success = 1;
	}

2.查看文件的元数据

	rpc GetFileMeta(FileMetaRequest) returns (FileMetaResponse){}

	message FileMetaRequest{
		FileRequest request = 1;
	}

	message FileMetaResponse{
		FileMeta response = 1;
	}

	message FileMeta{
		map<string,string> metadata = 1;
	}
  • 修改接口:
1.修改list的resp信息:
 返回文件的具体信息,同时返回marker支持分页查询。
	message FileInfo {
	    // The name of file
	    string file_name = 1;
	    // The size of file
	    int64 size = 2;
	    // The modified time of file
	    String last_modified = 3;
	    // The metadata for user extension.
	    map<string,string> metadata = 4;
	}

	message ListFileResp {
	  repeated FileInfo files = 1;
          string marker = 2;
	}
1.修改list的req信息:
   
   增加marker,支持分页
    message ListFileRequest {
      FileRequest request = 1;
      int page_size = 2;
      string marker = 3;
    }
  • 实现部分
  1. 对于prefix,采用 buketName + ”/“ + ”prefix“的形式,如果bucketName是空,sdk就传空,也就是 /prefix的形式,默认开头第一个字段对oss来说永远是bucketName,layotto测截取/前面的值作为bucketName。sdk可以不用在metadata里面再传递bucket,prefix信息,而是直接以该形式传递过来,component做对应的处理截取。
    2.List支持分页查询

上述改动对于已经使用的GET,PUT,DEL接口没有影响。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1

是不是没有必要单独增加一个 FileExists 接口,用 GetFileMeta 就可以了,比如 GetFileMeta 一个不存在的 Object,只需要返回错误码 codes.NotFound,另外所有接口有详细的错误码定义吗?

嗯,可以复用一个,错误码的话,grpc自带的错误码应该可以覆盖到文件操作。对于一些特殊的错误码的话目前还没做定义,像dapr一样,都是直接返回interfanal error,然后在message里面增加错误信息。

from layotto.

bokket avatar bokket commented on July 16, 2024 1

@zach030 @bokket 各位大佬帮忙一起看看File API的改动可以不~ 因为现在的API满足不了用户需求 @wenxuwan 在改File API (他会顺便重构File API相关组件

hdfs 要分页的话需要自己指定列出的数量,我们对其的实现也是每次列出目录下最多100个文件,这需要外部定义大小,在上面hdfs sdk里面有详细的介绍。
看起来hdfs如果要跑CI的话也需要加action,我们暑假写了一个,但不支持分布式,可以参考一下。

个人觉得S3标准和传统文件系统标准有区别的,但如果统一抽象成Write,Read,List,Metadata,Delete,Append,Create,CreateDir等等CRUD操作的话其实重复的代码量挺大的(除了标准不同带来的不同语义),所以我们是用模板定义了一套符合S3的标准兼容文件系统,然后直接生成的接口方便填充。我们通过测试支持的services也有oss,minio,aws,但是目前好像在支持配置文件,不知道有没有完成。。

所以如果考虑用beyondstorage的话,可以在rpc下面包一层就够了)没有东西是包一层解决不了的,如果不行,就加两层:)

from layotto.

bokket avatar bokket commented on July 16, 2024 1

所以我们是用模板定义了一套符合S3的标准兼容文件系统,然后直接生成的接口方便填充

@bokket 是说自动生成一套接口?好神奇,在哪能看到啊求链接学习学习

这是前期对模板的分析,后面我mentor又自己重写了一套,重构的代码现在我也不太熟,但如果想立马实现某个services的话,直接在toml加你想要给的特性和需求,然后根据generated生成即可。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024 1

刚看了下,Dapr binding API现在已经支持AWS S3和阿里云OSS了,可以看看能否直接复用,这样就不需要自己重新搞了 https://docs.dapr.io/reference/components-reference/supported-bindings/s3/ https://docs.dapr.io/reference/components-reference/supported-bindings/alicloudoss/

包括redis协议,感觉可以用在一些支持redis协议的缓存上 https://docs.dapr.io/reference/components-reference/supported-bindings/redis/

dapr的binding不支持stream,对于大文件根本没法玩的。文件接口肯定是stream存在的。binding的操作当时是想让他们支持stream,但没后文了,可以参考:dapr/dapr#3338

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

@alpha-baby hi buddy, any comments?

from layotto.

alpha-baby avatar alpha-baby commented on July 16, 2024

我也认为dapr那个bindings API不是一个很好理解的东西,可能还是我太菜理解不了这么抽象的东西(🐶)。

InvokeBinding是一个Unary的rpc调用,这对于OSS这种支持大文件需要流式传输的功能肯定是没法支持的。

我认同这个观点,我觉得dapr也很难去推动修改API。

方案2我觉得在oss这种场景中是完全可行的。但是方案2这个API是不是足够通用我就不能给太多建议了(经验不够)

from layotto.

alpha-baby avatar alpha-baby commented on July 16, 2024
  // Get file with stream.
  rpc GetFile(GetFileRequest) returns (stream GetFileResponse) {}

  // Put file with stream.
  rpc PutFile(stream PutFileRequest) returns (PutFileResponse) {}

这样是不是好一点呀

from layotto.

seeflood avatar seeflood commented on July 16, 2024
  // Get file with stream.
  rpc GetFile(GetFileRequest) returns (stream GetFileResponse) {}

  // Put file with stream.
  rpc PutFile(stream PutFileRequest) returns (PutFileResponse) {}

这样是不是好一点呀

同意,好像get入参用不到stream

另外传参加下store_name吧?

from layotto.

ujjboy avatar ujjboy commented on July 16, 2024

In addition to put() and get() , there are some methods such as list(), exist(), remove().

from layotto.

alpha-baby avatar alpha-baby commented on July 16, 2024
GetFileRequest
GetFileResponse
PutFileRequest
PutFileResponse

这几个对象里面的字段看目前的设计是比较少的,估计这里不太行。
需要多调研几个云厂商或者OSS的开源项目,然后给出一个对比的表格什么的。
类似这个issue dapr/dapr#2988

from layotto.

alpha-baby avatar alpha-baby commented on July 16, 2024

In addition to put() and get() , there are some methods such as list(), exist(), remove().

good!!

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

In addition to put() and get() , there are some methods such as list(), exist(), remove().

I think we can follow asw s3 protocol:

The Common operations in the protocol as below:

Create a bucket – Create and name your own bucket in which to store your objects.

Write an object – Store data by creating or overwriting an object. When you write an object, you specify a unique key in the namespace of your bucket. This is also a good time to specify any access control you want on the object.

Read an object – Read data back. You can download the data via HTTP.

Delete an object – Delete some of your data.

List keys – List the keys contained in one of your buckets. You can filter the key list based on a prefix.

By the way, In myopinion Create a bucket, Delete an object ,List keys It is more appropriate to put it in the outbinding,

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

reate a bucket, Delete an object ,List key

Add CRUD operation for file(OSS), If users want to customize their own operations, they can consider putting it in binding later.

from layotto.

github-actions avatar github-actions commented on July 16, 2024

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

from layotto.

fengmk2 avatar fengmk2 commented on July 16, 2024

可以适配支持一下 minio https://github.com/minio/minio ,S3 的接口

from layotto.

seeflood avatar seeflood commented on July 16, 2024

现在两个问题:

  1. File API是设计成 “桶--目录--文件”的抽象,还是“目录---文件”的抽象,还是干脆不叫File API、改叫OSS API(这样就方便做针对性的设计)。
    如果是“目录---文件”的抽象,需要用户侧sdk把 bucket 和prefix拼成一个字符串、作为目录传过来,比如以"bucket://prefix"的格式(不能只用"/"当分隔符),比如"bucket1://a/b/c/"。然后OSS组件把directory字符串拆成bucket和prefix
    我个人更支持这种方案

  2. File API要不要支持分页查询
    这个需要调研下。aws s3和阿里云oss都是支持的
    https://docs.aws.amazon.com/zh_cn/AmazonS3/latest/userguide/ListingKeysUsingAPIs.html
    https://help.aliyun.com/document_detail/187544.htm?spm=a2c4g.11186623.0.0.74ee5d7aPTaJos#reference-2520881

minio也支持分页 有个maxKeys
https://minio-java.min.io/io/minio/ListObjectsArgs.html#maxKeys--

个人觉得可以加个分页字段,毕竟就连linux都可以 ls | less 这样分页

from layotto.

fengmk2 avatar fengmk2 commented on July 16, 2024

@wenxuwan 还有一个查询文件元信息或者探测文件是否存在的接口,也需要看看 https://help.aliyun.com/document_detail/31984.htm?spm=a2c4g.11186623.0.0.77af5d7aqqsyje#reference-bgh-cbw-wdb

否则目前判断文件是否存在,必须是一个 get stream ,抛异常才能知道是否存在,开发者使用体验比较差的。我以 js 举例你看看。

// throw error
      await assert.rejects(
        async () => {
          const filepath = join(tmpfileDir, name);
          const stream = await client.file.get({
            storeName,
            name,
          });
          await pipeline(
            stream,
            createWriteStream(filepath),
          );
        },
        (err: any) => {
          // console.error(err);
          assert.equal(err.code, 13);
          assert.match(err.message, /StatusCode=404, ErrorCode=NoSuchKey/);
          return true;
        }
      );

from layotto.

fengmk2 avatar fengmk2 commented on July 16, 2024

@wenxuwan 赞,期待 pr 。

from layotto.

imeoer avatar imeoer commented on July 16, 2024

是不是没有必要单独增加一个 FileExists 接口,用 GetFileMeta 就可以了,比如 GetFileMeta 一个不存在的 Object,只需要返回错误码 codes.NotFound,另外所有接口有详细的错误码定义吗?

嗯,可以复用一个,错误码的话,grpc自带的错误码应该可以覆盖到文件操作。对于一些特殊的错误码的话目前还没做定义,像dapr一样,都是直接返回interfanal error,然后在message里面增加错误信息。

是的,grpc 里的错误码暂时是够用的,我们尽量不返回 internal error,真出了 unexpected 的异常才返回。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

是不是没有必要单独增加一个 FileExists 接口,用 GetFileMeta 就可以了,比如 GetFileMeta 一个不存在的 Object,只需要返回错误码 codes.NotFound,另外所有接口有详细的错误码定义吗?

嗯,可以复用一个,错误码的话,grpc自带的错误码应该可以覆盖到文件操作。对于一些特殊的错误码的话目前还没做定义,像dapr一样,都是直接返回interfanal error,然后在message里面增加错误信息。

是的,grpc 里的错误码暂时是够用的,我们尽量不返回 internal error,真出了 unexpected 的异常才返回。

嗯,后面如果有common的错误码,再统一出来,目前先复用grpc的。

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

@imeoer @fengmk2 @seeflood 接口改好了,aliyun的oss能力现在已经修改完毕,自测可以分页查找,前缀查找,判断文件是否存在,获取文件的metadata能力,可以帮忙一起看下接口的定义有没有问题,后续我会继续修改minio oss和aws的oss代码

from layotto.

seeflood avatar seeflood commented on July 16, 2024

@zach030 @bokket 各位大佬帮忙一起看看File API的改动可以不~
因为现在的API满足不了用户需求 @wenxuwan 在改File API (他会顺便重构File API相关组件

from layotto.

zach030 avatar zach030 commented on July 16, 2024

@zach030 @bokket 各位大佬帮忙一起看看File API的改动可以不~ 因为现在的API满足不了用户需求 @wenxuwan 在改File API (他会顺便重构File API相关组件

支持分页查询比较好。对于文件的Copy是否需要引入呢,我之前在使用对象存储的时候是有这个需求的

from layotto.

seeflood avatar seeflood commented on July 16, 2024

emmmm我在想干脆直接把AWS S3或者 POSIX API里面和文件系统相关的API 一一对应的移植过来,一方面省的自己推敲了(懒); 另一方面复用业界已有标准>>>>>自己重新发明标准

from layotto.

seeflood avatar seeflood commented on July 16, 2024

所以我们是用模板定义了一套符合S3的标准兼容文件系统,然后直接生成的接口方便填充

@bokket 是说自动生成一套接口?好神奇,在哪能看到啊求链接学习学习

from layotto.

wenxuwan avatar wenxuwan commented on July 16, 2024

@zach030 @bokket 各位大佬帮忙一起看看File API的改动可以不~ 因为现在的API满足不了用户需求 @wenxuwan 在改File API (他会顺便重构File API相关组件

支持分页查询比较好。对于文件的Copy是否需要引入呢,我之前在使用对象存储的时候是有这个需求的

Copy接口可以加,新的能力接口其实都好说,后期加也不会影响到目前的接口。但改动的接口要比较严谨一些,不能影响到在用的

from layotto.

seeflood avatar seeflood commented on July 16, 2024

刚看了下,Dapr binding API现在已经支持AWS S3和阿里云OSS了,可以看看能否直接复用,这样就不需要自己重新搞了
https://docs.dapr.io/reference/components-reference/supported-bindings/s3/
https://docs.dapr.io/reference/components-reference/supported-bindings/alicloudoss/

包括redis协议,感觉可以用在一些支持redis协议的缓存上
https://docs.dapr.io/reference/components-reference/supported-bindings/redis/

from layotto.

github-actions avatar github-actions commented on July 16, 2024

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

from layotto.

github-actions avatar github-actions commented on July 16, 2024

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue or help wanted. Thank you for your contributions.

from layotto.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.