orangewise / s3-zip Goto Github PK

View Code? Open in Web Editor NEW

186.0 8.0 61.0 394 KB

Download selected files from an Amazon S3 bucket as a zip file

License: MIT License

JavaScript 100.00%

s3-zip's Introduction

s3-zip

Download selected files from an Amazon S3 bucket as a zip file.

Install

npm install s3-zip

AWS Configuration

Refer to the AWS SDK for authenticating to AWS prior to using this plugin.

Usage

Zip specific files

const fs = require('fs')
const join = require('path').join
const s3Zip = require('s3-zip')

const region = 'bucket-region'
const bucket = 'name-of-s3-bucket'
const folder = 'name-of-bucket-folder/'
const file1 = 'Image A.png'
const file2 = 'Image B.png'
const file3 = 'Image C.png'
const file4 = 'Image D.png'

const output = fs.createWriteStream(join(__dirname, 'use-s3-zip.zip'))

s3Zip
  .archive({ region: region, bucket: bucket}, folder, [file1, file2, file3, file4])
  .pipe(output)

You can also pass a custom S3 client. For example if you want to zip files from a S3 compatible storage:

const { S3Client } = require('@aws-sdk/client-s3')

const S3Client = new aws.S3({
  signatureVersion: 'v4',
  s3ForcePathStyle: 'true',
  endpoint: 'http://localhost:9000',
})

s3Zip
  .archive({ s3: S3Client, bucket: bucket }, folder, [file1, file2])
  .pipe(output)

Zip files with AWS Lambda

Example of s3-zip in combination with AWS Lambda.

Zip a whole bucket folder

const fs = require('fs')
const join = require('path').join
const {
  S3Client
} = require("@aws-sdk/client-s3")
const s3Zip = require('s3-zip')
const XmlStream = require('xml-stream')

const region = 'bucket-region'
const bucket = 'name-of-s3-bucket'
const folder = 'name-of-bucket-folder/'
const s3 = new S3Client({ region: region })
const params = {
  Bucket: bucket,
  Prefix: folder
}

const filesArray = []
const files = s3.listObjects(params).createReadStream()
const xml = new XmlStream(files)
xml.collect('Key')
xml.on('endElement: Key', function(item) {
  filesArray.push(item['$text'].substr(folder.length))
})

xml
  .on('end', function () {
    zip(filesArray)
  })

function zip(files) {
  console.log(files)
  const output = fs.createWriteStream(join(__dirname, 'use-s3-zip.zip'))
  s3Zip
   .archive({ region: region, bucket: bucket, preserveFolderStructure: true }, folder, files)
   .pipe(output)
}

Tar format support

s3Zip
  .setFormat('tar')
  .archive({ region: region, bucket: bucket }, folder, [file1, file2])
  .pipe(output)

Zip a file with protected password

s3Zip
  .setRegisterFormatOptions('zip-encrypted', require("archiver-zip-encrypted"))
  .setFormat('zip-encryptable')
  .setArchiverOptions({zlib: {level: 8}, encryptionMethod: 'aes256', password: '123'})
  .archive({ region: region, bucket: bucket }, folder, [file1, file2])
  .pipe(output)

Archiver options

We use archiver to create archives. To pass your options to it, use setArchiverOptions method:

s3Zip
  .setFormat('tar')
  .setArchiverOptions({ gzip: true })
  .archive({ region: region, bucket: bucket }, folder, [file1, file2])

Organize your archive with custom paths and permissions

You can pass an array of objects with type EntryData to organize your archive.

const files = ['flower.jpg', 'road.jpg']
const archiveFiles = [
  { name: 'newFolder/flower.jpg' },

  /* _rw_______ */
  { name: 'road.jpg', mode: parseInt('0600', 8)  }
];
s3Zip.archive({ region: region, bucket: bucket }, folder, files, archiveFiles)

Using with ExpressJS

s3-zip works with any framework which leverages on NodeJS Streams including ExpressJS.

const s3Zip = require('s3-zip')

app.get('/download', (req, res) => {
  s3Zip
    .archive({ region: region, bucket: bucket }, '', 'abc.jpg')
    .pipe(res)
})

Above should stream out the file in the response of the request.

Debug mode

Enable debug mode to see the logs:

s3Zip.archive({ region: region, bucket: bucket, debug: true }, folder, files)

Testing

Tests are written in Node Tap, run them like this:

npm t

If you would like a more fancy coverage report:

npm run coverage

s3-zip's People

Contributors

Stargazers

Watchers

s3-zip's Issues

Cannot include same file multiple times with a different name

Take the following scenario as an example

// files to grab from s3
filesS3= [
  "x/y/z/file_1.png",
  "x/y/z/file_1.png",
];

// the structure of the zip file
filesZip = [
  "folder1/file_1_v1.png",
  "folder1/folder2/file_1_v2.png",
];

Here, I would expect the zipped folder to contain two instances of the same file, each with a different name (file_1_v1.png and file_1_v2.png). Instead, there are two instances of the same file, but they both get recongized as being filesS3[0] which will result in them both having the name corresponding to filesZip[0].

Is this something you would consider changing, even if it needs to be specified as an option when calling s3Zip.archive(...) to retain backwards-compatibility? If so, I can submit a PR. It seems the only core change that needs to be made is adding filesS3[i] = ""; after line 58. That way each subsequent access of the s3key will access the next name found in filesZip.

Need help in integrating s3-zip on [Error: append: queue closed]

Hi @orangewise,

I am using s3-zip to to zip the files without downloading so that this can be converted to a Lambda Function if needed. Please let me know whatever looks wrong below as I am new to node.js

Right now I am getting the error: append: queue closed

archive error [Error: append: queue closed]

/home/<<userid>>/node_modules/s3-zip/s3-zip.js:25
    throw err
          ^
Error: append: queue closed

The code I am using to test on the command line using the node interpreter is as below. The MyFile.zip is not present in s3.

const AWS = require('aws-sdk');
const path = require('path');
const s3Zip = require('s3-zip');
const awsRegion = '<<MyRegion>>';
const dataBucket = "<<MyBucketNameGoesHere>>";
const srcKey = "<<Testing With a Static Path to an Object>>";
const srcFile = path.basename(srcKey);
const srcFolder = path.dirname(srcKey);

var zipFileBody = s3Zip.archive({ region: awsRegion, bucket: dataBucket}, srcFolder + '/', [srcFile]);
var zipParams = { params: { region: awsRegion, Bucket: dataBucket, Key: path.join(srcFolder,'MyFile.zip') } };
var zipFile = new AWS.S3(zipParams);

zipFile.upload({Body: zipFileBody}).on('httpUploadProgress', function (evt) {
            console.log(evt)
        }).send(function (e, r) {
            if (e) {
                var err = 'zipFile.upload error ' + e;
                console.log(err);
            }
            console.log(r);
        })

Couple of questions here:

How does s3-zip add the .zip-file s3 when it is not present?
Are new files are presented to an existing .zip-file on s3 appended?
When files presented to the existing .zip-file on s3 are already present are they overwritten?

Custom paths not working

This is a great class by the way :) Thanks for sharing.

I was following the example in the documentation from "Organize your archive with custom paths and permissions" however when I passed the argument archiveFiles it was ignored.

Looking into the code from s3-zip.js I went to see how it is working. From what I can tell, you try to find the array index of the file being processed by searching inside the content of filesS3 and getting the index number. But I think there is some issue with how you trim the content that you search, let me show you.

Here is some sample debugged variables using your current code from my use case:
folder: N4KYyS4DCh0hA1T590cMDpcdbL6GBbgF
file.path: N4KYyS4DCh0hA1T590cMDpcdbL6GBbgF/Abundancia en Acción SAPI de CV (7246)/AA - Acta Constitutiva (Válido)/Acta 6118 Abundancia.pdf
file.path.startsWith(folder): true
file.path.substr(folder.length): /Abundancia en Acción SAPI de CV (7246)/AA - Acta Constitutiva (Válido)/Acta 6118 Abundancia.pdf

However the filesS3.indexOf(...) returns -1, thus ignoring the instructions I passed in archiveFiles, because actually the value stored in filesS3 is:
[ 'Abundancia en Acción SAPI de CV (7246)/AA - Acta Constitutiva (Válido)/Acta 6118 Abundancia.pdf',

So what I did is I increased by one the trimming and then the filesS3.indexOf(... would return the correct index value and everything worked. Like so:
file.path.substr(folder.length + 1): Abundancia en Acción SAPI de CV (7246)/AA - Acta Constitutiva (Válido)/Acta 6118 Abundancia.pdf

Original code:
const i = filesS3.indexOf(file.path.startsWith(folder) ? file.path.substr(folder.length) : file.path)

Suggested code:
const i = filesS3.indexOf(file.path.startsWith(folder) ? file.path.substr(folder.length + 1) : file.path)

KeyStream has no events

The keyStream file list is being set but the object has no events which causes the createFileStream method to return null when the keyStream.on method is called. Where are the keyStream events set?

Express zip download issue

I am trying to create a zip and send it using expressJS. The following is the code I am using

app.get('/', (req, res) => {
  res.setHeader("content-type", 'application/octet-stream');
  res.setHeader('Content-Disposition', 'attachment; filename=file.zip');
  s3Zip
    .archive({ region: 'us-east-1', bucket: 'botuploads.collectcdn.com', debug: true }, '5c5bbe28ac8153155159c3e4/44F01219-C9FE-4619-AAE6-37105E2E2CBE/', ['0_Resume (8).pdf'])
    .pipe(res);
});

I am getting a file called file.zip when i invoke the url via browser. But when i try to open it it shows the following error. What am i doing wrong? I want a direct link which can be used to download the zip file

The zip file will be overridden

I'm using AWS Lambda to append the files to zip file, and this function be executed when someone upload a file.
It's mean when someone upload a file, this file will be appended to current zip file (create new one if not exist), so the zip file will have a mutiple files from mutiple users.
That's my expect.
But actually, when someone upload a file, this file be appended to zip file, and override to existed zip file, that's mean the zip file always only has one newest upload file.
Could you give a solution?
Here is the code I used: https://github.com/orangewise/s3-zip/blob/HEAD/aws_lambda.md

NoSuchKey: The specified key does not exist

import * as AWS from 'aws-sdk';
import { success, failure } from "./libs/response-lib";
import * as s3Zip from "s3-zip";

export async function main(event) {
    const data = JSON.parse(event.body);

    const param = {
        region : data.region,
        bucket : data.bucket,
        folder : data.folder,
        files : data.files,
        zipFileName : data.zipFileName,
    };

    try {
        const body = s3Zip.archive({
            region: param.region,
            bucket: param.bucket },
            param.folder,
            param.files
        );
        const params = {params: {Bucket: param.bucket, Key: param.folder + param.zipFileName}};
        const s3 = new AWS.S3(params);
        await s3.upload( {Body: body}, (err, res) => {
            if (err) {
                return failure({ Status: false, Message: 'zipFile.upload error: ' + err });
            }
            console.log(res);
            return success({ Status: true, Body: res });
        }).promise();
    }  catch (e) {
        return failure({ Status: false, Message: 'Failure: ' + e })
    }
}

I am attempting to zip up files and upload them to S3. Any help with this issue?

REGION: "us-east-2",
BUCKET: "gv-cloud-share"

folder expects '/' at the end

I am using path.dirname which removes the '/' at the end for the folder variable:
var body = s3Zip.archive({ region: region, bucket: bucket}, folder, files)

Can this be addressed?

DeprecationWarning

Simply importing the most recent version of s3-zip for an AWS Lambda, gives me the following warning in the CloudWatch logs.
ERROR (node:7) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.

File not exist exception is not handled

I am downloading multiple files by passing an array of the file paths. Suppose a file from array not exist, it will break execution.
It is returning following error:
Code: 'NoSuchKey',
Message: 'The specified key does not exist.'

Subfolders in Windows

Great module, but wasn't working for me for any content in subfolders under the root bucket. I'm developing on Windows, deploying on Linux.

Seems that the full paths returned by the s3Files.createKeyStream method in s3-files.js are platform dependent. I'm sure they should always be forward slashes (as used by S3).

I added a line to correct the slashes
paths = paths.map(path => path.replace(/\/g,"/"));
just before the return streamify(paths) around line 36.

Not your bug - but hope this helps someone. Perhaps mention in the README??

TypeError: Cannot read property 'objectMode' of undefined

Getting below error while doing zip,

s3Zip
        .archive(params, folder, file_names)
        .pipe(this.res)

node_modules/readable-stream/lib/_stream_writable.js:317
  var isBuf = !state.objectMode && _isUint8Array(chunk);
                     ^

TypeError: Cannot read property 'objectMode' of undefined

Using Across Buckets

Can you please give an example for the following:

Zipping across multiple source buckets and different destination bucket.
Adding folders to existing zip file specifying path & filename

Any plan to unzip ?

Hi there,

I love your module !
Is there any plan to add some unzip features ? Should be great to have a complete solution with only one module ?

Thx for answer

S3 no such key in root bucket

Hi, I use your AWS lambda function to zip files, but I have an issue when I try to zip a file list, all the files are in the root s3://bucket_name/ and I have tried in other buckets with folders and it works ok. However ,I am not sure if the issue is the folder name, here is my code

$payload = json_encode([
            'region'            => AWS_REGION,
            'sourceBucket'      => 'SOURCE_BUCKET',
            'destinationBucket' => ZIP_BUCKET,
            'folder'            => '/',
            'files'             => $files ,
            'zipFileName'       => 'zip_name.zip'
        ]);

I appreciate your help. Thanks

Zip multiple files from different folder

Can I zip multiple files from different folder?

for example I want to zip file_a, file_b and file_c from this sample folder structure.

mys3bucket
-root_folder
-folder_1
-file_a
-folder_2
-file_b
-folder_3
-file_c

thanks!

Files unzip to .zip.cpgz format on mac

I'm able to download s3 files into a local folder successfully, until the extraction process always returns a file.zip.cpgz, which unzips to file.zip, and vice-versa.

Unzipping from my command line via unzip file.zip reveals the undamaged files, so I'm wondering if there's a problem with the zipping process in this package.

In the meantime let me know if there's a workaround, thanks!

Zipping files from AWS via array of presigned URLs

I have a library of images on AWS in a protected bucket, and want to be able to serve(stream) a subset of them as a zip download in response to a GET request.

I can identify the files for the request, generate an array of preSigned URLs and a corresponding array of the file names that match, but cannot for the life of me figure out how to leverage s3-zip to fetch these files via their preSigned URLs into a Zip file and stream that back to the requestor.

Is this possible?

My understanding of the AWS example here is that it assumes the bucket is public, and you can pass the bucket, folder name etc. and fetch the files.

I'm not sure if I'm overcomplicating things by getting preSigned URLs for all the files, but I don't see any option to do that any other way.

Any guidance would be very much appreciated.

The following code is stripped to only the relevant bits:

exports.getDownloads = (req, res, next) => {

fileList = [
  'https://the-archive-name.s3.us-east-2.amazonaws.com/media/originals/588f26a61b772-raw.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA...-Amz-Expires=300&X-Amz-Signature=...-Amz-SignedHeaders=host',
  'https://the-archive-name.s3.us-east-2.amazonaws.com/media/originals/5bb37fc4ad871-raw.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA...-Amz-Expires=300&X-Amz-Signature=...-Amz-SignedHeaders=host',
]```

The fileList is fetched, authenticated, and the above array of preSigned URLs passed to the Zip stream - the above is just a sanitized example of the console.log output.

const region = 'us-east-2'
const bucket = 'the-archive-name'
const folder = 'media/originals/'

The folder seems redundant here as the images all have full signed URLs

        const output = fs.createWriteStream(join(__dirname, 'Archive-Images.zip'))

        s3Zip
          .archive({ region: region, bucket: bucket}, '', fileList)
          .pipe(output)


What I'm expecting is that the user hits the getDownloads endpoint, which fetches their images from AWS, compiles them into a single 'Archive-Images.zip' file and starts streaming that zip to their local machine without writing it into the file system or having to generate the entire zip in memory first.

Saving zip file on client-side that was streamed from server

Hello, sort of a beginner here. I am zipping a few files from S3 on the server side using the Express framework like this

router.get('/download', function (req, res, next) {
    const region = 'us-west-2';
    const bucket = 'my-bucket';
    const folder = 'my-folder/';
    const file1 = '1.jpg';
    const file2 = '2.jpg';
    const file3 = '3.jpg';

    s3Zip
        .archive({ region: region, bucket: bucket}, folder, [file1, file2, file3])
        .pipe(res)
});

and I think that part is working correctly because I am getting a response on the client side that looks like this (from the chrome console)

But my goal here is to download the zip file to the client's browser. I am using Angular File Saver to do this. I have tried

var blob = new Blob([res._body], {type: "application/zip"});
FileSaver.saveAs(blob, "test-new.zip");

where res is the response corresponding to the above image. A zip file does download but when I try to open it I get the following error

Can anyone see why this is happening?

How to get content-length of the zip file, so we can show progress

We would like to sent content-length header while streaming a zip download from our express API.

Thank you in advance.

net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK)

Hi,
When I have files more then 100 in my s3 bucket. it takes lot of time and in frontend I get this error:"net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK)". Please assist.
Thanks in advance

Fails to complete on missing or bad key

Pass a set of keys to the archive function with a bad key purposely included.
The debug log continues to download everything but the failure prevents the file being ended and closed.
I believe the issue is actually in your s3-files library.

A bad key could happen in the event that a object is deleted from S3 before it can be streamed.

Losing mime type and directory structure of zipped folders

Is it somehow possible to preserve the structure of folders that contain other folders and files? Also it seems that files without a file extension are losing their mime types upon unzipping. Is there a work around for this? Great work on the module btw. :)

Upload to to a different bucket

I would like to fetch files to zip from a different s3 bucket.
Is there a way I can specify bucketname/folder for destination?

Add other files to zip

is it possible to add other files to the zip that are not in s3? I have an api endpoint below that works well to fetch the files I need from s3. but, I also need to send one file called 'config.json along with the zip. is this possible to do?

router.get('/test1', (req, res, next) => {
  Files.find().then(results => {
      let fileList = results.map(x => x.name + '.wav')
      let folderList = results.map(x => { return { name: 'files/audio/' + x.name + '.wav'}})
      let stream = s3Zip
        .archive({ s3: s3, bucket: bucket, debug: true }, '', fileList, folderList)
        .pipe(res)
  })
})

Missing Credentials in config

E:\Projects_2016\node_modules\s3-files\node_modules\aws-sdk\lib\request.js:31
throw err;
^

TimeoutError: Missing credentials in config
at ClientRequest. (E:\Projects_2016\node_modules\s3-files\node_modules\aws-sdk\lib\http\node.js:56:34)
at ClientRequest.g (events.js:286:16)
at emitNone (events.js:86:13)
at ClientRequest.emit (events.js:185:7)
at Socket.emitTimeout (_http_client.js:614:10)
at Socket.g (events.js:286:16)
at emitNone (events.js:86:13)
at Socket.emit (events.js:185:7)
at Socket._onTimeout (net.js:333:8)
at tryOnTimeout (timers.js:232:11)
at Timer.listOnTimeout (timers.js:202:5)

I paste the sample code on my project similar to this
s3Zip
.archive({ region: region, bucket: bucket}

I log also region and bucket and got a value .
could you help about the error.
Thank You in advance.

Add more detailed msgs.

I am getting the msg:
NoSuchKey: The specified key does not exist.

Not sure where is this coming from. Can you please a little more description.

Change file names before archiving

Hi,

Thanks for your helpful library. However, I wondering if there is a way that I can modify the original file names to somethings else before I zip the file and upload to S3.

Thank you!

Lambda function

Concerning var body = s3Zip.archive({ region: region, bucket: bucket}, folder, files) this is what the function is returning for me:
Archiver {
options:
{ highWaterMark: 1048576,
statConcurrency: 4,
comment: '',
forceUTC: false,
store: false,
zlib: { level: 1 },
forceZip64: false,
forceLocalTime: false },
_readableState:
ReadableState {
objectMode: false,
highWaterMark: 1048576,
buffer: BufferList { head: null, tail: null, length: 0 },
length: 0,
pipes: null,
pipesCount: 0,
flowing: null,
ended: false,
endEmitted: false,
reading: false,
sync: false,
needReadable: true,
emittedReadable: false,
readableListening: false,
resumeScheduled: false,
destroyed: false,
defaultEncoding: 'utf8',
awaitDrain: 0,
readingMore: false,
decoder: null,
encoding: null },
readable: true,
domain: null,
_events:
{ end: { [Function: bound onceWrapper] listener: [Function: onend] },
prefinish: { [Function: bound onceWrapper] listener: [Function] },
unpipe: [Function: onunpipe],
drain: [Function],
error: [ [Function: onerror], [Function] ],
close: { [Function: bound onceWrapper] listener: [Function: onclose] },
finish: { [Function: bound onceWrapper] listener: [Function: onfinish] } },
_eventsCount: 7,
_maxListeners: undefined,
_writableState:
WritableState {
objectMode: false,
highWaterMark: 1048576,
finalCalled: false,
needDrain: false,
ending: false,
ended: false,
finished: false,
destroyed: false,
decodeStrings: true,
defaultEncoding: 'utf8',
length: 0,
writing: false,
corked: 0,
sync: true,
bufferProcessing: false,
onwrite: [Function],
writecb: null,
writelen: 0,
bufferedRequest: null,
lastBufferedRequest: null,
pendingcb: 0,
prefinished: false,
errorEmitted: false,
bufferedRequestCount: 0,
corkedRequestsFree: CorkedRequest { next: null, entry: null, finish: [Function] } },
writable: true,
allowHalfOpen: true,
_transformState:
TransformState {
afterTransform: [Function],
needTransform: false,
transforming: false,
writecb: null,
writechunk: null,
writeencoding: null },
_format: 'zip',
_module:
Zip {
options:
{ highWaterMark: 1048576,
statConcurrency: 4,
comment: '',
forceUTC: false,
store: false,
zlib: [Object],
forceZip64: false,
forceLocalTime: false },
supports: { directory: true, symlink: true },
engine:
Transform {
options: [Object],
_readableState: [Object],
readable: true,
domain: null,
_events: [Object],
_eventsCount: 4,
_maxListeners: undefined,
_writableState: [Object],
writable: true,
allowHalfOpen: true,
_transformState: [Object],
offset: 0,
_archive: [Object],
_entry: null,
_entries: [] } },
_pending: 0,
_pointer: 0,
_entriesCount: 0,
_entriesProcessedCount: 0,
_fsEntriesTotalBytes: 0,
_fsEntriesProcessedBytes: 0,
_queue:
{ _tasks: DLL { tail: null, head: null, length: 0 },
concurrency: 1,
payload: 1,
saturated: [Function: noop],
unsaturated: [Function: noop],
buffer: 0.25,
empty: [Function: noop],
drain: [Function: bound ],
error: [Function: noop],
started: false,
paused: false,
push: [Function: push],
kill: [Function: kill],
unshift: [Function: unshift],
remove: [Function: remove],
process: [Function: process],
length: [Function: length],
running: [Function: running],
workersList: [Function: workersList],
idle: [Function: idle],
pause: [Function: pause],
resume: [Function: resume] },
_statQueue:
{ _tasks: DLL { tail: null, head: null, length: 0 },
concurrency: 4,
payload: 1,
saturated: [Function: noop],
unsaturated: [Function: noop],
buffer: 1,
empty: [Function: noop],
drain: [Function: noop],
error: [Function: noop],
started: false,
paused: false,
push: [Function: push],
kill: [Function: kill],
unshift: [Function: unshift],
remove: [Function: remove],
process: [Function: process],
length: [Function: length],
running: [Function: running],
workersList: [Function: workersList],
idle: [Function: idle],
pause: [Function: pause],
resume: [Function: resume] },
_state:
{ aborted: false,
finalize: false,
finalizing: false,
finalized: false,
modulePiped: true },
_streams: [] }

Memory concerns

Hi there! I've written an AWS Lambda function for the purpose of taking a list of files and adding them to a zip. You can see it here: https://gist.github.com/glennfu/5d95605fa56296d9dc056f5e7887beae

It's doing very little beyond the example you provided here: https://github.com/orangewise/s3-zip/blob/master/aws_lambda.md

I discovered that the memory usage is way higher than I expected. Here's some testing:

Files	Total Size	Memory Used	Zipped Size
2	30mb	90mb	138kb
3	15mb	90mb	69kb
5	75mb	221mb	344kb

I was really hoping to find that it could stream/pipe the files as it loaded them, unloading old files as each gets processed. Do you have any suggestions or ideas for how to improve this?

Error : EROFS: read-only file system

Hello,
I'm trying to work with s3-zip to zip some files in s3 bucket using a lambda function.
when I do,

function zip(files) {
console.log(files)
var output = fs.createWriteStream(join(__dirname, 'use-s3-zip.zip'))
s3Zip
.archive({ region: region, bucket: bucket, preserveFolderStructure: true }, folder, files)
.pipe(output)
}

in cloudwatch, I have a successful console.log of the files
in the next line I have :

Error: EROFS: read-only file system, open '/var/task/use-s3-zip.zip'
at Error (native)

Any help please ?

Set name and folder inside zip

Is there a way to change the name and/or folder of the file when it is added to the zip?
thanks

how to download the .zip from the client with Q promises

Hi, I have a button on the client, that request the .zip file to a route of my server.
The route is like this:

/* Route Definition */
routes.get('/companies/:company_id/all_zipped_docs', function(req, res)
{
    
         var file_path = "companies/"+req.params.company_id+"/";
         return S3getZippedDir(file_path);
        })
    .then(function(data)
          {
           res.status(201).json({ msg : "S3 zipped file", 
                                  data: data});     
          })
   .catch(function(err) 
          { return res.status(401).json({ msg : err.msg, code: err.code })  });  
});


/* Route Execution */
function S3getZippedDir(folder)
{
 var deferred = Q.defer();
 
 var filesArray = [];
 var params = {
                Bucket: AWS_S3_BUCKETNAME,
                Prefix: folder
              };

 var files = s3.listObjects(params).createReadStream();
 
 var xml = new XmlStream(files);
 xml.collect('Key');
 xml.on('endElement: Key', function(item) 
  { 
    filesArray.push(item['$text'].substr(folder.length)); 
 })

  xml.on('end', function () 
    { zip(filesArray); })

    xml.on('error', function (err) 
    { deferred.reject(err);
    });

    function zip(files) 
    {
 // ============ IMPORTANT PART ===============
    var output = fs.createWriteStream(join(__dirname, 'use-s3-zip.zip'));  
   var file_zip =  s3Zip
        .archive({     region   : AWS_S3_REGION, 
                        bucket  : AWS_S3_BUCKETNAME, 
                        preserveFolderStructure: true ,
                        debug    : true
                    }, folder, files)
        .pipe(output);
        deferred.resolve(file_zip)
    }  
 return deferred.promise;
 // ===========================================
}

Anyway it doesn't work. What object should I deferred.resolve() ?

cant zip file if the s3 bucket is private

 const body = s3Zip
 .archive({ region: region, bucket: bucket}, folder, file)
 .pipe(output)

the zip file is created but it's empty
How can we pass aws credentials in s3Zip method

zip entire S3 folder?

Hello,

Is there a way to zip an entire s3 folder without specifying individual file names?

[lambda] NoSuchKey: The specified key does not exist.

For lambda, why use zipFile.upload without using createWriteStream or something?

Payload:

{ "folder": "images-resized", "files": [ "1511414734320.300", "1511415403861.600" ] }

Key:

images-resized/download.zip

On upload error occurs:
2017-11-23T11:31:48.878Z e3434ea3-d041-11e7-a539-a50787a2ff84 NoSuchKey: The specified key does not exist. at Request.extractError (/var/task/node_modules/aws-sdk/lib/services/s3.js:577:35) at Request.callListeners (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:105:20) at Request.emit (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:77:10) at Request.emit (/var/task/node_modules/aws-sdk/lib/request.js:683:14) at Request.transition (/var/task/node_modules/aws-sdk/lib/request.js:22:10) at AcceptorStateMachine.runTo (/var/task/node_modules/aws-sdk/lib/state_machine.js:14:12) at /var/task/node_modules/aws-sdk/lib/state_machine.js:26:10 at Request.<anonymous> (/var/task/node_modules/aws-sdk/lib/request.js:38:9) at Request.<anonymous> (/var/task/node_modules/aws-sdk/lib/request.js:685:12) at Request.callListeners (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:115:18)

My Lambda handle:.


const aws = require('aws-sdk')
const s3Zip = require('s3-zip')
const bucket = process.env.AWS_BUCKET_NAME;
const region = 'ap-northeast-2'

exports.handle = (event, context, callback) => {
  console.log('Received event:', JSON.stringify(event, null, 2));

  const {
    folder,
    files,
    zipFileName = 'download.zip'
  } = event

  try {
    const body = s3Zip.archive({
      region, bucket
    }, folder, files)

    const zipFile = new aws.S3({
      apiVersion: '2012-10-17',
      Bucket: bucket,
      region: region,
      Key: `${folder}/${zipFileName}`
    })

    console.log('zipFile:', zipFile)

    zipFile.upload({ Body: body })
      .on('httpUploadProgress', (event) => {
        console.log(event)
      })
      .send((err, result) => {
        if (err) {
          console.log('zipFile.upload error:', err)
          callback(err)
        } else {
          callback(null, result)
        }
      })
  } catch (err) {
    console.log('catched error:', err)
    callback(err)
  }
}```

Express Example?

I am new to Node and I was wondering if s3-zip works within the express framework. Specifically, I would like to send the zipped file back to the client using the express framework. If this is possible it would be great if you could provide an example. Thanks

AWS lamda zip file size limits

Quick question about your AWS lambda example. I wasn't able to tell from the AWS documentation what is the maximum size that the resulting zip file can be.

What is your experience with s3-zip? Will the lambda function crash if the resulting .zip file is over 2GB? More? Less?

Thanks!

write zip to a Buffer instead of a stream

Can I write to a buffer and send in response to a request which will automatically start downloading in a browser?

RangeError: Array buffer allocation failed

Hi! I'm trying to to zip around 400+ files with average of 28MB each that's probably around 7GB and I'm getting this error.

Currently running these using EC2 m5.large.

Any suggestion or workaround regarding this memory issue? TIA

throw err;
^
RangeError: Array buffer allocation failed
at new ArrayBuffer (<anonymous>)
at new Uint8Array (<anonymous>)
at new FastBuffer (internal/buffer.js:941:1)
at createUnsafeBuffer (buffer.js:115:12)
at allocate (buffer.js:308:10)
at Function.allocUnsafe (buffer.js:273:10)
at Function.concat (buffer.js:452:25)
at bufferConcat (/app/node_modules/concat-stream/index.js:117:17)
at ConcatStream.getBody (/app/node_modules/concat-stream/index.js:64:42)
at ConcatStream.<anonymous> (/app/node_modules/concat-stream/index.js:37:51)
at ConcatStream.emit (events.js:205:15)
at ConcatStream.EventEmitter.emit (domain.js:471:20)
at finishMaybe (/app/node_modules/concat-stream/node_modules/readable-stream/lib/_stream_writable.js:620:14)
at endWritable (/app/node_modules/concat-stream/node_modules/readable-stream/lib/_stream_writable.js:629:3)
at ConcatStream.Writable.end (/app/node_modules/concat-stream/node_modules/readable-stream/lib/_stream_writable.js:567:22)
at PassThrough.onend (_stream_readable.js:675:10)
at Object.onceWrapper (events.js:288:20)
at PassThrough.emit (events.js:205:15)
at PassThrough.EventEmitter.emit (domain.js:471:20)
at endReadableNT (_stream_readable.js:1154:12)
at processTicksAndRejections (internal/process/task_queues.js:84:9)

Can s3-zip be used with express.js?

Hi,

I'm trying to send the zip file with a response on a http request. I see nothing that hints to this being available in the README, but was wondering if it was possible at all as I have found no other dependencies to achieve this.

Here is my slight adaptation to the code:

app.get('/files', (request, response) => {
const objectIds = request.body.objectIds
const folder = request.body.folder

response.set('content-type', 'application/zip')

s3Zip
.archive({ s3: s3, bucket: aws_bucket_name }, folder, objectIds)
.pipe(response)

s3Zip.on('end', () => {
console.log('ended');
response.end();
})
});

Is there any way to stream source files to s3-zip?

Looking at the examples, they all take an array of files to zip. Is it already possible or easy to extend so that .archive can be called multiple times as progressively more files are 'discovered' needing to be zipped up?

For example listObjects returns at most 1,000 objects, so to include more than 1,000 files in the zip without multiple calls to .archive we need to wait to finish traversing s3 before beginning the generation of the zip.

Am I overlooking the obvious? Is it already possible to incrementally add more files to the zip?

Thanks
Phillip

Loading forever when specified file doesn't exist

Hi there,
Thanks for your awesome library!!
I have an annoying issue, when for some reasons I have a wrong filename (S3 Key), it is loading forever! Is there any way to try catch error or abort if something is wrong?

Thank you very much!

Not able to save zip file in S3

__dirname in the zip function is creating the zip in the local system instead of S3 bucket.

Question - Is there a way to create the zip file in the same s3 bucket folder and leave it there?

Hi Hackers!

I was wondering... Is there a way to create the zip file in the same s3 bucket folder and leave it there?

Thanks!

Question - Is it possible to set different paths from a folder source in a zipped folder structure?

Hi!

I was wondering... how should we set a different destination path from a source folder having a certain string, for example: Zipping all media/video/files.ext into mp4/files.ext...

  setArchiveStructure({ mediaFiles }) {
    const archiveStructure = [];

    mediaFiles.forEach((mediaFile) => {
      let file = null;

      const filename = mediaFile.replace(/^.*[\\\/]/, "");
      const extension = filename.split(".").pop();

      if ((/video\//).test(mediaFile)) {
        file = {
          name: `${extension}/${filename}` // mp4/filename.extension
        };

        archiveStructure.push(file);
      } 
    });

    return archiveStructure;
  },

Clearer Instructions needed on how to implement AWS Lambda Example

It would be helpful if there were clearer instruction on how to implement the AWS Lambda example here:

https://github.com/orangewise/s3-zip/blob/master/aws_lambda.md

I've experimented with creating two Lambda functions (one for the function and one for the invocation) and I can't get it to work. I've tried this by manually creating the functions and also with Serverless.

My goal is to run everything from the browser by downloading the zip file from a Lambda function url. I initially attempted to use Express (which works fine locally) but this has not been resolved even though the issue is closed:

#55

Thanks for your help.

Memory leak when the pipe response is disconnected

When using the pipe method to stream the zip in the API response object, I am facing a memory leak if the download is disconnected or canceled. It works fine when the download is successfully completed.

Here is a code snippet of my usage. I am using loopback 3 for this.

authenticatedRoute.post('/downloadZip', function(req, res, next) {
    try {
      const s3Zip = require('s3-zip');
      const filename = generateFilename();
      const folder = getFolderName(req);
      const fileArray = JSON.parse(req.body.files);
      res.set('content-type', 'application/zip');
      res.set('Content-Disposition', 'attachment; filename=' + filename);
      s3Zip.archive({region: REGION, bucket: BUCKET, preserveFolderStructure: true, debug: true}, folder, fileArray)
        .pipe(res);
    } catch (e) {
      throw e;
    }
  });
  app.use(authenticatedRoute);``

relevant dependency versions in use:

    "compression": "^1.7.4",
    "jwt-decode": "^2.2.0",
    "loopback": "^3.22.0",
    "loopback-boot": "^2.6.5",
    "loopback-component-explorer": "^6.2.0",
    "loopback-ssl": "^1.0.4",
    "s3-zip": "^3.1.0",
    "strong-error-handler": "^3.0.0"

What I was expecting is that if the download from the browser is canceled, the memory used by the s3Zip.archive function would be cleaned up. But that is not happening.

Subdirectory support

It doesn't look like I'm able to generate a zip file that preserves an existing directory structure. Is there a way to do that?

Process crash if inner stream throws an error

Package version: 1.0.1
Node version: 8.9.4

Facing an error when we try to download a file from S3 which doesn't exist in the bucket. A simple use case is when a user is given a choice to download the documents and they somehow enter an S3 key which doesn't exist in the bucket.

Current behaviour:
Entering an incorrect key crashes the entire process with the following trace.

 NoSuchKey: The specified key does not exist.
     at Request.extractError (/app/node_modules/aws-sdk/lib/services/s3.js:577:35)
     at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
     at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
     at Request.emit (/app/node_modules/aws-sdk/lib/request.js:683:14)
     at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)
     at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)
     at /app/node_modules/aws-sdk/lib/state_machine.js:26:10
     at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:38:9)
     at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:685:12)
     at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
     at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
     at Request.emit (/app/node_modules/aws-sdk/lib/request.js:683:14)
     at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)
     at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)
     at /app/node_modules/aws-sdk/lib/state_machine.js:26:10
     at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:38:9)
     at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:685:12)
     at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
     at callNextListener (/app/node_modules/aws-sdk/lib/sequential_executor.js:95:12)
     at IncomingMessage.onEnd (/app/node_modules/aws-sdk/lib/event_listeners.js:269:13)
     at emitNone (events.js:111:20)
     at IncomingMessage.emit (events.js:208:7)
 [nodemon] app crashed - waiting for file changes before starting...

Suggested fix:
Internal stream which fetches the data from S3 via aws-sdk does not have error events attached to it. Also, try/catch/throw does not work for streams, they only understand events e.g. .on('data). Attaching .on('error') on the internal file stream and emitting the error on the outer stream should resolve the issue.

PR coming :)