-
Notifications
You must be signed in to change notification settings - Fork 633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
streaming Zip file creation #2237
Comments
The thing with the Go stdlib is that its rather a reference point, and not a strict rule to follow. An example is we are moving various APIs away from Deno.Reader/Writer and instead of web streams for those. Also, in regards to the tar module, there is a PR to switch to web streams for it (#1985). I do think having this would be great. From your quick example, it seems to me that what is async function *readFiles() {
for await (const entry of walk(".", { includeDirs: false }))
yield await Deno.open(entry.path)
}
const readable = readableStreamFromIterator(readFiles()).pipeThrough(ZipCompressionStream()); // we can just pass this to Response now We should also consider how we could make this portable with the web. Maybe a similar approach to my web tar PR? |
I agree with @crowlKats; your module would be a great addition to the standard library |
Thanks for the feedback. Since you're rewriting Tar, I have a few suggestions : Streams are better for optimized handling of binary data and I/O buffers. But streams of objects are just more complicated than async generators for no benefit. The input happens to be a sequence of objects (basically streams with metadata, or Of course, the output is a single binary file, so that's still a Stream. Speaking of The only compatibility problem I can see for client-zip (and any similar file bundler) is that |
I forgot to mention the stage-2 proposal for Anyway, if there is indeed a general drive to make Deno modules browser-compatible, then client-zip makes more sense here than I thought. We could probably have the Zip and Tar modules share the same interface (just with different options), both browser-compatible, with a separate Deno-specific mapper for filesystem input. However, I still don't see much reason to make Zip files on the server. For anything else than sending the file to regular end users, Tar is better. And for end users… well, if you can get their browser to do it (which is the whole point of client-zip), why waste your own server CPU ? |
I'd say most usages will actually be unzipping than zipping |
I think so too. Unzipping is another beast entirely, though, and will never part of client-zip. When zipping, you can pick just one implementation (that most unzip programs can understand) and do that well. For unzipping, you need to be compatible with all the quirky Zip files generated by lots of different programs and versions since 1989. Also, unzipping can be streamed sometimes but you can never be sure in advance (in the case of client-zip's output, it's guaranteed not to be stream-extractable), so basically you have to store the whole Zip file somewhere with fast random access, which removes one of the good reasons to do a pure JS implementation. Given those constraints (on top of the performance issues I already talked about for zipping), I think we're better off just calling a native unzipping utility on the Zip file right after storing it locally. |
I am the developer of client-zip, a very small, pretty fast streaming Zip file generator, written in modern JavaScript and a bit of WASM. A user suggested to me — insistently — that client-zip (or something based on it) should become part of the Deno standard library, presumably alongside the existing
archive/tar
module. client-zip already runs very well in Deno, by the way.Why not ?
This is not rhetorical. I actually don't think it's such a good idea, and I'm posting here to explain and see what happens.
The main reason for creating Zip archives instead of the more elegant Tar+Gzip format is because Zip enjoys universal support out of the box. I think it makes more sense to do the work client-side, but maybe I'm not seeing some good use cases.
client-zip has a very different design compared to what's in the Go library and most existing libraries. Instead of instantiating an object to represent the archive and calling methods on it to add files, client-zip exposes a single function that takes an async iterable of inputs and immediately returns an output stream (which is generated lazily from that moment on).
I like my design choice very much, and I think it meshes well with core Deno code which also favors async generators — particularly the wonderful
fs/walk
module. But it's a big side-step from the guideline of sticking with the Go stdlib, and the interface of the existing tar module. It would look like this if you wanted to zip a directory :client-zip is designed around streaming and therefore never looking ahead at file data. That means, in general (and particularly if you use compression, when that's implemented), if you create a Zip stream to feed an HTTP Response, you won't be able to send a Content-Length with that Response. The upside, of course, is low latency, low memory usage, and no need to write temporary files.
The text was updated successfully, but these errors were encountered: