LAMP: How to create .Zip of large files for the user on the fly, without disk/CPU thrashing

Often a web service needs to zip up several large files for download by the client. The most obvious way to do this is to create a temporary zip file, then either echo it to the user or save it to disk and redirect (deleting it some time in the future).

However, doing things that way has drawbacks:

  • a initial phase of intensive CPU and disk thrashing, resulting in...
  • a considerable initial delay to the user while the archive is prepared
  • very high memory footprint per request
  • use of substantial temporary disk space
  • if the user cancels the download half way through, all resources used in the initial phase (CPU, memory, disk) will have been wasted

Solutions like ZipStream-PHP improve on this by shovelling the data into Apache file by file. However, the result is still high memory usage (files are loaded entirely into memory), and large, thrashy spikes in disk and CPU usage.

In contrast, consider the following bash snippet:

ls -1 | zip -@ - | cat > file.zip
  # Note -@ is not supported on MacOS

Here, zip operates in streaming mode, resulting in a low memory footprint. A pipe has an integral buffer – when the buffer is full, the OS suspends the writing program (program on the left of the pipe). This here ensures that zip works only as fast as its output can be written by cat.

The optimal way, then, would be to do the same: replace cat with a web server process, streaming the zip file to the user with it created on the fly. This would create little overhead compared to just streaming the files, and would have an unproblematic, non-spiky resource profile.

How can you achieve this on a LAMP stack?

43
задан Benji XVI 29 October 2016 в 21:49
поделиться