Opened 10 years ago

Closed 10 years ago

#834 closed defect (duplicate)

File upload size limited by available system memory

Reported by: saul Owned by:
Priority: major Milestone:
Component: programming Keywords:
Cc: Parent Tickets:

Description

File uploads appear to be limited by the amount of memory available on the system.

In particular, I receive a MemoryError from queue_file.write(submitted_file.read())' in the mediagoblin/submit/lib.py file.

This occurs when the file size I upload exceeds 1.7GB -- things worked fine for files less than 1.7GB . I do not have any max_file_size specified and coincidentally 1.7GB is the amount of unused RAM I have available on my server (including swap).

These errors occur during the upload phase of the publishing process, before any transcoding is attempted. By watching top, it can be seen that memory usage keeps growing, including swap usage, and if all memory is used up then the upload fails. If the transfer completes without running out of memory then all used memory is freed and transcoding proceeds.

This to me seems indicative of either a memory leak in the file copying process, or that file copying reads the entire file into RAM before writing it back out. Regardless the cause, it poses a severe limitation on Mediagoblin deployment using small device servers and hosted slices where one might be charged for or have limitations placed on RAM usage.

Note: I have added an additional gigabyte of RAM to my server and was able to upload larger files, but still encountered this bug when the file size was greater than 2.7GB .

Attachments (1)

ticket834.patch (678 bytes ) - added by saul 10 years ago.
Patch to fix file size limitations (#834)

Download all attachments as: .zip

Change History (8)

comment:1 by Kevin Brubeck Unhammer, 10 years ago

Is the submitted_file one of these:
http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.FileStorage
? because then I would guess the .save() would solve it.

If not, it probably needs a loop like this one:
http://stackoverflow.com/a/16696317/69663

comment:2 by saul, 10 years ago

According to the Python docs, a file object's read() method will read the entire file unless a size is specified.

read([size]) -- Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached.

Based upon this, I made the following change which appears to have fixed the problem.

    with queue_file:
       while True:
           data = submitted_file.read(16384)
           if not data:
               break
           queue_file.write(data)

I will leave to someone better than me to code it properly.

Version 0, edited 10 years ago by saul (next)

comment:3 by Julien Gouesse, 10 years ago

I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?

by saul, 10 years ago

Attachment: ticket834.patch added

Patch to fix file size limitations (#834)

in reply to:  3 ; comment:4 by saul, 10 years ago

Replying to gouessej:

I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?


If the admin of your server can apply the attached "ticket834.patch" then you should be able to upload files of any size.

Last edited 10 years ago by saul (previous) (diff)

in reply to:  4 comment:5 by Julien Gouesse, 10 years ago

Replying to saulgoode:

Replying to gouessej:

I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?


If the admin of your server can apply the attached "ticket834.patch" then you should be able to upload files of any size.

I think it would be better if several developers could review your patch so that it gets included in the next release.

Your patch seems good to me. You read the file by chunks of 16384 bytes and you leave the while loop when there is nothing to read.

comment:6 by Julien Gouesse, 10 years ago

Actually, you should rather use shutil.copyfileobj like in this bug report:
https://issues.mediagoblin.org/ticket/647

comment:7 by Christopher Allan Webber, 10 years ago

Resolution: duplicate
Status: newclosed

This is indeed a duplicate of #647 but has some useful comments. Thanks for the comments, all! I'm referencing there so we can remember to check here when resolving.

Note: See TracTickets for help on using tickets.