Opened 11 years ago
Closed 11 years ago
#834 closed defect (duplicate)
File upload size limited by available system memory
Reported by: | saul | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | programming | Keywords: | |
Cc: | Parent Tickets: |
Description
File uploads appear to be limited by the amount of memory available on the system.
In particular, I receive a MemoryError from queue_file.write(submitted_file.read())' in the mediagoblin/submit/lib.py file.
This occurs when the file size I upload exceeds 1.7GB -- things worked fine for files less than 1.7GB . I do not have any max_file_size specified and coincidentally 1.7GB is the amount of unused RAM I have available on my server (including swap).
These errors occur during the upload phase of the publishing process, before any transcoding is attempted. By watching top, it can be seen that memory usage keeps growing, including swap usage, and if all memory is used up then the upload fails. If the transfer completes without running out of memory then all used memory is freed and transcoding proceeds.
This to me seems indicative of either a memory leak in the file copying process, or that file copying reads the entire file into RAM before writing it back out. Regardless the cause, it poses a severe limitation on Mediagoblin deployment using small device servers and hosted slices where one might be charged for or have limitations placed on RAM usage.
Note: I have added an additional gigabyte of RAM to my server and was able to upload larger files, but still encountered this bug when the file size was greater than 2.7GB .
Attachments (1)
Change History (8)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
According to the Python docs, a file object's read() method will read the entire file unless a size is specified.
read([size]) -- Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached.
Based upon this, I made the following change which appears to have fixed the problem.
with queue_file: while True: data = submitted_file.read(16384) if not data: break queue_file.write(data)
I will leave to someone better than me to code it properly.
follow-up: 4 comment:3 by , 11 years ago
I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?
follow-up: 5 comment:4 by , 11 years ago
Replying to gouessej:
I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?
If the admin of your server can apply the attached "ticket834.patch" then you should be able to upload files of any size.
comment:5 by , 11 years ago
Replying to saulgoode:
Replying to gouessej:
I would like to upload a big video (about 9 GB). Does it mean that I have no chance to succeed?
If the admin of your server can apply the attached "ticket834.patch" then you should be able to upload files of any size.
I think it would be better if several developers could review your patch so that it gets included in the next release.
Your patch seems good to me. You read the file by chunks of 16384 bytes and you leave the while loop when there is nothing to read.
comment:6 by , 11 years ago
Actually, you should rather use shutil.copyfileobj like in this bug report:
https://issues.mediagoblin.org/ticket/647
comment:7 by , 11 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
This is indeed a duplicate of #647 but has some useful comments. Thanks for the comments, all! I'm referencing there so we can remember to check here when resolving.
Is the submitted_file one of these:
http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.FileStorage
? because then I would guess the .save() would solve it.
If not, it probably needs a loop like this one:
http://stackoverflow.com/a/16696317/69663