Opened 13 years ago

Closed 11 years ago

Last modified 11 years ago

#419 closed defect (fixed)

MediaGoblin can't handle (upload?) Large files

Reported by: Jorge Araya Navarro Owned by:
Priority: critical Milestone: 0.3.3
Component: programming Keywords: upload, handle, video, audio, files
Cc: deletesoftware@…, jorgean@…, sheila@…, Stephen Compall, joar Parent Tickets:

Description

hello!

I tried few times in http://mg.wandborg.se to upload a couple of podcast, but the upload process failed in all attempts returning a "Broken by MediaGoblin" joke error message.

The duration of each podcast is 1 hour and the files are large than 250~ MiB because I have to add a "video channel" in order to upload it to youtube.

Change History (52)

comment:1 by Christopher Allan Webber, 13 years ago

Resolution: wontfix
Status: newclosed

Hiya,

So this is interesting. I'm not sure how to handle this ticket... it might be a bug in mediagoblin but it might not. However, there's no way to know from this situation that mediagoblin didn't hit something specific to the server, or a real bug.

That said, I think I'm going to close it and say if someone could run a testing instance and reproduce the problem, or if we can get logs from Joar, it should be reopened. (marked it as wontfix, but if it's a real problem that we can identify in some specific way, we should fix it!) :)

comment:2 by joar, 13 years ago

Resolution: wontfix
Status: closedreopened
2012/03/27 21:22:45 [error] 2176#0: *212266 FastCGI sent in stderr: "Error - <type 'exceptions.MemoryError'>: 
URL: http://mg.wandborg.se/submit/
File '/srv/mg.wandborg.se/mediagoblin/lib/python2.7/site-packages/Paste-1.7.5.1-py2.7.egg/paste/exceptions/errormiddleware.py', line 144 in __call__
  app_iter = self.application(environ, sr_checker)
File '/srv/mg.wandborg.se/mediagoblin/lib/python2.7/site-packages/Paste-1.7.5.1-py2.7.egg/paste/urlmap.py', line 203 in __call__
  return app(environ, start_response)
File '/srv/mg.wandborg.se/mediagoblin/lib/python2.7/site-packages/Beaker-1.6.1-py2.7.egg/beaker/middleware.py', line 155 in __call__
  return self.wrap_app(environ, session_start_response)
File '/srv/mg.wandborg.se/mediagoblin/mediagoblin/app.py', line 181 in __call__
  response = controller(request)
File '/srv/mg.wandborg.se/mediagoblin/mediagoblin/decorators.py', line 50 in new_controller_func
  return controller(request, *args, **kwargs)
File '/srv/mg.wandborg.se/mediagoblin/mediagoblin/submit/views.py', line 106 in submit_start
  queue_file.write(request.POST['file'].file.read())
MemoryError: 

In http://parlementum.net/notice/1590626 it was mentioned that the file is 245M in size. As the mg.wandborg.se RAM total is 512M it is understandable that it fails.

Still, we might have better options than loading the entire file into RAM at this point, so this will be what this issue is about.

comment:3 by Brett Smith, 13 years ago

It seems to me like the basic solution is to read just part of the file, write that to the new file, and repeat until done. The trick is (a) doing that consistently, and (b) figuring out how much to read at a time.

I asked a friend of mine who I trust about (b), and here's that conversation. Briefly, he's not aware of any algorithm or common heuristics that are good to figure out a value, but he suggests a rule of thumb of "spend equal time in the interpreter and the syscalls" and does some back-of-the-envelope math to suggest that chunks between 16KiB and 128KiB would be good for that. I might be interested in doing some testing to help make a more specific number.

09:52 <@brett> xentrac: Hey, I have a question I think you might know the 
               answer to.
09:53 <@brett> So there are lots of places in the MediaGoblin code where it 
               does output_file.write(input_file.read()).
09:53 <@brett> Which is fine until someone tries to upload a 256MiB video on a 
               server with 512MiB RAM.
09:54 <@brett> The naive solution to *that* problem would be a loop that calls 
               input_file.read(some_num) and keeps writing that until we're 
               done.
09:54 <@brett> But that could potentially incur a lot of unnecessary processing 
               overhead calling all those functions repeatedly.
09:54 <@brett> Is there some kind of best practice for figuring out what 
               some_num should be to strike a good balance between available 
               RAM and processing speed?
09:55 <@xentrac> well, one thing you can do is use output_file.writelines, 
                 which at least avoids one of the function calls
09:56 <@xentrac> my rule of thumb for things like that is that the parts should 
                 be about equal
09:57 <@xentrac> in this case, the time spent in the interpreter and the time 
                 spent copying data
09:57 <@xentrac> on a logarithmic scale, anywhere between 1:10 and 10:1 is 
                 "about equal"
09:57 <@xentrac> once they're equal, you can only get up to another factor of 
                 two in performance by using bigger block sizes
09:59 <@xentrac> I figure that a Python function call costs about ten 
                 microseconds, and copying memory is on the order of 10 
                 gigabytes per second
10:00 <@xentrac> so in ten microseconds you can copy 100 kilobytes
10:01 <@xentrac> similarly, the bandwidth-latency product for disks is on the 
                 order of 200 to 500 kilobytes, so smaller writes will tend to 
                 waste time seeking, and larger ones won't.  hopefully Python 
                 and the kernel take care of that for you in this case, though.
10:02 <@xentrac> so you could write
10:02 <@xentrac> def slurp_chunks(input, blocksize=128*1024):
10:02 <@xentrac>     while True:
10:02 <@xentrac>         data = input_file.read(blocksize)
10:03 <@xentrac>         if not data: break
10:03 <@xentrac>         yield data
10:03 <@xentrac> output_file.writelines(slurp_chunks(input_file))
10:04 <@xentrac> on the other hand you're unlikely to have anybody uploading at 
                 anything close to 10 gigabytes per second, so maybe you should 
                 use a smaller blocksize
10:06 <@xentrac> if you could get a significant speedup by using, say, a 1MB or 
                 10MB block size instead of a 128K blocksize, then the 
                 available memory might come into it.  but I suspect you can't, 
                 and in fact you'll probably be more processor-efficient if you 
                 don't blow your L2 cache on every memory access
10:06 <@brett> Yeah, I guess that's my main concern, is that there are so many 
               variables, and they can be markedly different on different 
               deployments, so I was wondering if there was some kind of 
               magical algorithm to adjust dynamically -- at least to calculate 
               an initial block size.
10:07 <@brett> And I figured if there *was* one, well, you're well-read in the 
               literature, so you were the person I knew who was most likely to 
               know.  :)
10:07 <@xentrac> I think on anything bigger than a WRT54GL you should have 
                 near-optimal performance with a block size in that 
                 neighborhood, like 16k to 128k
10:07 <@brett> I think we can assume at least SheevaPlug-type resources.
10:08 <@xentrac> how much RAM does that have?
10:08 <@xentrac> there could easily be an algorithm to adjust it dynamically 
                 but I think it's unlikely to provide significant performance 
                 gains
10:08 <@xentrac> now I need to figure out how to restart minimilter
10:08 <@brett> 512MiB according to Wikipedia.
10:09 <@xentrac> wow
10:11 <@brett> Raspberry Pi has 256MiB, maybe that should be the baseline.
Version 0, edited 13 years ago by Brett Smith (next)

comment:4 by Christopher Allan Webber, 13 years ago

Priority: criticalblocker

comment:5 by Elrond, 13 years ago

Milestone: 0.3.0

comment:6 by Brett Smith, 13 years ago

Owner: changed from somebody to Brett Smith
Status: reopenedaccepted

comment:7 by Brett Smith, 13 years ago

I have a proposed fix ready for review at https://gitorious.org/~bretts/mediagoblin/brettss-mediagoblin/commits/bug419-chunk-reads . This is a pretty naive solution; it simply implements what was discussed in my IRC log above.

comment:8 by Christopher Allan Webber, 13 years ago

So this code branch looks really good. Doing it as a yield: great. :)

My main concern with this was "would it work with cloudfiles"? But it looks like you must have tested that as I see cloudfiles code is here? Anyway, this means file-like objects provided by storage systems will have to provide .read(SIZE) and .write(SIZE) support from now on. I'm okay with that.

Sidenote; I wonder if we should make the blocksize configurable via the config file? Maybe that's overdoing customizations, but might be something someone would want to be able to set.

comment:9 by Christopher Allan Webber, 13 years ago

I'm tempted to merge, even though I haven't tested it on cloudfiles. :)

in reply to:  8 comment:10 by Brett Smith, 13 years ago

Replying to cwebber:

My main concern with this was "would it work with cloudfiles"? But it looks like you must have tested that as I see cloudfiles code is here? Anyway, this means file-like objects provided by storage systems will have to provide .read(SIZE) and .write(SIZE) support from now on. I'm okay with that.

I have not tested it with cloudfiles. I patched it because I patched everything that followed the out_file.write(in_file.read()) idiom, and cloudfiles did too. As it stood, that code was a potential place to tickle a bug similar to what's reported here.

That said, I think you're right that I did incorrectly assume that cloudfiles storage_objects have the same API as most Python file objects. I reviewed the API documentation and pushed another patch to the branch that I think ought to work. Fortunately, these objects have a stream() method that acts much like read_file_in_blocks(), so that made life simple.

All the tests pass with this branch. Do we not have cloudfiles tests? :)

Sidenote; I wonder if we should make the blocksize configurable via the config file? Maybe that's overdoing customizations, but might be something someone would want to be able to set.

It's not a bad idea and I'm not opposed to it, but I think I would prefer to leave it as it is for now as a sort of feedback mechanism. If a site operator comes to us and says 64KiB doesn't work for them, that could be a really great opportunity to learn why not and help us make the code better for a wider variety of deployments, maybe even in ways that aren't so configurable.

comment:11 by Christopher Allan Webber, 13 years ago

Okay great, thanks for the update.

It's clear we'll have to test it more manually with cloudfiles before this can be merged.

Regarding cloudfiles tests, the answer is "no, we don't have those." I'm not really sure what the ideal way to go about testing cloudfiles is. We could potentially mock out the API and fake submitting things to them with the present API, but that doesn't fix the fact that the real value of testing cloudfiles would be to test against a real instance. How to do that, I have no idea.

By the way, the cloudfiles does "buffered writing" to some extent, but on the writing end and not the reading end. See:

    def write(self, data='', verify=True, callback=None):
        [...]
        buff = data.read(4096)
        try:
            while len(buff) > 0:
                http.send(buff)
                if verify and not self._etag_override:
                    running_checksum.update(buff)
                buff = data.read(4096)
                transfered += len(buff)
                if callable(callback):
                    callback(transfered, self.size)
            response = http.getresponse()
            buff = response.read()
        except timeout, err:
            if response:
                # pylint: disable-msg=E1101
                buff = response.read()
            raise err
        else:
            if verify and not self._etag_override:
                self._etag = running_checksum.hexdigest()

comment:12 by Christopher Allan Webber, 13 years ago

Brett and I discussed this on IRC and agreed that while this is a serious bug, it's also complex, and the solution proposed in the current branch probably does not work with all cases and storage systems.

There are several complexities, but at minimum one problem is that cloudfiles doesn't support reading and writing in chunks in the way that one might expect a file-like object to do so.

Our current thought for the nicest way to be able to handle chunked writing is to have something like so:

readable_iterable = storage1.readable_iterable()
storage2.write_iteratively(filepath, readable_iterable)

comment:13 by Christopher Allan Webber, 13 years ago

Milestone: 0.3.0
Priority: blockercritical

comment:14 by Aleksej, 13 years ago

Cc: deletesoftware@… added

comment:15 by Jakob Kramer, 13 years ago

I think #448 has something to do with that.

comment:16 by Christopher Allan Webber, 13 years ago

Milestone: 0.3.1

comment:17 by Jorge Araya Navarro, 13 years ago

Cc: jorgean@… added

comment:18 by Sheila, 13 years ago

Cc: sheila@… added

comment:19 by Christopher Allan Webber, 13 years ago

There's been some delay on working on this. It would be good to see this roll forward!

comment:20 by Christopher Allan Webber, 13 years ago

Component: component1programming

in reply to:  8 comment:21 by Jorge Araya Navarro, 13 years ago

Replying to cwebber:

My main concern with this was "would it work with cloudfiles"? But it looks like you must have tested that as I see cloudfiles code is here? Anyway, this means file-like objects provided by storage systems will have to provide .read(SIZE) and .write(SIZE) support from now on. I'm okay with that.

Wait, what?
Why you should care that? the files are received by GMG and then converted to webm, then uploaded to cloudfiles.
in my POV, we dont need write anything in chunks to racksapce, only on the upload to the GMG instance were the problem came from.

Do you got the idea or my english is just getting worse? :P

comment:22 by Christopher Allan Webber, 12 years ago

Shackra:

So actually you're half right. We *do* need to write things in chunks to rackspace because if we don't it'll still read the entire file into memory at once when copying things over, and we could run out of ram. However, the cloudspace python API actually already does this for us.

The API we're defining here for media types still makes sense so we can have a unified way to upload things in chunks for *all* storage engines.

Anyway, in a sense we were inspired by the way that cloudspace API handled it. Does that make sense?

comment:23 by Will Kahn-Greene, 12 years ago

Bumping this to figure out where it's at. Can someone summarize the outstanding work in this issue?

Also, this missed the 0.3.1 release. Unless someone is planning to work on it, it won't make the 0.3.2 release. Is someone planning to work on this for 0.3.2 which will probably go out in October?

comment:24 by joar, 12 years ago

Milestone: 0.3.1

The issues as far as I know are related to storing uploads on a /tmp tmpfs file system, which limits the size of /tmp to < AVAILABLE_RAM.

On consumer-class VPS servers this means that a 250M file can exhaust the /tmp filesystem.

The solution is either to change the type of the /tmp filesystem or such.

An improvement has been made in #481, which changed the output type of a temporary file used to generate the spectrogram from WAV PCM to a vorbis codec wrappd in either WebM or OGG. I'm not really in a good position to test the CloudFiles storage, but I'll test it with regular FileStorage + Tears of Steel 1090p MKV and get back with the result.

comment:25 by joar, 12 years ago

http://blender-mirror.kino3d.org/mango/download.blender.org/demo/movies/ToS/tears_of_steel_1080p.mkv > 500M was not possible to upload on my linode 1024 machine with /tmp tmpfs. df -h gives the following output:

Filesystem      Size  Used Avail Use% Mounted on
...
tmpfs           199M  199M     0 100% /tmp

comment:26 by joar, 12 years ago

I think we should consider making the temporary storage location configurable. tmpfs seems to be the default filesystem type for Debian, and tmpfs only uses 20% of total RAM by default, this is insufficient for a deployment setting, and forcing siteadmins to change their /tmp filesystem isn't goung to be a breeze.

comment:27 by joar, 12 years ago

Milestone: 0.3.2
Owner: changed from Brett Smith to joar
Status: acceptedassigned

I should take a look at making the configurability happen.

Notes

  • Check if werkzeug's Request object may be used
  • Check if I can override webob's Request.make_tempfile() method

comment:28 by joar, 12 years ago

There's an easy solution to this issue.

export TEMPDIR=/dir/not/on/tmpfs

http://docs.python.org/library/tempfile.html

comment:29 by Brett Smith, 12 years ago

People are talking about two separate issues in this one bug.

The original report is about the fact that there are parts of the MediaGoblin code (or were at the time, at least) that read the entire file into RAM at once, by calling file.read() without any arguments, or similar. This generally works with images or even audio files, but can exhaust available memory when handling large audio or video files.

The more recent discovery is that these files *also* get written to /tmp during processing, and can fill that partition on certain common system setups, like Debian systems with a tmpfs /tmp partition.

To me, it seems worthwhile to address both of these issues, but they're separate and will require separate fixes.

in reply to:  29 comment:30 by joar, 12 years ago

Replying to brett:

People are talking about two separate issues in this one bug.

The original report is about the fact that there are parts of the MediaGoblin code (or were at the time, at least) that read the entire file into RAM at once, by calling file.read() without any arguments, or similar. This generally works with images or even audio files, but can exhaust available memory when handling large audio or video files.

The more recent discovery is that these files *also* get written to /tmp during processing, and can fill that partition on certain common system setups, like Debian systems with a tmpfs /tmp partition.

To me, it seems worthwhile to address both of these issues, but they're separate and will require separate fixes.

The /tmp exhaustion can be solved by setting TEMPDIR to a location stored on disk. The processing still fails though, when tested with a 381M file, the file shows up in the panel at 79%, but has probably been dropped from processing without being marked as failed for some reason.

comment:31 by joar, 12 years ago

THe issue that I derailed this ticket with was fixed. It was the VideoThumbnailer that got into a state where it just did nothing.

The original issue was

hello!

I tried few times in http://mg.wandborg.se to upload a couple of podcast, but the upload process failed in all attempts returning a "Broken by MediaGoblin?" joke error message.

The duration of each podcast is 1 hour and the files are large than 250~ MiB because I have to add a "video channel" in order to upload it to youtube.

comment:32 by spaetz, 12 years ago

OK, I have read all comments now and I am somewhat confused as what the bugs are that should be covered here and what is fixed. Is that correct? :

1) Incoming files might be put (by the webserver?) on /tmp which causes problems with /tmp as tmpfs as it is frequently too small. Or is the /tmp directory by the werkzeug request handler? We need to find that out and find out how/where to set the /tmp dir.

2) Incoming Files are presented to us as FileStorage objects (http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.FileStorage) and when I checked with an upload, these were internally handled as cStringIO.StringO objects, so the whole upload was help in RAM by werkzeug in any case rather than putting it on any temporary directly.

This is in /submit/views.py and we do:

 queue_file = request.app.queue_store.get_file(
                    queue_filepath, 'wb')
 with queue_file:
                    queue_file.write(request.files['file'].stream.read())

This means we'll keep the whole shebang in RAM twice now, once as cStringIO and once by read'ing() it all. FileStorage offers chunked writing in it's 'save' method, but as our "queue_store" files might be cloudfiles or whatever that does not easily help either.
What we need to do here, is to implement a "copy_to_storage" function (extending our copy_local_to_storage function) that can take "file objects" OR file path and work with either, chunking the reads/writes.

2b) We might need to subclass and extend the werkzeug FileStorage objects to save to disk above a certain threshold. I know that django does this, keeping small uploads in RAM and saving big ones on /tmp.

3) The single media_type handlers do things like this to move things from the queue to the final storage:

with mgg.public_store.get_file(original_filepath, 'wb') \
            as original_file:
            original_file.write(queued_file.read())

(The above really reads everything in RAM). So we should use the functions that were intended to do the above:

storage.copy_locally and storage.copy_local_to_storage
These should work in chunks, as the underlying storage sees fit.

comment:33 by spaetz, 12 years ago

P.S. Note to self, we might be able to leverage shutil.copyfileobj taking 2 file objects and copying in chunks.

comment:34 by spaetz, 12 years ago

P.P.S. THen perhaps not (see above comment). This is the full implementation of copyfileobj:

def copyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)

comment:35 by Christopher Allan Webber, 12 years ago

Milestone: 0.3.20.3.3

Re: the tempfile thing, also see a recent ticket I created:

http://issues.mediagoblin.org/ticket/561

re: copyfileobj, that looks very useful, and we should probably put it to use.

Not everything is a local file though, and I'm not sure cloudfiles will support the copyfileobj approach. I still tend to think that it might be even better to create a genereic method for yield'ing and writing in chunks as described in http://issues.mediagoblin.org/ticket/419#comment:3

We could surely do a version of copyfileobj that does something similar to this yield version?

Anyway, I don't think this is going to land in 0.3.2, though it's very important... bumping to 0.3.3.

comment:36 by spaetz, 12 years ago

I have a patch here that does chunked copy_locally and copy_local_to_storage. It is in my branch WIP/large_uploads. However, looking at our Cloudfiles.write() implementation it seems chunked writes would kill bandwidth with some O(1n) performance. This has been filed as spin-off #576.

comment:37 by Christopher Allan Webber, 12 years ago

The part that was broken out into #576 is now also merged.

comment:38 by Christopher Allan Webber, 12 years ago

I have not reviewed #419 yet but it's on my queue.

comment:39 by Stephen Compall, 12 years ago

Cc: Stephen Compall added

comment:40 by Christopher Allan Webber, 12 years ago

Okay, I reviewed it.

The branch looks really good; I wasn't aware of shutil.copyfileobj(); it's cool that it can theoretically do that with any file-like object. We should totally go that route.

The main question then is: does the openstack file storage system work? You said it's untested, and I haven't tested it yet either. Testing to make sure it works is what's necessary to merge and close out this ticket.

I can try it later, but I can't immediately. There's still a chance this could be merged for 0.3.3... if someone has a chance to hop in and try spaetz's WIP/large_uploads branch with cloudfiles that would really help me out.

comment:41 by spaetz, 12 years ago

Glad you like it, I don't know about openstack, I have not the required libraries installed and don't have any openstack instance available, so I never ever tested that code path.

comment:42 by Christopher Allan Webber, 12 years ago

I have not looked at it, but gandaro has also submitted a branch here: https://gitorious.org/~gandaro/mediagoblin/gandaros-mediagoblin/commits/647-copyfileobj

It sounds like a subset of spaetz's branch though. We should certainly get this resolved ASAP, anyway. If someone can jump on reviewing the cloudfiles compatibility, that would help greatly!

comment:43 by Christopher Allan Webber, 12 years ago

comment:44 by Christopher Allan Webber, 12 years ago

I see joar committed the cloudfiles aspect of spaetz's branch? Which I guess means the rest of it can be merged?

comment:45 by Christopher Allan Webber, 12 years ago

Cc: joar added

comment:46 by Christopher Allan Webber, 12 years ago

I just added https://gitorious.org/~cwebber/mediagoblin/cwebbers-mediagoblin/commits/419_cherrypick_large_uploads

Joar, could you please test this? It applies the other bit of spaetz's large_uploads branch (one commit turned out to not be needed). I'd like to merge and push but it would be good to verify we aren't about to break cloudfiles by pushing.

comment:47 by Christopher Allan Webber, 12 years ago

I just merged cherrypick_large_uploads. I tested it with cloudfiles and it worked great. :) Thanks for all that awesome work, spaetz!

I'm closing this out. But I'm reopening #647 because gandaro has some useful patches in there.

comment:48 by Christopher Allan Webber, 12 years ago

Resolution: fixed
Status: assignedclosed

comment:49 by spaetz, 12 years ago

Cool, glad I could help out. I will be back, by the way :-)

comment:50 by kellogs, 11 years ago

Resolution: fixed
Status: closedaccepted

I'm sorry for reopening this ticket, but it seems to be not finally fixed :(

I tried to upload a 350MB OGV video file on my account at http://gobblin.se and it timed out after a few minutes with

Server Error
YEOWCH... that's an error!
Something bad happened, and things broke.
If this is not your website, you may want to alert the owner.

Sadly I can't provide some more logs as I'm just an enduser at this instance. So maybe it's a bug dedicated to this single setup.

comment:51 by Christopher Allan Webber, 11 years ago

Owner: joar removed
Resolution: fixed
Status: acceptedclosed

Without more info we can't be sure that it's related to this ticket though.

comment:52 by Emily O'Leary, 11 years ago

Actually, I've had the issue that kellogs likely had, but it only happens on external hosts (like my VPS). It still seems to be uploading the file as one large upload which then hits the server's "file size limit". It's not exactly a GMG issue if that's the case but it would be a technical constraint to consider (could be fixed by async uploads).

This is somewhat confusing as I would have expected these fixes to handle that (unless there was another issue in uploading that they fixed; I haven't gone through all those commits).

This issue, while related to the topic in this ticket, should probably be a different ticket.

Note: See TracTickets for help on using tickets.