Opened 10 years ago

Closed 9 years ago

#436 closed defect (fixed)

MediaGoblin PuSH should happen post-processing

Reported by: Christopher Allan Webber Owned by:
Priority: major Milestone: 0.3.3
Component: programming Keywords: review
Cc: Parent Tickets:

Description

Right now if you have celery separated (processing is not synchronous) the PuSH push will happen *before* your processing ends up finishing. This is because the code actually happens in the view. Instead, we should move this code to the end of the processing code.

This is probably a fairly easy ticket, but I'm somewhat hesitant to mark it as "bitesized".

Subtickets

Change History (3)

comment:1 Changed 9 years ago by spaetz

One problem is, that the celery worker doing the processing might not even be connected to the public internet, so it might not be able to PuSH.

We therefore need a) a hook when media processing has finished b) somehow make use of the hook to actually push out notification of updated feeds to the PuSh? hub.

Or can we require a certain class of workers that actually is connected to the Internet? Not sure if this is a case we really need to worry about.

comment:2 Changed 9 years ago by spaetz

Keywords: review added
Milestone: 0.3.3

Please review my branch 436_celery_push. It fixes the issue. The commit comment is:

    Make PuSHing the Pubhubsubbub server an async task (#536, #585)
    
    Notifying the PuSH servers had 3 problems. 
    
    1) it was done immediately after sending of the processing task to celery. So if celery was run in a separate
    process we would notify the PuSH servers before the new media was processed/
    visible. (#536)
    
    2) Notification code was called in submit/views.py, so submitting via the
       API never resulted in notifications. (#585)
    
    3) If Notifying the PuSH server failed, we would never retry.
    
    The solution was to make the PuSH notification an asynchronous subtask. This
    way: 1) it will only be called once async processing has finished, 2) it
    is in the main processing code path, so even API calls will result in
    notifications, and 3) We retry 3 times in case of failure before giving up.
    If the server is in a separate process, we will wait 3x 2 minutes before
    retrying the notification.
    
    The only downside is that the celery server needs to have access to the internet
    to ping the PuSH server. If that is a problem, we need to make the task belong
    to a special group of celery servers that has access to the internet.
    
    As a side effect, I believe I removed the limitation that prevented us from
    upgrading celery.

comment:3 Changed 9 years ago by Christopher Allan Webber

Resolution: fixed
Status: newclosed

Looks good. Merged and pushed!

Note: See TracTickets for help on using tickets.