Opened 10 years ago
Closed 10 years ago
#614 closed defect (fixed)
Ability to swap app configuration for different unit tests
|Reported by:||Christopher Allan Webber||Owned by:||Christopher Allan Webber|
Description (last modified by )
We need the ability to load different app configurations for different types of tests.
This sounds easier than it is; technically we should be able to just have the function in mediagoblin.tests.tools that loads the mediagoblin app (
get_app) just use a different configuration that we specify. However, it isn't that easy for SQLAlchemy reasons. We currently use the "cascading delete" feature in SQLAlchemy... however, if we load one "application" that has the video media type enabled, then that imports and sees the
VideoData table. If we create another mediagoblin app instance without the video media type enabled, sqlalchemy will try to delete related entries during a cascading delete from the videodata table but won't be able to and will thus throw an exception.
There's one of two solutions to this:
- Figure out how to fiddle with sqlalchemy's metadata information to add/remove the link manually.
- Don't use sqlalchemy's built in cascading delete and roll our own instead. We could have models register themselves on a method's delete registry so that when deleted in our "manual cascading" pattern it'll be smart about it.
I'd like this fixed soon so we can start writing proper tests for plugins.
Change History (14)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
comment:3 by , 10 years ago
I think .clear() would be too much; it would also involve reattaching the metadata for things like tags, etc that are very core. However, I've confirmed at least that it's possible to remove tables via the .remove() method:
>>> db.MediaEntry.metadata.sorted_tables[-1] Table('stl__mediadata', MetaData(bind=None), Column('media_entry', Integer(), ForeignKey('core__media_entries.id'), table=<stl__mediadata>, primary_key=True, nullable=False), Column('center_x', Float(), table=<stl__mediadata>), Column('center_y', Float(), table=<stl__mediadata>), Column('center_z', Float(), table=<stl__mediadata>), Column('width', Float(), table=<stl__mediadata>), Column('height', Float(), table=<stl__mediadata>), Column('depth', Float(), table=<stl__mediadata>), Column('file_type', String(), table=<stl__mediadata>), schema=None) >>> stl_table = db.MediaEntry.metadata.sorted_tables[-1] >>> from mediagoblin.media_types.stl import models # StlData table recognized as the table above >>> models.StlData.__table__ in db.MediaEntry.metadata.sorted_tables True # Remove the table >>> db.MediaEntry.metadata.remove(stl_table) # Table no longer in the metadata >>> models.StlData.__table__ in db.MediaEntry.metadata.sorted_tables False
Knowing this, what's the next step? Do we manage *all* relations manually for plugin type things? Or do we just insert extra hackery to know how to unload and reload things?
Note, I only tried removing associations, I haven't looked into adding them.
comment:4 by , 10 years ago
Elrond has posted on stackoverflow here: http://stackoverflow.com/questions/14865342/sqlalchemy-undo-metadata-removetable
I spoke to a friend who suggested also that we might try nuking the SessionMaker object or something that everything's attached to and reinstantiating it. I'm not sure if that would work or not, but it's one more suggestion :)
comment:5 by , 10 years ago
It does not seem that .remove() itself to get the table to forget about the cascade. I tried this, and it didn't work... it still tried to remove the table in the cascading error.
I am increasingly thinking that either we need expert advise from SQLAlchemy core devs or we need to handle this ourselves via the plugin system. We could use SQLAlchemy-style cascading relationships for everything else, but I think for plugins and media type stuff we might have to handle it manually. Unless we get expert advice, I see no likely possibility that we will figure it out. (And since SQLAlchemy's ORM does not seem really designed to handle this, it may be that any solution we implement will rely on fragile hacks that we cannot depend reliably upon.)
So maybe the question is: what sort of "registry" would that look like?
comment:6 by , 10 years ago
Also, it's probably not surprising, but if you comment out the backref with cascading deletion like so:
class AsciiData(Base): __tablename__ = "ascii__mediadata" # The primary key *and* reference to the main media_entry media_entry = Column(Integer, ForeignKey('core__media_entries.id'), primary_key=True) # get_media_entry = relationship("MediaEntry", # backref=backref(BACKREF_NAME, uselist=False, # cascade="all, delete-orphan"))
...things work. I'm sure that's of no surprise to anyone, but it does mean that if we implement our own "manual cascade" type solution, that would definitely fix the problem.
comment:7 by , 10 years ago
Further contemplations regarding if we *were* to do a manual registry of doing cascading deletion:
- Where should the registry go? Probably in the PluginAPI proper.
- Would we use the callback mechanism, or would this be yet another registry?
- What should the actual deletion mechanism look like? how to do deletions efficiently?
- Right now, this is a one-to-one deletion in the way we are doing media_data type stuff. However, that might not be true for everything else. One-to-one deletions should be easy, but complex cascading might be hard to implement on our own.
- Is running .delete() on each thing good? It could allow for some custom code, but could be a mess in its own way. It would also mean a lot more statements executed potentially, instead of one larger sql statement? That could be slow or painful.
- then again, *not* doing so could mean that certain cleanup things, like deleting files from the file store, might not happen properly or efficiently.
- Would we want to delete before or after? And what happens regarding constraints and deletions?
comment:8 by , 10 years ago
Elrond and I discussed a few things about this. We can probably use the callback mechanisms we have, and do something like:
class MediaEntry(): def delete(self): call_hook('delete:MediaEntry', self) # etc class VideoData(): @classmethod def child_delete(cls, media_entry): # query on related media entry stuff, delete def video_plugin_init(): register_hook('delete:MediaEntry', VideoData.child_delete)
We might even add a mixin that adds a common hook for media entries. But this would be fairly flexible: it would allow for "simple" media entry style deletions, and it would also allow for complex stuff like cleaning up files and etc, or maybe more careful/conditional deletions. Additionally, it would require no back-monkey-patching between tests... it should only run when the plugin is enabled.
It would also require that media types become bona-fide plugins, but that's in the plans already.
comment:9 by , 10 years ago
comment:10 by , 10 years ago
Several things happened last night related to this:
- Elrond created a branch that seemed like it could handle some of the property removing/adding at https://gitorious.org/~elrond/mediagoblin/elronds-mediagoblin/commits/hack/disable_table ... however it apparently was not perfect, and we both agreed it was hacky. There was some concern that it might be fragile and that SQLAlchemy devs might update things and then everything would break.
- The SQLAlchemy main dev said that this wasn't really possible: https://groups.google.com/d/topic/sqlalchemy/yAw6MtVwMkc/discussion This indicates to me that us trying to be clever is kind of a bad idea.
- We had further discussion about the registration of hooks thing; Elrond raised a concern that this would mean having to redo a *lot* of things, including some media-data stuff he just fixed, so would be its own kind of mini-nightmare
- A new hope! We thought of the idea of running each test in its own process. And indeed, this very well may be possible: https://nose.readthedocs.org/en/latest/plugins/multiprocess.html
It seems like the best option at present is to give the multiprocess nosetests plugin a try, but not for parallelization... run it with just 1 process at a time.
comment:11 by , 10 years ago
|Milestone:||0.3.3 → 0.3.4|
This probably won't happen in time for 0.3.3... it's most important for the plugin stuff anyway.
comment:12 by , 10 years ago
|Status:||new → assigned|
I'm claiming and working on this during the sprint!
comment:13 by , 10 years ago
So one note:
Getting rid of beaker a-la #580 is pretty important to making this non-annoying. We need to specify which config we're using per run; if we need beaker for sessions, we need to specify both the paste and the mediagoblin config files... not to mention that we need to mention the mediagoblin config in the paste config!
So much duplication! I can't wait to get rid of beaker.
comment:14 by , 10 years ago
|Status:||assigned → closed|
Okay, I finally have a solution, and I've merged to master. The new setup allows for running with applications under different configurations. For now we make use of the
--boxed feature of pytest to run each test in its own process. I thought this would slow things down, but it doesn't really! Plus, we can now push tests off to multiple processes! Here I tried a number of different test-run styles with and without
-n5 (which says to run 5 tests at a time).
./bin/py.test mediagoblin/tests/ -n5 --boxed ========================= 87 passed in 138.46 seconds ========================== ./bin/py.test mediagoblin/tests/ --boxed ========================= 87 passed in 241.68 seconds ========================== ./bin/py.test mediagoblin/tests/ -n5 ========================= 87 passed in 130.91 seconds ========================== ./bin/py.test mediagoblin/tests/ ========================= 87 passed in 235.78 seconds ==========================
So, no significant speed difference with or without
--boxed. And MUCH faster with
Nothing in this makes use of the "multiple app configuration" stuff yet, but I'll be getting to that soon as I implement the rest of the plugin infrastructure stuff.
Two options for fiddling with the metadata:
.clear()(http://docs.sqlalchemy.org/en/rel_0_8/core/schema.html#sqlalchemy.schema.MetaData.clear) and then reload all models to repopulate the metadata.
.remove(table)can be used to remove the tables that aren't any more of interest: http://docs.sqlalchemy.org/en/rel_0_8/core/schema.html#sqlalchemy.schema.MetaData.remove
Also the metadata has a list of tables: http://docs.sqlalchemy.org/en/rel_0_8/core/schema.html#sqlalchemy.schema.MetaData.sorted_tables
This all should be implemented in
open.pybecause it has the knowledge to reload tables and or know the tables that should exist.
Just my few bits. Haven't tested anything.