.media_data is sometimes None (and has no rows)! Is this the right thing to do?
|Reported by:||Christopher Allan Webber||Owned by:|
Elrond and I had a conversation on IRC. Basically, I was surprised to learn that some entries do not have
.media_data records and wrote some code that exploded on me, and that this was by design (it may be that I was involved in this design, but anyway, if so I forgot :)) The way things currently work is that media_data is optionally initialized when properties are set, like so:
if len(exif_all): entry.media_data_init(exif_all=exif_all) if len(gps_data): for key in list(gps_data.keys()): gps_data['gps_' + key] = gps_data.pop(key) entry.media_data_init(**gps_data)
In the above example (which is image processing code) media data is initialized if there's exif data or if there's GPS data. However, if there's neither, the
.media_data row will never be written and trying to access
.media_data will return None.
The question is: is this a good idea? The short version is that it's a battle between two different forms of minimalism: only filling the database with rows when there's something to store results in a leaner database but messier code, always having a row results in simpler code but some rows when no rows need to exist. (Thus a battle between Elrond's razor (database simplicity) and cwebber's razor (code simplicity)!)
A longer version of these points are:
- This results in code that is harder to read:
if media.media_data is not None: # do something with media data
... which is fairly confusing.
- As for writing, one could argue that it's not much trickier to do
media.media_data.p = qor instead
media.media_data_init(p=q). (Counterpoint: maybe there are some media data forms that require setting multiple fields at once in a somewhat interdependent or tricky way?) However, initializing fields is not the only thing; one also works with things above such as the
if media.media_data is not Nonewhich can be very tricky to get right. Also, if Chris Webber was confused and wrote totally broken code because of this
.media_databeing None thing, probably so will others!
- Migrations could get a lot messier. Basically, one might write a migration that assumes a query on an images' media data, and one might forget that those rows might not always be there. This could result in broken migrations. On the other hand, migrations are already one of the trickest things in mediagoblin anyway. ;)
- The big thing though is that it's less db space to do things this way. Elrond says it's a bigger concern on MySQL/MariaDB (well, we don't support that yet anyway ;)) than it is on postgres or sqlite.
- It wouldn't be less queries though, since we still need to check whether there's a row or not. (However, Elrond points out that the payload is somewhat less if we don't return anything!)
What's the right thing to do?