Opened 15 years ago

Last modified 10 years ago

#183 closed enhancement

tag clouds — at Version 7

Reported by: Caleb Davis Owned by: Jakob Kramer
Priority: minor Milestone:
Component: programming Keywords:
Cc: Parent Tickets:

Description (last modified by Christopher Allan Webber)

This is another rollover from #360, but it got left behind somehow. A tag cloud is a dict-like object containing {'tag-name':frequency-of-use,...}. It's fun to have them to see all the tags that people are using publicly on an instance.

Where would we display these?

  • instance home page - all users, processed media

  • user's profile - user's processed media

  • [BONUS] - arbitrary collection (/tags/bunnies) Open questions:

  • Should we use MapReduce? http://cookbook.mongodb.org/patterns/count_tags/ The alternative would be to write tags to a text file and do

    sort tags_text_file | uniq -c

or do it completely within python * Should we use celery? Generating tag clouds shouldn't slow page renders. Thoughts:

do it with python if you're using MapReduce since, if MapReduce gets too slow, you just add more processors!

if it 'takes too long', then use celery

  • How often do we update the clouds? Thoughts included:

    not during a bulk upload

  • How do we store these tag cloud objects? If we're not rendering them on the fly, then they should be in some kind of cache. Thoughts:

    user['tag_cloud'] = dict

    associate the cloud with the route. something like - {'/':'instance_cloud.txt','/u/user1':'user1_cloud.txt','/tags/bunnies':'tags_bunnies.txt'}

Change History (10)

comment:1 by Caleb Davis, 15 years ago

I haven't gone through the code in this one yet, but the feature set is interesting: http://pypi.python.org/pypi/cs.tags

comment:2 by Christopher Allan Webber, 15 years ago

Milestone: 0.0.50.1.0

Rolling over to 0.1.0 but doubtful we'll get it done in that release :)

comment:3 by Christopher Allan Webber, 15 years ago

Milestone: 0.1.00.2.0

comment:4 by Jakob Kramer, 15 years ago

Owner: set to Jakob Kramer
Status: NewIn Progress

comment:3 by Jakob Kramer, 15 years ago

I think the best approach is to use Map/Reduce, it seems to be implementable very simple. At the beginning I had problems to understand how Map/Reduce works, but it is clear to me now. The tag clouds are already "finished", but I have to do some Celery task, so the tag clouds don't slow everything up.

comment:4 by Christopher Allan Webber, 15 years ago

Milestone: 0.2.00.2.1

comment:4 by Jakob Kramer, 14 years ago

Owner: set to Jakob Kramer
Status: In ProgressNew

As you may have noticed, I am very inactive in the last time again. I pushed my current work to my fork, so you may review it and change it to a better. It is very slow at the moment!

[https://gitorious.org/gandaro/mediagoblin/gandaros-mediagoblin/commits/tag-clouds](https://gitorious.org/gandaro/mediagoblin/gandaros-mediagoblin/commits/tag-clouds)

comment:5 by Will Kahn-Greene, 14 years ago

The original url for this bug was http://bugs.foocorp.net/issues/476 .
Relations:
#212: related, #207: related, #164: blocked

comment:6 by Elrond, 14 years ago

Milestone: 0.2.1

I don't think this one is top priority for the next release.
If someone wants to take it over, that's fine of course!

comment:7 by Christopher Allan Webber, 14 years ago

Description: modified (diff)

It probably makes sense that this can wait to be a plugin, once we have plugins? I'm not sure, though.

One easy thing now that we have SQL would be to show unique tags from recent images. That wouldn't have to do anything smart about generating a cloud, and would be fast.

Note: See TracTickets for help on using tickets.