Opened 11 years ago
Closed 10 years ago
#802 closed defect (wontfix)
sitemap.xml generation
Reported by: | user-A | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | programming | Keywords: | seo, sitemaps, search |
Cc: | Parent Tickets: |
Description
Hi, I'm partiticipating to the opensource distributed searchengine YACY. Thus I like to encourage you, to offer webspiders an sitemap.xml file (and refer them in the robots.txt): https://en.wikipedia.org/wiki/Sitemaps
This allows the bot to just walk trough the flat list and he doesn't has to get the webstructure out of the parsed HTML pages. This avoids that the searchengine misses some content and speeds up the analysis process.
Attachments (1)
Change History (4)
comment:1 by , 11 years ago
by , 11 years ago
Attachment: | 802-sitemap-generation.patch added |
---|
comment:2 by , 10 years ago
Status: | new → review |
---|
comment:3 by , 10 years ago
Resolution: | → wontfix |
---|---|
Status: | review → closed |
This iterates through all users and media, and depending on the size of the site, that could be a *ton* of users and media, enough to generate an enormous document or bring the site crashing to a halt. Of course, on smaller instances, that may be no problem... and maybe this patch is ideal for smaller instances.
Regardless, I get the sense that this would be best handled as an external plugin.
It's a good idea though! I just don't think I see a way to bring it in to mediagoblin proper.
I know this feature request hasn't been accepted yet, but I already implemented it. So far it is just a flat list of urls to index - no “lastmod” dates, because mediagoblin doesn't store this info. But it can already simplify the crawling process. To user-A: I don't have any experience with YaCy, could you please test if this satisfies its needs? Just download my patch file from the attachments section of this ticket, patch any instance of mediagoblin with it, and it will add /sitemap.xml and robots.txt. Try to index it with YaCy and tell me, it if this is enough, or whether there's something I should add or change.