Skip to content

Generate sitemap#846

Merged
jvwong merged 4 commits intounstablefrom
iss842_generate-sitemap
Oct 1, 2020
Merged

Generate sitemap#846
jvwong merged 4 commits intounstablefrom
iss842_generate-sitemap

Conversation

@jvwong
Copy link
Member

@jvwong jvwong commented Sep 29, 2020

  • Module: sitemap exposing generateSitemap( docs ):
    • assembles a JSON version of the sitemap data
    • uses xml-js to generate the xml string
    • writes to static folder
    • Tests for sensible xml
  • web service API: /document/sitemap?apiKey=xyz

Sample out

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://biofactoid.org/document/7af64ad2-64a5-466e-b42d-acf259c3883d</loc>
    <lastmod>2020-09-28T20:19:27.063Z</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.5</priority>
  </url>
  <url>
    <loc>https://biofactoid.org/document/13a08fd5-0299-47c3-be94-0efc0f878a90</loc>
    <lastmod>2020-09-28T20:18:18.196Z</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.5</priority>
  </url>
</urlset>

Refs #842

- test suite for required sitemap elements
- generateSitemap writes sitemap to static folder
@jvwong jvwong requested a review from maxkfranz September 29, 2020 19:24
@maxkfranz
Copy link
Member

Would it be simpler for the sitemap to be generated dynamically? It's always up-to-date that way.

@jvwong
Copy link
Member Author

jvwong commented Sep 30, 2020

Would it be simpler for the sitemap to be generated dynamically? It's always up-to-date that way.

I think I misread your 'attach it to a route' comment.

Maybe you mean when a document becomes PUBLIC, regenerate the sitemap?

@maxkfranz
Copy link
Member

I meant that it's a dynamic route rather than a static file. So something like /sitemap.xml is just a dynamic route like /api/documents that returns all the current documents.

@jvwong
Copy link
Member Author

jvwong commented Sep 30, 2020

I meant that it's a dynamic route rather than a static file. So something like /sitemap.xml is just a dynamic route like /api/documents that returns all the current documents.

Oh ok that's a better idea.

@maxkfranz
Copy link
Member

The xml format may be OK, but you can also just use a simple text file with each url separated by newlines, e.g.

https://biofactoid.org/document/foo
https://biofactoid.org/document/bar
https://biofactoid.org/document/baz

Maybe the xml format has some other advantages, but the text format is really simple.

@maxkfranz
Copy link
Member

The robots.txt should be included in this PR to activate the site map. See #842 (comment)

I think that this should be enough to start:

User-agent: *
Allow: /

Sitemap: https://biofactoid.org/api/sitemap.xml

@maxkfranz
Copy link
Member

Note that the sitemap should also include the homepage: https://biofactoid.org/

return loadDoc( docOpts ).then( getDocJson );
}));
} )
.then( docs2Sitemap )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe append the home page here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already appended in the sitemap.js code. Isn't that OK?

@jvwong jvwong merged commit 544cfd9 into unstable Oct 1, 2020
@jvwong jvwong deleted the iss842_generate-sitemap branch October 1, 2020 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants