Explicitly exclude urls with .. in search crawling
authorMagnus Hagander <magnus@hagander.net>
Wed, 8 Nov 2017 17:02:58 +0000 (12:02 -0500)
committerMagnus Hagander <magnus@hagander.net>
Wed, 8 Nov 2017 17:04:36 +0000 (12:04 -0500)
commit4ce8184e651e318de2328f2b47d62871cc3b43c8
treeb11abb0b341b651f94d764f9818d0206b4db32ce
parent01846cefc9be2b2f97c45636c64eb215b0d4b44d
Explicitly exclude urls with .. in search crawling

There were per-site configured rules defined but the regexp was slightly
incorrectly defined. However, we should just simply never crawl urls
like this unless they are normalized, so for now just add them to the
hardcoded exclusion rules.
tools/search/crawler/lib/genericsite.py