Last modified: 2012-03-16 08:23:21 UTC
http://www.google.pl/search?&q=site:svn.wikimedia.org/doc/ It's seemingly not, be nice to correct this
Appears thats true for all of svn.wikimedia.org ( http://svn.wikimedia.org/robots.txt ) Is there any particular reason to disallow google looking at viewvc? Its not like our source code is secret.
Wonder if it's just to prevent them crawling ViewVC etc...
Aside from all revisions of all files in viewvc being a problem (not sure if viewvc has implemented nofollow/noindex, we could fix via robots.txt on that path). currently this /doc/ system uses frames, which end up ugly via search engines (ie. navigation missing)
Unassigning default assignments. http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/54734
Looks like we have to edit /srv/org/wikimedia/svn/robots.txt on formey and add: Allow: /doc/* ccing Sam and Chad since they have access there.
Might make sense to allow indexing of http://svn.wikimedia.org/users.php while we're at it.
(In reply to comment #5) > Looks like we have to edit /srv/org/wikimedia/svn/robots.txt on formey and add: > > Allow: /doc/* > > ccing Sam and Chad since they have access there. I unfortunately don't have svnadm, so Chad or Ops would need to deal with it Somewhat suprised this isn't in puppet, oh well.
robots.txt content should be: User-Agent: * Allow: /doc/* Disallow: /
(In reply to comment #6) > Might make sense to allow indexing of http://svn.wikimedia.org/users.php while > we're at it. I suppose some people will want USERINFO moved over to git as well?
Instead just disallow viewvc https://gerrit.wikimedia.org/r/2888
(In reply to comment #9) > (In reply to comment #6) > > Might make sense to allow indexing of http://svn.wikimedia.org/users.php while > > we're at it. > > I suppose some people will want USERINFO moved over to git as well? Made that bug 34851.
Changed deployed by ops http://svn.wikimedia.org/robots.txt show the new content: --------------------------------------------------- # THIS FILE IS MANAGED BY PUPPET # # puppet:///files/svn/docroot/robots.txt # https://svn.wikimedia.org/robots.txt # User-Agent: * Allow: /doc/* Disallow: / --------------------------------------------------- Will have to wait for google to come around now.
Google bot came on svn and index the doc content :-) http://www.google.pl/search?&q=site:svn.wikimedia.org/doc/