Last modified: 2011-03-13 18:05:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T2707, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 707 - Google search with [[Google:search term]]
Google search with [[Google:search term]]
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Lowest enhancement with 8 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch
: 3014 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-13 14:34 UTC by Duncan Harris
Modified: 2011-03-13 18:05 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to make interwiki links use spaces instead of underscore (620 bytes, patch)
2004-10-31 10:17 UTC, Neil Barsema
Details

Description Duncan Harris 2004-10-13 14:35:00 UTC
A google search is possible using [[Google:searchterm]]

1. [[Google:Search term]] searches for Search_term not search+term i.e. the space is 
replaced with an underscore.
2. the link is dark blue when as an exty it should be light blue
3. open searches should be possible, e.g. Yahoo searches, or open searches (I'm sure 
they're possible somewhere)
Comment 1 Rowan Collins [IMSoP] 2004-10-13 14:56:10 UTC
The [[Google:foo]] is actually a clever use of the "interwiki" system (see
http://meta.wikimedia.org/wiki/Help:Interwiki_linking and
http://meta.wikimedia.org/wiki/Interwiki_map); as you can see from the second of
those links, it operates by simply adding the search term provided to the end of
"http://www.google.com/search?q="

1) This is an unfortunate side-effect of the way the interwiki links are
implemented: because they are created as 'normal' Title objects, they have to
conform to the rules of MediaWiki article titles: no "+", " " becomes "_", etc.
This differs from some other wiki engines, which use a simple substitution of
whatever's in the link, leaving it to the destination site to decide what
validity/conversion rules to apply. We could solve this by either: not creating
a Title object for interwiki links (might lead to complex special case code all
over the place); or special-casing Interwiki Title objects so that they don't
have to adhere to internal validity/canonicalisation constraints (would mean
rearranging Title creation code to spot interwiki prefixes before even checking
for "validity"; there may also be checks for validity outside Title.php to worry
about).

2) It shows up as light-blue to me; it doesn't get a little ext.link icon
because it's an "interwiki link" rather than an "external link"; to be honest,
I'm not sure that's a very sensible distinction to make, but it's definitely not
showing up as an *internal* link, anyway.

3) I'm not sure what you mean by "open searches" here; as for searching other
search engines, that's just a case of adding a new Intermap entry such as
"Yahoo:" with the appropriate URL to add the terms to; not much use unless/until
(1) is fixed, though, because with neither " " nor "+" available, you can only
search for single words.
Comment 2 Duncan Harris 2004-10-14 08:16:00 UTC
I've found a way to get round this.  You need to use nowiki tags within the to double brackets to 
allow a + sign, e.g.

[[Google:foo+bar]]

I still think there are issues with this though and it needs some attention paid to trying to 
develop it further.
Comment 3 Duncan Harris 2004-10-14 08:16:36 UTC
Sorry, I meant [[Google:foo<nowiki>+</nowiki>bar]]
Comment 4 Rowan Collins [IMSoP] 2004-10-14 15:31:20 UTC
Hmmm. I wouldn't have expected that to work, to be honest. And on the test
server, it doesn't - in fact, it looks like there's a bug in the development
version of the software. As you can see on
http://test.wikipedia.org/wiki/Bug707, the result is kind of ugly - those
random-looking strings are placeholders inserted by the software instead of the
"nowiki" section. This has bearings on bug 337, since we should decide how such
links *should* behave; in short, don't bet on this behaviour remaining available
forever.
Comment 5 Neil Barsema 2004-10-31 10:17:16 UTC
Created attachment 118 [details]
Patch to make interwiki links use spaces instead of underscore

Ofcourse the best solution would be to use the literal text from the user
supplied link but that is a bit above my understanding of the source at the
moment. 

But how about using the $mTextform variable of the title in case of interwiki
links, that is a minor change. This the version of the Title with the spaces
instead of the underscores. That way you can let the target of the interwiki
link take  care of the parsing. This will be no problem if it is an other
mediawiki. And it will fix the Google example.

The only problem is if you actualy *need* the underscores because this way the
underscores are changed into spaces.
Comment 6 Brion Vibber 2004-10-31 23:51:57 UTC
Underscores are necessary for links to UseMod-based wikis with free links, 
for instance. If this is going to be done, it has to be a per-interwiki option. 
One might add a field to the interwiki table listing it, or perhaps some 
funky alternate replacement: currently the interwiki URLs contain $1 as a 
placeholder for the link with underscores; another placeholder for spaces 
perhaps?
Comment 7 lɛʁi לערי ריינהארט 2004-11-09 05:28:32 UTC
Dear friends,
I am happy about the enthusiasm to improuve [[Google:search term]]. During 
last years I searched a lot and want to point out some other aspects:

- "search term" could be a string of high complexity with special 
characters used in iso8859-1, utf-8, Unicode;
-- simple example: [[Google:Español]] translates to 
http://www.google.com/search?q=Espa%F1ol and not to 
http://www.google.com/search?q=Espa%C3%B1ol as used by Google;

- as much more precise a serch, is as better / more usefull it is; what I 
want to say is the necessity of support for parameters '''without''' 
manipulation in a format as [[Google:search term|parameters]] where it 
should be decided if parameters starts with "?" (or "&" if some parameters 
are inserted already by default);
-- Example [[Google:Pug]] works but 
[[Google:Pug&num=100&as_sitesearch=en.wikipedia.org]] is (at the moment) 
translated to somthing useless http://www.google.com/search?q=Pug%26num%
3D100%26as_sitesearch%3Den.wikipedia.org ;

I would spend much time on testing, if required also on finding out how it 
should work. Please do not hesitate to contact me if you think I should do 
a part of the (specidication) work.

Regards Reinhardt
Comment 8 Brion Vibber 2004-11-09 05:34:18 UTC
If you need something really fancy, just copy and paste the URL -- it's not that hard, and it 
keeps from bloating the syntax with hard to maintain extra variants.
Comment 9 lɛʁi לערי ריינהארט 2004-11-09 05:49:59 UTC
I would not call it fancy. In a project group every person has some skills. One knows how 
to write articles, another knows where and how to search material and others may have the 
painfull time to make the wikisation of hundered of articles, to reformat them, add links 
or url's. And this group here knows how to develop a good software.
The idea below the maintenance list where this issue also arrised, is not to copy links or 
to teach others how to search, it is just providing usefull links with high potential.

Regards Reinhardt
Comment 10 Rowan Collins [IMSoP] 2004-11-09 17:24:03 UTC
(In reply to comment #8)
> If you need something really fancy, just copy and paste the URL -- it's not
that hard, and it 
> keeps from bloating the syntax with hard to maintain extra variants.

I believe Reinhardt's intention was that this could then be used with templates,
as an alternative to substituting the search term into the URL with an
as-yet-nonexistent {{URLESCAPE|arbitrary text}} type syntax. However, I would
tend to agree that creating an extra piece of syntax for this might be unwise.

I was thinking maybe we could just leave "&" unescaped, but then something like
[[Google:Bill & Ben]] would break. And now I think about it, you can build just
about all the options for a search engine into the query itself, so if we had
special treatment for [some] interwiki links, you could just construct things
like [[Google:some terms "a phrase" site:en.wikipedia.org]] and the link would
become
http://www.google.com/search?q=some%20terms%20%22a%20phrase%22%20site:en.wikipedia.org
(or "+" instead of "%20", makes no odds as far as Google is concerned), which
does the exact same search as
http://www.google.com/search?as_q=some+terms&as_epq=a+phrase&as_sitesearch=en.wikipedia.org
Comment 11 lɛʁi לערי ריינהארט 2004-11-09 22:15:37 UTC
(In reply to comment #10)
additional notes to:
you could just construct things
like [[Google:some terms "a phrase" site:en.wikipedia.org]] and the link 
would
become
http://www.google.com/search?q=some%20terms%20%22a%20phrase%22%
20site:en.wikipedia.org
(or "+" instead of "%20", makes no odds as far as Google is concerned), 
which
does the exact same search as
http://www.google.com/search?
as_q=some+terms&as_epq=a+phrase&as_sitesearch=en.wikipedia.org
----

I would be happy having the parameters. It will work for all English 
characters. Unfortunatelly some character transliteration (for non english 
characters) suitable to meet those from major search engins is required. 
Example:

[[Gerhard Schröder]] translates to 
http://en.wikipedia.org/wiki/Gerhard_Schr%F6der 

Please note the different character code used for "ö" in google:
http://www.google.com/search?num=100&q=%22Gerhard+Schr%C3%B6der%22+site%
3Aen.wikipedia.org
If ''we'' use the same character translation as for the url's in 
en.wikipedia the link will fail:
http://www.google.com/search?num=100&q=%22Gerhard+Schr%F6der%22+site%
3Aen.wikipedia.org

Regards Reinhardt
Comment 12 lɛʁi לערי ריינהארט 2004-11-10 00:39:43 UTC
(In reply to comment #10)
What do you think about the following solution:

[[Google:some terms "a phrase" site:xx.yy]] xx a subdomain (as "fr") and yy a 
domain (as wikipedia.org) would simply open the fallback search page when 
internal search is disabeled searching for >>some terms "a phrase"<< with the 
site:parameter as specified inside [[ ]].

This is still and improuvement because [[Google:foo]] would always search 
only for pages in the actual project.
Comment 13 Rowan Collins [IMSoP] 2004-11-10 17:29:30 UTC
(In reply to comment #12)
> [[Google:some terms "a phrase" site:xx.yy]] xx a subdomain (as "fr") and yy a 
> domain (as wikipedia.org) would simply open the fallback search page when 
> internal search is disabeled searching for >>some terms "a phrase"<< with the 
> site:parameter as specified inside [[ ]].

I think you may have misunderstood my earlier comments: there is no need for us
to do anything special with the "site:foo", that is something you can type into
the Search box on Google, and it will work. I was just demonstrating that we
don't need to be able to stuff things into extra parts of the Google URL, we can
just use Google's (and, I believe, most other search engine's) ability to have
all the extra information specified in the search query itself.
 
> This is still and improuvement because [[Google:foo]] would always search 
> only for pages in the actual project.

Are you saying that [[Google:foo]] should act as though it was actually a search
for "foo site:en.wikipedia.org", and not just "foo"? I'm not sure that's
generally what people will be wanting: if they're linking to a Google search,
it's probably intended to be a search of the whole of Google. We could have an
extra InterWiki prefix, say "[[Search: ... ]]", which linked to the internal
search (and when that's down, you'd be directed to a choice of Google and
Yahoo!, as normal), but that's essentially another issue.

Comment 14 lɛʁi לערי ריינהארט 2005-02-16 11:27:21 UTC
Dear friends!

[[en:User:Gangleri/tests/google bugzilla:707]] provides an analysis of the
differences depending if [[google:foo]] is used at a [[Latin-1]] or a [[UTF-8]]
wiki, depending if foo is only ascii, Latin-1 or UTF-8.
a) To my opinion [[google:foo]] should give the same result regardless of the wiki
b) There are examples about various parametrisations.
b1) One showing that "-" would be the most suitable in order to get an exact match.
b2) Another showing varios parametrisations made (manly as external links).

Please feel free to contact me if this is one of your areas of interest.

Best regards Reinhardt.
Comment 15 Rowan Collins [IMSoP] 2005-02-16 14:54:16 UTC
As far as I can see, there are several problems here which need different kinds
of solution:

1) characters that are illegal in MediaWiki titles, but not elsewhere (", +, &, etc)
--> *external* interwiki links (i.e. not inter-project ones, like "meta") should
only have to match the legal characters for a URL, not a title
--> of course, they'd still have to exclude '|' and ']', but if the string
wasn't interfered with too much, you could use '%7c' and '%5d' directly

2) URLs with parameters, particularly useful for search engines
--> this is irrelevant if (1) is dealt with, because you could use
[[Foo:Bar?arg1=baz&arg2=quux]], which is no more complex than
[[Foo:Bar|arg1=baz|arg2=quux]] (and less so in that what would that second link
*display* as, if we reuse the '|' to mean something different? Compare
[[Foo:Bar?arg=baz|This = displayed text]] and [[Foo:Bar|arg=baz|Does this =
displayed text or another param?]])
--> similarly, if (3) is [also] dealt with, you could take advantage of the
ability offered by most search engines to specify all your parameters in one
query string (e.g. [[Google:word "a phrase" site:example.com allinurl:foo]] and
so on)

3) how to translate spaces
--> as Brion says, some sites do require words to be seperated by underscores,
so this needs to be choosable per prefix
--> perhaps the most flexible way would be to have a field in the database
representing which character should be substituted for a space (so things like
"Google:" would have the substitution '+' or '%20', while other wikis could
retain the substitution '_'). This leaves open the possibility for yet other
representations, such as '-' (which some sites use as it is apparently more
search-engine friendly). It's also a bit more obvious to users how to use this
than a magic "$2" or whatever.

4) character encoding
--> the simplest solution is to always use UTF-8, which is what the majority of
the Wikimedia wikis now use internally anyway
--> however, as with spaces, there may be different sites that expect their URLs
in different character encodings. If so, I think again a char_encoding field, or
maybe just use_utf8 (where 'false' would mean to use ISO 8859-1 or -15 instead)
would be more transparent than something like "$2".

Of these, (1) is actually the most complex, as it requires changes to code other
than the Title class where the interwiki links are "brought to life" (since
links with illegal characters in are just rejected by the parser right now). (3)
and (4) make the database structure a little more complex, but not much, and (4)
in particular would make things more consistent than they are now. I'm actually
tempted to break this into 2 or 3 bugs for the different issues, because I
suspect they will have to tackled seperately.
Comment 16 Ævar Arnfjörð Bjarmason 2005-04-01 17:31:53 UTC
You can do this now with templates, make on with these contents:

"[http://www.google.com/search?q={{{1}}} Google Search for {{{1}}}]"

and use {{Template|searchword}}, what won't help if you're looking for "+" though.
Comment 17 Ævar Arnfjörð Bjarmason 2005-04-17 05:00:08 UTC
I think this should be marked as WONTFIX, if it's to be done at all it should be
done as an extension like:

<google>term</google>
Comment 18 Rowan Collins [IMSoP] 2005-04-18 17:45:50 UTC
(In reply to comment #17)
> I think this should be marked as WONTFIX, if it's to be done at all it should be
> done as an extension like:
> 
> <google>term</google>

I disagree - interwiki links (in the broad sense, as opp. "inter-project" ones)
are simply shortcuts for linking to often-referenced external sites which have
easy to guess URLs. [I know it was originally intended specifically to link
wikis, but why discriminate? For sites like Wikipedia, that distinction is
generally irrelevant] Search engines have such URLs, and are very often
referenced. The assumption in the MediaWiki code that targets of interwiki links
will behave like MediaWiki page titles is a bad one anyway, and a search query
is just on the extreme end of the variation.

Besides which, it's a link, so marking it up as a link makes more sense than
anything else. And as soon as Google is possible, all sorts of other interwiki
prefixes can be defined similarly with basically no extra effort (unlike with an
extension, which would require recoding and installation of multiple variants).
Comment 19 Ævar Arnfjörð Bjarmason 2005-04-18 17:58:02 UTC
A valid point, in fact we already use [[cache: to link to the google cache.
Comment 20 lɛʁi לערי ריינהארט 2005-04-21 00:58:07 UTC
(In reply to comment #16)
> You can do this now with templates, make on with these contents:
> "[http://www.google.com/search?q={{{1}}} Google Search for {{{1}}}]"
> and use {{Template|searchword}}, what won't help if you're looking for "+" though.

This works only with ONE word. Tray the template with "Ævar Arnfjörð Bjarmason"
and you will have as result http://www.google.com/search?q=Ævar Arnfjörð
Bjarmason . I assume you would like to search for
http://www.google.com/search?q=%C3%86var%2BArnfj%C3%B6r%C3%B0%2BBjarmason.
Please note that
1) this is not http://www.google.com/search?q=Ævar+Arnfjörð+Bjarmason
2) translation deiifers between [[Latin-1]] and [[UTF-8]] wikis

Regards Reinhardt
Comment 21 Rowan Collins [IMSoP] 2005-08-02 21:00:47 UTC
*** Bug 3014 has been marked as a duplicate of this bug. ***
Comment 22 Ævar Arnfjörð Bjarmason 2005-09-11 19:42:05 UTC
severity => enhancement
Comment 23 Daniel Kinzler 2006-05-17 15:40:19 UTC
see bug 839, which is resolved not: {{urlencode}} can be used to encode text as
of r14273.
Comment 24 Tristan Miller 2006-10-21 00:47:58 UTC
Couldn't this feature be implemented using templates rather than hard-coding the
MediaWiki source?
Comment 25 Andrew Garrett 2006-10-27 09:51:43 UTC
Yes, why not have {{google|search phrase}}?
Comment 26 Muke Tever 2006-10-27 12:22:28 UTC
(In reply to comment #25)
> Yes, why not have {{google|search phrase}}?

That's only possible now, with a template using {{urlencode}}.  
It didn't used to be possible at all (cf. comment #20).  

You still can't do [[Google:search term]], but you can now do 
[[Google:search+term]] (which was not a possible workaround earlier, 
according to comment #1) so this bug is correctly now only an 
enhancement, unless somebody develops a burning need to make 
interwiki links to a site where the space " " cannot be substituted 
with either the underscore "_" or the plus "+".
Comment 27 Rob Church 2007-01-02 21:30:27 UTC
We have templates, and we have {{urlencode}}. This can be implied to be fixed,
surely.
Comment 28 Minh Nguyễn 2007-01-03 02:49:31 UTC
{{urlencode}} works... as long as you don't put any non-ASCII characters in the
query: compare [[Google:{{urlencode:moment magnitude}}]] with
[[Google:{{urlencode:với moment magnitude}}]].
Comment 29 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-09-09 23:16:15 UTC
I concur with those above who consider this is adequately addressed through means like templates without having to stretch interwiki links' meanings.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links