Last modified: 2007-05-03 01:27:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T11729, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 9729 - Interface to modern AI software
Interface to modern AI software
Status: CLOSED INVALID
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
Other other
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-27 22:38 UTC by Steve Richfield
Modified: 2007-05-03 01:27 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Steve Richfield 2007-04-27 22:38:31 UTC
Beyond the "semantic web" is a sort of machine wisdom, where you describe a 
bad situation, the computer asks a few pointed questions, and then information 
is produced that you MUST have been unaware of in order to have your problem. 
This has been working in the form of Dr. Eliza that has been demonstrated at 
the last two AI conferences in Las Vegas. This could easily be interfaced to 
Wiki projects if only some important additional information were gathered when 
people made their entries, and a database containing this information were 
then available for download. We have such a database that would make a good 
starting point.

The needed additional information includes:
1. Regular expressions that describe snippets that would likely be included in 
a statement that was stating "symptoms" (for lack of a better term) usually 
associated with the information in the entry.
2. A name for each symptom.
3. A question for each symptom that when answered would probably produce a 
snippet that was recognizeable by the above regular expression.
4. A list of symptoms (or must-be-missing symptoms) for each entry. These are 
weighted according to their position in the list.

This can all be produced from a form screen, and any author who fails to fill 
in the form will simply not have their information considered by people using 
the AI interface.

The complexities of chopping long sentences into short ones, handling negation 
and tense, etc., are all taken care of by the AI software.

Dr. Eliza, with the Gracie speech front end, is now able to carry on spoken 
conversations about people's chronic illnesses, and sometimes comes up with 
new true cures even for supposedly well known conditions. With this additional 
information available, people could then discuss their life-problems with the 
world's psychological knowledge supporting the computer, political problems 
with a complete knowlsedge of history supporting the computer, etc., etc.

All that is needed to make all this magic now happen is just a little more 
information gathered from the authors when they make their entries.

It would appear that this new capability would supplant most existing Internet 
informational and search systems.

Thanks.

Steve Richfield
505-934-5200
Comment 1 Rob Church 2007-04-27 22:44:42 UTC
I would like to point out that BugZilla is not the sort of place to *begin*
discussions on large-scale projects of this nature; that sort of thing is best
suited to wikitech-l, or perhaps wikiresearch-l.
Comment 2 Steve Richfield 2007-04-28 07:13:00 UTC
(In reply to comment #1)
> I would like to point out that BugZilla is not the sort of place to 
*begin*discussions on large-scale projects of this nature...

Since this ONLY involves collecting some optional additional information from 
authors that is NOT presented on any non-author screens and which does NOT 
affect Wiki operation, it sure didn't look very "large" scale, at least to me.

Obviously, there must be some sort of quasi-standard as to what is a "large" 
scale project. Do you have any idea where that line should be?

I was hoping to avoid a big political debate over collecting some optional 
additional information. Asking seemed to be a no-brainer. Obviously, demanding 
such information or changing Wiki operation would be quite a different matter, 
but I am not proposing that.

Steve

Comment 3 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-04-29 01:02:30 UTC
When you say "make their entries", what do you mean?  A form shown to all users
who sign up?  Every article created?  Extra information added to articles on
medical conditions?  What sort of questions would you like asked, exactly; of
whom; and when?  A succinct statement of the exact object of your request would
be appreciated.  I'm not entirely sure what exactly you want us to do.

If I'm reading you correctly, it's something along the lines of you wanting to
make use of the Wikipedia user base to gather data for your AI.  Addition of
survey screens everywhere for everyone who would want this if we offered that to
all comers might be distracting or confusing to our users.  If I'm getting your
general thrust, you probably want to contact a Board member and not the
developers/sysadmins.  It doesn't seem so much a technical request as a request
for use of Wikimedia Foundation resources.  If I'm misunderstanding, please
correct me.
Comment 4 Steve Richfield 2007-04-29 04:03:31 UTC
(In reply to comment #3)
> When you say "make their entries", what do you mean? A form shown to all 
userswho sign up?

No.

> Every article created?

Yes.

> Extra information added to articles on medical conditions?

I used medical examples because that is an area of great familiarity for me. 
What I am proposing would be applicable to a significant fraction, maybe half 
of the articles. The significant articles are those that provide details 
relevant to any subject, the knowledge of which might explain a problem that a 
user has in that area.

> What sort of questions would you like asked, exactly;

I forese a screen akin to the advanced search screens used by several search 
engines, with wild-card, optional string, and other variables to use in 
recognizing snippets in user queries. Beyond that, the questions I outlined in 
my original posting. In addition, user name and date/time tags would be 
included in the database to help deal with duplicates, hackers, etc.

> of whom;

The authors when they enter or edit an article. This would be entirely 
optional.

> and when?

It is probably best to present the screen as part of the commit process.

> A succinct statement of the exact object of your request wouldbe appreciated.

AI technology has been quietly marching along. There is >90% of enough to be 
useful in the Wikis, but crucial information is routinely absent beyond any 
conceivable automated recovery, though recovery is often possible by people 
skilled in the areas being posted. Obviously, even a less-that-the-best effort 
by authors would GREATLY help following manual cleanup efforts.

I'm not pretending to have all of the answers. Indeed, I suspect that others 
may have additional requests for information to drive their own AI engines. 
This should be opened up for debate. I suspect that the way to REALLY get 
other AI researchers' attention is to simply start gathering optional 
information, and just wait for additional requests.

My point here is that while Wiki is conspicuously unable to lead the AI field, 
it CAN produce the content for coming AI "browsers" if only it will collect 
just a little additional information. Having Wiki "lock up" its content to 
only be accessable via current dumb browsers and search engines can only 
anchor it in the present, soon to become the past.

> I'm not entirely sure what exactly you want us to do.

*I* want my Dr. Eliza engine to be able to discuss people's difficult problems 
in casual verbal conversations as it now does, but astronomically enhanced 
with Wiki's content.

I know that there are others out there with different dreams, e.g. "semantic 
web", with different requirements, e.g. key words and phrases, denial words 
and phrases, etc. I see absolutely *NO* reason not to be collecting as much of 
this information as authors will volunteer. Do you?

> If I'm reading you correctly, it's something along the lines of you wanting 
to make use of the Wikipedia user base to gather data for your AI.

Not just *my* AI, but other engines to do new and presently undreampt things.

> Addition of survey screens every where for everyone who would want this if 
we offered that to all comers might be distracting or confusing to our users.

How about just a button that they can click or skip, to take them to a survey 
screen?

> If I'm getting your general thrust, you probably want to contact a Board 
member and not the developers/sysadmins.

This doesn't appear (to me) to be a political issue at all, but I certainly 
have no objection to talking to anyone who wants to discuss this. Did you have 
anyone in mind? Contact information?

> It doesn't seem so much a technical request as a request for use of 
Wikimedia Foundation resources.

I don't see it that way at all. All I am trying to do is to unhook the present 
needless dependency on existing dumb browsers and allow client-end AI smarts 
to get a LOT more from Wiki, and in the process greatly enhance the value of 
Wiki and the Foundation.

> If I'm misunderstanding, please correct me.

Hopefully I have. Any more questions?

Steve
Comment 5 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-04-29 17:28:38 UTC
Hmm.  Well, it's not up to me to decide things like this.  I would suggest
contacting one or more board members:

http://wikimediafoundation.org/wiki/Board

Alternatively, you could also contact Brion Vibber, the CTO:

http://en.wikipedia.org/wiki/User:Brion_VIBBER
Comment 6 Brion Vibber 2007-04-30 18:51:15 UTC
This really doesn't sound like a bug report or feature request.
Comment 7 Steve Richfield 2007-05-01 05:59:46 UTC
This call obviously depends on what constitutes a "feature". This proposal 
obviously provides the interfaces needed to make AI software work with Wiki, 
and does ABSOLUTELY NOTHING for present Wiki users using present browsers, if 
indeed that is a part of the present definition of a "feature". OK, then where 
is my next stop? This is the ONLY apparent path from Wike's present 
functionality to the AI future seen my many, so this would seem to be a 
question of when, and not if.

Alternatively, if I could get an commitment that Wiki ABSOLUTELY WILL NOT DO 
THIS, then I might be able to go elsewhere and do it there, eventually leaving 
Wiki to the dustbin of history.

Wiki's large presence coupled with an active disinterest in AI interfaces 
could be enough to put web AI into a sort of indefinite cold storage, which 
doesn't appear to be in anyone's interest.

Where in Wiki is my next stop?

Steve
Comment 8 Brion Vibber 2007-05-01 15:30:13 UTC
You are of course welcome to use any Wikipedia content in your research, but the extremely vague, 
unformed descriptions above neither invite nor require any kind of response from us at all.
Comment 9 Steve Richfield 2007-05-01 20:17:56 UTC
I did intend this as an "invitation".

While my request is admittedly open-ended, I expected a debate over exactly 
where the appropriate bounds should be for Wiki's interest and mission, 
whereupon a precise definition, including data elements, screen formats, etc., 
would be forthcoming.

Note that there is much debate in AI circles as to what the "next big thing" 
is. Some, like WWW inventor Tim Berners-Lee think that it will be 
the "semantic web", whereas my Dr. Eliza demonstrates an even greater 
capability but with less effort to implement. I was attempting to avoid 
appearing to "grab Wiki" for a particular implementation by leaving the bounds 
open to everyone, but that very openness is what you appear to be objecting 
to. It appears that openness draws objections of vagueness, and closedness 
draws objections of attempting to grab Wiki for a particular implementation. 
Obviously (to me), you/Wiki must somehow get past this "damned if you do and 
damned if you don't" position for Wiki to ever become more than just an 
information repository that is off-limits to AI.

I will gladly send/post some articles if there is a suitable place here, but 
it isn't really practical to explain the inner workings of complex AI engines 
here, in an attempt to precisely "prove" the need for particular information.

> You are of course welcome to use any Wikipedia content in your research...

What you seem to be saying here is that Wikipedia WILL NOT make the provisions 
(e.g. adding a button) to gather additional optional information as a 
transitional step to future AI front ends. This view, if permanent, would seem 
to seal Wikipedia's fate.

Can't we establish some sort of forum to move Wikipedia, if not into the AI 
future, at least into the AI present? AS CTO, this would seem to be clearly 
within your own area of responsibility.

Unfortunately, the rather negative reception received so far here does little 
to draw others in to help develop the really robust proposal that Wiki would 
need before a full implementation.

If you would open up an "official" Wiki forum, indicating Wiki's interest in 
at least exploring this area and wringing this out, then I will make the 
announcements on the various AI forums to draw people working in this area. 
Then, we should be able to develop a much more defined proposal to support a 
specific constellation of AI methodologies and/or eliminate/reduce the 
trickle/torrent of future requests for AI support.

Steve
Comment 10 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-05-01 20:29:20 UTC
CTOs manage technical affairs.  They do not necessarily direct the goals of
projects so much as their implementation.  I would again advise you to speak to
a Board member.  The Board determines the overall goals and direction of the
Wikimedia Foundation, which to my knowledge do not at present explicitly include
incorporation of machine-readable data into Wikipedia.
Comment 11 Steve Richfield 2007-05-01 23:25:49 UTC
> CTOs manage technical affairs. They do not necessarily direct the goals of
projects so much as their implementation.

While I might argue that they are also there (in most organizations) to keep 
up on the leading edges of their technology and advise management, evidence 
here has obviously been to the contrary, so I'll grant your point, at least in 
regards to Wiki.

> I would again advise you to speak to a Board member.

OK. I chased the http://wikimediafoundation.org/wiki/Board link that you 
provided and discovered pictures, interesting biographies, etc., but nothing 
resembling contact information, other than Wiki's official address, which 
probably isn't the address of any of the board members. Do I really need to 
send snail mail to Wiki's front door to get their attention?

Thanks for your suggestions, and especially for having the patience to repeat 
them until I *finally* got the idea.

Steve
Comment 12 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-05-01 23:51:58 UTC
The profiles on that page all link to user pages on various wikis, where you can
post on their talk pages or e-mail them.  Even if not, it would not have been
very difficult to Google their contact info.  But that's neither here nor there.
 I suggest you take some time to write up a cogent and thought-out proposal that
prominently explains why it's good for the Foundation's goals and explicitly
spells out what you're asking for, because no offense, but you don't seem to
have gotten your point across very clearly until now or convinced anyone that
what you're suggesting is a good idea.
Comment 13 Steve Richfield 2007-05-02 21:19:28 UTC
> I suggest you take some time to write up a cogent and thought-out proposal 
that prominently explains why it's good for the Foundation's goals

If you'll look at page 64 of April 2007 issue of the Fast Comapny magazine, 
you'll see an article/interview with Jimmy Wales, where he says that he wants 
to implement a Semantic Web like Wikipedia. Hence, it appears that this case 
has already been made! Now, to figure out how to contact Jimmy...

> and explicitly spells out what *you're* (emphasis added for response) asking 
for...

As explained in earlier postings, while this is absolutely NO PROBLEM for 
*me*, I fear that it could be a point of turn-off for others who want to do 
something quite different than I want to do, and the VERY LAST thing I ever 
want to do is to turn off an opening area of AI development.

Indeed, as for what *I* am looking for, rather than (trying to) perfectly 
describe a database format, I have an actual *running* (read that "debugged") 
Access .MDB database that I'll gladly send to anyone who wants to browse 
through it. EVERYTHING is in plain people-readable ASCII text with good 
internal documentation (with long field names and row comments filled in), 
where everything is quite obvious EXCEPT for how the complex AI algorithms 
roll it all together to make a conversational AI system. This includes plenty 
of working examples of "machine wisdom". Supporting this are some papers 
presented at past AI conferences.

Isn't one debugged and fully internally documented Access database, 
accompanied by peer-reviewed articles, worth more than any pile of documented 
proposals?

Thanks again for your contionued patience here.

Steve

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links