Last modified: 2007-05-03 01:27:44 UTC
Beyond the "semantic web" is a sort of machine wisdom, where you describe a
bad situation, the computer asks a few pointed questions, and then information
is produced that you MUST have been unaware of in order to have your problem.
This has been working in the form of Dr. Eliza that has been demonstrated at
the last two AI conferences in Las Vegas. This could easily be interfaced to
Wiki projects if only some important additional information were gathered when
people made their entries, and a database containing this information were
then available for download. We have such a database that would make a good
The needed additional information includes:
1. Regular expressions that describe snippets that would likely be included in
a statement that was stating "symptoms" (for lack of a better term) usually
associated with the information in the entry.
2. A name for each symptom.
3. A question for each symptom that when answered would probably produce a
snippet that was recognizeable by the above regular expression.
4. A list of symptoms (or must-be-missing symptoms) for each entry. These are
weighted according to their position in the list.
This can all be produced from a form screen, and any author who fails to fill
in the form will simply not have their information considered by people using
the AI interface.
The complexities of chopping long sentences into short ones, handling negation
and tense, etc., are all taken care of by the AI software.
Dr. Eliza, with the Gracie speech front end, is now able to carry on spoken
conversations about people's chronic illnesses, and sometimes comes up with
new true cures even for supposedly well known conditions. With this additional
information available, people could then discuss their life-problems with the
world's psychological knowledge supporting the computer, political problems
with a complete knowlsedge of history supporting the computer, etc., etc.
All that is needed to make all this magic now happen is just a little more
information gathered from the authors when they make their entries.
It would appear that this new capability would supplant most existing Internet
informational and search systems.
I would like to point out that BugZilla is not the sort of place to *begin*
discussions on large-scale projects of this nature; that sort of thing is best
suited to wikitech-l, or perhaps wikiresearch-l.
(In reply to comment #1)
> I would like to point out that BugZilla is not the sort of place to
*begin*discussions on large-scale projects of this nature...
Since this ONLY involves collecting some optional additional information from
authors that is NOT presented on any non-author screens and which does NOT
affect Wiki operation, it sure didn't look very "large" scale, at least to me.
Obviously, there must be some sort of quasi-standard as to what is a "large"
scale project. Do you have any idea where that line should be?
I was hoping to avoid a big political debate over collecting some optional
additional information. Asking seemed to be a no-brainer. Obviously, demanding
such information or changing Wiki operation would be quite a different matter,
but I am not proposing that.
When you say "make their entries", what do you mean? A form shown to all users
who sign up? Every article created? Extra information added to articles on
medical conditions? What sort of questions would you like asked, exactly; of
whom; and when? A succinct statement of the exact object of your request would
be appreciated. I'm not entirely sure what exactly you want us to do.
If I'm reading you correctly, it's something along the lines of you wanting to
make use of the Wikipedia user base to gather data for your AI. Addition of
survey screens everywhere for everyone who would want this if we offered that to
all comers might be distracting or confusing to our users. If I'm getting your
general thrust, you probably want to contact a Board member and not the
developers/sysadmins. It doesn't seem so much a technical request as a request
for use of Wikimedia Foundation resources. If I'm misunderstanding, please
(In reply to comment #3)
> When you say "make their entries", what do you mean? A form shown to all
userswho sign up?
> Every article created?
> Extra information added to articles on medical conditions?
I used medical examples because that is an area of great familiarity for me.
What I am proposing would be applicable to a significant fraction, maybe half
of the articles. The significant articles are those that provide details
relevant to any subject, the knowledge of which might explain a problem that a
user has in that area.
> What sort of questions would you like asked, exactly;
I forese a screen akin to the advanced search screens used by several search
engines, with wild-card, optional string, and other variables to use in
recognizing snippets in user queries. Beyond that, the questions I outlined in
my original posting. In addition, user name and date/time tags would be
included in the database to help deal with duplicates, hackers, etc.
> of whom;
The authors when they enter or edit an article. This would be entirely
> and when?
It is probably best to present the screen as part of the commit process.
> A succinct statement of the exact object of your request wouldbe appreciated.
AI technology has been quietly marching along. There is >90% of enough to be
useful in the Wikis, but crucial information is routinely absent beyond any
conceivable automated recovery, though recovery is often possible by people
skilled in the areas being posted. Obviously, even a less-that-the-best effort
by authors would GREATLY help following manual cleanup efforts.
I'm not pretending to have all of the answers. Indeed, I suspect that others
may have additional requests for information to drive their own AI engines.
This should be opened up for debate. I suspect that the way to REALLY get
other AI researchers' attention is to simply start gathering optional
information, and just wait for additional requests.
My point here is that while Wiki is conspicuously unable to lead the AI field,
it CAN produce the content for coming AI "browsers" if only it will collect
just a little additional information. Having Wiki "lock up" its content to
only be accessable via current dumb browsers and search engines can only
anchor it in the present, soon to become the past.
> I'm not entirely sure what exactly you want us to do.
*I* want my Dr. Eliza engine to be able to discuss people's difficult problems
in casual verbal conversations as it now does, but astronomically enhanced
with Wiki's content.
I know that there are others out there with different dreams, e.g. "semantic
web", with different requirements, e.g. key words and phrases, denial words
and phrases, etc. I see absolutely *NO* reason not to be collecting as much of
this information as authors will volunteer. Do you?
> If I'm reading you correctly, it's something along the lines of you wanting
to make use of the Wikipedia user base to gather data for your AI.
Not just *my* AI, but other engines to do new and presently undreampt things.
> Addition of survey screens every where for everyone who would want this if
we offered that to all comers might be distracting or confusing to our users.
How about just a button that they can click or skip, to take them to a survey
> If I'm getting your general thrust, you probably want to contact a Board
member and not the developers/sysadmins.
This doesn't appear (to me) to be a political issue at all, but I certainly
have no objection to talking to anyone who wants to discuss this. Did you have
anyone in mind? Contact information?
> It doesn't seem so much a technical request as a request for use of
Wikimedia Foundation resources.
I don't see it that way at all. All I am trying to do is to unhook the present
needless dependency on existing dumb browsers and allow client-end AI smarts
to get a LOT more from Wiki, and in the process greatly enhance the value of
Wiki and the Foundation.
> If I'm misunderstanding, please correct me.
Hopefully I have. Any more questions?
Hmm. Well, it's not up to me to decide things like this. I would suggest
contacting one or more board members:
Alternatively, you could also contact Brion Vibber, the CTO:
This really doesn't sound like a bug report or feature request.
This call obviously depends on what constitutes a "feature". This proposal
obviously provides the interfaces needed to make AI software work with Wiki,
and does ABSOLUTELY NOTHING for present Wiki users using present browsers, if
indeed that is a part of the present definition of a "feature". OK, then where
is my next stop? This is the ONLY apparent path from Wike's present
functionality to the AI future seen my many, so this would seem to be a
question of when, and not if.
Alternatively, if I could get an commitment that Wiki ABSOLUTELY WILL NOT DO
THIS, then I might be able to go elsewhere and do it there, eventually leaving
Wiki to the dustbin of history.
Wiki's large presence coupled with an active disinterest in AI interfaces
could be enough to put web AI into a sort of indefinite cold storage, which
doesn't appear to be in anyone's interest.
Where in Wiki is my next stop?
You are of course welcome to use any Wikipedia content in your research, but the extremely vague,
unformed descriptions above neither invite nor require any kind of response from us at all.
I did intend this as an "invitation".
While my request is admittedly open-ended, I expected a debate over exactly
where the appropriate bounds should be for Wiki's interest and mission,
whereupon a precise definition, including data elements, screen formats, etc.,
would be forthcoming.
Note that there is much debate in AI circles as to what the "next big thing"
is. Some, like WWW inventor Tim Berners-Lee think that it will be
the "semantic web", whereas my Dr. Eliza demonstrates an even greater
capability but with less effort to implement. I was attempting to avoid
appearing to "grab Wiki" for a particular implementation by leaving the bounds
open to everyone, but that very openness is what you appear to be objecting
to. It appears that openness draws objections of vagueness, and closedness
draws objections of attempting to grab Wiki for a particular implementation.
Obviously (to me), you/Wiki must somehow get past this "damned if you do and
damned if you don't" position for Wiki to ever become more than just an
information repository that is off-limits to AI.
I will gladly send/post some articles if there is a suitable place here, but
it isn't really practical to explain the inner workings of complex AI engines
here, in an attempt to precisely "prove" the need for particular information.
> You are of course welcome to use any Wikipedia content in your research...
What you seem to be saying here is that Wikipedia WILL NOT make the provisions
(e.g. adding a button) to gather additional optional information as a
transitional step to future AI front ends. This view, if permanent, would seem
to seal Wikipedia's fate.
Can't we establish some sort of forum to move Wikipedia, if not into the AI
future, at least into the AI present? AS CTO, this would seem to be clearly
within your own area of responsibility.
Unfortunately, the rather negative reception received so far here does little
to draw others in to help develop the really robust proposal that Wiki would
need before a full implementation.
If you would open up an "official" Wiki forum, indicating Wiki's interest in
at least exploring this area and wringing this out, then I will make the
announcements on the various AI forums to draw people working in this area.
Then, we should be able to develop a much more defined proposal to support a
specific constellation of AI methodologies and/or eliminate/reduce the
trickle/torrent of future requests for AI support.
CTOs manage technical affairs. They do not necessarily direct the goals of
projects so much as their implementation. I would again advise you to speak to
a Board member. The Board determines the overall goals and direction of the
Wikimedia Foundation, which to my knowledge do not at present explicitly include
incorporation of machine-readable data into Wikipedia.
> CTOs manage technical affairs. They do not necessarily direct the goals of
projects so much as their implementation.
While I might argue that they are also there (in most organizations) to keep
up on the leading edges of their technology and advise management, evidence
here has obviously been to the contrary, so I'll grant your point, at least in
regards to Wiki.
> I would again advise you to speak to a Board member.
OK. I chased the http://wikimediafoundation.org/wiki/Board link that you
provided and discovered pictures, interesting biographies, etc., but nothing
resembling contact information, other than Wiki's official address, which
probably isn't the address of any of the board members. Do I really need to
send snail mail to Wiki's front door to get their attention?
Thanks for your suggestions, and especially for having the patience to repeat
them until I *finally* got the idea.
The profiles on that page all link to user pages on various wikis, where you can
post on their talk pages or e-mail them. Even if not, it would not have been
very difficult to Google their contact info. But that's neither here nor there.
I suggest you take some time to write up a cogent and thought-out proposal that
prominently explains why it's good for the Foundation's goals and explicitly
spells out what you're asking for, because no offense, but you don't seem to
have gotten your point across very clearly until now or convinced anyone that
what you're suggesting is a good idea.
> I suggest you take some time to write up a cogent and thought-out proposal
that prominently explains why it's good for the Foundation's goals
If you'll look at page 64 of April 2007 issue of the Fast Comapny magazine,
you'll see an article/interview with Jimmy Wales, where he says that he wants
to implement a Semantic Web like Wikipedia. Hence, it appears that this case
has already been made! Now, to figure out how to contact Jimmy...
> and explicitly spells out what *you're* (emphasis added for response) asking
As explained in earlier postings, while this is absolutely NO PROBLEM for
*me*, I fear that it could be a point of turn-off for others who want to do
something quite different than I want to do, and the VERY LAST thing I ever
want to do is to turn off an opening area of AI development.
Indeed, as for what *I* am looking for, rather than (trying to) perfectly
describe a database format, I have an actual *running* (read that "debugged")
Access .MDB database that I'll gladly send to anyone who wants to browse
through it. EVERYTHING is in plain people-readable ASCII text with good
internal documentation (with long field names and row comments filled in),
where everything is quite obvious EXCEPT for how the complex AI algorithms
roll it all together to make a conversational AI system. This includes plenty
of working examples of "machine wisdom". Supporting this are some papers
presented at past AI conferences.
Isn't one debugged and fully internally documented Access database,
accompanied by peer-reviewed articles, worth more than any pile of documented
Thanks again for your contionued patience here.