Last modified: 2014-11-04 22:53:25 UTC
Wikimedia's image syntax should have separate fields for the caption and for alternate text. Captions and alternate text have opposite purposes: captions only make sense if an image *is* visible, whereas alternate text is intended for display only if the image is *not* visible. So it doesn't make sense for them to use the same data. The current effect, in screenreaders and in text-only browsers, is two recitals of the same non-sequitur.
See http://meta.wikimedia.org/wiki/ MediaWiki_1.3_comments_and_bug_reports#Separate_captions_and_alt_text for further comments, links to further comments, and related issues. (In particular, I think that alternate text should be stored with the image, on the image page. If that's the only place it goes then the current wiki syntax can be used to encode the caption. But this may not be the best solution.)
Unfortunately, that wouldn't work. Appropriate alternate text will almost always, and appropriate caption text will often, be different for the same image used in different articles.
There are actually three different text values, not just two. alt text (shown instead of the image by text browsers or audio browsers), title text (typically shown as a tool tip by graphical browsers), and caption text (shown below the image), should all be distinct. See [[en:Wikipedia_talk:Extended image syntax#Alt.2C_title.2C_and_caption_text_in_extended_markup]] for some discussion. I suggest syntax like [[Image:Filename.ext|other|options|here|title=Title text|alt=Alt text|caption=Caption text]], with defaults like: If caption is unset, then no caption; if alt is unset, then copy the title or the caption, or fall back to the file name; if title is unset then copy the alt text or the caption, or fall back to the file name. For backward compatibility, allow the existing [[Image: Filename.ext|other|options|here|Text that gets used for all three purposes]].
ALT is always required for an image, but caption and title should only be present if explicitly set and not generated by default. Title is least important. A better order for arguments would be: [[Image:Filename.ext|other|options|here|alt=Alt text|caption=Caption text|title=Title text]]
However, the most appropriate alt text for an image is usually nothing at all -- even on a Wiki, despite Wikis' avoidance of purely decorative images. Usually a Wiki image is either a diagram of a point that has already been put, as well as it can be in text format, in the article text already (in which case repeating it as more text is pointless), or it's an illustration of something that's marginally interesting to those viewing images (e.g. a person's appearance) but not interesting enough to feature in a text-only rendition of an article (e.g. you wouldn't bother mentioning the image if reading the article to someone over the phone). I am not saying alt should always be nothing at all, but it is more likely that the most appropriate alternate text will be nothing than that the most appropriate caption will be nothing. Therefore, I think alternate text should come after the caption, to reduce the compulsion for people to create redundant alt text (which they frequently do). I don't see the point of having a customizable title= at all, but I suppose that belongs to another bug report.
In the 1.4pre CVS branch, there are problems with image captions that contain links; the image alt text and "title" attribute of the link to the image page are set incorrectly. I think fixing this bug would be a way of fixing that problem as well.
(In reply to comment #5) > However, the most appropriate alt text for an image is > usually nothing at all This is simply incorrect. This is saying that users of alternative browsers or on slow connections should be denied access to images. It's also treating some handicapped users as second-class citizens. If there's a a point in putting an image into an article, then everyone who reads the article should be able to see that it's there, get an idea of what it is, and have the option of loading or downloading it. All images on WP must have alt attributes. Many will have captions, but they aren't strictly necessary, as many images can speak for themselves (unless, of course, they have empty alt text). E.g.: a prominent portrait on a bio page, a chart with a graphical title/caption, a country's location map. And remember that every image on WP also serves as a link to the Image: page, which may contain a more information that's not in the article, important copyright info, or a more detailed text description of the image.
Firstly, note that image parameters are not order-specific, so all discussion of "which should come first" is irrelevant. The only parameter that is currently order specific is that which is currently used for alt, title and caption; this is order specific in the sense that it is not treated as a parameter at all. (Roughly, the syntax is [[<image-name>|<zero-or-more-params>|<caption-and-alt-text>]]). If we wanted, we could define parameters of form alt=, caption= and title= and use the last not-really-a-parameter only as a fall-back. This leaves us with two questions: precedence, and defaults. [I'll refer to the text currently used for all three as <alltext>] * <caption> text is only used on images using the "frame" or "thumb" parameters; it needs to default to <alltext> to avoid breaking current usage. * <alt> text should be present on every image, but ideally be different from a visible caption; however, non-captioned images (those with neither "thumb" nor "frame") may have <alltext> deliberately chosen as a good <alt>; we could therefore fall back to <oldtext> for non-captioned images, but generate something from the filename for captioned images. * title text is the other position of <alltext> in non-captioned images, and so like <alt> should probably fall back to that; it's less clear, however, what fallback title a captioned image should have I propose the following: [[Image:<filename>|<options>|alt=<alt>|title=<title>|<caption>]] * <caption> is compulsory, as now; the others are optional For a non-captioned image: * the alt attribute contains <alt> if set, <caption> otherwise * the title attribute contains <title> if set, <caption> otherwise For a captioned image: * the alt attribute contains <alt> if set, <filename> otherwise * the title attribute contains <title> if set; if none set, I'm not sure: <alt>? <filename>? <caption>?; note that if <alt>, we need a third fallback for when that isn't set either.
(In reply to comment #6) > In the 1.4pre CVS branch, there are problems with image captions that > contain links; the image alt text and "title" attribute of the link to > the image page are set incorrectly. I think fixing this bug would be a > way of fixing that problem as well. If we still use the current <alltext> as a fall-back for anything, we need a way of fixing that problem anyway, otherwise existing instances will remain broken. However, no non-captioned image should have links in its <alltext> anyway, so if we generate <title> and <alt> from <filename> for these cases, we can indeed avoid fixing the problem.
(In reply to comment #7) > (In reply to comment #5) > > However, the most appropriate alt text for an image is > > usually nothing at all > > This is simply incorrect. No, it isn't. It's a very valid viewpoint (and I don't believe alt text is an exact science, there are many different points of view), and one that I share. > This is saying that users of alternative browsers or on slow connections > should be denied access to images. It's also treating some > handicapped users as second-class citizens. If there's a a point in > putting an image into an article, then everyone who reads the article > should be able to see that it's there, get an idea of what it is, and have > the option of loading or downloading it. Blank alt text does not deny the image to anyone. It simply prevents the image from sometimes becoming a great annoyance. As Matthew pointed out, alt text is currently usually duplicated caption text - what's the point in that? The caption text is already in the document. Having it displayed or read out twice is pointless. > All images on WP must have alt attributes. Yes, but the attribute may be empty. > Many will have captions, but they aren't strictly necessary, as many images > can speak for themselves (unless, of course, they have empty alt text). > E.g.: a prominent portrait on a bio page, a chart with a graphical > title/caption, a country's location map. For a portrait, good alt text would either be nothing or possibly a brief description of the person's appearance (especially if their appearance is specifically relevant to the article). Bad alt text is "George W. Bush", it adds nothing - but it's what a lot of portraits currently have. For a chart, good old text would be details about the data it shows - possibly a brief summing up of the results with a link to full information. Bad alt text is "a chart showing the growth of Wikipedia". For a map, good alt text could be a brief description on where the country is, presuming it isn't already in the text. Bad alt text is "a map of <wherever>". Especially bad alt text is "Image:LocationUSA.png", as you currently get. Most of the bad alt text would make an appropriate title, however. > And remember that every image on WP also serves as a link to the > Image: page, which may contain a more information that's not in the > article, important copyright info, or a more detailed text description of > the image. I'd suggest that's what the longdesc attribute is for. But caption text can also be linked to the image page.
(In reply to comment #10) > As Matthew pointed out, alt text is currently usually duplicated caption text - what's the point in that? The > caption text is already in the document. Having it displayed or read out twice is pointless. Hence my suggestion that it not default to being the same as <caption>, ever. But images that don't use |frame| or |thumb| don't have automatic captions anyway, so its a case of deciding whether the text that has been entered should be used for alt, title, or both. > For a chart, good old text would be details about the data it shows - possibly a brief summing up of the results > with a link to full information. Bad alt text is "a chart showing the growth of Wikipedia". Just a small point: as far as I know, alt attributes can't contain links; they're just a short string. > For a map, good alt text could be a brief description on where the country is, presuming it isn't already in the > text. Bad alt text is "a map of <wherever>". Especially bad alt text is "Image:LocationUSA.png", as you currently > get. > > Most of the bad alt text would make an appropriate title, however. The problem is, how to tell which any given image's label is: a decent alt text, or a decent title text. If we default to alt="" title="<alltext>" we break any images where people *did* consider the alt text. For captioned images, though, you might be right: just fall back on "". > > And remember that every image on WP also serves as a link to the > > Image: page, which may contain a more information that's not in the > > article, important copyright info, or a more detailed text description of > > the image. > > I'd suggest that's what the longdesc attribute is for. But caption text can also be linked to the image page. Auto-linking caption text to the image page runs contrary to the current ability to put formatting - including links - in the caption. I'm afraid I don't know anything about the longdesc attribute: what is its official role, and is it widely supported by UAs? I guess it all boils down to whether we think that people who can't see images would want to know they were there anyway. The advantage of having a non-blank alt text is that it shows the user that there is something there, and allows them to easily access a description page which may have more details about it (which could include the description of results for a graph...). The disadvantage is that, since they can't see it, they may simply be frustrated at an alt text that tells them little more than that there is an image there.
(In reply to comment #10) > (In reply to comment #7) > > (In reply to comment #5) > > > However, the most appropriate alt text for an image is > > > usually nothing at all > > > > This is simply incorrect. > > No, it isn't. It's a very valid viewpoint (and I don't believe alt text is an exact science, there are many > different points of view), and one that I share. > > > This is saying that users of alternative browsers or on slow connections > > should be denied access to images. It's also treating some > > handicapped users as second-class citizens. If there's a a point in > > putting an image into an article, then everyone who reads the article > > should be able to see that it's there, get an idea of what it is, and have > > the option of loading or downloading it. > > Blank alt text does not deny the image to anyone. It simply prevents the image from sometimes becoming a great > annoyance. Images with blank alt attributes, *and* the links surrounding them, are hidden and inaccessible in Lynx. I presume that audible and braille page readers would similarly omit an empty string, and possibly the existance of the link, too. There are probably other cases we haven't thought of. > > As Matthew pointed out, alt text is currently usually duplicated caption text - what's the point in that? The > caption text is already in the document. Having it displayed or read out twice is pointless. I agree; this is the point of this bug. A mechanism that lets us specify useful alt text, and optionally captions. A caption is *not* a substitute for alt text. Captions are displayed next to an image, *adding to or claryifying* information that we can see by viewing the image. Alt attributes (optionally supplemented by title attributes and longdesc destinations) attempt to be a *substitute for the image*. The W3's Web Content Accessibility Guidelines say that these should constitute "equivalent information to the visual or auditory content." > > All images on WP must have alt attributes. > > Yes, but the attribute may be empty. Yes, this is a way to *hide* a purely decorative image. The HTML 4 recommendation section 13.8 says: <blockquote> Do not specify irrelevant alternate text when including images intended to _format_ a page, for instance, alt="red ball" would be inappropriate for an image that adds a red ball for decorating a heading or paragraph. In such cases, the alternate text should be the empty string (""). </blockquote> I can't think of any instances on Wikipedia where an uploaded image is purely decorative. If the image is content, then alt text must also be content. Alt attributes with zero content are for images of zero content. > > Many will have captions, but they aren't strictly necessary, as many images > > can speak for themselves (unless, of course, they have empty alt text). > > E.g.: a prominent portrait on a bio page, a chart with a graphical > > title/caption, a country's location map. > > For a portrait, good alt text would either be nothing or possibly a brief description of the person's appearance > (especially if their appearance is specifically relevant to the article). Bad alt text is "George W. Bush", it adds > nothing - but it's what a lot of portraits currently have. A portrait is not decorative -- it conveys information about what a person's appearance, dress, and age. It can provide information about context and period (e.g. sepia-toned photo in a top-hat, or fresco with crown and sceptre). A brief description would be good, but even "portrait photo of George W. Bush" tells you what's there, and helps you decide whether to look at it. If the alt text is empty, how does a user browsing without images know that there is a portrait available for viewing or downloading? At best, some browser *might* show that some image exists or display its filename, which is practically never an acceptable text equivalent. > For a chart, good old text would be details about the data it shows - possibly a brief summing up of the results > with a link to full information. Bad alt text is "a chart showing the growth of Wikipedia". > > For a map, good alt text could be a brief description on where the country is, presuming it isn't already in the > text. Bad alt text is "a map of <wherever>". Especially bad alt text is "Image:LocationUSA.png", as you currently > get. Agreed, but in all these cases bad alt text is better than no alt text. > Most of the bad alt text would make an appropriate title, however. > > > And remember that every image on WP also serves as a link to the > > Image: page, which may contain a more information that's not in the > > article, important copyright info, or a more detailed text description of > > the image. > > I'd suggest that's what the longdesc attribute is for. But caption text can also be linked to the image page. longdesc is one way of indicating more info about an image. Wikipedia makes it more accessible (in most current browsers) by linking the image to the full image page, which may carry additional text. W3 HTML 4 recommendation section 13.2 says about alt and longdesc: <blockquote> The alt attribute provides a short description of the image. This should be sufficient to allow users to decide whether they want to follow the link given by the longdesc attribute to the longer description... </blockquote> Ideally, the Image: page would include a fuller "text equivalent" of the image. E.g. a description of George Bush's appearance and circumstances of the photo. As Wikipedia style guides are developed, we should incorporate more accessibility guidelines into them. The Wikimedia interface must also be developed with the tools to support them.
I think everyone can agree that the alt text for an image is logically separate from the caption, as Matthew and others noted. As a practical matter, a caption may contain links and be as long as reasonably necessary, while alt text should be text-only and is generally shorter, as Rowan said. So we should be able to do: [[Image:foo.jpg|alt=<alt>|<caption>]]. I agree with that part of Rowan's proposal, and hope most others do too. The obvious question is what to do if no alt text is specified: 1) Use empty alt text 2) Use the caption Option 2 would be most consistent with current behavior, and is what Rowan proposed. However, I would favor option 1, for the reasons Matthew and Tom gave. A caption is a long description and could contain links, and in general doesn't make good alt text, IMHO. If an image needs alt text, and it isn't specified, I think that's a deficiency in the article, and not something for which we can find a technical solution. As Rowan noted, there is also the issue of the title attribute of the link. Unlike the alt text, I don't see any good reason to allow editors to specify the title explicitly; I think we should keep the image syntax as simple as reasonably possible. Practically speaking, the main use of the title attribute is in a "tooltip" in modern browsers. Possibilities for title include: 1) Blank 2) The caption 3) The name of the link target (always the image page, for now) 4) Whatever the alt text is I would argue against option 2 for the same reason as above. I like option 3, which would give users with modern browsers a hint about what happens when they click on the image, and could complement a fix to bug 539. I don't feel strongly about it, though. (In reply to comment #8) > * title text is the other position of <alltext> in non-captioned images, and so > like <alt> should probably fall back to that; it's less clear, however, what > fallback title a captioned image should have When neither "thumb" nor "frame" is specified, the "caption" text isn't used as a caption at all, so it makes sense to use it for the title. I don't think it's necessary to use the caption for the title with "thumb" or "frame", though, because then the caption is always displayed. I don't think it makes sense to repeat the caption, which the reader can presumably already see, in the tooltip.
The HTML spec's guidelines are not exhaustive. They say alt="" is appropriate for "purely decorative" images, but that does not mean alt="" is inappropriate in less "pure" cases. Even very smart people often misunderstand this. Maybe this will help: Imagine that MediaWiki didn't allow graphics at all, just text. Would it be acceptable to litter articles with non-sequiturs like "foopy.jpg" or "Eiffel Tower viewed from the east"? Of course not. That wouldn't make any sense when listening to the article. Contributors would instead work towards the best possible text for the article. Now imagine graphics support has been added to MediaWiki. For those using text-only UAs, has the best possible text for each article suddenly changed? No, the ideal text is still the ideal text. So there still isn't any reason to add gibberish like "foopy.jpg" or "Eiffel Tower viewed from the east" to the text. That still wouldn't make any sense when listening to the article. Anyone who listens to it can tell that any alt= text like that is wrong. So when to use alt=? Well, occasionally a graphic makes some of that ideal text redundant. For example, a map may replace a sentence or two telling where a place is. A graph may replace a text summary of some data. A diagram may replace a list of instructions, a description of a chess move, etc. In these cases, the text that the image replaces should be moved into the alt= attribute for the image. That is the only reason for alt= to be anything other than "". It's the only reason. alt= is the *text equivalent*: the text that you honestly would have included if the image was never there, but that is redundant when the image is visible. alt= is hard to understand because often we have to work backwards. The image is introduced before the text-only version is finished (especially on a Wiki, where the text is never finished), so we have to ask: "If we had the ideal text, would this image replace any of it?" Portraits are an interesting example. Graphic portraits feature regularly in biographical articles, but textual portraits hardly ever do. So even if you *could* describe someone's appearance textually, the most appropriate alt= for a portrait image is still "" -- unless you can argue with a straight face that their appearance would still have been important enough to describe in the text if MediaWiki didn't allow images (which probably isn't true), *and* that the most appropriate text description is neatly made redundant by the image (which probably isn't true either). ... Apologies for turning Bugzilla into a podium. For those concerned about breaking existing syntax for alternate text: I just visited [[Special:Randompage]] until I had encountered 50 images with custom alternate text specified. The alternate text was appropriate in ... zero of them. (That's even fewer than I was expecting.) So if the syntax is changed now, it will break very little that's not already broken, and certainly much less than it fixes. > Images with blank alt attributes, *and* the links > surrounding them, are hidden and inaccessible in Lynx. Hidden? Of course, that's the whole point. Inaccessible? Not true -- see bug 371 comment 4.
(In reply to comment #14) > so we have to ask: "If we had the ideal text, would this image > replace any of it?" [...] > So even if you *could* describe someone's appearance textually, the > most appropriate alt= for a portrait image is still "" You're using a backwards argument to say that images have no inherent value. An image that shows *what someone looks like* isn't replacing some text in an article, *it is part of the article*. It could still have value to someone who is browsing without images turned on. A "text equivalent" (which can consist of one or more of alt, title, and longdesc) attempts to: 1. Indicate the presence and nature of the image. 2. Convey some information which is in that image. In an ideal world, the image's longdesc attribute links to a page that actually describes the image in detail, so an unsighted user could get an idea of what George Bush looks like. Wikipedia's not there yet, so we should be trying to improve accessibility, not work around it! W3's Web Content Accessibility Guideline number 1 is "Provide equivalent alternatives to auditory and visual content". Alt="Portrait of George W. Bush" is infinitely closer to this than alt="". > For those concerned about breaking existing syntax for alternate > text: I just visited [[Special:Randompage]] until I had encountered > 50 images with custom alternate text specified. The alternate text > was appropriate in ... zero of them. (That's even fewer than I was > expecting.) So if the syntax is changed now, it will break very > little that's not already broken, and certainly much less than it > fixes. Okay. Let's build the tools that let us do it right, then show people how to use them well. >> Images with blank alt attributes, *and* the links surrounding them, >> are hidden and inaccessible in Lynx. > Hidden? Of course, that's the whole point. Inaccessible? Not true -- > see bug 371 comment 4. Okay, so if a Lynx user suspects that maybe you've hidden a link and a portrait on a page, he can type "*", guess that "bush_por.jpg" might be useful, download the file, and fire up the Gimp to view the image. Oops, turns out it's a Portuguese flowering shrub. This doesn't prove that all non-visual browser users have access to an image with alt="". This doesn't fulfil WCAG Guideline 1. This scheme substitutes dumb luck for established accessibility techniques. It's a somewhat poorer user experience than FTP.
(In reply to comment #15) > W3's Web Content Accessibility Guideline number 1 is "Provide > equivalent alternatives to auditory and visual content". Alt="Portrait > of George W. Bush" is infinitely closer to this than alt="". Even if there is already a caption with exactly that text? I don't think there's any dispute that editors should be able to specify alt text separately from the caption. To me, the only question is what default to use when no alt text is given, and that is either blank, or the caption text. Beyond that, it's not a MediaWiki issue, IMHO, and you can take the discussion to [[Wikipedia talk:Alternative text for images]], or whereever, and argue for whatever policy you want.
*** Bug 8186 has been marked as a duplicate of this bug. ***
Fixed in r41364: * [[Image:Foo.png|alt=xyz]] now works as desired, setting the alt text. * If no alt text is specified, the alt text is the empty string, since it's useless to repeat the caption. If the second point causes problems for users of deficient user agents that are incapable of allowing their users to interact with images without alt text, please open another bug (and link to it from this one). longdesc support would be nice, too, probably would be done by linking to a specially-named page like [[Image:Imagename/longdesc]] if it exists and doing nothing otherwise (but that scheme would have to handle non-local images). Again, that's another bug. It's kind of sad that this was like a ten-line fix that anyone could have done at any point in the last four years, which could significantly increase Wikipedia's accessibility. Oh well, it's done now.
This caused parser test regressions; reverted in r41407: Revert r41364 -- broke 22 parser test cases with change of alt behavior. The caption was originally defined *as* the alt text (defaulting to the image file name if there is no alt text). Note that a separate caption text is only displayed in some display modes ('frame' and 'thumb', iirc), and not by default. Please run the parser tests and check the effect you have on them. If it's really an appropriate change, then update the test cases. If you're not sure, consider backing out pending further discussion. :) It might be appropriate to not set the 'alt' attribute for frame/thumb cases, but definitely not for inline images where we already have a way of setting the alt text which you're removing!
I tried running the parser tests, but they weren't working for me, per the e-mail I sent to Wikitech-l: they crashed, r40209 broke them. So I couldn't make sure they passed. And they're still broken on current trunk, with the same error message. So I pointed out the problem and committed without the parser tests. The behavior of using the same syntax to mean captions for thumbs/frames but alt text for inline images seems confusing and kind of broken, not at all what would be expected. It seems like it would make more sense to consistently require alt= for alt text, for all images. I'm guessing it wouldn't cause much incompatibility, because I don't think I've *ever* seen the extra parameter used for alt text on inline images. Probably in most of the cases where it's used, people were trying to make a caption anyway, so the text would be just as inappropriate as in all the other cases.
(In reply to comment #20) > I don't think I've *ever* seen the extra parameter > used for alt text on inline images. Probably in most of the cases where it's > used, people were trying to make a caption anyway, so the text would be just as > inappropriate as in all the other cases. I have seen images with the extra parameter with text. Perhaps not made for alt, but appropiate to be used as it. So it may be good to use alt parameter for 'normal' images, but it should fallback to the text.
Re-committed with the requested modifications in r41837.