Last modified: 2014-09-24 01:17:58 UTC
I have an extension that adds additional metadata to pages. I've modified SpecialImport and SpecialExport to handle importing/exporting this additional metadata, but it would be nice if this could be done via hooks instead of modifying core code. It doesn't look like an easy problem to solve since the importer relies on callbacks but...
Created attachment 4219 [details] Fix against 11.0 release Adds a hook to Export.php right before it writes the closing </revision> to allow extensions to export additional metadata for a revision. Changes the way importing is handled slightly. When an unknown tag is encountered in the <revision> block now, instead of throwing an error, the importer stores the tag in an array. After the revision is saved in the WikiRevision class, a hook is fired with the array as a parameter allowing extensions to import the custom tags. Hopefully nothing currently relies on the error throwing bit. Usage: $wgHooks['ExportPageRevision'][] = 'efExportMetadata'; function efExportMetadata( $writer, $revid, $out ) { // lookup additional metadata for the revision using the $revid $metadata = getRevisionMetadata($revid); // construct additional metadata chunks and write them to $out foreach($metadata as $field=>$value) { $out .= " " . wfElement( $field, null, strval( $value ) ) . "\n"; } return true; } $wgHooks['ImportPageRevision'][] = 'efImportMetadata'; function efImportMetadata( $wikirev, $revid, $metadata ) { // Do something with the imported metadata $revision = Revision::newFromId($revid); $custom_metadata_handler = new CustomMetadataHandler($revision); foreach($metadata as $name=>$value) { $custom_metadata_handler->handle($name, $value); } return true; }
Please post patches as attachments so they can be managed without copy/paste problems.
I don't understand. Do you mean don't select the patch checkbox when you are submitting an attachment?
I mean *attach a file*, don't paste it into a comment.
Um, I did attach the patch. It's id 4219. Its the one and only attachment on this bug. What I typed into the first comment is not a patch, its the recommended usage of the code that was added in the patch. Something that an end user would put in an extension to use the new hooks.
Created attachment 5175 [details] Patch against 1.13r39163 The import hook in this patch fires right before the revision is saved and passes the revision object so it can be manipulated.
Mass compoment change: <some> -> Export/Import
Is this a dupe of bug 11537?
It seems to me that this is a superset of bug 11537 in that it proposes more changes to the code. The part corresponding to bug 11537 is the patch to importOldRevision(). This patch is different from that of bug 11537 and appears at a different place, but I think it can be used for similar purposes. (The order in which hooks would be fired is different between the two bugs, but this should be of little importance.) Personally, I'd be happy if any of the two would be included in one of the next releases.
My DataTable extension (http://www.mediawiki.org/wiki/Extension:DataTable) now uses the hook NewRevisionFromEditComplete (http://www.mediawiki.org/wiki/Manual:Hooks/NewRevisionFromEditComplete), so personally I don't have any further need for the hook suggested here.
So I still need this, but I'm not going to continue making patches for it until there is some indication that it is going to get added to the code. If someone with permissions feels like committing this, let me know and I'll generate a new patch against trunk.
Basically fixed in 1.17 now that Import.php and Export.php have good hooks. See: http://www.mediawiki.org/wiki/Manual:Hooks/ImportHandleRevisionXMLTag http://www.mediawiki.org/wiki/Manual:Hooks/ModifyExportQuery http://www.mediawiki.org/wiki/Manual:Hooks/XmlDumpWriterWriteRevision
Oh so close! Unless I'm missing something, ImportHandleRevisionXMLTag is worthless for processing the imported XML. It doesn't pass the name of the tag being processed or the contents of the tag. Also, the tag name/contents are stored in a private member/function of the importer so you can't get to them in a hook. Can we either, 1) modify the hook to pass the tag name and tag contents or 2) make the $reader member and nodeContents function in WikiImporter public?
Christian, I am very sorry for the wait. I've cc'd the people who currently work on these dumps so they can advise you. Thank you for your contribution and sorry again for the delay in response!
Comment on attachment 5175 [details] Patch against 1.13r39163 Marking as obsolete as it is superseded in core
(In reply to comment #13) > Oh so close! > > Unless I'm missing something, ImportHandleRevisionXMLTag is worthless for > processing the imported XML. It doesn't pass the name of the tag being > processed or the contents of the tag. Also, the tag name/contents are stored > in a private member/function of the importer so you can't get to them in a > hook. > > Can we either, 1) modify the hook to pass the tag name and tag contents or 2) > make the $reader member and nodeContents function in WikiImporter public? Well, making the variables directly public is a no, adding a getFunction() is fair enough. Extra parameters passed to the hook should be fine What do the currently passed things give you?
Created attachment 9869 [details] Make nodeContents public and add the $tag param to various hooks (In reply to comment #16) > What do the currently passed things give you? http://www.mediawiki.org/wiki/Manual:Hooks/ImportHandleRevisionXMLTag http://www.mediawiki.org/wiki/Manual:Hooks/AfterImportPage * $importer: The WikiImporter object * $pageInfo: An array of xml tag names => xml tag content for the <page> object * $revisionInfo: An array of xml tag names => xml tag contents for the <revision> object Theoretically in the ImportHandleRevisionXMLTag hook, you would process the XML input and add data to the $pageInfo or $revisionInfo array. Then later on in the AfterImportPage hook, you could process the data and save it to the database or whatever. The problem is, the actual data being parsed out of the XML is stored in the $tag object in the importer and that isn't passed to the hook so you can't actually see what tag is being encountered. Adding the $tag param to the hook would fix that. To get the contents of the XML tag, you need to call $importer->nodeContents() which is currently a private function. Making that function public would totally solve that.
(In reply to comment #17) > Created attachment 9869 [details] > Make nodeContents public and add the $tag param to various hooks > > (In reply to comment #16) > > What do the currently passed things give you? > > http://www.mediawiki.org/wiki/Manual:Hooks/ImportHandleRevisionXMLTag > http://www.mediawiki.org/wiki/Manual:Hooks/AfterImportPage > > * $importer: The WikiImporter object > * $pageInfo: An array of xml tag names => xml tag content for the <page> object > * $revisionInfo: An array of xml tag names => xml tag contents for the > <revision> object > > Theoretically in the ImportHandleRevisionXMLTag hook, you would process the XML > input and add data to the $pageInfo or $revisionInfo array. Then later on in > the AfterImportPage hook, you could process the data and save it to the > database or whatever. > > The problem is, the actual data being parsed out of the XML is stored in the > $tag object in the importer and that isn't passed to the hook so you can't > actually see what tag is being encountered. Adding the $tag param to the hook > would fix that. To get the contents of the XML tag, you need to call > $importer->nodeContents() which is currently a private function. Making that > function public would totally solve that. I think your patch is the wrong way round, as it shows you're making it private Also, please use a unified diff against a file, see https://www.mediawiki.org/wiki/Subversion#Making_a_diff
adding "reviewed"
(In reply to comment #18) > > I think your patch is the wrong way round, as it shows you're making it private > > Also, please use a unified diff against a file, see > https://www.mediawiki.org/wiki/Subversion#Making_a_diff Bah! I was trying to make a diff without checking anything out. I'll try again later when I have some time to mess with it.
Christian, is this still something you have an interest in? We now use Git https://www.mediawiki.org/wiki/Git/Tutorial in case you'd like to address this hook addition again. (And if the answer is "no, I'm rather tired of this issue and wash my hands of it" I completely understand.) Best wishes.
I am still interested in the capability but realistically I'm not going to be submitting a patch for it again. Thanks for following up.