Last modified: 2013-06-18 16:50:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T27105, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 25105 - Add all public lists to Gmane: provide .mbox lists archives
Add all public lists to Gmane: provide .mbox lists archives
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Mailing lists (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-09-08 20:42 UTC by Nemo
Modified: 2013-06-18 16:50 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Excerpts from the Gmane subscription results replies (16.39 KB, text/plain)
2011-12-05 16:31 UTC, Nemo
Details

Description Nemo 2010-09-08 20:42:51 UTC
A number of our public mailing lists are not on Gmane ([[m:Mailing_lists/Overview/Gmane]]). Gmane is very useful to search archives, thread visualization etc.; nobody among our list admins should have anything against it if posting from Gmane itself is not allowed and email addresses are encripted. If they have problems (very unlikely) they can request removal of the list from Gmane.
I would need the ok of someone from the WMF (Cary?) or a sysadmin so that Gmane owner can add them (I've asked him and he's ok with it).

To import archives we need the .mbox archive, which is located at lists.wikimedia.org/mailman/private/<listname>.mbox/<listname>.mbox and is usually non private (try foundation-l – warning, 128 MB or such), but is private for all those lists.
If you don't want to make them non-private, please put them in some temporary place where Gmane owner can download them and mail me the list of URLs (I'll make the requests with the web form).
Comment 1 Tim Starling 2011-01-04 07:53:31 UTC
Mailman automatically makes files in mbox format available. They are split by month and compressed, e.g.

http://lists.wikimedia.org/pipermail/mediawiki-l/2010-December.txt.gz

Can't you just use those?
Comment 2 Nemo 2011-01-04 08:04:41 UTC
Gmane owner wants the complete mbox archives.
Merging the gzipped texts would be a huge waste of time and I'm not sure the results would be good, because those files are often weird.
Comment 3 Mark A. Hershberger 2011-07-30 05:09:07 UTC
Merging the gzipped texts is something easily scripted up and it could
be done without corrupting the the final mbox.  If you want this done,
I suggest you do that instead of trying to get something that hasn't
been provided for 6 months.

Closing as FIXED.
Comment 4 MZMcBride 2011-07-30 05:21:27 UTC
(In reply to comment #3)
> Merging the gzipped texts is something easily scripted up and it could
> be done without corrupting the the final mbox.  If you want this done,
> I suggest you do that instead of trying to get something that hasn't
> been provided for 6 months.
> 
> Closing as FIXED.

Re-opening.

This doesn't seem to be fixed at all. No idea where that resolution came from.

As far as I'm aware, Wikimedia lists are internally unsearchable (the subject of another bug). That is, resources such as Gmane are essential to make the mailing lists useful to people who are unable to search their own personal collections of the list. If resolving this bug would make lists searchable (both going backward and forward), it would be worth the minimal sysadmin effort (symlinking these to a public directory for a few minutes). Gmane has functionality that's critically important, so accommodating them seems reasonable. (And it seems rather unreasonable to expect anyone to download and re-upload a lot of individual files, esp. when there's a complete record available.)

This needs further discussion and consideration.
Comment 5 Mark A. Hershberger 2011-07-30 05:37:31 UTC
http://rt.wikimedia.org/Ticket/Display.html?id=1241
Comment 6 Mark A. Hershberger 2011-08-02 16:39:05 UTC
Since the mbox files are available to subscribers to the lists, Daniel Zahn asks:

> So you want the mbox files just one single time to import them as
> described on  http://gmane.org/import.php but not on a regular basis?
>
> Since subscribing the lists to Gmane means having a user/login for a
> mailing list user, couldn't you (or Lars, the Gmane guy) just use that one to
> login at  the authentication screen in order to get the mbox file ?

I've asked him to follow up here, but it seems like this would be the thing to do.
Comment 7 Daniel Zahn 2011-08-03 11:18:24 UTC
i seem to always get the "Private Archives Authentication" screen when trying to directly access mbox files with those URLs, like f.e.:

https://lists.wikimedia.org/mailman/private/wikitech-l.mbox/wikitech-l.mbox
https://lists.wikimedia.org/mailman/private/foundation-l.mbox/foundation-l.mbox

That means,  _also_ on those that are "private = 0" in our config and/or you have listed as being on Gmane already.

After logging in with my user/pass for that specific mailing list i do get the mbox files.

Did i get you right though, that you say you can (or could) access _some_ mbox files without any login?
Or do you say you are always logging in with some user/pass but you still just get mbox files for _some_ of them?

As to adding lists to Gmane, they are just subscribed via (http://gmane.org/subscribe.php) right?
I agree the "encryption/obfuscating mail addresses" option and the "Unidirectional (no posting allowed through Gmane)" options should definitely be switched on.

So you want the mbox files just one single time to import them as described on http://gmane.org/import.php but not on a regular basis?

Since subscribing the lists to Gmane means having a user/login for a mailing list user, couldn't you (or Lars, the Gmane guy) just use that one to login at the authentication screen in order to get the mbox file ?
Comment 8 Nemo 2011-08-17 09:10:40 UTC
(In reply to comment #7)
> That means,  _also_ on those that are "private = 0" in our config and/or you
> have listed as being on Gmane already.
> 
> After logging in with my user/pass for that specific mailing list i do get the
> mbox files.

It was a year ago but if I remember correctly I wasn't able to download the mbox of some lists I was subscribed to.

> Did i get you right though, that you say you can (or could) access _some_ mbox
> files without any login?

Yes, I'm sure that I just downloaded the foundation-l one with wget without any authentication (I also wondered what the /private meant, then) and that some months later I noticed I couldn't any longer.

> Or do you say you are always logging in with some user/pass but you still just
> get mbox files for _some_ of them?
> 
> As to adding lists to Gmane, they are just subscribed via
> (http://gmane.org/subscribe.php) right?

Yes. I can do the paperwork.

> I agree the "encryption/obfuscating mail addresses" option and the
> "Unidirectional (no posting allowed through Gmane)" options should definitely
> be switched on.

Several of our lists (e.g. wikitech-l) actually offer subscribers to post via Gmane, but it's so complex and bugged that it's not worth the effort anyway.

> 
> So you want the mbox files just one single time to import them as described on
> http://gmane.org/import.php but not on a regular basis?
> 
> Since subscribing the lists to Gmane means having a user/login for a mailing
> list user, couldn't you (or Lars, the Gmane guy) just use that one to login at
> the authentication screen in order to get the mbox file ?

Yes, a single time would be enough, but he wants them to be on some public webserver he can just wget; moreover, if they're not in the standard (auto-updated directory), a good coordination would be needed so that he can import the mbox and then start receiving the new messages without losing anything in between.
Subscribing to some dozens mailing lists and then logging in via wget (if possible) to get the mbox seems way more complex than needed and as said above doesn't seem to reliably work anyway; if those mboxes can't be made permanently public as they've been (at least some of them) for years, perhaps the better thing to do would be to make them public temporarily when he's ready for the import...
Comment 9 Daniel Zahn 2011-09-19 14:32:58 UTC
Regarding "if those mboxes can't be made permanently public as they've been (at least some of them) for years" i am still unsure about the answer and would like to somehow bring this up in a larger discussion. There could have good been reasons for the changes you noticed over time. ("months later I noticed I couldn't any longer"), but i just can't tell.
Comment 10 Mark A. Hershberger 2011-10-04 23:35:18 UTC
(In reply to comment #9)
> Regarding "if those mboxes can't be made permanently public as they've been (at
> least some of them) for years" i am still unsure about the answer and would
> like to somehow bring this up in a larger discussion.

Did you get this discussed?  It seems like something you could bring up at the semi-weekly Ops meeting.
Comment 11 Mark A. Hershberger 2011-10-21 14:48:50 UTC
Lists now available: https://lists.wikimedia.org/mbox/public/
Comment 12 Nemo 2011-10-21 21:12:27 UTC
I've requested the addition to Gmane of those mailing lists I listed plus some more; it can now require some time. I think there are still some more lists to be added which were not on my list, I'll check it later.
Comment 13 Nemo 2011-12-05 16:31:57 UTC
Created attachment 9615 [details]
Excerpts from the Gmane subscription results replies

Should be done now: all lists have been created (120 have been added to Gmane) and all archives should have been imported in the last few days (I checked only a couple of them). I notified all the lists and list owners.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links