Last modified: 2014-06-11 20:50:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T63813, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 61813 - filearchive table not available on labs
filearchive table not available on labs
Status: RESOLVED DUPLICATE of bug 57697
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: High normal
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: labs-replication tool-missing-ts-feat
  Show dependency treegraph
 
Reported: 2014-02-23 01:01 UTC by Betacommand
Modified: 2014-06-11 20:50 UTC (History)
12 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Betacommand 2014-02-23 01:01:59 UTC
Given the usefulness of having metadata including sha of deleted files, and that it is available on the toolserver it should be exposed on labs.
Comment 1 Marc A. Pelletier 2014-02-23 03:15:23 UTC
That information is not available to normal users on the project, and therefore requires an okay by Legal to clear.  Toolserver had imperfectly sanitized replication, and there were quite a few things available there that never should have been without clearance.  :-)

Adding Luis to the bug so that they can opine.
Comment 2 Luis Villa (WMF Legal) 2014-02-23 20:25:55 UTC
Is https://www.mediawiki.org/wiki/Manual:Filearchive_table the best place to figure out what is actually in the relevant table? And do we want all fields or just some?
Comment 3 Betacommand 2014-02-23 20:40:59 UTC
I would prefer as much as possible, the only field that should contain information that is sensitive is fa_description
Comment 4 Maarten Dammers 2014-02-23 20:56:35 UTC
The current toolserver view seems to be everything but fa_description and fa_sha1.
* fa_description should be left out as it might contain private info
* fa_sha1 is quite recent (1.21) so I think we just never added it at the Toolserver

mysql> describe filearchive;
+----------------------+--------------------------------------------------------------------------------------------------------+------+-----+---------+-------+
| Field                | Type                                                                                                   | Null | Key | Default | Extra |
+----------------------+--------------------------------------------------------------------------------------------------------+------+-----+---------+-------+
| fa_id                | int(11)                                                                                                | NO   |     | 0       |       |
| fa_name              | varbinary(255)                                                                                         | NO   |     |         |       |
| fa_archive_name      | varbinary(255)                                                                                         | YES  |     |         |       |
| fa_storage_group     | varbinary(16)                                                                                          | YES  |     | NULL    |       |
| fa_storage_key       | varbinary(64)                                                                                          | YES  |     |         |       |
| fa_deleted_user      | int(11)                                                                                                | YES  |     | NULL    |       |
| fa_deleted_timestamp | varbinary(14)                                                                                          | YES  |     |         |       |
| fa_deleted_reason    | blob                                                                                                   | YES  |     | NULL    |       |
| fa_size              | int(8) unsigned                                                                                        | YES  |     | 0       |       |
| fa_width             | int(5)                                                                                                 | YES  |     | 0       |       |
| fa_height            | int(5)                                                                                                 | YES  |     | 0       |       |
| fa_metadata          | mediumblob                                                                                             | YES  |     | NULL    |       |
| fa_bits              | int(3)                                                                                                 | YES  |     | 0       |       |
| fa_media_type        | enum('UNKNOWN','BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','OFFICE','TEXT','EXECUTABLE','ARCHIVE') | YES  |     | NULL    |       |
| fa_major_mime        | enum('unknown','application','audio','image','text','video','message','model','multipart')             | YES  |     | unknown |       |
| fa_minor_mime        | varbinary(32)                                                                                          | YES  |     | unknown |       |
| fa_user              | int(5) unsigned                                                                                        | YES  |     | 0       |       |
| fa_user_text         | varbinary(255)                                                                                         | YES  |     |         |       |
| fa_timestamp         | varbinary(14)                                                                                          | YES  |     |         |       |
| fa_deleted           | tinyint(1) unsigned                                                                                    | NO   |     | 0       |       |
+----------------------+--------------------------------------------------------------------------------------------------------+------+-----+---------+-------+
20 rows in set (0.00 sec)
Comment 5 Luis Villa (WMF Legal) 2014-03-11 00:11:20 UTC
I know we've seen crazy things be put in filenames before - is that oversightable? Otherwise, agree that fa_sha1 should not be problematic.
Comment 6 Betacommand 2014-03-11 00:26:56 UTC
Oversight no longer exists, but pretty much anything can be rev_del'ed if that is what you are referring to. However I have never seen a case of a file name being problematic.
Comment 7 Luis Villa (WMF Legal) 2014-03-13 00:14:12 UTC
I think it was James who told me that there have been crazy file names in the past, but that may be a fever dream - James?

With regards fa_description: is that normally publicly visible? I.e., would sensitive information in it be rev_del'd as part of normal site moderation/oversight? Because with other sensitive fields, one option is to simply respect revdel and keep it from being propagated.
Comment 8 Maarten Dammers 2014-05-10 19:46:19 UTC
Blocks toolserver migration.
Comment 9 Marc A. Pelletier 2014-05-19 14:47:01 UTC
There is, IMO, a plausible issue with the SHA but I don't know whether it is relevant for legal: its primary use case is (of course) to note files which have been previously uploaded then deleted, but it therefore necessarily allows any third party to determine whether any specific file they have the hash to has been uploaded in the past.

Could this be used by, say, a government agency to find who uploaded some files that they were displeased with?
Comment 10 Luis Villa (WMF Legal) 2014-05-20 17:10:10 UTC
Can't they already do that by simply uploading the file instead of the SHA?
Comment 11 Marc A. Pelletier 2014-05-20 17:13:47 UTC
At best they could tell that some file with the same /name/ existed; the SHA will confirm content.  AFAIK, uploading doesn't check against deleted files' SHAs.
Comment 12 James Alexander 2014-05-24 04:16:35 UTC
I(In reply to Marc A. Pelletier from comment #11)
> At best they could tell that some file with the same /name/ existed; the SHA
> will confirm content.  AFAIK, uploading doesn't check against deleted files'
> SHAs.

I may be wrong but I believe it does (and tells you that the same file is uploaded at X and I 'think' that one was deleted before though I'd have to double check that.
Comment 13 Marc A. Pelletier 2014-06-11 17:33:31 UTC

*** This bug has been marked as a duplicate of bug 57697 ***
Comment 14 Rainer Rillke @commons.wikimedia 2014-06-11 20:50:44 UTC
(In reply to Marc A. Pelletier from comment #11)
>  AFAIK, uploading doesn't check against deleted files' SHAs.
It does. And it tells you the title. From the title, look up the (public) logs and you have that user.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links