Apache Subversion is unable to store SHA1 collisions Summary: ======== Subversion repositories, in the default configuration, incorrectly store a file that has the same SHA-1 checksum as another file. Known affected: =============== Servers running Apache Subversion up to and including 1.9.5. Known fixed: ============ Servers running Apache Subversion 1.9.6, with rep-sharing enabled. Servers running Apache Subversion 1.8.18, with rep-sharing enabled. After upgrading the server to 1.9.6 or 1.8.18, rep-sharing should be enabled in the file db/fsfs.conf for all FSFS repositories. Because rep-sharing is enabled by default, there is no need to edit db/fsfs.conf unless the enable-rep-sharing option in db/fsfs.conf has been explicitly set to 'false'. If rep-sharing is disabled in a repository then the server (in any version) will store SHA1 collisions. However, standard SVN clients will not be able to handle them, so the recommended setup runs with rep-sharing enabled. Clients will not receive a patch for this issue. BDB repositories are not affected (but the BDB format is deprecated). Details: ======== In February 2017 a group of researchers released two PDF files which have different content but produce the same SHA1 checksum. This was the first publicly known SHA1 collision. If both of these files are committed to an FSFS or FSX repository, content is de-duplicated based on the SHA1 checksum, and only the content of one of the files ends up being stored in the repository. The commit succeeds, but subsequent updates and commits abort with a checksum error, since those two files have different MD5 checksums. De-duplication only takes place when representation sharing is enabled. Representation sharing is enabled by default, but may have been disabled in the fsfs.conf file. Recommendations: ================ We recommend all server installations to upgrade and enable rep-sharing. Subversion 1.9.6 and 1.8.18 reject commits which create a SHA1 collision. Detecting collisions requires representation sharing to be enabled. If you have disabled representation sharing (by setting 'enable-rep-sharing = false' in the file db/fsfs.conf in the repository directory), we recommend that you set that option to 'true' after upgrading to 1.9.6 (and not before). As a stopgap measure for users unable to upgrade immediately, the following hook scripts may be used (currently only Unix versions are available): https://svn.apache.org/repos/asf/subversion/trunk/tools/hook-scripts/reject-detected-sha1-collisions.sh https://svn.apache.org/repos/asf/subversion/trunk/tools/hook-scripts/reject-known-sha1-collisions.sh If rep-sharing is disabled, the fix is ineffective and SHA1 collisions can be stored. This will not cause problems for the repository itself. However, SVN clients which retrieve and store such content in a working copy will run into similar problems: de-duplication of content stored in the working copy also relies on SHA1, and for performance reasons clients using the HTTP protocol will avoid fetching content with a SHA1 checksum which has been fetched previously. Therefore, storing content with SHA1 collisions it not a supported use case. If such content needs to be versioned for any reason, consider packing the files in a compressed archive and commit this archive instead. If such content was accidentally committed while rep-sharing was disabled, one the following remedies may be used: One solution is just to delete the second file. This will resolve this problem for normal SVN client usage, but it will not work for tools like svnsync or git-svn which try to replay every revision in the repository. The tool will run into an error on the revision where the content was committed and the tool will not be able to proceed. A second solution would be to remove the problematic revision with svnadmin. 'svnadmin dump' can be used to dump the repository up to the revision that introduced the problem. This dump file can be loaded into a new repository. If there were more commits after the problematic revision then dump and load all of these subsequent revisions as well. Another option is to create a Subversion permission rule (authz) that blocks access to the one or both of the files. This will work with tools like svnsync and git-svn as the server will not send the colliding content. For example, authz rules which block access to the files could look like: [/trunk/tests/data/shattered-1.pdf] * = [/trunk/tests/data/shattered-2.pdf] * = With such an authz rule in place, 'svnsync' can be used to create a mirror of the repository, without the collision. Remember to change the mirror's UUID with 'svnadmin setuuid' after populating it with history. Another way to create a mirror is with a dump/load cycle using 'svndumpfilter exclude' to delete one or both files in the collision. References: =========== SVN issue 4673: https://issues.apache.org/jira/browse/SVN-4673 SHA1 collision: https://shattered.io/ First known occurrence: https://bugs.webkit.org/show_bug.cgi?id=168774#c29 CVE-2005-4900: https://nvd.nist.gov/vuln/detail/CVE-2005-4900 Reported by: ============ Øyvind A. Holm The WebKit Project, https://webkit.org/ Bryan Rosander