General Overview ¶

Participating in the community ¶

Although Subversion was originally sponsored and hosted by CollabNet (https://www.collab.net/), it's a true open-source project under the Apache License, Version 2.0. A number of Subversion's developers are paid by their employers to improve Subversion, while many others are simply excellent volunteers who are interested in building a better version control system.

The community exists mainly through mailing lists and a Subversion repository. To participate:

Join the "dev", "commits", and "announce" mailing lists. The dev list, dev@subversion.apache.org, is where almost all discussion takes place. All development questions should go there, though you might want to check the list archives first. The "commits" list receives automated commit emails. See https://subversion.apache.org/mailing-lists.html for details.
Get a copy of the latest development sources from https://svn.apache.org/repos/asf/subversion/trunk/
New development always takes place on trunk. Bugfixes, enhancements, and new features are backported from there to the various release branches.

For several years, the Subversion Community meet in Berlin for a Hackathon once a year: 2016, 2015, 2014, 2013, 2012, 2011 and 2010 [linked from archive.org, some media links do not work].

There are many ways to join the project, either by writing code, or by testing and/or helping to manage the bug database. If you'd like to contribute, then look at:

The bugs/issues database https://issues.apache.org/jira/projects/SVN/issues

To submit code, simply send your patches to dev@subversion.apache.org. No, wait, first read the rest of this file, then start sending patches to dev@subversion.apache.org. :-)

To help manage the issues database, read over the issue summaries, looking and testing for issues that are either invalid, or are duplicates of other issues. Both kinds are very common, the first because bugs often get unknowingly fixed as side effects of other changes in the code, and the second because people sometimes file an issue without noticing that it has already been reported. If you are not sure about an issue, post a question to dev@subversion.apache.org. ("Subversion: We're here to help you help us!")

Another way to help is to set up automated builds and test suite runs of Subversion on some platform, and have the output sent to the notifications@subversion.apache.org mailing list. See more details at the mailing lists page.

Finally, despite the online nature of the Subversion project and the human contact abstraction that results from that fact, it is important to realize that there are real people at the end of all contributions. Treat all other community members as you would expect to be treated. Review the contribution, not the contributor; don't annoy others, and don't become easily annoyed yourself.

Theory and documentation ¶

Design

A design spec was written in June 2000, and is a bit out of date. But it still gives a good theoretical introduction to the inner workings of the repository, and to Subversion's various layers.
API Documentation

See the section on the public API documentation for more information.
Delta Editors

Karl Fogel wrote a chapter for O'Reilly's 2007 book Beautiful Code: Leading Programmers Explain How They Think covering the design and use of Subversion's delta editor interface.
Network Protocols

The WebDAV Usage document is an introduction to Subversion's DAV network protocol, which is an extended dialect of HTTP and uses URLs beginning with "http://" or "https://".

The SVN Protocol document contains a formal description of Subversion ra_svn network protocol, which is a custom protocol on port 3690 (by default), whose URLs begin with "svn://" or "svn+ssh://".
User Manual

Version Control with Subversion is a book published by O'Reilly that shows in detail how to effectively use Subversion. The text of the book is free, and is actively being revised. On-line versions are available at https://svnbook.red-bean.com/. The XML source and translations to other languages are maintained in their own repository at https://sourceforge.net/projects/svnbook/.
System notes

A lot of the design ideas for particular aspects of the system have been documented in individual files in the notes/ directory.

Code to read ¶

Before you can contribute code, you'll need to familiarize yourself with the existing code base and interfaces.

Check out a copy of Subversion (anonymously, if you don't yet have an account with commit access) — so you can look at the code.

Within 'subversion/include/' are a bunch of header files with huge doc comments. If you read through these, you'll have a pretty good understanding of the implementation details. Here's a suggested perusal order:

the basic building blocks: svn_string.h, svn_error.h, svn_types.h
svn_io.h, svn_path.h, svn_hash.h, svn_xml.h
the critical interface: svn_delta.h
client-side interfaces: svn_ra.h, svn_wc.h, svn_client.h
the repository and versioned filesystem: svn_repos.h, svn_fs.h

Subversion tries to stay portable by using only the C89/C90 dialect of ANSI/ISO C and by using the Apache Portable Runtime (APR) library. APR is the portability layer used by the Apache httpd server, and more information can be found at https://apr.apache.org/.

Because Subversion depends so heavily on APR, it may be hard to understand Subversion without first glancing over certain header files in APR (look in 'apr/include/'):

memory pools: apr_pools.h
filesystem access: apr_file_io.h
hashes and arrays: apr_hash.h, apr_tables.h

Subversion also tries to deliver reliable and secure software. This can only be achieved by developers who understand secure programming in the C programming language. Please see 'notes/assurance.txt' for the full rationale behind this. Specifically, you should make it a point to carefully read David Wheeler's Secure Programming (as mentioned in 'notes/assurance.txt'). If at any point you have questions about the security implications of a change, you are urged to ask for review on the developer mailing list.

Directory layout ¶

A rough guide to the source tree:

doc/
User and Developer documentation.
tools/
Stuff that works with Subversion, but that Subversion doesn't depend on. Code in tools/ is maintained collectively by the Subversion project, and is under the same open source copyright as Subversion itself.
contrib/
Stuff that works with Subversion, but that Subversion doesn't depend on, and that is maintained by individuals who may or may not participate in Subversion development. Code in contrib/ is open source, but may have a different license or copyright holder than Subversion itself.
subversion/
Source code to Subversion itself (as opposed to external libraries).
subversion/include/
Public header files for users of Subversion libraries.
subversion/include/private/
Private header files shared internally by Subversion libraries.
subversion/libsvn_fs/
The versioning "filesystem" API.
subversion/libsvn_repos/
Repository functionality built around the `libsvn_fs' core.
subversion/libsvn_delta/
Common code for tree deltas, text deltas, and property deltas.
subversion/libsvn_wc/
Common code for working copies.
subversion/libsvn_ra/
Common code for repository access.
subversion/libsvn_client/
Common code for client operations.
subversion/svn/
The command line client.
subversion/tests/
Automated test suite.

Branching policy ¶

The Subversion project strongly prefers that active development happens in the common trunk. Changes made to trunk have the highest visibility and get the greatest amount of exercise that can be expected from unreleased code. For this to be beneficial to everyone, though, our trunk is expected at all times to be stable. It should build. It should work. It might not be release-ready, but it should certainly be test-suite ready.

We also strongly prefer to see large changes broken up into several, smaller, logical commits — each of which is expected to meet the aforementioned requirements of stability.

That said, we understand that it can be nearly impossible to apply all these policies to particularly large changes (new features, sweeping code reorganizations, etc.). It is in those situations that you might consider using a custom branch dedicated to your development task. The following are some guidelines to make your branch-based development work go smoothly.

Branch creation and management ¶

There's nothing particularly complex about branch-based development. You make a branch from the trunk (or from whatever branch best serves as both source and destination for your work), and you do your work on it. Subversion's merge tracking feature has greatly helped to reduce the sort of mental overhead required to work in this way, so making good use of that feature (by using Subversion 1.5 or newer clients, and by performing all merges to and from the roots of branches) is highly encouraged.

For our policy on log messages for your branch, please note the section on writing log messages.

Lightweight branches ¶

If you're working on a feature or bugfix in stages involving multiple commits, and some of the intermediate stages aren't stable enough to go on trunk, then create a temporary branch in /branches. There's no need to ask — just do it. It's fine to try out experimental ideas in a temporary branch, too. And all the preceding applies to partial as well as full committers. It even applies to committers of other ASF projects, but please talk to us (on dev@)—introduce yourself and the problem you plan to work on.

When you're done with the branch — when you've either merged it to trunk or given up on it — please remember to remove it.

See also the section on partial commit access for our policy on offering commit access to experimental branches.

BRANCH-README files ¶

For branches you expect to be longer-lived, we recommend the creation and regular updating of a file in the root of your branch named BRANCH-README. Such a file provides you with a great, central place to describe the following aspects of your branch:

The basic purpose of your branch: what bug it exists to fix, or feature to implement; what issue number(s) it relates to; what list discussion threads surround it; what design docs exists to describe the situation.
What style of branch management you are using: is this a feature branch that will regularly be kept in sync with its parent branch and ultimately reintegrated back to that parent branch? Is it a fork that is not expected to be merged back to its parent branch in the foreseeable future? Does it relate to any other branches?
What tasks remain for you to accomplish on your branch? Are those tasks claimed by someone? Do they need more design input? How can others help you?

Here is an example BRANCH-README file that demonstrates what we're talking about:

This branch exists for the resolution of issue #8810, per the ideas
documented in /trunk/notes/frobnobbing-feature.txt.  It is a feature
branch, receiving regular sync merges from /trunk, and expected to be
reintegrated back thereto.

TODO:

  * compose regression tests        [DONE]
  * add frob identification logic   [STARTED (fitz)]
  * add nobbing bits                []

Why all the fuss? Because this project idealizes communication and collaboration, understanding that the latter is more likely to happen when the former is a point of emphasis.

Just remember when you merge your branch back to its source to delete the BRANCH-README file.

Documentation ¶

Document Everything ¶

Every function, whether public or internal, must start out with a documentation comment that describes what the function does. The documentation should mention every parameter received by the function, every possible return value, and (if not obvious) the conditions under which the function could return an error.

For internal documentation put the parameter names in upper case in the doc string, even when they are not upper case in the actual declaration, so that they stand out to human readers.

For public or semi-public API functions, the doc string should go above the function in the .h file where it is declared; otherwise, it goes above the function definition in the .c file.

For structure types, document each individual member of the structure as well as the structure itself.

For actual source code, internally document chunks of each function, so that an someone familiar with Subversion can understand the algorithm being implemented. Do not include obvious or overly verbose documentation; the comments should help understanding of the code, not hinder it.

For example:

  /*** How not to document.  Don't do this. ***/

  /* Make a foo object. */
  static foo_t *
  make_foo_object(arg1, arg2, apr_pool_t *pool)
  {
     /* Create a subpool. */
     apr_pool_t *subpool = svn_pool_create(pool);

     /* Allocate a foo object from the main pool */
     foo_t *foo = apr_palloc(pool, sizeof(*foo));
     ...
  }

Instead, document decent sized chunks of code, like this:

      /* Transmit the segment (if its within the scope of our concern). */
      SVN_ERR(maybe_crop_and_send_segment(segment, start_rev, end_rev,
                                          receiver, receiver_baton, subpool));

      /* If we've set CURRENT_REV to SVN_INVALID_REVNUM, we're done
         (and didn't ever reach END_REV).  */
      if (! SVN_IS_VALID_REVNUM(current_rev))
        break;

      /* If there's a gap in the history, we need to report as much
         (if the gap is within the scope of our concern). */
      if (segment->range_start - current_rev < 1)
        {
          svn_location_segment_t *gap_segment;
          gap_segment = apr_pcalloc(subpool, sizeof(*gap_segment));
          gap_segment->range_end = segment->range_start - 1;
          gap_segment->range_start = current_rev + 1;
          gap_segment->path = NULL;
          SVN_ERR(maybe_crop_and_send_segment(gap_segment, start_rev, end_rev,
                                              receiver, receiver_baton,
                                              subpool));
        }

Read over the Subversion code to get an overview of how documentation looks in practice; in particular, see subversion/include/*.h for doxygen examples.

Public API Documentation ¶

We use the Doxygen format for public interface documentation. This means anything that goes in a public header file. The generated documentation is published on the web site for the latest and some earlier Subversion sources.

We use only a small portion of the available doxygen commands to markup our source. When writing doxygen documentation, the following conventions apply:

Use complete sentences and prose descriptions of the function, preceding parameter names with @a, and type and macro names with @c.
Use <tt>...</tt> to display multiple words and @p to display only one word in typewriter font.
Constant values, such as TRUE, FALSE and NULL should be in all caps.
When several functions are related, define a group name, and group them together using @defgroup and @{...@}.

See the Doxygen manual for a complete list of commands.

Patch submission guidelines ¶

Writing patches ¶

To get the latest source code, run:

svn checkout https://svn.apache.org/repos/asf/subversion/trunk/ svn-trunk

and follow the instructions in the INSTALL file. (If you do not have an svn client, download a source tarball.)

If your patch implements a new feature, or changes large amounts of code, please remember to discuss it on the dev@ list first. Please wait some time for a response, as not everyone is online all the time. That is so the community can express concerns with and suggest improvements for the proposed feature or implementation details as soon as possible—it is always better for all parties if that feedback is provided sooner (even before any code is written) rather than later.

If you have any questions about the patch, please feel free to ask them on dev@.

Submitting patches ¶

Mail patches to dev@subversion.apache.org, starting the subject line with [PATCH]. This helps our patch manager spot patches right away. For example:

   Subject: [PATCH] fix for rev printing bug in svn status

If the patch addresses a particular issue, include the issue number as well: "[PATCH] issue #1729: ...". Developers who are interested in that particular issue will know to read the mail.

A patch submission should contain one logical change; please don't mix N unrelated changes in one submission — send N separate emails instead.

Generate the patch using svn diff -x-p from the top of a Subversion trunk working copy. If the file you're diffing is not under revision control, you can achieve the same effect by using diff -u.

Please include a log message with your patch. A good log message helps potential reviewers understand the changes in your patch, and increases the likelihood that it will be applied. You can put the log message in the body of the email, or at the top of the patch attachment (see below). Either way, it should follow the guidelines given in Writing log messages, and be enclosed in triple square brackets, like so:

   [[[
   Fix issue #1729: Don't crash because of a missing file.

   * subversion/libsvn_ra_ansible/get_editor.c
     (frobnicate_file): Check that file exists before frobnicating.
   ]]]

(The brackets are not actually part of the log message, they're just a way to clearly mark off the log message from its surrounding context.)

If possible, send the patch as an attachment with a mime-type of text/x-diff, text/x-patch, or text/plain. Most people's mailreaders can display those inline, and having the patch as an attachment allows them to extract the patch from the message conveniently. Never send patches in archived or compressed form (e.g., tar, gzip, zip, bzip2), because that prevents people from reviewing the patch directly in their mailreaders.

If you can't attach the patch with one of these mime-types, or if the patch is very short, then it's okay to include it directly in the body of your message. But watch out: some mail editors munge inline patches by inserting unasked-for line breaks in the middle of long lines. If you think your mail software might do this, then please use an attachment instead.

If the patch implements a new feature, make sure to describe the feature completely in your mail; if the patch fixes a bug, describe the bug in detail and give a reproduction recipe. An exception to these guidelines is when the patch addresses a specific issue in the issues database — in that case, just refer to the issue number in your log message, as described in Writing log messages.

It is normal for patches to undergo several rounds of feedback and change before being applied. Don't be discouraged if your patch is not accepted immediately — it doesn't mean you goofed, it just means that there are a lot of eyes looking at every code submission, and it's a rare patch that doesn't have at least a little room for improvement. After discussing people's responses to your patch, make the appropriate changes and resubmit, wait for the next round of feedback, and lather, rinse, repeat, until some committer applies it. You can avoid some iterations by reviewing and applying the project's Coding Conventions to your patch before submitting it.

If you don't get a response for a while, and don't see the patch applied, it may just mean that people are really busy. Go ahead and repost, and don't hesitate to point out that you're still waiting for a response. One way to think of it is that patch management is highly parallizable, and we need you to shoulder your share of the management as well as the coding. Every patch needs someone to shepherd it through the process, and the person best qualified to do that is the original submitter.