Hixie's Natural Log: The Spectrum of Openness

2023-08-11 19:05 UTC The Spectrum of Openness

"Open Source" is a broad spectrum, with various axes. The following is an attempt to describe various ways to look at openness to aid project leaders in determining what they want their project to look like. I originally wrote this for my colleagues at Google, but the concepts apply widely and I figured they might be of use for others.

In practice, every project is a unique snowflake and there are exceptions to every rule. A project can be proprietary but use and contribute back to some open source library. An open source project can have undocumented proprietary protocols. A team can intend to fall in one category, but by their actions fall in another. The descriptions below should be seen merely as a high-level description of some possible ways projects can be configured, not as a comprehensive guide to the taxonomy of openness. Additionally, the examples I give below refer to the state of those products as of the time of writing. As projects evolve, these may become less accurate.

Interoperability (0-6)

One aspect of openness is how one's product interacts with others.

For the purposes of this section, APIs (Application Programming Interfaces), ABIs (Application Binary Interfaces), formats, and protocols are considered equivalent. While they serve different roles in practice, the techniques used to limit or encourage their reuse are the same.

0. Proprietary with obfuscation

The most closed one can make one's protocols is to not document them publicly and design them to be actively hard to understand by reverse engineering. Patents and DRM may also be used to further restrict potential interoperability by legal means in some jurisdictions.

Examples: Kindle file format, most streaming music formats.

1. Proprietary

Most protocols that are not intended for interoperability with other systems are undocumented (at least, not documented in a manner intended for public consumption), but are otherwise not obfuscated, and a sufficiently motivated user could reverse engineer the protocol and use it.

Example: NTFS file system.

2. Licensed open standards

One can have entirely open specifications, but require payment (or other agreements) before the standard can be read or used, e.g. by the use of patent licensing.

Example: the H.264 video codec.

3. "The Code Is The Standard"

Some projects do not document their protocols, but since their source code is available, they are effectively defined by their implementation, bugs and all.

Examples abound but since people rarely intend to be in this state calling out any specific project as being in this category tends to be controversial.

4. Public

When it is desired that users create new products to interact with one's own, one may publicly document one's protocols. There are varying levels of completeness to such documentation; for example, whether some aspects are kept proprietary, or whether the documentation includes details for error handling and future extensions.

Examples: IntelliJ, Swift UI, SMB protocol.

5. Open standards

The ultimate openness one can present is to submit one's protocols to a standards committee (or form a new one; the difference is largely symbolic). This is useful when the intent is to create an entire ecosystem around one's product and protocols.

Examples: the Internet's core protocols, the web.

6. Regulated standards

In the extreme, interoperability around some standards becomes so important that government agencies get involved and the protocol becomes a matter of law.

Examples: power grid standards.

Source code license (0-7)

If software is provided in binary form (e.g. client applications) then sufficiently motivated users will be able to reverse engineer it, even if the source code is not explicitly shared with the user. For the purposes of this section, we are ignoring this and focusing on the access that users have to the project's original source code.

0. Trade secret

Some source code is so secret and so important to its owner that it gets legal protection beyond copyright.

Example: The most sensitive internals of particularly special proprietary software products.

1. Proprietary

The default is for source code to be copyrighted. If one does not redistribute it, then that source code is entirely closed.

Example: The source code for the UI parts of macOS.

2. Commercially licensed

One can license one's code for use by specific downstream users, without making it public. Typically this is done for money.

Examples: Qt (in its closed-source form); Microsoft's sale of access to the Windows source code.

3. Source code that is incidentally visible

One can publish one's source code without licensing it (or licensing it using a very restrictive license that essentially does not allow any use), typically as an incidental part of distributing one's application. This allows people to see the code, but does not allow them to use it in their own projects unless they negotiate a separate license with the distributor.

Examples: JavaScript code in web sites that don't use a minifier or compiler; script code in game data files.

4. Usage-restricted source-sharing

One can make one's source available under a license that allows some kinds of reuse by other parties but prevents others, such as commercial use, use by enterprises over a certain size, or use that competes with the original developer. This can be done either by prohibiting undesired uses outright, or by nominally allowing them but only under onerous terms.

Example: MongoDB.

Open source licenses

One can license one's code for public use, and these licenses can vary in their terms.

It's important to notice that there are legally-sound open source licenses, and there are nonsense "licenses" that are the result of software engineers thinking that being a lawyer is easy. Talk to a lawyer before choosing a license. See the OSI's license page for an overview of the topic.

5. Restrictive

The most strict open source licenses significantly limit what one can do with the source code. For example, they might require that downstream developers license their modifications and any linked code with the same license, or require that downstream developers license their software such that their users can obtain their app's source code.

Examples: GPL-licensed software, such as Linux or Emacs.

6. Reciprocal

These licenses apply the restrictive terms to the code in question (typically a library) but not to code that uses it (such as an application that embeds the library).

Examples: MPL-licensed software, such as Firefox.

7. Permissive

The most liberal licenses require very little of downstream developers other than the replication of the copyright notices in software that uses the covered code (and in some cases not even that).

Examples: Apache-licensed software, such as Android or Rust; BSD-licensed software, such as Chromium or Flutter.

Copyright management

Projects that accept source code from more than one legal entity may wish to navigate the issues of copyright assignment, liability, relicensing, and so forth. The usual tools for this are Contributor License Agreements (CLAs) and Developer Certificates of Origin (DCOs). Talk to a lawyer about these options.

Development processes (0-8)

Separate from what one does with the protocols and the code, a separate choice is how to design and develop the code: where conversations happen, how people are added to the project, and so forth. This is sometimes called "governance".

The sections below apply equally to big projects as to one-person projects, but are primarily focused on projects with multiple team members.

0. Proprietary development

The most closed projects have no public-facing development at all. All design, implementation, and testing happens internally.

Example: Google Search.

1. Proprietary development of open source software

As with proprietary software, all the design, implementation, and testing happens internally. However, the source code is open source in some way, and is published periodically (e.g. in conjunction with a product release). This is often referred to as "throwing the code over the wall". No attempt is made to encourage public contributions. Patches may in some cases be taken (e.g. by e-mail).

Examples: Sqlite, Postfix.

2. Proprietary development, limited-access betas

A team can invite a closed set of unaffiliated users to test their software before launches.

This is a common model for commercial software.

3. Proprietary development, public betas

A team with private development can solicit feedback from a public community by providing pre-release software for any user to test.

This is a common model for commercial software.

4. Public presence, private development

A team with public tooling (e.g. bug databases, code repositories, code reviews, continuous integration), but that makes no attempt to accept public contributions (code, suggestions, etc). Public bug reports may be accepted but the development team typically does not engage with the bug reporters.

Such a team's communications channels are all or mostly internal. Commit access is typically automatic for the team, and unavailable for anyone else.

Examples: Many of Google's small open source projects, and many personal projects on GitHub, fall in this category.

5. Public clique development

A team with public tooling that nominally accepts public contributions, but where becoming an active and equal member of the team is in practice discouraged (new team members are explicitly recruited, and usually all work for the same company). Friction points exist that reduce the likelihood of contributions, for example, public tooling that is different from that typically used by open source projects, official public channels that are not typically frequented by the bulk of the team, lack of documentation or out of date documentation (especially about how to contribute), most communications are held in private channels.

The team may engage with bug reporters on occasion, and may listen to suggestions for project direction, but all decisions ultimately rest with the core team.

Many projects that try to make the jump from "public presence, private development" to "public development, private governance" end up in this state because they underestimate the effort required to successfully and productively develop software in public. That said, this is a valid development model in its own right, especially for projects that need a strong driving vision such as a programming language or an open source narrative video game.

Examples abound but since people rarely intend to be in this state calling out any specific project as being in this category tends to be controversial.

6. Public development, private governance

A team that works in public, with public design decisions, public meetings, and public chats, but whose core leadership is accountable to a single entity whose primary purpose is not this project (e.g. a company). Typically such a project is largely funded by that single entity as well, especially in terms of employing the most active contributors and marketing the project.

Such a team typically hopes that one day contributors with other affiliations will be independently designing, implementing, reviewing, and landing code without oversight from the project leads beyond ensuring a broad alignment on strategy. As such, it actively tries to differentiate between being a member of the development team, and being a member of the primary sponsoring organization. Such a team typically has publicly-visible documentation of its processes, governance, values, contributor access policies, etc.

Example: Flutter.

7. Public development with an unelected but independent core team

A project can be entirely open in its development, with a self-appointed core team that does not answer to anyone but themselves. The term "Benevolent Dictator for Life" (BDFL) is sometimes used to describe this model when the core team is a single person (usually the project founder).

It can have the advantage of a strong vision unaffected by fleeting trends, but can also have the disadvantage of the project not being responsive to important shifts in the environment.

Examples: Linux.

8. Accountable independent public development

Ultimately, the most open a project can be is for it to have entirely independent governance accountable to its community, e.g. a foundation with democratically elected core leaders.

This has all the advantages and disadvantages of democracy.

Examples: Python, Kubernetes.

Pingbacks: 1 2