Will the Big Tent "break" OpenStack?
As fast as OpenStack moves, sometimes it's hard to remember that it's still a relatively young project -- until something needs to be fixed. First, we moved from "core" projects (theoretically those without which you can't run OpenStack) to "integrated" (defined as those projects that are part of the semi-annual release cadence), with projects applying for "incubation" in order to get into those categories.
But over the last few months, it became apparent that even this model had problems; while a project may be "blessed" for a particular purpose, it may not really reflect the reality of what is out in the marketplace. To solve this problem, the Technical Committee is implementing a new project restructuring known as the "Big Tent" initiative. The premise here is that projects that wish to live in the 0penstack code namespace should have a clear, objective, set of requirements for entry, even if there's already another project that partially addresses the same problem domain.
This, of course, is a huge change, and we spoke to Mirantis Principal Technical Architect and OpenStack Technical Committee member Jay Pipes to find out what the implications actually are.
The problems with the current system
"The goal was to open the code namespaces to new projects and introduce some competition where there may be some overlap between project functionalities," Jay said.
For example, he explained, Ceilometer is the "official" OpenStack project, but the reality is that there are two other projects that also do telemetry-related things:
Stacktach predates Ceilometer, and is also Python-based. It doesn't have a REST-ful API, but it does process notifications. It's focused on processing an API session from start to finish, so it can be used as a debugger in addition to doing telemetry.
Monasca is based on Apache Storm, and Kafka, and uses Scala and Java to do some similar things to Ceilometer.
All 3 -- Ceilometer, Stacktach and Monasca -- are viable solutions for similar problems. "We were at a crossroads," Jay explained. "We had to say, 'if you want to work on telemetry, work on Ceilometer.' And that's not conducive to growing an ecosystem."
Another example is in the area of deployment. TripleO is the official OpenStack deployment program, but when it was "blessed", the TC disregarded the fact that most of the world deploys OpenStack using Puppet, Chef, Ansible, Saltstack, and so on, rather than the image-based approach that TripleO uses. "They're all are perfectly good ways to do it. What makes TripleO so much better and more deserving of being in the openstack code namespace compared to the Chef cookbooks or Puppet modules that live in Stackforge?"
Reaching the breaking point
Jay sees two things wrong with this approach. In addition to forcing the TC to bless one project over others when there wasn't any reason to, the process created a situation where graduating incubation was the only way to "be" OpenStack, because even Stackforge is generally seen as "not really OpenStack". Only "integrated" makes the cut. "Projects had to do all this stuff, and 'Hawk their wares', so to speak, in front of the TC -- the OpenStack High Court, as I call it," Jay said, "and the TC would then pass judgement."
What brought the situation to a head was the incubation process for the OpenStack Queuing project, Zaqar (formerly known as Marconi). The project underwent no less than three incubation rounds, during which the TC would change what was being asked of the project. "We were holding them to a standard no other project had to uphold," Jay said. "Basically we got bogged down in a bunch of email threads and IRC conversations that went round and round and never got anywhere."
And that led to the real issue. "Projects had no objective set of requirements for entering the openstack code namespace and being ‘part of OpenStack’."
The Big Tent
The new approach involves enabling virtually all projects to "be" OpenStack, giving them access to the same resources as everyone else, including the mailing lists, the shared infrastructure team and systems, the shared documentation team, and the same release management team.
Rather than having a binary approach where a project is either integrated or it's not, projects will have multiple "tags" that provide information about them. Some may be quite coarse-grained, such as "integrated-release" (a temporary tag to be used for the Kilo release cycle to indicate projects that are in the legacy integrated release) or "compute", "networking", or "storage". Others will be more fine-grained and focused such as "docs-api-partial" or "has-rest-api".
Of course there are still SOME barriers to entry, Jay points out. The resolution provides four examples of requirements for projects:
They align with the OpenStack Mission: the project should help further the OpenStack mission, by providing a cloud infrastructure service, or directly building on an existing OpenStack infrastructure service
They follow the OpenStack way: open source (licensing), open community (leadership chosen by the contributors to the project), open development (public reviews on Gerrit, core reviewers, gate, assigned liaisons), and open design (direction discussed at Design Summit and/or on public forums)
They ensure basic interoperability (API services should support at least Keystone)
They submit to the OpenStack Technical Committee oversight
But with a change this big, there are bound to be concerns.
Downsides to the Big Tent approach
Concerns about the Big Tent approach fall into several categories
Cross project overloading
One concern, voiced by Doug Hellman, was about overloading cross project teams such as documentation, infra, and so on. These teams have already been struggling to keep up with the mass of new integrated projects; what would happen when every project was on equal footing? To solve this problem, the TC has decided to restructure those teams to enable new projects to do things for themselves, rather than do everything for them. Every project must have documented "cross project liasions," and must work to set up its own CI systems, write its own docs, and so on. The current cross project teams will then focus on helping these teams be self-sufficient.
Testing
Another major -- and justifiable -- concern was that this would make OpenStack incredibly complicated to test. "But then," Jay pointed out, "it is already. Right now, all integrated projects are tested against all other integrated projects, but that doesn't make sense. Nova consumes Keystone, not the other way around. This process doesn't need to be transitive. The gate needs to be restructured around these dependencies, but a technical problem shouldn't affect policy."
Release cadence
But what about the (in)famous six month release cadence? How can this possibly be maintained without defining the projects that are part of it? Swift was never on the same cadence as OpenStack and nobody died, but surely that's the exception, isn't it?
It turns out that this is less of a concern than one might think. Generally, users fall into one of two categories: release-based, and time-based.
A large contingent of OpenStack users is release-based, which means that they deploy the integrated release, and all of the pieces involved. "For them what matters is actually dependencies, but it's not what you think," Jay says. "You'd think that nova-server depends on keystone-server, but it doesn't. nova-server depends on the python keystone client, which then interacts with Keystone. That client works with a specific Keystone server API version, which requires a specific library. So it's all about dependencies." Those dependencies can be managed via the micro-tagging that all projects will get in this process, along with the normal dependency-tracking functionality that package management systems enable.
Other users are time-based; they are always X weeks behind trunk, and making their own packages. For those people, the release cadence has never mattered; they tracked the dependencies on their own.
User overload
For some, OpenStack already has too many choices. Do you deploy Swift for Object Storage? Ceph? What will happen when there are multiple projects that can serve a specific need?
"There will be a set of tags specific to operations and deployment," Jay says. "For example, maybe you'd filter by 'mature' vs 'experimental' vs 'stable', or by the number of successful releases each project in a particular problem space has had."
"Ultimately, it's all a documentation issue."
The bottom line
Of course the main concern about adding multiple projects in a problem space is the potential for diluting the pool of potential developers. How will projects get critical mass, if everyone's doing their own thing?
"It's all about projects that are driven by motivated people," Jay said, "just as it is today."