6 Sep 2022•

learn•12 min read

Initial tips for working with the open source community

Nikita Sobolev

Intro

Working with open-source software is now a regular part of our lives. We can now say that open-source software has major market shares in almost all fields.

Verifying what is going on inside an application by yourself is especially important in DeFi, where transparency and openness are key philosophical factors. For example, any third-party security researcher can audit and verify the state of security of any open-source project. This creates a lot of trust among end users and other developers.

Everyone can help to improve the software they use if needed, which creates a connection between developers and forms a community around a piece of software.

It’s no surprise that these ideas are crucial elements of Web3 and open-blockchains movements.

Initial tips for working with the open source community

Are you or your company just starting with building an open-source project? Working with the open-source community might be very tricky! There are lots of things to learn about the community guidelines and distinct features of required tools.

Today, let’s break down these initial steps and see how they can really benefit your project and your community!

GitHub as the main platform for open-source code actually has these guidelines in every repository.

Let’s take the wemake-python-styleguide project as an example.

The GitHub checklist includes:

Description
Readme
CoC
CONTRIBUTING.md
Licence
Issue templates
Pull request template.

In this report, we’ll go through all the items on the list and further discuss the important features that GitHub did not include as “mandatory” in the list above.

Issues and Pull Requests

Issues and Pull Requests are the very essentials of any software project.

Without these two forms of communication, you absolutely cannot build a community project.

But there are many pitfalls in how to organize this communication.

Luckily, there are features and techniques to help you!

Forms / Templates

The first problem any developer faces when speaking about issues is what information from a user is required to fix a bug/start a new feature discussion.

Now, we need to use GitHub issue templates or/and forms to specify what information we need. Here are some examples to help you get started:

Bug form and template
Rule request template

The difference is that they can ignore all template fields. The form can have required fields.

Documentation:

I want to point out that you can and should add labels to templates:

Name: Rule request
About: Request a new rule to be checked
Labels: 'rule request', 'feature'

Specified labels will be automatically used for the created issue. Why is this helpful?

It can help maintainers easily filter different issue types based on these labels.

You can also attach additional links to outside resources on your “Create new issue project’s page” — for example, https://github.com/python/cpython/blob/main/.github/ISSUE_TEMPLATE/config.yml and https://github.com/python/cpython/issues/new/choose

This might be helpful if you want to highlight existing external resources like support forums, community discord, etc.

Similarly, you can create a template for Pull Requests — example.

It usually asks you to check that you did everything correctly before submitting the code itself.

It usually matches practical items from CONTRIBUTING.md (which we are going to discuss below) with the most precision.

Saved replies

The next problem that overwhelms open-source maintainers is that the same discussions arise multiple times!

A company can develop a shared text response to persistent problems:

Spam
Questions out of scope
Incomplete bug reports
Bugs that are impossible to recreate
Features that will not be implemented.

To solve such questions, a private repository with a list of prepared and approved answers is used.

The algorithm is simple:

A developer notices a recurring problem.
They create an issue in a private repository.
A company signs it off and creates a file with the ready answer, which is easy to find.
Done.

The file may also contain actions that need to be performed — for example, delete a comment, ban a user, etc.

Some frequent answers can be saved into your profile: https://docs.github.com/en/get-started/writing-on-github/working-with-saved-replies/about-saved-replies

For example, how to fight spam? My “saved reply”:

Hi, I will assume this was an accident because this entry looks like spam. Maybe you miss-posted this by accident?

In any case, I will delete this entry for now. In case this was actually spam, we will have to block you at the next attempt to post something unrelated because it will consume the unpaid time of our open-source contributors.

Use prepared well-phrased replies to solve recurring problems and save yourself some time for things that matter.

Code Review practices

Code reviews are very hard because this is the place where engineering and empathy skills must work together — especially when members of your team come from different cultural and linguistic backgrounds.

It is easy to make some harsh comments just by accident.

I use and recommend to others “Semantic code review” style: https://conventionalcomments.org/

This method can recognize your intent with high precision, which makes your code review comments very easy to interpret.

Additionally, there’s a list of things that you should do:

Use empathy
Offer help
Say “Thank you,” be polite
Be welcoming, be someone you want to work with
Automate all the things. It is easier to get feedback from robots than from humans
Ask questions, propose alternatives
Educate and learn from others
Criticize a solution.

And things that you totally should not do:

Assume that everyone knows the same set of things as you do.
Assume that everyone thinks and does things the same way as you do.
Criticize or blame someone.

It is also important to understand that it is not only important “what” you say, but “how” you say it has the same importance. Good “Tone of Voice” guidelines for developers:

This set of simple rules can level up your code review practices and make people happier.

Codeowners

Sometimes new contributors face a problem with some parts of the code, but they don’t know who to ask for help or review.

If you add the .github/CODEOWNERS file to the repository, you can specify which user reviews a specific part of the application. This is especially important for extensive projects when different people or teams work on different parts of the application. Documentation is here.

You can even make the owners’ review “mandatory” — example.

Security

Security problems should not be reported publicly because, otherwise, bad actors might use disclosed security problems to their advantage.

But GitHub has the minimum necessary functionality needed for working with security.

The owner of the repository needs to install:

Security policy: file with a pathway .github/SECURITY.md, which has a description of what to do if a user has found a critical bug in the project security. Classic solution: write to a specific email address (and not publishing the report openly) Example is here.
Security advisories: you can open CVE for known and fixed vulnerabilities.
Dependabot alerts: notifications from GitHub, if one of your dependencies got a new CVE.
Code scanning alerts: created automatically when using CodeQL.

These four simple steps can cover security needs for your project.

Also, make sure that you follow in-code security best practices and use special tools as a part of your CI.

Automation

If something can be automated, it probably should be!

GitHub Apps and Actions play a big role in how issues and PRs are managed. You can use existing automation or even write our own. For example:

You can add Stale Bot, which will close old and inactive issues and PRs — example
You can add CLA Bot if contributors need to sign CLA before sending the code off — example
Automatically adding necessary labels to issue: https://github.com/marketplace/issue-label-bot
You can automatically choose a reviewer for the PR based on certain criteria or even random: https://github.com/marketplace/actions/reviewer-lottery
You can write automation using GitHub Actions or Danger: https://github.com/danger

Unwanted behavior

Now, let’s switch our focus to people because, without users, your community project won’t be successful.

And working with people is hard!

Different problematic situations and even conflicts will happen. That’s the sad truth.

Thankfully, there are existing guides on how to make, prevent, and solve conflicts when they arise.

These guides are called “Code of Conduct,” and they are priceless. No surprise that most of the biggest open-source project already has one.

Examples of good Code of Conduct:

https://github.com/python/.github/blob/master/CODE_OF_CONDUCT.md and https://www.python.org/psf/conduct/

Orientation: GitHub, mail lists, conferences.

Why is it considered good?

Examples of undesirable behavior are written out in as much detail as possible. They should include a wide range of historical incidents, which have been included in the rules.

Availability of a public list of “violations”: https://pythondev.readthedocs.io/diversity.html#python-code-of-conduct-bans

Open discussion of problems for example: https://discuss.python.org/t/discussion-about-recent-coc-events/5778

Adaptation to various communication channels.

https://contributor-covenant.org

Why is it considered good?

It covers a wide range of problems.

It serves a wide range of open-source communities.

It has a lot of real-world experience and multiple versions that are improving.

It’s worth pointing out that you can create one CoC for all the projects in the organization — for example, https://github.com/python/.github/blob/master/CODE_OF_CONDUCT.md

Inclusive code

Another important part of making your project a more welcoming place is to use inclusive language in your codebase: variable names, comments, configuration options, etc.

Here’s a link to some good guidelines on this topic: https://chromium.googlesource.com/chromium/src/+/master/styleguide/inclusive_code.md

README.md

Your readme is an entry point to your project.

It is the first thing each person sees.

It is important to make a good first impression!

A good README.md has:

Clear purpose of the project
Demo/code example
Next steps.

An example of a good README.md: https://github.com/wemake-services/wemake-python-styleguide

There are many styles, guides, and ready-made templates on the internet. You can choose what you prefer:

I want to point out that you can also make a README.md file on the organization level, for example:

It is useful for highlighting common things: lists of important links, communication channels, featured projects, support, etc.

CONTRIBUTING.md

Each open-source project has different rules and processes for contributing.

How can a newcomer know what to do in each case?

CONTRIBUTING.md solves just this problem!

This file handles what needs to be done to send you your PR. It can be very simple: https://github.com/wemake-services/wemake-python-styleguide/blob/master/CONTRIBUTING.md

Or it can be as big as a book:

What should you aim for when writing your own contributing guide? Key criteria:

Clarity of steps
Depth and how many typical problems are described there
Practicality: It needs to have commands/snippets for primary tasks: release, running tests, build, etc.

It often shows the exact way to create a PR: what to include and what not to include.

Additionally, rules for core developers are often described: how to merge branches, how to rename commits when squashing, etc. Example: https://github.com/python/mypy/blob/master/CONTRIBUTING.md

With good CONTRIBUTING.md it would be much easier to attract new contributors.

Licenses

Licenses are complex. It is necessary to have lengthy and in-depth consultations with a specialized lawyer regarding the choice of a license for each specific project in a specific host country.

You need to pay extra attention to:

Possibility of using your code in private projects
Possibility of using your code for creating other people’s patents
Possibility of using your code “as a service” for other companies (typical example: database-as-a-service).

Or you can use a service like https://choosealicense.com.

Most popular choices are: MIT, BSD, Apache, GPL.

Releases

After all the steps above are completed — code is written and PRs are sent, reviewed, and merged — it is time to make a release.

It is very important to store the release information in the project’s repo. Why?

To keep the community up to date with new releases. GitHub even has a special subscription to the “Releases only” repository. It’s useful for those who follow updates.
To describe technical changes: API/ABI/plugins/build/etc. — everything that can be interesting for developers.
Otherwise, it would be very hard for your users to understand: what is released, what is not, which version is the latest, etc.

GitHub has a built-in control system for git tag and Releases. Examples:

There is the most popular format for keeping track of such changes: file CHANGELOG.md + text format called “Keep A Changelog”: https://keepachangelog.com/en/1.0.0/

Another question is: How can you generate the next version number?

There are several versioning systems (in order of popularity/applicability):

There is an option for auto-generating CHANGELOG.md and creating a release from commit messages: https://github.com/semantic-release/semantic-release

But this is a personal preference and is totally not required.

GitHub Package Registry and GitHub Container Registry can also be mentioned as part of the conversation. These are artifact repositories for your containers or language-specific packages, which are also needed not only as “technical” places to store the cache and intermediate artifacts. They should be considered official places for creating releases.

Discussions

If the repository has a lot of questions about its usage, there are two strategies:

Creating your own tag on StackOverflow (if there is someone with a specific reputation threshold)
Enabling GitHub Discussions for a discussion alongside.

Usually, the choice is made based on the amount of activity: If the discussion has already begun on SO, it is more logical to continue there. If the discussion is on GitHub, there is no point in moving it elsewhere.

Sometimes, people use “Discussions” for support questions and new feature ideas while using “Issues” for bug reports only.

Conclusion

We have covered most of the initial tips you need when starting your open-source journey: everything from submitting ideas to releasing their implementation.

But that’s not everything!

Building and managing a healthy open-source community would require a lot of knowledge, passion, empathy, and dedication. Consult other maintainers, learn from them, take part as a contributor in large projects, and learn from them!

And don’t forget to have fun.

Nikita Sobolev