First, thank you for contributing to Vector! The goal of this document is to provide everything you need to start contributing to Vector. The following TOC is sorted progressively, starting with the basics and expanding into more specifics. Everyone from a first time contributor to a Vector team member will find this document useful.
- You're familiar with GitHub and the pull request workflow.
- You've read Vector's docs.
- You know about the Vector community. Please use this for help.
- Ensure your change has an issue! Find an
existing issue or open a new issue.
- This is where you can get a feel if the change will be accepted or not.
Changes that are questionable will have a
needs: approval
label.
- This is where you can get a feel if the change will be accepted or not.
Changes that are questionable will have a
- Once approved, fork the Vector repository in your own GitHub account (only applicable to outside contributors).
- Create a new Git branch.
- Make your changes.
- Submit the branch as a pull request to the main Vector repo. A Vector team member should comment and/or review your pull request within a few days. Although, depending on the circumstances, it may take longer.
If you're thinking of contributing a new source, sink, or transform to Vector, thank you that's way cool! The answers to the below questions are required for each newly proposed component and depending on the answers, we may elect to not include the proposed component. If you're having trouble with any of the questions, we're available to help you.
Prior to beginning work on a new source or sink if a GitHub Issue does not already exist, please open one to discuss the introduction of the new integration. Maintainers will review the proposal with the following checklist in mind, try and consider them when sharing your proposal to reduce the amount of time it takes to review your proposal. This list is not exhaustive, and may be updated over time.
- Can the proposed component’s functionality be replicated by an existing component, with a specific configuration?
(ex: Azure Event Hub as a
kafka
sink configuration)- Alternatively implemented as a wrapper around an existing component. (ex.
axiom
wrappingelasticsearch
)
- Alternatively implemented as a wrapper around an existing component. (ex.
- Can an existing component replicate the proposed component’s functionality, with non-breaking changes?
- Can an existing component be rewritten in a more generic fashion to cover both the existing and proposed functions?
- Is the proposed component generically usable or is it specific to a particular service?
- How established is the target of the integration, what is the relative market share of the integrated service?
- Is there sufficient demand for the component?
- If the integration can be served with a workaround or more generic component, how painful is this for users?
- Is the contribution from an individual or the organization owning the integrated service? (examples of
organization backed integrations:
databend
sink,axiom
sink)- Is the contributor committed to maintaining the integration if it is accepted?
- What is the overall complexity of the proposed design of this integration from a technical and functional standpoint, and what is the expected ongoing maintenance burden?
- How will this integration be tested and QA’d for any changes and fixes?
- Will we have access to an account with the service if the integration is not open source?
To merge a new source, sink, or transform, the pull request is required to:
- Add tests, especially integration tests if your contribution connects to an external service.
- Add instrumentation so folks using your integration can get insight into how it's working and performing. You can see some example of instrumentation in existing integrations.
- Add documentation. You can see examples in the
docs
directory.
When adding new integration tests, the following changes are needed in the GitHub Workflows:
- in
.github/workflows/integration.yml
, add another entry in the matrix definition for the new integration. - in
.github/workflows/integration-comment.yml
, add another entry in the matrix definition for the new integration. - in
.github/workflows/changes.yml
, add a new filter definition for files changed, and update thechanges
job outputs to reference the filter, and finally update the outputs ofworkflow_call
to include the new filter.
All changes must be made in a branch and submitted as pull requests. Vector does not adopt any type of branch naming style, but please use something descriptive of your changes.
Please ensure your commits are small and focused; they should tell a story of your change. This helps reviewers to follow your changes, especially for more complex changes.
Once your changes are ready you must submit your branch as a pull request.
The pull request title must follow the format outlined in the conventional commits spec.
Conventional commits is a standardized
format for commit messages. Vector only requires this format for commits on
the master
branch. And because Vector squashes commits before merging
branches, this means that only the pull request title must conform to this
format. Vector performs a pull request check to verify the pull request title
in case you forget.
A list of allowed sub-categories is defined here.
The following are all good examples of pull request titles:
feat(new sink): new `xyz` sink
feat(tcp source): add foo bar baz feature
fix(tcp source): fix foo bar baz bug
chore: improve build process
docs: fix typos
All pull requests should be reviewed by:
- No review required for cosmetic changes like whitespace, typos, and spelling by a maintainer
- One Vector team member for minor changes or trivial changes from contributors
- Two Vector team members for major changes
- Three Vector team members for RFCs
If CODEOWNERS are assigned, a review from an individual from each of the sets of owners is required.
All pull requests are squashed and merged. We generally discourage large pull requests that are over 300-500 lines of diff. If you would like to propose a change that is larger we suggest coming onto our Discord server and discuss it with one of our engineers. This way we can talk through the solution and discuss if a change that large is even needed! This will produce a quicker response to the change and likely produce code that aligns better with our process.
Currently, Vector uses GitHub Actions to run tests. The workflows are defined in
.github/workflows
.
GitHub Actions is responsible for releasing updated versions of Vector through various channels.
Tests are run for all changes except those that have the label:
ci-condition: skip
Some long-running tests are only run daily, rather than on every pull request. If needed, an administrator can kick off these tests manually via the button on the nightly build action page
Historically, we've had some trouble with tests being flakey. If your PR does not have passing tests:
- Ensure that the test failures are unrelated to your change
- Is it failing on master?
- Does it fail if you rerun CI?
- Can you reproduce locally?
- Find or open an issue for the test failure (example)
- Link the PR in the issue for the failing test so that there are more examples
You can invoke the test harness by commenting on any pull request with:
/test -t <name>
To run tests locally, use cargo vdev.
Unit tests can be run by calling cargo vdev test
.
Integration tests are not run by default when running
cargo vdev test
. Instead, they are accessible via the integration subcommand (example:
cargo vdev int test aws
runs aws-related integration tests). You can find the list of available integration tests using cargo vdev int show
. Integration tests require docker or podman to run.
There are other checks that are run by CI before the PR can be merged. These should be run locally first to ensure they pass.
# Run the Clippy linter to catch common mistakes.
cargo vdev check rust
# Ensure all code is properly formatted. Code can be run through `rustfmt` using `cargo fmt` to ensure it is properly formatted.
cargo vdev check fmt
# Ensure the internal metrics that Vector emits conform to standards.
cargo vdev check events
# Ensure the `LICENSE-3rdparty.csv` file is up to date with the licenses each of Vector's dependencies are published under.
cargo vdev check licenses
# Vector's documentation for each component is generated from the comments attached to the Component structs and members.
# Running this ensures that the generated docs are up to date.
make check-component-docs
# Generate the code documentation for the Vector project.
# Run this to ensure the docs can be generated without errors (warnings are acceptable at the minute).
cd rust-doc && make docs
When deprecating functionality in Vector, see DEPRECATION.md.
When adding, modifying, or removing a dependency in Vector you may find that you need to update the
inventory of third-party licenses maintained in LICENSE-3rdparty.csv
. This file is generated using
dd-rust-license-tool and can be updated using
cargo vdev build licenses
.
As discussed in the README
, you should continue to the following
documents:
- DEVELOPING.md - Everything necessary to develop
- DOCUMENTING.md - Preparing your change for Vector users
- DEPRECATION.md - Deprecating functionality in Vector
To protect all users of Vector, the following legal requirements are made. If you have additional questions, please contact us.
Vector requires all contributors to sign the Contributor License Agreement (CLA). This gives Vector the right to use your contribution as well as ensuring that you own your contributions and can use them for other purposes.
The full text of the CLA can be found at https://cla.datadoghq.com/vectordotdev/vector.
This is covered by the CLA.