-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add tracing with OpenTelemetry #932
Open
meobilivang
wants to merge
18
commits into
kubernetes-sigs:main
Choose a base branch
from
meobilivang:add-tracing-otel
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 17 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
1ad8b12
install Go otel packages
meobilivang f955a09
set up TracerProvider + tracer object
meobilivang dcfb15b
add enableTracing flag
meobilivang 49c1ea2
instrument GCP machine + GCPCluster controllers
meobilivang deb44dc
helm charts for jaeger all-in-one + otel collector
meobilivang bb98f0e
templates to set up dev env
meobilivang bc4ea9f
instrument cloud/scope
meobilivang eb25369
instrument exp/controllers
meobilivang 1b98f8e
instrument cloud/services/compute
meobilivang ea54fe7
instrument cloud/services/container
meobilivang 560fd0a
add sampling rate
meobilivang ba6261e
Tilt file
meobilivang 173a97f
debug blocking tracing connection
meobilivang f4a0554
helm charts for jaeger + otel collector
meobilivang fc2812d
bump go package
meobilivang e4da793
bump packages
meobilivang 085b48f
Merge branch 'main' into add-tracing-otel
meobilivang bbc5bf5
reformat tilt file
meobilivang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,10 +5,12 @@ tools_bin = "./hack/tools/bin" | |
kubectl_cmd = "./hack/tools/bin/kubectl" | ||
kind_cmd = "./hack/tools/bin/kind" | ||
|
||
#Add tools to path | ||
# Add tools to path | ||
os.putenv("PATH", os.getenv("PATH") + ":" + tools_bin) | ||
|
||
update_settings(k8s_upsert_timeout_secs = 60) # on first tilt up, often can take longer than 30 seconds | ||
update_settings( | ||
k8s_upsert_timeout_secs=60 | ||
) # on first tilt up, often can take longer than 30 seconds | ||
|
||
# set defaults | ||
settings = { | ||
|
@@ -26,10 +28,12 @@ settings = { | |
keys = ["GCP_B64ENCODED_CREDENTIALS"] | ||
|
||
# global settings | ||
settings.update(read_json( | ||
"tilt-settings.json", | ||
default = {}, | ||
)) | ||
settings.update( | ||
read_json( | ||
"tilt-settings.json", | ||
default={}, | ||
) | ||
) | ||
|
||
if settings.get("trigger_mode") == "manual": | ||
trigger_mode(TRIGGER_MODE_MANUAL) | ||
|
@@ -40,36 +44,61 @@ if "allowed_contexts" in settings: | |
if "default_registry" in settings: | ||
default_registry(settings.get("default_registry")) | ||
|
||
|
||
# deploy CAPI | ||
def deploy_capi(): | ||
version = settings.get("capi_version") | ||
capi_uri = "https://github.com/kubernetes-sigs/cluster-api/releases/download/{}/cluster-api-components.yaml".format(version) | ||
cmd = "curl -sSL {} | {} | {} apply -f -".format(capi_uri, envsubst_cmd, kubectl_cmd) | ||
local(cmd, quiet = True) | ||
capi_uri = "https://github.com/kubernetes-sigs/cluster-api/releases/download/{}/cluster-api-components.yaml".format( | ||
version | ||
) | ||
cmd = "curl -sSL {} | {} | {} apply -f -".format( | ||
capi_uri, envsubst_cmd, kubectl_cmd | ||
) | ||
local(cmd, quiet=True) | ||
if settings.get("extra_args"): | ||
extra_args = settings.get("extra_args") | ||
if extra_args.get("core"): | ||
core_extra_args = extra_args.get("core") | ||
if core_extra_args: | ||
for namespace in ["capi-system"]: | ||
patch_args_with_extra_args(namespace, "capi-controller-manager", core_extra_args) | ||
patch_args_with_extra_args( | ||
namespace, "capi-controller-manager", core_extra_args | ||
) | ||
if extra_args.get("kubeadm-bootstrap"): | ||
kb_extra_args = extra_args.get("kubeadm-bootstrap") | ||
if kb_extra_args: | ||
patch_args_with_extra_args("capi-kubeadm-bootstrap-system", "capi-kubeadm-bootstrap-controller-manager", kb_extra_args) | ||
patch_args_with_extra_args( | ||
"capi-kubeadm-bootstrap-system", | ||
"capi-kubeadm-bootstrap-controller-manager", | ||
kb_extra_args, | ||
) | ||
|
||
|
||
def patch_args_with_extra_args(namespace, name, extra_args): | ||
args_str = str(local("{} get deployments {} -n {} -o jsonpath={{.spec.template.spec.containers[0].args}}".format(kubectl_cmd, name, namespace))) | ||
args_str = str( | ||
local( | ||
"{} get deployments {} -n {} -o jsonpath={{.spec.template.spec.containers[0].args}}".format( | ||
kubectl_cmd, name, namespace | ||
) | ||
) | ||
) | ||
args_to_add = [arg for arg in extra_args if arg not in args_str] | ||
if args_to_add: | ||
args = args_str[1:-1].split() | ||
args.extend(args_to_add) | ||
patch = [{ | ||
"op": "replace", | ||
"path": "/spec/template/spec/containers/0/args", | ||
"value": args, | ||
}] | ||
local("{} patch deployment {} -n {} --type json -p='{}'".format(kubectl_cmd, name, namespace, str(encode_json(patch)).replace("\n", ""))) | ||
patch = [ | ||
{ | ||
"op": "replace", | ||
"path": "/spec/template/spec/containers/0/args", | ||
"value": args, | ||
} | ||
] | ||
local( | ||
"{} patch deployment {} -n {} --type json -p='{}'".format( | ||
kubectl_cmd, name, namespace, str(encode_json(patch)).replace("\n", "") | ||
) | ||
) | ||
|
||
|
||
# Users may define their own Tilt customizations in tilt.d. This directory is excluded from git and these files will | ||
# not be checked in to version control. | ||
|
@@ -78,23 +107,37 @@ def include_user_tilt_files(): | |
for f in user_tiltfiles: | ||
include(f) | ||
|
||
def append_arg_for_container_in_deployment(yaml_stream, name, namespace, contains_image_name, args): | ||
|
||
def append_arg_for_container_in_deployment( | ||
yaml_stream, name, namespace, contains_image_name, args | ||
): | ||
for item in yaml_stream: | ||
if item["kind"] == "Deployment" and item.get("metadata").get("name") == name and item.get("metadata").get("namespace") == namespace: | ||
if ( | ||
item["kind"] == "Deployment" | ||
and item.get("metadata").get("name") == name | ||
and item.get("metadata").get("namespace") == namespace | ||
): | ||
containers = item.get("spec").get("template").get("spec").get("containers") | ||
for container in containers: | ||
if contains_image_name in container.get("image"): | ||
container.get("args").extend(args) | ||
|
||
|
||
def fixup_yaml_empty_arrays(yaml_str): | ||
yaml_str = yaml_str.replace("conditions: null", "conditions: []") | ||
return yaml_str.replace("storedVersions: null", "storedVersions: []") | ||
|
||
|
||
def validate_auth(): | ||
substitutions = settings.get("kustomize_substitutions", {}) | ||
missing = [k for k in keys if k not in substitutions] | ||
if missing: | ||
fail("missing kustomize_substitutions keys {} in tilt-settings.json".format(missing)) | ||
fail( | ||
"missing kustomize_substitutions keys {} in tilt-settings.json".format( | ||
missing | ||
) | ||
) | ||
|
||
|
||
tilt_helper_dockerfile_header = """ | ||
# Tilt image | ||
|
@@ -113,35 +156,58 @@ COPY --from=tilt-helper /restart.sh . | |
COPY manager . | ||
""" | ||
|
||
|
||
# Build CAPG and add feature gates | ||
def capg(): | ||
# Apply the kustomized yaml for this provider | ||
substitutions = settings.get("kustomize_substitutions", {}) | ||
os.environ.update(substitutions) | ||
|
||
# yaml = str(kustomizesub("./hack/observability")) # build an observable kind deployment by default | ||
yaml = str(kustomizesub("./config/default")) | ||
yaml = str( | ||
kustomizesub("./hack/observability") | ||
) # build an observable kind deployment by default | ||
# TODO: consider to remove | ||
# yaml = str(kustomizesub("./config/default")) | ||
|
||
# add extra_args if they are defined | ||
if settings.get("extra_args"): | ||
gcp_extra_args = settings.get("extra_args").get("gcp") | ||
if gcp_extra_args: | ||
yaml_dict = decode_yaml_stream(yaml) | ||
append_arg_for_container_in_deployment(yaml_dict, "capg-controller-manager", "capg-system", "cluster-api-gcp-controller", gcp_extra_args) | ||
append_arg_for_container_in_deployment( | ||
yaml_dict, | ||
"capg-controller-manager", | ||
"capg-system", | ||
"cluster-api-gcp-controller", | ||
gcp_extra_args, | ||
) | ||
yaml = str(encode_yaml_stream(yaml_dict)) | ||
yaml = fixup_yaml_empty_arrays(yaml) | ||
|
||
# Set up a local_resource build of the provider's manager binary. | ||
local_resource( | ||
"manager", | ||
cmd = 'mkdir -p .tiltbuild;CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags \'-extldflags "-static"\' -o .tiltbuild/manager', | ||
deps = ["api", "cloud", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"], | ||
cmd="mkdir -p .tiltbuild;CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags '-extldflags \"-static\"' -o .tiltbuild/manager", | ||
deps=[ | ||
"api", | ||
"cloud", | ||
"config", | ||
"controllers", | ||
"exp", | ||
"feature", | ||
"pkg", | ||
"go.mod", | ||
"go.sum", | ||
"main.go", | ||
], | ||
) | ||
|
||
dockerfile_contents = "\n".join([ | ||
tilt_helper_dockerfile_header, | ||
tilt_dockerfile_header, | ||
]) | ||
dockerfile_contents = "\n".join( | ||
[ | ||
tilt_helper_dockerfile_header, | ||
tilt_dockerfile_header, | ||
] | ||
) | ||
|
||
entrypoint = ["sh", "/start.sh", "/manager"] | ||
extra_args = settings.get("extra_args") | ||
|
@@ -151,45 +217,110 @@ def capg(): | |
# Set up an image build for the provider. The live update configuration syncs the output from the local_resource | ||
# build into the container. | ||
docker_build( | ||
ref = "gcr.io/k8s-staging-cluster-api-gcp/cluster-api-gcp-controller", | ||
context = "./.tiltbuild/", | ||
dockerfile_contents = dockerfile_contents, | ||
target = "tilt", | ||
entrypoint = entrypoint, | ||
only = "manager", | ||
live_update = [ | ||
ref="gcr.io/k8s-staging-cluster-api-gcp/cluster-api-gcp-controller", | ||
context="./.tiltbuild/", | ||
dockerfile_contents=dockerfile_contents, | ||
target="tilt", | ||
entrypoint=entrypoint, | ||
only="manager", | ||
live_update=[ | ||
sync(".tiltbuild/manager", "/manager"), | ||
run("sh /restart.sh"), | ||
], | ||
ignore = ["templates"], | ||
ignore=["templates"], | ||
) | ||
|
||
k8s_yaml(blob(yaml)) | ||
|
||
|
||
def observability(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. setting up OTEL Collector + Jaeger using helm chart |
||
# Install the OpenTelemetry helm chart | ||
gcp_project_id = os.getenv("GCP_PROJECT_ID", "") | ||
|
||
k8s_yaml( | ||
helm( | ||
"./hack/observability/opentelemetry/chart", | ||
name="opentelemetry-collector", | ||
namespace="capg-system", | ||
values=["./hack/observability/opentelemetry/values.yaml"], | ||
# refer https://github.com/helm/helm/issues/1987 | ||
set=[ | ||
"extraEnvs[0].name=GCP_PROJECT_ID", | ||
"extraEnvs[0].value=" + gcp_project_id, | ||
], | ||
) | ||
) | ||
|
||
k8s_yaml( | ||
helm( | ||
"./hack/observability/jaeger/chart", | ||
name="jaeger-all-in-one", | ||
namespace="capg-system", | ||
set=[ | ||
# TODO: consider to remove | ||
# "crd.install=false", | ||
# "rbac.create=false", | ||
"resources.limits.cpu=200m", | ||
"resources.limits.memory=256Mi", | ||
], | ||
) | ||
) | ||
|
||
k8s_resource( | ||
workload="jaeger-all-in-one", | ||
new_name="traces: jaeger-all-in-one", | ||
port_forwards=[ | ||
port_forward(16686, name="View traces", link_path="/search?service=capg") | ||
], | ||
labels=["observability"], | ||
) | ||
|
||
k8s_resource(workload="opentelemetry-collector", labels=["observability"]) | ||
|
||
|
||
def base64_encode(to_encode): | ||
encode_blob = local("echo '{}' | tr -d '\n' | base64 - | tr -d '\n'".format(to_encode), quiet = True) | ||
encode_blob = local( | ||
"echo '{}' | tr -d '\n' | base64 - | tr -d '\n'".format(to_encode), quiet=True | ||
) | ||
return str(encode_blob) | ||
|
||
|
||
def base64_encode_file(path_to_encode): | ||
encode_blob = local("cat {} | tr -d '\n' | base64 - | tr -d '\n'".format(path_to_encode), quiet = True) | ||
encode_blob = local( | ||
"cat {} | tr -d '\n' | base64 - | tr -d '\n'".format(path_to_encode), quiet=True | ||
) | ||
return str(encode_blob) | ||
|
||
|
||
def read_file_from_path(path_to_read): | ||
str_blob = local("cat {} | tr -d '\n'".format(path_to_read), quiet = True) | ||
str_blob = local("cat {} | tr -d '\n'".format(path_to_read), quiet=True) | ||
return str(str_blob) | ||
|
||
|
||
def base64_decode(to_decode): | ||
decode_blob = local("echo '{}' | base64 --decode -".format(to_decode), quiet = True) | ||
decode_blob = local("echo '{}' | base64 --decode -".format(to_decode), quiet=True) | ||
return str(decode_blob) | ||
|
||
|
||
def kustomizesub(folder): | ||
yaml = local("hack/kustomize-sub.sh {}".format(folder), quiet = True) | ||
yaml = local("hack/kustomize-sub.sh {}".format(folder), quiet=True) | ||
return yaml | ||
|
||
|
||
def waitforsystem(): | ||
local(kubectl_cmd + " wait --for=condition=ready --timeout=300s pod --all -n capi-kubeadm-bootstrap-system") | ||
local(kubectl_cmd + " wait --for=condition=ready --timeout=300s pod --all -n capi-kubeadm-control-plane-system") | ||
local(kubectl_cmd + " wait --for=condition=ready --timeout=300s pod --all -n capi-system") | ||
local( | ||
kubectl_cmd | ||
+ " wait --for=condition=ready --timeout=300s pod --all -n capi-kubeadm-bootstrap-system" | ||
) | ||
local( | ||
kubectl_cmd | ||
+ " wait --for=condition=ready --timeout=300s pod --all -n capi-kubeadm-control-plane-system" | ||
) | ||
local( | ||
kubectl_cmd | ||
+ " wait --for=condition=ready --timeout=300s pod --all -n capi-system" | ||
) | ||
|
||
|
||
############################## | ||
# Actual work happens here | ||
|
@@ -208,4 +339,6 @@ deploy_capi() | |
|
||
capg() | ||
|
||
observability() | ||
|
||
waitforsystem() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My
python
environment has the Black linter by default so it just reformats the whole file 😗