Giter VIP home page Giter VIP logo

build-tools's People

Contributors

alexanderwert avatar aneurysm9 avatar arminru avatar atoulme avatar bogdandrutu avatar bryce-b avatar cleancut avatar dashpole avatar dependabot[bot] avatar hughesjj avatar jamesmoessis avatar jerbly avatar joaopgrassi avatar jsuereth avatar justinfoote avatar kennykguo avatar lmolkova avatar lucacome avatar oberon00 avatar pyohannes avatar thisthat avatar tigrannajaryan avatar trask avatar vincentfree avatar z1c0 avatar zeitlinger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

build-tools's Issues

Semantic convention generator: Clean up tests using separate input files

Many tests that use separate input files are not written very cleanly. The most important points:

  • The input YAML files contain lots of redundant stuff (copied from real semantic conventions). They should be minimized to only contain what is needed for the test.
  • The Python code of the tests could use some cleanup too similar to what was already done for some markdown tests, see https://github.com/open-telemetry/build-tools/pull/54/files#diff-6efabfcf3a7471d21e99e83685a2d0151cbb1404c52eed86f26a06d81ef17c44 where the test python code was cut down from 544 to 153 lines.
  • Instead of or in addition to the above point, try testing in such a way that we can use Python (triple-quoted) strings as input instead of needing separate files. This is not entirely trivial since file names do play a role (for generating links), but should nonetheless be doable. This would make the tests much, much more readable.

Further ideas, not as important:

  • Asserting on the line in which an exception occurs could be cleaner if we could somehow specify the expected exception in the input (e.g. with a YAML/HTML comment).

Add release instructions to CONTRIBUTING.md

From #110:

There is no release guidelines in this repo, so I am not sure how to make a release.
specs-approvers who can help with this and/or write a paragraph in CONTRIBUTING.md to explain how to make the release?

Semnatic convention tool should generate markdown for metrics.

I think I realized there was no tracking bug for adding this feature, even though there's a PR: #79

  • We want markdown table generation for metric semantic conventions similar to tracing
  • This should account for nuances with metrics and may need additional fields compared to traces

I'm not as familiar with all the requirements we had so if someone with better memory can flesh out this ticket, please do!

[Feature proposal] Support 'attribute group' type in semantic conventions

Attributes can appear on different signals (metrics, traces, events, links, logs), but we don't have a way to describe attributes without the signal in semantic conventions.
I'd like to have a mechanism, where I'd define attributes separately and then refer to them from spans/links/events definitions.

E.g. in messaging we can have attributes set on links or on spans, maybe events:

  - id: messaging.message
    prefix: messaging
    type: attribute_group # this is the proposal. We currently use 'span' in cases like this
    brief: 'Semantic convention describing per-message attributes populated on messaging spans or links.'
    attributes:
      - id: message.id
        type: string
        brief: message id
        note:  when sending individual messages, it should be populated on span, otherwise should be populated on links

  - id: messaging.publish
    prefix: messaging
    type: span
    brief:  ...
    attributes:
      - ref: messaging.message.id

  - id: messaging.link
    prefix: messaging
    type: link # out of scope of this proposal
    brief:  ...
    attributes:
      - ref: mesasging.message.id

We'd also benefit from it in span-general.yaml which essentially describes unrelated groups of attributes, but not a span per se.

any_of markdown output is misleading if each set only has one member

E.g. see https://github.com/open-telemetry/opentelemetry-specification/blob/v1.0.1/specification/trace/semantic_conventions/database.md#connection-level-attributes

Additional attribute requirements: At least one of the following sets of attributes is required:

  • net.peer.name
  • net.peer.ip

Obviously the set here is {"net.peer.name", "net.peer.ip"}, so you need both. Except: This is wrong! There are two sets, one contains only net.peer.name and the other only net.peer.ip.

I think if each set contains only one member, the text above the list MUST change to "At least one of the following sets of attributes is required:" (without the sets of).

to_const_name generates invalid variable identifiers

Recently in semantic conventions, two new types of id were introduced:

The const names generated for those ids are ALIBABA CLOUD and 1XRTT respectively. Those generated values are not valid variable identifiers in most languages.

In opentelemetry-js, we use to_const_name to generate keys for those ids: https://github.com/open-telemetry/opentelemetry-js/blob/main/scripts/semconv/templates/SemanticAttributes.ts.j2#L57. In this case, the keys should be valid variable identifiers.

Should we change the semantic conventions or update the generator to generate valid identifiers?

Referenced attributes are partially generated in markdown

Note with possible attribute values (and allowing custom values) is not populated when an attribute is referenced.

E.g. I have

- id: messaging.destination
  prefix: messaging.destination
  brief: 'Semantic convention for attributes that describe messaging destination on broker'
  attributes:
    - id: kind
      type:
        allow_custom_values: true
        members:
          - id: queue
            value: "queue"
            brief: "A message sent to a queue"
          - id: topic
            value: "topic"
            brief: "A message sent to a topic"

If I generate md from it using <!-- semconv messaging.destination-->, I'd see an attribute in the table and a note describing possible values:

| `messaging.destination.kind` | string | The kind of message destination | `queue` | Recommended |
`messaging.destination.kind` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value  | Description |
|---|---|
| `queue` | A message sent to a queue |
| `topic` | A message sent to a topic |

Then if I reference this attribute

- id: messaging.producer
  prefix: messaging
  type: span
  extends: messaging
  span_kind: producer
  brief: 'Semantic convention for producers of messages sent to a messaging systems.'
  attributes:
    - ref: messaging.destination.kind
      requirement_level:
        conditionally_required: If the message destination is either a `queue` or a `topic`.

If I generate md from it using <!-- semconv messaging.producer-->, I'd expect a similar note to be populated, but it's not (see

open-telemetry/opentelemetry-specification#3214)

Add metric requirement level to yaml

This stems from open-telemetry/opentelemetry-specification#2972 and https://github.com/open-telemetry/opentelemetry-specification/pull/3158/files

"The HTTP metrics semantic convention spec should provide clarity "what is considered a compliant implementation", e.g. "an implementation is considered compliant if it has implemented all of the metrics described by this spec"

The work required in build-tools is:

  • Add a field to the metrics semconv yaml (should this also be for traces?)
  • Decide where this will be rendered in markdown (most likely the metrics table makes sense) + write it into the markdown renderer

Support for "map" conventions based on prefixes

See this suggested semantic convention for HTTP headers: open-telemetry/opentelemetry-specification#1061.

Right now, I think it is not possible to specify this in the semantic convention generator. It would be useful however to generate a constant for the http.request.header prefix (or http.request.header. with a trailing dot, or a function string httpRequestHeaderName(string key) { return 'http.request.header' + key; }.

In the markdown, this should probably be designated by adding .* to the name of the attribute. E.g. like this:

Attribute Type Description Examples Required
http.request.header.<key> string[] HTTP request headers, <key> being the HTTP Header name (case preserving), the value being the header values. http.request.header.Content-Type=["application/json"]; http.request.X-Forwarded-for=["1.2.3.4", "1.2.3.5"] No

In the YAML source:

  • The type could be specified like type: prefix_map: string[].
  • Examples need to be handled specially here, as they need to include both the key (suffix) and a value.

Semconv generated code causes doxygen warnings

Using opentelemetry-specifications version 1.16.0.

In particular, the spec for browser.platform contains, in the note:

          Note that some (but not all) of these values can overlap with values
          in the [`os.type` and `os.name` attributes](./os.md).

Using build-tools version 0.14.0

Using opentelemetry-cpp version 1.8.1, with the following generation script:

https://github.com/open-telemetry/opentelemetry-cpp/blob/main/buildscripts/semantic-convention/generate.sh

After generation, the C++ code for browser.platform looks like:

/**
 * The platform on which the browser is running
 *
 * <p>Notes:
  <ul> <li>This value is intended to be taken from the <a
href="https://wicg.github.io/ua-client-hints/#interface">UA client hints API</a> ({@code
navigator.userAgentData.platform}). If unavailable, the legacy {@code navigator.platform} API SHOULD
NOT be used instead and this attribute SHOULD be left unset in order for the values to be
consistent. The list of possible values is defined in the <a
href="https://wicg.github.io/ua-client-hints/#sec-ch-ua-platform">W3C User-Agent Client Hints
specification</a>. Note that some (but not all) of these values can overlap with values in the <a
href="./os.md">{@code os.type} and {@code os.name} attributes</a>. However, for consistency, the
values in the {@code browser.platform} attribute should capture the exact value that the user agent
provides.</li> </ul>
 */
static constexpr const char *kBrowserPlatform = "browser.platform";

https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/include/opentelemetry/sdk/resource/semantic_conventions.h#L38-L53

Note in particular how the original link to os.md is encoded:

Note that some (but not all) of these values can overlap with values in the <a
href="./os.md">{@code os.type} and {@code os.name} attributes</a>.

Using doxygen version 1.9.5, it reports the following warnings:

/.../semantic_conventions.h:49: warning: Illegal command @iliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @endiliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @iliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @endiliteral found as part of a <a>..</a> block

There are a few similar warnings for other semantic conventions.

Doxygen complains about the {@code os.type} and {@code os.name} seen inside an <a>anchor</a>

The expected result is to have a complete chain that generates clean doxygen documentation, without warnings.

About the point of failure, not sure where the root cause actually is:

  • Is the syntax used (see below) considered valid in the spec ?
  • Does the generator produces the correct output for this syntax ?
  • Does the template code used in opentelemetry-cpp needs changes to fix this, and how ?
  • Should doxygen accepts @code inside an <a></a> tag ?
Spec syntax:
[`os.type` and `os.name` attributes](./os.md) 

Please clarify and provide guidance here, in particular on which part (spec, generator, generator template) should be fixed, with suggestions if known.

If this turns out to be a doxygen limitation, what is the best way to implement a work around ?

Thanks.

Allow at least some form of templating also for markdown

The code generator uses a flexible mechanism based on Jinja2 to generate output. Yet the markdown output is generated with hard-coded Python code.
Consider introducing more flexibility into markdown generation with Jinja. Ideally, completely replace the Python code for markdown with Jinja templates which allow good structuring and are quite powerful. Just the replacement mechanism (finding the HTML comments) will best stay in Pyhton.

From #70 (comment)

Support clarifying shared attributes in other files

Currently, the way to refer to another attribute while clarifying its usage in a downstream semantic convention file, for example how aws-sdk does here

https://github.com/open-telemetry/opentelemetry-specification/blob/412fb5774cefc32ff9b7f47ef80e9e7d245e1e2e/semantic_conventions/trace/instrumentation/aws-sdk.yml#L13

The code generator seems to not correctly use the base definition. Perhaps this was too hacky of a way to do this approach but I think we need a way to refer to convventions from other files, and without breaking generation.

Also see:
open-telemetry/opentelemetry-specification#1607

/cc @weyert

Establish a build-tools SIG

I think this project has enough contributions to be its own SIG that has approvers and maintainers. We should consider some of the instrumentation SIG members for the semantic convention tools as well as people that made contributions @arminru, @Oberon00, @atoulme, @lmolkova, myself for the proto docker.

@open-telemetry/technical-committee @open-telemetry/specs-approvers @open-telemetry/instr-wg thoughts on this?

Make diffs in table regeneration issues more visible

I (and apparently another engineer) had trouble seeing why exactly the build process was failing, the error message only being

File {blah} contains a table that would be reformatted.

I've since found the tool is very particular about whitespace and even the number of dashes in a table header.

build-protobuf crashes with segmentation fault on Apple M1

Hi! We use the otel/build-protobuf image in Tempo, it is run as part of make vendor-check:
https://github.com/grafana/tempo/blob/055573362709a02e2ca6aa98400a3de38c91c4a1/Makefile#L123

I'm the first person on the team using the Apple M1 and using this image results in an error (full log at the end):

qemu: uncaught target signal 11 (Segmentation fault) - core dumped

The recommended way to solve this is by providing multi-arch Docker images. Similar work was done in Tempo itself: this requires the pipeline to be adapted so it builds both amd64 and arm64 images and the Dockerfile is parameterised to accept a target arch.

Is this work you would consider accepting?

Full log:

$ make vendor-check

...

--
-- Gen proto --
--
docker run --rm -u 501 -v/Users/koenraad/Repositories/grafana/tempo:/Users/koenraad/Repositories/grafana/tempo -w/Users/koenraad/Repositories/grafana/tempo otel/build-protobuf:0.2.1 --proto_path=/Users/koenraad/Repositories/grafana/tempo -Ipkg/.patched-proto --gogofaster_out=plugins=grpc,paths=source_relative:./pkg/tempopb/ pkg/.patched-proto/common/v1/common.proto
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
--gogofaster_out: protoc-gen-gogofaster: Plugin killed by signal 11.
make: *** [gen-proto] Error 1

Allow attaching metadata to attribute and group definitions

We're using the semconvgen tooling to generate documentation (and eventually code) for our internal conventions. We would like to be able to associate various metadata with individual attributes and possibly groups of attributes so that we could have all the data about our internal conventions in a single source. Some examples of data we'd like to be able to associate with attributes are:

  • Related service
  • Attribute owner/point of contact
  • Privacy annotations (for potential redaction/hashing, etc)
  • Retention policy for sensitive data

I suspect that at least some of the fields we would like to annotate on attributes and groups are unique to our internal use cases and probably do not make sense to be integrated with the semconvgen tooling. Instead I was wondering if there might be interest in adding a generic mechanism for associating arbitrary metadata with group and attributes? If so, would you have have any guidance how we might go about implementing this?

SemConv should i18N

Hi, I want to translate open-telemetry specifications to Japanese. But I found the text is hard-coded in English, at here, and table header. (perhaps only those two. Status, type, Description is ok for me)

So I think it is better to use some i18n for those texts. If a table header is difficult because of character length, I would like to add only the "MUST be one of the ..." text to the translation.

Release 0.13.0

We need a new release to use the latest version of schemas check tool in the spec repo before spec release 0.13.0.

There is no release guidelines in this repo, so I am not sure how to make a release. @open-telemetry/specs-approvers who can help with this and/or write a paragraph in CONTRIBUTING.md to explain how to make the release?

Generator needs better support for internal and spec links.

See open-telemetry/opentelemetry-specification#1192

The markdown generator would need to be passed an URL prefix that is treated as root which is relatively linkable. Semantic conventions would contain absolute links (i.e. revert that part of open-telemetry/opentelemetry-specification#1192).

Other generators should support replacing an URL prefix e.g. https://github.com/open-telemetry/opentelemetry-specification/tree/master/ to https://github.com/open-telemetry/opentelemetry-specification/tree/v0.6/.

[semconvgen] Feature Request: Add option to disable attribute id regex validation

Hey folks ๐Ÿ‘‹ We have existing code that emits attributes with IDs that do not match the semconv regex (for example, some of them are pascal case). We want to use the semconvgen tool with a custom template, but ideally we would like to be able to disable the regex validation on the attribute IDs as an option flag for semconvgen. In our case, relabelling would be too long/costly as we have a huge codebase in multiple languages to comb through.

Semconv code generator example does not work for opentelemetry-specifications 1.14.0

Repo opentelemetry-cpp successfully used the semconv code generator already on 1.13.0 specs.
See the generate script used:

generate.sh

Now trying to upgrade to 1.14.0 opentelemetry-specification:

SEMCONV_VERSION=1.14.0

The build fails with this error message:

opentelemetry.semconv.model.exceptions.ValidationError: Semantic Convention trace-exception reference `exception.type` but it cannot be found! - @2:5

In the 1.14.0 specs, indeed, file
semantic_conventions/trace/trace-exception.yaml
uses the exception defined in
semantic_conventions/exception.yaml

Because the docker mount point used is semantic_conventions/trace/,
the docker image does not see file exceptions.yaml

docker run --rm \
  -v ${SCRIPT_DIR}/opentelemetry-specification/semantic_conventions/trace:/source \
  ...

I hope this helps to narrow down the root cause.
No idea about a fix however.

Please fix and adjust the example to work with the latest semantic conventions from the specs,
as this is needed for each language SIG in general.

[resolved] Blocking for open-telemetry/opentelemetry-cpp#1671

Improve markdown table generation for metrics

(split from #79)

See https://github.com/open-telemetry/build-tools/pull/79/files#r996803040 for the previous discussion and proposals.

Two possible options would be:

  1. A table of metric names with a (then shared) table of attribute names that apply.
  2. A table of the metric where "empty" rows are created while attributes are expanded.

Both of them are currently manually applied in the hand-crafted metrics tables in the spec repo.

cc @jamesmoessis @Oberon00 @joaopgrassi @jsuereth

Update otel/build-protobuf for multi-architecture support

Unable to build the opentelemetry-go-proto release for OTLP v0.19.0 on an arm chip because the build tools are for amd64.

The release steps fail like so:

% make sync VERSION=v0.19.0
upgrading opentelemetry-proto submodule to v0.19.0
HEAD is now at 6459e1a Prepare for v0.19.0 release (#420)
rm -rf gen otlp
rm -rf ./gen/go
mkdir -p ./gen/go
docker run --rm -u 502 -v/Users/josh.macdonald/src/opentelemetry/proto-go:/Users/josh.macdonald/src/opentelemetry/proto-go -w/Users/josh.macdonald/src/opentelemetry/proto-go otel/build-protobuf:0.11.0 --proto_path="gen/proto"  --go_out=./gen/go gen/proto/opentelemetry/proto/common/v1/common.proto
Unable to find image 'otel/build-protobuf:0.11.0' locally
0.11.0: Pulling from otel/build-protobuf
Digest: sha256:a1f16b31cb70dca3e486afad20c268cf9d3ee776b0cd6656120c8e013c27a52c
Status: Downloaded newer image for otel/build-protobuf:0.11.0
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
--go_out: protoc-gen-go: Plugin killed by signal 11.
make: *** [gen-otlp-protobuf] Error 1

Support shared enum definitions

I noticed that we have duplicated the list of well-known cloud providers to both the cloud.provider resource attribute and the faas.invoked_provider attribute.

With this we keep approaching XSD ๐Ÿ˜„

How to gen grpc-java_out?

Hi,
I use the docker to gen Jave code, but I just find the java bean code, how to gen grpc java client/server code?

Formalize build process for semantic convention generator

As a new contributor to opentelemetry, I had to debug some stuff in build-tools, and it was a bit vague how we go about building the docker container. It would be great if we could formalize these into a make command or similar so that it's easier to onboard new contributors.

Improvements for attributes semconv generation

open-telemetry/opentelemetry-specification#3183 declares attributes separately from traces and metrics and then reuses attributes in corresponding semconvs. There are two issues:

  1. Grandparent attributes are not populated in the table. E.g.

    • http.common group defines http.method
    • http.server group extends http.common and defines http.route
    • metric.http.server.duration extends http.server. When metric.http.server.duration(full) semconv table is rendered, it does not include http.method, but includes http.route
  2. Referenced attributes description should come from extended semconv first. E.g.

    • net.peer.name is defined in span-general semconv
    • it's redefined for HTTP in http.client.common providing long HTTP-specific description
    • it needs sampling_relevant: true for tracing (and none of it for metrics), so it has to be referenced again in trace.http.client and long HTTP-specific description should be repeated
    • Proposal: when resolving net.peer.name for trace.http.client semconv, it should prioritize properties from trace.http.client, then read them from parent (http.client.common) and finally fill in the gaps from original span-general spec.

Support referencing attributes across signals in codegen

See open-telemetry/opentelemetry-specification#3299.
E.g. user_agent.original is defined within attribute_group in opentelemetry-specification/semantic_conventions folder and referenced in resource semconv here https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/resource/browser.yaml

Code generator allows to specify source folder and would generate all semconv from there, applying --only or --exclude flags to the source. It uses the same subset of files to resolve attributes and generate code for them.

When browser resource semconv is generated, we need to use all resource + cross-signal attributes to resolve attributes, but generate only those that are in resource. there is no such option today.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.