skills/arize-ai/openinference/java-code-reviewer

java-code-reviewer

Installation
SKILL.md

Java Code Reviewer for OpenInference Instrumentors

Review a Java OpenInference instrumentation package against the project's established patterns and conventions. Report findings with file paths and line numbers, organized by severity (Critical / High / Medium / Low).

Workflow

Step 1: Identify the package to review

  • Ask the user which instrumentor to review if not already clear from context
  • The package lives under java/instrumentation/openinference-instrumentation-<name>/
  • Read the instrumentor source, build.gradle, and src/test/ directory

Step 2: Use the instrumented library source as ground truth

Before flagging any finding, verify it against the actual library code. Do NOT assume how the instrumented library works — read it. Do NOT present findings without having read the library source first.

  • Find the library version from java/build.gradle ext block
  • Check ~/.gradle/caches/modules-2/files-2.1/ for cached sources
  • If not cached, download the sources jar from Maven Central (repo1.maven.org). Some libraries split across multiple artifacts — check build.gradle dependency declarations and fetch all relevant ones.
  • If you cannot obtain the source through any means, explicitly tell the user you were unable to verify against the library source before presenting findings.
  • Calibrate severity by what the library actually does: a bug on a common code path is High/Critical; an edge case for a type that can't appear at runtime is Low

Step 3: Run all review sections below

Step 4: Present findings in a severity table, list what's working well, then ask the user: fix issues, run tests (./gradlew :instrumentation:...:test), or done.


Section 1: Gradle Setup

Read the instrumentor's build.gradle and the root java/build.gradle.

  • Instrumented library must be compileOnly (not implementation) — High
  • openinference-instrumentation must be api
  • Version constants should be in root ext block, not hardcoded — Medium
  • Module must be in java/settings.gradleCritical if missing
  • Run cd java && ./gradlew spotlessCheck (Palantir Java Format)

Section 2: Testing Patterns

Exhaustive attribute assertions

This is the most important testing pattern. Tests must verify ALL span attributes, not spot-check a few. The remove-and-verify pattern catches both unexpected additions and silent removals:

Map<String, Object> attributes = new HashMap<>();
span.getAttributes().forEach((key, value) -> attributes.put(key.getKey(), value));

assertThat(attributes.remove("openinference.span.kind")).isEqualTo("LLM");
assertThat(attributes.remove("llm.model_name")).isEqualTo("gpt-4");
// ... remove and assert all remaining attributes ...
assertThat(attributes).isEmpty();  // Nothing unexpected left

Missing emptiness check — High.

Other required test coverage

  • Error handling: exception -> span has StatusCode.ERROR + recorded exception — High
  • Context attribute propagation: session_id, user_id, metadata, tags
  • TraceConfig masking: verify hideInputMessages etc. actually suppress attributes
  • Missing test files entirely — Critical

Section 3: OpenInference Semantic Conventions

Read SemanticConventions.java for the full attribute catalog: java/openinference-semantic-conventions/src/main/java/com/arize/semconv/trace/SemanticConventions.java

Also read the spec files under spec/ (semantic_conventions.md, traces.md, llm_spans.md, embedding_spans.md, tool_calling.md) for expected behavior.

For the library type being reviewed, verify the instrumentor sets all applicable attributes. Key checks:

  • Every span needs OPENINFERENCE_SPAN_KIND, INPUT_VALUE + INPUT_MIME_TYPE, OUTPUT_VALUE + OUTPUT_MIME_TYPE. Missing MIME type when value is set — High
  • All attributes should use constants from SemanticConventions, not hardcoded strings — Medium
  • Read TraceConfig.java for the full list of hide flags; verify each is respected where applicable — Medium if missing

Section 4: Span Lifecycle and Hierarchy

  • span.end() must ALWAYS be called — Critical if missing
  • Scope from context.makeCurrent() must be closed (try-with-resources) — High
  • For multi-span instrumentors: verify parent-child nesting and shared traceId
  • The instrumentor should use OITracer (not raw Tracer) to get TraceConfig support

Presenting Results

Organize findings into a table:

Severity Section Finding Location
Critical 4 span.end() not called in error path SomeListener.java:142
High 2 Tests don't verify all span attributes SomeTest.java:85
... ... ... ...

Then list what's working well — positive findings help the user understand what doesn't need to change.

Weekly Installs
2
GitHub Stars
917
First Seen
5 days ago