<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Apache SkyWalking – Observability</title>
    <link>/tags/observability/</link>
    <description>Recent content in Observability on Apache SkyWalking</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Sun, 05 Apr 2026 00:00:00 +0000</lastBuildDate>
    
	  <atom:link href="/tags/observability/feed.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Blog: Monitoring LLM Applications with SkyWalking 10.4: Insights into Performance and Cost</title>
      <link>/blog/2026-04-05-virtual-genai-monitoring/</link>
      <pubDate>Sun, 05 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/2026-04-05-virtual-genai-monitoring/</guid>
      <description>
        
        
        &lt;h1 id=&#34;the-problem-as-applications-consume-llms-monitoring-leaves-a-blind-spot&#34;&gt;The Problem: As Applications &amp;ldquo;Consume&amp;rdquo; LLMs, Monitoring Leaves a Blind Spot&lt;/h1&gt;
&lt;p&gt;With the deep penetration of Generative AI (GenAI) into enterprise workflows, developers face a challenging paradox: while powerful LLM capabilities are easily integrated via &lt;code&gt;Spring AI&lt;/code&gt; or &lt;code&gt;OpenAI SDKs&lt;/code&gt;, the actual performance and reliability of these calls remain largely invisible.&lt;/p&gt;
&lt;h3 id=&#34;1-the-black-box-of-cost-and-performance-is-the-expensive-model-worth-it&#34;&gt;1. The &amp;ldquo;Black Box&amp;rdquo; of Cost and Performance: Is the Expensive Model Worth It?&lt;/h3&gt;
&lt;p&gt;Facing high LLM bills, organizations often only see a total sum paid to a provider, but cannot calculate the &amp;ldquo;ROI&amp;rdquo; within the application.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Blind Upgrades:&lt;/strong&gt; You might switch to a premium flagship model for a better experience. But in your specific business scenario, does paying several times more per token actually yield lower latency or a faster &lt;strong&gt;TTFT (Time to First Token)&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lack of Real-World Benchmarks:&lt;/strong&gt; Official benchmarks mean little without your real-world business requests. You need to know which model achieves the perfect balance between &amp;ldquo;Token/Cost Consumption&amp;rdquo; and &amp;ldquo;Response Speed&amp;rdquo; under your actual prompt lengths and concurrency levels.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;2-the-vanishing-golden-timeout&#34;&gt;2. The Vanishing &amp;ldquo;Golden Timeout&amp;rdquo;&lt;/h3&gt;
&lt;p&gt;Many teams set timeouts for LLM calls arbitrarily (e.g., 30s or 60s).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Too Short:&lt;/strong&gt; During peak periods or long-text generation, requests are frequently interrupted, causing business failure rates to soar.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Too Long:&lt;/strong&gt; If a provider hangs, requests pile up in memory, blocking execution threads and potentially leading to the collapse of the entire Java application or microservice cluster.
Only by mastering the &lt;strong&gt;P99/P95 Latency&lt;/strong&gt; can you set rational timeout policies based on data rather than intuition.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;3-the-overlooked-experience-killer-ttft&#34;&gt;3. The Overlooked Experience Killer: TTFT&lt;/h3&gt;
&lt;p&gt;In GenAI scenarios, a user&amp;rsquo;s perception of speed depends less on the total duration of the conversation and more on &lt;strong&gt;&amp;ldquo;when the first word appears.&amp;rdquo;&lt;/strong&gt; * A streaming response with a 10s total duration but a &lt;strong&gt;500ms TTFT&lt;/strong&gt; feels instantaneous.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A non-streaming response with a 5s total duration but a &lt;strong&gt;4s TTFT&lt;/strong&gt; feels &amp;ldquo;frozen.&amp;rdquo;
If your observability system only tracks total latency, you miss the core UX metric that explains why users complain about &amp;ldquo;AI slowness.&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;SkyWalking 10.4: A &amp;ldquo;Digital Dashboard&amp;rdquo;&lt;/strong&gt;&lt;br&gt;
From the Application Perspective The &lt;strong&gt;Virtual GenAI&lt;/strong&gt; capability introduced in Apache SkyWalking 10.4 fills this &amp;ldquo;observability vacuum.&amp;rdquo; It avoids reliance on external gateways by using application-side probes (like the Java Agent) to collect the most authentic data from the client&amp;rsquo;s perspective.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Precise Latency Distribution:&lt;/strong&gt; Multi-dimensional metrics (P50, P90, P99) help visualize LLM fluctuations to inform dynamic timeout strategies.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core UX Metric — TTFT Monitoring:&lt;/strong&gt; Native support for first-token latency in streaming calls.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-dimensional Model Profiling:&lt;/strong&gt; Aligns token usage, estimated cost, and performance across Providers and Models, helping you choose the most cost-effective solution for your specific needs.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1 id=&#34;virtual-genai-observability&#34;&gt;Virtual GenAI Observability&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Virtual GenAI&lt;/strong&gt; represents Generative AI service nodes detected by probe plugins. All performance metrics are based on the &lt;strong&gt;GenAI Client Perspective&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For instance, the &lt;strong&gt;Spring AI plugin&lt;/strong&gt; in the Java Agent detects the response latency of a Chat Completion request. SkyWalking then visualizes these in the dashboard:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Traffic &amp;amp; Success Rate&lt;/strong&gt; (CPM &amp;amp; SLA)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Latency &amp;amp; TTFT&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Token Usage&lt;/strong&gt; (Input/Output)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Estimated Cost&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Screenshots:&lt;/strong&gt;
&lt;img src=&#34;provider-dashboard-1.png&#34; alt=&#34;provider-dashboard-1.png&#34;&gt;
&lt;img src=&#34;provider-dashboard-2.png&#34; alt=&#34;provider-dashboard-2.png&#34;&gt;
&lt;img src=&#34;provider-dashboard-3.png&#34; alt=&#34;provider-dashboard-3.png&#34;&gt;
&lt;img src=&#34;model-dashboard-1.png&#34; alt=&#34;model-dashboard-1.png&#34;&gt;
&lt;img src=&#34;model-dashboard-2.png&#34; alt=&#34;model-dashboard-2.png&#34;&gt;
&lt;img src=&#34;model-dashboard-3.png&#34; alt=&#34;model-dashboard-3.png&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;how-it-works&#34;&gt;How It Works&lt;/h1&gt;
&lt;p&gt;When the SkyWalking Java Agent or OTLP probes intercept calls to mainstream AI frameworks (e.g., Spring AI, OpenAI SDK), they report Trace data to the SkyWalking OAP.
The OAP aggregates and computes this data to generate performance metrics for both &lt;strong&gt;Providers&lt;/strong&gt; and &lt;strong&gt;Models&lt;/strong&gt;, which are then rendered in the built-in Virtual-GenAI dashboards.&lt;/p&gt;
&lt;h1 id=&#34;installation--configuration&#34;&gt;Installation &amp;amp; Configuration&lt;/h1&gt;
&lt;h2 id=&#34;requirements&#34;&gt;Requirements&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SkyWalking Java Agent:&lt;/strong&gt; &amp;gt;= 9.7&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SkyWalking OAP:&lt;/strong&gt; &amp;gt;= 10.4&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;semantic-conventions--compatibility&#34;&gt;Semantic Conventions &amp;amp; Compatibility&lt;/h2&gt;
&lt;p&gt;SkyWalking Virtual GenAI follows &lt;strong&gt;OpenTelemetry GenAI Semantic Conventions&lt;/strong&gt;. OAP identifies GenAI-related Spans based on:&lt;/p&gt;
&lt;h3 id=&#34;skywalking-java-agent&#34;&gt;SkyWalking Java Agent&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Spans must be of type Exit, have the SpanLayer attribute set to GENAI, and contain the gen_ai.response.model tag.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;otlp--zipkin-probes&#34;&gt;OTLP / Zipkin Probes&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Spans must contain the &lt;code&gt;gen_ai.response.model&lt;/code&gt; tag.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For details, refer to the E2E configurations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/virtual-genai/docker-compose.yml&#34;&gt;SkyWalking Java Agent Reporting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/otlp-virtual-genai/docker-compose.yml&#34;&gt;Probe Reporting OTLP Data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/zipkin-virtual-genai/docker-compose.yml&#34;&gt;Probe Reporting Zipkin Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1 id=&#34;genai-estimated-cost-configuration&#34;&gt;GenAI Estimated Cost Configuration&lt;/h1&gt;
&lt;h2 id=&#34;overview&#34;&gt;Overview&lt;/h2&gt;
&lt;p&gt;SkyWalking provides a built-in &lt;a href=&#34;https://github.com/apache/skywalking/blob/master/oap-server/server-starter/src/main/resources/gen-ai-config.yml&#34;&gt;GenAI Billing Configuration File&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This file defines how SkyWalking maps model names from Trace data to their corresponding providers and estimates the token cost for each LLM call. The estimated cost is displayed in the SkyWalking UI alongside trace and metric data, helping users intuitively understand the financial impact of their GenAI usage.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; The pricing in this file is intended for cost estimation only and must not be treated as actual billing or invoice amounts. Users are advised to regularly verify the latest rates on the providers&amp;rsquo; official pricing pages.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;configuration-structure&#34;&gt;Configuration Structure&lt;/h2&gt;
&lt;h3 id=&#34;top-level-fields&#34;&gt;Top-level Fields&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Field&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Type&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;last-updated&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;date&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;The last update date of the pricing data. All prices are based on public billing standards announced by providers prior to this date.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;providers&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;List of GenAI provider definitions. Each entry contains matching rules and specific model pricing information.&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;provider-definition&#34;&gt;Provider Definition&lt;/h3&gt;
&lt;p&gt;Each entry under &lt;code&gt;providers&lt;/code&gt; defines a GenAI provider:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;providers&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;provider-name&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &amp;lt;prefix-1&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &amp;lt;prefix-2&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;model-name&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aliases&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&amp;lt;alias-1&amp;gt;, &amp;lt;alias-2&amp;gt;]&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;cost&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;cost&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Field&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Type&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Required&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;provider&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Yes&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;The provider identifier (e.g., &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;gemini&lt;/code&gt;). It is displayed as the Virtual GenAI service name in SkyWalking.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;prefix-match&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[string]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Yes&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;A list of prefixes used to match model names to this provider. If a model name in the Trace data starts with any of these prefixes, it will be mapped to this provider.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;models&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[model]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;No&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;A list of model definitions containing pricing information. If omitted, the system can still identify the provider but will not perform cost estimation.&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;model-definition&#34;&gt;Model Definition&lt;/h3&gt;
&lt;p&gt;Each entry under &lt;code&gt;models&lt;/code&gt; defines the pricing for a specific model:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Field&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Type&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Required&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;name&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Yes&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;The standard model name used for matching.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;aliases&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[string]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;No&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Alternative names that should resolve to the same billing entry. This is useful when providers use different naming conventions (see the &amp;ldquo;Model Aliases&amp;rdquo; section).&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;input-estimated-cost-per-m&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;float&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;No&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated cost per 1,000,000 (one million) input (Prompt) tokens. The default unit is USD.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;output-estimated-cost-per-m&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;float&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;No&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated cost per 1,000,000 (one million) output (Completion) tokens. The default unit is USD.&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;model-matching-mechanism&#34;&gt;Model Matching Mechanism&lt;/h2&gt;
&lt;h3 id=&#34;provider-level-prefix-matching&#34;&gt;Provider-Level Prefix Matching&lt;/h3&gt;
&lt;p&gt;When SkyWalking receives a Trace containing a GenAI call, it determines the &lt;strong&gt;Provider&lt;/strong&gt; based on the following priority order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;gen_ai.provider.name&lt;/code&gt; tag&lt;/strong&gt;: This tag is retrieved first. It follows the latest &lt;code&gt;OpenTelemetry&lt;/code&gt; GenAI semantic conventions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;gen_ai.system&lt;/code&gt; tag&lt;/strong&gt;: If the above tag is missing, the system falls back to this legacy tag. Note: This tag is only parsed when processing OTLP or Zipkin format data, primarily for compatibility with older versions of libraries like the Python auto-instrumentation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Prefix Matching&lt;/strong&gt;: If neither of the above tags exists, &lt;code&gt;SkyWalking&lt;/code&gt; reads the &lt;code&gt;prefix-match&lt;/code&gt; rules defined in &lt;code&gt;gen-ai-config.yml&lt;/code&gt; and attempts to identify the provider by matching the &lt;strong&gt;Model Name&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;openai&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- gpt&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Any model name starting with &lt;strong&gt;gpt&lt;/strong&gt; (such as &lt;strong&gt;gpt-4o&lt;/strong&gt;, &lt;strong&gt;gpt-4.1-mini&lt;/strong&gt;, or &lt;strong&gt;gpt-5-nano&lt;/strong&gt;) will be mapped to the &lt;strong&gt;openai&lt;/strong&gt; provider.
A single provider can have multiple prefixes:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;tencent&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- hunyuan&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- Tencent&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;model-level-longest-prefix-matching&#34;&gt;Model-level Longest-Prefix Matching&lt;/h3&gt;
&lt;p&gt;Once the provider is determined, SkyWalking uses a Trie-based longest-prefix matching algorithm to find the best billing entry. This is crucial because model names returned in provider API responses often include version numbers or timestamps, differing from the base model name in the config.
Example OpenAI config:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;gpt-4o&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;2.5&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;10.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;gpt-4o-mini&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.15&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Matching behavior:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Model Name in Trace&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Matched Configuration Entry&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Reason&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Exact match&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-2024-08-06&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Longest prefix is &lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Exact match (Longer prefix &lt;code&gt;gpt-4o-mini&lt;/code&gt; takes priority over &lt;code&gt;gpt-4o&lt;/code&gt;)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini-2024-07-18&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Longest prefix is &lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This mechanism ensures versioned API model names map to the correct pricing tier without requiring exact full names in the configuration file.&lt;/p&gt;
&lt;h3 id=&#34;model-aliases&#34;&gt;Model Aliases&lt;/h3&gt;
&lt;p&gt;Some providers use different naming conventions across API responses and documentation. For example, Anthropic&amp;rsquo;s model might appear as &lt;code&gt;claude-4-sonnet&lt;/code&gt; or &lt;code&gt;claude-sonnet-4&lt;/code&gt;. The aliases field supports both formats under a single billing entry:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;claude-4-sonnet&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aliases&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;claude-sonnet-4]&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;3.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;15.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Under this configuration, &lt;code&gt;claude-4-sonnet&lt;/code&gt; and &lt;code&gt;claude-sonnet-4&lt;/code&gt; (as well as any versioned variants, such as &lt;code&gt;claude-sonnet-4-20250514&lt;/code&gt;) will resolve to the same &lt;strong&gt;billing entry&lt;/strong&gt;.&lt;br&gt;
&lt;strong&gt;Note:&lt;/strong&gt; Aliases also participate in &lt;strong&gt;longest prefix matching&lt;/strong&gt;. Therefore, &lt;code&gt;claude-sonnet-4-20250514&lt;/code&gt; will match the alias &lt;code&gt;claude-sonnet-4&lt;/code&gt;, which in turn resolves to the pricing information for &lt;code&gt;claude-4-sonnet&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;custom-configuration&#34;&gt;Custom Configuration&lt;/h2&gt;
&lt;h3 id=&#34;adding-a-new-provider&#34;&gt;Adding a New Provider&lt;/h3&gt;
&lt;p&gt;To add a provider that is not included in the default configuration:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;providers&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# ... Existing providers ...&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ollama&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- mymodel&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;mymodel-large&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;1.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;5.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;mymodel-small&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.1&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For OTLP/Zipkin data, a dedicated estimated tag has been added. You can now view the cost of each GenAI call directly on the UI.
&lt;img src=&#34;otlp-estimated-tag.png&#34; alt=&#34;otlp-estimated-tag&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;main-metrics&#34;&gt;Main Metrics&lt;/h1&gt;
&lt;h2 id=&#34;1provider-level&#34;&gt;1.Provider Level&lt;/h2&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Metric ID&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Meaning&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_cpm&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Calls Per Minute&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Requests per minute (Throughput)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_sla&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Success Rate&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Request success rate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_resp_time&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Avg Response Time&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Average response time&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_latency_percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Latency Percentiles&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Response time percentiles (P50, P75, P90, P95, P99)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_input_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Input Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Total and average input token usage&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_output_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Output Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Total and average output token usage&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_total_estimated_cost/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated Cost&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Total estimated cost and average cost per call&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;2-model-level&#34;&gt;2. Model Level&lt;/h2&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Metric ID&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Meaning&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_call_cpm&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Calls Per Minute&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Requests per minute for this specific model&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_sla&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Success Rate&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Model-specific request success rate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_latency_avg/percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Latency&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Average and percentiles of model response duration&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_ttft_avg/percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;TTFT&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Time to First Token (Streaming only)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_input_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Input Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Detailed input token consumption for the model&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_output_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Output Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Detailed output token consumption for the model&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_total_estimated_cost/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated Cost&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated total cost and average cost for the model&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;recommended-usage-scenarios&#34;&gt;Recommended Usage Scenarios&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Performance Evaluation: Use Latency and Time to First Token (TTFT) metrics to analyze model inference efficiency and the end-user interaction experience.&lt;/li&gt;
&lt;li&gt;Token Monitoring: Real-time monitoring of Input and Output token consumption to analyze resource utilization across different business scenarios.&lt;/li&gt;
&lt;li&gt;Cost Alerting: Set alert thresholds based on Estimated Cost or token consumption to promptly detect abnormal calls and prevent budget overruns.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Zh: 基于 SkyWalking 10.4 的大模型应用监控：洞察 LLM 的性能与成本</title>
      <link>/zh/2026-04-05-virtual-genai-monitoring/</link>
      <pubDate>Sun, 05 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/zh/2026-04-05-virtual-genai-monitoring/</guid>
      <description>
        
        
        &lt;h1 id=&#34;问题当应用开始吞噬大模型监控却留下了盲区&#34;&gt;问题：当应用开始“吞噬”大模型，监控却留下了盲区&lt;/h1&gt;
&lt;p&gt;随着生成式 AI（GenAI）在企业业务中的深度渗透，开发者正面临一个尴尬的局面：我们在应用中通过&lt;code&gt;Spring AI&lt;/code&gt;或&lt;code&gt;OpenAI SDK&lt;/code&gt;快速集成了强大的大模型能力，但对于这些调用的实际表现却几乎一无所知。&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;成本与性能的“黑盒”：昂贵的模型真的更具性价比吗？&lt;/strong&gt;&lt;br&gt;
面对高昂的大模型账单，我们往往只知道把钱交给了某个&lt;code&gt;Provider&lt;/code&gt;，却算不清这笔账在应用内部的“投入产出比”。
盲目的选型升级：为了追求更好的体验，你可能将业务默认切换到了成本更高的旗舰模型。但在具体的业务场景下，花费数倍的 Token 成本，它真的能在真实请求中带来更低的延迟和更快的 TTFT(Time to First Token) 吗？
缺乏真实的评估基准：脱离了真实的业务请求，单纯看官网的 Benchmark 意义不大，你需要知道在实际的 Prompt 长度和并发压力下，同一&lt;code&gt;Provider&lt;/code&gt;下的哪个模型能在“Token/Cost 消耗”与“响应速度”之间达到完美的平衡。如果没有应用侧的数据支撑，你根本无从判断哪款模型才是当前业务的最优解。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;消失的“黄金超时时间”&lt;/strong&gt;&lt;br&gt;
很多团队在代码里给 LLM 调用设置超时（Timeout）时，往往是拍脑袋决定（比如 30s 或 60s）。&lt;br&gt;
设太短：长文本生成或模型高峰期时，请求会被频繁强行中断，导致业务失败率飙升。&lt;br&gt;
设太长：如果下游供应商出现故障（卡死），大量的请求会堆积在应用内存中，阻塞执行线程，最终导致整个 Java 应用甚至微服务集群的瘫痪。
只有真正掌握了预估的整体调用延迟（P99/P95 Latency），你才能基于数据而非直觉，为不同模型设置最合理的超时策略。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;被忽视的体验杀手：TTFT&lt;/strong&gt;&lt;br&gt;
在 GenAI 场景下，用户对“快”的感知并不完全取决于整个对话结束的总耗时，而取决于**“第一行字什么时候跳出来”**。
一个总耗时 10 秒但 TTFT 仅 500ms 的流式响应，给用户的观感是“秒回”。
一个总耗时 5 秒但 TTFT 需要 4s 的非流式响应，给用户的观感却是“卡死”。
如果你的观测系统只能看到总耗时，你就会漏掉最核心的 UX 指标，无法解释为什么用户反馈“AI 很慢”即便总耗时看起来还行。&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;SkyWalking 10.4：应用视角的“数字仪表盘”&lt;/strong&gt;&lt;br&gt;
Apache SkyWalking 自 10.4 版本引入的 Virtual GenAI 能力，正是为了解决应用层侧的这种“观测真空”。它不依赖任何外部网关，直接通过应用侧探针（如 Java Agent）在客户端视角采集最真实的数据。&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;精准的延迟分布（Latency Percentiles）：通过 P50、P90、P99 等多维指标，帮你勾勒出 LLM 调用的真实波动曲线，为设置“动态超时时间”提供科学依据。&lt;/li&gt;
&lt;li&gt;核心 UX 指标——TTFT 监控：原生支持流式（Streaming）调用的首字延迟统计。通过对比不同 Provider 或不同模型的 TTFT，你可以优化提示词（Prompt）策略或切换更快的模型，确保用户体验始终在线。&lt;/li&gt;
&lt;li&gt;多维度的模型“画像”分析：在 Provider 和 Model 两个维度上，将 Token 消耗、预估成本与性能指标深度对齐。这让你不再看供应商全网的“理想平均数”，而是看清你的应用在调用特定模型时的真实表现，从而在复杂的模型生态中选出最具性价比的选型方案。&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&#34;虚拟-genai-观测&#34;&gt;虚拟 GenAI 观测&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;虚拟 GenAI&lt;/strong&gt; 代表了由探针插件检测到的生成式 AI 服务节点。GenAI 操作的性能指标均基于 &lt;strong&gt;GenAI 客户端视角&lt;/strong&gt;。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;例如，Java 探针中的 &lt;strong&gt;Spring AI 插件&lt;/strong&gt;可以检测一次对话补全（Chat Completion）请求的响应延迟。随后，SkyWalking 将在仪表盘中展示：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;流量与成功率&lt;/strong&gt; (CPM &amp;amp; SLA)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;响应延迟&lt;/strong&gt; (Latency &amp;amp; TTFT)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Token 消耗&lt;/strong&gt; (Input/Output)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;预估成本&lt;/strong&gt; (Estimated Cost)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;如图：
&lt;img src=&#34;provider-dashboard-1.png&#34; alt=&#34;provider-dashboard-1.png&#34;&gt;
&lt;img src=&#34;provider-dashboard-2.png&#34; alt=&#34;provider-dashboard-2.png&#34;&gt;
&lt;img src=&#34;provider-dashboard-3.png&#34; alt=&#34;provider-dashboard-3.png&#34;&gt;
&lt;img src=&#34;model-dashboard-1.png&#34; alt=&#34;model-dashboard-1.png&#34;&gt;
&lt;img src=&#34;model-dashboard-2.png&#34; alt=&#34;model-dashboard-2.png&#34;&gt;
&lt;img src=&#34;model-dashboard-3.png&#34; alt=&#34;model-dashboard-3.png&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;原理&#34;&gt;原理&lt;/h1&gt;
&lt;p&gt;当 SkyWalking Java Agent 或 OTLP 探针拦截到主流 AI 框架（如 Spring AI、OpenAI SDK 等）的调用时，将Trace 数据上报至 SkyWalking OAP。
OAP会基于这些 Trace 自动完成数据的聚合与计算。最终会生成 Provider（服务商）与 Model（模型）两个维度的各类性能指标，并直接渲染填充至内置的 Virtual-GenAI 仪表盘中。&lt;/p&gt;
&lt;h1 id=&#34;安装配置&#34;&gt;安装配置&lt;/h1&gt;
&lt;h2 id=&#34;要求&#34;&gt;要求&lt;/h2&gt;
&lt;h3 id=&#34;版本要求&#34;&gt;版本要求&lt;/h3&gt;
&lt;p&gt;● SkyWalking Java Agent: &amp;gt;= 9.7
● SkyWalking Oap: &amp;gt;= 10.4&lt;/p&gt;
&lt;h3 id=&#34;语义规范与兼容性&#34;&gt;语义规范与兼容性&lt;/h3&gt;
&lt;p&gt;SkyWalking 虚拟 GenAI 遵循&lt;code&gt; OpenTelemetry GenAI&lt;/code&gt; 语义规范。OAP 将根据以下标准识别 GenAI 相关 Span：&lt;/p&gt;
&lt;h4 id=&#34;skywalking-java-agent&#34;&gt;SkyWalking Java Agent&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;上报的 Span 必须为 Exit 类型，其 SpanLayer 属性需设定为 GENAI,包含&lt;code&gt;gen_ai.response.model&lt;/code&gt; 标签。&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;输出otlp--zipkin格式数据的探针&#34;&gt;输出OTLP / Zipkin格式数据的探针&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;上报的 Span 中包含 &lt;code&gt;gen_ai.response.model&lt;/code&gt; 标签。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;具体可以参考e2e配置&lt;br&gt;
&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/virtual-genai/docker-compose.yml&#34;&gt;SkyWalking Java Agent上报数据&lt;/a&gt;&lt;br&gt;
&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/otlp-virtual-genai/docker-compose.yml&#34;&gt;探针上报OTLP格式数据&lt;/a&gt;&lt;br&gt;
&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/test/e2e-v2/cases/zipkin-virtual-genai/docker-compose.yml&#34;&gt;探针上报Zipkin格式数据&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;genai-预估成本配置&#34;&gt;GenAI 预估成本配置&lt;/h1&gt;
&lt;h2 id=&#34;概览&#34;&gt;概览&lt;/h2&gt;
&lt;p&gt;SkyWalking 提供了一个内置的&lt;a href=&#34;https://github.com/apache/skywalking/blob/master/oap-server/server-starter/src/main/resources/gen-ai-config.yml&#34;&gt;GenAI计费配置文件&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;该配置定义了SkyWalking 如何将 Trace 数据中的模型名称映射到对应的供应商，并估算每次 LLM 调用的 Token 成本。估算成本将与 Trace 和指标数据一起显示在 SkyWalking UI 中，帮助用户直观了解 GenAI 使用带来的 预估费用影响。
重要提示: 此文件中的定价仅用于成本估算，不得视为实际账单或发票金额。建议用户定期从供应商官方定价页面核实最新费率。&lt;/p&gt;
&lt;h2 id=&#34;配置结构&#34;&gt;配置结构&lt;/h2&gt;
&lt;h3 id=&#34;top-字段&#34;&gt;Top 字段&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;字段&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;类型&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;描述&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;last-updated&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;date&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;定价数据的最后更新日期。所有价格均基于该日期前各厂商官网公布的公开计费标准。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;providers&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;GenAI 厂商定义列表。每个厂商条目下包含匹配规则（matching rules）以及具体的模型计费信息（model pricing）。&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;provider-定义&#34;&gt;provider 定义&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;providers&lt;/code&gt; 下的每个条目定义一个 GenAI 供应商：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;providers&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;provider-name&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &amp;lt;prefix-1&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &amp;lt;prefix-2&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;model-name&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aliases&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&amp;lt;alias-1&amp;gt;, &amp;lt;alias-2&amp;gt;]&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;cost&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&amp;lt;cost&amp;gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;字段 (Field)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;类型 (Type)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;必填 (Required)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;描述 (Description)&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;provider&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;是&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;供应商标识（如 &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;gemini&lt;/code&gt;）。在 SkyWalking 中作为虚拟 GenAI 服务名显示。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;prefix-match&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[string]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;是&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;用于将模型名称匹配到该供应商的前缀列表。如果 Trace 数据中的模型名以其中任一前缀开头，则会被映射到该供应商。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;models&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[model]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;否&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;包含定价信息的模型定义列表。如果省略，系统仍能识别供应商，但不会进行成本估算。&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;model-定义&#34;&gt;model 定义&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;models&lt;/code&gt; 下的每个条目定义特定模型的定价：&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;字段 (Field)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;类型 (Type)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;必填 (Required)&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;描述 (Description)&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;name&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;是&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;用于匹配的标准模型名称。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;aliases&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;list[string]&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;否&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;应解析为同一计费条目的备选名称。当供应商使用不同的命名习惯时非常有用（参见“模型别名”部分）。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;input-estimated-cost-per-m&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;float&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;否&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;每 1,000,000（一百万）输入（Prompt）Token 的预估成本。默认单位为 USD。&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;output-estimated-cost-per-m&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;float&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;否&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;每 1,000,000（一百万）输出（Completion）Token 的预估成本。默认单位为 USD。&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;模型匹配机制&#34;&gt;模型匹配机制&lt;/h2&gt;
&lt;h3 id=&#34;供应商级前缀匹配&#34;&gt;供应商级前缀匹配&lt;/h3&gt;
&lt;p&gt;当 SkyWalking 接收到包含 GenAI 调用的 Trace 时，会按照以下优先级顺序来确定供应商（Provider）：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;gen_ai.provider.name&lt;/code&gt; 标签：首先检索此标签。它是&lt;code&gt;OpenTelemetry&lt;/code&gt;最新的语义规范。&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gen_ai.system&lt;/code&gt; 标签：如果缺少上述标签，系统将回退到此旧版（Legacy）标签。注意：此标签仅在处理 OTLP 或 Zipkin 协议的数据时会被解析，主要用于兼容旧版的 Python 自动仪表化等库。&lt;/li&gt;
&lt;li&gt;前缀匹配 (Prefix Matching)：若上述两个标签均不存在，&lt;code&gt;SkyWalking&lt;/code&gt; 会读取 &lt;code&gt;gen-ai-config.yml&lt;/code&gt; 中定义的 prefix-match 规则，通过匹配 模型名称 (Model Name) 来尝试识别供应商。&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;openai&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- gpt&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;任何以 gpt 开头的模型名称（如 gpt-4o, gpt-4.1-mini, gpt-5-nano）都会被映射到 openai 供应商。
一个供应商可以拥有多个前缀：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;tencent&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- hunyuan&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- Tencent&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;模型级最长前缀匹配-model-level-longest-prefix-matching&#34;&gt;模型级最长前缀匹配 (Model-Level Longest-Prefix Matching)&lt;/h3&gt;
&lt;p&gt;一旦确定了供应商，SkyWalking 会使用基于前缀树 (Trie) 的最长前缀匹配算法来查找最佳的模型计费条目。这至关重要，因为 LLM 供应商在 API 响应中返回的模型名称通常包含版本号或时间戳，与配置中的基础模型名称有所不同。
示例： 假设 OpenAI 的配置条目如下：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;gpt-4o&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;2.5&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;10.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;gpt-4o-mini&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.15&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;其匹配行为如下表所示：&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;Trace 中的模型名称&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;匹配的配置条目&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;原因&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;完全匹配&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-2024-08-06&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;最长前缀为 &lt;code&gt;gpt-4o&lt;/code&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;完全匹配（比 &lt;code&gt;gpt-4o&lt;/code&gt; 更长的前缀优先）&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini-2024-07-18&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;最长前缀为 &lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;这种机制确保了 API 返回的带有版本的模型名称能够被正确映射到相应的价格档位，而无需在配置文件中维护精确的全名。&lt;/p&gt;
&lt;h3 id=&#34;模型别名-model-aliases&#34;&gt;模型别名 (Model Aliases)&lt;/h3&gt;
&lt;p&gt;部分供应商在 API 响应和官方文档中会使用不同的命名规范。例如，Anthropic 的模型在 Trace 中可能显示为 claude-4-sonnet 或 claude-sonnet-4。通过 aliases 字段，可以让单个计费条目同时支持这两种配置：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;claude-4-sonnet&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aliases&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;claude-sonnet-4]&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;3.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;15.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;在这种配置下，&lt;code&gt;claude-4-sonnet&lt;/code&gt; 和 &lt;code&gt;claude-sonnet-4&lt;/code&gt;（以及任何带有版本的变体，如 &lt;code&gt;claude-sonnet-4-20250514&lt;/code&gt;）都会解析为同一个计费条目。&lt;br&gt;
&lt;strong&gt;注意&lt;/strong&gt;： 别名同样参与最长前缀匹配。因此，&lt;code&gt;claude-sonnet-4-20250514&lt;/code&gt; 会匹配到别名 &lt;code&gt;claude-sonnet-4&lt;/code&gt;，进而解析到 &lt;code&gt;claude-4-sonnet&lt;/code&gt; 的定价信息。&lt;/p&gt;
&lt;h2 id=&#34;自定义配置&#34;&gt;自定义配置&lt;/h2&gt;
&lt;p&gt;添加新供应商 (Adding a New Provider)
要添加默认配置中未包含的供应商：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;providers&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# ... 现有供应商 ...&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;provider&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ollama&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;prefix-match&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- mymodel&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;models&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;mymodel-large&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;1.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;5.0&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;mymodel-small&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;input-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.1&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;output-estimated-cost-per-m&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;针对OTLP/zipkin的数据，新增了单独的estimated tag, 可以在UI上看到这次GenAI调用消耗的cost。&lt;br&gt;
&lt;img src=&#34;otlp-estimated-tag.png&#34; alt=&#34;otlp-estimated-tag&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;主要指标&#34;&gt;主要指标&lt;/h1&gt;
&lt;h2 id=&#34;1-provider-level-服务商维度&#34;&gt;1. Provider Level (服务商维度)&lt;/h2&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;指标 ID&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;描述&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;含义&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_cpm&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Calls Per Minute&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;每分钟请求数 (吞吐量)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_sla&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Success Rate&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;请求成功率&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_resp_time&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Avg Response Time&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;平均响应耗时&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_latency_percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Latency Percentiles&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;响应耗时百分位数 (P50, P75, P90, P95, P99)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_input_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Input Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;输入 Token 的总和及平均值&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_output_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Output Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;输出 Token 的总和及平均值&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_provider_total_estimated_cost/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated Cost&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;预估总成本及次均成本&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;2-model-level-模型维度&#34;&gt;2. Model Level (模型维度)&lt;/h2&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;指标 ID&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;描述&lt;/th&gt;
          &lt;th style=&#34;text-align: left&#34;&gt;含义&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_call_cpm&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Calls Per Minute&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;该特定模型的每分钟请求数&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_sla&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Success Rate&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;模型请求成功率&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_latency_avg/percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Latency&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;模型响应耗时的平均值及百分位数&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_ttft_avg/percentile&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;TTFT&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;首个token响应时间 (仅限流式传输 Streaming)&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_input_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Input Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;该模型的输入 Token 消耗详情&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_output_tokens_sum/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Output Token Usage&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;该模型的输出 Token 消耗详情&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;gen_ai_model_total_estimated_cost/avg&lt;/code&gt;&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;Estimated Cost&lt;/td&gt;
          &lt;td style=&#34;text-align: left&#34;&gt;该模型的预估总成本及次均成本&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;建议使用场景&#34;&gt;建议使用场景&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;性能评估：利用 响应延迟（Latency） 和 首字响应时间（TTFT） 指标，分析模型推理效率及终端用户交互体验。&lt;/li&gt;
&lt;li&gt;Token 监控：实时监控 输入（Input）与输出（Output）Token 的消耗，用于分析不同业务场景下的资源占用情况。&lt;/li&gt;
&lt;li&gt;成本预警：支持基于 预估成本（Cost） 或 Token 消耗量 配置告警阈值，及时发现异常调用，防止成本超支。&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Monitoring Envoy AI Gateway with Apache SkyWalking</title>
      <link>/blog/2026-04-02-envoy-ai-gateway-monitoring/</link>
      <pubDate>Thu, 02 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/2026-04-02-envoy-ai-gateway-monitoring/</guid>
      <description>
        
        
        &lt;h2 id=&#34;the-problem-flying-blind-with-llm-traffic&#34;&gt;The Problem: Flying Blind with LLM Traffic&lt;/h2&gt;
&lt;p&gt;LLM traffic is becoming a first-class citizen in production infrastructure. Teams are calling OpenAI, Anthropic,
AWS Bedrock, Azure OpenAI, Google Gemini — often multiple providers at once. But most organizations have
no unified visibility into this traffic:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Token costs spiral&lt;/strong&gt; without knowing which teams, models, or providers drive the spend.
A single misconfigured prompt template can burn through thousands of dollars before anyone notices.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provider outages cause cascading failures.&lt;/strong&gt; When OpenAI has a bad hour, your application goes down
with it — and you have no failover visibility to understand what happened or switch providers automatically.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No unified metrics&lt;/strong&gt; across heterogeneous LLM calls. Latency, Time to First Token (TTFT),
Time Per Output Token (TPOT), token usage, error rates — each provider reports these differently,
if at all. There is no single dashboard to compare them.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the same observability gap that microservices faced a decade ago. The solution then was
service meshes and API gateways with built-in telemetry. For AI workloads, the answer is an AI gateway.&lt;/p&gt;
&lt;h2 id=&#34;why-an-ai-gateway&#34;&gt;Why an AI Gateway&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; is an open-source AI gateway built on top of
&lt;a href=&#34;https://www.envoyproxy.io/&#34;&gt;Envoy Proxy&lt;/a&gt; and &lt;a href=&#34;https://gateway.envoyproxy.io/&#34;&gt;Envoy Gateway&lt;/a&gt;.
It is not a standalone SaaS product or a Python proxy — it is infrastructure-grade software built on
the same Envoy that already handles traffic for a large portion of cloud-native deployments.&lt;/p&gt;
&lt;p&gt;Key capabilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multi-provider routing&lt;/strong&gt; — supports 16+ AI providers (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI,
Google Gemini, Mistral, Cohere, DeepSeek, and more) behind a unified API.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Token-based rate limiting&lt;/strong&gt; — rate limit by token consumption, not just request count.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provider fallback&lt;/strong&gt; — automatic failover when a provider is down or slow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model virtualization&lt;/strong&gt; — abstract model names so applications are decoupled from specific providers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Two-tier architecture&lt;/strong&gt; — a reference architecture with a centralized entry gateway (Tier 1) for
auth and global routing, and per-cluster gateways (Tier 2) for inference optimization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CNCF ecosystem native&lt;/strong&gt; — runs on Kubernetes, composes with existing Envoy filters, WASM plugins,
and standard Kubernetes Gateway API resources.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because Envoy AI Gateway natively emits GenAI metrics and access logs via OTLP following
&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt;,
it plugs directly into any OpenTelemetry-compatible backend.&lt;/p&gt;
&lt;p&gt;Starting from SkyWalking 10.4.0, the OAP server natively receives and analyzes Envoy AI Gateway&amp;rsquo;s
OTLP metrics and access logs — no OpenTelemetry Collector needed in between.&lt;/p&gt;
&lt;h2 id=&#34;data-flow&#34;&gt;Data Flow&lt;/h2&gt;
&lt;p&gt;The AI Gateway pushes telemetry directly to SkyWalking via OTLP gRPC:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;workflow.jpg&#34; alt=&#34;Data flow&#34;&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Application&lt;/strong&gt; sends LLM API requests through the Envoy AI Gateway.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway&lt;/strong&gt; routes requests to AI providers (or local models like Ollama)
and records GenAI metrics (token usage, latency, TTFT, TPOT) and access logs.&lt;/li&gt;
&lt;li&gt;The gateway pushes metrics and logs via &lt;strong&gt;OTLP gRPC&lt;/strong&gt; directly to &lt;strong&gt;SkyWalking OAP&lt;/strong&gt; on port 11800.&lt;/li&gt;
&lt;li&gt;SkyWalking OAP parses metrics with MAL rules and access logs with LAL rules,
then stores everything in &lt;strong&gt;BanyanDB&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;No OpenTelemetry Collector is needed. SkyWalking OAP&amp;rsquo;s built-in OTLP receiver handles everything.&lt;/p&gt;
&lt;h2 id=&#34;try-it-locally&#34;&gt;Try It Locally&lt;/h2&gt;
&lt;p&gt;This demo uses &lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt; as a local LLM backend so you can try
everything without an API key. The &lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt;
(&lt;code&gt;aigw&lt;/code&gt;) provides a standalone mode that runs outside Kubernetes — perfect for local testing.&lt;/p&gt;
&lt;h3 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Docker and Docker Compose&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt; installed on your host&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;step-1-start-ollama&#34;&gt;Step 1: Start Ollama&lt;/h3&gt;
&lt;p&gt;Start Ollama on all interfaces so Docker containers can reach it:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#953800&#34;&gt;OLLAMA_HOST&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;0.0.0.0 ollama serve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Pull a small model for testing:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ollama pull llama3.2:1b
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;step-2-start-the-stack&#34;&gt;Step 2: Start the Stack&lt;/h3&gt;
&lt;p&gt;Create a &lt;code&gt;docker-compose.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;services&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-banyandb:0.10.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17912:17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;wget -qO- http://localhost:17913/api/healthz || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;3s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-oap-server:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;oap&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;11800:11800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;12800:12800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE_BANYANDB_TARGETS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb:17912&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;bash -c &amp;#39;echo &amp;gt; /dev/tcp/localhost/12800&amp;#39; || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;start_period&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;60s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ui&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-ui:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ui&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;8080:8080&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_OAP_ADDRESS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;http://oap:12800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aigw&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;envoyproxy/ai-gateway-cli:latest&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;aigw&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_BASE_URL=http://host.docker.internal:11434/v1&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_API_KEY=unused&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_SERVICE_NAME=my-ai-gateway&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_ENDPOINT=http://oap:11800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_PROTOCOL=grpc&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRICS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_LOGS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRIC_EXPORT_INTERVAL=5000&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_RESOURCE_ATTRIBUTES=job_name=envoy-ai-gateway,service.instance.id=aigw-1,service.layer=ENVOY_AI_GATEWAY&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;1975:1975&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;extra_hosts&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;host.docker.internal:host-gateway&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;run&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Start everything:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Wait for all services to become healthy (BanyanDB starts first, then OAP, then UI and AI Gateway):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose ps
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The key OTLP configuration on the &lt;code&gt;aigw&lt;/code&gt; service:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Env Var&lt;/th&gt;
          &lt;th&gt;Value&lt;/th&gt;
          &lt;th&gt;Purpose&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_SERVICE_NAME&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;my-ai-gateway&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Service name in SkyWalking&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;http://oap:11800&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;SkyWalking OAP gRPC endpoint&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_PROTOCOL&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;grpc&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;OTLP transport&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_METRICS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Enable metrics push&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_LOGS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Enable access log push&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;code&gt;OTEL_RESOURCE_ATTRIBUTES&lt;/code&gt; must include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;job_name=envoy-ai-gateway&lt;/code&gt; — routing tag for MAL/LAL rules&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.instance.id=&amp;lt;id&amp;gt;&lt;/code&gt; — instance identity&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.layer=ENVOY_AI_GATEWAY&lt;/code&gt; — routes logs to AI Gateway LAL rules&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The MAL and LAL rules are enabled by default in SkyWalking OAP. No OAP-side configuration is needed.&lt;/p&gt;
&lt;h3 id=&#34;step-3-run-the-demo-app&#34;&gt;Step 3: Run the Demo App&lt;/h3&gt;
&lt;p&gt;Create a simple Python application that sends requests through the AI Gateway (&lt;code&gt;app.py&lt;/code&gt;).
It mixes normal requests, streaming requests (for TTFT/TPOT metrics), and error requests
(non-existent model → HTTP 404, always captured by the LAL sampling policy):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;import&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;time&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;random&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;requests&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;GATEWAY &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;http://localhost:1975&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;HEADERS &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Authorization&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Bearer unused&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Content-Type&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;application/json&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;questions &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Apache SkyWalking? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Envoy Proxy used for? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What are the benefits of an AI gateway? Answer in two sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Explain observability in three sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#6639ba&#34;&gt;chat&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;False&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    resp &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; requests&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;post&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;GATEWAY&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;/v1/chat/completions&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        json&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;messages&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;}],&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;stream&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        headers&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;HEADERS&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; timeout&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;60&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;stream&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        chunks &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;for&lt;/span&gt; line &lt;span style=&#34;color:#0550ae&#34;&gt;in&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;iter_lines&lt;span style=&#34;color:#1f2328&#34;&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; line&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                chunks&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;append&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;line&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;decode&lt;span style=&#34;color:#1f2328&#34;&gt;())&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[streamed &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;len&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;chunks&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; chunks]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;json&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;while&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    r &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;random&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Error request: non-existent model triggers 404&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;non-existent-model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;hello&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[error] model=non-existent-model status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;elif&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Streaming request — generates TTFT and TPOT metrics&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; info &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[stream] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;info&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;else&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Normal non-streaming request&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        answer &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;choices&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{}])[&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)[:&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        tokens &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;usage&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[ok] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; tokens=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;tokens&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; answer=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;answer&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;...&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    time&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;sleep&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;randint&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;20&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Run it:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pip install requests
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;python app.py
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The application talks to the AI Gateway on port 1975, which routes to Ollama.
Each request generates GenAI metrics (token usage, latency, TTFT, TPOT) and access logs
that the gateway pushes to SkyWalking via OTLP.&lt;/p&gt;
&lt;p&gt;The error requests (non-existent model → HTTP 404) are always captured by the access log
sampling policy, so you will see them in the SkyWalking log view.&lt;/p&gt;
&lt;h3 id=&#34;step-4-view-in-skywalking-ui&#34;&gt;Step 4: View in SkyWalking UI&lt;/h3&gt;
&lt;p&gt;Open &lt;a href=&#34;http://localhost:8080&#34;&gt;http://localhost:8080&lt;/a&gt; and select the &lt;strong&gt;GenAI &amp;gt; Envoy AI Gateway&lt;/strong&gt; menu.&lt;/p&gt;
&lt;p&gt;The service list shows &lt;code&gt;my-ai-gateway&lt;/code&gt; with CPM, latency, and token rates at a glance:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-1.png&#34; alt=&#34;Service list&#34;&gt;&lt;/p&gt;
&lt;p&gt;Click into the service to see the full dashboard — Request CPM, Latency (average + percentiles),
Input/Output Token Rates, TTFT, and TPOT:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-2.png&#34; alt=&#34;Service dashboard&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Providers&lt;/strong&gt; tab breaks down metrics by AI provider:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-3.png&#34; alt=&#34;Provider breakdown&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Models&lt;/strong&gt; tab shows per-model metrics including TTFT and TPOT (streaming only).
Note the &lt;code&gt;unknown&lt;/code&gt; model entries — these are the error requests with non-existent models:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-4.png&#34; alt=&#34;Model breakdown&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Log&lt;/strong&gt; tab shows access logs. The sampling policy drops normal successful responses
but always captures errors (HTTP 404) and high-token requests:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-5.png&#34; alt=&#34;Access logs&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;cleanup&#34;&gt;Cleanup&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose down
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;deploying-on-kubernetes&#34;&gt;Deploying on Kubernetes&lt;/h2&gt;
&lt;p&gt;For production deployments, Envoy AI Gateway runs as a full Kubernetes controller with
Envoy Gateway as the control plane. See the
&lt;a href=&#34;https://aigateway.envoyproxy.io/docs/getting-started/&#34;&gt;Envoy AI Gateway getting started guide&lt;/a&gt;
for Kubernetes installation.&lt;/p&gt;
&lt;p&gt;The OTLP configuration is the same — set the &lt;code&gt;OTEL_*&lt;/code&gt; environment variables on the
AI Gateway&amp;rsquo;s external processor to point at SkyWalking OAP&amp;rsquo;s gRPC port (11800).
See the &lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway Monitoring&lt;/a&gt;
documentation for details.&lt;/p&gt;
&lt;h2 id=&#34;genai-observability-without-an-ai-gateway&#34;&gt;GenAI Observability Without an AI Gateway&lt;/h2&gt;
&lt;p&gt;Not every deployment uses an AI gateway. If your applications call LLM providers directly,
SkyWalking 10.4.0 also provides GenAI observability through the
&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;Virtual GenAI&lt;/a&gt; layer.&lt;/p&gt;
&lt;p&gt;This works with any SkyWalking-instrumented, OpenTelemetry-instrumented, or Zipkin-instrumented application.
When traces carry &lt;code&gt;gen_ai.*&lt;/code&gt; tags (following
&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt;),
SkyWalking derives per-provider and per-model metrics from the client side:
latency, token usage, success rate, and estimated cost.&lt;/p&gt;
&lt;p&gt;For Java applications, the SkyWalking Java Agent (9.7+) includes a Spring AI plugin that automatically
instruments calls to 13+ providers (OpenAI, Anthropic, AWS Bedrock, Google GenAI, DeepSeek, Mistral, etc.)
with the correct &lt;code&gt;gen_ai.*&lt;/code&gt; span tags — no code changes needed.&lt;/p&gt;
&lt;p&gt;This is a different use case from the Envoy AI Gateway monitoring covered above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway layer&lt;/strong&gt;: infrastructure-level observability — what the gateway sees across all traffic.
Best for platform teams managing centralized AI routing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Virtual GenAI layer&lt;/strong&gt;: application-level observability — what each instrumented app sees for its own LLM calls.
Best for teams without a centralized gateway, or for per-application cost tracking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;references&#34;&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; — project site and documentation&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt; — standalone mode for local development&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway Monitoring&lt;/a&gt; — OAP setup doc&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;SkyWalking Virtual GenAI&lt;/a&gt; — client-side GenAI observability&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt; — the metric/attribute standard both projects follow&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Zh: 用 Apache SkyWalking 监控 Envoy AI Gateway</title>
      <link>/zh/2026-04-02-envoy-ai-gateway-monitoring/</link>
      <pubDate>Thu, 02 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/zh/2026-04-02-envoy-ai-gateway-monitoring/</guid>
      <description>
        
        
        &lt;h2 id=&#34;问题llm-流量缺乏统一观测&#34;&gt;问题：LLM 流量缺乏统一观测&lt;/h2&gt;
&lt;p&gt;LLM 流量正在成为生产基础设施中不可忽视的一部分。团队同时在调用 OpenAI、Anthropic、AWS Bedrock、Azure OpenAI、Google Gemini——往往还不止一个提供商。但大多数组织对这些流量缺乏统一的可见性：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Token 费用失控&lt;/strong&gt;，却不知道哪个团队、哪个模型、哪个提供商在烧钱。一个配置不当的 prompt 模板就可能在无人察觉的情况下烧掉几千美元。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;提供商故障引发连锁反应。&lt;/strong&gt; OpenAI 出问题的那一小时，你的应用也跟着挂——而你既没有故障切换的可见性，也无法自动切换提供商。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;缺乏统一指标。&lt;/strong&gt; 延迟、首 Token 耗时（TTFT）、每 Token 输出耗时（TPOT）、Token 用量、错误率——每个提供商的报告方式都不一样，有些甚至不提供。没有一个统一的面板能做对比。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这和十年前微服务面临的可观测性困境如出一辙。当时的解法是服务网格和内置遥测的 API 网关。对 AI 工作负载来说，答案就是 AI 网关。&lt;/p&gt;
&lt;h2 id=&#34;为什么选择-ai-网关&#34;&gt;为什么选择 AI 网关&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; 是一个开源 AI 网关，构建在 &lt;a href=&#34;https://www.envoyproxy.io/&#34;&gt;Envoy Proxy&lt;/a&gt; 和 &lt;a href=&#34;https://gateway.envoyproxy.io/&#34;&gt;Envoy Gateway&lt;/a&gt; 之上。底层就是云原生世界里已经广泛部署的 Envoy，天然具备基础设施级的稳定性和性能。&lt;/p&gt;
&lt;p&gt;核心能力：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;多提供商路由&lt;/strong&gt; —— 支持 16+ AI 提供商（OpenAI、Anthropic、AWS Bedrock、Azure OpenAI、Google Gemini、Mistral、Cohere、DeepSeek 等），统一 API 接入。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;基于 Token 的限流&lt;/strong&gt; —— 按 Token 消耗限流，而不只是按请求数。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;提供商故障切换&lt;/strong&gt; —— 某个提供商宕机或响应慢时自动切换。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;模型虚拟化&lt;/strong&gt; —— 抽象模型名称，让应用与具体提供商解耦。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;两层架构&lt;/strong&gt; —— 参考架构包含一个集中入口网关（Tier 1）负责认证和全局路由，以及每集群网关（Tier 2）负责推理优化。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CNCF 生态原生&lt;/strong&gt; —— 运行在 Kubernetes 上，兼容现有的 Envoy Filter、WASM 插件和标准 Kubernetes Gateway API 资源。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Envoy AI Gateway 原生支持通过 OTLP 发送 GenAI 指标和访问日志，遵循 &lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI 语义约定&lt;/a&gt;，可以直接接入任何兼容 OpenTelemetry 的后端。&lt;/p&gt;
&lt;p&gt;从 SkyWalking 10.4.0 开始，OAP 原生接收和分析 Envoy AI Gateway 的 OTLP 指标和访问日志——中间不需要部署 OpenTelemetry Collector。&lt;/p&gt;
&lt;h2 id=&#34;数据流&#34;&gt;数据流&lt;/h2&gt;
&lt;p&gt;AI Gateway 通过 OTLP gRPC 直接将遥测数据推送到 SkyWalking：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;workflow.jpg&#34; alt=&#34;数据流&#34;&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;应用&lt;/strong&gt; 通过 Envoy AI Gateway 发送 LLM API 请求。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway&lt;/strong&gt; 将请求路由到 AI 提供商（或 Ollama 这样的本地模型），同时记录 GenAI 指标（Token 用量、延迟、TTFT、TPOT）和访问日志。&lt;/li&gt;
&lt;li&gt;网关通过 &lt;strong&gt;OTLP gRPC&lt;/strong&gt; 直接将指标和日志推送到 &lt;strong&gt;SkyWalking OAP&lt;/strong&gt; 的 11800 端口。&lt;/li&gt;
&lt;li&gt;SkyWalking OAP 用 MAL 规则解析指标、用 LAL 规则解析访问日志，然后统一存储到 &lt;strong&gt;BanyanDB&lt;/strong&gt;。&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;不需要 OpenTelemetry Collector。SkyWalking OAP 内置的 OTLP 接收器可以直接处理所有数据。&lt;/p&gt;
&lt;h2 id=&#34;本地体验&#34;&gt;本地体验&lt;/h2&gt;
&lt;p&gt;这个 Demo 使用 &lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt; 作为本地 LLM 后端，不需要任何 API Key 就能跑起来。&lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt;（&lt;code&gt;aigw&lt;/code&gt;）提供独立运行模式，不依赖 Kubernetes，非常适合本地测试。&lt;/p&gt;
&lt;h3 id=&#34;前置条件&#34;&gt;前置条件&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Docker 和 Docker Compose&lt;/li&gt;
&lt;li&gt;主机上已安装 &lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;第一步启动-ollama&#34;&gt;第一步：启动 Ollama&lt;/h3&gt;
&lt;p&gt;让 Ollama 监听所有网络接口，以便 Docker 容器能访问到：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#953800&#34;&gt;OLLAMA_HOST&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;0.0.0.0 ollama serve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;拉取一个小模型用于测试：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ollama pull llama3.2:1b
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;第二步启动服务栈&#34;&gt;第二步：启动服务栈&lt;/h3&gt;
&lt;p&gt;创建 &lt;code&gt;docker-compose.yaml&lt;/code&gt;：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;services&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-banyandb:0.10.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17912:17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;wget -qO- http://localhost:17913/api/healthz || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;3s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-oap-server:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;oap&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;11800:11800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;12800:12800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE_BANYANDB_TARGETS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb:17912&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;bash -c &amp;#39;echo &amp;gt; /dev/tcp/localhost/12800&amp;#39; || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;start_period&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;60s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ui&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-ui:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ui&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;8080:8080&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_OAP_ADDRESS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;http://oap:12800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aigw&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;envoyproxy/ai-gateway-cli:latest&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;aigw&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_BASE_URL=http://host.docker.internal:11434/v1&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_API_KEY=unused&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_SERVICE_NAME=my-ai-gateway&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_ENDPOINT=http://oap:11800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_PROTOCOL=grpc&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRICS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_LOGS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRIC_EXPORT_INTERVAL=5000&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_RESOURCE_ATTRIBUTES=job_name=envoy-ai-gateway,service.instance.id=aigw-1,service.layer=ENVOY_AI_GATEWAY&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;1975:1975&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;extra_hosts&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;host.docker.internal:host-gateway&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;run&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;启动所有服务：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;等待所有服务变为健康状态（BanyanDB 先启动，然后是 OAP，最后是 UI 和 AI Gateway）：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose ps
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;aigw&lt;/code&gt; 服务的关键 OTLP 配置：&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;环境变量&lt;/th&gt;
          &lt;th&gt;值&lt;/th&gt;
          &lt;th&gt;用途&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_SERVICE_NAME&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;my-ai-gateway&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;SkyWalking 中的服务名&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;http://oap:11800&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;SkyWalking OAP gRPC 端点&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_PROTOCOL&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;grpc&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;OTLP 传输协议&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_METRICS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;启用指标推送&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_LOGS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;启用访问日志推送&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;OTEL_RESOURCE_ATTRIBUTES&lt;/code&gt; 必须包含：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;job_name=envoy-ai-gateway&lt;/code&gt; —— MAL/LAL 规则的路由标签&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.instance.id=&amp;lt;id&amp;gt;&lt;/code&gt; —— 实例标识&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.layer=ENVOY_AI_GATEWAY&lt;/code&gt; —— 将日志路由到 AI Gateway LAL 规则&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;MAL 和 LAL 规则在 SkyWalking OAP 中默认启用，不需要额外配置。&lt;/p&gt;
&lt;h3 id=&#34;第三步运行-demo-应用&#34;&gt;第三步：运行 Demo 应用&lt;/h3&gt;
&lt;p&gt;创建一个简单的 Python 应用，通过 AI Gateway 发送请求（&lt;code&gt;app.py&lt;/code&gt;）。
它混合了普通请求、流式请求（用于产生 TTFT/TPOT 指标）和错误请求（不存在的模型 → HTTP 404，始终会被 LAL 采样策略捕获）：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;import&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;time&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;random&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;requests&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;GATEWAY &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;http://localhost:1975&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;HEADERS &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Authorization&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Bearer unused&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Content-Type&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;application/json&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;questions &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Apache SkyWalking? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Envoy Proxy used for? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What are the benefits of an AI gateway? Answer in two sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Explain observability in three sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#6639ba&#34;&gt;chat&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;False&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    resp &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; requests&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;post&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;GATEWAY&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;/v1/chat/completions&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        json&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;messages&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;}],&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;stream&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        headers&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;HEADERS&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; timeout&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;60&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;stream&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        chunks &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;for&lt;/span&gt; line &lt;span style=&#34;color:#0550ae&#34;&gt;in&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;iter_lines&lt;span style=&#34;color:#1f2328&#34;&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; line&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                chunks&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;append&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;line&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;decode&lt;span style=&#34;color:#1f2328&#34;&gt;())&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[streamed &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;len&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;chunks&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; chunks]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;json&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;while&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    r &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;random&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Error request: non-existent model triggers 404&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;non-existent-model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;hello&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[error] model=non-existent-model status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;elif&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Streaming request — generates TTFT and TPOT metrics&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; info &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[stream] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;info&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;else&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Normal non-streaming request&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        answer &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;choices&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{}])[&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)[:&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        tokens &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;usage&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[ok] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; tokens=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;tokens&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; answer=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;answer&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;...&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    time&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;sleep&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;randint&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;20&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;运行：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pip install requests
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;python app.py
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;应用通过 1975 端口与 AI Gateway 通信，AI Gateway 再路由到 Ollama。每次请求都会产生 GenAI 指标（Token 用量、延迟、TTFT、TPOT）和访问日志，由网关通过 OTLP 推送到 SkyWalking。&lt;/p&gt;
&lt;p&gt;错误请求（不存在的模型 → HTTP 404）始终会被访问日志采样策略捕获，所以在 SkyWalking 的日志视图中一定能看到。&lt;/p&gt;
&lt;h3 id=&#34;第四步在-skywalking-ui-中查看&#34;&gt;第四步：在 SkyWalking UI 中查看&lt;/h3&gt;
&lt;p&gt;打开 &lt;a href=&#34;http://localhost:8080&#34;&gt;http://localhost:8080&lt;/a&gt;，选择 &lt;strong&gt;GenAI &amp;gt; Envoy AI Gateway&lt;/strong&gt; 菜单。&lt;/p&gt;
&lt;p&gt;服务列表显示 &lt;code&gt;my-ai-gateway&lt;/code&gt;，可以一览 CPM、延迟和 Token 速率：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-1.png&#34; alt=&#34;服务列表&#34;&gt;&lt;/p&gt;
&lt;p&gt;点击进入服务详情，查看完整仪表盘——请求 CPM、延迟（平均值 + 百分位数）、输入/输出 Token 速率、TTFT 和 TPOT：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-2.png&#34; alt=&#34;服务仪表盘&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Providers&lt;/strong&gt; 标签页按 AI 提供商维度展示指标：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-3.png&#34; alt=&#34;提供商维度&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models&lt;/strong&gt; 标签页展示每个模型的指标，包括 TTFT 和 TPOT（仅流式请求）。注意 &lt;code&gt;unknown&lt;/code&gt; 模型条目——这些就是使用不存在模型的错误请求：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-4.png&#34; alt=&#34;模型维度&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Log&lt;/strong&gt; 标签页展示访问日志。采样策略会丢弃正常的成功响应，但始终保留错误（HTTP 404）和高 Token 消耗的请求：&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-5.png&#34; alt=&#34;访问日志&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;清理&#34;&gt;清理&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose down
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;kubernetes-生产部署&#34;&gt;Kubernetes 生产部署&lt;/h2&gt;
&lt;p&gt;生产环境中，Envoy AI Gateway 作为完整的 Kubernetes 控制器运行，以 Envoy Gateway 作为控制面。详见 &lt;a href=&#34;https://aigateway.envoyproxy.io/docs/getting-started/&#34;&gt;Envoy AI Gateway 入门指南&lt;/a&gt;。&lt;/p&gt;
&lt;p&gt;OTLP 配置方式相同——在 AI Gateway 的 External Processor 上设置 &lt;code&gt;OTEL_*&lt;/code&gt; 环境变量，指向 SkyWalking OAP 的 gRPC 端口（11800）。详见 &lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway 监控文档&lt;/a&gt;。&lt;/p&gt;
&lt;h2 id=&#34;不用-ai-网关也能做-genai-可观测&#34;&gt;不用 AI 网关也能做 GenAI 可观测&lt;/h2&gt;
&lt;p&gt;并非所有场景都需要 AI 网关。如果你的应用直接调用 LLM 提供商，SkyWalking 10.4.0 也提供了基于 &lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;Virtual GenAI&lt;/a&gt; 层的 GenAI 可观测方案。&lt;/p&gt;
&lt;p&gt;任何接入了 SkyWalking、OpenTelemetry 或 Zipkin 探针的应用都能使用这个功能。只要 Trace 中携带 &lt;code&gt;gen_ai.*&lt;/code&gt; 标签（遵循 &lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI 语义约定&lt;/a&gt;），SkyWalking 就能从客户端视角推导出每提供商、每模型的指标：延迟、Token 用量、成功率和预估费用。&lt;/p&gt;
&lt;p&gt;对于 Java 应用，SkyWalking Java Agent（9.7+）内置了 Spring AI 插件，自动为 13+ 提供商（OpenAI、Anthropic、AWS Bedrock、Google GenAI、DeepSeek、Mistral 等）的调用注入正确的 &lt;code&gt;gen_ai.*&lt;/code&gt; Span 标签——不需要改代码。&lt;/p&gt;
&lt;p&gt;这与上面介绍的 Envoy AI Gateway 监控是不同的使用场景：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway 层&lt;/strong&gt;：基础设施级可观测——网关视角，覆盖所有流量。适合负责集中 AI 路由的平台团队。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Virtual GenAI 层&lt;/strong&gt;：应用级可观测——每个应用自己看到的 LLM 调用情况。适合没有集中网关的团队，或者需要按应用维度跟踪费用的场景。&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;参考资料&#34;&gt;参考资料&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; —— 项目官网和文档&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt; —— 本地开发用的独立运行模式&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway 监控&lt;/a&gt; —— OAP 配置文档&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;SkyWalking Virtual GenAI&lt;/a&gt; —— 客户端侧 GenAI 可观测&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI 语义约定&lt;/a&gt; —— 两个项目共同遵循的指标/属性标准&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: How to run Apache SkyWalking on AWS EKS and RDS/Aurora</title>
      <link>/blog/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</link>
      <pubDate>Tue, 13 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</guid>
      <description>
        
        
        &lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Apache SkyWalking is an open source APM tool for monitoring and troubleshooting distributed systems,
especially designed for microservices, cloud native and container-based (Docker, Kubernetes, Mesos)
architectures. It provides distributed tracing, service mesh observability, metric aggregation and
visualization, and alarm.&lt;/p&gt;
&lt;p&gt;In this article, I will introduce how to quickly set up Apache SkyWalking on AWS EKS and RDS/Aurora,
as well as a couple of sample services, monitoring services to observe SkyWalking itself.&lt;/p&gt;
&lt;h2 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;AWS account&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html&#34;&gt;AWS CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.terraform.io/downloads.html&#34;&gt;Terraform&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kubernetes.io/docs/tasks/tools/#kubectl&#34;&gt;kubectl&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can use the AWS web console or CLI to create all resources needed in this tutorial, but it can be
too tedious and hard to debug when something goes wrong. So in this artical I will use Terraform to
create all AWS resources, deploy SkyWalking, sample services, and load generator services (Locust).&lt;/p&gt;
&lt;h2 id=&#34;architecture&#34;&gt;Architecture&lt;/h2&gt;
&lt;p&gt;The demo architecture is as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-mermaid&#34; data-lang=&#34;mermaid&#34;&gt;graph LR
    subgraph AWS
        subgraph EKS
          subgraph istio-system namespace
              direction TB
              OAP[[SkyWalking OAP]]
              UI[[SkyWalking UI]]
            Istio[[istiod]]
          end
          subgraph sample namespace
              Service0[[Service0]]
              Service1[[Service1]]
              ServiceN[[Service ...]]
          end
          subgraph locust namespace
              LocustMaster[[Locust Master]]
              LocustWorkers0[[Locust Worker 0]]
              LocustWorkers1[[Locust Worker 1]]
              LocustWorkersN[[Locust Worker ...]]
          end
        end
        RDS[[RDS/Aurora]]
    end
    OAP --&amp;gt; RDS
    Service0 -. telemetry data -.-&amp;gt; OAP
    Service1 -. telemetry data -.-&amp;gt; OAP
    ServiceN -. telemetry data -.-&amp;gt; OAP
    UI --query--&amp;gt; OAP
    LocustWorkers0 -- traffic --&amp;gt; Service0
    LocustWorkers1 -- traffic --&amp;gt; Service0
    LocustWorkersN -- traffic --&amp;gt; Service0
    Service0 --&amp;gt; Service1 --&amp;gt; ServiceN
    LocustMaster --&amp;gt; LocustWorkers0
    LocustMaster --&amp;gt; LocustWorkers1
    LocustMaster --&amp;gt; LocustWorkersN
    User --&amp;gt; LocustMaster
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As shown in the architecture diagram, we need to create the following AWS resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;EKS cluster&lt;/li&gt;
&lt;li&gt;RDS instance or Aurora cluster&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Sounds simple, but there are a lot of things behind the scenes, such as VPC, subnets, security groups, etc.
You have to configure them correctly to make sure the EKS cluster can connect to RDS instance/Aurora cluster
otherwise the SkyWalking won&amp;rsquo;t work. Luckily, Terraform can help us to create and destroy all these resources
automatically.&lt;/p&gt;
&lt;p&gt;I have created a Terraform module to create all AWS resources needed in this tutorial, you can find it in the
&lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/tree/main/aws&#34;&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;create-aws-resources&#34;&gt;Create AWS resources&lt;/h2&gt;
&lt;p&gt;First, we need to clone the GitHub repository and &lt;code&gt;cd&lt;/code&gt; into the folder:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/kezhenxu94/oap-load-test.git
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then, we need to create a file named &lt;code&gt;terraform.tfvars&lt;/code&gt; to specify the AWS region and other variables:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cat &amp;gt; terraform.tfvars &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_access_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_secret_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;cluster_name   = &amp;#34;skywalking-on-aws&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;region         = &amp;#34;ap-east-1&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;db_type        = &amp;#34;rds-postgresql&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you have already configured the AWS CLI, you can skip the &lt;code&gt;aws_access_key&lt;/code&gt; and &lt;code&gt;aws_secret_key&lt;/code&gt; variables.
To install SkyWalking with RDS postgresql, set the &lt;code&gt;db_type&lt;/code&gt; to &lt;code&gt;rds-postgresql&lt;/code&gt;, to install SkyWalking with
Aurora postgresql, set the &lt;code&gt;db_type&lt;/code&gt; to &lt;code&gt;aurora-postgresql&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There are a lot of other variables you can configure, such as tags, sample services count, replicas, etc.,
you can find them in the &lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/blob/main/aws/variables.tf&#34;&gt;variables.tf&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then, we can run the following commands to initialize the Terraform module and download the required providers,
then create all AWS resources:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform init
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Type &lt;code&gt;yes&lt;/code&gt; to confirm the creation of all AWS resources, or add the &lt;code&gt;-auto-approve&lt;/code&gt; flag to the &lt;code&gt;terraform apply&lt;/code&gt;
to skip the confirmation:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now what you need to do is to wait for the creation of all AWS resources to complete, it may take a few minutes.
You can check the progress of the creation in the AWS web console, and check the deployment progress of the services
inside the EKS cluster.&lt;/p&gt;
&lt;h2 id=&#34;generate-traffic&#34;&gt;Generate traffic&lt;/h2&gt;
&lt;p&gt;Besides creating necessary AWS resources, the Terraform module also deploys SkyWalking, sample services, and Locust
load generator services to the EKS cluster.&lt;/p&gt;
&lt;p&gt;You can access the Locust web UI to generate traffic to the sample services:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;open http://&lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl get svc -n locust -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;locust-master -o &lt;span style=&#34;color:#953800&#34;&gt;jsonpath&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;{.items[0].status.loadBalancer.ingress[0].hostname}&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt;:8089
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The command opens the browser to the Locust web UI, you can configure the number of users and hatch rate to generate
traffic.&lt;/p&gt;
&lt;h2 id=&#34;observe-skywalking&#34;&gt;Observe SkyWalking&lt;/h2&gt;
&lt;p&gt;You can access the SkyWalking web UI to observe the sample services.&lt;/p&gt;
&lt;p&gt;First you need to forward the SkyWalking UI port to local&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl -n istio-system port-forward &lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl -n istio-system get pod -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;skywalking -l &lt;span style=&#34;color:#953800&#34;&gt;component&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;ui -o name&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt; 8080:8080
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And then open the browser to http://localhost:8080 to access the SkyWalking web UI.&lt;/p&gt;
&lt;h2 id=&#34;observe-rdsaurora&#34;&gt;Observe RDS/Aurora&lt;/h2&gt;
&lt;p&gt;You can also access the RDS/Aurora web console to observe the performance of RDS/Aurora instance/Aurora cluste.&lt;/p&gt;
&lt;h2 id=&#34;test-results&#34;&gt;Test Results&lt;/h2&gt;
&lt;h3 id=&#34;test-1-skywalking-with-eks-and-rds-postgresql&#34;&gt;Test 1: SkyWalking with EKS and RDS PostgreSQL&lt;/h3&gt;
&lt;h4 id=&#34;service-traffic&#34;&gt;Service Traffic&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-cpm.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-performance&#34;&gt;RDS Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-performance&#34;&gt;SkyWalking Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;test-2-skywalking-with-eks-and-aurora-postgresql&#34;&gt;Test 2: SkyWalking with EKS and Aurora PostgreSQL&lt;/h3&gt;
&lt;h4 id=&#34;service-traffic-1&#34;&gt;Service Traffic&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-cpm-skywalking.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-performance-1&#34;&gt;RDS Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-performance-1&#34;&gt;SkyWalking Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;clean-up&#34;&gt;Clean up&lt;/h2&gt;
&lt;p&gt;When you are done with the demo, you can run the following command to destroy all AWS resources:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform destroy -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
      </description>
    </item>
    
    <item>
      <title>Zh: 如何在 AWS EKS 和 RDS/Aurora 上运行 Apache SkyWalking</title>
      <link>/zh/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</link>
      <pubDate>Tue, 13 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/zh/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</guid>
      <description>
        
        
        &lt;h2 id=&#34;介绍&#34;&gt;介绍&lt;/h2&gt;
&lt;p&gt;Apache SkyWalking 是一个开源的 APM 工具，用于监控分布式系统和排除故障，特别是为微服务、云原生和基于容器（Docker、Kubernetes、Mesos）的架构而设计。它提供分布式跟踪、服务网格可观测性、指标聚合和可视化以及警报。&lt;/p&gt;
&lt;p&gt;在本文中，我将介绍如何在 AWS EKS 和 RDS/Aurora 上快速设置 Apache SkyWalking，以及几个示例服务，监控服务以观察 SkyWalking 本身。&lt;/p&gt;
&lt;h2 id=&#34;先决条件&#34;&gt;先决条件&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;AWS 账号&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html&#34;&gt;AWS CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.terraform.io/downloads.html&#34;&gt;Terraform&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kubernetes.io/docs/tasks/tools/#kubectl&#34;&gt;kubectl&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;我们可以使用 AWS Web 控制台或 CLI 来创建本教程所需的所有资源，但是当出现问题时，它可能过于繁琐且难以调试。因此，在本文中，我将使用 Terraform 创建所有 AWS 资源、部署 SkyWalking、示例服务和负载生成器服务 (Locust)。&lt;/p&gt;
&lt;h2 id=&#34;架构&#34;&gt;架构&lt;/h2&gt;
&lt;p&gt;演示架构如下：&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-mermaid&#34; data-lang=&#34;mermaid&#34;&gt;graph LR
    subgraph AWS
        subgraph EKS
          subgraph istio-system namespace
              direction TB
              OAP[[SkyWalking OAP]]
              UI[[SkyWalking UI]]
            Istio[[istiod]]
          end
          subgraph sample namespace
              Service0[[Service0]]
              Service1[[Service1]]
              ServiceN[[Service ...]]
          end
          subgraph locust namespace
              LocustMaster[[Locust Master]]
              LocustWorkers0[[Locust Worker 0]]
              LocustWorkers1[[Locust Worker 1]]
              LocustWorkersN[[Locust Worker ...]]
          end
        end
        RDS[[RDS/Aurora]]
    end
    OAP --&amp;gt; RDS
    Service0 -. telemetry data -.-&amp;gt; OAP
    Service1 -. telemetry data -.-&amp;gt; OAP
    ServiceN -. telemetry data -.-&amp;gt; OAP
    UI --query--&amp;gt; OAP
    LocustWorkers0 -- traffic --&amp;gt; Service0
    LocustWorkers1 -- traffic --&amp;gt; Service0
    LocustWorkersN -- traffic --&amp;gt; Service0
    Service0 --&amp;gt; Service1 --&amp;gt; ServiceN
    LocustMaster --&amp;gt; LocustWorkers0
    LocustMaster --&amp;gt; LocustWorkers1
    LocustMaster --&amp;gt; LocustWorkersN
    User --&amp;gt; LocustMaster
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;如架构图所示，我们需要创建以下 AWS 资源：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;EKS 集群&lt;/li&gt;
&lt;li&gt;RDS 实例或 Aurora 集群&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;听起来很简单，但背后有很多东西，比如 VPC、子网、安全组等。你必须正确配置它们以确保 EKS 集群可以连接到 RDS 实例 / Aurora 集群，否则 SkyWalking 不会不工作。幸运的是，Terraform 可以帮助我们自动创建和销毁所有这些资源。&lt;/p&gt;
&lt;p&gt;我创建了一个 Terraform 模块来创建本教程所需的所有 AWS 资源，您可以在 &lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/tree/main/aws&#34;&gt;GitHub 存储库&lt;/a&gt;中找到它。&lt;/p&gt;
&lt;h2 id=&#34;创建-aws-资源&#34;&gt;创建 AWS 资源&lt;/h2&gt;
&lt;p&gt;首先，我们需要将 GitHub 存储库克隆 &lt;code&gt;cd&lt;/code&gt; 到文件夹中：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/kezhenxu94/oap-load-test.git
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;然后，我们需要创建一个文件 &lt;code&gt;terraform.tfvars&lt;/code&gt; 来指定 AWS 区域和其他变量：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cat &amp;gt; terraform.tfvars &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_access_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_secret_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;cluster_name   = &amp;#34;skywalking-on-aws&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;region         = &amp;#34;ap-east-1&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;db_type        = &amp;#34;rds-postgresql&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;如果您已经配置了 AWS CLI，则可以跳过 &lt;code&gt;aws_access_key&lt;/code&gt; 和 &lt;code&gt;aws_secret_key&lt;/code&gt; 变量。要使用 RDS postgresql 安装 SkyWalking，请将 &lt;code&gt;db_type&lt;/code&gt; 设置为 &lt;code&gt;rds-postgresql&lt;/code&gt;，要使用 Aurora postgresql 安装 SkyWalking，请将 &lt;code&gt;db_type&lt;/code&gt; 设置为 &lt;code&gt;aurora-postgresql&lt;/code&gt;。&lt;/p&gt;
&lt;p&gt;您可以配置许多其他变量，例如标签、示例服务计数、副本等，您可以在 variables.tf 中找到&lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/blob/main/aws/variables.tf&#34;&gt;它们&lt;/a&gt;。&lt;/p&gt;
&lt;p&gt;然后，我们可以运行以下命令来初始化 Terraform 模块并下载所需的提供程序，然后创建所有 AWS 资源：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform init
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;键入 &lt;code&gt;yes&lt;/code&gt; 以确认所有 AWS 资源的创建，或将标志 &lt;code&gt;-auto-approve&lt;/code&gt; 添加到 &lt;code&gt;terraform apply&lt;/code&gt; 以跳过确认：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;现在你需要做的就是等待所有 AWS 资源的创建完成，这可能需要几分钟的时间。您可以在 AWS Web 控制台查看创建进度，也可以查看 EKS 集群内部服务的部署进度。&lt;/p&gt;
&lt;h2 id=&#34;产生流量&#34;&gt;产生流量&lt;/h2&gt;
&lt;p&gt;除了创建必要的 AWS 资源外，Terraform 模块还将 SkyWalking、示例服务和 Locust 负载生成器服务部署到 EKS 集群。&lt;/p&gt;
&lt;p&gt;您可以访问 Locust Web UI 以生成到示例服务的流量：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;open http://&lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl get svc -n locust -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;locust-master -o &lt;span style=&#34;color:#953800&#34;&gt;jsonpath&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;{.items[0].status.loadBalancer.ingress[0].hostname}&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt;:8089
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;该命令将浏览器打开到 Locust web UI，您可以配置用户数量和孵化率以生成流量。&lt;/p&gt;
&lt;h2 id=&#34;观察-skywalking&#34;&gt;观察 SkyWalking&lt;/h2&gt;
&lt;p&gt;您可以访问 SkyWalking Web UI 来观察示例服务。&lt;/p&gt;
&lt;p&gt;首先需要将 SkyWalking UI 端口转发到本地：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl -n istio-system port-forward &lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl -n istio-system get pod -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;skywalking -l &lt;span style=&#34;color:#953800&#34;&gt;component&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;ui -o name&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt; 8080:8080
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;然后在浏览器中打开 http://localhost:8080 访问 SkyWalking web UI。&lt;/p&gt;
&lt;h2 id=&#34;观察-rdsaurora&#34;&gt;观察 RDS/Aurora&lt;/h2&gt;
&lt;p&gt;您也可以访问 RDS/Aurora web 控制台，观察 RDS/Aurora 实例 / Aurora 集群的性能。&lt;/p&gt;
&lt;h2 id=&#34;试验结果&#34;&gt;试验结果&lt;/h2&gt;
&lt;h3 id=&#34;测试-1使用-eks-和-rds-postgresql-的-skywalking&#34;&gt;测试 1：使用 EKS 和 RDS PostgreSQL 的 SkyWalking&lt;/h3&gt;
&lt;h4 id=&#34;服务流量&#34;&gt;服务流量&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-cpm.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-性能&#34;&gt;RDS 性能&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-性能&#34;&gt;SkyWalking 性能&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;测试-2使用-eks-和-aurora-postgresql-的-skywalking&#34;&gt;测试 2：使用 EKS 和 Aurora PostgreSQL 的 SkyWalking&lt;/h3&gt;
&lt;h4 id=&#34;服务流量-1&#34;&gt;服务流量&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-cpm-skywalking.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-性能-1&#34;&gt;RDS 性能&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-性能-1&#34;&gt;SkyWalking 性能&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;清理&#34;&gt;清理&lt;/h2&gt;
&lt;p&gt;完成演示后，您可以运行以下命令销毁所有 AWS 资源：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform destroy -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
      </description>
    </item>
    
    <item>
      <title>Blog: [Video] Distributed tracing demo using Apache SkyWalking and Kong API Gateway</title>
      <link>/blog/2022-08-11-kongcast-20-distributed-tracing-using-skywalking-kong/</link>
      <pubDate>Thu, 11 Aug 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-08-11-kongcast-20-distributed-tracing-using-skywalking-kong/</guid>
      <description>
        
        
        &lt;p&gt;Observability essential when working with distributed systems. Built on 3 pillars of metrics, logging and
tracing, having the right tools in place to quickly identify and determine the root cause of an issue in production
is imperative. In this Kongcast interview, we explore the benefits of having observability and demo the use of
Apache SkyWalking. We walk through the capabilities that SkyWalking offers out of the box and debug a common HTTP 500
error using the tool.&lt;/p&gt;
&lt;p&gt;Andrew Kew is interviewed by Viktor Gamov, a developer advocate at &lt;a href=&#34;https://konghq.com/&#34;&gt;Kong Inc&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Andrew is a highly passionate technologist with over 16 valuable years experience in building server side and cloud
applications. Having spent the majority of his time in the Financial Services domain, his meritocratic rise to CTO of an
Algorithmic Trading firm allowed him to not only steer the business from a technology standpoint, but build robust and
scalable trading algorithms. His mantra is &amp;ldquo;right first time&amp;rdquo;, thus ensuring the projects or clients he is involved in
are left in a better place than they were before he arrived.&lt;/p&gt;
&lt;p&gt;He is the founder of a boutique software consultancy in the United Kingdom, &lt;a href=&#34;https://quadcorps.co.uk&#34;&gt;QuadCorps Ltd&lt;/a&gt;, working in the API and
Integration Ecosystem space and is currently on a residency programme at &lt;a href=&#34;https://konghq.com/&#34;&gt;Kong Inc&lt;/a&gt; as a senior field engineer and
technical account manager working across many of their enterprise strategic accounts.&lt;/p&gt;
&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/r8e9ib0powM?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;


      </description>
    </item>
    
    <item>
      <title>Blog: Pinpoint Service Mesh Critical Performance Impact by using eBPF</title>
      <link>/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf/</link>
      <pubDate>Tue, 05 Jul 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf/</guid>
      <description>
        
        
        &lt;h3 id=&#34;content&#34;&gt;Content&lt;/h3&gt;
&lt;h1 id=&#34;background&#34;&gt;Background&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;Apache SkyWalking&lt;/a&gt; observes metrics, logs, traces, and events for services deployed into the service mesh. When troubleshooting, SkyWalking error analysis can be an invaluable tool helping to pinpoint where an error occurred. However, performance problems are more difficult: It’s often impossible to locate the root cause of performance problems with pre-existing observation data. To move beyond the status quo, dynamic debugging and troubleshooting are essential service performance tools. In this article, we&amp;rsquo;ll discuss how to use eBPF technology to improve the profiling feature in SkyWalking and analyze the performance impact in the service mesh.&lt;/p&gt;
&lt;h1 id=&#34;trace-profiling-in-skywalking&#34;&gt;Trace Profiling in SkyWalking&lt;/h1&gt;
&lt;p&gt;Since SkyWalking 7.0.0, Trace Profiling has helped developers find performance problems by periodically sampling the thread stack to let developers know which lines of code take more time. However, Trace Profiling is not suitable for the following scenarios:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Thread Model&lt;/strong&gt;: Trace Profiling is most useful for profiling code that executes in a single thread. It is less useful for middleware that relies heavily on async execution models. For example Goroutines in Go or Kotlin Coroutines.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language&lt;/strong&gt;: Currently, Trace Profiling is only supported in Java and Python, since it’s not easy to obtain the thread stack in the runtimes of some languages such as Go and Node.js.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Binding&lt;/strong&gt;: Trace Profiling requires Agent installation, which can be tricky depending on the language (e.g., PHP has to rely on its C kernel; Rust and C/C++ require manual instrumentation to make install).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trace Correlation&lt;/strong&gt;: Since Trace Profiling is only associated with a single request it can be hard to determine which request is causing the problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Short Lifecycle Services&lt;/strong&gt;: Trace Profiling doesn&amp;rsquo;t support short-lived services for (at least) two reasons:
&lt;ol&gt;
&lt;li&gt;It&amp;rsquo;s hard to differentiate system performance from class code manipulation in the booting stage.&lt;/li&gt;
&lt;li&gt;Trace profiling is linked to an endpoint to identify performance impact, but there is no endpoint to match these short-lived services.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Fortunately, there are techniques that can go further than Trace Profiling in these situations.&lt;/p&gt;
&lt;h1 id=&#34;introduce-ebpf&#34;&gt;Introduce eBPF&lt;/h1&gt;
&lt;p&gt;We have found that eBPF — a technology that can run sandboxed programs in an operating system kernel and thus safely and efficiently extend the capabilities of the kernel without requiring kernel modifications or loading kernel modules — can help us fill gaps left by Trace Profiling. eBPF is a trending technology because it breaks the traditional barrier between user and kernel space. Programs can now inject bytecode that runs in the kernel, instead of having to recompile the kernel to customize it. This is naturally a good fit for observability.&lt;/p&gt;
&lt;p&gt;In the figure below, we can see that when the system executes the execve syscalls, the eBPF program is triggered, and the current process runtime information is obtained by using function calls.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;eBPF-hook-points.png&#34; alt=&#34;eBPF Hook Point&#34;&gt;&lt;/p&gt;
&lt;p&gt;Using eBPF technology, we can expand the scope of Skywalking&amp;rsquo;s profiling capabilities:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Global Performance Analysis&lt;/strong&gt;: Before eBPF, data collection was limited to what agents can observe. Since eBPF programs run in the kernel, they can observe all threads. This is especially useful when you are not sure whether a performance problem is caused by a particular request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data Content&lt;/strong&gt;: eBPF can dump both user and kernel space thread stacks, so if a performance issue happens in kernel space, it’s easier to find.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Binding&lt;/strong&gt;: All modern Linux kernels support eBPF, so there is no need to install anything. This means it is an orchestration-free vs an agent model. This reduces friction caused by built-in software which may not have the correct agents installed, such as Envoy in a Service Mesh.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sampling Type&lt;/strong&gt;: Unlike Trace Profiling, eBPF is event-driven and, therefore, not constrained by interval polling. For example, eBPF can trigger events and collect more data depending on a transfer size threshold. This can allow the system to triage and prioritize data collection under extreme load.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;ebpf-limitations&#34;&gt;eBPF Limitations&lt;/h2&gt;
&lt;p&gt;While eBPF offers significant advantages for hunting performance bottlenecks, no technology is perfect. eBPF has a number of limitations described below. Fortunately, since SkyWalking does not require eBPF, the impact is limited.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Linux Version Requirement&lt;/strong&gt;: eBPF programs require a Linux kernel version above 4.4, with later kernel versions offering more data to be collected. The BCC has &lt;a href=&#34;https://github.com/iovisor/bcc/blob/13b5563c11f7722a61a17c6ca0a1a387d2fa7788/docs/kernel-versions.md#main-features&#34;&gt;documented the features supported by different Linux kernel versions&lt;/a&gt;, with the differences between versions usually being what data can be collected with eBPF.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privileges Required&lt;/strong&gt;: All processes that intend to load eBPF programs into the Linux kernel must be running in privileged mode. As such, bugs or other issues in such code may have a big impact.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weak Support for Dynamic Language&lt;/strong&gt;: eBPF has weak support for JIT-based dynamic languages, such as Java. It also depends on what data you want to collect. For Profiling, eBPF does not support parsing the symbols of the program, which is why most eBPF-based profiling technologies only support static languages like C, C++, Go, and Rust. However, symbol mapping can sometimes be solved through tools provided by the language. For example, in Java, &lt;a href=&#34;https://github.com/jvm-profiling-tools/perf-map-agent#architecture&#34;&gt;perf-map-agent&lt;/a&gt; can be used to generate the symbol mapping. However, dynamic languages don&amp;rsquo;t support the attach (uprobe) functionality that would allow us to trace execution events through symbols.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;introducing-skywalking-rover&#34;&gt;Introducing SkyWalking Rover&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking-rover&#34;&gt;SkyWalking Rover&lt;/a&gt; introduces the eBPF profiling feature into the SkyWalking ecosystem. The figure below shows the overall architecture of SkyWalking Rover. SkyWalking Rover is currently supported in Kubernetes environments and must be deployed inside a Kubernetes cluster. After establishing a connection with the SkyWalking backend server, it saves information about the processes on the current machine to SkyWalking. When the user creates an eBPF profiling task via the user interface, SkyWalking Rover receives the task and executes it in the relevant C, C++, Golang, and Rust language-based programs.&lt;/p&gt;
&lt;p&gt;Other than an eBPF-capable kernel, there are no additional prerequisites for deploying SkyWalking Rover.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;architecture.png&#34; alt=&#34;architecture&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;cpu-profiling-with-rover&#34;&gt;CPU Profiling with Rover&lt;/h2&gt;
&lt;p&gt;CPU profiling is the most intuitive way to show service performance. Inspired by &lt;a href=&#34;https://www.brendangregg.com/offcpuanalysis.html&#34;&gt;Brendan Gregg‘s blog post&lt;/a&gt;, we&amp;rsquo;ve divided CPU profiling into two types that we have implemented in Rover:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;On-CPU Profiling&lt;/strong&gt;: Where threads are spending time running on-CPU.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Off-CPU Profiling&lt;/strong&gt;: Where time is spent waiting while blocked on I/O, locks, timers, paging/swapping, etc.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;profiling-envoy-with-ebpf&#34;&gt;Profiling Envoy with eBPF&lt;/h1&gt;
&lt;p&gt;Envoy is a popular proxy, used as the data plane by the Istio service mesh. In a Kubernetes cluster, Istio injects Envoy into each service’s pod as a sidecar where it transparently intercepts and processes incoming and outgoing traffic. As the data plane, any performance issues in Envoy can affect all service traffic in the mesh. In this scenario, it’s more powerful to use &lt;strong&gt;eBPF profiling&lt;/strong&gt; to analyze issues in production caused by service mesh configuration.&lt;/p&gt;
&lt;h2 id=&#34;demo-environment&#34;&gt;Demo Environment&lt;/h2&gt;
&lt;p&gt;If you want to see this scenario in action, we&amp;rsquo;ve built a demo environment where we deploy an Nginx service for stress testing. Traffic is intercepted by Envoy and forwarded to Nginx. The commands to install the whole environment can be accessed through &lt;a href=&#34;https://github.com/mrproliu/skywalking-rover-profiling-demo&#34;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&#34;on-cpu-profiling&#34;&gt;On-CPU Profiling&lt;/h1&gt;
&lt;p&gt;On-CPU profiling is suitable for analyzing thread stacks when service CPU usage is high. If the stack is dumped more times, it means that the thread stack occupies more CPU resources.&lt;/p&gt;
&lt;p&gt;When installing Istio using the demo configuration profile, we found there are two places where we can optimize performance:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Zipkin Tracing&lt;/strong&gt;: Different Zipkin sampling percentages have a direct impact on QPS.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Access Log Format&lt;/strong&gt;: Reducing the fields of the Envoy access log can improve QPS.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;zipkin-tracing&#34;&gt;Zipkin Tracing&lt;/h2&gt;
&lt;h3 id=&#34;zipkin-with-100-sampling&#34;&gt;Zipkin with 100% sampling&lt;/h3&gt;
&lt;p&gt;In the default demo configuration profile, Envoy is using 100% sampling as default tracing policy. How does that impact the performance?&lt;/p&gt;
&lt;p&gt;As shown in the figure below, using the &lt;strong&gt;on-CPU profiling&lt;/strong&gt;, we found that it takes about &lt;strong&gt;16%&lt;/strong&gt; of the CPU overhead. At a fixed consumption of &lt;strong&gt;2 CPUs&lt;/strong&gt;, its QPS can reach &lt;strong&gt;5.7K&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;zipkin-sampling-100.png&#34; alt=&#34;Zipkin with 100% sampling&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;disable-zipkin-tracing&#34;&gt;Disable Zipkin tracing&lt;/h3&gt;
&lt;p&gt;At this point, we found that if Zipkin is not necessary, the sampling percentage can be reduced or we can even disable tracing. Based on the &lt;a href=&#34;https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/#Tracing&#34;&gt;Istio documentation&lt;/a&gt;, we can disable tracing when installing the service mesh using the following command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;meshConfig.enableTracing=false&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;meshConfig.defaultConfig.tracing.sampling=0.0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After disabling tracing, we performed on-CPU profiling again. According to the figure below, we found that Zipkin has disappeared from the flame graph. With the same &lt;strong&gt;2 CPU&lt;/strong&gt; consumption as in the previous example, the QPS reached &lt;strong&gt;9K&lt;/strong&gt;, which is an almost &lt;strong&gt;60%&lt;/strong&gt; increase.
&lt;img src=&#34;zipkin-disable-tracing.png&#34; alt=&#34;Disable Zipkin tracing&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;tracing-with-throughput&#34;&gt;Tracing with Throughput&lt;/h3&gt;
&lt;p&gt;With the same CPU usage, we&amp;rsquo;ve discovered that Envoy performance greatly improves when the tracing feature is disabled. Of course, this requires us to make trade-offs between the number of samples Zipkin collects and the desired performance of Envoy (QPS).&lt;/p&gt;
&lt;p&gt;The table below illustrates how different Zipkin sampling percentages under the same CPU usage affect QPS.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Zipkin sampling %&lt;/th&gt;
          &lt;th&gt;QPS&lt;/th&gt;
          &lt;th&gt;CPUs&lt;/th&gt;
          &lt;th&gt;Note&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;100% &lt;strong&gt;(default)&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;5.7K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;16% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;1%&lt;/td&gt;
          &lt;td&gt;8.1K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;0.3% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;disabled&lt;/td&gt;
          &lt;td&gt;9.2K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;0% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;access-log-format&#34;&gt;Access Log Format&lt;/h2&gt;
&lt;h3 id=&#34;default-log-format&#34;&gt;Default Log Format&lt;/h3&gt;
&lt;p&gt;In the default demo configuration profile, &lt;a href=&#34;https://istio.io/latest/docs/tasks/observability/logs/access-log/#default-access-log-format&#34;&gt;the default Access Log format&lt;/a&gt; contains a lot of data. The flame graph below shows various functions involved in parsing the data such as request headers, response headers, and streaming the body.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;log-format-default.png&#34; alt=&#34;Default Log Format&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;simplifying-access-log-format&#34;&gt;Simplifying Access Log Format&lt;/h3&gt;
&lt;p&gt;Typically, we don’t need all the information in the access log, so we can often simplify it to get what we need. The following command simplifies the access log format to only display basic information:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set meshConfig.accessLogFormat&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[%START_TIME%] \&amp;#34;%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\&amp;#34; %RESPONSE_CODE%\n&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After simplifying the access log format, we found that the QPS increased from &lt;strong&gt;5.7K&lt;/strong&gt; to &lt;strong&gt;5.9K&lt;/strong&gt;. When executing the on-CPU profiling again, the CPU usage of log formatting dropped from &lt;strong&gt;2.4%&lt;/strong&gt; to &lt;strong&gt;0.7%&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Simplifying the log format helped us to improve the performance.&lt;/p&gt;
&lt;h1 id=&#34;off-cpu-profiling&#34;&gt;Off-CPU Profiling&lt;/h1&gt;
&lt;p&gt;Off-CPU profiling is suitable for performance issues that are not caused by high CPU usage. For example, when there are too many threads in one service, using off-CPU profiling could reveal which threads spend more time context switching.&lt;/p&gt;
&lt;p&gt;We provide data aggregation in two dimensions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Switch count&lt;/strong&gt;: The number of times a thread switches context. When the thread returns to the CPU, it completes one context switch. A thread stack with a higher switch count spends more time context switching.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Switch duration&lt;/strong&gt;: The time it takes a thread to switch the context. A thread stack with a higher switch duration spends more time off-CPU.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;write-access-log&#34;&gt;Write Access Log&lt;/h2&gt;
&lt;h3 id=&#34;enable-write&#34;&gt;Enable Write&lt;/h3&gt;
&lt;p&gt;Using the same environment and settings as before in the on-CPU test, we performed off-CPU profiling. As shown below, we found that access log writes accounted for about &lt;strong&gt;28%&lt;/strong&gt; of the total context switches. The &amp;ldquo;__write&amp;rdquo; shown below also indicates that this method is the Linux kernel method.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;access-log-write-enable.png&#34; alt=&#34;Enable Write Access Log&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;disable-write&#34;&gt;Disable Write&lt;/h3&gt;
&lt;p&gt;SkyWalking implements Envoy&amp;rsquo;s Access Log Service (ALS) feature which allows us to send access logs to the SkyWalking Observability Analysis Platform (OAP) using the gRPC protocol. Even by disabling the access logging, we can still use ALS to capture/aggregate the logs. We&amp;rsquo;ve disabled writing to the access log using the following command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo --set meshConfig.accessLogFile&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After disabling the Access Log feature, we performed the off-CPU profiling. File writing entries have disappeared as shown in the figure below. Envoy throughput also increased from &lt;strong&gt;5.7K&lt;/strong&gt; to &lt;strong&gt;5.9K&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;access-log-write-disable.png&#34; alt=&#34;Disable Write Access Log&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;In this article, we&amp;rsquo;ve examined the insights Apache Skywalking&amp;rsquo;s Trace Profiling can give us and how much more can be achieved with eBPF profiling. All of these features are implemented in &lt;a href=&#34;https://github.com/apache/skywalking-rover&#34;&gt;skywalking-rover&lt;/a&gt;. In addition to on- and off-CPU profiling, you will also find the following features:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Continuous profiling&lt;/strong&gt;, helps you automatically profile without manual intervention. For example, when Rover detects that the CPU exceeds a configurable threshold, it automatically executes the on-CPU profiling task.&lt;/li&gt;
&lt;li&gt;More profiling types to enrich usage scenarios, such as network, and memory profiling.&lt;/li&gt;
&lt;/ol&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Apache ShenYu(incubating) plugin implementation principles and observability practices</title>
      <link>/blog/2022-05-08-apache-shenyuincubating-integrated-skywalking-practice-observability/</link>
      <pubDate>Sun, 08 May 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-05-08-apache-shenyuincubating-integrated-skywalking-practice-observability/</guid>
      <description>
        
        
        &lt;h3 id=&#34;content&#34;&gt;Content&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&#34;#1.-Introduction-of-SkyWalking-and-ShenYu&#34;&gt;Introduction of SkyWalking and ShenYu&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#2.-Apache-ShenYu-plugin-implementation-principle&#34;&gt;Apache ShenYu plugin implementation principle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#3.-Adding-generalized-call-tracking-to-the-gRPC-plugin-and-keeping-it-compatible&#34;&gt;Adding generalized call tracking to the gRPC plugin and keeping it compatible&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#4.-ShenYu-Gateway-Observability-Practice&#34;&gt;ShenYu Gateway Observability Practice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#5.-Summary&#34;&gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;1-introduction-of-skywalking-and-shenyu&#34;&gt;1. Introduction of SkyWalking and ShenYu&lt;/h2&gt;
&lt;h3 id=&#34;11-skywalking&#34;&gt;1.1 SkyWalking&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/hutaishi/skywalking&#34;&gt;SkyWalking&lt;/a&gt; is an Application Performance Monitoring (APM) and Observability Analysis Platform (OAP) for microservices, distributed systems, and cloud natives,
Has powerful features that provide a multi-dimensional means of application performance analysis, including distributed topology diagrams, application performance metrics, distributed link tracing, log correlation analysis and alerts. Also has a very rich ecology. Widely used in various companies and open source projects.&lt;/p&gt;
&lt;h3 id=&#34;12-apache-shenyu-incubating&#34;&gt;1.2 Apache ShenYu (incubating)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/incubator-shenyu&#34;&gt;Apache ShenYu (incubating)&lt;/a&gt;
High-performance,multi-protocol,extensible,responsive API Gateway. Compatible with a variety of mainstream framework systems, support hot plug,
users can customize the development, meet the current situation and future needs of users in a variety of scenarios, experienced the temper of large-scale scenes.
Rich protocol support: &lt;code&gt;Http&lt;/code&gt;, &lt;code&gt;Spring Cloud&lt;/code&gt;, &lt;code&gt;gRPC&lt;/code&gt;, &lt;code&gt;Dubbo&lt;/code&gt;, &lt;code&gt;SOFARPC&lt;/code&gt;, &lt;code&gt;Motan&lt;/code&gt;, &lt;code&gt;Tars&lt;/code&gt;, etc.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;shenyu-arch.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;2-apache-shenyu-plugin-implementation-principle&#34;&gt;2. Apache ShenYu plugin implementation principle&lt;/h2&gt;
&lt;p&gt;ShenYu&amp;rsquo;s asynchrony is a little different from previous exposure to asynchrony, it is a full-link asynchrony, the execution of each plug-in is asynchronous, and thread switching is not a single fixed situation (and the individual plug-in implementation is related).
The gateway initiates service calls of various protocol types, and the existing SkyWalking plugins create ExitSpan (synchronous or asynchronous) when they initiate service calls.  The gateway receives the request and creates an asynchronous EntrySpan.
The asynchronous EntrySpan needs to be concatenated with the synchronous or asynchronous ExitSpan, otherwise the link will be broken.&lt;/p&gt;
&lt;p&gt;There are 2 types of tandem solutions：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Snapshot Delivery&lt;/strong&gt;:&lt;br&gt;
Pass the snapshot after creating the EntrySpan to the thread that created the ExitSpan in some way.&lt;br&gt;
Currently this approach is used in the asynchronous WebClient plugin, which can receive asynchronous snapshots. shenYu proxy Http service or SpringCloud service is to achieve span concatenation through snapshot passing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LocalSpan transit&lt;/strong&gt;:&lt;br&gt;
Other RPC class plugins do not receive snapshots for concatenation like Asynchronous WebClient. Although you can modify other RPC plugins to receive snapshots for concatenation, it is not recommended or necessary to do so.
This can be achieved by creating a LocalSpan in the thread where the ExitSpan is created, and then connecting the asynchronous EntrySpan and LocalSpan by &lt;code&gt;snapshot passing&lt;/code&gt;. This can be done without changing the original plugin code.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The span connection is shown below:&lt;br&gt;
&lt;img src=&#34;span-connect.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;You may ask if it is possible to create LocalSpan inside a generic plugin, instead of creating one separately for ShenYu RPC plugin?
The answer is no, because you need to ensure that LocalSpan and ExitSpan are in the same thread, and ShenYu is fully linked asynchronously. The code to create LocalSpan is reused in the implementation.&lt;/p&gt;
&lt;h2 id=&#34;3-adding-generalized-call-tracking-to-the-grpc-plugin-and-keeping-it-compatible&#34;&gt;3. Adding generalized call tracking to the gRPC plugin and keeping it compatible&lt;/h2&gt;
&lt;p&gt;The existing SkyWalking gRPC plugin only supports calls initiated by way of stubs. For the gateway there is no proto file, the gateway takes generalized calls (not through stubs), so tracing RPC requests, you will find that the link will break at the gateway node.
In this case, it is necessary to make the gRPC plugin support generalized calls, while at the same time needing to remain compatible and not affect the original tracing method. This is achieved by determining whether the request parameter is a DynamicMessage, and if it is not, then the original tracing logic through the stub is used.
If not, then the original tracing logic via stubs is used, and if not, then the generalized call tracing logic is used. The other compatibility is the difference between the old and new versions of gRPC, as well as the compatibility of various cases of obtaining server-side IP, for those interested in the source code.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;grpc-generic-call.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;4-shenyu-gateway-observability-practice&#34;&gt;4. ShenYu Gateway Observability Practice&lt;/h2&gt;
&lt;p&gt;The above explains the principle of SkyWalking ShenYu plug-in implementation, the following deployment application to see the effect. SkyWalking powerful, in addition to the link tracking requires the development of plug-ins, other powerful features out of the box.
Here only describe the link tracking and application performance analysis part, if you want to experience the power of SkyWalking features, please refer to the &lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;SkyWalking official documentation&lt;/a&gt;.&lt;br&gt;
Version description:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;skywalking-java: &lt;code&gt;8.11.0-SNAPSHOT&lt;/code&gt; source code build. Note: The shenyu plugin will be released in version 8.11.0, and will probably release it initially in May or June. the Java agent is in the regular release phase.&lt;/li&gt;
&lt;li&gt;skywalking: &lt;code&gt;9.0.0&lt;/code&gt; V9 version&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Usage instructions:&lt;br&gt;
SkyWalking is designed to be very easy to use. Please refer to the official documentation for configuring and activating the shenyu plugin.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/readme/&#34;&gt;SkyWalking Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/skywalking-java/latest/readme/&#34;&gt;SkyWalking Java Agent Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;41-sending-requests-to-the-gateway&#34;&gt;4.1 Sending requests to the gateway&lt;/h3&gt;
&lt;p&gt;Initiate various service requests to the gateway via the &lt;code&gt;postman&lt;/code&gt; client or &lt;code&gt;other means&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;42-request-topology-diagram&#34;&gt;4.2 Request Topology Diagram&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;topology.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;img src=&#34;topology2.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&#34;43-request-trace-in-the-case-of-grpc&#34;&gt;4.3 Request Trace (in the case of gRPC)&lt;/h3&gt;
&lt;h4 id=&#34;normal-trace&#34;&gt;Normal Trace：&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;grpc-ok.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;abnormal-trace&#34;&gt;Abnormal Trace：&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;grpc-error.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Click on the link node to see the corresponding node information and exception information&lt;/p&gt;
&lt;h4 id=&#34;service-provider-span&#34;&gt;Service Provider Span&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;grpc-error-span.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;gateway-request-span&#34;&gt;Gateway request span&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;gateway-error-span.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;44-service-metrics-monitoring&#34;&gt;4.4 Service Metrics Monitoring&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;overview.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;45-gateway-background-metrics-monitoring&#34;&gt;4.5 Gateway background metrics monitoring&lt;/h3&gt;
&lt;h4 id=&#34;database-monitoring&#34;&gt;Database Monitoring:&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;database.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;thread-pool-and-connection-pool-monitoring&#34;&gt;Thread pool and connection pool monitoring:&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;img.png&#34; alt=&#34;img.png&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;46-jvm-monitoring&#34;&gt;4.6 JVM Monitoring&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;jvm.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;47-endpoint-analysis&#34;&gt;4.7 Endpoint Analysis&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;endpoint.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;48-exception-log-and-exception-link-analysis&#34;&gt;4.8 Exception log and exception link analysis&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/skywalking-java/latest/en/setup/service-agent/java-agent/application-toolkit-logback-1.x/&#34;&gt;See official documentation for log configuration&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Log monitoring
&lt;img src=&#34;log-trace.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Distributed link trace details corresponding to exception logs
&lt;img src=&#34;log-trace-detail.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;5-summary&#34;&gt;5. Summary&lt;/h2&gt;
&lt;p&gt;SkyWalking has very comprehensive support for metrics, link tracing, and logging in observability, and is powerful, easy to use, and designed for large distributed systems, microservices, cloud-native, container architectures, and has a rich ecosystem.
Using SkyWalking to provide powerful observability support for Apache ShenYu (incubating) gives ShenYu a boost. Finally, if you are interested in high-performance responsive gateways, you can follow
&lt;a href=&#34;https://github.com/apache/incubator-shenyu&#34;&gt;Apache ShenYu (incubating)&lt;/a&gt;.
Also, thanks to SkyWalking such an excellent open source software to the industry contributions.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Integrating Apache SkyWalking with source code</title>
      <link>/blog/2022-04-14-integrating-skywalking-with-source-code/</link>
      <pubDate>Thu, 14 Apr 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-04-14-integrating-skywalking-with-source-code/</guid>
      <description>
        
        
        &lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it. - Mark Weiser&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mark Weiser prophetically argued in the late 1980s, that the most far-reaching technologies are those which vanish into thin air. According to Weiser, &amp;ldquo;Whenever people learn something sufficiently well, they cease to be aware of it.&amp;rdquo; This disappearing act, as Weiser claimed, is not limited to technology but rather human psychology. It is this very experience that allows us to escape lower-level thinking into higher-level thinking. For once we are no longer impeded by mundane details, we are then free to focus on new goals.&lt;/p&gt;
&lt;p&gt;This realization becomes more relevant as APMs become increasingly popular. As more applications are deployed with APMs, the number of abstract representations of the underlying source code also increases. While this provides great value to many non-development roles within an organization, it does pose additional challenges to those in development roles who must translate these representations into concepts they can work with (i.e. source code). Weiser sums this difficultly up rather succinctly when he states that &amp;ldquo;Programmers should no more be asked to work without access to source code than auto-mechanics should be asked to work without looking at the engine.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Still, APMs collect more information only to produce a plethora of new abstract representations. In this article, we will introduce a new concept in &lt;a href=&#34;https://github.com/sourceplusplus/live-platform&#34;&gt;Source++&lt;/a&gt;, the open-source live-coding platform, specifically designed to allow developers to monitor production applications more intuitively.&lt;/p&gt;
&lt;h2 id=&#34;live-views&#34;&gt;Live Views&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;And we really don&amp;rsquo;t understand even yet, hundreds of metrics later, what make a program easier to understand or modify or reuse or borrow. I don&amp;rsquo;t think we&amp;rsquo;ll find out by looking away from programs to their abstract interfaces. The answers are in the source code. - Mark Weiser&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As APMs move from the &amp;ldquo;nice to have&amp;rdquo; category to the &amp;ldquo;must-have&amp;rdquo; category, there is a fundamental feature holding them back from ubiquity. They must disappear from consciousness. As developers, we should feel no impulse to open our browsers to better understand the underlying source code. The answers are literally in the source code. Instead, we should improve our tools so the source code conveniently tells us what we need to know. Think of how simple life could be if failing code always indicated how and why it failed. This is the idea behind Source++.&lt;/p&gt;
&lt;p&gt;In our last blog post, we discussed &lt;a href=&#34;https://skywalking.apache.org/blog/2021-12-06-extend-skywalking-with-nbb/&#34;&gt;Extending Apache SkyWalking with non-breaking breakpoints&lt;/a&gt;. In that post, we introduced a concept called &lt;strong&gt;Live Instruments&lt;/strong&gt;, which developers can use to easily debug live production applications without leaving their IDE. Today, we will discuss how existing SkyWalking installations can be integrated into your IDE via a new concept called &lt;strong&gt;Live Views&lt;/strong&gt;. Unlike Live Instruments, which are designed for debugging live applications, Live Views are designed for increasing application comprehension and awareness. This is accomplished through a variety of commands which are input into the Live Command Palette.&lt;/p&gt;
&lt;h3 id=&#34;live-command-palette&#34;&gt;Live Command Palette&lt;/h3&gt;
&lt;p&gt;The Live Command Palette (LCP) is a contextual command prompt, included in the &lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains&#34;&gt;Source++ JetBrains Plugin&lt;/a&gt;, that allows developers to control and query live applications from their IDE. Opened via keyboard shortcut (&lt;code&gt;Ctrl+Shift+S&lt;/code&gt;), the LCP allows developers to easily view metrics relevant to the source code they&amp;rsquo;re currently viewing. The following Live View commands are currently supported:&lt;/p&gt;
&lt;h4 id=&#34;command-view-overviewactivitytraceslogs&#34;&gt;Command: view (overview/activity/traces/logs)&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;view&lt;/code&gt; commands display contextual popups with live operational data of the current source code. These commands allow developers to view traditional SkyWalking operational data filtered down to the relevant metrics.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;view_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;command-watch-log&#34;&gt;Command: watch log&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;watch log&lt;/code&gt; command allows developers to follow individual log statements of a running application in real-time. This command allows developers to negate the need for manually scrolling through the logs to find instances of a specific log statement.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;watch_log_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;command-showhide-quick-stats&#34;&gt;Command: (show/hide) quick stats&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;show quick stats&lt;/code&gt; command displays live endpoint metrics for a quick idea of an endpoint&amp;rsquo;s activity. Using this command, developers can quickly assess the status of an endpoint and determine if the endpoint is performing as expected.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;show_quick_stats_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;future-work&#34;&gt;Future Work&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;A good tool is an invisible tool. By invisible, I mean that the tool does not intrude on your consciousness; you focus on the task, not the tool. Eyeglasses are a good tool &amp;ndash; you look at the world, not the eyeglasses. - Mark Weiser&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Source++ aims to extend SkyWalking in such a way that SkyWalking itself becomes invisible. To accomplish this, we plan to support custom developer commands. Developers will be able to build customized commands for themselves, as well as commands to share with their team. These commands will recognize context, types, and conditions allowing for a wide possibility of operations. As more commands are added, developers will be able to expose everything SkyWalking has to offer while focusing on what matters most, the source code.&lt;/p&gt;
&lt;p&gt;If you find these features useful, please consider giving Source++ a try. You can install the plugin directly from your JetBrains IDE, or through the &lt;a href=&#34;https://plugins.jetbrains.com/plugin/12033-source-&#34;&gt;JetBrains Marketplace&lt;/a&gt;. If you have any issues or questions, please &lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains/issues&#34;&gt;open an issue&lt;/a&gt;. Feedback is always welcome!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Zh: 将 Apache SkyWalking 与源代码集成</title>
      <link>/zh/2022-04-14-integrating-skywalking-with-source-code/</link>
      <pubDate>Thu, 14 Apr 2022 00:00:00 +0000</pubDate>
      <guid>/zh/2022-04-14-integrating-skywalking-with-source-code/</guid>
      <description>
        
        
        &lt;p&gt;&lt;sub&gt;Read this post in original language: &lt;a href=&#34;https://skywalking.apache.org/blog/2022-04-14-integrating-skywalking-with-source-code/&#34;&gt;English&lt;/a&gt;&lt;/sub&gt;&lt;/p&gt;
&lt;h2 id=&#34;介绍&#34;&gt;介绍&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;最具影响力的技术是那些消失的技术。他们交织在日常生活中，直到二者完全相融。 - 马克韦瑟&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;马克韦瑟在 1980 年代后期预言，影响最深远的技术是那些消失在空气中的技术。&lt;/p&gt;
&lt;p&gt;“当人们足够熟知它，就不会再意识到它。”&lt;/p&gt;
&lt;p&gt;正如韦瑟所说，这种消失的现象不只源于技术，更是人类的心理。
正是这种经验使我们能够摆脱对底层的考量，进入更高层次的思考。
一旦我们不再被平凡的细枝末节所阻碍，我们就可以自如地专注于新的目标。&lt;/p&gt;
&lt;p&gt;随着 APM(应用性能管理系统) 变得越来越普遍，这种认识变得更加重要。随着更多的应用程序开始使用 APM 部署，底层源代码抽象表示的数量也在同步增加。
虽然这为组织内的许多非开发角色提供了巨大的价值，但它确实也对开发人员提出了额外的挑战 -
他们必须将这些表示转化为可操作的概念（即源代码）。
对此，韦瑟相当简洁的总结道,“就像不应要求汽车机械师在不查看引擎的情况下工作一样，我们不应要求程序员在不访问源代码的情况下工作”。&lt;/p&gt;
&lt;p&gt;尽管如此，APM 收集更多信息只是为了产生充足的新抽象表示。
在本文中，我们将介绍开源实时编码平台 &lt;a href=&#34;https://github.com/sourceplusplus/live-platform&#34;&gt;Source++&lt;/a&gt; 中的一个新概念，旨在让开发人员更直观地监控生产应用程序。&lt;/p&gt;
&lt;h2 id=&#34;实时查看&#34;&gt;实时查看&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;我们尚且不理解在收集了数百个指标之后，是什么让程序更容易理解、修改、重复使用或借用。
我不认为我们能够通过原理程序本身而到它们的抽象接口中找到答案。答案就在源代码之中。 - 马克韦瑟&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;随着 APM 从“有了更好”转变为“必须拥有”，有一个基本特性阻碍了它们的普及。
它们必须从意识中消失。作为开发人员，我们不应急于打开浏览器以更好地理解底层源代码，答案就在源代码中。
相反，我们应该改进我们的工具，以便源代码直观地告诉我们需要了解的内容。
想想如果失败的代码总是表明它是如何以及为什么失败的，生活会多么简单。这就是 Source++ 背后的理念。&lt;/p&gt;
&lt;p&gt;在我们的上一篇博客中，我们讨论了不间断断点 &lt;a href=&#34;https://skywalking.apache.org/blog/2021-12-06-extend-skywalking-with-nbb/&#34;&gt;Extending Apache SkyWalking&lt;/a&gt;。
我们介绍了一个名为 &lt;strong&gt;Live Instruments&lt;/strong&gt;(实时埋点) 的概念，开发人员可以使用它轻松调试实时生产应用程序，而无需离开他们的开发环境。
而今天，我们将讨论如何通过一个名为 &lt;strong&gt;Live Views&lt;/strong&gt;（实时查看）的新概念将现有部署的 SkyWalking 集成到您的 IDE 中。
与专为调试实时应用程序而设计的 Live Instruments (实时埋点) 不同，Live Views（实时查看）旨在提高对应用程序的理解和领悟。
这将通过输入到 Live Command Palette (实时命令面板) 中的各种命令来完成。&lt;/p&gt;
&lt;h3 id=&#34;实时命令面板&#34;&gt;实时命令面板&lt;/h3&gt;
&lt;p&gt;Live Command Palette (LCP) 是一个当前上下文场景下的命令行面板，这个组件包含在 &lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains&#34;&gt;Source++ JetBrains 插件中&lt;/a&gt;，它允许开发人员从 IDE 中直接控制和对实时应用程序发起查询。&lt;/p&gt;
&lt;p&gt;LCP 通过键盘快捷键 (&lt;code&gt;Ctrl+Shift+S&lt;/code&gt;) 打开，允许开发人员轻松了解与他们当前正在查看的源代码相关的运行指标。&lt;/p&gt;
&lt;p&gt;目前 LCP 支持以下实时查看命令：&lt;/p&gt;
&lt;h4 id=&#34;命令viewoverviewactivitytraceslogs--查看-总览活动追踪日志&#34;&gt;命令：&lt;code&gt;view&lt;/code&gt;（overview/activity/traces/Logs）- 查看 总览/活动/追踪/日志&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;view&lt;/code&gt; 查看命令会展示一个与当前源码的实时运维数据关联的弹窗。
这些命令允许开发人员查看根据相关指标过滤的传统 SkyWalking 的运维数据。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;view_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;命令watch-log---实时监听日志&#34;&gt;命令：&lt;code&gt;watch log&lt;/code&gt; - 实时监听日志&lt;/h4&gt;
&lt;p&gt;本日志命令允许开发人员实时跟踪正在运行的应用程序的每一条日志。
通过此命令开发人员无需手动查阅大量日志就可以查找特定日志语句的实例。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;watch_log_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;命令showhide-quick-stats-显示隐藏快速统计&#34;&gt;命令：(show/hide) quick stats （显示/隐藏）快速统计&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;show quick stats&lt;/code&gt; 显示快速统计命令显示实时端点指标，以便快速了解端点的活动。
使用此命令，开发人员可以快速评估端点的状态并确定端点是否按预期正常运行。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;show_quick_stats_command.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;未来的工作&#34;&gt;未来的工作&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;好工具是无形的。我所指的无形，是指这个工具不会侵入你的意识；
你专注于任务，而不是工具。
眼镜就是很好的工具——你看的是世界，而不是眼镜。 - 马克韦瑟&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Source++ 旨在扩展 SkyWalking，使 SkyWalking 本身变得无需感知。
为此，我们计划支持自定义的开发人员命令。
开发人员将能够构建自定义命令，以及与团队共享的命令。
这些命令将识别上下文、类型和条件，从而允许广泛的操作。
随着更多命令的添加，开发人员将能够洞悉 SkyWalking 所提供的所有功能，同时专注于最重要的源码。&lt;/p&gt;
&lt;p&gt;如果您觉得这些功能有用，请考虑尝试使用 Source++。
您可以通过 &lt;a href=&#34;https://plugins.jetbrains.com/plugin/12033-source-&#34;&gt;JetBrains Marketplace&lt;/a&gt;
或直接从您的 JetBrains IDE 安装插件。
如果您有任何疑问，&lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains/issues&#34;&gt;请到这提 issue&lt;/a&gt;。&lt;/p&gt;
&lt;p&gt;欢迎随时反馈！&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Apache APISIX Integrates with SkyWalking to Create a Full Range of Log Processing</title>
      <link>/blog/2021-12-08-apisix-integrate-skywalking-plugin/apisix-integrate-skywalking-plugin/</link>
      <pubDate>Wed, 08 Dec 2021 00:00:00 +0000</pubDate>
      <guid>/blog/2021-12-08-apisix-integrate-skywalking-plugin/apisix-integrate-skywalking-plugin/</guid>
      <description>
        
        
        &lt;p&gt;In the field of observability, the three main directions of data collection and analysis, Metrics, Logger and Tracing, are usually used to achieve insight into the operational status of applications.&lt;/p&gt;
&lt;p&gt;Apache APISIX has integrated Apache SkyWaling Tracing capabilities as early as version 1.4, with features such as error logging and access log collection added in subsequent versions. Now with Apache SkyWalking&amp;rsquo;s support for Metrics, it enables Apache APISIX to implement a one-stop observable solution in integrated mode, covering both logging, metrics and call tracing.&lt;/p&gt;
&lt;h2 id=&#34;feature-development-background&#34;&gt;Feature Development Background&lt;/h2&gt;
&lt;p&gt;Those of you who are familiar with Apache APISIX should know that Apache APISIX produces two types of logs during operation, namely the access log and the error log.&lt;/p&gt;
&lt;p&gt;Access logs record detailed information about each request and are logs generated within the scope of the request, so they can be directly associated with Tracing. Error logs, on the other hand, are Apache APISIX runtime output log messages, which are application-wide logs, but cannot be 100% associated with requests.&lt;/p&gt;
&lt;p&gt;At present, Apache APISIX provides very rich log processing plug-ins, including TCP/HTTP/Kafka and other collection and reporting plug-ins, but they are weakly associated with Tracing. Take Apache SkyWalking as an example. We extract the SkyWalking Tracing Conetxt Header from the log records of Apache APISIX and export it to the file system, and then use the log processing framework (fluentbit) to convert the logs into a log format acceptable to SkyWalking. The Tracing Context is then parsed and extracted to obtain the Tracing ID to establish a connection with the Trace.&lt;/p&gt;
&lt;p&gt;Obviously, the above way of handling the process is tedious and complicated, and requires additional conversion of log formats. For this reason, in &lt;a href=&#34;https://github.com/apache/apisix/pull/5550&#34;&gt;PR#5500&lt;/a&gt; we have implemented the Apache SkyWalking access log into the Apache APISIX plug-in ecosystem to make it easier for users to collect and process logs using Apache SkyWalking in Apache APISIX.&lt;/p&gt;
&lt;h2 id=&#34;introduction-of-the-new-plugins&#34;&gt;Introduction of the New Plugins&lt;/h2&gt;
&lt;h3 id=&#34;skywalking-logger-pulgin&#34;&gt;SkyWalking Logger Pulgin&lt;/h3&gt;
&lt;p&gt;The SkyWalking Logger plugin parses the SkyWalking Tracing Context Header and prints the relevant Tracing Context information to the log, thus enabling the log to be associated with the call chain.&lt;/p&gt;
&lt;p&gt;By using this plug-in, Apache APISIX can get the SkyWalking Tracing Context and associate it with Tracing even if the SkyWalking Tracing plug-in is not turned on, if Apache SkyWalking is already integrated downstream.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./log_content.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;The above Content is the log content, where the Apache APISIX metadata configuration is used to collect request-related information. You can later modify the Log Format to customize the log content by Plugin Metadata, please refer to the &lt;a href=&#34;https://apisix.apache.org/docs/apisix/plugins/skywalking-logger#metadata&#34;&gt;official documentation.&lt;/a&gt;&lt;/p&gt;
&lt;h4 id=&#34;how-to-use&#34;&gt;How to Use&lt;/h4&gt;
&lt;p&gt;When using this plugin, since the SkyWalking plugin is &amp;ldquo;not enabled&amp;rdquo; by default, you need to manually modify the &lt;code&gt;plugins&lt;/code&gt; section in the &lt;code&gt;conf/default-apisix.yaml&lt;/code&gt; file to enable the plugin.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;plugins&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;- error-log-logger&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then you can use the SkyWalking Tracing plug-in to get the tracing data directly, so you can verify that the Logging plug-in-related features are enabled and working properly.&lt;/p&gt;
&lt;h5 id=&#34;step-1-create-a-route&#34;&gt;Step 1: Create a route&lt;/h5&gt;
&lt;p&gt;Next, create a route and bind the SkyWalking Tracing plugin and the SkyWalking Logging plugin. More details of the plugin configuration can be found in the &lt;a href=&#34;https://apisix.apache.org/docs/apisix/plugins/skywalking-logger&#34;&gt;official Apache APISIX documentation&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -X PUT &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;http://192.168.0.108:9080/apisix/admin/routes/1001&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-H &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;X-API-KEY:  edd1c9f034335f136f87ad84b625c8f1&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-H &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;Content-Type: application/json&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-d &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;{
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;uri&amp;#34;: &amp;#34;/get&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;plugins&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        &amp;#34;skywalking&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;            &amp;#34;sample_ratio&amp;#34;: 1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        },
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        &amp;#34;skywalking-logger&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;            &amp;#34;endpoint_addr&amp;#34;: &amp;#34;http://127.0.0.1:12800&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    },
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;upstream&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        &amp;#34;type&amp;#34;: &amp;#34;roundrobin&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        &amp;#34;nodes&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;            &amp;#34;httpbin.org:80&amp;#34;: 1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;}&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h5 id=&#34;step-2-log-processing&#34;&gt;Step 2: Log Processing&lt;/h5&gt;
&lt;p&gt;On the Apache SkyWalking side, you can use LAL (Logger Analysis Language) scripts for log processing, such as Tag extraction, SkyWalking metadata correction, and so on.&lt;/p&gt;
&lt;p&gt;The main purpose of Tag extraction here is to facilitate subsequent retrieval and to add dependencies to the Metrics statistics. The following code can be used to configure the SkyWalking LAL script to complete the Tag extraction. For more information on how to use the SkyWalking LAL script, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/concepts-and-designs/lal/&#34;&gt;official Apache SkyWalking documentation&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# The default LAL script to save all logs, behaving like the versions before 8.5.0.&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;rules&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;- &lt;span style=&#34;color:#0550ae&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;dsl&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;|&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;      filter {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        json {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;          abortOnFailure false
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        extractor {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;          tag routeId: parsed.route_id
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;          tag upstream: parsed.upstream
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;          tag clientIp: parsed.client_ip
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;          tag latency: parsed.latency
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        sink {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;      }&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After configuring the above LAL script in SkyWalking OAP Server the following log will be displayed.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./LALscript_details.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Details of the expanded log are as follows.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./expanded_log.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;As you can see from the above, displaying &lt;code&gt;routeId&lt;/code&gt;, &lt;code&gt;upstream&lt;/code&gt; and &lt;code&gt;clientIp&lt;/code&gt; as key-value pairs is much easier than searching directly in the log body. This is because the Tag format not only supports log display format and search, but also generates information such as Metrics using MAL statistics.&lt;/p&gt;
&lt;h3 id=&#34;skywalking-error-logger-plugin&#34;&gt;SkyWalking Error Logger Plugin&lt;/h3&gt;
&lt;p&gt;The error-log-logger plug-in now supports the SkyWalking log format, and you can now use the http-error-log plug-in to quickly connect Apache APISIX error logs to Apache SkyWalking. Currently, error logs do not have access to SkyWalking Tracing Context information, and therefore cannot be directly associated with SkyWalking Tracing.&lt;/p&gt;
&lt;p&gt;The main reason for the error log to be integrated into SkyWalking is to centralize the Apache APISIX log data and to make it easier to view all observable data within SkyWalking.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./skywalking_dashboard.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;how-to-use-1&#34;&gt;How to Use&lt;/h4&gt;
&lt;p&gt;Since the error-log-logger plugin is &amp;ldquo;not enabled&amp;rdquo; by default, you still need to enable the plugin in the way mentioned above.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;plugins&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;- error-log-logger&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;...&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h5 id=&#34;step-1-bind-the-route&#34;&gt;Step 1: Bind the route&lt;/h5&gt;
&lt;p&gt;After enabling, you need to bind the plugin to routes or global rules. Here we take &amp;ldquo;bind routes&amp;rdquo; as an example.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -X PUT &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;http://192.168.0.108:9080/apisix/admin/plugin_metadata/error-log-logger&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-H &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;X-API-KEY:  edd1c9f034335f136f87ad84b625c8f1&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-H &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;Content-Type: application/json&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;-d &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;{
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;inactive_timeout&amp;#34;: 10,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;level&amp;#34;: &amp;#34;ERROR&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    &amp;#34;skywalking&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        &amp;#34;endpoint_addr&amp;#34;: &amp;#34;http://127.0.0.1:12800/v3/logs&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;}&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Note that the &lt;code&gt;endpoint_addr&lt;/code&gt; is the SkyWalking OAP Server address and needs to have the URI (i.e. &lt;code&gt;/v3/logs&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h5 id=&#34;step-2-lal-processing&#34;&gt;Step 2: LAL Processing&lt;/h5&gt;
&lt;p&gt;In much the same way as the Access Log processing, the logs are also processed by LAL when they reach SkyWalking OAP Server. Therefore, we can still use the SkyWalking LAL script to analyze and process the log messages.&lt;/p&gt;
&lt;p&gt;It is important to note that the Error Log message body is in text format. If you are extracting tags, you will need to use regular expressions to do this. Unlike Access Log, which handles the message body in a slightly different way, Acces Log uses JSON format and can directly reference the fields of the JSON object using JSON parsing, but the rest of the process is largely the same.&lt;/p&gt;
&lt;p&gt;Tags can also be used to optimize the display and retrieval for subsequent metrics calculations using SkyWalking MAL.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;rules:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;name:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;apisix-errlog&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;dsl:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;|&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;filter&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;text&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;regexp&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;#34;(?&amp;lt;datetime&amp;gt;\\d{4}/\\d{2}/\\d{2} \\d{2}:\\d{2}:\\d{2}) \\[(?&amp;lt;level&amp;gt;\\w+)\\] \\d+\\#\\d+:( \\*\\d+ \\[(?&amp;lt;module&amp;gt;\\w+)\\] (?&amp;lt;position&amp;gt;.*\\.lua:\\d+): (?&amp;lt;function&amp;gt;\\w+\\(\\)):)* (?&amp;lt;msg&amp;gt;.+)&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;extractor&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;tag&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;level:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;parsed.level&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;if&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;(parsed?.module)&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;tag&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;module:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;parsed.module&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;tag&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;position:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;parsed.position&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;tag&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;function:&lt;/span&gt; &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;parsed.function&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;sink&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f6f8fa;background-color:#82071e&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After the LAL script used by SkyWalking OAP Server, some of the Tags will be extracted from the logs, as shown below.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://static.apiseven.com/202108/1638781886771-b98c80de-4ea2-4cf3-ad1c-26250da231f7.png&#34; alt=&#34;Tags details&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article introduces two logging plug-ins for Apache APISIX that integrate with SkyWalking to provide a more convenient operation and environment for logging in Apache APISIX afterwards.&lt;/p&gt;
&lt;p&gt;We hope that through this article, you will have a fuller understanding of the new features and be able to use Apache APISIX for centralized management of observable data more conveniently in the future.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Extending Apache SkyWalking with non-breaking breakpoints</title>
      <link>/blog/2021-12-06-extend-skywalking-with-nbb/</link>
      <pubDate>Mon, 06 Dec 2021 00:00:00 +0000</pubDate>
      <guid>/blog/2021-12-06-extend-skywalking-with-nbb/</guid>
      <description>
        
        
        &lt;p&gt;Non-breaking breakpoints are breakpoints specifically designed for live production environments. With non-breaking breakpoints, reproducing production bugs locally or in staging is conveniently replaced with capturing them directly in production.&lt;/p&gt;
&lt;p&gt;Like regular breakpoints, non-breaking breakpoints can be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;placed almost anywhere&lt;/li&gt;
&lt;li&gt;added and removed at will&lt;/li&gt;
&lt;li&gt;set to fire on specific conditions&lt;/li&gt;
&lt;li&gt;expose internal application state&lt;/li&gt;
&lt;li&gt;persist as long as desired (even between application reboots)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The last feature is especially useful given non-breaking breakpoints can be left in production for days, weeks, and even months at a time while waiting to capture behavior that happens rarely and unpredictably.&lt;/p&gt;
&lt;h4 id=&#34;how-do-non-breaking-breakpoints-work&#34;&gt;How do non-breaking breakpoints work?&lt;/h4&gt;
&lt;p&gt;If you&amp;rsquo;re familiar with general distributed tracing concepts, such as &amp;ldquo;traces&amp;rdquo; and &amp;ldquo;spans&amp;rdquo;, then you&amp;rsquo;re already broadly familiar with how non-breaking breakpoints work. Put simply, non-breaking breakpoints are small fragments of code added during runtime that, upon the proper conditions, save a portion of the application&amp;rsquo;s current state, and resume normal execution. In SkyWalking, this can be implemented by simply opening a new local span, adding some tags, and closing the local span.&lt;/p&gt;
&lt;p&gt;While this process is relatively simple, the range of functionality that can be achieved through this technique is quite impressive.
Save the current and global variables to create a non-breaking breakpoint; add the ability to format log messages to create just-in-time logging; add the ability to trigger metric telemetry to create real-time KPI monitoring. If you keep moving in this direction, you eventually enter the realm of live debugging/coding, and this is where Source++ comes in.&lt;/p&gt;
&lt;h4 id=&#34;live-coding-platform&#34;&gt;Live Coding Platform&lt;/h4&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/sourceplusplus&#34;&gt;Source++&lt;/a&gt; is an open-source live coding platform designed for production environments, powered by Apache SkyWalking. Using Source++, developers can add breakpoints, logs, metrics, and distributed tracing to live production software in real-time on-demand, right from their IDE or CLI. While capable of stand-alone deployment, the latest version of Source++ makes it easier than ever to integrate into existing Apache SkyWalking installations. This process can be completed in a few minutes and is easy to customize for your specific needs.&lt;/p&gt;
&lt;p&gt;For a better idea of how Source++ works, take a look at the following diagram:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./enhanced_sw_setup.svg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;In this diagram, blue components represent existing SkyWalking architecture, black components represent new Source++ architecture, and the red arrows show how non-breaking breakpoints make their way from production to IDEs. A process that is facilitated by Source++ components: Live Probe, Live Processors, Live Platform, and Live Interface.&lt;/p&gt;
&lt;h4 id=&#34;live-probe&#34;&gt;Live Probe&lt;/h4&gt;
&lt;p&gt;The Live Probe is currently available for &lt;a href=&#34;https://github.com/sourceplusplus/probe-jvm&#34;&gt;JVM&lt;/a&gt; and &lt;a href=&#34;https://github.com/sourceplusplus/probe-python&#34;&gt;Python&lt;/a&gt; applications. It runs alongside the SkyWalking agent and is responsible for dynamically adding and removing code fragments based on valid instrumentation requests from developers. These code fragments in turn make use of the SkyWalking agent&amp;rsquo;s internal APIs to facilitate production instrumentation.&lt;/p&gt;
&lt;h4 id=&#34;live-processors&#34;&gt;Live Processors&lt;/h4&gt;
&lt;p&gt;Live Processors are responsible for finding, extracting, and transforming data found in distributed traces produced via live probes. They run alongside SkyWalking collectors and implement additional post-processing logic, such as PII redaction. Live processors work via uniquely identifiable tags (prefix &lt;code&gt;spp.&lt;/code&gt;) added previously by live probes.&lt;/p&gt;
&lt;p&gt;One could easily view a non-breaking breakpoint ready for processing using Rocketbot, however, it will look like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./rocketbot_nbb.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Even though the above does not resemble what&amp;rsquo;s normally thought of as a breakpoint, the necessary information is there. With live processors added to your SkyWalking installation, this data is refined and may be viewed more traditionally via live interfaces.&lt;/p&gt;
&lt;h4 id=&#34;live-platform&#34;&gt;Live Platform&lt;/h4&gt;
&lt;p&gt;The &lt;a href=&#34;https://github.com/sourceplusplus/live-platform&#34;&gt;Live Platform&lt;/a&gt; is the core part of the Source++ architecture. Unlike the live probe and processors, the live platform does not have a direct correlation with SkyWalking components. It is a standalone server responsible for validating and distributing production breakpoints, logs, metrics, and traces. Each component of the Source++ architecture (probes, processors, interfaces) communicates with each other through the live platform. It is important to ensure the live platform is accessible to all of these components.&lt;/p&gt;
&lt;h4 id=&#34;live-interface&#34;&gt;Live Interface&lt;/h4&gt;
&lt;p&gt;Finally, with all the previous parts installed, we&amp;rsquo;re now at the component software developers will find the most useful. A Live Interface is what developers use to create, manage, and view non-breaking breakpoints, and so on. There are a few live interfaces available:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains&#34;&gt;JetBrains Plugin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/sourceplusplus/interface-cli&#34;&gt;CLI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the &lt;a href=&#34;https://github.com/sourceplusplus/processor-live-instrument&#34;&gt;Live Instrument Processor&lt;/a&gt; enabled, and the &lt;a href=&#34;https://github.com/sourceplusplus/interface-jetbrains&#34;&gt;JetBrains Plugin&lt;/a&gt; installed, non-breaking breakpoints appear as such:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./ide_breakpoint.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;The above should be a sight far more familiar to software developers. Beyond the fact that you can&amp;rsquo;t step through execution, non-breaking breakpoints look and feel just like regular breakpoints.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;For more details and complete setup instructions, please visit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/sourceplusplus/deploy-skywalking&#34;&gt;https://github.com/sourceplusplus/deploy-skywalking&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: End-User Tracing in a SkyWalking-Observed Browser</title>
      <link>/blog/end-user-tracing-in-a-skywalking-observed-browser/</link>
      <pubDate>Thu, 25 Mar 2021 00:00:00 +0000</pubDate>
      <guid>/blog/end-user-tracing-in-a-skywalking-observed-browser/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img src=&#34;aircraft.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Origin: &lt;a href=&#34;https://thenewstack.io/end-user-tracing-in-a-skywalking-observed-browser&#34;&gt;End-User Tracing in a SkyWalking-Observed Browser - The New Stack&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking&#34;&gt;Apache SkyWalking&lt;/a&gt;: an APM (application performance monitor) system, especially
designed for microservices, cloud native, and container-based (Docker, Kubernetes, Mesos) architectures.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking-client-js&#34;&gt;skywalking-client-js&lt;/a&gt;: a lightweight client-side JavaScript exception, performance, and tracing library. It provides metrics and error collection to the SkyWalking backend. It also makes the browser the starting point for distributed tracing.&lt;/p&gt;
&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;Web application performance affects the retention rate of users. If a page load time is too long, the user will give up. So we need to monitor the web application to understand performance and ensure that servers are stable, available and healthy. SkyWalking is an APM tool and the skywalking-client-js extends its monitoring to include the browser, providing performance metrics and error collection to the SkyWalking backend.&lt;/p&gt;
&lt;h2 id=&#34;performance-metrics&#34;&gt;Performance Metrics&lt;/h2&gt;
&lt;p&gt;The skywalking-client-js uses [window.performance] (&lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/Window/performance&#34;&gt;https://developer.mozilla.org/en-US/docs/Web/API/Window/performance&lt;/a&gt;) for performance data collection. From the MDN doc, the performance interface provides access to performance-related information for the current page. It&amp;rsquo;s part of the High Resolution Time API, but is enhanced by the &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/Performance_Timeline&#34;&gt;Performance Timeline API&lt;/a&gt;,  the &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/Navigation_timing_API&#34;&gt;Navigation Timing API&lt;/a&gt;, the &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/User_Timing_API&#34;&gt;User Timing API&lt;/a&gt;, and the &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/Resource_Timing_API&#34;&gt;Resource Timing API&lt;/a&gt;. In skywalking-client-js, all performance metrics  are calculated according to the &lt;a href=&#34;https://www.w3.org/TR/navigation-timing/?spm=a2c4g.11186623.2.12.2f495c7cmRef8Q#sec-navigation-timing-interface&#34;&gt;Navigation Timing API&lt;/a&gt; defined in the W3C specification. We can get a PerformanceTiming object describing our page using the window.performance.timing property. The PerformanceTiming interface contains properties that offer performance timing information for various events that occur during the loading and use of the current page.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;window.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;We can better understand these attributes when we see them together in the figure below from &lt;a href=&#34;https://www.w3.org/TR/navigation-timing/?spm=a2c4g.11186623.2.14.2f495c7cmRef8Q#processing-model&#34;&gt;W3C&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;w3c.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;The following table contains performance metrics in skywalking-client-js.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Metrics Name&lt;/th&gt;
          &lt;th&gt;Describe&lt;/th&gt;
          &lt;th&gt;Calculating Formulae&lt;/th&gt;
          &lt;th&gt;Note&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;redirectTime&lt;/td&gt;
          &lt;td&gt;Page redirection time&lt;/td&gt;
          &lt;td&gt;redirectEnd - redirectStart&lt;/td&gt;
          &lt;td&gt;If the current document and the document that is redirected to are not from the same &lt;a href=&#34;http://tools.ietf.org/html/rfc6454&#34;&gt;origin&lt;/a&gt;, set redirectStart, redirectEnd to 0&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;ttfbTime&lt;/td&gt;
          &lt;td&gt;Time to First Byte&lt;/td&gt;
          &lt;td&gt;responseStart - requestStart&lt;/td&gt;
          &lt;td&gt;According to &lt;a href=&#34;https://developers.google.com/web/tools/chrome-devtools/network/reference?spm=a2c4g.11186623.2.16.2f495c7cmRef8Q#timing&#34;&gt;Google Development&lt;/a&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;dnsTime&lt;/td&gt;
          &lt;td&gt;Time to DNS query&lt;/td&gt;
          &lt;td&gt;domainLookupEnd - domainLookupStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;tcpTime&lt;/td&gt;
          &lt;td&gt;Time to TCP link&lt;/td&gt;
          &lt;td&gt;connectEnd - connectStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;transTime&lt;/td&gt;
          &lt;td&gt;Time to content transfer&lt;/td&gt;
          &lt;td&gt;responseEnd - responseStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;sslTime&lt;/td&gt;
          &lt;td&gt;Time to SSL secure connection&lt;/td&gt;
          &lt;td&gt;connectEnd - secureConnectionStart&lt;/td&gt;
          &lt;td&gt;Only supports HTTPS&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;resTime&lt;/td&gt;
          &lt;td&gt;Time to resource loading&lt;/td&gt;
          &lt;td&gt;loadEventStart - domContentLoadedEventEnd&lt;/td&gt;
          &lt;td&gt;Represents a synchronized load resource in pages&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;fmpTime&lt;/td&gt;
          &lt;td&gt;Time to First Meaningful Paint&lt;/td&gt;
          &lt;td&gt;-&lt;/td&gt;
          &lt;td&gt;Listen for changes in page elements. Traverse each new element, and calculate the total score of these elements. If the element is visible, the score is 1 * weight; if the element is not visible, the score is 0&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;domAnalysisTime&lt;/td&gt;
          &lt;td&gt;Time to DOM analysis&lt;/td&gt;
          &lt;td&gt;domInteractive - responseEnd&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;fptTime&lt;/td&gt;
          &lt;td&gt;First Paint Time&lt;/td&gt;
          &lt;td&gt;responseEnd - fetchStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;domReadyTime&lt;/td&gt;
          &lt;td&gt;Time to DOM ready&lt;/td&gt;
          &lt;td&gt;domContentLoadedEventEnd - fetchStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;loadPageTime&lt;/td&gt;
          &lt;td&gt;Page full load time&lt;/td&gt;
          &lt;td&gt;loadEventStart - fetchStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;ttlTime&lt;/td&gt;
          &lt;td&gt;Time to interact&lt;/td&gt;
          &lt;td&gt;domInteractive - fetchStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;firstPackTime&lt;/td&gt;
          &lt;td&gt;Time to first package&lt;/td&gt;
          &lt;td&gt;responseStart - domainLookupStart&lt;/td&gt;
          &lt;td&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Skywalking-client-js collects those performance metrics and sends them to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/concepts-and-designs/backend-overview/&#34;&gt;OAP (Observability Analysis Platform) server&lt;/a&gt; , which aggregates data on the back-end side that is then shown in visualizations on the UI side. Users can optimize the page according to these data.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;performance.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;exception-metrics&#34;&gt;Exception Metrics&lt;/h2&gt;
&lt;p&gt;There are five kinds of errors that can be caught in skywalking-client-js:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The resource loading error  is captured  by &lt;code&gt;window.addeventlistener (&#39;error &#39;, callback, true)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;window.onerror&lt;/code&gt; catches JS execution errors&lt;/li&gt;
&lt;li&gt;&lt;code&gt;window.addEventListener(&#39;unhandledrejection&#39;, callback)&lt;/code&gt; is used to catch the promise errors&lt;/li&gt;
&lt;li&gt;the  Vue errors are captured by &lt;code&gt;Vue.config.errorHandler&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;the Ajax errors are captured by &lt;code&gt;addEventListener(&#39;error&#39;, callback); addEventListener(&#39;abort&#39;, callback); addEventListener(&#39;timeout&#39;, callback); &lt;/code&gt; in send callback.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Skywalking-client-js traces error data to the OAP server, finally visualizing data on the UI side.  For an error overview of the App, there are several metrics for basic statistics and trends of errors, including the following metrics.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;App  Error Count, the total number of errors in the selected time period.&lt;/li&gt;
&lt;li&gt;App JS Error Rate, the proportion of PV with JS errors in a selected time period to total PV.&lt;/li&gt;
&lt;li&gt;All of  Apps Error Count, Top N Apps error count  ranking.&lt;/li&gt;
&lt;li&gt;All of Apps JS Error Rate, Top N Apps JS error rate ranking.&lt;/li&gt;
&lt;li&gt;Error Count of  Versions in the Selected App,  Top N Error Count of  Versions in the Selected App ranking.&lt;/li&gt;
&lt;li&gt;Error Rate  of  Versions in the Selected App, Top N JS Error Rate of Versions in the Selected App ranking.&lt;/li&gt;
&lt;li&gt;Error Count of the Selected App, Top N Error Count of the Selected App ranking.&lt;/li&gt;
&lt;li&gt;Error Rate  of the Selected App, Top N JS Error Rate of the Selected App ranking.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;errors.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;For pages, we use several metrics for basic statistics and trends of errors, including the following metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Top Unstable Pages / Error Rate, Top N Error Count pages of the Selected version ranking.&lt;/li&gt;
&lt;li&gt;Top Unstable Pages / Error Count, Top N Error Count pages of the Selected version ranking.&lt;/li&gt;
&lt;li&gt;Page Error Count Layout, data display of different errors in a period of time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;trends-errors.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;user-metrics&#34;&gt;User Metrics&lt;/h2&gt;
&lt;p&gt;SkyWalking browser monitoring also provides metrics about how the visitors use the monitored websites, such as PV(page views), UV(unique visitors), top N PV(page views), etc.&lt;/p&gt;
&lt;p&gt;In SPAs (single page applications), the page will be refreshed only once. The traditional method only reports PV once after the page loading, but cannot count the PV of each sub-page, and can&amp;rsquo;t make other types of logs aggregate by sub-page.&lt;/p&gt;
&lt;p&gt;SkyWalking browser monitoring provides two processing methods for SPA pages:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Enable SPA automatic parsing. This method is suitable for most single page application scenarios with URL hash as the route. In the initialized configuration item, set enableSPA to true, which will turn on the page&amp;rsquo;s hashchange event listener (trigger re reporting PV), and use URL hash as the page field in other data reporting.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Manual reporting. This method can be used in all single page application scenarios. This method can be used if the first method is not usable. The following example provides a set page method to manually update the page name when data is reported. When this method is called, the page PV will be re reported by default:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-js&#34; data-lang=&#34;js&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;.&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;routeChange&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;function&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;to&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;ClientMonitor&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;.&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;setPerformance&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;({&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;collector&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;http://127.0.0.1:8080&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;service&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;browser-app&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;serviceVersion&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;1.0.0&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;pagePath&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;to&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;.&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;path&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;autoTracePerf&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;enableSPA&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;true&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#1f2328&#34;&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Let&amp;rsquo;s take a look at the result found in the following image. It shows the most popular applications and versions, and the changes of PV over a period of time.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;user.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;make-the-browser-the-starting-point-for-distributed-tracing&#34;&gt;Make the browser the starting point for distributed tracing&lt;/h2&gt;
&lt;p&gt;SkyWalking browser monitoring intercepts HTTP requests to trace segments and spans. It supports tracking these following modes of HTTP requests: XMLHttpRequest and fetch. It also supports tracking libraries and tools based on XMLHttpRequest and fetch - such as Axios, SuperAgent, OpenApi, and so on.&lt;/p&gt;
&lt;p&gt;Let’s see how the SkyWalking browser monitoring intercepts HTTP requests:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;interceptor.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;After this, use &lt;code&gt;window.addEventListener(&#39;xhrReadyStateChange&#39;, callback)&lt;/code&gt; and set the readyState value to&lt;code&gt;sw8 = xxxx&lt;/code&gt; in the request header. At the same time, reporting requests information to the back-end side. Finally, we can view trace data on the trace page. The following graphic is from the trace page:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;trace.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;To see how we listen for fetch requests, let’s see the source code of &lt;a href=&#34;https://github.com/github/fetch/blob/90fb680c1f50181782f276122c1b1115535b1603/fetch.js#L506&#34;&gt;fetch&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;fetch.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;As you can see, it creates a promise and a new XMLHttpRequest object. Because the code of the fetch is built into the browser, it must monitor the code execution first. Therefore, when we add listening events, we can&amp;rsquo;t monitor the code in the fetch. Just after monitoring the code execution, let&amp;rsquo;s rewrite the fetch:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-js&#34; data-lang=&#34;js&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;import&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;fetch&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;whatwg-fetch&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;;&lt;/span&gt; &lt;span style=&#34;color:#6639ba&#34;&gt;window&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;.&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;fetch&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;fetch&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this way, we can intercept the fetch request through the above method.&lt;/p&gt;
&lt;h2 id=&#34;additional-resources&#34;&gt;Additional Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/blog/end-user-tracing-in-a-skywalking-observed-browser&#34;&gt;End-User Tracing in a SkyWalking-Observed Browser&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
  </channel>
</rss>
