Distributed tracing at Pinterest with new open source tools

The Pinterest backend consists of hundreds of microservices deployed across tens of thousands of machines. A simple request to the Pinterest home feed can make hundreds of network calls to dozens of backend services. While we know how long a request takes to execute, previously, we didn’t know why a request was slow. To collect this data, we built Pintrace, a distributed tracing pipeline that tracks a request as it’s processed across our python and Java backend services. We also built Pintrace collector to help us debug performance issues. Pintrace collector is a Spark job that reads spans from Kafka, aggregates them into traces and stores them in an Elasticsearch backend. Today, we’re open sourcing Pintrace collector on Github , and contributing the code to the OpenZipkin community. By releasing it to the community, we hope others can benefit from the project as much as we do.

Background

Tracking down the source of latency can be very challenging because the slowness can be caused by any one of our backend services. Our metrics ( OpenTSDB ) and logging ( ELK ) infrastructure provide a coarse-grained view of how the requests are processed on the backend, but before Pintrace, it was hard to pinpoint why a request was slow.

To solve this challenge, we turned to distributed tracing, which tracks latency issues and shows where a request spends its time. It provides fine-grained visibility into request execution by capturing request causality information (i.e., which downstream requests were executed as part of a request) in addition to request latency information. Pintrace is our in-house distributed tracing pipeline that tracks a request as it’s processed across our Python and Java backend services.

Before we look into the details, let’s define some terminology. Annotations are events occurring during a request. For example, sending a request to the server is an annotation. A span represents a logical operation and contains a set of annotations that belong to that operation. For example, a RPC call consisting of "client send" and "client receive" annotations is represented as a span. A trace is a graph of spans. A request execution is usually captured as a trace. All spans in a trace have the same trace ID. For more details on how traces are structured, check out OpenTracing's introduction .

Pintrace: a tracing pipeline

Building a distributed tracing infrastructure involves setting up a tracing pipeline. A tracing pipeline consists of two main components source code instrumentation and trace processing backend.

The source code instrumentation is responsible for tracking the request across services and generating events (spans) as the request is processed along various services. The trace processing backend is responsible for aggregating the generated spans, processing them into traces and storing them. The stored traces can be searched and visualized in a UI.

The Pintrace tracing pipeline consists of several components shown in the following diagram. The components in blue are built in-house.

Distributed tracing at Pinterest with new open source tools

Source code instrumentation

Source code instrumentation is the bulk of the work when setting up a tracing pipeline. The instrumentation is responsible for generating spans that belong to the same request with the same trace ID. To achieve this, the instrumentation does three tasks: (a) propagating the trace ID along with every request (b) generating the spans for every logical operation performed by that component and (c) sending the generated spans to the backend. The first two responsibilities are implemented by a tracer. A logger implements the third responsibility. At Pinterest, we log the spans to Kafka using an analytics tool calledSinger.

Pintrace backend consists of Thrift services written in Python, Java, Go, Node and C++, each using different frameworks. To trace a request across these services, we implemented a tracer for each language framework combination. Pinterest has a common framework per language, and most services we’re interested in are written in Python and Java, so we only had to instrument Python and Java frameworks. When we first started building Pintrace, the initial goal was to track down network request latency, so we focused solely on capturing the network activity of a request as spans in Python and Java applications.

Python instrumentation

Our Java services are fronted by a monolithic Python application called ngapi, a HTTP service that handles all incoming requests to our backend. To serve a request, ngapi talks to several Java services over Thrift and implements a custom web framework based on the Python gevent library. Since it’s a custom framework, we implemented our own tracer to trace the requests passing through it.

We wrote a tracer for our web framework using the open tracing API . The tracer propagated the trace ID from the HTTP interface to the Java Thrift service using a custom Thrift wrapper provided by Finagle. In addition, the tracer generated spans for all Thrift, Memcache and HTTP requests. To report these spans, we wrote a logger that took the generated spans, converted them to Zipkin format and logged them to Kafka using Singer.

Since ngapi is managed in a monorepo and deployed as a monolithic application, the change to enable tracing is deployed in a single step across all frontend services.

Java tracer and logger

Our Java services use an in-house framework called “service framework,” a wrapper around Twitter’s Finagle library. Since Finagle ships with a Zipkin tracer, we didn’t build a new tracer for our Java services. By default, Finagle tracer writes the gathered spans to scribe or a log file, but since we wanted to write our spans to Kafka using Singer, we wrote a span logger that logs the spans to Singer.

Once the tracer and logger are in place, we enabled them in our service framework. However, our service framework lacked a way to enable tracing in all apps at once. To avoid duplicating this logic in more than 100 applications, we updated the service framework to have a global config and enabled tracing in it. Then we updated our Java apps to use this global config from the updated service framework. The hardest parts of enabling tracing in Java was making the change in hundreds of apps and working to deploy the changes across the organization.

Sampler

Even though our client instrumentation has really small latency overhead, there’s double digit computation overhead added by capturing the spans and logging them. To reduce this overhead, as well as the cost of storing the generated traces, we only trace a small set of the overall requests.

A sampler component decides which requests should be traced and randomly selects between 0 to 1 percent (typically 0.3 percent) of all of our backend requests. This rate can be dynamically adjusted using thedecider framework. We also sample all requests on development machines so developers can use traces as part of their day-to-day activities.

Trace processing backend

The trace processing backend, as shown in the diagram above, is responsible for aggregating the spans from thousands of machines in our cluster, processing them into traces and storing and visualizing them.

Singer-Kafka pipeline The Singer-Kafka pipeline

Latest Images

Trending Articles

Latest Images