To understand how your application is performing and make necessary improvements, it’s crucial to know how it behaves. The need to uncover the unknown is significant in the age of information, especially with the rapid growth of technology and our reliance on applications. That’s where observability comes in, and tools like OpenTelemetry, OpenTracing, and OpenCensus can help developers gain insights into how their applications and hosting environments are working.
The need of the hour is observability. So that developers can ensure things are working the way they should. Many apps and programs are developed and released in phases, relying on cloud-based technologies and communicating through multiple microservices. This makes observability necessary. Nobody wants such systems to become too difficult to fix. Everyone wants a simplified solution.
For those who need a refresher or are not familiar, observability is simply the ability to ascertain and monitor a system’s internal states through external outputs, such as its performance. With designs and requirements becoming more and more complex, system failures and bugs have started to multiply and become more abundant. Many tools have been created to prevent these issues from becoming part of the released system, which end-users would have to deal with.
As many of the applications are now being developed following microservice architecture, the growth of complex and diverse requirements at each step of the development is growing. These microservices each have distinctive single-function modules, and understanding how they behave data for developers to work on them further. In this structure, many applications are communicating with each other than before, and the margin for error is increasing with the level of complexity. The critical role of observability saves a lot of time and money for software developing companies. This is because, at every step of the process, bugs and issues are being dealt with. It provides information on how the system is going to work.
While observability is synonymous with the term monitoring, and in this day and age, cloud-native telemetry is becoming increasingly accepted as the crowd favorite for developers. It helps them find a lead and find where the bug is by tracing the behavior and getting to its bottom.
Equipping developers with tools and libraries to collect and analyze data is essential to achieving observability. The primary means of achieving this is through distributed tracing, which provides a source of metrics and logs to examine individual requests within the program, allowing for faster problem-solving. Having a good distributed tracing tool saves businesses time and energy, allowing developers to focus on improving the systems they are developing. Projects such as OpenTelemetry, OpenTracing, and OpenCensus were created specifically for this purpose. In this article, we will be delving into the two-decade-long journey to OpenTelemetry, its need, and how it evolved from its predecessors OpenTracing and OpenCensus. This way, you will also know how helpful these systems can be for you, especially for DevOps.
OpenTelemetry: A Unified Open Specification
As the name suggests, OpenTelemetry collects telemetry data (remember the three types mentioned previously). It is an ecosystem of instrumentation libraries and tools used to generate, collect, process, and export data. This tool does so by looking at the data from a distributed system to troubleshoot, debug and manage applications and the host environments that they are in. This is helpful because it allows IT and developer teams to issue their code base for data collection and adjust and adapt as they grow. It helps IT professionals to be able to analyze data using any language or platform they are comfortable with so that they are not tied to specifics in the long run. At the moment, it is only supporting a handful of languages, but it will be releasing more updates soon.
It is a vendor-neutral or vendor-agnostic tool that the Cloud Native Computing Foundation created. This sandbox project merged two different projects, OpenCensus and OpenTracing (which we will explain later). Currently, OpenTelemetry is still incubating and has been since May 2019. It is slowly releasing the standards in parts, as described below. Because the project is open-source, it takes in contributions from many developers. You can find it on GitHub, where it is very transparent for developers to follow what is going on and who is contributing to it.
OpenTelemetry consists of several tools that can come in handy for a developer, mainly for observability. The project is meant to be flexible and extensible to support a broad range of open-source, commercial, and end-user solutions. It was meant to bring together two projects that served almost the exact purpose of tracing metrics that were both open-sourced and could standardize the processes.
The unification served multiple purposes and fostered collaboration within the developers’ community instead of creating products that perform the same thing in the market and are also controlled by vendors.
This can provide you with the following:
- A single, vendor-neutral instrumentation library that is also language-specific and can support you with automatic and manual instrumentations.
- A single collector binary that you can use to deploy in many ways, including but not limited to as an agent or as a gateway.
- An end-to-end implementation that can help you generate, emit, collect, process, and export telemetry data.
- Complete control of data, including being able to send data to many destinations simultaneously through configuration.
- Open standard semantic conventions to make sure that data collection is always vendor neutrality.
- Support of multiple context propagation formats in conjunction with assisting with migration in the future as standards evolve.
- The ability to add on more technology protocols and formats as the technology evolves and new ways of observing data arise.
It’s important to note that OpenTelemetry should not be mistaken for back-end providers such as Prometheus or Jaeger. It supports the export of data to open source and commercial back ends to understand the data better as the developer requires.
Why Was OpenTelemetry Created?
The goal of creating OpenTelemetry was to bring together a myriad of technologies to form a vendor-neutral observability platform. As mentioned earlier, it is a part of the Cloud Native Computing Foundation and came from a merger of the OpenTracing and OpenCensus projects.
With growing complexity due to rapid changes in technology, new challenges have pushed for further solutions. The sole purpose was to manage the complexity and diversity of data that would become available. OpenTelemetry then came forward to consolidate and unify the environment for developers.
The unified set of libraries and specifications purpose was to create a platform that would be a complete telemetry system. This was done to be suitable for monitoring microservices and many other types of modern and distributed systems that would be compatible with most OSS and commercial backends.
Previously there were no accurate, standardized methods of describing what a system was doing. This happened mainly because different developers used different ways, languages, and machines in their different combinations. The burden of maintenance for instrumentation is heavily laid on the shoulders of the user as well. There was also a lack of data portability, so this became a challenge for observability tools to search for compatibility in various environments and requirements. Thus, the need for OpenTelemetry was to create a standardization for what distributed systems were doing. This also included the flexibility of using different languages and hardware systems, and APIs.
OpenTelemetry resulted from two decades worth of effort put into creating standards for observability and vendor-neutral APIs. While there were projects available that helped observability for software developers, the problem was that they had to look at different options in the market before settling for one that suited them the most. Developers had to study the pros and cons and consider which one worked better for them.
In 2016, OpenTracing was incubated in the CNCF, which focused on vendor-neutral APIs for the consumption of traces. In 2018, OpenCensus was created by Google, and it captured retracing permissions and metrics. The approaches (explained below) were more complementary rather than contradictory. In the same timeframe, World Wide Web Consortium (W3C) worked on Trace Contect and Correlation Context header specifications. This was used explicitly for efficient communication of traceIDs over HTTP. And these weren’t the only projects available at that time.
Ben Sigelman, in March 2019, then announced that the two projects, OpenTracing and OpenCensus, would be merging. This was because both had a common goal for open standards that focused on interoperability and a vendor-neutral observability ecosystem. The vendor-neutral approach would empower developers because they would not have to be contractually bound to a vendor for using their tools and have the flexibility.
By bringing together two distributed tracing libraries, CNCF and Google essentially killed the competition. While competition is good in the market as it fosters innovation, the same could not be said for OpenTracing and OpenCensus. Both were open-source and would benefit from the collaboration rather than compete with each other. While OpenTracing took care of tracing and logs, OpenCensus provided additional context while doing the same thing as you’ll see below. OpenTelemetry thus became a unifier of open specification that would support, empower and strengthen developers and their creations.
Benefits of OpenTelemetry
OpenTelemetry, while still in the early stage of its inception, is becoming a hot topic for developers to follow. The benefits that OpenTelemetry will provide to developers is as follows:
Because of its open-sourced nature, it is easy for developers to change backends without the need to change instrumentation. Apart from that, developers can work with more vendors, platforms, and projects quickly because of the single set of standards. Being bound to a vendor due to contracts is no longer a concern, as OpenTelemetry frees you from this constraint. Vendor roadmap priorities and configuration will no longer lock you in, and you will no longer have to wait for vendor updates to instrument new technologies as they emerge in the market.
Not only does it save your time from choosing between projects and standards, but OpenTelemetry also allows you to simplify your choice between OpenTracing and OpenCensus easily. It does everything; you don’t have to go comparing, trying out, and reading reviews about different platforms to use anymore. This saves your time and effort so that you can focus on building reliable and fantastic software. It also simplifies telemetry data management, and you can easily export it to a form that you can analyze quickly.
As many developers will follow a single standard, vendors will also move towards OpenTelemetry due to its flexibility. OpenTelemetry’s focus on high-quality, streamlined telemetry data makes it easy to accommodate and test a single standard.
Cross-platform and languages
OpenTelemetry already supports various languages and backends and is building up to accommodate more in the future. Being able to accommodate multiple platforms and languages provides ease for developers to capture and transmit telemetry to backends without changing existing instrumentations. Even more remarkable is the fact that OpenTelemetry’s installation and integration are as simple as adding a few lines of code into the system.
More control over your data
OpenTelemetry helps ease the burden of data collection from various sources and technologies. This aspect provides you the observability and the monitoring capabilities to focus more on analyzing the data and to better understand your applications. You no longer have to go through tedious methods of collecting data as everything will be streamlined according to your customizations. This way, you will ensure that the program you deliver can enhance user experiences and improve business outcomes.
This is a bonus benefit if you have already been using OpenCensus or OpenTracing. OpenTelemetry supports the use of its predecessors so that you can seamlessly migrate the systems.
When Did OpenTelemetry Become Available?
OpenTelemetry is in the beta stage across several languages. It has been in incubation in the Cloud Native Sandbox since May 2019.
At the moment, it has broad language support for :
It can integrate with frameworks such as:
OpenTelemetry went to the beta stage in March 2020. Currently, OpenTelemetry Tracing Specification has reached 1.0. Metric will achieve the same status within the second half of 2021, and logs will receive specifications by 2022.
Components of OpenTelemetry
OpenTelemetry consists of many components, some of which include:
- APIs: The Application Program Interface (API), one of the sources of telemetry data, is the core component of OpenTelemetry. To create traces, it is used to instrument the code, which can be achieved through either code change or auto-instrumentation agents, and is language-specific.
- SDKs: Developers use the Software Development Kit (SDK) to implement the API. It helps process and export data. The SDKs support configuration and help with transaction sampling and request filtering as well. You can imagine it as a bridge used to deliver data gathered from the API and the exporter.
- Exporters: Developers can configure where they wish to send their telemetry data through exporters. Exporters have the capability to translate data into customized formats as required, and the processed data can then be sent to the backend.
- Collector: This is an optional part of the OpenTelemetry that allows you to make a seamless telemetry solution. It can be used for filtering data, batching, aggregation, and communication with the backends. You can do this either on the agent residing in the host application or through a standalone process. It also has two versions of the collector, which are either Core, which is foundational, and Contrib, which are all the components that are available along with all the optional and experimental components.
Key Terms for OpenTelemetry
Telemetry data is the output that is required to understand the system. Observability requires the telemetry data for developers to study how their programs are working.
Commonly used terms in OpenTelemetry include:
A metric is a piece of quantifiable data that determines a component’s behavior over time. Metrics have attributes that can give you information about Service Level Agreements (SLAs), Service Level Objectives (SLOs), and Service Level Indicators (SLIs). You can use metrics to have a holistic view of the health of the system and its performance. Metrics are usually raw measurements about the service and are captured while the application is running. In OpenTelemetry, there are three metric instruments: observer, counter, and measure.
Traces are a representation of the end-to-end journey that a request makes through a system. The information tracks the request moving through the entire system from once it is made to once it is delivered. This way, you can identify at what stage of operation the request found an issue. Traces provide you with the context that you need for troubleshooting. It keeps track of an activity happening from the beginning till the end.
A log is a record of what is happening within the application. It helps you understand what your application is doing. These are lines of text that are structured, unstructured or plain text. Logging provides details about when a code had an issue, which makes fixing issues easy because it is now easy to find.
Spans, which are named, timed, and found within a trace, form a trace tree when nested. Each trace becomes a root span. You can use this to explain end-to-end latency and also its sub spans.
Each span contains a context. This is a unique identifier that represents the request that the span is a part of. It shows the data that is moving throughout the environment. It can support correlation context. This essentially helps carry user-defined properties, if required.
OpenTelemetry supports context propagation to bundle and communicate context between services using multiple protocols, which helps avoid issues. Context propagation is a crucial component of OpenTelemetry, and it has applications beyond tracing. Typically, HTTP headers are used to accomplish context propagation.
What is OpenTracing?
The developers developed OpenTracing as a vendor-agnostic API to assist with the instrumentation of tracing in code. It became a CNCF project in 2016, backed by the goal of having a vendor-neutral specification for distributed tracing and providing developers the ability to trace a request from start to finish while instrumenting their code.
It is a set of standard APIs that consistently model and explain the behavior of your distributed systems. OpenTracing relies on three constituencies:
- Tracing Tool Maintainers
- Software developers who are responsible for building and deploying applications
- Software developers who are contributing to widely used software.
How it supports developers is by creating a standard vendor-neutral framework for instrumentation through its API. Developers could try out various distributed tracing systems without the tedium that comes with repeating the entire instrumentation process from scratch for a new distributed tracing system. The purpose of this API is the incorporation of the distributed tracing at the service level and application level to allow developers to track requests across the services used to make the application.
OpenTracing’s specifications for span management can be used for any of the supported platforms:
OpenTracing is also compatible with the following tracers:
- CNCF Jaeger
- Elastic APM
- Apache SkyWalking
- Wavefront by VMWare
The OpenTracing API is pretty straightforward. It is a standardization that lies between application and library code and the myriad of systems that use data for tracing and for causality. Users of the standardization brought forth by OpenTracing could benefit from the offerings of consistent, unified, and tracer-neutral instrumentation API that could support a wide range of frameworks, programming languages, and platforms. It was able to:
- Provide an infrastructure overview that was out of the box and show what the interactions between different services were like and what they depended on.
- Provide information on the efficiency and detection of any latency issues.
- Provide smart error reporting through span transport errors messages and stack the traces. It is a valuable insight to find out the root cause of the issues and system failures.
- Information on trace data, which sends the data to other log processing platforms for analysis, provides useful information.
OpenTracing also utilizes distributed context propagation, which consists of the causal chain and breaks down the transaction from its starting point to its end.
Developers can trace what happened from the moment they initiated the request to its completion, or when an error occurred, using this.
What is OpenCensus?
Google created the open-source project OpenCensus in 2018, with its internal Census tool later becoming an open standard that included implementations for API metrics and traces. On integration with an application code, OpenCensus can emit traces and merits for a better understanding of the program and how it is behaving, allowing you to debug easily.
It helps you collect metrics and distributed traces. This allows developers to capture, export, and manipulate metrics and distributed traces to their choice of backend.OpenCensus provides the core function of collecting traces and metrics from applications, displaying them, and sending them to a tool for data analysis, often referred to as a backend.
Upon instrumentation of OpenCensus on a code, developers will arm themselves with tools to help them optimize the speed of their services, learn the exact way the request travels in the services, and gain metrics about the entire architecture. It uses context propagation, distributed trace collection, time-series metrics collection, APIs, and a myriad of integrations to support developers with their software with a lot of backend support.
There are many benefits of OpenCensus for the ecosystem:
For one, it aims to make application metrics and distributed traces accessible and available for developers in a more effortless manner than before. It provides a standard for good automatic instrumentation that helps developers know how well their code is performing.
Vendors of APM will have to deal with lesser issues based on setup friction. It makes it easy for their customers to switch when needed without compatibility problems and needs for upgrades and changes. Having broader language support means more ease of integration.
It provides local debugging capabilities. This way, developers can look at the metrics and requests on their own and customize sampling rates for traces.
OpenCensus aims to increase collaboration and support between vendors and open source-based providers, giving more power to the developers to be flexible with their software environment design.
It helps service providers and developers be able to solve customer issues better and faster.
OpenCensus can support the following languages:
The platform provides observability capabilities for the following:
- Google Cloud
- Go kit
It has the following backend support as well:
- Azure Monitor
- AWS X-Ray
- Google Cloud
- New Relic
Over to You
As you can see, OpenTelemetry took to OpenTracing’s tracing and distributed context propagation and OpenCensus’ time-series metrics and brought together two ambitious projects through a sense of collaboration for an open-source library for understanding telemetry data. The leadership of OpenCensus and OpenTracing actively demonstrates their commitment to fostering collaboration and advancing development in the industry by working towards a unified initiative that benefits the development community. They actively demonstrate their dedication to fostering collaboration and advancing development in the industry.
The point of the merger is to provide straightforward backward compatibility with the legacy projects using software bridges so that the transition is easy and smooth. While OpenTracing and OpenCensus will soon be on read-only mode, OpenTelemetry will take over as the specifications become more readily available. Remember, as of writing this article, it is available in some languages, tracing has reached specification 1, and logs and metrics will follow suit. The primary benefit to the community and the ecosystem is the consolidation of specifications and their standardization.
By now, you have already understood how important it is to know how your application works. Several tools are available to help you. Some are available through a vendor. Some through open source options. In a world where technological advances are demanding more complex and diverse solutions, it is only fair to assume that there is a need for a standard that can help make things easy. In the world of microservice architecture and the move towards cloud-based technologies in software development, OpenTelemetry provides the solution.
Rising from the merger between Google’s OpenCensus and CNCF’s OpenTracing APIs for the same purpose of increased observability and distributed tracing, OpenTelemetry will provide a way to collect data and to analyze it so that you are always one step ahead in your software development journey easily.
Moreover, as an open-source platform, developers can easily contribute to OpenTelemetry and be a part of its growth and development.
The CNCF welcomes and urges support and suggestions from the community. To keep yourself informed with the latest updates and contribute to the OpenTelemetry project, you can actively follow the project on GitHub.