How many APIs does your organization rely on? A 2020 study by Slashdata found that 89% of developers use APIs, and the vast majority are using third-party APIs. These numbers aren’t unique to specific markets either. Regardless whether you are a software-first company, or offering a more tangible service, APIs are a vital part of modern infrastructures. Relying on APIs saves time, resources, and allows businesses to experiment in ways that otherwise would be hard to justify.
Back to our original question. How many APIs is your organization relying on? This seems like an easy question to answer, but unless you have a deliberate management plan in place there is a high likelihood that you aren’t counting them all. You aren’t alone. Many companies lack information about their own dependencies. The APIs that fall through the cracks—the ones that aren’t visible and accounted for—are what we call shadow APIs.
What are shadow APIs
Shadow APIs are the third-party APIs and services that your company uses, but doesn’t track. You may not even know they exist, or you may not be aware of their less-than-redeeming qualities. When thinking about an API in your stack it can be useful to ask yourself the following questions:
- Where does it live? Is it used in a single service, or across all parts of your business?
- Who is the business owner? Is each developer responsible for the APIs they use, or is there a centralized manager? You can extend this question to include purchasing approval.
- Is the API compliant with the requirements of your business? Will their practices put you at risk?
- What data are you sending the API? Is it receiving personally identifiable information (PII) about your users?
These questions tell us a lot about the power third-party APIs have over our applications and the dangers they present. They describe APIs that could be:
- Leaking user data without your consent or knowledge.
- Performing poorly with sporadic downtimes or unpredictable response times.
- Lacking the required security and privacy standards to keep your business in compliance.
- Causing unpredictable costs and runaway budgets that managers have stopped trying to keep a handle on.
The biggest problem with shadow APIs is the unknown. They are a black box that could fail at any time, fail to meet your company's compliance standards, and even put your user's data at risk—all without your knowledge.
How to detect shadow APIs
The first step to avoiding shadow APIs is to discover the ones you currently have. There are a variety of approaches to doing so, and we’ve compiled the most common below.
Outbound proxies and API marketplaces
One solution is an outbound proxy. There are a handful of services doing some form of this. API proxies intercept all outgoing API calls and route them through their own service. Along the way, they catalog the APIs and log all the requests and responses. API marketplaces are similar, except developers use the marketplace’s API client and act more like a direct middleman.
The benefit to both of these solutions is that they inherently create a catalog of all API usage. This makes it easier to understand which apps consume which APIs and how they are performing. The downside is that developers need to explicitly use the proxy throughout their codebase. This can be automated to some extent, but still requires direct manipulation. In addition to that, traffic needs to go through them which can cause diminished performance. As with any additional dependency, they add another point of failure—the proxy itself—for each request. This hit to both performance and reliability isn’t an appealing compromise for detecting unknown shadow APIs.
Most companies already have a logging solution in place. This makes log analysis an appealing area to investigate. Tools in this category work by watching an application’s logs in real-time—or on set intervals—in order to detect APIs and problems. One great benefit of log analysis is that, if configured correctly, it can act as a hands-off and completely holistic approach to API monitoring. Once set up, it can detect new endpoints and APIs, surface response data, and more.
The downside is that in order to be effective, logs of every API call—including the full payload—need to be stored. This increases storage costs, which also increases processing costs to parse and consume those logs. It also opens you up to the risk of storing sensitive customer information. If incorrectly configured, a log analysis tool can miss valuable information or provide too much information. Many logging solutions are centered around app-level logging, not HTTP request logging, which makes identifying important changes harder.
Live monitoring inspects all API requests as they happen. This makes it a very appealing option, as it can detect newly added APIs almost immediately. It can detect problems and catch performance anomalies without the need to manually run performance tests. It even has the potential to remediate problems—if the implementation is directly tied to how the code executes. Live monitoring can be done directly in an API gateway, or as an in-app agent. The approach of integrating with API gateways is often preferred for this style of monitoring, as it simplifies the setup and isolates any security concerns that may come with duplicating the monitoring functionality throughout your applications.
The downside of live monitoring is that it can have a performance impact. Unlike proxy services which add performance to each request. Live monitoring will impact application performance as it needs to either process the captured information locally, or send it off to a service to handle it. In either case, some impact will be felt. Another pitfall is that it needs to be configured across every application and service in an organization. This means that if the shadow APIs you’re trying to identify exist in a legacy application that doesn’t receive the updates, you’ll never find them.
Code scanning, or static code analysis, involves scanning the source code of your applications to identify the use of APIs. It is non-invasive since it doesn’t run live in your application, and can catch APIs before they hit production. It is also holistic. Connect it to your source control system, and you’re all set. No need to manually integrate it with every application. This is also an excellent way to build an accurate, automated data flow map. In addition to helping with GDPR compliance and easing the creation of your ROPA reports, automated data flow maps will help identify and APIs your organization is using.
The downside is that it is harder to build. Code scanning tools need to support all languages in your stack, understand their HTTP request structure and dependency systems, and identify the difference between internal and external APIs. This could result in an incomplete system missing important APIs. It also makes it harder to detect code that causes API issues, and as it doesn’t run in production there is no way to detect any performance-related problems with the API or automatically performance test web services.
Avoiding unexpected API-creep
Once existing shadow APIs have been detected, the next step is to shift focus toward an ongoing prevention and detection strategy. To start, set up a system of governance.
Governance is often a term used to describe first-party—internal—APIs, but it can be just as beneficial for third-party APIs and web services. We’ve written about some benefits of applying governance to third-party APIs in the past, but they are worth repeating.
Governance allows you to create a system of procedures and a set of standards that all APIs need to meet before they can be used. These standards may include criteria such as minimum acceptable privacy and compliance certifications. This could mean an API is required to be GDPR, SOC, and ISO27001 compliant in order to be approved. These differ from industry to industry, however most user-privacy standards are now expected across all markets. You can even include requirements into what kind of service level agreement (SLA) an API offers.
Governance should also include an approval process and management system. Do you know the business and technical owner of each API in your stack? Governance can solve that. When a new API is proposed for use, it goes through the requirements check, a business case is made, and it is cataloged. This allows the rest of your organization to easily confirm if an API is already in use and permitted. You may be surprised to learn that it is not uncommon for multiple teams in an organization to use the same API, with different accounts.
Actively monitor for new APIs
In order to keep the governance process accountable, implementing a monitoring and alerting system is crucial. Many of the approaches for detecting shadow APIs can apply here. We believe live monitoring and code analysis systems are the best approach for consistently detecting shadow APIs. To keep the process even more automatic, code analysis can be set to run as part of your existing continuous integration (CI) suite or testing platform.
Set up regular audits
The goal is to avoid errors in reporting. If the detection methods are detecting new APIs, but nobody is listening, then they aren’t providing value. Create a schedule of regular reporting and auditing. This can be automated—perhaps as part of the tooling mentioned above—or as part of manual compliance audits each year or quarter. These can be used to bring all the stakeholders up to speed on the dependencies within your organization. This is also a great time to discuss which APIs are no longer necessary for your business.
More visibility, fewer shadow APIs
The most important takeaway is this: make avoiding shadow APIs easy. The goal should be to increase visibility into your API reliance across the entire development and product organization. Create just enough friction for developers adding a new API that nothing unexpected ships, but teams aren’t bogged down in procedures and paperwork.
APIs affect stakeholders across development, ops, marketing, product, and even HR. Bring them all together in one place. Products like Bearer can help align both technical and non-technical stakeholders so that everyone has better insight into the state of API usage within your organization.