The Rising Issue of Zombie APIs and Your Increased Attack Surface

1 Jun 2024

Offering an API to customers increases your revenue, but it also expands your attack surface. Businesses can offer an API that can be embedded into third-party applications to make development easier. For example, embedding social media into an application lets customers discuss a product without adding extensive overhead to your development team. The social media company gains traffic and visibility, and the customer gains ease of development while adding features to their application.

Although an API is good for marketing and revenue, adding APIs and endpoints expands your attack surface. Having an API is an additional risk that can be managed, but all endpoints should be strictly monitored and secured. Most administrators agree that APIs must be monitored, but a fast-paced business environment with numerous updates and deployments might find itself losing track of APIs and unknowingly adding a cybersecurity risk called “zombie APIs.”

What is a Zombie API?

A zombie API is (in basic terms) a forgotten and overlooked infrastructure that remains available for use, but organizations are unaware of its existence. Zombie APIs can be created in small or large environments, but they are often created in environments where IT resources are built without strict provisioning and documentation procedures in place. Change control helps avoid zombie API situations, but emergency deployments or configurations performed to fix a specific critical error can also happen.

In an automated environment, cloud resources are often deployed along with application code. The benefit is that developers and operations people no longer need to remember to deploy hardware and configure it manually. Automation in software deployment lowers incidents based on misconfigurations or avoids any issues where developers forget to include requests to provision resources to support their applications.

In some environments, developers are given access to their own test servers. These servers might be accessible on the public internet so that developers can test new code. An API test server might be available to the public internet, and developers might think that it won’t be detected without being published.

Zombie APIs can be created in numerous ways with their own edge cases, even with the most strict of change control procedures. Whether they occur from mistakes or misguidance, zombie APIs are a form of shadow IT that can be especially dangerous to data protection. Without monitoring, an attacker could retrieve data for months with no limitations or rate limiting. Any probing for vulnerabilities or exploiting them would not be logged, so typical cybersecurity analytics wouldn’t notify administrators of anomalous traffic.

How Do Hackers Find Zombie APIs?

Just like the numerous ways a zombie API can be created based on the situation, the same can be said for finding a zombie API. Hackers could find an endpoint by reverse engineering code, reviewing open-source repositories, or through a concept called “fuzzing.” Fuzzing is a type of discovery where scripts are written to make requests against common API endpoint names. For example, it’s common for an API endpoint used for authentication to have an endpoint named “/login” or “/authenticate” or something similar. Requests are made to different common endpoint names against your infrastructure to discover endpoints.

Discovery from open-source repositories is common. Open-source repositories are also vulnerable to disclosure of secrets, meaning that developers might forget to remove references to private keys, authentication credentials, and other private data. References to API endpoints are also available for discovery and will be probed for any vulnerabilities. If your organization is unaware of endpoints referenced in code, then they could be probed without any mitigation or rate limiting.

A zombie API isn’t always vulnerable to bug exploits. For example, exploiting SQL injection vulnerabilities could cause data disclosure of sensitive information, but some endpoints are properly coded with resilience against threats. In a zombie API situation, the API might function normally, but it can be used to gather data without any limitations. It’s possible that the endpoint could have a business logic error that could be exploited, but without monitoring, any suspicious activity would go undetected.

Real-World Data Breaches from Zombie APIs

A good example of an API functioning normally but being used to quietly enumerate data is the JustDial incident. JustDial is one of India’s large local directories with over 100 million users. In 2019, a security researcher found that JustDial had a zombie API open to the public internet without any monitoring implemented. The API returned information like name, email, mobile number, address, and gender to anyone making a request to the endpoint. No authentication was necessary, and JustDial was not monitoring to catch the incident.

After a security researcher detected the zombie API, JustDial claimed to have remediated the incident, but the same issue was detected again in 2020. It’s unclear if any third party aside from the security researcher, but because the endpoint was open to the public internet with no monitoring in place, JustDial cannot assess the extent of the data exfiltration.

Another example is with one of the big San Francisco tech companies known for some of the best developers on the market, Facebook. Facebook has had several instances of zombie APIs. In 2016, developers deployed a subdomain (mbasic.beta.facebook.com) to test their password reset functionality. The production version of the API had rate limitations on it, so attackers could not brute force the six-digit passcode sent to users to reset their passwords. The beta version did not have this limitation, so a six-digit passcode could be brute forced within seconds, limited only by an internet connection, bandwidth, and the API endpoints’ backend processing speed.

In 2018, Facebook suffered from another zombie API attack. The vulnerability was found in Facebook’s “View As” feature. The feature allowed users to view their profiles as others see it. The API endpoint for this feature was not locked down or monitored, so attackers could view other user profiles and steal their access tokens. With an access token, an attacker can then steal a user’s profile and their data. Facebook estimated that 40 million users were impacted, and 90 million users had to re-authenticate to ensure that their access token was not stolen.

A smaller company--yet significant data breach from a zombie API--occurred in 2022 with an endpoint from Travis CI. Travis CI is an automation vendor used to deploy infrastructure and code. One of Travis CI’s API endpoints required no authentication and allowed for requests to obtain customer log events. To make matters worse, logs were stored in plaintext, so user log data, including access keys, could be retrieved without any limitations. When the issue was reported, Travis CI estimated that 770 million user log records, including access tokens, keys, and cloud credentials, were stolen.

Zombie API Discovery

Ideally, software developers document changes to infrastructure so that change control includes new API endpoints. Operations people can then add endpoints to monitoring agents, and these agents collect data so that cybersecurity and analytics monitors can let administrators know when suspicious activity is detected. A zombie API happens when endpoints aren’t documented, so monitoring agents are unaware of endpoints. Without monitoring, any requests can be sent to servers without any analysis and administrator alerts.

To grapple with potential zombie APIs, administrators will often install agents on the network to detect traffic. Agents collect traffic data and detect open connections on servers and other network infrastructure. The issue with this strategy is that zombie APIs often stay dormant with no traffic or requests until they are discovered. They might be discovered by developers, operations, or a third party on the internet. Only after a third party finds the endpoint will traffic be logged, but it doesn’t mean that requests will trigger alerts. A zombie API will allow for standard requests without any “hacking” or malformed queries. It’s what makes zombie APIs so dangerous for data disclosure.

Using Artificial Intelligence to Discover Zombie APIs

Instead of relying on agents to reactively detect zombie APIs, a better solution is to work with artificial intelligence and scans on your code. This strategy has two phases: scanning your repository code for references to internal APIs and using event logs to determine if the API receives any requests.

The first step is to scan code for references to APIs. These APIs could be external or internal, but you want to focus on internal APIs since this infrastructure affects your own data. The references could be in numerous repositories, both active and deprecated. You might not even be aware that the references are in your code, but scanning it will discover them so that a list can be sent to artificial intelligence (AI).

Next is a large language model (LLM) used to ingest and analyze event logs. Event logs could potentially be gigabytes or terabytes of line-by-line data. Event logs are critical for cybersecurity and monitoring usage on infrastructure, so they should be set up for servers hosting APIs. If an API endpoint is referenced in code but has little or no traffic events, then you might have a zombie API. APIs with references and numerous event logs are being used and monitored so they would not be considered a zombie API.

Using LLMs to process analytics on every API endpoint reference takes time, but the results could surprise administrators unaware of active APIs in their environment. As an example, a case study on Domino’s Pizza India recently found 189 zombie APIs, with 11% of them linking personally identifiable information (PII). The discovery came after scanning 2063 endpoints and spending 2.2 days analyzing event logs across the entire Domino’s Pizza India’s environment.

Should You Monitor for Zombie APIs?

The answer is “Yes!” Zombie APIs could leave your customer data, internal data, or other critical information open to disclosure. In a compliant environment, this oversight could cost millions in fines. Litigation, brand damage, revenue impact, and several other negative consequences are associated with unmonitored infrastructure that leads to a data breach.

Having better visibility into an organization’s environment is critical for cybersecurity and faster incident response. As more organizations deploy infrastructure in the cloud, it’s more important than ever to ensure that you don’t have any loose ends, including zombie APIs. Documenting your infrastructure is a great first step, but it’s possible that some APIs slip through the cracks. Scanning your code continually helps identify zombie APIs, which can then be disabled or added to monitoring agents. Better visibility into your infrastructure lowers the risks of exposing sensitive data and reduces your attack surface.