API Specification Discovery

Apr 25

API Specification Discovery is a specialized cybersecurity reconnaissance process focused on identifying, locating, and retrieving machine-readable documentation files that describe the structure, logic, and endpoints of an Application Programming Interface (API). Unlike general API discovery, which may rely on analyzing network traffic or logs, specification discovery targets the API's "blueprints"—files such as Swagger (OpenAPI), WSDL, and GraphQL schemas—that are often inadvertently exposed to the public internet.

In the context of External Attack Surface Management (EASM), finding these specifications is critical because they provide attackers with a detailed map of backend systems, effectively removing the need for manual reverse engineering.

What is API Specification Discovery?

This process involves searching for specific file types and directories that developers use to document how their APIs function. These files are intended for internal use or authorized third-party developers but are frequently left accessible on production servers, staging environments, or public code repositories.

Key targets of API Specification Discovery include:

OpenAPI/Swagger Files: typically named swagger.json, openapi.yaml, or found at endpoints like /v2/api-docs or /swagger-ui.html.
WSDL Files: Web Services Description Language files used for SOAP APIs, often found at ?wsdl endpoints.
GraphQL Schemas: Discovered via "introspection" queries that ask the API to reveal its entire data schema.
Postman Collections: JSON files exported from the Postman collaboration tool, often leaking environment variables and API keys alongside endpoint definitions.

Why is API Specification Discovery Critical?

For security teams, discovering these files is a high-priority task because they reveal the "Shadow API" attack surface—endpoints that exist but are not managed or secured by the IT department.

Reveals Hidden Endpoints: Specifications often list administrative, debugging, or deprecated ("Zombie") endpoints that represent significant security gaps.
Exposes Business Logic: The files define exactly what data parameters the API accepts, allowing attackers to identify fields susceptible to manipulation (e.g., changing a user_id to access another person's data.
Accelerates Attacks: Attackers can import these files into automated vulnerability scanners (like Burp Suite or OWASP ZAP) to instantly test every defined endpoint for SQL injection, Cross-Site Scripting (XSS), or authorization flaws.

How API Specification Discovery Works

The discovery process typically utilizes open-source intelligence (OSINT) and active scanning techniques:

Search Engine Dorking: Using advanced search queries (e.g., inurl:/v2/api-docs site:target.com) to find indexed specification files.
Directory Fuzzing: Systematically guessing common directory names (like /api, /docs, /schema) to find files that are not linked from the main website.
Repository Scanning: Monitoring platforms like GitHub or GitLab for code commits that accidentally include API documentation or Postman collections.

Common Questions About API Specification Discovery

Is publishing an API specification a vulnerability? Not inherently. Publishing documentation for a public API is standard practice. However, publishing specifications for internal, private, or partner-only APIs is a security risk (Information Disclosure) because it helps attackers understand the system's internal architecture.

What is the difference between API Discovery and API Specification Discovery? API Discovery is a broad term that encompasses analyzing traffic (logs, PCAP) to identify which APIs are currently in use. API Specification Discovery is a subset that focuses on finding the documentation files (schemas) that define what is possible to use, even if no traffic is currently flowing to them.

How do I prevent unauthorized API Specification Discovery? Organizations should disable specification endpoints (e.g., Swagger UI) in production environments, restrict documentation to strict authentication, and ensure introspection is disabled for GraphQL endpoints. Continuous scanning should be used to detect if these files inadvertently become public.

Securing API Specifications with ThreatNG

ThreatNG provides a comprehensive solution for managing the risks associated with exposed API documentation. By adopting an attacker's perspective, ThreatNG identifies, assesses, and monitors these critical assets before they can be exploited.

External Discovery

ThreatNG automates the discovery of API specifications across the entire digital supply chain. It goes beyond simple website crawling to identify assets that traditional scanners often miss.

Example: ThreatNG scans an organization's cloud storage buckets and identifies a misconfigured AWS S3 bucket containing a postman_collection.json backup file. This file outlines internal API routes for a legacy payment gateway, which the security team was unaware still existed.

External Assessment

Upon discovering an API specification, ThreatNG evaluates the exposure context to determine its severity.

Example: ThreatNG finds an exposed WSDL file for a SOAP service. The external assessment module analyzes the hosting server's SSL configuration and determines that it is using weak encryption (TLS 1.0). It combines this with the WSDL finding to generate a high-risk rating, as the exposed logic, combined with weak transport security, makes the API highly susceptible to Man-in-the-Middle (MitM) attacks.

Reporting

ThreatNG consolidates findings into clear, prioritized reports that bridge the gap between technical details and business risk.

Example: A security analyst receives a ThreatNG report highlighting a newly discovered GraphQL endpoint with introspection enabled. The report details the specific schema exposed, including sensitive object types like UserCreditCardInfo, and provides immediate steps to disable introspection in the production environment.

Continuous Monitoring

APIs evolve rapidly, and documentation can be exposed by a single bad code commit. ThreatNG continuously monitors the attack surface for these changes.

Example: A DevOps team accidentally removes a firewall rule, exposing a previously internal /api-docs endpoint to the public internet. ThreatNG's continuous monitoring engine detects this state change within hours and alerts the security operations center (SOC), allowing them to re-secure the endpoint before it is indexed by search engines.

Investigation Modules

ThreatNG’s investigation modules allow for deep-dive analysis of discovered specifications to uncover secondary risks.

Example: Using the Domain Intelligence module, ThreatNG investigates a newly found API subdomain (api-dev.target.com). It correlates this domain with historical WHOIS data and finds that it was registered by a third-party contractor who left the company, highlighting a potential "orphan" asset risk.

Intelligence Repositories

ThreatNG enriches discovery data by cross-referencing findings with its proprietary intelligence repositories.

Example: ThreatNG identifies a "Shadow API" endpoint referenced in a public code snippet. It checks this endpoint against the DarCache Dark Web Intelligence Repository and finds that the specific API key format used by this endpoint is currently being sold in a "stealer log" on a dark web marketplace, indicating an active breach.

Complementary Solutions

ThreatNG serves as the foundational intelligence layer that powers a broader security ecosystem. It seamlessly complements other security technologies by providing them with the "target list" they need to be effective.

Complementary Solution (DAST): ThreatNG feeds discovered Swagger/OpenAPI files directly into Dynamic Application Security Testing (DAST) tools. This allows the DAST scanner to parse the file and automatically construct valid requests for every endpoint, ensuring 100% coverage during vulnerability testing.
Complementary Solution (API Gateways): When ThreatNG discovers a rogue or Shadow API endpoint, it shares this intelligence with the organization's API Gateway. The Gateway can then instantly apply a "block" policy to that specific route until it can be properly onboarded and secured.
Complementary Solution (SIEM/SOAR): ThreatNG pushes alerts regarding exposed API specifications to Security Information and Event Management (SIEM) platforms. This allows the security team to correlate exposure time with traffic logs to determine whether any external IP addresses accessed the documentation during the public window.

Examples of ThreatNG Helping

Helping reduce Shadow IT: ThreatNG helps a financial institution discover a "test" API server set up by a marketing vendor that was exposing customer email patterns via a public Swagger UI. The organization was able to shut down the server immediately.
Helping secure supply chains: ThreatNG helps a software company identify that a partner's API documentation (hosted on a shared subdomain) was leaking administrative authentication tokens in the "example" fields of the specification.
Supporting compliance efforts: ThreatNG helps a healthcare provider demonstrate to auditors that no internal patient-data APIs are exposed to the public internet by providing continuous, time-stamped discovery logs of its external perimeter.

API Specification Discovery

Threat NG Staff

API Specification Discovery

What is API Specification Discovery?

Why is API Specification Discovery Critical?

How API Specification Discovery Works

Common Questions About API Specification Discovery

Securing API Specifications with ThreatNG

External Discovery

External Assessment

Reporting

Continuous Monitoring

Investigation Modules

Intelligence Repositories

Complementary Solutions

Examples of ThreatNG Helping

OpenAPI Specification Discovery

SwaggerHub Discovery