Data Leakage Detection in cybersecurity is the proactive process of identifying and responding to the unauthorized, typically unintentional, exposure of an organization's sensitive, confidential, or proprietary information to the external world. It focuses on identifying data that has "escaped" the secure internal perimeter due to errors, misconfigurations, or negligence, before malicious actors exploit it.
This use case primarily involves scanning public-facing parts of the internet—including the surface web, deep web, and dark web—for the presence of exposed information such as:
Credentials: Usernames, passwords, API keys, or security tokens.
Intellectual Property (IP): Source code, proprietary algorithms, or internal project documents.
Personally Identifiable Information (PII): Customer or employee records like social security numbers, email addresses, or financial data.
Configuration Files: Database backups, server settings, or private encryption keys.
How ThreatNG Helps with Data Leakage Detection
ThreatNG, which incorporates Digital Risk Protection (DRP) capabilities, is designed to provide the external visibility necessary to discover and remediate these leaks.
External Discovery
ThreatNG performs purely external unauthenticated discovery using no connectors. This process is crucial for finding "shadow assets" and forgotten resources that may be leaking data without the organization's knowledge.
Example: It continuously scans the internet and identifies assets that are related to the organization but are not actively managed, such as a decommissioned FTP server or an unsecured development subdomain. If a sensitive directory listing is discovered on one of these unmanaged assets, ThreatNG flags it as a leak path.
External Assessment
ThreatNG includes a specific assessment of the likelihood of sensitive data exposure:
Data Leak Susceptibility: This score is directly informed by intelligence related to exposed sensitive information and code.
Example: If ThreatNG detects a high volume of organization-specific compromised credentials on the dark web (which feeds this score), it will raise the Data Leak Susceptibility to indicate a high risk of account takeover that could lead to further, more targeted data exfiltration.
Reporting
ThreatNG provides reports that allow security teams to prioritize and act on leakage incidents:
Prioritized Report: A report would immediately flag the discovery of an employee's hardcoded database API key found in a public GitHub repository as a Critical risk, ensuring its immediate revocation and remediation. The report would include all the necessary evidence for the incident response team.
Ransomware Susceptibility Report: This may include context on Dark Web Presence regarding compromised credentials, as exposed login information often serves as the initial entry point for ransomware attacks.
Continuous Monitoring
ThreatNG performs continuous monitoring of the external attack surface and digital risk. Since misconfigurations (a leading cause of leaks) can be introduced at any time, constant vigilance is paramount.
Example: It monitors code-sharing platforms for any new pushes of code that include sensitive keywords or file types. If a developer accidentally pushes a
.envfile containing production secrets to a public repository, continuous monitoring detects and alerts on this leak in near real-time, drastically reducing the window of exposure.
Investigation Modules
The Investigation Modules are the core tool set for identifying and gathering evidence of data leakage:
Sensitive Code Exposure: This module actively scans public code repositories and mobile apps for leaked information.
Example: It finds a snippet of leaked source code on a public platform (like GitHub) that contains exposed Access Credentials (e.g., an AWS Access Key ID or a Stripe API Key), providing the exact location and content of the leak.
Dark Web Presence: This feature searches underground forums and marketplaces.
Example: It detects Associated Compromised Credentials for multiple company email domains being sold in a dark web forum, providing the security team with evidence that a data leak has already occurred and that accounts are at risk of takeover.
Archived Web Pages: This module identifies sensitive information that was previously visible but is now removed, often remaining in public archives.
Example: It discovers an archived version of a subdirectory (Directories) on an old corporate website that contains an easily accessible Document File (PDF) with proprietary financial data that should have been deleted, revealing a past leak.
Intelligence Repositories (DarCache)
ThreatNG’s intelligence repositories provide the real-time context on where and how leaks are occurring externally:
Compromised Credentials (DarCache Rupture): This database is continuously updated with vast amounts of credentials found in breaches and leaks, enabling rapid identification when an organization’s employee or customer accounts are compromised.
Dark Web (DarCache Dark Web): Provides intelligence on hacker chatter, which can give early warning of an imminent leak or breach based on organizational mentions.
Working with Complementary Solutions
ThreatNG's external discovery and leakage intelligence can be used with complementary solutions for automated response and containment.
Data Loss Prevention (DLP) Tools: ThreatNG’s Sensitive Code Exposure module identifies a large block of proprietary source code that was accidentally uploaded to a public code-sharing site by a contractor. This intelligence, including the exact file location and the user identity (if discoverable), is sent to a complementary DLP tool. The DLP tool, which typically monitors internal data movement, can then use this external intelligence to identify the specific internal workstation or user account from which the data originated, allowing for immediate internal policy enforcement or access restrictions.
Security Orchestration, Automation, and Response (SOAR) Platforms: ThreatNG’s Dark Web Presence module detects that a C-suite executive's Associated Compromised Credentials (email and password) have been posted on a paste site. This high-priority alert is automatically ingested by a complementary SOAR platform. The SOAR platform's automated playbook immediately executes several actions: it forces a password reset for the executive's account, notifies the IT and incident response teams, and automatically blocks the exposed credential pair from internal network access attempts, effectively containing the potential damage from the data leak before it leads to an account takeover.

