Securing secrets is hard. API tokens, cloud credentials, and database URLs have a bad habit of getting exposed anywhere and everywhere – from code repositories to CI job logs and Jira tickets. And let's not forget those leaks tend to happen when your security teams least expect them, usually during “out of office hours” and in assets you don’t own – think personal GitHub repos of your developers.
GitGuardian has been actively tackling this problem since 2017. We’ve been scanning every public commit on GitHub (just a little over 1 billion last year!), identifying secret leaks on the way, and publishing a yearly report, The State of Secrets Sprawl, read by thousands of security practitioners around the world. Reports are, of course, a great display of thought leadership, make for great press coverage, and help us generate tons of backlinks (of which our marketing team always demands more…). Yet, we understand that reports only scratch the surface; it's time for real solutions.
Today, we’re unveiling HasMySecretLeaked, a free toolset to help security and DevOps teams verify if their organization’s secrets have leaked on public repositories, gists, and issues on GitHub projects.
HasMySecretLeaked is the fruit of GitGuardian’s research on GitHub, uncovering 3 million exposed secrets in 2020, 6 million in 2021, and 10 million in 2022. It is your gateway to a private and proprietary database, housing all leaked secrets and their locations in a secure and hashed format.
You might wonder how HasMySecretLeaked differs from traditional approaches, where you scan your source code, Docker images, and other corners of your DevOps environment for secrets. The answer is simple. We're not doing any scanning.
The focus has always been on scanning your organization's software delivery pipeline for hardcoded secrets. But what about those secrets that venture into the wild, far beyond your control? Think of the developer who inadvertently exposes a personal access token on their personal GitHub repository. Or that community open-source project where critical cloud credentials find their way into the source or binary package on PyPI.
HasMySecretLeaked brings ‘auditability’ to every secret you manage: the ones in your vaults, build pipelines, .env files, cloud provider’s built-in secrets managers, and many other places. It works its way from the source, wherever your secrets may be, to verify whether they’ve leaked, how, and where.
Here's the lowdown: You can query our database by submitting your secret in the search console. GitGuardian takes it from there, hashing your secret on the client side and looking for matches–without revealing any other secrets or their locations. It's private by design, secure, and lightning-fast.
Starting today, GitGuardian users can already tap into the power of HasMySecretLeaked directly from our command-line interface ggshield. Here’s a quick tutorial👇
If you haven't installed ggshield yet, visit https://github.com/GitGuardian/ggshield#installation and follow the setup instructions. MacOS, Linux, and Windows are supported.
ggshield processes your secrets in three steps. There is one command for each step, but there's also an all-in-one command that runs all three steps at once.
If you're a first-time user, we recommend going through the step-by-step guide at least once to understand how HasMySecretLeaked works. After your initial tests, we recommend using the all-in-one ggshield hmsl check [OPTIONS] PATH
command for your convenience.
Before you start, make sure you are able to reach the service and check if you are authenticated by running ggshield hmsl api-status
.
You can also run ggshield hmsl quota
to see how many daily credits you have left. Note that authenticated users are given a larger quota.
Create a file, for instance, secrets.txt
, containing some secrets:
password
09f45eb5bBA9D0FAcb309bc4fd6bb7aCd36f9ccF4EfAeBADB6EA552f6AEbbB238B5730B
hunter2
Here we have two passwords and a GitGuardian API key (invalid).
The first step is to hash the secrets and prepare them for the next steps:
$ ggshield hmsl fingerprint secrets.txt
payload.txt and mapping.txt files have been written.
Prepared 3 secrets.
This command creates two files: payload.txt
and mapping.txt
. The first one contains the prefixes of the hashes, and the second one is a mapping between the hashes of your secrets and their redacted values. Note that mapping.txt
will never leave your environment.
Let's print their content to the console.
$ cat payload.txt
2af56
d13f5
743d9
$ cat mapping.txt
743d9fde380b7064cc6a8d3071184fc47905cf7440e5615cd46c7b6cbfb46d47:pa****rd
2af568e0f5f67b3ca2a66d6612250770d7e0c30049812fdce698483004448faa:09f45eb5bBA9***********************************************bbB238B5730B
d13f54620142ebbf972b44f626e30fdc436b50ddbe4f2d4749c8d5685942a4a9:hu***r2
$ ggshield hmsl query payload.txt > results.dump
1000 credits left for today.
Audited 3 secrets.
The only input here is the payload.txt
file created during the first step. Remember to pipe the results in a file, as they will need to be processed with another command later.
The previous command returns a list of potential matches with their five-character hash prefix in common with your secrets' hashes. Each potential match consists of an object with two properties: the hint, which is the hash of the hash, and the payload, which is an object containing the redacted secret, the number of times it has leaked, and a URL leading to its first location. Note that each payload is encrypted with its secret's own hash!
Let's print the results.dump
file to the console.
$ cat results.dump
{"hint": "ed31069a5300566cad3752a2a013dd150de5f4132b4f88453621da35c6be3428",
"payload":"sbeR1+kmi2wLq2DjFxrb5YSeXXbqFO0NRnz7n3l2mAEa5U16lFOrss1GiydudZiWhLBnhSF7RF0pzChU/R9YmXcOl//viRPheyKm6Hme0OKJvQPTGxoL6PTz5lpJsJSNDpcF90XgFrOLb1SMjHxAbZZ9aXI8WsfvTcoUnrJAtjS4AUkCPqt2supLQRzsA80jy29IRQOJgYGobnQ+XUx88BTJKauHy7iJ3xfYeC+aJKkNVc/ODxCCQ6bI1H+IQa1aty/aFo6oz7MP26AXds/Zn/QWmis="}
{"hint": "2b5270eec06c953839ee77f7aa7b167920cbfcf1d54abc7805f5e13c9732c146",
"payload":"OO9lotZqFxUyKDjAMcUu1E8IUEMbUwZIJg1B1Ohp86VrqkFXASGCMdugLt9jb+Kagt8kLwH6Bi8anTV419NrdrTsQx9z5yegoodQFhO+teW8LgZR3A4IaILaEAzJkFQG7SHpfAxZLuiXyA8hBtEtPQpgsZSyZT5ASdixSqFV9/PEtz3+sPxDTS4Juch+n2vdIAZXPktoOpcSb8adzwVUgec4LUnzlPvHPzyzrq32mcQWJEn7z9xrRdMzlQjrARHNHSR6vumj/enNv/vR38uQDnFleS8L19SCN7w="}
{...}
Thanks to the mapping.txt
file created in the first step, ggshield can now filter the hints to find the perfect matches and decrypt their content using the full hash of each secret, stored on the client side, as a private key.
$ ggshield hmsl decrypt results.dump
Found 2 leaked secrets.
> Secret 1
Secret name: "hu***r2"
Secret hash: "d13f54620142ebbf972b44f626e30fdc436b50ddbe4f2d4749c8d5685942a4a9"
Distinct locations: 45
First occurrence:
URL: "<https://github.com/destinationbg/frontend-nuxt/blob/8d79317d5766d5de336338b727966101bb217adf/_server/api/auth/[...].ts#L42>"
> Secret 2
Secret name: "pa****rd"
Secret hash: "743d9fde380b7064cc6a8d3071184fc47905cf7440e5615cd46c7b6cbfb46d47"
Distinct locations: 25
First occurrence:
URL: "<https://github.com/edly-io/devstack/commit/ccfc9c2d63c2917be60a9fd2a4c36ff3a8b9bb8c#diff-e45e45baeda1c1e73482975a664062aa56f20c03dd9d64a827aba57775bed0d3L158>"
When running this command, ggshield expects to find a mapping.txt
file in the current directory. If you used a prefix or moved the file, use the -m option to specify the location of the mapping file.
Options
-m --mapping the location of the mapping file
--json to output the results in JSON
What's more, ggshield includes a plugin for pulling secrets from HashiCorp Vault and staging them in local environments before checking them for leaks.
$ ggshield hmsl check-secret-manager hashicorp-vault --help
Usage: ggshield hmsl check-secret-manager hashicorp-vault [OPTIONS] VAULT_PATH
Check secrets of an Hashicorp Vault instance. Only compatible with the kv secret engines (v1 or v2) for now.
Will use the VAULT_URL environment variable to get the Vault instance URL or the --url option if no environment variable is set.
Will use the VAULT_TOKEN environment variable to authenticate, except if the --use-cli-token option is set.
For more information about using ggshield to interact with HasMySecretLeaked, visit the CLI reference documentation.
And if you're a user of the GitGuardian Platform, the good news keeps coming! HasMySecretLeaked is already integrated, advising security teams if their hardcoded secrets, found in organization-owned repositories, Slack channels, or Jira projects, have accidentally leaked to public sources outside their perimeter.
GitGuardian Secrets Detection now includes a new publicly leaked
tag, which you can use for filtering your incidents.
In addition, a leak status
property has been added to the incident’s details view to inform you in the event of a public leak and to display the total number of times the secret was found on GitHub repositories, gists, and issues outside your GitHub organizations.
And we mean it. HasMySecretLeaked is designed to never read or access your secrets.
"We're leveraging concepts such as k-anonymity and zero-knowledge to determine whether your secrets have leaked in public sources–without even having to see them in the first place." Pierre Lalanne, Lead Engineer at GitGuardian
It is impossible for us or anyone else to see or reconstruct your secrets. If you want to learn more about the work HasMySecretLeaked performs behind the scenes, the data flows, encryption, and more, read the following post from our engineering team.
Take HasMySecretLeaked for a spin and gain peace of mind knowing your secrets are safe!
Stay tuned for more updates as we take secrets security to new heights!
*** This is a Security Bloggers Network syndicated blog from GitGuardian Blog - Automated Secrets Detection authored by Ziad Ghalleb. Read the original post at: https://blog.gitguardian.com/announcing-has-my-secret-leaked/