By Dotan Nahum
Part of the Spectral API Security Series
Collaboration is key. Not only in software development. But when it comes to collaboration on Git repositories, the word “key” takes on a whole new meaning. Whether it’s API Keys, passwords, or digital certificates; the secrets used to authenticate access must remain secure.
The open nature and convenience of Git repositories are often encumbered by human-error.
Lack of education on security practices, inattention to details, or just plain incompetence are leading to public secret exposure on a scale never seen before. Thousands of secrets leaking daily on public Git repositories, including over two million corporate secrets in 2020 alone.
As awareness of this issue grew, new tools and technologies emerged to provide additional security layers throughout the SDLC. Before listing the Git secret scanning solutions you should know, let’s first educate ourselves on what they are, exactly, and look at some cautionary tales to find out just what can happen if secret scanning is neglected.
What is Git secret scanning?
There are two modalities to Git secret scanning, each covering different phases of the CI/CD pipeline.
Prevention is key
The first modality attempts to prevent secrets from ever leaking in the first place. By integrating into the CI/CD pipeline and monitoring developers’ actions in real-time, an accidental code-commit containing a secret may be intercepted before it even has a chance to become publicly exposed.
Timely detection averts disaster
The second modality attempts to detect secrets that may have already been exposed. Either missed by lacking security practices, exposed through a developer’s personal account or detected using new security scanning algorithms, secret detection is an ever-evolving process that must be regularly updated.
Detection is not limited to security solutions. Bad actors are consistently using Git scanning technologies in an effort to extract secrets from public and badly configured Git repositories, repositories that may contain useful information to exploit. Without scanning tools as powerful as the ones used by such malignant entities, you may simply be unaware that your secrets have already been leaked.
Why you need secret scanning in your SDLC
It is hard to understate the importance of secret leakage prevention. A misplaced key or a database password accidentally leaking can become an instant crisis, often with a painful associated cost.
Top 9 secret scanning solutions for DevSecOps
Clearly, no one wants to be on the receiving end of a Git secret leak. To help you get started protecting secrets in your code, we’ve listed the top nine Git secret scanning solutions you can add to your SecOps toolbelt.
gitLeaks is an open-source static analysis command-line tool released under the MIT license. The gitLeaks tool is used to detect hard-coded secrets like passwords, API keys, and tokens in local and GitHub repositories (private and public).
gitLeaks utilizes regular expressions and entropy string coding to detect secrets based on custom rules, exporting reports in either the JSON, SARIF, or CSV formats. gitLeaks can scan commit history and hook into your CI/CD pipeline.
Pros:
gitLeaks is an open-source project that is free to use and actively developed with more than 50 contributors. gitLeaks includes integration, audit, and cloning features that are not available in most open-source projects.
Cons:
With no user interface and limited integration options, gitLeaks is mostly suitable for security professionals, researchers, or niche development projects.
Spectral offers one of the most comprehensive secret scanning solutions, integrating into every facet of the build process. Whether it’s a static build, pre-commit to Git, or CI integration, Spectral offers simple integration options that can be enhanced using plugins and hooks.
Another interesting feature is Spectral’s ability to scan Git repositories not just for configuration issues and secrets lurking in the code, but also for logs, binaries, and other data in the codebase which you may not intuitively think of as a potential leak source.
Pros:
Spectral uses an intuitive user interface that makes it much more accessible and suitable for corporate management. The AI and Machine Learning algorithms used by Spectral’s secret scanning technology ensure that detection rates increase and false positives rates decrease continuously over time as more data is processed by the system.
Cons:
Spectral is not well suited to small projects or single developers. It is designed for a development team collaborating on a large codebase.
Git-Secrets is an open-source command-line tool used to scan developer commits and “–no-ff” merges to prevent secrets from accidentally entering Git repositories. If a commit or merge matches a regular expression pattern, the commit is rejected.
Pros:
Git-Secrets can integrate into the CI/CD pipeline to monitor commits in real-time. One of git-secrets unique security-centric features includes support for a “Secret Providers” feature that can prevent secrets from ever showing up in a commit.
Cons:
Git-secrets uses fairly simple detection algorithms, mainly focusing on ‘regular expression’ which can often result in many false-positives. The project is no longer maintained on a regular basis and may not be suitable for use in a professional development environment.
4. Whispers
Whispers is an open-source static code analysis tool designed to search for hardcoded credentials and dangerous functions.
It can run as a command-line tool or integrated into your CI/CD pipeline. The tool is designed to parse structured text such as YAML, JSON, XML, npmrc, .pypirc, .htpasswd, .properties, pip.conf, conf / ini, Dockerfile, Shell scripts, and Python3 (as AST) as well as declarations and assignment formats for Javascript, Java, GO, and PHP.
Pros:
Right out of the box, Whispers supports a wide range of secret detection formats, covering Passwords, AWS keys, API Tokens, Sensitive files, Dangerous functions, and more. Beyond its native capabilities, Whispers includes a plug-in system that can be used to further extend its scanning capabilities to new file formats.
Cons:
Whispers is designed to accompany other secret scanning solutions, it does not perform deep scans on actual code, mostly focusing on structured text files. Scanning rules are based on a limited combination of regular expressions, Base64 and Ascii detection.
When using GitHub as your public repository, GitHub makes available its own integrated secret scanning solution, capable of detecting popular API Key and Token structures. To scan private repositories, you are required to obtain an Advanced Security license. You can extend the detection algorithm by supplying regular expression formulas to detect custom secret string structures.
Pros:
Using GitHub’s user interface makes it a lot easier to visualize the scanning, configuration, and integration process. Extensive API Key and Token string structure support for many of the web’s popular services are included with the service, offering a strong starting base to any security evaluation.
Cons:
Secret scanning for private repositories is currently in beta. The service as a whole has a very narrow focus, mostly targeting known string structures such as API Keys and Tokens while ignoring other secrets such as database passwords, email addresses, administrative URLs, etc.
6. Gittyleaks
Gittyleaks is a straightforward Git secrets scanner command line tool capable of scanning and cloning repositories. It attempts to discover usernames, passwords, and emails that should not be included in code or configuration files.
Pros:
Gittyleaks is a simple tool that can be used to quickly scan repositories for obvious secrets. Its simplicity helps introduce the concept of secret scanning without the more complex configuration required by other solutions.
Cons:
Due to its simplicity and fixed rules, Gittyleaks is mostly useful as an introductory tool to help educate users about secrets in code. Gittyleaks is lacking the features and flexibility required by commercial development teams.
7. Scan
Scan is a comprehensive open-source security audit tool. It provides strong integration with popular repositories and pipelines such as Azure, BitBucket, GitHub, GitLab, Jenkins, TeamCity, and many more.
Scan also supports a broad section of popular frameworks and languages, integrates into the CI/CD pipeline to provide real-time commit protection, and provides extensive reporting capabilities.
Pros:
Due to its well-maintained open-source nature, Scan is possibly one of the most powerful and flexible DevSecOps tools you can get for free.
Cons:
While Scan is indeed powerful and flexible, its sparse user interface and complex setup ensure that only a limited number of security experts will be truly capable of extracting the best results from Scan’s feature set.
Git-all-secrets is an open-source secret scanner aggregation project. This tool currently relies on two open-source secret scanning projects: truffleHog and repo-supervisor – two projects using regular expression and high entropy secret detection algorithms. Git-all-secrets aggregates the combined results of both scanners to present a more comprehensive picture.
Pros:
Git-all-secrets introduces an interesting concept that tries to enhance secret scanning results by not relying on a single algorithm.
Cons:
While using a novel approach, Git-all-secrets underlying scanning is still relying on basic algorithms and the project is no longer actively maintained. This tool currently provides more of a proof-of-concept that may be exploited by other projects at a future time.
Detect-secrets is an actively maintained open-source project designed with the enterprise client in mind.
It was created to prevent new secrets from entering the code base, detect if preventions are explicitly bypassed, and provide a checklist of secrets to maintain in a secure storage. Detect-secrets works by running periodic comparisons against heuristically crafted regular expression statements to identify new secrets that may have been committed.
Pros:
Detect-secrets’ scanning method avoids the overhead of scanning through entire git histories, as well as the need to scan the entire repository every single time. The plugin support is excellent, with 18 different plugins currently available, spanning AWS keys, Entropy Strings, Base64 encoding, Azure Keys, and many more.
Cons:
The pre-commit hook implements only basic heuristics to try and prevent obvious secrets from being committed. If secrets are split across multiple lines or do not include enough entropy, they may not be detected in real-time.
Summary
It is blatantly obvious that actively scanning Git repositories and developer commits to prevent secrets from leaking should become a mandatory part of every company’s software development pipeline.
The examples of poorly managed code security in this article are just the tip of the iceberg. Every day, personally identifying information and private intellectual property are leaked by malicious actors. These often result from lacking code security practices or simply due to human error.
You can mitigate many of these issues by using secret scanning technology integrated right into the CI/CD pipeline, and active secret scanning of Git repositories associated with these projects.