By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
ToolAccuracy of FindingsDetects Non-Pattern-Based Issues?Coverage of SAST FindingsSpeed of ScanningUsability & Dev Experience
DryRun SecurityVery high – caught multiple critical issues missed by othersYes – context-based analysis, logic flaws & SSRFBroad coverage of standard vulns, logic flaws, and extendableNear real-time PR feedback
Snyk CodeHigh on well-known patterns (SQLi, XSS), but misses other categoriesLimited – AI-based, focuses on recognized vulnerabilitiesGood coverage of standard vulns; may miss SSRF or advanced auth logic issuesFast, often near PR speedDecent GitHub integration, but rules are a black box
GitHub Advanced Security (CodeQL)Very high precision for known queries, low false positivesPartial – strong dataflow for known issues, needs custom queriesGood for SQLi and XSS but logic flaws require advanced CodeQL experience.Moderate to slow (GitHub Action based)Requires CodeQL expertise for custom logic
SemgrepMedium, but there is a good community for adding rulesPrimarily pattern-based with limited dataflowDecent coverage with the right rules, can still miss advanced logic or SSRFFast scansHas custom rules, but dev teams must maintain them
SonarQubeLow – misses serious issues in our testingLimited – mostly pattern-based, code quality orientedBasic coverage for standard vulns, many hotspots require manual reviewModerate, usually in CIDashboard-based approach, can pass “quality gate” despite real vulns
Vulnerability ClassSnyk (partial)GitHub (CodeQL) (partial)SemgrepSonarQubeDryRun Security
SQL Injection
*
Cross-Site Scripting (XSS)
SSRF
Auth Flaw / IDOR
User Enumeration
Hardcoded Token
ToolAccuracy of FindingsDetects Non-Pattern-Based Issues?Coverage of C# VulnerabilitiesScan SpeedDeveloper Experience
DryRun Security
Very high – caught all critical flaws missed by others
Yes – context-based analysis finds logic errors, auth flaws, etc.
Broad coverage of OWASP Top 10 vulns plus business logic issuesNear real-time (PR comment within seconds)Clear single PR comment with detailed insights; no config or custom scripts needed
Snyk CodeHigh on known patterns (SQLi, XSS), but misses logic/flow bugsLimited – focuses on recognizable vulnerability patterns
Good for standard vulns; may miss SSRF or auth logic issues 
Fast (integrates into PR checks)Decent GitHub integration, but rules are a black box (no easy customization)
GitHub Advanced Security (CodeQL)Low - missed everything except SQL InjectionMostly pattern-basedLow – only discovered SQL InjectionSlowest of all but finished in 1 minuteConcise annotation with a suggested fix and optional auto-remedation
SemgrepMedium – finds common issues with community rules, some missesPrimarily pattern-based, limited data flow analysis
Decent coverage with the right rules; misses advanced logic flaws 
Very fast (runs as lightweight CI)Custom rules possible, but require maintenance and security expertise
SonarQube
Low – missed serious issues in our testing
Mostly pattern-based (code quality focus)Basic coverage for known vulns; many issues flagged as “hotspots” require manual review Moderate (runs in CI/CD pipeline)Results in dashboard; risk of false sense of security if quality gate passes despite vulnerabilities
Vulnerability ClassSnyk CodeGitHub Advanced Security (CodeQL)SemgrepSonarQubeDryRun Security
SQL Injection (SQLi)
Cross-Site Scripting (XSS)
Server-Side Request Forgery (SSRF)
Auth Logic/IDOR
User Enumeration
Hardcoded Credentials
VulnerabilityDryRun SecuritySemgrepGitHub CodeQLSonarQubeSnyk Code
1. Remote Code Execution via Unsafe Deserialization
2. Code Injection via eval() Usage
3. SQL Injection in a Raw Database Query
4. Weak Encryption (AES ECB Mode)
5. Broken Access Control / Logic Flaw in Authentication
Total Found5/53/51/51/50/5
AI in AppSec
November 19, 2024

Next-Level Application Security: Leveraging LLMs in Practice

In our last blog post titled “One Year of Using LLMs for Application Security: What We Learned” I revealed some of the good and bad discoveries in the process of trying to use this emerging technology to solve long-lived problems within the application security space.

Today I’m going to delve into some of these persistent problems that have plagued the application security community and discuss how LLMs can help in these spaces. I’ll also talk about how we are leveraging them here at DryRun Security in particular.

Release Processes and Security Reviews

The release process is designed to incorporate valuable feedback and insights from all relevant parties during the earliest stages of software development—ideally even before any code is written. This approach works reasonably well for new applications, services, or major features. However, security practitioners often grapple with frequent, security-critical changes that occur multiple times a day or week.

These changes are typically smaller in scope and may not qualify for the formal release process. As a result, they bypass standard procedures, leaving security teams unaware of potential risks introduced into the system. In some cases, even significant changes that should undergo the release process are skipped for various reasons.

We've found that LLMs excel at understanding and summarizing code changes. By aggregating all changes across an organization, LLMs can identify modifications that security practitioners need to know about, even when these changes are dispersed across multiple repositories and authors.

We refer to this capability as Code Insights. Even in its initial implementation, this feature has uncovered critical and previously unknown changes within our customers' organizations. Here are some real-world examples:

  • Overhaul of SSO and conversion to using AWS SSO OIDC
  • Switched to a new payment gateway provider
  • The use of 57k SSNs for testing purposes of a new service
  • Introduction of new marketing-related widgets that required modifications to the Content Security Policy
  • Crypto library replacement

Some of these changes should have gone through the release process but did not; others were deemed too minor to qualify. In all cases, being aware of these changes is crucial for the security team to adequately protect the organization.

Advanced Code Scanning for Complex Issues

Code scanning using LLMs is a topic of active debate. Some argue that non-deterministic analysis cannot be trusted, while others believe LLMs represent the future of code analysis. 

Regardless of your stance, it's evident that traditional Static Application Security Testing (SAST) tools often produce noisy, low-value results and lack contextual understanding. 

Given these limitations, exploring new approaches is not just justified but necessary—which is precisely what we've done.

Our thesis posits that vulnerabilities are only one marker of risk. Sometimes, vulnerabilities are too nuanced for traditional tools to detect, but there are indicators suggesting that a human should investigate further. This led us to address two key challenges:

  1. Developing a Robust Code Scanning Engine: Utilizing LLMs as a core component.
  2. Surfacing Additional Risk Markers: Analyzing the "Who, What, Where, and Why" of code changes.

By combining these efforts, we achieve a more accurate picture of the risks introduced by code changes. We call this approach Contextual Security Analysis and employ a methodology known as S.L.I.D.E.

LLMs are adept at text summarization and understanding the intent behind code changes, which helps answer the "What" and "Why." We then use deterministic techniques to ascertain the "Who" and "Where," allowing us to build a comprehensive profile of code changes and extract risk markers.

Enhancing Vulnerability Detection

When it comes to code scanning, LLMs offer significant advantages. Like a human reviewer, an LLM can analyze a piece of code and ask additional questions if necessary. There are two well-known methods to provide LLMs with the context they need:

  1. Retrieval Augmented Generation (RAG): Allows you to prefill the LLM with relevant information.
  2. Agent-Based Architecture: Allows the LLM to decide when and which agent to use for additional information.

One other note here in regards to the questions themselves: questions shouldn’t be broad. Often you will need to ask several very discrete questions to get an accurate answer. Asking “Is this code vulnerable in any way?” is far less effective than asking:

  1. Does this code use any known dangerous functions?
  2. Is user-supplied input being passed to these functions?
  3. Is this input incorporated in an unsafe manner?

Addressing Complex Authorization Flaws

Traditional SAST tools struggle with detecting authorization flaws, such as Insecure Direct Object References (IDOR) or Broken Object Level Authorization (BOLA). We see great promise in using agent-based architectures to tackle these issues. For example, while both an LLM and a SAST tool can detect patterns that resemble IDOR, only an LLM with agent-based capabilities can further inquire:

  1. Are there authorization decorators present in the changing code that has the vulnerable pattern?
  2. What are the names of those decorators?
  3. What do those authorization decorators do? (Agent will now search code base to extract the definitions)
  4. Are the authorization decorators preventing the potentially vulnerable code from being vulnerable?

This simulated intelligence allows for nuanced analysis that traditional SAST tools can’t achieve, uncovering vulnerabilities that would otherwise remain hidden.

Security Assistants for Developers

Chatbots & LLMs are all the rage these days so it should be no surprise that there are potential use cases available to security staff that can help them optimize their work and extend their team. 

While security teams encourage developers to ask questions, it's beneficial when routine queries can be addressed without direct human intervention.

Our implementation answers questions directly from the Source Code Management (SCM) system (e.g., GitHub Pull Requests or Issues), enabling developers to receive answers based on your organization's documentation—and only that documentation.

This approach is advantageous because it meets developers where they are and provides contextually relevant information. Security teams maintain tight control over the types of questions that can be answered and the information used, effectively enabling first-level triage without the need for immediate human involvement.

Conclusion

We're continuously exploring additional ways to solve customer problems either by using LLM technology or through other means. As new features become more fully developed and validated at scale, we'll share those use cases as well. 

Whether it's scaling your team, identifying risks in code changes, empowering developers, or highlighting significant security-impacting modifications, I hope this article has illustrated the transformative potential of LLMs within AppSec.

Are you interested in further discussion? Schedule some time with me and I’d be happy to chat with you about LLMs in AppSec.