Tool	Accuracy of Findings	Detects Non-Pattern-Based Issues?	Coverage of SAST Findings	Speed of Scanning	Usability & Dev Experience
DryRun Security	Very high – caught multiple critical issues missed by others	Yes – context-based analysis, logic flaws & SSRF	Broad coverage of standard vulns, logic flaws, and extendable	Near real-time PR feedback	Clear PR comments, expandable policies with no scripting or coding (NLCP)
Snyk Code	High on well-known patterns (SQLi, XSS), but misses other categories	Limited – AI-based, focuses on recognized vulnerabilities	Good coverage of standard vulns; may miss SSRF or advanced auth logic issues	Fast, often near PR speed	Decent GitHub integration, but rules are a black box
GitHub Advanced Security (CodeQL)	Very high precision for known queries, low false positives	Partial – strong dataflow for known issues, needs custom queries	Good for SQLi and XSS but logic flaws require advanced CodeQL experience.	Moderate to slow (GitHub Action based)	Requires CodeQL expertise for custom logic
Semgrep	Medium, but there is a good community for adding rules	Primarily pattern-based with limited dataflow	Decent coverage with the right rules, can still miss advanced logic or SSRF	Fast scans	Has custom rules, but dev teams must maintain them
SonarQube	Low – misses serious issues in our testing	Limited – mostly pattern-based, code quality oriented	Basic coverage for standard vulns, many hotspots require manual review	Moderate, usually in CI	Dashboard-based approach, can pass “quality gate” despite real vulns

Tool

Accuracy of Findings

Detects Non-Pattern-Based Issues?

Coverage of SAST Findings

Speed of Scanning

Usability & Dev Experience

DryRun Security

Very high – caught multiple critical issues missed by others

Yes – context-based analysis, logic flaws & SSRF

Broad coverage of standard vulns, logic flaws, and extendable

Near real-time PR feedback

Clear PR comments, expandable policies with no scripting or coding (NLCP)

Snyk Code

High on well-known patterns (SQLi, XSS), but misses other categories

Limited – AI-based, focuses on recognized vulnerabilities

Good coverage of standard vulns; may miss SSRF or advanced auth logic issues

Fast, often near PR speed

Decent GitHub integration, but rules are a black box

GitHub Advanced Security (CodeQL)

Very high precision for known queries, low false positives

Partial – strong dataflow for known issues, needs custom queries

Good for SQLi and XSS but logic flaws require advanced CodeQL experience.

Moderate to slow (GitHub Action based)

Requires CodeQL expertise for custom logic

Semgrep

Medium, but there is a good community for adding rules

Primarily pattern-based with limited dataflow

Decent coverage with the right rules, can still miss advanced logic or SSRF

Fast scans

Has custom rules, but dev teams must maintain them

SonarQube

Low – misses serious issues in our testing

Limited – mostly pattern-based, code quality oriented

Basic coverage for standard vulns, many hotspots require manual review

Moderate, usually in CI

Dashboard-based approach, can pass “quality gate” despite real vulns

Vulnerability Class

Snyk (partial)

GitHub (CodeQL) (partial)

Semgrep

SonarQube

DryRun Security

SQL Injection

Cross-Site Scripting (XSS)

SSRF

Auth Flaw / IDOR

User Enumeration

Hardcoded Token

Tool	Accuracy of Findings	Detects Non-Pattern-Based Issues?	Coverage of C# Vulnerabilities	Scan Speed	Developer Experience
DryRun Security	Very high – caught all critical flaws missed by others	Yes – context-based analysis finds logic errors, auth flaws, etc.	Broad coverage of OWASP Top 10 vulns plus business logic issues	Near real-time (PR comment within seconds)	Clear single PR comment with detailed insights; no config or custom scripts needed
Snyk Code	High on known patterns (SQLi, XSS), but misses logic/flow bugs	Limited – focuses on recognizable vulnerability patterns	Good for standard vulns; may miss SSRF or auth logic issues	Fast (integrates into PR checks)	Decent GitHub integration, but rules are a black box (no easy customization)
GitHub Advanced Security (CodeQL)	Low - missed everything except SQL Injection	Mostly pattern-based	Low – only discovered SQL Injection	Slowest of all but finished in 1 minute	Concise annotation with a suggested fix and optional auto-remedation
Semgrep	Medium – finds common issues with community rules, some misses	Primarily pattern-based, limited data flow analysis	Decent coverage with the right rules; misses advanced logic flaws	Very fast (runs as lightweight CI)	Custom rules possible, but require maintenance and security expertise
SonarQube	Low – missed serious issues in our testing	Mostly pattern-based (code quality focus)	Basic coverage for known vulns; many issues flagged as “hotspots” require manual review	Moderate (runs in CI/CD pipeline)	Results in dashboard; risk of false sense of security if quality gate passes despite vulnerabilities

Tool

Accuracy of Findings

Detects Non-Pattern-Based Issues?

Coverage of C# Vulnerabilities

Scan Speed

Developer Experience

DryRun Security

Very high – caught all critical flaws missed by others

Yes – context-based analysis finds logic errors, auth flaws, etc.

Broad coverage of OWASP Top 10 vulns plus business logic issues

Near real-time (PR comment within seconds)

Clear single PR comment with detailed insights; no config or custom scripts needed

Snyk Code

High on known patterns (SQLi, XSS), but misses logic/flow bugs

Limited – focuses on recognizable vulnerability patterns

Good for standard vulns; may miss SSRF or auth logic issues

Fast (integrates into PR checks)

Decent GitHub integration, but rules are a black box (no easy customization)

GitHub Advanced Security (CodeQL)

Low - missed everything except SQL Injection

Mostly pattern-based

Low – only discovered SQL Injection

Slowest of all but finished in 1 minute

Concise annotation with a suggested fix and optional auto-remedation

Semgrep

Medium – finds common issues with community rules, some misses

Primarily pattern-based, limited data flow analysis

Decent coverage with the right rules; misses advanced logic flaws

Very fast (runs as lightweight CI)

Custom rules possible, but require maintenance and security expertise

SonarQube

Low – missed serious issues in our testing

Mostly pattern-based (code quality focus)

Basic coverage for known vulns; many issues flagged as “hotspots” require manual review

Moderate (runs in CI/CD pipeline)

Results in dashboard; risk of false sense of security if quality gate passes despite vulnerabilities

Vulnerability Class

Snyk Code

GitHub Advanced Security (CodeQL)

Semgrep

SonarQube

DryRun Security

SQL Injection (SQLi)

Cross-Site Scripting (XSS)

Server-Side Request Forgery (SSRF)

Auth Logic/IDOR

User Enumeration

Hardcoded Credentials

Vulnerability

DryRun Security

Semgrep

GitHub CodeQL

SonarQube

Snyk Code

1. Remote Code Execution via Unsafe Deserialization

2. Code Injection via eval() Usage

3. SQL Injection in a Raw Database Query

4. Weak Encryption (AES ECB Mode)

5. Broken Access Control / Logic Flaw in Authentication

Total Found

5/5

3/5

1/5

0/5

Vulnerability

DryRun Security

Snyk

CodeQL

SonarQube

Semgrep

Server-Side Request Forgery (SSRF)

(Hotspot)

Cross-Site Scripting (XSS)

SQL Injection (SQLi)

IDOR / Broken Access Control

Broken Authentication Logic

Invalid Token Validation Logic

Broken Email Verification Logic

Dimension	Why It Matters
Surface	Entry points & data sources highlight tainted flows early.
Language	Code idioms reveal hidden sinks and framework quirks.
Intent	What is the purpose of the code being changed/added?
Design	Robustness and resilience of changing code.
Environment	Libraries, build flags, and infra metadata flag, infrastructure (IaC) all give clues around the risks in changing code.

Dimension

Why It Matters

Surface

Entry points & data sources highlight tainted flows early.

Language

Code idioms reveal hidden sinks and framework quirks.

Intent

What is the purpose of the code being changed/added?

Design

Robustness and resilience of changing code.

Environment

Libraries, build flags, and infra metadata flag, infrastructure (IaC) all give clues around the risks in changing code.

KPI	Pattern-Based SAST	DryRun CSA
Mean Time to Regex	3–8 hrs per noisy finding set	Not required
Mean Time to Context	N/A	< 1 min
False-Positive Rate	50–85 %	< 5 %
Logic-Flaw Detection	< 5 %	90%+

KPI

Pattern-Based SAST

DryRun CSA

Mean Time to Regex

3–8 hrs per noisy finding set

Not required

Mean Time to Context

N/A

< 1 min

False-Positive Rate

50–85 %

< 5 %

Logic-Flaw Detection

< 5 %

90%+

	Severity
Location	utils/authorization.py :L118	utils/authorization.py :L49 & L82 & L164
Issue	JWT Algorithm Confusion Attack: jwt.decode() selects the algorithm from unverified JWT headers.	Insecure OIDC Endpoint Communication: ‍urllib.request.urlopen called without explicit TLS/CA handling.
Impact	Complete auth bypass (switch RS256→HS256, forge tokens with public key as HMAC secret).	Susceptible to MITM if default SSL behavior is weakened or cert store compromised.
Remediation	Replace the dynamic algorithm selection with a fixed, expected algorithm list. Change line 118 from algorithms=[unverified_header.get('alg', 'RS256')] to algorithms=['RS256'] to only accept RS256 tokens. Add algorithm validation before token verification to ensure the header algorithm matches expected values.	Create a secure SSL context using ssl.create_default_context() with proper certificate verification. Configure explicit timeout values for all HTTP requests to prevent hanging connections. Add explicit SSL/TLS configuration by creating an HTTPSHandler with the secure SSL context. Implement proper error handling specifically for SSL certificate validation failures.
Key Insight	This vulnerability arises from trusting an unverified portion of the JWT to determine the verification method itself	This vulnerability stems from a lack of explicit secure communication practices, leaving the application reliant on potentially weak default behaviors.

Severity

Critical

High

Location

utils/authorization.py :L118

utils/authorization.py :L49 & L82 & L164

Issue

JWT Algorithm Confusion Attack:
jwt.decode() selects the algorithm from unverified JWT headers.

Insecure OIDC Endpoint Communication:
‍urllib.request.urlopen called without explicit TLS/CA handling.

Impact

Complete auth bypass (switch RS256→HS256, forge tokens with public key as HMAC secret).

Susceptible to MITM if default SSL behavior is weakened or cert store compromised.

Remediation

Replace the dynamic algorithm selection with a fixed, expected algorithm list. Change line 118 from algorithms=[unverified_header.get('alg', 'RS256')] to algorithms=['RS256'] to only accept RS256 tokens. Add algorithm validation before token verification to ensure the header algorithm matches expected values.

Create a secure SSL context using ssl.create_default_context() with proper certificate verification. Configure explicit timeout values for all HTTP requests to prevent hanging connections. Add explicit SSL/TLS configuration by creating an HTTPSHandler with the secure SSL context. Implement proper error handling specifically for SSL certificate validation failures.

Key Insight

This vulnerability arises from trusting an unverified portion of the JWT to determine the verification method itself

This vulnerability stems from a lack of explicit secure communication practices, leaving the application reliant on potentially weak default behaviors.

AI in AppSec

•

November 19, 2024

Next-Level Application Security: Leveraging LLMs in Practice

In our last blog post titled “One Year of Using LLMs for Application Security: What We Learned” I revealed some of the good and bad discoveries in the process of trying to use this emerging technology to solve long-lived problems within the application security space.

Today I’m going to delve into some of these persistent problems that have plagued the application security community and discuss how LLMs can help in these spaces. I’ll also talk about how we are leveraging them here at DryRun Security in particular.

Release Processes and Security Reviews

The release process is designed to incorporate valuable feedback and insights from all relevant parties during the earliest stages of software development—ideally even before any code is written. This approach works reasonably well for new applications, services, or major features. However, security practitioners often grapple with frequent, security-critical changes that occur multiple times a day or week.

These changes are typically smaller in scope and may not qualify for the formal release process. As a result, they bypass standard procedures, leaving security teams unaware of potential risks introduced into the system. In some cases, even significant changes that should undergo the release process are skipped for various reasons.

We've found that LLMs excel at understanding and summarizing code changes. By aggregating all changes across an organization, LLMs can identify modifications that security practitioners need to know about, even when these changes are dispersed across multiple repositories and authors.

We refer to this capability as Code Insights. Even in its initial implementation, this feature has uncovered critical and previously unknown changes within our customers' organizations. Here are some real-world examples:

Overhaul of SSO and conversion to using AWS SSO OIDC
Switched to a new payment gateway provider
The use of 57k SSNs for testing purposes of a new service
Introduction of new marketing-related widgets that required modifications to the Content Security Policy
Crypto library replacement

Some of these changes should have gone through the release process but did not; others were deemed too minor to qualify. In all cases, being aware of these changes is crucial for the security team to adequately protect the organization.

Advanced Code Scanning for Complex Issues

Code scanning using LLMs is a topic of active debate. Some argue that non-deterministic analysis cannot be trusted, while others believe LLMs represent the future of code analysis.

Regardless of your stance, it's evident that traditional Static Application Security Testing (SAST) tools often produce noisy, low-value results and lack contextual understanding.

Given these limitations, exploring new approaches is not just justified but necessary—which is precisely what we've done.

Our thesis posits that vulnerabilities are only one marker of risk. Sometimes, vulnerabilities are too nuanced for traditional tools to detect, but there are indicators suggesting that a human should investigate further. This led us to address two key challenges:

Developing a Robust Code Scanning Engine: Utilizing LLMs as a core component.
Surfacing Additional Risk Markers: Analyzing the "Who, What, Where, and Why" of code changes.

By combining these efforts, we achieve a more accurate picture of the risks introduced by code changes. We call this approach Contextual Security Analysis and employ a methodology known as S.L.I.D.E.

LLMs are adept at text summarization and understanding the intent behind code changes, which helps answer the "What" and "Why." We then use deterministic techniques to ascertain the "Who" and "Where," allowing us to build a comprehensive profile of code changes and extract risk markers.

Enhancing Vulnerability Detection

When it comes to code scanning, LLMs offer significant advantages. Like a human reviewer, an LLM can analyze a piece of code and ask additional questions if necessary. There are two well-known methods to provide LLMs with the context they need:

Retrieval Augmented Generation (RAG): Allows you to prefill the LLM with relevant information.
Agent-Based Architecture: Allows the LLM to decide when and which agent to use for additional information.

One other note here in regards to the questions themselves: questions shouldn’t be broad. Often you will need to ask several very discrete questions to get an accurate answer. Asking “Is this code vulnerable in any way?” is far less effective than asking:

Does this code use any known dangerous functions?
Is user-supplied input being passed to these functions?
Is this input incorporated in an unsafe manner?

Addressing Complex Authorization Flaws

Traditional SAST tools struggle with detecting authorization flaws, such as Insecure Direct Object References (IDOR) or Broken Object Level Authorization (BOLA). We see great promise in using agent-based architectures to tackle these issues. For example, while both an LLM and a SAST tool can detect patterns that resemble IDOR, only an LLM with agent-based capabilities can further inquire:

Are there authorization decorators present in the changing code that has the vulnerable pattern?
What are the names of those decorators?
What do those authorization decorators do? (Agent will now search code base to extract the definitions)
Are the authorization decorators preventing the potentially vulnerable code from being vulnerable?

This simulated intelligence allows for nuanced analysis that traditional SAST tools can’t achieve, uncovering vulnerabilities that would otherwise remain hidden.

Security Assistants for Developers

Chatbots & LLMs are all the rage these days so it should be no surprise that there are potential use cases available to security staff that can help them optimize their work and extend their team.

While security teams encourage developers to ask questions, it's beneficial when routine queries can be addressed without direct human intervention.

Our implementation answers questions directly from the Source Code Management (SCM) system (e.g., GitHub Pull Requests or Issues), enabling developers to receive answers based on your organization's documentation—and only that documentation.

This approach is advantageous because it meets developers where they are and provides contextually relevant information. Security teams maintain tight control over the types of questions that can be answered and the information used, effectively enabling first-level triage without the need for immediate human involvement.

Conclusion

We're continuously exploring additional ways to solve customer problems either by using LLM technology or through other means. As new features become more fully developed and validated at scale, we'll share those use cases as well.

Whether it's scaling your team, identifying risks in code changes, empowering developers, or highlighting significant security-impacting modifications, I hope this article has illustrated the transformative potential of LLMs within AppSec.

Are you interested in further discussion? Schedule some time with me and I’d be happy to chat with you about LLMs in AppSec.

Ken Johnson

Co-founder & CTO

No items found.

Next-Level Application Security: Leveraging LLMs in Practice

Release Processes and Security Reviews

Advanced Code Scanning for Complex Issues

Enhancing Vulnerability Detection

Addressing Complex Authorization Flaws

Security Assistants for Developers

Conclusion

Related Blogs

How We Turned Natural Language Into a Scalable Agentic AppSec Engine

Velocity Without Vision: Why AI-Led Development Is Driving Agentic Security Adoption