# Semgrep — Lightweight Static Analysis for Any Language > Semgrep is a fast, open-source static analysis tool that finds bugs and security issues using patterns that look like source code. Write rules in a syntax similar to the code you are searching — no complex AST queries or regex needed. ## Install Save in your project root: # Semgrep — Lightweight Static Analysis for Any Language ## Quick Use ```bash # Install Semgrep pip install semgrep # Or: brew install semgrep # Scan with community rules semgrep --config auto . # Scan for security issues semgrep --config p/security-audit . # Scan for OWASP Top 10 semgrep --config p/owasp-top-ten . # Scan specific languages semgrep --config p/python . semgrep --config p/javascript . ``` ## Introduction Semgrep makes static analysis accessible. Traditional SAST tools require learning complex query languages or AST manipulation. Semgrep rules look like the code you are searching for — write a pattern in Python syntax to find Python bugs, in JavaScript syntax to find JavaScript issues. If you can read code, you can write Semgrep rules. With over 15,000 GitHub stars, Semgrep supports 30+ languages and has a community rule registry with thousands of pre-built rules for security, correctness, and best practices. It is used by Dropbox, Figma, Snowflake, and hundreds of security teams. ## What Semgrep Does Semgrep parses source code into an AST and matches patterns against it. Patterns use a code-like syntax with metavariables ($X) for wildcards and operators (...) for matching any code. It finds bugs, security vulnerabilities, anti-patterns, and enforces coding standards — all without compiling the code. ## Architecture Overview ``` [Source Code] 30+ languages supported | [Semgrep Engine (OCaml)] Language-aware parsing Pattern matching on ASTs | [Rules (YAML)] +-------+-------+ | | | [Community [Custom [Semgrep Rules] Rules] Registry] p/security Your org 3,000+ p/owasp specific rules p/python rules from community | [Pattern Matching] $VAR = metavariable ... = match anything Taint tracking Constant propagation | [Results] CLI, JSON, SARIF GitHub/GitLab integration ``` ## Self-Hosting & Configuration ```yaml # .semgrep.yml — custom rule example rules: - id: sql-injection-risk patterns: - pattern: | cursor.execute("..." + $USER_INPUT) message: | Possible SQL injection: use parameterized queries instead of string concatenation. languages: [python] severity: ERROR metadata: cwe: CWE-89 owasp: A03:2021 - id: no-eval pattern: eval(...) message: Avoid eval() — it executes arbitrary code. languages: [python, javascript] severity: WARNING - id: use-strict-equality pattern: $X == $Y fix: $X === $Y message: Use strict equality (===) instead of loose equality (==). languages: [javascript, typescript] severity: INFO ``` ```bash # CI/CD integration # GitHub Actions: # - uses: returntocorp/semgrep-action@v1 # with: # config: p/security-audit # Run with autofix semgrep --config .semgrep.yml --autofix . # Output SARIF for GitHub Security semgrep --config auto --sarif -o results.sarif . ``` ## Key Features - **Code-Like Patterns** — write rules that look like source code - **30+ Languages** — Python, JS/TS, Java, Go, Ruby, C, PHP, and more - **Metavariables** — $X matches any expression for flexible patterns - **Taint Analysis** — track data flow from sources to sinks - **Autofix** — automatically fix issues with fix patterns - **Community Rules** — 3,000+ pre-built rules in the registry - **CI/CD** — GitHub Actions, GitLab CI, and SARIF output - **Fast** — scans large codebases in seconds (no compilation needed) ## Comparison with Similar Tools | Feature | Semgrep | ESLint | SonarQube | CodeQL | Bandit | |---|---|---|---|---|---| | Languages | 30+ | JS/TS | 30+ | 10+ | Python only | | Rule Syntax | Code-like | JS API | Java | QL language | Plugin | | Learning Curve | Very Low | Low | Moderate | High | Low | | Taint Analysis | Yes | No | Yes | Yes | Limited | | Autofix | Yes | Yes | No | No | No | | Self-Hosted | Yes | Yes | Yes | GitHub only | Yes | | Best For | Multi-lang security | JS linting | Enterprise | Deep analysis | Python security | ## FAQ **Q: Semgrep vs ESLint — when should I use Semgrep?** A: Use ESLint for JavaScript/TypeScript linting and formatting rules. Use Semgrep for security scanning across multiple languages, custom organizational rules, and taint analysis. **Q: Can Semgrep replace SonarQube?** A: For security scanning, Semgrep is faster to set up and easier to customize. SonarQube provides more comprehensive code quality metrics, technical debt tracking, and enterprise features. **Q: How do I write custom rules?** A: Start at semgrep.dev/playground — paste your code, write a pattern, and test it interactively. Rules use YAML with code-like patterns. The docs have a tutorial that takes 30 minutes. **Q: What is Semgrep Pro?** A: Semgrep Pro (cloud) adds inter-file analysis, AI-assisted triage, and team management. The open-source CLI with community rules is free and sufficient for many teams. ## Sources - GitHub: https://github.com/semgrep/semgrep - Documentation: https://semgrep.dev/docs - Playground: https://semgrep.dev/playground - Created by Semgrep (formerly r2c) - License: LGPL-2.1 --- Source: https://tokrepo.com/en/workflows/876dacb1-372b-11f1-9bc6-00163e2b0d79 Author: AI Open Source