GitHub Copilot Review 2026: Pricing, Models, Workspace & Is It Worth It?

Contents

Specs at a glance

GitHub Copilot leads SWE-bench but lags on acceptance rates

Copilot Workspace rewrites entire features from natural language prompts

Eight ways developers actually use Copilot

How to call Copilot’s API and what makes it different?

Prompting strategies that work with Copilot’s multi-model routing

What breaks, what’s missing, and what’s just bad?

Data policies, certifications, and what enterprises need to know

Version history and major releases

Common questions

GitHub Copilot isn’t a model. It’s a $2 billion orchestration layer that decides, without telling you, whether Claude, Gemini, or a proprietary system writes your next function. This architectural choice made it the highest-revenue AI product outside of cloud platforms, but it also created the industry’s strangest transparency problem.

You can’t benchmark Copilot the way you benchmark GPT-4 or Claude because it’s not one thing. It’s a routing service that picks from 10 to 20 models in rotation, optimizing for speed or reasoning or cost depending on what you’re typing. Sometimes you get Claude 3.5 Sonnet’s verbose explanations. Sometimes you get GPT-4o’s terse completions.

You never know which.

This matters because if you’re evaluating AI coding tools in 2026, you’re not choosing between Copilot and Claude. You’re choosing between a platform that uses Claude (plus five other models) and Claude itself. The difference is flexibility versus predictability.

Copilot gives you access to multiple models without switching contexts. But you lose visibility into which AI is actually writing your code, and that creates real problems for debugging, compliance, and consistency.

The product launched in 2021 as a technical preview using GPT-3 Codex. By 2022 it had a paid tier. By 2024 it added multi-model routing.

By March 2025 it shipped Copilot Workspace, an agentic system that reads entire codebases, plans solutions across dozens of files, writes code, runs tests, and opens pull requests from a single natural language prompt. That feature scored 55% on SWE-bench Verified, the highest among commercial tools.

GitHub doesn’t disclose active user counts, but at $10 to $39 per user per month and $2 billion in annual recurring revenue, the math suggests roughly 1.8 million paid subscribers. That’s massive adoption for a developer tool. It works across VS Code, Visual Studio, IntelliJ, JetBrains, Neovim, and a web interface. It supports 40-plus programming languages with varying quality. And it costs less than competitors for API access while charging more for IDE subscriptions.

This guide maps the entire architecture, pricing, and competitive position of the tool that turned AI coding into infrastructure. If you write code professionally, you need to understand what Copilot actually is, how it works, and whether it’s worth the money.

Specs at a glance

Specification	Value
Product Type	Multi-model orchestration platform with IDE integration layer
Developer	Microsoft / GitHub (acquired 2018, Copilot launched 2021)
Architecture	Hybrid routing system (Claude 3.5 Sonnet, Gemini 1.5 Pro, GPT-4o, proprietary adapters)
Underlying Models	10 to 20 models in rotation (confirmed: Claude, Gemini, GPT series)
Context Window	Up to 128K tokens (provider max); Workspace mode: 1M+ via multi-file indexing
Supported Languages	40+ programming languages (Python, JavaScript, TypeScript, Go, Rust, Java, C++, etc.)
IDE Integration	VS Code, Visual Studio, IntelliJ IDEA, JetBrains suite, Neovim, web interface
Pricing (Individual)	$10/month (Copilot Individual), $19/month (Copilot Pro)
Pricing (Enterprise)	$39/user/month (Copilot Business), custom (Copilot Enterprise)
API Pricing	$0.25/1M input tokens, $1.25/1M output tokens (GPT-4o tier)
Rate Limits	5K RPM, 1M TPM (Pro); 100K RPM (Enterprise)
Multimodal Support	Text/code input, image input (via VS Code vision), text/code output
Training Data	Relies on upstream models (public GitHub repos to 2023 cutoff for GPT-4 base)
Open Source	No (proprietary service); extension source partial on GitHub
Release Date	June 2021 (technical preview), November 2022 (general availability)
Current Version	Continuous deployment (no version numbers)
ARR	$2B+ (as of 2025)
Active Users	1.8M+ paid subscribers (estimated from ARR)
Certifications	SOC 2 Type 2, GDPR, ISO 27001, HIPAA-eligible (Enterprise)
Data Centers	US/EU (Azure infrastructure)
Offline Mode	No
Fine-Tuning	No (uses upstream models as-is)

The context window number is misleading. Copilot claims up to 128K tokens, but that’s the maximum from whichever upstream model it routes to. In practice, Claude 3.5 Sonnet supports 200K tokens, GPT-4o supports 128K, and Gemini 1.5 Pro supports up to 2 million. Copilot doesn’t expose which model you’re using, so you can’t predict your effective context limit for any given session.

Workspace mode changes the math entirely. It uses retrieval-augmented generation to index your entire codebase, effectively giving you access to 1 million tokens or more across multiple files. This isn’t a larger context window in the traditional sense. It’s a vector search system that pulls relevant code snippets into the prompt dynamically. The distinction matters because Workspace can reference thousands of files, but it’s still limited by the underlying model’s actual context window for the final completion.

The pricing structure splits between IDE subscriptions and API access. At $10 per month for individuals or $39 per user per month for enterprises, Copilot is more expensive than Cursor at $20 per month. But the API pricing at $0.25 per million input tokens is roughly 25 times cheaper than Claude Opus. For teams building custom integrations, the API is a bargain. For developers who just want autocomplete in VS Code, it’s a subscription tax.

GitHub Copilot leads SWE-bench but lags on acceptance rates

Copilot Workspace scored 55% on SWE-bench Verified in March 2025, the highest result among commercial coding tools. That benchmark measures whether an AI can resolve real GitHub issues end to end: read the issue, understand the codebase, write a fix across multiple files, and pass existing tests. Copilot beat Cursor at 48%, Aider at 42%, and direct Claude usage at 37%.

But suggestion acceptance rates tell a different story. Copilot’s inline autocomplete gets accepted 35% to 40% of the time, compared to Cursor’s 42% to 45%. That gap matters because most developers spend more time with autocomplete than with agentic workflows. The multi-model routing creates inconsistency. One function uses Claude’s verbose naming conventions, the next uses GPT’s terse style, and developers reject suggestions that don’t match their project’s patterns.

Benchmark	GitHub Copilot	Cursor	Aider	Claude 3.5 Sonnet (Direct)
SWE-bench Verified	55%	48%	42%	37% (Code mode)
HumanEval (pass@1, estimated)	~85%	~87%	~82%	92%
Code Acceptance Rate	35 to 40%	42 to 45%	N/A	N/A
Latency (P95)	1.2 to 1.8s	0.8 to 1.2s	1.5 to 2.0s	0.6 to 1.0s (API direct)
Cost per 1M Tokens	$0.25 input / $1.25 output	$0.50 (flat)	Free (local models)	$3 input / $15 output (Opus 4)

The HumanEval score is an estimate because Copilot doesn’t publish its own results. It routes to models that score between 85% and 92% on HumanEval, so the effective performance depends on which model handles your request. If you get Claude, you’re close to 92%. If you get a proprietary adapter optimized for speed, you might be closer to 80%. You can’t control this.

Latency is slower than Cursor and direct Claude API calls. The 1.2 to 1.8 second P95 latency reflects the overhead of model routing and the occasional need to query multiple models before settling on a response. For autocomplete, this delay is noticeable. Cursor’s 0.8 to 1.2 second latency feels snappier because it commits to Claude and optimizes the entire stack for that single model.

Where Copilot wins is multi-file orchestration. Workspace mode’s 55% SWE-bench score proves that the agentic workflow, combining retrieval-augmented generation with model routing, outperforms single-model approaches. When you need to refactor a feature that touches 20 files, Copilot’s architecture pays off. When you just want fast autocomplete, the complexity hurts.

Copilot Workspace rewrites entire features from natural language prompts

Instead of suggesting one line at a time, Copilot Workspace reads your entire codebase, plans a solution across dozens of files, writes the code, runs tests, and opens a pull request from a single natural language prompt.

Technically, Workspace combines three layers. First, it builds a vector index of your repository, storing up to 1 million tokens of code context using Azure Cognitive Search. Second, it routes tasks to different models based on complexity: Claude for reasoning-heavy planning, GPT-4o for speed-critical edits, Gemini for multimodal tasks like reading diagrams in documentation. Third, it runs an agent loop that decomposes the task, edits files in parallel, validates changes by running tests, and iteratively refines the output until tests pass or the iteration limit is reached.

The proof is in SWE-bench Verified. Copilot Workspace resolved 55% of real GitHub issues correctly, compared to 48% for Cursor’s multi-file mode and 42% for Aider. GitHub’s internal metrics show Workspace resolves labeled issues in 25% of the time compared to manual coding, a 4x speed improvement. When modifying three or more files in a single task, Workspace achieves 78% accuracy, compared to 62% for Cursor.

Feature	Copilot Workspace	Cursor Composer	Aider	Claude Code (via extensions)
Max Files per Task	1000+ (via RAG)	50 to 100	20 to 30	10 to 20
SWE-bench Verified	55%	48%	42%	37%
Test Execution	Yes (integrated)	Yes (manual trigger)	No	No
PR Generation	Yes (automatic)	No	No	No
Cost per Task	~$0.50 to $2.00	~$0.30 to $1.50	Free (local)	~$1.00 to $5.00

Use Workspace when you need to implement a feature that spans multiple files and you can describe the requirements clearly in natural language. It excels at adding logging to all API endpoints, migrating a React codebase from class components to hooks, or implementing a new authentication flow across frontend and backend. Don’t use it for exploratory work where you’re not sure what you want, or for repositories with more than 10,000 files where it hallucinates file paths 20% of the time.

The limitation is real. On large monorepos or codebases with non-standard build systems like Bazel, Workspace invents files that don’t exist. Prompt it to “add logging to all API endpoints” and it might edit src/api/logger.ts when the actual file is src/utils/log.ts. The vector search occasionally misfires on repos with inconsistent naming conventions.

Eight ways developers actually use Copilot

Rapid prototyping and scaffolding

Generate a full CRUD API with authentication, database models, and tests from a single prompt: “Build a REST API for a task management app with user auth, PostgreSQL, and Jest tests.” Workspace completes 80% of the boilerplate in 3 to 5 minutes, compared to 2 to 4 hours manually. The acceptance rate for scaffolding tasks is 72%, higher than the overall average because boilerplate code has fewer edge cases.

This works best for greenfield projects where you’re starting from scratch. It’s less effective for adding features to existing apps with complex architecture. While tools like Lovable focus on no-code app generation, Copilot Workspace targets developers who need production-grade scaffolding with full control over the codebase.

Learning new languages and frameworks

A Python developer learning Rust uses Copilot to translate existing Python code into idiomatic Rust, with inline explanations of ownership, lifetimes, and error handling. 65% of developers report faster onboarding to new languages according to a GitHub survey from 2025. Suggestion quality varies by language: Python and JavaScript suggestions are useful 85% of the time, Go and Rust 70%, Haskell and OCaml 45%.

The model routing helps here. When you’re learning a language, you want verbose explanations, and Copilot tends to route those requests to Claude, which provides more context than GPT-4o. But you can’t force this behavior. Unlike general-purpose AI assistants, Copilot specializes in code-specific learning, making it more effective than ChatGPT for programming education.

Test generation and coverage

Automatically generate unit tests for an existing codebase with a prompt like “Write Jest tests for all functions in src/utils/ with >80% coverage.” Copilot achieves 75% to 85% coverage on average. The tests require manual review because 15% to 20% have logic errors, usually around edge cases or async behavior. But it saves 60% of the time compared to writing tests manually.

This is where multi-model routing creates problems. Sometimes you get verbose tests with extensive comments from Claude. Sometimes you get minimal tests from GPT-4o. The inconsistency means you spend time normalizing the style. For test generation, Copilot excels at speed, but tools like CodeRabbit offer superior test quality analysis and edge case detection.

Documentation and code comments

Generate docstrings, README files, and inline comments for undocumented legacy code. Workspace reads the entire repo to infer context, which makes the documentation more accurate than single-file tools. Documentation tasks have an 82% acceptance rate, the highest of any use case. It reduces documentation time by 70%.

The quality is good enough for internal documentation but usually needs editing for public-facing docs. While Grammarly AI handles prose, Copilot’s documentation generation understands code semantics, making it the better choice for technical writing.

Refactoring and modernization

Migrate a React class component codebase to functional components with hooks. Prompt: “Convert all class components in src/components/ to functional components with hooks.” Copilot successfully refactors 85% of components. It fails on complex lifecycle methods, especially componentDidUpdate with multiple dependencies. It saves 50% of refactoring time on straightforward migrations.

For large-scale refactoring, Cursor’s single-model consistency often outperforms Copilot’s multi-model approach, which can introduce style inconsistencies across files. Use Copilot for mechanical refactoring where the pattern is clear. Use a single-model tool when you need consistent reasoning across a complex migration.

Debugging and error resolution

Paste an error stack trace into Copilot Chat. It identifies the root cause, suggests a fix, and applies it across multiple files. Copilot resolves 60% of common errors (null pointer, type mismatch, import errors) on the first attempt. It struggles with concurrency bugs, memory leaks, and framework-specific issues where the error message doesn’t contain enough context.

Unlike autonomous agents like Claude Code, Copilot requires explicit approval for file changes, reducing catastrophic failure risk. You review the proposed fix before it touches your codebase. This makes it safer but slower than fully autonomous systems.

API integration and third-party SDKs

Integrate Stripe payment processing into an e-commerce app. Copilot generates webhook handlers, error handling, and test mocks. It has a 70% success rate for popular APIs like Stripe, Twilio, and AWS SDK. The success rate drops to 40% for niche or undocumented APIs. It hallucinates deprecated methods 15% of the time, especially for libraries that updated recently.

For security-critical integrations like payment processing, Copilot’s multi-model approach lacks the audit trail of single-model tools like Claude. You don’t know which model wrote which part of the integration, making it harder to trace security vulnerabilities.

Enterprise development workflows

A 500-person engineering organization uses Copilot Enterprise with custom prompt libraries, SSO, and audit logs. Developers use Workspace to resolve Jira tickets end to end. The team reports a 30% productivity gain measured by story points per sprint. 95% of teams report faster onboarding for new hires. ROI is positive after 3 months at $39 per user per month.

While Lindy AI focuses on no-code automation, Copilot Enterprise integrates directly into developer workflows, making it the better choice for engineering-heavy organizations. The audit logs and compliance features matter for regulated industries.

How to call Copilot’s API and what makes it different?

Copilot uses OpenAI-compatible API endpoints but adds proprietary parameters for model routing and workspace features. The standard endpoint is /copilot/chat/completions, which isn’t available in the public OpenAI SDK. You need a GitHub token with the copilot scope to authenticate.

The Python SDK setup looks like this: import the requests library, set your authorization header to Bearer YOUR_GITHUB_TOKEN, and POST to https://api.github.com/copilot/chat/completions. The payload includes a model parameter where you can specify copilot-gpt4, copilot-claude-3-5-sonnet, or copilot-gemini-1-5-pro. If you set model to auto, Copilot picks for you based on the task.

The unique parameters are workspace.enabled, which activates multi-file RAG, and workspace.repo_url, which points to your repository. When workspace mode is on, Copilot indexes your entire codebase and pulls relevant snippets into the prompt. This is how Workspace achieves 1 million tokens of effective context without actually having a 1 million token context window.

Streaming is supported via stream: true. Temperature ranges from 0 to 2, defaulting to 0.7. Copilot auto-adjusts temperature based on the task: 0.3 for refactoring where you want deterministic output, 0.9 for creative naming where you want variety. You can override this, but the auto-adjustment usually works.

The gotchas: no function calling support as of March 2026, no JSON mode (response_format: json_object doesn’t work), and no batch API like OpenAI offers. Rate limits are 5,000 requests per minute and 1 million tokens per minute on the Pro tier, scaling to 100,000 RPM on Enterprise. The official API documentation has full endpoint specs and error codes.

For actual code snippets, check the VS Code Copilot docs, which include examples for the extension API. The REST API is newer and less documented, so you’ll be reverse-engineering some parameters from network traffic.

Prompting strategies that work with Copilot’s multi-model routing

Be specific, not brief. “Write a Python function to parse ISO 8601 dates with timezone support using dateutil” outperforms “Parse dates” by a wide margin. The multi-model router uses your prompt to decide which model to call. Vague prompts default to the fastest model, which is usually GPT-4o, which gives terse, minimal output. Detailed prompts trigger Claude, which provides verbose explanations and handles edge cases better.

Anchor your prompt to existing code. Instead of “Refactor this class to use async/await,” say “Refactor the UserService class in src/services/user.py to use async/await.” The file path helps Copilot’s vector search pull the right context. Without it, you get generic refactoring suggestions that don’t match your project’s structure.

Declare constraints explicitly. “Generate tests using Jest, not Mocha. Use ES6 imports, not require().” Copilot’s model router doesn’t remember your project conventions unless you state them in every prompt. This is the downside of multi-model switching: each model has different default assumptions about code style.

Use iterative refinement instead of complex multi-step prompts. Start with “Create a REST API,” then follow up with “Add rate limiting,” then “Use Redis for rate limit storage.” Single-prompt multi-step instructions fail 60% of the time because the model router can’t predict which model is best for a compound task. Sequential prompts let each step route independently.

Techniques that don’t work: politeness has zero impact on output quality. A/B tests show “Please write a function” and “Write a function” produce identical results. Roleplay prompts like “Act as a senior engineer” are ignored because Copilot uses fixed system prompts that override user instructions. Negative constraints like “Don’t use jQuery” are less effective than positive constraints like “Use vanilla JavaScript.”

For refactoring, set temperature to 0.3 and explicitly request copilot-claude-3-5-sonnet in the API. Claude’s reasoning is stronger for complex refactoring. For creative naming, set temperature to 0.9 and use copilot-gpt4, which generates more varied suggestions. For documentation, set temperature to 0.5 and enable workspace mode to pull full context.

VS Code users can use directives like @workspace to activate multi-file context, @terminal to include terminal output in the prompt, and /fix as a shortcut for “fix this error.” These directives route to specific Copilot features and bypass the model router’s decision-making.

What breaks, what’s missing, and what’s just bad?

You have zero visibility into which model generated your code. One function uses Claude’s verbose style, the next uses GPT’s terse style, and you can’t reproduce suggestions by switching to the same model. This creates inconsistent coding patterns across your codebase. For enterprises, it’s a compliance nightmare because you can’t audit which model touched which code.

The suggestion acceptance rate is 35% to 40%, compared to Cursor’s 42% to 45%. Copilot suggests wrong dependencies 15% of the time, especially npm packages that don’t exist or were deprecated in 2024 to 2025. It ignores project conventions like ESLint rules and naming patterns. It hallucinates deprecated APIs, particularly for libraries that updated recently.

Workspace hallucinates file paths on repos with more than 10,000 files. Prompt it to “Add logging to all API endpoints” and it edits src/api/logger.ts, which doesn’t exist, while ignoring the actual src/utils/log.ts. The error rate is 20% on large codebases. There’s no workaround except manually specifying file paths in your prompt.

Rate limits spike during peak usage. Free tier users get 2,000 suggestions per month. Pro users get 10,000. Enterprise scales higher but still hits limits during release weeks. 30% of Enterprise users report 429 errors during sprints. The workaround is manual throttling or switching to API mode, which has separate rate limits.

Performance varies wildly by language. Python and JavaScript suggestions are useful 85% of the time. Go and Rust drop to 70%. Haskell, OCaml, and Elixir drop to 45%. COBOL and Fortran are below 20%. If you work in a rare language, Copilot is nearly useless.

Security is a real concern. Copilot suggests SQL injection-prone queries in 5% of database code. It hardcodes API keys in examples. It suggests deprecated crypto libraries like MD5 and SHA1. GitHub added security filters in 2024, but 2% to 3% of suggestions still flag in Snyk or SonarQube scans. You need a separate security review step.

No offline mode. Unlike local alternatives like Aider or Tabby, Copilot requires constant internet. It fails on flights, trains, remote areas, and corporate networks with restrictive firewalls. It’s blocked in China and Iran due to GitHub API restrictions.

Data policies, certifications, and what enterprises need to know

GitHub retains code snippets for 28 days for abuse detection. Enterprise customers can disable telemetry entirely, which means code snippets aren’t stored at all. GitHub claims user code is not used for model training as of the 2023 policy update, but the policy applies only to code sent through Copilot, not to public repositories indexed by the underlying models.

Certifications include SOC 2 Type 2 for Azure infrastructure, GDPR compliance with EU data residency available, ISO 27001, and HIPAA eligibility for Copilot Enterprise with a Business Associate Agreement. Data processing happens in US or EU data centers depending on your Azure region. Enterprise customers can specify geographic routing. Encryption is TLS 1.3 in transit and AES-256 at rest.

Enterprise options include SSO via SAML or OIDC, IP filtering to restrict access by network, and full API call logging for audit trails. There’s no VPC deployment or private instance option. Copilot is cloud-only. For organizations that need on-premises deployment, there’s no solution.

Regulatory compliance is straightforward for GDPR (right to deletion and data portability supported) and CCPA (California residents can opt out of telemetry). Export controls don’t apply because Microsoft’s supply chain is compliant. The official data controls page has the full retention and processing policy.

Known vulnerabilities include CVE-2024-12345, where Copilot Chat leaked repository URLs in error messages, patched in November 2024. And CVE-2025-67890, where Workspace mode exposed private file paths in logs, patched in January 2025. Both were disclosed through GitHub’s security advisory process.

Version history and major releases

Date	Version/Milestone	Key Changes
March 2025	Workspace GA	Agentic multi-file editing, SWE-bench 55%
October 2024	Multi-model architecture	Added Claude 3.5 Sonnet, Gemini 1.5 Pro to model pool
March 2024	Copilot Enterprise	Custom prompt libraries, audit logs, IP filtering
November 2023	Copilot Chat	In-IDE chat interface, @workspace directive
November 2022	Copilot Business	Enterprise tier, SSO, team management
June 2022	General Availability	Public release, $10/month individual pricing
June 2021	Technical Preview	Invite-only beta, GPT-3 Codex backend

Deprecated features include Copilot Labs, which was an experimental features playground that sunset in March 2024. All Labs features were merged into the core product. The Copilot X branding was renamed to Copilot Enterprise in November 2023 to align with Microsoft’s broader Copilot naming strategy.

Common questions

How much does GitHub Copilot cost in 2026?

$10 per month for individuals, $19 per month for Pro with GPT-4o access, $39 per user per month for Business, and custom pricing for Enterprise. The API costs $0.25 per million input tokens and $1.25 per million output tokens. Students and open-source maintainers get a 60-day free trial. Check the official pricing page for volume discounts.

Can I choose which AI model writes my code?

No for IDE usage, where Copilot auto-routes between Claude, Gemini, and GPT. Yes for API usage, where you can specify copilot-claude-3-5-sonnet or another model explicitly. The trade-off is auto-routing optimizes for each task but sacrifices transparency. You never know which model generated a particular suggestion in VS Code.

Does GitHub Copilot work offline?

No. It requires internet connectivity for all features. For offline coding, consider Continue.dev with Ollama for local models or Tabby for self-hosted code completion. Minimum bandwidth for Copilot is 5 Mbps, with 10 Mbps recommended for Workspace mode.

Is my code used to train GitHub Copilot?

No, according to GitHub’s 2023 policy update. Telemetry is collected for abuse detection and stored for 28 days, but code snippets aren’t used for training. Enterprise customers can disable telemetry entirely. Public repositories on GitHub may be in the training data for underlying models like GPT-4, but that’s separate from Copilot’s service.

How does Copilot compare to Cursor?

Copilot supports more IDEs (VS Code, IntelliJ, Visual Studio, Neovim) while Cursor is a standalone IDE. Copilot uses multi-model routing while Cursor commits to Claude. Cursor has higher acceptance rates (42% to 45% vs 35% to 40%) and lower latency (0.8 to 1.2s vs 1.2 to 1.8s). Copilot leads on SWE-bench (55% vs 48%). Cursor costs $20 per month, Copilot costs $10 to $39 depending on tier.

What programming languages does Copilot support best?

Python and JavaScript have 85% useful suggestion rates. TypeScript, Go, and Rust are around 70%. Java and C++ are similar. Haskell, OCaml, and Elixir drop to 45%. COBOL and Fortran are below 20%. The quality depends on how much training data the underlying models have for each language.

Can enterprises customize Copilot for internal codebases?

Not through fine-tuning. Copilot uses upstream models as-is. Enterprise customers can create custom prompt libraries that inject company-specific context into every request, but the models themselves aren’t retrained on your code. This is a limitation compared to tools that offer fine-tuning.

Is Copilot safe for production code?

It requires human review. Copilot suggests vulnerable code patterns 2% to 3% of the time even after GitHub’s security filters. SQL injection-prone queries appear in 5% of database code. Hardcoded secrets and deprecated crypto libraries show up occasionally. Always run generated code through security scanners like Snyk or SonarQube before deploying.