Skip to main content

BYOLLM (Bring Your Own LLM)

BYOLLM (Bring Your Own LLM) lets you configure aprity to use your own large language model provider instead of the default aprity-managed AI infrastructure. This gives you full control over data sovereignty, cost management, and model selection.

info

BYOLLM is available on the Intelligence plan only — it is not included in the Trial. See Plan Comparison for the full feature matrix.

Why use BYOLLM?

Data sovereignty

By routing AI analysis through your own LLM provider, metadata and prompts stay within your organization's cloud infrastructure. This is essential for organizations with strict data residency requirements or regulatory obligations.

Cost control

Using your own LLM subscription allows you to manage AI costs through your existing cloud billing. You can leverage volume discounts, reserved capacity, or enterprise agreements with your provider.

Model selection

Choose the AI model that best fits your needs. Different models offer different tradeoffs in speed, accuracy, and cost. BYOLLM lets you select the model version and configuration that works for your organization.

Supported providers

aprity supports the following LLM providers for BYOLLM:

ProviderEndpoint type
Azure OpenAIAzure-hosted deployment endpoint
Anthropic (Claude)Anthropic API
OpenAIOpenAI API
xAI (Grok)xAI API
caution

aprity only supports models with a 1M-token context window or larger. Smaller-context models are rejected when the endpoint is configured. This requirement exists because aprity injects large deterministic substrates directly into its prompts rather than summarizing them.

note

Model availability and naming change as providers release new versions. Contact support for the current list of tested and supported 1M-context models.

How it works

When BYOLLM is configured:

  1. aprity sends analysis prompts to your LLM endpoint instead of the default aprity infrastructure.
  2. All metadata content processed by the LLM flows through your provider account.
  3. Responses are returned to aprity for documentation assembly.
  4. The rest of the aprity pipeline (metadata extraction, dependency computation, documentation rendering) is unchanged.

The LLM is used only for the ANALYZE_METADATA phase of the scan. All other phases (metadata extraction, dependency graph construction, document generation) are fully deterministic and do not involve any LLM calls.

Setup process

BYOLLM is configured by an administrator directly in the aprity app. An endpoint profile is defined for your tenant and applies to all subsequent scans.

Step 1: Gather your endpoint details

Before configuring, have the following ready:

  • Your LLM provider (Azure OpenAI, Anthropic, OpenAI, or Grok).
  • Your API endpoint URL (for Azure OpenAI, this is your deployment endpoint).
  • Your API key or authentication credentials.
  • Your chosen 1M-context model name and version.

Step 2: Configure the endpoint profile

In the aprity app, an admin creates an endpoint profile that defines:

  • The provider and model to use (must be a 1M-context model).
  • The API endpoint and authentication.
  • Rate limits and timeout settings.
  • Fallback behavior if the endpoint is unavailable.

Step 3: Validation

aprity validates the endpoint to confirm:

  • Connectivity to your LLM endpoint.
  • The model meets the 1M-context requirement.
  • Response format compatibility.

Step 4: Activation

Once validated, BYOLLM is active for your tenant. All subsequent scans use your configured LLM endpoint.

caution

Keep your API key current. If the key expires or is rotated, scans will fail during the analysis phase. Update your credentials in the endpoint profile, or contact support if you need help.

Provider-specific notes

Azure OpenAI

  • Requires a deployed model in your Azure OpenAI resource.
  • Provide the full deployment endpoint URL (e.g., https://your-resource.openai.azure.com/openai/deployments/your-model).
  • Supports both API key and Azure Active Directory (AAD) authentication.

Anthropic (Claude)

  • Requires an active Anthropic API account.
  • Provide your API key and the desired model identifier.

OpenAI

  • Requires an active OpenAI API account.
  • Provide your API key and the desired model identifier.
  • Organization ID is optional but recommended if you belong to multiple organizations.

Monitoring and troubleshooting

  • Scan logs include indicators of which LLM endpoint was used for analysis.
  • If your endpoint is unreachable or returns errors, the scan will fail during the ANALYZE_METADATA phase with a descriptive error message.
  • Rate limit errors from your provider will cause the scan to slow down (with automatic retry and backoff) rather than fail immediately.

Frequently asked questions

Can I switch providers after initial setup?

Yes. An admin updates the endpoint profile in the aprity app. There is no downtime; the change takes effect on the next scan.

What happens if my LLM endpoint is down?

The scan will retry with backoff. If the endpoint remains unavailable, the scan will fail with a clear error message indicating the LLM endpoint could not be reached.

Does BYOLLM affect scan speed?

Scan speed depends on your LLM endpoint's latency and throughput. Endpoints with higher rate limits and lower latency will produce faster scans. The aprity team can help tune configuration for optimal performance.

Can I use different models for different scans?

Not currently. The endpoint profile is configured at the tenant level and applies to all scans.