Skip to main content

BYOLLM (Bring Your Own LLM)

BYOLLM (Bring Your Own LLM) allows Enterprise customers to configure aprity to use their own large language model provider instead of the default aprity-managed AI infrastructure. This gives you full control over data sovereignty, cost management, and model selection.

info

BYOLLM is available on Enterprise plans only.

Why use BYOLLM?

Data sovereignty

By routing AI analysis through your own LLM provider, metadata and prompts stay within your organization's cloud infrastructure. This is essential for organizations with strict data residency requirements or regulatory obligations.

Cost control

Using your own LLM subscription allows you to manage AI costs through your existing cloud billing. You can leverage volume discounts, reserved capacity, or enterprise agreements with your provider.

Model selection

Choose the AI model that best fits your needs. Different models offer different tradeoffs in speed, accuracy, and cost. BYOLLM lets you select the model version and configuration that works for your organization.

Supported providers

aprity supports the following LLM providers for BYOLLM:

ProviderModels supportedEndpoint type
Azure OpenAIGPT-4o, GPT-4o-mini, GPT-4Azure-hosted deployment endpoint
Anthropic (Claude)Claude 4 Opus, Claude 4 SonnetAnthropic API
OpenAIGPT-4o, GPT-4o-miniOpenAI API
note

Model availability and naming may change as providers release new versions. Contact support for the latest list of tested and supported models.

How it works

When BYOLLM is configured:

  1. aprity sends analysis prompts to your LLM endpoint instead of the default aprity infrastructure.
  2. All metadata content processed by the LLM flows through your provider account.
  3. Responses are returned to aprity for documentation assembly.
  4. The rest of the aprity pipeline (metadata extraction, dependency computation, documentation rendering) is unchanged.

The LLM is used only for the ANALYZE_METADATA phase of the scan. All other phases (metadata extraction, dependency graph construction, document generation) are fully deterministic and do not involve any LLM calls.

Setup process

BYOLLM configuration is handled by the aprity support team to ensure proper setup and validation.

Step 1: Contact support

Email support@aprity.ai with the following information:

  • Your aprity Org ID (found in Settings > Plan).
  • Your LLM provider (Azure OpenAI, Anthropic, or OpenAI).
  • Your API endpoint URL (for Azure OpenAI, this is your deployment endpoint).
  • Your API key or authentication credentials.
  • Your preferred model name and version.

Step 2: Endpoint profile configuration

The aprity team configures an endpoint profile for your tenant. An endpoint profile defines:

  • The provider and model to use.
  • The API endpoint and authentication.
  • Rate limits and timeout settings.
  • Fallback behavior if the endpoint is unavailable.

Step 3: Validation

The aprity team runs a validation scan to confirm:

  • Connectivity to your LLM endpoint.
  • Response format compatibility.
  • Latency and throughput are within acceptable bounds.

Step 4: Activation

Once validated, BYOLLM is activated for your tenant. All subsequent scans will use your configured LLM endpoint.

caution

Keep your API key current. If the key expires or is rotated, scans will fail during the analysis phase. Contact support to update your credentials.

Provider-specific notes

Azure OpenAI

  • Requires a deployed model in your Azure OpenAI resource.
  • Provide the full deployment endpoint URL (e.g., https://your-resource.openai.azure.com/openai/deployments/your-model).
  • Supports both API key and Azure Active Directory (AAD) authentication.

Anthropic (Claude)

  • Requires an active Anthropic API account.
  • Provide your API key and the desired model identifier.

OpenAI

  • Requires an active OpenAI API account.
  • Provide your API key and the desired model identifier.
  • Organization ID is optional but recommended if you belong to multiple organizations.

Monitoring and troubleshooting

  • Scan logs include indicators of which LLM endpoint was used for analysis.
  • If your endpoint is unreachable or returns errors, the scan will fail during the ANALYZE_METADATA phase with a descriptive error message.
  • Rate limit errors from your provider will cause the scan to slow down (with automatic retry and backoff) rather than fail immediately.

Frequently asked questions

Can I switch providers after initial setup?

Yes. Contact support to update your endpoint profile. There is no downtime; the change takes effect on the next scan.

What happens if my LLM endpoint is down?

The scan will retry with backoff. If the endpoint remains unavailable, the scan will fail with a clear error message indicating the LLM endpoint could not be reached.

Does BYOLLM affect scan speed?

Scan speed depends on your LLM endpoint's latency and throughput. Endpoints with higher rate limits and lower latency will produce faster scans. The aprity team can help tune configuration for optimal performance.

Can I use different models for different scans?

Not currently. The endpoint profile is configured at the tenant level and applies to all scans. If you need multiple configurations, contact support to discuss options.