BYOLLM (Bring Your Own LLM)
BYOLLM (Bring Your Own LLM) allows Enterprise customers to configure aprity to use their own large language model provider instead of the default aprity-managed AI infrastructure. This gives you full control over data sovereignty, cost management, and model selection.
BYOLLM is available on Enterprise plans only.
Why use BYOLLM?
Data sovereignty
By routing AI analysis through your own LLM provider, metadata and prompts stay within your organization's cloud infrastructure. This is essential for organizations with strict data residency requirements or regulatory obligations.
Cost control
Using your own LLM subscription allows you to manage AI costs through your existing cloud billing. You can leverage volume discounts, reserved capacity, or enterprise agreements with your provider.
Model selection
Choose the AI model that best fits your needs. Different models offer different tradeoffs in speed, accuracy, and cost. BYOLLM lets you select the model version and configuration that works for your organization.
Supported providers
aprity supports the following LLM providers for BYOLLM:
| Provider | Models supported | Endpoint type |
|---|---|---|
| Azure OpenAI | GPT-4o, GPT-4o-mini, GPT-4 | Azure-hosted deployment endpoint |
| Anthropic (Claude) | Claude 4 Opus, Claude 4 Sonnet | Anthropic API |
| OpenAI | GPT-4o, GPT-4o-mini | OpenAI API |
Model availability and naming may change as providers release new versions. Contact support for the latest list of tested and supported models.
How it works
When BYOLLM is configured:
- aprity sends analysis prompts to your LLM endpoint instead of the default aprity infrastructure.
- All metadata content processed by the LLM flows through your provider account.
- Responses are returned to aprity for documentation assembly.
- The rest of the aprity pipeline (metadata extraction, dependency computation, documentation rendering) is unchanged.
The LLM is used only for the ANALYZE_METADATA phase of the scan. All other phases (metadata extraction, dependency graph construction, document generation) are fully deterministic and do not involve any LLM calls.
Setup process
BYOLLM configuration is handled by the aprity support team to ensure proper setup and validation.
Step 1: Contact support
Email support@aprity.ai with the following information:
- Your aprity Org ID (found in Settings > Plan).
- Your LLM provider (Azure OpenAI, Anthropic, or OpenAI).
- Your API endpoint URL (for Azure OpenAI, this is your deployment endpoint).
- Your API key or authentication credentials.
- Your preferred model name and version.
Step 2: Endpoint profile configuration
The aprity team configures an endpoint profile for your tenant. An endpoint profile defines:
- The provider and model to use.
- The API endpoint and authentication.
- Rate limits and timeout settings.
- Fallback behavior if the endpoint is unavailable.
Step 3: Validation
The aprity team runs a validation scan to confirm:
- Connectivity to your LLM endpoint.
- Response format compatibility.
- Latency and throughput are within acceptable bounds.
Step 4: Activation
Once validated, BYOLLM is activated for your tenant. All subsequent scans will use your configured LLM endpoint.
Keep your API key current. If the key expires or is rotated, scans will fail during the analysis phase. Contact support to update your credentials.
Provider-specific notes
Azure OpenAI
- Requires a deployed model in your Azure OpenAI resource.
- Provide the full deployment endpoint URL (e.g.,
https://your-resource.openai.azure.com/openai/deployments/your-model). - Supports both API key and Azure Active Directory (AAD) authentication.
Anthropic (Claude)
- Requires an active Anthropic API account.
- Provide your API key and the desired model identifier.
OpenAI
- Requires an active OpenAI API account.
- Provide your API key and the desired model identifier.
- Organization ID is optional but recommended if you belong to multiple organizations.
Monitoring and troubleshooting
- Scan logs include indicators of which LLM endpoint was used for analysis.
- If your endpoint is unreachable or returns errors, the scan will fail during the ANALYZE_METADATA phase with a descriptive error message.
- Rate limit errors from your provider will cause the scan to slow down (with automatic retry and backoff) rather than fail immediately.
Frequently asked questions
Can I switch providers after initial setup?
Yes. Contact support to update your endpoint profile. There is no downtime; the change takes effect on the next scan.
What happens if my LLM endpoint is down?
The scan will retry with backoff. If the endpoint remains unavailable, the scan will fail with a clear error message indicating the LLM endpoint could not be reached.
Does BYOLLM affect scan speed?
Scan speed depends on your LLM endpoint's latency and throughput. Endpoints with higher rate limits and lower latency will produce faster scans. The aprity team can help tune configuration for optimal performance.
Can I use different models for different scans?
Not currently. The endpoint profile is configured at the tenant level and applies to all scans. If you need multiple configurations, contact support to discuss options.