-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'origin/main' into az-cli-docs
- Loading branch information
Showing
13 changed files
with
595 additions
and
238 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: Plural AI Architecture | ||
description: How Plural AI Works | ||
--- | ||
|
||
## Overview | ||
|
||
At its core, Plural AI has three main components: | ||
|
||
* A causal graph of the high-level objects that define your infrastructure. An example is this Plural Service owns this Kubernetes Deployment, which owns a ReplicaSet which owns a set of Pods. | ||
* A permission engine to ensure any set of objects within the graph are interactable by the given user of Plural's AI. This hardens the governance process around access to the completions for our AI. The presence of Plural's agent in your Kubernetes fleet also makes the ability to query end-clusters much more secure from a networking perspective. | ||
* Our PR Automation framework - this allows us to hook into SCM providers and automate code fixes in a reviewable, safe way. | ||
|
||
In the parlance of the AI industry, you can think of it as a highly advanced RAG (retrieval augmented generation), with an agent-like behavior, since it's always on and triggered by any emergent issue in your infrastructure. | ||
|
||
## In Detail | ||
|
||
Here's a detailed walkthrough of how the AI engine works in the case of a Plural Service with a failing Kubernetes deployment. | ||
|
||
1. The engine is told the service is failing from our internal event bus | ||
2. The failing components of that service are collected, with the failing deployment selected first | ||
3. The metadata of the service are added to the prompt (what cluster its on, how its sourcing its configuration, etc) | ||
4. The events, replica sets, spec of the k8s deployment are queried and added to a prompt | ||
5. The failing pods for the deployment are selected from the replica sets, and a random subset are queried individually | ||
6. Each failing pods events and spec are added to the growing prompt | ||
|
||
This will then craft an insight for the Deployment node, which can be combined with insights from any other components to collect to a service-level insight. | ||
|
||
If this investigation were done again, we'd be able to cache any non-stale insights and prevent rerunning the inference a second time where it would be unnecessary. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
--- | ||
title: Plural AI Cost Analysis | ||
description: How much will Plural AI cost me? | ||
--- | ||
|
||
Plural AI is built to be extremely cost efficient. We've found it will often heavily outcompete the spend you would use on advanced APM tools, and likely even DIY prometheus setups. That said, AI inference is not cheap in general, and we do a number of things to work around that: | ||
|
||
* Our causal knowledge graph heavily caches at each layer of the tree. This allows us to ensure repeated attempts to generate the same insight are deduplicated, reducing inference API calls dramatically | ||
* You can split the model used by usecase. Insight generation can leverage cheap, fast models, whereas the tool calls that ultimately generate PRs use smarter, advanced models, but are executed less frequently so the cost isn't felt as hard. | ||
* We use AI sparingly. Inference is only done when we know something is wrong. | ||
|
||
That said, what does that actually mean? | ||
|
||
## Basic Cost Analysis | ||
|
||
We at Plural dogfood our own AI functionality in our own infrastructure. This includes a sandbox test fleet of over 10 clusters, and a production fleet of around 5 clusters for both our main services and Plural Cloud. Plural's AI Engine runs on the management clusters for each of these domains since launch, and while we might do a decent-ish job of caretaking those environments, or current daily OpenAI bill is $~2.64 per day, or roughly $81 per month. | ||
|
||
This is staggeringly cost effective, when you consider a Datadog bill for our equivalent infrastructure is at minimum $10k, even a prometheus setup is well over 100/mo for the necessary compute including datastore, grafana, grafana's database, load balancers, and agents. Granted, some of these services will ultimately be necessary to have Plural AI reach its full potential, but we could see a world where: | ||
|
||
```sh | ||
OpenTelemetry + Plural AI >> Datadog/New Relic | ||
``` | ||
|
||
as a general debugging platform, while being a miniscule fraction of the current cost. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
title: Plural AI | ||
description: Plural's AI Engine Removes the Gruntwork from Infrastructure | ||
--- | ||
|
||
{% callout severity="info" %} | ||
If you just want to skip the text and see it in action, skip to the [demo video](/ai/overview#demo-video) | ||
{% /callout %} | ||
|
||
Managing infrastructure is full of mind-numbing tasks, from troubleshooting the same misconfiguration for the hundredth time, to whack-a-moling Datadog alerts, to playing internal IT support to application developers who cannot be bothered to learn the basics of foundational technology like Kubernetes. Plural AI allows you to outsource all those time-sucks to LLMs so you can focus on building value-added platforms for your enterprise. | ||
|
||
In particular, Plural AI has a few differentiators to its approach: | ||
|
||
* A bring-your-own-LLM model - allows you to use the LLM already approved by your enterprise and not worry about us as a MITM | ||
* An always-on troubleshooting engine - taking signals from failed kubernetes services, failed terraform runs, and other misfires in your infrastructure to run a consistent investigative process and summarize the results. Eliminate manual digging and just fix the issue instead. | ||
* Automated Fixes - Take any insight from our troubleshooting engine and generate a fix PR automatically, generated from our ability to introspect the GitOps code defining that piece of infrastructure. | ||
* AI Explanation - Complex or Domain-specific pages can be explained w/ one click with AI, eliminating internal support burdens for engineers. | ||
* AI Chat - any workflow above can be further refined or expanded in a full ChatGPT-like experience. Paste additional context into chats automatically, or generate PRs once you and the AI has found the fix. | ||
|
||
|
||
# Demo Video | ||
|
||
To see this all in action, feel free to browse our live demo video on Youtube of our GenAI integration: | ||
|
||
{% embed url="https://youtu.be/LxurfPikps8" aspectRatio="16 / 9" /%} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
--- | ||
title: Setup Plural AI | ||
description: How to configure Plural AI | ||
--- | ||
|
||
Plural AI can easily be configured via the `DeploymentSettings` CRD or at `/settings/global/ai-provider` in your Plural Console instance. An example `DeploymentSettings` config is below: | ||
|
||
```yaml | ||
apiVersion: deployments.plural.sh/v1alpha1 | ||
kind: DeploymentSettings | ||
metadata: | ||
name: global | ||
namespace: plrl-deploy-operator | ||
spec: | ||
managementRepo: pluralsh/plrl-boot-aws | ||
|
||
ai: | ||
enabled: true | ||
provider: OPENAI | ||
anthropic: # example anthropic config | ||
model: claude-3-5-sonnet-latest | ||
tokenSecretRef: | ||
name: ai-config | ||
key: anthropic | ||
|
||
openAI: # example openai config | ||
tokenSecretRef: | ||
name: ai-config | ||
key: openai | ||
|
||
vertex: # example VertexAI config | ||
project: pluralsh-test-384515 | ||
location: us-east1 | ||
model: gemini-1.5-pro-002 | ||
serviceAccountJsonSecretRef: | ||
name: ai-config | ||
key: vertex | ||
``` | ||
You can see the full schema at our [Operator API Reference](/deployments/operator/api#deploymentsettings). | ||
In all these cases, you need to create an additional secret in the `plrl-deploy-operator` namespace to reference api keys and auth secrets. It would look something like this: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: ai-config | ||
namespace: plrl-deploy-operator | ||
stringData: | ||
vertex: <service account json string> | ||
openai: <access-token> | ||
anthropic: <access-token> | ||
``` | ||
|
||
{% callout severity="warn" %} | ||
Be sure not to commit this secret resource into your Git repository in plain-text, as that will result in a git secret exposure. | ||
|
||
Plural provides a number of mechanisms to manage secrets, or you can use the established patterns within your engineering organization. | ||
{% /callout %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
title: Contribution Program | ||
description: Contributing to Plural's Service Catalog | ||
--- | ||
|
||
We run a continuous Contributor Program to help maintain our catalog from the community. The bounties for the various tasks are as follows: | ||
|
||
* $100 for an application update (note that many applications should auto-update) | ||
* $250 for an application onboarding | ||
|
||
To qualify for the bounty, you'll need to submit a PR to https://github.com/pluralsh/scaffolds.git and once it's been approved and merged, DM a member of the Plural staff on Discord to receive your payout. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
--- | ||
title: Creating Your Own Catalog | ||
description: Defining your own service catalogs with Plural | ||
--- | ||
|
||
## Overview | ||
|
||
{% callout severity="info" %} | ||
TLDR: skip to [Examples](/catalog/creation#examples) to see a link to our Github repository with our full default catalog for working examples | ||
{% /callout %} | ||
|
||
Plural Service Catalogs are ultimately driven off of two kubernetes custom resources: `Catalog` and `PrAutomation`. Here are examples of both: | ||
|
||
```yaml | ||
apiVersion: deployments.plural.sh/v1alpha1 | ||
kind: Catalog | ||
metadata: | ||
name: data-engineering | ||
spec: | ||
name: data-engineering | ||
category: data | ||
icon: https://docs.plural.sh/favicon-128.png | ||
author: Plural | ||
description: | | ||
Sets up OSS data infrastructure using Plural | ||
bindings: | ||
create: | ||
- groupName: developers # controls who can spawn prs from this catalog | ||
``` | ||
```yaml | ||
apiVersion: deployments.plural.sh/v1alpha1 | ||
kind: PrAutomation | ||
metadata: | ||
name: airbyte | ||
spec: | ||
name: airbyte | ||
icon: https://plural-assets.s3.us-east-2.amazonaws.com/uploads/repos/d79a69b7-dfcd-480a-a51d-518865fd6e7c/airbyte.png | ||
identifier: mgmt | ||
documentation: | | ||
Sets up an airbyte instance for a given cloud | ||
creates: | ||
git: | ||
ref: sebastian/prod-2981-set-up-catalog-pipeline # TODO set to main | ||
folder: catalogs/data/airbyte | ||
templates: | ||
- source: helm | ||
destination: helm/airbyte/{{ context.cluster }} | ||
external: true | ||
- source: services/oauth-proxy-ingress.yaml.liquid | ||
destination: services/apps/airbyte/oauth-proxy-ingress.yaml.liquid | ||
external: true | ||
- source: "terraform/{{ context.cloud }}" | ||
destination: "terraform/apps/airbyte/{{ context.cluster }}" | ||
external: true | ||
- source: airbyte-raw-servicedeployment.yaml | ||
destination: "bootstrap/apps/airbyte/{{ context.cluster }}/airbyte-raw-servicedeployment.yaml" | ||
external: true | ||
- source: airbyte-servicedeployment.yaml | ||
destination: "bootstrap/apps/airbyte/{{ context.cluster }}/airbyte-servicedeployment.yaml" | ||
external: true | ||
- source: airbyte-stack.yaml | ||
destination: "bootstrap/apps/airbyte/{{ context.cluster }}/airbyte-stack.yaml" | ||
external: true | ||
- source: oauth-proxy-config-servicedeployment.yaml | ||
destination: "bootstrap/apps/airbyte/{{ context.cluster }}/oauth-proxy-config-servicedeployment.yaml" | ||
external: true | ||
- source: README.md | ||
destination: documentation/airbyte/README.md | ||
external: true | ||
repositoryRef: | ||
name: scaffolds | ||
catalogRef: # <-- NOTE this references the Catalog CRD | ||
name: data-engineering | ||
scmConnectionRef: | ||
name: plural | ||
title: "Setting up airbyte on cluster {{ context.cluster }} for {{ context.cloud }}" | ||
message: | | ||
Set up airbyte on {{ context.cluster }} ({{ context.cloud }}) | ||
Will set up an airbyte deployment, including object storage and postgres setup | ||
configuration: | ||
- name: cluster | ||
type: STRING | ||
documentation: Handle of the cluster you want to deploy airbyte to. | ||
- name: stackCluster | ||
type: STRING | ||
documentation: Handle of the cluster used to run Infrastructure Stacks for provisioning the infrastructure. Defaults to the management cluster. | ||
default: mgmt | ||
- name: cloud | ||
type: ENUM | ||
documentation: Cloud provider you want to deploy airbyte to. | ||
values: | ||
- aws | ||
- name: bucket | ||
type: STRING | ||
documentation: The name of the S3/GCS/Azure Blob bucket you'll use for airbyte logs. This must be globally unique. | ||
- name: hostname | ||
type: STRING | ||
documentation: The DNS name you'll host airbyte under. | ||
- name: region | ||
type: STRING | ||
documentation: The cloud provider region you're going to use to deploy cloud resources. | ||
``` | ||
A catalog is a container for many PRAutomations which themselves control the code-generation to accomplish the provisioning task being implemented. In this case, we're provisioning [Airbyte](https://airbyte.com/). The real work is being done in the referenced templates. | ||
## Examples | ||
The best way to get some inspiration on how to write your own templates is to look through some examples, and that's why we've made our default service catalog open source. You can browse it here: | ||
https://github.com/pluralsh/scaffolds/tree/main/setup/catalogs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
title: Service Catalog | ||
description: Enterprise Grade Self-Service with Plural | ||
--- | ||
|
||
{% callout severity="info" %} | ||
If you just want to skip the text and see it in action, skip to the [demo video](/catalog/overview#demo-video) | ||
{% /callout %} | ||
|
||
Plural provides a full-stack GitOps platform for provisioning resources with both IaC frameworks like terraform and Kubernetes manifests like helm and kustomize. This alone is very powerful, but most enterprises want to go a step beyond and implement full self-service. This provides two main benefits: | ||
|
||
* Reduction of manual toil and error in repeatable infrastructure provisioning paths | ||
* Ensuring compliance with enterprise cybersecurity and reliabilty standards in the creation of new infrastructure, eg the creation of "Golden Paths". | ||
|
||
Plural accomplishes this via our Catalog feature, which allows [PR Automations](/deployments/pr-automation) to be bundled according to common infrastructure provisioning usecases. We like the code generation approach for a number of reasons: | ||
|
||
* Clear tie-in with established review-and-approval mechanisms in the PR-process | ||
* Great customizability throughout the lifecycle. | ||
* Generality - any infrastructure provisioning task can be represented as some terraform + GitOps code in theory | ||
|
||
# Demo Video | ||
|
||
To see this all in action in provisioning a relatively complex application in [Dagster](https://dagster.io/), feel free to browse our live demo video on Youtube of our GenAI integration: | ||
|
||
{% embed url="https://youtu.be/5D6myZ7sm2k" aspectRatio="16 / 9" /%} |
Oops, something went wrong.