Skip to content

Commit

Permalink
fix(llm): reviews 25/11 (#4067)
Browse files Browse the repository at this point in the history
  • Loading branch information
ldecarvalho-doc authored Dec 3, 2024
1 parent e3d32e8 commit 0c2cb60
Show file tree
Hide file tree
Showing 6 changed files with 12 additions and 12 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page provides information on the Llama-3-70b-instruct model
tags:
dates:
validation: 2024-05-28
validation: 2024-12-03
posted: 2024-05-28
categories:
- ai-data
Expand All @@ -34,7 +34,7 @@ meta/llama-3-70b-instruct:fp8
## Model introduction

Meta’s Llama 3 is an iteration of the open-access Llama family.
Llama 3 was designed to match the best proprietary models, enhanced by community feedback for greater utility and responsibly spearheading the deployment of LLMs.
Llama 3 was designed to match the best proprietary models, enhanced by community feedback for greater utility and responsibly spearheading the deployment of LLMs.
With a commitment to open-source principles, this release marks the beginning of a multilingual, multimodal future for Llama 3, pushing the boundaries in reasoning and coding capabilities.

## Why is it useful?
Expand Down Expand Up @@ -77,7 +77,7 @@ Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [I

### Receiving Inference responses

Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request.

<Message type="note">
Expand Down
10 changes: 5 additions & 5 deletions ai-data/managed-inference/reference-content/sentence-t5-xxl.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page provides information on the Sentence-t5-xxl embedding model
tags: embedding
dates:
validation: 2024-05-22
validation: 2024-12-03
posted: 2024-05-22
categories:
- ai-data
Expand All @@ -31,12 +31,12 @@ sentence-transformers/sentence-t5-xxl:fp32

| Instance type | Max context length |
| ------------- |-------------|
| L4 | 512 (FP32) |
| L4 | 512 (FP32) |

## Model introduction

The Sentence-T5-XXL model represents a significant evolution in sentence embeddings, building on the robust foundation of the Text-To-Text Transfer Transformer (T5) architecture.
Designed for performance in various language processing tasks, Sentence-T5-XXL leverages the strengths of T5's encoder-decoder structure to generate high-dimensional vectors that encapsulate rich semantic information.
The Sentence-T5-XXL model represents a significant evolution in sentence embeddings, building on the robust foundation of the Text-To-Text Transfer Transformer (T5) architecture.
Designed for performance in various language processing tasks, Sentence-T5-XXL leverages the strengths of T5's encoder-decoder structure to generate high-dimensional vectors that encapsulate rich semantic information.
This model has been meticulously tuned for tasks such as text classification, semantic similarity, and clustering, making it a useful tool in the RAG (Retrieval-Augmented Generation) framework. It excels in sentence similarity tasks, but its performance in semantic search tasks is less optimal.

## Why is it useful?
Expand Down Expand Up @@ -66,5 +66,5 @@ Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [I

### Receiving Inference responses

Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request.
2 changes: 1 addition & 1 deletion containers/kubernetes/how-to/use-nvidia-gpu-operator.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page explains how to use the NVIDIA GPU operator on Kapsule and Kosmos with GPU Instances
tags: kubernetes kubernetes-kapsule kapsule cluster gpu-operator nvidia gpu
dates:
validation: 2024-05-22
validation: 2024-12-03
posted: 2023-07-18
categories:
- containers
Expand Down
2 changes: 1 addition & 1 deletion dedibox-network/ipv6/how-to/configure-ipv6-windows.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page explains how to configure an IPv6 subnet on a Dedibox running Windows Server.
tags: dedibox ipv6 windows subnet
dates:
validation: 2024-05-20
validation: 2024-12-03
posted: 2021-08-03
categories:
- dedibox-network
Expand Down
2 changes: 1 addition & 1 deletion dedibox-network/ipv6/how-to/enable-ipv6-slaac.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page explains how to enable IPv6 SLAAC on Dedibox servers.
tags: dedibox slaac ipv6
dates:
validation: 2024-05-20
validation: 2024-12-03
posted: 2021-08-03
categories:
- dedibox-network
Expand Down
2 changes: 1 addition & 1 deletion dedibox-network/ipv6/how-to/request-prefix.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ content:
paragraph: This page explains how to request a free /48 IPv6 prefix for Dedibox servers.
tags: dedibox ipv6 prefix
dates:
validation: 2024-05-20
validation: 2024-12-03
posted: 2021-08-03
categories:
- dedibox-network
Expand Down

0 comments on commit 0c2cb60

Please sign in to comment.