Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Adding language recommendation #4266

Merged
merged 6 commits into from
Jun 15, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/getting-started/architecture-and-components/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Architecture

{% content-ref url="language.md" %}
[langauge.md](language.md)
{% endcontent-ref %}

{% content-ref url="overview.md" %}
[overview.md](overview.md)
{% endcontent-ref %}
Expand Down
46 changes: 46 additions & 0 deletions docs/getting-started/architecture-and-components/language.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Python: The Language of Production Machine Learning

Use Python to serve your features online.


## Why should you use Python to Serve features for Machine Learning?
Python has emerged as the primary language for machine learning, and this extends to feature serving and there are five main reasons Feast recommends using a microservice in Feast.

## 1. Python is the language of Machine Learning

You should meet your users where they are. Python’s popularity in the machine learning community is undeniable. Its simplicity and readability make it an ideal language for writing and understanding complex algorithms. Python boasts a rich ecosystem of libraries such as TensorFlow, PyTorch, XGBoost, and scikit-learn, which provide robust support for developing and deploying machine learning models and we want Feast in this ecosystem.

## 2. Precomputation is The Way

Precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable. Precomputation ensures product experiences for downstream services are also fast. Slow user experiences are bad user experiences. Precompute and persist data as much as you can.

## 3. Serving features in another language can lead to skew
Ensuring that features used during model training (offline serving) and online serving are available in production to make real-time predictions is critical. When features are initially developed, they are typically written in Python. This is due to the convenience and efficiency provided by Python's data manipulation libraries. However, in a production environment, there is often interest or pressure to rewrite these features in a different language, like Java, Go, or C++, for performance reasons. This reimplementation introduces a significant risk: training and serving skew. Note that there will always be some minor exceptions (e.g., any *Time Since Last Event* types of features) but this should not be the rule.

Training and serving skew occurs when there are discrepancies between the features used during model training and those used during prediction. This can lead to degraded model performance, unreliable predictions, and reduced velocity in releasing new features and new models. The process of rewriting features in another language is prone to errors and inconsistencies, which exacerbate this issue.

## 4. Reimplementation is Excessive

Rewriting features in another language is not only risky but also resource-intensive. It requires significant time and effort from engineers to ensure that the features are correctly translated. This process can introduce bugs and inconsistencies, further increasing the risk of training and serving skew. Additionally, maintaining two versions of the same feature codebase adds unnecessary complexity and overhead. More importantly, the opportunity cost of this work is high and requires twice the amount of resourcing. Reimplementing code should only be done when the performance gains are worth the investment. Features should largely be precomputed so the latency performance gains should not be the highest impact work that your team can accomplish.

## 5. Use existing Python Optimizations

Rather than switching languages, it is more efficient to optimize the performance of your feature store while keeping Python as the primary language. Optimization is a two step process.

### Step 1: Quantify latency bottlenecks in your feature calculations

Use tools like [CProfile](https://docs.python.org/3/library/profile.html) to understand latency bottlenecks in your code. This will help you prioritize the biggest inefficiencies first. When we initially launched Python native transformations in Python, [profiling the code](https://github.com/feast-dev/feast/issues/4207#issuecomment-2155754504) helped us identify that Pandas resulted in a 10x overhead due to type conversion.

### Step 2: Optimize your feature calculations

As mentioned, precomputation is the recommended path. In some cases, you may want fully synchronous writes from your data producer to your online feature store, in which case you will want your feature computations and writes to be very fast. In this case, we recommend optimizing the feature calculation code first.

You should optimize your code using libraries, tools, and caching. For example, identify whether your feature calculations can be optimized through vectorized calculations in NumPy; explore tools like Numba for faster execution; and cache frequently accessed data using tools like an lru_cache.

Lastly, Feast will continue to optimize serving in Python and making the overall infrastructure more performant. This will better serve the community.

So we recommend focusing on optimizing your feature-specific code, reporting latency bottlenecks to the maintainers, and contributing to help the infrastructure be more performant.

By keeping features in Python and optimizing performance, you can ensure consistency between training and serving, reduce the risk of errors, and focus on launching more product experiences for your customers.

Embrace Python for feature serving, and leverage its strengths to build robust and reliable machine learning systems.
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,9 @@ A complete Feast deployment contains the following components:
* **Offline Store:** The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets. For feature retrieval and materialization, Feast does not manage the offline store directly, but runs queries against it. However, offline stores can be configured to support writes if Feast configures logging functionality of served features.

{% hint style="info" %}
Java and Go Clients are also available for online feature retrieval.
Java and Go Clients are also available for online feature retrieval.

In general, we recommend [using Python](language.md) for your Feature Store microservice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see with this recommendation is online feature retrieval latency. Python has high latencies compared to Go or Java option. Do you think its better to mention the latency impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to address this point in this statement:

Precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable.

But I can be more explicit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That definitely should be a factor. Even if you precompute, there will be applications out there with low-latency requirements and high enough load for which python server performance itself might become a bottleneck. I guess we are sort of trying to address that with introducing asyncio in python online retrieval, but even that might not be enough for some use cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that is true in theory, in practice Python works very well at quite high scale so my goal is to make it clear that we recommend Python.

Regardless, I see that this is in the overview section so I'll add this snippet in it.

franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved

As mentioned in the document, precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable.
franciscojavierarceo marked this conversation as resolved.
Show resolved Hide resolved
{% endhint %}
3 changes: 3 additions & 0 deletions docs/getting-started/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ We encourage you to ask questions on [GitHub](https://github.com/feast-dev/feast

## Getting started

### Which programming language should I use to run Feast in a microservice architecture?
[We recommend Python](language.md).

### Do you have any examples of how Feast should be used?

The [quickstart](quickstart.md) is the easiest way to learn about Feast. For more detailed tutorials, please check out the [tutorials](../tutorials/tutorials-overview/) page.
Expand Down
Loading