Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method to retrieve jwt token + stream logs from a Space #2667

Open
Wauplin opened this issue Nov 19, 2024 · 2 comments
Open

Method to retrieve jwt token + stream logs from a Space #2667

Wauplin opened this issue Nov 19, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Nov 19, 2024

(started working on it while helping @abhishekkrthakur on autotrain-advanced)

Users can stream Space logs from the UI. We could introduce a method to do that as well from a script using requests with stream=True. Note: authentication is done using a jwt instead of an user auth token. Retrieving a jwt token and streaming logs are 2 completely different topics but we need the first to implement the later. Here is a draft of how to do that:

import json
from typing import Literal

from huggingface_hub import constants
from huggingface_hub.utils import build_hf_headers, get_session, hf_raise_for_status


def get_space_logs_sse(space_id: str, level: Literal["build", "run"] = "run"):
    # fetch a JWT token to access the API
    jwt_url = f"{constants.ENDPOINT}/api/spaces/{space_id}/jwt"
    response = get_session().get(jwt_url, headers=build_hf_headers())
    hf_raise_for_status(response)
    jwt_token = response.json()["token"]  # works for 24h (see "exp" field)

    # fetch the logs
    logs_url = f"https://api.hf.space/v1/{space_id}/logs/{level}"

    with get_session().get(logs_url, headers=build_hf_headers(token=jwt_token), stream=True) as response:
        hf_raise_for_status(response)
        for line in response.iter_lines():
            if not line.startswith(b"data: "):
                continue
            line_data = line[len(b"data: "):]
            try:
                event = json.loads(line_data.decode())
            except json.JSONDecodeError as e:
                print(e)
                continue # ignore (for example, empty lines or `b': keep-alive'`)
            print(event["timestamp"], event["data"])

get_space_logs_sse("Wauplin/space_to_dataset_saver")

Not sure though we want to officially support this. I'm putting this snippet here in case it helps people. Let's implement it if we see enough interest around that feature. Generating a jwt and streaming logs can be done in 2 separate PRs.

@Wauplin Wauplin added the enhancement New feature or request label Nov 19, 2024
@abhishekkrthakur
Copy link
Member

works well for me, thanks!
here's the autotrain code: https://github.com/huggingface/autotrain-advanced/blob/main/src/autotrain/app/api_routes.py#L748

@WizKnight
Copy link
Contributor

Hi @Wauplin :),
Just wanted to follow up on this feature and see if you still need any assistance.

And If you're all set with that, I'm also interested in contributing further. Are there any other existing issues or features that I could help with?
I'm eager to get involved and contribute.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants