-
-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3 rewrite #371
Comments
Wayland SupportThis would require Implementations readily available at https://github.com/H-M-H/Weylus, https://github.com/pavlobu/deskreen, and of course https://github.com/LizardByte/Sunshine. Another alternative is https://github.com/games-on-whales/gst-wayland-display by @ABeltramo and @Drakulix. This is by using nested compositors. However, this would likely not work when in conjunction with real monitors. PipeWire as a hub for all media (Wayland video capture, V4L2 webcam stream dispensation, and PulseAudio drop-in replacement)Switch to PipeWire for the containers and accept PipeWire audio (for either X11 or Wayland) as well as screen capture (for Wayland) directly as well as PulseAudio (stop-gap solution would be pipewire-pulse). Moreover, an interesting capability Pipewire has is its potential to replace v4l2loopback with It's possible to compile the PipeWire GStreamer plugin together with GStreamer if it is included as a subproject in meson.build. Else, use the pipewire-debian PPA. |
I've heard from @totaam that WebSockets for the stream, not only signaling (or any other TCP ways to do it) works until there's packet loss. When there's packet loss, there would be visible defects in the stream. |
QUIC / WebTransport + WebCodecs & WebAudioFor QUIC, it should (obviously) be over HTTP/3 WebTransport instead of a custom QUIC protocol to be compatible with web browsers. Even if we're talking about native clients, HTTP/3 WebTransport would offer no disadvantages. Note that https://developer.mozilla.org/en-US/docs/Web/API/WebTransport/WebTransport#servercertificatehashes needs to be generated for self-signed certificates at every session start, as the self-signed certificate maximum length is 14 days. It is worth noting that because WebTransport lacks several capabilities (including the availability of reverse proxies), the technology might need to wait to mature. WebCodecs should be used for decoding in any other protocol than WebRTC, and fall back to WebAssembly + MSE libraries or other different methods if it doesn't exist (since it's pretty new). It is important to understand that WebSockets should typically not be mixed with HTTP/3 or WebTransport to obtain maximum benefit from HTTP/3. The reason WebSockets were required in WebRTC was to exchange signaling information. HTTP/3 can be used for this very purpose as well and should be. MediaStream/getUserMedia should still be used for WebRTC. DataChannels cannot transport video or audio efficiently. All methods for video processing in browsers: https://www.youtube.com/watch?v=0RvosCplkCc https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_API WebSockets for both media and signalingAlso worth noting is that some restricted firewalled environments need to use WebSockets (with upgrade from HTTP/1.1 required) for both media and signaling. This might be worth one of the options. For WebSockets, WebCodecs, then falling back to WebAssembly + MSE might work for video and audio decoding. RTWebSocket is a pretty interesting option for the WebSocket portion of this project. https://github.com/zenomt/rtwebsocket https://www.rfc-editor.org/rfc/rfc7016.html Return flow association (bi-directional information): https://www.rfc-editor.org/rfc/rfc7425.html Return flow association (bi-directional information): Along with WebTransport, the possible reason to use this is to show frames as soon as they arrive instead of going through internal jitterbuffers which WebRTC has limited control over (but the WebRTC approach of having separate audio and video streams should still make the frame display as fast as possible). Two things to keep in mind about RTWebSocket when looking at the above links: RTWebSocket uses the same primary abstraction as RTMFP: unidirectional ordered message-oriented "flows" named by arbitrary binary metadata, where each flow can have an independent priority/precedence (that you can change at any time), each message can have an arbitrary transmission deadline (that you can change at any time), and "return flow association" generalizes bidirectional communication into arbitrarily-complex trees of unidirectional flows. Note that you can revise a message's transmission deadline after it's been queued (for example, you can revise the transmission deadlines of previously-queued video messages when you get a new keyframe). The link to Section 5.3.5 of RFC 7425 above is to an illustration of multiple levels and many more than 2 flows of a bidirectional flow tree. RFC 7016 describes RTMFP (a UDP-based transport protocol suspiciously similar to, but significantly predating, Quic). RFC 7425 describes how to send RTMP video, audio, data, and RPC messages over RTMFP. the same method can be used to send RTMP messages over RTWebSocket, and the https://github.com/zenomt/rtwebsocket and https://github.com/zenomt/rtmfp-cpp repos have demonstrations of doing that. WebRTC ImprovementsInvestigate WHIP and WHEP, allowing unlimited users through WebRTC: Chunked DataChannel for clipboard and other large information: https://groups.google.com/g/discuss-webrtc/c/f3dfmu3oh00 MiscellaneousDevelopers must understand that all protocols and APIs are bound by web browser specifications and standards. Luckily, we have reached an era where most required APIs for this project objective are supplied by web browsers. In some cases, look for an external JavaScript or WebAssembly implementation. DASH is available through JavaScript fallback. RTSP seems to be not supported directly in the web browser; rather a protocol to capture from another source accessible to the server, than a transport protocol on the web. Same for QUIC; QUIC on a web browser is only done with WebTransport. MQTT is also redundant; all MQTT web client libraries use WebSockets to transport MQTT requests. |
It can happen if partial decoding is implemented, but that's very hard to get right. |
I included "delayed frames, stuttering and bandwidth issues" into visible (I think I meant perceivable) defects. |
GPU AccelerationZero-copy buffers are very important for shaving off the last bits of latency. #291 (SW: Performance optimizations: https://git.dec05eba.com/gpu-screen-recorder-gtk/about/ (I encourage talking to the maintainer) NVIDIANVIDIA Capture API (NvFBC) zero-copy framebuffer encoding for NVIDIA GPUs (may lead to great X11 performance improvements): https://github.com/CERIT-SC/gstreamer-nvimagesrc GStreamer >= 1.22 supports Jetson (aarch64)VA-API (AMD and Intel)A successful VA-API pipeline which was quite redundant for me so far. I do not guarantee everything will work for all GPUs, libva version, VA drivers, etc., but I feel it's much better than the deprecated plugins including working well on Some more pointers for GPU acceleration and high-performance streaming: Hardware-accelerated JPEG encodingNVIDIA and VA-API both provide hardware-accelerated JPEG encoding and are supported by most modern GPUs. https://gstreamer.freedesktop.org/documentation/nvcodec/nvjpegenc.html?gi-language=c MiscellaneousNote that x264 requires screen resolutions to be an even number. It wouldn't hurt to default to that always. GStreamer examples: https://gist.github.com/hum4n0id/2760d987a5a4b68c24256edd9db6b42b GStreamer Portable BuildGStreamer may be built statically if using C, Rust, or Go (not for Python). Because the most prominent performance and encoding optimizations are in the latest production releases, the most recent releases must be used. Then, Neko can be deployed regardless of environment, standalone without containers. Even if we don't do a static build, it could be useful to make shared library builds for GStreamer which is compatible with all active distros for ABI and glibc (separate GPL and LGPL builds). Using AppImage is also a way to make the resulting application portable. Used in Sunshine and RustDesk. Conda (https://conda-forge.org/news/2023/07/12/end-of-life-for-centos-6/), used frequently in science and technology, maintains a portable compiler toolchain and package ecosystem (based on CentOS 7) and packages recent GStreamer versions. Especially useful when Python components exist. |
Low-latency graphical streaming (including 3D graphics development - Teradici/NICE DCV and game development) and relative cursors#339 (Reduce Latency by using eliminating A/V sync for WebRTC or QUIC) Host Encoder and WebRTC Web Browser Decoder Settings (eliminate all client-side latency)selkies-project/selkies-gstreamer#34 (comment) #344 (Relative cursors in X11 and Wayland, pointer lock) #364 (Unicode Keysyms) selkies-project/selkies-gstreamer#22 (Are there touch keyboards for mobile users?) selkies-project/selkies-gstreamer#25 (URL authentication and JSON Web Token authentication?) selkies-project/selkies-gstreamer#98: selkies-project/selkies-gstreamer#102 selkies-project/selkies-gstreamer#110 (HiDPI management: https://wiki.archlinux.org/title/HiDPI / https://linuxreviews.org/KDE_Plasma#Workaround_For_Bugs) And of course, a heap of information in https://docs.lizardbyte.dev/projects/sunshine/en/latest/index.html More reference: Gamepads/Joysticks and Wayland InputIn the web browser perspective, the interface could utilize whatever the Gamepad API exposes. But from the server, it is not trivial to use arbitrary input devices in an unprivileged Docker or Kubernetes container, because However, a number of workarounds are available. https://github.com/selkies-project/selkies-gstreamer/tree/main/addons/js-interposer Additional approaches: What's intriguing in this approach is that this workaround method may also pave ways to replace Touchscreen and Stylushttps://github.com/H-M-H/Weylus |
Multi-user GPU Sharing
The issue with this is that this cannot be shared between different Kubernetes pods. It only works within multiple containers within the same pod. This means that GPU sharing with a single X server is a bit harder. An alternative would be to use X11 through TCP instead of UNIX Sockets. VirtualGL through GLX or using Wayland would also be an alternative that makes things smoother. Multi-architecture EnvironmentsSupport
Available to build with QEMU using Multiarch paths must not be hardcoded. |
https://vitejs.dev/guide/#trying-vite-online might be a good complement to this concept. And I think React (for backward compatibility) or Svelte (for lightweightness and development speed) as the default interface has their points compared to Vue (the whole reason the project needs to be rewritten). |
I already started and did the first step, where i upgraded demodesk/neko to vue3. Next step would be to remove vue3 as dependency for the core module and use vue only in the test client to speed up testing. |
We were using Vue 2 ourselves, so Vue 3 I guess, wouldn't hurt. Reference performance: |
Anyone familiar with any of the concepts described above is encouraged to discuss and contribute. This is definitely a possible project. GeForce Now, XBOX Cloud, Reemo all did it in the web browser through WebRTC. Now it's time for something open-source and not too restrictive (more permissive than GPL). |
I can add a bit of additional context to the virtual input part since I've moved my implementation from Wolf into a reusable standalone library games-on-whales/inputtino. fake-udev is not a replacement for On a separate note, I've recently managed to implement gyro, acceleration, touchpad and force feedback for a virtual PS5 gamepad using uhid because unfortunately |
@ABeltramo Thank you! Will it be possible to investigate the possibility of emulating the uhid device without the Else, this could be an optional feature enabled or disabled based on user preference. |
I don't exclude that it might be possible but I think it'll be fairly hard to achieve: the created devices will be picked up directly by the Linux kernel drivers just like when you plug an usb cable.
So that the end unprivileged container doesn't have access to uinput, udev or even This can be further locked down by running the process inside the unprivileged container with a low privileged user and let the external controller exec commands as root (or a higher privileged user) inside that container. Not a security expert, but I think this can be a fairly secure approach.. |
|
https://github.com/nestriness/nestri Cloud gaming platform using WebTransport and Media over QUIC. Developed by @wanjohiryan with input from @kixelated. |
https://github.com/go-gst/go-gst Go is now a first-class citizen on GStreamer. Combined with its compiled language characteristics being able to use C libraries and high legibility compared to Rust, as well as the existence of great web protocol libraries such as Pion, will hopefully work out well. This library will support dynamic property and capsfilter updates like how C, Rust, Python has done. |
Update: demodesk/neko moved to https://github.com/m1k1o/neko/tree/demodesk-v3. both of them merged in https://github.com/m1k1o/neko/tree/v3 |
Since Vue2 is deprecated #358 we need to rewrite client. While we will be rewriting client, we could take a look at the server as well and finally join m1k1o/neko and demodesk/neko.
Main pain-points that should be solved by this rewrite:
Connection:
Neko can connect to the backend using multiple channels. Therefore API users should not be exposed to WebSocket internals.
They should only care about the connection status:
And about connection type:
Media streaming
For media streaming, we implement a similar approach with the following streaming backends:
Various media streaming backends can have various features. For example, WebRTC can have a feature to send media to the server, while HTTP can only receive media from the server.
They can be selected based on the user's device capabilities, network conditions, and server capabilities.
There must be a single interface that all streaming backends must satisfy and its their only communication channel with the rest of the system.
Control (Human interface device)
The user can control the target system using various human interface devices. The user can use a keyboard, mouse, gamepad, touch screen, or any other device that can be used to control the system. Custom or virtual devices can be used as well.
Normally in-band feedback should be provided to the user inside media stream. But there can be cases where out-of-band feedback is required.
Control can use both underlying connections or media streaming for transmitting and receiving control data. For example, WebRTC data channels can be used for transmitting control data in real-time.
Conclusion
In the first step we should create and agree on client's library API and create interfaces that will be used and implemented later.
More will follow, please let me know if you have any ideas. Stay tuned.
Todo
First phase - merge
demodesk/neko
tom1k1o/neko
with legacy driver to emulate old API.Second phase - merge
demodesk/neko-client
tom1k1o/neko
while upgrading to vue3 and deprecate legacy API.m1k1o/neko
GUI from scratch on vue3 while usingdemodesk/neko-client
as core plugin.Third phase - make the codebase modular (as mentioned above).
The text was updated successfully, but these errors were encountered: