Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shader plugin support / Temporal processing support (Blur Busters has source code bounty (2K) for this) #35

Open
mdrejhon opened this issue Nov 17, 2023 · 19 comments
Labels
enhancement New feature or request

Comments

@mdrejhon
Copy link

mdrejhon commented Nov 17, 2023

I'm Mark, the creator of Blur Busters and TestUFO.

I noticed you have a virtual display driver capable of custom refresh rates -- maybe you want to make minor modifications to meet an existing Blur Busters source code bounty?

For those not familiar with me -- I'm an indie-turned-business, and the Internet's biggest advocate for high refresh rates, and I am in over 25 peer reviewed research papers and I have an easier Coles Notes research portal Area 51 on BlurBusters.

TestUFO is my free display-testing/display-demoing website used by over 500 content creators (reviewers, youtubers, bloggers) that tests displays. Two of the biggest content creators are RTINGS (which uses my invention) and LinusTechTips (which used the pursuit camera in some tests).

I've been looking for a good open source virtual display driver as a base to produce drivers that can implement algorithms such as testufo.com/blackframes (software BFI) and testufo.com/vrr (simulated VRR on non-VRR displays), as well as hundreds of other fantastic temporal filters.
Even if not all filters are possible, a virtualized display driver with temporal-processing capabilities, would unlock a lot of neat capabilities for a lot of communities.

It could also be used to add subpixel-awareness for OLED displays (odd pixel structures) by scaling things in a custom-subpixel-pattern way, it was also posted here in ClearType improvement thread, but I'm posting here since I've now discovered this project:

$2K Bounty for a Virtual Display Driver capable of Temporal Filters

Which I crosspost here:

**Jason-GitH ** commented May 11, 2023

Yes, that's another option as an alternative.

I mention this already at microsoft/PowerToys#25595 (comment)

A USD $2000.00 bounty already offered for this open source driver project

Blur Busters currently already offers a $2K bounty for this -- a subpixel-aware downscaler would allow everything to be "ClearType'd" including graphics, and not just text. This can even be done as a third-party Windows Indirect Display Driver, which I already offer a USD$2000.00 source code bounty for something under MIT or Apache. Contact me at mark[at]blurbusters.com

Basically, a Windows IDD that virtualizes a simulated monitor at a simulated resolution at a simulated refresh rate (can be higher or lower Hz than real display) -- we have some important uses cases for this), and runs all the frames through plug-in GPU shaders, to display on an existing physical monitor.

In addition to solving this problem -- plug-in GPU shaders can enables shockingly creative stuff such as:

  • subpixel-aware downscaling, etc.
  • custom developer refresh rate testing on monitors not owned (e.g. test refresh rates that your monitor does not support)
  • easier software developer testing with displays that aren't in the QA lab, by simulating a display
  • easier hardware prototyping and engineering of future displays, by simulating a display
  • simulated interpolated VRR (www.testufo.com/vrr algorithm) on displays that don't have VRR
  • software based BFI OLED motion blur eduction (www.testufo.com/blackframes#count=3)
  • streaming / headless use cases (without needing to be in a game nor open Remote Desktop)
  • software based overdrive algorithms
  • custom spatial filters (e.g. HLSL CRT filters for retro simulation / preservation).
  • custom temporal filters (e.g. simulate a CRT electron beam, complete with rolling scan and phosphor fade simulation, by using 8 digital refresh cycles per CRT Hz, on 480Hz display that windows thinks it "sees" as a 60Hz CRT tube), for even further improved retro simulation / preservation / education / history education.
  • add frame generation (e.g. reprojection https://www.youtube.com/watch?v=IvqrlgKuowE for 500fps UE5)
  • Other plug-in GPU shaders (that can modify, adjust, scale, process, merge, blend, etc history of buffered "refresh cycle" frames)

The $2K bounty doesn't have to include these custom shaders, but needs to include a plug-in shader system, and end users need the ability to share shaders/profiles with each other (e.g. a sharable modular file-based format, possibly JSON or XML) that loads settings such as shader module (compiled or uncompiled), input Hz, output Hz, constraints if any (outputHz = Formula(inputHz)). It could be a single-file format or a composite (.zip) that includes shaders and settings for a custom filter/processing. Such as shaders that require actual display output Hz to be integer multiples of the virtualized display Hz. The bounty requirements are negotiable (contact mark [at] blurbusters.com), but it needs to meet the needs of the open source community, the retro community, the blur busters fan community, and the commercial community (e.g. display prototyping), as I do work with display manufacturers. As you know, I am the inventor of free display tests used both by indies and manufacturers...

Other options include integration of SweetFX/Reshade/etc into the virtual display driver, including possibly open source or third party lagless frame generation algorithms (e.g. https://www.blurbusters.com/framegen ...) or whole-desktop SmoothVideoProject, etc. So this ideally should be supported too, although it cannot support the entire scope needed. However, the capability for output refresh rates higher AND/OR lower AND/OR variable than the refresh rate visible in Control Panel for the said virtualized display. In other words, the real Hz displayed on real display, is a processed version of the virtualized Hz visible to Windows,. e.g. 60Hz in Windows, but is outputting 240Hz+BFI with persistence adjustability (similar to https://www.testufo.com/blackframes#count=4&bonusufo=1 as one example). Or variable frametimes (I can help you understand how to control VRR).

I really would like to see a driver package capable of running custom shaders at refresh cycle granularity. And enough example dummy shaders to demonstrate successful virtualization of resolution AND refresh rate (both higher and lower than actual physical display(s) connected to computer). INCLUDING of course, variable refresh rate capability which is simply asynchronous timing of Present() that may not be related to timing interval of the refresh cycles. I am OK with the constraint of a fixed-Hz virtualization (just VRR output virtualization).

Although not needed for subpixel processing filters and input=output Hz filters (shader-process-only filters)..... for certain temporal refresh rate filtering tasks I'll need extremely accurate timing of output-frame Present()'s, which MAY require a separate CPU thread that busyloops right before Present() of the composited frame to the actual output display. So an ultra-high-precision presentation timing thread will be mandatory here, I can help assist you -- I have lots of experience via [url=https://forums.blurbusters.com/viewtopic.php?t=4213]Tearline Jedi[/url] raster-interrupt-style beam racing of VSYNC OFF tearlines).

Keep in mind that this is a refresh-cycle-granularity processor, not a frame-granularity processor. For many reasons, it has to process every refresh cycle independently of underlying framerates of the application running within the desktop on the virtualized display). Be noted, frame presenting timestamps do need to be made available to the shader run once every refresh-cycle (for certain shaders like VRR simulating shaders, which requires refresh cycle granularity processing while also separately knowing the frame presentation timestamps of the underlying application).

Except that it would work on everything in Windows, not just in a test. Reshade/SweetFX/SpecialK/etc does something similar in some ways, but not for many use cases. The other drivers have the problem of being frame-granularity (as frame Present() hooks) which means they are not refresh cycle granularity. Many algorithms need to process at Hz-granularity independently of the underlying frame rate, for successful operation. And being a Windows IDD, means desktop applications are included, not just games.

For qualifying for the bounty, some changes needed may include:

  • Support replacing an existing monitor with the virtualized monitor (or some compromise), ala Feature Request: Disable Other Monitors #19
  • Adding support for a loadable settings/shader system (to allow the temporal processing)
  • A relicensed fork (if that's an option) to Apache/MIT.
    (legal if you're the only developer, or all devs/contributers of 100% of the source code retroactively agree)
  • Bounty requirements can be negotiable (contact me privately), with requirements confirmed into stone before agreeing if necessary;

Example (of hundreds) of possible benefits....

  • End users who want to reduce display motion blur by adding software-based BFI
  • End users who don't have VRR that want to add software-based VRR
  • Adding software-based superior LCD overdrive to cheap monitors (that performs better than manufacturer 17x17 LCD OD LUTs), like superior versions of the old "ATI Radeon Overdrive" system from 15 years ago;
  • Game developers stuck at 60Hz, can finally QA-test on higher refresh rates (virtualized), with the extra Hz's being blended together one way or another (alphablend, or simul-vrr, or tiled, or other).
  • Manufacturers can prototype future-Hz displays like 1000Hz displays
  • Improved SDR/HDR adjustability (e.g. using HDR for SDR content as a brightness booster for BFI like Retrotink 5K).
  • Future CRT simulators for retro community (electron beam simulators in a shader, using 8 refresh cycles on upcoming 480Hz OLED to create 1 simulated 60Hz CRT refresh cycle)
  • etc.

Some possible catches

  • It may not be able to play DRM content (e.g. Netflix)

Question: Are you the only developer, @MolotovCherry?
Note: It could even be a fork of this project if this is too dramatically different. The original project can remain GPL3, just the fork qualifying for the bounty would need to be one of the permissive licenses to support both the indies and businesses, as per above. Can even backport the Apache/MIT version to your GPL3 fork (one-way compatibility), if you want -- all good.

@mdrejhon mdrejhon added the invalid This doesn't seem right label Nov 17, 2023
@mdrejhon mdrejhon changed the title Blur Busters has a USD $2000 bounty for a similar project Blur Busters has an open source bounty (2K) for a virtual display driver doing temporal filtering. Could this be it? Nov 17, 2023
@mdrejhon mdrejhon changed the title Blur Busters has an open source bounty (2K) for a virtual display driver doing temporal filtering. Could this be it? Blur Busters has an open source bounty (2K) for a virtual display driver doing temporal filtering tasks in a shader. Could this be it? Nov 17, 2023
@mdrejhon mdrejhon changed the title Blur Busters has an open source bounty (2K) for a virtual display driver doing temporal filtering tasks in a shader. Could this be it? Shader plugin support / Temporal processing support (Blur Busters has source code bounty (2K) for temporal filtering tasks). Nov 18, 2023
@mdrejhon mdrejhon changed the title Shader plugin support / Temporal processing support (Blur Busters has source code bounty (2K) for temporal filtering tasks). Shader plugin support / Temporal processing support (Blur Busters has source code bounty (2K) for this) Nov 18, 2023
@mdrejhon
Copy link
Author

mdrejhon commented Nov 18, 2023

I'm having difficulty with the labels; it's not letting me label this as "enhancement".

P.S. I have an EV Code Signing Certificate, which I volunteer to use for betas/public-releases of this feature.

EDIT: I sponsored @MolotovCherry $100 to show goodwill -- no commitment follow through. I'm very happy people are at least playing more often with virtualized display drivers (which the world badly needs, for the reasons above), and will inspire more and more indies to do the same. We indies want to produce stuff that enhances displays beyond what the manufacturers intended them to do.

You can see how I overlap indies and business -- as a passionate indie-helping-manufacturers, I help manufacturers with some elements of the refresh rate race. Especially where 120-vs-240 is much more visible to mainstream on OLED than LCD, but drawbacks (e.g. strobing/BFI removed from OLED, plus bad ClearType, etc) -- where some of the pick-poisons that can be partially solved with a virtualized display driver with temporal filtering/reprocessing capabilities -- which also helps them to be convinced to add features to the actual hardware! (While concurrently, niche features continue to be added DIY by end users)

P.S. Anybody who wants to learn more of the best content I've blogged, see Area 51 on Blur Busters for the curated best-articles. A more manageable read than Google Scolar.

@MolotovCherry MolotovCherry added enhancement New feature or request and removed invalid This doesn't seem right labels Nov 18, 2023
@MolotovCherry
Copy link
Owner

MolotovCherry commented Nov 18, 2023

Hi!

Thanks for the feature request! Also thanks for the goodwill sponsorship!

This sounds very interesting! I never imagined that GPU shader plugin support in something like this would be a much desired feature.

I will give this a serious lookover when I have some time and see if it's possible for me to implement such a shader system1. I'm still not sure at the moment how much of those items are possible to do with my current skill-set, but I will look into it. For non-sensitive discussion, I would like to let communication reside in this open issue if that's alright with you

Question: Are you the only developer, @MolotovCherry?

Yes, I'm the sole developer and owner of this project (all the code was made by me, so there are no loose threads)

Note: It could even be a fork of this project if this is too dramatically different. The original project can remain GPL3, just the fork qualifying for the bounty would need to be one of the permissive licenses to support both the indies and businesses, as per above. Can even backport the Apache/MIT version to your GPL3 fork (one-way compatibility), if you want -- all good.

I'm not opposed to changing my license on the project to a more friendly one, or offering a special re-licenced version which is favorable to the interested party, on a case-by-case basis, if circumstances are favorable to me (of which this is an aforementioned favorable circumstance). My only real previous concern with my current driver was that I didn't want it to be taken and used with 0 credit / sharing of source code, since I put so much hard work into making it.

Footnotes

  1. It would fit well within the gui I've been developing
    image

@mdrejhon
Copy link
Author

mdrejhon commented Nov 20, 2023

Thank you for the follow up!
I'll post a few messages to add more information; to help you (and other readers) think.

Purpose / Precedents

Why does the world need this?

Lots of people are doing lots of work in this sphere already! But most of them are FRAME BASED PROCESSING

But there are a lot of algorithms that require REFRESH CYCLE BASED PROCESSING independent of frame rate, and also works outside of games. that can run on all refresh cycles independently of the underlying frame rate...

Existing Precedents of FRAME-BASED processing Systems

  1. The BFI support now built into SpecialK (but it only works inside games as Present() hook), using a custom workflow they've created to allow a refresh-cycle-based system to work with a frame-based system;
  2. Frame based filter systems like ReShade/SweetFX/NVIDIA FreeStyle/etc! But they are not suitable for stuff that must be refresh cycle granularity, and/or need to run outside games;
  3. Gamescope built into Steam for Linux, does some frame-based processing too!
  4. Although not a processor (But a monitor controller), some software such as DisplayFusion allows you to use hotkeys or command line options to enable/disable displays from appearing in Control Panel, so there's precedent.

Not suitable for REFRESH-CYCLE-BASED processing

Refresh cycle based processing (independent of underlying frame rate) requires a virtual video driver -- like yours. That's why I posted here.

While many could serve as plugins for a virtual display driver (to allow filters to be used outside of a game, for the whole Windows desktop) -- unfortunately, the majority of these don't support all algorithms that only works with REFRESH CYCLE BASED processing (independently of frame rate)

Also some drivers for PCVR headsets (some of which actually connect as displays, but their specialized drivers "hide" the displays from becoming visible as regular monitors). Oculus Rift DK1 showed them as regular displays but by the time Oculus Rift CV1, they were hidden displays that could only be accessed through Oculus APIs.

I realize you don't have time now, and maybe someone else needs to take the baton, but your software is the closest thing to a code skeleton that could be used for this sort of stuff!

@mdrejhon
Copy link
Author

mdrejhon commented Nov 20, 2023

Interim "Specs" Discussion / Brainstorm For Temporal Processing

I realize a lot of things need to be clarified, I'm happy to flesh it out, as I don't have enough skills to write a Windows driver (I'm not an expert), but I am full of display-processing ideas! (That's where I am an expert on)

Some of these things MAY not be possible; this needs to be verified.

Bounty Requirements need to be Spec'd/Simplified

I realize that a lot of thinking is needed to see what is the simplest way to proceed, because we have a bunch of potential input/output settings. Most of which should be possible, and these values should be accessible to the shader processor.

For example, simulated VRR might be removed from requirements, if VRR is too difficult to implement, and that alternative capabilities are implemented instead;

Tentative/Unconfirmed Requirements

Loadable temporal shader filter modules

  • Loadable filter modules should be easily importable/exportable package, preferably 100% text based for easy customizability (ala JSON or INI or TXT or CFG), but an open to binary formats if easier to implement.
  • Loadable filter module should be able to have the following:
    The filter profile could be text-based to allow easy customizing by advanced users:
    • Title (e.g. "Software BFI")
    • Optional prerequisites/constraints on mode metrics
      • Custom warning-league constraints (filter runs degraded, such as stutters)
      • Custom fail-league constraints (can't run filter at all)
      • Example: "Output Hz must be an integer multiple of input Hz" or "hzOutput==2*hzInput"
    • Settings
      • Data type [boolean,integer,float], allowed range of data, and default value
      • Custom data constraint [if possible]
      • Examples: "Black frame count, defined as unsigned integer", "Phosphor decay simulation, defined as float, range 0.0 to 1.0", etc.
    • Shader text file (that driver compiles if not already compiled)

Possible Static Shader Variables

  • [boolean] Whether input is fixed-Hz or variable (e.g. VRR / VSYNC OFF)
  • [boolean] Whether output is fixed-Hz or variable (e.g. VRR / VSYNC OFF)
  • [float/double] Input refresh rate (aspirational if variable)
  • [float/double] Output refresh rate (aspirational if variable)
  • [enum?] Actual sync technology of actual FSE output (VSYNC ON, VSYNC OFF, VRR)
  • [integer] The input horizontal resolution
  • [integer] The input vertical resolution
  • [integer] The output horizontal resolution
  • [integer] The output vertical resolution
  • [?] The input format (16, 24, 32, FP16 HDR, FP48 HDR10, etc)
  • [?] The output format (16, 24, 32, FP16 HDR, FP48 HDR10, etc), typically same but can be different
  • Other accessible settings may be needed, if easy to include (e.g. multimonitor coordinate of the upper-left corner of virtualized screen)

Possible Dynamic Shader Variables

Basically, shader variables that changes once a frame / once a refresh cycle.

  • [framebuffers/textures] Previous unreprocessed "refresh cycle" (1 by default, configurable)
  • [framebuffers/textures] Current unreprocessed "refresh cycle"
  • [framebuffers/textures] Previous post-processed "refresh cycle" (1 by default, configurable)
  • [framebuffers/textures] Current post-processed "refresh cycle"
  • [numeric] (if possible grabbing timestamps from a presentation hook for refresh cycle processor) Current frame presentation time (e.g. timestamp of last Present event); this will require a barebones presentation hook to capture the frame presentation time
  • [framebuffers/textures] Previous unreprocessed frames and their frame presentation time (independent of refresh cycles), for processing frames/framerates independently of refresh cycle.
  • [integer] Monotonically increasing frame counter (since load of driver)
  • [integer] Monotonically increasing refresh cycle counter (since load of driver)

Output should support Full Screen Exclusive (FSE)

  • FSE also prevents the output and input refresh rate from throttling each other.
  • This also lowers input latency of filter processing too.
  • The output mode should have its sync technology configurable (VSYNC ON, VSYNC OFF, VRR).

ADVANCED NOTE (doesn't have to be part of bounty at first): If you have a Present() hook too and concurrently support both frame processing and refresh cycle processing, it is possible to have virtually lagless processing + VSYNC OFF mirroring! (VSYNC OFF on input Hz, VSYNC OFF on output Hz). In this case, the only lag is how fast the filter executes.

Configurable constraints on a per-filter basis

  • A temporal processing module (settings + shader) could specify what it needs / what it does not need
    • An overdrive lookup table system will require access to previous refresh cycles (but not frames or frame presentation times)
    • Virtualized VRR on non-VRR will require access to frame presentation times and previous frames (preferably all frames since previous refresh cycle, up to a max)
  • Ideally I'd like HDR and SDR to be able to diverge too.
    • An HDR nits-booster shader for SDR, will require input to be SDR and output to be HDR
    • An HDR analysis shader for SDR (e.g. heatmap debug), will require input to be HDR and output to be SDR
  • Optional exact constraints like "input Hz is always half output Hz", or "output Hz is always twice input Hz" or "output horizontal resolution is twice input horizontal resolution".
  • Some algorithms such as BFI may require "output Hz is always an integer multiple of input Hz".
    • Maybe an startup-time-executable math formula that is eval()'d somehow upon driver startup or loadable-module startup, or maybe a separate startup-only shader function that executes -- whichever is easiest to implement a very customizable settings-constraints system for a specific filter.

Input data for shader processing

  • Input shader data/variables considered "static" can be initialized on startup or during mode changes (e.g. resolution changes, refresh rate changes). For these cases, anytime you need to change these variables (such as screen mode changes), it is generally ok to essentially "reboot" the driver (e.g. reloading the shader module from scratch).
  • If possible, support for dynamic (once-a-Hz) variable updates would be desired. Input shader variables considered "dynamic" needs to be refreshed every refresh cycle (e.g. frame presentation time). This is needed for some algorithms like virtualized software VRR ala https://www.testufo.com/vrr running on a fixed-Hz display. This may require creativity to pull off without shader recompiles (I'm not 100% familar what will force a shader recompile)

API for driver, for a settings application, and for screwup-recovery, etc.

  • Some filters may accidentally make the display unintelligible. So an API provides a possible recovery mechanism may be needed,
  • An API should be built into the driver to allow applications to control the driver, such as switch filters. This would make unit testing easier, as well as loading data, as well as messup-recovery, as well as a settings applet / system tray / etc.
  • It would make it possible to have a system tray app that allows a hotkey to enable simple mirroring operation to revert from a shader messup (e.g. shader making the display unreadable) or some other reasonable recovery mechanism. With a driver supporting an API of some kind, then a system tray app can monitor a hotkey and send an API call to put the driver into a safe state. And preventing things like refresh rate getting too low (e.g. <1 Hz), or other problematic conditions that would otherwise require the user to hard-poweroff the computer.
  • API should be ideally able to enable/disable the virtual monitor, reverting to non-virtual display (eg. Feature Request: Disable Other Monitors #19). Sort of like switching between two DisplayFusion profiles with two separate hotkeys, which can disable one monitor while enabling a different monitor. I could use DisplayFusion to do this instead, if not possible.

Settings applet of some sort (whether in Windows CP or system tray, etc)

  • Settings application could handle the import/export responsibility for loadable filter modules
  • Existence of API could allow a separate settings application / control panel application to make configuring the driver much more user friendly. Could be part of an existing Windows Control Panel workflow, or could be a launchable system tray style application.
  • Ideally, the settings should be able to be dynamically adjustable (within constraints) via the API mentioned above. This would allow a settings application to see real time results as the user adjusts. If not practical, understandable; but this would be ideal. Like you can do for a gamma setting for a gamma-enhancing filter (of other software), I would be able to realtime adjust an overdrive setting, or a phosphor decay setting, or another temporal-filter setting.

This is just a proposal

Other possible workflows might be better and merit discussion; this is how I've visualized this so far;

External filter support

You can try testing SmoothVideoProject attachment support, maybe it already works out-of-the-box, because it can (in theory) attach to any DirectX framebuffer (Direct3D, DirectShow). If not enough compute, just use the crappy laggy interpolation as a validation test -- if that works, then the massively better AI-based interpolation might work okay.

  • ReShade
  • SmoothVideoProject

Multiple GPU support

Also given some filters (e.g. RIFE or future 10:1 reprojection algorithms) can be more compute-heavy than usual, might need to support 2-GPU systems -- so one GPU can be used for the game (primary) and a secondary GPU is used for the compute-heavy filter workloads. So architecturing decisions (long term) may need to (eventually) allow a "Select Preferred GPU to run Shaders/Filters On", even different from the rendering GPU.

Multiple Video Output Support (Refresh Rate Combining):

This will be useful for refresh rate multiplication Experiments. A virtual display driver can also support refresh rate multiplication (A Blur Busters breakthrough, which I'm still formulating a white paper for) through two methods:

  • Two strobed monitors in a beam splitter mirror;
  • Multiple strobed projectors pointing to the same projection screen;

So you can produce a 960Hz screen by combining four 240Hz, eight 120Hz or sixteen 60Hz projectors pointing to the same screen.

A virtual windows driver driver virtualizes a higher refresh rate, and outputs the frames round-robin to multiple lower-Hz GPU outputs.

There are some additional complicated techniques to projection map and or offset the VBI's that utilizes simulating fixed Hz via VRR, and using VRR as a software genlock to slew offsetted blanking intervals (but leave those details to me). To make it easier for multiple projectors pointing to the same screen, even a custom user-written shader can do projection mapping with settings adjustments (keystone/distortion correction) and specify which output the next framebuffer should go to (or something like that), but leave the details to the shader deets.

What I need is a filter-supporting driver framework that makes it all possible at the end; which is part of the reason of the bounty.

Epilogue

If only a subset can be supported, that's negotiable for the bounty (you can communicate with me privately at mark [at] blurbusters.com first, if necessary before discuss public here).

Some of the above is easy, and some of the above is hard. A task breakdown might need to be brainstormed (stage 1, stage 2, stage 3...)

Just putting some known factors down, to help brainstorming, and to help task-breakdown, and to help architecturing... You are welcome to continue to implement subsets of these ideas without doing a bounty. Basically you're welcome to using a subset of my ideas (less than bounty) if not able to -- at least I helped improve a project. But if a full shader-customizable driver is possible

@mdrejhon
Copy link
Author

mdrejhon commented Nov 20, 2023

External Processors (As additional support other than plug-ins)

Although it is desired to have built-in shader plugins, nominally, the framework should be able to support external processors (mostly spatial processors), and this does not necessarly have to be part of the bounty. But it is desirable to try to include support for popular external filter processors, which is usually simple API calls:

Another purpose; add optional support for SmoothVideoProject filters -- there are some custom interpolation engines (e.g. RIFE 4.6 NN) that does near-flawless artificial-based interpolation for retrogaming;

Currently it requires a HDMI capture card like an Elecard, but a virtual graphics driver supporting SmoothVideoProject and its most advanced interpolation plugins.

Interpolation is not well liked by most gamers, but an RTX 4090 is able to do RIFE 4.6 NN in realtime to create 1080p/120 that looks really "native" (relatively artifact free and low latency) out of 1080p/60 content, while still having enough compute leftover to render the game itself.

I think something (roughly like this) is already in the works already by a few people I know on Discord!

Now...A lot of things have improved since this thread was created! AI-based interpolation.

For absolutely stunning near-flawless interpolation on 2D games (like RTS style), I recommend using a Elecard capture card + RIFE 4.6 NCNN interpolation engine + RTX 4000 series. It can do one of the world's best NN-AI-based interpolation to 1080p/120 in real time using a compute-heavy interpolation/extrapolation algorithm requiring the latest RTX cards

It's a convoluted setup, but it makes retro games look like true native 120fps, with only rare artifacts.

The NN-AI "learns" the background of games while you play them, and uses as background-reveal or parallax-infill (sometimes pixel perfectly, sometimes not, depends on the game you try). So it's a more flawless interpolation for retrogaming, platformers, and RTS games. For a interpolation "black box in the middle" type of setup that is not integrated into the engine -- it is the most native looking realtime interpolation today's compute can get you today so far.

There's input lag, but it's not too shabby. In theory only +1 frame! (excluding capture card overhead).

Long term, I want to see somebody implement this into a Windows Virtual Display Driver so that you can omit the capture card (Elecard).

For more information about artifical-intelligence (NN) based interpolation, which is the state of art, see https://www.svp-team.com/wiki/RIFE_AI_interpolation and you can tell it's pretty much supercomputing-league interpolation essentially, but it's light years quality far beyond any interpolation algorithm I've ever seen. Recommended if you /must/ use interpolation in retrogaming.

It's nigh native-looking on most retrogaming content! Sadly, gotta be piped into an Elecard capture card (lossless HDMI video input), I hope this changes so we can do it natively as a filter plugin (e.g. ReShade/SweetFX style, but it would be better to use a virtualized graphics driver to allow it to work on any content)

So from this, possible additional ideas, not all unviersally part of the bounty, but nominal support (e.g. multiple-output support) that makes a shader-filter programming task possible.

which I've edited into the previous posts;

External filter support

You can try testing SmoothVideoProject attachment support, maybe it already works out-of-the-box, because it can (in theory) attach to any DirectX framebuffer (Direct3D, DirectShow). If not enough compute, just use the crappy laggy interpolation as a validation test -- if that works, then the massively better AI-based interpolation might work okay.

  • ReShade
  • SmoothVideoProject

Multiple GPU support

Also given some filters (e.g. RIFE or future 10:1 reprojection algorithms) can be more compute-heavy than usual, might need to support 2-GPU systems -- so one GPU can be used for the game (primary) and a secondary GPU is used for the compute-heavy filter workloads. So architecturing decisions (long term) may need to (eventually) allow a "Select Preferred GPU to run Shaders/Filters On", even different from the rendering GPU.

Multiple Video Output Support (Refresh Rate Combining):

This will be useful for refresh rate multiplication Experiments. A virtual display driver can also support refresh rate multiplication (A Blur Busters breakthrough, which I'm still formulating a white paper for) through two methods:

  • Two strobed monitors in a beam splitter mirror;
  • Multiple strobed projectors pointing to the same projection screen;

So you can produce a 960Hz screen by combining four 240Hz, eight 120Hz or sixteen 60Hz projectors pointing to the same screen.

A virtual windows driver driver virtualizes a higher refresh rate, and outputs the frames round-robin to multiple lower-Hz GPU outputs.

There are some additional complicated techniques to projection map and or offset the VBI's that utilizes simulating fixed Hz via VRR, and using VRR as a software genlock to slew offsetted blanking intervals (but leave those details to me). To make it easier for multiple projectors pointing to the same screen, even a custom user-written shader can do projection mapping with settings adjustments (keystone/distortion correction) and specify which output the next framebuffer should go to (or something like that), but leave the details to the shader deets.

What I need is a filter-supporting driver framework that makes it all possible at the end; which is part of the reason of the bounty.

@daiaji
Copy link

daiaji commented Feb 18, 2024

https://eligao.com/enabling-freesync-on-unsupported-displays-f90ce7e8089346d2bbbe9275b21ba3ca

By design, FreeSync does not require any special hardware as G-Sync does. Wouldn't it be nice to get it to work on my other non-FreeSync monitors?

VRR seems to only need to modify the EDID?
I use LookingGlass to play games on a virtual machine, but my graphics card RX550 does not support HDMI VRR (AMD said this graphics card does not support it), so I only have one DP interface that supports VRR, so VRR support on a virtual monitor is very useful.

@mdrejhon
Copy link
Author

mdrejhon commented Feb 19, 2024

VRR seems to only need to modify the EDID?

Depends. It requires the panel to be VRR-tolerant.

I've seen VRR forced over DVI and VGA and old-HDMI versions before, through this ToastyX trick on AMD GPUs, as it's simply a variable-size blanking interval on a standard video signal, to vary the interval between refresh cycles (unchanged horizontal scanrate, varying vertical refresh rate).

There are many panels that are intolerant of VRR forced upon it (goes blank).

Also, panels with generic adaptive-sync with no compensations, will do a poor job of VRR quality. It usually won't have good overdrive compensation for VRR. This means you may have LCD ghosting artifacts at certain framerates/refreshrates, which is more problematic for motion quality than high-level FreeSync Premium & G-SYNC Native certifications.

Example of asymmetric LCD ghosting artifact; this can appear/disappear, worsen/improve at different VRR frame rates;
image

Motion-quality enthusiasts usually have a preference to pay extra for better/tested/premium certified VRR motion quality.

But, yes, it's also nice to be able to add VRR as a bonus feature to a non-VRR panel; as long as expectations are tempered.

I use LookingGlass to play games on a virtual machine, but my graphics card RX550 does not support HDMI VRR (AMD said this graphics card does not support it), so I only have one DP interface that supports VRR, so VRR support on a virtual monitor is very useful.

Unfortunately, virtual machines usually are incapable of VRR.
However, there's two possible tricks to get VMs to work with VRR:

  1. Force the virtual machine to run in tearing VSYNC OFF mode (make tearing appear whenever VRR is disabled on host OS and monitor)
  2. Force the host OS in VRR mode (whether native or ToastyX trick).
  3. Turn VRR ON in the monitor menus.
    Then sometimes games running inside virtual machines will work fine with VRR.

In other words, if you have to force VRR, you have to do it at the native-host-OS level. For example, a Windows VM on a Windows host, would work with this trick. As long as you successfully got the VM to show tearing in step 1, then you can trick the VM to convert the tearing into successfully working VRR. But would not work with a Windows VM on a Mac, even if you were able to create a custom EDID via SwitchResX (the Mac equivalent of ToastyX), because the M1/M2/M3 doesn't support tearing during VSYNC OFF.

Technical Reason Why It Works: This is because the first step of tearing = good indicator if VRR will work when VRR is enabled at host OS + monitor level. Random tearing (as a temporary visual debugging method) means you've successfully disconnected frame-present timing from fixed Hz refresh rate. Tearing is simply mid-refresh-cycle splices of a new frame, indicating asynchronous attempts at delivering new frames to display, is now functioning. If that's solved, then making the refresh rate float (enable VRR in host drivers/host OS/host monitor) seamlessly "converts" the VSYNC OFF to VRR successfully, even if the VM software is VRR-unaware. The VRR-unaware software is configured as VSYNC OFF, something that existed for a long time pre-VRR, as a backwards compatibility technique. That's also happens to be how pre-VRR game software (e.g. 2004's Half Life 2 and the like) successfully worked with raster VRR (which arrived in the 2010s), you simply had to use VSYNC OFF at the game side, in order to get VRR working on a VRR-supported OS+GPU+display. The same is true for VRR-unaware VM software that still supports VSYNC OFF with visible tearing artifacts whenever VRR is turned off

Sorry about the sidetrack, just Blur Busters (that I founded) knows this stuff; and I wanted to follow up;

</Temporary Side Track>

@daiaji
Copy link

daiaji commented Feb 19, 2024

Looking Glass

An extremely low latency KVMFR (KVM FrameRelay) implementation for guests with VGA PCI Passthrough.

Basically I use a virtual machine with GPU passthrough to play games and have been using VRR for a while, so the issue doesn't seem to be there.

@mdrejhon
Copy link
Author

Looking Glass
An extremely low latency KVMFR (KVM FrameRelay) implementation for guests with VGA PCI Passthrough.

Basically I use a virtual machine with GPU passthrough to play games and have been using VRR for a while, so the issue doesn't seem to be there.

Ah, GPU passthrough is different! That's easier.

@daiaji
Copy link

daiaji commented Feb 19, 2024

Looking Glass
An extremely low latency KVMFR (KVM FrameRelay) implementation for guests with VGA PCI Passthrough.

Basically I use a virtual machine with GPU passthrough to play games and have been using VRR for a while, so the issue doesn't seem to be there.

Ah, GPU passthrough is different! That's easier.

So for virtual displays, to achieve VRR, just edit EDID?

@mdrejhon
Copy link
Author

mdrejhon commented Mar 4, 2024

So for virtual displays, to achieve VRR, just edit EDID?

In theory. There could be some gotchas, as virtual display drivers may do other behaviors that prevent VRR from working.
But now we're getting offtopic; I wanna focus on the github item.

@mdrejhon
Copy link
Author

mdrejhon commented Mar 4, 2024

Hi!
Thanks for the feature request! Also thanks for the goodwill sponsorship!

Any update? I'd love to at least get access to a Apache/MIT codebase of a virtual display driver this year.

May you contact me at [email protected] and we can negotiate -- I might have third party resources that may be able to help and I may be able to submit some improvements in due time.

@MolotovCherry
Copy link
Owner

MolotovCherry commented Mar 4, 2024

I'd love to at least get access to a Apache/MIT codebase of a virtual display driver this year.

That's fine, I'll switch the license to a more permissive one soon.

May you contact me at [email protected] and we can negotiate

I assume you were referring to the license when you said this, right? (So that should be covered by my above statement?) Or did you have something else in mind?

I might have third party resources that may be able to help and I may be able to submit some improvements in due time.

That does sound great! Contributions are most welcome.

@mdrejhon
Copy link
Author

mdrejhon commented Mar 4, 2024

I assume you were referring to the license when you said this, right? (So that should be covered by my above statement?) Or did you have something else in mind?

Either or both.

I need maximum flexibility to figure out how to improve the codebase to meet multiple needs. As well as submit it back to the open source community (but also be able to use it in future proprietary Blur Busters software too).

I'm open to many kinds of arrangement, but simply switching licenses would also be the simplest if you have no time to do the feature requests -- at least I'd have more options, whether via both funded developer and volunteer developer routes.

I'd still want to contribute a bunch of changes back somehow (even if I used a funded software developer to do it), or make a forked open source project.

Also, custom blur busting software is a very niche, with niche skills, so additional options are welcome! Obviously, if I am unable to find developer resources, then it may be a while, but the sooner it is Apache/MIT, the sooner I can find non-unobtainium developer options within the next several months.

For those who want to contact me: For the actual work, I may have a modicum of funding available to an open source software developer, who's allowed to recontribute it back to any relevant opensource project (But only if it's on a MIT/Apache codebase -- I have simultaneous opensource & proprietary needs that I'm trying to make compatible).

@MolotovCherry
Copy link
Owner

MolotovCherry commented Mar 4, 2024

I need maximum flexibility to figure out how to improve the codebase to meet multiple needs. As well as submit it back to the open source community (but also be able to use it in future proprietary Blur Busters software too).

I'm open to many kinds of arrangement, but simply switching licenses would also be the simplest if you have no time to do the feature requests -- at least I'd have more options, whether via both funded developer and volunteer developer routes.

You can operate under the assumption that it's switched to MIT already (I'll get around to it soon, by the time you find anything, it'll already be switched, so it's not an issue).

@MolotovCherry
Copy link
Owner

@mdrejhon It's switched now to MIT

@mdrejhon
Copy link
Author

mdrejhon commented Mar 8, 2024

@mdrejhon It's switched now to MIT

Fantastique! This just opened maybe three time as many developer options -- in a very difficult "nichengineering" project. Thank you!

Keep in mind this is a self-funded passion project, so a bit of time may pass before I successfully move forward on getting the chess board setup. I will inform you when I do.

@gnif
Copy link

gnif commented Nov 28, 2024

I was just made aware of this bounty.

Please be aware that any indirect display driver (IDD), by the design of the API, is taking captures of the desktop, which does, without a doubt impact latency and overall system performance of 3D workloads.

See: IddCxSwapChainReleaseAndAcquireBuffer and IddCxSwapChainFinishedProcessingFrame. It is impossible to avoid overheads that the IDD brings with it if all you want to do is fake a display.

Edit: As long as the goal isn't for bench-marking, this should be fine, but just throwing this out there as we see so many in the Looking Glass (also a capture application, but in a VM) community using variations on the Microsoft IDD Sample Driver (which is what this is) and not understanding why the performance tanks, or is unstable.

@mdrejhon
Copy link
Author

mdrejhon commented Nov 29, 2024

Edit: As long as the goal isn't for bench-marking, this should be fine, but just throwing this out there as we see so many in the Looking Glass (also a capture application, but in a VM) community using variations on the Microsoft IDD Sample Driver (which is what this is) and not understanding why the performance tanks, or is unstable.

Disambiguation:

Performance benchmarking for maximum performance = no
Benchmarking for motion quality improvements = yes
Evaluating refresh rates that don't exist yet = yes

Some of the applications for this is black frame insertion for retro material (any emulator you wish to run), so that's low performance. Motion blur reduction via software means, ala beta.testufo.com/blackframes except works desktop wide in a way much more reliable (less erratic flicker) than DesktopBFI app.

Another is to improve motion quality of low-CPU/low-GPU stuff on LCDs by adding IDD-based software overdrive (clone of ATI Radeon Overdrive from the early 00's). This includes even mere browser scrolling, since adding software-based overdrive to overdriveless laptop LCDs actually halves LCD GtG by more than half (yes, you can do it in a software shader!). The problem is that it mandatorily requires refresh-cycle precision independently of the frame rate of the underlying content, and needs to work on office/text/browser scrolling to improve motion quality benchmarks, so it's not practical to do with frame-injection.

There are over 1000+ (very niche) use cases, but these are two examples. Possible more mainstream use cases would be filter injection (SweetFX, Reshade, NVIDIA FReestyle) but, those can be done by frame injection instead of refresh-cycle injection. On the other hand, it means you can apply effects Windows-wide, and in video playback, not just videogame frame buffers.

For many esoteric use cases;
-- People are willing to sacrifice application framerate in order to ensure every single refresh cycle is reprocessed and VSYNC'd properly in the injector (as long as the injected filter uses only a fraction of refreshtime).
-- People are willing to upgrade GPUs to add headroom for this injection overhead.

Essentially, this is like "Every refresh cycle must be done and vetted by us" situation (much like VR, where it will automatically reproject if original frames are late). This may require extremely high thread priorities within the reprocessor. Degrading framerate is the lesser of evil in many specific use cases such as software-based custom overdrive curves for overdriveless mobile LCDs to improve motion clarity during LCD scrolling (scrolling often only uses 10% CPU and GPU anyway), things like that.

I successfully implemented overdrive in a private TestUFO test, which I'll probably publish in 2025, so there's some internal precedents (newer than ATI Radeon Overdrive from twenty years ago). It's simply an A[B]=C thing where A is original subpixel greyshade, B is new subpixel greyshade, and C is overdive color, like using 220 to speed a pixel transition from 100 to 200 (dark grey to light grey, computed on a per-RGB channel basis). Easy shader work with a 256x256 overdrive lookup table. That's what display firmware software does (they actually cheap out on interpolated 17x17 overdrive lookup tables), but it also works GPU-side too. Only done whenever a pixel changes, and only for one refresh cycle after. Which means it's gotta be Hz-granularity, independent of underlying framerate. It can halve scrolling motion blur on the worst laptop LCDs (e.g. running TestUFO overdrive test on slow laptop LCDs like MacBook Pros or DELL/HP laptops). Shame I can't make this crossplatform, it even works in JavaScript at low overhead, so IDD should be doable. But doing it on PC can be done via IDD approaches.

NOTE: While the earlier bounty has expired (I had a Dec 31 2023 deadline posted somewhere else but extended it into 2024) -- I am willing to extend it further but by invitation (e.g. a taker) for some delivery date in 2025. Please contact me mark [at] blurbusters.com to negotiate a possible new bounty. Or to simply donate the code, if you can only implement only a subset. Either way, biggest mandatory requirement is Apache/MIT or similar hobbyist+commercial permissive license, since it's too niche to be fully noncommercial only, and too niche to be fully commercial. It's something I've been trying to make happen for years. But it unlocks a lot of crazy-neat stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants