Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data missing from16 bit png depth map #2

Open
AugmentedRealityCat opened this issue Mar 22, 2023 · 12 comments
Open

data missing from16 bit png depth map #2

AugmentedRealityCat opened this issue Mar 22, 2023 · 12 comments

Comments

@AugmentedRealityCat
Copy link

There seems to be a problem with the 16 bit PNG depth map you can download from the first tab, and even the preview seems to have similar problems.

Here is a picture used to create a depthmap

00027

And here is the preview we get of the resulting depth map. You can clearly see that some parts are completely black - like if they were missing, or clipped out from the range covered by the depth data.
preview_8bpc

As for the actual 16 bit PNG it's almost completely black - on the curves histogram you can see all the data is squeezed to the left.
a

I have the impression the raw data coming from the depth analysis is good (at first glance it seems to be interpreted correctly by both the 3d model and the 3d panorama mesh extraction process) but that it needs to be interpreted differently to create useful PNGs and provide the user with a representative preview of the complete data.

If you try to interpret the 16 bit PNG to rescale the depth data using a wider color range it will not work as there is a bit of data there, but only just a couple of bits ! If you map them using brightness curve adjustments in photoshop you get this.

b

As you can see, there is huge color banding as there is not enough data in the image to provide a subtler gradient.

I can provide examples and tests to illustrate the problem, but I'm guessing that it's probably just a simply math mistake in converting the raw depth values to 16 bit.

I would check the following line:

raw_depth = Image.fromarray((depth*256).astype('uint16'))

I have changed it myself to 8192 (65 536 / 8) and it almost seems to work ! But the values, instead of going from 0 to 255 (if we were to interpret the image in 8 bits) they go from 50 to 155 approximately. So there seems to be both an offset to compensate for, and the need for a larger multiplier than 8192.

That gives you something like this - it's a very decent depthmap and it might be a properly scaled one as well.
d

And if you adjust the brightness curve to pseudo-normalize it (extend it manually to cover the full range from black to full white) you get a nice depthmap in appearance, without any banding or clipping.

e

If I change the multiplier to 16384, then I can see there is an overflow, and the furthest distances are indicated with a gradient that restarts from black up to a dark grey.

c

Again, I interpret this as an offset that we have to compensate for somehow, which I tried to do manually using curves in photoshop. It's a dirty solution, but it shows that there is indeed an offset that we can compensate for.

f

We should also consider the colorspace used to encode the file itself, and check if that colorspace and associated gamma are indicated as such by the PNG in its metadata. This could lead to some problems as well - I had to put a counter-gamma of 0.4545 to interpret that depthmap in 3d in a way that made sense.

And, finally, there is also the fact that Photoshop and many libraries and the software that use them are not correctly saving 16 bit PNG files - they only use 15 bits for some reason. This is probably not related to what I have documented here, but we should keep this possibility in mind if nothing else works.

@vslash0
Copy link

vslash0 commented Mar 23, 2023

I'm sorry I am ultra mega noob idk if this is appropriate but I've been looking through your previous comments on generating 3D models from images. I'm curious if you're still pursuing that these days and what is the best method you've found and is Zoedepth a changer

@AugmentedRealityCat
Copy link
Author

AugmentedRealityCat commented Mar 23, 2023

This ZoeDepth extension is AMAZING ! The image-to-3d-panoramic model is just unbelievable. It does in seconds what took me hours, and it does it better. The depth data extracted by the ZoeDepth algorithm is really a big advantage over previous solutions for extracting 3d models because its measurement are more accurate in regard to the real distances they represent - at least that's my impression so far.
The problem I've documented above is just a little detail. I've put plenty of pictures because it was hard to explain without showing what I meant.

Also, the hacks I've applied are based on my lack of any kind of programming skills - they are NOT the solution, but just a way to demonstrate that the ZoeDepth data itself seems to be there and accurate, and that it's just not interpreted correctly in the PNG file you can download from the first tab.

Besides this little detail, everything in this extension seems to work perfectly well - try it it's impressive !

@vslash0
Copy link

vslash0 commented Mar 23, 2023

That all does sound impressive. I have a feeling I'm not suppose to have random discussion in github comments like this but I'm super curious if you're still generating depthmasks and doing 3d inpainting like with the code donlinglok has available. Could the depth data from ZoeDepth be used in the CVPR 2020 code to get even better results? I've been trying to tweak it with different MiDaS models but the mesh.ply meshes I get aren't quite the best.

@AugmentedRealityCat
Copy link
Author

I am convinced 3d inpainting + ZoeDepth will be amazing !

@Lumabrik
Copy link

Lumabrik commented Mar 28, 2023

I've found if you want more automated way of adjusting the 16bit png depth map, in PS go into Image > adjustments > HDR toning and use either 'Equalize Histogram' or 'Local Adaption'. This will work well for some use cases, but may not for others that require any sort of high accuracy interpretion of depth. I've been applying the maps as displacement maps to subdivided planes in 3d software and they work well with limited camera movements. Compared to other depth map generation methods ZoeDepth is a bit better for this.

@AugmentedRealityCat
Copy link
Author

PS go into Image > adjustments > HDR toning and use either 'Equalize Histogram' or 'Local Adaption'.

This works when you have a good source, but in the example above there isn't enough data to extrapolate from.

All the other levels adjustments I'm showing after the first are just to demonstrate that if you change the code you can actually get a 16 bit PNG with all the data. I just wish I had the skill to make the right modification !

Right now it's a better solution to create the mesh directly in the extension as it seems to use all the data with full precision, but I do hope that one day the code will get fixed so we can use those Depth Maps in other softwares while maintaining all the precision of the original depth estimation.

@Lumabrik
Copy link

As I'm not a coder, can you explain how to modify the line:

raw_depth = Image.fromarray((depth*256).astype('uint16'))

for the extra bit depth, Thanks.

@AugmentedRealityCat
Copy link
Author

raw_depth = Image.fromarray((depth*8192).astype('uint16'))
And don't forget to keep the tabulation (all the empty spaces before the actual words).
This is just a hack though, and I am not sure at all if the result is actually metrically and mathematically correct - it's mostly just a wild guess because I know the data is there somewhere - it is used properly to generate the 3d models directly in the extension - so the problem has to be in how that data is transcribed into a 16 bit PNG picture.

@Ratinod
Copy link

Ratinod commented Apr 6, 2023

Found a better solution. It seems that the output is a full range of depth.
change line
raw_depth = Image.fromarray((depth*256).astype('uint16'))
to
raw_depth = Image.fromarray((65535*(depth - depth.min())/depth.ptp()).astype('uint16'), mode="I;16")
But to use it in Blender, you still have to invert the color in Gimp or Photoshop.
PIL.ImageOps.invert(raw_depth) doesn't work because raw_depth not RGB or L.

@AugmentedRealityCat
Copy link
Author

AugmentedRealityCat commented Apr 6, 2023

I will try that, and I do not mind having to invert the image in Photoshop, Gimp or Krita. Thanks a lot for sharing your code.

I'm not a programmer, so I might be getting all of this completely wrong, but can you explain a little bit what your code is doing ? Here is what I undertand from it at first glance:
65535 = this is the factor by which we extend our 0-to-1 data into a 0-to-65535 range.
depth-depth.min/depth = some kind of normalization to better use the whole range ?
uint16 = the type of data, 16 bit integer

Thanks again for your help !

@Ratinod
Copy link

Ratinod commented Apr 6, 2023

I found a solution. Literally. Here. (And here is another place where the same line of code is used.)

I hope this helps find answers.

@Ratinod
Copy link

Ratinod commented Apr 7, 2023

A 16-bit depth map that doesn't need to be inverted in an external editors. + Better image preview (still 8-bit!!!).

gradio_depth_pred.py

change line
colored_depth = colorize(depth, cmap='gray_r')
to
#colored_depth = colorize(depth, cmap='gray_r')

and also change line
raw_depth = Image.fromarray((depth*256).astype('uint16'))
to

        raw_depth = Image.fromarray((65535*(depth - depth.min())/depth.ptp()).astype('uint16'), mode="I;16")
        width, height = raw_depth.size
        temp_image = Image.new('I;16', (width, height), (65535))
        buffer1 = np.asarray(raw_depth)
        buffer2 = np.asarray(temp_image)
        buffer3 = buffer2 - buffer1
        raw_depth = Image.fromarray(buffer3)
        colored_depth = Image.fromarray((255*(depth - depth.min())/depth.ptp()).astype('uint8'), mode="L")
        temp_image = Image.new('L', (width, height), (255))
        buffer1 = np.asarray(colored_depth)
        buffer2 = np.asarray(temp_image)
        buffer3 = buffer2 - buffer1
        colored_depth = Image.fromarray(buffer3)
        raw_depth.save(tmp.name)
        del buffer1
        del buffer2
        del buffer3
        del temp_image

Perhaps there is a much more elegant solution to how it all needs to be done. But I don't care. It just works.

AugmentedRealityCat added a commit to AugmentedRealityCat/a1111-sd-zoe-depth-16bit that referenced this issue Jul 24, 2023
Modified using the code written by Ratinod over here:
sanmeow#2 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants