Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice for null-terminated strings #85

Open
fleimgruber opened this issue Sep 22, 2021 · 4 comments
Open

Best practice for null-terminated strings #85

fleimgruber opened this issue Sep 22, 2021 · 4 comments

Comments

@fleimgruber
Copy link

fleimgruber commented Sep 22, 2021

Using a third party API via a DLL, some fields in structs are wrapped with CBinding.Carray{Int8, 100, 100} which represent null-terminated strings in the API and are often printed with lots of trailing null characters in Julia (implicitly converted by string I guess). So consider e.g.

using CBinding
println(string(Carray{Int8,5,5}(120, 121, 122, 0, 0)))

which prints

"xyz\0\0"

is it possible to automatically handle these null-terminated strings (esp. for nested structs) in the conversion to Julia, e.g. via truncating? Or is there an obvious way to do this that I am missing? Are there any other recommendations for handling these cases?

@krrutkow
Copy link
Member

It is probably not possible to automatically truncate the strings since there are so many ways that C API's use the char arrays (like a bunch of null-separated strings within the same char array) which would be problematic. Would it help if there was a keyword argument available to enable the desired behavior? Something like this:

str = String(Carray{Int8,5,5}(120, 121, 122, 0, 0)), truncate = true)
# "xyz"

@fleimgruber
Copy link
Author

fleimgruber commented Sep 24, 2021

Thanks for the feedback. Yes, after more consideration, automatic truncation for C APIs in the wild is problematic indeed. To mimick this for user (API) defined types and fields, I was thinking of two approaches to this on the user side

getproperty and getfield

using CBinding

c`-std=c99 -Wall`

struct S
    name::Carray
end

name = Carray{Int8,5,5}(120, 121, 122, 0, 0)
c = S(name)

function Base.getproperty(c::S, v::Symbol)
    if v == :name
        normalized = replace(string(getfield(c, v)), r"\0+$" => s"")
        return normalized
    else
        return getfield(c, v)
    end
end

println(c.name)

which still prints

"xyz\0\0"

while

println(replace("xyz\0\0", r"\0+" => s""))

gives the desired

"xyz"

if that worked then there would be no changes required to CBinding.jl and users can explicitly manage their fields. Another issue with this approach I saw with structs that are not pure Julia, but c"...", where we would get e.g.

  type (c"struct DWChannel") has no field array_size
  Stacktrace:
   [1] getproperty(::var"(c\"struct DWChannel\")", ::Symbol)

Explicitly wrap field access and modification in methods on C API wrapped types

Currently, I like this approach better as it is more idiomatic, explicit and works well with the rest of CBinding.jl without changes to the code or lower level mechanisms (such as approach 1).

function name(c::DWChannel)
    replace(String(c.name), r"\0+$" => s"")
end

This might also be something for the docs - just to give an example of possible ways to deal with such cases.

@krrutkow
Copy link
Member

The problem with how you are hijacking getproperty is that you are using getfield. Native Julia structs cannot fully represent C aggregate types, so getproperty is already used in a very generic way within CBinding. Therefore, if you invoke the more abstract getproperty that exists, your approach should work without issue:

function Base.getproperty(c::S, v::Symbol)
    x = invoke(getproperty, Tuple{supertype(S), Symbol}, c, v)
    return v == :name ? replace(String(x), r"\0+$" => s"") : x
end

A keyword argument for null-terminated string conversion should still be added to the package though, which would make that last line return v == :name ? String(x, truncate = true) : x.

As a side note, I didn't even realize this level of "customization" was still available for CBinding-generated types, but it does allow for some interesting use cases, like dynamically creating Julian constructs (String, IPAddr, etc.) when accessing the C API objects.

@fleimgruber
Copy link
Author

Thanks for the corrected snippet, works nicely! Glad this gave rise to new ideas on use cases. Issue is solved for me, please feel free to close or keep open as reminder for implementing the null-terminated string conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants