Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using virtual nodes in Cypher #536

Open
rickardoberg opened this issue Nov 16, 2023 · 6 comments
Open

Using virtual nodes in Cypher #536

rickardoberg opened this issue Nov 16, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@rickardoberg
Copy link

Using virtual nodes in Cypher

It appears that using virtual nodes in Cypher queries is not possible or bugged. Basically, if a procedure/function returns an APOC VirtualNode (or any Node implementation for that matter) I would expect it to be possible to use it as any other node.

Neo4j Version: 5.12.0
Operating System: Windows 11
API: Cypher

Steps to reproduce

Run this Cypher query:

WITH apoc.create.vNode(['vnode'], {name:'one'}) AS one
RETURN one.name

Expected behavior

Should return "one"

Actual behavior

Returns null

The reason I need this to work is to be able to implement field-level access control, which is not supported by Neo4j itself.

@rickardoberg rickardoberg added the bug Something isn't working label Nov 16, 2023
@mnd999 mnd999 transferred this issue from neo4j/neo4j Nov 16, 2023
@InverseFalcon
Copy link
Contributor

InverseFalcon commented Nov 16, 2023

This one isn't a bug, but a limitation in virtual node property access, since they aren't actually real nodes.

You can use the apoc.any.property() function to access the property of any object, including a virtual node or relationship. You can substitute usage of this in your RETURN clause to get the virtual property value.

...
RETURN apoc.any.property(one, 'name') as name

That said, I don't see this function or mention of it in the APOC virtual nodes documentation, and that really needs a mention there as well as a link to the function.

@rickardoberg
Copy link
Author

It may not be a bug, but having "Map<String,Object>" being better supported than "Node" is a bit odd, wouldn't you say? Like.. why?

@mnd999
Copy link
Contributor

mnd999 commented Nov 17, 2023

I'd be interested to here about what your needs are for field-level access control are as well. That's something that we're thinking about at the moment.

@rickardoberg
Copy link
Author

It's for an HR analytics service. For each node and each field we need to calculate if the user making the query is allowed to access the value of that particular value, using the same rules as the HR system itself. The rules are in the graph as well, so in a sense it's back to the original reason Neo4j was created in the first place, but on a much more granular level.

In the end the model I ended up going with was to translate nodes into Map<String, Object> as the Cypher engine knows what to do with that. So when converting the node to that structure I can apply the rules, and then allow that data to be used for aggregations and output, behind a GraphQL engine which is what the user/UI uses to access the database.

We're also doing tenant separation of graphs by applying tenant id to all nodes as a label, and then using a custom AccessMode implementation which checks that each accessed node has the tenant label of the user, and with tenant aware compound indexes to make lookups fast.

All of this is with embedded Neo4j, using event sourcing projections to scale it.

On top of all the above each entity has full history, because we have all events with metadata of how state was changed, when, by who, and why, so we can also do time queries, as in, any query can be run with a timestamp of what state of the data should be used (includes both properties and relationships). A time series is simply running the same query many time with different timestamps, for example.

@mnd999
Copy link
Contributor

mnd999 commented Nov 17, 2023

Thanks, I'm not sure if you're on community or enterprise but either way we didn't make the native RBAC available through public APIs for embedded so it's interesting to hear about a use case for it. I think initially what we're thinking about is comparisons with static values rather than dynamic lookups against the graph at least for now.

@rickardoberg
Copy link
Author

This is on Community. I should make it clear that I have no desire to have the database implement any of this. It's all very application specific. What I want, essentially, is to have the database not get in my way and help me get it done on top of it.

In this case, specifically, if a procedure/function returns a Node it should be usable for the rest of the query. That would allow me to take a raw Node and put a wrapper with the extra logic on top.

Also, if a procedure/function returns a Map<String,Object> it would be nice if it wasn't immediately copied. Since Node's didn't work I was hoping to be able to return a custom Map<String,Object> implementation that could do the access check on get() but that also didn't work because of the immediate copy of all properties rather than just the used ones. Instead I now have to pre-calculate what fields are going to be used by the GraphQL query, create an access control checked Map<String, Object> with those fields, and then return that to be used by Cypher. Feels like an unnecessary hassle.

Then again I understand from other tickets that embedded is not a priority, at all, so I can see how it's perhaps a bit too niche.

@vgazdag vgazdag mentioned this issue Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants