Use DataGraphs in ttn_svd and other changes #175

emstoudenmire · 2024-05-12T19:56:27Z

This PR contains mostly minor but helpful improvements to the ttn_svd function and its sub functions.

Here is a list of changes, from most to least significant:

Use DataGraphs to hold the bond coefficients and sparse/symbolic vertex tensors returned from make_symbolic_ttn
Reduce the number of arguments going into make_symbolic_ttn, svd_bond_coefs, and compress_ttn. (They weren't really sharing as much state as the code made it look, even compared to the current code following the last PR.)
Significantly simplify svd_bond_coefs logic to just loop over all edges (it does a trivially parallel task of just forming and SVD'ing each edge coefficient matrix) rather than previous more complicated logic involving vertices and graph analysis.
Removed other unneeded dictionaries.
Use better graph analysis (vertex_path) to check which terms cross a given vertex. Could use steiner_tree but it seemed to have an issue with some disconnected components being left in the output.
Remove coefficient_type function barrier in ttn_svd since now that function just calls out to other functions anyway.
Simplify MatElem type keeping only code used for OpSum to TTN conversion. Will probably delete MatElem soon anyway in favor of SparseArray.

emstoudenmire · 2024-05-12T20:04:43Z

src/treetensornetworks/opsum_to_ttn/qnarrelem.jl

-  idxs::MVector{N,Int}
+struct QNArrElem{T}
+  qn_idxs::Vector{QN}
+  idxs::Vector{Int}


This change made it possible to have a dictionary of vectors of QNArrElem's have a concrete eltype (of each vector). Otherwise the value type was just Vector.

I don't understand that, could you explain that more?

mtfishman · 2024-05-12T21:07:24Z

src/treetensornetworks/opsum_to_ttn/opsum_to_ttn.jl

+      v_crossed = (v ∈ op_sites) || any(crosses_v, partition(op_sites, 2, 1))
+      # TODO: seems we could make this just
+      # v_crossed = (v ∈ vertices(steiner_tree(sites,op_sites))))
+      # but it is not working - steiner_tree seems to have spurious extra vertices?


Can we define our own tree_steiner_tree function that is specific to tree graphs and calls Graphs.a_star/NamedGraphs.GraphsExtensions.vertex_path between each pair of specified vertices?

I believe @b-kloss also reported the steiner_tree didn't always give the minimal Steiner tree, even on trees. Probably Graphs.jl should have a special code path if the graph is a tree, that would make for a good PR. Also reporting an issue to Graphs.jl of an example where steiner_tree fails would be a good start.

Please check whether this is a performance regression. The previous code was an optimization to remove a significant performance bottleneck when using vertex_path logic. Note that the test cases in ITensorNetworks/test are not sufficiently hard to verify this. The bottleneck was evident for multi-orbital impurity problems with star-geometry on a fork graph with O(100-1000) sites on each leg of the fork.

I see NamedGraphs.GraphsExtensions.vertex_path may not be implemented be implemented in the best way: https://github.com/mtfishman/NamedGraphs.jl/blob/v0.6.0/src/lib/GraphsExtensions/src/abstractgraph.jl#L436-L449. Is it possible that optimizing that (say by using a_star) might fix the performance issue you were seeing on large graphs?

Additionally I wonder if we can rewrite the code in a nice way using vertex_path/steiner_tree and then fix any performance issues with a general solution like memoization to cache paths that have already been computed, i.e. find a way to get the best of both worlds in terms of simple code and good performance.

Concretely, I think the previous code was O(n_vertices^2 + ...) and O(n_terms) * O(average__n_site_per_term)).

Thanks. So to summarize your comment, if we had an implementation of steiner_tree(graph, vertices) that scaled at worst as O(nv(steiner_tree(graph, vertices))) if is_tree(graph), then used that function here, that could help to make the code simpler and also likely get good enough efficiency in the large tree limit.

I agree that my suggestion of repeatedly using a_star between the specified vertices may not be the absolute best way to implement it, I was more suggesting that as a simple first thing to try. The algorithm you suggest sounds good.

Yes, that summarizes it.
However, this PR still uses _which_incident_edge, so it already constructed an object that allows to determine crosses_vertex at cost O(n_sites_in_term).

Our goal now is more about simplifying the code as much as possible, while keeping in mind parts that may be bottlenecks and need to be optimized, and then once the code is simpler we will go back and optimize. It's part of a bigger rewrite based on #117.

But it is helpful to hear about parts you found could end up as bottlenecks if they are not written properly.

Thanks, Benedikt. That's really helpful background information as well as feedback about which graph algorithms we might want to use versus avoid. I'm on a learning curve with some of the graph mathematics. Based on this discussion, I'm going to create some performance benchmarks for me to test over time (is there already a collection of these somewhere?) and to check against the version before the previous PR. I definitely believe you that some of these changes are performance regressions so we plan to get that performance back as we iterate the design.

b-kloss · 2024-05-13T01:10:53Z

src/treetensornetworks/opsum_to_ttn/opsum_to_ttn.jl

    subgraphs = split_at_vertex(sites, v)
-    _boundary_edges = [
-      only(boundary_edges(underlying_graph(sites), subgraph)) for subgraph in subgraphs
-    ]
-    _boundary_edges = align_edges(_boundary_edges, edges)
+    _boundary_edges = [only(boundary_edges(sites, subgraph)) for subgraph in subgraphs]
+    _boundary_edges = align_edges(_boundary_edges, v_edges)
    which_incident_edge = Dict(
      Iterators.flatten([
        subgraphs[i] .=> ((_boundary_edges[i]),) for i in eachindex(subgraphs)
      ]),
    )



Given the target to simplify the code structure, I suggest removing this and reverting the dependent implementation below (i.e. logic to filter constituents of term according to whether they are incoming/outgoing etc.) to the state of #116. For an efficient implementation of vertex_path the original implementation using vertex/edge_path is unlikely to cause a significant overhead.

Thanks for the link to #116. I will consider that, and a good plan here could be:

revert the parts using which_incident_edge to use the older code, which looks quite nice and readable

keep this code as a reference for performance, i.e. that we have to get back to at least this performance level in the future to consider the refactor 'done'

try to find ways to either speed up calls involving edge_path, say, by using a different graph algorithm either inside edge_path or replacing edge_path with another algorithm altogether

We may end up making some new graph functions in the process which might end up repeating the which_incident_edge logic in the end, but the upshot is that it would be behind a more "self documenting" interface and properly pulled out as a separate function etc.

emstoudenmire added 14 commits May 12, 2024 21:42

Simplify vertex crossing check and other improvements

e25e081

Fixes

e663011

Use DataGraphs to store symbolic TTN data

831eaec

Remove where keyword in abstractprojttn.jl

8e20c23

Remove unneeded dictionary

6184f49

Minor changes

b7c5ec8

Improve svd_bond_coefs function

2326009

Use DataGraph in more places

957de59

More code cleeaning

9842920

Just iterate over vertices(sites)

46d731a

Removed ordered_verts array and simplified ttn_svd

0fafbbf

Simplify MatElem type

bd32fe2

More code simplifications

ba5707a

Remove a comment

9677fc0

emstoudenmire commented May 12, 2024

View reviewed changes

mtfishman reviewed May 12, 2024

View reviewed changes

b-kloss reviewed May 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use DataGraphs in ttn_svd and other changes #175

Use DataGraphs in ttn_svd and other changes #175

emstoudenmire commented May 12, 2024

emstoudenmire May 12, 2024

mtfishman May 12, 2024

mtfishman May 12, 2024 •

edited

Loading

mtfishman May 12, 2024

b-kloss May 12, 2024

mtfishman May 12, 2024

b-kloss May 12, 2024

mtfishman May 12, 2024

b-kloss May 12, 2024 •

edited

Loading

mtfishman May 13, 2024

mtfishman May 13, 2024 •

edited

Loading

emstoudenmire May 13, 2024

b-kloss May 13, 2024 •

edited

Loading

emstoudenmire May 13, 2024

Use DataGraphs in ttn_svd and other changes #175

Are you sure you want to change the base?

Use DataGraphs in ttn_svd and other changes #175

Conversation

emstoudenmire commented May 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtfishman May 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b-kloss May 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtfishman May 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b-kloss May 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtfishman May 12, 2024 •

edited

Loading

b-kloss May 12, 2024 •

edited

Loading

mtfishman May 13, 2024 •

edited

Loading

b-kloss May 13, 2024 •

edited

Loading