Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROOTFrameData leaks collection buffers that are not requested by the Frame #500

Closed
tmadlener opened this issue Oct 7, 2023 · 2 comments · Fixed by #502
Closed

ROOTFrameData leaks collection buffers that are not requested by the Frame #500

tmadlener opened this issue Oct 7, 2023 · 2 comments · Fixed by #502
Assignees
Labels

Comments

@tmadlener
Copy link
Collaborator

tmadlener commented Oct 7, 2023

When reading files via the ROOTFrameReader memory increases steadily. (This most likely also applies to other readers). The minimal reproducer looks something like this

#include "podio/ROOTFrameReader.h"

void demo() {
  auto reader = podio::ROOTFrameReader();
  reader.openFile("example_frame.root");
    
  for (int i = 0; i < reader.getEntries("events"); ++i) {
     auto data = reader.readNextEntry("events"); 
  }  
}

A quick run to valgrind seems to indicate that some of the data that is allocated in createBuffers becomes unreachable somewhere in the chain from creating it to constructing CollectionData from the buffers. Currently trying to figure out where things go wrong and whether this is a ROOT exclusive problem or whether this is a general podio issue (which I assume currently).

@tmadlener tmadlener added the bug label Oct 7, 2023
@tmadlener tmadlener self-assigned this Oct 7, 2023
@tmadlener
Copy link
Collaborator Author

I am not sure if this was introduced with the CollectionBufferFactory (#394) or whether this was pre-existing and simply not caught up until now. The underlying reason is that these allocations here:

readBuffers.references = new podio::CollRefCollection(nRefs);

readBuffers.vectorMembers = new podio::VectorMembersInfo();

are leaked. This should be fairly straight forward to fix by simply cleaning them up in the CollectionData constructor:

{{ class_type }}::{{ class_type }}(podio::CollectionReadBuffers buffers, bool isSubsetColl) :
{% for relation in OneToManyRelations + OneToOneRelations %}
m_rel_{{ relation.name }}(new std::vector<{{ relation.namespace }}::{{ relation.bare_type }}>()),
{% endfor %}
m_refCollections(std::move(*buffers.references)),
m_vecmem_info(std::move(*buffers.vectorMembers)) {
// For subset collections we are done, for proper collections we still have to
// populate the data and vector members
if (!isSubsetColl) {
m_data.reset(buffers.dataAsVector<{{ class.full_type }}Data>());
{% for member in VectorMembers %}
m_vec_{{ member.name }}.reset(podio::CollectionReadBuffers::asVector<{{ member.full_type }}>(m_vecmem_info[{{ loop.index0 }}].second));
{% endfor %}
}
}

However, this will only fix half of the problem as there might be collection (buffers) that are read but from which no collection is every constructed (simply by the way the whole Frame concept works) and the fact that we will always read the complete event (until #499 is fixed). CollectionBuffers that are read, but from which no collection is constructed will leak (pretty much) all their data.

@tmadlener tmadlener changed the title Memory leaked when reading from files Memory leaked when reading from root files Oct 7, 2023
@tmadlener
Copy link
Collaborator Author

The slightly counterintuitive workaround until this is properly fixed is to read all collections from the event, i.e. in python:

from podio import root_io

reader = root_io.Reader("example_frame.root")
for event in reader.get("events"):
  # read all collections to work around AIDASoft/podio#500
  for c in event.collections:
    event.get(c)
  # continue with actual event loop

or in c++:

#include "podio/Frame.h"
#include "podio/ROOTFrameReader.h"

void demo() {
  auto reader = podio::ROOTFrameReader();
  reader.openFile("example_frame.root");
    
  for (int i = 0; i < reader.getEntries("events"); ++i) {
    auto event = podio::Frame(reader.readNextEntry("events"));
    // read all collections to work around AIDASoft/podio#500
    for (const auto& name :  event.getAvailableCollections()) {
      event.get(name);
    }
    // Continue with the actual event loop
  }  
}

@tmadlener tmadlener changed the title Memory leaked when reading from root files ROOTFrameData leaks collection buffers that are not requested by the Frame Oct 10, 2023
tmadlener added a commit to tmadlener/podio that referenced this issue Nov 28, 2024
Reading only part of the collections stored in a frame should not lead
to leaked memory (see e.g. AIDASoft#500). The tests that are added in this
commit are targetting that, specifically by running them with sanitizers
enabled.
tmadlener added a commit to tmadlener/podio that referenced this issue Nov 28, 2024
Reading only part of the collections stored in a frame should not lead
to leaked memory (see e.g. AIDASoft#500). The tests that are added in this
commit are targetting that, specifically by running them with sanitizers
enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant