Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cam6_3_140: Separate history tapes into hXi and hXa #903

Merged
merged 31 commits into from
Dec 7, 2023

Conversation

peverwhee
Copy link
Collaborator

@peverwhee peverwhee commented Oct 13, 2023

Overview

Per #159 - this PR changes the "time" variable in a history file with averaged (and/or min/max/stdev quantities) to be the midpoint of the averaging period (basically, the average of the time_bounds variable).

Side affects of the above/related changes in this PR (all implemented):

  • Each history tape can now result in TWO files - hXa and hXi. The 'i' file is ALWAYS created and will, at minimum, contain time-dependent variables (like solar forcing data, etc). If only the instantaneous file is created, it will still carry the 'i' flag in the filename
  • Changed 'time_bnds' variable name to 'time_bounds'
  • Ensured the 'cell_methods' metadata field always includes "time: XXXXX" (where XXXXX refers to the avgflag) for history fields so that it's always clear what operation is being done on/for that field
  • Any user-specified history file name or template will also have "a" or "i" appended to it and may result in two files
  • time, date, datesec are all calculated based on the midpoint time for the accumulated file

Implementation

  • amending the active_entry DDT "File" property to be an array of files called "Files" and adding a num_files attribute to determine if we have one or two files
  • iterating over Files array and adding all necessary metadata and variables to both "i" and "a" files when necessary

Testing

  • All regression tests run successfully (except for known failures)
  • Wrote a script to compare the results vs baselines; confirmed no fields were lost in the transition and no answers changed (except the expected midpoint time variables on the accumulated file)

Related to: ESCOMP/CTSM#2019

closes #159
closes #554
Supersedes #869

@peverwhee peverwhee self-assigned this Oct 13, 2023
Copy link
Collaborator

@cacraigucar cacraigucar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking a basic question to aid my understanding during the review process

src/control/cam_history.F90 Show resolved Hide resolved
@cacraigucar cacraigucar self-requested a review October 18, 2023 19:46
Copy link
Collaborator

@cacraigucar cacraigucar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My brain is officially fried and I'm just doing the review!! I am impressed with the changes you've made!

First batch of review comments - more may be coming tomorrow

src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
Comment on lines 4032 to 4042
if (f == 1) then
! this is the accumulated file - skip instantaneous fields
if (tape(t)%hlist(fld)%avgflag == 'I') then
cycle
end if
else
! this is the instantaneous file - skip accumulated fields
if (tape(t)%hlist(fld)%avgflag /= 'I') then
cycle
end if
end if
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little fragile - assuming that the first file is accumulated and the second is instantaneous. With this current implementation, that is the case, but if someone flipped them down at line 5650, then it would no longer work. Since this history is nearing end-of-life and the likelihood of it getting broken is low, I won't require this be fixed, but if it is easy to determine which is which, I'd suggest that be used instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assign a parameter to the instantaneous file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree @cacraigucar that this is fragile. Good point.

@gold2718 what do you mean by "assign a parameter to the instantaneous file"?

My initial thought is to add "accumulate_file_index" and "instantaneous_file_index" parameters and use those instead of "1" and "2" throughout the code. But I'm open to other options (I just don't yet understand yours...)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly what I was talking about, just using real language :)
I think those parameter names are fine except that I still don't really like 'instantaneous' because that does not describe which instant (I like last value because you get the last sample you happened to talk before writing the file).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would encourage you to keep "instantaneous" as this is what the ultimate file is called. If we were to change to "last value", we'd need to overhaul everything (including changing the "I" designator) and coordinate with CTSM. I believe "I" is what we are currently stuck with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718, I think what we're trying to convey in the filename is something about the temporal characteristics of the data. Focusing on the implementation of collecting the time samples (e.g., by overwritting the history buffer each time a sample is collected until it's time to write the buffer) ignores that other implementations are possible, like not putting anything in the buffer until the desired timestep is reached. The sample collection implementation doesn't describe the temporal characteristics of the data. Instantaneous does.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new instantaneous files, does the time variable always match the time step of the sample?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

It has always been the case that the time coordinate matches the end of the time step producing the data for an instantaneous field. From the code I've looked at so far the time coordinate in the hXi file is the same as it's always been for instantaneous fields. I am running tests as I review this PR to check these details.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you answered my question. You are saying that the time that shows up in the file is the same as the time step active when the write occurs. I'm asking if the time that shows up in the file matches the time step that was active when the outfld buffer was captured for each and every field in the file. For instance, what if a particular diagnostic is only called every other time step and the write step is one where it was not captured.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take 2. Yes, the time variable always matches the time step of an instantaneous sample assuming that outfld was called during that time step.

You are correct to point out that outfld isn't called every time step for all fields that are output. Radiation fields are a prime example. To analyze fields like that one needs to look at the details of when the outfield calls are made, and at what happens during the time step(s) when outfld is not called. In the case of radiation, the heating rates that are applied during time steps when radiation isn't run (and outfld calls are not made) are from the most recent calculation (they are saved in the physics buffer). So even though the outfld call for the instantaneous field was not made at the time indicated by the time coordinate, the field is in fact valid at that time. I'm not claiming this to be the case for all fields that fall into this category. But I do claim that all fields in this category are instantaneous, even if there is a mismatch in the associated time coordinate.

src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
Copy link
Collaborator

@cacraigucar cacraigucar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew - I finished my review and I've resolved the conversations which are fixed. Again, good job on these changes!

src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
Copy link
Collaborator

@cacraigucar cacraigucar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those new parameters. It helped me find some areas which I believe are bugs.

src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
Copy link
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @peverwhee! I mostly just have some minor clean-up suggestions.

src/utils/cam_grid_support.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
if ( t /= f )then
write(iulog,*)'hfilename_spec(', f, ') = ', hfilename_spec(f)
duplicate = .false.
do f = 1, t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only be looping to t here, or should we be looping through all of the tapes (ptapes), which is what it looks like it was doing in the original version?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to t since this is a loop nested within a loop over ptapes, but now I'm questioning how that will work with threading, so i've changed it back to ptapes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seriously doubt wshist (if I got the right subroutine) can ever be threaded.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I think I understand now! I am fine then with either ptapes or t (and your original t method is very likely more performant, at least for the average CAM configuration).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed back to t

src/control/cam_history.F90 Outdated Show resolved Hide resolved
src/control/cam_history.F90 Outdated Show resolved Hide resolved
@peverwhee peverwhee requested a review from nusbaume November 1, 2023 22:59
Copy link
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for dealing with my concerns! I am happy with this PR now.

Copy link
Collaborator

@cacraigucar cacraigucar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left my one conversation unresolved as I'm not sure if the side discussion was finished or not. But my concern was addressed to my satisfaction

@peverwhee peverwhee requested a review from brian-eaton December 6, 2023 21:06
@cacraigucar cacraigucar changed the title Separate history tapes into hXi and hXa cam6_3_140: Separate history tapes into hXi and hXa Dec 6, 2023
Copy link
Collaborator

@brian-eaton brian-eaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nice work!

@peverwhee peverwhee merged commit 32f2772 into ESCOMP:cam_development Dec 7, 2023
@ekluzek
Copy link

ekluzek commented Dec 7, 2023

@peverwhee nice to see this in. We are going to study it in CTSM and figure out what we need to do using this as strong inspiration. Thanks so much.

@billsacks
Copy link
Member

Thank you very much for your work on this, @peverwhee ! It's exciting to see this completed!!!

gold2718 pushed a commit to gold2718/CAM that referenced this pull request May 2, 2024
Merge pull request ESCOMP#903 from peverwhee/history-split

cam6_3_140: Separate history tapes into hXi and hXa

ESCOMP commit: 32f2772
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Tag
Development

Successfully merging this pull request may close these issues.

7 participants