Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation output utils: only alive species? #112

Open
2 tasks done
Tracked by #137
alaindanet opened this issue Feb 16, 2023 · 4 comments
Open
2 tasks done
Tracked by #137

Simulation output utils: only alive species? #112

alaindanet opened this issue Feb 16, 2023 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@alaindanet
Copy link
Contributor

alaindanet commented Feb 16, 2023

Threshold

Currently, the functions work with a threshold argument that enables the user to decide a cut-off to consider species as extinct. But discussing with @ismael-lajaaiti, it appears more consistent to rely on the threshold defined in simulate().

  • Set threshold=0 as default instead of eps() (as ExtinctionCallBack() set extinct species to 0)

Should we be more strict and remove the possibility to set another threshold, i.e. should we remove threshold argument from all output analysis functions?

Determine alive and extinct species

We can determine which species is alive or extinct with two methods, either with the Dict from simulate() output or we can infer it from the solution matrix (Refer to #110 )

Method 1: from simulate()'s Dict

get_extinct_species(sol) = sol.prob.p.extinct_sp

function get_alive_species(sol)
  extinct_sp = collect(keys(get_extinct_species(sol)))
  p = get_parameters(sol)
  S = richness(p.network)
  [i for i in 1:S if i  extinct_sp]
end

Method 2: from the solution matrix

function get_extinction_timesteps(solution; idxs = nothing, threshold = 0)
    idxs = process_idxs(solution, idxs;)
    sp = get_parameters(solution).network.species[idxs]

    ext_t = findfirst.(x -> x <=0, eachrow(solution[idxs,:]))
    extinct = ext_t .!== nothing

    (
     species = sp[extinct],
     idxs = idxs[extinct],
     extinction_timestep =  ext_t[extinct]
    )
end

function get_alive_species(solution; idxs = nothing, threshold = 0)
    idxs = process_idxs(solution, idxs;)
    sp = get_parameters(solution).network.species[idxs]

    alive_idxs = get_alive_species(solution[idxs, end]; threshold = threshold)

    (
     species = sp[alive_idxs],
     idxs = idxs[alive_idxs]
    )
end

I feel that the method 2 is a bit more robust because it relies on the definition of extinct species given by the solution object, i.e. biomass == 0.0 set by the ExtinctionCallBack() (it is very unlikely to change).
while the second relates on the structure of the output get_extinct_species(sol)
What do you think?

julia> foodweb = FoodWeb([0 0 0 0; 1 0 0 0; 0 1 1 0; 1 0 0 1]; quiet = true)
FoodWeb of 4 species:
  A: sparse matrix with 5 links
  M: [1.0, 1.0, ..., 1.0, 1.0]
  metabolic_class: 1 producers, 3 invertebrates, 0 vertebrates
  method: unspecified
  species: [s1, s2, ..., s3, s4]

julia> params = ModelParameters(foodweb);

julia> sim = simulate(params, [0.5, 0.5, 0.5, .5]);
┌ Info: Species [4] went extinct at time t = 76.42512383931599. 
└ 1 out of 4 species are extinct.
┌ Info: Species [3] went extinct at time t = 91.00012798705222. 
└ 2 out of 4 species are extinct.

julia> EcologicalNetworksDynamics.get_alive_species(sim;)
(species = ["s1", "s2"], idxs = [1, 2])

julia> EcologicalNetworksDynamics.get_extinction_timesteps(sim)
(species = ["s3", "s4"], idxs = [3, 4], extinction_timestep = [38, 34])

Exclude extinct species?

Coefficient of variation

Should the temporal coefficient of variation only consider alive species at the end of the simulation ? (Currently consider species that have an average biomass different from 0 to prevent NaN)

Average trophic level

  • Average trophic structure over last timesteps (trophic_structure()) should exclude extinct species of the network at each timestep and compute the metric.

Warning

If we consider that ouput analysis should be done with only alive species:

  • Should the users be warned if they extract timesteps containing species not yet extinct?

If yes, a suggestion:

function check_last_extinction(solution; idxs = nothing, last)
    ext = get_extinction_timesteps(solution; idxs)
    ext_t = ext.extinction_timestep

    extinct = ext_t .!== nothing
    if any(extinct)
        check_last = findall(x -> x > last, ext_t[extinct])

        isempty(check_last) || @warn "With `last` = $last, a table has been extracted with \
        the extinct species $(ext.species[extinct][check_last]), \
        which went extinct at timesteps = $(ext_t[extinct][check_last]). Set `last` >= \
        $(maximum(ext_t[extinct])) to get rid of them."
    end
end
julia> foodweb = FoodWeb([0 0 0 0; 1 0 0 0; 0 1 1 0; 1 0 0 1]; quiet = true)
FoodWeb of 4 species:
  A: sparse matrix with 5 links
  M: [1.0, 1.0, ..., 1.0, 1.0]
  metabolic_class: 1 producers, 3 invertebrates, 0 vertebrates
  method: unspecified
  species: [s1, s2, ..., s3, s4]

julia> params = ModelParameters(foodweb);

julia> sim = simulate(params, [0.5, 0.5, 0.5, .5]);
┌ Info: Species [4] went extinct at time t = 76.42512383931599. 
└ 1 out of 4 species are extinct.
┌ Info: Species [3] went extinct at time t = 91.00012798705222. 
└ 2 out of 4 species are extinct.

julia> EcologicalNetworksDynamics.check_last_extinction(sim; last = 10)
┌ Warning: With `last` = 10, a table has been extracted with the extinct species ["s3", "s4"], which went extinct at timesteps = [38, 34]. Set `last` >= 38 to get rid of them.
└ @ EcologicalNetworksDynamics ~/Documents/post-these/sheffield/befwm2_dev/BEFWM2/src/measures/utils.jl:148

julia> EcologicalNetworksDynamics.check_last_extinction(sim; last = 38)
true
@alaindanet alaindanet added the help wanted Extra attention is needed label Feb 16, 2023
@alaindanet
Copy link
Contributor Author

Questions that raised by discussing with @ismael-lajaaiti !
Happy to have your feedback on it @ismael-lajaaiti and @iago-lito.

@ismael-lajaaiti
Copy link
Collaborator

ismael-lajaaiti commented Feb 16, 2023

Thanks for writing this all of this, it's clear and it's good a basis to discuss all of these tricky choices.
Here are my thoughts:

  • about the method to determine extinct species, we expect both to return the same results, right? Then, IMO the best is to use method 1 as it works on the minimal object needed (the dictionary of extinction events), but we should add test to ensure that we would obtain the same results by working on the output of simulate()
  • for the CV I'm not sure, I find it 'weird' to compute a CV on a system that undergoes extinctions. Maybe I'm wrong (I'm not use to work with this metric) but I see the CV as a metric of stability of a given community (composition kept constant). So I would be in favor to throw at least a warning or at worst an error, if there is an extinction in the last time steps. However, if you think that it can make sense to compute the CV on trajectories that contain extinctions, I still think that we should remove extinct species time steps per time steps and not filter for every time steps the extinct species of the final time step.
  • I really like this warning, and I think it can be very useful e.g. for the coefficient of variation.

@iago-lito
Copy link
Collaborator

iago-lito commented Feb 21, 2023

Hi @alaindanet and sorry for the late response.

Regarding 1., I agree with @ismael-lajaaiti here that method 1 is definitely the way to go. The extinct_species dict is carefully crafted to keep track of species becoming extinct, and is our small state-of-the-art regarding the suprising complexity of detecting species becoming extinct during simulation ^ ^"

I feel that the method 2 is a bit more robust because it relies on the definition of extinct species given by the solution object, i.e. biomass == 0.0 set by the ExtinctionCallBack() (it is very unlikely to change).

@alaindanet Maybe there is a misunderstanding here that the extinct_species dict is build to exactly reflect the outcome of ExtinctionCallback(). Our first approach was to just rely on the numbers in the trajectory and checking biomasses against a naive threshold, and this has led to #65. extinct_species is meant to exactly contain the species extinct during simulation. If it does not, then it's a bug in the simulation process :)

we should add test to ensure that we would obtain the same results by working on the output of simulate()

@ismael-lajaaiti I would not even be surprised that such a test would fail one day and yet extinct_species be still correct ^ ^" But yes, I can't argue that testing more could ever be a bad idea: go for it, the output of simulate() should be consistent :D

Regarding 2. I'm sorry I can't participate in the discussion because I'm not even sure what CV really is or mean. Maybe @skefi, @andbeck, @evadelmas would answer your concern best?

Regarding 3. I'm afraid the answer would depend on 2. But in any case, you can help us read your suggested code by colorizing it a little 0:) Did you know that instead of:

```
function this_is_julia_code(a, b = 4)
   # comment
end
```

Yielding:

function this_is_julia_code(a, b = 4)
   # comment
end

You could replace the upper ``` code fence with ```julia instead and get:

function this_is_julia_code(a, b = 4)
   # comment
end

^ ^

@alaindanet
Copy link
Contributor Author

alaindanet commented Mar 9, 2023

Regarding this discussion,
I put a warning if species went extinct in the extraction of the timesteps, I have set last = 1 as a default (regarding @ismael-lajaaiti suggestion) for all the functions expect the temporal CV related functions for which I set last = "10%".

Furthermore, I kept only the species that have an average biomass above 0 over the last timesteps to prevent NaN values.

I hope that it makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants