Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS review comments #31

Closed
gdalle opened this issue May 4, 2024 · 18 comments
Closed

JOSS review comments #31

gdalle opened this issue May 4, 2024 · 18 comments

Comments

@gdalle
Copy link
Contributor

gdalle commented May 4, 2024

Hi there, and congrats on the package and JOSS submission!

I'm one of the reviewers and I'll put my remarks and questions below.

@gdalle
Copy link
Contributor Author

gdalle commented May 4, 2024

Paper

Main points

The paper is very clear, detailed and pleasant to read. The review of the state of the art, both for the algorithms and for their implementations, seems exhaustive.
My main issue is that the authors do not convey what is fundamentally new / better / faster about CBXPy and ConsensusBasedX.jl. The only clues we have in the Statement of need are

CBXPy is a significant extension of the latter.

that package has now been deprecated in favour of ConsensusBasedX.jl, which offers additional CBX methods and a far more general interface.

I wanna know how generic we're talking:

  • What can we do now that we couldn't do with the alternatives?
  • For the methods that were already available, are there improvements in speed / accuracy / etc. ?

Side remarks

  • L54 (Summary): explain that the diffusion is scaled so that points further from the consensus are bolder in their exploration
  • Figure 1: add a colormap for the dot colors and the background functoin, or little arrows to indicate the durection of optimization?
  • L85 and below (Statement of need): give the package names in addition to the authors
  • L104 and below (Features): a code example for both libraries would be a welcome addition, similar to the one in the README
  • L115: which covariance matrix are you talking about?
  • References: There seems to be one missing DOI according to the editorial bot

@gdalle
Copy link
Contributor Author

gdalle commented May 4, 2024

Documentation

Main points

The documentation is REALLY well-written and extensive, kudos on that!

There is a single thing I don't like, namely the fact that users need to click to see the full code examples. In addition, these examples are run during tests but their output is not checked against the expected output, so we're unsure of their behavior. The same goes for the code examples inside docstrings.

Some solutions for this include:

I have used Literate.jl in my recent JOSS submission HiddenMarkovModels.jl to include examples in both the test suite and the docs, you can check it out if it helps.

Side remarks

Some typos are fixed in #32

I read the documentation in order, so some of the questions that pop up early might be answered later on, but it's useful to know that they do pop up in the first place.

Mathematical background

  • On one page you present only the SDE and on the other page only the numerical scheme, is there a reason?
  • In the formula for $V_\alpha$, the use of the Kronecker product $\otimes$ confuses me a little. Shouldn't it just be $(x - c)(x - c)^\top$?

Basic usage

  • The documentation should specify what the output type of each function is:
    • What does minimise return?
    • What does sample return?
  • Is there any kind of parallelization
    • Between particles?
    • Between ensembles?
  • What is the unit of max_time? Seconds or nanoseconds?

Mid level usage

  • There's a TODO on the "Extended output" page, and it focuses only on minimise but this option is also available for sample
  • The auxiliary package ConsensusBasedXPlots.jl would also make sense as a package extension
  • In "Performance and benchmarking", you should recommend the use of BenchmarkTools.jl, which is more precise than the built-in @time. Internally, you may also want to use it, since it does exactly what you do: create necessary cache before starting the timer, and run a few executions to get rid of compilation overhead.
  • I wonder if we could get rid of the extra allocations in the symmetric matrix sqrt?
  • In "Parallelisation", can you specify the parallelisation kind (likely multithreading)? Would it be possible to apply it at the particle level, with libraries like OhMyThreads.jl providing reduction utilities to compute the consensus point?

Advanced usage

I didn't read that section in detail but you may be interested in DocStringExtensions.jl to document structures with numerous fields

@gdalle gdalle mentioned this issue May 4, 2024
@gdalle
Copy link
Contributor Author

gdalle commented May 4, 2024

Tests

I took a quick look at the test folder and I didn't find a file where the output of minimization / sampling is tested against the reference output, e.g. for the provided default objectives. Did I miss it?

To account for stochasticity in the tests, you can

  • use StableRNGs.jl
  • evaluate statistics of the resulting population

@TimRoith
Copy link
Member

Hi, thanks for your review!

Regarding the plot in the paper, I just modified it and the new plot would look like this:

JOSS

I think this makes the behavior much clearer. If you agree, I would go ahead and replace the figure in the paper.

@gdalle
Copy link
Contributor Author

gdalle commented May 15, 2024

Better indeed, although I would recommend adding labels to the legend and colorbar, something like "particle evolution" and "objective value"

@TimRoith
Copy link
Member

Like this?
JOSS

@TimRoith
Copy link
Member

I wanna know how generic we're talking:

  • What can we do now that we couldn't do with the alternatives?
  • For the methods that were already available, are there improvements in speed / accuracy / etc. ?

Regarding this: The previous packages were developed by me (Python) and @rafaelbailo (Julia) and are basically deprecated. For the Python case, the previous one was tailored to the polarized CBO variant. Broadly speaking, CBXPy offers more generality to define other CBO methods and also simplifies the user interface. I made a remark on this in the paper.

For Julia, I guess @rafaelbailo can answer the question better, but as far as I understand the Julia implementation, CBX.jl is just a significant improvement to the previous consensus.jl, which only offered basic CBO functionality.

@TimRoith
Copy link
Member

TimRoith commented May 15, 2024

  • L85 and below (Statement of need): give the package names in addition to the authors
  • L104 and below (Features): a code example for both libraries would be a welcome addition, similar to the one in the README

We included both suggestions in the current version of the paper

@TimRoith
Copy link
Member

  • References: There seems to be one missing DOI according to the editorial bot

I added a lot of DOIs, I'll check if there are still some missing,.

@TimRoith
Copy link
Member

  • L115: which covariance matrix are you talking about?

It is a weighted covariance matrix of the ensemble, we clarified it in the paper.

@rafaelbailo
Copy link
Member

Hi @gdalle, thank you very much for the thorough review! I'm working through the missing points, post by post.

Regarding the Paper, I've made a specification of the improvements of the new package with regards to my old Consensus.jl. In short, the old package implemented only CBO, and wasn't necessarily allocation free, whereas the new one is free of allocations, type stable, and implements CBS as well. I think the rest of the comments have already been answered by @TimRoith.

I will follow up shortly with the Documentation and Tests parts.

@rafaelbailo
Copy link
Member

Regarding tests, I have adapted a test of R. Pinnau, C. Totzeck, O. Tse, and S. Martin (2017). We are now checking that the consensus point (or distribution mean when relevant) is indeed close to the global minimiser, in high dimension (20 for CBO, 5 for CBS for performance reasons), for several objective functions (quadratic, Ackley, and Rastrigin). This is similar to the tests that was added to the CBXpy.

@rafaelbailo
Copy link
Member

rafaelbailo commented Jun 7, 2024

Regarding the Documentation:

There is a single thing I don't like, namely the fact that users need to click to see the full code examples. In addition, these examples are run during tests but their output is not checked against the expected output, so we're unsure of their behavior. The same goes for the code examples inside docstrings.

This has been addressed now, as discussed in the other review issue

  • On one page you present only the SDE and on the other page only the numerical scheme, is there a reason?

We were just following the original CBS paper. I've added the SDE for completeness.

  • In the formula for $V_\alpha$, the use of the Kronecker product $\otimes$ confuses me a little. Shouldn't it just be $(x - c)(x - c)^\top$?

Both notations mean the same to me! Again, we use $\otimes$ because we followed the original SDE paper.

  • The documentation should specify what the output type of each function is:
    • What does minimise return?
    • What does sample return?

This was sort of explained in "Extended output". I've added comments in the "Function minimisation" and "Distribution sampling" pages, linking back to "Extended output" for the full details. "Extended output" has also been expanded and clarified.

  • Is there any kind of parallelization
    • Between particles?
    • Between ensembles?

See the later parallelisation comment.

  • What is the unit of max_time? Seconds or nanoseconds?

It's dimensionless time; it's the SDE solution time, not the CPU time! This wasn't very clear in the docs, I've hopefully clarified it.

  • There's a TODO on the "Extended output" page, and it focuses only on minimise but this option is also available for sample

This has been expanded

  • The auxiliary package ConsensusBasedXPlots.jl would also make sense as a package extension

Yes it would! I was not aware of the extension functionality, thank you for the tip. I have an upcoming intern who is going to work on extending the plotting package (what's there right now is extremely basic), I think we will use that chance to include the package as an extension.

  • In "Performance and benchmarking", you should recommend the use of BenchmarkTools.jl, which is more precise than the built-in @time. Internally, you may also want to use it, since it does exactly what you do: create necessary cache before starting the timer, and run a few executions to get rid of compilation overhead.

I've added a recommendation in the docs to use BenchmarkTools.jl to test the implementation of the objective function before using it with our pacakge.

For the internal benchmark mode, I think it makes sense to keep using @time since the only objective is to check for memory allocations.

  • I wonder if we could get rid of the extra allocations in the symmetric matrix sqrt?

I wasn't able to find a non-allocating matrix square root, which is rather frustrating!

  • In "Parallelisation", can you specify the parallelisation kind (likely multithreading)? Would it be possible to apply it at the particle level, with libraries like OhMyThreads.jl providing reduction utilities to compute the consensus point?

Indeed it's multithreading, this is now stated clearly.

It is certainly possible to parallelise most of the routines at the particle level. There was a lengthy discussion among the authors on whether or not this should be pursued, since we expect that real-world use of this package is to optimise very expensive objective functions, whose implementations will already be multithreaded. In the end, we decided not to implement particle parallelism, though I would be open to implementing this in the future if there is user demand.

I didn't read that section in detail but you may be interested in DocStringExtensions.jl to document structures with numerous fields

I was not aware of this either, thank you. I think the current docstrings are quite complete, but I will certainly use this next time I have to make changes to them!

@rafaelbailo
Copy link
Member

@gdalle I think I've addressed all the points, but do let me know if I've missed anything. Thanks again for the review, it's been very helpful!

@gdalle
Copy link
Contributor Author

gdalle commented Jun 17, 2024

Thanks for taking the review into account! I'm okay with the changes to the docs and tests, proofreading the revised article now

@gdalle
Copy link
Contributor Author

gdalle commented Jun 17, 2024

Last remarks on the paper:

  • L17: as well as for
  • Figure 1: a bit blurry, export as svg or pdf instead
  • L109: Optim.jl
  • L121: lightweight

Otherwise it's good to go!

@rafaelbailo
Copy link
Member

I've just pushed a fix for those typos. @TimRoith could you update the figure please?

@rafaelbailo
Copy link
Member

Closing this as resolved :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants