JOSS review comments #31

gdalle · 2024-05-04T09:18:29Z

Hi there, and congrats on the package and JOSS submission!

I'm one of the reviewers and I'll put my remarks and questions below.

gdalle · 2024-05-04T09:34:49Z

Paper

Main points

The paper is very clear, detailed and pleasant to read. The review of the state of the art, both for the algorithms and for their implementations, seems exhaustive.
My main issue is that the authors do not convey what is fundamentally new / better / faster about CBXPy and ConsensusBasedX.jl. The only clues we have in the Statement of need are

CBXPy is a significant extension of the latter.

that package has now been deprecated in favour of ConsensusBasedX.jl, which offers additional CBX methods and a far more general interface.

I wanna know how generic we're talking:

What can we do now that we couldn't do with the alternatives?
For the methods that were already available, are there improvements in speed / accuracy / etc. ?

Side remarks

L54 (Summary): explain that the diffusion is scaled so that points further from the consensus are bolder in their exploration
Figure 1: add a colormap for the dot colors and the background functoin, or little arrows to indicate the durection of optimization?
L85 and below (Statement of need): give the package names in addition to the authors
L104 and below (Features): a code example for both libraries would be a welcome addition, similar to the one in the README
L115: which covariance matrix are you talking about?
References: There seems to be one missing DOI according to the editorial bot

gdalle · 2024-05-04T09:44:41Z

Documentation

Main points

The documentation is REALLY well-written and extensive, kudos on that!

There is a single thing I don't like, namely the fact that users need to click to see the full code examples. In addition, these examples are run during tests but their output is not checked against the expected output, so we're unsure of their behavior. The same goes for the code examples inside docstrings.

Some solutions for this include:

Using doctests from Documenter.jl (https://documenter.juliadocs.org/stable/man/doctests/) for the docstrings
Parsing Julia files to Markdown with Literate.jl for the documentation

I have used Literate.jl in my recent JOSS submission HiddenMarkovModels.jl to include examples in both the test suite and the docs, you can check it out if it helps.

Side remarks

Some typos are fixed in #32

I read the documentation in order, so some of the questions that pop up early might be answered later on, but it's useful to know that they do pop up in the first place.

Mathematical background

On one page you present only the SDE and on the other page only the numerical scheme, is there a reason?
In the formula for $V_\alpha$, the use of the Kronecker product $\otimes$ confuses me a little. Shouldn't it just be $(x - c)(x - c)^\top$?

Basic usage

The documentation should specify what the output type of each function is:
- What does minimise return?
- What does sample return?
Is there any kind of parallelization
- Between particles?
- Between ensembles?
What is the unit of max_time? Seconds or nanoseconds?

Mid level usage

There's a TODO on the "Extended output" page, and it focuses only on minimise but this option is also available for sample
The auxiliary package ConsensusBasedXPlots.jl would also make sense as a package extension
In "Performance and benchmarking", you should recommend the use of BenchmarkTools.jl, which is more precise than the built-in @time. Internally, you may also want to use it, since it does exactly what you do: create necessary cache before starting the timer, and run a few executions to get rid of compilation overhead.
I wonder if we could get rid of the extra allocations in the symmetric matrix sqrt?
In "Parallelisation", can you specify the parallelisation kind (likely multithreading)? Would it be possible to apply it at the particle level, with libraries like OhMyThreads.jl providing reduction utilities to compute the consensus point?

Advanced usage

I didn't read that section in detail but you may be interested in DocStringExtensions.jl to document structures with numerous fields

gdalle · 2024-05-04T10:19:32Z

Tests

I took a quick look at the test folder and I didn't find a file where the output of minimization / sampling is tested against the reference output, e.g. for the provided default objectives. Did I miss it?

To account for stochasticity in the tests, you can

use StableRNGs.jl
evaluate statistics of the resulting population

TimRoith · 2024-05-15T07:56:08Z

Hi, thanks for your review!

Regarding the plot in the paper, I just modified it and the new plot would look like this:

I think this makes the behavior much clearer. If you agree, I would go ahead and replace the figure in the paper.

gdalle · 2024-05-15T08:06:21Z

Better indeed, although I would recommend adding labels to the legend and colorbar, something like "particle evolution" and "objective value"

TimRoith · 2024-05-15T08:17:41Z

Like this?

TimRoith · 2024-05-15T16:16:43Z

I wanna know how generic we're talking:

What can we do now that we couldn't do with the alternatives?

For the methods that were already available, are there improvements in speed / accuracy / etc. ?

Regarding this: The previous packages were developed by me (Python) and @rafaelbailo (Julia) and are basically deprecated. For the Python case, the previous one was tailored to the polarized CBO variant. Broadly speaking, CBXPy offers more generality to define other CBO methods and also simplifies the user interface. I made a remark on this in the paper.

For Julia, I guess @rafaelbailo can answer the question better, but as far as I understand the Julia implementation, CBX.jl is just a significant improvement to the previous consensus.jl, which only offered basic CBO functionality.

TimRoith · 2024-05-15T16:17:52Z

L85 and below (Statement of need): give the package names in addition to the authors

L104 and below (Features): a code example for both libraries would be a welcome addition, similar to the one in the README

We included both suggestions in the current version of the paper

TimRoith · 2024-05-15T16:19:47Z

References: There seems to be one missing DOI according to the editorial bot

I added a lot of DOIs, I'll check if there are still some missing,.

TimRoith · 2024-05-15T16:44:19Z

L115: which covariance matrix are you talking about?

It is a weighted covariance matrix of the ensemble, we clarified it in the paper.

rafaelbailo · 2024-06-07T08:28:43Z

Hi @gdalle, thank you very much for the thorough review! I'm working through the missing points, post by post.

Regarding the Paper, I've made a specification of the improvements of the new package with regards to my old Consensus.jl. In short, the old package implemented only CBO, and wasn't necessarily allocation free, whereas the new one is free of allocations, type stable, and implements CBS as well. I think the rest of the comments have already been answered by @TimRoith.

I will follow up shortly with the Documentation and Tests parts.

rafaelbailo · 2024-06-07T10:09:28Z

Regarding tests, I have adapted a test of R. Pinnau, C. Totzeck, O. Tse, and S. Martin (2017). We are now checking that the consensus point (or distribution mean when relevant) is indeed close to the global minimiser, in high dimension (20 for CBO, 5 for CBS for performance reasons), for several objective functions (quadratic, Ackley, and Rastrigin). This is similar to the tests that was added to the CBXpy.

rafaelbailo · 2024-06-07T13:50:43Z

Regarding the Documentation:

There is a single thing I don't like, namely the fact that users need to click to see the full code examples. In addition, these examples are run during tests but their output is not checked against the expected output, so we're unsure of their behavior. The same goes for the code examples inside docstrings.

This has been addressed now, as discussed in the other review issue

On one page you present only the SDE and on the other page only the numerical scheme, is there a reason?

We were just following the original CBS paper. I've added the SDE for completeness.

In the formula for $V_\alpha$, the use of the Kronecker product $\otimes$ confuses me a little. Shouldn't it just be $(x - c)(x - c)^\top$?

Both notations mean the same to me! Again, we use $\otimes$ because we followed the original SDE paper.

The documentation should specify what the output type of each function is:

What does minimise return?

What does sample return?

This was sort of explained in "Extended output". I've added comments in the "Function minimisation" and "Distribution sampling" pages, linking back to "Extended output" for the full details. "Extended output" has also been expanded and clarified.

Is there any kind of parallelization

Between particles?

Between ensembles?

See the later parallelisation comment.

What is the unit of max_time? Seconds or nanoseconds?

It's dimensionless time; it's the SDE solution time, not the CPU time! This wasn't very clear in the docs, I've hopefully clarified it.

There's a TODO on the "Extended output" page, and it focuses only on minimise but this option is also available for sample

This has been expanded

The auxiliary package ConsensusBasedXPlots.jl would also make sense as a package extension

Yes it would! I was not aware of the extension functionality, thank you for the tip. I have an upcoming intern who is going to work on extending the plotting package (what's there right now is extremely basic), I think we will use that chance to include the package as an extension.

In "Performance and benchmarking", you should recommend the use of BenchmarkTools.jl, which is more precise than the built-in @time. Internally, you may also want to use it, since it does exactly what you do: create necessary cache before starting the timer, and run a few executions to get rid of compilation overhead.

I've added a recommendation in the docs to use BenchmarkTools.jl to test the implementation of the objective function before using it with our pacakge.

For the internal benchmark mode, I think it makes sense to keep using @time since the only objective is to check for memory allocations.

I wonder if we could get rid of the extra allocations in the symmetric matrix sqrt?

I wasn't able to find a non-allocating matrix square root, which is rather frustrating!

In "Parallelisation", can you specify the parallelisation kind (likely multithreading)? Would it be possible to apply it at the particle level, with libraries like OhMyThreads.jl providing reduction utilities to compute the consensus point?

Indeed it's multithreading, this is now stated clearly.

It is certainly possible to parallelise most of the routines at the particle level. There was a lengthy discussion among the authors on whether or not this should be pursued, since we expect that real-world use of this package is to optimise very expensive objective functions, whose implementations will already be multithreaded. In the end, we decided not to implement particle parallelism, though I would be open to implementing this in the future if there is user demand.

I didn't read that section in detail but you may be interested in DocStringExtensions.jl to document structures with numerous fields

I was not aware of this either, thank you. I think the current docstrings are quite complete, but I will certainly use this next time I have to make changes to them!

rafaelbailo · 2024-06-07T13:51:57Z

@gdalle I think I've addressed all the points, but do let me know if I've missed anything. Thanks again for the review, it's been very helpful!

gdalle · 2024-06-17T17:18:50Z

Thanks for taking the review into account! I'm okay with the changes to the docs and tests, proofreading the revised article now

gdalle · 2024-06-17T17:26:36Z

Last remarks on the paper:

L17: as well as for
Figure 1: a bit blurry, export as svg or pdf instead
L109: Optim.jl
L121: lightweight

Otherwise it's good to go!

rafaelbailo · 2024-06-17T17:32:48Z

I've just pushed a fix for those typos. @TimRoith could you update the figure please?

rafaelbailo · 2024-06-19T15:32:47Z

Closing this as resolved :)

gdalle mentioned this issue May 4, 2024

Typos in docs #32

Merged

gdalle mentioned this issue May 4, 2024

[REVIEW]: CBX: Python and Julia Packages for Consensus-Based Interacting Particle Methods openjournals/joss-reviews#6611

Closed

gdalle mentioned this issue May 14, 2024

Review comments PdIPS/CBX#3

Closed

10 tasks

rafaelbailo closed this as completed Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JOSS review comments #31

JOSS review comments #31

gdalle commented May 4, 2024

gdalle commented May 4, 2024 •

edited

Loading

gdalle commented May 4, 2024 •

edited

Loading

gdalle commented May 4, 2024

TimRoith commented May 15, 2024

gdalle commented May 15, 2024

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024 •

edited

Loading

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024

rafaelbailo commented Jun 7, 2024

rafaelbailo commented Jun 7, 2024

rafaelbailo commented Jun 7, 2024 •

edited

Loading

rafaelbailo commented Jun 7, 2024

gdalle commented Jun 17, 2024 •

edited

Loading

gdalle commented Jun 17, 2024

rafaelbailo commented Jun 17, 2024

rafaelbailo commented Jun 19, 2024

JOSS review comments #31

JOSS review comments #31

Comments

gdalle commented May 4, 2024

gdalle commented May 4, 2024 • edited Loading

Paper

gdalle commented May 4, 2024 • edited Loading

Documentation

gdalle commented May 4, 2024

Tests

TimRoith commented May 15, 2024

gdalle commented May 15, 2024

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024 • edited Loading

TimRoith commented May 15, 2024

TimRoith commented May 15, 2024

rafaelbailo commented Jun 7, 2024

rafaelbailo commented Jun 7, 2024

rafaelbailo commented Jun 7, 2024 • edited Loading

rafaelbailo commented Jun 7, 2024

gdalle commented Jun 17, 2024 • edited Loading

gdalle commented Jun 17, 2024

rafaelbailo commented Jun 17, 2024

rafaelbailo commented Jun 19, 2024

gdalle commented May 4, 2024 •

edited

Loading

gdalle commented May 4, 2024 •

edited

Loading

TimRoith commented May 15, 2024 •

edited

Loading

rafaelbailo commented Jun 7, 2024 •

edited

Loading

gdalle commented Jun 17, 2024 •

edited

Loading