Tool Submission Discussion #3

ChristopherBrix · 2023-06-09T12:20:45Z

At this point, you should be updating your tool in order to support quickly verifying as many benchmarks as you can. Note that the benchmarks instances will change based on a new random seed for the final evaluation. We will follow a similar workflow to last year, where tool authors provide shell scripts to install their tool, prepare instances (convert models to a different format, for example), and then finally verify an instance. The detailed instructions for this are available at 2021's git repo.

You will be able to run and debug your toolkit on the submitted benchmarks online at this link. There, you first need to register. Your registration will be manually activated by the organizers, you'll receive a notification once that's done. Afterwards, you can login and start with your first submission.

The process is similar to the submission of benchmarks, with a small change compared to last year: You need to specify a public git URL and commit hash, as well as the location of a .yaml config file. There, you can specify parameters for your toolkit evaluation. By making those settings part of the repository, those will be preserved for future reference.
You can define a post installation script to set up any licenses.

Once submitted, you're placed in a queue until the chosen AWS instance can be created, at which point your installation and evaluation scripts will be run. You'll see the output of each step and can abort the evaluation early in case there are any issues. Once a submission has terminated, you can use it to populate the submission form for the next iteration, so you don't have to retype everything.

Important: We currently have no limitation on how often you can submit your tool for testing purposes, but will monitor the usage closely and may impose limits if necessary. Please be mindful of the costs (approx. 3$ per hour) each submission incurs. To save costs, you should debug your code locally and then use the website to confirm the results match your expectations.

We strongly encourage tool participants to at least register and have some test submissions on the toolkit website well ahead of the deadline.

The text was updated successfully, but these errors were encountered:

ttj · 2023-06-26T19:01:37Z

@ChristopherBrix we currently are trying to get our NNV/Matlab things set up to execute on this. The last time we did this, I set everything up, but it was before the infrastructure used last year, and I had to manually do quite a few things. So, @mldiego is leading this, and we may need to do a few things to make this work due to how Matlab currently can be set up for execution on AWS (either a custom AMI based on reference architecture from e.g. here https://github.com/mathworks-ref-arch/matlab-on-aws#deployment-steps , possibly running inside Docker [maybe easiest], possibly needing to manually install/configure some things, or something else, etc.). So, wanted to make you aware as we likely will need your help for the execution system to make this work.

First, what AWS region specifically? The website says Amazon (Oregon region), which is this specifically (us-west-1 or what ?)?

Further, we may have some questions on how to do things with the startup / batch execution scripts, as currently sorting out whether we want to keep things running in between or not, given that if inside Docker, this may take some time to start up, and further, may take some time for Matlab to start up. We likely also may have some questions on the model conversion, as unfortunately Matlab's ONNX support is rather poor right now, and need to see whether preprocessing of models necessary or not to get them into Matlab

ChristopherBrix · 2023-06-27T10:16:58Z

I'm using the us-west-2 region.

Let me know if you have any questions I can help you with!

ChristopherBrix · 2023-06-27T13:32:01Z

@ALL: Please give the online submission tool a try as soon as possible! Make sure your tool supports the automatic setup on AWS. If you need any assistance, I'm happy to help.

You can submit as often as you'd like, so you can debug your setup even while you're still working on some benchmarks.

mldiego · 2023-06-28T05:22:52Z

@ChristopherBrix @ttj
Is there a way to have any AWS instances with a predefined mac-address? Something like using ENI to fix the mac-address, and then use that one for NNV? (see: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html)
If we can get that, we may not need a matlab-aws target architecture, as we can set up the licensing (MATLAB) to that specific MAC-address, and then installing everything via install_tool.sh

Otherwise, we may need help setting up one of the matlab-on-aws instances (R2022b), then set the licensing there manually and installing support packages. Once that is set, we can install NNV and MATLAB requirements for NNV with the install_tool.sh script (Should be very similar to what @ttj prepared for the 2021 submission: https://github.com/mldiego/nnv/blob/master/code/nnv/examples/Submission/VNN_COMP2021/README-AWS.md)

Docker is not ideal (competition infrastructure, overhead...), but for licensing may be the best/easiest for us. If we take this route, it would be a little easier for us to be able to run Docker only once (I understand execution scripts may need to change slightly for this) to avoid starting a container and Matlab for each instance. Essentially, we could set up the install_tool.sh to automatically download Docker, build the image, do some hacky things for support packages (ONNX importer), then copy the benchmark files from the instance (local) to the container after the properties have been generated with the random seed, then call the scripts prepare_instance.sh and run_instance.sh similar to as if we were running it locally (outside Docker).

Please, let me know if either of the options can be considered and added to the automated framework. I'll be happy to help with any of this as well @ChristopherBrix

aliabigdeli · 2023-06-28T06:43:31Z

I don't know if any other team submitted their codes or not, but I couldn't run my code on the VNNCOMP website. Although I install the Conda and create an environment and use python path of the environment to run the toolkit in run_instance.sh, it couldn't find the module installed on the environment, which is strange; because I don't face this issue on local machine and also some other cloud platform like cloudLab. Does anybody successfully submit and run their code?

ChristopherBrix · 2023-06-28T13:34:25Z

@mldiego To clarify, if I could support ENI, then you could have the same MAC address all the time, and thus your licensing issues would be solved?

For Gurobi, the current process is like this:

Tools install everything they need, including Gurobi using the install_tool.sh script
As the last step, they print some information about the AWS instance that's needed to create a license
In a manual step, a license file is generated. The content can be copied into the "post_install_script.sh" via the website so it'll be created on the AWS instance
After this manual step, the setup is done and the evaluation can begin

For you, licensing is more involved? Or could a similar process work?
Note, that this requires Gurobi users to get a new license file each time they submit their tool. However, Gurobi licenses are free for academic use, so that's not an issue. Is it different for Matlab?

@aliabigdeli I will look into this.

ChristopherBrix · 2023-06-28T14:41:51Z

@mldiego I should be able to support ENI by tomorrow. I'll ping you then.

@aliabigdeli Please try again with a larger (eg. m5) instance type. Your machine ran out of RAM during the setup (there's an error code in the logs if you scroll up a bit). It tested it with m5, there it seemed to work.

mldiego · 2023-06-28T19:11:53Z

@ChristopherBrix
That should be enough. Once we have that MAC-address, I will create the license, as well as the installer_input.txt and activation key, all necessary to install MATLAB "Offline" or noninteractively, include all of that within the install_tool.sh, and we should be good to go, although it would take a while to install everything.

https://www.mathworks.com/help/install/ug/install-noninteractively-silent-installation.html

About the current process for Gurobi licensing: Something similar may work for us too. As far as I understand, Matlab's licenses can only be downloaded from their website (to specify target OS, username, and MAC-address), so we'll have to do that manually every time to get the license, then we could do a similar process as explained earlier (noninteractive installation).

It would also work if we get an AWS instance with MATLAB already installed (matlab-on-aws, then follow the steps listed there for licensing and setup, and then we should be able to simplify the installation script (only worry about NNV installation, much faster).

ChristopherBrix · 2023-06-28T20:37:10Z

I'm on mobile so I cannot check right now, but isn't that a description for a desktop client?

If there's an ami that includes MATLAB support, that would be trivial to support.

mldiego · 2023-06-28T21:04:18Z

I just assumed we could do the same non-interactive installation within the aws instance as long as no GUI is needed.

There is some information here about creating an AMI with MATLAB support, but I am not very familiar with it, so not sure how helpful this will be:
https://www.mathworks.com/help/cloudcenter/ug/create-and-discover-clusters.html
https://www.mathworks.com/help/cloudcenter/ug/create-a-custom-amazon-machine-image-ami.html

aliabigdeli · 2023-06-28T22:28:24Z

@mldiego I should be able to support ENI by tomorrow. I'll ping you then.

@aliabigdeli Please try again with a larger (eg. m5) instance type. Your machine ran out of RAM during the setup (there's an error code in the logs if you scroll up a bit). It tested it with m5, there it seemed to work.

It works now. Thanks.

ChristopherBrix · 2023-06-29T14:43:39Z

@mldiego I can now assign a running instance an ENI. However, I'm not sure how that helps. Does the ENI need to be associated with a public IP?

Do you have access to a AWS account? If so, could you try setting up the instance the way you need it, and let me know the steps to reproduce? So what needs to be done with the ENI to support your usecase? Then I can add that more easily.

mldiego · 2023-06-29T15:47:41Z

@ChristopherBrix

What about the AMI with MATLAB support? Would that be easier to add since all the other ones are set this way too?

ChristopherBrix · 2023-06-29T16:38:54Z

That would be much simpler. If you find a suitable ami, let me know!

mldiego · 2023-06-29T16:55:45Z

Thanks! Working on it. I was following the instructions here: https://www.mathworks.com/help/cloudcenter/ug/create-a-custom-amazon-machine-image-ami.html, but my license is not authorized to create clusters (MATLAB parallel server), so I am trying to find another way.

I'll take a closer look at the ENI method. My understanding is that we would need a fixed MAC-address (set licensing once and save it for next time we run NNV if possible?), and then we should be able to use it, but I have not used AWS before, so I may also be completely wrong about it...

mldiego · 2023-06-29T19:03:02Z

@ChristopherBrix

I found on the AWS AMI catalog some that may be useful for us if we can use those in the automated framework. Would something like this work?

ChristopherBrix · 2023-06-29T20:13:04Z

I've added that one, please give it a try!

mldiego · 2023-06-29T20:13:39Z

Thank you!

mldiego · 2023-06-30T05:46:57Z

@ChristopherBrix

Could you also add this one: ami-02fbb965167e007cb ?

Would like to test the installation/execution process with both.

AMI info:
R2022b matlab_linux
ami-02fbb965167e007cb

Published
2023-06-16T08:11:11.000Z
Architecture
x86_64
Virtualization
hvm
Root device type
ebs
ENA Enabled
Yes

ChristopherBrix · 2023-06-30T09:12:01Z

@mldiego Done!

ChristopherBrix · 2023-06-30T17:53:22Z

To give everyone enough time to adapt their tools to the submission system and support as many benchmarks as possible, we will extend the submission deadline to July 7 EOD AOE.

There will be no further extensions.

mldiego · 2023-07-02T17:34:33Z

@ChristopherBrix

I got some errors last night with the submissions (some server message, but don't remember now). I submitted it again this morning (this one is working well), and it looks like the previous 2 submissions are still pending (it says positions 2 and 3 in the queue, these numbers have not moved since last night). I am not sure if they are still in the queue, or it is just a website error, but just in case you can kill those if there are still there in the queue (submissions 837, 838).

jferlez · 2023-07-03T22:44:46Z

@ChristopherBrix

I likewise have two runs that are listed as queued, but are not advancing (numbers 839 and 840). You can terminate both, though, as I think I have fully debugged the deployment of my tool submission

ChristopherBrix · 2023-07-04T10:39:40Z

@mldiego @jferlez Thank you for letting me know - I've stopped those submissions. I'm not entirely sure why that happened, but tools submitted later were processed fine. I will monitor this, but please let me know if it happens again!

wu-haoze · 2023-07-05T03:25:16Z

@ChristopherBrix It seems that the dist_shift benchmarks is missing on the test web page. I'm wondering whether this can be fixed so we can test on that benchmark set? Thanks!

aliabigdeli · 2023-07-05T17:45:24Z

@ChristopherBrix
I have two questions, first; is there any runtime cap on the VNNComp website? when I want to run all instances, it aborted after 1 hour in the middle of the executing the benchmarks. For example, in the run with id=877 , the running aborted after the first benchmark when total runtime reached "1 hours, 1 minutes", although I expected it to keep running on the other benchmarks.
Secondly, I think there is a problem in vnnlib format of "collins_rul_cnn" benchmark; in the output constraints of this benchmark, there is (and(>= Y_0 438.0107727050781)) but I think there should be an space after 'and' so it should be like this (and (>= Y_0 438.0107727050781)); am I right? or we should support to parse that vnnlib format as well?

ChristopherBrix · 2023-07-05T22:09:03Z

@anwu1219 Thanks, I've added it.

@aliabigdeli Yes, the timeout for now was 1h. I've increased it to 12. Please don't run those tests too often, they get quite expensive if everyone does them repeatedly (3$ per hour).

I think the vnnlib file should be fixed, @regkirov

mldiego · 2023-07-07T15:08:31Z

@ChristopherBrix

I am getting the following error during the Initialization phase now:

1.vnnlib.gz': No such file or directory
mv: cannot move './benchmarks/cifar2020/vnnlib/cifar10_spec_idx_39_eps_0.00784_n1.vnnlib.gz' to '../../benchmarks/./benchmarks/cifar2020/vnnlib/cifar10_spec_idx_39_eps_0.00784_n1.vnnlib.gz': No such file or directory
 .
 .
 .
mv: cannot move './benchmarks/rl_benchmarks/vnnlib/dubinsrejoin_case_unsafe_59.vnnlib.gz' to '../../benchmarks/./benchmarks/rl_benchmarks/vnnlib/dubinsrejoin_case_unsafe_59.vnnlib.gz': No such file or directory
mv: cannot move './benchmarks/rl_benchmarks/vnnlib/cartpole_case_unsafe_13.vnnlib.gz' to '../../benchmarks/./benchmarks/rl_benchmarks/vnnlib/cartpole_case_unsafe_13.vnnlib.gz': No such file or directory
rm: cannot remove 'large_models': No such file or directory
rm: cannot remove 'large_models.zip': No such file or directory
gzip: benchmarks/*/onnx/*.gz: No such file or directory
gzip: benchmarks/*/vnnlib/*.gz: No such file or directory
benchmarks/vggnet16_2022/onnx/vgg16-7.onnx: No such file or directory
+ curl --retry 100 --retry-connrefused https://vnncomp.christopher-brix.de/update/977/failure

ChristopherBrix · 2023-07-07T15:22:31Z

@mldiego I'm looking into it.

ChristopherBrix · 2023-07-07T15:33:08Z

@mldiego I can see that you got this error, and this isn't something your code could have influenced - but if I start exactly the same submission once again, it initializes without issues.

Please try again and let me know if it continues to fail. Maybe either GitHub or the AWS instance experienced some temporary issue.

mldiego · 2023-07-07T15:52:32Z

@ChristopherBrix Yes, it is working now. Thanks for looking into it!

ChristopherBrix · 2023-07-13T12:48:46Z

All teams: You should have received an email from me (I send them to the email you used for your account on the submission page). Please double check that it lists the correct instance type, commit hash and benchmark list. If you spot any issues, please let me know.

If you did not receive an email, please contact me as soon as possible!

ChristopherBrix mentioned this issue Jul 5, 2023

Benchmark discussion #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool Submission Discussion #3

Tool Submission Discussion #3

ChristopherBrix commented Jun 9, 2023

ttj commented Jun 26, 2023

ChristopherBrix commented Jun 27, 2023

ChristopherBrix commented Jun 27, 2023

mldiego commented Jun 28, 2023

aliabigdeli commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

mldiego commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

mldiego commented Jun 28, 2023

aliabigdeli commented Jun 28, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

mldiego commented Jun 29, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

mldiego commented Jun 30, 2023

ChristopherBrix commented Jun 30, 2023

ChristopherBrix commented Jun 30, 2023 •

edited

Loading

mldiego commented Jul 2, 2023

jferlez commented Jul 3, 2023

ChristopherBrix commented Jul 4, 2023

wu-haoze commented Jul 5, 2023

aliabigdeli commented Jul 5, 2023 •

edited

Loading

ChristopherBrix commented Jul 5, 2023

mldiego commented Jul 7, 2023

ChristopherBrix commented Jul 7, 2023

ChristopherBrix commented Jul 7, 2023

mldiego commented Jul 7, 2023

ChristopherBrix commented Jul 13, 2023

Tool Submission Discussion #3

Tool Submission Discussion #3

Comments

ChristopherBrix commented Jun 9, 2023

ttj commented Jun 26, 2023

ChristopherBrix commented Jun 27, 2023

ChristopherBrix commented Jun 27, 2023

mldiego commented Jun 28, 2023

aliabigdeli commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

mldiego commented Jun 28, 2023

ChristopherBrix commented Jun 28, 2023

mldiego commented Jun 28, 2023

aliabigdeli commented Jun 28, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

mldiego commented Jun 29, 2023

ChristopherBrix commented Jun 29, 2023

mldiego commented Jun 29, 2023

mldiego commented Jun 30, 2023

ChristopherBrix commented Jun 30, 2023

ChristopherBrix commented Jun 30, 2023 • edited Loading

mldiego commented Jul 2, 2023

jferlez commented Jul 3, 2023

ChristopherBrix commented Jul 4, 2023

wu-haoze commented Jul 5, 2023

aliabigdeli commented Jul 5, 2023 • edited Loading

ChristopherBrix commented Jul 5, 2023

mldiego commented Jul 7, 2023

ChristopherBrix commented Jul 7, 2023

ChristopherBrix commented Jul 7, 2023

mldiego commented Jul 7, 2023

ChristopherBrix commented Jul 13, 2023

ChristopherBrix commented Jun 30, 2023 •

edited

Loading

aliabigdeli commented Jul 5, 2023 •

edited

Loading