Analyzing the Time x Energy Relation in C++ Solutions Mined from a Programming Contest Site

What is this?

This repository contains the source code of C++ solutions mined from the Code Submission Evaluation System (CSES).

It contains folders related to 15 different CSES problems, where each folder contains around 100 different solutions (70 C++, 30 Java). The complete list of CSES problems is available in CSES_problems.

Folder Structure

The main folder contains 18 subfolders:

15 subfolders (one for each CSES problem).
A subfolder RAPL, which contains the code related to the time/energy measurement framework. The running time of a program is measured based on a wall-clock timer.
A subfolder RAPL-time, which contains the code related to the time/energy measurement framework. The running time of a program is measured based on a cpu timer.
A subfolder analysis-script, which contains Python scripts to analyze the data generated by the time/energy measuring tool.

The code in subfolders RAPL and RAPL-time is mostly based on the one provided by the Green Software Lab.

Each folder related to a CSES problem has a Makefile, a subfolder test, with input files related to the problem, a c++ folder, with the C++ solutions, and java folder, with the Java solutions.

Each c++ folder has the subfolders slow, fast, rand, rand30 and control, with the following configuration:

slow: the 10 slowest C++ solutions.
fast: the 10 fastest C++ solutions.
rand: 10 C++ solutions chosen at random.
rand30: 30 C++ solutions (different from rand) chosen at random.
control: 10 C++ solutions (different from rand and rand30) chosen at random.

Each java folder has only the subfolder rand30, with 30 Java solutions chosen at random.

During our experiments, we considered the following datasets:

SFR C++: which consists of the C++ solutions in subfolders slow, fast and rand
Rand30 C++: which consists of the C++ solutions in subfolder rand30
Control: which consists of the C++ solutions in subfolder control
Rand30 Java: which consists of the Java solutions in subfolder rand30

In the folder of each CSES problem there are .csv files related to the energy measurements performed regarding sections 4.1.1, 4.2 and 4.3. The names of these files have the following structure: PROBLEM_NUMBER-MACHINE-DATASET-TIME_MEASUREMENT.

Below, we present some examples of these files:

1621-HPELITE-control-time.csv: data related to CSES problem 1621, where the measurement was performed at machine HPELITE for the control dataset, using a cpu-based timer.
1082-HPTHINK-slow-clock.csv: data related to CSES problem 1082, where the measurement was performed at machine HPTHINK for the slow dataset, using a wall-clock timer.

The measurements related to section 4.1.2 are in folder results, where there is a subfolder for each machine (elite, think and xeon), and then a subfolder for each measurement framework (perf and rapl).

The names of the files in these folders have the following structure: PROBLEM_NUMBER-LANGUAGE-CORES-MACHINE-DATASET-TIME_MEASUREMENT. Below, we present some examples of these files:

1643-Maximum_Subarray_Sum-java-mult-rapl-elite-rand30-24-07-2024-18-11.csv
2185-Prime_Multiples-c++-sing-perf-xeon-rand30-26-07-2024-19-22.csv

Compiling/Running the C++ Solutions

The files in this repository were compiled in a Linux/Ubuntu environment using versions of the g++ compiler with support to the C++17 standard.

To start measuring the time/energy of the CSES solutions, the first step is to compile the measurement framework. You should enter the RAPL and RAPL-time folders and type make.

After this, you should enter a folder related to a CSES problem (e.g., cses-1084_Apartments) and edit the Makefile to select the subfolders with C++ solutions that will be measured.

Then, we should log in as root (this is necessary for increasing the amount of memory that a program can use, and for the energy measurements) , and then type ./faztudo at the command line. This will compile and run all solutions for the given problem in the selected subfolders against each test file of the corresponding test subfolder. By default, we will run each solution ten times against the corresponding testset.

The energy measurements of the solutions for a given problem will be stored in a .csv file whose name can be configured by changing the value of the variable PROBLEM in the first line of the corresponding outermost Makefile. For example, if we associate the name "1084" with variable PROBLEM, our measurements will be stored at a file 1084.csv.

Each line of the .csv containg six columns related to the following information:

Name of the executable file , PKG (Joules) , CPU (J) , GPU (J) , DRAM (J) , Time (ms)

RAPL will also report values for the columns PKG and CPU, but the measurements related to GPU and DRAM may not be available in some machines.

Using the Analysis Script

To use the analysis script, enter in the folder of a problem and type the following command:

python ../analysis-script/analisacsvs.py <file>.csv

You can also provide multiple .csv files to the analysis script. In this case, you should provide an even number 2 * N of files. The script will consider that the first N files are related to a machine, while the others are related to a different machine, and it will compare the outliers in files 1, 2, ..., N with the outliers in files N+1, N+2, ..., N+N.

The analysis script creates a subfoder analysis_results where it will store several auxiliary files generated during the analysis.

Contact

You can contact @Sérgio Medeiros and @Marcelo Nogueira about this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
1071-Number_Spiral		1071-Number_Spiral
1082-Sum_of_Divisors		1082-Sum_of_Divisors
1084-Apartments		1084-Apartments
1140-Projects		1140-Projects
1158-Bookshop		1158-Bookshop
1621-Distinct_Numbers		1621-Distinct_Numbers
1632-Movie_Festival_II		1632-Movie_Festival_II
1634-Minimizing_Coins		1634-Minimizing_Coins
1635-Coin_Combinations_I		1635-Coin_Combinations_I
1636-Coin_Combinations_II		1636-Coin_Combinations_II
1639-Edit_Distance		1639-Edit_Distance
1640-Sum_of_Two_Values		1640-Sum_of_Two_Values
1640-Sum_of_Two_Values_Ref		1640-Sum_of_Two_Values_Ref
1642-Sum_of_Four_Values		1642-Sum_of_Four_Values
1643-Maximum_Subarray_Sum		1643-Maximum_Subarray_Sum
2185-Prime_Multiples		2185-Prime_Multiples
RAPL-time		RAPL-time
RAPL		RAPL
analysis-script		analysis-script
baseline		baseline
log		log
results		results
run-experiment-script		run-experiment-script
scripts		scripts
.gitignore		.gitignore
CSES_problems.md		CSES_problems.md
Experiments.txt		Experiments.txt
LICENSE		LICENSE
README.md		README.md
ToProcess.txt		ToProcess.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analyzing the Time x Energy Relation in C++ Solutions Mined from a Programming Contest Site

What is this?

Folder Structure

Compiling/Running the C++ Solutions

Using the Analysis Script

Suggested Reading

Contact

About

Releases 3

Packages

Contributors 2

Languages

License

sqmedeiros/sblp2023-time-vs-energy

Folders and files

Latest commit

History

Repository files navigation

Analyzing the Time x Energy Relation in C++ Solutions Mined from a Programming Contest Site

What is this?

Folder Structure

Compiling/Running the C++ Solutions

Using the Analysis Script

Suggested Reading

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages