Online material for the paper "Balancing Consumer and Business Value of Recommender Systems: A Simulation-based Analysis"
- Summary
- General model workflow
- Requirements
- Installation
- Running the model
- File structure
- Rating dataset
- Configuration file
- Results
This agent-based simulation model demonstrates the consequences of various recommendation strategies for different stakeholders: Focusing only on satisfing consumers when delivering the recommendations may affect other stakeholders' interests, in particular the short-term profit of the service provider. Likewise, delivering recommendations only to maximize profit may negatively affect the consumers' trust in the service provider.
Two types of agents are used in the model:
- Recommendation service provider: Prepares and sends personalized recommendations to the consumers
- Consumer: Receive the recommendations and make further decisions
General model workflow based on Experience goods
We tested the code on a machine with MS Windows 10, Python=3.8, 16GB, and an Intel Core 7 CPU. The code also was tested using a machine with Docker, Ubuntu 20.04.2 LTS x86_64, , 30GB, and an Intel Xeon E5645 (12) @ 2.4. processor.
For installation without Docker, it is recommended to install the last version of Anaconda, which comes with Python 3 and supports scientific packages.
The following packages are used in our model, see also the file requirements.txt
:
Download and install Anaconda (Individual Edition)
Create a virtual environment
conda create -n myenv python=3.8
Activate the virtual environment
conda activate myenv
More commands regarding the use of virtual environments in Anaconda can be found here
Install the required packages by running:
pip install -r requirements.txt
If you face errors when insatlling the surprise package on MS Windows, run:
conda install -c conda-forge scikit-surprise
We provide a Docker image on Docker hub; to pull the image use the following:
docker pull nadadocker/simulation
To run the simulation when Docker does not exist:
cd src
python run.py
When using Docker:
Since the simulation saves data to the disk at the end, an output directory has to be provided to the Docker image. The following command runs a new container of the simulation and saves the output in the "results" directory. Before running the Docker container, create a directory named results
on the host machine by executing the following commands:
git clone <git_repo>
cd simulation
mkdir results
Run the Docker container.
docker run -dit --rm -v ${PWD}/results:/results --name <my_container> <nadadocker/simulation>
container_name
: A name of the container${PWD}
: The current working directory-v ${PWD}/results:/results
: Sets up a bind mount volume that links the/results
directory from inside the 'container_name' to the directory ${PWD}/results on the host machine. Docker uses ':' to split the host’s path from the container path, and the host path always comes first<nadadocker/simulation>
: The Docker image that is used to run the container
The simulation is built with the help of Mesa, an agent-based simulation framework in Python.
├── data/
│ ├── dataset <- MovieLens dataset
│ │ ├── movies.csv
│ │ └── ratings.csv
│ ├── recdata/ <- Recommendation algorithm output saved in pickle format
│ │ ├── consumers_items_utilities_predictions.p
│ │ ├── consumers_items_utilities_predictions_popular.p
│ │ └── SVDmodel.p
│ └── trust/ <- Initial data for consumer trust
│ └── beta_initials.p
├── Dockerfile
├── figures/ <- Figures that show simulation results
│ ├── modelgeneralflow.png
│ ├── time-consumption_probability.png
│ ├── time-total_profit.png
│ └── time-trust.png
├── README.md
├── requirements.txt
├── results-analysis(R)/ <- R code to analyze model output, the output is stored in "results" folder, we store it in a seafile service
├── src/
├── __init__.py
├── config.yml <- Simulation settings
├── consumer.py <- Contains all propoerties and behaviors of consumer agents
├── mesa_utils/
│ ├── __init__.py
│ ├── datacollection.py
│ └── schedule.py
├── model.py <- Contains the model class, which manages agent creation, data sharing, and simulation output collection
├── plots.py <- Plotting module for data analysis
├── read_config.py
├── run.py <- Launches the simulation
├── service_provider.py <- Contains all properties and behavior of the service provider agent
├── test.py
└── utils.py <- An auxiliary module
We use the MovieLens dataset, the small version (1 MB), which contains movie ratings for multiple consumers, more details. The following shows the content of ratings.csv
.
userId | movieId | rating | timestamp |
---|---|---|---|
1 | 1 | 4 | 964982703 |
1 | 3 | 4 | 964981247 |
1 | 6 | 4 | 964982224 |
1 | 47 | 5 | 964983815 |
1 | 50 | 5 | 964982931 |
The dataset is used to predict consumer items utilities, and to initialize the model.
config.yml
includes all the required parameters to set up the model.
Note: Running the code may take a long time (e.g. one hour) based on the predefined time steps and the number of replications in the configuration.
Each execution of the model generates a unique folder inside the results folder. The collected data from the simulation contains various CSV files, a summary of the simulated strategies in a file named scenarios.json, and plots in the PNG format.
The following is part of the results generated from running the simulation for 1000 time steps and 3 replications. The simulation comprises one service provider and 610 consumers, and consumers can share their experiences on social media.
Consumption probability | Profit per step | Cumulative profit |