Distributed Deep Joint Source-Channel Coding with Decoder-Only Side Information (ICMLCN 2024) [Paper]
This repository contains the implementation of the paper "Distributed Deep Joint Source-Channel Coding with Decoder-Only Side Information".
Please cite the paper if this code or paper has been useful to you:
@INPROCEEDINGS{yilmaz2024decoderonly,
author={Yilmaz, Selim F. and Ozyilkan, Ezgi and Gündüz, Deniz and Erkip, Elza},
booktitle={2024 IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN)},
title={Distributed Deep Joint Source-Channel Coding with Decoder-Only Side Information},
year={2024},
volume={},
number={},
pages={139-144},
doi={10.1109/ICMLCN59089.2024.10625214}}
We consider low-latency image transmission over a noisy wireless channel when correlated side information is present only at the receiver side (the Wyner-Ziv scenario). In particular, we are interested in developing practical schemes using a data-driven joint source-channel coding approach, which has been previously shown to outperform conventional separation-based approaches in the practical finite blocklength regimes, and to provide graceful degradation with channel quality. We propose a novel neural network architecture that incorporates the decoder-only side information at multiple stages at the receiver side. Our results demonstrate that the proposed method succeeds in integrating the side information, yielding improved performance at all channel conditions in terms of the various quality measures considered here, especially at low channel signal-to-noise ratios and small bandwidth ratios. We have made the source code of the proposed method public to enable further research, and the reproducibility of the results.
To install environment, run the following commands after installing Python 3.10
or above, and torch>=1.10.0
for your specific training device (Pytorch Installation):
git clone https://github.com/ipc-lab/deepjscc-wz.git
cd deepjscc-wz
pip install -r requirements.txt
Then, place the KittiStereo and Cityscape datasets in the data
folder. The datasets can be downloaded from the following links:
- Cityscape
- Download the image pairs from here. After downloading
leftImg8bit_trainvaltest.zip
andrightImg8bit_trainvaltest.zip
to thedata
folder, run the following commands:
mkdir cityscape_dataset unzip leftImg8bit_trainvaltest.zip mv leftImg8bit cityscape_dataset unzip rightImg8bit_trainvaltest.zip mv rightImg8bit cityscape_dataset
- Folder structure needs to be as follows:
data/cityscape_dataset/leftImg8bit/val
anddata/cityscape_dataset/leftImg8bit/train
(for training) anddata/cityscape_dataset/leftImg8bit/test
(for testing). Subfolders are named as city names such as berlin and munich, containing images.
- Download the image pairs from here. After downloading
- KittiStereo
- Download the necessary image pairs from KITTI 2012 and KITTI 2015. After obtaining
data_stereo_flow_multiview.zip
anddata_scene_flow_multiview.zip
in thedata
folder, run the following commands:
unzip data_stereo_flow_multiview.zip # KITTI 2012 mkdir data_stereo_flow_multiview mv training data_stereo_flow_multiview mv testing data_stereo_flow_multiview unzip data_scene_flow_multiview.zip # KITTI 2015 mkdir data_scene_flow_multiview mv training data_scene_flow_multiview mv testing data_scene_flow_multiview
- Folder structure needs to be as follows:
data/data_scene_flow_multiview/training
,data/data_scene_flow_multiview/testing
anddata/data_stereo_flow_multiview/training
,data/data_stereo_flow_multiview/testing
(for training, validation and testing), where subfolders are named likeimage_2
andimage_3
. Text files containing paths are already in this repository (data/KITTI_stereo_train.txt
,data/KITTI_stereo_val.txt
,data/KITTI_stereo_test.txt
).
- Download the necessary image pairs from KITTI 2012 and KITTI 2015. After obtaining
To train a single model on a single dataset and bandwidth ratio, the following command can be used:
cd src
python train.py trainer.devices=[0] model=base_deepjscc model.N=256 model.snr=undefined model.csi.min=-5.0 model.csi.max=5.0 model/loss=mse_lpips model.loss.lpips_weight=0.001 data.name=<data_name> model._target_=<model_class> model.bw_factor=<bw_factor>
Description of the parameters used are as follows:
trainer.devices
: The device to be used for training. Here,trainer.devices=[0]
indicates the GPU device (e.g. 0th gpu) to be used for training.model
: The name of the model to be used for training. Here,model=base_deepjscc
indicates that the base DeepJSCC class is to be used for training. The config is brought from theconfig/model/base_deepjscc.yaml
file.model.N
: Number of convolutional filters in middle layers of encoder and decoder (default=256).model.snr
: This is disregarded since we state min and max csi, but required for the code to run. Just useundefined
as the value.model.csi.min
andmodel.csi.max
: Minimum and maximum values of the channel signal-to-noise ratio (csi) to be used for training. During training, the csi value will be sampled uniformly from the interval[min, max]
.model/loss
: The loss function to be used for training. Here,model/loss=mse_lpips
indicates that the loss function to be used is a combination of mean squared error (MSE) and learned perceptual image patch similarity (LPIPS) loss. The config is brought from theconfig/model/loss/mse_lpips.yaml
file.model.loss.lpips_weight
: The weight of the LPIPS loss in the combined loss function. Here, the defaultmodel.loss.lpips_weight=0.001
indicates that the weight of the LPIPS loss is 0.001.data.name
: The name of the dataset to be used for training. Here,<data_name>
should be replaced with the name of the dataset to be used. Possible values areKittiStereo
andCityscape
.model._target_
: The name of the model class to be used for training. Here,<model_class>
should be replaced with the name of the model class to be used. Possible values are as follows:src.models.wz.deepjscc_wz_baseline2.DeepJSCCWZBaseline2
: TheDeepJSCC-WZ
model in the paper (see result figures/tables in the paper).src.models.wz.deepjscc_wz.DeepJSCCWZ
: TheDeepJSCC-WZ-sm
model in the paper.src.models.wz.deepjscc_wz_baseline.DeepJSCCWZBaseline
: TheDeepJSCC
model in the paper.src.models.wz.deepjscc_wz_joint2.DeepJSCCWZJoint2
: TheDeepJSCC-Cond
model in the paper.
To reproduce the trainings of the methods in the paper for all figures, the following command can be used:
cd src
python train.py trainer.devices=[0] experiment=wz/wz_train_bw
This command will train all models in the paper and log the best checkpoints to logs
folder.
To evaluate at different signal-to-noise values, the following command can be used:
python eval.py trainer.devices=[0] experiment=wz/wz_eval ckpt_path=<saved checkpoint>
The <saved_checkpoint
is the path to the saved checkpoint of the model to be evaluated, which is saved in the logs
folder after training. This parameter can be multiple paths separated by a comma, each of which will be run sequentially.
To facilitate further research and reproducibility, we share the checkpoints in Google Drive. You can download the checkpoints and place them in the checkpoints
folder.
The checkpoints are stored in .ckpt
files, where the file names has the following structure: <dataset>_<1/rho value>_{model}.ckpt
, along with corresponding config files named <dataset>_<1/rho value>_{model}_cfg.pt
. Here, <dataset>
is the dataset used for training (either KittiStereo
or Cityscape
), <1/rho value>
is the inverse of the signal-to-noise value (either 16
or 32
), and {model}
is the name of the model used for training. Possible values for {model}
are indicated below along with the corresponding model in the paper:
DeepJSCCWZBaseline2
: TheDeepJSCC-WZ
model in the paper (see result figures/tables in the paper).DeepJSCCWZ
: TheDeepJSCC-WZ-sm
model in the paper.DeepJSCCWZBaseline
: TheDeepJSCC
model in the paper.DeepJSCCWZJoint2
: TheDeepJSCC-Cond
model in the paper.
We share evaluation results of the models in the paper in the results
folder, containing psnr
, mssim
and lpips
values for each evaluated signal-to-noise value. The results are stored in .csv
files, where each row corresponds to a a different signal-to-noise value. The file names has the following structure: <dataset>_<1/rho value>_{model}.csv
. Here, <dataset>
is the dataset used for evaluation (either KittiStereo
or Cityscape
), <1/rho value>
is the inverse of the signal-to-noise value (either 16
or 32
), and {model}
is the name of the model used for evaluation. Possible values for {model}
are indicated below along with the corresponding model in the paper:
DeepJSCC-Cond
:<dataset>_<1/rho value>_DeepJSCCWZJoint2.csv
DeepJSCC-WZ
:<dataset>_<1/rho value>_DeepJSCCWZBaseline2.csv
DeepJSCC-WZ-sm
:<dataset>_<1/rho value>_DeepJSCCWZ.csv
DeepJSCC
:<dataset>_<1/rho value>_DeepJSCCBaseline.csv
This code is based on Pytorch Lightning and Hydra. We use Lightning Hydra Template as our base code. For more details on the template we use, please see the README.md of the template.