The following demo illustrates a side-by-side comparison of the framework performing GTAV → Cityscapes (left) and GTAV → Mapillary Vistas (right) at 1280x720 (maximum game settings). The footages were recorded with OBS Studio while the game was also rendered on the same GPU. It is running on a system with an RTX 4090, an Intel i7 14700F CPU, and 64GB of DDR4 system memory without any optimization (e.g., TensorRT). The full video is included in the demos directory.
The following demos illustrate the framework performing GTAV → Cityscapes at 1280x720 (maximum game settings) and CARLA → KITTI at 960x540 with a 20 fps cap of the simulator in synchronous mode. Both are running on a system with an RTX 4070 Super 12GB, an Intel i7 13700KF CPU, and 32GB of DDR4 system memory without any optimization (e.g., TensorRT).
Photorealism is an important aspect of modern video games since it can shape the player experience and simultaneously impact the immersion, narrative engagement, and visual fidelity. Although recent hardware technological breakthroughs, along with state-of-the-art rendering technologies, have significantly improved the visual realism of video games, achieving true photorealism in dynamic environments at real-time frame rates still remains a major challenge due to the tradeoff between visual quality and performance. In this short paper, we present a novel approach for enhancing the photorealism of rendered game frames using generative adversarial networks. To this end, we propose Real-time photorealism Enhancement in Games via a dual-stage gEnerative Network framework (REGEN), which employs a robust unpaired image-to-image translation model to produce semantically consistent photorealistic frames that transform the problem into a simpler paired image-to-image translation task. This enables training with a lightweight method that can achieve real-time inference time without compromising visual quality. We demonstrate the effectiveness of our framework on Grand Theft Auto V, showing that the approach achieves visual results comparable to the ones produced by the robust unpaired Im2Im method while improving inference speed by 32.14 times. Our findings also indicate that the results outperform the photorealism-enhanced frames produced by directly training a lightweight unpaired Im2Im translation method to translate the video game frames towards the visual characteristics of real-world images.
If you used the REGEN framwork or any of the pretrained models from this repository in a scientific publication, we would appreciate using the following citation:
@misc{pasios2025regenrealtimephotorealismenhancement,
title={REGEN: Real-Time Photorealism Enhancement in Games via a Dual-Stage Generative Network Framework},
author={Stefanos Pasios and Nikos Nikolaidis},
year={2025},
eprint={2508.17061},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.17061},
}
📝 Note: This repository uses code from the Pix2PixHD repository.
conda create -n REGEN python=3.9
conda activate REGEN
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install carla==0.9.15
pip install opencv-python
pip install dominate
pip install scipy
pip install mss
pip install pywin32
To train the model, it is required to have access to a synthetic dataset generated by a game or simulator and the corresponding images that were photorealism enhanced by a robust unpaired image-to-image translation method such as Enhancing Photorealism Enhancement (EPE).
To train a model that enhances the photorealism of the CARLA simulator towards the characteristics of real-world datasets (Mapillary Vistas, Cityscapes, and KITTI), we already provide both the original rendered frames and the results of EPE here.
To train a model that enhances the photorealism of GTAV towards the characteristics of real-world datasets (Mapillary Vistas and Cityscapes), the results of EPE are already provided by the authors at the official repository. The initial rendered GTAV frames originated from the Playing for Data dataset, which can be downloaded here.
After collecting the required datasets, place the training and test sets of the game/simulator dataset into code/data/train_A and code/data/test_A, respectively. The corresponding photorealism-enhanced images should be transferred into the code/data/train_B and code/data/test_B directories. To start training, execute the following command:
python train.py --dataroot ./data --name REGEN --label_nc 0 --no_instance --gpu_id 0To test the framework, we provide pretrained models for GTAV → Cityscapes, CARLA → Cityscapes, and CARLA → KITTI. Download the models from Google Drive and transfer them into code/checkpoints/REGEN/. Finally, transfer the images that are to be inferred with the model in the code/data/test_A directory and execute the following command:
python test.py --dataroot ./data --name REGEN --label_nc 0 --no_instance --gpu_id 0The resulting images will be saved in the code/results/REGEN/images/ directory.
📝 Note: We have already provided some sample screenshots for testing purposes that also include the UI of the game.
We additionally provide two sample scripts for testing the models in real-time conditions. The provided pretrained models should be placed in the same directory as for testing.
To test the model on CARLA, download the UE4 executable of the simulator from the official repository. Particularly, the code was tested with CARLA version 0.9.15. After running the simulator and initializing the world, execute the following command:
python carla_test.py --dataroot ./data --name REGEN --label_nc 0 --no_instance --gpu_id 0
To test the model on GTA V, first download and run the game. Considering that the script performs real-time capturing of the game window, set the game in windowed mode with a lower resolution of the monitor (a dual-monitor setup would be ideal). In addition, through the game settings cap the frame rate to 30 FPS to reduce the GPU load. Then execute the following script:
python gta_test.py --dataroot ./data --name REGEN --label_nc 0 --no_instance --gpu_id 0
⚠️ Warning: You may need to modify the offsets in line 60 ofgta_test.pyin order to perfectly crop the game window while capturing.
📝 Note: For the best results, it is recommended to download ScriptHook and Hood Camera mods, as the PFD dataset used for training is mainly limited to that perspective.
📝 Note: All the available parameters of the model (e.g., for changing the resolution of the resulting images) can be found in
code/options/.



