A Novel Video Super-Resolution Enhancement Method Based on Residual Learning Using Hidden Markov Random Fields and a New Deep Learning Network Architecture

Document Type : Research Paper

Authors

Department of Electrical Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran.

Abstract

In today's world, improving the quality and clarity of videos has become increasingly important, particularly in the fields of surveillance, medicine, and imaging technologies. Traditional super-resolution methods primarily focus on the full reconstruction of video frames, which poses challenges in preserving fine details and complex structures. This paper introduces a novel approach based on parallel deep networks, effectively enhancing video quality by dividing video frames into three separate input branches: raw images, outputs based on Hidden Markov Random Fields (HMRF), and temporal images. The method also leverages techniques such as residual learning and random patching within a unified framework that combines spatial segmentation (HMRF) and temporal information. This integration allows the model to better capture spatial and temporal dependencies, leading to more accurate and efficient video frame reconstruction. To better focus on high-frequency details and mitigate the vanishing gradient problem, residual learning is employed, enabling the network to estimate only the additional details necessary for reconstructing high-resolution images. Additionally, through random patching, the network training process is designed to emphasize critical features and intricate textures. Experimental results demonstrate that the proposed method achieves an SSIM of 0.92857 and a PSNR of 34.8617, offering superior clarity in video reconstruction.

Keywords

Main Subjects


[1] Chan, K. C., Zhou, S., Xu, X., & Loy, C. C. (2022). Investigating tradeoffs in real-world video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5962-5971).‏
[2] Yang, X., Xiang, W., Zeng, H., & Zhang, L. (2021). Real-world video super-resolution: A benchmark dataset and a decomposition-based learning scheme. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4781-4790).‏
[3] Lee, J., Lee, M., Cho, S., & Lee, S. (2022). Reference-based video super-resolution using multi-camera video triplets. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 17824-17833).‏
[4] Chiche, B. N., Woiselle, A., Frontera-Pons, J., & Starck, J. L. (2022). Stable long-term recurrent video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 837-846).‏
[5] Chen, P., Yang, W., Wang, M., Sun, L., Hu, K., & Wang, S. (2021). Compressed domain deep video super-resolution. IEEE Transactions on Image Processing30, 7156-7169.‏
[6] Shi, S., Gu, J., Xie, L., Wang, X., Yang, Y., & Dong, C. (2022). Rethinking alignment in video super-resolution transformers. Advances in Neural Information Processing Systems, 35, 36081-36093.‏
[7] Isobe, T., Jia, X., Tao, X., Li, C., Li, R., Shi, Y., ... & Tai, Y. W. (2022). Look back and forth: Video super-resolution with explicit temporal difference modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 17411-17420).‏
[8] Qiu, Z., Yang, H., Fu, J., & Fu, D. (2022, October). Learning spatiotemporal frequency-transformer for compressed video super-resolution. In European Conference on Computer Vision (pp. 257-273). Cham: Springer Nature Switzerland.‏
[9] Xiao, Z., Fu, X., Huang, J., Cheng, Z., & Xiong, Z. (2021). Space-time distillation for video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2113-2122).‏
[10] Zhang, A., Li, Q., Chen, Y., Ma, X., Zou, L., Jiang, Y., ... & Muntean, G. M. (2021). Video super-resolution and caching—An edge-assisted adaptive video streaming solution. IEEE Transactions on Broadcasting, 67(4), 799-812.‏
[11] Yu, J., Liu, J., Bo, L., & Mei, T. (2022). Memory-augmented non-local attention for video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 17834-17843).‏
[12] Zhu, X., Li, Z., Lou, J., & Shen, Q. (2021). Video super-resolution based on a spatio-temporal matching network. Pattern Recognition, 110, 107619.‏
[13] Hu, M., Jiang, K., Wang, Z., Bai, X., & Hu, R. (2023). Cycmunet+: Cycle-projected mutual learning for spatial-temporal video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence.‏
[14] Wang, L., Hajiesmaili, M., & Sitaraman, R. K. (2021, October). Focas: Practical video super-resolution using foveated rendering. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 5454-5462).‏
[15] Luo, L., Yi, B., Wang, Z., Yi, P., & He, Z. (2024). Efficient lightweight network for video super-resolution. Neural Computing and Applications, 36(2), 883-896.‏         
[16] Lin, J., Huang, Y., & Wang, L. (2021). FDAN: Flow-guided deformable alignment network for video super-resolution. arXiv preprint arXiv: 2105.05640.‏
[17] Luo, J., Huang, S., & Yuan, Y. (2020, October). Video super-resolution using multi-scale pyramid 3d convolutional networks. In Proceedings of the 28th ACM International Conference on Multimedia (pp. 1882-1890).‏
[18] Armin Kappeler, Seunghwan Yoo, Qiqin Dai, and Aggelos K. Katsaggelos." Video Super-Resolution with Convolutional Neural Networks". IEEE Transactions on Computational Imaging 2016.
[19] Li, Tianyi, et al. "A Deep Learning Approach for Multi-Frame In-Loop Filter of HEVC." IEEE Transactions on Image Processing 28.11 (2019): 5663-5678.
[20] Liu, Dong, et al. "Deep Learning-Based Technology in Responses to the Joint Call for Proposals on Video Compression with Capability beyond HEVC." IEEE Transactions on Circuits and Systems for Video Technology (2019).
[21] Lin, Hongwei, et al. "Improved Low-Bitrate HEVC Video Coding using Deep Learning based Super-Resolution and Adaptive Block Patching." IEEE Transactions on Multimedia(2019).
[22] Wang, Y., Guo, J., Gao, H., & Yue, H. (2021). UIEC^ 2-Net: CNN-based underwater image enhancement using two color spaces. Signal Processing: Image Communication, 96, 116250.‏
[23] Magnusson, M., Sigurdsson, J., Armansson, S. E., Ulfarsson, M. O., Deborah, H., & Sveinsson, J. R. (2020, September). Creating RGB images from hyperspectral images using a color matching function. In IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium (pp. 2045-2048). IEEE.‏
[24] Alwan, Z. A., Farhan, H. M., & Mahdi, S. Q. (2020). Color image steganography in YCbCr space. International Journal of Electrical and Computer Engineering, 10(1), 202.‏
[25] Ansari, M., & Singh, D. K. (2022). Significance of color spaces and their selection for image processing: a survey. Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), 15(7), 946-956.‏
[26] Saleem, E., & El Abbadi, N. K. (2020). Auto colorization of grayscale image using YCbCr color space. Iraqi Journal of Science, 3379-3386.‏
[27] Sahu, M., & Dash, R. (2021). A survey on deep learning: convolution neural network (CNN). In Intelligent and Cloud Computing: Proceedings of ICICC 2019, Volume 2 (pp. 317-325). Springer Singapore.‏
[28] Kumar, V., Choudhury, T., Satapathy, S. C., Tomar, R., & Aggarwal, A. (2020). Video super resolution using convolutional neural network and image fusion techniques. International Journal of Knowledge-based and Intelligent Engineering Systems, 24(4), 279-287.‏
[29] Daithankar, M. V., & Ruikar, S. D. (2020). Video super resolution: a review. In ICDSMLA 2019: Proceedings of the 1st International Conference on Data Science, Machine Learning and Applications (pp. 488-495). Springer Singapore.‏
[30] Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. Artificial intelligence review, 53, 5455-5516.‏
[31] Sekehravani, E. A., Babulak, E., & Masoodi, M. (2020). Implementing canny edge detection algorithm for noisy image. Bulletin of Electrical Engineering and Informatics, 9(4), 1404-1410.‏
[32] Cao, Y., Wu, D., & Duan, Y. (2020). A new image edge detection algorithm based on improved Canny. Journal of Computational Methods in Sciences and Engineering, 20(2), 629-642.‏
[33] Sidén, P., & Lindsten, F. (2020, November). Deep Gaussian Markov random fields. In International conference on machine learning (pp. 8916-8926). PMLR.‏
[34] Blake, A., Kohli, P., & Rother, C. (Eds.). (2011). Markov random fields for vision and image processing. MIT press.‏
[35] Geman, S., & Graffigne, C. (1986, August). Markov random field image models and their applications to computer vision. In Proceedings of the international congress of mathematicians (Vol. 1, p. 2).‏
[36] Kim, J., J. K. Lee, and K. M. Lee. "Accurate Image Super-Resolution Using Very Deep Convolutional Networks." Proceedings of the IEEE® Conference on Computer Vision and Pattern Recognition. 2016, pp. 1646-1654.
[37] Xue, T., Chen, B., Wu, J., Wei, D., & Freeman, W. T. (2017). Video Enhancement with Task-Oriented Flow. arXiv.
[38] Zamzam, P., Rezaei, P., Khatami, S. A., & Appasani, B. (2025). Super perfect polarization-insensitive graphene disk terahertz absorber for breast cancer detection using deep learning. Optics & Laser Technology, 183, 112246.
 
[39] Cao, Y., Wang, C., Song, C., Tang, Y., & Li, H. (2021, July). Real-time super-resolution system of 4k-video based on deep learning. In 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP) (pp. 69-76). IEEE.‏
[40] Pan, J., Bai, H., Dong, J., Zhang, J., & Tang, J. (2021). Deep blind video super-resolution. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 4811-4820).‏
[41] Li, F., Bai, H., & Zhao, Y. (2020). Learning a deep dual attention network for video super-resolution. IEEE transactions on image processing, 29, 4474-4488.‏
[42] Isobe, T., Li, S., Jia, X., Yuan, S., Slabaugh, G., Xu, C., ... & Tian, Q. (2020). Video super-resolution with temporal group attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8008-8017).‏
[43] Bai, H., & Pan, J. (2024). Self-supervised deep blind video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence.
[44] Feng, Z., Zhang, W., Liang, S., & Yu, Q. (2023). Deep video super-resolution using a hybrid imaging system. IEEE Transactions on Circuits and Systems for Video Technology, 33(9), 4855-4867.
[45] Wang, W., Liu, Z., Lu, H., Lan, R., & Zhang, Z. (2023). Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference. Sensors, 23(18), 7880.