Compressed-Domain Deep Learning with Application to Image Recognition and Universal Adversarial Attack

189300-Thumbnail Image.png
Description
Researchers have shown that the predictions of a deep neural network (DNN) for an image set can be severely distorted by one single image-agnostic perturbation, or universal perturbation, usually with an empirically fixed threshold in the spatial domain to restrict

Researchers have shown that the predictions of a deep neural network (DNN) for an image set can be severely distorted by one single image-agnostic perturbation, or universal perturbation, usually with an empirically fixed threshold in the spatial domain to restrict its perceivability. However, current universal perturbations have limited attack ability, and more importantly, limiting the perturbation's norm in the spatial domain may not be a suitable way to restrict the perceptibility of universal adversarial perturbations. Besides, the effects of such attacks on DNN-based texture recognition have yet to be explored. Learning-based image compression was shown to achieve a competitive performance with state-of-the-art transform-based codecs. This motivated the development of learning-based image compression systems targeting both humans and machines. Also, the learning-based compressed-domain representations can be utilized to perform computer vision tasks directly in the compressed domain. In the context of universal attacks, a novel method is proposed to compute more effective universal perturbations via enhanced projected gradient descent on targeted classifiers. The perturbation is optimized by accumulating small updates on perturbed images consecutively. Performance results show that the proposed adversarial attack method can achieve much higher fooling rates as compared to state-of-the-art universal attack methods. In order to reduce the perceptibility of universal attacks without compromising their effectiveness, a frequency-tuned universal attack framework is proposed to adopt JND thresholds to guide the perceptibility of universal adversarial perturbations. The proposed frequency-tuned attack method can achieve cutting-edge quantitative results, realize a good balance between perceptibility and effectiveness in terms of fooling rate on both natural and texture image datasets. In the context of compressed-domain image recognition, a novel feature adaptation module integrating a lightweight attention model is proposed to adaptively emphasize and enhance the key features within the extracted channel-wise information. Also, an adaptation training strategy is designed to utilize the pretrained pixel-domain weights. The obtained performance results show that the proposed compressed-domain classification model can distinctly outperform the existing compressed-domain classifiers, and that it can also yield similar accuracy results with a much higher computational efficiency as compared to the decoded image trained pixel-domain models.
Date Created
2023
Agent

Robust Object Detection under Varying Illuminations and Distortions

158419-Thumbnail Image.png
Description
Object detection is an interesting computer vision area that is concerned with the detection of object instances belonging to specific classes of interest as well as the localization of these instances in images and/or videos. Object detection serves as a

Object detection is an interesting computer vision area that is concerned with the detection of object instances belonging to specific classes of interest as well as the localization of these instances in images and/or videos. Object detection serves as a vital module in many computer vision based applications. This work focuses on the development of object detection methods that exhibit increased robustness to varying illuminations and image quality. In this work, two methods for robust object detection are presented.

In the context of varying illumination, this work focuses on robust generic obstacle detection and collision warning in Advanced Driver Assistance Systems (ADAS) under varying illumination conditions. The highlight of the first method is the ability to detect all obstacles without prior knowledge and detect partially occluded obstacles including the obstacles that have not completely appeared in the frame (truncated obstacles). It is first shown that the angular distortion in the Inverse Perspective Mapping (IPM) domain belonging to obstacle edges varies as a function of their corresponding 2D location in the camera plane. This information is used to generate object proposals. A novel proposal assessment method based on fusing statistical properties from both the IPM image and the camera image to perform robust outlier elimination and false positive reduction is also proposed.

In the context of image quality, this work focuses on robust multiple-class object detection using deep neural networks for images with varying quality. The use of Generative Adversarial Networks (GANs) is proposed in a novel generative framework to generate features that provide robustness for object detection on reduced quality images. The proposed GAN-based Detection of Objects (GAN-DO) framework is not restricted to any particular architecture and can be generalized to several deep neural network (DNN) based architectures. The resulting deep neural network maintains the exact architecture as the selected baseline model without adding to the model parameter complexity or inference speed. Performance results provided using GAN-DO on object detection datasets establish an improved robustness to varying image quality and a higher object detection and classification accuracy compared to the existing approaches.
Date Created
2020
Agent

New Signal Processing Methods for Blur Detection and Applications

157697-Thumbnail Image.png
Description
The depth richness of a scene translates into a spatially variable defocus blur in the acquired image. Blurring can mislead computational image understanding; therefore, blur detection can be used for selective image enhancement of blurred regions and the application of

The depth richness of a scene translates into a spatially variable defocus blur in the acquired image. Blurring can mislead computational image understanding; therefore, blur detection can be used for selective image enhancement of blurred regions and the application of image understanding algorithms to sharp regions. This work focuses on blur detection and its application to image enhancement.

This work proposes a spatially-varying defocus blur detection based on the quotient of spectral bands; additionally, to avoid the use of computationally intensive algorithms for the segmentation of foreground and background regions, a global threshold defined using weak textured regions on the input image is proposed. Quantitative results expressed in the precision-recall space as well as qualitative results overperform current state-of-the-art algorithms while keeping the computational requirements at competitive levels.

Imperfections in the curvature of lenses can lead to image radial distortion (IRD). Computer vision applications can be drastically affected by IRD. This work proposes a novel robust radial distortion correction algorithm based on alternate optimization using two cost functions tailored for the estimation of the center of distortion and radial distortion coefficients. Qualitative and quantitative results show the competitiveness of the proposed algorithm.

Blur is one of the causes of visual discomfort in stereopsis. Sharpening applying traditional algorithms can produce an interdifference which causes eyestrain and visual fatigue for the viewer. A sharpness enhancement method for stereo images that incorporates binocular vision cues and depth information is presented. Perceptual evaluation and quantitative results based on the metric of interdifference deviation are reported; results of the proposed algorithm are competitive with state-of-the-art stereo algorithms.

Digital images and videos are produced every day in astonishing amounts. Consequently, the market-driven demand for higher quality content is constantly increasing which leads to the need of image quality assessment (IQA) methods. A training-free, no-reference image sharpness assessment method based on the singular value decomposition of perceptually-weighted normalized-gradients of relevant pixels in the input image is proposed. Results over six subject-rated publicly available databases show competitive performance when compared with state-of-the-art algorithms.
Date Created
2019
Agent

Subjective and objective evaluation of visual attention models

155148-Thumbnail Image.png
Description
Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation of computational VA models for the distortion-free case as well

Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation of computational VA models for the distortion-free case as well as in the presence of image distortions.



Existing VA models are traditionally evaluated by using VA metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though there is a considerable number of objective VA metrics, there exists no study that validates that these metrics are adequate for the evaluation of VA models. This work constructs a VA Quality (VAQ) Database by subjectively assessing the prediction performance of VA models on distortion-free images. Additionally, shortcomings in existing metrics are discussed through illustrative examples and a new metric that uses local weights based on fixation density and that overcomes these flaws, is proposed. The proposed VA metric outperforms all other popular existing metrics in terms of the correlation with subjective ratings.



In practice, the image quality is affected by a host of factors at several stages of the image processing pipeline such as acquisition, compression, and transmission. However, none of the existing studies have discussed the subjective and objective evaluation of visual saliency models in the presence of distortion. In this work, a Distortion-based Visual Attention Quality (DVAQ) subjective database is constructed to evaluate the quality of VA maps for images in the presence of distortions. For creating this database, saliency maps obtained from images subjected to various types of distortions, including blur, noise and compression, and varying levels of distortion severity are rated by human observers in terms of their visual resemblance to corresponding ground-truth fixation density maps. The performance of traditionally used as well as recently proposed VA metrics are evaluated by correlating their scores with the human subjective ratings. In addition, an objective evaluation of 20 state-of-the-art VA models is performed using the top-performing VA metrics together with a study of how the VA models’ prediction performance changes with different types and levels of distortions.
Date Created
2016
Agent

Visual quality with a focus on 3D blur discrimination and texture granularity

154256-Thumbnail Image.png
Description
Blur is an important attribute in the study and modeling of the human visual system. In this work, 3D blur discrimination experiments are conducted to measure the just noticeable additional blur required to differentiate a target blur from the reference

Blur is an important attribute in the study and modeling of the human visual system. In this work, 3D blur discrimination experiments are conducted to measure the just noticeable additional blur required to differentiate a target blur from the reference blur level. The past studies on blur discrimination have measured the sensitivity of the human visual system to blur using 2D test patterns. In this dissertation, subjective tests are performed to measure blur discrimination thresholds using stereoscopic 3D test patterns. The results of this study indicate that, in the symmetric stereo viewing case, binocular disparity does not affect the blur discrimination thresholds for the selected 3D test patterns. In the asymmetric viewing case, the blur discrimination thresholds decreased and the decrease in threshold values is found to be dominated by the eye observing the higher blur.



The second part of the dissertation focuses on texture granularity in the context of 2D images. A texture granularity database referred to as GranTEX, consisting of textures with varying granularity levels is constructed. A subjective study is conducted to measure the perceived granularity level of textures present in the GranTEX database. An objective index that automatically measures the perceived granularity level of textures is also presented. It is shown that the proposed granularity metric correlates well with the subjective granularity scores and outperforms the other methods presented in the literature.

A subjective study is conducted to assess the effect of compression on textures with varying degrees of granularity. A logarithmic function model is proposed as a fit to the subjective test data. It is demonstrated that the proposed model can be used for rate-distortion control by allowing the automatic selection of the needed compression ratio for a target visual quality. The proposed model can also be used for visual quality assessment by providing a measure of the visual quality for a target compression ratio.

The effect of texture granularity on the quality of synthesized textures is studied. A subjective study is presented to assess the quality of synthesized textures with varying levels of texture granularity using different types of texture synthesis methods. This work also proposes a reduced-reference visual quality index referred to as delta texture granularity index for assessing the visual quality of synthesized textures.
Date Created
2015
Agent

Temporal coding of cortical neural signals and camera motion estimation in target tracking

150692-Thumbnail Image.png
Description
This dissertation includes two parts. First it focuses on discussing robust signal processing algorithms, which lead to consistent performance under perturbation or uncertainty in video target tracking applications. Projective distortion plagues the quality of long sequence mosaicking which results in

This dissertation includes two parts. First it focuses on discussing robust signal processing algorithms, which lead to consistent performance under perturbation or uncertainty in video target tracking applications. Projective distortion plagues the quality of long sequence mosaicking which results in loosing important target information. Some correction techniques require prior information. A new algorithm is proposed in this dissertation to this very issue. Optimization and parameter tuning of a robust camera motion estimation as well as implementation details are discussed for a real-time application using an ordinary general-purpose computer. Performance evaluations on real-world unmanned air vehicle (UAV) videos demonstrate the robustness of the proposed algorithms. The second half of the dissertation addresses neural signal analysis and modeling. Neural waveforms were recorded from rats' motor cortical areas while rats performed a learning control task. Prior to analyzing and modeling based on the recorded neural signal, neural action potentials are processed to detect neural action potentials which are considered the basic computation unit in the brain. Most algorithms rely on simple thresholding, which can be subjective. This dissertation proposes a new detection algorithm, which is an automatic procedure based on signal-to-noise ratio (SNR) from the neural waveforms. For spike sorting, this dissertation proposes a classification algorithm based on spike features in the frequency domain and adaptive clustering method such as the self-organizing map (SOM). Another major contribution of the dissertation is the study of functional interconnectivity of neurons in an ensemble. These functional correlations among neurons reveal spatial and temporal statistical dependencies, which consequently contributes to the understanding of a neuronal substrate of meaningful behaviors. This dissertation proposes a new generalized yet simple method to study adaptation of neural ensemble activities of a rat's motor cortical areas during its cognitive learning process. Results reveal interesting temporal firing patterns underlying the behavioral learning process.
Date Created
2012
Agent