Investigation and Analysis of Music Genre Identification via Machine Learning

132715-Thumbnail Image.png
Description
Modern audio datasets and machine learning software tools have given researchers a deep understanding into Music Information Retrieval (MIR) applications. In this paper, we investigate the accuracy and viability of using a machine learning based approach to perform music

Modern audio datasets and machine learning software tools have given researchers a deep understanding into Music Information Retrieval (MIR) applications. In this paper, we investigate the accuracy and viability of using a machine learning based approach to perform music genre recognition using the Free Music Archive (FMA) dataset. We compare the classification accuracy of popular machine learning models, implement various tuning techniques including principal components analysis (PCA), as well as provide an analysis of the effect of feature space noise on classification accuracy.
Date Created
2019-05
Agent

Advances in Motion Estimators for Applications in Computer Vision

156919-Thumbnail Image.png
Description
Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained.

The contributions of this dissertation are approaches and frameworks that introduce i) a new optical flow-based interpolation method to achieve minimally divergent velocimetry data, ii) a framework that improves the accuracy of change detection algorithms in synthetic aperture radar (SAR) images, and iii) a set of new methods to integrate Proton Magnetic Resonance Spectroscopy (1HMRSI) data into threedimensional (3D) neuronavigation systems for tumor biopsies.

In the first application an optical flow-based approach for the interpolation of minimally divergent velocimetry data is proposed. The velocimetry data of incompressible fluids contain signals that describe the flow velocity. The approach uses the additional flow velocity information to guide the interpolation process towards reduced divergence in the interpolated data.

In the second application a framework that mainly consists of optical flow methods and other image processing and computer vision techniques to improve object extraction from synthetic aperture radar images is proposed. The proposed framework is used for distinguishing between actual motion and detected motion due to misregistration in SAR image sets and it can lead to more accurate and meaningful change detection and improve object extraction from a SAR datasets.

In the third application a set of new methods that aim to improve upon the current state-of-the-art in neuronavigation through the use of detailed three-dimensional (3D) 1H-MRSI data are proposed. The result is a progressive form of online MRSI-guided neuronavigation that is demonstrated through phantom validation and clinical application.
Date Created
2018
Agent

Computational Modeling and Analysis of Symmetry in Human Movements

156802-Thumbnail Image.png
Description
Human movement is a complex process influenced by physiological and psychological factors. The execution of movement is varied from person to person, and the number of possible strategies for completing a specific movement task is almost infinite. Different choices of

Human movement is a complex process influenced by physiological and psychological factors. The execution of movement is varied from person to person, and the number of possible strategies for completing a specific movement task is almost infinite. Different choices of strategies can be perceived by humans as having different degrees of quality, and the quality can be defined with regard to aesthetic, athletic, or health-related ratings. It is useful to measure and track the quality of a person's movements, for various applications, especially with the prevalence of low-cost and portable cameras and sensors today. Furthermore, based on such measurements, feedback systems can be designed for people to practice their movements towards certain goals. In this dissertation, I introduce symmetry as a family of measures for movement quality, and utilize recent advances in computer vision and differential geometry to model and analyze different types of symmetry in human movements. Movements are modeled as trajectories on different types of manifolds, according to the representations of movements from sensor data. The benefit of such a universal framework is that it can accommodate different existing and future features that describe human movements. The theory and tools developed in this dissertation will also be useful in other scientific areas to analyze symmetry from high-dimensional signals.
Date Created
2018
Agent

Data-Driven Representation Learning in Multimodal Feature Fusion

156587-Thumbnail Image.png
Description
Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction.

We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.

In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.
Date Created
2018
Agent

Process Control Applications in Microbial Fuel Cells(MFC)

156507-Thumbnail Image.png
Description
Microbial fuel cells(MFC) use micro-organisms called anode-respiring bacteria(ARB) to convert chemical energy into electrical energy. This process can not only treat wastewater but can also produce useful byproduct hydrogen peroxide(H2O2). Process variables like anode potential and pH play important role

Microbial fuel cells(MFC) use micro-organisms called anode-respiring bacteria(ARB) to convert chemical energy into electrical energy. This process can not only treat wastewater but can also produce useful byproduct hydrogen peroxide(H2O2). Process variables like anode potential and pH play important role in the MFC operation and the focus of this dissertation are pH and potential control problems.

Most of the adaptive pH control solutions use signal-based-norms as cost functions, but their strong dependency on excitation signal properties makes them sensitive to noise, disturbances, and modeling errors. System-based-norm( H-infinity) cost functions provide a viable alternative for the adaptation as they are less susceptible to the signal properties. Two variants of adaptive pH control algorithms that use approximate H-infinity frequency loop-shaping (FLS) cost metrics are proposed in this dissertation.

A pH neutralization process with high retention time is studied using lab scale experiments and the experimental setup is used as a basis to develop a first-principles model. The analysis of such a model shows that only the gain of the process varies significantly with operating conditions and with buffering capacity. Consequently, the adaptation of the controller gain (single parameter) is sufficient to compensate for the variation in process gain and the focus of the proposed algorithms is the adaptation of the PI controller gain. Computer simulations and lab-scale experiments are used to study tracking, disturbance rejection and adaptation performance of these algorithms under different excitation conditions. Results show the proposed algorithm produces optimum that is less dependent on the excitation as compared to a commonly used L2 cost function based algorithm and tracks set-points reasonably well under practical conditions. The proposed direct pH control algorithm is integrated with the combined activated sludge anaerobic digestion model (CASADM) of an MFC and it is shown pH control improves its performance.

Analytical grade potentiostats are commonly used in MFC potential control, but, their high cost (>$6000) and large size, make them nonviable for the field usage. This dissertation proposes an alternate low-cost($200) portable potentiostat solution. This potentiostat is tested using a ferricyanide reactor and results show it produces performance close to an analytical grade potentiostat.
Date Created
2018
Agent

Evaluation of an Original Design for a Cost-Effective Wheel-Mounted Dynamometer for Road Vehicles

133887-Thumbnail Image.png
Description
This thesis evaluates the viability of an original design for a cost-effective wheel-mounted dynamometer for road vehicles. The goal is to show whether or not a device that generates torque and horsepower curves by processing accelerometer data collected at the

This thesis evaluates the viability of an original design for a cost-effective wheel-mounted dynamometer for road vehicles. The goal is to show whether or not a device that generates torque and horsepower curves by processing accelerometer data collected at the edge of a wheel can yield results that are comparable to results obtained using a conventional chassis dynamometer. Torque curves were generated via the experimental method under a variety of circumstances and also obtained professionally by a precision engine testing company. Metrics were created to measure the precision of the experimental device's ability to consistently generate torque curves and also to compare the similarity of these curves to the professionally obtained torque curves. The results revealed that although the test device does not quite provide the same level of precision as the professional chassis dynamometer, it does create torque curves that closely resemble the chassis dynamometer torque curves and exhibit a consistency between trials comparable to the professional results, even on rough road surfaces. The results suggest that the test device provides enough accuracy and precision to satisfy the needs of most consumers interested in measuring their vehicle's engine performance but probably lacks the level of accuracy and precision needed to appeal to professionals.
Date Created
2018-05
Agent

Robust distributed parameter estimation in wireless sensor networks

156015-Thumbnail Image.png
Description
Fully distributed wireless sensor networks (WSNs) without fusion center have advantages such as scalability in network size and energy efficiency in communications. Each sensor shares its data only with neighbors and then achieves global consensus quantities by in-network processing. This

Fully distributed wireless sensor networks (WSNs) without fusion center have advantages such as scalability in network size and energy efficiency in communications. Each sensor shares its data only with neighbors and then achieves global consensus quantities by in-network processing. This dissertation considers robust distributed parameter estimation methods, seeking global consensus on parameters of adaptive learning algorithms and statistical quantities.

Diffusion adaptation strategy with nonlinear transmission is proposed. The nonlinearity was motivated by the necessity for bounded transmit power, as sensors need to iteratively communicate each other energy-efficiently. Despite the nonlinearity, it is shown that the algorithm performs close to the linear case with the added advantage of power savings. This dissertation also discusses convergence properties of the algorithm in the mean and the mean-square sense.

Often, average is used to measure central tendency of sensed data over a network. When there are outliers in the data, however, average can be highly biased. Alternative choices of robust metrics against outliers are median, mode, and trimmed mean. Quantiles generalize the median, and they also can be used for trimmed mean. Consensus-based distributed quantile estimation algorithm is proposed and applied for finding trimmed-mean, median, maximum or minimum values, and identification of outliers through simulation. It is shown that the estimated quantities are asymptotically unbiased and converges toward the sample quantile in the mean-square sense. Step-size sequences with proper decay rates are also discussed for convergence analysis.

Another measure of central tendency is a mode which represents the most probable value and also be robust to outliers and other contaminations in data. The proposed distributed mode estimation algorithm achieves a global mode by recursively shifting conditional mean of the measurement data until it converges to stationary points of estimated density function. It is also possible to estimate the mode by utilizing grid vector as well as kernel density estimator. The densities are estimated at each grid point, while the points are updated until they converge to a global mode.
Date Created
2017
Agent

Diversity Promoting Online Sampling for Streaming Video Summarization

Description
Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about. The algorithm usually selects frames in a time sequence through

Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about. The algorithm usually selects frames in a time sequence through systematic sampling. Invariably, there are other applications like video surveillance, web-based video surfing and video archival applications which can benefit from efficient and concise video summaries. In this project, we explored several clustering algorithms and how these can be combined and deconstructed to make summarization algorithm more efficient and relevant. We focused on two metrics to summarize: reducing error and redundancy in the summary. To reduce the error online k-means clustering algorithm was used; to reduce redundancy we applied two different methods: volume of convex hulls and the true diversity measure that is usually used in biological disciplines. The algorithm was efficient and computationally cost effective due to its online nature. The diversity maximization (or redundancy reduction) using technique of volume of convex hulls showed better results compared to other conventional methods on 50 different videos. For the true diversity measure, there has not been much work done on the nature of the measure in the context of video summarization. When we applied it, the algorithm stalled due to the true diversity saturating because of the inherent initialization present in the algorithm. We explored the nature of this measure to gain better understanding on how it can help to make summarization more intuitive and give the user a handle to customize the summary.
Date Created
2017-05
Agent

Measuring Glide Reflection Symmetry in Human Movements

155748-Thumbnail Image.png
Description
Many studies on human walking pattern assume that adult gait is characterized by bilateral symmetrical behavior. It is well understood that maintaining symmetry in walking patterns increases energetic eciency. We present a framework to provide a quantitative assessment of human

Many studies on human walking pattern assume that adult gait is characterized by bilateral symmetrical behavior. It is well understood that maintaining symmetry in walking patterns increases energetic eciency. We present a framework to provide a quantitative assessment of human walking patterns, especially assessments related to symmetric and asymmetric gait patterns purely based on glide reflection. A Gliding symmetry score is calculated from the data obtained from Motion Capture(MoCap) system. Six primary joints (Shoulder, Elbow, Palm, Hip, Knee, Foot) are considered for this study. Two dierent abnormalities were chosen and studied carefully. All the two gaits were mimicked in controlled environment. The framework proposed clearly showed that it could distinguish the abnormal gaits from the ordinary walking patterns. This framework can be widely used by the doctors and physical therapists for kinematics analysis, bio-mechanics, motion capture research, sports medicine and physical therapy, including human gait analysis and injury rehabilitation.
Date Created
2017
Agent

Locally Adaptive Stereo Vision Based 3D Visual Reconstruction

155540-Thumbnail Image.png
Description
Using stereo vision for 3D reconstruction and depth estimation has become a popular and promising research area as it has a simple setup with passive cameras and relatively efficient processing procedure. The work in this dissertation focuses on locally adaptive

Using stereo vision for 3D reconstruction and depth estimation has become a popular and promising research area as it has a simple setup with passive cameras and relatively efficient processing procedure. The work in this dissertation focuses on locally adaptive stereo vision methods and applications to different imaging setups and image scenes.





Solder ball height and substrate coplanarity inspection is essential to the detection of potential connectivity issues in semi-conductor units. Current ball height and substrate coplanarity inspection tools are expensive and slow, which makes them difficult to use in a real-time manufacturing setting. In this dissertation, an automatic, stereo vision based, in-line ball height and coplanarity inspection method is presented. The proposed method includes an imaging setup together with a computer vision algorithm for reliable, in-line ball height measurement. The imaging setup and calibration, ball height estimation and substrate coplanarity calculation are presented with novel stereo vision methods. The results of the proposed method are evaluated in a measurement capability analysis (MCA) procedure and compared with the ground-truth obtained by an existing laser scanning tool and an existing confocal inspection tool. The proposed system outperforms existing inspection tools in terms of accuracy and stability.



In a rectified stereo vision system, stereo matching methods can be categorized into global methods and local methods. Local stereo methods are more suitable for real-time processing purposes with competitive accuracy as compared with global methods. This work proposes a stereo matching method based on sparse locally adaptive cost aggregation. In order to reduce outlier disparity values that correspond to mis-matches, a novel sparse disparity subset selection method is proposed by assigning a significance status to candidate disparity values, and selecting the significant disparity values adaptively. An adaptive guided filtering method using the disparity subset for refined cost aggregation and disparity calculation is demonstrated. The proposed stereo matching algorithm is tested on the Middlebury and the KITTI stereo evaluation benchmark images. A performance analysis of the proposed method in terms of the I0 norm of the disparity subset is presented to demonstrate the achieved efficiency and accuracy.
Date Created
2017
Agent