161220-Thumbnail Image.png
Description

Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater

Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is estimation of the Bayes Error Rate. This paper provides the development and analysis of two methods used to estimate the Bayes Error Rate on a given set of data to evaluate performance. The first method takes a “global” approach, looking at the data as a whole, and the second is more “local”—partitioning the data at the outset and then building up to a Bayes Error Estimation of the whole. It is found that one of the methods provides an accurate estimation of the true Bayes Error Rate when the dataset is at high dimension, while the other method provides accurate estimation at large sample size. This second conclusion, in particular, can have significant ramifications on “big data” problems, as one would be able to clarify the distribution with an accurate estimation of the Bayes Error Rate by using this method.

Reuse Permissions


  • Download restricted.
    Restrictions Statement

    Barrett Honors College theses and creative projects are restricted to ASU community members.

    Details

    Title
    • Characterizing the Performance of Machine Learning Algorithms: A Study and Novel Techniques
    Contributors
    Date Created
    2021-12
    Resource Type
  • Text
  • Machine-readable links