LIBRARIES
    • Login
    Research Exchange
    Share your work
    View Item 
    •   Research Exchange
    • Electrical Engineering and Computer Science, School of
    • Faculty - Engineering and Computer Science
    • Broschat, Shira
    • View Item
    •   Research Exchange
    • Electrical Engineering and Computer Science, School of
    • Faculty - Engineering and Computer Science
    • Broschat, Shira
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Research ExchangeCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Capreomycin resistance prediction in two species of Mycobacterium using a stacked ensemble method

    Thumbnail
    View/Open
    khaledian19.pdf (517.4Kb)
    Date
    2019
    Author
    Chowdhury, Abu Sayed
    Khaledian, Ehdieh
    Broschat, Shira L.
    Metadata
    Show full item record
    Abstract
    Aims: Predicting bacterial resistance provides valuable information that can assist in clinical decisions. With recent advances in whole genome sequencing technology, the detection of antibiotic resistance (AR) proteins directly from genomic data is becoming feasible. AR genes/proteins can be identified using best‐hit methods that work by comparing candidate sequences with known AR genes in public databases. However, these approaches may fail to detect resistance genes with sequences that differ significantly from known sequences. Our goal is to develop a machine learning technique to accurately predict capreomycin resistance in Mycobacteria with low false discovery rates. Methods and Results: We present a stacked ensemble learning model as an alternative to traditional DNA sequence alignment‐based methods using optimal features generated from the physicochemical, evolutionary and secondary structure properties of protein sequences. We train logistic regression, C5.0 and support vector machine (SVM) algorithms as our base classifiers, and our stacked ensemble predictors combine the results from the base classifiers to achieve higher accuracy. Compared with our most accurate base classifier (SVM), our most accurate stacked ensemble predictor increases training accuracy by 2·43%. Our stacked ensemble predictors achieve test accuracy up to 81·25%. Conclusions: We developed a stacked ensemble model to predict capreomycin resistance for Mycobacteria with an accuracy >80% using protein sequences with sequence similarity ranging between 10% and 70%. This performance cannot be achieved with best‐hit methods due to differences in sequence similarity. Significance and Impact of the Study: Today an estimated one‐half million cases of multidrug‐resistant (MDR) and extensively drug‐resistant (XDR) tuberculosis (TB) occur annually worldwide at a great cost. Because capreomycin is a second‐line drug used to treat drug‐resistant TB, the ability to use a machine learning approach to classify capreomycin‐resistant TB in a timely manner is crucial for the successful treatment of MDR or XDR TB.
    URI
    http://hdl.handle.net/2376/17921
    Collections
    • Broschat, Shira