Abstract
This paper analyzes boosting in unscaled versions of ROC spaces, also referred to as PN spaces. A minor revision to AdaBoost ’s reweighting strategy is analyzed, which allows to reformulate it in terms of stratification, and to visualize the boosting process in nested PN spaces as known from divide-and-conquer rule learning. The analyzed confidence-rated algorithm is proven to take more advantage of its base models in each iteration, although also searching a space of linear discrete base classifier combinations. The algorithm reduces the training error quicker without lacking any of the advantages of original AdaBoost. The PN space interpretation allows to derive a lower-bound for the area under the ROC curve metric (AUC) of resulting ensembles based on the AUC after reweighting. The theoretical findings of this paper are complemented by an empirical evaluation on benchmark datasets.
Chapter PDF
Similar content being viewed by others
References
Freund, Y., Schapire, R.R.: A decision–theoretic generalization of on-line learning and an application to boosting. Computer and System Sciences 55(1) (1997)
Fürnkranz, J., Flach, P.: ROC ’n’ Rule Learning – Towards a Better Understanding of Covering Algorithms. Machine Learning 58(1) (2005)
Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Tech report HPL-2003-4. HP Laboratories, Palo Alto, CA, USA (2004)
Schapire, R.E., Singer, Y.: Improved Boosting Using Confidence-rated Predictions. Machine Learning 37(3) (1999)
Rudin, C., Cortes, C., Mohri, M., Schapire, R.E.: Margin-Based Ranking Meets Boosting in the Middle. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 63–78. Springer, Heidelberg (2005)
Flach, P.A.: The Geometry of ROC Space: Understanding Machine Learning Metrics through ROC Isometrics. In: Proc. of ICML (2003)
Rosset, S.: Model Selection via the AUC. In: Proc. of ICML (2004)
Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28 (2000)
Mason, L., Baxter, J., Bartlett, P., Frean, M.: Boosting algorithms as gradient descent in function space. Technical report, Australian National University (1999)
Scholz, M.: Sampling-Based Sequential Subgroup Mining. In: Proc. of KDD (2005)
Blake, C., Merz, C.: UCI Repository of machine learning databases (1998)
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proc. of KDD (2006)
Witten, I., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Proc. of ICML (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scholz, M. (2006). Boosting in PN Spaces. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_37
Download citation
DOI: https://doi.org/10.1007/11871842_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)