Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/108328| Title: | P2P loan repayment prediction with imbalanced training sets |
| Authors: | Mizzi, Bernard (2022) |
| Keywords: | Peer-to-peer loans Default (Finance) -- Forecasting Machine learning Logistic regression analysis |
| Issue Date: | 2022 |
| Citation: | Mizzi, B. (2022). P2P loan repayment prediction with imbalanced training sets (Master's dissertation). |
| Abstract: | Loan defaulting was one of the major causes leading to the Great Recession of 2008-2009. Having systems which correctly identify loan defaulters is essential to the financial markets to avoid major losses which might negatively impact the economy. Recent advancements in technology have resulted in the creation of online platforms on which people can apply for loans. These platforms are known as online Peer-to-Peer lending platforms (P2P). Loans issued through these types of platforms are normally unsecure, and, thus, it is crucial to correctly identify loan defaulters so that lenders avoid losses. Using the data obtained from a P2P lending platform based in the USA, we apply machine learning techniques to predict defaulted loans in the P2P lending environment. We investigate the role of data preparation and training-testing selection techniques to improve the predictive capability of a classifier. Due to having a disproportionate number of defaulted loans, such environments suffer from the class imbalance problem. Hence, we also include sampling techniques to tackle class imbalance. We also treat the problem as the Maximum Diversity Problem (MDP) to extract the most diverse set so that it can be used for training. Furthermore, we also adopted a strategy to group the data separately for both training and testing according to certain feature values. Finally, we also included a method which constantly updates a classifier with the new and latest data to be able to cope with concept drift. We discovered that applying the dynamic approach is effective in such environments. We combine this dynamic approach with existing classifiers which outperform the traditional machine learning techniques. The results showed that Neural Networks combined with the proposed dynamic approach outperform the traditional Logistic Regression and Random Forest. The hypothesis tests which we performed also indicated that dynamic models outperform static ones. Such dynamic approach can be implemented by these online P2P lending platforms so that losses incurred due to defaulted loans are lowered, while also increasing their platform rating. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/108328 |
| Appears in Collections: | Dissertations - FacICT - 2022 Dissertations - FacICTAI - 2022 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2219ICTICS520000008496_1.PDF | 5.4 MB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
