Stroke Prediction
| | Dataset | GitHub | Report | Slides | Video |
Abstract
Stroke remains one of the leading causes of mortality worldwide, accounting for nearly 11% of all deaths. Early detection and prevention are crucial in mitigating its impact. This project presents a comprehensive stroke prediction system that leverages multiple machine learning models to identify individuals at high risk of stroke based on health and lifestyle parameters. The system is built using a real-world healthcare dataset containing patient attributes such as age, gender, hypertension, heart disease, marital status, work type, residence, average glucose level, BMI, and smoking status.
To enhance prediction accuracy and robustness, we implemented and compared the performance of several machine learning algorithms, including K-Nearest Neighbors (KNN), Logistic Regression, Naive Bayes, Decision Tree Classifier and Neural Network. These models were evaluated using various performance metrics such as accuracy, precision, recall, and F1-score. To address the challenges of class imbalance, we employed techniques such as class weighting.
Furthermore, dimensionality reduction techniques were explored to improve model performance and training efficiency. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were used to reduce feature space while preserving essential variance and class-separating information, respectively. The impact of these techniques was analyzed by comparing model accuracy before and after transformation.
Our findings highlight the effectiveness of combining classical machine learning models with dimensionality reduction strategies for early stroke prediction. The project underscores the potential of data-driven approaches in supporting preventive healthcare and aiding timely medical intervention for stroke-prone individuals.
Keywords:
Performance Metrics of Different Models
| Model | Metrics |
|---|---|
| Naive Bayes | Accuracy: 71.61% Accuracy(with LDA): 61.61% |
| Logistic Regression | Accuracy(with PCA): 69.76% Accuracy(with LDA): 72.79% |
| KNN | Accuracy: 75.25% |
| Decision Tree | Accuracy(Varying due to depth): 0.96 to 0.91 |
| Neural Network | Accuracy: 95.11% |
Video Demo of Project:
Team Members
Acknowledgement
We would like to express our sincere gratitude to our course instructor, Anand Mishra for his invaluable guidance, support and feedback throughout this project. His expertise and insights were instrumental in shaping our understanding of machine learning concepts and their practical applications in the healthcare domain.
We would also like to extend our appreciation to the open-source community for their contributions in developing and maintaining the libraries and datasets that were crucial for the implementation and analysis of our stroke prediction project. Specifically, we acknowledge the use of Stroke Prediction Dataset in our project.
Furthermore, we would like to thank our peers and colleagues for their constructive discussions, feedback and suggestions, which significantly contributed to improving our project and enhancing our grasp of the subject matter.