This project aims to classify news articles as either real or fake using various machine learning algorithms and a Long Short-Term Memory (LSTM) neural network model.
The dataset used in this project contains news articles along with their headlines, bodies, and labels indicating whether they are real or fake. The dataset was preprocessed by combining the headlines and bodies, filtering out samples with empty text, and encoding the labels.
-
LSTM Model:
- Accuracy: 98.37%
- The LSTM model achieved an accuracy of 98.37% on the test data after training for 5 epochs.
-
Random Forest:
- Accuracy: 91.10%
- Random forest classifier achieved an accuracy of 91.10% on the test data.
-
Support Vector Machine (SVM):
- Accuracy: 74.81%
- SVM classifier achieved an accuracy of 74.81% on the test data.
-
Naive Bayes:
- Accuracy: 62.66%
- Naive Bayes classifier achieved an accuracy of 62.66% on the test data.
-
Gradient Boosting:
- Accuracy: 90.10%
- Gradient boosting classifier achieved an accuracy of 90.10% on the test data.
The LSTM model outperformed the traditional machine learning algorithms, achieving the highest accuracy of 98.37%. This suggests that deep learning models, such as LSTM, can be effective for text classification tasks like news classification. However, it's essential to consider factors such as model complexity, training time, and interpretability when choosing the appropriate model for a given task.
- Nimra Waqar
The dataset used in this project can be found at link.