Fake News Detection: Benchmarking Machine Learning and Deep Learning Approaches

Manan Buddhadev; Virtee Parekh

Fake News Detection: Benchmarking Machine Learning and Deep Learning Approaches

ESP Journal of Engineering & Technology Advancements

Volume 5 Issue 2

Year of Publication : 2025

Authors : Manan Buddhadev, Virtee Parekh

:10.56472/25832646/JETA-V5I2P106

Citation:

Manan Buddhadev, Virtee Parekh, 2025. "Fake News Detection: Benchmarking Machine Learning and Deep Learning Approaches", ESP Journal of Engineering & Technology Advancements 5(2): 41-48.

Abstract:

Fraudulent articles have cropped up all over the web and spread like wildfire. They constitute falsified facts, phony scientific facts, discriminatory articles, satirical items and misleading articles aimed at demeaning other groups or individuals. It is imperative to contain such articles as they create chaos and lead to unwise decision making. In this project, machine learning and deep learning approaches are used to flag fake news items. Part of the dataset is manually scraped from the web and the other half is publicly available. Feature extraction techniques like Bag of Words, TF-IDF, N-grams, word embeddings like GloVe are explored. Out of the various combinations of feature extraction techniques and models, it was found that implementing CNN-LSTM along with GloVe embeddings gave the best results with 91% testing accuracy.

References:

[1] Singh, Anita. "'Cuffing Season' and 'Corbynmania' Are Named Words of the Year by Collins Dictionary." The Telegraph, 2 Nov. 2017, https://www.telegraph.co.uk/news/2017/11/02/cuffing-season-corbynmania-named-words-year-collins-dictionary/. Accessed 20 Apr. 2025.

[2] Wardle, Claire. "Fake News. It's Complicated." First Draft, 16 Feb. 2017, https://firstdraftnews.org/fake-news-complicated/. Accessed 20 Apr. 2025.

[3] Álvarez, Miguel M. "How Can Machine Learning and AI Help Solving the Fake News Problem?" Miguel M. Álvarez, 23 Mar. 2017, https://miguelmalvarez.com/2017/03/23/how-can-machine-learning-and-ai-help-solving-the-fake-news-problem/. Accessed 20 Apr. 2025.

[4] Wang, William Yang. "" liar, liar pants on fire": A new benchmark dataset for fake news detection." arXiv preprint arXiv:1705.00648 (2017).

[5] Shu, Kai, et al. "Fake news detection on social media: A data mining perspective." ACM SIGKDD explorations newsletter 19.1 (2017): 22-36.

[6] Tacchini, Eugenio, et al. "Some like it hoax: Automated fake news detection in social networks." arXiv preprint arXiv:1704.07506 (2017).

[7] Ruchansky, Natali, Sungyong Seo, and Yan Liu. "CSI: A Hybrid Deep Model for Fake News Detection." Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017, pp. 797–806.

[8] Ajao, Oluwaseun, Deepayan Bhowmik, and Shahrzad Zargari. "Fake News Identification on Twitter with Hybrid CNN and RNN Models." Proceedings of the 9th International Conference on Social Media and Society, 2018, pp. 226–230.

[9] Singh, Umber, Talieh Hajzargarbashi, Sasanka Gandavarapu, and Brennan Borlaug. "Make News Credible Again." UC Berkeley School of Information, 2017, https://www.ischool.berkeley.edu/projects/2017/make-news-credible-again. Accessed 20 Apr. 2025.

[10] Estela, Zach. "D4D Project Highlight: Are You Fake News?" Medium, Data for Democracy, 27 Nov. 2018, https://medium.com/data-for-democracy/d4d-project-highlight-are-you-fake-news-2c8b3930f804. Accessed 20 Apr. 2025.

[11] Szpakowski, Maciej. "FakeNewsCorpus." GitHub, 24 Jan. 2020, https://github.com/several27/FakeNewsCorpus. Accessed 20 Apr. 2025.

[12] Media Bias/Fact Check. Media Bias/Fact Check, 2025, https://mediabiasfactcheck.com/. Accessed 20 Apr. 2025.

[13] Kaggle. "Fake News Detection." Kaggle, https://www.kaggle.com/c/fake-news/data. Accessed 20 Apr. 2025.

[14] McIntire, George. "fake_real_news_dataset." GitHub, 24 Jan. 2020, https://github.com/GeorgeMcIntire/fake_real_news_dataset. Accessed 20 Apr. 2025.

[15] Sarkar, Dipanjan. "A Practitioner's Guide to Natural Language Processing (Part I): Processing & Understanding Text." Towards Data Science, 20 June 2018, towardsdatascience.com/a-practitioners-guide-to-natural-language-processing-part-i-processing-understanding-text-9f4abfd13e72. Accessed 20 Apr. 2025.

[16] Loper, Edward, and Steven Bird. "Nltk: The natural language toolkit." arXiv preprint cs/0205028 (2002).

[17] "spaCy: Industrial-Strength Natural Language Processing in Python." spaCy, Explosion AI, https://spacy.io/. Accessed 20 Apr. 2025.

[18] D'Souza, Jocelyn. "An Introduction to Bag-of-Words in NLP." GreyAtom, 3 Apr. 2018, https://medium.com/greyatom/an-introduction-to-bag-of-words-in-nlp-ac967d43b428. Accessed 20 Apr. 2025.

[19] Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." the Journal of machine Learning research 12 (2011): 2825-2830.

[20] Fueyo, Enrique. "WTF is TF-IDF?" KDnuggets, 6 Aug. 2018, https://www.kdnuggets.com/2018/08/wtf-tf-idf.html. Accessed 20 Apr. 2025.

[21] Karani, Dhruvil. "Introduction to Word Embedding and Word2Vec." Medium, 1 Sept. 2018, https://medium.com/data-science/introduction-to-word-embedding-and-word2vec-652d0c2060fa. Accessed 20 Apr. 2025.

[22] Kurita, Keita. "Paper Dissected:“Glove: Global Vectors for Word Representation” Explained." Machine Learning Explained (2018).

[23] Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "GloVe: Global Vectors for Word Representation." Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532-1543.

[24] Keras. "Keras: Deep Learning for Humans." Keras, https://keras.io/. Accessed 20 Apr. 2025.

[25] Olah, Christopher. "Understanding LSTM Networks." colah's blog, 27 Aug. 2015, https://colah.github.io/posts/2015-08-Understanding-LSTMs/. Accessed 20 Apr. 2025.

[26] Kim, Yoon. "Convolutional Neural Networks for Sentence Classification." arXiv, 25 Aug. 2014, https://arxiv.org/abs/1408.5882. Accessed 20 Apr. 2025.

Keywords:

Fake News Detection; Deep Learning; Text Classification; Glove Embedding; Machine Learning; CNN-LSTM.

ISSN : 2583-2646