Multilingual Cyber Threat Detection In Tweets/X Using Ml, Dl, And Llm: A Comparative Analysis

Document Type

Article

Publication Date

4-1-2026

School

Computing Sciences and Computer Engineering

Abstract

Cyber threat detection has become an important area of focus in the digital age of today due to the growing spread of fake information and harmful content on social media platforms such as Twitter (now “X”). These cyber threats, often disguised within tweets, pose significant risks to individuals, communities, and even nations, emphasizing the need for effective detection systems. While previous research has explored tweet-based threats, much of the work is limited to specific languages, domains, or locations or relies on single-model approaches, reducing their applicability to diverse real-world scenarios. To address these gaps, our study focuses on multilingual tweet cyber threat detection using a variety of advanced models. The research was conducted in three stages: 1) we collected and labeled tweet datasets in four languages—English, Chinese, Russian, and Arabic—employing both manual and polarity-based labeling methods to ensure high-quality annotations; 2) each dataset was analyzed individually using ML and DL models to assess their performance on distinct languages; and 3) finally, we combined all four datasets into a single multilingual dataset and applied deep learning (DL) and large language model (LLM) architectures to evaluate their efficacy in identifying cyber threats across various languages. Our results show that among machine learning models, random forest (RF) attained the highest performance, and the bidirectional long short-term memory (BiLSTM) architecture consistently surpassed other DL and LLM architectures across all datasets. These findings underline the effectiveness of Bi-LSTM in multilingual cyber threat detection.

Publication Title

IEEE Transactions on Computational Social Systems

Volume

13

Issue

2

First Page

1758

Last Page

1772

Share

COinS