Shelby Hall Graduate Research Forum Posters
Files
Download Full Poster (343 KB)
Description
The stock market consists of complex financial datasets, and achieving stock price real time prediction needs an efficient big data framework for processing. This paper compares big data distributed data processing frameworks for forecasting stock prices using Graph Neural Networks (GNNs) - Apache Flink and Apache Spark. We analyze 70 publicly traded companies’ monthly data for the last 5 years from Yahoo Finance, ranked by Price-to-Earnings (P/E). In the companies’ datasets, there may be a connection or similarity between companies, and this can lead to similar stocks’ price behavior. These interfirm relationships are maintained by GNNs models, and their output is processed by Flink and Spark. During comparison, I used mean squared error (MSE) to measure the accuracy of predictions, lower MSE means lower error rate. Similarly, I used a unit of time in seconds to measure the efficiency by collecting processing time. Based on learning and literature, I suggest that Apache Flink is superior in prediction accuracy (MSE: 0.038 vs. 0.045) and efficiency (12.5s vs. 18.2s) as compared to Apache Spark. This ongoing study addresses a gap in financial forecasting comparisons, offering insights for algorithmic trading.
Publication Date
3-2026
Department
Computer Science
Disciplines
Databases and Information Systems | Programming Languages and Compilers
Recommended Citation
Pal, Vibhor, "Stock Market Price Prediction Using Big Data Models Comparison Analysis" (2026). Shelby Hall Graduate Research Forum Posters. 48.
https://jagworks.southalabama.edu/southalabama-shgrf-posters/48