Shelby Hall Graduate Research Forum Posters

Files

Download

Download Full Poster (343 KB)

Description

The stock market consists of complex financial datasets, and achieving stock price real time prediction needs an efficient big data framework for processing. This paper compares big data distributed data processing frameworks for forecasting stock prices using Graph Neural Networks (GNNs) - Apache Flink and Apache Spark. We analyze 70 publicly traded companies’ monthly data for the last 5 years from Yahoo Finance, ranked by Price-to-Earnings (P/E). In the companies’ datasets, there may be a connection or similarity between companies, and this can lead to similar stocks’ price behavior. These interfirm relationships are maintained by GNNs models, and their output is processed by Flink and Spark. During comparison, I used mean squared error (MSE) to measure the accuracy of predictions, lower MSE means lower error rate. Similarly, I used a unit of time in seconds to measure the efficiency by collecting processing time. Based on learning and literature, I suggest that Apache Flink is superior in prediction accuracy (MSE: 0.038 vs. 0.045) and efficiency (12.5s vs. 18.2s) as compared to Apache Spark. This ongoing study addresses a gap in financial forecasting comparisons, offering insights for algorithmic trading.

Publication Date

3-2026

Department

Computer Science

Disciplines

Databases and Information Systems | Programming Languages and Compilers

Stock Market Price Prediction Using Big Data Models Comparison Analysis

Share

COinS