"Analytics Insights from Text: Machine Learning, AI, and Sentiment Anal" by Charlie Smith
 

Theses and Dissertations

Date of Award

5-2025

Document Type

Dissertation

Degree Name

Ph.D.

Department

Business Administration

Committee Chair

Co-Chairs: Joshua Lambert and Ermanno Affuso

Abstract

Business analytics is about drawing actionable insights from data. These distinct but connected essays represent a novel approach to explore how natural language processing (NLP) advances and machine learning can transform unstructured text data into actionable conclusions. Essay 1 provides a broad framework. Essay 2 strengthens the sentiment analysis with the most recent artificial intelligence methodologies for capturing nuanced sentiment in complex texts. Essay 3 applies those insights to forecast recessions using topics that can be readily interpreted and applied.

The research demonstrates how these methodologies can be applied to enhance understanding of the same dataset, Beige Books. Published by the Federal Reserve eight times yearly, the documents provide anecdotal impressions of current economic conditions from stakeholders representing diverse sectors and geographic regions. NLP quantifies the rich and timely information shared in their perspectives.

Each essay approaches the task from a distinct analytical angle (code and data for each essay is available at https://github.com/ces2222/bbFinal). Essay 1, “Textual Analysis of Beige Books to Predict Regional Macroeconomic Changes,” calculates sentiment measures using a lexical method and key topics based on an unsupervised process. Sentiment features are used in a random forest model to predict growth in a region during a month as measured by the State Coincident Index. Results show an AUC of .79, indicating promising relevance of Beige Book sentiment.

Essay 2, “Large Language Models Predictions of Economic Sentiment Based on Beige Books,” draws upon the emerging AI tool of large language models (LLMs) to comprehend texts in a human-like way that potentially exceeds previous sentiment analysis methods. A BERT (Bidirectional Encoder Representations from Transformers) LLM model is fine-tuned on human-labeled data, classifying tone in a Beige Book. The model developed, which is dubbed BeigeSage, compares favorably against leading LLMs like GPT and Llama in its ability to classify Beige Book sentiment. Comparisons are made on the performance and efficiency of fine-tuned, closed-source, and open-source LLMs, highlighting potential low-cost applications for non-proprietary AI models.

Essay 3, “Economic Forecasting with Interpretable Topic Models: Evidence from Beige Books,” aspires to overcome a common problem with topic modeling in business research: Unsupervised methods create topics that reflect statistical relationships between words, but a lack of interpretability limits practical application. To overcome this issue, topics are pre-selected based on theory about their importance to the economy and then labeled in Beige Books by the researcher. A fine-tuned BERT model learns from the human annotations to assign labels to the entire Beige Book corpus. Then a probit regression forecasts the likelihood of a recession based on counts of 11 economic topics. Findings demonstrate significant value for nowcasting and forecasting recessions with Beige Books. Discussions related to capital spending plans and employment levels hold particular forward-looking importance.

Share

COinS