Date of Award
Aviv Segev, Ph.D.
The increasing complexity of artificial intelligence models has given rise to extensive work toward understanding the inner workings of neural networks. Much of that work, however, has focused on manipulating input data feeding the network to assess their affects on network output or pruning model components after the often-extensive time-consuming training. It is postulated in this study that understanding of neural network can benefit from model structure simplification. In turn, it is shown that model simplification can benefit from investigating network node, the most fundamental unit of neural networks, evolving trends during training. Whereas studies on simplification of model structure have mostly required repeated model training at prohibitive time costs, assessing evolving trends in node weights toward model stabilization may circumvent that limitation.
Node Positional and magnitude stabilities were the central construct to investigate neuronal patterns in time for this study and to determine node influence in model predictive ability. Positional stability was defined as the number of epochs wherein nodes held their location compared to those from the stable model, defined in this study as a model with accuracy >0.90. Node magnitude stability was defined as the number of epochs where node weights retained their magnitude within a tolerance value when compared to the stable model. To test evolving trends, a manipulated, a contrived, two life science data sets were used. Data sets were run convolutional (CNN) and deep neural network (DNN) models. Experiments were conducted to test neural network training for patterns as a predicate for investigating node evolving trends. It was postulated that highly stable nodes were most influential in determining model prediction, measured by accuracy. Furthermore, this study suggested that influential node addition to model during training followed a biological growth curve.
Findings indicated that neural network weight assignment, weight spatial structure, and progression through time were not random, strongly by model choice and choice of data set. Moreover, progress toward stability differed by model, where CNNs added influential nodes more evenly during training. The CNN model runs generally followed a biological growht curve covering an entire life, whereas for DNN model runs, the growth curve shape was more characteristic of an organism during its early life or a population unconstrained by resources, where growth tends to be exponential.
The stability approach of this study showed superior time efficiencies when compared to competing methods. The contributions of this work may assist in making AI models more transparent and easier to understand to all stakeholders, adding to the benefits of AI technologies by minimizing and dispelling the fears associated with adoption of black-box automation approaches in science and industry.
Riedel, Ralf P., "Explainable Artificial Intelligence: Approaching it From the Lowest Level" (2023). Theses and Dissertations. 164.