Description
There have been multiple attempts of coupling neural networks with external memory components for sequence learning problems. Such architectures have demonstrated success in algorithmic, sequence transduction, question-answering and reinforcement learning tasks. Most notable of these attempts is the Neural Turing Machine (NTM), which is an implementation of the Turing Machine with a neural network controller that interacts with a continuous memory. Although the architecture is Turing complete and hence, universally computational, it has seen limited success with complex real-world tasks.
In this thesis, I introduce an extension of the Neural Turing Machine, the Neural Harvard Machine, that implements a fully differentiable Harvard Machine framework with a feed-forward neural network controller. Unlike the NTM, it has two different memories - a read-only program memory and a read-write data memory. A sufficiently complex task is divided into smaller, simpler sub-tasks and the program memory stores parameters of pre-trained networks trained on these sub-tasks. The controller reads inputs from an input-tape, uses the data memory to store valuable signals and writes correct symbols to an output tape. The output symbols are a function of the outputs of each sub-network and the state of the data memory. Hence, the controller learns to load the weights of the appropriate program network to generate output symbols.
A wide range of experiments demonstrate that the Harvard Machine framework learns faster and performs better than the NTM and RNNs like LSTM, as the complexity of tasks increases.
In this thesis, I introduce an extension of the Neural Turing Machine, the Neural Harvard Machine, that implements a fully differentiable Harvard Machine framework with a feed-forward neural network controller. Unlike the NTM, it has two different memories - a read-only program memory and a read-write data memory. A sufficiently complex task is divided into smaller, simpler sub-tasks and the program memory stores parameters of pre-trained networks trained on these sub-tasks. The controller reads inputs from an input-tape, uses the data memory to store valuable signals and writes correct symbols to an output tape. The output symbols are a function of the outputs of each sub-network and the state of the data memory. Hence, the controller learns to load the weights of the appropriate program network to generate output symbols.
A wide range of experiments demonstrate that the Harvard Machine framework learns faster and performs better than the NTM and RNNs like LSTM, as the complexity of tasks increases.
Details
Title
- Differentiable Harvard Machine Architecture with Neural Network Controller
Contributors
- Bhatt, Manthan Bharat (Author)
- Ben Amor, Hani (Thesis advisor)
- Zhang, Yu (Committee member)
- Yang, Yezhou (Committee member)
- Arizona State University (Publisher)
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2020
Resource Type
Collections this item is in
Note
- Masters Thesis Computer Science 2020