ABSTRACT

We address the problem of processing a context-sensitive language with a recurrent neural network (RN). So far, the language processing capabilities of RNs have only been investigated for regular and context-free languages. We present an extremely simple RN with only one parameter z for its two hidden nodes that can perform a prediction task on sequences of symbols from the language {(bak)n | k ≥ 0, n > 0}, a language that is context-sensitive but not context-free. The input to the RN consists of any string of the language, one symbol at a time. The network should then, at all times, predict the symbol that should follow. This means that the network must be able to count the number of a's in the first subsequence and to retain this number for future use. We present a value for the parameter z for which our RN can solve the task for k = 1 up to k = 120. As we do not give any method to find a good value for z, this does not say anything about the learning capabilities of our network. It does, however, show that context-sensitive information (the count of a's) can be represented by the network; we analyse in detail how this is done. Hence our work shows that, at least from a representational point of view, connectionist architectures can handle more complex formal languages than was previously known.