Transforming the transformers
Natural language processing is the phrase used in applied computing for understanding human (“natural”) languages. Natural language processing has developed through the use of artificial neural networks that take in words and compare them with words in existing collections. Depending on the comparison of the words input to those already received, the neural networks will respond or not.
But this is “supervised” learning: a neural network cannot respond reliably until it has been trained with a set of words that have already been labelled so that the neural network has something to compare with. Labelling can be done manually (slow) or (semi-)automatically (much faster) but it must be done.
Once this is done a new set of words (the “test set”) can be input to the neural network and its response should provide information to the human users of natural language processing. Good – in principle.
We as human speakers and writers don’t limit our language in order to make it easier for computers. Some neural network architectures (recurrent neural networks e.g. from TensorFlow) can step through on-going natural language input to process it. But how does this scale when the text gets really long, and how does it deal with context?
This is where transformers have got their name: novel architectures for natural language processing that can be more flexible; that can take into account the context of natural language better; that may be able to scale better by comparison; and may be more responsive to automated methods of labelling that are needed to carry out supervised learning. Machine Learning platforms such as Keras and Hugging Face provide libraries to support transformers.
Application domains? They are huge, because the use of human language is so diverse. For example visit the GrApH AI Working Group seeking to use NLP in the processing of clinical notes for better use of medical knowledge.