Skip to content

Language Loop

Aidan Gomez edited this page Feb 24, 2022 · 3 revisions

The language loop remains unfinished, and is currently the most complicated and computationally demanding portion of the system. The storage of language information is done via a formatted directed graph, termed a ‘semantic web’.

The loop takes plaintext as an input and optionally, a speaker ID. It is initialized using languageloop.py, using the language_loop class. language_loop possesses three critical functions, update_knowledge_base() which takes no arguments and reads the text files stored in the knowledge base (a collection of text files), read_key_input() which takes no arguments and allows debug access to the loop, and feed(spk, txt) which passes a speaker ID and plaintext into the loop. Within languageloop.py, a textflow is initialized. The textflow is the coroutine that handles the passage of information between coroutines and dictates the procession of the analysis of the input. Upon initialization of the textflow class, a serialization of the semantic web (to be discussed later) is loaded, and the Wernicke loop, Sage loop, (and later Q_GAN) are initialized as well. A textflow coroutine cleans the inbound text, and then passes it to the Wernicke loop.

Wernicke Loop The Wernicke loop handles the majority of the detailed NLP performed on inbound text. It begins by classifying the type of sentence as either declarative, imperative, or interrogative. This is done via the STGen coroutine, a simple neural network which remains loaded. Next, it is passed to spGen, or the spacy generator. Finally, it is passed to the adjective corpex, which classifies all adjectives as either positive or negative, and forms a sentiment bit vector. The spacy generator returns a sentence ‘frame’, which contains the type prediction, and the emotional charge vector, the plaintext, a list of entities, a list of tokens, a list of noun-chunks, and the speaker.

Post-processing in the Wernicke loop, the frame is routed via the logic portion of the textflow routine. If it is declarative or imperative, it is immediately sent to the symbolic engine, which ideally, should be responsible for swapping speaker ID with a link to a profile class, and resolving common language that can be simplified into a logical and symbolic representation. It is then passed to the semantic web, and stored there via sentenceEncounter(), which takes a speaker frame.

If it is an interrogative sentence, it is instead first passed to the Sage loop, which contains a google-electron question answering model. The critical semantic components of the question are traced to other statements possessing the same ones, and that forms the context for the QA model. The answer, as well as confidence in the answer, is then passed back to the textflow.

Semantic Web The web consists of semantic vectors, which consist of semantic nodes and edges, which have both an x and y value. The x represents the location in a sentence and the y is a temporal indicator. Edges travel along the x axis, while ‘traces’ travel along the y axis, connecting instances of nodes across different conversations with the same semantic hash. Semantic ‘Cube’ Another data structure option that uses x for in-sentence location, y for time, and z for semantic hash number.

Semantic Vector The semantic vector tracks meta-data about the sentence itself for later retrieval. It tracks the time it was initialized, the speaker, the type of sentence, the track (a list consisting of node and edge objects), and a plaintext representation of it.

Noid Corruption of void and node, this is an object to concretely mark the end of a semantic vector.

Semantic Edge The semantic edge tracks the current summation of the sentiment, (calculation TBI), a negation float denoting the presence of a prior negation, the type of edge (TBI via symbolic engine), and a weight (same).

Semantic Trace The semantic trace is simply an object that has two x and y values, one for each x,y pair of the nodes it connects.

Semantic Node The semantic node is the representation of a word on the graph. It tracks inbound and outbound edges, its x and y coordinates, the plaintext, tokens, and hash. It also has a list of its own individual traces, (sem_trace) objects, and in the future, a pointer to a profile class.

Additionally, a ‘curiosity engine’ has been explored, but only on pen and paper. The curiosity engine is a feedback loop of a GAN to generate questions utilizing portions of the graph which are loosely connected, and the confidence value from the Sage loop, to fine tune asking itself questions to elaborate further on encountered concepts. A suitably confident answer can then be added to the semantic web as an original ‘thought’, or ‘conclusion’ regarding a topic, and used later as if it had seen it elsewhere.

Clone this wiki locally