Universal Semantic Code

The Universal Semantic Code (USC) is the semantic language for knowledge representation. It is based on the set of semantic primitives, which are normalized to unambiguously represent the knowledge and are organized in the regular structure. The USC patterns have an interpretation within the formal algebra described below. The formal nature of the USC along with regularity and unambiguous explicit expression of the semantics allow us to describe virtually any application domain.

USC can be viewed as an action-oriented analysis. The structure related to the action is defined as the basic USC semantic pattern. The number of base actions is limited by the permutations of the structural action elements and there are 108 base actions. Other actions, which are expressed by action verbs outside of the USC base patterns set, are considered as

  1. Complex actions that can be expressed through two or more USC patterns
  2. Synonymous actions that are equivalent of some of the base USC patterns

The selection of the key action verbs for USC along with complex and synonymous actions covers up to 80-90% of the existing text and almost 100% of the specific technology and business texts, which makes USC a semantically complete and universal coding system for applications and systems analysis and design. The base USC principles are as following:

  • Every USC statement corresponds to only one meaning and every USC transformation corresponds to only one meaning transformation.
  • The knowledge is represented as a procedural and for any object of the system the only criteria is considered the function the object performs.
  • All the key USC action verbs have symbolic and canonized natural language representation, the rest of the verbs refer to the key verbs.
  • Nouns are defined on the basis of the corresponding verbs with modal characteristics being added to each noun.
  • The rules of reading USC strings in any natural language are set up according to formal characteristics.

The USC and related work

The fundamental research done prior to USC in the direction of the semantic language has some positive aspects as well as some shortcomings.

Formal grammars generated on the basis of N. Chomsky concepts can be used for creating computer-programming languages. Absence of semantic interpretation doesn’t allow them to become knowledge representation languages with one-to-one correlation between syntactic and semantic elements.

S. Amarel and R. Schank gave priority to Knowledge Representation (KR) in intellectual problems solving. Natural language in its non-canonized form cannot meet all KR requirements. Formal system with full semantic interpretation is embodied in R. Montague grammar. However, Montague semantics is defined at the natural language word level without exposing its inner semantic structure.

There were two projects which attempted to create a language with formalized semantics: the model of conceptual dependency proposed by R. C. Schank and the “meaning-text” model of I. A. Melchuk. The basis of both is some primitives (semantic elements): primitive actions in Schank’s model and lexical functions in Melchuk’s, which form semantic notation of expression. The primitives of the given models don’t claim to be complete, independent or consistent in the strict sense of the word because of their empirical elaboration. The deductive theory of knowledge representation language has been embodied in USC for the first time. It is clear that artificial intelligence can be effective only with formal representation and transformation of sense since only under these conditions the computer modeling of mental processes is ensured.

USC is in agreement with the given principles. USC has two additional fundamental characteristics: special means for representation of information and modality (these categories are represented by traditional ternary strings of elementary symbols) and the explication of USC as certain algebra. There is a set of axioms within the scope of this algebra. Each axiom represents a regular transformation of sense in explicit form, which is a requirement for any artificial intelligence system (substitution of strict concepts for the intuitive one).

Discussion

It is natural for a human to come to the following conclusion: The engineer has seen the device before, and that is why he would recognize it. In the general form: X has seen Y ⇒ X would recognize Y. An intellectual system should know how to draw this conclusion, i.e., designers of AI system must know how to teach a computer to draw such kind of conclusion.

In another instance of deduction: He has already played Rossini’s “Tarantella” that is why he would play it or in a more general form X has played Y ⇒ X would play (X can possibly play Y). This deduction can be reduced to the modal logic rule P → ◊P.

The right part of the first statement reads as X would recognize Y. This sentence signifies a possible result of action represented in the left part (X can possibly recognize Y). After transformation the right part has coincided with the same part of the former statement and the above-mentioned rule. The left part the verb “to see” may be interpreted as to receive information, to get to know. Then the whole reads as following: X has known Y → X can possibly know (recognize) Y. It leads to the same modal logic rule: P → ◊P.

In order to identify P in the left and right parts of our conclusion we have to add semantics into formalism of representation. The formal representation and the canonized language establishes the identity of P in the left and right parts of the conclusion and distinguishes the modal operator in the right one. Semantic code is the only means of the formalization of human “thinking”. It can be implemented in a system with specialized dictionary for translation of natural language phrases into semantic notation (as USC does).

Formalization of lexical semantics is not powerful enough because of the natural language vagueness that follows from the discrepancy between the complexity of the syntactic and semantic structures. Such discrepancy is due to the omissions in the natural language phrases. For example, the phrase “Your child eats with his hands” is reconstructed in full as “Your child eats with his (mouth, holding food with his) hands”.

Making comparative analysis of the phrases “John beats Jim” and “John expects Jim”, we realize that in spite of their full syntactic coincidence they have important semantic distinctions. Asking the question “what does John do to Jim?”, we get a regular answer “He beats him” in the first case, and meaningless answer in the second case “He expects him”. Actually, the phrase “John beats Jim” has a basic semantic structure, while the phrase “John expects Jim” has a complex one. The semantic reconstruction of the second phrase is: “John is where he expects Jim to be soon”.

Let’s compare semantic-syntactic representation in USC of two phrases: “John protects Jim from something” and “John guides Jim through something”. The symbolic representation of the first phrase is: ((XY)Z) (Z(WY)), which reads in natural canonized language as “John by means of Y protects Jim (i.e., by means of Y preserves him from something, John does so that Jim still exists)”. The symbolic representation of the second phrase would be: ((XY)Z) ((ZZ)Y), which reads in natural canonized language as “X by means of Y guides Z (i.e., by means of Y lets Jim goes through Y)”. It is easy to see that Y in these phrases performs different function. “Protects, preserves” by means of Y makes us think that Y is some medicine. “Guides through by means of Y shows that Y is some entrance or exit (like a door or a hatch). The difference between syntactic structures of the phrases is isomorphic to the difference in semantic structures.

USC has an advantage in solving intellectual problems by bringing problem description to the base semantic level. A user just has to do the mapping of the following type: who “X” by means of what -”Y” acting on what “Z” gets what “W”.

Having assigned names of actors of the action and devices that are being used to the variables, a user determines initial problem state. USC base system shows possible options of moving from the initial state to other possible states according to user goals. The system Knowledge Base is formed by the axioms of the USC algebra and is an oriented graph. The nodes of the graph are USC patterns. The arcs are USC axioms transitions between possible actions. The solution of the problem can be realized in the form of a path set by the succession of arcs. The algorithm of the problem solution is based on the successive path from the initial problem state to the target state.

Apart from the axioms, the USC system contains the semantic vocabulary of most commonly used verbs. Each of the verbs is either defined in the USC or has a reference to the synonym having a similar definition or is a complex verb and is expressed through the base USC patterns. The system Knowledge Base statements are formed with verbs selected by the user. By referring to the verb vocabulary and using the USC patterns definition, the user fills the names of the relevant objects, required by the USC pattern structure.

USC axioms

The formal part of the USC has a status of algebra (A) of USC patterns representation and transformation and has been determined as:

A = < M, →, ~ >

where

  • M is a set of elements
  • ⇒ is the binary-non-commutative and non-associative operation on the given set (the operation of implication)
  • ~ is the unary operation on the given set (the operation of negation).

This kind of universal algebra corresponds to Lukaszewicz variant of Lindenbaum algebra:

  • Lindenbaum: A = < M, U, ∩, →, ~ > disjunction, conjunction, implication, negation
  • Lukaszewicz: A = < M, →, ~ > implication, negation; U, ∩ are dependent on →, ~; →, ~ are not.

USC moves further. Besides USC algebra includes rules of ranking of implication and negation in ascending order of magnitude (three grades of rank):

  • [→] = [ →1→2 →3 ] – if … then = start of influence ⇒ influence ⇒ end of influence, accordingly:
  • [~] = [~1~2~3 ] – not = inside ~ superficially ~ outside.

There is a set of axioms within the scope of this “semantic” algebra.

I. Axioms of generation

  • Axiom of application. If X and Y are elements of a set M then (X ⇒ Y) is a set element too. The set elements are:
  1. X → Y – kernel pattern
  2. (X→Y) → Z – extended pattern
  3. ((X→Y) →Z) → ((… →)…→) – complex pattern
  • Axiom of canonization 1. The left part of the complex pattern takes the following canonized forms:
  1. (X→Y) → Z (complex physical pattern)
  2. (X→Y) → X (complex information pattern)
  • Axiom of canonization 2. The starting element of the right part of the complex pattern is identical to the final one of the left part
  1. ((X→Y)→Z) → (Z→…)→…)
  2. ((X→Y)→X) → (X→…)→…)

In this connection only the right part of the complex pattern is recorded. We can also eliminate the signs of implication

(X→Y) → Z ≡ (XY)Z etc

  • Axiom of fixation. The unary operation can be executed only on the element in the final position of the complex pattern. Axioms of generation make it possible to construct words and ideas classifier on the basis of semantic modeling.

II. Axioms of transformation

  • Axiom of transposition. The right part of the complex pattern can be transformed by means of changing the sequence of operations (brackets shift)

(ZY)W → Z(YW) etc.

  • Axiom of diffusion. The right part of the complex pattern can be transformed by spreading the element in the first or second position to the second or third.

  • Axiom of divergence. The left part of the complex pattern can be transformed by spreading the element in the first position to the third one.

(XY)Z → (XY)X

  • Axiom of correlation. Each X of the complex information pattern, except for X in the first position, correlates with Z in the complex physical pattern. Each Y of the complex information pattern, except for Y in the second position, correlates with W in the complex physical pattern.

((XY)X)((XY)Y) ↔ ((XY)Z)((ZW)W) etc.

The following is the table of USC patterns.

Physical Patterns (actions)
Active Passive
create (ZW)Y originate Z(WY)
construct (ZW)Y’ form Z(WY’)
restore (ZW)Y” preserve Z(WY”)
erase (ZY)W vanish Z(YW)
destroy (ZY)W’ deform Z(YW’)
damage (ZY)W” waste Z(YW”)
compress (ZW)W merge Z(WW)
join (ZW)W’ fix Z(WW’)
concentrate (ZW)W” group Z(WW”)
decompress (ZY)Y dissolve Z(YY)
separate (ZY)Y’ detach Z(YY’)
disperse (ZY)Y” scatter Z(YY”)
lead in ZZ)W go in Z(ZW)
bring (ZZ)W’ come Z(ZW’)
target (ZZ)W” arrange Z(ZW”)
bring out (ZZ)Y go out Z(ZY)
move away (ZZ)Y’ move aside Z(ZY’)
depart (ZZ)Y” diverge Z(ZY”)
hang (ZW)Z dangle Z(WZ)
lift (ZW)Z’ rise Z(WZ’ )
take off (ZW)Z” leave Z(WZ”)
drop (ZY)Z fall Z(YZ)
lower (ZY)Z’ descend Z(YZ’)
put (ZY)Z” stay Z(YZ”)
rotate (ZZ)Z turn Z(ZZ)
circulate (ZZ)Z’ encircle Z(ZZ’)
move (ZZ)Z” displace Z(ZZ”)

Information Patterns (actions)
Active Passive
order (XW)Y submit X(WY)
convince (XW)Y’ agree X(WY’)
allow (XW)Y” act X(WY”)
abolish (XY)W resist X(YW)
dissuade (XY)W’ insist X(YW’)
forbid (XY)W” do nothing X(YW”)
instruct (XW)W perceive X(WW)
teach (XW)W’ fix X(WW’)
enlighten (XW)W” remember X(WW”)
break off (XY)Y lose X(YY)
confuse (XY)Y’ forget X(YY’)
deceive (XY)Y” trust X(YY”)
inform (XX)W know X(XW)
declare (XX)W’ understand X(XW’)
tell (XX)W” apprehend X(XW”)
hide (XX)Y don’t know X(XY)
conceal (XX)Y’ guess X(XY’)
keep silent (XX)Y” err X(XY”)
raise (XW)X be proud X(WX)
inspiration (XW)X’ triumph X(WX’)
love (XW)X” enjoy X(WX”)
humiliate (XY)X be distressed X(YX)
insult (XY)X’ take offense X(YX’)
detest (XY)X” suffer X(YX”)
think (XX)X live X(XX)
want (XX)X’ plan out X(XX’)
be able (XX)X” suppose X(XX”)

Note: The quotes ( ‘, ” ) mark action stages, e.g., drop( ) → lower( ‘ ) → put( ” ).

USC Semantic Inference

The USC semantic inference is based on the axioms of USC algebra and is formed as an oriented graph. The nodes of the graph are USC primitives, and the arcs are the USC algebra axioms. The solution of the semantic problems can be realized in the form of a path formed by a set of continuous arcs. The algorithm of the problem solution is based on successive path from the initial state to the target one.

As a result of inference search three types of patterns can be selected:

Unary pattern ZZW “lead in”
Double pattern ZZW ZZW “lead in” + “lead in” = “pump up”
Confluence of two patterns ZZW ZWZ “lead in”‘ + “hold” = “put in”

As an example of the definition of the domain Knowledge Base: The vocabulary item “to create” is described by complex string ((XY)Z)((ZW)Y) – X by means of Y affects W so that ZW exists. Objects mapping to USC string is: Who “X” by means of what “Y” affects what “Z” creates what “ZW”.

The primitives for the initial and target states are placed on the relevant nodes of the graph defining a path between them. The transfer algorithm is reduced to correlation of the initial and the target states and can be defined as a one-step inference, which has a form of a complex string.

In the sentence we have “the master restores a picture”, the item “to restore” being a reference item addressing “to preserve”. The following structure is suggested in the item: ((XY)Z)(ZW)Y”) – “X” by means of “Y” affects “Z” so that “Z” is preserved

After mapping items to symbols: “The master by means of some set of tools affects the picture so that it is preserved”, based on the axiom ZWY → ZWW (preserve ⇒ concentrate, join, compress; combine, contract, squeeze).

The USC algorithm is defined by a search through the list of typical effects in the knowledge base for the tools for preserving and restoring pictures.

Typical effects are complex strings that form the base of the classification of physical, chemical, geometrical, informational and other effects. The difference between axioms and typical effects is that axioms are true irrespective of the value of the variables composing them, but typical effects are not. Physical effect of magnetic separation is completely dependent on value of variables such as magnetic field and metal when USC string for the action “separate” is able to be filled with variety of domain independent values and including values for the presented effect.

© SEMPL.net, 2001

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • TwitThis
  • MySpace