Growing activity in semantic research and development is based on the understanding of semantics borrowed from linguistics and intuitively applied in new fields. Most of the developments are distinctively different from linguistic domain, for example, systems development. It requires re-thinking of the semantic terms and defining them for the new purpose.

Two things are happening meanwhile:

  • All existing development has been done in the absence of clear and guiding definition of semantics for the domain of software engineering.
  • A number of definitions are used, which in broaden and chaotic manner being contradictory and becoming more and more fuzzy with the end state of total loss of meaning. It is happening with many popular terms (for example, term technology) eventually losing all informative value in most contexts.

Loss of the well defined word meaning surprisingly makes it so easy and convenient to use in any context without thinking, that it strengthening the practice of nonsense generation, and firmly preserves such term in the vocabulary of frequent words.

To soften this negative influence and to ensure progress in the Semantic Technology clear definitions of the semantic terms are required.

What is the nature of semantics in software?

Meaning is contextual and has two sources:

  • from unknown context which cannot be established
  • from the fundamental reflection of the space and time in the language

We will discuss the second as more suitable for machine processing, because for the first it is hard to predict what context a person has in his mind.

So, the nature of semantics can be traced as the time and space relationships and represented by predicates. The dominating significance of predicates for semantics is recognized in computational linguistics.

This is fundamental conclusion and it is running against deeply embedded view of the hierarchy of ontological classes as a main representation tool of semantics. The relationship or predicate, but not only a class is a core semantic component of knowledge representation.

Complexity

Combinatorial diversity of the relationships is a main reason of complexity in the systems. The number of relationships is growing in arithmetical progression with number of entities, plus there can be more than one relationship per pair of objects, potentially leading the complex system to overwhelming number of relationships, far greater than entities. Modeling each relationship as unique is a source of complexity, so the need for the classification of the relationships is a strong requirement for knowledge representation.

In natural language this problem is addressed by reverse to combinatorial proportion of verbs to nouns: the number of nouns is far greater than the number of potential predicates represented by verbs (more the order of magnitude), but the overall number of verbs is still huge (about 25,000 in WordNet [1]). The synonymy reduces this number substantially and linguistic research shows that predicates could be grouped into about 200 classes [2]. This is a major result which can be used for reducing complexity.

Besides, the modeling of relationships itself is not normalized and differs from system to system.

Basic Semantic Definitions

We see a possibility to encode the meaning by matching it to the category from the limited number of predefined predicate types or classes of predicates. These classes map the meanings and provide the basis for expressing semantics in software systems.

This brings us to the definitions in software development:

Semantics is a form of representing knowledge through data classes, predicates (properties), and axiomatized relationship between classes, and classes of predicates / properties. Semantics is a formal representation of knowledge.

Semantics is in the specially constructed data structures. This allows deriving new knowledge and facts out of the structures themselves, regardless of data instances populating this structures (thus abstracting semantics and providing a good separation of semantics from data and code).The main idea of semantic technology is separation of semantics from code as it was done with logic (rules) in the past.

Semantic Coding is a process of mapping of the identified semantic structures (patterns) to the Semantic Code, numeric or alpha-numeric, corresponding to the enumerated list of predicate classes.

The major problem is to establish a systematic list of the Semantic Codes for developers of interoperable software.

Semantic Category is a coded structure, which can be canonized in natural language and includes: Main Agent, Instrument of action (where action is a predicate represented by verb), Object of the action or Peer Agent, and Result which is lexically derived and can be an Event or Content Object. Specific Semantic Category represented by this structure is Semantic Pattern.

Semantic Patterns allow defining of abstract axiomatic relationship between categories based on the nature of predicates and reflect time and space representation in the particular category. These axioms are universal for any domain and a part of the Semantic Inference Engine (Semantic Processor).

Semantic Processing is an inference of implications, on the semantically encoded specific set of data, using universal semantic codes and semantic axioms. This can cover diverse set of tasks like semantic process control, semantic query understanding, and unstructured text analysis. Literally everything since we are talking about highly abstract and domain independent representation.

Besides, there are tasks of semantic trans-coding (from / to) other models of representation.

The semantic code relationship allows exhaustive representation of the entity in any act of communication as a code. So, in Semantic Processing the entity meaning is always represented explicitly and in the context, thus allowing generating of ontology. That is confirming secondary role of the ontology in the semantic representation.

Universal Semantic Code [3] reflecting time and space is empirically derived. We are working on an approach of formal deriving of the code and mean time preferring to call established semantic formulas as Semantic Patterns.

Bibliography

  1. G. A. Miller, WordNet lexical database of English, 2008
  2. B. Levin, English verb classes and alternations: a preliminary investigation, 1993
  3. V. Martynov, Foundations of Semantic Coding, 2001

© SemPL.net, 2009

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • TwitThis
  • MySpace