KM 5433 Blog/Joe Colannino

A blog discussing knowledge management and library science issues.

Saturday, October 28, 2006

Post Script: Semantics and Ontologies/J. Colannino

In the Ding blog below, I commented on ontologies and semantic models. I have added this post script, because it now occurs to me that online translators are working examples of semantic models. Photocopy machines are examples of transcription without semantics. One way to test a transcription error rate is to make copies of copies in seriatim. What starts out as perfectly rectangular grids and circular coordinates quickly degrades to amplify any distortions. In the same way, one can note deficiencies in online translators by translating a text from source to receptor language, switching the source and receptor languages and retranslating, and repeating the cycle until reaching a stable source-receptor phrase couple.

As I am familiar with French, I will use choose English and French as the starting source and receptor languages. For example, I would normally use “Ça va?” to capture the equivalent of the English idiom “How’s it going?” To be precise, “Ça va?” means literally, “It goes?” Indeed, the native French speaker could interpret this phrase in different ways depending on the context. If used as a greeting then it would be interpreted as “How’s it going?” On the other hand, in different contexts the phrase could also mean “Is that is acceptable?” or “Is that enough” or “Is that a problem?”

Putting “How’s it going?” into Babel Fish gives “Comment va-t-il?” This is a perfectly acceptable translation, and actually more exact, meaning “How is it going with you?” (Actually, the literal word order is “How goes you it” which is improper syntax in English but proper in French. So I give Babel Fish good marks for using the appropriate receptor language syntax.) Putting Comment va-t-il?” back into the translator and switching the source and receptor languages, Bable Fish back-translates this to be “How is it?” Subsequently, Babel Fish re-translates this to “Comment est il?” At this point, English and French happen to share the same syntax. Once arriving at an identical syntax, no further change occurs; thus we have arrived at a stable French-English couple.

Stability aside, the couple does not really match the semantics of the original question. In English, “How is it?” might be a question to solicit an opinion of taste or meeting of expectation, but it would never suffice as a greeting. That is, “How is it?” expressed as a greeting would probably elicit the response “How’s what?” – a plea for clarification.

This leads me to believe that we can better enable semantics at the expense of syntax in automated search. In other words, if we sacrifice syntactical variety we would immediately generate the stable French – English couple “How goes you it?” ↔ Comment va-t-il?” If you apply French syntax to the English phrase it makes sense, otherwise it is gibberish.

The beauty of this observation is that stable syntax is something computers rely upon. Perhaps what we need to enable the semantic web is not an “ontology” at all but a syntactical map whereby similar semantic constructs are associated in syntactically identical ways. Then the focus would be on identifying parts of speech and ordering them uniformly. For our example: “How’s it going?”, “How are things?” and “Is it going well?” are all mapped to an identical syntax – “How goes you it?” or whatever. In effect, we would be creating a new language with uniform syntactical structure for the sole purpose of associating semantic constructs. Well, maybe this has already been done, but I haven’t come across it. If you have, perhaps you would be gracious enough to inform me by comment.

Thanks,

Joe Colannino