© Viatcheslav Yatsko, 2005 COMPUTATIONAL LINGUISTICS LABORATORY

LINGUISTICS AND INFORMATICS

This Web site has been created by V. Yatsko (last name also spelt "Iatsko"), head of the Computational Linguistics Laboratory (CLL), Professor in the Department of Information Technologies and Systems at Katanov State University of Khakasia (KSU) located in Abakan, Russia.
The CLL at KSU was founded in 2002 to conduct work in the following areas.
1. Applied linguistics research, development of computer systems to be used in language teaching and foreign language teaching. By now four such systems have been created (see below).

2. Automatic text summarization research. V.Yatsko is the author of symmetric summarization conception that underlies PASS and ETS allowing to produce coherent and adequate summaries. For details see [1-4].
3. Evaluation of the Internet information retrieval systems. V.Yatsko is the author of depth of user''s search conception described in the paper submitted to Dialog-2006 (http://www.dialog-21.ru/default.asp) . The reference dictionary conception is being developed to evaluate automatic text summarization systems as well as the Internet information retrieval systems.

4. Discourse analysis. Integrational discourse analysis conception suggested by V.Yatsko [5-7] distinguishes between surface and deep levels of discourse structure. Currently we are investigating various types of possessive discourse and linguistic features of possessive relations differentiating between alienable and inalienable possession [8]

5. Computer learner corpora research project. This ongoing project is aimed at 1) creating corpora of texts (dictations, expositions, compositions, etc.) produced by Russian-speaking learners of English; 2) creating tools for error tagging and automatic analysis of these corpora; 3) contrastive analysis of Russian learner corpora with corpora produced by speakers of other languages. The project is in line with research done by Granger et al [9].

OUR PRODUCTS

By now there are four such products:
– Compare – a program to be used in comparative linguistics;
– PASS – a semi-automatic network text summarization system to be used in FLT;
– TITE – a bilingual network translation system to be used in FLT;
- ETS - event tracking summarizer.

Compare allows the user to determine the degree of correspondence between basic dictionaries of two languages and then, using Ch. Swadesh’s formula, to estimate time depth separating two genetically related languages, i.e. to find out at what time the two languages separated from their proto-language.
When working with Compare the user is required 1) to compile basic dictionaries for two languages or use/modify existing ones. Compare contains a basic dictionary for English compiled by Swadesh; 2) to set correspondences between symbols in pairs of words using some laws of languages historical development (Grim’s law, Verner’s law for Germanic languages) . For example initial consonants can be safely identified in English drink and German trinken; 3) to compute time depth by Swadesh’s formula. Compare is linked with Scientific Workplace mathematical editor.
Compare can work with any pairs of languages that use Latin or Cyrillic alphabets.
System requirements: Windows 98/NT/2000/XP; 0,5 MB free disc space, 300 MHz processor.

PASS and TITE are network computer systems designed to facilitate interaction between the teacher and the student. Unlike local systems distributed on CDs these systems can be used to teach unlimited number of students simultaneously. They can be easily integrated into existing foreign language curricila and adjusted to any level of language proficiency. The systems generate statistic data about students'' mistakes thus providing the teacher with important feedback about students'' progress.

PASS (partially automated symmetric summarization) allows the user to summarize scientific and newspaper English texts of any size. It works on symmetric summarization methodology [2]. General idea underlining PASS is the following. To summarize a text an FLT student is given the following assignments.
1) Make up a dictionary of speciality terms pertaining to the subject field of the paper. Summarization process won''t start until the student enters in the system correct dictionary terms. Since the student will take efforts to make up a domain dictionary, he is supposed to memorize it well enough so as to use the terms in his speech.
2)Change the sequence of sentences in the summary.
Basing on the dictionary provided by the student PASS will produce a summary. Sentences in the summary will appear in random order. The student must place sentences in the same order as they were used in the original.
3) Make the summary coherent.
Sentences in the summary may have manifestations of connections with other sentences that were not selected during summarization. The student will use a number of transformation procedures (such as deletion, insertion, modification) to make up a coherent summary, thus memorizing better syntactic structures and means providing speech coherence. The resulting coherent summary is then submitted to the teacher.
PASS is a distributed system that comprises Student’s Module and Teacher’s Module linked by a specialized server. It doesn’t require the installation of SQL sever.
By means of Teacher’s Module the teacher can 1) download texts for summarization and reference dictionaries; 2) assign texts for particular students; 3) set summary size; 4) get statistics about students’ mistakes (how many students entered incorrectly one and the same word).
By means of Student’s Module that students can: 1) get connected to the server; 2) enter dictionary words and summarize a given text; 3) edit the summary using transformation procedures; 4) submit the resulting summary to the teacher.

A local version of PASS can be used as a summarizer providing the user with an event-tracking opportunity (ETS). Suppose you were assigned by the teacher to read Theodore Dreiser''s "Gennie Gerhardt" novel. To skip reading the whole novel just type "Gennie" and "Gerhardt" in the dictionary section of PASS and it will provide you with the summary of main events that happened to the character in the whole book or a separate chapter/chapters. To get a more profound view of the book type "Gennie", "Gerhardt", "Lester", "Kane". This function can be of use to the teacher. You won''t have to read all books that your students selected for individual reading. Just summarize them with PASS and check up students'' progress.
ETS allows summarizing texts of all genres and lengths.

System requirements for PASS and ETS: Windows 98/NT/2000/XP; 1 MB free disc space, 500 MHz processor.

TITE (translation in teaching English) is a bilingual translation system to be used in FLT during out-of-school activities. General idea underlining TITE is the following. At the beginning of the term a student is given a number of texts to be translated from his native tongue into the target language or vice versa. These texts fall into two groups: limited and unlimited. A limited text has a time limit: the student must translate it in the time allotted by the teacher (for example 15 minutes). If the student fails to keep within the time limit all the text translated by him is deleted. The translation stops as soon as he makes a mistake (enters an incorrect symbol). Unlimited texts are larger than limited ones and don’t have a time limit; the student can save the results and proceed with the translation next time. As soon as the student finishes translating the text a record is made in the log on the server so that the teacher can check student’s progress. By the end of the term the student must translate all texts assigned to him/her so as to get credit.
TITE is a distributed system that comprises Student’s Module and Teacher’s Module linked by an SQL server.
By means of Teacher’s Module the teacher can 1) assign passwords and network names to students; 2) enter information about students, such as first and last name, group number, year of study; 3) assign students texts for translation and download reference translations; 4) assign texts characteristics, such as direction of translation; limited, unlimited; time allotted for translation of limited texts; the sequence of texts to be translated; 5) select options and functions to modify the compexity of translation. They are: synonyms, punctuation, dictionary+, dictionary-.
"Synonyms" option when checked allows the student to read synonyms of the word that is being translated at the moment.
When "Punctuation" option is checked by the teacher punctuation marks as well as white spaces separating words appear automatically; the student doesn''t have to enter them.
Dictionary+ function allows the teacher to make up a dictionary of words that must be translated by the student; the rest of the words in the text will appear automatically.
Dictionary- function allows the teacher to make up a dictionary of words that the student doesn''t have to translate - they will appear automatically
These options and fuctions allow the teacher to adjust the translation process to any level of foreign language proficiency.
6) get statistics about the number of students’ mistakes in specific words.
By means of Student’s Module that students can: 1) get connected to the SQL server; 2) enter translation. TITE stops and signals at each incorrect symbol; 3) save unlimited texts; 4) get synonyms help if this option was turned on by the teacher. Synonyms of the word that is being currently translated appear in the appropriate section of program’s window.
TITE can process any texts in any language.
System requirements: Database is installed on SQL Server 2000; Teacher''s Application and Student''s Application can be installed on Windows 98/NT/2000/XP; they require 2 MB free disc space, 700 MHz processor.

References

1. ßöêî Â.À Ñèììåòðè÷íîå ðåôåðèðîâàíèå: òåîðåòè÷åñêèå îñíîâû è ìåòîäèêà // Íàó÷íî-òåõíè÷åñêàÿ èíôîðìàöèÿ. Ñåð.2. 2002. ¹ 5
2. Iatsko V. Linguistic aspects of summarization In: Philologie im Netz. 2001. N 18.
www.fu-berlin.de/ phin/phin18/p18i.htm
3. Yatsko, V., Shilov S., Vishniakov T. A Semi-automatic Text Summarization System In: Proceedings of the 10 International Ñonference on Speech and Computer.- Patras, 2005.- P. 283-288
4. Yatsko, V., Shilov S., Vishniakov T. Semi-automatic Text Summarization and Foreign Language Teaching In: Philologie im Netz. – 2005. – No 34. – P. 48–59
http://www.fu-berlin.de/phin/phin34/p34i.htm
5. Iatsko V. Deep structure of proposition and deep structure of discourse In: Linguistics in Potsdam. ¹ 4. 1998.
6. Iatsko V. Textual deep structure In: Text, Speech, Dialogue. Proceedings of the First Workshop on Text, Speech, Dialogue–TSD''98 Brno, Czech Republic, September 23–26, 1998
7. ßöêî Â.À. Ðàññóæäåíèå êàê òèï íàó÷íîé ðå÷è. Àáàêàí: Èçä-âî Õàêàññêîãî ãîñ. óí-òà. 1998.
8. Iatsko V. Possessive and existential sentences in Russian and in English In: Philologie im Netz. 2000. N 14.
www.fu-berlin.de/ phin/phin14/p14i.htm
9. Iatsko V. A review of Granger, Sylviane, Joseph Hung and Stephanie Petch-Tyson ed. (2002) Computer Learner Corpora, Second Language Acquisition and Foreign
Language Teaching In: LINGUIST List 14.1098 Apr 14 2003
http://linguistlist.org/issues/14/14-1098.html#1

CONTACTS

Laboratory''s address: Lenin Street 90, Abakan, Russia, 655017
Tel/fax:++(3902)243364
E-mail: iatsko@gmail.com slavay@khsu.ru
You are also invited to visit other Web pages created by me (V.Yatsko).


Literary Web page (in English and Russian)

Educational Web page (in Russian)

Scientific Web page (in English)

Find my picture here

Demo-versions and publications on my University''s site