Skip to main content
Skip header
Terminated in academic year 2014/2015

Documentographic Information Systems

Type of study Follow-up Master
Language of instruction Czech
Code 460-4008/01
Abbreviation DOK
Course title Documentographic Information Systems
Credits 4
Coordinating department Department of Computer Science
Course coordinator prof. RNDr. Václav Snášel, CSc.

Subject syllabus

Lectures:
Introduction to Information Systems.
History and evolution of search in the texts.
Differences between the factual and dokumentografickými IS.
Algorithms for accurate tracking of the lyrics.
Naive algorithm.
Algorithms for forward search.
Knuth-Morris-Pratt algorithm.
Aho-Corasickové algorithm.
Search regular expressions by finite automata.
Boyer-Moore algorithm.
Algorithm Commentz-Walter.
Iinformation retrieval models.
vector model
probabilistic strategies
Extended Boolean Logic
latent semantic
neural networks
genetic algorithms
fuzzy Sets
Bibliographic Information Systems.
Boolean model.
Vector model.

Signature methods.
Automatic indexing of documents.
Selecting indexing of terms.
Implementation index systems.
Hypertext systems.
Text and multimedia systems.
Semi-structured documents (SGML, HTML, XML)
Indexing of multimedia data, feature extraction
Searching on the Web.
Search by agent

The theme of the project will be assigned at the beginning of the semester.


projects:
Project objectives are as follows:
1) state of the art
2) implementation of selected problem
3) experiments
4) evaluation of the experiments.
The project documentation, presentation and data sources over which the simulation experiments were conducted.

Literature

R. Baeza-Yates, B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley
1999.

Advised literature

I.H.Witten, A.Moffat. T.C.Bell. Managing Gigabytes: Compressing
and Indexing Document and Images. Van Nostrand Reinhold 1994.