Seminars of Rough Set Technology Laboratory

  1. An insight into some aspects of rough-neurocomputing
    SPEAKER:	Marcin Szczuka, PhD
    		Warsaw University, Poland
    		CS Dept. Visiting Scholar
    DATE:		Wednesday, February 15, 2006
    TIME:		2:30pm
    PLACE:		CL 408
    
    Abstract
    The presentation is aimed at bringing together several ideas that have emerged on the boundary between the theories of rough sets and neurocomputing. Starting with the first attempts that coupled rough set methods with ANN's for classification purposes, we will further present the ideas that strive against incorporation of rough-set- specific notions, such as approximation, into the fabric of neural network. Finally, we will present our recent findings, that try to make use of overall neurocomputing paradigm for the sake of construction of extended classification systems. Such systems make use of extensions of notions which originated in the rough set theory.

  2. Rough Set based 1-v-1 and 1-v-r Approaches to Support Vector Machine Multi-classification
    SPEAKER:   Pawan Lingras 
               Department of Math and Computer Science 
               Saint Mary's University        
    DATE:      Monday, June 20, 2005
    TIME:      11:00am -12:00pm
    PLACE:     CL 435
    
    ABSTRACT
    Support vector machines (SVMs) are essentially binary classifiers. To improve their applicability, several methods have been suggested for extending SVMs for multi-classification, including one versus one (1-v-1), one versus rest (1-v-r) and DAGSVM.

    In this seminar, we first describe how binary classification with SVMs can be interpreted using rough sets. A rough set approach to SVM classification removes the necessity of exact classification and is especially useful when dealing with noisy data. Next, by utilizing the boundary region in rough sets, we suggest two new approaches, extensions of 1-v-r and 1-v-1, to SVM multi-classification that allow for an error rate. We explicitly demonstrate how our extended 1-v-r may shorten the training time of the conventional 1-v-r approach. In addition, we show that our 1-v-1 approach may have reduced storage requirements compared to the conventional 1-v-1 and DAGSVM techniques. Our techniques also provide better semantic interpretations of the classification process.

  3. From Business Objectives to Data Mining: Towards a Systematic Way of Data Mining Project Development
    SPEAKER:     Ernestina Menasalvas
                 Facultad de Informática
                 Universidad Politecnica de Madrid
                 Spain
    	     emenasalvas@fi.upm.es
    DATE:	Tuesday, Nov. 30, 2004
    TIME:	4:00-5:00pm
    PLACE:	CL 418
    
    ABSTRACT
    Confronted with a confusing set of techniques and ways to transform business problems into data mining problems, data miners need proof that a particular technique is better than another. Despite the existence of data mining standards such as Crisp-DM, SEMMA, PMML, up to date, data mining projects are being developed more as an art than as a science.

    The process depends completely on the expertise of the data miner since no method is available to make the process systematic and automatic. This is due to a lack of data mining problem conceptualization. In this sense, a deep understanding of both of the data to be analyzed and the application domain of the results as well as of the data mining functions is needed. Knowing the meaning of the data to be analyzed: facts they represent, constraints and context under which they were captured and the constrains underneath the data mining functions to be applied, will make it possible to find out whether the business goals to achieve are feasible. However, up to date, there is no formal method to describe these elements in such a way that the quality of results can be assured. On the other hand, this setting is a step towards a methodology for data mining project development that will be, in itself, the main basis for automatizing the process.

    It is also a need to prove that the improvements are really due to the actions taken after a data mining discovery and not to any other factor or action carried out in the company. In particular, results are thought to improve benefits. It is surprisingly though, that none of the obvious claims that are taken for sure as the starting point of a data mining project have ever been systematically tested.

    Here is where experimentation plays its role. Experiments are crucial to establish if the impact of the deployment is really positive or negative. Also experimentation will highlight if this impact is a side effect of actions taken as a result of the data mining project or just of other environmental factors.

    In this talk, we present the approach of the Data Mining group at Facultad de Informática, Universidad Politecnica de Madrid, to the data mining project development.

  4. The Rough Set Exploration System
    SPEAKER:	Dr. Marcin Szczuka
    		Assistant Professor
    		The University of Warsaw
    		Warsaw Poland
    DATE:		Thursday Sept 2, 2004
    TIME:		13:00-14:00
    PLACE:		ED 122
    
    ABSTRACT
    The Rough Set Exploration System (RSES 2.1) is a free software tool developed at the Warsaw University. This tool's main purpose is to provide users with ability to use advanced data analysis that uses results of our reserch, in particular in the field of Rough Sets. This tool has been developed for several years and reached some level of maturity. The presentation will contain the general information about the usage of RSES 2.1, the possible fields for its application and information about the inventory of data analysis methods that are provideded within this software system.

  5. Current research at Group of Logic, Warsaw University, Poland (Prof. A. Skowron's group)
    SPEAKER:	Dr. Marcin Szczuka
    		Assistant Professor
    		The University of Warsaw
    		Warsaw Poland
    DATE:		Wednesday, Sept 1, 2004
    TIME:		10:00-11:00
    PLACE:		ED 122
    
    ABSTRACT
    The current major directions in research of researchers associated with Group of Logic, Warsaw University, will be sketched. In particular, we will show the current activities that take palace within the frame of ongoing national research grant "Classifier networks". The presentation will focus on perspectives and research fields that we perceive as promising in the longer term.

  6. Feedforward classifier networks
    SPEAKER:        Dr. Marcin Szczuka
    		Assistant Professor
    		The University of Warsaw
    		Warsaw Poland
    DATE:		Tuesday, August 31, 2004
    TIME:		10:00-11:00
    PLACE:		ED 122
    
    ABSTRACT
    The problem of approximating compound concepts and making use of them by composition, classification and comparison is encompassed. To be able to capture the complex and multi-faceted nature of such compound constructs we propose an approach based on so called concept networks. These networks are representations of simple-to-compound construction of final concept from the simpler, more basic one. To achieve goals several techniques from probabilistic reasoning, rough set theory and neural networks are employed. The presentation will outline the problem we start with and show our views on possible solutions, including some propositions for algorithms and solution methods.

  7. Rough Set based Initiative Data Mining
  8. SPEAKER:     Prof. Guoyin Wang
    DATE:        Friday, August 13, 2004
    TIME:        11:00 a.m.
    PLACE:       CL 418 
    
    ABSTRACT
    Rough set theory is emerging as a new tool for dealing with fuzzy and uncertain data. In this paper, a theory is developed to express, measure and process uncertain information and uncertain knowledge based on our result about the uncertainty measure of decision tables and decision rule systems. Based on Skowron's propositional default rule generation algorithm, we develop an initiative learning model with rough set based initiative rule generation algorithm. Simulation results illustrate its efficiency.

    2003

  9. Using LERS for Knowledge Discovery From Real-Life Data
    SPEAKER:     Professor Jerzy W. Grzymala-Busse
    DATE:        Tuesday, November 25, 2003
    TIME:        1:00 p.m.
    PLACE:       Screening Room "C"
    
    ABSTRACT
    The data mining system LERS (Learning from Examples based on Rough Sets), developed at the University of Kansas, induces a set of rules from examples and classifies new, unseen examples using the induced set of rules. LERS is equipped with a number of tools. First, a family of programs may be used to pre-process data with errors, with missing attribute values, and with numerical attributes. If the input data file is inconsistent, LERS computes lower and upper approximations of all concepts.

    The classification system of LERS is a modification of the bucket brigade algorithm. The decision to which concept an example belongs is made on the basis of four factors: strength, specificity, matching factor, and support. LERS is also equipped with a tool for multiple-fold cross validation. The system has been used in the medical area, nursing, global warming, environmental protection, natural language, data transmission, etc. LERS may process big data sets and frequently outperforms not only other data mining systems but also human experts.

    Biography of Dr. Grzymala-Busse:
    Dr. Jerzy W. Grzymala-Busse is a Professor of Electrical Engineering and Computer Science at the University of Kansas since August of 1993. His research interests include data mining, knowledge discovery from data bases, machine learning, expert systems, reasoning under uncertainty and rough set theory. Recently he participated, as a co-principal investigator, in the project "Informatics Techniques for Medical Knowledge Building", together with the Medical Center of Duke University. The project was funded by the National Institutes of Health. Currently he is a co-investigator in the project: "CISE Research Infrastructure: Ambient Computational Environments", funded by the National Science Foundation. He has published three books and over 180 articles in the above areas, mostly in data mining.Dr. Jerzy W. Grzymala-Busse received his M.S. in Electrical Engineering from the Technical University of Poznan, Poland, in 1964; M.S. in Mathematics from the University of Wroclaw, Poland, in 1967; Ph. D. in Engineering from the Technical University of Poznan, Poland, 1969 and Doctor habilitatus in Engineering from the Technical University of Warsaw, Poland, in 1972.

  10. Rough set Model based on database operation (CS836)
    SPEAKER:  Waqar Ahsan
    DATE:     Thursday, Nov 13, 2003
    TIME:     2:30 pm
    PLACE:    CL 251
    
    ABSTRACT
    In this presentation I discuss the rough sets model based on the database theory to take benefit of efficient set oriented database operations. A drawback of rough set theory is the inefficiency in computations, which limits the suitability for large data sets. In order to find the reducts, core, dispensable attributes, rough set needs to construct all the equivalent classes based on the attribute values of the condition and decision attributes. This is a very time consuming process and does not scale for large data set, which is common in data mining application. I discuss the new set of algorithms to calculate the core and reduct based on database based rough set model.

  11. Models of Concurrency
    SPEAKER:  Prof. Ryszard Janicki,
              Dept. Computer Science, 
              McMaster University 
    DATE:     Tueday, September 16, 2003
    TIME:     2:30 pm
    PLACE:    CL 312
    
    ABSTRACT
    Concurrent systems are abundant in human experience but their full conceptualization and understanding still elude us. Concurrency theory is more than 30 years old, and although many problems are still far from a satisfactory solution, a lot of formal techniques have been developed, including a sophisticated use of partial orders, automata, composition and decomposition operators, etc. There are two major different (and often incompatible) attitudes towards abstracting non-sequential behaviour, one based on interleaving abstraction, another based on partially ordered causality. The interleaving models, mainly in the form of process algebras, are very structured and compositional, but have difficulty in dealing with topics like fairness, confusion, etc. The partial order models, like Petri nets, handle these problems better but are less compositional, although the recent results make that distance much smaller. Nevertheless some aspects of concurrent behaviour are difficult or almost impossible to tackle by both process algebras and partially ordered causality based models. E.g., the specification of priorities, error recovery, time testing, proper treatment of simultaneity, are in some circumstances problematic. New models, where causality is represented by two relations, can handle those problems but are often too complex for a practical use. Temporal Log! ic, invented long before the first computers, is the logic of choice for formulating many problems occurring in concurrent systems. The talk will cover all the issues mentioned above.

  12. Feature and Concept Extraction in KDD
    SPEAKER:	Dr. Jakub Wroblewski 
    	Polish-Japanese Institute of Information Technology
    DATE:		September 5, 2003
    TIME:		4:00 pm - 5:00 pm
    PLACE:		ED 621
    
    ABSTRACT
    One of the most important parts of KDD (Knowledge Discovery in Databases) process is feature extraction and selection. This step is usually done after preprocessing of working data set (or is considered as its part). It precedes the DM (Data Mining) step. During the seminar, I will present some theoretical issues and practical examples (incl. algorithms) of the feature extraction process, treated as a data-based induction of new concepts.

    In particular, I would like to address the following:
    - How to evaluate a quality of a new concept;
    - How to construct new attributes in case of numerical (continuous) values;
    - How to construct new attributes as discrete concepts based on continuous values;
    - How to construct attributes for complex structures, in relational and temporal databases.

  13. Evolutionary Computation in Rough Sets and KDD
    SPEAKER:	Dr. Jakub Wroblewski 
    	Polish-Japanese Institute of Information Technology
    DATE:		August 29, 2003
    TIME:		2:00 pm -3:00 pm
    PLACE:		CL 408
    
    ABSTRACT
    During the seminar I would like to address several issues concerned with Evolutionary Computation (EC) methods in some data analysis problems and applications. The approximate plan of my talk is as follows
    - Evolutionary methods: from classical genetic algorithms to genetic programming;
    - Optimization problems, especially concerned with the rough set based methods;
    - Hybrid EC solutions in KDD (examples).

    I will start with a short presentation of EC for Participants less familiar with the topic.

  14. Variable Precision Rough Set Inductive Logic Programming and Statistical Relational Learning
    SPEAKER:	Arul Siromoney 
    		School of Computer Science and Engineering
    		Anna University, Chennai 600 025, India
    DATE:		Wednesday, August 6, 2003
    TIME:		11:00 a.m. - 12:00 p.m.
    PLACE:		CL 408
    
    ABSTRACT
    The generic Rough Set Inductive Logic Programming (gRS-ILP) model combines Rough Set Theory (RST) and Inductive Logic Programming (ILP). Variable Precision Rough Set theory is used to extend gRS-ILP to the Variable Precision Rough Set Inductive Logic Programming (VPRSILP) model.

    Statistical Relational Learning (SRL) explores approaches to learning statistical models from relational data. One of the approaches in SRL is stochastic logic programs. A stochastic logic program (SLP) is a probabilistic extension of a normal logic program that has been proposed as a flexible way of representing complex probabilistic knowledge.

    The VPRSILP model is presented in the context of SRL and SLP. A preliminary experiment using the Predictive Toxicology Evaluation (PTE) Challenge dataset is presented. The PTE Challenge dataset is based on the rodent carcinogenicity tests conducted within the US National Toxicology Program.

  15. Rough Set Approach to Approximating Compound Decisions
    SPEAKER:	Dr. Dominik Slezak
    DATE:		May 8, 2003
    TIME:		1:30 PM
    PLACE:		ED 621
    
    ABSTRACT
    The theory of rough sets provides the tools for extracting knowledge from incomplete data based information. The rough set approximations enable to describe the decision classes, regarded as the sets of objects satisfying some predefined conditions, by means of indiscernibility relations grouping into classes the objects with the same (similar) values of the considered attributes. Moreover, the rough set reduction algorithms enable to approximate the decision classes using possibly large and simplified patterns. It corresponds to the well known Ockham's Razor Principle, as well as, e.g., to the statistical Minimum Description Length Principle.
    In many approaches, especially these dedicated to (strongly) inconsistent data tables, where the decision class approximations cannot be determined to a satisfactory degree, decisions can take more complex forms, e.g. probabilistic distributions (rough membership functions and distributions) of the original decision values. In the same way, one could consider, e.g., statistical estimates, plots, etc., definable using the original attributes, in a way appropriate for a particular decision problem. Then, one should develop methods aiming at optimal approximation of such decision structures, possibly similar to the classical reduction techniques.
    Complex values occur very often in the medical domain, while analyzing heterogeneous data gathering series of measurements, images, texts, etc. In this paper, we analyze data about medical treatment of patients with the head and neck cancer cases. The data table, collected for years by Medical Center of Postgraduate Education in Warsaw, Poland, consists of 557 patient records described by 29 attributes. The most important attributes are well-defined qualitative features. The decision problem, however, requires approximation of especially designed complex decision attribute, corresponding to the needs of the survival analysis. It seems to be perfect case study for learning how complex decision semantics can influence the algorithmic framework and results of its performance. It illustrates that even quite unusual structures can be still handled using just slightly modified rough set algorithms. Therefore, one may conclude that the proposed methodology is applicable not only to the presented case study but also to other medical, as well as, e.g., multimedia or robotics problems.

  16. Bayesian extension of VPRS
    SPEAKER:	Dr. Dominik Slezak
    DATE:		Monday, May 5
    TIME:		1:30PM
    PLACE:		ED 621
    
    ABSTRACT
    The variable precision rough set (VPRS) model introduces definitions of the approximate set positive and negative regions, which depend on the settings of model parameters defining the permissible levels of uncertainty associated with each of the rough regions. Using model parameters them to define the approximation regions is often not required. For that reason, we introduce a non-parametric modification of VPRS model, where the prior probability of the event is used as a benchmark value, against which the quality of available information about objects of the universe of interest can be measured. We consider three possible scenarios in that respect:
    1. The acquired information increases our perception of the likelihood that the event of interest would happen
    2. The acquired information increases the assessment of the probability that the event would not happen
    3. The acquired information has no effect at all
    Such a categorization of the universe leads to the Bayesian Rough Set (BRS) model, which seems to be more appropriate to application problems concerned with achieving any certainty gain in the decision making of prediction processes rather than meeting specific certainty goals.
    A next step is to think about BRS model as a special case of some more general, parametric approach, just like in case of the classical rough set model, which is a special case of VPRS model. We present the Variable Precision Bayesian Rough Set (VPBRS) model, where the set approximations correspond to the following situations:
    1. The acquired information sufficiently increases our perception of the likelihood that the event of interest would happen
    2. The acquired information sufficiently increases the assessment of the probability that the event would not happen
    3. The acquired information has almost no effect at all
    The verbs ''sufficiently'' and ''almost'' are expressed in terms of mathematical constrains parameterized by appropriately tuned thresholds. Besides development of theoretical foundations of VPBRS, an important issue is to investigate its properties in terms of being able to capture the quality of attributes (columns, features) in the analysis of real life data -- and hence, its applicability in the feature selection, extraction and reduction problems. We concentrate on the issue of the attribute reduction, which is addressed in the theory of rough sets in terms of (approximate) decision reduct. We adapt the relative probabilistic gain function to evaluate the global average information gain associated with a subset of features. We also formulate criteria for maintaining the level of the probabilistic gain during the process of attribute reduction. Finally, we draw a connection between those criteria and the reduction principles, based on discernibility between the set approximation regions.

  17. A Better Driver Assessment using the Variable Precision Rough Set Methodology
    PEAKER: Kwei Aryeetey
    DATE:   March 31, 2003
    TIME:   1:00 PM
    PLACE:  285 Riddell Centre
    
    ABSTRACT
    In most jurisdictions in North America, traffic violation and accident experience of drivers are closely monitored by jurisdictional licensing agencies. Studies conducted so far on driver behaviour have also concluded that past accidents are better predictors of future accidents than past violations.
    Accidents are enormously costly, both socially and economically, to drivers and insurance companies. In order to prevent these costs, jurisdictions must be able to identify potentially unsafe drivers and intervene through education or remedial actions. These measures could take the form of warnings, stiff penalties, suspensions from driving for specific periods of time or the imposition of high annual insurance premiums for these drivers.
    This presentation focuses on a different approach of analyzing driving-behaviour of drivers, using the variable precision rough sets by modeling the relationship between a driver's socio-economic, demographic, traffic conviction and accident history, other characteristics, and the future probability of being in an at-fault car accident.

  18. Acquisition of Control Algorithms Using Rough Sets
    SPEAKERS: Fulian Shang and Peng Yao
              Department of Computer Science
    DATE:     March 24, 2003
    TIME:     1:00 PM
    PLACE:    Riddell Centre 285
    
    ABSTRACT
    The seminar will include two short presentations of two projects concerned with applications of rough set theory to control. First, a multi-input multi-output (MIMO) data acquired controller using system of hierarchical decision tables for a simulated vehicle driving control problem will be presented. The simulator incorporates dynamic mathematical model of a vehicle driving on a track. Sensor readings and expert driver control actions are accumulated to derive the vehicle control model. Sensor readings include random error to reflect realistic data acquisition conditions. The methodology of rough sets is being used to process the data and to automatically derive the control algorithm. In the second, the research is to investigate the application of rough set theory in the automatic acquisition of control algorithms from operation log files aquired when controlling concurrent processes by human operators. Our objective is to develop a methodology for the elimination of the mathematical modeling and manual programming stages in the development of a control system for such systems. The approach includes a training stage followed by automatic generation of a decision algorithm. The research will use rough set-based data modelling techniques to obtain a control algorithm from an operator's history log and to apply this control algorithm in controlling concurrently moving objects in real time.

  19. Variable Precision Rough Sets in Modeling from Data (2)
    SPEAKER:Dr. Wojciech Ziarko
    	Department of Computer Science
    DATE:	March 17, 2003		
    TIME:	1:00 PM	
    PLACE:	285 Riddell Centre	
    
    ABSTRACT
    The presentation is a continuation of the introduction to the Variable Precision Rough Set Model (VPRSM). Two main topics will be discussed.
    The first is the derivation of hierarchical structures of decision tables in the context of VPRSM, which will include presentation of two methods for forming such structures. The structures constitute predictive models derived from data, which are applicable to variety of problems existing in data mining, pattern recognition, control etc.
    The second is the discussion of different evaluative measures to assess the quality of the decision table-based models. A number of measures will be presented, some of which are generalizations of original measures introduced by Pawlak.
    Finally, work in progress on application of the presented methodologies to analysis of car insurance company database will be discussed.

  20. An Introduction to the Variable Precision Rough Set Model
    
    SPEAKER:Dr. Wojciech Ziarko
    	Department of Computer Science
    DATE:	March 10, 2003		
    TIME:	1:00 PM	
    PLACE:	285 Riddell Centre	
    
    ABSTRACT
    The Variable Precision Rough Set Model (VPRSM) was introduced in early nineties as a probabilistic extension of the original Rough Set Theory (RST). The VPRSM generalizes RST by adapting a partial inclusion relation for definitions of rough approximation regions, that is, positive region, boundary region and negative region of a set. The generalization is motivated by frequent absence of positive or negative regions in application problems dealt with the original RST. In other words, in many practical problems existing in data mining, machine learning, pattern classification etc., it is not possible to identify deterministic rules or patterns from data but probabilistic patterns can be found. The VPRSM helps in applying the results and methodologies of RST to analysis, identification and optimization of probabilistic regularities in data. The presentation will introduce the basics and motivations behind the VPRSM and its relationship to the original RST.

  21. On Generalizing Rough Set Theory (II)
    SPEAKER:        Dr. Yiyu Yao 
                    Department of Computer Science
    DATE:           Monday, March 3, 2003
    TIME:           1:00 PM
    PLACE:          Riddell Centre 285
    
    ABSTRACT
    I will continue with last week's talk.

  22. On Generalizing Rough Set Theory (I)
    SPEAKER:        Dr. Yiyu Yao 
                    Department of Computer Science
    DATE:           Monday, February 24, 2003
    TIME:           1:00 PM
    PLACE:          Riddell Centre 285
    
    ABSTRACT
    This talk summarizes various formulations of the standard rough set theory. It demonstrates how those formulations can be adopted to develop different generalized rough set theories. The relationships between rough set theory and other theories are discussed.

  23. Toposes and rough set theory (III)
    SPEAKER:        Dr. Jonathon Funk
                    Department of Mathematics & Statistics
    DATE:           Monday, February 10, 2003
    TIME:           1:00 PM
    PLACE:          Riddell Centre 285
    
    ABSTRACT
    We continue our analysis of the category K of generalized upper approximations associated with a Pawlak relational system. We discuss the Yoneda embedding, constant objects, global sections, finite limits, exponential objects, and the subobject classifier.

  24. Toposes and rough set theory (II)
    SPEAKER:        Dr. Jonathon Funk
                    Department of Mathematics & Statistics
    DATE:           Monday, February 3, 2003
    TIME:           1:00 PM
    PLACE:          Riddell Centre 285
    
    ABSTRACT
    Previously, we defined the category K (of generalized upper approximations) associated with a Pawlak relational system. We next explain the properties of K (K is a topos), and consider other examples of objects that live in this category.

  25. Toposes and rough set theory
    SPEAKER:        Dr. Jonathon Funk
                    Department of Mathematics & Statistics
    DATE:           Monday, Jan. 27, 2003 
    TIME:           1:00 PM
    PLACE:          Riddell Centre 285
    
    ABSTRACT
    A natural connection between topos theory and rough set theory can be found by considering a variation over equivalence relations. I will begin with some basic category theory needed to explain this connection.

    2002

  26. LOGIC: An Ex-logician perspective (III)
    SPEAKER:        Dr. Anita Wasilewska
    DATE:           September 23, 2002
    TIME:           10:30 AM
    PLACE:          LB 235
    
    ABSTRACT
    An algebraic approach to classical and non-classical logics. History, techniques and results. Logics that are defined only algebraically. Case studies and connection with Rough Sets.

  27. LOGIC: An Ex-logician perspective (II)
    SPEAKER:        Dr. Anita Wasilewska
    DATE:           September 20, 2002
    TIME:           1:30 PM
    PLACE:          LB 268
    
    ABSTRACT
    Automated proof systems. Definition, types, history. Constructive and non-constructive proofs of completeness theorem. Gentzen and Resolution. Case study: First constructive proof of completeness theorem for classical predicate logic.

  28. LOGIC: An Ex-logician perspective (I)
    SPEAKER:        Dr. Anita Wasilewska
    		Computer Science Department
    		State University of New York
    		Stony Brook, NY
    DATE:           September 19, 2002
    TIME:           1:00 PM
    PLACE:          CL 232
    
    ABSTRACT
    What is LOGIC: distinctions, classifications and definition. Distinctions: philosophical logic, mathematical logic, and logics for computer science. Classifications (some): propositional, predicate, classical, non-classical, logics that extend to and from classical logic, logics that extend to and from intuitionistic logic, and others. Examples (and history). Syntax and semantics: general definition and distinctions. Techniques for a proof of completeness theorem. Theories based on a (given) logic: Consistency, incompleteness theorem. Foundations of Mathematics, logic as foundation of AI.

  29. Entropy Based Approximate Reducts and Networks
    SPEAKER:        Dr. Dominic Slezak
    DATE:           Sept 17, 2002
    TIME:           1:00 PM
    PLACE:          CL 232
    
    ABSTRACT
    Information entropy measures are widely applied to evaluate the degree of probabilistic dependencies between random variables. At the level of data analysis, one can efficiently apply entropy to selection, extraction and reduction of features providing optimal data models. We discuss two applications of information entropy to modeling data dependencies:

    The first application is related to the rough set approach to the construction of classification models. It is based on the paradigm of reducing attributes irrelevant with respect to determining a distinguished decision attribute. The degree of such irrelevance can be expressed in probabilistic terms, by using entropy. Hence, one can consider a kind of approximate reduction principle, claiming that attributes should be removed during the reduction process, if and only if entropy of the model remains approximately at the same level. We discuss various specifications of this principle, as well as computational complexity of optimization problems concerning the search for entropy based approximate decision reducts.

    The second application is related to the notion of an approximate Bayesian network, capable to encode the statements about approximate conditional independence between random variables. The usage of information entropy to approximate the notion of probabilistic independence is a natural consequence of its fundamental properties. The notion of an entropy based approximate decision reduct becomes to correspond to the notion of an approximate Markov boundary - irreducible subset of random variables making a distinguished variable approximately independent from the rest of them. We discuss the advantages of dealing with approximate conditional independence statements and approximate Bayesian networks while analyzing real life data. We also show mathematical foundations for generalizing results concerning classical Bayesian networks onto the entropy based approximate case.

  30. Data Based Approximation of Complex Concepts

    SPEAKER:        Dr. Dominic Slezak
    

    DATE: Sept 10, 2002

    TIME: 1:00 PM

    PLACE: CL 232

    ABSTRACT
    The theory of rough sets provides a clear and efficient approach to the concept approximation tasks. The most common application is the data based construction of decision models, where the concepts correspond to the values of a distinguished decision feature. In case of many decision problems there is an issue of data inconsistency, where construction of deterministic models is impossible. This problem is addressed by introducing, e.g., the set approximations, generalized decision functions and rough membership functions. In specific applications, decision can be expressed as a continuous value, function plot, probabilistic distribution, etc. Then there is a need for measuring how close two values of decision are. Such measures may be devised in a manner supporting the particular goal we want to achieve.

    The above issues are illustrated by the example of application of rough set based tools to the post-surgery survival analysis. Decision problem is in this case defined over data related to the head and neck cancer cases, for two types of medical surgeries. The task is to express the differences between expected results of these surgeries and to search for rules discerning different survival tendencies. The needs of considering complex decision values, like the plots of Kaplan-Meier product estimates and contingency-like cross distributions of the success ratio against the type of surgery, are discussed.

    Data based concept approximation may be also applied to the construction of decision models in a hierarchical way. In many situations decision states are impossible to be expressed by means of simple decision rules. A possible approach is to follow the layered learning paradigm. It corresponds to the theory of rough mereology, where at each level of the learning hierarchy one tries to approximate more complex concepts, by basing on those from previous levels. This principle is illustrated by applications to the synthesis of decision models, multi-layered feature extraction, as well as the others.

  31. AUTOMATIC CLASSIFICATION OF MUSICAL INSTRUMENTS SOUNDS BASED ON WAVELETS AND NEURAL NETWORKS

    SPEAKER:        Dr.Bozena Kostek
    DATE:           July 25, 2002 
    TIME:           2:00 PM 
    PLACE:          CL 345

    A study on the classification of musical instruments by means of the wavelet analysis and artificial neural networks will be shown. A short discussion on pitch detection methods of musical sounds will be presented. Then, some details of the engineered pitch detection method are shown. Several analyses exemplifying problems related to automatic pitch tracking process are included. Principles of the wavelet-based parameterization of musical instrument sounds will be presented and a set of parameters resulting from the parametrization process will be shown. Artificial neural networks were used for classification purposes. Exemplary results obtained in the carried out investigations will be presented and discussed.

  32. NEURAL NETWORK - BASED IDENTIFICATION OF SOUND SOURCE POSITION AND MUSICAL SOUND PITCH

    SPEAKER:        Dr. Andrzej Czyzewski
    DATE:		July 25, 2002
    TIME:		3:00 PM
    PLACE:		CL 345
    

    Sound source position identification systems are often used in many telecommunication areas. Numerous approaches to this task were developed. Usually such systems are based on digital signal processing technology and are computationally intensive. This paper presents one of alternative methods implementing intelligent neural network-based decision module. The method effectiveness was tested with various types and structures of multilayer neural networks. The obtained results will be presented and discussed.

    A new -approach to musical signal pitch prediction based on musical knowledge modeling will be discussed in the second part of the presentation. First, signal is partitioned into segments roughly analogous to consecutive notes. Thereafter, for each segment an autocorrelation function is calculated. Autocorrelation function values are then altered using pitch predictor output. A music predictor based on artificial neural networks was introduced for this task. The description of the proposed pitch estimation enhancement method is included and some details concerning music prediction and recognition are discussed in the paper.

  33. Studies in Rough Sets and Variable Precision Rough Sets
    SPEAKER:        Dr. Arul Siromoney
    DATE:           July 22 
    TIME:           1:30 PM 
    PLACE:          CL 305
    

    ABSTRACT

    This talk presents a survey of the author's research studies in Rough Set Theory (RST) and Variable Precision Rough Sets (VPRS).

    One area of research is the intersection of Rough Set Theory and Inductive Logic Programming. ILP uses positive examples, negative examples and background knowledge to induce a hypothesis that, along with the background knowledge, describes the positive examples (completeness), without describing the negative examples (consistency).

    The examples, background and hypothesis are all expressed as Prolog clauses. The notions of consistency and completeness in ILP are studied in RST in a finite universe and in an extension to future test cases. RST is extended to ILP, and elementary sets are defined that any induced logic program, for that background knowledge and declarative bias, cannot distinguish between the elements of the elementary set. VPRS is then extended to ILP (VPRSILP). The VPRS and VPRSILP models are also defined for universes that include future test cases.

    VPRS is also used after a domain-relevant preprocessing stage. Domain relevant techniques are used to generate a small number of attributes, that are then used in VPRS for classification.

    Illustrative examples include the identification of transmembrane domains and classification of web usage sessions.