Research Colloquium: Howard Hamilton, "Evaluation of Interestingness Measures: 10 Years Later," September 12, 3:30 pm, CL 410 (Expired)

Evaluation of Interestingness Measures: 10 Years Later

 

Dr. Howard J. Hamilton, Department of Computer Science, University of Regina

 

September 12, 2011, 3:30 pm, CL 410

 

Abstract:

 

Each year, the Steering Committee for the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) reviews the papers presented at the conference 10 years ago to recognize one that has been well-received by the knowledge discovery research community. In considering all papers published at PAKDD 2001, the Steering Committee chose the paper entitled “Evaluation of Interestingness Measures for Ranking Discovered Knowledge,” by Robert J. Hilderman and Howard J. Hamilton. The award for the PAKDD 2001 Most Influential Paper was presented to them at PAKDD 2011 in Shenzhen, China on Wednesday, May 26th, 2011. In this seminar, Dr. Hamilton will present the original presentation for the paper again and then discuss the effect this paper has had on subsequent research.  The original abstract for the paper follows.

 

When mining a large database, the number of patterns discovered can easily exceed the capabilities of a human user to identify interesting results. To address this problem, various techniques have been suggested to reduce and/or order the patterns prior to presenting them to the user. In this presentation, our focus is on ranking summaries generated from a single dataset, where attributes can be generalized in many different ways and to many levels of granularity according to taxonomic hierarchies. We theoretically and empirically evaluate thirteen diversity measures used as heuristic measures of interestingness for ranking summaries generated from databases. The thirteen diversity measures have previously been utilized in various disciplines, such as information theory, statistics, ecology, and economics. We describe five principles that any measure must satisfy to be considered useful for ranking summaries.  Theoretical results show that only four of the thirteen diversity measures satisfy all of the principles. We then analyze the distribution of the index values generated by each of the thirteen diversity measures. Empirical results, obtained using synthetic data, show that the distribution of index values generated tend to be highly skewed about the mean, median, and middle index values. The objective of this work is to gain some insight into the behaviour that can be expected from each of the measures in practice.

 

The original paper is available at:

http://www2.cs.uregina.ca/~hamilton/papers/conferences/hhPAKDD2001.pdf


Breaking News


Recent News


Do You Have News for the Department of Computer Science?

  • Please send a plain text version of your posting, which can include URL links, to deptsec@cs.uregina.ca
RSS
To Top of Page