LOGO

DBMiner:

A data mining tool for large relational databases
Try DBMiner


Project Overview

Major functional modules

Implementation of DBMiner

Further Development of DBMiner

Research Project Team

Research Funding

Publications

Try DBMiner...
MiMo our DBMiner mascot
Back to Database Lab Page
LogicBase DBMiner GeoMiner WebMiner

DBMiner, a data mining system for interactive mining of multiple-level knowledge in large relational databases, has been developed based on our years-of-research. The system implements a wide spectrum of data mining functions, including generalization, characterization, discrimination, association, classification, and prediction. By incorporation of several interesting data mining techniques, including attribute-oriented induction, progressive deepening for mining multiple-level rules, and meta-rule guided knowledge mining, the system provides a user-friendly, interactive data mining environment with good performance.


Project Overview

A data mining system, DBMiner, has been developed for interactive mining of multiple-level knowledge in large relational dtabases. It is based on our studies of data mining techniques and our experience in the development of an early system prototype, DBLearn. The system implements a wide spectrum of data mining functions, including generalization, characterization, association, classification, and prediction. By incorporation of several interesting data mining techniques, including attribute-oriented induction, statistical analysis, progressive deepening for mining multiple-level knowledge, and meta-rule guided mining, the system provides a user-friendly, interactive data mining environment with good performance.


Project Description

 
Figure:   General architecture of DBMiner

The system has the following distinct features:


Major functional modules:

 
Figure:  Knowledge discovery modules of DBMiner

DBMiner characterizer
The characterizer generalizes a set of task-relevant data into a generalized relation which can then be viewed at multiple concept levels from different angles. In particular, it derives a set of characteristic rules which summarize the general characteristics of a set of user-specified data (called the target class). For example, the symptoms of a specific disease can be summarized by a characteristic rule.

DBMiner discriminator
A discriminator discovers a set of discriminant rules which summarize the features that distinguish the class being examined (the target class) from other classes (called contrasting classes). For example, to distinguish one disease from others, a discriminant rule summarizes the symptoms that discriminate this disease from others.

DBMiner association rule finder
An association rule finder discovers a set of association rules (in the form of "") at multiple concept levels from the relevant set(s) of data in a database. For example, one may discover a set of symptoms frequently occurring together with certain kinds of diseases and further study the reasons behind them.

DBMiner data classifier
A classifier analyzes a set of training data(i.e., a set of objects whose class label is known) and constructs a model for each class based on the features in the data. A set of classification rules is generated by such a classification process, which can be used to classify future data and develop a better understanding of each class in the database. For example, one may classify diseases and provide the symptoms which describe each class or subclass of diseases.

DBMiner predictor
A predictor predicts the possible values of some missing data or the value distribution of certain attributes in a set of objects. This involves finding the set of attributes relevant to the attribute of interest (by some statistical analysis) and predicting the value distribution based on the set of data similar to the selected object(s). For example, an employee's potential salary can be predicted based on the salary distribution of similar employees in the company.

DBMiner meta-rule guided miner
A meta-rule guided miner is a data mining mechanism which takes a user-specified meta-rule form, such as "" as a pattern to confine the search for desired rules. For example, one may specify the discovered rules to be in the form of "" in order to find the relatinships between a student's major and his/her gpa in a university database.

DBMiner evolution evaluator
A data evolution evaluator evaluates the data evolution regularities for certain objects whose behavior changes over time. This may include characterization, classification, association, or clustering of time-related data. For example, one may find the general characteristics of the companies whose stock price has gone up over 20% last year or evaluate the trend or particular growth patterns of certain stocks.
DBMiner deviation evaluator
A deviation evaluator evaluates the deviation patterns for a set of task-relevant data in the database. For example, one may discover and evaluate a set of stocks whose behavior deviates from the trend of the majority of stocks during a certain period of time. The module contains the following three functions:
  1. recognizes or identifies the general trend and/or behavior for data in the database,
  2. detects the set of data which deviates from such a trend or behavior, and
  3. summarizes the general characteristics of deviation data.

DBMiner user interfaces
Three user interfaces, UNIX-based, Windows/NT-based, and WWW/netscape-based GUIs have been developed to allow users to interactively discover multiple-level knowledge in large relational databases, it integrates well with existing commercial database systems with high performance, and is robust at handling noise and exceptional data.


Implementation of DBMiner


Refer to the
KDD publication of DBLab, SFU.


Further Development of DBMiner

The DBMiner system is currently being extended in several directions, as illustrated below.


Data Mining Research Project Team,
Database Systems Research Laboratory

  1. Jiawei Han.
    Ph.D., University of Wisconsin-Madison, 1985.
    Professor of the School of Computing Science and Director of Database Systems Research Laboratory, Simon Fraser University, Canada.

    He has conducted research in the areas of knowledge discovery in databases, deductive databases, object-oriented databases, spatial databases, multimedia databases, and logic programming, with over 100 journal and conference publications. He is known for his work on knowledge discovery in databases and has been invited to give talks or tutorials in several international conferences (including SSD'93, ICDE'95, CIKM'95, and SIGMOD'96), universities, and industry firms in many countries. His research has been supported by Natural Sciences and Engineering Research Council (NSERC) of Canada (1988--present), Network of Centres of Excellence of Canada (IRIS-2 Project Leader for the project ``data mining and knowledge discovery in large databases'', 1994--1998), Hughes Research Laboratories (1995-1996), B.C. Science Council, MPR Teltech Ltd., and some other funding agencies.

    He has served as a program committee member for over 20 international conferences and workshops, including ICDE'95 (PC vice-chairman), DOOD'95, VLDB'96, SIGMOD'96, and SSD'97. He is currently the program committee co-chairman of the Second Int'l Conf. on Knowledge Discovery and Data Mining (KDD'96), the workshop co-organizer of the SIGMOD'96 workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'96), and the guest co-editor of the special issue on data mining and knowledge discovery for the IEEE Transactions on Knowledge and Data Engineering. He is also an editor for IEEE Transactions on Knowledge and Data Engineering, Journal of Intelligent Information Systems, and Journal of Data Mining and Knowledge Discovery.

  2. Sonny Chee. ( NSERC postgraduate scholarship holder)
    Ph.D. student, Computing Science, Simon Fraser University.
    He plans to work on the development of the new modules for mining unstructured data in the DBMiner system.

  3. Shan Chen.
    M.Sc. student, Computing Science, Simon Fraser University. She has been working on the application of statistical techniques in data mining, and, in particular, the DBMiner deviation evaluator.

  4. Jenny Chiang.( NSERC postgraduate scholarship holder).
    M.Sc. student, Computing Science, Simon Fraser University.
    She is working on the cube (multiple-dimensional database)-based DBMiner and the performance improvements of the DBMiner system.

  5. Yongjian Fu. ( BC Science Council GREAT award scholarship holder, CATA'95 (Canadian Advance Technology Association) scholarhsip awardee).
    Ph.D. student, Computing Science, Simon Fraser University.
    He is the major implementor of DBMiner version 1.0 and is currently working on multiple-level mining of association rules and meta-rule guided data mining.

  6. Wan Gong.
    M.Sc. student, Computing Science, Simon Fraser University.
    She is working on the classifier of the DBMiner system.

  7. Micheline Kamber. ( NSERC postgraduate scholarship awardee)
    Ph.D. student, Computing Science, Simon Fraser University. She has been working on interestingness measurements for discovered rules and is planning to work on meta-rule guided mining of different kinds of rules.

  8. Krzysztof Koperski.
    Ph.D. student, Computing Science, Simon Fraser University.
    He is working on spatial data mining and the GeoMiner project. He is also interested in spatial reasoning and spatial object-oriented databases. He has also been working on the testing of the DBMiner System.

  9. Deyi Li.
    Visiting Professor, Computing Science, Simon Fraser University. Ph.D. in Computing Science (1983), University of Edinburgh, U.K. Author of a few scientific books including `` A Prolog Database System'' and `` A Fuzzy Prolog Database System''.
    His major research interests include database and knowledge-base systems, knowledge discovery in databases, deductive databases, logic programming, and artificial intelligence.

  10. Yijun Lu.
    M.Sc. student, Computing Science, Simon Fraser University. He holds an M.Sc. degree, Mathematics and Statistics, Simon Fraser University.
    He is working on the concept hierarchy: generation, specification and adjustment, of the DBMiner system.

  11. Amynmohamed Rajan.
    M.Sc. student, Computing Science, Simon Fraser University.
    He is working on the development of the user-interfaces in the DBMiner system.

  12. Nebojsa Stefanovic.
    M.Sc. student, Computing Science, Simon Fraser University.
    He is working on data mining in spatial database systems and the GeoMiner project. He has also been doing the association and classification visualization for the DBMiner System.

  13. Wei Wang.
    M.Sc. student, Computing Science, Simon Fraser University.
    He has been implementing the GUI interface for the DBMiner system on PCs and is working on predictor of the DBMiner system.

  14. Lara Winstone. ( NSERC postgraduate scholarship holder)
    M.Sc. student, Computing Science, Simon Fraser University.
    She plans to work on the development of the classification techniques in the DBMiner system.

  15. Betty Xia.
    M.Sc. student, Computing Science, Simon Fraser University.
    She holds an M.Sc. degree, Computer Science, Jilin University, China. She is working on the development of PC interface of the DBMiner system.

  16. Osmar R. Zaïane. ( Quebec postgraduate scholarship holder)
    Ph.D. student, Computing Science, Simon Fraser University.
    His current research focuses on resource and knowledge discovery in global network information systems (inter/intranet), and the WebMiner project.
    He developed WWW/Netscape-based DBMiner User Interface.
    He is also working in the TeleLearning project designing and implementing a multimedia database.

Research Funding

The research has been supported by the following funding agencies and industry.

  1. Natural Sciences and Engineering Research Council of Canada (NSERC)
  2. Networks of Centres of Excellence of Canada (IRIS-II:HMI-5, IC-2), administered by PRECARN Associates, Inc.
  3. British Columbia Science Council.
  4. MPR Teltech Ltd.
  5. Hughes Research Laboratories, USA.
  6. Centre for Systems Science, Simon Fraser University.

Selected Publications

Try DBMiner

LogicBase DBMiner GeoMiner WebMiner

Return to Database Research Lab Page


Last updated: June 11, 1996. Page maintained by Osmar R. Zaïane (zaiane@cs.sfu.ca)