NSERC’s Awards Database
Award Details

Efficient mining of constrained patterns

Research Details
Application Id: 298317-2007
Competition Year: 2007 Fiscal Year: 2009-2010
Project Lead Name: Leung, CarsonKaiSang Institution: University of Manitoba
Department: Computer Science Province: Manitoba
Award Amount: $24,000 Installment: 3 - 5
Program: Discovery Grants Program - Individual Selection Committee: Computing and Information Sciences - A
Research Subject: Database management Area of Application: Information systems and technology
Co-Researchers: No Co-Researcher Partners: No Partners
Award Summary

With the advance in technology, a flood of data can be produced in many applications such as wireless sensor networks. Consequently, we are drowning in data but starving for knowledge. To be able to ''drink from a fire hose'' (i.e., to make sense of the flood of data), methods for extracting useful information from the flood of data are in demand. This calls for data mining, which refers to the search for implicit, previously unknown, and potentially useful knowledge (such as frequent patterns and exceptional patterns) that might be embedded in the data. Over the past few years, I have developed interactive algorithms for finding frequent patterns satisfying a certain class of constraints. The algorithms are enhanced with some optimizations such as a light-weight structure that provides sharper bounds on the frequency of frequent patterns. Furthermore, I have also developed a novel tree structure for effectively capturing and updating the contents of the database in an incremental environment. Along this direction, I propose to build a more efficient, user-friendly, and powerful mining system such that it (i) incorporates users' preferences, (ii) allows users to visualize the data, (iii) permits users to change the mining parameter and/or constraints during the mining process, (iv) provides users with comprehensible feedback in a ''real-time'' fashion, (v) discovers and exploits any unknown properties of constraints to avoid unnecessary computation and to further speed up performance, and (vi) keeps a good fusion of theory and practice via the exploration of real-life applications (e.g., mining from market basket data, Web click stream, agricultural/meteorological data, and medical/biomedical data). The discovered frequent patterns reveal the common trends; the discovered exceptional patterns trigger alarm bells for prevention of outbreaks or disasters. In the long term, this proposal can also be extended to handle various types of data, ranging from traditional alphanumeric data to non-traditional multimedia data, from structured data to semi-structured XML data, and from traditional ''static'' transactional data to ''dynamic'' streams of continuous data.