CS497-KCC

Meta-Research in Information Access and Data Management

Supporting Ranked Queries for Data Retrieval

Call number: 01616, Credit: 1 unit

Fall 2003
Tuesday/Thursday, 5:00 - 6:15pm
 3211 DCL (note: location changed)


Instructor Administration About the Course
Format Schedule Resources

Instructor:

Kevin C. Chang

Administration:


About the Course

Research (in system engineering) is an activity of identifying problems and finding solutions, which eventually leads to the creation of new knowledge. While research is challenging in all aspects, often the decisive questions facing a researcher are:

"What research directions should I pursue? What problems should I solve?"

To learn to answer these "meta" questions, this course will study meta-research, or research about research, in the area of information access and data management. In particular, we will research about what research to pursue, a critical activity for any successful research efforts.

To establish a concrete technical focus, we will situate ourselves specifically in the area of supporting ranked top-k queries for data retrieval: As database systems are facing new challenges in non-traditional fuzzy retrieval scenarios, top-k or ranked queries are crucial for matching data by "soft" conditions, such as similarity, relevance, or preference: A multimedia database may rank objects by their "similarity'' to an example image. A text search engine orders documents by their "relevance'' to query terms. An e-commerce service may sort their products according to a user's "preference'' criteria to facilitate purchase decisions. For these applications, Boolean queries (e.g., as in SQL) can be too restrictive as they do not capture partial matching and result ranking-- which top-k queries specifically support.

Thus, this class will take all of us, the instructor and students together, to identify and create specific research agenda in the general area of top-k querying. We will research about what research have been done, and we will research about what research to undertake. Toward this end, this class consists of three stages:

Meanwhile, in parallel with the technical focus, we will also investigate some key "meta-questions" of research:  How have research topics evolved? How is industrial labs vs. academic research? -- Or any other interesting questions you can think of.

Prerequisites


Format

This class is essentially a "working group," in which we will together study the literature and develop new ideas.  (So, no, this is not a lecture & exam-type class. And, no, it is not simply a "seminar" class.) We will learn by doing (not by lecturing, not by exams), in a rather informal setting. The main activities, in the various stages, are as follows:

The CORE Stage:

We will read, present, and discuss selected recent research literature, to establish our core knowledge of the state of the art. Students are expected (depending on the enrollment) to present a research paper, in one of the class meetings.

For each such "core" paper (well, only 8 of them), students will read before class and send a Discussion Question Suggestion (DQS) to the presenter, by 9pm on the day before the class meeting. A DQS should be brief-- with no more than six sentences-- which suggests a discussion topic (1 sentence) and provides some initial thoughts  (5 more sentences at most). Hopefully, students will suggest DQ that is interesting and thought provoking. These DQS submissions will help to form the basis of our class discussion.

DQS Submission:

The CONTEXT Stage:

We will decide on a set of context-question to survey (e.g., "What query paradigms have been proposed for database querying?"; "How has ranking been used in Computer Science?"; "How has ranking been used in social choice?") Each student (or a group of students) will survey a question, by studying the literature, in a time frame of about four weeks. Students will then document their findings and report to the class in one of the meetings.

The CREATION Stage:

As our ultimate goal, we will identify and develop a set of promising research topics. Starting from mid semester, each student (or a group of students) will propose a research topic and develop its insight and promise.

As the deliverable, the student will write a 5-10 page mini-proposal to define and justify the problem and speculate a solution. During the semester, we will discuss how to write a research proposal, and study some successful examples.

Meta-Research Fun Investigation:

As a fun exercise, students will  investigate some "meta-research" questions of their choosing (e.g., "How have research topics evolved?"; "How do academic researchers play a role in DB research?") and give a brief report of findings to the class. You come up with your own question and find out an answer!

Evaluation:

Final grades in the class will be determined approximately in the following way:


Schedule

The tentative schedule is as follows.  We may change the schedule as needed.

Most of the class readings are linked to their Electronic Edition (EE) where you can download their online copies, as provided by ACM Digital Library, SIGMOD Anthology, or the authors. Note that ACM DL can be freely accessible from the UIUC domain (see Resources).

Important Dates

W Date Activities Presenter
1 08/28 Introduction and Administrative Matters  Kevin
2 09/02 Ronald Fagin: Combining Fuzzy Information from Multiple Systems. PODS 1996: 216-226 [EE]
** (Ref Only) Edward L. Wimmers, Laura M. Haas, Mary Tork Roth, Christoph Braendli: Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware. CoopIS 1999: 267-278 [EE]
 Kevin
09/04 Rakesh Agrawal, Edward L. Wimmers: A Framework for Expressing and Combining Preferences. SIGMOD Conference 2000: 297-306 [EE] [slides]  Zhen (zhang2@)
3 09/09 "Context" Getting Started: Introduction of the Context Survey Issues   Kevin
09/11 Surya Nepal, M. V. Ramakrishna: Query Processing Issues in Image (Multimedia) Databases. ICDE 1999: 22-29 [EE] [slides]
** (Ref Only) Ulrich Güntzer, Wolf-Tilo Balke, Werner Kießling: Optimizing Multi-Feature Queries for Image Databases. VLDB 2000: 419-428 [EE]
 Lars (leolson1@)
4 09/16 Ronald Fagin, Amnon Lotem, Moni Naor: Optimal Aggregation Algorithms for Middleware. PODS 2001 [EE]
Note: Class meeting rescheduled to 2501 DCL, Wed. 9/17, 5-6:15pm
 Seung-won (shwang5@)
09/18 Kevin Chen-Chuan Chang, Seung-won Hwang: Minimal Probing: Supporting Expensive Predicates for  Top-k Queries. SIGMOD Conference 2002: 346-357 [EE] [slides]  Sumin (ssong4@)
5 09/23 Michael J. Carey, Donald Kossmann: On Saying "Enough Already!" in SQL. SIGMOD Conference 1997: 219-230 [EE] [slides]  Ulrich (ukadow@)
09/25 Surajit Chaudhuri, Luis Gravano: Evaluating Top-k Selection Queries. VLDB 1999: 397-410 [EE]  Chengkai (cli@)
6 09/30 Werner Kießling: Foundations of Preferences in Database Systems. VLDB 2002: 311-322 [EE] [slides]  Johannes (kirschni@)
10/02 How to Write a Research Proposal? Also: Meta-Research Getting Started Introduction [slides]
** (Ref) Projects in the NSF Information and Data Management Program
 Kevin
7 10/07 "Creation" Getting Started: Report of Collective Thinking of New Research Agenda (Group A)  [slides]  Group A & Kevin 
10/09 "Creation" Getting Started: Report of Collective Thinking of New Research Agenda (Group B) [slides]  Group B & Kevin
8 10/14 Context: Q01: Sumin [slides]; Q02: Ava  [slides]  
10/16 Context: Q03: Zhen [slides]
9 10/21 Context: Q04: Lars [slides], Guixian [slides]  
10/23 Context: Q05: Johannes [slides]  
10 10/28 Context: Q06: Ulrich [slides]  
10/30 Context: Q07: William [slides]  
11 11/04 Context: Q08: Alex [slides]  
11/06 Context: Q09: Bin [slides]  
12 11/11 Context: Q10: Chengkai [slides]  
11/13 Context: Q11: Seung-won [slides]  
13 11/18 No class: Instead, working group meetings for proposing creation agenda in September/October  
11/20 Same as above  
14 11/25  No Class: Thanksgiving Vacation  
11/27  No Class: Thanksgiving Vacation  
15 12/02 Meta-Research Free-Investigation Report (1) [Ava], [Lars], [Sumin], [Ulrich], [Zhen], [Alex] Group B
12/04 Meta-Research Free-Investigation Report (2) [Chengkai], [Seung-won], [Johannes], [William], [Guixian] Group A
16 12/09 Creation-1: Future Agenda Discussion: Student Report [Alex], [Chengkai], [Seung-won], [Johannes], [William], [Guixian], [Ava], [Lars], [Sumin], [Ulrich], [Zhen]  
12/11 Creation-2: Future Agenda Discussion: Kevin's Report  

University calendar:  Academic Year 2003-2004


Resources

Finding Papers Online:


Kevin C. Chang, kcchang@cs.uiuc.edu