Information Discovery on Domain Data Graphs

An increasing amount of data is stored in an interconnected manner. Such data range from the Web – hyperlinked pages – to bibliographical data – graph of citations – to biological data – associations between proteins, genes, publications – to clinical data – associations between patients, hospitalizations, exams and diagnoses.

A critical need in order to leverage the available data is the enablement of information discovery, i.e., given a question (query) find pieces of data or associations between them in the data graph that are “good” (relevant, authoritative and specific) for the query, and rank them according to their “goodness”. Submitting such queries should not require knowledge of a complex query language (e.g., SQL) or of the details of the data (e.g., schema). Unfortunately, little has been done to provide high-quality information discovery on data graphs in domains other than the Web, where search engines have been successful.

This project will facilitate effective information discovery on domain –biological, clinical, patents, e-commerce, spatial– data, which can lead to cost savings, and increased research productivity in these domains.

This project is sponsored by NSF (IIS 0811922, 2008-2011).

People

1.      Vagelis Hristidis, PI, Assistant Professor, FIU

2.      Theodoros Chondrogiannis, PhD student

3.      Fernando Farfán, PhD student

4.      Eduardo Ruiz, PhD student

Alumni

5.      Ramakrishna Varadarajan (PostDoc, University of Wisconsin, Madison)

6.      Alejandro Hernandez, FIU B.S. graduate

Publications

Books

1.    Vagelis Hristidis. Information Discovery on Electronic Health Records. Under preparation. To be published by CRC - Taylor & Francis, 2009 (also co-authored 4 chapters)

Conferences/Workshops/Journals

2.    Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos and Sotiria Tavoulari. Exploring Biomedical Databases with BioNav. Demo Paper, ACM SIGMOD Conference 2009 (acceptance rate 37%)

3.      Ramakrishna Varadarajan, Vagelis Hristidis, Louiqa Raschid, Maria-Esther Vidal, Luis lbanez and Hector Rodriguez-Drumond: Flexible and Efficient Querying and Ranking on Hyperlinked Data Sources, EDBT 2009 (full paper, acceptance rate 33%)

4.      Fernando Farfán, Vagelis Hristidis, Anand Ranganathan, and Michael Weiner. XOntoRank: Ontology-Aware Search of Electronic Medical Records. IEEE International Conference on Data Engineering (ICDE) 2009 (long paper, acceptance rate 17%)

5.      Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos, and Sotiria Tavoulari. BioNav: Effective Navigation on Query Results of Biomedical Databases. IEEE International Conference on Data Engineering, ICDE 2009 (short paper, acceptance rate 27%)

6.      Vagelis Hristidis, Oscar Valdivia, Michail Vlachos, Philip S Yu. Information Discovery across Multiple Streams. Elsevier Information Sciences, 2009

7.      Ian De Felipe, Vagelis Hristidis, Naphtali Rishe. Keyword Search on Spatial Databases. IEEE ICDE 2008 (full paper, long presentation, acceptance rate 12%)

8.      Ramakrishna Varadarajan, Vagelis Hristidis, Louiqa Raschid. Explaining and Reformulating Authority Flow Queries. IEEE ICDE 2008 (full paper, short presentation, acceptance rate 19%)

9.      Fernando Farfán, Vagelis Hristidis, Anand Ranganathan, Redmond P. Burke. Ontology-Aware Search on XML-based Electronic Medical Records. Poster Paper, IEEE ICDE 2008 (acceptance rate 31%)

Outreach Activities

High School Outreach Program