Abstract: Organizations today are awash in data. The number of datasets continues to grow along with the software that is creating both more and larger datasets. The challenge for many organizations is how to locate relevant data, how to combine that data with other sources, and how to keep existing sources up to date. In this talk I will present our work on machine learning techniques and graph algorithms for automatically understanding the semantics of data sources. I will also describe how the resulting semantic descriptions can be used to locate relevant data, integrate the data with other sources, and find errors and out of date information in existing sources. This research is available in a set of software tools that are freely available for download and use.
Bio: Craig Knoblock is the Keston Executive Director of the Information Sciences Institute and a Research Professor of both Computer Science and Spatial Sciences at the University of Southern California. He received his Ph.D. from Carnegie Mellon University in computer science. His research focuses on techniques for describing, acquiring, and exploiting the semantics of data. He has worked extensively on source modeling, schema and ontology alignment, entity and record linkage, data cleaning and normalization, extracting data from the web, and combining all of these techniques to build knowledge graphs. Dr. Knoblock is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), the Association of Computing Machinery (ACM), and the Institute of Electrical and Electronic Engineers (IEEE).