Abstract: A tremendous amount of text-content is available in the form of documents, microblogs, scientific articles, etc. and this is keep on growing exponentially over the time with the arrival of new data from multiple sources. In order to scan through such large volume of data, there is a requirement of developing some efficient text-mining techniques. Summarization techniques become popular in extracting relevant information from huge amount of data. Moreover development of some supervised technique requires huge amount of labeled data. The annotation of data for developing supervised information extraction systems is time-consuming and costly. In summarization, the aim is to generate compress, relevant, and concise information from the available data. Different facets of summarization, like document summarization, figure-summarization, microblog summarization, and multi-modal microblog summarization, will be discussed in the talk.
The task of summarization is posed as multiobjective optimization problem where multiple quality measures like cohesion, readability, anti-redundancy, are simultaneously optimized. In order to measure these quality measures, different semantic similarity measures, textual entailment concepts are utilized. Techniques like differential evolution are used as the underlying optimization strategy. Some new genetic operators inspired by the concepts of self-organizing map are also incorporated in the optimization process. Extensive experimentations have verified that all our proposed methods outperform many other state-of-the-art methods when tested on task-related data-sets.
Bio: Dr. Sriparna Saha is currently an Associate Professor in the Department of Computer Science and Engineering, Indian Institute of Technology Patna, India. She is the author of a book published by Springer-Verlag. She has authored or coauthored more than 290 papers. Her current research interests include deep learning, natural language processing, machine learning, information extraction, text mining, bioinformatics, and multiobjective optimization. Her h-index is 28 and the total citation count of her papers is 4938 (according to Google scholar). She is also a senior member of IEEE. Her name is included in the list of eight leading women scientists in the area of AI in India published by INDIAai which is the National AI Portal of India - a central hub for everything AI in India and beyond, a joint initiative of MeitY, NeGD, and NASSCOM, the website aims to be the trusted content powerhouse in the backdrop of India's journey to global prominence in Artificial Intelligence. Her name is also included in the list of the top 2% of scientists of their main subfield discipline (Artificial Intelligence and Image Processing), across those that have published at least five papers ( a survey conducted by Stanford University). She is the Associate Editors of IEEE/ACM Transactions on Computational Biology and Bioinformatics, Expert Systems with Applications, PLOS ONE, and IEEE Internet Computing journal. She is the recipient of the Lt Rashi Roy Memorial Gold Medal from the Indian Statistical Institute for outstanding performance in MTech (computer science). She is the recipient of the Google India Women in Engineering Award, 2008, NASI YOUNG SCIENTIST PLATINUM JUBILEE AWARD 2016, BIRD Award 2016, IEI Young Engineers' Award 2016, SERB WOMEN IN EXCELLENCE AWARD 2018, and SERB Early Career Research Award 2018. She is the recipient of DUO-India fellowship 2020, Humboldt Research Fellowship, Indo-U.S. Fellowship for Women in STEMM (WISTEMM) Women Overseas Fellowship program 2018 and CNRS fellowship.