Stream-Based Active Mining at Scale: Non-Linear Non-Submodular Maximization

The past decades have witnessed enormous transformations of intelligent data analysis in the realm of datasets at an unprecedented scale. Analysis of big data is computationally demanding, resource hungry, and much more complex. With recent emerging applications, most of the studied objective functions have been shown to be non-submodular or non-linear. Additionally, with the presence of dynamics in billion-scale datasets, scalable and stream-based adaptive algorithms which can quickly update solutions instead of recalculating from scratch must be investigated. All of the aforementioned issues call for a scalable and stream-based active mining techniques to cope with enormous applications of non-submodular maximization in the era of big data.
This project develops a theoretical framework together with highly scalable approximation algorithms and tight theoretical performance bound guarantees for the class of non-submodular and non-linear optimization. In particular, the project lays the foundation for the novel data mining techniques, suitable to the new era of big data with emerging applications, as well as advance the research front of stochastic and stream-based algorithm designs, with several key innovations: 1) Rigorous mathematical techniques to analyze and design highly scalable approximation algorithms to the class of non-monotonic, non-submodular maximization, which underlies many emerging applications. 2) Attempt a new research direction by bridging the non-linear optimization and the combinatorial optimization, thereby bringing the new angles for the study of non-submodular optimization as well as getting deeper understanding of the problem structures. 3) Novel stream-based active mining at scale for multiple applications, focused on the two general models which unify many optimization problems in the domain of online social networks and privacy.