Paper Group ANR 81
On the convex geometry of blind deconvolution and matrix completion. A Knowledge Transfer Framework for Differentially Private Sparse Learning. Scalable Semi-Supervised SVM via Triply Stochastic Gradients. Pipelines for Procedural Information Extraction from Scientific Literature: Towards Recipes using Machine Learning and Data Science. Detecting A …
On the convex geometry of blind deconvolution and matrix completion
Title | On the convex geometry of blind deconvolution and matrix completion |
Authors | Felix Krahmer, Dominik Stöger |
Abstract | Low-rank matrix recovery from structured measurements has been a topic of intense study in the last decade and many important problems like matrix completion and blind deconvolution have been formulated in this framework. An important benchmark method to solve these problems is to minimize the nuclear norm, a convex proxy for the rank. A common approach to establish recovery guarantees for this convex program relies on the construction of a so-called approximate dual certificate. However, this approach provides only limited insight in various respects. Most prominently, the noise bounds exhibit seemingly suboptimal dimension factors. In this paper we take a novel, more geometric viewpoint to analyze both the matrix completion and the blind deconvolution scenario. We find that for both these applications the dimension factors in the noise bounds are not an artifact of the proof, but the problems are intrinsically badly conditioned. We show, however, that bad conditioning only arises for very small noise levels: Under mild assumptions that include many realistic noise levels we derive near-optimal error estimates for blind deconvolution under adversarial noise. |
Tasks | Matrix Completion |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.11156v2 |
http://arxiv.org/pdf/1902.11156v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convex-geometry-of-blind-deconvolution |
Repo | |
Framework | |
A Knowledge Transfer Framework for Differentially Private Sparse Learning
Title | A Knowledge Transfer Framework for Differentially Private Sparse Learning |
Authors | Lingxiao Wang, Quanquan Gu |
Abstract | We study the problem of estimating high dimensional models with underlying sparse structures while preserving the privacy of each training example. We develop a differentially private high-dimensional sparse learning framework using the idea of knowledge transfer. More specifically, we propose to distill the knowledge from a “teacher” estimator trained on a private dataset, by creating a new dataset from auxiliary features, and then train a differentially private “student” estimator using this new dataset. In addition, we establish the linear convergence rate as well as the utility guarantee for our proposed method. For sparse linear regression and sparse logistic regression, our method achieves improved utility guarantees compared with the best known results (Kifer et al., 2012; Wang and Gu, 2019). We further demonstrate the superiority of our framework through both synthetic and real-world data experiments. |
Tasks | Sparse Learning, Transfer Learning |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06322v1 |
https://arxiv.org/pdf/1909.06322v1.pdf | |
PWC | https://paperswithcode.com/paper/a-knowledge-transfer-framework-for |
Repo | |
Framework | |
Scalable Semi-Supervised SVM via Triply Stochastic Gradients
Title | Scalable Semi-Supervised SVM via Triply Stochastic Gradients |
Authors | Xiang Geng, Bin Gu, Xiang Li, Wanli Shi, Guansheng Zheng, Heng Huang |
Abstract | Semi-supervised learning (SSL) plays an increasingly important role in the big data era because a large number of unlabeled samples can be used effectively to improve the performance of the classifier. Semi-supervised support vector machine (S$^3$VM) is one of the most appealing methods for SSL, but scaling up S$^3$VM for kernel learning is still an open problem. Recently, a doubly stochastic gradient (DSG) algorithm has been proposed to achieve efficient and scalable training for kernel methods. However, the algorithm and theoretical analysis of DSG are developed based on the convexity assumption which makes them incompetent for non-convex problems such as S$^3$VM. To address this problem, in this paper, we propose a triply stochastic gradient algorithm for S$^3$VM, called TSGS$^3$VM. Specifically, to handle two types of data instances involved in S$^3$VM, TSGS$^3$VM samples a labeled instance and an unlabeled instance as well with the random features in each iteration to compute a triply stochastic gradient. We use the approximated gradient to update the solution. More importantly, we establish new theoretic analysis for TSGS$^3$VM which guarantees that TSGS$^3$VM can converge to a stationary point. Extensive experimental results on a variety of datasets demonstrate that TSGS$^3$VM is much more efficient and scalable than existing S$^3$VM algorithms. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11584v1 |
https://arxiv.org/pdf/1907.11584v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-semi-supervised-svm-via-triply |
Repo | |
Framework | |
Pipelines for Procedural Information Extraction from Scientific Literature: Towards Recipes using Machine Learning and Data Science
Title | Pipelines for Procedural Information Extraction from Scientific Literature: Towards Recipes using Machine Learning and Data Science |
Authors | Huichen Yang, Carlos A. Aguirre, Maria F. De La Torre, Derek Christensen, Luis Bobadilla, Emily Davich, Jordan Roth, Lei Luo, Yihong Theis, Alice Lam, T. Yong-Jin Han, David Buttler, William H. Hsu |
Abstract | This paper describes a machine learning and data science pipeline for structured information extraction from documents, implemented as a suite of open-source tools and extensions to existing tools. It centers around a methodology for extracting procedural information in the form of recipes, stepwise procedures for creating an artifact (in this case synthesizing a nanomaterial), from published scientific literature. From our overall goal of producing recipes from free text, we derive the technical objectives of a system consisting of pipeline stages: document acquisition and filtering, payload extraction, recipe step extraction as a relationship extraction task, recipe assembly, and presentation through an information retrieval interface with question answering (QA) functionality. This system meets computational information and knowledge management (CIKM) requirements of metadata-driven payload extraction, named entity extraction, and relationship extraction from text. Functional contributions described in this paper include semi-supervised machine learning methods for PDF filtering and payload extraction tasks, followed by structured extraction and data transformation tasks beginning with section extraction, recipe steps as information tuples, and finally assembled recipes. Measurable objective criteria for extraction quality include precision and recall of recipe steps, ordering constraints, and QA accuracy, precision, and recall. Results, key novel contributions, and significant open problems derived from this work center around the attribution of these holistic quality measures to specific machine learning and inference stages of the pipeline, each with their performance measures. The desired recipes contain identified preconditions, material inputs, and operations, and constitute the overall output generated by our computational information and knowledge management (CIKM) system. |
Tasks | Entity Extraction, Information Retrieval, Question Answering |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07747v1 |
https://arxiv.org/pdf/1912.07747v1.pdf | |
PWC | https://paperswithcode.com/paper/pipelines-for-procedural-information |
Repo | |
Framework | |
Detecting AI Trojans Using Meta Neural Analysis
Title | Detecting AI Trojans Using Meta Neural Analysis |
Authors | Xiaojun Xu, Qi Wang, Huichen Li, Nikita Borisov, Carl A. Gunter, Bo Li |
Abstract | Machine learning models, especially neural networks (NNs), have achieved outstanding performance on diverse and complex applications. However, recent work has found that they are vulnerable to Trojan attacks where an adversary trains a corrupted model with poisoned data or directly manipulates its parameters in a stealthy way. Such Trojaned models can obtain good performance on normal data during test time while predicting incorrectly on the adversarially manipulated data samples. This paper aims to develop ways to detect Trojaned models. We mainly explore the idea of meta neural analysis, a technique involving training a meta NN model that can be used to predict whether or not a target NN model has certain properties. We develop a novel pipeline Meta Neural Trojaned model Detection (MNTD) system to predict if a given NN is Trojaned via meta neural analysis on a set of trained shadow models. We propose two ways to train the meta-classifier without knowing the Trojan attacker’s strategies. The first one, one-class learning, will fit a novel detection meta-classifier using only benign neural networks. The second one, called jumbo learning, will approximate a general distribution of Trojaned models and sample a “jumbo” set of Trojaned models to train the meta-classifier and evaluate on the unseen Trojan strategies. Extensive experiments demonstrate the effectiveness of MNTD in detecting different Trojan attacks in diverse areas such as vision, speech, tabular data, and natural language processing. We show that MNTD reaches an average of 97% detection AUC (Area Under the ROC Curve) score and outperforms existing approaches. Furthermore, we design and evaluate MNTD system to defend against strong adaptive attackers who have exactly the knowledge of the detection, which demonstrates the robustness of MNTD. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03137v2 |
https://arxiv.org/pdf/1910.03137v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-ai-trojans-using-meta-neural |
Repo | |
Framework | |
Identification of primary angle-closure on AS-OCT images with Convolutional Neural Networks
Title | Identification of primary angle-closure on AS-OCT images with Convolutional Neural Networks |
Authors | Chenglang Yuan, Cheng Bian, Hongjian Kang, Shu Liang, Kai Ma, Yefeng Zheng |
Abstract | Primary angle-closure disease (PACD) is a severe retinal disease, which might cause irreversible vision loss. In clinic, accurate identification of angle-closure and localization of the scleral spur’s position on anterior segment optical coherence tomography (AS-OCT) is essential for the diagnosis of PACD. However, manual delineation might confine in low accuracy and low efficiency. In this paper, we propose an efficient and accurate end-to-end architecture for angle-closure classification and scleral spur localization. Specifically, we utilize a revised ResNet152 as our backbone to improve the accuracy of the angle-closure identification. For scleral spur localization, we adopt EfficientNet as encoder because of its powerful feature extraction potential. By combining the skip-connect module and pyramid pooling module, the network is able to collect semantic cues in feature maps from multiple dimensions and scales. Afterward, we propose a novel keypoint registration loss to constrain the model’s attention to the intensity and location of the scleral spur area. Several experiments are extensively conducted to evaluate our method on the angle-closure glaucoma evaluation (AGE) Challenge dataset. The results show that our proposed architecture ranks the first place of the classification task on the test dataset and achieves the average Euclidean distance error of 12.00 pixels in the scleral spur localization task. |
Tasks | |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10414v1 |
https://arxiv.org/pdf/1910.10414v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-primary-angle-closure-on-as |
Repo | |
Framework | |
Scene Understanding for Autonomous Manipulation with Deep Learning
Title | Scene Understanding for Autonomous Manipulation with Deep Learning |
Authors | Anh Nguyen |
Abstract | Over the past few years, deep learning techniques have achieved tremendous success in many visual understanding tasks such as object detection, image segmentation, and caption generation. Despite this thriving in computer vision and natural language processing, deep learning has not yet shown significant impact in robotics. Due to the gap between theory and application, there are many challenges when applying the results of deep learning to the real robotic systems. In this study, our long-term goal is to bridge the gap between computer vision and robotics by developing visual methods that can be used in real robots. In particular, this work tackles two fundamental visual problems for autonomous robotic manipulation: affordance detection and fine-grained action understanding. Theoretically, we propose different deep architectures to further improves the state of the art in each problem. Empirically, we show that the outcomes of our proposed methods can be applied in real robots and allow them to perform useful manipulation tasks. |
Tasks | Object Detection, Scene Understanding, Semantic Segmentation |
Published | 2019-03-23 |
URL | http://arxiv.org/abs/1903.09761v1 |
http://arxiv.org/pdf/1903.09761v1.pdf | |
PWC | https://paperswithcode.com/paper/scene-understanding-for-autonomous |
Repo | |
Framework | |
Visualizing Movement Control Optimization Landscapes
Title | Visualizing Movement Control Optimization Landscapes |
Authors | Perttu Hämäläinen, Juuso Toikka, C. Karen Liu |
Abstract | A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control optimization landscapes; this yields insights into why control optimization is hard and why common practices like early termination and spline-based action parameterizations make optimization easier. For example, our experiments show how trajectory optimization can become increasingly ill-conditioned with longer trajectories, but parameterizing control as partial target states - e.g., target angles converted to torques using a PD-controller - can act as an efficient preconditioner. Both our visualizations and quantitative empirical data also indicate that neural network policy optimization scales better than trajectory optimization for long planning horizons. Our work advances the understanding of movement optimization and our visualizations should also provide value in educational use. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07869v2 |
https://arxiv.org/pdf/1909.07869v2.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-movement-control-optimization |
Repo | |
Framework | |
Learning Longer-term Dependencies via Grouped Distributor Unit
Title | Learning Longer-term Dependencies via Grouped Distributor Unit |
Authors | Wei Luo, Feng Yu |
Abstract | Learning long-term dependencies still remains difficult for recurrent neural networks (RNNs) despite their success in sequence modeling recently. In this paper, we propose a novel gated RNN structure, which contains only one gate. Hidden states in the proposed grouped distributor unit (GDU) are partitioned into groups. For each group, the proportion of memory to be overwritten in each state transition is limited to a constant and is adaptively distributed to each group member. In other word, every separate group has a fixed overall update rate, yet all units are allowed to have different paces. Information is therefore forced to be latched in a flexible way, which helps the model to capture long-term dependencies in data. Besides having a simpler structure, GDU is demonstrated experimentally to outperform LSTM and GRU on tasks including both pathological problems and natural data set. |
Tasks | |
Published | 2019-04-29 |
URL | https://arxiv.org/abs/1906.08856v1 |
https://arxiv.org/pdf/1906.08856v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-longer-term-dependencies-via-grouped |
Repo | |
Framework | |
Unifying Heterogeneous Classifiers with Distillation
Title | Unifying Heterogeneous Classifiers with Distillation |
Authors | Jayakorn Vongkulbhisal, Phongtharin Vinayavekhin, Marco Visentini-Scarzanella |
Abstract | In this paper, we study the problem of unifying knowledge from a set of classifiers with different architectures and target classes into a single classifier, given only a generic set of unlabelled data. We call this problem Unifying Heterogeneous Classifiers (UHC). This problem is motivated by scenarios where data is collected from multiple sources, but the sources cannot share their data, e.g., due to privacy concerns, and only privately trained models can be shared. In addition, each source may not be able to gather data to train all classes due to data availability at each source, and may not be able to train the same classification model due to different computational resources. To tackle this problem, we propose a generalisation of knowledge distillation to merge HCs. We derive a probabilistic relation between the outputs of HCs and the probability over all classes. Based on this relation, we propose two classes of methods based on cross-entropy minimisation and matrix factorisation, which allow us to estimate soft labels over all classes from unlabelled samples and use them in lieu of ground truth labels to train a unified classifier. Our extensive experiments on ImageNet, LSUN, and Places365 datasets show that our approaches significantly outperform a naive extension of distillation and can achieve almost the same accuracy as classifiers that are trained in a centralised, supervised manner. |
Tasks | |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06062v1 |
http://arxiv.org/pdf/1904.06062v1.pdf | |
PWC | https://paperswithcode.com/paper/unifying-heterogeneous-classifiers-with |
Repo | |
Framework | |
ROAM: Recurrently Optimizing Tracking Model
Title | ROAM: Recurrently Optimizing Tracking Model |
Authors | Tianyu Yang, Pengfei Xu, Runbo Hu, Hua Chai, Antoni B. Chan |
Abstract | In this paper, we design a tracking model consisting of response generation and bounding box regression, where the first component produces a heat map to indicate the presence of the object at different positions and the second part regresses the relative bounding box shifts to anchors mounted on sliding-window locations. Thanks to the resizable convolutional filters used in both components to adapt to the shape changes of objects, our tracking model does not need to enumerate different sized anchors, thus saving model parameters. To effectively adapt the model to appearance variations, we propose to offline train a recurrent neural optimizer to update tracking model in a meta-learning setting, which can converge the model in a few gradient steps. This improves the convergence speed of updating the tracking model while achieving better performance. We extensively evaluate our trackers, ROAM and ROAM++, on the OTB, VOT, LaSOT, GOT-10K and TrackingNet benchmark and our methods perform favorably against state-of-the-art algorithms. |
Tasks | Meta-Learning |
Published | 2019-07-28 |
URL | https://arxiv.org/abs/1907.12006v3 |
https://arxiv.org/pdf/1907.12006v3.pdf | |
PWC | https://paperswithcode.com/paper/roam-recurrently-optimizing-tracking-model |
Repo | |
Framework | |
The Futility of Bias-Free Learning and Search
Title | The Futility of Bias-Free Learning and Search |
Authors | George D. Montanez, Jonathan Hayase, Julius Lauw, Dominique Macias, Akshay Trikha, Julia Vendemiatti |
Abstract | Building on the view of machine learning as search, we demonstrate the necessity of bias in learning, quantifying the role of bias (measured relative to a collection of possible datasets, or more generally, information resources) in increasing the probability of success. For a given degree of bias towards a fixed target, we show that the proportion of favorable information resources is strictly bounded from above. Furthermore, we demonstrate that bias is a conserved quantity, such that no algorithm can be favorably biased towards many distinct targets simultaneously. Thus bias encodes trade-offs. The probability of success for a task can also be measured geometrically, as the angle of agreement between what holds for the actual task and what is assumed by the algorithm, represented in its bias. Lastly, finding a favorably biasing distribution over a fixed set of information resources is provably difficult, unless the set of resources itself is already favorable with respect to the given task and algorithm. |
Tasks | |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.06010v1 |
https://arxiv.org/pdf/1907.06010v1.pdf | |
PWC | https://paperswithcode.com/paper/the-futility-of-bias-free-learning-and-search |
Repo | |
Framework | |
Reasoning about disclosure in data integration in the presence of source constraints
Title | Reasoning about disclosure in data integration in the presence of source constraints |
Authors | Michael Benedikt, Pierre Bourhis, Louis Jachiet, Michaël Thomazo |
Abstract | Data integration systems allow users to access data sitting in multiple sources by means of queries over a global schema, related to the sources via mappings. Data sources often contain sensitive information, and thus an analysis is needed to verify that a schema satisfies a privacy policy, given as a set of queries whose answers should not be accessible to users. Such an analysis should take into account not only knowledge that an attacker may have about the mappings, but also what they may know about the semantics of the sources. In this paper, we show that source constraints can have a dramatic impact on disclosure analysis. We study the problem of determining whether a given data integration system discloses a source query to an attacker in the presence of constraints, providing both lower and upper bounds on source-aware disclosure analysis. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00624v1 |
https://arxiv.org/pdf/1906.00624v1.pdf | |
PWC | https://paperswithcode.com/paper/190600624 |
Repo | |
Framework | |
Learning Symmetric and Asymmetric Steganography via Adversarial Training
Title | Learning Symmetric and Asymmetric Steganography via Adversarial Training |
Authors | Zheng Li, Ge Han, Yunqing Wei, Shanqing Guo |
Abstract | Steganography refers to the art of concealing secret messages within multiple media carriers so that an eavesdropper is unable to detect the presence and content of the hidden messages. In this paper, we firstly propose a novel key-dependent steganographic scheme that achieves steganographic objectives with adversarial training. Symmetric (secret-key) and Asymmetric (public-key) steganographic scheme are separately proposed and each scheme is successfully designed and implemented. We show that these encodings produced by our scheme improve the invisibility by 20% than previous deep-leanring-based work, and further that perform competitively remarkable undetectability 25% better than classic steganographic algorithms. Finally, we simulated our scheme in a real situation where the decoder achieved an accuracy of more than 98% of the original message. |
Tasks | |
Published | 2019-03-13 |
URL | https://arxiv.org/abs/1903.05297v2 |
https://arxiv.org/pdf/1903.05297v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-symmetric-and-asymmetric |
Repo | |
Framework | |
Improved low-count quantitative PET reconstruction with an iterative neural network
Title | Improved low-count quantitative PET reconstruction with an iterative neural network |
Authors | Hongki Lim, Il Yong Chun, Yuni K. Dewaraja, Jeffrey A. Fessler |
Abstract | Image reconstruction in low-count PET is particularly challenging because gammas from natural radioactivity in Lu-based crystals cause high random fractions that lower the measurement signal-to-noise-ratio (SNR). In model-based image reconstruction (MBIR), using more iterations of an unregularized method may increase the noise, so incorporating regularization into the image reconstruction is desirable to control the noise. New regularization methods based on learned convolutional operators are emerging in MBIR. We modify the architecture of an iterative neural network, BCD-Net, for PET MBIR, and demonstrate the efficacy of the trained BCD-Net using XCAT phantom data that simulates the low true coincidence count-rates with high random fractions typical for Y-90 PET patient imaging after Y-90 microsphere radioembolization. Numerical results show that the proposed BCD-Net significantly improves PET reconstruction performance compared to MBIR methods using non-trained regularizers, total variation (TV) and non-local means (NLM). BCD-Net significantly improved CNR and RMSE compared to TV (NLM) regularized MBIR. Moreover, BCD-Net successfully generalizes to data that differs from training data. Improvements were also demonstrated for the clinically relevant phantom measurement data where we used training and testing datasets having very different activity distributions and count-levels. |
Tasks | Image Reconstruction |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02327v2 |
https://arxiv.org/pdf/1906.02327v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-low-count-quantitative-pet |
Repo | |
Framework | |