Paper Group ANR 728
Fast Approach to Build an Automatic Sentiment Annotator for Legal Domain using Transfer Learning. Quantum Machine Learning Tensor Network States. Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels. An Unsupervised Multivariate Time Series Kernel Approach for Identifying Patients with Surgical Site Infection from Blood Sample …
Fast Approach to Build an Automatic Sentiment Annotator for Legal Domain using Transfer Learning
Title | Fast Approach to Build an Automatic Sentiment Annotator for Legal Domain using Transfer Learning |
Authors | Viraj Gamage, Menuka Warushavithana, Nisansa de Silva, Amal Shehan Perera, Gathika Ratnayaka, Thejan Rupasinghe |
Abstract | This study proposes a novel way of identifying the sentiment of the phrases used in the legal domain. The added complexity of the language used in law, and the inability of the existing systems to accurately predict the sentiments of words in law are the main motivations behind this study. This is a transfer learning approach, which can be used for other domain adaptation tasks as well. The proposed methodology achieves an improvement of over 6% compared to the source model’s accuracy in the legal domain. |
Tasks | Domain Adaptation, Transfer Learning |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01912v1 |
http://arxiv.org/pdf/1810.01912v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-approach-to-build-an-automatic-sentiment |
Repo | |
Framework | |
Quantum Machine Learning Tensor Network States
Title | Quantum Machine Learning Tensor Network States |
Authors | Jacob Biamonte, Andrey Kardashin, Alexey Uvarov |
Abstract | Tensor network states minimize correlations to compress the classical data representing quantum states. Tensor network algorithms and similar tools$-$called tensor network methods$-$form the backbone of modern numerical methods used to simulate many-body physics and have a further range of applications in machine learning. Finding tensor states is in general a computationally challenging task, a computational task which quantum computers might be used to accelerate. We present a quantum algorithm which returns a classical description of a $k$-rank tensor network state satisfying an area law and approximating an eigenvector given black-box access to a unitary matrix. Each iteration of the optimization requires $O(n\cdot k^2)$ quantum gates. |
Tasks | Quantum Machine Learning |
Published | 2018-04-06 |
URL | https://arxiv.org/abs/1804.02398v2 |
https://arxiv.org/pdf/1804.02398v2.pdf | |
PWC | https://paperswithcode.com/paper/quantum-machine-learning-matrix-product |
Repo | |
Framework | |
Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels
Title | Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels |
Authors | Kashyap Chitta, Jianwei Feng, Martial Hebert |
Abstract | Training deep networks for semantic segmentation requires annotation of large amounts of data, which can be time-consuming and expensive. Unfortunately, these trained networks still generalize poorly when tested in domains not consistent with the training data. In this paper, we show that by carefully presenting a mixture of labeled source domain and proxy-labeled target domain data to a network, we can achieve state-of-the-art unsupervised domain adaptation results. With our design, the network progressively learns features specific to the target domain using annotation from only the source domain. We generate proxy labels for the target domain using the network’s own predictions. Our architecture then allows selective mining of easy samples from this set of proxy labels, and hard samples from the annotated source domain. We conduct a series of experiments with the GTA5, Cityscapes and BDD100k datasets on synthetic-to-real domain adaptation and geographic domain adaptation, showing the advantages of our method over baselines and existing approaches. |
Tasks | Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03542v1 |
http://arxiv.org/pdf/1811.03542v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-semantic-segmentation-with-a |
Repo | |
Framework | |
An Unsupervised Multivariate Time Series Kernel Approach for Identifying Patients with Surgical Site Infection from Blood Samples
Title | An Unsupervised Multivariate Time Series Kernel Approach for Identifying Patients with Surgical Site Infection from Blood Samples |
Authors | Karl Øyvind Mikalsen, Cristina Soguero-Ruiz, Filippo Maria Bianchi, Arthur Revhaug, Robert Jenssen |
Abstract | A large fraction of the electronic health records consists of clinical measurements collected over time, such as blood tests, which provide important information about the health status of a patient. These sequences of clinical measurements are naturally represented as time series, characterized by multiple variables and the presence of missing data, which complicate analysis. In this work, we propose a surgical site infection detection framework for patients undergoing colorectal cancer surgery that is completely unsupervised, hence alleviating the problem of getting access to labelled training data. The framework is based on powerful kernels for multivariate time series that account for missing data when computing similarities. Our approach show superior performance compared to baselines that have to resort to imputation techniques and performs comparable to a supervised classification baseline. |
Tasks | Imputation, Time Series |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.07879v1 |
http://arxiv.org/pdf/1803.07879v1.pdf | |
PWC | https://paperswithcode.com/paper/an-unsupervised-multivariate-time-series |
Repo | |
Framework | |
Technical Report: When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks
Title | Technical Report: When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks |
Authors | Octavian Suciu, Radu Mărginean, Yiğitcan Kaya, Hal Daumé III, Tudor Dumitraş |
Abstract | Recent results suggest that attacks against supervised machine learning systems are quite effective, while defenses are easily bypassed by new attacks. However, the specifications for machine learning systems currently lack precise adversary definitions, and the existing attacks make diverse, potentially unrealistic assumptions about the strength of the adversary who launches them. We propose the FAIL attacker model, which describes the adversary’s knowledge and control along four dimensions. Our model allows us to consider a wide range of weaker adversaries who have limited control and incomplete knowledge of the features, learning algorithms and training instances utilized. To evaluate the utility of the FAIL model, we consider the problem of conducting targeted poisoning attacks in a realistic setting: the crafted poison samples must have clean labels, must be individually and collectively inconspicuous, and must exhibit a generalized form of transferability, defined by the FAIL model. By taking these constraints into account, we design StingRay, a targeted poisoning attack that is practical against 4 machine learning applications, which use 3 different learning algorithms, and can bypass 2 existing defenses. Conversely, we show that a prior evasion attack is less effective under generalized transferability. Such attack evaluations, under the FAIL adversary model, may also suggest promising directions for future defenses. |
Tasks | |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.06975v2 |
http://arxiv.org/pdf/1803.06975v2.pdf | |
PWC | https://paperswithcode.com/paper/when-does-machine-learning-fail-generalized |
Repo | |
Framework | |
Three Dimensional Reconstruction of Botanical Trees with Simulatable Geometry
Title | Three Dimensional Reconstruction of Botanical Trees with Simulatable Geometry |
Authors | Ed Quigley, Winnie Lin, Yilin Zhu, Ronald Fedkiw |
Abstract | We tackle the challenging problem of creating full and accurate three dimensional reconstructions of botanical trees with the topological and geometric accuracy required for subsequent physical simulation, e.g. in response to wind forces. Although certain aspects of our approach would benefit from various improvements, our results exceed the state of the art especially in geometric and topological complexity and accuracy. Starting with two dimensional RGB image data acquired from cameras attached to drones, we create point clouds, textured triangle meshes, and a simulatable and skinned cylindrical articulated rigid body model. We discuss the pros and cons of each step of our pipeline, and in order to stimulate future research we make the raw and processed data from every step of the pipeline as well as the final geometric reconstructions publicly available. |
Tasks | |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08849v1 |
http://arxiv.org/pdf/1812.08849v1.pdf | |
PWC | https://paperswithcode.com/paper/three-dimensional-reconstruction-of-botanical |
Repo | |
Framework | |
Deep Generative Model using Unregularized Score for Anomaly Detection with Heterogeneous Complexity
Title | Deep Generative Model using Unregularized Score for Anomaly Detection with Heterogeneous Complexity |
Authors | Takashi Matsubara, Kenta Hama, Ryosuke Tachibana, Kuniaki Uehara |
Abstract | Accurate and automated detection of anomalous samples in a natural image dataset can be accomplished with a probabilistic model for end-to-end modeling of images. Such images have heterogeneous complexity, however, and a probabilistic model overlooks simply shaped objects with small anomalies. This is because the probabilistic model assigns undesirably lower likelihoods to complexly shaped objects that are nevertheless consistent with set standards. To overcome this difficulty, we propose an unregularized score for deep generative models (DGMs), which are generative models leveraging deep neural networks. We found that the regularization terms of the DGMs considerably influence the anomaly score depending on the complexity of the samples. By removing these terms, we obtain an unregularized score, which we evaluated on a toy dataset and real-world manufacturing datasets. Empirical results demonstrate that the unregularized score is robust to the inherent complexity of samples and can be used to better detect anomalies. |
Tasks | Anomaly Detection |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05800v2 |
http://arxiv.org/pdf/1807.05800v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-model-using-unregularized |
Repo | |
Framework | |
Progress and Tradeoffs in Neural Language Models
Title | Progress and Tradeoffs in Neural Language Models |
Authors | Raphael Tang, Jimmy Lin |
Abstract | In recent years, we have witnessed a dramatic shift towards techniques driven by neural networks for a variety of NLP tasks. Undoubtedly, neural language models (NLMs) have reduced perplexity by impressive amounts. This progress, however, comes at a substantial cost in performance, in terms of inference latency and energy consumption, which is particularly of concern in deployments on mobile devices. This paper, which examines the quality-performance tradeoff of various language modeling techniques, represents to our knowledge the first to make this observation. We compare state-of-the-art NLMs with “classic” Kneser-Ney (KN) LMs in terms of energy usage, latency, perplexity, and prediction accuracy using two standard benchmarks. On a Raspberry Pi, we find that orders of increase in latency and energy usage correspond to less change in perplexity, while the difference is much less pronounced on a desktop. |
Tasks | Language Modelling |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00942v1 |
http://arxiv.org/pdf/1811.00942v1.pdf | |
PWC | https://paperswithcode.com/paper/progress-and-tradeoffs-in-neural-language |
Repo | |
Framework | |
Learning Multilingual Embeddings for Cross-Lingual Information Retrieval in the Presence of Topically Aligned Corpora
Title | Learning Multilingual Embeddings for Cross-Lingual Information Retrieval in the Presence of Topically Aligned Corpora |
Authors | Mitodru Niyogi, Kripabandhu Ghosh, Arnab Bhattacharya |
Abstract | Cross-lingual information retrieval is a challenging task in the absence of aligned parallel corpora. In this paper, we address this problem by considering topically aligned corpora designed for evaluating an IR setup. To emphasize, we neither use any sentence-aligned corpora or document-aligned corpora, nor do we use any language specific resources such as dictionary, thesaurus, or grammar rules. Instead, we use an embedding into a common space and learn word correspondences directly from there. We test our proposed approach for bilingual IR on standard FIRE datasets for Bangla, Hindi and English. The proposed method is superior to the state-of-the-art method not only for IR evaluation measures but also in terms of time requirements. We extend our method successfully to the trilingual setting. |
Tasks | Information Retrieval |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04475v1 |
http://arxiv.org/pdf/1804.04475v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-multilingual-embeddings-for-cross |
Repo | |
Framework | |
Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs
Title | Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs |
Authors | Bin Hu, Stephen Wright, Laurent Lessard |
Abstract | Techniques for reducing the variance of gradient estimates used in stochastic programming algorithms for convex finite-sum problems have received a great deal of attention in recent years. By leveraging dissipativity theory from control, we provide a new perspective on two important variance-reduction algorithms: SVRG and its direct accelerated variant Katyusha. Our perspective provides a physically intuitive understanding of the behavior of SVRG-like methods via a principle of energy conservation. The tools discussed here allow us to automate the convergence analysis of SVRG-like methods by capturing their essential properties in small semidefinite programs amenable to standard analysis and computational techniques. Our approach recovers existing convergence results for SVRG and Katyusha and generalizes the theory to alternative parameter choices. We also discuss how our approach complements the linear coupling technique. Our combination of perspectives leads to a better understanding of accelerated variance-reduced stochastic methods for finite-sum problems. |
Tasks | |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03677v1 |
http://arxiv.org/pdf/1806.03677v1.pdf | |
PWC | https://paperswithcode.com/paper/dissipativity-theory-for-accelerating |
Repo | |
Framework | |
Adversarial Machine Learning And Speech Emotion Recognition: Utilizing Generative Adversarial Networks For Robustness
Title | Adversarial Machine Learning And Speech Emotion Recognition: Utilizing Generative Adversarial Networks For Robustness |
Authors | Siddique Latif, Rajib Rana, Junaid Qadir |
Abstract | Deep learning has undoubtedly offered tremendous improvements in the performance of state-of-the-art speech emotion recognition (SER) systems. However, recent research on adversarial examples poses enormous challenges on the robustness of SER systems by showing the susceptibility of deep neural networks to adversarial examples as they rely only on small and imperceptible perturbations. In this study, we evaluate how adversarial examples can be used to attack SER systems and propose the first black-box adversarial attack on SER systems. We also explore potential defenses including adversarial training and generative adversarial network (GAN) to enhance robustness. Experimental evaluations suggest various interesting aspects of the effective utilization of adversarial examples useful for achieving robustness for SER systems opening up opportunities for researchers to further innovate in this space. |
Tasks | Adversarial Attack, Emotion Recognition, Speech Emotion Recognition |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11402v2 |
http://arxiv.org/pdf/1811.11402v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-machine-learning-and-speech |
Repo | |
Framework | |
InverseRenderNet: Learning single image inverse rendering
Title | InverseRenderNet: Learning single image inverse rendering |
Authors | Ye Yu, William A. P. Smith |
Abstract | We show how to train a fully convolutional neural network to perform inverse rendering from a single, uncontrolled image. The network takes an RGB image as input, regresses albedo and normal maps from which we compute lighting coefficients. Our network is trained using large uncontrolled image collections without ground truth. By incorporating a differentiable renderer, our network can learn from self-supervision. Since the problem is ill-posed we introduce additional supervision: 1. We learn a statistical natural illumination prior, 2. Our key insight is to perform offline multiview stereo (MVS) on images containing rich illumination variation. From the MVS pose and depth maps, we can cross project between overlapping views such that Siamese training can be used to ensure consistent estimation of photometric invariants. MVS depth also provides direct coarse supervision for normal map estimation. We believe this is the first attempt to use MVS supervision for learning inverse rendering. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12328v1 |
http://arxiv.org/pdf/1811.12328v1.pdf | |
PWC | https://paperswithcode.com/paper/inverserendernet-learning-single-image |
Repo | |
Framework | |
Disentangling Propagation and Generation for Video Prediction
Title | Disentangling Propagation and Generation for Video Prediction |
Authors | Hang Gao, Huazhe Xu, Qi-Zhi Cai, Ruth Wang, Fisher Yu, Trevor Darrell |
Abstract | A dynamic scene has two types of elements: those that move fluidly and can be predicted from previous frames, and those which are disoccluded (exposed) and cannot be extrapolated. Prior approaches to video prediction typically learn either to warp or to hallucinate future pixels, but not both. In this paper, we describe a computational model for high-fidelity video prediction which disentangles motion-specific propagation from motion-agnostic generation. We introduce a confidence-aware warping operator which gates the output of pixel predictions from a flow predictor for non-occluded regions and from a context encoder for occluded regions. Moreover, in contrast to prior works where confidence is jointly learned with flow and appearance using a single network, we compute confidence after a warping step, and employ a separate network to inpaint exposed regions. Empirical results on both synthetic and real datasets show that our disentangling approach provides better occlusion maps and produces both sharper and more realistic predictions compared to strong baselines. |
Tasks | Predict Future Video Frames, Video Prediction |
Published | 2018-12-02 |
URL | https://arxiv.org/abs/1812.00452v2 |
https://arxiv.org/pdf/1812.00452v2.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-propagation-and-generation-for |
Repo | |
Framework | |
Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding
Title | Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding |
Authors | Nathan Kallus, Xiaojie Mao, Angela Zhou |
Abstract | We study the problem of learning conditional average treatment effects (CATE) from observational data with unobserved confounders. The CATE function maps baseline covariates to individual causal effect predictions and is key for personalized assessments. Recent work has focused on how to learn CATE under unconfoundedness, i.e., when there are no unobserved confounders. Since CATE may not be identified when unconfoundedness is violated, we develop a functional interval estimator that predicts bounds on the individual causal effects under realistic violations of unconfoundedness. Our estimator takes the form of a weighted kernel estimator with weights that vary adversarially. We prove that our estimator is sharp in that it converges exactly to the tightest bounds possible on CATE when there may be unobserved confounders. Further, we study personalized decision rules derived from our estimator and prove that they achieve optimal minimax regret asymptotically. We assess our approach in a simulation study as well as demonstrate its application in the case of hormone replacement therapy by comparing conclusions from a real observational study and clinical trial. |
Tasks | |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02894v1 |
http://arxiv.org/pdf/1810.02894v1.pdf | |
PWC | https://paperswithcode.com/paper/interval-estimation-of-individual-level |
Repo | |
Framework | |
High-Accuracy Inference in Neuromorphic Circuits using Hardware-Aware Training
Title | High-Accuracy Inference in Neuromorphic Circuits using Hardware-Aware Training |
Authors | Borna Obradovic, Titash Rakshit, Ryan Hatcher, Jorge A. Kittl, Mark S. Rodder |
Abstract | Neuromorphic Multiply-And-Accumulate (MAC) circuits utilizing synaptic weight elements based on SRAM or novel Non-Volatile Memories (NVMs) provide a promising approach for highly efficient hardware representations of neural networks. NVM density and robustness requirements suggest that off-line training is the right choice for “edge” devices, since the requirements for synapse precision are much less stringent. However, off-line training using ideal mathematical weights and activations can result in significant loss of inference accuracy when applied to non-ideal hardware. Non-idealities such as multi-bit quantization of weights and activations, non-linearity of weights, finite max/min ratios of NVM elements, and asymmetry of positive and negative weight components all result in degraded inference accuracy. In this work, it is demonstrated that non-ideal Multi-Layer Perceptron (MLP) architectures using low bitwidth weights and activations can be trained with negligible loss of inference accuracy relative to their Floating Point-trained counterparts using a proposed off-line, continuously differentiable HW-aware training algorithm. The proposed algorithm is applicable to a wide range of hardware models, and uses only standard neural network training methods. The algorithm is demonstrated on the MNIST and EMNIST datasets, using standard MLPs. |
Tasks | Quantization |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04982v1 |
http://arxiv.org/pdf/1809.04982v1.pdf | |
PWC | https://paperswithcode.com/paper/high-accuracy-inference-in-neuromorphic |
Repo | |
Framework | |