Paper Group ANR 143
A total variation based regularizer promoting piecewise-Lipschitz reconstructions. Language models and Automated Essay Scoring. PAN: Projective Adversarial Network for Medical Image Segmentation. Universality and individuality in neural dynamics across large populations of recurrent networks. Grape detection, segmentation and tracking using deep ne …
A total variation based regularizer promoting piecewise-Lipschitz reconstructions
Title | A total variation based regularizer promoting piecewise-Lipschitz reconstructions |
Authors | Martin Burger, Yury Korolev, Carola-Bibiane Schönlieb, Christiane Stollenwerk |
Abstract | We introduce a new regularizer in the total variation family that promotes reconstructions with a given Lipschitz constant (which can also vary spatially). We prove regularizing properties of this functional and investigate its connections to total variation and infimal convolution type regularizers TVLp and, in particular, establish topological equivalence. Our numerical experiments show that the proposed regularizer can achieve similar performance as total generalized variation while having the advantage of a very intuitive interpretation of its free parameter, which is just a local estimate of the norm of the gradient. It also provides a natural approach to spatially adaptive regularization. |
Tasks | |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05079v1 |
http://arxiv.org/pdf/1903.05079v1.pdf | |
PWC | https://paperswithcode.com/paper/a-total-variation-based-regularizer-promoting |
Repo | |
Framework | |
Language models and Automated Essay Scoring
Title | Language models and Automated Essay Scoring |
Authors | Pedro Uria Rodriguez, Amir Jafari, Christopher M. Ormerod |
Abstract | In this paper, we present a new comparative study on automatic essay scoring (AES). The current state-of-the-art natural language processing (NLP) neural network architectures are used in this work to achieve above human-level accuracy on the publicly available Kaggle AES dataset. We compare two powerful language models, BERT and XLNet, and describe all the layers and network architectures in these models. We elucidate the network architectures of BERT and XLNet using clear notation and diagrams and explain the advantages of transformer architectures over traditional recurrent neural network architectures. Linear algebra notation is used to clarify the functions of transformers and attention mechanisms. We compare the results with more traditional methods, such as bag of words (BOW) and long short term memory (LSTM) networks. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.09482v1 |
https://arxiv.org/pdf/1909.09482v1.pdf | |
PWC | https://paperswithcode.com/paper/language-models-and-automated-essay-scoring |
Repo | |
Framework | |
PAN: Projective Adversarial Network for Medical Image Segmentation
Title | PAN: Projective Adversarial Network for Medical Image Segmentation |
Authors | Naji Khosravan, Aliasghar Mortazi, Michael Wallace, Ulas Bagci |
Abstract | Adversarial learning has been proven to be effective for capturing long-range and high-level label consistencies in semantic segmentation. Unique to medical imaging, capturing 3D semantics in an effective yet computationally efficient way remains an open problem. In this study, we address this computational burden by proposing a novel projective adversarial network, called PAN, which incorporates high-level 3D information through 2D projections. Furthermore, we introduce an attention module into our framework that helps for a selective integration of global information directly from our segmentor to our adversarial network. For the clinical application we chose pancreas segmentation from CT scans. Our proposed framework achieved state-of-the-art performance without adding to the complexity of the segmentor. |
Tasks | Medical Image Segmentation, Pancreas Segmentation, Semantic Segmentation |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04378v1 |
https://arxiv.org/pdf/1906.04378v1.pdf | |
PWC | https://paperswithcode.com/paper/pan-projective-adversarial-network-for |
Repo | |
Framework | |
Universality and individuality in neural dynamics across large populations of recurrent networks
Title | Universality and individuality in neural dynamics across large populations of recurrent networks |
Authors | Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo |
Abstract | Task-based modeling with recurrent neural networks (RNNs) has emerged as a popular way to infer the computational function of different brain regions. These models are quantitatively assessed by comparing the low-dimensional neural representations of the model with the brain, for example using canonical correlation analysis (CCA). However, the nature of the detailed neurobiological inferences one can draw from such efforts remains elusive. For example, to what extent does training neural networks to solve common tasks uniquely determine the network dynamics, independent of modeling architectural choices? Or alternatively, are the learned dynamics highly sensitive to different model choices? Knowing the answer to these questions has strong implications for whether and how we should use task-based RNN modeling to understand brain dynamics. To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics. We find the geometry of the RNN representations can be highly sensitive to different network architectures, yielding a cautionary tale for measures of similarity that rely representational geometry, such as CCA. Moreover, we find that while the geometry of neural dynamics can vary greatly across architectures, the underlying computational scaffold—the topological structure of fixed points, transitions between them, limit cycles, and linearized dynamics—often appears universal across all architectures. |
Tasks | |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08549v2 |
https://arxiv.org/pdf/1907.08549v2.pdf | |
PWC | https://paperswithcode.com/paper/universality-and-individuality-in-neural |
Repo | |
Framework | |
Grape detection, segmentation and tracking using deep neural networks and three-dimensional association
Title | Grape detection, segmentation and tracking using deep neural networks and three-dimensional association |
Authors | Thiago T. Santos, Leonardo L. de Souza, Andreza A. dos Santos, Sandra Avila |
Abstract | Agricultural applications such as yield prediction, precision agriculture and automated harvesting need systems able to infer the crop state from low-cost sensing devices. Proximal sensing using affordable cameras combined with computer vision has seen a promising alternative, strengthened after the advent of convolutional neural networks (CNNs) as an alternative for challenging pattern recognition problems in natural images. Considering fruit growing monitoring and automation, a fundamental problem is the detection, segmentation and counting of individual fruits in orchards. Here we show that for wine grapes, a crop presenting large variability in shape, color, size and compactness, grape clusters can be successfully detected, segmented and tracked using state-of-the-art CNNs. In a test set containing 408 grape clusters from images taken on a trellis-system based vineyard, we have reached an F 1 -score up to 0.91 for instance segmentation, a fine separation of each cluster from other structures in the image that allows a more accurate assessment of fruit size and shape. We have also shown as clusters can be identified and tracked along video sequences recording orchard rows. We also present a public dataset containing grape clusters properly annotated in 300 images and a novel annotation methodology for segmentation of complex objects in natural images. The presented pipeline for annotation, training, evaluation and tracking of agricultural patterns in images can be replicated for different crops and production systems. It can be employed in the development of sensing components for several agricultural and environmental applications. |
Tasks | Instance Segmentation, Semantic Segmentation |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11819v3 |
https://arxiv.org/pdf/1907.11819v3.pdf | |
PWC | https://paperswithcode.com/paper/grape-detection-segmentation-and-tracking |
Repo | |
Framework | |
Empirical Bayesian Mixture Models for Medical Image Translation
Title | Empirical Bayesian Mixture Models for Medical Image Translation |
Authors | Mikael Brudfors, John Ashburner, Parashkev Nachev, Yael Balbastre |
Abstract | Automatically generating one medical imaging modality from another is known as medical image translation, and has numerous interesting applications. This paper presents an interpretable generative modelling approach to medical image translation. By allowing a common model for group-wise normalisation and segmentation of brain scans to handle missing data, the model allows for predicting entirely missing modalities from one, or a few, MR contrasts. Furthermore, the model can be trained on a fairly small number of subjects. The proposed model is validated on three clinically relevant scenarios. Results appear promising and show that a principled, probabilistic model of the relationship between multi-channel signal intensities can be used to infer missing modalities – both MR contrasts and CT images. |
Tasks | |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05926v1 |
https://arxiv.org/pdf/1908.05926v1.pdf | |
PWC | https://paperswithcode.com/paper/empirical-bayesian-mixture-models-for-medical |
Repo | |
Framework | |
Towards Privacy and Security of Deep Learning Systems: A Survey
Title | Towards Privacy and Security of Deep Learning Systems: A Survey |
Authors | Yingzhe He, Guozhu Meng, Kai Chen, Xingbo Hu, Jinwen He |
Abstract | Deep learning has gained tremendous success and great popularity in the past few years. However, recent research found that it is suffering several inherent weaknesses, which can threaten the security and privacy of the stackholders. Deep learning’s wide use further magnifies the caused consequences. To this end, lots of research has been conducted with the purpose of exhaustively identifying intrinsic weaknesses and subsequently proposing feasible mitigation. Yet few is clear about how these weaknesses are incurred and how effective are these attack approaches in assaulting deep learning. In order to unveil the security weaknesses and aid in the development of a robust deep learning system, we are devoted to undertaking a comprehensive investigation on attacks towards deep learning, and extensively evaluating these attacks in multiple views. In particular, we focus on four types of attacks associated with security and privacy of deep learning: model extraction attack, model inversion attack, poisoning attack and adversarial attack. For each type of attack, we construct its essential workflow as well as adversary capabilities and attack goals. Many pivot metrics are devised for evaluating the attack approaches, by which we perform a quantitative and qualitative analysis. From the analysis, we have identified significant and indispensable factors in an attack vector, \eg, how to reduce queries to target models, what distance used for measuring perturbation. We spot light on 17 findings covering these approaches’ merits and demerits, success probability, deployment complexity and prospects. Moreover, we discuss other potential security weaknesses and possible mitigation which can inspire relevant researchers in this area. |
Tasks | Adversarial Attack |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12562v1 |
https://arxiv.org/pdf/1911.12562v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-privacy-and-security-of-deep-learning |
Repo | |
Framework | |
Multitask Hopfield Networks
Title | Multitask Hopfield Networks |
Authors | Marco Frasca, Giuliano Grossi, Giorgio Valentini |
Abstract | Multitask algorithms typically use task similarity information as a bias to speed up and improve the performance of learning processes. Tasks are learned jointly, sharing information across them, in order to construct models more accurate than those learned separately over single tasks. In this contribution, we present the first multitask model, to our knowledge, based on Hopfield Networks (HNs), named HoMTask. We show that by appropriately building a unique HN embedding all tasks, a more robust and effective classification model can be learned. HoMTask is a transductive semi-supervised parametric HN, that minimizes an energy function extended to all nodes and to all tasks under study. We provide theoretical evidence that the optimal parameters automatically estimated by HoMTask make coherent the model itself with the prior knowledge (connection weights and node labels). The convergence properties of HNs are preserved, and the fixed point reached by the network dynamics gives rise to the prediction of unlabeled nodes. The proposed model improves the classification abilities of singletask HNs on a preliminary benchmark comparison, and achieves competitive performance with state-of-the-art semi-supervised graph-based algorithms. |
Tasks | |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05098v1 |
http://arxiv.org/pdf/1904.05098v1.pdf | |
PWC | https://paperswithcode.com/paper/multitask-hopfield-networks |
Repo | |
Framework | |
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Title | SGD Converges to Global Minimum in Deep Learning via Star-convex Path |
Authors | Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh |
Abstract | Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks. However, there is still a lack of understanding on how and why SGD can train these complex networks towards a global minimum. In this study, we establish the convergence of SGD to a global minimum for nonconvex optimization problems that are commonly encountered in neural network training. Our argument exploits the following two important properties: 1) the training loss can achieve zero value (approximately), which has been widely observed in deep learning; 2) SGD follows a star-convex path, which is verified by various experiments in this paper. In such a context, our analysis shows that SGD, although has long been considered as a randomized algorithm, converges in an intrinsically deterministic manner to a global minimum. |
Tasks | |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00451v1 |
http://arxiv.org/pdf/1901.00451v1.pdf | |
PWC | https://paperswithcode.com/paper/sgd-converges-to-global-minimum-in-deep |
Repo | |
Framework | |
Seven Myths in Machine Learning Research
Title | Seven Myths in Machine Learning Research |
Authors | Oscar Chang, Hod Lipson |
Abstract | We present seven myths commonly believed to be true in machine learning research, circa Feb 2019. This is an archival copy of the blog post at https://crazyoscarchang.github.io/2019/02/16/seven-myths-in-machine-learning-research/ Myth 1: TensorFlow is a Tensor manipulation library Myth 2: Image datasets are representative of real images found in the wild Myth 3: Machine Learning researchers do not use the test set for validation Myth 4: Every datapoint is used in training a neural network Myth 5: We need (batch) normalization to train very deep residual networks Myth 6: Attention $>$ Convolution Myth 7: Saliency maps are robust ways to interpret neural networks |
Tasks | |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06789v2 |
http://arxiv.org/pdf/1902.06789v2.pdf | |
PWC | https://paperswithcode.com/paper/seven-myths-in-machine-learning-research |
Repo | |
Framework | |
Discrete Argument Representation Learning for Interactive Argument Pair Identification
Title | Discrete Argument Representation Learning for Interactive Argument Pair Identification |
Authors | Lu Ji, Zhongyu Wei, Jing Li, Qi Zhang, Xuanjing Huang |
Abstract | In this paper, we focus on extracting interactive argument pairs from two posts with opposite stances to a certain topic. Considering opinions are exchanged from different perspectives of the discussing topic, we study the discrete representations for arguments to capture varying aspects in argumentation languages (e.g., the debate focus and the participant behavior). Moreover, we utilize hierarchical structure to model post-wise information incorporating contextual knowledge. Experimental results on the large-scale dataset collected from CMV show that our proposed framework can significantly outperform the competitive baselines. Further analyses reveal why our model yields superior performance and prove the usefulness of our learned representations. |
Tasks | Representation Learning |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01621v1 |
https://arxiv.org/pdf/1911.01621v1.pdf | |
PWC | https://paperswithcode.com/paper/discrete-argument-representation-learning-for |
Repo | |
Framework | |
Time-aware Gradient Attack on Dynamic Network Link Prediction
Title | Time-aware Gradient Attack on Dynamic Network Link Prediction |
Authors | Jinyin Chen, Jian Zhang, Zhi Chen, Min Du, Feifei Li, Qi Xuan |
Abstract | In network link prediction, it is possible to hide a target link from being predicted with a small perturbation on network structure. This observation may be exploited in many real world scenarios, for example, to preserve privacy, or to exploit financial security. There have been many recent studies to generate adversarial examples to mislead deep learning models on graph data. However, none of the previous work has considered the dynamic nature of real-world systems. In this work, we present the first study of adversarial attack on dynamic network link prediction (DNLP). The proposed attack method, namely time-aware gradient attack (TGA), utilizes the gradient information generated by deep dynamic network embedding (DDNE) across different snapshots to rewire a few links, so as to make DDNE fail to predict target links. We implement TGA in two ways: one is based on traversal search, namely TGA-Tra; and the other is simplified with greedy search for efficiency, namely TGA-Gre. We conduct comprehensive experiments which show the outstanding performance of TGA in attacking DNLP algorithms. |
Tasks | Adversarial Attack, Link Prediction, Network Embedding |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10561v1 |
https://arxiv.org/pdf/1911.10561v1.pdf | |
PWC | https://paperswithcode.com/paper/time-aware-gradient-attack-on-dynamic-network |
Repo | |
Framework | |
Learning to Identify Object Instances by Touch: Tactile Recognition via Multimodal Matching
Title | Learning to Identify Object Instances by Touch: Tactile Recognition via Multimodal Matching |
Authors | Justin Lin, Roberto Calandra, Sergey Levine |
Abstract | Much of the literature on robotic perception focuses on the visual modality. Vision provides a global observation of a scene, making it broadly useful. However, in the domain of robotic manipulation, vision alone can sometimes prove inadequate: in the presence of occlusions or poor lighting, visual object identification might be difficult. The sense of touch can provide robots with an alternative mechanism for recognizing objects. In this paper, we study the problem of touch-based instance recognition. We propose a novel framing of the problem as multi-modal recognition: the goal of our system is to recognize, given a visual and tactile observation, whether or not these observations correspond to the same object. To our knowledge, our work is the first to address this type of multi-modal instance recognition problem on such a large-scale with our analysis spanning 98 different objects. We employ a robot equipped with two GelSight touch sensors, one on each finger, and a self-supervised, autonomous data collection procedure to collect a dataset of tactile observations and images. Our experimental results show that it is possible to accurately recognize object instances by touch alone, including instances of novel objects that were never seen during training. Our learned model outperforms other methods on this complex task, including that of human volunteers. |
Tasks | |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03591v1 |
http://arxiv.org/pdf/1903.03591v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-identify-object-instances-by |
Repo | |
Framework | |
An Application of CNNs to Time Sequenced One Dimensional Data in Radiation Detection
Title | An Application of CNNs to Time Sequenced One Dimensional Data in Radiation Detection |
Authors | Eric T. Moore, William P. Ford, Emma J. Hague, Johanna Turk |
Abstract | A Convolutional Neural Network architecture was used to classify various isotopes of time-sequenced gamma-ray spectra, a typical output of a radiation detection system of a type commonly fielded for security or environmental measurement purposes. A two-dimensional surface (waterfall plot) in time-energy space is interpreted as a monochromatic image and standard image-based CNN techniques are applied. This allows for the time-sequenced aspects of features in the data to be discovered by the network, as opposed to standard algorithms which arbitrarily time bin the data to satisfy the intuition of a human spectroscopist. The CNN architecture and results are presented along with a comparison to conventional techniques. The results of this novel application of image processing techniques to radiation data will be presented along with a comparison to more conventional adaptive methods. |
Tasks | |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10887v1 |
https://arxiv.org/pdf/1908.10887v1.pdf | |
PWC | https://paperswithcode.com/paper/an-application-of-cnns-to-time-sequenced-one |
Repo | |
Framework | |
Preventing Adversarial Use of Datasets through Fair Core-Set Construction
Title | Preventing Adversarial Use of Datasets through Fair Core-Set Construction |
Authors | Benjamin Spector, Ravi Kumar, Andrew Tomkins |
Abstract | We propose improving the privacy properties of a dataset by publishing only a strategically chosen “core-set” of the data containing a subset of the instances. The core-set allows strong performance on primary tasks, but forces poor performance on unwanted tasks. We give methods for both linear models and neural networks and demonstrate their efficacy on data. |
Tasks | |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.10871v1 |
https://arxiv.org/pdf/1910.10871v1.pdf | |
PWC | https://paperswithcode.com/paper/preventing-adversarial-use-of-datasets |
Repo | |
Framework | |