Paper Group ANR 470
Channel Gating Neural Networks. Party Matters: Enhancing Legislative Embeddings with Author Attributes for Vote Prediction. Nonlinear regression based on a hybrid quantum computer. Deep learning based 2.5D flow field estimation for maximum intensity projections of 4D optical coherence tomography. Generating Bilingual Pragmatic Color References. Qua …
Channel Gating Neural Networks
Title | Channel Gating Neural Networks |
Authors | Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, G. Edward Suh |
Abstract | This paper introduces channel gating, a dynamic, fine-grained, and hardware-efficient pruning scheme to reduce the computation cost for convolutional neural networks (CNNs). Channel gating identifies regions in the features that contribute less to the classification result, and skips the computation on a subset of the input channels for these ineffective regions. Unlike static network pruning, channel gating optimizes CNN inference at run-time by exploiting input-specific characteristics, which allows substantially reducing the compute cost with almost no accuracy loss. We experimentally show that applying channel gating in state-of-the-art networks achieves 2.7-8.0$\times$ reduction in floating-point operations (FLOPs) and 2.0-4.4$\times$ reduction in off-chip memory accesses with a minimal accuracy loss on CIFAR-10. Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2.6$\times$ without accuracy drop on ImageNet. We further demonstrate that channel gating can be realized in hardware efficiently. Our approach exhibits sparsity patterns that are well-suited to dense systolic arrays with minimal additional hardware. We have designed an accelerator for channel gating networks, which can be implemented using either FPGAs or ASICs. Running a quantized ResNet-18 model for ImageNet, our accelerator achieves an encouraging speedup of 2.4$\times$ on average, with a theoretical FLOP reduction of 2.8$\times$. |
Tasks | Network Pruning |
Published | 2018-05-29 |
URL | https://arxiv.org/abs/1805.12549v2 |
https://arxiv.org/pdf/1805.12549v2.pdf | |
PWC | https://paperswithcode.com/paper/channel-gating-neural-networks |
Repo | |
Framework | |
Party Matters: Enhancing Legislative Embeddings with Author Attributes for Vote Prediction
Title | Party Matters: Enhancing Legislative Embeddings with Author Attributes for Vote Prediction |
Authors | Anastassia Kornilova, Daniel Argyle, Vlad Eidelman |
Abstract | Predicting how Congressional legislators will vote is important for understanding their past and future behavior. However, previous work on roll-call prediction has been limited to single session settings, thus did not consider generalization across sessions. In this paper, we show that metadata is crucial for modeling voting outcomes in new contexts, as changes between sessions lead to changes in the underlying data generation process. We show how augmenting bill text with the sponsors’ ideologies in a neural network model can achieve an average of a 4% boost in accuracy over the previous state-of-the-art. |
Tasks | |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08182v1 |
http://arxiv.org/pdf/1805.08182v1.pdf | |
PWC | https://paperswithcode.com/paper/party-matters-enhancing-legislative |
Repo | |
Framework | |
Nonlinear regression based on a hybrid quantum computer
Title | Nonlinear regression based on a hybrid quantum computer |
Authors | Dan-Bo Zhang, Shi-Liang Zhu, Z. D. Wang |
Abstract | Incorporating nonlinearity into quantum machine learning is essential for learning a complicated input-output mapping. We here propose quantum algorithms for nonlinear regression, where nonlinearity is introduced with feature maps when loading classical data into quantum states. Our implementation is based on a hybrid quantum computer, exploiting both discrete and continuous variables, for their capacity to encode novel features and efficiency of processing information. We propose encoding schemes that can realize well-known polynomial and Gaussian kernel ridge regressions, with exponentially speed-up regarding to the number of samples. |
Tasks | Quantum Machine Learning |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09607v1 |
http://arxiv.org/pdf/1808.09607v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-regression-based-on-a-hybrid |
Repo | |
Framework | |
Deep learning based 2.5D flow field estimation for maximum intensity projections of 4D optical coherence tomography
Title | Deep learning based 2.5D flow field estimation for maximum intensity projections of 4D optical coherence tomography |
Authors | Max-Heinrich Laves, Lüder A. Kahrs, Tobias Ortmaier |
Abstract | In microsurgery, lasers have emerged as precise tools for bone ablation. A challenge is automatic control of laser bone ablation with 4D optical coherence tomography (OCT). OCT as high resolution imaging modality provides volumetric images of tissue and foresees information of bone position and orientation (pose) as well as thickness. However, existing approaches for OCT based laser ablation control rely on external tracking systems or invasively ablated artificial landmarks for tracking the pose of the OCT probe relative to the tissue. This can be superseded by estimating the scene flow caused by relative movement between OCT-based laser ablation system and patient. Therefore, this paper deals with 2.5D scene flow estimation of volumetric OCT images for application in laser ablation. We present a semi-supervised convolutional neural network based tracking scheme for subsequent 3D OCT volumes and apply it to a realistic semi-synthetic data set of ex vivo human temporal bone specimen. The scene flow is estimated in a two-stage approach. In the first stage, 2D lateral scene flow is computed on census-transformed en-face arguments-of-maximum intensity projections. Subsequent to this, the projections are warped by predicted lateral flow and 1D depth flow is estimated. The neural network is trained semi-supervised by combining error to ground truth and the reconstruction error of warped images with assumptions of spatial flow smoothness. Quantitative evaluation reveals a mean endpoint error of $ (4.7\pm{}3.5) $ voxel or $ 27.5 \pm 20.5 \mu\mathrm{m} $ for scene flow estimation caused by simulated relative movement between the OCT probe and bone. The scene flow estimation for 4D OCT enables its use for markerless tracking of mastoid bone structures for image guidance in general, and automated laser ablation control. |
Tasks | Object Tracking, Optical Flow Estimation, Scene Flow Estimation |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11205v2 |
http://arxiv.org/pdf/1810.11205v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-25d-flow-field-estimation |
Repo | |
Framework | |
Generating Bilingual Pragmatic Color References
Title | Generating Bilingual Pragmatic Color References |
Authors | Will Monroe, Jennifer Hu, Andrew Jong, Christopher Potts |
Abstract | Contextual influences on language often exhibit substantial cross-lingual regularities; for example, we are more verbose in situations that require finer distinctions. However, these regularities are sometimes obscured by semantic and syntactic differences. Using a newly-collected dataset of color reference games in Mandarin Chinese (which we release to the public), we confirm that a variety of constructions display the same sensitivity to contextual difficulty in Chinese and English. We then show that a neural speaker agent trained on bilingual data with a simple multitask learning approach displays more human-like patterns of context dependence and is more pragmatically informative than its monolingual Chinese counterpart. Moreover, this is not at the expense of language-specific semantic understanding: the resulting speaker model learns the different basic color term systems of English and Chinese (with noteworthy cross-lingual influences), and it can identify synonyms between the two languages using vector analogy operations on its output layer, despite having no exposure to parallel data. |
Tasks | |
Published | 2018-03-11 |
URL | http://arxiv.org/abs/1803.03917v2 |
http://arxiv.org/pdf/1803.03917v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-bilingual-pragmatic-color |
Repo | |
Framework | |
Quantized Single-Ion-Channel Hodgkin-Huxley Model for Quantum Neurons
Title | Quantized Single-Ion-Channel Hodgkin-Huxley Model for Quantum Neurons |
Authors | Tasio Gonzalez-Raya, Xiao-Hang Cheng, Iñigo L. Egusquiza, Xi Chen, Mikel Sanz, Enrique Solano |
Abstract | The Hodgkin-Huxley model describes the behavior of the cell membrane in neurons, treating each part of it as an electric circuit element, namely capacitors, memristors, and voltage sources. We focus on the activation channel of potassium ions, due to its simplicity, while keeping most of the features displayed by the original model. This reduced version is essentially a classical memristor, a resistor whose resistance depends on the history of electric signals that have crossed it, coupled to a voltage source and a capacitor. Here, we will consider a quantized Hodgkin-Huxley model based on a quantum memristor formalism. We compare the behavior of the membrane voltage and the potassium channel conductance, when the circuit is subjected to AC sources, in both classical and quantum realms. Numerical simulations show an expected adaptation of the considered channel conductance depending on the signal history in all regimes. Remarkably, the computation of higher moments of the voltage manifest purely quantum features related to the circuit zero-point energy. Finally, we study the implementation of the Hodgkin-Huxley quantum memristor as an asymmetric rf SQUID in superconducting circuits. This study may allow the construction of quantum neuron networks inspired in the brain function, as well as the design of neuromorphic quantum architectures for quantum machine learning. |
Tasks | Quantum Machine Learning |
Published | 2018-07-27 |
URL | https://arxiv.org/abs/1807.10698v4 |
https://arxiv.org/pdf/1807.10698v4.pdf | |
PWC | https://paperswithcode.com/paper/quantized-hodgkin-huxley-model-for-quantum |
Repo | |
Framework | |
Information-theoretic Limits for Community Detection in Network Models
Title | Information-theoretic Limits for Community Detection in Network Models |
Authors | Chuyang Ke, Jean Honorio |
Abstract | We analyze the information-theoretic limits for the recovery of node labels in several network models. This includes the Stochastic Block Model, the Exponential Random Graph Model, the Latent Space Model, the Directed Preferential Attachment Model, and the Directed Small-world Model. For the Stochastic Block Model, the non-recoverability condition depends on the probabilities of having edges inside a community, and between different communities. For the Latent Space Model, the non-recoverability condition depends on the dimension of the latent space, and how far and spread are the communities in the latent space. For the Directed Preferential Attachment Model and the Directed Small-world Model, the non-recoverability condition depends on the ratio between homophily and neighborhood size. We also consider dynamic versions of the Stochastic Block Model and the Latent Space Model. |
Tasks | Community Detection |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.06104v2 |
http://arxiv.org/pdf/1802.06104v2.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-limits-for-community |
Repo | |
Framework | |
Growing and Retaining AI Talent for the United States Government
Title | Growing and Retaining AI Talent for the United States Government |
Authors | Edward Raff |
Abstract | Artificial Intelligence and Machine Learning have become transformative to a number of industries, and as such many industries need for AI talent is increasing the demand for individuals with these skills. This continues to exacerbate the difficulty of acquiring and retaining talent for the United States Federal Government, both for its direct employees as well as the companies that support it. We take the position that by focusing on growing and retaining current talent through a number of cultural changes, the government can work to remediate this problem today. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10276v1 |
http://arxiv.org/pdf/1809.10276v1.pdf | |
PWC | https://paperswithcode.com/paper/growing-and-retaining-ai-talent-for-the |
Repo | |
Framework | |
Quantum classification of the MNIST dataset via Slow Feature Analysis
Title | Quantum classification of the MNIST dataset via Slow Feature Analysis |
Authors | Iordanis Kerenidis, Alessandro Luongo |
Abstract | Quantum machine learning carries the promise to revolutionize information and communication technologies. While a number of quantum algorithms with potential exponential speedups have been proposed already, it is quite difficult to provide convincing evidence that quantum computers with quantum memories will be in fact useful to solve real-world problems. Our work makes considerable progress towards this goal. We design quantum techniques for Dimensionality Reduction and for Classification, and combine them to provide an efficient and high accuracy quantum classifier that we test on the MNIST dataset. More precisely, we propose a quantum version of Slow Feature Analysis (QSFA), a dimensionality reduction technique that maps the dataset in a lower dimensional space where we can apply a novel quantum classification procedure, the Quantum Frobenius Distance (QFD). We simulate the quantum classifier (including errors) and show that it can provide classification of the MNIST handwritten digit dataset, a widely used dataset for benchmarking classification algorithms, with $98.5%$ accuracy, similar to the classical case. The running time of the quantum classifier is polylogarithmic in the dimension and number of data points. We also provide evidence that the other parameters on which the running time depends (condition number, Frobenius norm, error threshold, etc.) scale favorably in practice, thus ascertaining the efficiency of our algorithm. |
Tasks | Dimensionality Reduction, Quantum Machine Learning |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08837v2 |
http://arxiv.org/pdf/1805.08837v2.pdf | |
PWC | https://paperswithcode.com/paper/quantum-classification-of-the-mnist-dataset |
Repo | |
Framework | |
Detecting Adversarial Perturbations Through Spatial Behavior in Activation Spaces
Title | Detecting Adversarial Perturbations Through Spatial Behavior in Activation Spaces |
Authors | Ziv Katzir, Yuval Elovici |
Abstract | Neural network based classifiers are still prone to manipulation through adversarial perturbations. State of the art attacks can overcome most of the defense or detection mechanisms suggested so far, and adversaries have the upper hand in this arms race. Adversarial examples are designed to resemble the normal input from which they were constructed, while triggering an incorrect classification. This basic design goal leads to a characteristic spatial behavior within the context of Activation Spaces, a term coined by the authors to refer to the hyperspaces formed by the activation values of the network’s layers. Within the output of the first layers of the network, an adversarial example is likely to resemble normal instances of the source class, while in the final layers such examples will diverge towards the adversary’s target class. The steps below enable us to leverage this inherent shift from one class to another in order to form a novel adversarial example detector. We construct Euclidian spaces out of the activation values of each of the deep neural network layers. Then, we induce a set of k-nearest neighbor classifiers (k-NN), one per activation space of each neural network layer, using the non-adversarial examples. We leverage those classifiers to produce a sequence of class labels for each nonperturbed input sample and estimate the a priori probability for a class label change between one activation space and another. During the detection phase we compute a sequence of classification labels for each input using the trained classifiers. We then estimate the likelihood of those classification sequences and show that adversarial sequences are far less likely than normal ones. We evaluated our detection method against the state of the art C&W attack method, using two image classification datasets (MNIST, CIFAR-10) reaching an AUC 0f 0.95 for the CIFAR-10 dataset. |
Tasks | Image Classification |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.09043v2 |
http://arxiv.org/pdf/1811.09043v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-adversarial-perturbations-through |
Repo | |
Framework | |
Quantum generative adversarial networks
Title | Quantum generative adversarial networks |
Authors | Pierre-Luc Dallaire-Demers, Nathan Killoran |
Abstract | Quantum machine learning is expected to be one of the first potential general-purpose applications of near-term quantum devices. A major recent breakthrough in classical machine learning is the notion of generative adversarial training, where the gradients of a discriminator model are used to train a separate generative model. In this work and a companion paper, we extend adversarial training to the quantum domain and show how to construct generative adversarial networks using quantum circuits. Furthermore, we also show how to compute gradients – a key element in generative adversarial network training – using another quantum circuit. We give an example of a simple practical circuit ansatz to parametrize quantum machine learning models and perform a simple numerical experiment to demonstrate that quantum generative adversarial networks can be trained successfully. |
Tasks | Quantum Machine Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08641v2 |
http://arxiv.org/pdf/1804.08641v2.pdf | |
PWC | https://paperswithcode.com/paper/quantum-generative-adversarial-networks |
Repo | |
Framework | |
Universal discriminative quantum neural networks
Title | Universal discriminative quantum neural networks |
Authors | Hongxiang Chen, Leonard Wossnig, Simone Severini, Hartmut Neven, Masoud Mohseni |
Abstract | Quantum mechanics fundamentally forbids deterministic discrimination of quantum states and processes. However, the ability to optimally distinguish various classes of quantum data is an important primitive in quantum information science. In this work, we train near-term quantum circuits to classify data represented by non-orthogonal quantum probability distributions using the Adam stochastic optimization algorithm. This is achieved by iterative interactions of a classical device with a quantum processor to discover the parameters of an unknown non-unitary quantum circuit. This circuit learns to simulates the unknown structure of a generalized quantum measurement, or Positive-Operator-Value-Measure (POVM), that is required to optimally distinguish possible distributions of quantum inputs. Notably we use universal circuit topologies, with a theoretically motivated circuit design, which guarantees that our circuits can in principle learn to perform arbitrary input-output mappings. Our numerical simulations show that shallow quantum circuits could be trained to discriminate among various pure and mixed quantum states exhibiting a trade-off between minimizing erroneous and inconclusive outcomes with comparable performance to theoretically optimal POVMs. We train the circuit on different classes of quantum data and evaluate the generalization error on unseen mixed quantum states. This generalization power hence distinguishes our work from standard circuit optimization and provides an example of quantum machine learning for a task that has inherently no classical analogue. |
Tasks | Quantum Machine Learning, Stochastic Optimization |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08654v1 |
http://arxiv.org/pdf/1805.08654v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-discriminative-quantum-neural |
Repo | |
Framework | |
Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation
Title | Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation |
Authors | Delu Zeng, Yixuan He, Li Liu, Zhihong Chen, Jiabin Huang, Jie Chen, John Paisley |
Abstract | Although deep CNNs have brought significant improvement to image saliency detection, most CNN based models are sensitive to distortion such as compression and noise. In this paper, we propose an end-to-end generic salient object segmentation model called Metric Expression Network (MEnet) to deal with saliency detection with the tolerance of distortion. Within MEnet, a new topological metric space is constructed, whose implicit metric is determined by the deep network. As a result, we manage to group all the pixels in the observed image semantically within this latent space into two regions: a salient region and a non-salient region. With this architecture, all feature extractions are carried out at the pixel level, enabling fine granularity of output boundaries of the salient objects. What’s more, we try to give a general analysis for the noise robustness of the network in the sense of Lipschitz and Jacobian literature. Experiments demonstrate that robust salient maps facilitating object segmentation can be generated by the proposed metric. Tests on several public benchmarks show that MEnet has achieved desirable performance. Furthermore, by direct computation and measuring the robustness, the proposed method outperforms previous CNN-based methods on distorted inputs. |
Tasks | Saliency Detection, Semantic Segmentation |
Published | 2018-05-15 |
URL | https://arxiv.org/abs/1805.05638v3 |
https://arxiv.org/pdf/1805.05638v3.pdf | |
PWC | https://paperswithcode.com/paper/menet-a-metric-expression-network-for-salient |
Repo | |
Framework | |
A multilayer backpropagation saliency detection algorithm and its applications
Title | A multilayer backpropagation saliency detection algorithm and its applications |
Authors | Chunbiao Zhu, Ge Li |
Abstract | Saliency detection is an active topic in the multimedia field. Most previous works on saliency detection focus on 2D images. However, these methods are not robust against complex scenes which contain multiple objects or complex backgrounds. Recently, depth information supplies a powerful cue for saliency detection. In this paper, we propose a multilayer backpropagation saliency detection algorithm based on depth mining by which we exploit depth cue from three different layers of images. The proposed algorithm shows a good performance and maintains the robustness in complex situations. Experiments’ results show that the proposed framework is superior to other existing saliency approaches. Besides, we give two innovative applications by this algorithm, such as scene reconstruction from multiple images and small target object detection in video. |
Tasks | Object Detection, Saliency Detection |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09659v1 |
http://arxiv.org/pdf/1803.09659v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multilayer-backpropagation-saliency |
Repo | |
Framework | |
Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy
Title | Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy |
Authors | Haoyu Fu, Yuejie Chi, Yingbin Liang |
Abstract | We study model recovery for data classification, where the training labels are generated from a one-hidden-layer neural network with sigmoid activations, and the goal is to recover the weights of the neural network. We consider two network models, the fully-connected network (FCN) and the non-overlapping convolutional neural network (CNN). We prove that with Gaussian inputs, the empirical risk based on cross entropy exhibits strong convexity and smoothness {\em uniformly} in a local neighborhood of the ground truth, as soon as the sample complexity is sufficiently large. This implies that if initialized in this neighborhood, gradient descent converges linearly to a critical point that is provably close to the ground truth. Furthermore, we show such an initialization can be obtained via the tensor method. This establishes the global convergence guarantee for empirical risk minimization using cross entropy via gradient descent for learning one-hidden-layer neural networks, at the near-optimal sample and computational complexity with respect to the network input dimension without unrealistic assumptions such as requiring a fresh set of samples at each iteration. |
Tasks | |
Published | 2018-02-18 |
URL | http://arxiv.org/abs/1802.06463v2 |
http://arxiv.org/pdf/1802.06463v2.pdf | |
PWC | https://paperswithcode.com/paper/guaranteed-recovery-of-one-hidden-layer |
Repo | |
Framework | |