January 31, 2020

3142 words 15 mins read

Paper Group ANR 138

Online Mixed-Integer Optimization in Milliseconds. Intrinsic Weight Learning Approach for Multi-view Clustering. Derivation of the Variational Bayes Equations. Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring. Memorized Sparse Backpropagation. Automatic Synthesis of Totally Self-Checking Circuits. HEIDL: Le …

Online Mixed-Integer Optimization in Milliseconds


Title	Online Mixed-Integer Optimization in Milliseconds
Authors	Dimitris Bertsimas, Bartolomeo Stellato
Abstract	We propose a method to solve online mixed-integer optimization (MIO) problems at very high speed using machine learning. By exploiting the repetitive nature of online optimization, we are able to greatly speedup the solution time. Our approach encodes the optimal solution into a small amount of information denoted as strategy using the Voice of Optimization framework proposed in [BS18]. In this way the core part of the optimization algorithm becomes a multiclass classification problem which can be solved very quickly. In this work we extend that framework to real-time and high-speed applications focusing on parametric mixed-integer quadratic optimization (MIQO). We propose an extremely fast online optimization algorithm consisting of a feedforward neural network (NN) evaluation and a linear system solution where the matrix has already been factorized. Therefore, this online approach does not require any solver nor iterative algorithm. We show the speed of the proposed method both in terms of total computations required and measured execution time. We estimate the number of floating point operations (flops) required to completely recover the optimal solution as a function of the problem dimensions. Compared to state-of-the-art MIO routines, the online running time of our method is very predictable and can be lower than a single matrix factorization time. We benchmark our method against the state-of-the-art solver Gurobi obtaining from two to three orders of magnitude speedups on benchmarks with real-world data.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02206v1
PDF	https://arxiv.org/pdf/1907.02206v1.pdf
PWC	https://paperswithcode.com/paper/online-mixed-integer-optimization-in
Repo
Framework

Intrinsic Weight Learning Approach for Multi-view Clustering


Title	Intrinsic Weight Learning Approach for Multi-view Clustering
Authors	Feiping Nie, Jing Li, Xuelong Li
Abstract	Exploiting different representations, or views, of the same object for better clustering has become very popular these days, which is conventionally called multi-view clustering. Generally, it is essential to measure the importance of each individual view, due to some noises, or inherent capacities in description. Many previous works model the view importance as weight, which is simple but effective empirically. In this paper, instead of following the traditional thoughts, we propose a new weight learning paradigm in context of multi-view clustering in virtue of the idea of re-weighted approach, and we theoretically analyze its working mechanism. Meanwhile, as a carefully achieved example, all of the views are connected by exploring a unified Laplacian rank constrained graph, which will be a representative method to compare with other weight learning approaches in experiments. Furthermore, the proposed weight learning strategy is much suitable for multi-view data, and it can be naturally integrated with many existing clustering learners. According to the numerical experiments, the proposed intrinsic weight learning approach is proved effective and practical to use in multi-view clustering.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.08905v1
PDF	https://arxiv.org/pdf/1906.08905v1.pdf
PWC	https://paperswithcode.com/paper/intrinsic-weight-learning-approach-for-multi
Repo
Framework

Derivation of the Variational Bayes Equations


Title	Derivation of the Variational Bayes Equations
Authors	Alianna J. Maren
Abstract	The derivation of key equations for the variational Bayes approach is well-known in certain circles. However, translating the fundamental derivations (e.g., as found in Beal (2003)) to the notation of Friston (2013, 2015) is somewhat delicate. Further, the notion of using variational Bayes in the context of a system with Markov blankets requires special attention. This Technical Report presents the derivation in detail. It further illustrates how the variational Bayes method provides a framework for a new computational engine, incorporating the 2-D cluster variation method (CVM), which provides a necessary free energy equation that can be minimized across both the external and representational systems’ states, respectively.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08804v4
PDF	https://arxiv.org/pdf/1906.08804v4.pdf
PWC	https://paperswithcode.com/paper/derivation-of-the-variational-bayes-equations
Repo
Framework

Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring


Title	Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring
Authors	Seunghyeok Back, Seongju Lee, Hogeon Seo, Deokhwan Park, Tae Kim, Kyoobin Lee
Abstract	This study proposes a novel deep learning model, called IITNet, to learn intra- and inter-epoch temporal contexts from a raw single channel electroencephalogram (EEG) for automatic sleep stage scoring. When sleep experts identify the sleep stage of a 30-second PSG data called an epoch, they investigate the sleep-related events such as sleep spindles, K-complex, and frequency components from local segments of an epoch (sub-epoch) and consider the relations between sleep-related events of successive epochs to follow the transition rules. Inspired by this, IITNet learns how to encode sub-epoch into representative feature via a deep residual network, then captures contextual information in the sequence of representative features via BiLSTM. Thus, IITNet can extract features in sub-epoch level and consider temporal context not only between epochs but also in an epoch. IITNet is an end-to-end architecture and does not need any preprocessing, handcrafted feature design, balanced sampling, pre-training, or fine-tuning. Our model was trained and evaluated in Sleep-EDF and MASS datasets and outperformed other state-of-the-art results on both the datasets with the overall accuracy (ACC) of 84.0% and 86.6%, macro F1-score (MF1) of 77.7 and 80.8, and Cohen’s kappa of 0.78 and 0.80 in Sleep-EDF and MASS, respectively.
Tasks	EEG, Sleep Stage Detection
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06562v1
PDF	http://arxiv.org/pdf/1902.06562v1.pdf
PWC	https://paperswithcode.com/paper/intra-and-inter-epoch-temporal-context
Repo
Framework

Memorized Sparse Backpropagation


Title	Memorized Sparse Backpropagation
Authors	Zhiyuan Zhang, Pengcheng Yang, Xuancheng Ren, Xu Sun
Abstract	Neural network learning is typically slow since backpropagation needs to compute full gradients and backpropagate them across multiple layers. Despite its success of existing work in accelerating propagation through sparseness, the relevant theoretical characteristics remain unexplored and we empirically find that they suffer from the loss of information contained in unpropagated gradients. To tackle these problems, in this work, we present a unified sparse backpropagation framework and provide a detailed analysis of its theoretical characteristics. Analysis reveals that when applied to a multilayer perceptron, our framework essentially performs gradient descent using an estimated gradient similar enough to the true gradient, resulting in convergence in probability under certain conditions. Furthermore, a simple yet effective algorithm named memorized sparse backpropagation (MSBP) is proposed to remedy the problem of information loss by storing unpropagated gradients in memory for the next learning. The experiments demonstrate that the proposed MSBP is able to effectively alleviate the information loss in traditional sparse backpropagation while achieving comparable acceleration.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10194v2
PDF	https://arxiv.org/pdf/1905.10194v2.pdf
PWC	https://paperswithcode.com/paper/memorized-sparse-backpropagation
Repo
Framework

Automatic Synthesis of Totally Self-Checking Circuits


Title	Automatic Synthesis of Totally Self-Checking Circuits
Authors	Michael Garvie, Phil Husbands
Abstract	Totally self-checking (TSC) circuits are synthesised with a grid of computers running a distributed population based stochastic optimisation algorithm. The presented method is the first to automatically synthesise TSC circuits from arbitrary logic as all previous methods fail to guarantee the checker is self-testing (ST) for circuits with limited output codespaces. The circuits synthesised by the presented method have significantly lower overhead than the previously reported best for every one of a set of 11 frequently used benchmarks. Average overhead across the entire set is 23% of duplication and comparison overhead, compared with an average of 69% for the previous best reported values across the set. The methodology presented represents a breakthrough in concurrent error detection (CED). The highly efficient, novel designs produced are tailored to each circuit’s function, rather than being constrained by a particular modular CED design methodology. Results are synthesised using two-input gates and are TSC with respect to all gate input and output stuck-at faults. The method can be used to add CED with or without modifications to the original logic, and can be generalised to any implementation technology and fault model. An example circuit is analysed and rigorously proven to be TSC.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.07023v1
PDF	http://arxiv.org/pdf/1901.07023v1.pdf
PWC	https://paperswithcode.com/paper/automatic-synthesis-of-totally-self-checking
Repo
Framework

HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop


Title	HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop
Authors	Yiwei Yang, Eser Kandogan, Yunyao Li, Walter S. Lasecki, Prithviraj Sen
Abstract	While the role of humans is increasingly recognized in machine learning community, representation of and interaction with models in current human-in-the-loop machine learning (HITL-ML) approaches are too low-level and far-removed from human’s conceptual models. We demonstrate HEIDL, a prototype HITL-ML system that exposes the machine-learned model through high-level, explainable linguistic expressions formed of predicates representing semantic structure of text. In HEIDL, human’s role is elevated from simply evaluating model predictions to interpreting and even updating the model logic directly by enabling interaction with rule predicates themselves. Raising the currency of interaction to such semantic levels calls for new interaction paradigms between humans and machines that result in improved productivity for text analytics model development process. Moreover, by involving humans in the process, the human-machine co-created models generalize better to unseen data as domain experts are able to instill their expertise by extrapolating from what has been learned by automated algorithms from few labelled data.
Tasks
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11184v1
PDF	https://arxiv.org/pdf/1907.11184v1.pdf
PWC	https://paperswithcode.com/paper/heidl-learning-linguistic-expressions-with
Repo
Framework

A Dictionary-Based Generalization of Robust PCA Part I: Study of Theoretical Properties


Title	A Dictionary-Based Generalization of Robust PCA Part I: Study of Theoretical Properties
Authors	Sirisha Rambhatla, Xingguo Li, Jineng Ren, Jarvis Haupt
Abstract	We consider the decomposition of a data matrix assumed to be a superposition of a low-rank matrix and a component which is sparse in a known dictionary, using a convex demixing method. We consider two sparsity structures for the sparse factor of the dictionary sparse component, namely entry-wise and column-wise sparsity, and provide a unified analysis, encompassing both undercomplete and the overcomplete dictionary cases, to show that the constituent matrices can be successfully recovered under some relatively mild conditions on incoherence, sparsity, and rank. We corroborate our theoretical results by presenting empirical evaluations in terms of phase transitions in rank and sparsity, in comparison to related techniques. Investigation of a specific application in hyperspectral imaging is included in an accompanying paper.
Tasks
Published	2019-02-21
URL	http://arxiv.org/abs/1902.08304v1
PDF	http://arxiv.org/pdf/1902.08304v1.pdf
PWC	https://paperswithcode.com/paper/a-dictionary-based-generalization-of-robust-2
Repo
Framework

Context-Aware Emotion Recognition Networks


Title	Context-Aware Emotion Recognition Networks
Authors	Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, Kwanghoon Sohn
Abstract	Traditional techniques for emotion recognition have focused on the facial expression analysis only, thus providing limited ability to encode context that comprehensively represents the emotional responses. We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner. The key idea is to hide human faces in a visual scene and seek other contexts based on an attention mechanism. Our networks consist of two sub-networks, including two-stream encoding networks to seperately extract the features of face and context regions, and adaptive fusion networks to fuse such features in an adaptive fashion. We also introduce a novel benchmark for context-aware emotion recognition, called CAER, that is more appropriate than existing benchmarks both qualitatively and quantitatively. On several benchmarks, CAER-Net proves the effect of context for emotion recognition. Our dataset is available at http://caer-dataset.github.io.
Tasks	Emotion Recognition
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05913v1
PDF	https://arxiv.org/pdf/1908.05913v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-emotion-recognition-networks
Repo
Framework

EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices


Title	EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices
Authors	Mario Almeida, Stefanos Laskaridis, Ilias Leontiadis, Stylianos I. Venieris, Nicholas D. Lane
Abstract	In recent years, advances in deep learning have resulted in unprecedented leaps in diverse tasks spanning from speech and object recognition to context awareness and health monitoring. As a result, an increasing number of AI-enabled applications are being developed targeting ubiquitous and mobile devices. While deep neural networks (DNNs) are getting bigger and more complex, they also impose a heavy computational and energy burden on the host devices, which has led to the integration of various specialized processors in commodity devices. Given the broad range of competing DNN architectures and the heterogeneity of the target hardware, there is an emerging need to understand the compatibility between DNN-platform pairs and the expected performance benefits on each platform. This work attempts to demystify this landscape by systematically evaluating a collection of state-of-the-art DNNs on a wide variety of commodity devices. In this respect, we identify potential bottlenecks in each architecture and provide important guidelines that can assist the community in the co-design of more efficient DNNs and accelerators.
Tasks	Object Recognition
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07346v1
PDF	https://arxiv.org/pdf/1905.07346v1.pdf
PWC	https://paperswithcode.com/paper/embench-quantifying-performance-variations-of
Repo
Framework

Intentional Attention Mask Transformation for Robust CNN Classification


Title	Intentional Attention Mask Transformation for Robust CNN Classification
Authors	Masanari Kimura, Masayuki Tanaka
Abstract	Convolutional Neural Networks have achieved impressive results in various tasks, but interpreting the internal mechanism is a challenging problem. To tackle this problem, we exploit a multi-channel attention mechanism in feature space. Our network architecture allows us to obtain an attention mask for each feature while existing CNN visualization methods provide only a common attention mask for all features. We apply the proposed multi-channel attention mechanism to multi-attribute recognition task. We can obtain different attention mask for each feature and for each attribute. Those analyses give us deeper insight into the feature space of CNNs. Furthermore, our proposed attention mechanism naturally derives a method for improving the robustness of CNNs. From the observation of feature space based on the proposed attention mask, we demonstrate that we can obtain robust CNNs by intentionally emphasizing features that are important for attributes. The experimental results for the benchmark dataset show that the proposed method gives high human interpretability while accurately grasping the attributes of the data, and improves network robustness.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02719v2
PDF	https://arxiv.org/pdf/1905.02719v2.pdf
PWC	https://paperswithcode.com/paper/interpretation-of-feature-space-using-multi-1
Repo
Framework

Auto-encoding a Knowledge Graph Using a Deep Belief Network: A Random Fields Perspective


Title	Auto-encoding a Knowledge Graph Using a Deep Belief Network: A Random Fields Perspective
Authors	Robert A. Murphy
Abstract	We started with a knowledge graph of connected entities and descriptive properties of those entities, from which, a hierarchical representation of the knowledge graph is derived. Using a graphical, energy-based neural network, we are able to show that the structure of the hierarchy can be internally captured by the neural network, which allows for efficient output of the underlying equilibrium distribution from which the data are drawn.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06322v4
PDF	https://arxiv.org/pdf/1911.06322v4.pdf
PWC	https://paperswithcode.com/paper/auto-encoding-a-knowledge-graph-using-a-deep
Repo
Framework

Location Field Descriptors: Single Image 3D Model Retrieval in the Wild


Title	Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
Authors	Alexander Grabner, Peter M. Roth, Vincent Lepetit
Abstract	We present Location Field Descriptors, a novel approach for single image 3D model retrieval in the wild. In contrast to previous methods that directly map 3D models and RGB images to an embedding space, we establish a common low-level representation in the form of location fields from which we compute pose invariant 3D shape descriptors. Location fields encode correspondences between 2D pixels and 3D surface coordinates and, thus, explicitly capture 3D shape and 3D pose information without appearance variations which are irrelevant for the task. This early fusion of 3D models and RGB images results in three main advantages: First, the bottleneck location field prediction acts as a regularizer during training. Second, major parts of the system benefit from training on a virtually infinite amount of synthetic data. Finally, the predicted location fields are visually interpretable and unblackbox the system. We evaluate our proposed approach on three challenging real-world datasets (Pix3D, Comp, and Stanford) with different object categories and significantly outperform the state-of-the-art by up to 20% absolute in multiple 3D retrieval metrics.
Tasks
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02853v1
PDF	https://arxiv.org/pdf/1908.02853v1.pdf
PWC	https://paperswithcode.com/paper/location-field-descriptors-single-image-3d
Repo
Framework

Seeker based Adaptive Guidance via Reinforcement Meta-Learning Applied to Asteroid Close Proximity Operations


Title	Seeker based Adaptive Guidance via Reinforcement Meta-Learning Applied to Asteroid Close Proximity Operations
Authors	Brian Gaudet, Richard Linares, Roberto Furfaro
Abstract	Current practice for asteroid close proximity maneuvers requires extremely accurate characterization of the environmental dynamics and precise spacecraft positioning prior to the maneuver. This creates a delay of several months between the spacecraft’s arrival and the ability to safely complete close proximity maneuvers. In this work we develop an adaptive integrated guidance, navigation, and control system that can complete these maneuvers in environments with unknown dynamics, with initial conditions spanning a large deployment region, and without a shape model of the asteroid. The system is implemented as a policy optimized using reinforcement meta-learning. The spacecraft is equipped with an optical seeker that locks to either a terrain feature, back-scattered light from a targeting laser, or an active beacon, and the policy maps observations consisting of seeker angles and LIDAR range readings directly to engine thrust commands. The policy implements a recurrent network layer that allows the deployed policy to adapt real time to both environmental forces acting on the agent and internal disturbances such as actuator failure and center of mass variation. We validate the guidance system through simulated landing maneuvers in a six degrees-of-freedom simulator. The simulator randomizes the asteroid’s characteristics such as solar radiation pressure, density, spin rate, and nutation angle, requiring the guidance and control system to adapt to the environment. We also demonstrate robustness to actuator failure, sensor bias, and changes in the spacecraft’s center of mass and inertia tensor. Finally, we suggest a concept of operations for asteroid close proximity maneuvers that is compatible with the guidance system.
Tasks	Meta-Learning
Published	2019-07-13
URL	https://arxiv.org/abs/1907.06098v1
PDF	https://arxiv.org/pdf/1907.06098v1.pdf
PWC	https://paperswithcode.com/paper/seeker-based-adaptive-guidance-via
Repo
Framework

Transformer Dissection: A Unified Understanding of Transformer’s Attention via the Lens of Kernel


Title	Transformer Dissection: A Unified Understanding of Transformer’s Attention via the Lens of Kernel
Authors	Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov
Abstract	Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To be more precise, we realize that the attention can be seen as applying kernel smoother over the inputs with the kernel scores being the similarities between inputs. This new formulation gives us a better way to understand individual components of the Transformer’s attention, such as the better way to integrate the positional embedding. Another important advantage of our kernel-based formulation is that it paves the way to a larger space of composing Transformer’s attention. As an example, we propose a new variant of Transformer’s attention which models the input as a product of symmetric kernels. This approach achieves competitive performance to the current state of the art model with less computation. In our experiments, we empirically study different kernel construction strategies on two widely used tasks: neural machine translation and sequence prediction.
Tasks	Machine Translation
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11775v4
PDF	https://arxiv.org/pdf/1908.11775v4.pdf
PWC	https://paperswithcode.com/paper/transformer-dissection-an-unified
Repo
Framework