Paper Group ANR 138
Online Mixed-Integer Optimization in Milliseconds. Intrinsic Weight Learning Approach for Multi-view Clustering. Derivation of the Variational Bayes Equations. Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring. Memorized Sparse Backpropagation. Automatic Synthesis of Totally Self-Checking Circuits. HEIDL: Le …
Online Mixed-Integer Optimization in Milliseconds
Title | Online Mixed-Integer Optimization in Milliseconds |
Authors | Dimitris Bertsimas, Bartolomeo Stellato |
Abstract | We propose a method to solve online mixed-integer optimization (MIO) problems at very high speed using machine learning. By exploiting the repetitive nature of online optimization, we are able to greatly speedup the solution time. Our approach encodes the optimal solution into a small amount of information denoted as strategy using the Voice of Optimization framework proposed in [BS18]. In this way the core part of the optimization algorithm becomes a multiclass classification problem which can be solved very quickly. In this work we extend that framework to real-time and high-speed applications focusing on parametric mixed-integer quadratic optimization (MIQO). We propose an extremely fast online optimization algorithm consisting of a feedforward neural network (NN) evaluation and a linear system solution where the matrix has already been factorized. Therefore, this online approach does not require any solver nor iterative algorithm. We show the speed of the proposed method both in terms of total computations required and measured execution time. We estimate the number of floating point operations (flops) required to completely recover the optimal solution as a function of the problem dimensions. Compared to state-of-the-art MIO routines, the online running time of our method is very predictable and can be lower than a single matrix factorization time. We benchmark our method against the state-of-the-art solver Gurobi obtaining from two to three orders of magnitude speedups on benchmarks with real-world data. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02206v1 |
https://arxiv.org/pdf/1907.02206v1.pdf | |
PWC | https://paperswithcode.com/paper/online-mixed-integer-optimization-in |
Repo | |
Framework | |
Intrinsic Weight Learning Approach for Multi-view Clustering
Title | Intrinsic Weight Learning Approach for Multi-view Clustering |
Authors | Feiping Nie, Jing Li, Xuelong Li |
Abstract | Exploiting different representations, or views, of the same object for better clustering has become very popular these days, which is conventionally called multi-view clustering. Generally, it is essential to measure the importance of each individual view, due to some noises, or inherent capacities in description. Many previous works model the view importance as weight, which is simple but effective empirically. In this paper, instead of following the traditional thoughts, we propose a new weight learning paradigm in context of multi-view clustering in virtue of the idea of re-weighted approach, and we theoretically analyze its working mechanism. Meanwhile, as a carefully achieved example, all of the views are connected by exploring a unified Laplacian rank constrained graph, which will be a representative method to compare with other weight learning approaches in experiments. Furthermore, the proposed weight learning strategy is much suitable for multi-view data, and it can be naturally integrated with many existing clustering learners. According to the numerical experiments, the proposed intrinsic weight learning approach is proved effective and practical to use in multi-view clustering. |
Tasks | |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.08905v1 |
https://arxiv.org/pdf/1906.08905v1.pdf | |
PWC | https://paperswithcode.com/paper/intrinsic-weight-learning-approach-for-multi |
Repo | |
Framework | |
Derivation of the Variational Bayes Equations
Title | Derivation of the Variational Bayes Equations |
Authors | Alianna J. Maren |
Abstract | The derivation of key equations for the variational Bayes approach is well-known in certain circles. However, translating the fundamental derivations (e.g., as found in Beal (2003)) to the notation of Friston (2013, 2015) is somewhat delicate. Further, the notion of using variational Bayes in the context of a system with Markov blankets requires special attention. This Technical Report presents the derivation in detail. It further illustrates how the variational Bayes method provides a framework for a new computational engine, incorporating the 2-D cluster variation method (CVM), which provides a necessary free energy equation that can be minimized across both the external and representational systems’ states, respectively. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08804v4 |
https://arxiv.org/pdf/1906.08804v4.pdf | |
PWC | https://paperswithcode.com/paper/derivation-of-the-variational-bayes-equations |
Repo | |
Framework | |
Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring
Title | Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage Scoring |
Authors | Seunghyeok Back, Seongju Lee, Hogeon Seo, Deokhwan Park, Tae Kim, Kyoobin Lee |
Abstract | This study proposes a novel deep learning model, called IITNet, to learn intra- and inter-epoch temporal contexts from a raw single channel electroencephalogram (EEG) for automatic sleep stage scoring. When sleep experts identify the sleep stage of a 30-second PSG data called an epoch, they investigate the sleep-related events such as sleep spindles, K-complex, and frequency components from local segments of an epoch (sub-epoch) and consider the relations between sleep-related events of successive epochs to follow the transition rules. Inspired by this, IITNet learns how to encode sub-epoch into representative feature via a deep residual network, then captures contextual information in the sequence of representative features via BiLSTM. Thus, IITNet can extract features in sub-epoch level and consider temporal context not only between epochs but also in an epoch. IITNet is an end-to-end architecture and does not need any preprocessing, handcrafted feature design, balanced sampling, pre-training, or fine-tuning. Our model was trained and evaluated in Sleep-EDF and MASS datasets and outperformed other state-of-the-art results on both the datasets with the overall accuracy (ACC) of 84.0% and 86.6%, macro F1-score (MF1) of 77.7 and 80.8, and Cohen’s kappa of 0.78 and 0.80 in Sleep-EDF and MASS, respectively. |
Tasks | EEG, Sleep Stage Detection |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06562v1 |
http://arxiv.org/pdf/1902.06562v1.pdf | |
PWC | https://paperswithcode.com/paper/intra-and-inter-epoch-temporal-context |
Repo | |
Framework | |
Memorized Sparse Backpropagation
Title | Memorized Sparse Backpropagation |
Authors | Zhiyuan Zhang, Pengcheng Yang, Xuancheng Ren, Xu Sun |
Abstract | Neural network learning is typically slow since backpropagation needs to compute full gradients and backpropagate them across multiple layers. Despite its success of existing work in accelerating propagation through sparseness, the relevant theoretical characteristics remain unexplored and we empirically find that they suffer from the loss of information contained in unpropagated gradients. To tackle these problems, in this work, we present a unified sparse backpropagation framework and provide a detailed analysis of its theoretical characteristics. Analysis reveals that when applied to a multilayer perceptron, our framework essentially performs gradient descent using an estimated gradient similar enough to the true gradient, resulting in convergence in probability under certain conditions. Furthermore, a simple yet effective algorithm named memorized sparse backpropagation (MSBP) is proposed to remedy the problem of information loss by storing unpropagated gradients in memory for the next learning. The experiments demonstrate that the proposed MSBP is able to effectively alleviate the information loss in traditional sparse backpropagation while achieving comparable acceleration. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10194v2 |
https://arxiv.org/pdf/1905.10194v2.pdf | |
PWC | https://paperswithcode.com/paper/memorized-sparse-backpropagation |
Repo | |
Framework | |
Automatic Synthesis of Totally Self-Checking Circuits
Title | Automatic Synthesis of Totally Self-Checking Circuits |
Authors | Michael Garvie, Phil Husbands |
Abstract | Totally self-checking (TSC) circuits are synthesised with a grid of computers running a distributed population based stochastic optimisation algorithm. The presented method is the first to automatically synthesise TSC circuits from arbitrary logic as all previous methods fail to guarantee the checker is self-testing (ST) for circuits with limited output codespaces. The circuits synthesised by the presented method have significantly lower overhead than the previously reported best for every one of a set of 11 frequently used benchmarks. Average overhead across the entire set is 23% of duplication and comparison overhead, compared with an average of 69% for the previous best reported values across the set. The methodology presented represents a breakthrough in concurrent error detection (CED). The highly efficient, novel designs produced are tailored to each circuit’s function, rather than being constrained by a particular modular CED design methodology. Results are synthesised using two-input gates and are TSC with respect to all gate input and output stuck-at faults. The method can be used to add CED with or without modifications to the original logic, and can be generalised to any implementation technology and fault model. An example circuit is analysed and rigorously proven to be TSC. |
Tasks | |
Published | 2019-01-21 |
URL | http://arxiv.org/abs/1901.07023v1 |
http://arxiv.org/pdf/1901.07023v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-synthesis-of-totally-self-checking |
Repo | |
Framework | |
HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop
Title | HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop |
Authors | Yiwei Yang, Eser Kandogan, Yunyao Li, Walter S. Lasecki, Prithviraj Sen |
Abstract | While the role of humans is increasingly recognized in machine learning community, representation of and interaction with models in current human-in-the-loop machine learning (HITL-ML) approaches are too low-level and far-removed from human’s conceptual models. We demonstrate HEIDL, a prototype HITL-ML system that exposes the machine-learned model through high-level, explainable linguistic expressions formed of predicates representing semantic structure of text. In HEIDL, human’s role is elevated from simply evaluating model predictions to interpreting and even updating the model logic directly by enabling interaction with rule predicates themselves. Raising the currency of interaction to such semantic levels calls for new interaction paradigms between humans and machines that result in improved productivity for text analytics model development process. Moreover, by involving humans in the process, the human-machine co-created models generalize better to unseen data as domain experts are able to instill their expertise by extrapolating from what has been learned by automated algorithms from few labelled data. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11184v1 |
https://arxiv.org/pdf/1907.11184v1.pdf | |
PWC | https://paperswithcode.com/paper/heidl-learning-linguistic-expressions-with |
Repo | |
Framework | |
A Dictionary-Based Generalization of Robust PCA Part I: Study of Theoretical Properties
Title | A Dictionary-Based Generalization of Robust PCA Part I: Study of Theoretical Properties |
Authors | Sirisha Rambhatla, Xingguo Li, Jineng Ren, Jarvis Haupt |
Abstract | We consider the decomposition of a data matrix assumed to be a superposition of a low-rank matrix and a component which is sparse in a known dictionary, using a convex demixing method. We consider two sparsity structures for the sparse factor of the dictionary sparse component, namely entry-wise and column-wise sparsity, and provide a unified analysis, encompassing both undercomplete and the overcomplete dictionary cases, to show that the constituent matrices can be successfully recovered under some relatively mild conditions on incoherence, sparsity, and rank. We corroborate our theoretical results by presenting empirical evaluations in terms of phase transitions in rank and sparsity, in comparison to related techniques. Investigation of a specific application in hyperspectral imaging is included in an accompanying paper. |
Tasks | |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.08304v1 |
http://arxiv.org/pdf/1902.08304v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dictionary-based-generalization-of-robust-2 |
Repo | |
Framework | |
Context-Aware Emotion Recognition Networks
Title | Context-Aware Emotion Recognition Networks |
Authors | Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, Kwanghoon Sohn |
Abstract | Traditional techniques for emotion recognition have focused on the facial expression analysis only, thus providing limited ability to encode context that comprehensively represents the emotional responses. We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner. The key idea is to hide human faces in a visual scene and seek other contexts based on an attention mechanism. Our networks consist of two sub-networks, including two-stream encoding networks to seperately extract the features of face and context regions, and adaptive fusion networks to fuse such features in an adaptive fashion. We also introduce a novel benchmark for context-aware emotion recognition, called CAER, that is more appropriate than existing benchmarks both qualitatively and quantitatively. On several benchmarks, CAER-Net proves the effect of context for emotion recognition. Our dataset is available at http://caer-dataset.github.io. |
Tasks | Emotion Recognition |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05913v1 |
https://arxiv.org/pdf/1908.05913v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-emotion-recognition-networks |
Repo | |
Framework | |
EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices
Title | EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices |
Authors | Mario Almeida, Stefanos Laskaridis, Ilias Leontiadis, Stylianos I. Venieris, Nicholas D. Lane |
Abstract | In recent years, advances in deep learning have resulted in unprecedented leaps in diverse tasks spanning from speech and object recognition to context awareness and health monitoring. As a result, an increasing number of AI-enabled applications are being developed targeting ubiquitous and mobile devices. While deep neural networks (DNNs) are getting bigger and more complex, they also impose a heavy computational and energy burden on the host devices, which has led to the integration of various specialized processors in commodity devices. Given the broad range of competing DNN architectures and the heterogeneity of the target hardware, there is an emerging need to understand the compatibility between DNN-platform pairs and the expected performance benefits on each platform. This work attempts to demystify this landscape by systematically evaluating a collection of state-of-the-art DNNs on a wide variety of commodity devices. In this respect, we identify potential bottlenecks in each architecture and provide important guidelines that can assist the community in the co-design of more efficient DNNs and accelerators. |
Tasks | Object Recognition |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07346v1 |
https://arxiv.org/pdf/1905.07346v1.pdf | |
PWC | https://paperswithcode.com/paper/embench-quantifying-performance-variations-of |
Repo | |
Framework | |
Intentional Attention Mask Transformation for Robust CNN Classification
Title | Intentional Attention Mask Transformation for Robust CNN Classification |
Authors | Masanari Kimura, Masayuki Tanaka |
Abstract | Convolutional Neural Networks have achieved impressive results in various tasks, but interpreting the internal mechanism is a challenging problem. To tackle this problem, we exploit a multi-channel attention mechanism in feature space. Our network architecture allows us to obtain an attention mask for each feature while existing CNN visualization methods provide only a common attention mask for all features. We apply the proposed multi-channel attention mechanism to multi-attribute recognition task. We can obtain different attention mask for each feature and for each attribute. Those analyses give us deeper insight into the feature space of CNNs. Furthermore, our proposed attention mechanism naturally derives a method for improving the robustness of CNNs. From the observation of feature space based on the proposed attention mask, we demonstrate that we can obtain robust CNNs by intentionally emphasizing features that are important for attributes. The experimental results for the benchmark dataset show that the proposed method gives high human interpretability while accurately grasping the attributes of the data, and improves network robustness. |
Tasks | |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02719v2 |
https://arxiv.org/pdf/1905.02719v2.pdf | |
PWC | https://paperswithcode.com/paper/interpretation-of-feature-space-using-multi-1 |
Repo | |
Framework | |
Auto-encoding a Knowledge Graph Using a Deep Belief Network: A Random Fields Perspective
Title | Auto-encoding a Knowledge Graph Using a Deep Belief Network: A Random Fields Perspective |
Authors | Robert A. Murphy |
Abstract | We started with a knowledge graph of connected entities and descriptive properties of those entities, from which, a hierarchical representation of the knowledge graph is derived. Using a graphical, energy-based neural network, we are able to show that the structure of the hierarchy can be internally captured by the neural network, which allows for efficient output of the underlying equilibrium distribution from which the data are drawn. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06322v4 |
https://arxiv.org/pdf/1911.06322v4.pdf | |
PWC | https://paperswithcode.com/paper/auto-encoding-a-knowledge-graph-using-a-deep |
Repo | |
Framework | |
Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
Title | Location Field Descriptors: Single Image 3D Model Retrieval in the Wild |
Authors | Alexander Grabner, Peter M. Roth, Vincent Lepetit |
Abstract | We present Location Field Descriptors, a novel approach for single image 3D model retrieval in the wild. In contrast to previous methods that directly map 3D models and RGB images to an embedding space, we establish a common low-level representation in the form of location fields from which we compute pose invariant 3D shape descriptors. Location fields encode correspondences between 2D pixels and 3D surface coordinates and, thus, explicitly capture 3D shape and 3D pose information without appearance variations which are irrelevant for the task. This early fusion of 3D models and RGB images results in three main advantages: First, the bottleneck location field prediction acts as a regularizer during training. Second, major parts of the system benefit from training on a virtually infinite amount of synthetic data. Finally, the predicted location fields are visually interpretable and unblackbox the system. We evaluate our proposed approach on three challenging real-world datasets (Pix3D, Comp, and Stanford) with different object categories and significantly outperform the state-of-the-art by up to 20% absolute in multiple 3D retrieval metrics. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02853v1 |
https://arxiv.org/pdf/1908.02853v1.pdf | |
PWC | https://paperswithcode.com/paper/location-field-descriptors-single-image-3d |
Repo | |
Framework | |
Seeker based Adaptive Guidance via Reinforcement Meta-Learning Applied to Asteroid Close Proximity Operations
Title | Seeker based Adaptive Guidance via Reinforcement Meta-Learning Applied to Asteroid Close Proximity Operations |
Authors | Brian Gaudet, Richard Linares, Roberto Furfaro |
Abstract | Current practice for asteroid close proximity maneuvers requires extremely accurate characterization of the environmental dynamics and precise spacecraft positioning prior to the maneuver. This creates a delay of several months between the spacecraft’s arrival and the ability to safely complete close proximity maneuvers. In this work we develop an adaptive integrated guidance, navigation, and control system that can complete these maneuvers in environments with unknown dynamics, with initial conditions spanning a large deployment region, and without a shape model of the asteroid. The system is implemented as a policy optimized using reinforcement meta-learning. The spacecraft is equipped with an optical seeker that locks to either a terrain feature, back-scattered light from a targeting laser, or an active beacon, and the policy maps observations consisting of seeker angles and LIDAR range readings directly to engine thrust commands. The policy implements a recurrent network layer that allows the deployed policy to adapt real time to both environmental forces acting on the agent and internal disturbances such as actuator failure and center of mass variation. We validate the guidance system through simulated landing maneuvers in a six degrees-of-freedom simulator. The simulator randomizes the asteroid’s characteristics such as solar radiation pressure, density, spin rate, and nutation angle, requiring the guidance and control system to adapt to the environment. We also demonstrate robustness to actuator failure, sensor bias, and changes in the spacecraft’s center of mass and inertia tensor. Finally, we suggest a concept of operations for asteroid close proximity maneuvers that is compatible with the guidance system. |
Tasks | Meta-Learning |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.06098v1 |
https://arxiv.org/pdf/1907.06098v1.pdf | |
PWC | https://paperswithcode.com/paper/seeker-based-adaptive-guidance-via |
Repo | |
Framework | |
Transformer Dissection: A Unified Understanding of Transformer’s Attention via the Lens of Kernel
Title | Transformer Dissection: A Unified Understanding of Transformer’s Attention via the Lens of Kernel |
Authors | Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov |
Abstract | Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To be more precise, we realize that the attention can be seen as applying kernel smoother over the inputs with the kernel scores being the similarities between inputs. This new formulation gives us a better way to understand individual components of the Transformer’s attention, such as the better way to integrate the positional embedding. Another important advantage of our kernel-based formulation is that it paves the way to a larger space of composing Transformer’s attention. As an example, we propose a new variant of Transformer’s attention which models the input as a product of symmetric kernels. This approach achieves competitive performance to the current state of the art model with less computation. In our experiments, we empirically study different kernel construction strategies on two widely used tasks: neural machine translation and sequence prediction. |
Tasks | Machine Translation |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11775v4 |
https://arxiv.org/pdf/1908.11775v4.pdf | |
PWC | https://paperswithcode.com/paper/transformer-dissection-an-unified |
Repo | |
Framework | |