Paper Group ANR 1327
Viterbi Extraction tutorial with Hidden Markov Toolkit. Breaking the Spatio-Angular Trade-off for Light Field Super-Resolution via LSTM Modelling on Epipolar Plane Images. A Forest from the Trees: Generation through Neighborhoods. Selective prediction-set models with coverage guarantees. Escaping Saddle Points for Zeroth-order Nonconvex Optimizatio …
Viterbi Extraction tutorial with Hidden Markov Toolkit
Title | Viterbi Extraction tutorial with Hidden Markov Toolkit |
Authors | Zulkarnaen Hatala, Victor Puturuhu |
Abstract | An algorithm used to extract HMM parameters is revisited. Most parts of the extraction process are taken from implemented Hidden Markov Toolkit (HTK) program under name HInit. The algorithm itself shows a few variations compared to another domain of implementations. The HMM model is introduced briefly based on the theory of Discrete Time Markov Chain. We schematically outline the Viterbi method implemented in HTK. Iterative definition of the method which is ready to be implemented in computer programs is reviewed. We also illustrate the method calculation precisely using manual calculation and extensive graphical illustration. The distribution of observation probability used is simply independent Gaussians r.v.s. The purpose of the content is not to justify the performance or accuracy of the method applied in a specific area. This writing merely to describe how the algorithm is performed. The whole content should enlighten the audience the insight of the Viterbi Extraction method used by HTK. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.03143v1 |
https://arxiv.org/pdf/1908.03143v1.pdf | |
PWC | https://paperswithcode.com/paper/viterbi-extraction-tutorial-with-hidden |
Repo | |
Framework | |
Breaking the Spatio-Angular Trade-off for Light Field Super-Resolution via LSTM Modelling on Epipolar Plane Images
Title | Breaking the Spatio-Angular Trade-off for Light Field Super-Resolution via LSTM Modelling on Epipolar Plane Images |
Authors | Hao Zhu, Mantang Guo, Hongdong Li, Qing Wang, Antonio Robles-Kelly |
Abstract | Light-field cameras (LFC) have received increasing attention due to their wide-spread applications. However, current LFCs suffer from the well-known spatio-angular trade-off, which is considered as an inherent and fundamental limit for LFC designs. In this paper, by doing a detailed geometrical optical analysis of the sampling process in an LFC, we show that the effective sampling resolution is generally higher than the number of micro-lenses. This contribution makes it theoretically possible to break the resolution trade-off. Our second contribution is an epipolar plane image (EPI) based super-resolution method, which can super-resolve the spatial and angular dimensions simultaneously. We prove that the light field is a 2D series, thus, a specifically designed CNN-LSTM network is proposed to capture the continuity property of the EPI. Rather than leveraging semantic information, our network focuses on extracting geometric continuity in the EPI. This gives our method an improved generalization ability and makes it applicable to a wide range of previously unseen scenes. Experiments on both synthetic and real light fields demonstrate the improvements over state-of-the-art, especially in large disparity areas. |
Tasks | Super-Resolution |
Published | 2019-02-15 |
URL | http://arxiv.org/abs/1902.05672v1 |
http://arxiv.org/pdf/1902.05672v1.pdf | |
PWC | https://paperswithcode.com/paper/breaking-the-spatio-angular-trade-off-for |
Repo | |
Framework | |
A Forest from the Trees: Generation through Neighborhoods
Title | A Forest from the Trees: Generation through Neighborhoods |
Authors | Yang Li, Tianxiang Gao, Junier B. Oliva |
Abstract | In this work, we propose to learn a generative model using both learned features (through a latent space) and memories (through neighbors). Although human learning makes seamless use of both learned perceptual features and instance recall, current generative learning paradigms only make use of one of these two components. Take, for instance, flow models, which learn a latent space of invertible features that follow a simple distribution. Conversely, kernel density techniques use instances to shift a simple distribution into an aggregate mixture model. Here we propose multiple methods to enhance the latent space of a flow model with neighborhood information. Not only does our proposed framework represent a more human-like approach by leveraging both learned features and memories, but it may also be viewed as a step forward in non-parametric methods. The efficacy of our model is shown empirically with standard image datasets. We observe compelling results and a significant improvement over baselines. |
Tasks | |
Published | 2019-02-04 |
URL | https://arxiv.org/abs/1902.01435v2 |
https://arxiv.org/pdf/1902.01435v2.pdf | |
PWC | https://paperswithcode.com/paper/a-forest-from-the-trees-generation-through |
Repo | |
Framework | |
Selective prediction-set models with coverage guarantees
Title | Selective prediction-set models with coverage guarantees |
Authors | Jean Feng, Arjun Sondhi, Jessica Perry, Noah Simon |
Abstract | Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data. Instead, we can learn more reliable models by letting them either output a prediction set or abstain when the uncertainty is high. We propose training these selective prediction-set models using an uncertainty-aware loss minimization framework, which unifies ideas from decision theory and robust maximum likelihood. Moreover, since black-box methods are not guaranteed to output well-calibrated prediction sets, we show how to calculate point estimates and confidence intervals for the true coverage of any selective prediction-set model, as well as a uniform mixture of K set models obtained from K-fold sample-splitting. When applied to predicting in-hospital mortality and length-of-stay for ICU patients, our model outperforms existing approaches on both in-sample and out-of-sample age groups, and our recalibration method provides accurate inference for prediction set coverage. |
Tasks | |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05473v1 |
https://arxiv.org/pdf/1906.05473v1.pdf | |
PWC | https://paperswithcode.com/paper/selective-prediction-set-models-with-coverage |
Repo | |
Framework | |
Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent
Title | Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent |
Authors | Qinbo Bai, Mridul Agarwal, Vaneet Aggarwal |
Abstract | Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stationary properties, the second-order stationary properties are important in machine learning applications to achieve better performance. We show that the proposed model-free non-convex optimization algorithm returns an $\epsilon$-second-order stationary point with $\widetilde{O}(\frac{d^{2+\frac{\theta}{2}}}{\epsilon^{8+\theta}})$ queries of the function for any arbitrary $\theta>0$. |
Tasks | |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01277v1 |
https://arxiv.org/pdf/1910.01277v1.pdf | |
PWC | https://paperswithcode.com/paper/escaping-saddle-points-for-zeroth-order |
Repo | |
Framework | |
The Web is missing an essential part of infrastructure: an Open Web Index
Title | The Web is missing an essential part of infrastructure: an Open Web Index |
Authors | Dirk Lewandowski |
Abstract | A proposal for building an index of the Web that separates the infrastructure part of the search engine - the index - from the services part that will form the basis for myriad search engines and other services utilizing Web data on top of a public infrastructure open to everyone. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03846v1 |
http://arxiv.org/pdf/1903.03846v1.pdf | |
PWC | https://paperswithcode.com/paper/the-web-is-missing-an-essential-part-of |
Repo | |
Framework | |
XferNAS: Transfer Neural Architecture Search
Title | XferNAS: Transfer Neural Architecture Search |
Authors | Martin Wistuba |
Abstract | The term Neural Architecture Search (NAS) refers to the automatic optimization of network architectures for a new, previously unknown task. Since testing an architecture is computationally very expensive, many optimizers need days or even weeks to find suitable architectures. However, this search time can be significantly reduced if knowledge from previous searches on different tasks is reused. In this work, we propose a generally applicable framework that introduces only minor changes to existing optimizers to leverage this feature. As an example, we select an existing optimizer and demonstrate the complexity of the integration of the framework as well as its impact. In experiments on CIFAR-10 and CIFAR-100, we observe a reduction in the search time from 200 to only 6 GPU days, a speed up by a factor of 33. In addition, we observe new records of 1.99 and 14.06 for NAS optimizers on the CIFAR benchmarks, respectively. In a separate study, we analyze the impact of the amount of source and target data. Empirically, we demonstrate that the proposed framework generally gives better results and, in the worst case, is just as good as the unmodified optimizer. |
Tasks | Neural Architecture Search |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08307v1 |
https://arxiv.org/pdf/1907.08307v1.pdf | |
PWC | https://paperswithcode.com/paper/xfernas-transfer-neural-architecture-search |
Repo | |
Framework | |
A Note on Reasoning on $\textit{DL-Lite}_{\cal R}$ with Defeasibility
Title | A Note on Reasoning on $\textit{DL-Lite}_{\cal R}$ with Defeasibility |
Authors | Loris Bozzato, Thomas Eiter, Luciano Serafini |
Abstract | Representation of defeasible information is of interest in description logics, as it is related to the need of accommodating exceptional instances in knowledge bases. In this direction, in our previous works we presented a datalog translation for reasoning on (contextualized) OWL RL knowledge bases with a notion of justified exceptions on defeasible axioms. While it covers a relevant fragment of OWL, the resulting reasoning process needs a complex encoding in order to capture reasoning on negative information. In this paper, we consider the case of knowledge bases in $\textit{DL-Lite}{\cal R}$, i.e. the language underlying OWL QL. We provide a definition for $\textit{DL-Lite}{\cal R}$ knowledge bases with defeasible axioms and study their properties. The limited form of $\textit{DL-Lite}{\cal R}$ axioms allows us to formulate a simpler encoding into datalog (under answer set semantics) with direct rules for reasoning on negative information. The resulting materialization method gives rise to a complete reasoning procedure for instance checking in $\textit{DL-Lite}{\cal R}$ with defeasible axioms. |
Tasks | |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.09221v1 |
https://arxiv.org/pdf/1905.09221v1.pdf | |
PWC | https://paperswithcode.com/paper/a-note-on-reasoning-on-textitdl-lite_cal-r |
Repo | |
Framework | |
Online and Bandit Algorithms for Nonstationary Stochastic Saddle-Point Optimization
Title | Online and Bandit Algorithms for Nonstationary Stochastic Saddle-Point Optimization |
Authors | Abhishek Roy, Yifang Chen, Krishnakumar Balasubramanian, Prasant Mohapatra |
Abstract | Saddle-point optimization problems are an important class of optimization problems with applications to game theory, multi-agent reinforcement learning and machine learning. A majority of the rich literature available for saddle-point optimization has focused on the offline setting. In this paper, we study nonstationary versions of stochastic, smooth, strongly-convex and strongly-concave saddle-point optimization problem, in both online (or first-order) and multi-point bandit (or zeroth-order) settings. We first propose natural notions of regret for such nonstationary saddle-point optimization problems. We then analyze extragradient and Frank-Wolfe algorithms, for the unconstrained and constrained settings respectively, for the above class of nonstationary saddle-point optimization problems. We establish sub-linear regret bounds on the proposed notions of regret in both the online and bandit setting. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01698v1 |
https://arxiv.org/pdf/1912.01698v1.pdf | |
PWC | https://paperswithcode.com/paper/online-and-bandit-algorithms-for |
Repo | |
Framework | |
Data Sanity Check for Deep Learning Systems via Learnt Assertions
Title | Data Sanity Check for Deep Learning Systems via Learnt Assertions |
Authors | Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang |
Abstract | Reliability is a critical consideration to DL-based systems. But the statistical nature of DL makes it quite vulnerable to invalid inputs, i.e., those cases that are not considered in the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to enhance the reliability of DL-based systems. We design and implement a tool to detect behavior deviation of a DL model when processing an input case. This tool extracts the data flow footprints and conducts an assertion-based validation mechanism. The assertions are built automatically, which are specifically-tailored for DL model data flow analysis. Our experiments conducted with real-world scenarios demonstrate that such an assertion-based data sanity check mechanism is effective in identifying invalid input cases. |
Tasks | |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.03835v3 |
https://arxiv.org/pdf/1909.03835v3.pdf | |
PWC | https://paperswithcode.com/paper/data-sanity-check-for-deep-learning-systems |
Repo | |
Framework | |
Imputing Missing Events in Continuous-Time Event Streams
Title | Imputing Missing Events in Continuous-Time Event Streams |
Authors | Hongyuan Mei, Guanghui Qin, Jason Eisner |
Abstract | Events in the world may be caused by other, unobserved events. We consider sequences of events in continuous time. Given a probability model of complete sequences, we propose particle smoothing—a form of sequential importance sampling—to impute the missing events in an incomplete sequence. We develop a trainable family of proposal distributions based on a type of bidirectional continuous-time LSTM: Bidirectionality lets the proposals condition on future observations, not just on the past as in particle filtering. Our method can sample an ensemble of possible complete sequences (particles), from which we form a single consensus prediction that has low Bayes risk under our chosen loss metric. We experiment in multiple synthetic and real domains, using different missingness mechanisms, and modeling the complete sequences in each domain with a neural Hawkes process (Mei & Eisner 2017). On held-out incomplete sequences, our method is effective at inferring the ground-truth unobserved events, with particle smoothing consistently improving upon particle filtering. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05570v1 |
https://arxiv.org/pdf/1905.05570v1.pdf | |
PWC | https://paperswithcode.com/paper/imputing-missing-events-in-continuous-time |
Repo | |
Framework | |
Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks
Title | Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks |
Authors | Sanghyun Hong, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, Tudor Dumitraş |
Abstract | Deep neural networks (DNNs) have been shown to tolerate “brain damage”: cumulative changes to the network’s parameters (e.g., pruning, numerical perturbations) typically result in a graceful degradation of classification accuracy. However, the limits of this natural resilience are not well understood in the presence of small adversarial changes to the DNN parameters’ underlying memory representation, such as bit-flips that may be induced by hardware fault attacks. We study the effects of bitwise corruptions on 19 DNN models—six architectures on three image classification tasks—and we show that most models have at least one parameter that, after a specific bit-flip in their bitwise representation, causes an accuracy loss of over 90%. We employ simple heuristics to efficiently identify the parameters likely to be vulnerable. We estimate that 40-50% of the parameters in a model might lead to an accuracy drop greater than 10% when individually subjected to such single-bit perturbations. To demonstrate how an adversary could take advantage of this vulnerability, we study the impact of an exemplary hardware fault attack, Rowhammer, on DNNs. Specifically, we show that a Rowhammer enabled attacker co-located in the same physical machine can inflict significant accuracy drops (up to 99%) even with single bit-flip corruptions and no knowledge of the model. Our results expose the limits of DNNs’ resilience against parameter perturbations induced by real-world fault attacks. We conclude by discussing possible mitigations and future research directions towards fault attack-resilient DNNs. |
Tasks | Image Classification |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.01017v1 |
https://arxiv.org/pdf/1906.01017v1.pdf | |
PWC | https://paperswithcode.com/paper/terminal-brain-damage-exposing-the-graceless |
Repo | |
Framework | |
Deep Cytometry: Deep learning with Real-time Inference in Cell Sorting and Flow Cytometry
Title | Deep Cytometry: Deep learning with Real-time Inference in Cell Sorting and Flow Cytometry |
Authors | Yueqin Li, Ata Mahjoubfar, Claire Lifan Chen, Kayvan Reza Niazi, Li Pei, Bahram Jalali |
Abstract | Deep learning has achieved spectacular performance in image and speech recognition and synthesis. It outperforms other machine learning algorithms in problems where large amounts of data are available. In the area of measurement technology, instruments based on the photonic time stretch have established record real-time measurement throughput in spectroscopy, optical coherence tomography, and imaging flow cytometry. These extreme-throughput instruments generate approximately 1 Tbit/s of continuous measurement data and have led to the discovery of rare phenomena in nonlinear and complex systems as well as new types of biomedical instruments. Owing to the abundance of data they generate, time-stretch instruments are a natural fit to deep learning classification. Previously we had shown that high-throughput label-free cell classification with high accuracy can be achieved through a combination of time-stretch microscopy, image processing and feature extraction, followed by deep learning for finding cancer cells in the blood. Such a technology holds promise for early detection of primary cancer or metastasis. Here we describe a new deep learning pipeline, which entirely avoids the slow and computationally costly signal processing and feature extraction steps by a convolutional neural network that directly operates on the measured signals. The improvement in computational efficiency enables low-latency inference and makes this pipeline suitable for cell sorting via deep learning. Our neural network takes less than a few milliseconds to classify the cells, fast enough to provide a decision to a cell sorter for real-time separation of individual target cells. We demonstrate the applicability of our new method in the classification of OT-II white blood cells and SW-480 epithelial cancer cells with more than 95% accuracy in a label-free fashion. |
Tasks | Speech Recognition |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.09233v2 |
https://arxiv.org/pdf/1904.09233v2.pdf | |
PWC | https://paperswithcode.com/paper/190409233 |
Repo | |
Framework | |
Solving graph compression via optimal transport
Title | Solving graph compression via optimal transport |
Authors | Vikas K. Garg, Tommi Jaakkola |
Abstract | We propose a new approach to graph compression by appeal to optimal transport. The transport problem is seeded with prior information about node importance, attributes, and edges in the graph. The transport formulation can be setup for either directed or undirected graphs, and its dual characterization is cast in terms of distributions over the nodes. The compression pertains to the support of node distributions and makes the problem challenging to solve directly. To this end, we introduce Boolean relaxations and specify conditions under which these relaxations are exact. The relaxations admit algorithms with provably fast convergence. Moreover, we provide an exact $O(d \log d)$ algorithm for the subproblem of projecting a $d$-dimensional vector to transformed simplex constraints. Our method outperforms state-of-the-art compression methods on graph classification. |
Tasks | Graph Classification |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12158v1 |
https://arxiv.org/pdf/1905.12158v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-graph-compression-via-optimal |
Repo | |
Framework | |
Provably Efficient Exploration in Policy Optimization
Title | Provably Efficient Exploration in Policy Optimization |
Authors | Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang |
Abstract | While policy-based reinforcement learning (RL) achieves tremendous successes in practice, it is significantly less understood in theory, especially compared with value-based RL. In particular, it remains elusive how to design a provably efficient policy optimization algorithm that incorporates exploration. To bridge such a gap, this paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO), which follows an “optimistic version” of the policy gradient direction. This paper proves that, in the problem of episodic Markov decision process with linear function approximation, unknown transition, and adversarial reward with full-information feedback, OPPO achieves $\tilde{O}(\sqrt{d^3 H^3 T})$ regret. Here $d$ is the feature dimension, $H$ is the episode horizon, and $T$ is the total number of steps. To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores. |
Tasks | Efficient Exploration |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.05830v1 |
https://arxiv.org/pdf/1912.05830v1.pdf | |
PWC | https://paperswithcode.com/paper/provably-efficient-exploration-in-policy |
Repo | |
Framework | |