Paper Group ANR 504
Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification. Towards an understanding of CNNs: analysing the recovery of activation pathways via Deep Convolutional Sparse Coding. Deep MR Image Super-Resolution Using Structural Priors. Sequence-based Multi-lingual Low Resource Speech Recognition. Learning Optima …
Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification
Title | Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification |
Authors | Weifeng Ge, Bingchen Gong, Yizhou Yu |
Abstract | Single image superresolution has been a popular research topic in the last two decades and has recently received a new wave of interest due to deep neural networks. In this paper, we approach this problem from a different perspective. With respect to a downsampled low resolution image, we model a high resolution image as a combination of two components, a deterministic component and a stochastic component. The deterministic component can be recovered from the low-frequency signals in the downsampled image. The stochastic component, on the other hand, contains the signals that have little correlation with the low resolution image. We adopt two complementary methods for generating these two components. While generative adversarial networks are used for the stochastic component, deterministic component reconstruction is formulated as a regression problem solved using deep neural networks. Since the deterministic component exhibits clearer local orientations, we design novel loss functions tailored for such properties for training the deep regression network. These two methods are first applied to the entire input image to produce two distinct high-resolution images. Afterwards, these two images are fused together using another deep neural network that also performs local statistical rectification, which tries to make the local statistics of the fused image match the same local statistics of the groundtruth image. Quantitative results and a user study indicate that the proposed method outperforms existing state-of-the-art algorithms with a clear margin. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06557v1 |
http://arxiv.org/pdf/1809.06557v1.pdf | |
PWC | https://paperswithcode.com/paper/image-super-resolution-via-deterministic |
Repo | |
Framework | |
Towards an understanding of CNNs: analysing the recovery of activation pathways via Deep Convolutional Sparse Coding
Title | Towards an understanding of CNNs: analysing the recovery of activation pathways via Deep Convolutional Sparse Coding |
Authors | Michael Murray, Jared Tanner |
Abstract | Deep Convolutional Sparse Coding (D-CSC) is a framework reminiscent of deep convolutional neural networks (DCNNs), but by omitting the learning of the dictionaries one can more transparently analyse the role of the activation function and its ability to recover activation paths through the layers. Papyan, Romano, and Elad conducted an analysis of such an architecture, demonstrated the relationship with DCNNs and proved conditions under which the D-CSC is guaranteed to recover specific activation paths. A technical innovation of their work highlights that one can view the efficacy of the ReLU nonlinear activation function of a DCNN through a new variant of the tensor’s sparsity, referred to as stripe-sparsity. Using this they proved that representations with an activation density proportional to the ambient dimension of the data are recoverable. We extend their uniform guarantees to a modified model and prove that with high probability the true activation is typically possible to recover for a greater density of activations per layer. Our extension follows from incorporating the prior work on one step thresholding by Schnass and Vandergheynst. |
Tasks | |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09888v1 |
http://arxiv.org/pdf/1806.09888v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-understanding-of-cnns-analysing |
Repo | |
Framework | |
Deep MR Image Super-Resolution Using Structural Priors
Title | Deep MR Image Super-Resolution Using Structural Priors |
Authors | Venkateswararao Cherukuri, Tiantong Guo, Steven J. Schiff, Vishal Monga |
Abstract | High resolution magnetic resonance (MR) images are desired for accurate diagnostics. In practice, image resolution is restricted by factors like hardware, cost and processing constraints. Recently, deep learning methods have been shown to produce compelling state of the art results for image super-resolution. Paying particular attention to desired hi-resolution MR image structure, we propose a new regularized network that exploits image priors, namely a low-rank structure and a sharpness prior to enhance deep MR image superresolution. Our contributions are then incorporating these priors in an analytically tractable fashion in the learning of a convolutional neural network (CNN) that accomplishes the super-resolution task. This is particularly challenging for the low rank prior, since the rank is not a differentiable function of the image matrix (and hence the network parameters), an issue we address by pursuing differentiable approximations of the rank. Sharpness is emphasized by the variance of the Laplacian which we show can be implemented by a fixed {\em feedback} layer at the output of the network. Experiments performed on two publicly available MR brain image databases exhibit promising results particularly when training imagery is limited. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03140v1 |
http://arxiv.org/pdf/1809.03140v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-mr-image-super-resolution-using |
Repo | |
Framework | |
Sequence-based Multi-lingual Low Resource Speech Recognition
Title | Sequence-based Multi-lingual Low Resource Speech Recognition |
Authors | Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black |
Abstract | Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss. We show that our model improves performance on Babel languages by over 6% absolute in terms of word/phoneme error rate when compared to mono-lingual systems built in the same setting for these languages. We also show that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data. We show that training on multiple languages is important for very low resource cross-lingual target scenarios, but not for multi-lingual testing scenarios. Here, it appears beneficial to include large well prepared datasets. |
Tasks | Speech Recognition |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07420v2 |
http://arxiv.org/pdf/1802.07420v2.pdf | |
PWC | https://paperswithcode.com/paper/sequence-based-multi-lingual-low-resource |
Repo | |
Framework | |
Learning Optimal Fair Policies
Title | Learning Optimal Fair Policies |
Authors | Razieh Nabi, Daniel Malinsky, Ilya Shpitser |
Abstract | Systematic discriminatory biases present in our society influence the way data is collected and stored, the way variables are defined, and the way scientific findings are put into practice as policy. Automated decision procedures and learning algorithms applied to such data may serve to perpetuate existing injustice or unfairness in our society. In this paper, we consider how to make optimal but fair decisions, which “break the cycle of injustice” by correcting for the unfair dependence of both decisions and outcomes on sensitive features (e.g., variables that correspond to gender, race, disability, or other protected attributes). We use methods from causal inference and constrained optimization to learn optimal policies in a way that addresses multiple potential biases which afflict data analysis in sensitive contexts, extending the approach of (Nabi and Shpitser 2018). Our proposal comes equipped with the theoretical guarantee that the chosen fair policy will induce a joint distribution for new instances that satisfies given fairness constraints. We illustrate our approach with both synthetic data and real criminal justice data. |
Tasks | Causal Inference, Decision Making |
Published | 2018-09-06 |
URL | https://arxiv.org/abs/1809.02244v3 |
https://arxiv.org/pdf/1809.02244v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-optimal-fair-policies |
Repo | |
Framework | |
Automatic quantification of the LV function and mass: a deep learning approach for cardiovascular MRI
Title | Automatic quantification of the LV function and mass: a deep learning approach for cardiovascular MRI |
Authors | Ariel H. Curiale, Flavio D. Colavecchia, German Mato |
Abstract | Objective: This paper proposes a novel approach for automatic left ventricle (LV) quantification using convolutional neural networks (CNN). Methods: The general framework consists of one CNN for detecting the LV, and another for tissue classification. Also, three new deep learning architectures were proposed for LV quantification. These new CNNs introduce the ideas of sparsity and depthwise separable convolution into the U-net architecture, as well as, a residual learning strategy level-to-level. To this end, we extend the classical U-net architecture and use the generalized Jaccard distance as optimization objective function. Results: The CNNs were trained and evaluated with 140 patients from two public cardiovascular magnetic resonance datasets (Sunnybrook and Cardiac Atlas Project) by using a 5-fold cross-validation strategy. Our results demonstrate a suitable accuracy for myocardial segmentation ($\sim$0.9 Dice’s coefficient), and a strong correlation with the most relevant physiological measures: 0.99 for end-diastolic and end-systolic volume, 0.97 for the left myocardial mass, 0.95 for the ejection fraction and 0.93 for the stroke volume and cardiac output. Conclusion: Our simulation and clinical evaluation results demonstrate the capability and merits of the proposed CNN to estimate different structural and functional features such as LV mass and EF which are commonly used for both diagnosis and treatment of different pathologies. Significance: This paper suggests a new approach for automatic LV quantification based on deep learning where errors are comparable to the inter- and intra-operator ranges for manual contouring. Also, this approach may have important applications on motion quantification. |
Tasks | |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.06061v1 |
http://arxiv.org/pdf/1812.06061v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-quantification-of-the-lv-function |
Repo | |
Framework | |
Multi-Source Neural Variational Inference
Title | Multi-Source Neural Variational Inference |
Authors | Richard Kurle, Stephan Günnemann, Patrick van der Smagt |
Abstract | Learning from multiple sources of information is an important problem in machine-learning research. The key challenges are learning representations and formulating inference methods that take into account the complementarity and redundancy of various information sources. In this paper we formulate a variational autoencoder based multi-source learning framework in which each encoder is conditioned on a different information source. This allows us to relate the sources via the shared latent variables by computing divergence measures between individual source’s posterior approximations. We explore a variety of options to learn these encoders and to integrate the beliefs they compute into a consistent posterior approximation. We visualise learned beliefs on a toy dataset and evaluate our methods for learning shared representations and structured output prediction, showing trade-offs of learning separate encoders for each information source. Furthermore, we demonstrate how conflict detection and redundancy can increase robustness of inference in a multi-source setting. |
Tasks | |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04451v2 |
http://arxiv.org/pdf/1811.04451v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-source-neural-variational-inference |
Repo | |
Framework | |
Artificial Intelligence-aided OFDM Receiver: Design and Experimental Results
Title | Artificial Intelligence-aided OFDM Receiver: Design and Experimental Results |
Authors | Peiwen Jiang, Tianqi Wang, Bin Han, Xuanxuan Gao, Jing Zhang, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li |
Abstract | Orthogonal frequency division multiplexing (OFDM) is one of the key technologies that are widely applied in current communication systems. Recently, artificial intelligence (AI)-aided OFDM receivers have been brought to the forefront to break the bottleneck of the traditional OFDM systems. In this paper, we investigate two AI-aided OFDM receivers, data-driven fully connected-deep neural network (FC-DNN) receiver and model-driven ComNet receiver, respectively. We first study their performance under different channel models through simulation and then establish a real-time video transmission system using a 5G rapid prototyping (RaPro) system for over-the-air (OTA) test. To address the performance gap between the simulation and the OTA test caused by the discrepancy between the channel model for offline training and real environments, we develop a novel online training strategy, called SwitchNet receiver. The SwitchNet receiver is with a flexible and extendable architecture and can adapts to real channel by training one parameter online. The OTA test verifies its feasibility and robustness to real environments and indicates its potential for future communications systems. At the end of this paper, we discuss some challenges to inspire future research. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.06638v2 |
http://arxiv.org/pdf/1812.06638v2.pdf | |
PWC | https://paperswithcode.com/paper/artificial-intelligence-aided-ofdm-receiver |
Repo | |
Framework | |
Learning Scripts as Hidden Markov Models
Title | Learning Scripts as Hidden Markov Models |
Authors | J. Walker Orr, Prasad Tadepalli, Janardhan Rao Doppa, Xiaoli Fern, Thomas G. Dietterich |
Abstract | Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partial observation sequences. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03680v1 |
http://arxiv.org/pdf/1809.03680v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-scripts-as-hidden-markov-models |
Repo | |
Framework | |
FPGA Implementations of 3D-SIMD Processor Architecture for Deep Neural Networks Using Relative Indexed Compressed Sparse Filter Encoding Format and Stacked Filters Stationary Flow
Title | FPGA Implementations of 3D-SIMD Processor Architecture for Deep Neural Networks Using Relative Indexed Compressed Sparse Filter Encoding Format and Stacked Filters Stationary Flow |
Authors | Yuechao Gao, Nianhong Liu, Sheng Zhang |
Abstract | It is a challenging task to deploy computationally and memory intensive State-of-the-art deep neural networks (DNNs) on embedded systems with limited hardware resources and power budgets. Recently developed techniques like Deep Compression make it possible to fit large DNNs, such as AlexNet and VGGNet, fully in on-chip SRAM. But sparse networks compressed using existing encoding formats, like CSR or CSC, complex the computation at runtime due to their irregular memory access characteristics. In [1], we introduce a computation dataflow, stacked filters stationary dataflow (SFS), and a corresponding data encoding format, relative indexed compressed sparse filter format (CSF), to make the best of data sparsity, and simplify data handling at execution time. In this paper we present FPGA implementations of these methods. We implement several compact streaming fully connected (FC) and Convolutional (CONV) neural network processors to show their efficiency. Comparing with the state-of-the-art results [2,3,4], our methods achieve at least 2x improvement for computation efficiency per PE on most layers. Especially, our methods achieve 8x improvement on AlexNet layer CONV4 with 384 filters, and 11x improvement on VGG16 layer CONV5-3 with 512 filters. |
Tasks | |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10548v3 |
http://arxiv.org/pdf/1803.10548v3.pdf | |
PWC | https://paperswithcode.com/paper/fpga-implementations-of-3d-simd-processor |
Repo | |
Framework | |
Learning Low Precision Deep Neural Networks through Regularization
Title | Learning Low Precision Deep Neural Networks through Regularization |
Authors | Yoojin Choi, Mostafa El-Khamy, Jungwon Lee |
Abstract | We consider the quantization of deep neural networks (DNNs) to produce low-precision models for efficient inference of fixed-point operations. Compared to previous approaches to training quantized DNNs directly under the constraints of low-precision weights and activations, we learn the quantization of DNNs with minimal quantization loss through regularization. In particular, we introduce the learnable regularization coefficient to find accurate low-precision models efficiently in training. In our experiments, the proposed scheme yields the state-of-the-art low-precision models of AlexNet and ResNet-18, which have better accuracy than their previously available low-precision models. We also examine our quantization method to produce low-precision DNNs for image super resolution. We observe only $0.5$~dB peak signal-to-noise ratio (PSNR) loss when using binary weights and 8-bit activations. The proposed scheme can be used to train low-precision models from scratch or to fine-tune a well-trained high-precision model to converge to a low-precision model. Finally, we discuss how a similar regularization method can be adopted in DNN weight pruning and compression, and show that $401\times$ compression is achieved for LeNet-5. |
Tasks | Image Super-Resolution, Quantization, Super-Resolution |
Published | 2018-09-01 |
URL | http://arxiv.org/abs/1809.00095v1 |
http://arxiv.org/pdf/1809.00095v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-low-precision-deep-neural-networks |
Repo | |
Framework | |
Can automated smoothing significantly improve benchmark time series classification algorithms?
Title | Can automated smoothing significantly improve benchmark time series classification algorithms? |
Authors | James Large, Paul Southam, Anthony Bagnall |
Abstract | tl;dr: no, it cannot, at least not on average on the standard archive problems. We assess whether using six smoothing algorithms (moving average, exponential smoothing, Gaussian filter, Savitzky-Golay filter, Fourier approximation and a recursive median sieve) could be automatically applied to time series classification problems as a preprocessing step to improve the performance of three benchmark classifiers (1-Nearest Neighbour with Euclidean and Dynamic Time Warping distances, and Rotation Forest). We found no significant improvement over unsmoothed data even when we set the smoothing parameter through cross validation. We are not claiming smoothing has no worth. It has an important role in exploratory analysis and helps with specific classification problems where domain knowledge can be exploited. What we observe is that the automatic application does not help and that we cannot explain the improvement of other time series classification algorithms over the baseline classifiers simply as a function of the absence of smoothing. |
Tasks | Time Series, Time Series Classification |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00894v1 |
http://arxiv.org/pdf/1811.00894v1.pdf | |
PWC | https://paperswithcode.com/paper/can-automated-smoothing-significantly-improve |
Repo | |
Framework | |
A Sensorimotor Perspective on Grounding the Semantic of Simple Visual Features
Title | A Sensorimotor Perspective on Grounding the Semantic of Simple Visual Features |
Authors | Alban Laflaquière |
Abstract | In Machine Learning and Robotics, the semantic content of visual features is usually provided to the system by a human who interprets its content. On the contrary, strictly unsupervised approaches have difficulties relating the statistics of sensory inputs to their semantic content without also relying on prior knowledge introduced in the system. We proposed in this paper to tackle this problem from a sensorimotor perspective. In line with the Sensorimotor Contingencies Theory, we make the fundamental assumption that the semantic content of sensory inputs at least partially stems from the way an agent can actively transform it. We illustrate our approach by formalizing how simple visual features can induce invariants in a naive agent’s sensorimotor experience, and evaluate it on a simple simulated visual system. Without any a priori knowledge about the way its sensorimotor information is encoded, we show how an agent can characterize the uniformity and edge-ness of the visual features it interacts with. |
Tasks | |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04396v1 |
http://arxiv.org/pdf/1805.04396v1.pdf | |
PWC | https://paperswithcode.com/paper/a-sensorimotor-perspective-on-grounding-the |
Repo | |
Framework | |
Eye movement simulation and detector creation to reduce laborious parameter adjustments
Title | Eye movement simulation and detector creation to reduce laborious parameter adjustments |
Authors | Wolfgang Fuhl, Thiago Santini, Thomas Kuebler, Nora Castner, Wolfgang Rosenstiel, Enkelejda Kasneci |
Abstract | Eye movements hold information about human perception, intention and cognitive state. Various algorithms have been proposed to identify and distinguish eye movements, particularly fixations, saccades, and smooth pursuits. A major drawback of existing algorithms is that they rely on accurate and constant sampling rates, impeding straightforward adaptation to new movements such as micro saccades. We propose a novel eye movement simulator that i) probabilistically simulates saccade movements as gamma distributions considering different peak velocities and ii) models smooth pursuit onsets with the sigmoid function. This simulator is combined with a machine learning approach to create detectors for general and specific velocity profiles. Additionally, our approach is capable of using any sampling rate, even with fluctuations. The machine learning approach consists of different binary patterns combined using conditional distributions. The simulation is evaluated against publicly available real data using a squared error, and the detectors are evaluated against state-of-the-art algorithms. |
Tasks | |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1804.00970v1 |
http://arxiv.org/pdf/1804.00970v1.pdf | |
PWC | https://paperswithcode.com/paper/eye-movement-simulation-and-detector-creation |
Repo | |
Framework | |
Experimental Design for Cost-Aware Learning of Causal Graphs
Title | Experimental Design for Cost-Aware Learning of Causal Graphs |
Authors | Erik M. Lindgren, Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath |
Abstract | We consider the minimum cost intervention design problem: Given the essential graph of a causal graph and a cost to intervene on a variable, identify the set of interventions with minimum total cost that can learn any causal graph with the given essential graph. We first show that this problem is NP-hard. We then prove that we can achieve a constant factor approximation to this problem with a greedy algorithm. We then constrain the sparsity of each intervention. We develop an algorithm that returns an intervention design that is nearly optimal in terms of size for sparse graphs with sparse interventions and we discuss how to use it when there are costs on the vertices. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11867v1 |
http://arxiv.org/pdf/1810.11867v1.pdf | |
PWC | https://paperswithcode.com/paper/experimental-design-for-cost-aware-learning |
Repo | |
Framework | |