Paper Group ANR 198
Patch-Ordering as a Regularization for Inverse Problems in Image Processing. Tensor Decomposition via Variational Auto-Encoder. Fast Bilateral Filtering of Vector-Valued Images. Parallelizing Word2Vec in Shared and Distributed Memory. Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder. Two- …
Patch-Ordering as a Regularization for Inverse Problems in Image Processing
Title | Patch-Ordering as a Regularization for Inverse Problems in Image Processing |
Authors | Gregory Vaksman, Michael Zibulevsky, Michael Elad |
Abstract | Recent work in image processing suggests that operating on (overlapping) patches in an image may lead to state-of-the-art results. This has been demonstrated for a variety of problems including denoising, inpainting, deblurring, and super-resolution. The work reported in [1,2] takes an extra step forward by showing that ordering these patches to form an approximate shortest path can be leveraged for better processing. The core idea is to apply a simple filter on the resulting 1D smoothed signal obtained after the patch-permutation. This idea has been also explored in combination with a wavelet pyramid, leading eventually to a sophisticated and highly effective regularizer for inverse problems in imaging. In this work we further study the patch-permutation concept, and harness it to propose a new simple yet effective regularization for image restoration problems. Our approach builds on the classic Maximum A’posteriori probability (MAP), with a penalty function consisting of a regular log-likelihood term and a novel permutation-based regularization term. Using a plain 1D Laplacian, the proposed regularization forces robust smoothness (L1) on the permuted pixels. Since the permutation originates from patch-ordering, we propose to accumulate the smoothness terms over all the patches’ pixels. Furthermore, we take into account the found distances between adjacent patches in the ordering, by weighting the Laplacian outcome. We demonstrate the proposed scheme on a diverse set of problems: (i) severe Poisson image denoising, (ii) Gaussian image denoising, (iii) image deblurring, and (iv) single image super-resolution. In all these cases, we use recent methods that handle these problems as initialization to our scheme. This is followed by an L-BFGS optimization of the above-described penalty function, leading to state-of-the-art results, and especially so for highly ill-posed cases. |
Tasks | Deblurring, Denoising, Image Denoising, Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08510v1 |
http://arxiv.org/pdf/1602.08510v1.pdf | |
PWC | https://paperswithcode.com/paper/patch-ordering-as-a-regularization-for |
Repo | |
Framework | |
Tensor Decomposition via Variational Auto-Encoder
Title | Tensor Decomposition via Variational Auto-Encoder |
Authors | Bin Liu, Zenglin Xu, Yingming Li |
Abstract | Tensor decomposition is an important technique for capturing the high-order interactions among multiway data. Multi-linear tensor composition methods, such as the Tucker decomposition and the CANDECOMP/PARAFAC (CP), assume that the complex interactions among objects are multi-linear, and are thus insufficient to represent nonlinear relationships in data. Another assumption of these methods is that a predefined rank should be known. However, the rank of tensors is hard to estimate, especially for cases with missing values. To address these issues, we design a Bayesian generative model for tensor decomposition. Different from the traditional Bayesian methods, the high-order interactions of tensor entries are modeled with variational auto-encoder. The proposed model takes advantages of Neural Networks and nonparametric Bayesian models, by replacing the multi-linear product in traditional Bayesian tensor decomposition with a complex nonlinear function (via Neural Networks) whose parameters can be learned from data. Experimental results on synthetic data and real-world chemometrics tensor data have demonstrated that our new model can achieve significantly higher prediction performance than the state-of-the-art tensor decomposition approaches. |
Tasks | |
Published | 2016-11-03 |
URL | http://arxiv.org/abs/1611.00866v1 |
http://arxiv.org/pdf/1611.00866v1.pdf | |
PWC | https://paperswithcode.com/paper/tensor-decomposition-via-variational-auto |
Repo | |
Framework | |
Fast Bilateral Filtering of Vector-Valued Images
Title | Fast Bilateral Filtering of Vector-Valued Images |
Authors | Sanjay Ghosh, Kunal N. Chaudhury |
Abstract | In this paper, we consider a natural extension of the edge-preserving bilateral filter for vector-valued images. The direct computation of this non-linear filter is slow in practice. We demonstrate how a fast algorithm can be obtained by first approximating the Gaussian kernel of the bilateral filter using raised-cosines, and then using Monte Carlo sampling. We present simulation results on color images to demonstrate the accuracy of the algorithm and the speedup over the direct implementation. |
Tasks | |
Published | 2016-05-07 |
URL | http://arxiv.org/abs/1605.02164v1 |
http://arxiv.org/pdf/1605.02164v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-bilateral-filtering-of-vector-valued |
Repo | |
Framework | |
Parallelizing Word2Vec in Shared and Distributed Memory
Title | Parallelizing Word2Vec in Shared and Distributed Memory |
Authors | Shihao Ji, Nadathur Satish, Sheng Li, Pradeep Dubey |
Abstract | Word2Vec is a widely used algorithm for extracting low-dimensional vector representations of words. It generated considerable excitement in the machine learning and natural language processing (NLP) communities recently due to its exceptional performance in many NLP applications such as named entity recognition, sentiment analysis, machine translation and question answering. State-of-the-art algorithms including those by Mikolov et al. have been parallelized for multi-core CPU architectures but are based on vector-vector operations that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we improve reuse of various data structures in the algorithm through the use of minibatching, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a compute cluster, and demonstrate good strong scalability up to 32 nodes. In combination, these techniques allow us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge. |
Tasks | Machine Translation, Named Entity Recognition, Question Answering, Sentiment Analysis |
Published | 2016-04-15 |
URL | http://arxiv.org/abs/1604.04661v2 |
http://arxiv.org/pdf/1604.04661v2.pdf | |
PWC | https://paperswithcode.com/paper/parallelizing-word2vec-in-shared-and |
Repo | |
Framework | |
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder
Title | Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder |
Authors | Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-Yi Lee, Lin-Shan Lee |
Abstract | The vector representations of fixed dimensionality for words (in text) offered by Word2Vec have been shown to be very useful in many application scenarios, in particular due to the semantic information they carry. This paper proposes a parallel version, the Audio Word2Vec. It offers the vector representations of fixed dimensionality for variable-length audio segments. These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with very attractive real world applications such as query-by-example Spoken Term Detection (STD). In this STD application, the proposed approach significantly outperformed the conventional Dynamic Time Warping (DTW) based approaches at significantly lower computation requirements. We propose unsupervised learning of Audio Word2Vec from audio data without human annotation using Sequence-to-sequence Audoencoder (SA). SA consists of two RNNs equipped with Long Short-Term Memory (LSTM) units: the first RNN (encoder) maps the input audio sequence into a vector representation of fixed dimensionality, and the second RNN (decoder) maps the representation back to the input audio sequence. The two RNNs are jointly trained by minimizing the reconstruction error. Denoising Sequence-to-sequence Autoencoder (DSA) is furthered proposed offering more robust learning. |
Tasks | Denoising |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.00982v4 |
http://arxiv.org/pdf/1603.00982v4.pdf | |
PWC | https://paperswithcode.com/paper/audio-word2vec-unsupervised-learning-of-audio |
Repo | |
Framework | |
Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge
Title | Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge |
Authors | Adrian Bulat, Georgios Tzimiropoulos |
Abstract | This paper describes our submission to the 1st 3D Face Alignment in the Wild (3DFAW) Challenge. Our method builds upon the idea of convolutional part heatmap regression [1], extending it for 3D face alignment. Our method decomposes the problem into two parts: (a) X,Y (2D) estimation and (b) Z (depth) estimation. At the first stage, our method estimates the X,Y coordinates of the facial landmarks by producing a set of 2D heatmaps, one for each landmark, using convolutional part heatmap regression. Then, these heatmaps, alongside the input RGB image, are used as input to a very deep subnetwork trained via residual learning for regressing the Z coordinate. Our method ranked 1st in the 3DFAW Challenge, surpassing the second best result by more than 22%. |
Tasks | Depth Estimation, Face Alignment |
Published | 2016-09-29 |
URL | http://arxiv.org/abs/1609.09545v1 |
http://arxiv.org/pdf/1609.09545v1.pdf | |
PWC | https://paperswithcode.com/paper/two-stage-convolutional-part-heatmap |
Repo | |
Framework | |
Recurrent Fully Convolutional Networks for Video Segmentation
Title | Recurrent Fully Convolutional Networks for Video Segmentation |
Authors | Sepehr Valipour, Mennatullah Siam, Martin Jagersand, Nilanjan Ray |
Abstract | Image segmentation is an important step in most visual tasks. While convolutional neural networks have shown to perform well on single image segmentation, to our knowledge, no study has been been done on leveraging recurrent gated architectures for video segmentation. Accordingly, we propose a novel method for online segmentation of video sequences that incorporates temporal data. The network is built from fully convolutional element and recurrent unit that works on a sliding window over the temporal data. We also introduce a novel convolutional gated recurrent unit that preserves the spatial information and reduces the parameters learned. Our method has the advantage that it can work in an online fashion instead of operating over the whole input batch of video frames. The network is tested on the change detection dataset, and proved to have 5.5% improvement in F-measure over a plain fully convolutional network for per frame segmentation. It was also shown to have improvement of 1.4% for the F-measure compared to our baseline network that we call FCN 12s. |
Tasks | Semantic Segmentation, Video Semantic Segmentation |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.00487v3 |
http://arxiv.org/pdf/1606.00487v3.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-fully-convolutional-networks-for |
Repo | |
Framework | |
Conditional Generation and Snapshot Learning in Neural Dialogue Systems
Title | Conditional Generation and Snapshot Learning in Neural Dialogue Systems |
Authors | Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Steve Young |
Abstract | Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used. |
Tasks | Text Generation |
Published | 2016-06-10 |
URL | http://arxiv.org/abs/1606.03352v1 |
http://arxiv.org/pdf/1606.03352v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-generation-and-snapshot-learning |
Repo | |
Framework | |
Efficient Distributed Estimation of Inverse Covariance Matrices
Title | Efficient Distributed Estimation of Inverse Covariance Matrices |
Authors | Jesús Arroyo, Elizabeth Hou |
Abstract | In distributed systems, communication is a major concern due to issues such as its vulnerability or efficiency. In this paper, we are interested in estimating sparse inverse covariance matrices when samples are distributed into different machines. We address communication efficiency by proposing a method where, in a single round of communication, each machine transfers a small subset of the entries of the inverse covariance matrix. We show that, with this efficient distributed method, the error rates can be comparable with estimation in a non-distributed setting, and correct model selection is still possible. Practical performance is shown through simulations. |
Tasks | Model Selection |
Published | 2016-05-03 |
URL | http://arxiv.org/abs/1605.00758v1 |
http://arxiv.org/pdf/1605.00758v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-distributed-estimation-of-inverse |
Repo | |
Framework | |
Gland Instance Segmentation by Deep Multichannel Side Supervision
Title | Gland Instance Segmentation by Deep Multichannel Side Supervision |
Authors | Yan Xu, Yang Li, Mingyuan Liu, Yipei Wang, Maode Lai, Eric I-Chao Chang |
Abstract | In this paper, we propose a new image instance segmentation method that segments individual glands (instances) in colon histology images. This is a task called instance segmentation that has recently become increasingly important. The problem is challenging since not only do the glands need to be segmented from the complex background, they are also required to be individually identified. Here we leverage the idea of image-to-image prediction in recent deep learning by building a framework that automatically exploits and fuses complex multichannel information, regional and boundary patterns, with side supervision (deep supervision on side responses) in gland histology images. Our proposed system, deep multichannel side supervision (DMCS), alleviates heavy feature design due to the use of convolutional neural networks guided by side supervision. Compared to methods reported in the 2015 MICCAI Gland Segmentation Challenge, we observe state-of-the-art results based on a number of evaluation metrics. |
Tasks | Instance Segmentation, Semantic Segmentation |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03222v2 |
http://arxiv.org/pdf/1607.03222v2.pdf | |
PWC | https://paperswithcode.com/paper/gland-instance-segmentation-by-deep |
Repo | |
Framework | |
Network of Bandits insure Privacy of end-users
Title | Network of Bandits insure Privacy of end-users |
Authors | Raphaël Féraud |
Abstract | In order to distribute the best arm identification task as close as possible to the user’s devices, on the edge of the Radio Access Network, we propose a new problem setting, where distributed players collaborate to find the best arm. This architecture guarantees privacy to end-users since no events are stored. The only thing that can be observed by an adversary through the core network is aggregated information across users. We provide a first algorithm, Distributed Median Elimination, which is optimal in term of number of transmitted bits and near optimal in term of speed-up factor with respect to an optimal algorithm run independently on each player. In practice, this first algorithm cannot handle the trade-off between the communication cost and the speed-up factor, and requires some knowledge about the distribution of players. Extended Distributed Median Elimination overcomes these limitations, by playing in parallel different instances of Distributed Median Elimination and selecting the best one. Experiments illustrate and complete the analysis. According to the analysis, in comparison to Median Elimination performed on each player, the proposed algorithm shows significant practical improvements. |
Tasks | |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03779v14 |
http://arxiv.org/pdf/1602.03779v14.pdf | |
PWC | https://paperswithcode.com/paper/network-of-bandits-insure-privacy-of-end |
Repo | |
Framework | |
Automated timetabling for small colleges and high schools using huge integer programs
Title | Automated timetabling for small colleges and high schools using huge integer programs |
Authors | Joshua S. Friedman |
Abstract | We formulate an integer program to solve a highly constrained academic timetabling problem at the United States Merchant Marine Academy. The IP instance that results from our real case study has approximately both 170,000 rows and columns and solves to optimality in 4–24 hours using a commercial solver on a portable computer (near optimal feasible solutions were often found in 4–12 hours). Our model is applicable to both high schools and small colleges who wish to deviate from group scheduling. We also solve a necessary preprocessing student subgrouping problem, which breaks up big groups of students into small groups so they can optimally fit into small capacity classes. |
Tasks | |
Published | 2016-12-28 |
URL | http://arxiv.org/abs/1612.08777v2 |
http://arxiv.org/pdf/1612.08777v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-timetabling-for-small-colleges-and |
Repo | |
Framework | |
Self-calibration-based Approach to Critical Motion Sequences of Rolling-shutter Structure from Motion
Title | Self-calibration-based Approach to Critical Motion Sequences of Rolling-shutter Structure from Motion |
Authors | Eisuke Ito, Takayuki Okatani |
Abstract | In this paper we consider critical motion sequences (CMSs) of rolling-shutter (RS) SfM. Employing an RS camera model with linearized pure rotation, we show that the RS distortion can be approximately expressed by two internal parameters of an “imaginary” camera plus one-parameter nonlinear transformation similar to lens distortion. We then reformulate the problem as self-calibration of the imaginary camera, in which its skew and aspect ratio are unknown and varying in the image sequence. In the formulation, we derive a general representation of CMSs. We also show that our method can explain the CMS that was recently reported in the literature, and then present a new remedy to deal with the degeneracy. Our theoretical results agree well with experimental results; it explains degeneracies observed when we employ naive bundle adjustment, and how they are resolved by our method. |
Tasks | Calibration |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05476v1 |
http://arxiv.org/pdf/1611.05476v1.pdf | |
PWC | https://paperswithcode.com/paper/self-calibration-based-approach-to-critical |
Repo | |
Framework | |
The red one!: On learning to refer to things based on their discriminative properties
Title | The red one!: On learning to refer to things based on their discriminative properties |
Authors | Angeliki Lazaridou, Nghia The Pham, Marco Baroni |
Abstract | As a first step towards agents learning to communicate about their visual environment, we propose a system that, given visual representations of a referent (cat) and a context (sofa), identifies their discriminative attributes, i.e., properties that distinguish them (has_tail). Moreover, despite the lack of direct supervision at the attribute level, the model learns to assign plausible attributes to objects (sofa-has_cushion). Finally, we present a preliminary experiment confirming the referential success of the predicted discriminative attributes. |
Tasks | |
Published | 2016-03-08 |
URL | http://arxiv.org/abs/1603.02618v2 |
http://arxiv.org/pdf/1603.02618v2.pdf | |
PWC | https://paperswithcode.com/paper/the-red-one-on-learning-to-refer-to-things-1 |
Repo | |
Framework | |
A Riemannian gossip approach to decentralized matrix completion
Title | A Riemannian gossip approach to decentralized matrix completion |
Authors | Bamdev Mishra, Hiroyuki Kasai, Atul Saroop |
Abstract | In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks. |
Tasks | Matrix Completion |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.06968v1 |
http://arxiv.org/pdf/1605.06968v1.pdf | |
PWC | https://paperswithcode.com/paper/a-riemannian-gossip-approach-to-decentralized |
Repo | |
Framework | |