July 26, 2019

3043 words 15 mins read

Paper Group ANR 789

Adapting Sequence Models for Sentence Correction. Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network. A New Probabilistic Algorithm for Approximate Model Counting. Online Convolutional Dictionary Learning for Multimodal Imaging. Multi-Scale Video Frame-Synthesis Network with Transitive Consistency Loss. …

Adapting Sequence Models for Sentence Correction


Title	Adapting Sequence Models for Sentence Correction
Authors	Allen Schmaltz, Yoon Kim, Alexander M. Rush, Stuart M. Shieber
Abstract	In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. Our strongest sequence-to-sequence model improves over our strongest phrase-based statistical machine translation model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally, in the data environment of the standard CoNLL-2014 setup, we demonstrate that modeling (and tuning against) diffs yields similar or better M2 scores with simpler models and/or significantly less data than previous sequence-to-sequence approaches.
Tasks	Machine Translation
Published	2017-07-27
URL	http://arxiv.org/abs/1707.09067v1
PDF	http://arxiv.org/pdf/1707.09067v1.pdf
PWC	https://paperswithcode.com/paper/adapting-sequence-models-for-sentence
Repo
Framework

Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network


Title	Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network
Authors	Johan Bjorck, Yiwei Bai, Xiaojian Wu, Yexiang Xue, Mark C. Whitmore, Carla Gomes
Abstract	Cascades represent rapid changes in networks. A cascading phenomenon of ecological and economic impact is the spread of invasive species in geographic landscapes. The most promising management strategy is often biocontrol, which entails introducing a natural predator able to control the invading population, a setting that can be treated as two interacting cascades of predator and prey populations. We formulate and study a nonlinear problem of optimal biocontrol: optimally seeding the predator cascade over time to minimize the harmful prey population. Recurring budgets, which typically face conservation organizations, naturally leads to sparse constraints which make the problem amenable to approximation algorithms. Available methods based on continuous relaxations scale poorly, to remedy this we develop a novel and scalable randomized algorithm based on a width relaxation, applicable to a broad class of combinatorial optimization problems. We evaluate our contributions in the context of biocontrol for the insect pest Hemlock Wolly Adelgid (HWA) in eastern North America. Our algorithm outperforms competing methods in terms of scalability and solution quality, and finds near optimal strategies for the control of the HWA for fine-grained networks – an important problem in computational sustainability.
Tasks	Combinatorial Optimization
Published	2017-11-18
URL	http://arxiv.org/abs/1711.06800v3
PDF	http://arxiv.org/pdf/1711.06800v3.pdf
PWC	https://paperswithcode.com/paper/scalable-relaxations-of-sparse-packing
Repo
Framework

A New Probabilistic Algorithm for Approximate Model Counting


Title	A New Probabilistic Algorithm for Approximate Model Counting
Authors	Cunjing Ge, Feifei Ma, Tian Liu, Jian Zhang
Abstract	Constrained counting is important in domains ranging from artificial intelligence to software analysis. There are already a few approaches for counting models over various types of constraints. Recently, hashing-based approaches achieve both theoretical guarantees and scalability, but still rely on solution enumeration. In this paper, a new probabilistic polynomial time approximate model counter is proposed, which is also a hashing-based universal framework, but with only satisfiability queries. A variant with a dynamic stopping criterion is also presented. Empirical evaluation over benchmarks on propositional logic formulas and SMT(BV) formulas shows that the approach is promising.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.03906v1
PDF	http://arxiv.org/pdf/1706.03906v1.pdf
PWC	https://paperswithcode.com/paper/a-new-probabilistic-algorithm-for-approximate
Repo
Framework

Online Convolutional Dictionary Learning for Multimodal Imaging


Title	Online Convolutional Dictionary Learning for Multimodal Imaging
Authors	Kevin Degraux, Ulugbek S. Kamilov, Petros T. Boufounos, Dehong Liu
Abstract	Computational imaging methods that can exploit multiple modalities have the potential to enhance the capabilities of traditional sensing systems. In this paper, we propose a new method that reconstructs multimodal images from their linear measurements by exploiting redundancies across different modalities. Our method combines a convolutional group-sparse representation of images with total variation (TV) regularization for high-quality multimodal imaging. We develop an online algorithm that enables the unsupervised learning of convolutional dictionaries on large-scale datasets that are typical in such applications. We illustrate the benefit of our approach in the context of joint intensity-depth imaging.
Tasks	Dictionary Learning
Published	2017-06-13
URL	http://arxiv.org/abs/1706.04256v1
PDF	http://arxiv.org/pdf/1706.04256v1.pdf
PWC	https://paperswithcode.com/paper/online-convolutional-dictionary-learning-for
Repo
Framework

Multi-Scale Video Frame-Synthesis Network with Transitive Consistency Loss


Title	Multi-Scale Video Frame-Synthesis Network with Transitive Consistency Loss
Authors	Zhe Hu, Yinglan Ma, Lizhuang Ma
Abstract	Traditional approaches to interpolate/extrapolate frames in a video sequence require accurate pixel correspondences between images, e.g., using optical flow. Their results stem on the accuracy of optical flow estimation, and could generate heavy artifacts when flow estimation failed. Recently methods using auto-encoder has shown impressive progress, however they are usually trained for specific interpolation/extrapolation settings and lack of flexibility and In order to reduce these limitations, we propose a unified network to parameterize the interest frame position and therefore infer interpolate/extrapolate frames within the same framework. To achieve this, we introduce a transitive consistency loss to better regularize the network. We adopt a multi-scale structure for the network so that the parameters can be shared across multi-layers. Our approach avoids expensive global optimization of optical flow methods, and is efficient and flexible for video interpolation/extrapolation applications. Experimental results have shown that our method performs favorably against state-of-the-art methods.
Tasks	Optical Flow Estimation
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02874v2
PDF	http://arxiv.org/pdf/1712.02874v2.pdf
PWC	https://paperswithcode.com/paper/multi-scale-video-frame-synthesis-network
Repo
Framework

On the Discrimination-Generalization Tradeoff in GANs


Title	On the Discrimination-Generalization Tradeoff in GANs
Authors	Pengchuan Zhang, Qiang Liu, Dengyong Zhou, Tao Xu, Xiaodong He
Abstract	Generative adversarial training can be generally understood as minimizing certain moment matching loss defined by a set of discriminator functions, typically neural networks. The discriminator set should be large enough to be able to uniquely identify the true distribution (discriminative), and also be small enough to go beyond memorizing samples (generalizable). In this paper, we show that a discriminator set is guaranteed to be discriminative whenever its linear span is dense in the set of bounded continuous functions. This is a very mild condition satisfied even by neural networks with a single neuron. Further, we develop generalization bounds between the learned distribution and true distribution under different evaluation metrics. When evaluated with neural distance, our bounds show that generalization is guaranteed as long as the discriminator set is small enough, regardless of the size of the generator or hypothesis set. When evaluated with KL divergence, our bound provides an explanation on the counter-intuitive behaviors of testing likelihood in GAN training. Our analysis sheds lights on understanding the practical performance of GANs.
Tasks
Published	2017-11-07
URL	http://arxiv.org/abs/1711.02771v2
PDF	http://arxiv.org/pdf/1711.02771v2.pdf
PWC	https://paperswithcode.com/paper/on-the-discrimination-generalization-tradeoff
Repo
Framework

Machine Vision System for 3D Plant Phenotyping


Title	Machine Vision System for 3D Plant Phenotyping
Authors	Ayan Chaudhury, Christopher Ward, Ali Talasaz, Alexander G. Ivanov, Mark Brophy, Bernard Grodzinski, Norman P. A. Huner, Rajni V. Patel, John L. Barron
Abstract	Machine vision for plant phenotyping is an emerging research area for producing high throughput in agriculture and crop science applications. Since 2D based approaches have their inherent limitations, 3D plant analysis is becoming state of the art for current phenotyping technologies. We present an automated system for analyzing plant growth in indoor conditions. A gantry robot system is used to perform scanning tasks in an automated manner throughout the lifetime of the plant. A 3D laser scanner mounted as the robot’s payload captures the surface point cloud data of the plant from multiple views. The plant is monitored from the vegetative to reproductive stages in light/dark cycles inside a controllable growth chamber. An efficient 3D reconstruction algorithm is used, by which multiple scans are aligned together to obtain a 3D mesh of the plant, followed by surface area and volume computations. The whole system, including the programmable growth chamber, robot, scanner, data transfer and analysis is fully automated in such a way that a naive user can, in theory, start the system with a mouse click and get back the growth analysis results at the end of the lifetime of the plant with no intermediate intervention. As evidence of its functionality, we show and analyze quantitative results of the rhythmic growth patterns of the dicot Arabidopsis thaliana(L.), and the monocot barley (Hordeum vulgare L.) plants under their diurnal light/dark cycles.
Tasks	3D Reconstruction
Published	2017-04-28
URL	http://arxiv.org/abs/1705.00540v1
PDF	http://arxiv.org/pdf/1705.00540v1.pdf
PWC	https://paperswithcode.com/paper/machine-vision-system-for-3d-plant
Repo
Framework

ActionVLAD: Learning spatio-temporal aggregation for action classification


Title	ActionVLAD: Learning spatio-temporal aggregation for action classification
Authors	Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan Russell
Abstract	In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of the video. We do so by integrating state-of-the-art two-stream networks with learnable spatio-temporal feature aggregation. The resulting architecture is end-to-end trainable for whole-video classification. We investigate different strategies for pooling across space and time and combining signals from the different streams. We find that: (i) it is important to pool jointly across space and time, but (ii) appearance and motion streams are best aggregated into their own separate representations. Finally, we show that our representation outperforms the two-stream base architecture by a large margin (13% relative) as well as out-performs other baselines with comparable base architectures on HMDB51, UCF101, and Charades video classification benchmarks.
Tasks	Action Classification, Video Classification
Published	2017-04-10
URL	http://arxiv.org/abs/1704.02895v1
PDF	http://arxiv.org/pdf/1704.02895v1.pdf
PWC	https://paperswithcode.com/paper/actionvlad-learning-spatio-temporal
Repo
Framework

A Unified View-Graph Selection Framework for Structure from Motion


Title	A Unified View-Graph Selection Framework for Structure from Motion
Authors	Rajvi Shah, Visesh Chari, P J Narayanan
Abstract	View-graph is an essential input to large-scale structure from motion (SfM) pipelines. Accuracy and efficiency of large-scale SfM is crucially dependent on the input view-graph. Inconsistent or inaccurate edges can lead to inferior or wrong reconstruction. Most SfM methods remove `undesirable’ images and pairs using several, fixed heuristic criteria, and propose tailor-made solutions to achieve specific reconstruction objectives such as efficiency, accuracy, or disambiguation. In contrast to these disparate solutions, we propose a single optimization framework that can be used to achieve these different reconstruction objectives with task-specific cost modeling. We also construct a very efficient network-flow based formulation for its approximate solution. The abstraction brought on by this selection mechanism separates the challenges specific to datasets and reconstruction objectives from the standard SfM pipeline and improves its generalization. This paper demonstrates the application of the proposed view-graph framework with standard SfM pipeline for two particular use-cases, (i) accurate and ghost-free reconstructions of highly ambiguous datasets using costs based on disambiguation priors, and (ii) accurate and efficient reconstruction of large-scale Internet datasets using costs based on commonly used priors. \|
Tasks
Published	2017-08-03
URL	http://arxiv.org/abs/1708.01125v2
PDF	http://arxiv.org/pdf/1708.01125v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-view-graph-selection-framework-for
Repo
Framework

Massively-Parallel Feature Selection for Big Data


Title	Massively-Parallel Feature Selection for Big Data
Authors	Ioannis Tsamardinos, Giorgos Borboudakis, Pavlos Katsogridakis, Polyvios Pratikakis, Vassilis Christophides
Abstract	We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of $p$-values of conditional independence tests and meta-analysis techniques PFBP manages to rely only on computations local to a partition while minimizing communication costs. Then, it employs powerful and safe (asymptotically sound) heuristics to make early, approximate decisions, such as Early Dropping of features from consideration in subsequent iterations, Early Stopping of consideration of features within the same iteration, or Early Return of the winner in each iteration. PFBP provides asymptotic guarantees of optimality for data distributions faithfully representable by a causal network (Bayesian network or maximal ancestral graph). Our empirical analysis confirms a super-linear speedup of the algorithm with increasing sample size, linear scalability with respect to the number of features and processing cores, while dominating other competitive algorithms in its class.
Tasks	Feature Selection
Published	2017-08-23
URL	http://arxiv.org/abs/1708.07178v1
PDF	http://arxiv.org/pdf/1708.07178v1.pdf
PWC	https://paperswithcode.com/paper/massively-parallel-feature-selection-for-big
Repo
Framework

TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation


Title	TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation
Authors	Li Ding, Chenliang Xu
Abstract	Action segmentation as a milestone towards building automatic systems to understand untrimmed videos has received considerable attention in the recent years. It is typically being modeled as a sequence labeling problem but contains intrinsic and sufficient differences than text parsing or speech processing. In this paper, we introduce a novel hybrid temporal convolutional and recurrent network (TricorNet), which has an encoder-decoder architecture: the encoder consists of a hierarchy of temporal convolutional kernels that capture the local motion changes of different actions; the decoder is a hierarchy of recurrent neural networks that are able to learn and memorize long-term action dependencies after the encoding stage. Our model is simple but extremely effective in terms of video sequence labeling. The experimental results on three public action segmentation datasets have shown that the proposed model achieves superior performance over the state of the art.
Tasks	action segmentation
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07818v1
PDF	http://arxiv.org/pdf/1705.07818v1.pdf
PWC	https://paperswithcode.com/paper/tricornet-a-hybrid-temporal-convolutional-and
Repo
Framework

Online to Offline Conversions, Universality and Adaptive Minibatch Sizes


Title	Online to Offline Conversions, Universality and Adaptive Minibatch Sizes
Authors	Kfir Y. Levy
Abstract	We present an approach towards convex optimization that relies on a novel scheme which converts online adaptive algorithms into offline methods. In the offline optimization setting, our derived methods are shown to obtain favourable adaptive guarantees which depend on the harmonic sum of the queried gradients. We further show that our methods implicitly adapt to the objective’s structure: in the smooth case fast convergence rates are ensured without any prior knowledge of the smoothness parameter, while still maintaining guarantees in the non-smooth setting. Our approach has a natural extension to the stochastic setting, resulting in a lazy version of SGD (stochastic GD), where minibathces are chosen \emph{adaptively} depending on the magnitude of the gradients. Thus providing a principled approach towards choosing minibatch sizes.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10499v2
PDF	http://arxiv.org/pdf/1705.10499v2.pdf
PWC	https://paperswithcode.com/paper/online-to-offline-conversions-universality
Repo
Framework

Neural Probabilistic Model for Non-projective MST Parsing


Title	Neural Probabilistic Model for Non-projective MST Parsing
Authors	Xuezhe Ma, Eduard Hovy
Abstract	In this paper, we propose a probabilistic parsing model, which defines a proper conditional probability distribution over non-projective dependency trees for a given sentence, using neural representations as inputs. The neural network architecture is based on bi-directional LSTM-CNNs which benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM and CNN. On top of the neural network, we introduce a probabilistic structured layer, defining a conditional log-linear model over non-projective trees. We evaluate our model on 17 different datasets, across 14 different languages. By exploiting Kirchhoff’s Matrix-Tree Theorem (Tutte, 1984), the partition functions and marginals can be computed efficiently, leading to a straight-forward end-to-end model training procedure via back-propagation. Our parser achieves state-of-the-art parsing performance on nine datasets.
Tasks
Published	2017-01-04
URL	http://arxiv.org/abs/1701.00874v4
PDF	http://arxiv.org/pdf/1701.00874v4.pdf
PWC	https://paperswithcode.com/paper/neural-probabilistic-model-for-non-projective
Repo
Framework

SquishedNets: Squishing SqueezeNet further for edge device scenarios via deep evolutionary synthesis


Title	SquishedNets: Squishing SqueezeNet further for edge device scenarios via deep evolutionary synthesis
Authors	Mohammad Javad Shafiee, Francis Li, Brendan Chwyl, Alexander Wong
Abstract	While deep neural networks have been shown in recent years to outperform other machine learning methods in a wide range of applications, one of the biggest challenges with enabling deep neural networks for widespread deployment on edge devices such as mobile and other consumer devices is high computational and memory requirements. Recently, there has been greater exploration into small deep neural network architectures that are more suitable for edge devices, with one of the most popular architectures being SqueezeNet, with an incredibly small model size of 4.8MB. Taking further advantage of the notion that many applications of machine learning on edge devices are often characterized by a low number of target classes, this study explores the utility of combining architectural modifications and an evolutionary synthesis strategy for synthesizing even smaller deep neural architectures based on the more recent SqueezeNet v1.1 macroarchitecture for applications with fewer target classes. In particular, architectural modifications are first made to SqueezeNet v1.1 to accommodate for a 10-class ImageNet-10 dataset, and then an evolutionary synthesis strategy is leveraged to synthesize more efficient deep neural networks based on this modified macroarchitecture. The resulting SquishedNets possess model sizes ranging from 2.4MB to 0.95MB (~5.17X smaller than SqueezeNet v1.1, or 253X smaller than AlexNet). Furthermore, the SquishedNets are still able to achieve accuracies ranging from 81.2% to 77%, and able to process at speeds of 156 images/sec to as much as 256 images/sec on a Nvidia Jetson TX1 embedded chip. These preliminary results show that a combination of architectural modifications and an evolutionary synthesis strategy can be a useful tool for producing very small deep neural network architectures that are well-suited for edge device scenarios.
Tasks
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07459v1
PDF	http://arxiv.org/pdf/1711.07459v1.pdf
PWC	https://paperswithcode.com/paper/squishednets-squishing-squeezenet-further-for
Repo
Framework

The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings


Title	The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings
Authors	Tomer Galanti, Lior Wolf, Sagie Benaim
Abstract	We discuss the feasibility of the following learning problem: given unmatched samples from two domains and nothing else, learn a mapping between the two, which preserves semantics. Due to the lack of paired samples and without any definition of the semantic information, the problem might seem ill-posed. Specifically, in typical cases, it seems possible to build infinitely many alternative mappings from every target mapping. This apparent ambiguity stands in sharp contrast to the recent empirical success in solving this problem. We identify the abstract notion of aligning two domains in a semantic way with concrete terms of minimal relative complexity. A theoretical framework for measuring the complexity of compositions of functions is developed in order to show that it is reasonable to expect the minimal complexity mapping to be unique. The measured complexity used is directly related to the depth of the neural networks being learned and a semantically aligned mapping could then be captured simply by learning using architectures that are not much bigger than the minimal architecture. Various predictions are made based on the hypothesis that semantic alignment can be captured by the minimal mapping. These are verified extensively. In addition, a new mapping algorithm is proposed and shown to lead to better mapping results.
Tasks
Published	2017-08-31
URL	https://arxiv.org/abs/1709.00074v3
PDF	https://arxiv.org/pdf/1709.00074v3.pdf
PWC	https://paperswithcode.com/paper/the-role-of-minimal-complexity-functions-in
Repo
Framework