January 28, 2020

2868 words 14 mins read

Paper Group ANR 1012

To each route its own ETA: A generative modeling framework for ETA prediction. A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning. SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection. Model selection for high-dimensional linear regression with dependent observations. EAST: Encodin …

To each route its own ETA: A generative modeling framework for ETA prediction


Title	To each route its own ETA: A generative modeling framework for ETA prediction
Authors	Charul, Pravesh Biyani
Abstract	Accurate expected time of arrival (ETA) information is crucial in maintaining the quality of service of public transit. Recent advances in artificial intelligence (AI) has led to more effective models for ETA estimation that rely heavily on a large GPS datasets. More importantly, these are mainly cabs based datasets which may not be fit for bus-based public transport. Consequently, the latest methods may not be applicable for ETA estimation in cities with the absence of large training data set. On the other hand, the ETA estimation problem in many cities needs to be solved in the absence of big datasets that also contains outliers, anomalies and may be incomplete. This work presents a simple but robust model for ETA estimation for a bus route that only relies on the historical data of the particular route. We propose a system that generates ETA information for a trip and updates it as the trip progresses based on the real-time information. We train a deep learning based generative model that learns the probability distribution of ETA data across trips and conditional on the current trip information updates the ETA information on the go. Our plug and play model not only captures the non-linearity of the task well but that any transit agency can use without needing any other external data source. The experiments run over three routes, data collected in the city of Delhi illustrates the promise of our approach.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09925v1
PDF	https://arxiv.org/pdf/1906.09925v1.pdf
PWC	https://paperswithcode.com/paper/to-each-route-its-own-eta-a-generative
Repo
Framework

A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning


Title	A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning
Authors	Juan Luis Gonzalez Bello, Munchurl Kim
Abstract	Convolutional neural networks (CNN) have shown state-of-the-art results for low-level computer vision problems such as stereo and monocular disparity estimations, but still, have much room to further improve their performance in terms of accuracy, numbers of parameters, etc. Recent works have uncovered the advantages of using an unsupervised scheme to train CNN’s to estimate monocular disparity, where only the relatively-easy-to-obtain stereo images are needed for training. We propose a novel encoder-decoder architecture that outperforms previous unsupervised monocular depth estimation networks by (i) taking into account ambiguities, (ii) efficient fusion between encoder and decoder features with rectangular convolutions and (iii) domain transformations between encoder and decoder. Our architecture outperforms the Monodepth baseline in all metrics, even with a considerable reduction of parameters. Furthermore, our architecture is capable of estimating a full disparity map in a single forward pass, whereas the baseline needs two passes. We perform extensive experiments to verify the effectiveness of our method on the KITTI dataset.
Tasks	Depth Estimation, Disparity Estimation, Monocular Depth Estimation
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08514v1
PDF	http://arxiv.org/pdf/1903.08514v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-monocular-disparity-estimation
Repo
Framework

SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection


Title	SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection
Authors	Anurag Kumar, Vamsi Krishna Ithapu
Abstract	Weakly supervised learning algorithms are critical for scaling audio event detection to several hundreds of sound categories. Such learning models should not only disambiguate sound events efficiently with minimal class-specific annotation but also be robust to label noise, which is more apparent with weak labels instead of strong annotations. In this work, we propose a new framework for designing learning models with weak supervision by bridging ideas from sequential learning and knowledge distillation. We refer to the proposed methodology as SeCoST (pronounced Sequest) – Sequential Co-supervision for training generations of Students. SeCoST incrementally builds a cascade of student-teacher pairs via a novel knowledge transfer method. Our evaluations on Audioset (the largest weakly labeled dataset available) show that SeCoST achieves a mean average precision of 0.383 while outperforming prior state of the art by a considerable margin.
Tasks	Transfer Learning
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11789v2
PDF	https://arxiv.org/pdf/1910.11789v2.pdf
PWC	https://paperswithcode.com/paper/secost-sequential-co-supervision-for-weakly
Repo
Framework

Model selection for high-dimensional linear regression with dependent observations


Title	Model selection for high-dimensional linear regression with dependent observations
Authors	Ching-Kang Ing
Abstract	We investigate the prediction capability of the orthogonal greedy algorithm (OGA) in high-dimensional regression models with dependent observations. The rates of convergence of the prediction error of OGA are obtained under a variety of sparsity conditions. To prevent OGA from overfitting, we introduce a high-dimensional Akaike’s information criterion (HDAIC) to determine the number of OGA iterations. A key contribution of this work is to show that OGA, used in conjunction with HDAIC, can achieve the optimal convergence rate without knowledge of how sparse the underlying high-dimensional model is.
Tasks	Model Selection
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07395v1
PDF	https://arxiv.org/pdf/1906.07395v1.pdf
PWC	https://paperswithcode.com/paper/model-selection-for-high-dimensional-linear
Repo
Framework

EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets


Title	EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets
Authors	Matteo Grimaldi, Valentino Peluso, Andrea Calimera
Abstract	The implementation of Deep Convolutional Neural Networks (ConvNets) on tiny end-nodes with limited non-volatile memory space calls for smart compression strategies capable of shrinking the footprint yet preserving predictive accuracy. There exist a number of strategies for this purpose, from those that play with the topology of the model or the arithmetic precision, e.g. pruning and quantization, to those that operate a model agnostic compression, e.g. weight encoding. The tighter the memory constraint, the higher the probability that these techniques alone cannot meet the requirement, hence more awareness and cooperation across different optimizations become mandatory. This work addresses the issue by introducing EAST, Encoding-Aware Sparse Training, a novel memory-constrained training procedure that leads quantized ConvNets towards deep memory compression. EAST implements an adaptive group pruning designed to maximize the compression rate of the weight encoding scheme (the LZ4 algorithm in this work). If compared to existing methods, EAST meets the memory constraint with lower sparsity, hence ensuring higher accuracy. Results conducted on a state-of-the-art ConvNet (ResNet-9) deployed on a low-power microcontroller (ARM Cortex-M4) validate the proposal.
Tasks	Quantization
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10087v1
PDF	https://arxiv.org/pdf/1912.10087v1.pdf
PWC	https://paperswithcode.com/paper/east-encoding-aware-sparse-training-for-deep
Repo
Framework

Sublinear Subwindow Search


Title	Sublinear Subwindow Search
Authors	Max Reuter, Gheorghe-Teodor Bercea
Abstract	We propose an efficient approximation algorithm for subwindow search that runs in sublinear time and memory. Applied to object localization, this algorithm significantly reduces running time and memory usage while maintaining competitive accuracy scores compared to the state-of-the-art. The algorithm’s accuracy also scales with both the size and the spatial coherence (nearby-element similarity) of the matrix. It is thus well-suited for real-time applications and against many matrices in general.
Tasks	Object Localization
Published	2019-07-31
URL	https://arxiv.org/abs/1908.00140v1
PDF	https://arxiv.org/pdf/1908.00140v1.pdf
PWC	https://paperswithcode.com/paper/sublinear-subwindow-search
Repo
Framework

Speaker Adaptation for End-to-End CTC Models


Title	Speaker Adaptation for End-to-End CTC Models
Authors	Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong
Abstract	We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems. One is Kullback-Leibler divergence (KLD) regularization and the other is multi-task learning (MTL). Both approaches aim to address the data sparsity especially output target sparsity issue of speaker adaptation in E2E systems. The KLD regularization adapts a model by forcing the output distribution from the adapted model to be close to the unadapted one. The MTL utilizes a jointly trained auxiliary task to improve the performance of the main task. We investigated our approaches on E2E connectionist temporal classification (CTC) models with three different types of output units. Experiments on the Microsoft short message dictation task demonstrated that MTL outperforms KLD regularization. In particular, the MTL adaptation obtained 8.8% and 4.0% relative word error rate reductions (WERRs) for supervised and unsupervised adaptations for the word CTC model, and 9.6% and 3.8% relative WERRs for the mix-unit CTC model, respectively.
Tasks	Multi-Task Learning, Speech Recognition
Published	2019-01-04
URL	http://arxiv.org/abs/1901.01239v1
PDF	http://arxiv.org/pdf/1901.01239v1.pdf
PWC	https://paperswithcode.com/paper/speaker-adaptation-for-end-to-end-ctc-models
Repo
Framework

Unsupervised Fault Detection in Varying Operating Conditions


Title	Unsupervised Fault Detection in Varying Operating Conditions
Authors	Gabriel Michau, Olga Fink
Abstract	Training data-driven approaches for complex industrial system health monitoring is challenging. When data on faulty conditions are rare or not available, the training has to be performed in a unsupervised manner. In addition, when the observation period, used for training, is kept short, to be able to monitor the system in its early life, the training data might not be representative of all the system normal operating conditions. In this paper, we propose five approaches to perform fault detection in such context. Two approaches rely on the data from the unit to be monitored only: the baseline is trained on the early life of the unit. An incremental learning procedure tries to learn new operating conditions as they arise. Three other approaches take advantage of data from other similar units within a fleet. In two cases, units are directly compared to each other with similarity measures, and the data from similar units are combined in the training set. We propose, in the third case, a new deep-learning methodology to perform, first, a feature alignment of different units with an Unsupervised Feature Alignment Network (UFAN). Then, features of both units are combined in the training set of the fault detection neural network. The approaches are tested on a fleet comprising 112 units, observed over one year of data. All approaches proposed here are an improvement to the baseline, trained with two months of data only. As units in the fleet are found to be very dissimilar, the new architecture UFAN, that aligns units in the feature space, is outperforming others.
Tasks	Fault Detection
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06481v1
PDF	https://arxiv.org/pdf/1907.06481v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-fault-detection-in-varying
Repo
Framework

Hybrid Probabilistic Inference with Logical Constraints: Tractability and Message Passing


Title	Hybrid Probabilistic Inference with Logical Constraints: Tractability and Message Passing
Authors	Zhe Zeng, Fanqi Yan, Paolo Morettin, Antonio Vergari, Guy Van den Broeck
Abstract	Weighted model integration (WMI) is a very appealing framework for probabilistic inference: it allows to express the complex dependencies of real-world hybrid scenarios where variables are heterogeneous in nature (both continuous and discrete) via the language of Satisfiability Modulo Theories (SMT); as well as computing probabilistic queries with arbitrarily complex logical constraints. Recent work has shown WMI inference to be reducible to a model integration (MI) problem, under some assumptions, thus effectively allowing hybrid probabilistic reasoning by volume computations. In this paper, we introduce a novel formulation of MI via a message passing scheme that allows to efficiently compute the marginal densities and statistical moments of all the variables in linear time. As such, we are able to amortize inference for arbitrarily rich MI queries when they conform to the problem structure, here represented as the primal graph associated to the SMT formula. Furthermore, we theoretically trace the tractability boundaries of exact MI. Indeed, we prove that in terms of the structural requirements on the primal graph that make our MI algorithm tractable - bounding its diameter and treewidth - the bounds are not only sufficient, but necessary for tractable inference via MI.
Tasks
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09362v2
PDF	https://arxiv.org/pdf/1909.09362v2.pdf
PWC	https://paperswithcode.com/paper/hybrid-probabilistic-inference-with-logical
Repo
Framework

TensorFlow.js: Machine Learning for the Web and Beyond


Title	TensorFlow.js: Machine Learning for the Web and Beyond
Authors	Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing Cai, Eric Nielsen, David Soergel, Stan Bileschi, Michael Terry, Charles Nicholson, Sandeep N. Gupta, Sarah Sirajuddin, D. Sculley, Rajat Monga, Greg Corrado, Fernanda B. Viégas, Martin Wattenberg
Abstract	TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. TensorFlow.js has empowered a new set of developers from the extensive JavaScript community to build and deploy machine learning models and enabled new classes of on-device computation. This paper describes the design, API, and implementation of TensorFlow.js, and highlights some of the impactful use cases.
Tasks
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05350v2
PDF	http://arxiv.org/pdf/1901.05350v2.pdf
PWC	https://paperswithcode.com/paper/tensorflowjs-machine-learning-for-the-web-and
Repo
Framework

On Controlled DeEntanglement for Natural Language Processing


Title	On Controlled DeEntanglement for Natural Language Processing
Authors	SaiKrishna Rallabandi
Abstract	Latest addition to the toolbox of human species is Artificial Intelligence(AI). Thus far, AI has made significant progress in low stake low risk scenarios such as playing Go and we are currently in a transition toward medium stake scenarios such as Visual Dialog. In my thesis, I argue that we need to incorporate controlled de-entanglement as first class object to succeed in this transition. I present mathematical analysis from information theory to show that employing stochasticity leads to controlled de-entanglement of relevant factors of variation at various levels. Based on this, I highlight results from initial experiments that depict efficacy of the proposed framework. I conclude this writeup by a roadmap of experiments that show the applicability of this framework to scalability, flexibility and interpretibility.
Tasks	Visual Dialog
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09964v1
PDF	https://arxiv.org/pdf/1909.09964v1.pdf
PWC	https://paperswithcode.com/paper/190909964
Repo
Framework

Road Damage Detection Based on Unsupervised Disparity Map Segmentation


Title	Road Damage Detection Based on Unsupervised Disparity Map Segmentation
Authors	Rui Fan, Ming Liu
Abstract	This paper presents a novel road damage detection algorithm based on unsupervised disparity map segmentation. Firstly, a disparity map is transformed by minimizing an energy function with respect to stereo rig roll angle and road disparity projection model. Instead of solving this energy minimization problem using non-linear optimization techniques, we directly find its numerical solution. The transformed disparity map is then segmented using Otus’s thresholding method, and the damaged road areas can be extracted. The proposed algorithm requires no parameters when detecting road damage. The experimental results illustrate that our proposed algorithm performs both accurately and efficiently. The pixel-level road damage detection accuracy is approximately 97.56%.
Tasks	Road Damage Detection
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04988v1
PDF	https://arxiv.org/pdf/1910.04988v1.pdf
PWC	https://paperswithcode.com/paper/road-damage-detection-based-on-unsupervised
Repo
Framework

Relating lp regularization and reweighted l1 regularization


Title	Relating lp regularization and reweighted l1 regularization
Authors	Hao Wang, Hao Zeng, Jiashan Wang
Abstract	We propose a general framework of iteratively reweighted l1 methods for solving lp regularization problems. We prove that after some iteration k, the iterates generated by the proposed methods have the same support and sign as the limit points, and are bounded away from 0, so that the algorithm behaves like solving a smooth problem in the reduced space. As a result, the global convergence can be easily obtained and an update strategy for the smoothing parameter is proposed which can automatically terminate the updates for zero components. We show that lp regularization problems are locally equivalent to a weighted l1 regularization problem and every optimal point corresponds to a Maximum A Posterior estimation for independently and non-identically distributed Laplace prior parameters. Numerical experiments exhibit the behaviors and the efficiency of our proposed methods.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00723v1
PDF	https://arxiv.org/pdf/1912.00723v1.pdf
PWC	https://paperswithcode.com/paper/relating-lp-regularization-and-reweighted-l1
Repo
Framework

A Novel Hierarchical Binary Tagging Framework for Joint Extraction of Entities and Relations


Title	A Novel Hierarchical Binary Tagging Framework for Joint Extraction of Entities and Relations
Authors	Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, Yi Chang
Abstract	Extracting relational triples from unstructured text is crucial for large-scale knowledge graph construction. However, few existing works excel in solving the overlapping triple problem where multiple relational triples in the same sentence share the same entities. We propose a novel Hierarchical Binary Tagging (HBT) framework derived from a principled problem formulation. Instead of treating relations as discrete labels as in previous works, our new framework models relations as functions that map subjects to objects in a sentence, which naturally handles overlapping triples. Experiments show that the proposed framework already outperforms state-of-the-art methods even its encoder module uses a randomly initialized BERT encoder, showing the power of the new tagging framework. It enjoys further performance boost when employing a pretrained BERT encoder, outperforming the strongest baseline by 25.6 and 45.9 absolute gain in F1-score on two public datasets NYT and WebNLG, respectively. In-depth analysis on different types of overlapping triples shows that the method delivers consistent performance gain in all scenarios.
Tasks	Relation Extraction
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03227v1
PDF	https://arxiv.org/pdf/1909.03227v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-hierarchical-binary-tagging-framework
Repo
Framework

Global Convergence of Gradient Descent for Deep Linear Residual Networks


Title	Global Convergence of Gradient Descent for Deep Linear Residual Networks
Authors	Lei Wu, Qingcan Wang, Chao Ma
Abstract	We analyze the global convergence of gradient descent for deep linear residual networks by proposing a new initialization: zero-asymmetric (ZAS) initialization. It is motivated by avoiding stable manifolds of saddle points. We prove that under the ZAS initialization, for an arbitrary target matrix, gradient descent converges to an $\varepsilon$-optimal point in $O(L^3 \log(1/\varepsilon))$ iterations, which scales polynomially with the network depth $L$. Our result and the $\exp(\Omega(L))$ convergence time for the standard initialization (Xavier or near-identity) [Shamir, 2018] together demonstrate the importance of the residual structure and the initialization in the optimization for deep linear neural networks, especially when $L$ is large.
Tasks
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00645v1
PDF	https://arxiv.org/pdf/1911.00645v1.pdf
PWC	https://paperswithcode.com/paper/global-convergence-of-gradient-descent-for
Repo
Framework