Paper Group ANR 1012
To each route its own ETA: A generative modeling framework for ETA prediction. A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning. SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection. Model selection for high-dimensional linear regression with dependent observations. EAST: Encodin …
To each route its own ETA: A generative modeling framework for ETA prediction
Title | To each route its own ETA: A generative modeling framework for ETA prediction |
Authors | Charul, Pravesh Biyani |
Abstract | Accurate expected time of arrival (ETA) information is crucial in maintaining the quality of service of public transit. Recent advances in artificial intelligence (AI) has led to more effective models for ETA estimation that rely heavily on a large GPS datasets. More importantly, these are mainly cabs based datasets which may not be fit for bus-based public transport. Consequently, the latest methods may not be applicable for ETA estimation in cities with the absence of large training data set. On the other hand, the ETA estimation problem in many cities needs to be solved in the absence of big datasets that also contains outliers, anomalies and may be incomplete. This work presents a simple but robust model for ETA estimation for a bus route that only relies on the historical data of the particular route. We propose a system that generates ETA information for a trip and updates it as the trip progresses based on the real-time information. We train a deep learning based generative model that learns the probability distribution of ETA data across trips and conditional on the current trip information updates the ETA information on the go. Our plug and play model not only captures the non-linearity of the task well but that any transit agency can use without needing any other external data source. The experiments run over three routes, data collected in the city of Delhi illustrates the promise of our approach. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09925v1 |
https://arxiv.org/pdf/1906.09925v1.pdf | |
PWC | https://paperswithcode.com/paper/to-each-route-its-own-eta-a-generative |
Repo | |
Framework | |
A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning
Title | A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning |
Authors | Juan Luis Gonzalez Bello, Munchurl Kim |
Abstract | Convolutional neural networks (CNN) have shown state-of-the-art results for low-level computer vision problems such as stereo and monocular disparity estimations, but still, have much room to further improve their performance in terms of accuracy, numbers of parameters, etc. Recent works have uncovered the advantages of using an unsupervised scheme to train CNN’s to estimate monocular disparity, where only the relatively-easy-to-obtain stereo images are needed for training. We propose a novel encoder-decoder architecture that outperforms previous unsupervised monocular depth estimation networks by (i) taking into account ambiguities, (ii) efficient fusion between encoder and decoder features with rectangular convolutions and (iii) domain transformations between encoder and decoder. Our architecture outperforms the Monodepth baseline in all metrics, even with a considerable reduction of parameters. Furthermore, our architecture is capable of estimating a full disparity map in a single forward pass, whereas the baseline needs two passes. We perform extensive experiments to verify the effectiveness of our method on the KITTI dataset. |
Tasks | Depth Estimation, Disparity Estimation, Monocular Depth Estimation |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08514v1 |
http://arxiv.org/pdf/1903.08514v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-monocular-disparity-estimation |
Repo | |
Framework | |
SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection
Title | SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection |
Authors | Anurag Kumar, Vamsi Krishna Ithapu |
Abstract | Weakly supervised learning algorithms are critical for scaling audio event detection to several hundreds of sound categories. Such learning models should not only disambiguate sound events efficiently with minimal class-specific annotation but also be robust to label noise, which is more apparent with weak labels instead of strong annotations. In this work, we propose a new framework for designing learning models with weak supervision by bridging ideas from sequential learning and knowledge distillation. We refer to the proposed methodology as SeCoST (pronounced Sequest) – Sequential Co-supervision for training generations of Students. SeCoST incrementally builds a cascade of student-teacher pairs via a novel knowledge transfer method. Our evaluations on Audioset (the largest weakly labeled dataset available) show that SeCoST achieves a mean average precision of 0.383 while outperforming prior state of the art by a considerable margin. |
Tasks | Transfer Learning |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11789v2 |
https://arxiv.org/pdf/1910.11789v2.pdf | |
PWC | https://paperswithcode.com/paper/secost-sequential-co-supervision-for-weakly |
Repo | |
Framework | |
Model selection for high-dimensional linear regression with dependent observations
Title | Model selection for high-dimensional linear regression with dependent observations |
Authors | Ching-Kang Ing |
Abstract | We investigate the prediction capability of the orthogonal greedy algorithm (OGA) in high-dimensional regression models with dependent observations. The rates of convergence of the prediction error of OGA are obtained under a variety of sparsity conditions. To prevent OGA from overfitting, we introduce a high-dimensional Akaike’s information criterion (HDAIC) to determine the number of OGA iterations. A key contribution of this work is to show that OGA, used in conjunction with HDAIC, can achieve the optimal convergence rate without knowledge of how sparse the underlying high-dimensional model is. |
Tasks | Model Selection |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07395v1 |
https://arxiv.org/pdf/1906.07395v1.pdf | |
PWC | https://paperswithcode.com/paper/model-selection-for-high-dimensional-linear |
Repo | |
Framework | |
EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets
Title | EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets |
Authors | Matteo Grimaldi, Valentino Peluso, Andrea Calimera |
Abstract | The implementation of Deep Convolutional Neural Networks (ConvNets) on tiny end-nodes with limited non-volatile memory space calls for smart compression strategies capable of shrinking the footprint yet preserving predictive accuracy. There exist a number of strategies for this purpose, from those that play with the topology of the model or the arithmetic precision, e.g. pruning and quantization, to those that operate a model agnostic compression, e.g. weight encoding. The tighter the memory constraint, the higher the probability that these techniques alone cannot meet the requirement, hence more awareness and cooperation across different optimizations become mandatory. This work addresses the issue by introducing EAST, Encoding-Aware Sparse Training, a novel memory-constrained training procedure that leads quantized ConvNets towards deep memory compression. EAST implements an adaptive group pruning designed to maximize the compression rate of the weight encoding scheme (the LZ4 algorithm in this work). If compared to existing methods, EAST meets the memory constraint with lower sparsity, hence ensuring higher accuracy. Results conducted on a state-of-the-art ConvNet (ResNet-9) deployed on a low-power microcontroller (ARM Cortex-M4) validate the proposal. |
Tasks | Quantization |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.10087v1 |
https://arxiv.org/pdf/1912.10087v1.pdf | |
PWC | https://paperswithcode.com/paper/east-encoding-aware-sparse-training-for-deep |
Repo | |
Framework | |
Sublinear Subwindow Search
Title | Sublinear Subwindow Search |
Authors | Max Reuter, Gheorghe-Teodor Bercea |
Abstract | We propose an efficient approximation algorithm for subwindow search that runs in sublinear time and memory. Applied to object localization, this algorithm significantly reduces running time and memory usage while maintaining competitive accuracy scores compared to the state-of-the-art. The algorithm’s accuracy also scales with both the size and the spatial coherence (nearby-element similarity) of the matrix. It is thus well-suited for real-time applications and against many matrices in general. |
Tasks | Object Localization |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1908.00140v1 |
https://arxiv.org/pdf/1908.00140v1.pdf | |
PWC | https://paperswithcode.com/paper/sublinear-subwindow-search |
Repo | |
Framework | |
Speaker Adaptation for End-to-End CTC Models
Title | Speaker Adaptation for End-to-End CTC Models |
Authors | Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong |
Abstract | We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems. One is Kullback-Leibler divergence (KLD) regularization and the other is multi-task learning (MTL). Both approaches aim to address the data sparsity especially output target sparsity issue of speaker adaptation in E2E systems. The KLD regularization adapts a model by forcing the output distribution from the adapted model to be close to the unadapted one. The MTL utilizes a jointly trained auxiliary task to improve the performance of the main task. We investigated our approaches on E2E connectionist temporal classification (CTC) models with three different types of output units. Experiments on the Microsoft short message dictation task demonstrated that MTL outperforms KLD regularization. In particular, the MTL adaptation obtained 8.8% and 4.0% relative word error rate reductions (WERRs) for supervised and unsupervised adaptations for the word CTC model, and 9.6% and 3.8% relative WERRs for the mix-unit CTC model, respectively. |
Tasks | Multi-Task Learning, Speech Recognition |
Published | 2019-01-04 |
URL | http://arxiv.org/abs/1901.01239v1 |
http://arxiv.org/pdf/1901.01239v1.pdf | |
PWC | https://paperswithcode.com/paper/speaker-adaptation-for-end-to-end-ctc-models |
Repo | |
Framework | |
Unsupervised Fault Detection in Varying Operating Conditions
Title | Unsupervised Fault Detection in Varying Operating Conditions |
Authors | Gabriel Michau, Olga Fink |
Abstract | Training data-driven approaches for complex industrial system health monitoring is challenging. When data on faulty conditions are rare or not available, the training has to be performed in a unsupervised manner. In addition, when the observation period, used for training, is kept short, to be able to monitor the system in its early life, the training data might not be representative of all the system normal operating conditions. In this paper, we propose five approaches to perform fault detection in such context. Two approaches rely on the data from the unit to be monitored only: the baseline is trained on the early life of the unit. An incremental learning procedure tries to learn new operating conditions as they arise. Three other approaches take advantage of data from other similar units within a fleet. In two cases, units are directly compared to each other with similarity measures, and the data from similar units are combined in the training set. We propose, in the third case, a new deep-learning methodology to perform, first, a feature alignment of different units with an Unsupervised Feature Alignment Network (UFAN). Then, features of both units are combined in the training set of the fault detection neural network. The approaches are tested on a fleet comprising 112 units, observed over one year of data. All approaches proposed here are an improvement to the baseline, trained with two months of data only. As units in the fleet are found to be very dissimilar, the new architecture UFAN, that aligns units in the feature space, is outperforming others. |
Tasks | Fault Detection |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06481v1 |
https://arxiv.org/pdf/1907.06481v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-fault-detection-in-varying |
Repo | |
Framework | |
Hybrid Probabilistic Inference with Logical Constraints: Tractability and Message Passing
Title | Hybrid Probabilistic Inference with Logical Constraints: Tractability and Message Passing |
Authors | Zhe Zeng, Fanqi Yan, Paolo Morettin, Antonio Vergari, Guy Van den Broeck |
Abstract | Weighted model integration (WMI) is a very appealing framework for probabilistic inference: it allows to express the complex dependencies of real-world hybrid scenarios where variables are heterogeneous in nature (both continuous and discrete) via the language of Satisfiability Modulo Theories (SMT); as well as computing probabilistic queries with arbitrarily complex logical constraints. Recent work has shown WMI inference to be reducible to a model integration (MI) problem, under some assumptions, thus effectively allowing hybrid probabilistic reasoning by volume computations. In this paper, we introduce a novel formulation of MI via a message passing scheme that allows to efficiently compute the marginal densities and statistical moments of all the variables in linear time. As such, we are able to amortize inference for arbitrarily rich MI queries when they conform to the problem structure, here represented as the primal graph associated to the SMT formula. Furthermore, we theoretically trace the tractability boundaries of exact MI. Indeed, we prove that in terms of the structural requirements on the primal graph that make our MI algorithm tractable - bounding its diameter and treewidth - the bounds are not only sufficient, but necessary for tractable inference via MI. |
Tasks | |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09362v2 |
https://arxiv.org/pdf/1909.09362v2.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-probabilistic-inference-with-logical |
Repo | |
Framework | |
TensorFlow.js: Machine Learning for the Web and Beyond
Title | TensorFlow.js: Machine Learning for the Web and Beyond |
Authors | Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing Cai, Eric Nielsen, David Soergel, Stan Bileschi, Michael Terry, Charles Nicholson, Sandeep N. Gupta, Sarah Sirajuddin, D. Sculley, Rajat Monga, Greg Corrado, Fernanda B. Viégas, Martin Wattenberg |
Abstract | TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. TensorFlow.js has empowered a new set of developers from the extensive JavaScript community to build and deploy machine learning models and enabled new classes of on-device computation. This paper describes the design, API, and implementation of TensorFlow.js, and highlights some of the impactful use cases. |
Tasks | |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05350v2 |
http://arxiv.org/pdf/1901.05350v2.pdf | |
PWC | https://paperswithcode.com/paper/tensorflowjs-machine-learning-for-the-web-and |
Repo | |
Framework | |
On Controlled DeEntanglement for Natural Language Processing
Title | On Controlled DeEntanglement for Natural Language Processing |
Authors | SaiKrishna Rallabandi |
Abstract | Latest addition to the toolbox of human species is Artificial Intelligence(AI). Thus far, AI has made significant progress in low stake low risk scenarios such as playing Go and we are currently in a transition toward medium stake scenarios such as Visual Dialog. In my thesis, I argue that we need to incorporate controlled de-entanglement as first class object to succeed in this transition. I present mathematical analysis from information theory to show that employing stochasticity leads to controlled de-entanglement of relevant factors of variation at various levels. Based on this, I highlight results from initial experiments that depict efficacy of the proposed framework. I conclude this writeup by a roadmap of experiments that show the applicability of this framework to scalability, flexibility and interpretibility. |
Tasks | Visual Dialog |
Published | 2019-09-22 |
URL | https://arxiv.org/abs/1909.09964v1 |
https://arxiv.org/pdf/1909.09964v1.pdf | |
PWC | https://paperswithcode.com/paper/190909964 |
Repo | |
Framework | |
Road Damage Detection Based on Unsupervised Disparity Map Segmentation
Title | Road Damage Detection Based on Unsupervised Disparity Map Segmentation |
Authors | Rui Fan, Ming Liu |
Abstract | This paper presents a novel road damage detection algorithm based on unsupervised disparity map segmentation. Firstly, a disparity map is transformed by minimizing an energy function with respect to stereo rig roll angle and road disparity projection model. Instead of solving this energy minimization problem using non-linear optimization techniques, we directly find its numerical solution. The transformed disparity map is then segmented using Otus’s thresholding method, and the damaged road areas can be extracted. The proposed algorithm requires no parameters when detecting road damage. The experimental results illustrate that our proposed algorithm performs both accurately and efficiently. The pixel-level road damage detection accuracy is approximately 97.56%. |
Tasks | Road Damage Detection |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.04988v1 |
https://arxiv.org/pdf/1910.04988v1.pdf | |
PWC | https://paperswithcode.com/paper/road-damage-detection-based-on-unsupervised |
Repo | |
Framework | |
Relating lp regularization and reweighted l1 regularization
Title | Relating lp regularization and reweighted l1 regularization |
Authors | Hao Wang, Hao Zeng, Jiashan Wang |
Abstract | We propose a general framework of iteratively reweighted l1 methods for solving lp regularization problems. We prove that after some iteration k, the iterates generated by the proposed methods have the same support and sign as the limit points, and are bounded away from 0, so that the algorithm behaves like solving a smooth problem in the reduced space. As a result, the global convergence can be easily obtained and an update strategy for the smoothing parameter is proposed which can automatically terminate the updates for zero components. We show that lp regularization problems are locally equivalent to a weighted l1 regularization problem and every optimal point corresponds to a Maximum A Posterior estimation for independently and non-identically distributed Laplace prior parameters. Numerical experiments exhibit the behaviors and the efficiency of our proposed methods. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00723v1 |
https://arxiv.org/pdf/1912.00723v1.pdf | |
PWC | https://paperswithcode.com/paper/relating-lp-regularization-and-reweighted-l1 |
Repo | |
Framework | |
A Novel Hierarchical Binary Tagging Framework for Joint Extraction of Entities and Relations
Title | A Novel Hierarchical Binary Tagging Framework for Joint Extraction of Entities and Relations |
Authors | Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, Yi Chang |
Abstract | Extracting relational triples from unstructured text is crucial for large-scale knowledge graph construction. However, few existing works excel in solving the overlapping triple problem where multiple relational triples in the same sentence share the same entities. We propose a novel Hierarchical Binary Tagging (HBT) framework derived from a principled problem formulation. Instead of treating relations as discrete labels as in previous works, our new framework models relations as functions that map subjects to objects in a sentence, which naturally handles overlapping triples. Experiments show that the proposed framework already outperforms state-of-the-art methods even its encoder module uses a randomly initialized BERT encoder, showing the power of the new tagging framework. It enjoys further performance boost when employing a pretrained BERT encoder, outperforming the strongest baseline by 25.6 and 45.9 absolute gain in F1-score on two public datasets NYT and WebNLG, respectively. In-depth analysis on different types of overlapping triples shows that the method delivers consistent performance gain in all scenarios. |
Tasks | Relation Extraction |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03227v1 |
https://arxiv.org/pdf/1909.03227v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-hierarchical-binary-tagging-framework |
Repo | |
Framework | |
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Title | Global Convergence of Gradient Descent for Deep Linear Residual Networks |
Authors | Lei Wu, Qingcan Wang, Chao Ma |
Abstract | We analyze the global convergence of gradient descent for deep linear residual networks by proposing a new initialization: zero-asymmetric (ZAS) initialization. It is motivated by avoiding stable manifolds of saddle points. We prove that under the ZAS initialization, for an arbitrary target matrix, gradient descent converges to an $\varepsilon$-optimal point in $O(L^3 \log(1/\varepsilon))$ iterations, which scales polynomially with the network depth $L$. Our result and the $\exp(\Omega(L))$ convergence time for the standard initialization (Xavier or near-identity) [Shamir, 2018] together demonstrate the importance of the residual structure and the initialization in the optimization for deep linear neural networks, especially when $L$ is large. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00645v1 |
https://arxiv.org/pdf/1911.00645v1.pdf | |
PWC | https://paperswithcode.com/paper/global-convergence-of-gradient-descent-for |
Repo | |
Framework | |