Paper Group ANR 27
Weighted AdaGrad with Unified Momentum. An Intelligent Extraversion Analysis Scheme from Crowd Trajectories for Surveillance. Task-Driven Super Resolution: Object Detection in Low-resolution Images. Clustering-based Anomaly Detection for microservices. Fusion of hyperspectral and ground penetrating radar to estimate soil moisture. Equivalence of ap …
Weighted AdaGrad with Unified Momentum
Title | Weighted AdaGrad with Unified Momentum |
Authors | Fangyu Zou, Li Shen, Zequn Jie, Ju Sun, Wei Liu |
Abstract | Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as Nadam, AccAdaGrad, \textit{etc}. In spite of their effectiveness in practice, there is still a large gap in their theories of convergences, especially in the difficult non-convex stochastic setting. To fill this gap, we propose \emph{weighted AdaGrad with unified momentum}, dubbed AdaUSM, which has the main characteristics that (1) it incorporates a unified momentum scheme which covers both the heavy ball momentum and the Nesterov accelerated gradient momentum; (2) it adopts a novel weighted adaptive learning rate that can unify the learning rates of AdaGrad, AccAdaGrad, Adam, and RMSProp. Moreover, when we take polynomially growing weights in AdaUSM, we obtain its $\mathcal{O}(\log(T)/\sqrt{T})$ convergence rate in the non-convex stochastic setting. We also show that the adaptive learning rates of Adam and RMSProp correspond to taking exponentially growing weights in AdaUSM, which thereby provides a new perspesctive for understanding Adam and RMSProp. Lastly, comparative experiments of AdaUSM against SGD with momentum, AdaGrad, AdaEMA, Adam, and AMSGrad on various deep learning models and datasets are also provided. |
Tasks | Stochastic Optimization |
Published | 2018-08-10 |
URL | https://arxiv.org/abs/1808.03408v3 |
https://arxiv.org/pdf/1808.03408v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convergence-of-weighted-adagrad-with |
Repo | |
Framework | |
An Intelligent Extraversion Analysis Scheme from Crowd Trajectories for Surveillance
Title | An Intelligent Extraversion Analysis Scheme from Crowd Trajectories for Surveillance |
Authors | Wenxi Liu, Yuanlong Yu, Chun-Yang Zhang, Genggeng Liu, Naixue Xiong |
Abstract | In recent years, crowd analysis is important for applications such as smart cities, intelligent transportation system, customer behavior prediction, and visual surveillance. Understanding the characteristics of the individual motion in a crowd can be beneficial for social event detection and abnormal detection, but it has rarely been studied. In this paper, we focus on the extraversion measure of individual motions in crowds based on trajectory data. Extraversion is one of typical personalities that is often observed in human crowd behaviors and it can reflect not only the characteristics of the individual motion, but also the that of the holistic crowd motions. To our best knowledge, this is the first attempt to analyze individual extraversion of crowd motions based on trajectories. To accomplish this, we first present a effective composite motion descriptor, which integrates the basic individual motion information and social metrics, to describe the extraversion of each individual in a crowd. The social metrics consider both the neighboring distribution and their interaction pattern. Since our major goal is to learn a universal scoring function that can measure the degrees of extraversion across varied crowd scenes, we incorporate and adapt the active learning technique to the relative attribute approach. Specifically, we assume the social groups in any crowds contain individuals with the similar degree of extraversion. Based on such assumption, we significantly reduce the computation cost by clustering and ranking the trajectories actively. Finally, we demonstrate the performance of our proposed method by measuring the degree of extraversion for real individual trajectories in crowds and analyzing crowd scenes from a real-world dataset. |
Tasks | Active Learning |
Published | 2018-09-27 |
URL | https://arxiv.org/abs/1809.10398v2 |
https://arxiv.org/pdf/1809.10398v2.pdf | |
PWC | https://paperswithcode.com/paper/an-intelligent-extraversion-analysis-scheme |
Repo | |
Framework | |
Task-Driven Super Resolution: Object Detection in Low-resolution Images
Title | Task-Driven Super Resolution: Object Detection in Low-resolution Images |
Authors | Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita |
Abstract | We consider how image super resolution (SR) can contribute to an object detection task in low-resolution images. Intuitively, SR gives a positive impact on the object detection task. While several previous works demonstrated that this intuition is correct, SR and detector are optimized independently in these works. This paper proposes a novel framework to train a deep neural network where the SR sub-network explicitly incorporates a detection loss in its training objective, via a tradeoff with a traditional detection loss. This end-to-end training procedure allows us to train SR preprocessing for any differentiable detector. We demonstrate that our task-driven SR consistently and significantly improves accuracy of an object detector on low-resolution images for a variety of conditions and scaling factors. |
Tasks | Image Super-Resolution, Object Detection, Super-Resolution |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1803.11316v1 |
http://arxiv.org/pdf/1803.11316v1.pdf | |
PWC | https://paperswithcode.com/paper/task-driven-super-resolution-object-detection |
Repo | |
Framework | |
Clustering-based Anomaly Detection for microservices
Title | Clustering-based Anomaly Detection for microservices |
Authors | Roman Nikiforov |
Abstract | Anomaly detection is an important step in the management and monitoring of data centers and cloud computing platforms. The ability to detect anomalous virtual machines before real failures occur results in reduced downtime while operations engineers urgently recover malfunctioning virtual machines, efficient root cause analysis, and improved customer optics in the event said malfunction lead to an outage. Virtual machines could fail at any time, whether in a lab or production system. If there is no anomaly detection system, and a virtual machine in a lab environment fails, the QA and DEV team will have to switch to another environment while the OPS team fixes the failure. The potential impact of failing to detect anomalous virtual machines can result in financial ramifications, both when developing new features and servicing existing ones. This paper presents a model that can efficiently detect anomalous virtual machines both in production and testing environments. |
Tasks | Anomaly Detection |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02762v1 |
http://arxiv.org/pdf/1810.02762v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-based-anomaly-detection-for |
Repo | |
Framework | |
Fusion of hyperspectral and ground penetrating radar to estimate soil moisture
Title | Fusion of hyperspectral and ground penetrating radar to estimate soil moisture |
Authors | Felix M. Riese, Sina Keller |
Abstract | In this contribution, we investigate the potential of hyperspectral data combined with either simulated ground penetrating radar (GPR) or simulated (sensor-like) soil-moisture data to estimate soil moisture. We propose two simulation approaches to extend a given multi-sensor dataset which contains sparse GPR data. In the first approach, simulated GPR data is generated either by an interpolation along the time axis or by a machine learning model. The second approach includes the simulation of soil-moisture along the GPR profile. The soil-moisture estimation is improved significantly by the fusion of hyperspectral and GPR data. In contrast, the combination of simulated, sensor-like soil-moisture values and hyperspectral data achieves the worst regression performance. In conclusion, the estimation of soil moisture with hyperspectral and GPR data engages further investigations. |
Tasks | |
Published | 2018-04-14 |
URL | http://arxiv.org/abs/1804.05273v3 |
http://arxiv.org/pdf/1804.05273v3.pdf | |
PWC | https://paperswithcode.com/paper/fusion-of-hyperspectral-and-ground |
Repo | |
Framework | |
Equivalence of approximation by convolutional neural networks and fully-connected networks
Title | Equivalence of approximation by convolutional neural networks and fully-connected networks |
Authors | Philipp Petersen, Felix Voigtlaender |
Abstract | Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of fully-connected neural networks for functions $f \in \mathcal{C}$—for an arbitrary function class $\mathcal{C}$—translate to essentially the same bounds on approximation rates of convolutional neural networks for functions $f \in {\mathcal{C}^{equi}}$, with the class $\mathcal{C}^{equi}$ consisting of all translation equivariant functions whose first coordinate belongs to $\mathcal{C}$. |
Tasks | |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.00973v2 |
http://arxiv.org/pdf/1809.00973v2.pdf | |
PWC | https://paperswithcode.com/paper/equivalence-of-approximation-by-convolutional |
Repo | |
Framework | |
High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient
Title | High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient |
Authors | Amir Asiaee, Samet Oymak, Kevin R. Coombes, Arindam Banerjee |
Abstract | High dimensional structured data enriched model describes groups of observations by shared and per-group individual parameters, each with its own structure such as sparsity or group sparsity. In this paper, we consider the general form of data enrichment where data comes in a fixed but arbitrary number of groups G. Any convex function, e.g., norms, can characterize the structure of both shared and individual parameters. We propose an estimator for high dimensional data enriched model and provide conditions under which it consistently estimates both shared and individual parameters. We also delineate sample complexity of the estimator and present high probability non-asymptotic bound on estimation error of all parameters. Interestingly the sample complexity of our estimator translates to conditions on both per-group sample sizes and the total number of samples. We propose an iterative estimation algorithm with linear convergence rate and supplement our theoretical analysis with synthetic and real experimental results. Particularly, we show the predictive power of data-enriched model along with its interpretable results in anticancer drug sensitivity analysis. |
Tasks | |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.04047v3 |
http://arxiv.org/pdf/1806.04047v3.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-data-enrichment |
Repo | |
Framework | |
Risk-Sensitive Generative Adversarial Imitation Learning
Title | Risk-Sensitive Generative Adversarial Imitation Learning |
Authors | Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, Marco Pavone |
Abstract | We study risk-sensitive imitation learning where the agent’s goal is to perform at least as well as the expert in terms of a risk profile. We first formulate our risk-sensitive imitation learning setting. We consider the generative adversarial approach to imitation learning (GAIL) and derive an optimization problem for our formulation, which we call it risk-sensitive GAIL (RS-GAIL). We then derive two different versions of our RS-GAIL optimization problem that aim at matching the risk profiles of the agent and the expert w.r.t. Jensen-Shannon (JS) divergence and Wasserstein distance, and develop risk-sensitive generative adversarial imitation learning algorithms based on these optimization problems. We evaluate the performance of our algorithms and compare them with GAIL and the risk-averse imitation learning (RAIL) algorithms in two MuJoCo and two OpenAI classical control tasks. |
Tasks | Imitation Learning |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04468v2 |
http://arxiv.org/pdf/1808.04468v2.pdf | |
PWC | https://paperswithcode.com/paper/risk-sensitive-generative-adversarial |
Repo | |
Framework | |
Estimating Information Flow in Deep Neural Networks
Title | Estimating Information Flow in Deep Neural Networks |
Authors | Ziv Goldfeld, Ewout van den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy |
Abstract | We study the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the information bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information $I(X;T)$ between the input $X$ and internal representations $T$ decreases. Several papers observe compression of estimated mutual information on different DNN models, but the true $I(X;T)$ over these networks is provably either constant (discrete $X$) or infinite (continuous $X$). This work explains the discrepancy between theory and experiments, and clarifies what was actually measured by these past works. To this end, we introduce an auxiliary (noisy) DNN framework for which $I(X;T)$ is a meaningful quantity that depends on the network’s parameters. This noisy framework is shown to be a good proxy for the original (deterministic) DNN both in terms of performance and the learned representations. We then develop a rigorous estimator for $I(X;T)$ in noisy DNNs and observe compression in various models. By relating $I(X;T)$ in the noisy DNN to an information-theoretic communication problem, we show that compression is driven by the progressive clustering of hidden representations of inputs from the same class. Several methods to directly monitor clustering of hidden representations, both in noisy and deterministic DNNs, are used to show that meaningful clusters form in the $T$ space. Finally, we return to the estimator of $I(X;T)$ employed in past works, and demonstrate that while it fails to capture the true (vacuous) mutual information, it does serve as a measure for clustering. This clarifies the past observations of compression and isolates the geometric clustering of hidden representations as the true phenomenon of interest. |
Tasks | |
Published | 2018-10-12 |
URL | https://arxiv.org/abs/1810.05728v4 |
https://arxiv.org/pdf/1810.05728v4.pdf | |
PWC | https://paperswithcode.com/paper/estimating-information-flow-in-neural |
Repo | |
Framework | |
A personal model of trumpery: Deception detection in a real-world high-stakes setting
Title | A personal model of trumpery: Deception detection in a real-world high-stakes setting |
Authors | Sophie van der Zee, Ronald Poppe, Alice Havrileck, Aurelien Baillon |
Abstract | Language use reveals information about who we are and how we feel1-3. One of the pioneers in text analysis, Walter Weintraub, manually counted which types of words people used in medical interviews and showed that the frequency of first-person singular pronouns (i.e., I, me, my) was a reliable indicator of depression, with depressed people using I more often than people who are not depressed4. Several studies have demonstrated that language use also differs between truthful and deceptive statements5-7, but not all differences are consistent across people and contexts, making prediction difficult8. Here we show how well linguistic deception detection performs at the individual level by developing a model tailored to a single individual: the current US president. Using tweets fact-checked by an independent third party (Washington Post), we found substantial linguistic differences between factually correct and incorrect tweets and developed a quantitative model based on these differences. Next, we predicted whether out-of-sample tweets were either factually correct or incorrect and achieved a 73% overall accuracy. Our results demonstrate the power of linguistic analysis in real-world deception research when applied at the individual level and provide evidence that factually incorrect tweets are not random mistakes of the sender. |
Tasks | Deception Detection |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01938v1 |
http://arxiv.org/pdf/1811.01938v1.pdf | |
PWC | https://paperswithcode.com/paper/a-personal-model-of-trumpery-deception |
Repo | |
Framework | |
METCC: METric learning for Confounder Control Making distance matter in high dimensional biological analysis
Title | METCC: METric learning for Confounder Control Making distance matter in high dimensional biological analysis |
Authors | Kabir Manghnani, Adam Drake, Nathan Wan, Imran Haque |
Abstract | High-dimensional data acquired from biological experiments such as next generation sequencing are subject to a number of confounding effects. These effects include both technical effects, such as variation across batches from instrument noise or sample processing, or institution-specific differences in sample acquisition and physical handling, as well as biological effects arising from true but irrelevant differences in the biology of each sample, such as age biases in diseases. Prior work has used linear methods to adjust for such batch effects. Here, we apply contrastive metric learning by a non-linear triplet network to optimize the ability to distinguish biologically distinct sample classes in the presence of irrelevant technical and biological variation. Using whole-genome cell-free DNA data from 817 patients, we demonstrate that our approach, METric learning for Confounder Control (METCC), is able to match or exceed the classification performance achieved using a best-in-class linear method (HCP) or no normalization. Critically, results from METCC appear less confounded by irrelevant technical variables like institution and batch than those from other methods even without access to high quality metadata information required by many existing techniques; offering hope for improved generalization. |
Tasks | Metric Learning |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03188v1 |
http://arxiv.org/pdf/1812.03188v1.pdf | |
PWC | https://paperswithcode.com/paper/metcc-metric-learning-for-confounder-control |
Repo | |
Framework | |
Bayesian Models for Unit Discovery on a Very Low Resource Language
Title | Bayesian Models for Unit Discovery on a Very Low Resource Language |
Authors | Lucas Ondel, Pierre Godard, Laurent Besacier, Elin Larsen, Mark Hasegawa-Johnson, Odette Scharenborg, Emmanuel Dupoux, Lukas Burget, François Yvon, Sanjeev Khudanpur |
Abstract | Developing speech technologies for low-resource languages has become a very active research field over the last decade. Among others, Bayesian models have shown some promising results on artificial examples but still lack of in situ experiments. Our work applies state-of-the-art Bayesian models to unsupervised Acoustic Unit Discovery (AUD) in a real low-resource language scenario. We also show that Bayesian models can naturally integrate information from other resourceful languages by means of informative prior leading to more consistent discovered units. Finally, discovered acoustic units are used, either as the 1-best sequence or as a lattice, to perform word segmentation. Word segmentation results show that this Bayesian approach clearly outperforms a Segmental-DTW baseline on the same corpus. |
Tasks | |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.06053v2 |
http://arxiv.org/pdf/1802.06053v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-models-for-unit-discovery-on-a-very |
Repo | |
Framework | |
Computational complexity lower bounds of certain discrete Radon transform approximations
Title | Computational complexity lower bounds of certain discrete Radon transform approximations |
Authors | Timur M. Khanipov |
Abstract | For the computational model where only additions are allowed, the $\Omega(n^2\log n)$ lower bound on operations count with respect to image size $n\times n$ is obtained for two types of the discrete Radon transform implementations: the fast Hough transform and a generic strip pattern class which includes the classical Hough transform, implying the fast Hough transform algorithm asymptotic optimality. The proofs are based on a specific result from the boolean circuits complexity theory and are generalized for the case of boolean $\vee$ binary operation. |
Tasks | |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.01054v1 |
http://arxiv.org/pdf/1801.01054v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-complexity-lower-bounds-of |
Repo | |
Framework | |
Situation Assessment for Planning Lane Changes: Combining Recurrent Models and Prediction
Title | Situation Assessment for Planning Lane Changes: Combining Recurrent Models and Prediction |
Authors | Oliver Scheel, Loren Schwarz, Nassir Navab, Federico Tombari |
Abstract | One of the greatest challenges towards fully autonomous cars is the understanding of complex and dynamic scenes. Such understanding is needed for planning of maneuvers, especially those that are particularly frequent such as lane changes. While in recent years advanced driver-assistance systems have made driving safer and more comfortable, these have mostly focused on car following scenarios, and less on maneuvers involving lane changes. In this work we propose a situation assessment algorithm for classifying driving situations with respect to their suitability for lane changing. For this, we propose a deep learning architecture based on a Bidirectional Recurrent Neural Network, which uses Long Short-Term Memory units, and integrates a prediction component in the form of the Intelligent Driver Model. We prove the feasibility of our algorithm on the publicly available NGSIM datasets, where we outperform existing methods. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06776v1 |
http://arxiv.org/pdf/1805.06776v1.pdf | |
PWC | https://paperswithcode.com/paper/situation-assessment-for-planning-lane |
Repo | |
Framework | |
Phase-only Image Based Kernel Estimation for Single-image Blind Deblurring
Title | Phase-only Image Based Kernel Estimation for Single-image Blind Deblurring |
Authors | Liyuan Pan, Richard Hartley, Miaomiao Liu, Yuchao Dai |
Abstract | The image blurring process is generally modelled as the convolution of a blur kernel with a latent image. Therefore, the estimation of the blur kernel is essentially important for blind image deblurring. Unlike existing approaches which focus on approaching the problem by enforcing various priors on the blur kernel and the latent image, we are aiming at obtaining a high quality blur kernel directly by studying the problem in the frequency domain. We show that the auto-correlation of the absolute phase-only image can provide faithful information about the motion (e.g. the motion direction and magnitude, we call it the motion pattern in this paper.) that caused the blur, leading to a new and efficient blur kernel estimation approach. The blur kernel is then refined and the sharp image is estimated by solving an optimization problem by enforcing a regularization on the blur kernel and the latent image. We further extend our approach to handle non-uniform blur, which involves spatially varying blur kernels. Our approach is evaluated extensively on synthetic and real data and shows good results compared to the state-of-the-art deblurring approaches. |
Tasks | Blind Image Deblurring, Deblurring, Single-Image Blind Deblurring |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10185v3 |
http://arxiv.org/pdf/1811.10185v3.pdf | |
PWC | https://paperswithcode.com/paper/phase-only-image-based-kernel-estimation-for |
Repo | |
Framework | |