Paper Group ANR 79
Can Deep Neural Networks Match the Related Objects?: A Survey on ImageNet-trained Classification Models. Identifiability of Kronecker-structured Dictionaries for Tensor Data. Framework for evaluation of sound event detection in web videos. A Bayesian algorithm for distributed network localization using distance and direction data. Mining Worse and …
Can Deep Neural Networks Match the Related Objects?: A Survey on ImageNet-trained Classification Models
Title | Can Deep Neural Networks Match the Related Objects?: A Survey on ImageNet-trained Classification Models |
Authors | Han S. Lee, Heechul Jung, Alex A. Agarwal, Junmo Kim |
Abstract | Deep neural networks (DNNs) have shown the state-of-the-art level of performances in wide range of complicated tasks. In recent years, the studies have been actively conducted to analyze the black box characteristics of DNNs and to grasp the learning behaviours, tendency, and limitations of DNNs. In this paper, we investigate the limitation of DNNs in image classification task and verify it with the method inspired by cognitive psychology. Through analyzing the failure cases of ImageNet classification task, we hypothesize that the DNNs do not sufficiently learn to associate related classes of objects. To verify how DNNs understand the relatedness between object classes, we conducted experiments on the image database provided in cognitive psychology. We applied the ImageNet-trained DNNs to the database consisting of pairs of related and unrelated object images to compare the feature similarities and determine whether the pairs match each other. In the experiments, we observed that the DNNs show limited performance in determining relatedness between object classes. In addition, the DNNs present somewhat improved performance in discovering relatedness based on similarity, but they perform weaker in discovering relatedness based on association. Through these experiments, a novel analysis of learning behaviour of DNNs is provided and the limitation which needs to be overcome is suggested. |
Tasks | Image Classification |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03806v1 |
http://arxiv.org/pdf/1709.03806v1.pdf | |
PWC | https://paperswithcode.com/paper/can-deep-neural-networks-match-the-related |
Repo | |
Framework | |
Identifiability of Kronecker-structured Dictionaries for Tensor Data
Title | Identifiability of Kronecker-structured Dictionaries for Tensor Data |
Authors | Zahra Shakeri, Anand D. Sarwate, Waheed U. Bajwa |
Abstract | This paper derives sufficient conditions for local recovery of coordinate dictionaries comprising a Kronecker-structured dictionary that is used for representing $K$th-order tensor data. Tensor observations are assumed to be generated from a Kronecker-structured dictionary multiplied by sparse coefficient tensors that follow the separable sparsity model. This work provides sufficient conditions on the underlying coordinate dictionaries, coefficient and noise distributions, and number of samples that guarantee recovery of the individual coordinate dictionaries up to a specified error, as a local minimum of the objective function, with high probability. In particular, the sample complexity to recover $K$ coordinate dictionaries with dimensions $m_k \times p_k$ up to estimation error $\varepsilon_k$ is shown to be $\max_{k \in [K]}\mathcal{O}(m_kp_k^3\varepsilon_k^{-2})$. |
Tasks | |
Published | 2017-12-10 |
URL | http://arxiv.org/abs/1712.03471v3 |
http://arxiv.org/pdf/1712.03471v3.pdf | |
PWC | https://paperswithcode.com/paper/identifiability-of-kronecker-structured |
Repo | |
Framework | |
Framework for evaluation of sound event detection in web videos
Title | Framework for evaluation of sound event detection in web videos |
Authors | Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj |
Abstract | The largest source of sound events is web videos. Most videos lack sound event labels at segment level, however, a significant number of them do respond to text queries, from a match found using metadata by search engines. In this paper we explore the extent to which a search query can be used as the true label for detection of sound events in videos. We present a framework for large-scale sound event recognition on web videos. The framework crawls videos using search queries corresponding to 78 sound event labels drawn from three datasets. The datasets are used to train three classifiers, and we obtain a prediction on 3.7 million web video segments. We evaluated performance using the search query as true label and compare it with human labeling. Both types of ground truth exhibited close performance, to within 10%, and similar performance trend with increasing number of evaluated segments. Hence, our experiments show potential for using search query as a preliminary true label for sound event recognition in web videos. |
Tasks | Sound Event Detection |
Published | 2017-11-02 |
URL | http://arxiv.org/abs/1711.00804v2 |
http://arxiv.org/pdf/1711.00804v2.pdf | |
PWC | https://paperswithcode.com/paper/framework-for-evaluation-of-sound-event |
Repo | |
Framework | |
A Bayesian algorithm for distributed network localization using distance and direction data
Title | A Bayesian algorithm for distributed network localization using distance and direction data |
Authors | Hassan Naseri, Visa Koivunen |
Abstract | A reliable, accurate, and affordable positioning service is highly required in wireless networks. In this paper, the novel Message Passing Hybrid Localization (MPHL) algorithm is proposed to solve the problem of cooperative distributed localization using distance and direction estimates. This hybrid approach combines two sensing modalities to reduce the uncertainty in localizing the network nodes. A statistical model is formulated for the problem, and approximate minimum mean square error (MMSE) estimates of the node locations are computed. The proposed MPHL is a distributed algorithm based on belief propagation (BP) and Markov chain Monte Carlo (MCMC) sampling. It improves the identifiability of the localization problem and reduces its sensitivity to the anchor node geometry, compared to distance-only or direction-only localization techniques. For example, the unknown location of a node can be found if it has only a single neighbor; and a whole network can be localized using only a single anchor node. Numerical results are presented showing that the average localization error is significantly reduced in almost every simulation scenario, about 50% in most cases, compared to the competing algorithms. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01918v2 |
http://arxiv.org/pdf/1704.01918v2.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-algorithm-for-distributed-network |
Repo | |
Framework | |
Mining Worse and Better Opinions. Unsupervised and Agnostic Aggregation of Online Reviews
Title | Mining Worse and Better Opinions. Unsupervised and Agnostic Aggregation of Online Reviews |
Authors | Michela Fazzolari, Marinella Petrocchi, Alessandro Tommasi, Cesare Zavattari |
Abstract | In this paper, we propose a novel approach for aggregating online reviews, according to the opinions they express. Our methodology is unsupervised - due to the fact that it does not rely on pre-labeled reviews - and it is agnostic - since it does not make any assumption about the domain or the language of the review content. We measure the adherence of a review content to the domain terminology extracted from a review set. First, we demonstrate the informativeness of the adherence metric with respect to the score associated with a review. Then, we exploit the metric values to group reviews, according to the opinions they express. Our experimental campaign has been carried out on two large datasets collected from Booking and Amazon, respectively. |
Tasks | |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05393v1 |
http://arxiv.org/pdf/1704.05393v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-worse-and-better-opinions-unsupervised |
Repo | |
Framework | |
InverseNet: Solving Inverse Problems with Splitting Networks
Title | InverseNet: Solving Inverse Problems with Splitting Networks |
Authors | Kai Fan, Qi Wei, Wenlin Wang, Amit Chakraborty, Katherine Heller |
Abstract | We propose a new method that uses deep learning techniques to solve the inverse problems. The inverse problem is cast in the form of learning an end-to-end mapping from observed data to the ground-truth. Inspired by the splitting strategy widely used in regularized iterative algorithm to tackle inverse problems, the mapping is decomposed into two networks, with one handling the inversion of the physical forward model associated with the data term and one handling the denoising of the output from the former network, i.e., the inverted version, associated with the prior/regularization term. The two networks are trained jointly to learn the end-to-end mapping, getting rid of a two-step training. The training is annealing as the intermediate variable between these two networks bridges the gap between the input (the degraded version of output) and output and progressively approaches to the ground-truth. The proposed network, referred to as InverseNet, is flexible in the sense that most of the existing end-to-end network structure can be leveraged in the first network and most of the existing denoising network structure can be used in the second one. Extensive experiments on both synthetic data and real datasets on the tasks, motion deblurring, super-resolution, and colorization, demonstrate the efficiency and accuracy of the proposed method compared with other image processing algorithms. |
Tasks | Colorization, Deblurring, Denoising, Super-Resolution |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00202v1 |
http://arxiv.org/pdf/1712.00202v1.pdf | |
PWC | https://paperswithcode.com/paper/inversenet-solving-inverse-problems-with |
Repo | |
Framework | |
Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval
Title | Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval |
Authors | Yang Wang, Xuemin Lin, Lin Wu, Wenjie Zhang |
Abstract | Given a query photo issued by a user (q-user), the landmark retrieval is to return a set of photos with their landmarks similar to those of the query, while the existing studies on the landmark retrieval focus on exploiting geometries of landmarks for similarity matches between candidate photos and a query photo. We observe that the same landmarks provided by different users over social media community may convey different geometry information depending on the viewpoints and/or angles, and may subsequently yield very different results. In fact, dealing with the landmarks with \illshapes caused by the photography of q-users is often nontrivial and has seldom been studied. In this paper we propose a novel framework, namely multi-query expansions, to retrieve semantically robust landmarks by two steps. Firstly, we identify the top-$k$ photos regarding the latent topics of a query landmark to construct multi-query set so as to remedy its possible \illshape. For this purpose, we significantly extend the techniques of Latent Dirichlet Allocation. Then, motivated by the typical \emph{collaborative filtering} methods, we propose to learn a \emph{collaborative} deep networks based semantically, nonlinear and high-level features over the latent factor for landmark photo as the training set, which is formed by matrix factorization over \emph{collaborative} user-photo matrix regarding the multi-query set. The learned deep network is further applied to generate the features for all the other photos, meanwhile resulting into a compact multi-query set within such space. Extensive experiments are conducted on real-world social media data with both landmark photos together with their user information to show the superior performance over the existing methods. |
Tasks | |
Published | 2017-01-18 |
URL | http://arxiv.org/abs/1701.05003v1 |
http://arxiv.org/pdf/1701.05003v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-multi-query-expansions |
Repo | |
Framework | |
SMARTies: Sentiment Models for Arabic Target Entities
Title | SMARTies: Sentiment Models for Arabic Target Entities |
Authors | Noura Farra, Kathleen McKeown |
Abstract | We consider entity-level sentiment analysis in Arabic, a morphologically rich language with increasing resources. We present a system that is applied to complex posts written in response to Arabic newspaper articles. Our goal is to identify important entity “targets” within the post along with the polarity expressed about each target. We achieve significant improvements over multiple baselines, demonstrating that the use of specific morphological representations improves the performance of identifying both important targets and their sentiment, and that the use of distributional semantic clusters further boosts performances for these representations, especially when richer linguistic resources are not available. |
Tasks | Sentiment Analysis |
Published | 2017-01-12 |
URL | http://arxiv.org/abs/1701.03434v1 |
http://arxiv.org/pdf/1701.03434v1.pdf | |
PWC | https://paperswithcode.com/paper/smarties-sentiment-models-for-arabic-target |
Repo | |
Framework | |
Accelerating Dependency Graph Learning from Heterogeneous Categorical Event Streams via Knowledge Transfer
Title | Accelerating Dependency Graph Learning from Heterogeneous Categorical Event Streams via Knowledge Transfer |
Authors | Chen Luo, Zhengzhang Chen, Lu-An Tang, Anshumali Shrivastava, Zhichun Li |
Abstract | Dependency graph, as a heterogeneous graph representing the intrinsic relationships between different pairs of system entities, is essential to many data analysis applications, such as root cause diagnosis, intrusion detection, etc. Given a well-trained dependency graph from a source domain and an immature dependency graph from a target domain, how can we extract the entity and dependency knowledge from the source to enhance the target? One way is to directly apply a mature dependency graph learned from a source domain to the target domain. But due to the domain variety problem, directly using the source dependency graph often can not achieve good performance. Traditional transfer learning methods mainly focus on numerical data and are not applicable. In this paper, we propose ACRET, a knowledge transfer based model for accelerating dependency graph learning from heterogeneous categorical event streams. In particular, we first propose an entity estimation model to filter out irrelevant entities from the source domain based on entity embedding and manifold learning. Only the entities with statistically high correlations are transferred to the target domain. On the surviving entities, we propose a dependency construction model for constructing the unbiased dependency relationships by solving a two-constraint optimization problem. The experimental results on synthetic and real-world datasets demonstrate the effectiveness and efficiency of ACRET. We also apply ACRET to a real enterprise security system for intrusion detection. Our method is able to achieve superior detection performance at least 20 days lead lag time in advance with more than 70% accuracy. |
Tasks | Intrusion Detection, Transfer Learning |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07867v1 |
http://arxiv.org/pdf/1708.07867v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-dependency-graph-learning-from |
Repo | |
Framework | |
Pathological OCT Retinal Layer Segmentation using Branch Residual U-shape Networks
Title | Pathological OCT Retinal Layer Segmentation using Branch Residual U-shape Networks |
Authors | Stefanos Apostolopoulos, Sandro De Zanet, Carlos Ciller, Sebastian Wolf, Raphael Sznitman |
Abstract | The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dilated residual blocks in an asymmetric U-shape configuration, and can segment multiple layers of highly pathological eyes in one shot. We validate our approach on a dataset of late-stage AMD patients and demonstrate lower computational costs and higher performance compared to other state-of-the-art methods. |
Tasks | |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04931v1 |
http://arxiv.org/pdf/1707.04931v1.pdf | |
PWC | https://paperswithcode.com/paper/pathological-oct-retinal-layer-segmentation |
Repo | |
Framework | |
OpenNMT: Open-source Toolkit for Neural Machine Translation
Title | OpenNMT: Open-source Toolkit for Neural Machine Translation |
Authors | Guillaume Klein, Yoon Kim, Yuntian Deng, Josep Crego, Jean Senellart, Alexander M. Rush |
Abstract | We introduce an open-source toolkit for neural machine translation (NMT) to support research into model architectures, feature representations, and source modalities, while maintaining competitive performance, modularity and reasonable training requirements. |
Tasks | Machine Translation |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03815v1 |
http://arxiv.org/pdf/1709.03815v1.pdf | |
PWC | https://paperswithcode.com/paper/opennmt-open-source-toolkit-for-neural-1 |
Repo | |
Framework | |
Neural-based Context Representation Learning for Dialog Act Classification
Title | Neural-based Context Representation Learning for Dialog Act Classification |
Authors | Daniel Ortega, Ngoc Thang Vu |
Abstract | We explore context representation learning methods in neural-based models for dialog act classification. We propose and compare extensively different methods which combine recurrent neural network architectures and attention mechanisms (AMs) at different context levels. Our experimental results on two benchmark datasets show consistent improvements compared to the models without contextual information and reveal that the most suitable AM in the architecture depends on the nature of the dataset. |
Tasks | Dialog Act Classification, Representation Learning |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02561v1 |
http://arxiv.org/pdf/1708.02561v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-based-context-representation-learning |
Repo | |
Framework | |
Non-Adaptive Randomized Algorithm for Group Testing
Title | Non-Adaptive Randomized Algorithm for Group Testing |
Authors | Nader H. Bshouty, Nuha Diab, Shada R. Kawar, Robert J. Shahla |
Abstract | We study the problem of group testing with a non-adaptive randomized algorithm in the random incidence design (RID) model where each entry in the test is chosen randomly independently from ${0,1}$ with a fixed probability $p$. The property that is sufficient and necessary for a unique decoding is the separability of the tests, but unfortunately no linear time algorithm is known for such tests. In order to achieve linear-time decodable tests, the algorithms in the literature use the disjunction property that gives almost optimal number of tests. We define a new property for the tests which we call semi-disjunction property. We show that there is a linear time decoding for such test and for $d\to \infty$ the number of tests converges to the number of tests with the separability property and is therefore optimal (in the RID model). Our analysis shows that, in the RID model, the number of tests in our algorithm is better than the one with the disjunction property even for small $d$. |
Tasks | |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02787v1 |
http://arxiv.org/pdf/1708.02787v1.pdf | |
PWC | https://paperswithcode.com/paper/non-adaptive-randomized-algorithm-for-group |
Repo | |
Framework | |
End-to-end representation learning for Correlation Filter based tracking
Title | End-to-end representation learning for Correlation Filter based tracking |
Authors | Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr |
Abstract | The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates. |
Tasks | Object Tracking, Representation Learning |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06036v1 |
http://arxiv.org/pdf/1704.06036v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-representation-learning-for |
Repo | |
Framework | |
Direct Load Control of Thermostatically Controlled Loads Based on Sparse Observations Using Deep Reinforcement Learning
Title | Direct Load Control of Thermostatically Controlled Loads Based on Sparse Observations Using Deep Reinforcement Learning |
Authors | Frederik Ruelens, Bert J. Claessens, Peter Vrancx, Fred Spiessens, Geert Deconinck |
Abstract | This paper considers a demand response agent that must find a near-optimal sequence of decisions based on sparse observations of its environment. Extracting a relevant set of features from these observations is a challenging task and may require substantial domain knowledge. One way to tackle this problem is to store sequences of past observations and actions in the state vector, making it high dimensional, and apply techniques from deep learning. This paper investigates the capabilities of different deep learning techniques, such as convolutional neural networks and recurrent neural networks, to extract relevant features for finding near-optimal policies for a residential heating system and electric water heater that are hindered by sparse observations. Our simulation results indicate that in this specific scenario, feeding sequences of time-series to an LSTM network, which is a specific type of recurrent neural network, achieved a higher performance than stacking these time-series in the input of a convolutional neural network or deep neural network. |
Tasks | Time Series |
Published | 2017-07-26 |
URL | http://arxiv.org/abs/1707.08553v1 |
http://arxiv.org/pdf/1707.08553v1.pdf | |
PWC | https://paperswithcode.com/paper/direct-load-control-of-thermostatically |
Repo | |
Framework | |