Paper Group ANR 286
Examining CNN Representations with respect to Dataset Bias. Crop Planning using Stochastic Visual Optimization. Improving Sparsity in Kernel Adaptive Filters Using a Unit-Norm Dictionary. Collaborative Summarization of Topic-Related Videos. Surfacing contextual hate speech words within social media. Face-to-BMI: Using Computer Vision to Infer Body …
Examining CNN Representations with respect to Dataset Bias
Title | Examining CNN Representations with respect to Dataset Bias |
Authors | Quanshi Zhang, Wenguan Wang, Song-Chun Zhu |
Abstract | Given a pre-trained CNN without any testing samples, this paper proposes a simple yet effective method to diagnose feature representations of the CNN. We aim to discover representation flaws caused by potential dataset bias. More specifically, when the CNN is trained to estimate image attributes, we mine latent relationships between representations of different attributes inside the CNN. Then, we compare the mined attribute relationships with ground-truth attribute relationships to discover the CNN’s blind spots and failure modes due to dataset bias. In fact, representation flaws caused by dataset bias cannot be examined by conventional evaluation strategies based on testing images, because testing images may also have a similar bias. Experiments have demonstrated the effectiveness of our method. |
Tasks | |
Published | 2017-10-29 |
URL | http://arxiv.org/abs/1710.10577v2 |
http://arxiv.org/pdf/1710.10577v2.pdf | |
PWC | https://paperswithcode.com/paper/examining-cnn-representations-with-respect-to |
Repo | |
Framework | |
Crop Planning using Stochastic Visual Optimization
Title | Crop Planning using Stochastic Visual Optimization |
Authors | Gunjan Sehgal, Bindu Gupta, Kaushal Paneri, Karamjit Singh, Geetika Sharma, Gautam Shroff |
Abstract | As the world population increases and arable land decreases, it becomes vital to improve the productivity of the agricultural land available. Given the weather and soil properties, farmers need to take critical decisions such as which seed variety to plant and in what proportion, in order to maximize productivity. These decisions are irreversible and any unusual behavior of external factors, such as weather, can have catastrophic impact on the productivity of crop. A variety which is highly desirable to a farmer might be unavailable or in short supply, therefore, it is very critical to evaluate which variety or varieties are more likely to be chosen by farmers from a growing region in order to meet demand. In this paper, we present our visual analytics tool, ViSeed, showcased on the data given in Syngenta 2016 crop data challenge 1 . This tool helps to predict optimal soybean seed variety or mix of varieties in appropriate proportions which is more likely to be chosen by farmers from a growing region. It also allows to analyse solutions generated from our approach and helps in the decision making process by providing insightful visualizations |
Tasks | Decision Making |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09077v1 |
http://arxiv.org/pdf/1710.09077v1.pdf | |
PWC | https://paperswithcode.com/paper/crop-planning-using-stochastic-visual |
Repo | |
Framework | |
Improving Sparsity in Kernel Adaptive Filters Using a Unit-Norm Dictionary
Title | Improving Sparsity in Kernel Adaptive Filters Using a Unit-Norm Dictionary |
Authors | Felipe Tobar |
Abstract | Kernel adaptive filters, a class of adaptive nonlinear time-series models, are known by their ability to learn expressive autoregressive patterns from sequential data. However, for trivial monotonic signals, they struggle to perform accurate predictions and at the same time keep computational complexity within desired boundaries. This is because new observations are incorporated to the dictionary when they are far from what the algorithm has seen in the past. We propose a novel approach to kernel adaptive filtering that compares new observations against dictionary samples in terms of their unit-norm (normalised) versions, meaning that new observations that look like previous samples but have a different magnitude are not added to the dictionary. We achieve this by proposing the unit-norm Gaussian kernel and define a sparsification criterion for this novel kernel. This new methodology is validated on two real-world datasets against standard KAF in terms of the normalised mean square error and the dictionary size. |
Tasks | Time Series |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.04236v1 |
http://arxiv.org/pdf/1707.04236v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-sparsity-in-kernel-adaptive-filters |
Repo | |
Framework | |
Collaborative Summarization of Topic-Related Videos
Title | Collaborative Summarization of Topic-Related Videos |
Authors | Rameswar Panda, Amit K. Roy-Chowdhury |
Abstract | Large collections of videos are grouped into clusters by a topic keyword, such as Eiffel Tower or Surfing, with many important visual concepts repeating across them. Such a topically close set of videos have mutual influence on each other, which could be used to summarize one of them by exploiting information from others in the set. We build on this intuition to develop a novel approach to extract a summary that simultaneously captures both important particularities arising in the given video, as well as, generalities identified from the set of videos. The topic-related videos provide visual context to identify the important parts of the video being summarized. We achieve this by developing a collaborative sparse optimization method which can be efficiently solved by a half-quadratic minimization algorithm. Our work builds upon the idea of collaborative techniques from information retrieval and natural language processing, which typically use the attributes of other similar objects to predict the attribute of a given object. Experiments on two challenging and diverse datasets well demonstrate the efficacy of our approach over state-of-the-art methods. |
Tasks | Information Retrieval |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.03114v1 |
http://arxiv.org/pdf/1706.03114v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-summarization-of-topic-related |
Repo | |
Framework | |
Surfacing contextual hate speech words within social media
Title | Surfacing contextual hate speech words within social media |
Authors | Jherez Taylor, Melvyn Peignon, Yi-Shin Chen |
Abstract | Social media platforms have recently seen an increase in the occurrence of hate speech discourse which has led to calls for improved detection methods. Most of these rely on annotated data, keywords, and a classification technique. While this approach provides good coverage, it can fall short when dealing with new terms produced by online extremist communities which act as original sources of words which have alternate hate speech meanings. These code words (which can be both created and adopted words) are designed to evade automatic detection and often have benign meanings in regular discourse. As an example, “skypes”, “googles”, and “yahoos” are all instances of words which have an alternate meaning that can be used for hate speech. This overlap introduces additional challenges when relying on keywords for both the collection of data that is specific to hate speech, and downstream classification. In this work, we develop a community detection approach for finding extremist hate speech communities and collecting data from their members. We also develop a word embedding model that learns the alternate hate speech meaning of words and demonstrate the candidacy of our code words with several annotation experiments, designed to determine if it is possible to recognize a word as being used for hate speech without knowing its alternate meaning. We report an inter-annotator agreement rate of K=0.871, and K=0.676 for data drawn from our extremist community and the keyword approach respectively, supporting our claim that hate speech detection is a contextual task and does not depend on a fixed list of keywords. Our goal is to advance the domain by providing a high quality hate speech dataset in addition to learned code words that can be fed into existing classification approaches, thus improving the accuracy of automated detection. |
Tasks | Community Detection, Hate Speech Detection |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10093v1 |
http://arxiv.org/pdf/1711.10093v1.pdf | |
PWC | https://paperswithcode.com/paper/surfacing-contextual-hate-speech-words-within |
Repo | |
Framework | |
Face-to-BMI: Using Computer Vision to Infer Body Mass Index on Social Media
Title | Face-to-BMI: Using Computer Vision to Infer Body Mass Index on Social Media |
Authors | Enes Kocabey, Mustafa Camurcu, Ferda Ofli, Yusuf Aytar, Javier Marin, Antonio Torralba, Ingmar Weber |
Abstract | A person’s weight status can have profound implications on their life, ranging from mental health, to longevity, to financial income. At the societal level, “fat shaming” and other forms of “sizeism” are a growing concern, while increasing obesity rates are linked to ever raising healthcare costs. For these reasons, researchers from a variety of backgrounds are interested in studying obesity from all angles. To obtain data, traditionally, a person would have to accurately self-report their body-mass index (BMI) or would have to see a doctor to have it measured. In this paper, we show how computer vision can be used to infer a person’s BMI from social media images. We hope that our tool, which we release, helps to advance the study of social aspects related to body weight. |
Tasks | |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03156v1 |
http://arxiv.org/pdf/1703.03156v1.pdf | |
PWC | https://paperswithcode.com/paper/face-to-bmi-using-computer-vision-to-infer |
Repo | |
Framework | |
An Improved Naive Bayes Classifier-based Noise Detection Technique for Classifying User Phone Call Behavior
Title | An Improved Naive Bayes Classifier-based Noise Detection Technique for Classifying User Phone Call Behavior |
Authors | Iqbal H. Sarker, Muhammad Ashad Kabir, Alan Colman, Jun Han |
Abstract | The presence of noisy instances in mobile phone data is a fundamental issue for classifying user phone call behavior (i.e., accept, reject, missed and outgoing), with many potential negative consequences. The classification accuracy may decrease and the complexity of the classifiers may increase due to the number of redundant training samples. To detect such noisy instances from a training dataset, researchers use naive Bayes classifier (NBC) as it identifies misclassified instances by taking into account independence assumption and conditional probabilities of the attributes. However, some of these misclassified instances might indicate usages behavioral patterns of individual mobile phone users. Existing naive Bayes classifier based noise detection techniques have not considered this issue and, thus, are lacking in classification accuracy. In this paper, we propose an improved noise detection technique based on naive Bayes classifier for effectively classifying users’ phone call behaviors. In order to improve the classification accuracy, we effectively identify noisy instances from the training dataset by analyzing the behavioral patterns of individuals. We dynamically determine a noise threshold according to individual’s unique behavioral patterns by using both the naive Bayes classifier and Laplace estimator. We use this noise threshold to identify noisy instances. To measure the effectiveness of our technique in classifying user phone call behavior, we employ the most popular classification algorithm (e.g., decision tree). Experimental results on the real phone call log dataset show that our proposed technique more accurately identifies the noisy instances from the training datasets that leads to better classification accuracy. |
Tasks | |
Published | 2017-10-12 |
URL | http://arxiv.org/abs/1710.04461v2 |
http://arxiv.org/pdf/1710.04461v2.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-naive-bayes-classifier-based |
Repo | |
Framework | |
Generalization of Deep Neural Networks for Chest Pathology Classification in X-Rays Using Generative Adversarial Networks
Title | Generalization of Deep Neural Networks for Chest Pathology Classification in X-Rays Using Generative Adversarial Networks |
Authors | Hojjat Salehinejad, Shahrokh Valaee, Tim Dowdell, Errol Colak, Joseph Barfett |
Abstract | Medical datasets are often highly imbalanced with over-representation of common medical problems and a paucity of data from rare conditions. We propose simulation of pathology in images to overcome the above limitations. Using chest X-rays as a model medical image, we implement a generative adversarial network (GAN) to create artificial images based upon a modest sized labeled dataset. We employ a combination of real and artificial images to train a deep convolutional neural network (DCNN) to detect pathology across five classes of chest X-rays. Furthermore, we demonstrate that augmenting the original imbalanced dataset with GAN generated images improves performance of chest pathology classification using the proposed DCNN in comparison to the same DCNN trained with the original dataset alone. This improved performance is largely attributed to balancing of the dataset using GAN generated images, where image classes that are lacking in example images are preferentially augmented. |
Tasks | |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1712.01636v2 |
http://arxiv.org/pdf/1712.01636v2.pdf | |
PWC | https://paperswithcode.com/paper/generalization-of-deep-neural-networks-for |
Repo | |
Framework | |
Automatic Image Cropping for Visual Aesthetic Enhancement Using Deep Neural Networks and Cascaded Regression
Title | Automatic Image Cropping for Visual Aesthetic Enhancement Using Deep Neural Networks and Cascaded Regression |
Authors | Guanjun Guo, Hanzi Wang, Chunhua Shen, Yan Yan, Hong-Yuan Mark Liao |
Abstract | Despite recent progress, computational visual aesthetic is still challenging. Image cropping, which refers to the removal of unwanted scene areas, is an important step to improve the aesthetic quality of an image. However, it is challenging to evaluate whether cropping leads to aesthetically pleasing results because the assessment is typically subjective. In this paper, we propose a novel cascaded cropping regression (CCR) method to perform image cropping by learning the knowledge from professional photographers. The proposed CCR method improves the convergence speed of the cascaded method, which directly uses random-ferns regressors. In addition, a two-step learning strategy is proposed and used in the CCR method to address the problem of lacking labelled cropping data. Specifically, a deep convolutional neural network (CNN) classifier is first trained on large-scale visual aesthetic datasets. The deep CNN model is then designed to extract features from several image cropping datasets, upon which the cropping bounding boxes are predicted by the proposed CCR method. Experimental results on public image cropping datasets demonstrate that the proposed method significantly outperforms several state-of-the-art image cropping methods. |
Tasks | Image Cropping |
Published | 2017-12-25 |
URL | http://arxiv.org/abs/1712.09048v2 |
http://arxiv.org/pdf/1712.09048v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-image-cropping-for-visual-aesthetic |
Repo | |
Framework | |
GazeGAN - Unpaired Adversarial Image Generation for Gaze Estimation
Title | GazeGAN - Unpaired Adversarial Image Generation for Gaze Estimation |
Authors | Matan Sela, Pingmei Xu, Junfeng He, Vidhya Navalpakkam, Dmitry Lagun |
Abstract | Recent research has demonstrated the ability to estimate gaze on mobile devices by performing inference on the image from the phone’s front-facing camera, and without requiring specialized hardware. While this offers wide potential applications such as in human-computer interaction, medical diagnosis and accessibility (e.g., hands free gaze as input for patients with motor disorders), current methods are limited as they rely on collecting data from real users, which is a tedious and expensive process that is hard to scale across devices. There have been some attempts to synthesize eye region data using 3D models that can simulate various head poses and camera settings, however these lack in realism. In this paper, we improve upon a recently suggested method, and propose a generative adversarial framework to generate a large dataset of high resolution colorful images with high diversity (e.g., in subjects, head pose, camera settings) and realism, while simultaneously preserving the accuracy of gaze labels. The proposed approach operates on extended regions of the eye, and even completes missing parts of the image. Using this rich synthesized dataset, and without using any additional training data from real users, we demonstrate improvements over state-of-the-art for estimating 2D gaze position on mobile devices. We further demonstrate cross-device generalization of model performance, as well as improved robustness to diverse head pose, blur and distance. |
Tasks | Gaze Estimation, Image Generation, Medical Diagnosis |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09767v1 |
http://arxiv.org/pdf/1711.09767v1.pdf | |
PWC | https://paperswithcode.com/paper/gazegan-unpaired-adversarial-image-generation |
Repo | |
Framework | |
An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
Title | An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog |
Authors | Bing Liu, Ian Lane |
Abstract | We present a novel end-to-end trainable neural network model for task-oriented dialog systems. The model is able to track dialog state, issue API calls to knowledge base (KB), and incorporate structured KB query results into system responses to successfully complete task-oriented dialogs. The proposed model produces well-structured system responses by jointly learning belief tracking and KB result processing conditioning on the dialog history. We evaluate the model in a restaurant search domain using a dataset that is converted from the second Dialog State Tracking Challenge (DSTC2) corpus. Experiment results show that the proposed model can robustly track dialog state given the dialog history. Moreover, our model demonstrates promising results in producing appropriate system responses, outperforming prior end-to-end trainable neural network models using per-response accuracy evaluation metrics. |
Tasks | |
Published | 2017-08-20 |
URL | http://arxiv.org/abs/1708.05956v1 |
http://arxiv.org/pdf/1708.05956v1.pdf | |
PWC | https://paperswithcode.com/paper/an-end-to-end-trainable-neural-network-model |
Repo | |
Framework | |
Mining Non-Redundant Local Process Models From Sequence Databases
Title | Mining Non-Redundant Local Process Models From Sequence Databases |
Authors | Niek Tax, Marlon Dumas |
Abstract | Sequential pattern mining techniques extract patterns corresponding to frequent subsequences from a sequence database. A practical limitation of these techniques is that they overload the user with too many patterns. Local Process Model (LPM) mining is an alternative approach coming from the field of process mining. While in traditional sequential pattern mining, a pattern describes one subsequence, an LPM captures a set of subsequences. Also, while traditional sequential patterns only match subsequences that are observed in the sequence database, an LPM may capture subsequences that are not explicitly observed, but that are related to observed subsequences. In other words, LPMs generalize the behavior observed in the sequence database. These properties make it possible for a set of LPMs to cover the behavior of a much larger set of sequential patterns. Yet, existing LPM mining techniques still suffer from the pattern explosion problem because they produce sets of redundant LPMs. In this paper, we propose several heuristics to mine a set of non-redundant LPMs either from a set of redundant LPMs or from a set of sequential patterns. We empirically compare the proposed heuristics between them and against existing (local) process mining techniques in terms of coverage, redundancy, and complexity of the produced sets of LPMs. |
Tasks | Sequential Pattern Mining |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04159v2 |
http://arxiv.org/pdf/1712.04159v2.pdf | |
PWC | https://paperswithcode.com/paper/mining-non-redundant-local-process-models |
Repo | |
Framework | |
LocDyn: Robust Distributed Localization for Mobile Underwater Networks
Title | LocDyn: Robust Distributed Localization for Mobile Underwater Networks |
Authors | Cláudia Soares, João Gomes, Beatriz Ferreira, João Paulo Costeira |
Abstract | How to self-localize large teams of underwater nodes using only noisy range measurements? How to do it in a distributed way, and incorporating dynamics into the problem? How to reject outliers and produce trustworthy position estimates? The stringent acoustic communication channel and the accuracy needs of our geophysical survey application demand faster and more accurate localization methods. We approach dynamic localization as a MAP estimation problem where the prior encodes dynamics, and we devise a convex relaxation method that takes advantage of previous estimates at each measurement acquisition step; The algorithm converges at an optimal rate for first order methods. LocDyn is distributed: there is no fusion center responsible for processing acquired data and the same simple computations are performed for each node. LocDyn is accurate: experiments attest to a smaller positioning error than a comparable Kalman filter. LocDyn is robust: it rejects outlier noise, while the comparing methods succumb in terms of positioning error. |
Tasks | |
Published | 2017-01-27 |
URL | http://arxiv.org/abs/1701.08027v1 |
http://arxiv.org/pdf/1701.08027v1.pdf | |
PWC | https://paperswithcode.com/paper/locdyn-robust-distributed-localization-for |
Repo | |
Framework | |
Should You Derive, Or Let the Data Drive? An Optimization Framework for Hybrid First-Principles Data-Driven Modeling
Title | Should You Derive, Or Let the Data Drive? An Optimization Framework for Hybrid First-Principles Data-Driven Modeling |
Authors | Remi R. Lam, Lior Horesh, Haim Avron, Karen E. Willcox |
Abstract | Mathematical models are used extensively for diverse tasks including analysis, optimization, and decision making. Frequently, those models are principled but imperfect representations of reality. This is either due to incomplete physical description of the underlying phenomenon (simplified governing equations, defective boundary conditions, etc.), or due to numerical approximations (discretization, linearization, round-off error, etc.). Model misspecification can lead to erroneous model predictions, and respectively suboptimal decisions associated with the intended end-goal task. To mitigate this effect, one can amend the available model using limited data produced by experiments or higher fidelity models. A large body of research has focused on estimating explicit model parameters. This work takes a different perspective and targets the construction of a correction model operator with implicit attributes. We investigate the case where the end-goal is inversion and illustrate how appropriate choices of properties imposed upon the correction and corrected operator lead to improved end-goal insights. |
Tasks | Decision Making |
Published | 2017-11-12 |
URL | http://arxiv.org/abs/1711.04374v1 |
http://arxiv.org/pdf/1711.04374v1.pdf | |
PWC | https://paperswithcode.com/paper/should-you-derive-or-let-the-data-drive-an |
Repo | |
Framework | |
Multi-view pose estimation with mixtures-of-parts and adaptive viewpoint selection
Title | Multi-view pose estimation with mixtures-of-parts and adaptive viewpoint selection |
Authors | Emre Dogan, Gonen Eren, Christian Wolf, Eric Lombardi, Atilla Baskurt |
Abstract | We propose a new method for human pose estimation which leverages information from multiple views to impose a strong prior on articulated pose. The novelty of the method concerns the types of coherence modelled. Consistency is maximised over the different views through different terms modelling classical geometric information (coherence of the resulting poses) as well as appearance information which is modelled as latent variables in the global energy function. Moreover, adequacy of each view is assessed and their contributions are adjusted accordingly. Experiments on the HumanEva and UMPM datasets show that the proposed method significantly decreases the estimation error compared to single-view results. |
Tasks | Pose Estimation |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08527v1 |
http://arxiv.org/pdf/1709.08527v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-pose-estimation-with-mixtures-of |
Repo | |
Framework | |