July 27, 2019

2893 words 14 mins read

Paper Group ANR 524

Simultaneous Feature Aggregating and Hashing for Large-scale Image Search. Nonlinear Kalman Filtering with Divergence Minimization. Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces. LD-SDS: Towards an Expressive Spoken Dialogue System based on Linked-Data. Phrase Pair Mappings for Hindi-English Statistical Machine Translati …

Simultaneous Feature Aggregating and Hashing for Large-scale Image Search


Title	Simultaneous Feature Aggregating and Hashing for Large-scale Image Search
Authors	Thanh-Toan Do, Dang-Khoa Le Tan, Trung T. Pham, Ngai-Man Cheung
Abstract	In most state-of-the-art hashing-based visual search systems, local image descriptors of an image are first aggregated as a single feature vector. This feature vector is then subjected to a hashing function that produces a binary hash code. In previous work, the aggregating and the hashing processes are designed independently. In this paper, we propose a novel framework where feature aggregating and hashing are designed simultaneously and optimized jointly. Specifically, our joint optimization produces aggregated representations that can be better reconstructed by some binary codes. This leads to more discriminative binary hash codes and improved retrieval accuracy. In addition, we also propose a fast version of the recently-proposed Binary Autoencoder to be used in our proposed framework. We perform extensive retrieval experiments on several benchmark datasets with both SIFT and convolutional features. Our results suggest that the proposed framework achieves significant improvements over the state of the art.
Tasks	Image Retrieval
Published	2017-04-04
URL	http://arxiv.org/abs/1704.00860v1
PDF	http://arxiv.org/pdf/1704.00860v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-feature-aggregating-and-hashing
Repo
Framework

Nonlinear Kalman Filtering with Divergence Minimization


Title	Nonlinear Kalman Filtering with Divergence Minimization
Authors	San Gultekin, John Paisley
Abstract	We consider the nonlinear Kalman filtering problem using Kullback-Leibler (KL) and $\alpha$-divergence measures as optimization criteria. Unlike linear Kalman filters, nonlinear Kalman filters do not have closed form Gaussian posteriors because of a lack of conjugacy due to the nonlinearity in the likelihood. In this paper we propose novel algorithms to optimize the forward and reverse forms of the KL divergence, as well as the alpha-divergence which contains these two as limiting cases. Unlike previous approaches, our algorithms do not make approximations to the divergences being optimized, but use Monte Carlo integration techniques to derive unbiased algorithms for direct optimization. We assess performance on radar and sensor tracking, and options pricing problems, showing general improvement over the UKF and EKF, as well as competitive performance with particle filtering.
Tasks
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00722v1
PDF	http://arxiv.org/pdf/1705.00722v1.pdf
PWC	https://paperswithcode.com/paper/nonlinear-kalman-filtering-with-divergence
Repo
Framework

Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces


Title	Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces
Authors	Songbai Yan, Chicheng Zhang
Abstract	It has been a long-standing problem to efficiently learn a halfspace using as few labels as possible in the presence of noise. In this work, we propose an efficient Perceptron-based algorithm for actively learning homogeneous halfspaces under the uniform distribution over the unit sphere. Under the bounded noise condition~\cite{MN06}, where each label is flipped with probability at most $\eta < \frac 1 2$, our algorithm achieves a near-optimal label complexity of $\tilde{O}\left(\frac{d}{(1-2\eta)^2}\ln\frac{1}{\epsilon}\right)$ in time $\tilde{O}\left(\frac{d^2}{\epsilon(1-2\eta)^3}\right)$. Under the adversarial noise condition~\cite{ABL14, KLS09, KKMS08}, where at most a $\tilde \Omega(\epsilon)$ fraction of labels can be flipped, our algorithm achieves a near-optimal label complexity of $\tilde{O}\left(d\ln\frac{1}{\epsilon}\right)$ in time $\tilde{O}\left(\frac{d^2}{\epsilon}\right)$. Furthermore, we show that our active learning algorithm can be converted to an efficient passive learning algorithm that has near-optimal sample complexities with respect to $\epsilon$ and $d$.
Tasks	Active Learning
Published	2017-02-18
URL	http://arxiv.org/abs/1702.05581v2
PDF	http://arxiv.org/pdf/1702.05581v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-perceptron-efficient-and-label
Repo
Framework

LD-SDS: Towards an Expressive Spoken Dialogue System based on Linked-Data


Title	LD-SDS: Towards an Expressive Spoken Dialogue System based on Linked-Data
Authors	Alexandros Papangelis, Panagiotis Papadakos, Margarita Kotti, Yannis Stylianou, Yannis Tzitzikas, Dimitris Plexousakis
Abstract	In this work we discuss the related challenges and describe an approach towards the fusion of state-of-the-art technologies from the Spoken Dialogue Systems (SDS) and the Semantic Web and Information Retrieval domains. We envision a dialogue system named LD-SDS that will support advanced, expressive, and engaging user requests, over multiple, complex, rich, and open-domain data sources that will leverage the wealth of the available Linked Data. Specifically, we focus on: a) improving the identification, disambiguation and linking of entities occurring in data sources and user input; b) offering advanced query services for exploiting the semantics of the data, with reasoning and exploratory capabilities; and c) expanding the typical information seeking dialogue model (slot filling) to better reflect real-world conversational search scenarios.
Tasks	Information Retrieval, Slot Filling, Spoken Dialogue Systems
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02973v1
PDF	http://arxiv.org/pdf/1710.02973v1.pdf
PWC	https://paperswithcode.com/paper/ld-sds-towards-an-expressive-spoken-dialogue
Repo
Framework

Phrase Pair Mappings for Hindi-English Statistical Machine Translation


Title	Phrase Pair Mappings for Hindi-English Statistical Machine Translation
Authors	Sreelekha S, Pushpak Bhattacharyya
Abstract	In this paper, we present our work on the creation of lexical resources for the Machine Translation between English and Hindi. We describes the development of phrase pair mappings for our experiments and the comparative performance evaluation between different trained models on top of the baseline Statistical Machine Translation system. We focused on augmenting the parallel corpus with more vocabulary as well as with various inflected forms by exploring different ways. We have augmented the training corpus with various lexical resources such as lexical words, synset words, function words and verb phrases. We have described the case studies, automatic and subjective evaluations, detailed error analysis for both the English to Hindi and Hindi to English machine translation systems. We further analyzed that, there is an incremental growth in the quality of machine translation with the usage of various lexical resources. Thus lexical resources do help uplift the translation quality of resource poor langugaes.
Tasks	Machine Translation
Published	2017-10-05
URL	http://arxiv.org/abs/1710.02100v3
PDF	http://arxiv.org/pdf/1710.02100v3.pdf
PWC	https://paperswithcode.com/paper/phrase-pair-mappings-for-hindi-english
Repo
Framework

A Method of Generating Random Weights and Biases in Feedforward Neural Networks with Random Hidden Nodes


Title	A Method of Generating Random Weights and Biases in Feedforward Neural Networks with Random Hidden Nodes
Authors	Grzegorz Dudek
Abstract	Neural networks with random hidden nodes have gained increasing interest from researchers and practical applications. This is due to their unique features such as very fast training and universal approximation property. In these networks the weights and biases of hidden nodes determining the nonlinear feature mapping are set randomly and are not learned. Appropriate selection of the intervals from which weights and biases are selected is extremely important. This topic has not yet been sufficiently explored in the literature. In this work a method of generating random weights and biases is proposed. This method generates the parameters of the hidden nodes in such a way that nonlinear fragments of the activation functions are located in the input space regions with data and can be used to construct the surface approximating a nonlinear target function. The weights and biases are dependent on the input data range and activation function type. The proposed methods allows us to control the generalization degree of the model. These all lead to improvement in approximation performance of the network. Several experiments show very promising results.
Tasks
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04874v1
PDF	http://arxiv.org/pdf/1710.04874v1.pdf
PWC	https://paperswithcode.com/paper/a-method-of-generating-random-weights-and
Repo
Framework

Algorithm guided outlining of 105 pancreatic cancer liver metastases in Ultrasound


Title	Algorithm guided outlining of 105 pancreatic cancer liver metastases in Ultrasound
Authors	Alexander Hann, Lucas Bettac, Mark M. Haenle, Tilmann Graeter, Andreas W. Berger, Jens Dreyhaupt, Dieter Schmalstieg, Wolfram G. Zoller, Jan Egger
Abstract	Manual segmentation of hepatic metastases in ultrasound images acquired from patients suffering from pancreatic cancer is common practice. Semiautomatic measurements promising assistance in this process are often assessed using a small number of lesions performed by examiners who already know the algorithm. In this work, we present the application of an algorithm for the segmentation of liver metastases due to pancreatic cancer using a set of 105 different images of metastases. The algorithm and the two examiners had never assessed the images before. The examiners first performed a manual segmentation and, after five weeks, a semiautomatic segmentation using the algorithm. They were satisfied in up to 90% of the cases with the semiautomatic segmentation results. Using the algorithm was significantly faster and resulted in a median Dice similarity score of over 80%. Estimation of the inter-operator variability by using the intra class correlation coefficient was good with 0.8. In conclusion, the algorithm facilitates fast and accurate segmentation of liver metastases, comparable to the current gold standard of manual segmentation.
Tasks
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02984v1
PDF	http://arxiv.org/pdf/1710.02984v1.pdf
PWC	https://paperswithcode.com/paper/algorithm-guided-outlining-of-105-pancreatic
Repo
Framework

The MacGyver Test - A Framework for Evaluating Machine Resourcefulness and Creative Problem Solving


Title	The MacGyver Test - A Framework for Evaluating Machine Resourcefulness and Creative Problem Solving
Authors	Vasanth Sarathy, Matthias Scheutz
Abstract	Current measures of machine intelligence are either difficult to evaluate or lack the ability to test a robot’s problem-solving capacity in open worlds. We propose a novel evaluation framework based on the formal notion of MacGyver Test which provides a practical way for assessing the resilience and resourcefulness of artificial agents.
Tasks
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08350v1
PDF	http://arxiv.org/pdf/1704.08350v1.pdf
PWC	https://paperswithcode.com/paper/the-macgyver-test-a-framework-for-evaluating
Repo
Framework

Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras


Title	Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras
Authors	Rui Wang, Martin Schwörer, Daniel Cremers
Abstract	We propose Stereo Direct Sparse Odometry (Stereo DSO) as a novel method for highly accurate real-time visual odometry estimation of large-scale environments from stereo cameras. It jointly optimizes for all the model parameters within the active window, including the intrinsic/extrinsic camera parameters of all keyframes and the depth values of all selected pixels. In particular, we propose a novel approach to integrate constraints from static stereo into the bundle adjustment pipeline of temporal multi-view stereo. Real-time optimization is realized by sampling pixels uniformly from image regions with sufficient intensity gradient. Fixed-baseline stereo resolves scale drift. It also reduces the sensitivities to large optical flow and to rolling shutter effect which are known shortcomings of direct image alignment methods. Quantitative evaluation demonstrates that the proposed Stereo DSO outperforms existing state-of-the-art visual odometry methods both in terms of tracking accuracy and robustness. Moreover, our method delivers a more precise metric 3D reconstruction than previous dense/semi-dense direct approaches while providing a higher reconstruction density than feature-based methods.
Tasks	3D Reconstruction, Optical Flow Estimation, Visual Odometry
Published	2017-08-25
URL	http://arxiv.org/abs/1708.07878v1
PDF	http://arxiv.org/pdf/1708.07878v1.pdf
PWC	https://paperswithcode.com/paper/stereo-dso-large-scale-direct-sparse-visual
Repo
Framework

Ethical Questions in NLP Research: The (Mis)-Use of Forensic Linguistics


Title	Ethical Questions in NLP Research: The (Mis)-Use of Forensic Linguistics
Authors	Anil Kumar Singh, Akhilesh Sudhakar
Abstract	Ideas from forensic linguistics are now being used frequently in Natural Language Processing (NLP), using machine learning techniques. While the role of forensic linguistics was more benign earlier, it is now being used for purposes which are questionable. Certain methods from forensic linguistics are employed, without considering their scientific limitations and ethical concerns. While we take the specific case of forensic linguistics as an example of such trends in NLP and machine learning, the issue is a larger one and present in many other scientific and data-driven domains. We suggest that such trends indicate that some of the applied sciences are exceeding their legal and scientific briefs. We highlight how carelessly implemented practices are serving to short-circuit the due processes of law as well breach ethical codes.
Tasks
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07512v1
PDF	http://arxiv.org/pdf/1712.07512v1.pdf
PWC	https://paperswithcode.com/paper/ethical-questions-in-nlp-research-the-mis-use
Repo
Framework

Age Group and Gender Estimation in the Wild with Deep RoR Architecture


Title	Age Group and Gender Estimation in the Wild with Deep RoR Architecture
Authors	Ke Zhang, Ce Gao, Liru Guo, Miao Sun, Xingfang Yuan, Tony X. Han, Zhenbing Zhao, Baogang Li
Abstract	Automatically predicting age group and gender from face images acquired in unconstrained conditions is an important and challenging task in many real-world applications. Nevertheless, the conventional methods with manually-designed features on in-the-wild benchmarks are unsatisfactory because of incompetency to tackle large variations in unconstrained images. This difficulty is alleviated to some degree through Convolutional Neural Networks (CNN) for its powerful feature representation. In this paper, we propose a new CNN based method for age group and gender estimation leveraging Residual Networks of Residual Networks (RoR), which exhibits better optimization ability for age group and gender classification than other CNN architectures.Moreover, two modest mechanisms based on observation of the characteristics of age group are presented to further improve the performance of age estimation.In order to further improve the performance and alleviate over-fitting problem, RoR model is pre-trained on ImageNet firstly, and then it is fune-tuned on the IMDB-WIKI-101 data set for further learning the features of face images, finally, it is used to fine-tune on Adience data set. Our experiments illustrate the effectiveness of RoR method for age and gender estimation in the wild, where it achieves better performance than other CNN methods. Finally, the RoR-152+IMDB-WIKI-101 with two mechanisms achieves new state-of-the-art results on Adience benchmark.
Tasks
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02985v1
PDF	http://arxiv.org/pdf/1710.02985v1.pdf
PWC	https://paperswithcode.com/paper/age-group-and-gender-estimation-in-the-wild
Repo
Framework

Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features


Title	Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features
Authors	Gregor Wiedemann, Gerhard Heyer
Abstract	In recent years, (retro-)digitizing paper-based files became a major undertaking for private and public archives as well as an important task in electronic mailroom applications. As a first step, the workflow involves scanning and Optical Character Recognition (OCR) of documents. Preservation of document contexts of single page scans is a major requirement in this context. To facilitate workflows involving very large amounts of paper scans, page stream segmentation (PSS) is the task to automatically separate a stream of scanned images into multi-page documents. In a digitization project together with a German federal archive, we developed a novel approach based on convolutional neural networks (CNN) combining image and text features to achieve optimal document separation results. Evaluation shows that our PSS architecture achieves an accuracy up to 93 % which can be regarded as a new state-of-the-art for this task.
Tasks	Optical Character Recognition
Published	2017-10-09
URL	http://arxiv.org/abs/1710.03006v3
PDF	http://arxiv.org/pdf/1710.03006v3.pdf
PWC	https://paperswithcode.com/paper/page-stream-segmentation-with-convolutional
Repo
Framework

Temporal Multimodal Fusion for Video Emotion Classification in the Wild


Title	Temporal Multimodal Fusion for Video Emotion Classification in the Wild
Authors	Valentin Vielzeuf, Stéphane Pateux, Frédéric Jurie
Abstract	This paper addresses the question of emotion classification. The task consists in predicting emotion labels (taken among a set of possible labels) best describing the emotions contained in short video clips. Building on a standard framework – lying in describing videos by audio and visual features used by a supervised classifier to infer the labels – this paper investigates several novel directions. First of all, improved face descriptors based on 2D and 3D Convo-lutional Neural Networks are proposed. Second, the paper explores several fusion methods, temporal and multimodal, including a novel hierarchical method combining features and scores. In addition, we carefully reviewed the different stages of the pipeline and designed a CNN architecture adapted to the task; this is important as the size of the training set is small compared to the difficulty of the problem, making generalization difficult. The so-obtained model ranked 4th at the 2017 Emotion in the Wild challenge with the accuracy of 58.8 %.
Tasks	Emotion Classification
Published	2017-09-21
URL	http://arxiv.org/abs/1709.07200v1
PDF	http://arxiv.org/pdf/1709.07200v1.pdf
PWC	https://paperswithcode.com/paper/temporal-multimodal-fusion-for-video-emotion
Repo
Framework

A Branch-and-Bound Algorithm for Checkerboard Extraction in Camera-Laser Calibration


Title	A Branch-and-Bound Algorithm for Checkerboard Extraction in Camera-Laser Calibration
Authors	Alireza Khosravian, Tat-Jun Chin, Ian Reid
Abstract	We address the problem of camera-to-laser-scanner calibration using a checkerboard and multiple image-laser scan pairs. Distinguishing which laser points measure the checkerboard and which lie on the background is essential to any such system. We formulate the checkerboard extraction as a combinatorial optimization problem with a clear cut objective function. We propose a branch-and-bound technique that deterministically and globally optimizes the objective. Unlike what is available in the literature, the proposed method is not heuristic and does not require assumptions such as constraints on the background or relying on discontinuity of the range measurements to partition the data into line segments. The proposed approach is generic and can be applied to both 3D or 2D laser scanners as well as the cases where multiple checkerboards are present. We demonstrate the effectiveness of the proposed approach by providing numerical simulations as well as experimental results.
Tasks	Calibration, Combinatorial Optimization
Published	2017-04-04
URL	http://arxiv.org/abs/1704.00887v1
PDF	http://arxiv.org/pdf/1704.00887v1.pdf
PWC	https://paperswithcode.com/paper/a-branch-and-bound-algorithm-for-checkerboard
Repo
Framework

Auto Analysis of Customer Feedback using CNN and GRU Network


Title	Auto Analysis of Customer Feedback using CNN and GRU Network
Authors	Deepak Gupta, Pabitra Lenka, Harsimran Bedi, Asif Ekbal, Pushpak Bhattacharyya
Abstract	Analyzing customer feedback is the best way to channelize the data into new marketing strategies that benefit entrepreneurs as well as customers. Therefore an automated system which can analyze the customer behavior is in great demand. Users may write feedbacks in any language, and hence mining appropriate information often becomes intractable. Especially in a traditional feature-based supervised model, it is difficult to build a generic system as one has to understand the concerned language for finding the relevant features. In order to overcome this, we propose deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches that do not require handcrafting of features. We evaluate these techniques for analyzing customer feedback sentences in four languages, namely English, French, Japanese and Spanish. Our empirical analysis shows that our models perform well in all the four languages on the setups of IJCNLP Shared Task on Customer Feedback Analysis. Our model achieved the second rank in French, with an accuracy of 71.75% and third ranks for all the other languages.
Tasks
Published	2017-10-12
URL	http://arxiv.org/abs/1710.04600v1
PDF	http://arxiv.org/pdf/1710.04600v1.pdf
PWC	https://paperswithcode.com/paper/auto-analysis-of-customer-feedback-using-cnn
Repo
Framework