Paper Group ANR 731
Constrained Classification and Ranking via Quantiles. Demo of Sanskrit-Hindi SMT System. Deep Learning with Inaccurate Training Data for Image Restoration. Liquid Time-constant Recurrent Neural Networks as Universal Approximators. Distilling Knowledge Using Parallel Data for Far-field Speech Recognition. Quantum Semantic Correlations in Hate and No …
Constrained Classification and Ranking via Quantiles
Title | Constrained Classification and Ranking via Quantiles |
Authors | Alan Mackey, Xiyang Luo, Elad Eban |
Abstract | In most machine learning applications, classification accuracy is not the primary metric of interest. Binary classifiers which face class imbalance are often evaluated by the $F_\beta$ score, area under the precision-recall curve, Precision at K, and more. The maximization of many of these metrics can be expressed as a constrained optimization problem, where the constraint is a function of the classifier’s predictions. In this paper we propose a novel framework for learning with constraints that can be expressed as a predicted positive rate (or negative rate) on a subset of the training data. We explicitly model the threshold at which a classifier must operate to satisfy the constraint, yielding a surrogate loss function which avoids the complexity of constrained optimization. The method is model-agnostic and only marginally more expensive than minimization of the unconstrained loss. Experiments on a variety of benchmarks show competitive performance relative to existing baselines. |
Tasks | |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1803.00067v1 |
http://arxiv.org/pdf/1803.00067v1.pdf | |
PWC | https://paperswithcode.com/paper/constrained-classification-and-ranking-via |
Repo | |
Framework | |
Demo of Sanskrit-Hindi SMT System
Title | Demo of Sanskrit-Hindi SMT System |
Authors | Rajneesh Pandey, Atul Kr. Ojha, Girish Nath Jha |
Abstract | The demo proposal presents a Phrase-based Sanskrit-Hindi (SaHiT) Statistical Machine Translation system. The system has been developed on Moses. 43k sentences of Sanskrit-Hindi parallel corpus and 56k sentences of a monolingual corpus in the target language (Hindi) have been used. This system gives 57 BLEU score. |
Tasks | Machine Translation |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.06716v1 |
http://arxiv.org/pdf/1804.06716v1.pdf | |
PWC | https://paperswithcode.com/paper/demo-of-sanskrit-hindi-smt-system |
Repo | |
Framework | |
Deep Learning with Inaccurate Training Data for Image Restoration
Title | Deep Learning with Inaccurate Training Data for Image Restoration |
Authors | Bolin Liu, Xiao Shu, Xiaolin Wu |
Abstract | In many applications of deep learning, particularly those in image restoration, it is either very difficult, prohibitively expensive, or outright impossible to obtain paired training data precisely as in the real world. In such cases, one is forced to use synthesized paired data to train the deep convolutional neural network (DCNN). However, due to the unavoidable generalization error in statistical learning, the synthetically trained DCNN often performs poorly on real world data. To overcome this problem, we propose a new general training method that can compensate for, to a large extent, the generalization errors of synthetically trained DCNNs. |
Tasks | Image Restoration |
Published | 2018-11-18 |
URL | http://arxiv.org/abs/1811.07268v1 |
http://arxiv.org/pdf/1811.07268v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-inaccurate-training-data |
Repo | |
Framework | |
Liquid Time-constant Recurrent Neural Networks as Universal Approximators
Title | Liquid Time-constant Recurrent Neural Networks as Universal Approximators |
Authors | Ramin M. Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu |
Abstract | In this paper, we introduce the notion of liquid time-constant (LTC) recurrent neural networks (RNN)s, a subclass of continuous-time RNNs, with varying neuronal time-constant realized by their nonlinear synaptic transmission model. This feature is inspired by the communication principles in the nervous system of small species. It enables the model to approximate continuous mapping with a small number of computational units. We show that any finite trajectory of an $n$-dimensional continuous dynamical system can be approximated by the internal state of the hidden units and $n$ output units of an LTC network. Here, we also theoretically find bounds on their neuronal states and varying time-constant. |
Tasks | |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00321v1 |
http://arxiv.org/pdf/1811.00321v1.pdf | |
PWC | https://paperswithcode.com/paper/liquid-time-constant-recurrent-neural |
Repo | |
Framework | |
Distilling Knowledge Using Parallel Data for Far-field Speech Recognition
Title | Distilling Knowledge Using Parallel Data for Far-field Speech Recognition |
Authors | Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu |
Abstract | In order to improve the performance for far-field speech recognition, this paper proposes to distill knowledge from the close-talking model to the far-field model using parallel data. The close-talking model is called the teacher model. The far-field model is called the student model. The student model is trained to imitate the output distributions of the teacher model. This constraint can be realized by minimizing the Kullback-Leibler (KL) divergence between the output distribution of the student model and the teacher model. Experimental results on AMI corpus show that the best student model achieves up to 4.7% absolute word error rate (WER) reduction when compared with the conventionally-trained baseline models. |
Tasks | Speech Recognition |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.06941v1 |
http://arxiv.org/pdf/1802.06941v1.pdf | |
PWC | https://paperswithcode.com/paper/distilling-knowledge-using-parallel-data-for |
Repo | |
Framework | |
Quantum Semantic Correlations in Hate and Non-Hate Speeches
Title | Quantum Semantic Correlations in Hate and Non-Hate Speeches |
Authors | Francesco Galofaro, Zeno Toffano, Bich-Liên Doan |
Abstract | This paper aims to apply the notions of quantum geometry and correlation to the typification of semantic relations between couples of keywords in different documents. In particular we analysed texts classified as hate / non hate speeches, containing the keywords “women”, “white”, and “black”. The paper compares this approach to cosine similarity, a classical methodology, to cast light on the notion of “similar meaning”. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03275v1 |
http://arxiv.org/pdf/1811.03275v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-semantic-correlations-in-hate-and-non |
Repo | |
Framework | |
Learning Personas from Dialogue with Attentive Memory Networks
Title | Learning Personas from Dialogue with Attentive Memory Networks |
Authors | Eric Chu, Prashanth Vijayaraghavan, Deb Roy |
Abstract | The ability to infer persona from dialogue can have applications in areas ranging from computational narrative analysis to personalized dialogue generation. We introduce neural models to learn persona embeddings in a supervised character trope classification task. The models encode dialogue snippets from IMDB into representations that can capture the various categories of film characters. The best-performing models use a multi-level attention mechanism over a set of utterances. We also utilize prior knowledge in the form of textual descriptions of the different tropes. We apply the learned embeddings to find similar characters across different movies, and cluster movies according to the distribution of the embeddings. The use of short conversational text as input, and the ability to learn from prior knowledge using memory, suggests these methods could be applied to other domains. |
Tasks | Dialogue Generation |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08717v1 |
http://arxiv.org/pdf/1810.08717v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-personas-from-dialogue-with |
Repo | |
Framework | |
Attentive Interaction Model: Modeling Changes in View in Argumentation
Title | Attentive Interaction Model: Modeling Changes in View in Argumentation |
Authors | Yohan Jo, Shivani Poddar, Byungsoo Jeon, Qinlan Shen, Carolyn P. Rose, Graham Neubig |
Abstract | We present a neural architecture for modeling argumentative dialogue that explicitly models the interplay between an Opinion Holder’s (OH’s) reasoning and a challenger’s argument, with the goal of predicting if the argument successfully changes the OH’s view. The model has two components: (1) vulnerable region detection, an attention model that identifies parts of the OH’s reasoning that are amenable to change, and (2) interaction encoding, which identifies the relationship between the content of the OH’s reasoning and that of the challenger’s argument. Based on evaluation on discussions from the Change My View forum on Reddit, the two components work together to predict an OH’s change in view, outperforming several baselines. A posthoc analysis suggests that sentences picked out by the attention model are addressed more frequently by successful arguments than by unsuccessful ones. |
Tasks | |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1804.00065v2 |
http://arxiv.org/pdf/1804.00065v2.pdf | |
PWC | https://paperswithcode.com/paper/attentive-interaction-model-modeling-changes |
Repo | |
Framework | |
Generation of Virtual Dual Energy Images from Standard Single-Shot Radiographs using Multi-scale and Conditional Adversarial Network
Title | Generation of Virtual Dual Energy Images from Standard Single-Shot Radiographs using Multi-scale and Conditional Adversarial Network |
Authors | Bo Zhou, Xunyu Lin, Brendan Eck, Jun Hou, David Wilson |
Abstract | Dual-energy (DE) chest radiographs provide greater diagnostic information than standard radiographs by separating the image into bone and soft tissue, revealing suspicious lesions which may otherwise be obstructed from view. However, acquisition of DE images requires two physical scans, necessitating specialized hardware and processing, and images are prone to motion artifact. Generation of virtual DE images from standard, single-shot chest radiographs would expand the diagnostic value of standard radiographs without changing the acquisition procedure. We present a Multi-scale Conditional Adversarial Network (MCA-Net) which produces high-resolution virtual DE bone images from standard, single-shot chest radiographs. Our proposed MCA-Net is trained using the adversarial network so that it learns sharp details for the production of high-quality bone images. Then, the virtual DE soft tissue image is generated by processing the standard radiograph with the virtual bone image using a cross projection transformation. Experimental results from 210 patient DE chest radiographs demonstrated that the algorithm can produce high-quality virtual DE chest radiographs. Important structures were preserved, such as coronary calcium in bone images and lung lesions in soft tissue images. The average structure similarity index and the peak signal to noise ratio of the produced bone images in testing data were 96.4 and 41.5, which are significantly better than results from previous methods. Furthermore, our clinical evaluation results performed on the publicly available dataset indicates the clinical values of our algorithms. Thus, our algorithm can produce high-quality DE images that are potentially useful for radiologists, computer-aided diagnostics, and other diagnostic tasks. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09354v1 |
http://arxiv.org/pdf/1810.09354v1.pdf | |
PWC | https://paperswithcode.com/paper/generation-of-virtual-dual-energy-images-from |
Repo | |
Framework | |
Disentangling Controllable and Uncontrollable Factors of Variation by Interacting with the World
Title | Disentangling Controllable and Uncontrollable Factors of Variation by Interacting with the World |
Authors | Yoshihide Sawada |
Abstract | We introduce a method to disentangle controllable and uncontrollable factors of variation by interacting with the world. Disentanglement leads to good representations and is important when applying deep neural networks (DNNs) in fields where explanations are required. This study attempts to improve an existing reinforcement learning (RL) approach to disentangle controllable and uncontrollable factors of variation, because the method lacks a mechanism to represent uncontrollable obstacles. To address this problem, we train two DNNs simultaneously: one that represents the controllable object and another that represents uncontrollable obstacles. For stable training, we applied a pretraining approach using a model robust against uncontrollable obstacles. Simulation experiments demonstrate that the proposed model can disentangle independently controllable and uncontrollable factors without annotated data. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.06955v2 |
http://arxiv.org/pdf/1804.06955v2.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-controllable-and-uncontrollable |
Repo | |
Framework | |
Estimation of Variance and Spatial Correlation Width for Fine-scale Measurement Error in Digital Elevation Model
Title | Estimation of Variance and Spatial Correlation Width for Fine-scale Measurement Error in Digital Elevation Model |
Authors | Mykhail Uss, Benoit Vozel, Vladimir Lukin, Kacem Chehdi |
Abstract | In this paper, we borrow from blind noise parameter estimation (BNPE) methodology early developed in the image processing field an original and innovative no-reference approach to estimate Digital Elevation Model (DEM) vertical error parameters without resorting to a reference DEM. The challenges associated with the proposed approach related to the physical nature of the error and its multifactor structure in DEM are discussed in detail. A suitable multivariate method is then developed for estimating the error in gridded DEM. It is built on a recently proposed vectorial BNPE method for estimating spatially correlated noise using Noise Informative areas and Fractal Brownian Motion. The newly multivariate method is derived to estimate the effect of the stacking procedure and that of the epipolar line error on local (fine-scale) standard deviation and autocorrelation function width of photogrammetric DEM measurement error. Applying the new estimator to ASTER GDEM2 and ALOS World 3D DEMs, good agreement of derived estimates with results available in the literature is evidenced. In future works, the proposed no-reference method for analyzing DEM error can be extended to a larger number of predictors for accounting for other factors influencing remote sensing (RS) DEM accuracy. |
Tasks | |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07740v1 |
http://arxiv.org/pdf/1801.07740v1.pdf | |
PWC | https://paperswithcode.com/paper/estimation-of-variance-and-spatial |
Repo | |
Framework | |
Disentangled Dynamic Representations from Unordered Data
Title | Disentangled Dynamic Representations from Unordered Data |
Authors | Leonhard Helminger, Abdelaziz Djelouah, Markus Gross, Romann M. Weber |
Abstract | We present a deep generative model that learns disentangled static and dynamic representations of data from unordered input. Our approach exploits regularities in sequential data that exist regardless of the order in which the data is viewed. The result of our factorized graphical model is a well-organized and coherent latent space for data dynamics. We demonstrate our method on several synthetic dynamic datasets and real video data featuring various facial expressions and head poses. |
Tasks | |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03962v1 |
http://arxiv.org/pdf/1812.03962v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangled-dynamic-representations-from |
Repo | |
Framework | |
Data-Driven Debugging for Functional Side Channels
Title | Data-Driven Debugging for Functional Side Channels |
Authors | Saeid Tizpaz-Niari, Pavol Cerny, Ashutosh Trivedi |
Abstract | Information leaks through side channels are a pervasive problem, even in security-critical applications. Functional side channels arise when an attacker knows that a secret value of a server stays fixed for a certain time. Then, the attacker can observe the server executions on a sequence of different public inputs, each paired with the same secret input. Thus for each secret, the attacker observes a function from public inputs to execution time, for instance, and she can compare these functions for different secrets. First, we introduce a notion of noninterference for functional side channels. We focus on the case of noisy observations, where we demonstrate with examples that there is a practical functional side channel in programs that would be deemed information-leak-free or be underestimated using the standard definition. Second, we develop a framework and techniques for debugging programs for functional side channels. We extend evolutionary fuzzing techniques to generate inputs that exploit functional dependencies of response times on public inputs. We adapt existing results and algorithms in functional data analysis to model the functions and discover the existence of side channels. We use a functional extension of standard decision tree learning to pinpoint the code fragments causing a side channel if there is one. We empirically evaluate the performance of our tool FUCHSIA on a series of micro-benchmarks and realistic Java programs. On the set of benchmarks, we show that FUCHSIA outperforms the state-of-the-art techniques in detecting side channel classes. On the realistic programs, we show the scalability of FUCHSIA in analyzing functional side channels in Java programs with thousands of methods. Also, we show the usefulness of FUCHSIA in finding side channels including a zero-day vulnerability in OpenJDK and another vulnerability in Jetty that was since fixed by the developers. |
Tasks | |
Published | 2018-08-30 |
URL | https://arxiv.org/abs/1808.10502v2 |
https://arxiv.org/pdf/1808.10502v2.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-debugging-for-functional-side |
Repo | |
Framework | |
Autonomous discovery of the goal space to learn a parameterized skill
Title | Autonomous discovery of the goal space to learn a parameterized skill |
Authors | Emilio Cartoni, Gianluca Baldassarre |
Abstract | A parameterized skill is a mapping from multiple goals/task parameters to the policy parameters to accomplish them. Existing works in the literature show how a parameterized skill can be learned given a task space that defines all the possible achievable goals. In this work, we focus on tasks defined in terms of final states (goals), and we face on the challenge where the agent aims to autonomously acquire a parameterized skill to manipulate an initially unknown environment. In this case, the task space is not known a priori and the agent has to autonomously discover it. The agent may posit as a task space its whole sensory space (i.e. the space of all possible sensor readings) as the achievable goals will certainly be a subset of this space. However, the space of achievable goals may be a very tiny subspace in relation to the whole sensory space, thus directly using the sensor space as task space exposes the agent to the curse of dimensionality and makes existing autonomous skill acquisition algorithms inefficient. In this work we present an algorithm that actively discovers the manifold of the achievable goals within the sensor space. We validate the algorithm by employing it in multiple different simulated scenarios where the agent actions achieve different types of goals: moving a redundant arm, pushing an object, and changing the color of an object. |
Tasks | |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07547v1 |
http://arxiv.org/pdf/1805.07547v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-discovery-of-the-goal-space-to |
Repo | |
Framework | |
Interactive dimensionality reduction using similarity projections
Title | Interactive dimensionality reduction using similarity projections |
Authors | Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas |
Abstract | Recent advances in machine learning allow us to analyze and describe the content of high-dimensional data like text, audio, images or other signals. In order to visualize that data in 2D or 3D, usually Dimensionality Reduction (DR) techniques are employed. Most of these techniques, e.g., PCA or t-SNE, produce static projections without taking into account corrections from humans or other data exploration scenarios. In this work, we propose the interactive Similarity Projection (iSP), a novel interactive DR framework based on similarity embeddings, where we form a differentiable objective based on the user interactions and perform learning using gradient descent, with an end-to-end trainable architecture. Two interaction scenarios are evaluated. First, a common methodology in multidimensional projection is to project a subset of data, arrange them in classes or clusters, and project the rest unseen dataset based on that manipulation, in a kind of semi-supervised interpolation. We report results that outperform competitive baselines in a wide range of metrics and datasets. Second, we explore the scenario of manipulating some classes, while enriching the optimization with high-dimensional neighbor information. Apart from improving classification precision and clustering on images and text documents, the new emerging structure of the projection unveils semantic manifolds. For example, on the Head Pose dataset, by just dragging the faces looking far left to the left and those looking far right to the right, all faces are re-arranged on a continuum even on the vertical axis (face up and down). This end-to-end framework can be used for fast, visual semi-supervised learning, manifold exploration, interactive domain adaptation of neural embeddings and transfer learning. |
Tasks | Dimensionality Reduction, Domain Adaptation, Transfer Learning |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05531v1 |
http://arxiv.org/pdf/1811.05531v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-dimensionality-reduction-using |
Repo | |
Framework | |