July 30, 2019

3002 words 15 mins read

Paper Group AWR 38

Paper Group AWR 38

AON: Towards Arbitrarily-Oriented Text Recognition. Laplacian-Steered Neural Style Transfer. Interpretable Policies for Reinforcement Learning by Genetic Programming. Combining Residual Networks with LSTMs for Lipreading. Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models. GXNOR-Net: Training deep …

AON: Towards Arbitrarily-Oriented Text Recognition

Title AON: Towards Arbitrarily-Oriented Text Recognition
Authors Zhanzhan Cheng, Yangliu Xu, Fan Bai, Yi Niu, Shiliang Pu, Shuigeng Zhou
Abstract Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.
Tasks Optical Character Recognition
Published 2017-11-12
URL http://arxiv.org/abs/1711.04226v2
PDF http://arxiv.org/pdf/1711.04226v2.pdf
PWC https://paperswithcode.com/paper/aon-towards-arbitrarily-oriented-text
Repo https://github.com/kallianisawesome/AON-pytorch
Framework pytorch

Laplacian-Steered Neural Style Transfer

Title Laplacian-Steered Neural Style Transfer
Authors Shaohua Li, Xinxing Xu, Liqiang Nie, Tat-Seng Chua
Abstract Neural Style Transfer based on Convolutional Neural Networks (CNN) aims to synthesize a new image that retains the high-level structure of a content image, rendered in the low-level texture of a style image. This is achieved by constraining the new image to have high-level CNN features similar to the content image, and lower-level CNN features similar to the style image. However in the traditional optimization objective, low-level features of the content image are absent, and the low-level features of the style image dominate the low-level detail structures of the new image. Hence in the synthesized image, many details of the content image are lost, and a lot of inconsistent and unpleasing artifacts appear. As a remedy, we propose to steer image synthesis with a novel loss function: the Laplacian loss. The Laplacian matrix (“Laplacian” in short), produced by a Laplacian operator, is widely used in computer vision to detect edges and contours. The Laplacian loss measures the difference of the Laplacians, and correspondingly the difference of the detail structures, between the content image and a new image. It is flexible and compatible with the traditional style transfer constraints. By incorporating the Laplacian loss, we obtain a new optimization objective for neural style transfer named Lapstyle. Minimizing this objective will produce a stylized image that better preserves the detail structures of the content image and eliminates the artifacts. Experiments show that Lapstyle produces more appealing stylized images with less artifacts, without compromising their “stylishness”.
Tasks Image Generation, Style Transfer
Published 2017-07-05
URL http://arxiv.org/abs/1707.01253v2
PDF http://arxiv.org/pdf/1707.01253v2.pdf
PWC https://paperswithcode.com/paper/laplacian-steered-neural-style-transfer
Repo https://github.com/askerlee/lapstyle
Framework tf

Interpretable Policies for Reinforcement Learning by Genetic Programming

Title Interpretable Policies for Reinforcement Learning by Genetic Programming
Authors Daniel Hein, Steffen Udluft, Thomas A. Runkler
Abstract The search for interpretable reinforcement learning policies is of high academic and industrial interest. Especially for industrial systems, domain experts are more likely to deploy autonomously learned controllers if they are understandable and convenient to evaluate. Basic algebraic equations are supposed to meet these requirements, as long as they are restricted to an adequate complexity. Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples. GPRL is compared to a straight-forward method which utilizes genetic programming for symbolic regression, yielding policies imitating an existing well-performing, but non-interpretable policy. Experiments on three reinforcement learning benchmarks, i.e., mountain car, cart-pole balancing, and industrial benchmark, demonstrate the superiority of our GPRL approach compared to the symbolic regression method. GPRL is capable of producing well-performing interpretable reinforcement learning policies from pre-existing default trajectory data.
Tasks
Published 2017-12-12
URL http://arxiv.org/abs/1712.04170v2
PDF http://arxiv.org/pdf/1712.04170v2.pdf
PWC https://paperswithcode.com/paper/interpretable-policies-for-reinforcement
Repo https://github.com/siemens/industrialbenchmark
Framework none

Combining Residual Networks with LSTMs for Lipreading

Title Combining Residual Networks with LSTMs for Lipreading
Authors Themos Stafylakis, Georgios Tzimiropoulos
Abstract We propose an end-to-end deep learning architecture for word-level visual speech recognition. The system is a combination of spatiotemporal convolutional, residual and bidirectional Long Short-Term Memory networks. We train and evaluate it on the Lipreading In-The-Wild benchmark, a challenging database of 500-size target-words consisting of 1.28sec video excerpts from BBC TV broadcasts. The proposed network attains word accuracy equal to 83.0, yielding 6.8 absolute improvement over the current state-of-the-art, without using information about word boundaries during training or testing.
Tasks Lipreading, Speech Recognition, Visual Speech Recognition
Published 2017-03-12
URL http://arxiv.org/abs/1703.04105v4
PDF http://arxiv.org/pdf/1703.04105v4.pdf
PWC https://paperswithcode.com/paper/combining-residual-networks-with-lstms-for
Repo https://github.com/mpc001/end-to-end-Lipreading
Framework pytorch

Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Title Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models
Authors Makoto Aoshima, Kazuyoshi Yata
Abstract We consider classifiers for high-dimensional data under the strongly spiked eigenvalue (SSE) model. We first show that high-dimensional data often have the SSE model. We consider a distance-based classifier using eigenstructures for the SSE model. We apply the noise reduction methodology to estimation of the eigenvalues and eigenvectors in the SSE model. We create a new distance-based classifier by transforming data from the SSE model to the non-SSE model. We give simulation studies and discuss the performance of the new classifier. Finally, we demonstrate the new classifier by using microarray data sets.
Tasks
Published 2017-10-30
URL http://arxiv.org/abs/1710.10768v1
PDF http://arxiv.org/pdf/1710.10768v1.pdf
PWC https://paperswithcode.com/paper/distance-based-classifier-by-data
Repo https://github.com/keisuke6616/Distance-based-classifier-by-data-transformation-for-high-dimension-strongly-spiked-eigenvalue-mode
Framework none

GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework

Title GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework
Authors Lei Deng, Peng Jiao, Jing Pei, Zhenzhi Wu, Guoqi Li
Abstract There is a pressing need to build an architecture that could subsume these networks under a unified framework that achieves both higher performance and less overhead. To this end, two fundamental issues are yet to be addressed. The first one is how to implement the back propagation when neuronal activations are discrete. The second one is how to remove the full-precision hidden weights in the training phase to break the bottlenecks of memory/computation consumption. To address the first issue, we present a multi-step neuronal activation discretization method and a derivative approximation technique that enable the implementing the back propagation algorithm on discrete DNNs. While for the second issue, we propose a discrete state transition (DST) methodology to constrain the weights in a discrete space without saving the hidden weights. Through this way, we build a unified framework that subsumes the binary or ternary networks as its special cases, and under which a heuristic algorithm is provided at the website https://github.com/AcrossV/Gated-XNOR. More particularly, we find that when both the weights and activations become ternary values, the DNNs can be reduced to sparse binary networks, termed as gated XNOR networks (GXNOR-Nets) since only the event of non-zero weight and non-zero activation enables the control gate to start the XNOR logic operations in the original binary networks. This promises the event-driven hardware design for efficient mobile intelligence. We achieve advanced performance compared with state-of-the-art algorithms. Furthermore, the computational sparsity and the number of states in the discrete space can be flexibly modified to make it suitable for various hardware platforms.
Tasks
Published 2017-05-25
URL http://arxiv.org/abs/1705.09283v5
PDF http://arxiv.org/pdf/1705.09283v5.pdf
PWC https://paperswithcode.com/paper/gxnor-net-training-deep-neural-networks-with
Repo https://github.com/AcrossV/Gated-XNOR
Framework none

EmoTxt: A Toolkit for Emotion Recognition from Text

Title EmoTxt: A Toolkit for Emotion Recognition from Text
Authors Fabio Calefato, Filippo Lanubile, Nicole Novielli
Abstract We present EmoTxt, a toolkit for emotion recognition from text, trained and tested on a gold standard of about 9K question, answers, and comments from online interactions. We provide empirical evidence of the performance of EmoTxt. To the best of our knowledge, EmoTxt is the first open-source toolkit supporting both emotion recognition from text and training of custom emotion classification models.
Tasks Emotion Classification, Emotion Recognition
Published 2017-08-13
URL http://arxiv.org/abs/1708.03892v2
PDF http://arxiv.org/pdf/1708.03892v2.pdf
PWC https://paperswithcode.com/paper/emotxt-a-toolkit-for-emotion-recognition-from
Repo https://github.com/brandonserna/sentiment2emoji
Framework none

Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks

Title Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks
Authors Luiz G. Hafemann, Robert Sabourin, Luiz S. Oliveira
Abstract Verifying the identity of a person using handwritten signatures is challenging in the presence of skilled forgeries, where a forger has access to a person’s signature and deliberately attempt to imitate it. In offline (static) signature verification, the dynamic information of the signature writing process is lost, and it is difficult to design good feature extractors that can distinguish genuine signatures and skilled forgeries. This reflects in a relatively poor performance, with verification errors around 7% in the best systems in the literature. To address both the difficulty of obtaining good features, as well as improve system performance, we propose learning the representations from signature images, in a Writer-Independent format, using Convolutional Neural Networks. In particular, we propose a novel formulation of the problem that includes knowledge of skilled forgeries from a subset of users in the feature learning process, that aims to capture visual cues that distinguish genuine signatures and forgeries regardless of the user. Extensive experiments were conducted on four datasets: GPDS, MCYT, CEDAR and Brazilian PUC-PR datasets. On GPDS-160, we obtained a large improvement in state-of-the-art performance, achieving 1.72% Equal Error Rate, compared to 6.97% in the literature. We also verified that the features generalize beyond the GPDS dataset, surpassing the state-of-the-art performance in the other datasets, without requiring the representation to be fine-tuned to each particular dataset.
Tasks
Published 2017-05-16
URL http://arxiv.org/abs/1705.05787v1
PDF http://arxiv.org/pdf/1705.05787v1.pdf
PWC https://paperswithcode.com/paper/learning-features-for-offline-handwritten
Repo https://github.com/nathayush/Sigver
Framework pytorch

HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

Title HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving
Authors Cezary Kaliszyk, François Chollet, Christian Szegedy
Abstract Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies. We make this dataset publicly available under the BSD license. We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving. We also benchmark a set of simple baseline machine learning models suited for the tasks (including logistic regression, convolutional neural networks and recurrent neural networks). The results of our baseline models show the promise of applying machine learning to HOL theorem proving.
Tasks Automated Theorem Proving
Published 2017-03-01
URL http://arxiv.org/abs/1703.00426v1
PDF http://arxiv.org/pdf/1703.00426v1.pdf
PWC https://paperswithcode.com/paper/holstep-a-machine-learning-dataset-for-higher
Repo https://github.com/tensorflow/deepmath/tree/master/deepmath/holstep_baselines
Framework tf

From Word Segmentation to POS Tagging for Vietnamese

Title From Word Segmentation to POS Tagging for Vietnamese
Authors Dat Quoc Nguyen, Thanh Vu, Dai Quoc Nguyen, Mark Dras, Mark Johnson
Abstract This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable. We also make a comparison between state-of-the-art (SOTA) feature-based and neural network-based models. On the benchmark Vietnamese treebank (Nguyen et al., 2009), experimental results show that the pipeline strategy produces better scores of POS tagging from unsegmented text than the joint strategy, and the highest accuracy is obtained by using a feature-based model.
Tasks Part-Of-Speech Tagging
Published 2017-11-14
URL http://arxiv.org/abs/1711.04951v1
PDF http://arxiv.org/pdf/1711.04951v1.pdf
PWC https://paperswithcode.com/paper/from-word-segmentation-to-pos-tagging-for
Repo https://github.com/datquocnguyen/VnMarMoT
Framework none

“Zero-Shot” Super-Resolution using Deep Internal Learning

Title “Zero-Shot” Super-Resolution using Deep Internal Learning
Authors Assaf Shocher, Nadav Cohen, Michal Irani
Abstract Deep Learning has led to a dramatic leap in Super-Resolution (SR) performance in the past few years. However, being supervised, these SR methods are restricted to specific training data, where the acquisition of the low-resolution (LR) images from their high-resolution (HR) counterparts is predetermined (e.g., bicubic downscaling), without any distracting artifacts (e.g., sensor noise, image compression, non-ideal PSF, etc). Real LR images, however, rarely obey these restrictions, resulting in poor SR results by SotA (State of the Art) methods. In this paper we introduce “Zero-Shot” SR, which exploits the power of Deep Learning, but does not rely on prior training. We exploit the internal recurrence of information inside a single image, and train a small image-specific CNN at test time, on examples extracted solely from the input image itself. As such, it can adapt itself to different settings per image. This allows to perform SR of real old photos, noisy images, biological data, and other images where the acquisition process is unknown or non-ideal. On such images, our method outperforms SotA CNN-based SR methods, as well as previous unsupervised SR methods. To the best of our knowledge, this is the first unsupervised CNN-based SR method.
Tasks Image Compression, Image Super-Resolution, Super-Resolution
Published 2017-12-17
URL http://arxiv.org/abs/1712.06087v1
PDF http://arxiv.org/pdf/1712.06087v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-super-resolution-using-deep
Repo https://github.com/assafshocher/ZSSR
Framework none

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?

Title Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?
Authors Xiang Zhang, Yann LeCun
Abstract This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification. Different encoding levels are studied, including UTF-8 bytes, characters, words, romanized characters and romanized words. For all encoding levels, whenever applicable, we provide comparisons with linear models, fastText and convolutional networks. For convolutional networks, we compare between encoding mechanisms using character glyph images, one-hot (or one-of-n) encoding, and embedding. In total there are 473 models, using 14 large-scale text classification datasets in 4 languages including Chinese, English, Japanese and Korean. Some conclusions from these results include that byte-level one-hot encoding based on UTF-8 consistently produces competitive results for convolutional networks, that word-level n-grams linear models are competitive even without perfect word segmentation, and that fastText provides the best result using character-level n-gram encoding but can overfit when the features are overly rich.
Tasks Text Classification
Published 2017-08-08
URL http://arxiv.org/abs/1708.02657v2
PDF http://arxiv.org/pdf/1708.02657v2.pdf
PWC https://paperswithcode.com/paper/which-encoding-is-the-best-for-text
Repo https://github.com/zhangxiangxiao/glyph
Framework none

Multiwinner Voting with Fairness Constraints

Title Multiwinner Voting with Fairness Constraints
Authors L. Elisa Celis, Lingxiao Huang, Nisheeth K. Vishnoi
Abstract Multiwinner voting rules are used to select a small representative subset of candidates or items from a larger set given the preferences of voters. However, if candidates have sensitive attributes such as gender or ethnicity (when selecting a committee), or specified types such as political leaning (when selecting a subset of news items), an algorithm that chooses a subset by optimizing a multiwinner voting rule may be unbalanced in its selection – it may under or over represent a particular gender or political orientation in the examples above. We introduce an algorithmic framework for multiwinner voting problems when there is an additional requirement that the selected subset should be “fair” with respect to a given set of attributes. Our framework provides the flexibility to (1) specify fairness with respect to multiple, non-disjoint attributes (e.g., ethnicity and gender) and (2) specify a score function. We study the computational complexity of this constrained multiwinner voting problem for monotone and submodular score functions and present several approximation algorithms and matching hardness of approximation results for various attribute group structure and types of score functions. We also present simulations that suggest that adding fairness constraints may not affect the scores significantly when compared to the unconstrained case.
Tasks
Published 2017-10-27
URL http://arxiv.org/abs/1710.10057v2
PDF http://arxiv.org/pdf/1710.10057v2.pdf
PWC https://paperswithcode.com/paper/multiwinner-voting-with-fairness-constraints
Repo https://github.com/huanglx12/Balanced-Committee-Election
Framework none

Function Assistant: A Tool for NL Querying of APIs

Title Function Assistant: A Tool for NL Querying of APIs
Authors Kyle Richardson, Jonas Kuhn
Abstract In this paper, we describe Function Assistant, a lightweight Python-based toolkit for querying and exploring source code repositories using natural language. The toolkit is designed to help end-users of a target API quickly find information about functions through high-level natural language queries and descriptions. For a given text query and background API, the tool finds candidate functions by performing a translation from the text to known representations in the API using the semantic parsing approach of Richardson and Kuhn (2017). Translations are automatically learned from example text-code pairs in example APIs. The toolkit includes features for building translation pipelines and query engines for arbitrary source code projects. To explore this last feature, we perform new experiments on 27 well-known Python projects hosted on Github.
Tasks Semantic Parsing
Published 2017-06-01
URL http://arxiv.org/abs/1706.00468v2
PDF http://arxiv.org/pdf/1706.00468v2.pdf
PWC https://paperswithcode.com/paper/function-assistant-a-tool-for-nl-querying-of
Repo https://github.com/yakazimir/Code-Datasets
Framework none

Determining Semantic Textual Similarity using Natural Deduction Proofs

Title Determining Semantic Textual Similarity using Natural Deduction Proofs
Authors Hitomi Yanaka, Koji Mineshima, Pascual Martinez-Gomez, Daisuke Bekki
Abstract Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higher-order automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logic-based systems and that features derived from the proofs are effective for learning textual similarity.
Tasks Semantic Textual Similarity
Published 2017-07-27
URL http://arxiv.org/abs/1707.08713v1
PDF http://arxiv.org/pdf/1707.08713v1.pdf
PWC https://paperswithcode.com/paper/determining-semantic-textual-similarity-using
Repo https://github.com/mynlp/ccg2lambda
Framework none
comments powered by Disqus