July 29, 2019

2982 words 14 mins read

Paper Group ANR 74

Marmara Turkish Coreference Corpus and Coreference Resolution Baseline. Pose-Invariant Face Alignment with a Single CNN. A Data Prism: Semi-Verified Learning in the Small-Alpha Regime. 3D Reconstruction of Temples in the Special Region of Yogyakarta By Using Close-Range Photogrammetry. Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-N …

Marmara Turkish Coreference Corpus and Coreference Resolution Baseline


Title	Marmara Turkish Coreference Corpus and Coreference Resolution Baseline
Authors	Peter Schüller, Kübra Cıngıllı, Ferit Tunçer, Barış Gün Sürmeli, Ayşegül Pekel, Ayşe Hande Karatay, Hacer Ezgi Karakaş
Abstract	We describe the Marmara Turkish Coreference Corpus, which is an annotation of the whole METU-Sabanci Turkish Treebank with mentions and coreference chains. Collecting eight or more independent annotations for each document allowed for fully automatic adjudication. We provide a baseline system for Turkish mention detection and coreference resolution and evaluate it on the corpus.
Tasks	Coreference Resolution
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01863v2
PDF	http://arxiv.org/pdf/1706.01863v2.pdf
PWC	https://paperswithcode.com/paper/marmara-turkish-coreference-corpus-and
Repo
Framework

Pose-Invariant Face Alignment with a Single CNN


Title	Pose-Invariant Face Alignment with a Single CNN
Authors	Amin Jourabloo, Mao Ye, Xiaoming Liu, Liu Ren
Abstract	Face alignment has witnessed substantial progress in the last decade. One of the recent focuses has been aligning a dense 3D face shape to face images with large head poses. The dominant technology used is based on the cascade of regressors, e.g., CNN, which has shown promising results. Nonetheless, the cascade of CNNs suffers from several drawbacks, e.g., lack of end-to-end training, hand-crafted features and slow training speed. To address these issues, we propose a new layer, named visualization layer, that can be integrated into the CNN architecture and enables joint optimization with different loss functions. Extensive evaluation of the proposed method on multiple datasets demonstrates state-of-the-art accuracy, while reducing the training time by more than half compared to the typical cascade of CNNs. In addition, we compare multiple CNN architectures with the visualization layer to further demonstrate the advantage of its utilization.
Tasks	Face Alignment, Facial Landmark Detection
Published	2017-07-19
URL	http://arxiv.org/abs/1707.06286v1
PDF	http://arxiv.org/pdf/1707.06286v1.pdf
PWC	https://paperswithcode.com/paper/pose-invariant-face-alignment-with-a-single
Repo
Framework

A Data Prism: Semi-Verified Learning in the Small-Alpha Regime


Title	A Data Prism: Semi-Verified Learning in the Small-Alpha Regime
Authors	Michela Meister, Gregory Valiant
Abstract	We consider a model of unreliable or crowdsourced data where there is an underlying set of $n$ binary variables, each evaluator contributes a (possibly unreliable or adversarial) estimate of the values of some subset of $r$ of the variables, and the learner is given the true value of a constant number of variables. We show that, provided an $\alpha$-fraction of the evaluators are “good” (either correct, or with independent noise rate $p < 1/2$), then the true values of a $(1-\epsilon)$ fraction of the $n$ underlying variables can be deduced as long as $\alpha > 1/(2-2p)^r$. This setting can be viewed as an instance of the semi-verified learning model introduced in [CSV17], which explores the tradeoff between the number of items evaluated by each worker and the fraction of good evaluators. Our results require the number of evaluators to be extremely large, $>n^r$, although our algorithm runs in linear time, $O_{r,\epsilon}(n)$, given query access to the large dataset of evaluations. This setting and results can also be viewed as examining a general class of semi-adversarial CSPs with a planted assignment. This parameter regime where the fraction of reliable data is small, is relevant to a number of practical settings. For example, settings where one has a large dataset of customer preferences, with each customer specifying preferences for a small (constant) number of items, and the goal is to ascertain the preferences of a specific demographic of interest. Our results show that this large dataset (which lacks demographic information) can be leveraged together with the preferences of the demographic of interest for a constant number of randomly selected items, to recover an accurate estimate of the entire set of preferences. In this sense, our results can be viewed as a “data prism” allowing one to extract the behavior of specific cohorts from a large, mixed, dataset.
Tasks
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02740v1
PDF	http://arxiv.org/pdf/1708.02740v1.pdf
PWC	https://paperswithcode.com/paper/a-data-prism-semi-verified-learning-in-the
Repo
Framework

3D Reconstruction of Temples in the Special Region of Yogyakarta By Using Close-Range Photogrammetry


Title	3D Reconstruction of Temples in the Special Region of Yogyakarta By Using Close-Range Photogrammetry
Authors	Adityo Priyandito Utomo, Canggih Puspo Wibowo
Abstract	Object reconstruction is one of the main problems in cultural heritage preservation. This problem is due to lack of data in documentation. Thus in this research we presented a method of 3D reconstruction using close-range photogrammetry. We collected 1319 photos from five temples in Yogyakarta. Using A-KAZE algorithm, keypoints of each image were obtained. Then we employed LIOP to create feature descriptor from it. After performing feature matching, L1RA was utilized to create sparse point clouds. In order to generate the geometry shape, MVS was used. Finally, FSSR and Large Scale Texturing were employed to deal with the surface and texture of the object. The quality of the reconstructed 3D model was measured by comparing the 3D images of the model with the original photos utilizing SSIM. The results showed that in terms of quality, our method was on par with other commercial method such as PhotoModeler and PhotoScan.
Tasks	3D Reconstruction, Object Reconstruction
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06722v1
PDF	http://arxiv.org/pdf/1702.06722v1.pdf
PWC	https://paperswithcode.com/paper/3d-reconstruction-of-temples-in-the-special
Repo
Framework

Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls


Title	Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls
Authors	Zeyuan Allen-Zhu, Elad Hazan, Wei Hu, Yuanzhi Li
Abstract	We propose a rank-$k$ variant of the classical Frank-Wolfe algorithm to solve convex optimization over a trace-norm ball. Our algorithm replaces the top singular-vector computation ($1$-SVD) in Frank-Wolfe with a top-$k$ singular-vector computation ($k$-SVD), which can be done by repeatedly applying $1$-SVD $k$ times. Alternatively, our algorithm can be viewed as a rank-$k$ restricted version of projected gradient descent. We show that our algorithm has a linear convergence rate when the objective function is smooth and strongly convex, and the optimal solution has rank at most $k$. This improves the convergence rate and the total time complexity of the Frank-Wolfe method and its variants.
Tasks
Published	2017-08-07
URL	http://arxiv.org/abs/1708.02105v3
PDF	http://arxiv.org/pdf/1708.02105v3.pdf
PWC	https://paperswithcode.com/paper/linear-convergence-of-a-frank-wolfe-type
Repo
Framework

New Techniques for Inferring L-Systems Using Genetic Algorithm


Title	New Techniques for Inferring L-Systems Using Genetic Algorithm
Authors	Jason Bernard, Ian McQuillan
Abstract	Lindenmayer systems (L-systems) are a formal grammar system that iteratively rewrites all symbols of a string, in parallel. When visualized with a graphical interpretation, the images have self-similar shapes that appear frequently in nature, and they have been particularly successful as a concise, reusable technique for simulating plants. The L-system inference problem is to find an L-system to simulate a given plant. This is currently done mainly by experts, but this process is limited by the availability of experts, the complexity that may be solved by humans, and time. This paper introduces the Plant Model Inference Tool (PMIT) that infers deterministic context-free L-systems from an initial sequence of strings generated by the system using a genetic algorithm. PMIT is able to infer more complex systems than existing approaches. Indeed, while existing approaches are limited to L-systems with a total sum of 20 combined symbols in the productions, PMIT can infer almost all L-systems tested where the total sum is 140 symbols. This was validated using a test bed of 28 previously developed L-system models, in addition to models created artificially by bootstrapping larger models.
Tasks
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00180v2
PDF	http://arxiv.org/pdf/1712.00180v2.pdf
PWC	https://paperswithcode.com/paper/new-techniques-for-inferring-l-systems-using
Repo
Framework

Learning to Compose Skills


Title	Learning to Compose Skills
Authors	Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Charles Isbell
Abstract	We present a differentiable framework capable of learning a wide variety of compositions of simple policies that we call skills. By recursively composing skills with themselves, we can create hierarchies that display complex behavior. Skill networks are trained to generate skill-state embeddings that are provided as inputs to a trainable composition function, which in turn outputs a policy for the overall task. Our experiments on an environment consisting of multiple collect and evade tasks show that this architecture is able to quickly build complex skills from simpler ones. Furthermore, the learned composition function displays some transfer to unseen combinations of skills, allowing for zero-shot generalizations.
Tasks
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11289v1
PDF	http://arxiv.org/pdf/1711.11289v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-compose-skills
Repo
Framework

Question-Answering with Grammatically-Interpretable Representations


Title	Question-Answering with Grammatically-Interpretable Representations
Authors	Hamid Palangi, Paul Smolensky, Xiaodong He, Li Deng
Abstract	We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model performs comparably to a state-of-the-art system on the SQuAD QA task. The internal representation which is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories) such as subject, wh-word, determiner, etc. Fine-grained analysis reveals specific correspondences between the learned roles and parts of speech as assigned by a standard tagger (Toutanova et al. 2003), and finds several discrepancies in the model’s favor. In this sense, the model learns significant aspects of grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to build representations using symbols and roles, with an inductive bias favoring use of these in an approximately discrete manner.
Tasks	Question Answering
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08432v2
PDF	http://arxiv.org/pdf/1705.08432v2.pdf
PWC	https://paperswithcode.com/paper/question-answering-with-grammatically
Repo
Framework

Learning K-way D-dimensional Discrete Code For Compact Embedding Representations


Title	Learning K-way D-dimensional Discrete Code For Compact Embedding Representations
Authors	Ting Chen, Martin Renqiang Min, Yizhou Sun
Abstract	Embedding methods such as word embedding have become pillars for many applications containing discrete structures. Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying linear transformation based on “one-hot” encoding of the discrete symbols. Despite its simplicity, such approach yields number of parameters that grows linearly with the vocabulary size and can lead to overfitting. In this work we propose a much more compact K-way D-dimensional discrete encoding scheme to replace the “one-hot” encoding. In “KD encoding”, each symbol is represented by a $D$-dimensional code, and each of its dimension has a cardinality of $K$. The final symbol embedding vector can be generated by composing the code embedding vectors. To learn the semantically meaningful code, we derive a relaxed discrete optimization technique based on stochastic gradient descent. By adopting the new coding system, the efficiency of parameterization can be significantly improved (from linear to logarithmic), and this can also mitigate the over-fitting problem. In our experiments with language modeling, the number of embedding parameters can be reduced by 97% while achieving similar or better performance.
Tasks	Language Modelling
Published	2017-11-08
URL	http://arxiv.org/abs/1711.03067v3
PDF	http://arxiv.org/pdf/1711.03067v3.pdf
PWC	https://paperswithcode.com/paper/learning-k-way-d-dimensional-discrete-code
Repo
Framework

Personalized and Private Peer-to-Peer Machine Learning


Title	Personalized and Private Peer-to-Peer Machine Learning
Authors	Aurélien Bellet, Rachid Guerraoui, Mahsa Taziki, Marc Tommasi
Abstract	The rise of connected personal devices together with privacy concerns call for machine learning algorithms capable of leveraging the data of a large number of agents to learn personalized models under strong privacy requirements. In this paper, we introduce an efficient algorithm to address the above problem in a fully decentralized (peer-to-peer) and asynchronous fashion, with provable convergence rate. We show how to make the algorithm differentially private to protect against the disclosure of information about the personal datasets, and formally analyze the trade-off between utility and privacy. Our experiments show that our approach dramatically outperforms previous work in the non-private case, and that under privacy constraints, we can significantly improve over models learned in isolation.
Tasks
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08435v2
PDF	http://arxiv.org/pdf/1705.08435v2.pdf
PWC	https://paperswithcode.com/paper/personalized-and-private-peer-to-peer-machine
Repo
Framework

Measuring Offensive Speech in Online Political Discourse


Title	Measuring Offensive Speech in Online Political Discourse
Authors	Rishab Nithyanand, Brian Schaffner, Phillipa Gill
Abstract	The Internet and online forums such as Reddit have become an increasingly popular medium for citizens to engage in political conversations. However, the online disinhibition effect resulting from the ability to use pseudonymous identities may manifest in the form of offensive speech, consequently making political discussions more aggressive and polarizing than they already are. Such environments may result in harassment and self-censorship from its targets. In this paper, we present preliminary results from a large-scale temporal measurement aimed at quantifying offensiveness in online political discussions. To enable our measurements, we develop and evaluate an offensive speech classifier. We then use this classifier to quantify and compare offensiveness in the political and general contexts. We perform our study using a database of over 168M Reddit comments made by over 7M pseudonyms between January 2015 and January 2017 – a period covering several divisive political events including the 2016 US presidential elections.
Tasks
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01875v2
PDF	http://arxiv.org/pdf/1706.01875v2.pdf
PWC	https://paperswithcode.com/paper/measuring-offensive-speech-in-online
Repo
Framework

BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance


Title	BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D Bounding Boxes in Traffic Surveillance
Authors	Jakub Sochor, Jakub Špaňhel, Adam Herout
Abstract	In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach that is orthogonal to recent advancements in fine-grained recognition (automatic part discovery and bilinear pooling). In addition, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the vehicles to be seen from any viewpoint. Our approach is based on 3-D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use precise construction, we propose a method for an estimation of the 3-D bounding box. The 3-D bounding box is used to normalize the image viewpoint by “unpacking” the image into a plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to a random position in the image during the training of convolutional neural networks (CNNs). We have collected a large fine-grained vehicle data set BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments, which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12% points and the error is reduced by up to 50% compared with CNNs without the proposed modifications). We also show that our method outperforms the state-of-the-art methods for fine-grained recognition.
Tasks
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00686v3
PDF	http://arxiv.org/pdf/1703.00686v3.pdf
PWC	https://paperswithcode.com/paper/boxcars-improving-fine-grained-recognition-of
Repo
Framework

Statistically Optimal and Computationally Efficient Low Rank Tensor Completion from Noisy Entries


Title	Statistically Optimal and Computationally Efficient Low Rank Tensor Completion from Noisy Entries
Authors	Dong Xia, Ming Yuan, Cun-Hui Zhang
Abstract	In this article, we develop methods for estimating a low rank tensor from noisy observations on a subset of its entries to achieve both statistical and computational efficiencies. There have been a lot of recent interests in this problem of noisy tensor completion. Much of the attention has been focused on the fundamental computational challenges often associated with problems involving higher order tensors, yet very little is known about their statistical performance. To fill in this void, in this article, we characterize the fundamental statistical limits of noisy tensor completion by establishing minimax optimal rates of convergence for estimating a $k$th order low rank tensor under the general $\ell_p$ ($1\le p\le 2$) norm which suggest significant room for improvement over the existing approaches. Furthermore, we propose a polynomial-time computable estimating procedure based upon power iteration and a second-order spectral initialization that achieves the optimal rates of convergence. Our method is fairly easy to implement and numerical experiments are presented to further demonstrate the practical merits of our estimator.
Tasks
Published	2017-11-14
URL	http://arxiv.org/abs/1711.04934v2
PDF	http://arxiv.org/pdf/1711.04934v2.pdf
PWC	https://paperswithcode.com/paper/statistically-optimal-and-computationally
Repo
Framework

End-to-End Optimized Speech Coding with Deep Neural Networks


Title	End-to-End Optimized Speech Coding with Deep Neural Networks
Authors	Srihari Kankanahalli
Abstract	Modern compression algorithms are often the result of laborious domain-specific research; industry standards such as MP3, JPEG, and AMR-WB took years to develop and were largely hand-designed. We present a deep neural network model which optimizes all the steps of a wideband speech coding pipeline (compression, quantization, entropy coding, and decompression) end-to-end directly from raw speech data – no manual feature engineering necessary, and it trains in hours. In testing, our DNN-based coder performs on par with the AMR-WB standard at a variety of bitrates (~9kbps up to ~24kbps). It also runs in realtime on a 3.8GhZ Intel CPU.
Tasks	Feature Engineering, Quantization
Published	2017-10-25
URL	https://arxiv.org/abs/1710.09064v2
PDF	https://arxiv.org/pdf/1710.09064v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-optimized-speech-coding-with-deep
Repo
Framework

Spectral Norm Regularization for Improving the Generalizability of Deep Learning


Title	Spectral Norm Regularization for Improving the Generalizability of Deep Learning
Authors	Yuichi Yoshida, Takeru Miyato
Abstract	We investigate the generalizability of deep learning based on the sensitivity to input perturbation. We hypothesize that the high sensitivity to the perturbation of data degrades the performance on it. To reduce the sensitivity to perturbation, we propose a simple and effective regularization method, referred to as spectral norm regularization, which penalizes the high spectral norm of weight matrices in neural networks. We provide supportive evidence for the abovementioned hypothesis by experimentally confirming that the models trained using spectral norm regularization exhibit better generalizability than other baseline methods.
Tasks
Published	2017-05-31
URL	http://arxiv.org/abs/1705.10941v1
PDF	http://arxiv.org/pdf/1705.10941v1.pdf
PWC	https://paperswithcode.com/paper/spectral-norm-regularization-for-improving
Repo
Framework