Paper Group AWR 112
Detailed, accurate, human shape estimation from clothed 3D scan sequences. Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks. Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering. On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation St …
Detailed, accurate, human shape estimation from clothed 3D scan sequences
Title | Detailed, accurate, human shape estimation from clothed 3D scan sequences |
Authors | Chao Zhang, Sergi Pujades, Michael Black, Gerard Pons-Moll |
Abstract | We address the problem of estimating human pose and body shape from 3D scans over time. Reliable estimation of 3D body shape is necessary for many applications including virtual try-on, health monitoring, and avatar creation for virtual reality. Scanning bodies in minimal clothing, however, presents a practical barrier to these applications. We address this problem by estimating body shape under clothing from a sequence of 3D scans. Previous methods that have exploited body models produce smooth shapes lacking personalized details. We contribute a new approach to recover a personalized shape of the person. The estimated shape deviates from a parametric model to fit the 3D scans. We demonstrate the method using high quality 4D data as well as sequences of visual hulls extracted from multi-view images. We also make available BUFF, a new 4D dataset that enables quantitative evaluation (http://buff.is.tue.mpg.de). Our method outperforms the state of the art in both pose estimation and shape estimation, qualitatively and quantitatively. |
Tasks | Pose Estimation |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04454v2 |
http://arxiv.org/pdf/1703.04454v2.pdf | |
PWC | https://paperswithcode.com/paper/detailed-accurate-human-shape-estimation-from |
Repo | https://github.com/maria-korosteleva/Body-Shape-Estimation |
Framework | none |
Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
Title | Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks |
Authors | Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, Xiao-Jun Wu |
Abstract | We present a new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs). We first compare and analyse different loss functions including L2, L1 and smooth L1. The analysis of these loss functions suggests that, for the training of a CNN-based localisation model, more attention should be paid to small and medium range errors. To this end, we design a piece-wise loss function. The new loss amplifies the impact of errors from the interval (-w, w) by switching from L1 loss to a modified logarithm function. To address the problem of under-representation of samples with large out-of-plane head rotations in the training set, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation approaches. Last, the proposed approach is extended to create a two-stage framework for robust facial landmark localisation. The experimental results obtained on AFLW and 300W demonstrate the merits of the Wing loss function, and prove the superiority of the proposed method over the state-of-the-art approaches. |
Tasks | Data Augmentation, Face Alignment |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06753v5 |
http://arxiv.org/pdf/1711.06753v5.pdf | |
PWC | https://paperswithcode.com/paper/wing-loss-for-robust-facial-landmark |
Repo | https://github.com/xialuxi/arcface-caffe |
Framework | none |
Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering
Title | Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering |
Authors | Seunghyun Yoon, Joongbo Shin, Kyomin Jung |
Abstract | In this paper, we propose a novel end-to-end neural architecture for ranking candidate answers, that adapts a hierarchical recurrent neural network and a latent topic clustering module. With our proposed model, a text is encoded to a vector representation from an word-level to a chunk-level to effectively capture the entire meaning. In particular, by adapting the hierarchical structure, our model shows very small performance degradations in longer text comprehension while other state-of-the-art recurrent neural network models suffer from it. Additionally, the latent topic clustering module extracts semantic information from target samples. This clustering module is useful for any text related tasks by allowing each data sample to find its nearest topic cluster, thus helping the neural network model analyze the entire data. We evaluate our models on the Ubuntu Dialogue Corpus and consumer electronic domain question answering dataset, which is related to Samsung products. The proposed model shows state-of-the-art results for ranking question-answer pairs. |
Tasks | Answer Selection, Learning-To-Rank, Question Answering |
Published | 2017-10-10 |
URL | http://arxiv.org/abs/1710.03430v3 |
http://arxiv.org/pdf/1710.03430v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-rank-question-answer-pairs-using |
Repo | https://github.com/david-yoon/QA_HRDE_LTC |
Framework | tf |
On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis
Title | On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis |
Authors | Jose Camacho-Collados, Mohammad Taher Pilehvar |
Abstract | Text preprocessing is often the first step in the pipeline of a Natural Language Processing (NLP) system, with potential impact in its final performance. Despite its importance, text preprocessing has not received much attention in the deep learning literature. In this paper we investigate the impact of simple text preprocessing decisions (particularly tokenizing, lemmatizing, lowercasing and multiword grouping) on the performance of a standard neural text classifier. We perform an extensive evaluation on standard benchmarks from text categorization and sentiment analysis. While our experiments show that a simple tokenization of input text is generally adequate, they also highlight significant degrees of variability across preprocessing techniques. This reveals the importance of paying attention to this usually-overlooked step in the pipeline, particularly when comparing different models. Finally, our evaluation provides insights into the best preprocessing practices for training word embeddings. |
Tasks | Sentiment Analysis, Text Categorization, Text Classification, Tokenization, Word Embeddings |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01780v3 |
http://arxiv.org/pdf/1707.01780v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-role-of-text-preprocessing-in-neural |
Repo | https://github.com/changji2069/Scope-Project |
Framework | none |
Improving Discourse Relation Projection to Build Discourse Annotated Corpora
Title | Improving Discourse Relation Projection to Build Discourse Annotated Corpora |
Authors | Majid Laali, Leila Kosseim |
Abstract | The naive approach to annotation projection is not effective to project discourse annotations from one language to another because implicit discourse relations are often changed to explicit ones and vice-versa in the translation. In this paper, we propose a novel approach based on the intersection between statistical word-alignment models to identify unsupported discourse annotations. This approach identified 65% of the unsupported annotations in the English-French parallel sentences from Europarl. By filtering out these unsupported annotations, we induced the first PDTB-style discourse annotated corpus for French from Europarl. We then used this corpus to train a classifier to identify the discourse-usage of French discourse connectives and show a 15% improvement of F1-score compared to the classifier trained on the non-filtered annotations. |
Tasks | Word Alignment |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06357v1 |
http://arxiv.org/pdf/1707.06357v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-discourse-relation-projection-to |
Repo | https://github.com/mjlaali/Europarl-ConcoDisco |
Framework | none |
Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet
Title | Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet |
Authors | Naimish Agarwal, Artus Krohn-Grimberghe, Ranjana Vyas |
Abstract | Facial Key Points (FKPs) Detection is an important and challenging problem in the fields of computer vision and machine learning. It involves predicting the co-ordinates of the FKPs, e.g. nose tip, center of eyes, etc, for a given face. In this paper, we propose a LeNet adapted Deep CNN model - NaimishNet, to operate on facial key points data and compare our model’s performance against existing state of the art approaches. |
Tasks | |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.00977v1 |
http://arxiv.org/pdf/1710.00977v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-key-points-detection-using-deep |
Repo | https://github.com/anitagold/30-days-of-Udacity |
Framework | pytorch |
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Title | Evolution Strategies as a Scalable Alternative to Reinforcement Learning |
Authors | Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, Ilya Sutskever |
Abstract | We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable solution strategy that scales extremely well with the number of CPUs available: By using a novel communication strategy based on common random numbers, our ES implementation only needs to communicate scalars, making it possible to scale to over a thousand parallel workers. This allows us to solve 3D humanoid walking in 10 minutes and obtain competitive results on most Atari games after one hour of training. In addition, we highlight several advantages of ES as a black box optimization technique: it is invariant to action frequency and delayed rewards, tolerant of extremely long horizons, and does not need temporal discounting or value function approximation. |
Tasks | Atari Games, Q-Learning |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03864v2 |
http://arxiv.org/pdf/1703.03864v2.pdf | |
PWC | https://paperswithcode.com/paper/evolution-strategies-as-a-scalable |
Repo | https://github.com/aspk/space_battle |
Framework | none |
Evolving Deep Convolutional Neural Networks for Image Classification
Title | Evolving Deep Convolutional Neural Networks for Image Classification |
Authors | Yanan Sun, Bing Xue, Mengjie Zhang, Gary G. Yen |
Abstract | Evolutionary computation methods have been successfully applied to neural networks since two decades ago, while those methods cannot scale well to the modern deep neural networks due to the complicated architectures and large quantities of connection weights. In this paper, we propose a new method using genetic algorithms for evolving the architectures and connection weight initialization values of a deep convolutional neural network to address image classification problems. In the proposed algorithm, an efficient variable-length gene encoding strategy is designed to represent the different building blocks and the unpredictable optimal depth in convolutional neural networks. In addition, a new representation scheme is developed for effectively initializing connection weights of deep convolutional neural networks, which is expected to avoid networks getting stuck into local minima which is typically a major issue in the backward gradient-based optimization. Furthermore, a novel fitness evaluation method is proposed to speed up the heuristic search with substantially less computational resource. The proposed algorithm is examined and compared with 22 existing algorithms on nine widely used image classification tasks, including the state-of-the-art methods. The experimental results demonstrate the remarkable superiority of the proposed algorithm over the state-of-the-art algorithms in terms of classification error rate and the number of parameters (weights). |
Tasks | Image Classification |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10741v3 |
http://arxiv.org/pdf/1710.10741v3.pdf | |
PWC | https://paperswithcode.com/paper/evolving-deep-convolutional-neural-networks |
Repo | https://github.com/MagnusCaligo/EvoCNN |
Framework | none |
Target contrastive pessimistic risk for robust domain adaptation
Title | Target contrastive pessimistic risk for robust domain adaptation |
Authors | Wouter M. Kouw, Marco Loog |
Abstract | In domain adaptation, classifiers with information from a source domain adapt to generalize to a target domain. However, an adaptive classifier can perform worse than a non-adaptive classifier due to invalid assumptions, increased sensitivity to estimation errors or model misspecification. Our goal is to develop a domain-adaptive classifier that is robust in the sense that it does not rely on restrictive assumptions on how the source and target domains relate to each other and that it does not perform worse than the non-adaptive classifier. We formulate a conservative parameter estimator that only deviates from the source classifier when a lower risk is guaranteed for all possible labellings of the given target samples. We derive the classical least-squares and discriminant analysis cases and show that these perform on par with state-of-the-art domain adaptive classifiers in sample selection bias settings, while outperforming them in more general domain adaptation settings. |
Tasks | Domain Adaptation |
Published | 2017-06-25 |
URL | http://arxiv.org/abs/1706.08082v1 |
http://arxiv.org/pdf/1706.08082v1.pdf | |
PWC | https://paperswithcode.com/paper/target-contrastive-pessimistic-risk-for |
Repo | https://github.com/wmkouw/libTLDA |
Framework | none |
Structured Attention Networks
Title | Structured Attention Networks |
Authors | Yoon Kim, Carl Denton, Luong Hoang, Alexander M. Rush |
Abstract | Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training. In this work, we experiment with incorporating richer structural distributions, encoded using graphical models, within deep networks. We show that these structured attention networks are simple extensions of the basic attention procedure, and that they allow for extending attention beyond the standard soft-selection approach, such as attending to partial segmentations or to subtrees. We experiment with two different classes of structured attention networks: a linear-chain conditional random field and a graph-based parsing model, and describe how these models can be practically implemented as neural network layers. Experiments show that this approach is effective for incorporating structural biases, and structured attention networks outperform baseline attention models on a variety of synthetic and real tasks: tree transduction, neural machine translation, question answering, and natural language inference. We further find that models trained in this way learn interesting unsupervised hidden representations that generalize simple attention. |
Tasks | Machine Translation, Natural Language Inference, Question Answering |
Published | 2017-02-03 |
URL | http://arxiv.org/abs/1702.00887v3 |
http://arxiv.org/pdf/1702.00887v3.pdf | |
PWC | https://paperswithcode.com/paper/structured-attention-networks |
Repo | https://github.com/harvardnlp/struct-attn |
Framework | none |
Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade
Title | Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade |
Authors | Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, Xiaoou Tang |
Abstract | We propose a novel deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation. Unlike the conventional model cascade (MC) that is composed of multiple independent models, LC treats a single deep model as a cascade of several sub-models. Earlier sub-models are trained to handle easy and confident regions, and they progressively feed-forward harder regions to the next sub-model for processing. Convolutions are only calculated on these regions to reduce computations. The proposed method possesses several advantages. First, LC classifies most of the easy regions in the shallow stage and makes deeper stage focuses on a few hard regions. Such an adaptive and ‘difficulty-aware’ learning improves segmentation performance. Second, LC accelerates both training and testing of deep network thanks to early decisions in the shallow stage. Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models. We evaluate our method on PASCAL VOC and Cityscapes datasets, achieving state-of-the-art performance and fast speed. |
Tasks | Semantic Segmentation |
Published | 2017-04-05 |
URL | http://arxiv.org/abs/1704.01344v1 |
http://arxiv.org/pdf/1704.01344v1.pdf | |
PWC | https://paperswithcode.com/paper/not-all-pixels-are-equal-difficulty-aware |
Repo | https://github.com/liuziwei7/region-conv |
Framework | none |
Max-value Entropy Search for Efficient Bayesian Optimization
Title | Max-value Entropy Search for Efficient Bayesian Optimization |
Authors | Zi Wang, Stefanie Jegelka |
Abstract | Entropy Search (ES) and Predictive Entropy Search (PES) are popular and empirically successful Bayesian Optimization techniques. Both rely on a compelling information-theoretic motivation, and maximize the information gained about the $\arg\max$ of the unknown function; yet, both are plagued by the expensive computation for estimating entropies. We propose a new criterion, Max-value Entropy Search (MES), that instead uses the information about the maximum function value. We show relations of MES to other Bayesian optimization methods, and establish a regret bound. We observe that MES maintains or improves the good empirical performance of ES/PES, while tremendously lightening the computational burden. In particular, MES is much more robust to the number of samples used for computing the entropy, and hence more efficient for higher dimensional problems. |
Tasks | |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01968v3 |
http://arxiv.org/pdf/1703.01968v3.pdf | |
PWC | https://paperswithcode.com/paper/max-value-entropy-search-for-efficient |
Repo | https://github.com/zi-w/Max-value-Entropy-Search |
Framework | none |
Completion of High Order Tensor Data with Missing Entries via Tensor-train Decomposition
Title | Completion of High Order Tensor Data with Missing Entries via Tensor-train Decomposition |
Authors | Longhao Yuan, Qibin Zhao, Jianting Cao |
Abstract | In this paper, we aim at the completion problem of high order tensor data with missing entries. The existing tensor factorization and completion methods suffer from the curse of dimensionality when the order of tensor N»3. To overcome this problem, we propose an efficient algorithm called TT-WOPT (Tensor-train Weighted OPTimization) to find the latent core tensors of tensor data and recover the missing entries. Tensor-train decomposition, which has the powerful representation ability with linear scalability to tensor order, is employed in our algorithm. The experimental results on synthetic data and natural image completion demonstrate that our method significantly outperforms the other related methods. Especially when the missing rate of data is very high, e.g., 85% to 99%, our algorithm can achieve much better performance than other state-of-the-art algorithms. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02641v2 |
http://arxiv.org/pdf/1709.02641v2.pdf | |
PWC | https://paperswithcode.com/paper/completion-of-high-order-tensor-data-with |
Repo | https://github.com/yuanlonghao/T3C_tensor_completion |
Framework | none |
A study of Thompson Sampling with Parameter h
Title | A study of Thompson Sampling with Parameter h |
Authors | Qiang Ha |
Abstract | Thompson Sampling algorithm is a well known Bayesian algorithm for solving stochastic multi-armed bandit. At each time step the algorithm chooses each arm with probability proportional to it being the current best arm. We modify the strategy by introducing a paramter h which alters the importance of the probability of an arm being the current best arm. We show that the optimality of Thompson sampling is robust to this perturbation within a range of parameter values for two arm bandits. |
Tasks | |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.02174v1 |
http://arxiv.org/pdf/1710.02174v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-thompson-sampling-with-parameter-h |
Repo | https://github.com/qiangha/thompson_sampling |
Framework | none |
Conditional Variance Penalties and Domain Shift Robustness
Title | Conditional Variance Penalties and Domain Shift Robustness |
Authors | Christina Heinze-Deml, Nicolai Meinshausen |
Abstract | When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) “core” or “conditionally invariant” features $X^\text{core}$ whose distribution $X^\text{core}\vert Y$, conditional on the class $Y$, does not change substantially across domains and (ii) “style” features $X^{\text{style}}$ whose distribution $X^{\text{style}} \vert Y$ can change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable. We do assume that we can sometimes observe a typically discrete identifier or “$\mathrm{ID}$ variable”. In some applications we know, for example, that two images show the same person, and $\mathrm{ID}$ then refers to the identity of the person. The proposed method requires only a small fraction of images to have $\mathrm{ID}$ information. We group observations if they share the same class and identifier $(Y,\mathrm{ID})=(y,\mathrm{id})$ and penalize the conditional variance of the prediction or the loss if we condition on $(Y,\mathrm{ID})$. Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables. Empirically, we show that the CoRe penalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture. |
Tasks | Image Classification |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11469v5 |
http://arxiv.org/pdf/1710.11469v5.pdf | |
PWC | https://paperswithcode.com/paper/conditional-variance-penalties-and-domain |
Repo | https://github.com/christinaheinze/core |
Framework | tf |