October 21, 2019

3118 words 15 mins read

Paper Group AWR 114

Paper Group AWR 114

Neural Wavetable: a playable wavetable synthesizer using neural networks. Complementary-Label Learning for Arbitrary Losses and Models. Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark. Dimensionality’s Blessing: Clustering Images by Underlying Distribution. Learning Semantic Sentence Embeddings using Sequential Pa …

Neural Wavetable: a playable wavetable synthesizer using neural networks

Title Neural Wavetable: a playable wavetable synthesizer using neural networks
Authors Lamtharn Hantrakul, Li-Chia Yang
Abstract We present Neural Wavetable, a proof-of-concept wavetable synthesizer that uses neural networks to generate playable wavetables. The system can produce new, distinct waveforms through the interpolation of traditional wavetables in an autoencoder’s latent space. It is available as a VST/AU plugin for use in a Digital Audio Workstation.
Tasks
Published 2018-11-13
URL http://arxiv.org/abs/1811.05550v2
PDF http://arxiv.org/pdf/1811.05550v2.pdf
PWC https://paperswithcode.com/paper/neural-wavetable-a-playable-wavetable
Repo https://github.com/RichardYang40148/Neural_Wavetable_Synthesizer
Framework tf

Complementary-Label Learning for Arbitrary Losses and Models

Title Complementary-Label Learning for Arbitrary Losses and Models
Authors Takashi Ishida, Gang Niu, Aditya Krishna Menon, Masashi Sugiyama
Abstract In contrast to the standard classification paradigm where the true class is given to each training pattern, complementary-label learning only uses training patterns each equipped with a complementary label, which only specifies one of the classes that the pattern does not belong to. The goal of this paper is to derive a novel framework of complementary-label learning with an unbiased estimator of the classification risk, for arbitrary losses and models—all existing methods have failed to achieve this goal. Not only is this beneficial for the learning stage, it also makes model/hyper-parameter selection (through cross-validation) possible without the need of any ordinarily labeled validation data, while using any linear/non-linear models or convex/non-convex loss functions. We further improve the risk estimator by a non-negative correction and gradient ascent trick, and demonstrate its superiority through experiments.
Tasks
Published 2018-10-10
URL https://arxiv.org/abs/1810.04327v4
PDF https://arxiv.org/pdf/1810.04327v4.pdf
PWC https://paperswithcode.com/paper/complementary-label-learning-for-arbitrary
Repo https://github.com/takashiishida/comp
Framework pytorch

Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark

Title Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark
Authors Xiaodan Liang, Ke Gong, Xiaohui Shen, Liang Lin
Abstract Human parsing and pose estimation have recently received considerable interest due to their substantial application potentials. However, the existing datasets have limited numbers of images and annotations and lack a variety of human appearances and coverage of challenging cases in unconstrained environments. In this paper, we introduce a new benchmark named “Look into Person (LIP)” that provides a significant advancement in terms of scalability, diversity, and difficulty, which are crucial for future developments in human-centric analysis. This comprehensive dataset contains over 50,000 elaborately annotated images with 19 semantic part labels and 16 body joints, which are captured from a broad range of viewpoints, occlusions, and background complexities. Using these rich annotations, we perform detailed analyses of the leading human parsing and pose estimation approaches, thereby obtaining insights into the successes and failures of these methods. To further explore and take advantage of the semantic correlation of these two tasks, we propose a novel joint human parsing and pose estimation network to explore efficient context modeling, which can simultaneously predict parsing and pose with extremely high quality. Furthermore, we simplify the network to solve human parsing by exploring a novel self-supervised structure-sensitive learning approach, which imposes human pose structures into the parsing results without resorting to extra supervision. The dataset, code and models are available at http://www.sysu-hcp.net/lip/.
Tasks Human Parsing, Pose Estimation, Semantic Segmentation
Published 2018-04-05
URL http://arxiv.org/abs/1804.01984v1
PDF http://arxiv.org/pdf/1804.01984v1.pdf
PWC https://paperswithcode.com/paper/look-into-person-joint-body-parsing-pose
Repo https://github.com/andrewjong/SwapNet
Framework pytorch

Dimensionality’s Blessing: Clustering Images by Underlying Distribution

Title Dimensionality’s Blessing: Clustering Images by Underlying Distribution
Authors Wen-Yan Lin, Siying Liu, Jian-Huang Lai, Yasuyuki Matsushita
Abstract Many high dimensional vector distances tend to a constant. This is typically considered a negative “contrast-loss” phenomenon that hinders clustering and other machine learning techniques. We reinterpret “contrast-loss” as a blessing. Re-deriving “contrast-loss” using the law of large numbers, we show it results in a distribution’s instances concentrating on a thin “hyper-shell”. The hollow center means apparently chaotically overlapping distributions are actually intrinsically separable. We use this to develop distribution-clustering, an elegant algorithm for grouping of data points by their (unknown) underlying distribution. Distribution-clustering, creates notably clean clusters from raw unlabeled data, estimates the number of clusters for itself and is inherently robust to “outliers” which form their own clusters. This enables trawling for patterns in unorganized data and may be the key to enabling machine intelligence.
Tasks
Published 2018-04-08
URL http://arxiv.org/abs/1804.02624v1
PDF http://arxiv.org/pdf/1804.02624v1.pdf
PWC https://paperswithcode.com/paper/dimensionalitys-blessing-clustering-images-by
Repo https://github.com/EricElmoznino/distribution_clustering
Framework pytorch

Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator

Title Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator
Authors Badri N. Patro, Vinod K. Kurmi, Sandeep Kumar, Vinay P. Namboodiri
Abstract In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.
Tasks Paraphrase Generation, Sentence Embedding, Sentence Embeddings, Sentiment Analysis
Published 2018-06-03
URL http://arxiv.org/abs/1806.00807v5
PDF http://arxiv.org/pdf/1806.00807v5.pdf
PWC https://paperswithcode.com/paper/learning-semantic-sentence-embeddings-using-1
Repo https://github.com/vinodkkurmi/PQG
Framework pytorch

Training wide residual networks for deployment using a single bit for each weight

Title Training wide residual networks for deployment using a single bit for each weight
Authors Mark D. McDonnell
Abstract For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bit-per-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9%, 18.5% and 26.0% / 8.5% (Top-1 / Top-5) respectively. We also considered MNIST, SVHN and ImageNet32, achieving 1-bit-per-weight test results of 0.27%, 1.9%, and 41.3% / 19.1% respectively. For CIFAR, our error rates halve previously reported values, and are within about 1% of our error-rates for the same network with full-precision weights. For networks that overfit, we also show significant improvements in error rate by not learning batch normalization scale and offset parameters. This applies to both full precision and 1-bit-per-weight networks. Using a warm-restart learning-rate schedule, we found that training for 1-bit-per-weight is just as fast as full-precision networks, with better accuracy than standard schedules, and achieved about 98%-99% of peak performance in just 62 training epochs for CIFAR-10/100. For full training code and trained models in MATLAB, Keras and PyTorch see https://github.com/McDonnell-Lab/1-bit-per-weight/ .
Tasks
Published 2018-02-23
URL http://arxiv.org/abs/1802.08530v1
PDF http://arxiv.org/pdf/1802.08530v1.pdf
PWC https://paperswithcode.com/paper/training-wide-residual-networks-for
Repo https://github.com/McDonnell-Lab/1-bit-per-weight
Framework none

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Title Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
Authors Sandeep Subramanian, Adam Trischler, Yoshua Bengio, Christopher J Pal
Abstract A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general purpose features for words across a range of NLP problems. However, extending this success to learning representations of sequences of words, such as sentences, remains an open problem. Recent work has explored unsupervised as well as supervised learning techniques with different training objectives to learn general purpose fixed-length sentence representations. In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model. We train this model on several data sources with multiple training objectives on over 100 million sentences. Extensive experiments demonstrate that sharing a single recurrent sentence encoder across weakly related tasks leads to consistent improvements over previous methods. We present substantial improvements in the context of transfer learning and low-resource settings using our learned general-purpose representations.
Tasks Multi-Task Learning, Natural Language Inference, Paraphrase Identification, Semantic Textual Similarity, Transfer Learning
Published 2018-03-30
URL http://arxiv.org/abs/1804.00079v1
PDF http://arxiv.org/pdf/1804.00079v1.pdf
PWC https://paperswithcode.com/paper/learning-general-purpose-distributed-sentence
Repo https://github.com/facebookresearch/InferSent
Framework pytorch

Recurrently Exploring Class-wise Attention in A Hybrid Convolutional and Bidirectional LSTM Network for Multi-label Aerial Image Classification

Title Recurrently Exploring Class-wise Attention in A Hybrid Convolutional and Bidirectional LSTM Network for Multi-label Aerial Image Classification
Authors Yuansheng Hua, Lichao Mou, Xiao Xiang Zhu
Abstract Aerial image classification is of great significance in remote sensing community, and many researches have been conducted over the past few years. Among these studies, most of them focus on categorizing an image into one semantic label, while in the real world, an aerial image is often associated with multiple labels, e.g., multiple object-level labels in our case. Besides, a comprehensive picture of present objects in a given high resolution aerial image can provide more in-depth understanding of the studied region. For these reasons, aerial image multi-label classification has been attracting increasing attention. However, one common limitation shared by existing methods in the community is that the co-occurrence relationship of various classes, so called class dependency, is underexplored and leads to an inconsiderate decision. In this paper, we propose a novel end-to-end network, namely class-wise attention-based convolutional and bidirectional LSTM network (CA-Conv-BiLSTM), for this task. The proposed network consists of three indispensable components: 1) a feature extraction module, 2) a class attention learning layer, and 3) a bidirectional LSTM-based sub-network. Particularly, the feature extraction module is designed for extracting fine-grained semantic feature maps, while the class attention learning layer aims at capturing discriminative class-specific features. As the most important part, the bidirectional LSTM-based sub-network models the underlying class dependency in both directions and produce structured multiple object labels. Experimental results on UCM multi-label dataset and DFC15 multi-label dataset validate the effectiveness of our model quantitatively and qualitatively.
Tasks Image Classification, Multi-Label Classification
Published 2018-07-30
URL https://arxiv.org/abs/1807.11245v2
PDF https://arxiv.org/pdf/1807.11245v2.pdf
PWC https://paperswithcode.com/paper/recurrently-exploring-class-wise-attention-in
Repo https://github.com/EricYangsw/Multi-Label-Classification
Framework tf

A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss

Title A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
Authors Wan-Ting Hsu, Chieh-Kai Lin, Ming-Ying Lee, Kerui Min, Jing Tang, Min Sun
Abstract We propose a unified model combining the strength of extractive and abstractive summarization. On the one hand, a simple extractive model can obtain sentence-level attention with high ROUGE scores but less readable. On the other hand, a more complicated abstractive model can obtain word-level dynamic attention to generate a more readable paragraph. In our model, sentence-level attention is used to modulate the word-level attention such that words in less attended sentences are less likely to be generated. Moreover, a novel inconsistency loss function is introduced to penalize the inconsistency between two levels of attentions. By end-to-end training our model with the inconsistency loss and original losses of extractive and abstractive models, we achieve state-of-the-art ROUGE scores while being the most informative and readable summarization on the CNN/Daily Mail dataset in a solid human evaluation.
Tasks Abstractive Text Summarization
Published 2018-05-16
URL http://arxiv.org/abs/1805.06266v2
PDF http://arxiv.org/pdf/1805.06266v2.pdf
PWC https://paperswithcode.com/paper/a-unified-model-for-extractive-and
Repo https://github.com/HsuWanTing/unified-summarization
Framework tf

Adversarial training for multi-context joint entity and relation extraction

Title Adversarial training for multi-context joint entity and relation extraction
Authors Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improving the state-of-the-art effectiveness on several datasets in different contexts (i.e., news, biomedical, and real estate data) and for different languages (English and Dutch).
Tasks Joint Entity and Relation Extraction, Relation Extraction
Published 2018-08-21
URL http://arxiv.org/abs/1808.06876v3
PDF http://arxiv.org/pdf/1808.06876v3.pdf
PWC https://paperswithcode.com/paper/adversarial-training-for-multi-context-joint
Repo https://github.com/bekou/multihead_joint_entity_relation_extraction
Framework tf

T-GCN: A Temporal Graph ConvolutionalNetwork for Traffic Prediction

Title T-GCN: A Temporal Graph ConvolutionalNetwork for Traffic Prediction
Authors Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, Haifeng Li
Abstract Accurate and real-time traffic forecasting plays an important role in the Intelligent Traffic System and is of great significance for urban traffic planning, traffic management, and traffic control. However, traffic forecasting has always been considered an open scientific issue, owing to the constraints of urban road network topological structure and the law of dynamic change with time, namely, spatial dependence and temporal dependence. To capture the spatial and temporal dependence simultaneously, we propose a novel neural network-based traffic forecasting method, the temporal graph convolutional network (T-GCN) model, which is in combination with the graph convolutional network (GCN) and gated recurrent unit (GRU). Specifically, the GCN is used to learn complex topological structures to capture spatial dependence and the gated recurrent unit is used to learn dynamic changes of traffic data to capture temporal dependence. Then, the T-GCN model is employed to traffic forecasting based on the urban road network. Experiments demonstrate that our T-GCN model can obtain the spatio-temporal correlation from traffic data and the predictions outperform state-of-art baselines on real-world traffic datasets. Our tensorflow implementation of the T-GCN is available at https://github.com/lehaifeng/T-GCN.
Tasks Traffic Prediction
Published 2018-11-12
URL http://arxiv.org/abs/1811.05320v3
PDF http://arxiv.org/pdf/1811.05320v3.pdf
PWC https://paperswithcode.com/paper/t-gcn-a-temporal-graph-convolutionalnetwork
Repo https://github.com/R4h4/AIforSEA_Traffic_Management
Framework none

Spectral Feature Transformation for Person Re-identification

Title Spectral Feature Transformation for Person Re-identification
Authors Chuanchen Luo, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang
Abstract With the surge of deep learning techniques, the field of person re-identification has witnessed rapid progress in recent years. Deep learning based methods focus on learning a feature space where samples are clustered compactly according to their corresponding identities. Most existing methods rely on powerful CNNs to transform the samples individually. In contrast, we propose to consider the sample relations in the transformation. To achieve this goal, we incorporate spectral clustering technique into CNN. We derive a novel module named Spectral Feature Transformation and seamlessly integrate it into existing CNN pipeline with negligible cost,which makes our method enjoy the best of two worlds. Empirical studies show that the proposed approach outperforms previous state-of-the-art methods on four public benchmarks by a considerable margin without bells and whistles.
Tasks Person Re-Identification
Published 2018-11-28
URL http://arxiv.org/abs/1811.11405v1
PDF http://arxiv.org/pdf/1811.11405v1.pdf
PWC https://paperswithcode.com/paper/spectral-feature-transformation-for-person-re
Repo https://github.com/xuxu116/pytorch-reid-lite
Framework pytorch

A Meaning-based Statistical English Math Word Problem Solver

Title A Meaning-based Statistical English Math Word Problem Solver
Authors Chao-Chun Liang, Yu-Shiang Wong, Yi-Chung Lin, Keh-Yih Su
Abstract We introduce MeSys, a meaning-based approach, for solving English math word problems (MWPs) via understanding and reasoning in this paper. It first analyzes the text, transforms both body and question parts into their corresponding logic forms, and then performs inference on them. The associated context of each quantity is represented with proposed role-tags (e.g., nsubj, verb, etc.), which provides the flexibility for annotating an extracted math quantity with its associated context information (i.e., the physical meaning of this quantity). Statistical models are proposed to select the operator and operands. A noisy dataset is designed to assess if a solver solves MWPs mainly via understanding or mechanical pattern matching. Experimental results show that our approach outperforms existing systems on both benchmark datasets and the noisy dataset, which demonstrates that the proposed approach understands the meaning of each quantity in the text more.
Tasks
Published 2018-03-16
URL http://arxiv.org/abs/1803.06064v2
PDF http://arxiv.org/pdf/1803.06064v2.pdf
PWC https://paperswithcode.com/paper/a-meaning-based-statistical-english-math-word
Repo https://github.com/chaochun/nlu-mwp-noise-dataset
Framework none

Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision

Title Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision
Authors Genta Indra Winata, Onno Pepijn Kampman, Pascale Fung
Abstract We propose a Long Short-Term Memory (LSTM) with attention mechanism to classify psychological stress from self-conducted interview transcriptions. We apply distant supervision by automatically labeling tweets based on their hashtag content, which complements and expands the size of our corpus. This additional data is used to initialize the model parameters, and which it is fine-tuned using the interview data. This improves the model’s robustness, especially by expanding the vocabulary size. The bidirectional LSTM model with attention is found to be the best model in terms of accuracy (74.1%) and f-score (74.3%). Furthermore, we show that distant supervision fine-tuning enhances the model’s performance by 1.6% accuracy and 2.1% f-score. The attention mechanism helps the model to select informative words.
Tasks 3D Shape Analysis
Published 2018-05-31
URL http://arxiv.org/abs/1805.12307v1
PDF http://arxiv.org/pdf/1805.12307v1.pdf
PWC https://paperswithcode.com/paper/attention-based-lstm-for-psychological-stress
Repo https://github.com/gentaiscool/lstm-attention
Framework none

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

Title GenAttack: Practical Black-box Attacks with Gradient-Free Optimization
Authors Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, Huan Zhang, Cho-Jui Hsieh, Mani Srivastava
Abstract Deep neural networks are vulnerable to adversarial examples, even in the black-box setting, where the attacker is restricted solely to query access. Existing black-box approaches to generating adversarial examples typically require a significant number of queries, either for training a substitute network or performing gradient estimation. We introduce GenAttack, a gradient-free optimization technique that uses genetic algorithms for synthesizing adversarial examples in the black-box setting. Our experiments on different datasets (MNIST, CIFAR-10, and ImageNet) show that GenAttack can successfully generate visually imperceptible adversarial examples against state-of-the-art image recognition models with orders of magnitude fewer queries than previous approaches. Against MNIST and CIFAR-10 models, GenAttack required roughly 2,126 and 2,568 times fewer queries respectively, than ZOO, the prior state-of-the-art black-box attack. In order to scale up the attack to large-scale high-dimensional ImageNet models, we perform a series of optimizations that further improve the query efficiency of our attack leading to 237 times fewer queries against the Inception-v3 model than ZOO. Furthermore, we show that GenAttack can successfully attack some state-of-the-art ImageNet defenses, including ensemble adversarial training and non-differentiable or randomized input transformations. Our results suggest that evolutionary algorithms open up a promising area of research into effective black-box attacks.
Tasks
Published 2018-05-28
URL https://arxiv.org/abs/1805.11090v3
PDF https://arxiv.org/pdf/1805.11090v3.pdf
PWC https://paperswithcode.com/paper/genattack-practical-black-box-attacks-with
Repo https://github.com/tuscan-chicken-wrap/ECE260FinalProject
Framework tf
comments powered by Disqus