January 25, 2020

2822 words 14 mins read

Paper Group NAWR 25

Partitioning Structure Learning for Segmented Linear Regression Trees. Yes, we can! Mining Arguments in 50 Years of US Presidential Campaign Debates. Approximate Feature Collisions in Neural Nets. A Streamlined Method for Sourcing Discourse-level Argumentation Annotations from the Crowd. Optimal Sparsity-Sensitive Bounds for Distributed Mean Estima …

Partitioning Structure Learning for Segmented Linear Regression Trees


Title	Partitioning Structure Learning for Segmented Linear Regression Trees
Authors	Xiangyu Zheng, Song Xi Chen
Abstract	This paper proposes a partitioning structure learning method for segmented linear regression trees (SLRT), which assigns linear predictors over the terminal nodes. The recursive partitioning process is driven by an adaptive split selection algorithm that maximizes, at each node, a criterion function based on a conditional Kendall’s τ statistic that measures the rank dependence between the regressors and the fit- ted linear residuals. Theoretical analysis shows that the split selection algorithm permits consistent identification and estimation of the unknown segments. A suffi- ciently large tree is induced by applying the split selection algorithm recursively. Then the minimal cost-complexity tree pruning procedure is applied to attain the right-sized tree, that ensures (i) the nested structure of pruned subtrees and (ii) consistent estimation to the number of segments. Implanting the SLRT as the built-in base predictor, we obtain the ensemble predictors by random forests (RF) and the proposed weighted random forests (WRF). The practical performance of the SLRT and its ensemble versions are evaluated via numerical simulations and empirical studies. The latter shows their advantageous predictive performance over a set of state-of-the-art tree-based models on well-studied public datasets.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8494-partitioning-structure-learning-for-segmented-linear-regression-trees
PDF	http://papers.nips.cc/paper/8494-partitioning-structure-learning-for-segmented-linear-regression-trees.pdf
PWC	https://paperswithcode.com/paper/partitioning-structure-learning-for-segmented
Repo	https://github.com/xy-zheng/Segmented-Linear-Regression-Tree
Framework	none

Yes, we can! Mining Arguments in 50 Years of US Presidential Campaign Debates


Title	Yes, we can! Mining Arguments in 50 Years of US Presidential Campaign Debates
Authors	Shohreh Haddadan, Elena Cabrio, Serena Villata
Abstract	Political debates offer a rare opportunity for citizens to compare the candidates{'} positions on the most controversial topics of the campaign. Thus they represent a natural application scenario for Argument Mining. As existing research lacks solid empirical investigation of the typology of argument components in political debates, we fill this gap by proposing an Argument Mining approach to political debates. We address this task in an empirical manner by annotating 39 political debates from the last 50 years of US presidential campaigns, creating a new corpus of 29k argument components, labeled as premises and claims. We then propose two tasks: (1) identifying the argumentative components in such debates, and (2) classifying them as premises and claims. We show that feature-rich SVM learners and Neural Network architectures outperform standard baselines in Argument Mining over such complex data. We release the new corpus USElecDeb60To16 and the accompanying software under free licenses to the research community.
Tasks	Argument Mining
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1463/
PDF	https://www.aclweb.org/anthology/P19-1463
PWC	https://paperswithcode.com/paper/yes-we-can-mining-arguments-in-50-years-of-us
Repo	https://github.com/atreyasha/sentiment-argument-mining
Framework	none

Approximate Feature Collisions in Neural Nets


Title	Approximate Feature Collisions in Neural Nets
Authors	Ke Li, Tianhao Zhang, Jitendra Malik
Abstract	Work on adversarial examples has shown that neural nets are surprisingly sensitive to adversarially chosen changes of small magnitude. In this paper, we show the opposite: neural nets could be surprisingly insensitive to adversarially chosen changes of large magnitude. We observe that this phenomenon can arise from the intrinsic properties of the ReLU activation function. As a result, two very different examples could share the same feature activation and therefore the same classification decision. We refer to this phenomenon as feature collision and the corresponding examples as colliding examples. We find that colliding examples are quite abundant: we empirically demonstrate the existence of polytopes of approximately colliding examples in the neighbourhood of practically any example.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9713-approximate-feature-collisions-in-neural-nets
PDF	http://papers.nips.cc/paper/9713-approximate-feature-collisions-in-neural-nets.pdf
PWC	https://paperswithcode.com/paper/approximate-feature-collisions-in-neural-nets
Repo	https://github.com/zth667/Approximate-Feature-Collisions-in-Neural-Nets
Framework	tf

A Streamlined Method for Sourcing Discourse-level Argumentation Annotations from the Crowd


Title	A Streamlined Method for Sourcing Discourse-level Argumentation Annotations from the Crowd
Authors	Tristan Miller, Maria Sukhareva, Iryna Gurevych
Abstract	The study of argumentation and the development of argument mining tools depends on the availability of annotated data, which is challenging to obtain in sufficient quantity and quality. We present a method that breaks down a popular but relatively complex discourse-level argument annotation scheme into a simpler, iterative procedure that can be applied even by untrained annotators. We apply this method in a crowdsourcing setup and report on the reliability of the annotations obtained. The source code for a tool implementing our annotation method, as well as the sample data we obtained (4909 gold-standard annotations across 982 documents), are freely released to the research community. These are intended to serve the needs of qualitative research into argumentation, as well as of data-driven approaches to argument mining.
Tasks	Argument Mining
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1177/
PDF	https://www.aclweb.org/anthology/N19-1177
PWC	https://paperswithcode.com/paper/a-streamlined-method-for-sourcing-discourse
Repo	https://github.com/UKPLab/naacl2019-argument-annotations
Framework	none

Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation


Title	Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation
Authors	Zengfeng Huang, Ziyue Huang, Yilei Wang, Ke Yi
Abstract	We consider the problem of estimating the mean of a set of vectors, which are stored in a distributed system. This is a fundamental task with applications in distributed SGD and many other distributed problems, where communication is a main bottleneck for scaling up computations. We propose a new sparsity-aware algorithm, which improves previous results both theoretically and empirically. The communication cost of our algorithm is characterized by Hoyer’s measure of sparseness. Moreover, we prove that the communication cost of our algorithm is information-theoretic optimal up to a constant factor in all sparseness regime. We have also conducted experimental studies, which demonstrate the advantages of our method and confirm our theoretical findings.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8866-optimal-sparsity-sensitive-bounds-for-distributed-mean-estimation
PDF	http://papers.nips.cc/paper/8866-optimal-sparsity-sensitive-bounds-for-distributed-mean-estimation.pdf
PWC	https://paperswithcode.com/paper/optimal-sparsity-sensitive-bounds-for
Repo	https://github.com/ZiyueHuang/DME
Framework	none

Thompson Sampling for Multinomial Logit Contextual Bandits


Title	Thompson Sampling for Multinomial Logit Contextual Bandits
Authors	Min-Hwan Oh, Garud Iyengar
Abstract	We consider a dynamic assortment selection problem where the goal is to offer a sequence of assortments that maximizes the expected cumulative revenue, or alternatively, minimize the expected regret. The feedback here is the item that the user picks from the assortment. The distinguishing feature in this work is that this feedback has a multinomial logistic distribution. The utility of each item is a dynamic function of contextual information of both the item and the user. We propose two Thompson sampling algorithms for this multinomial logit contextual bandit. Our first algorithm maintains a posterior distribution of the true parameter and establishes $\tilde{O}(d\sqrt{T})$ Bayesian regret over $T$ rounds with $d$ dimensional context vector. The worst-case computational complexity of this algorithm could be high when the prior distribution is not a conjugate. The second algorithm approximates the posterior by a Gaussian distribution, and uses a new optimistic sampling procedure to address the issues that arise in worst-case regret analysis. This algorithm achieves $\tilde{O}(d^{3/2}\sqrt{T})$ worst-case (frequentist) regret bound. The numerical experiments show that the practical performance of both methods is in line with the theoretical guarantees.
Tasks	Multi-Armed Bandits
Published	2019-12-01
URL	http://papers.nips.cc/paper/8578-thompson-sampling-for-multinomial-logit-contextual-bandits
PDF	http://papers.nips.cc/paper/8578-thompson-sampling-for-multinomial-logit-contextual-bandits.pdf
PWC	https://paperswithcode.com/paper/thompson-sampling-for-multinomial-logit
Repo	https://github.com/minhwanoh/Thompson-sampling-for-MNL-contextual-bandits
Framework	none

Staying up to Date with Online Content Changes Using Reinforcement Learning for Scheduling


Title	Staying up to Date with Online Content Changes Using Reinforcement Learning for Scheduling
Authors	Andrey Kolobov, Yuval Peres, Cheng Lu, Eric J. Horvitz
Abstract	From traditional Web search engines to virtual assistants and Web accelerators, services that rely on online information need to continually keep track of remote content changes by explicitly requesting content updates from remote sources (e.g., web pages). We propose a novel optimization objective for this setting that has several practically desirable properties, and efficient algorithms for it with optimality guarantees even in the face of mixed content change observability and initially unknown change model parameters. Experiments on 18.5M URLs crawled daily for 14 weeks show significant advantages of this approach over prior art.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8348-staying-up-to-date-with-online-content-changes-using-reinforcement-learning-for-scheduling
PDF	http://papers.nips.cc/paper/8348-staying-up-to-date-with-online-content-changes-using-reinforcement-learning-for-scheduling.pdf
PWC	https://paperswithcode.com/paper/staying-up-to-date-with-online-content
Repo	https://github.com/microsoft/Optimal-Freshness-Crawl-Scheduling
Framework	none

STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction


Title	STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction
Authors	Yingfan Huang, Huikun Bi, Zhaoxin Li, Tianlu Mao, Zhaoqi Wang
Abstract	Human trajectory prediction is challenging and critical in various applications (e.g., autonomous vehicles and social robots). Because of the continuity and foresight of the pedestrian movements, the moving pedestrians in crowded spaces will consider both spatial and temporal interactions to avoid future collisions. However, most of the existing methods ignore the temporal correlations of interactions with other pedestrians involved in a scene. In this work, we propose a Spatial-Temporal Graph Attention network (STGAT), based on a sequence-to-sequence architecture to predict future trajectories of pedestrians. Besides the spatial interactions captured by the graph attention mechanism at each time-step, we adopt an extra LSTM to encode the temporal correlations of interactions. Through comparisons with state-of-the-art methods, our model achieves superior performance on two publicly available crowd datasets (ETH and UCY) and produces more “socially” plausible trajectories for pedestrians.
Tasks	Autonomous Vehicles, Trajectory Prediction
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Huang_STGAT_Modeling_Spatial-Temporal_Interactions_for_Human_Trajectory_Prediction_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Huang_STGAT_Modeling_Spatial-Temporal_Interactions_for_Human_Trajectory_Prediction_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/stgat-modeling-spatial-temporal-interactions
Repo	https://github.com/huang-xx/STGAT
Framework	pytorch

Deep Structured Prediction for Facial Landmark Detection


Title	Deep Structured Prediction for Facial Landmark Detection
Authors	Lisha Chen, Hui Su, Qiang Ji
Abstract	Existing deep learning based facial landmark detection methods have achieved excellent performance. These methods, however, do not explicitly embed the structural dependencies among landmark points. They hence cannot preserve the geometric relationships between landmark points or generalize well to challenging conditions or unseen data. This paper proposes a method for deep structured facial landmark detection based on combining a deep Convolutional Network with a Conditional Random Field. We demonstrate its superior performance to existing state-of-the-art techniques in facial landmark detection, especially a better generalization ability on challenging datasets that include large pose and occlusion.
Tasks	Facial Landmark Detection, Structured Prediction
Published	2019-12-01
URL	http://papers.nips.cc/paper/8515-deep-structured-prediction-for-facial-landmark-detection
PDF	http://papers.nips.cc/paper/8515-deep-structured-prediction-for-facial-landmark-detection.pdf
PWC	https://paperswithcode.com/paper/deep-structured-prediction-for-facial
Repo	https://github.com/lisha-chen/Deep-structured-facial-landmark-detection
Framework	none

Ghost-free multi exposure image fusion technique using dense SIFT descriptor and guided filter


Title	Ghost-free multi exposure image fusion technique using dense SIFT descriptor and guided filter
Authors	Naila Hayat, Muhammad Imran
Abstract	A ghost-free multi-exposure image fusion technique using the dense SIFT descriptor and the guided filter is proposed in this paper. The results suggest that the presented scheme produces high-quality images using ordinary cameras and that too without the ghosting artifact. To do so, the dense SIFT descriptor is used to extract the local contrast information from source images. Whereas, for the dynamic scenes, the histogram equalization and median filtering are used to calculate the color dissimilarity feature. Three weighting terms: local contrast, brightness, and color dissimilarity feature are used to estimate the initial weights. The estimated initial weights contain discontinuities. Therefore, the guided filter is used to remove the noise and discontinuity in initial weights. Finally, the fusion is performed using a pyramid decomposition method. Experimental results prove the superiority of the proposed technique over existing state-of-the-art methods in terms of both subjective and objective evaluation.
Tasks
Published	2019-07-01
URL	https://www.sciencedirect.com/science/article/pii/S1047320319301750
PDF	https://www.sciencedirect.com/science/article/pii/S1047320319301750
PWC	https://paperswithcode.com/paper/ghost-free-multi-exposure-image-fusion
Repo	https://github.com/ImranNust/Source-Code
Framework	none

Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time


Title	Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time
Authors	Karlis Freivalds, Emīls Ozoliņš, Agris Šostaks
Abstract	A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O(n^2) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total complexity. We show that this model is powerful enough to infer efficient algorithms for common algorithmic benchmarks including sorting, addition and multiplication. We evaluate our architecture on the challenging LAMBADA question answering dataset and compare it with the state-of-the-art models which use attention. Our model achieves competitive accuracy and scales to sequences with more than a hundred thousand of elements. We are confident that the proposed model has the potential for building more efficient architectures for processing large interrelated data in language modeling, music generation and other application domains.
Tasks	Language Modelling, Music Generation, Question Answering
Published	2019-12-01
URL	http://papers.nips.cc/paper/8889-neural-shuffle-exchange-networks-sequence-processing-in-on-log-n-time
PDF	http://papers.nips.cc/paper/8889-neural-shuffle-exchange-networks-sequence-processing-in-on-log-n-time.pdf
PWC	https://paperswithcode.com/paper/neural-shuffle-exchange-networks-sequence-1
Repo	https://github.com/LUMII-Syslab/shuffle-exchange
Framework	tf

BERT is Not an Interlingua and the Bias of Tokenization


Title	BERT is Not an Interlingua and the Bias of Tokenization
Authors	Jasdeep Singh, Bryan McCann, Richard Socher, Caiming Xiong
Abstract	Multilingual transfer learning can benefit both high- and low-resource languages, but the source of these improvements is not well understood. Cananical Correlation Analysis (CCA) of the internal representations of a pre- trained, multilingual BERT model reveals that the model partitions representations for each language rather than using a common, shared, interlingual space. This effect is magnified at deeper layers, suggesting that the model does not progressively abstract semantic con- tent while disregarding languages. Hierarchical clustering based on the CCA similarity scores between languages reveals a tree structure that mirrors the phylogenetic trees hand- designed by linguists. The subword tokenization employed by BERT provides a stronger bias towards such structure than character- and word-level tokenizations. We release a subset of the XNLI dataset translated into an additional 14 languages at https://www.github.com/salesforce/xnli{_}extension to assist further research into multilingual representations.
Tasks	Tokenization, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6106/
PDF	https://www.aclweb.org/anthology/D19-6106
PWC	https://paperswithcode.com/paper/bert-is-not-an-interlingua-and-the-bias-of
Repo	https://github.com/salesforce/xnli_extension
Framework	none

Wide-Context Semantic Image Extrapolation


Title	Wide-Context Semantic Image Extrapolation
Authors	Yi Wang, Xin Tao, Xiaoyong Shen, Jiaya Jia
Abstract	This paper studies the fundamental problem of extrapolating visual context using deep generative models, i.e., extending image borders with plausible structure and details. This seemingly easy task actually faces many crucial technical challenges and has its unique properties. The two major issues are size expansion and one-side constraints. We propose a semantic regeneration network with several special contributions and use multiple spatial related losses to address these issues. Our results contain consistent structures and high-quality textures. Extensive experiments are conducted on various possible alternatives and related methods. We also explore the potential of our method for various interesting applications that can benefit research in a variety of fields.
Tasks	Image Inpainting, Image Outpainting
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Wide-Context_Semantic_Image_Extrapolation_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Wide-Context_Semantic_Image_Extrapolation_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/wide-context-semantic-image-extrapolation
Repo	https://github.com/shepnerd/outpainting_srn
Framework	tf

A Local Block Coordinate Descent Algorithm for the CSC Model


Title	A Local Block Coordinate Descent Algorithm for the CSC Model
Authors	Ev Zisselman, Jeremias Sulam, Michael Elad
Abstract	The Convolutional Sparse Coding (CSC) model has recently gained considerable traction in the signal and image processing communities. By providing a global, yet tractable, model that operates on the whole image, the CSC was shown to overcome several limitations of the patch-based sparse model while achieving superior performance in various applications. Contemporary methods for pursuit and learning the CSC dictionary often rely on the Alternating Direction Method of Multipliers (ADMM) in the Fourier domain for the computational convenience of convolutions, while ignoring the local characterizations of the image. In this work we propose a new and simple approach that adopts a localized strategy, based on the Block Coordinate Descent algorithm. The proposed method, termed Local Block Coordinate Descent (LoBCoD), operates locally on image patches. Furthermore, we introduce a novel stochastic gradient descent version of LoBCoD for training the convolutional filters. This Stochastic-LoBCoD leverages the benefits of online learning, while being applicable even to a single training image. We demonstrate the advantages of the proposed algorithms for image inpainting and multi-focus image fusion, achieving state-of-the-art results.
Tasks	Image Inpainting
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Zisselman_A_Local_Block_Coordinate_Descent_Algorithm_for_the_CSC_Model_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Zisselman_A_Local_Block_Coordinate_Descent_Algorithm_for_the_CSC_Model_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/a-local-block-coordinate-descent-algorithm-1
Repo	https://github.com/EvZissel/LoBCoD
Framework	none

Neural Taskonomy: Inferring the Similarity of Task-Derived Representations from Brain Activity


Title	Neural Taskonomy: Inferring the Similarity of Task-Derived Representations from Brain Activity
Authors	Yuan Wang, Michael Tarr, Leila Wehbe
Abstract	Convolutional neural networks (CNNs) trained for object classification have been widely used to account for visually-driven neural responses in both human and primate brains. However, because of the generality and complexity of object classification, despite the effectiveness of CNNs in predicting brain activity, it is difficult to draw specific inferences about neural information processing using CNN-derived representations. To address this problem, we used learned representations drawn from 21 computer vision tasks to construct encoding models for predicting brain responses from BOLD5000—a large-scale dataset comprised of fMRI scans collected while observers viewed over 5000 naturalistic scene and object images. Encoding models based on task features predict activity in different regions across the whole brain. Features from 3D tasks such as keypoint/edge detection explain greater variance compared to 2D tasks—a pattern observed across the whole brain. Using results across all 21 task representations, we constructed a ``task graph’’ based on the spatial layout of well-predicted brain areas from each task. A comparison of this brain-derived task structure to the task structure derived from transfer learning accuracy demonstrate that tasks with higher transferability make similar predictions for brain responses from different regions. These results—arising out of state-of-the-art computer vision methods—help reveal the task-specific architecture of the human visual system. \|
Tasks	Edge Detection, Object Classification, Transfer Learning
Published	2019-12-01
URL	http://papers.nips.cc/paper/9683-neural-taskonomy-inferring-the-similarity-of-task-derived-representations-from-brain-activity
PDF	http://papers.nips.cc/paper/9683-neural-taskonomy-inferring-the-similarity-of-task-derived-representations-from-brain-activity.pdf
PWC	https://paperswithcode.com/paper/neural-taskonomy-inferring-the-similarity-of
Repo	https://github.com/ariaaay/NeuralTaskonomy
Framework	pytorch