February 1, 2020

2996 words 15 mins read

Paper Group AWR 106

Paper Group AWR 106

Tracking without bells and whistles. Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning. IoU-uniform R-CNN: Breaking Through the Limitations of RPN. MoGA: Searching Beyond MobileNetV3. Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders. BERT has a Mouth, and It Must Speak: BERT as a Mark …

Tracking without bells and whistles

Title Tracking without bells and whistles
Authors Philipp Bergmann, Tim Meinhardt, Laura Leal-Taixe
Abstract The problem of tracking multiple objects in a video sequence poses several challenging tasks. For tracking-by-detection, these include object re-identification, motion prediction and dealing with occlusions. We present a tracker (without bells and whistles) that accomplishes tracking without specifically targeting any of these tasks, in particular, we perform no training or optimization on tracking data. To this end, we exploit the bounding box regression of an object detector to predict the position of an object in the next frame, thereby converting a detector into a Tracktor. We demonstrate the potential of Tracktor and provide a new state-of-the-art on three multi-object tracking benchmarks by extending it with a straightforward re-identification and camera motion compensation. We then perform an analysis on the performance and failure cases of several state-of-the-art tracking methods in comparison to our Tracktor. Surprisingly, none of the dedicated tracking methods are considerably better in dealing with complex tracking scenarios, namely, small and occluded objects or missing detections. However, our approach tackles most of the easy tracking scenarios. Therefore, we motivate our approach as a new tracking paradigm and point out promising future research directions. Overall, Tracktor yields superior tracking performance than any current tracking method and our analysis exposes remaining and unsolved tracking challenges to inspire future research directions.
Tasks Motion Compensation, motion prediction, Multi-Object Tracking, Object Tracking
Published 2019-03-13
URL https://arxiv.org/abs/1903.05625v3
PDF https://arxiv.org/pdf/1903.05625v3.pdf
PWC https://paperswithcode.com/paper/tracking-without-bells-and-whistles
Repo https://github.com/zhanxinrui/tracking_wo_bnw_fork
Framework pytorch

Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning

Title Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning
Authors Mengge Xue, Weiming Cai, Jinsong Su, Linfeng Song, Yubin Ge, Yubao Liu, Bin Wang
Abstract Benefiting from the excellent ability of neural networks on learning semantic representations, existing studies for entity linking (EL) have resorted to neural networks to exploit both the local mention-to-entity compatibility and the global interdependence between different EL decisions for target entity disambiguation. However, most neural collective EL methods depend entirely upon neural networks to automatically model the semantic dependencies between different EL decisions, which lack of the guidance from external knowledge. In this paper, we propose a novel end-to-end neural network with recurrent random-walk layers for collective EL, which introduces external knowledge to model the semantic interdependence between different EL decisions. Specifically, we first establish a model based on local context features, and then stack random-walk layers to reinforce the evidence for related EL decisions into high-probability decisions, where the semantic interdependence between candidate entities is mainly induced from an external knowledge base. Finally, a semantic regularizer that preserves the collective EL decisions consistency is incorporated into the conventional objective function, so that the external knowledge base can be fully exploited in collective EL decisions. Experimental results and in-depth analysis on various datasets show that our model achieves better performance than other state-of-the-art models. Our code and data are released at \url{https://github.com/DeepLearnXMU/RRWEL}.
Tasks Entity Disambiguation, Entity Linking, Learning Semantic Representations
Published 2019-06-20
URL https://arxiv.org/abs/1906.09320v1
PDF https://arxiv.org/pdf/1906.09320v1.pdf
PWC https://paperswithcode.com/paper/neural-collective-entity-linking-based-on
Repo https://github.com/DeepLearnXMU/RRWEL
Framework pytorch

IoU-uniform R-CNN: Breaking Through the Limitations of RPN

Title IoU-uniform R-CNN: Breaking Through the Limitations of RPN
Authors Li Zhu, Zihao Xie, Liman Liu, Bo Tao, Wenbing Tao
Abstract Region Proposal Network (RPN) is the cornerstone of two-stage object detectors, it generates a sparse set of object proposals and alleviates the extrem foregroundbackground class imbalance problem during training. However, we find that the potential of the detector has not been fully exploited due to the IoU distribution imbalance and inadequate quantity of the training samples generated by RPN. With the increasing intersection over union (IoU), the exponentially smaller numbers of positive samples would lead to the distribution skewed towards lower IoUs, which hinders the optimization of detector at high IoU levels. In this paper, to break through the limitations of RPN, we propose IoU-Uniform R-CNN, a simple but effective method that directly generates training samples with uniform IoU distribution for the regression branch as well as the IoU prediction branch. Besides, we improve the performance of IoU prediction branch by eliminating the feature offsets of RoIs at inference, which helps the NMS procedure by preserving accurately localized bounding box. Extensive experiments on the PASCAL VOC and MS COCO dataset show the effectiveness of our method, as well as its compatibility and adaptivity to many object detection architectures. The code is made publicly available at https://github.com/zl1994/IoU-Uniform-R-CNN,
Tasks Object Detection
Published 2019-12-11
URL https://arxiv.org/abs/1912.05190v1
PDF https://arxiv.org/pdf/1912.05190v1.pdf
PWC https://paperswithcode.com/paper/iou-uniform-r-cnn-breaking-through-the
Repo https://github.com/zl1994/IoU-Uniform-R-CNN
Framework pytorch

MoGA: Searching Beyond MobileNetV3

Title MoGA: Searching Beyond MobileNetV3
Authors Xiangxiang Chu, Bo Zhang, Ruijun Xu
Abstract The evolution of MobileNets has laid a solid foundation for neural network applications on mobile end. With the latest MobileNetV3, neural architecture search again claimed its supremacy in network design. Unfortunately, till today all mobile methods mainly focus on CPU latencies instead of GPU, the latter, however, is much preferred in practice for it has faster speed, lower overhead and less interference. Bearing the target hardware in mind, we propose the first Mobile GPU-Aware (MoGA) neural architecture search in order to be precisely tailored for real-world applications. Further, the ultimate objective to devise a mobile network lies in achieving better performance by maximizing the utilization of bounded resources. Urging higher capability while restraining time consumption is not reconcilable. We alleviate the tension by weighted evolution techniques. Moreover, we encourage increasing the number of parameters for higher representational power. With 200x fewer GPU days than MnasNet, we obtain a series of models that outperform MobileNetV3 under the similar latency constraints, i.e., MoGA-A achieves 75.9% top-1 accuracy on ImageNet, MoGA-B meets 75.5% which costs only 0.5 ms more on mobile GPU. MoGA-C best attests GPU-awareness by reaching 75.3% and being slower on CPU but faster on GPU.The models and test code is made available here https://github.com/xiaomi-automl/MoGA.
Tasks AutoML, Image Classification, Neural Architecture Search
Published 2019-08-04
URL https://arxiv.org/abs/1908.01314v4
PDF https://arxiv.org/pdf/1908.01314v4.pdf
PWC https://paperswithcode.com/paper/moga-searching-beyond-mobilenetv3
Repo https://github.com/xiaomi-automl/MoGA
Framework pytorch

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

Title Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders
Authors Andrew Drozdov, Pat Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum
Abstract We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Our approach predicts each word in an input sentence conditioned on the rest of the sentence and uses inside-outside dynamic programming to consider all possible binary trees over the sentence. At test time the CKY algorithm extracts the highest scoring parse. DIORA achieves a new state-of-the-art F1 in unsupervised binary constituency parsing (unlabeled) in two benchmark datasets, WSJ and MultiNLI.
Tasks Constituency Parsing
Published 2019-04-03
URL http://arxiv.org/abs/1904.02142v2
PDF http://arxiv.org/pdf/1904.02142v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-latent-tree-induction-with-deep
Repo https://github.com/iesl/diora
Framework pytorch

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

Title BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
Authors Alex Wang, Kyunghyun Cho
Abstract We show that BERT (Devlin et al., 2018) is a Markov random field language model. This formulation gives way to a natural procedure to sample sentences from BERT. We generate from BERT and find that it can produce high-quality, fluent generations. Compared to the generations of a traditional left-to-right language model, BERT generates sentences that are more diverse but of slightly worse quality.
Tasks Language Modelling
Published 2019-02-11
URL http://arxiv.org/abs/1902.04094v2
PDF http://arxiv.org/pdf/1902.04094v2.pdf
PWC https://paperswithcode.com/paper/bert-has-a-mouth-and-it-must-speak-bert-as-a
Repo https://github.com/nyu-dl/bert-gen
Framework pytorch

StructureFlow: Image Inpainting via Structure-aware Appearance Flow

Title StructureFlow: Image Inpainting via Structure-aware Appearance Flow
Authors Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H. Li, Shan Liu, Ge Li
Abstract Image inpainting techniques have shown significant improvements by using deep neural networks recently. However, most of them may either fail to reconstruct reasonable structures or restore fine-grained textures. In order to solve this problem, in this paper, we propose a two-stage model which splits the inpainting task into two parts: structure reconstruction and texture generation. In the first stage, edge-preserved smooth images are employed to train a structure reconstructor which completes the missing structures of the inputs. In the second stage, based on the reconstructed structures, a texture generator using appearance flow is designed to yield image details. Experiments on multiple publicly available datasets show the superior performance of the proposed network.
Tasks Image Inpainting, Texture Synthesis
Published 2019-08-11
URL https://arxiv.org/abs/1908.03852v1
PDF https://arxiv.org/pdf/1908.03852v1.pdf
PWC https://paperswithcode.com/paper/structureflow-image-inpainting-via-structure
Repo https://github.com/RenYurui/StructureFlow
Framework pytorch

Deep Social Collaborative Filtering

Title Deep Social Collaborative Filtering
Authors Wenqi Fan, Yao Ma, Dawei Yin, Jianping Wang, Jiliang Tang, Qing Li
Abstract Recommender systems are crucial to alleviate the information overload problem in online worlds. Most of the modern recommender systems capture users’ preference towards items via their interactions based on collaborative filtering techniques. In addition to the user-item interactions, social networks can also provide useful information to understand users’ preference as suggested by the social theories such as homophily and influence. Recently, deep neural networks have been utilized for social recommendations, which facilitate both the user-item interactions and the social network information. However, most of these models cannot take full advantage of the social network information. They only use information from direct neighbors, but distant neighbors can also provide helpful information. Meanwhile, most of these models treat neighbors’ information equally without considering the specific recommendations. However, for a specific recommendation case, the information relevant to the specific item would be helpful. Besides, most of these models do not explicitly capture the neighbor’s opinions to items for social recommendations, while different opinions could affect the user differently. In this paper, to address the aforementioned challenges, we propose DSCF, a Deep Social Collaborative Filtering framework, which can exploit the social relations with various aspects for recommender systems. Comprehensive experiments on two-real world datasets show the effectiveness of the proposed framework.
Tasks Recommendation Systems
Published 2019-07-16
URL https://arxiv.org/abs/1907.06853v1
PDF https://arxiv.org/pdf/1907.06853v1.pdf
PWC https://paperswithcode.com/paper/deep-social-collaborative-filtering
Repo https://github.com/wenqifan03/GraphRec-WWW19
Framework pytorch

NoduleNet: Decoupled False Positive Reductionfor Pulmonary Nodule Detection and Segmentation

Title NoduleNet: Decoupled False Positive Reductionfor Pulmonary Nodule Detection and Segmentation
Authors Hao Tang, Chupeng Zhang, Xiaohui Xie
Abstract Pulmonary nodule detection, false positive reduction and segmentation represent three of the most common tasks in the computeraided analysis of chest CT images. Methods have been proposed for eachtask with deep learning based methods heavily favored recently. However training deep learning models to solve each task separately may be sub-optimal - resource intensive and without the benefit of feature sharing. Here, we propose a new end-to-end 3D deep convolutional neural net (DCNN), called NoduleNet, to solve nodule detection, false positive reduction and nodule segmentation jointly in a multi-task fashion. To avoid friction between different tasks and encourage feature diversification, we incorporate two major design tricks: 1) decoupled feature maps for nodule detection and false positive reduction, and 2) a segmentation refinement subnet for increasing the precision of nodule segmentation. Extensive experiments on the large-scale LIDC dataset demonstrate that the multi-task training is highly beneficial, improving the nodule detection accuracy by 10.27%, compared to the baseline model trained to only solve the nodule detection task. We also carry out systematic ablation studies to highlight contributions from each of the added components. Code is available at https://github.com/uci-cbcl/NoduleNet.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.11320v1
PDF https://arxiv.org/pdf/1907.11320v1.pdf
PWC https://paperswithcode.com/paper/nodulenet-decoupled-false-positive
Repo https://github.com/uci-cbcl/NoduleNet
Framework pytorch

Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution

Title Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution
Authors Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim
Abstract Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics. It is, at the same time, one of the most challenging issues for spoken language understanding (SLU) systems as well. As syntactic ambiguity is intertwined with issues regarding prosody and semantics, the computational approach toward speech intention identification is expected to benefit from the observations of the human language processing mechanism. In this regard, we address the task with attentive recurrent neural networks that exploit acoustic and textual features simultaneously and reveal how the modalities interact with each other to derive sentence meaning. Utilizing a speech corpus recorded on Korean scripts of syntactically ambiguous utterances, we revealed that co-attention frameworks, namely multi-hop attention and cross-attention, show significantly superior performance in disambiguating speech intention. With further analysis, we demonstrate that the computational models reflect the internal relationship between auditory and linguistic processes.
Tasks Spoken Language Understanding
Published 2019-10-21
URL https://arxiv.org/abs/1910.09275v2
PDF https://arxiv.org/pdf/1910.09275v2.pdf
PWC https://paperswithcode.com/paper/disambiguating-speech-intention-via-audio
Repo https://github.com/warnikchow/coaudiotext
Framework tf

Torchmeta: A Meta-Learning library for PyTorch

Title Torchmeta: A Meta-Learning library for PyTorch
Authors Tristan Deleu, Tobias Würfl, Mandana Samiei, Joseph Paul Cohen, Yoshua Bengio
Abstract The constant introduction of standardized benchmarks in the literature has helped accelerating the recent advances in meta-learning research. They offer a way to get a fair comparison between different algorithms, and the wide range of datasets available allows full control over the complexity of this evaluation. However, for a large majority of code available online, the data pipeline is often specific to one dataset, and testing on another dataset requires significant rework. We introduce Torchmeta, a library built on top of PyTorch that enables seamless and consistent evaluation of meta-learning algorithms on multiple datasets, by providing data-loaders for most of the standard benchmarks in few-shot classification and regression, with a new meta-dataset abstraction. It also features some extensions for PyTorch to simplify the development of models compatible with meta-learning algorithms. The code is available here: https://github.com/tristandeleu/pytorch-meta
Tasks Meta-Learning
Published 2019-09-14
URL https://arxiv.org/abs/1909.06576v1
PDF https://arxiv.org/pdf/1909.06576v1.pdf
PWC https://paperswithcode.com/paper/torchmeta-a-meta-learning-library-for-pytorch
Repo https://github.com/tristandeleu/pytorch-meta
Framework pytorch

Multi-task Learning for Target-dependent Sentiment Classification

Title Multi-task Learning for Target-dependent Sentiment Classification
Authors Divam Gupta, Kushagra Singh, Soumen Chakrabarti, Tanmoy Chakraborty
Abstract Detecting and aggregating sentiments toward people, organizations, and events expressed in unstructured social media have become critical text mining operations. Early systems detected sentiments over whole passages, whereas more recently, target-specific sentiments have been of greater interest. In this paper, we present MTTDSC, a multi-task target-dependent sentiment classification system that is informed by feature representation learnt for the related auxiliary task of passage-level sentiment classification. The auxiliary task uses a gated recurrent unit (GRU) and pools GRU states, followed by an auxiliary fully-connected layer that outputs passage-level predictions. In the main task, these GRUs contribute auxiliary per-token representations over and above word embeddings. The main task has its own, separate GRUs. The auxiliary and main GRUs send their states to a different fully connected layer, trained for the main task. Extensive experiments using two auxiliary datasets and three benchmark datasets (of which one is new, introduced by us) for the main task demonstrate that MTTDSC outperforms state-of-the-art baselines. Using word-level sensitivity analysis, we present anecdotal evidence that prior systems can make incorrect target-specific predictions because they miss sentiments expressed by words independent of target.
Tasks Multi-Task Learning, Sentiment Analysis, Word Embeddings
Published 2019-02-08
URL http://arxiv.org/abs/1902.02930v1
PDF http://arxiv.org/pdf/1902.02930v1.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-for-target-dependent
Repo https://github.com/16631140828/Paper-list
Framework none

Deep Learning for Symbolic Mathematics

Title Deep Learning for Symbolic Mathematics
Authors Guillaume Lample, François Charton
Abstract Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.01412v1
PDF https://arxiv.org/pdf/1912.01412v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-symbolic-mathematics-1
Repo https://github.com/janeyoung2018/symbolic-math
Framework none

VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems

Title VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems
Authors Tommaso Dreossi, Daniel J. Fremont, Shromona Ghosh, Edward Kim, Hadi Ravanbakhsh, Marcell Vazquez-Chanlatte, Sanjit A. Seshia
Abstract We present VERIFAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VERIFAI particularly seeks to address challenges with applying formal methods to perception and ML components, including those based on neural networks, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VERIFAI which centers on simulation guided by formal models and specifications. Several use cases are illustrated with examples, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation.
Tasks
Published 2019-02-12
URL http://arxiv.org/abs/1902.04245v2
PDF http://arxiv.org/pdf/1902.04245v2.pdf
PWC https://paperswithcode.com/paper/verifai-a-toolkit-for-the-design-and-analysis
Repo https://github.com/BerkeleyLearnVerify/VerifAI
Framework tf

Communication-efficient distributed SGD with Sketching

Title Communication-efficient distributed SGD with Sketching
Authors Nikita Ivkin, Daniel Rothchild, Enayat Ullah, Vladimir Braverman, Ion Stoica, Raman Arora
Abstract Large-scale distributed training of neural networks is often limited by network bandwidth, wherein the communication time overwhelms the local computation time. Motivated by the success of sketching methods in sub-linear/streaming algorithms, we introduce Sketched SGD, an algorithm for carrying out distributed SGD by communicating sketches instead of full gradients. We show that Sketched SGD has favorable convergence rates on several classes of functions. When considering all communication – both of gradients and of updated model weights – Sketched SGD reduces the amount of communication required compared to other gradient compression methods from $\mathcal{O}(d)$ or $\mathcal{O}(W)$ to $\mathcal{O}(\log d)$, where $d$ is the number of model parameters and $W$ is the number of workers participating in training. We run experiments on a transformer model, an LSTM, and a residual network, demonstrating up to a 40x reduction in total communication cost with no loss in final model performance. We also show experimentally that Sketched SGD scales to at least 256 workers without increasing communication cost or degrading model performance.
Tasks
Published 2019-03-12
URL https://arxiv.org/abs/1903.04488v3
PDF https://arxiv.org/pdf/1903.04488v3.pdf
PWC https://paperswithcode.com/paper/communication-efficient-distributed-sgd-with
Repo https://github.com/sunahhlee/TopHCS
Framework pytorch
comments powered by Disqus