Paper Group AWR 106
Tracking without bells and whistles. Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning. IoU-uniform R-CNN: Breaking Through the Limitations of RPN. MoGA: Searching Beyond MobileNetV3. Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders. BERT has a Mouth, and It Must Speak: BERT as a Mark …
Tracking without bells and whistles
Title | Tracking without bells and whistles |
Authors | Philipp Bergmann, Tim Meinhardt, Laura Leal-Taixe |
Abstract | The problem of tracking multiple objects in a video sequence poses several challenging tasks. For tracking-by-detection, these include object re-identification, motion prediction and dealing with occlusions. We present a tracker (without bells and whistles) that accomplishes tracking without specifically targeting any of these tasks, in particular, we perform no training or optimization on tracking data. To this end, we exploit the bounding box regression of an object detector to predict the position of an object in the next frame, thereby converting a detector into a Tracktor. We demonstrate the potential of Tracktor and provide a new state-of-the-art on three multi-object tracking benchmarks by extending it with a straightforward re-identification and camera motion compensation. We then perform an analysis on the performance and failure cases of several state-of-the-art tracking methods in comparison to our Tracktor. Surprisingly, none of the dedicated tracking methods are considerably better in dealing with complex tracking scenarios, namely, small and occluded objects or missing detections. However, our approach tackles most of the easy tracking scenarios. Therefore, we motivate our approach as a new tracking paradigm and point out promising future research directions. Overall, Tracktor yields superior tracking performance than any current tracking method and our analysis exposes remaining and unsolved tracking challenges to inspire future research directions. |
Tasks | Motion Compensation, motion prediction, Multi-Object Tracking, Object Tracking |
Published | 2019-03-13 |
URL | https://arxiv.org/abs/1903.05625v3 |
https://arxiv.org/pdf/1903.05625v3.pdf | |
PWC | https://paperswithcode.com/paper/tracking-without-bells-and-whistles |
Repo | https://github.com/zhanxinrui/tracking_wo_bnw_fork |
Framework | pytorch |
Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning
Title | Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning |
Authors | Mengge Xue, Weiming Cai, Jinsong Su, Linfeng Song, Yubin Ge, Yubao Liu, Bin Wang |
Abstract | Benefiting from the excellent ability of neural networks on learning semantic representations, existing studies for entity linking (EL) have resorted to neural networks to exploit both the local mention-to-entity compatibility and the global interdependence between different EL decisions for target entity disambiguation. However, most neural collective EL methods depend entirely upon neural networks to automatically model the semantic dependencies between different EL decisions, which lack of the guidance from external knowledge. In this paper, we propose a novel end-to-end neural network with recurrent random-walk layers for collective EL, which introduces external knowledge to model the semantic interdependence between different EL decisions. Specifically, we first establish a model based on local context features, and then stack random-walk layers to reinforce the evidence for related EL decisions into high-probability decisions, where the semantic interdependence between candidate entities is mainly induced from an external knowledge base. Finally, a semantic regularizer that preserves the collective EL decisions consistency is incorporated into the conventional objective function, so that the external knowledge base can be fully exploited in collective EL decisions. Experimental results and in-depth analysis on various datasets show that our model achieves better performance than other state-of-the-art models. Our code and data are released at \url{https://github.com/DeepLearnXMU/RRWEL}. |
Tasks | Entity Disambiguation, Entity Linking, Learning Semantic Representations |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.09320v1 |
https://arxiv.org/pdf/1906.09320v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-collective-entity-linking-based-on |
Repo | https://github.com/DeepLearnXMU/RRWEL |
Framework | pytorch |
IoU-uniform R-CNN: Breaking Through the Limitations of RPN
Title | IoU-uniform R-CNN: Breaking Through the Limitations of RPN |
Authors | Li Zhu, Zihao Xie, Liman Liu, Bo Tao, Wenbing Tao |
Abstract | Region Proposal Network (RPN) is the cornerstone of two-stage object detectors, it generates a sparse set of object proposals and alleviates the extrem foregroundbackground class imbalance problem during training. However, we find that the potential of the detector has not been fully exploited due to the IoU distribution imbalance and inadequate quantity of the training samples generated by RPN. With the increasing intersection over union (IoU), the exponentially smaller numbers of positive samples would lead to the distribution skewed towards lower IoUs, which hinders the optimization of detector at high IoU levels. In this paper, to break through the limitations of RPN, we propose IoU-Uniform R-CNN, a simple but effective method that directly generates training samples with uniform IoU distribution for the regression branch as well as the IoU prediction branch. Besides, we improve the performance of IoU prediction branch by eliminating the feature offsets of RoIs at inference, which helps the NMS procedure by preserving accurately localized bounding box. Extensive experiments on the PASCAL VOC and MS COCO dataset show the effectiveness of our method, as well as its compatibility and adaptivity to many object detection architectures. The code is made publicly available at https://github.com/zl1994/IoU-Uniform-R-CNN, |
Tasks | Object Detection |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05190v1 |
https://arxiv.org/pdf/1912.05190v1.pdf | |
PWC | https://paperswithcode.com/paper/iou-uniform-r-cnn-breaking-through-the |
Repo | https://github.com/zl1994/IoU-Uniform-R-CNN |
Framework | pytorch |
MoGA: Searching Beyond MobileNetV3
Title | MoGA: Searching Beyond MobileNetV3 |
Authors | Xiangxiang Chu, Bo Zhang, Ruijun Xu |
Abstract | The evolution of MobileNets has laid a solid foundation for neural network applications on mobile end. With the latest MobileNetV3, neural architecture search again claimed its supremacy in network design. Unfortunately, till today all mobile methods mainly focus on CPU latencies instead of GPU, the latter, however, is much preferred in practice for it has faster speed, lower overhead and less interference. Bearing the target hardware in mind, we propose the first Mobile GPU-Aware (MoGA) neural architecture search in order to be precisely tailored for real-world applications. Further, the ultimate objective to devise a mobile network lies in achieving better performance by maximizing the utilization of bounded resources. Urging higher capability while restraining time consumption is not reconcilable. We alleviate the tension by weighted evolution techniques. Moreover, we encourage increasing the number of parameters for higher representational power. With 200x fewer GPU days than MnasNet, we obtain a series of models that outperform MobileNetV3 under the similar latency constraints, i.e., MoGA-A achieves 75.9% top-1 accuracy on ImageNet, MoGA-B meets 75.5% which costs only 0.5 ms more on mobile GPU. MoGA-C best attests GPU-awareness by reaching 75.3% and being slower on CPU but faster on GPU.The models and test code is made available here https://github.com/xiaomi-automl/MoGA. |
Tasks | AutoML, Image Classification, Neural Architecture Search |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.01314v4 |
https://arxiv.org/pdf/1908.01314v4.pdf | |
PWC | https://paperswithcode.com/paper/moga-searching-beyond-mobilenetv3 |
Repo | https://github.com/xiaomi-automl/MoGA |
Framework | pytorch |
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders
Title | Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders |
Authors | Andrew Drozdov, Pat Verga, Mohit Yadav, Mohit Iyyer, Andrew McCallum |
Abstract | We introduce deep inside-outside recursive autoencoders (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Our approach predicts each word in an input sentence conditioned on the rest of the sentence and uses inside-outside dynamic programming to consider all possible binary trees over the sentence. At test time the CKY algorithm extracts the highest scoring parse. DIORA achieves a new state-of-the-art F1 in unsupervised binary constituency parsing (unlabeled) in two benchmark datasets, WSJ and MultiNLI. |
Tasks | Constituency Parsing |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.02142v2 |
http://arxiv.org/pdf/1904.02142v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-latent-tree-induction-with-deep |
Repo | https://github.com/iesl/diora |
Framework | pytorch |
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
Title | BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model |
Authors | Alex Wang, Kyunghyun Cho |
Abstract | We show that BERT (Devlin et al., 2018) is a Markov random field language model. This formulation gives way to a natural procedure to sample sentences from BERT. We generate from BERT and find that it can produce high-quality, fluent generations. Compared to the generations of a traditional left-to-right language model, BERT generates sentences that are more diverse but of slightly worse quality. |
Tasks | Language Modelling |
Published | 2019-02-11 |
URL | http://arxiv.org/abs/1902.04094v2 |
http://arxiv.org/pdf/1902.04094v2.pdf | |
PWC | https://paperswithcode.com/paper/bert-has-a-mouth-and-it-must-speak-bert-as-a |
Repo | https://github.com/nyu-dl/bert-gen |
Framework | pytorch |
StructureFlow: Image Inpainting via Structure-aware Appearance Flow
Title | StructureFlow: Image Inpainting via Structure-aware Appearance Flow |
Authors | Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H. Li, Shan Liu, Ge Li |
Abstract | Image inpainting techniques have shown significant improvements by using deep neural networks recently. However, most of them may either fail to reconstruct reasonable structures or restore fine-grained textures. In order to solve this problem, in this paper, we propose a two-stage model which splits the inpainting task into two parts: structure reconstruction and texture generation. In the first stage, edge-preserved smooth images are employed to train a structure reconstructor which completes the missing structures of the inputs. In the second stage, based on the reconstructed structures, a texture generator using appearance flow is designed to yield image details. Experiments on multiple publicly available datasets show the superior performance of the proposed network. |
Tasks | Image Inpainting, Texture Synthesis |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03852v1 |
https://arxiv.org/pdf/1908.03852v1.pdf | |
PWC | https://paperswithcode.com/paper/structureflow-image-inpainting-via-structure |
Repo | https://github.com/RenYurui/StructureFlow |
Framework | pytorch |
Deep Social Collaborative Filtering
Title | Deep Social Collaborative Filtering |
Authors | Wenqi Fan, Yao Ma, Dawei Yin, Jianping Wang, Jiliang Tang, Qing Li |
Abstract | Recommender systems are crucial to alleviate the information overload problem in online worlds. Most of the modern recommender systems capture users’ preference towards items via their interactions based on collaborative filtering techniques. In addition to the user-item interactions, social networks can also provide useful information to understand users’ preference as suggested by the social theories such as homophily and influence. Recently, deep neural networks have been utilized for social recommendations, which facilitate both the user-item interactions and the social network information. However, most of these models cannot take full advantage of the social network information. They only use information from direct neighbors, but distant neighbors can also provide helpful information. Meanwhile, most of these models treat neighbors’ information equally without considering the specific recommendations. However, for a specific recommendation case, the information relevant to the specific item would be helpful. Besides, most of these models do not explicitly capture the neighbor’s opinions to items for social recommendations, while different opinions could affect the user differently. In this paper, to address the aforementioned challenges, we propose DSCF, a Deep Social Collaborative Filtering framework, which can exploit the social relations with various aspects for recommender systems. Comprehensive experiments on two-real world datasets show the effectiveness of the proposed framework. |
Tasks | Recommendation Systems |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06853v1 |
https://arxiv.org/pdf/1907.06853v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-social-collaborative-filtering |
Repo | https://github.com/wenqifan03/GraphRec-WWW19 |
Framework | pytorch |
NoduleNet: Decoupled False Positive Reductionfor Pulmonary Nodule Detection and Segmentation
Title | NoduleNet: Decoupled False Positive Reductionfor Pulmonary Nodule Detection and Segmentation |
Authors | Hao Tang, Chupeng Zhang, Xiaohui Xie |
Abstract | Pulmonary nodule detection, false positive reduction and segmentation represent three of the most common tasks in the computeraided analysis of chest CT images. Methods have been proposed for eachtask with deep learning based methods heavily favored recently. However training deep learning models to solve each task separately may be sub-optimal - resource intensive and without the benefit of feature sharing. Here, we propose a new end-to-end 3D deep convolutional neural net (DCNN), called NoduleNet, to solve nodule detection, false positive reduction and nodule segmentation jointly in a multi-task fashion. To avoid friction between different tasks and encourage feature diversification, we incorporate two major design tricks: 1) decoupled feature maps for nodule detection and false positive reduction, and 2) a segmentation refinement subnet for increasing the precision of nodule segmentation. Extensive experiments on the large-scale LIDC dataset demonstrate that the multi-task training is highly beneficial, improving the nodule detection accuracy by 10.27%, compared to the baseline model trained to only solve the nodule detection task. We also carry out systematic ablation studies to highlight contributions from each of the added components. Code is available at https://github.com/uci-cbcl/NoduleNet. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11320v1 |
https://arxiv.org/pdf/1907.11320v1.pdf | |
PWC | https://paperswithcode.com/paper/nodulenet-decoupled-false-positive |
Repo | https://github.com/uci-cbcl/NoduleNet |
Framework | pytorch |
Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution
Title | Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution |
Authors | Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim |
Abstract | Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics. It is, at the same time, one of the most challenging issues for spoken language understanding (SLU) systems as well. As syntactic ambiguity is intertwined with issues regarding prosody and semantics, the computational approach toward speech intention identification is expected to benefit from the observations of the human language processing mechanism. In this regard, we address the task with attentive recurrent neural networks that exploit acoustic and textual features simultaneously and reveal how the modalities interact with each other to derive sentence meaning. Utilizing a speech corpus recorded on Korean scripts of syntactically ambiguous utterances, we revealed that co-attention frameworks, namely multi-hop attention and cross-attention, show significantly superior performance in disambiguating speech intention. With further analysis, we demonstrate that the computational models reflect the internal relationship between auditory and linguistic processes. |
Tasks | Spoken Language Understanding |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09275v2 |
https://arxiv.org/pdf/1910.09275v2.pdf | |
PWC | https://paperswithcode.com/paper/disambiguating-speech-intention-via-audio |
Repo | https://github.com/warnikchow/coaudiotext |
Framework | tf |
Torchmeta: A Meta-Learning library for PyTorch
Title | Torchmeta: A Meta-Learning library for PyTorch |
Authors | Tristan Deleu, Tobias Würfl, Mandana Samiei, Joseph Paul Cohen, Yoshua Bengio |
Abstract | The constant introduction of standardized benchmarks in the literature has helped accelerating the recent advances in meta-learning research. They offer a way to get a fair comparison between different algorithms, and the wide range of datasets available allows full control over the complexity of this evaluation. However, for a large majority of code available online, the data pipeline is often specific to one dataset, and testing on another dataset requires significant rework. We introduce Torchmeta, a library built on top of PyTorch that enables seamless and consistent evaluation of meta-learning algorithms on multiple datasets, by providing data-loaders for most of the standard benchmarks in few-shot classification and regression, with a new meta-dataset abstraction. It also features some extensions for PyTorch to simplify the development of models compatible with meta-learning algorithms. The code is available here: https://github.com/tristandeleu/pytorch-meta |
Tasks | Meta-Learning |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.06576v1 |
https://arxiv.org/pdf/1909.06576v1.pdf | |
PWC | https://paperswithcode.com/paper/torchmeta-a-meta-learning-library-for-pytorch |
Repo | https://github.com/tristandeleu/pytorch-meta |
Framework | pytorch |
Multi-task Learning for Target-dependent Sentiment Classification
Title | Multi-task Learning for Target-dependent Sentiment Classification |
Authors | Divam Gupta, Kushagra Singh, Soumen Chakrabarti, Tanmoy Chakraborty |
Abstract | Detecting and aggregating sentiments toward people, organizations, and events expressed in unstructured social media have become critical text mining operations. Early systems detected sentiments over whole passages, whereas more recently, target-specific sentiments have been of greater interest. In this paper, we present MTTDSC, a multi-task target-dependent sentiment classification system that is informed by feature representation learnt for the related auxiliary task of passage-level sentiment classification. The auxiliary task uses a gated recurrent unit (GRU) and pools GRU states, followed by an auxiliary fully-connected layer that outputs passage-level predictions. In the main task, these GRUs contribute auxiliary per-token representations over and above word embeddings. The main task has its own, separate GRUs. The auxiliary and main GRUs send their states to a different fully connected layer, trained for the main task. Extensive experiments using two auxiliary datasets and three benchmark datasets (of which one is new, introduced by us) for the main task demonstrate that MTTDSC outperforms state-of-the-art baselines. Using word-level sensitivity analysis, we present anecdotal evidence that prior systems can make incorrect target-specific predictions because they miss sentiments expressed by words independent of target. |
Tasks | Multi-Task Learning, Sentiment Analysis, Word Embeddings |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.02930v1 |
http://arxiv.org/pdf/1902.02930v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-target-dependent |
Repo | https://github.com/16631140828/Paper-list |
Framework | none |
Deep Learning for Symbolic Mathematics
Title | Deep Learning for Symbolic Mathematics |
Authors | Guillaume Lample, François Charton |
Abstract | Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.01412v1 |
https://arxiv.org/pdf/1912.01412v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-symbolic-mathematics-1 |
Repo | https://github.com/janeyoung2018/symbolic-math |
Framework | none |
VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems
Title | VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems |
Authors | Tommaso Dreossi, Daniel J. Fremont, Shromona Ghosh, Edward Kim, Hadi Ravanbakhsh, Marcell Vazquez-Chanlatte, Sanjit A. Seshia |
Abstract | We present VERIFAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VERIFAI particularly seeks to address challenges with applying formal methods to perception and ML components, including those based on neural networks, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VERIFAI which centers on simulation guided by formal models and specifications. Several use cases are illustrated with examples, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation. |
Tasks | |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04245v2 |
http://arxiv.org/pdf/1902.04245v2.pdf | |
PWC | https://paperswithcode.com/paper/verifai-a-toolkit-for-the-design-and-analysis |
Repo | https://github.com/BerkeleyLearnVerify/VerifAI |
Framework | tf |
Communication-efficient distributed SGD with Sketching
Title | Communication-efficient distributed SGD with Sketching |
Authors | Nikita Ivkin, Daniel Rothchild, Enayat Ullah, Vladimir Braverman, Ion Stoica, Raman Arora |
Abstract | Large-scale distributed training of neural networks is often limited by network bandwidth, wherein the communication time overwhelms the local computation time. Motivated by the success of sketching methods in sub-linear/streaming algorithms, we introduce Sketched SGD, an algorithm for carrying out distributed SGD by communicating sketches instead of full gradients. We show that Sketched SGD has favorable convergence rates on several classes of functions. When considering all communication – both of gradients and of updated model weights – Sketched SGD reduces the amount of communication required compared to other gradient compression methods from $\mathcal{O}(d)$ or $\mathcal{O}(W)$ to $\mathcal{O}(\log d)$, where $d$ is the number of model parameters and $W$ is the number of workers participating in training. We run experiments on a transformer model, an LSTM, and a residual network, demonstrating up to a 40x reduction in total communication cost with no loss in final model performance. We also show experimentally that Sketched SGD scales to at least 256 workers without increasing communication cost or degrading model performance. |
Tasks | |
Published | 2019-03-12 |
URL | https://arxiv.org/abs/1903.04488v3 |
https://arxiv.org/pdf/1903.04488v3.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-distributed-sgd-with |
Repo | https://github.com/sunahhlee/TopHCS |
Framework | pytorch |