February 1, 2020

2886 words 14 mins read

Paper Group AWR 162

DiffEqFlux.jl - A Julia Library for Neural Differential Equations. Equalized odds postprocessing under imperfect group information. Pre-Training with Whole Word Masking for Chinese BERT. Pan-tilt-zoom SLAM for Sports Videos. DenseNet Models for Tiny ImageNet Classification. Unsupervised Embedding Learning via Invariant and Spreading Instance Featur …

DiffEqFlux.jl - A Julia Library for Neural Differential Equations


Title	DiffEqFlux.jl - A Julia Library for Neural Differential Equations
Authors	Chris Rackauckas, Mike Innes, Yingbo Ma, Jesse Bettencourt, Lyndon White, Vaibhav Dixit
Abstract	DiffEqFlux.jl is a library for fusing neural networks and differential equations. In this work we describe differential equations from the viewpoint of data science and discuss the complementary nature between machine learning models and differential equations. We demonstrate the ability to incorporate DifferentialEquations.jl-defined differential equation problems into a Flux-defined neural network, and vice versa. The advantages of being able to use the entire DifferentialEquations.jl suite for this purpose is demonstrated by counter examples where simple integration strategies fail, but the sophisticated integration strategies provided by the DifferentialEquations.jl library succeed. This is followed by a demonstration of delay differential equations and stochastic differential equations inside of neural networks. We show high-level functionality for defining neural ordinary differential equations (neural networks embedded into the differential equation) and describe the extra models in the Flux model zoo which includes neural stochastic differential equations. We conclude by discussing the various adjoint methods used for backpropogation of the differential equation solvers. DiffEqFlux.jl is an important contribution to the area, as it allows the full weight of the differential equation solvers developed from decades of research in the scientific computing field to be readily applied to the challenges posed by machine learning and data science.
Tasks
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02376v1
PDF	http://arxiv.org/pdf/1902.02376v1.pdf
PWC	https://paperswithcode.com/paper/diffeqfluxjl-a-julia-library-for-neural
Repo	https://github.com/UnofficialJuliaMirrorSnapshots/DiffEqFlux.jl-aae7a2af-3d4f-5e19-a356-7da93b79d9d0
Framework	none

Equalized odds postprocessing under imperfect group information


Title	Equalized odds postprocessing under imperfect group information
Authors	Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern
Abstract	Most approaches aiming to ensure a model’s fairness with respect to a protected attribute (such as gender or race) assume to know the true value of the attribute for every data point. In this paper, we ask to what extent fairness interventions can be effective even when only imperfect information about the protected attribute is available. In particular, we study the prominent equalized odds postprocessing method of Hardt et al. (2016) under a perturbation of the attribute. We identify conditions on the perturbation that guarantee that the bias of a classifier is reduced even by running equalized odds with the perturbed attribute. We also study the error of the resulting classifier. We empirically observe that under our identified conditions most often the error does not suffer from a perturbation of the protected attribute. For a special case, we formally prove this observation to be true.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03284v3
PDF	https://arxiv.org/pdf/1906.03284v3.pdf
PWC	https://paperswithcode.com/paper/effectiveness-of-equalized-odds-for-fair
Repo	https://github.com/matthklein/equalized_odds_with_perturbed_attribute
Framework	none

Pre-Training with Whole Word Masking for Chinese BERT


Title	Pre-Training with Whole Word Masking for Chinese BERT
Authors	Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu
Abstract	Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks. Recently, an upgraded version of BERT has been released with Whole Word Masking (WWM), which mitigate the drawbacks of masking partial WordPiece tokens in pre-training BERT. In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task. The proposed models are verified on various NLP tasks, across sentence-level to document-level, including machine reading comprehension (CMRC 2018, DRCD, CJRC), natural language inference (XNLI), sentiment classification (ChnSentiCorp), sentence pair matching (LCQMC, BQ Corpus), and document classification (THUCNews). Experimental results on these datasets show that the whole word masking could bring another significant gain. Moreover, we also examine the effectiveness of the Chinese pre-trained models: BERT, ERNIE, BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, and RoBERTa-wwm-ext-large. We release all the pre-trained models: \url{https://github.com/ymcui/Chinese-BERT-wwm
Tasks	Document Classification, Language Modelling, Machine Reading Comprehension, Named Entity Recognition, Natural Language Inference, Reading Comprehension, Sentiment Analysis
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08101v2
PDF	https://arxiv.org/pdf/1906.08101v2.pdf
PWC	https://paperswithcode.com/paper/pre-training-with-whole-word-masking-for
Repo	https://github.com/ymcui/Chinese-BERT-wwm
Framework	pytorch

Pan-tilt-zoom SLAM for Sports Videos


Title	Pan-tilt-zoom SLAM for Sports Videos
Authors	Jikai Lu, Jianhui Chen, James J. Little
Abstract	We present an online SLAM system specifically designed to track pan-tilt-zoom (PTZ) cameras in highly dynamic sports such as basketball and soccer games. In these games, PTZ cameras rotate very fast and players cover large image areas. To overcome these challenges, we propose to use a novel camera model for tracking and to use rays as landmarks in mapping. Rays overcome the missing depth in pure-rotation cameras. We also develop an online pan-tilt forest for mapping and introduce moving objects (players) detection to mitigate negative impacts from foreground objects. We test our method on both synthetic and real datasets. The experimental results show the superior performance of our method over previous methods for online PTZ camera pose estimation.
Tasks	Pose Estimation
Published	2019-07-20
URL	https://arxiv.org/abs/1907.08816v1
PDF	https://arxiv.org/pdf/1907.08816v1.pdf
PWC	https://paperswithcode.com/paper/pan-tilt-zoom-slam-for-sports-videos
Repo	https://github.com/lulufa390/Pan-tilt-zoom-SLAM
Framework	none

DenseNet Models for Tiny ImageNet Classification


Title	DenseNet Models for Tiny ImageNet Classification
Authors	Zoheb Abai, Nishad Rajmalwar
Abstract	In this paper, we present two image classification models on the Tiny ImageNet dataset. We built two very different networks from scratch based on the idea of Densely Connected Convolution Networks. The architecture of the networks is designed based on the image resolution of this specific dataset and by calculating the Receptive Field of the convolution layers. We also used some non-conventional techniques related to image augmentation and Cyclical Learning Rate to improve the accuracy of our models. The networks are trained under high constraints and low computation resources. We aimed to achieve top-1 validation accuracy of 60%; the results and error analysis are also presented.
Tasks	Image Augmentation, Image Classification
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10429v1
PDF	http://arxiv.org/pdf/1904.10429v1.pdf
PWC	https://paperswithcode.com/paper/densenet-models-for-tiny-imagenet
Repo	https://github.com/ZohebAbai/Tiny-ImageNet-Challenge
Framework	none

Unsupervised Embedding Learning via Invariant and Spreading Instance Feature


Title	Unsupervised Embedding Learning via Invariant and Spreading Instance Feature
Authors	Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang
Abstract	This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning data augmentation invariant and instance spread-out features. To achieve this goal, we propose a novel instance based softmax embedding method, which directly optimizes the `real’ instance features on top of the softmax function. It achieves significantly faster learning speed and higher accuracy than all existing methods. The proposed method performs well for both seen and unseen testing categories with cosine similarity. It also achieves competitive performance even without pre-trained network over samples from fine-grained categories. \|
Tasks	Data Augmentation
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03436v1
PDF	http://arxiv.org/pdf/1904.03436v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-embedding-learning-via-invariant
Repo	https://github.com/mangye16/Unsupervised_Embedding_Learning
Framework	pytorch

Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation


Title	Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Authors	Yuzhou Liu, DeLiang Wang
Abstract	We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous grouping and sequential grouping. Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network. In the second stage, the frame-level separated spectra are sequentially grouped to different speakers by a clustering network. The proposed deep CASA approach optimizes frame-level separation and speaker tracking in turn, and produces excellent results for both objectives. Experimental results on the benchmark WSJ0-2mix database show that the new approach achieves the state-of-the-art results with a modest model size.
Tasks	Speaker Separation, Speech Separation
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11148v1
PDF	http://arxiv.org/pdf/1904.11148v1.pdf
PWC	https://paperswithcode.com/paper/divide-and-conquer-a-deep-casa-approach-to
Repo	https://github.com/yuzhou-git/deep-casa
Framework	tf


Title	Benchmarking Classic and Learned Navigation in Complex 3D Environments
Authors	Dmytro Mishkin, Alexey Dosovitskiy, Vladlen Koltun
Abstract	Navigation research is attracting renewed interest with the advent of learning-based methods. However, this new line of work is largely disconnected from well-established classic navigation approaches. In this paper, we take a step towards coordinating these two directions of research. We set up classic and learning-based navigation systems in common simulated environments and thoroughly evaluate them in indoor spaces of varying complexity, with access to different sensory modalities. Additionally, we measure human performance in the same environments. We find that a classic pipeline, when properly tuned, can perform very well in complex cluttered environments. On the other hand, learned systems can operate more robustly with a limited sensor suite. Overall, both approaches are still far from human-level performance.
Tasks
Published	2019-01-30
URL	http://arxiv.org/abs/1901.10915v2
PDF	http://arxiv.org/pdf/1901.10915v2.pdf
PWC	https://paperswithcode.com/paper/benchmarking-classic-and-learned-navigation
Repo	https://github.com/ducha-aiki/navigation-benchmark
Framework	pytorch

MAP Inference via L2-Sphere Linear Program Reformulation


Title	MAP Inference via L2-Sphere Linear Program Reformulation
Authors	Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem
Abstract	Maximum a posteriori (MAP) inference is an important task for graphical models. Due to complex dependencies among variables in realistic model, finding an exact solution for MAP inference is often intractable. Thus, many approximation methods have been developed, among which the linear programming (LP) relaxation based methods show promising performance. However, one major drawback of LP relaxation is that it is possible to give fractional solutions. Instead of presenting a tighter relaxation, in this work we propose a continuous but equivalent reformulation of the original MAP inference problem, called LS-LP. We add the L2-sphere constraint onto the original LP relaxation, leading to an intersected space with the local marginal polytope that is equivalent to the space of all valid integer label configurations. Thus, LS-LP is equivalent to the original MAP inference problem. We propose a perturbed alternating direction method of multipliers (ADMM) algorithm to optimize the LS-LP problem, by adding a sufficiently small perturbation epsilon onto the objective function and constraints. We prove that the perturbed ADMM algorithm globally converges to the epsilon-Karush-Kuhn-Tucker (epsilon-KKT) point of the LS-LP problem. The convergence rate will also be analyzed. Experiments on several benchmark datasets from Probabilistic Inference Challenge (PIC 2011) and OpenGM 2 show competitive performance of our proposed method against state-of-the-art MAP inference methods.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03433v3
PDF	https://arxiv.org/pdf/1905.03433v3.pdf
PWC	https://paperswithcode.com/paper/190503433
Repo	https://github.com/wubaoyuan/Lpbox-ADMM
Framework	none

Pyramid Feature Attention Network for Saliency detection


Title	Pyramid Feature Attention Network for Saliency detection
Authors	Ting Zhao, Xiangqian Wu
Abstract	Saliency detection is one of the basic challenges in computer vision. How to extract effective features is a critical point for saliency detection. Recent methods mainly adopt integrating multi-scale convolutional features indiscriminately. However, not all features are useful for saliency detection and some even cause interferences. To solve this problem, we propose Pyramid Feature Attention network to focus on effective high-level context features and low-level spatial structural features. First, we design Context-aware Pyramid Feature Extraction (CPFE) module for multi-scale high-level feature maps to capture rich context features. Second, we adopt channel-wise attention (CA) after CPFE feature maps and spatial attention (SA) after low-level feature maps, then fuse outputs of CA & SA together. Finally, we propose an edge preservation loss to guide network to learn more detailed information in boundary localization. Extensive evaluations on five benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches under different evaluation metrics.
Tasks	Saliency Detection
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00179v2
PDF	http://arxiv.org/pdf/1903.00179v2.pdf
PWC	https://paperswithcode.com/paper/pyramid-feature-selective-network-for
Repo	https://github.com/CaitinZhao/cvpr2019_Pyramid-Feature-Attention-Network-for-Saliency-detection
Framework	tf

Open-domain Event Extraction and Embedding for Natural Gas Market Prediction


Title	Open-domain Event Extraction and Embedding for Natural Gas Market Prediction
Authors	Minh Triet Chau, Diego Esteves, Jens Lehmann
Abstract	We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparison to off-the-shelf tools, our event extraction method detects not only the occurrence of phenomena but also the changes in attribution and characteristics from public sources. Instead of using sentence embedding as a feature, we use every word of the extracted events, encode and organize them before feeding to the learning models. Empirical results show favorable results, in terms of prediction performance, money saved and scalability.
Tasks	Sentence Embedding, Time Series
Published	2019-12-08
URL	https://arxiv.org/abs/1912.11334v1
PDF	https://arxiv.org/pdf/1912.11334v1.pdf
PWC	https://paperswithcode.com/paper/open-domain-event-extraction-and-embedding
Repo	https://github.com/minhtriet/gas_market
Framework	none

Variance Reduced Local SGD with Lower Communication Complexity


Title	Variance Reduced Local SGD with Lower Communication Complexity
Authors	Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng
Abstract	To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training. Among them, Local SGD has gained much attention due to its lower communication cost. Nevertheless, when the data distribution on workers is non-identical, Local SGD requires $O(T^{\frac{3}{4}} N^{\frac{3}{4}})$ communications to maintain its \emph{linear iteration speedup} property, where $T$ is the total number of iterations and $N$ is the number of workers. In this paper, we propose Variance Reduced Local SGD (VRL-SGD) to further reduce the communication complexity. Benefiting from eliminating the dependency on the gradient variance among workers, we theoretically prove that VRL-SGD achieves a \emph{linear iteration speedup} with a lower communication complexity $O(T^{\frac{1}{2}} N^{\frac{3}{2}})$ even if workers access non-identical datasets. We conduct experiments on three machine learning tasks, and the experimental results demonstrate that VRL-SGD performs impressively better than Local SGD when the data among workers are quite diverse.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12844v1
PDF	https://arxiv.org/pdf/1912.12844v1.pdf
PWC	https://paperswithcode.com/paper/variance-reduced-local-sgd-with-lower-1
Repo	https://github.com/zerolxf/VRL-SGD
Framework	pytorch

From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI


Title	From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI
Authors	Roman Beliy, Guy Gaziv, Assaf Hoogi, Francesca Strappini, Tal Golan, Michal Irani
Abstract	Reconstructing observed images from fMRI brain recordings is challenging. Unfortunately, acquiring sufficient “labeled” pairs of {Image, fMRI} (i.e., images with their corresponding fMRI responses) to span the huge space of natural images is prohibitive for many reasons. We present a novel approach which, in addition to the scarce labeled data (training pairs), allows to train fMRI-to-image reconstruction networks also on “unlabeled” data (i.e., images without fMRI recording, and fMRI recording without images). The proposed model utilizes both an Encoder network (image-to-fMRI) and a Decoder network (fMRI-to-image). Concatenating these two networks back-to-back (Encoder-Decoder & Decoder-Encoder) allows augmenting the training with both types of unlabeled data. Importantly, it allows training on the unlabeled test-fMRI data. This self-supervision adapts the reconstruction network to the new input test-data, despite its deviation from the statistics of the scarce training data.
Tasks	Image Reconstruction
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02431v1
PDF	https://arxiv.org/pdf/1907.02431v1.pdf
PWC	https://paperswithcode.com/paper/from-voxels-to-pixels-and-back-self
Repo	https://github.com/WeizmannVision/ssfmri2im
Framework	none

MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks


Title	MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks
Authors	Ty Nguyen, Shreyas S. Shivakumar, Ian D. Miller, James Keller, Elijah S. Lee, Alex Zhou, Tolga Ozaslan, Giuseppe Loianno, Joseph H. Harwood, Jennifer Wozencraft, Camillo J. Taylor, Vijay Kumar
Abstract	Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable performance with some reference models in empirical experiments. Our model achieves a trade-off between speed and accuracy, achieving up to 48 FPS on an NVIDIA 1080Ti and 9 FPS on the NVIDIA Jetson Xavier when processing high resolution imagery. Additionally, we provide two novel datasets that represent challenges in semantic segmentation for real-time MAV tracking and infrastructure inspection tasks and verify MAVNet on these datasets. Our algorithm and datasets are made publicly available.
Tasks	Real-Time Semantic Segmentation, Semantic Segmentation, Visual Odometry
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01795v2
PDF	https://arxiv.org/pdf/1904.01795v2.pdf
PWC	https://paperswithcode.com/paper/mavnet-an-effective-semantic-segmentation
Repo	https://github.com/tynguyen/MAVNet
Framework	none

Neural Architectures for Nested NER through Linearization


Title	Neural Architectures for Nested NER through Linearization
Authors	Jana Straková, Milan Straka, Jan Hajič
Abstract	We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label. We encode the nested labels using a linearized scheme. In our first proposed approach, the nested labels are modeled as multilabels corresponding to the Cartesian product of the nested labels in a standard LSTM-CRF architecture. In the second one, the nested NER is viewed as a sequence-to-sequence problem, in which the input sequence consists of the tokens and output sequence of the labels, using hard attention on the word whose label is being predicted. The proposed methods outperform the nested NER state of the art on four corpora: ACE-2004, ACE-2005, GENIA and Czech CNEC. We also enrich our architectures with the recently published contextual embeddings: ELMo, BERT and Flair, reaching further improvements for the four nested entity corpora. In addition, we report flat NER state-of-the-art results for CoNLL-2002 Dutch and Spanish and for CoNLL-2003 English.
Tasks	Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06926v1
PDF	https://arxiv.org/pdf/1908.06926v1.pdf
PWC	https://paperswithcode.com/paper/neural-architectures-for-nested-ner-through-1
Repo	https://github.com/ufal/acl2019_nested_ner
Framework	none