Paper Group AWR 162
DiffEqFlux.jl - A Julia Library for Neural Differential Equations. Equalized odds postprocessing under imperfect group information. Pre-Training with Whole Word Masking for Chinese BERT. Pan-tilt-zoom SLAM for Sports Videos. DenseNet Models for Tiny ImageNet Classification. Unsupervised Embedding Learning via Invariant and Spreading Instance Featur …
DiffEqFlux.jl - A Julia Library for Neural Differential Equations
Title | DiffEqFlux.jl - A Julia Library for Neural Differential Equations |
Authors | Chris Rackauckas, Mike Innes, Yingbo Ma, Jesse Bettencourt, Lyndon White, Vaibhav Dixit |
Abstract | DiffEqFlux.jl is a library for fusing neural networks and differential equations. In this work we describe differential equations from the viewpoint of data science and discuss the complementary nature between machine learning models and differential equations. We demonstrate the ability to incorporate DifferentialEquations.jl-defined differential equation problems into a Flux-defined neural network, and vice versa. The advantages of being able to use the entire DifferentialEquations.jl suite for this purpose is demonstrated by counter examples where simple integration strategies fail, but the sophisticated integration strategies provided by the DifferentialEquations.jl library succeed. This is followed by a demonstration of delay differential equations and stochastic differential equations inside of neural networks. We show high-level functionality for defining neural ordinary differential equations (neural networks embedded into the differential equation) and describe the extra models in the Flux model zoo which includes neural stochastic differential equations. We conclude by discussing the various adjoint methods used for backpropogation of the differential equation solvers. DiffEqFlux.jl is an important contribution to the area, as it allows the full weight of the differential equation solvers developed from decades of research in the scientific computing field to be readily applied to the challenges posed by machine learning and data science. |
Tasks | |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02376v1 |
http://arxiv.org/pdf/1902.02376v1.pdf | |
PWC | https://paperswithcode.com/paper/diffeqfluxjl-a-julia-library-for-neural |
Repo | https://github.com/UnofficialJuliaMirrorSnapshots/DiffEqFlux.jl-aae7a2af-3d4f-5e19-a356-7da93b79d9d0 |
Framework | none |
Equalized odds postprocessing under imperfect group information
Title | Equalized odds postprocessing under imperfect group information |
Authors | Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern |
Abstract | Most approaches aiming to ensure a model’s fairness with respect to a protected attribute (such as gender or race) assume to know the true value of the attribute for every data point. In this paper, we ask to what extent fairness interventions can be effective even when only imperfect information about the protected attribute is available. In particular, we study the prominent equalized odds postprocessing method of Hardt et al. (2016) under a perturbation of the attribute. We identify conditions on the perturbation that guarantee that the bias of a classifier is reduced even by running equalized odds with the perturbed attribute. We also study the error of the resulting classifier. We empirically observe that under our identified conditions most often the error does not suffer from a perturbation of the protected attribute. For a special case, we formally prove this observation to be true. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03284v3 |
https://arxiv.org/pdf/1906.03284v3.pdf | |
PWC | https://paperswithcode.com/paper/effectiveness-of-equalized-odds-for-fair |
Repo | https://github.com/matthklein/equalized_odds_with_perturbed_attribute |
Framework | none |
Pre-Training with Whole Word Masking for Chinese BERT
Title | Pre-Training with Whole Word Masking for Chinese BERT |
Authors | Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu |
Abstract | Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks. Recently, an upgraded version of BERT has been released with Whole Word Masking (WWM), which mitigate the drawbacks of masking partial WordPiece tokens in pre-training BERT. In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task. The proposed models are verified on various NLP tasks, across sentence-level to document-level, including machine reading comprehension (CMRC 2018, DRCD, CJRC), natural language inference (XNLI), sentiment classification (ChnSentiCorp), sentence pair matching (LCQMC, BQ Corpus), and document classification (THUCNews). Experimental results on these datasets show that the whole word masking could bring another significant gain. Moreover, we also examine the effectiveness of the Chinese pre-trained models: BERT, ERNIE, BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, and RoBERTa-wwm-ext-large. We release all the pre-trained models: \url{https://github.com/ymcui/Chinese-BERT-wwm |
Tasks | Document Classification, Language Modelling, Machine Reading Comprehension, Named Entity Recognition, Natural Language Inference, Reading Comprehension, Sentiment Analysis |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08101v2 |
https://arxiv.org/pdf/1906.08101v2.pdf | |
PWC | https://paperswithcode.com/paper/pre-training-with-whole-word-masking-for |
Repo | https://github.com/ymcui/Chinese-BERT-wwm |
Framework | pytorch |
Pan-tilt-zoom SLAM for Sports Videos
Title | Pan-tilt-zoom SLAM for Sports Videos |
Authors | Jikai Lu, Jianhui Chen, James J. Little |
Abstract | We present an online SLAM system specifically designed to track pan-tilt-zoom (PTZ) cameras in highly dynamic sports such as basketball and soccer games. In these games, PTZ cameras rotate very fast and players cover large image areas. To overcome these challenges, we propose to use a novel camera model for tracking and to use rays as landmarks in mapping. Rays overcome the missing depth in pure-rotation cameras. We also develop an online pan-tilt forest for mapping and introduce moving objects (players) detection to mitigate negative impacts from foreground objects. We test our method on both synthetic and real datasets. The experimental results show the superior performance of our method over previous methods for online PTZ camera pose estimation. |
Tasks | Pose Estimation |
Published | 2019-07-20 |
URL | https://arxiv.org/abs/1907.08816v1 |
https://arxiv.org/pdf/1907.08816v1.pdf | |
PWC | https://paperswithcode.com/paper/pan-tilt-zoom-slam-for-sports-videos |
Repo | https://github.com/lulufa390/Pan-tilt-zoom-SLAM |
Framework | none |
DenseNet Models for Tiny ImageNet Classification
Title | DenseNet Models for Tiny ImageNet Classification |
Authors | Zoheb Abai, Nishad Rajmalwar |
Abstract | In this paper, we present two image classification models on the Tiny ImageNet dataset. We built two very different networks from scratch based on the idea of Densely Connected Convolution Networks. The architecture of the networks is designed based on the image resolution of this specific dataset and by calculating the Receptive Field of the convolution layers. We also used some non-conventional techniques related to image augmentation and Cyclical Learning Rate to improve the accuracy of our models. The networks are trained under high constraints and low computation resources. We aimed to achieve top-1 validation accuracy of 60%; the results and error analysis are also presented. |
Tasks | Image Augmentation, Image Classification |
Published | 2019-04-23 |
URL | http://arxiv.org/abs/1904.10429v1 |
http://arxiv.org/pdf/1904.10429v1.pdf | |
PWC | https://paperswithcode.com/paper/densenet-models-for-tiny-imagenet |
Repo | https://github.com/ZohebAbai/Tiny-ImageNet-Challenge |
Framework | none |
Unsupervised Embedding Learning via Invariant and Spreading Instance Feature
Title | Unsupervised Embedding Learning via Invariant and Spreading Instance Feature |
Authors | Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang |
Abstract | This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning data augmentation invariant and instance spread-out features. To achieve this goal, we propose a novel instance based softmax embedding method, which directly optimizes the `real’ instance features on top of the softmax function. It achieves significantly faster learning speed and higher accuracy than all existing methods. The proposed method performs well for both seen and unseen testing categories with cosine similarity. It also achieves competitive performance even without pre-trained network over samples from fine-grained categories. | |
Tasks | Data Augmentation |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03436v1 |
http://arxiv.org/pdf/1904.03436v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-embedding-learning-via-invariant |
Repo | https://github.com/mangye16/Unsupervised_Embedding_Learning |
Framework | pytorch |
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Title | Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation |
Authors | Yuzhou Liu, DeLiang Wang |
Abstract | We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous grouping and sequential grouping. Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network. In the second stage, the frame-level separated spectra are sequentially grouped to different speakers by a clustering network. The proposed deep CASA approach optimizes frame-level separation and speaker tracking in turn, and produces excellent results for both objectives. Experimental results on the benchmark WSJ0-2mix database show that the new approach achieves the state-of-the-art results with a modest model size. |
Tasks | Speaker Separation, Speech Separation |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11148v1 |
http://arxiv.org/pdf/1904.11148v1.pdf | |
PWC | https://paperswithcode.com/paper/divide-and-conquer-a-deep-casa-approach-to |
Repo | https://github.com/yuzhou-git/deep-casa |
Framework | tf |
Benchmarking Classic and Learned Navigation in Complex 3D Environments
Title | Benchmarking Classic and Learned Navigation in Complex 3D Environments |
Authors | Dmytro Mishkin, Alexey Dosovitskiy, Vladlen Koltun |
Abstract | Navigation research is attracting renewed interest with the advent of learning-based methods. However, this new line of work is largely disconnected from well-established classic navigation approaches. In this paper, we take a step towards coordinating these two directions of research. We set up classic and learning-based navigation systems in common simulated environments and thoroughly evaluate them in indoor spaces of varying complexity, with access to different sensory modalities. Additionally, we measure human performance in the same environments. We find that a classic pipeline, when properly tuned, can perform very well in complex cluttered environments. On the other hand, learned systems can operate more robustly with a limited sensor suite. Overall, both approaches are still far from human-level performance. |
Tasks | |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.10915v2 |
http://arxiv.org/pdf/1901.10915v2.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-classic-and-learned-navigation |
Repo | https://github.com/ducha-aiki/navigation-benchmark |
Framework | pytorch |
MAP Inference via L2-Sphere Linear Program Reformulation
Title | MAP Inference via L2-Sphere Linear Program Reformulation |
Authors | Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem |
Abstract | Maximum a posteriori (MAP) inference is an important task for graphical models. Due to complex dependencies among variables in realistic model, finding an exact solution for MAP inference is often intractable. Thus, many approximation methods have been developed, among which the linear programming (LP) relaxation based methods show promising performance. However, one major drawback of LP relaxation is that it is possible to give fractional solutions. Instead of presenting a tighter relaxation, in this work we propose a continuous but equivalent reformulation of the original MAP inference problem, called LS-LP. We add the L2-sphere constraint onto the original LP relaxation, leading to an intersected space with the local marginal polytope that is equivalent to the space of all valid integer label configurations. Thus, LS-LP is equivalent to the original MAP inference problem. We propose a perturbed alternating direction method of multipliers (ADMM) algorithm to optimize the LS-LP problem, by adding a sufficiently small perturbation epsilon onto the objective function and constraints. We prove that the perturbed ADMM algorithm globally converges to the epsilon-Karush-Kuhn-Tucker (epsilon-KKT) point of the LS-LP problem. The convergence rate will also be analyzed. Experiments on several benchmark datasets from Probabilistic Inference Challenge (PIC 2011) and OpenGM 2 show competitive performance of our proposed method against state-of-the-art MAP inference methods. |
Tasks | |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03433v3 |
https://arxiv.org/pdf/1905.03433v3.pdf | |
PWC | https://paperswithcode.com/paper/190503433 |
Repo | https://github.com/wubaoyuan/Lpbox-ADMM |
Framework | none |
Pyramid Feature Attention Network for Saliency detection
Title | Pyramid Feature Attention Network for Saliency detection |
Authors | Ting Zhao, Xiangqian Wu |
Abstract | Saliency detection is one of the basic challenges in computer vision. How to extract effective features is a critical point for saliency detection. Recent methods mainly adopt integrating multi-scale convolutional features indiscriminately. However, not all features are useful for saliency detection and some even cause interferences. To solve this problem, we propose Pyramid Feature Attention network to focus on effective high-level context features and low-level spatial structural features. First, we design Context-aware Pyramid Feature Extraction (CPFE) module for multi-scale high-level feature maps to capture rich context features. Second, we adopt channel-wise attention (CA) after CPFE feature maps and spatial attention (SA) after low-level feature maps, then fuse outputs of CA & SA together. Finally, we propose an edge preservation loss to guide network to learn more detailed information in boundary localization. Extensive evaluations on five benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches under different evaluation metrics. |
Tasks | Saliency Detection |
Published | 2019-03-01 |
URL | http://arxiv.org/abs/1903.00179v2 |
http://arxiv.org/pdf/1903.00179v2.pdf | |
PWC | https://paperswithcode.com/paper/pyramid-feature-selective-network-for |
Repo | https://github.com/CaitinZhao/cvpr2019_Pyramid-Feature-Attention-Network-for-Saliency-detection |
Framework | tf |
Open-domain Event Extraction and Embedding for Natural Gas Market Prediction
Title | Open-domain Event Extraction and Embedding for Natural Gas Market Prediction |
Authors | Minh Triet Chau, Diego Esteves, Jens Lehmann |
Abstract | We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparison to off-the-shelf tools, our event extraction method detects not only the occurrence of phenomena but also the changes in attribution and characteristics from public sources. Instead of using sentence embedding as a feature, we use every word of the extracted events, encode and organize them before feeding to the learning models. Empirical results show favorable results, in terms of prediction performance, money saved and scalability. |
Tasks | Sentence Embedding, Time Series |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.11334v1 |
https://arxiv.org/pdf/1912.11334v1.pdf | |
PWC | https://paperswithcode.com/paper/open-domain-event-extraction-and-embedding |
Repo | https://github.com/minhtriet/gas_market |
Framework | none |
Variance Reduced Local SGD with Lower Communication Complexity
Title | Variance Reduced Local SGD with Lower Communication Complexity |
Authors | Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng |
Abstract | To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training. Among them, Local SGD has gained much attention due to its lower communication cost. Nevertheless, when the data distribution on workers is non-identical, Local SGD requires $O(T^{\frac{3}{4}} N^{\frac{3}{4}})$ communications to maintain its \emph{linear iteration speedup} property, where $T$ is the total number of iterations and $N$ is the number of workers. In this paper, we propose Variance Reduced Local SGD (VRL-SGD) to further reduce the communication complexity. Benefiting from eliminating the dependency on the gradient variance among workers, we theoretically prove that VRL-SGD achieves a \emph{linear iteration speedup} with a lower communication complexity $O(T^{\frac{1}{2}} N^{\frac{3}{2}})$ even if workers access non-identical datasets. We conduct experiments on three machine learning tasks, and the experimental results demonstrate that VRL-SGD performs impressively better than Local SGD when the data among workers are quite diverse. |
Tasks | |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.12844v1 |
https://arxiv.org/pdf/1912.12844v1.pdf | |
PWC | https://paperswithcode.com/paper/variance-reduced-local-sgd-with-lower-1 |
Repo | https://github.com/zerolxf/VRL-SGD |
Framework | pytorch |
From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI
Title | From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI |
Authors | Roman Beliy, Guy Gaziv, Assaf Hoogi, Francesca Strappini, Tal Golan, Michal Irani |
Abstract | Reconstructing observed images from fMRI brain recordings is challenging. Unfortunately, acquiring sufficient “labeled” pairs of {Image, fMRI} (i.e., images with their corresponding fMRI responses) to span the huge space of natural images is prohibitive for many reasons. We present a novel approach which, in addition to the scarce labeled data (training pairs), allows to train fMRI-to-image reconstruction networks also on “unlabeled” data (i.e., images without fMRI recording, and fMRI recording without images). The proposed model utilizes both an Encoder network (image-to-fMRI) and a Decoder network (fMRI-to-image). Concatenating these two networks back-to-back (Encoder-Decoder & Decoder-Encoder) allows augmenting the training with both types of unlabeled data. Importantly, it allows training on the unlabeled test-fMRI data. This self-supervision adapts the reconstruction network to the new input test-data, despite its deviation from the statistics of the scarce training data. |
Tasks | Image Reconstruction |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.02431v1 |
https://arxiv.org/pdf/1907.02431v1.pdf | |
PWC | https://paperswithcode.com/paper/from-voxels-to-pixels-and-back-self |
Repo | https://github.com/WeizmannVision/ssfmri2im |
Framework | none |
MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks
Title | MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks |
Authors | Ty Nguyen, Shreyas S. Shivakumar, Ian D. Miller, James Keller, Elijah S. Lee, Alex Zhou, Tolga Ozaslan, Giuseppe Loianno, Joseph H. Harwood, Jennifer Wozencraft, Camillo J. Taylor, Vijay Kumar |
Abstract | Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable performance with some reference models in empirical experiments. Our model achieves a trade-off between speed and accuracy, achieving up to 48 FPS on an NVIDIA 1080Ti and 9 FPS on the NVIDIA Jetson Xavier when processing high resolution imagery. Additionally, we provide two novel datasets that represent challenges in semantic segmentation for real-time MAV tracking and infrastructure inspection tasks and verify MAVNet on these datasets. Our algorithm and datasets are made publicly available. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation, Visual Odometry |
Published | 2019-04-03 |
URL | https://arxiv.org/abs/1904.01795v2 |
https://arxiv.org/pdf/1904.01795v2.pdf | |
PWC | https://paperswithcode.com/paper/mavnet-an-effective-semantic-segmentation |
Repo | https://github.com/tynguyen/MAVNet |
Framework | none |
Neural Architectures for Nested NER through Linearization
Title | Neural Architectures for Nested NER through Linearization |
Authors | Jana Straková, Milan Straka, Jan Hajič |
Abstract | We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label. We encode the nested labels using a linearized scheme. In our first proposed approach, the nested labels are modeled as multilabels corresponding to the Cartesian product of the nested labels in a standard LSTM-CRF architecture. In the second one, the nested NER is viewed as a sequence-to-sequence problem, in which the input sequence consists of the tokens and output sequence of the labels, using hard attention on the word whose label is being predicted. The proposed methods outperform the nested NER state of the art on four corpora: ACE-2004, ACE-2005, GENIA and Czech CNEC. We also enrich our architectures with the recently published contextual embeddings: ELMo, BERT and Flair, reaching further improvements for the four nested entity corpora. In addition, we report flat NER state-of-the-art results for CoNLL-2002 Dutch and Spanish and for CoNLL-2003 English. |
Tasks | Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06926v1 |
https://arxiv.org/pdf/1908.06926v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-architectures-for-nested-ner-through-1 |
Repo | https://github.com/ufal/acl2019_nested_ner |
Framework | none |