January 30, 2020

2953 words 14 mins read

Paper Group ANR 476

A Simple Differentiable Programming Language. MixUp as Directional Adversarial Training. TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW. Removing Stripes, Scratches, and Curtaining with Non-Recoverable Compressed Sensing. Pruning from Scratch. Towards Automatic Embryo Staging in 3D+T Microscopy Images using Convolution …

A Simple Differentiable Programming Language


Title	A Simple Differentiable Programming Language
Authors	Martin Abadi, Gordon D. Plotkin
Abstract	Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear—discrepancies can arise, sometimes inadvertently. In order to study automatic differentiation in such programming contexts, we define a small but expressive programming language that includes a construct for reverse-mode differentiation. We give operational and denotational semantics for this language. The operational semantics employs popular implementation techniques, while the denotational semantics employs notions of differentiation familiar from real analysis. We establish that these semantics coincide.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04523v4
PDF	https://arxiv.org/pdf/1911.04523v4.pdf
PWC	https://paperswithcode.com/paper/a-simple-differentiable-programming-language
Repo
Framework

MixUp as Directional Adversarial Training


Title	MixUp as Directional Adversarial Training
Authors	Guillaume P. Archambault, Yongyi Mao, Hongyu Guo, Richong Zhang
Abstract	In this work, we explain the working mechanism of MixUp in terms of adversarial training. We introduce a new class of adversarial training schemes, which we refer to as directional adversarial training, or DAT. In a nutshell, a DAT scheme perturbs a training example in the direction of another example but keeps its original label as the training target. We prove that MixUp is equivalent to a special subclass of DAT, in that it has the same expected loss function and corresponds to the same optimization problem asymptotically. This understanding not only serves to explain the effectiveness of MixUp, but also reveals a more general family of MixUp schemes, which we call Untied MixUp. We prove that the family of Untied MixUp schemes is equivalent to the entire class of DAT schemes. We establish empirically the existence of Untied Mixup schemes which improve upon MixUp.
Tasks
Published	2019-06-17
URL	https://arxiv.org/abs/1906.06875v1
PDF	https://arxiv.org/pdf/1906.06875v1.pdf
PWC	https://paperswithcode.com/paper/mixup-as-directional-adversarial-training
Repo
Framework

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW


Title	TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW
Authors	Guy G. F. Lemieux, Joe Edwards, Joel Vandergriendt, Aaron Severance, Ryan De Iaco, Abdullah Raouf, Hussein Osman, Tom Watzka, Satwant Singh
Abstract	Reduced-precision arithmetic improves the size, cost, power and performance of neural networks in digital logic. In convolutional neural networks, the use of 1b weights can achieve state-of-the-art error rates while eliminating multiplication, reducing storage and improving power efficiency. The BinaryConnect binary-weighted system, for example, achieves 9.9% error using floating-point activations on the CIFAR-10 dataset. In this paper, we introduce TinBiNN, a lightweight vector processor overlay for accelerating inference computations with 1b weights and 8b activations. The overlay is very small – it uses about 5,000 4-input LUTs and fits into a low cost iCE40 UltraPlus FPGA from Lattice Semiconductor. To show this can be useful, we build two embedded ‘person detector’ systems by shrinking the original BinaryConnect network. The first is a 10-category classifier with a 89% smaller network that runs in 1,315ms and achieves 13.6% error. The other is a 1-category classifier that is even smaller, runs in 195ms, and has only 0.4% error. In both classifiers, the error can be attributed entirely to training and not reduced precision.
Tasks
Published	2019-03-05
URL	http://arxiv.org/abs/1903.06630v1
PDF	http://arxiv.org/pdf/1903.06630v1.pdf
PWC	https://paperswithcode.com/paper/tinbinn-tiny-binarized-neural-network-overlay
Repo
Framework

Removing Stripes, Scratches, and Curtaining with Non-Recoverable Compressed Sensing


Title	Removing Stripes, Scratches, and Curtaining with Non-Recoverable Compressed Sensing
Authors	Jonathan Schwartz, Yi Jiang, Yongjie Wang, Anthony Aiello, Pallab Bhattacharya, Hui Yuan, Zetian Mi, Nabil Bassim, Robert Hovden
Abstract	Highly-directional image artifacts such as ion mill curtaining, mechanical scratches, or image striping from beam instability degrade the interpretability of micrographs. These unwanted, aperiodic features extend the image along a primary direction and occupy a small wedge of information in Fourier space. Deleting this wedge of data replaces stripes, scratches, or curtaining, with more complex streaking and blurring artifacts-known within the tomography community as missing wedge artifacts. Here, we overcome this problem by recovering the missing region using total variation minimization, which leverages image sparsity based reconstruction techniques-colloquially referred to as compressed sensing-to reliably restore images corrupted by stripe like features. Our approach removes beam instability, ion mill curtaining, mechanical scratches, or any stripe features and remains robust at low signal-to-noise. The success of this approach is achieved by exploiting compressed sensings inability to recover directional structures that are highly localized and missing in Fourier Space.
Tasks
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08001v1
PDF	http://arxiv.org/pdf/1901.08001v1.pdf
PWC	https://paperswithcode.com/paper/removing-stripes-scratches-and-curtaining
Repo
Framework

Pruning from Scratch


Title	Pruning from Scratch
Authors	Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu
Abstract	Network pruning is an important research field aiming at reducing computational costs of neural networks. Conventional approaches follow a fixed paradigm which first trains a large and redundant network, and then determines which units (e.g., channels) are less important and thus can be removed. In this work, we find that pre-training an over-parameterized model is not necessary for obtaining the target pruned structure. In fact, a fully-trained over-parameterized model will reduce the search space for the pruned structure. We empirically show that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance. Therefore, we propose a novel network pruning pipeline which allows pruning from scratch. In the experiments for compressing classification models on CIFAR10 and ImageNet datasets, our approach not only greatly reduces the pre-training burden of traditional pruning methods, but also achieves similar or even higher accuracy under the same computation budgets. Our results facilitate the community to rethink the effectiveness of existing techniques used for network pruning.
Tasks	Network Pruning
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12579v1
PDF	https://arxiv.org/pdf/1909.12579v1.pdf
PWC	https://paperswithcode.com/paper/pruning-from-scratch
Repo
Framework

Towards Automatic Embryo Staging in 3D+T Microscopy Images using Convolutional Neural Networks and PointNets


Title	Towards Automatic Embryo Staging in 3D+T Microscopy Images using Convolutional Neural Networks and PointNets
Authors	Manuel Traub, Johannes Stegmaier
Abstract	Automatic analyses and comparisons of different stages of embryonic development largely depend on a highly accurate spatio-temporal alignment of the investigated data sets. In this contribution, we compare multiple approaches to perform automatic staging of developing embryos that were imaged with time-resolved 3D light-sheet microscopy. The methods comprise image-based convolutional neural networks as well as an approach based on the PointNet architecture that directly operates on 3D point clouds of detected cell nuclei centroids. The proof-of-concept experiments with four wild-type zebrafish embryos render both approaches suitable for automatic staging with average deviations of 0.45 - 0.57 hours.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00443v1
PDF	https://arxiv.org/pdf/1910.00443v1.pdf
PWC	https://paperswithcode.com/paper/towards-automatic-embryo-staging-in-3dt
Repo
Framework

Estimating Fingertip Forces, Torques, and Local Curvatures from Fingernail Images


Title	Estimating Fingertip Forces, Torques, and Local Curvatures from Fingernail Images
Authors	Nutan Chen, Göran Westling, Benoni B. Edin, Patrick van der Smagt
Abstract	The study of dexterous manipulation has provided important insights in humans sensorimotor control as well as inspiration for manipulation strategies in robotic hands. Previous work focused on experimental environment with restrictions. Here we describe a method using the deformation and color distribution of the fingernail and its surrounding skin, to estimate the fingertip forces, torques and contact surface curvatures for various objects, including the shape and material of the contact surfaces and the weight of the objects. The proposed method circumvents limitations associated with sensorized objects, gloves or fixed contact surface type. In addition, compared with previous single finger estimation in an experimental environment, we extend the approach to multiple finger force estimation, which can be used for applications such as human grasping analysis. Four algorithms are used, c.q., Gaussian process (GP), Convolutional Neural Networks (CNN), Neural Networks with Fast Dropout (NN-FD) and Recurrent Neural Networks with Fast Dropout (RNN-FD), to model a mapping from images to the corresponding labels. The results further show that the proposed method has high accuracy to predict force, torque and contact surface.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.05659v1
PDF	https://arxiv.org/pdf/1909.05659v1.pdf
PWC	https://paperswithcode.com/paper/estimating-fingertip-forces-torques-and-local
Repo
Framework

Neural Network Pruning with Residual-Connections and Limited-Data


Title	Neural Network Pruning with Residual-Connections and Limited-Data
Authors	Jian-Hao Luo, Jianxin Wu
Abstract	Filter level pruning is an effective method to accelerate the inference speed of deep CNN models. Although numerous pruning algorithms have been proposed, there are still two open issues. The first problem is how to prune residual connections. Most previous filter level pruning algorithms only prune channels inside residual blocks, leaving the number of output channels unchanged. We show that pruning both channels inside and outside the residual connections is crucial to achieve better performance. The second issue is pruning with limited data. We observe an interesting phenomenon: directly pruning on a small dataset is usually worse than fine-tuning a small model which is pruned or trained from scratch on the large dataset. In this paper, we propose a novel method, namely Compression Using Residual-connections and Limited-data (CURL), to tackle these two challenges. Experiments on the large scale dataset demonstrate the effectiveness of CURL. CURL significantly outperforms previous state-of-the-art methods on ImageNet. More importantly, when pruning on small datasets, CURL achieves comparable or much better performance than fine-tuning a pretrained small model.
Tasks	Network Pruning
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08114v2
PDF	https://arxiv.org/pdf/1911.08114v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-pruning-with-residual
Repo
Framework

Adaptive Gradient-Based Meta-Learning Methods


Title	Adaptive Gradient-Based Meta-Learning Methods
Authors	Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar
Abstract	We build a theoretical framework for designing and understanding practical meta-learning methods that integrates sophisticated formalizations of task-similarity with the extensive literature on online convex optimization and sequential prediction algorithms. Our approach enables the task-similarity to be learned adaptively, provides sharper transfer-risk bounds in the setting of statistical learning-to-learn, and leads to straightforward derivations of average-case regret bounds for efficient algorithms in settings where the task-environment changes dynamically or the tasks share a certain geometric structure. We use our theory to modify several popular meta-learning algorithms and improve their meta-test-time performance on standard problems in few-shot learning and federated learning.
Tasks	Few-Shot Learning, Meta-Learning
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02717v3
PDF	https://arxiv.org/pdf/1906.02717v3.pdf
PWC	https://paperswithcode.com/paper/adaptive-gradient-based-meta-learning-methods
Repo
Framework

UDFNet: Unsupervised Disparity Fusion with Adversarial Networks


Title	UDFNet: Unsupervised Disparity Fusion with Adversarial Networks
Authors	Can Pu, Robert B. Fisher
Abstract	Existing disparity fusion methods based on deep learning achieve state-of-the-art performance, but they require ground truth disparity data to train. As far as I know, this is the first time an unsupervised disparity fusion not using ground truth disparity data has been proposed. In this paper, a mathematical model for disparity fusion is proposed to guide an adversarial network to train effectively without ground truth disparity data. The initial disparity maps are inputted from the left view along with auxiliary information (gradient, left & right intensity image) into the refiner and the refiner is trained to output the refined disparity map registered on the left view. The refined left disparity map and left intensity image are used to reconstruct a fake right intensity image. Finally, the fake and real right intensity images (from the right stereo vision camera) are fed into the discriminator. In the model, the refiner is trained to output a refined disparity value close to the weighted sum of the disparity inputs for global initialisation. Then, three refinement principles are adopted to refine the results further. (1) The reconstructed intensity error between the fake and real right intensity image is minimised. (2) The similarities between the fake and real right image in different receptive fields are maximised. (3) The refined disparity map is smoothed based on the corresponding intensity image. The adversarial networks’ architectures are effective for the fusion task. The fusion time using the proposed network is small. The network can achieve 90 fps using Nvidia Geforce GTX 1080Ti on the Kitti2015 dataset when the input resolution is 1242 * 375 (Width * Height) without downsampling and cropping. The accuracy of this work is equal to (or better than) the state-of-the-art supervised methods.
Tasks
Published	2019-04-22
URL	http://arxiv.org/abs/1904.10044v1
PDF	http://arxiv.org/pdf/1904.10044v1.pdf
PWC	https://paperswithcode.com/paper/udfnet-unsupervised-disparity-fusion-with
Repo
Framework

A Unified Framework for Tuning Hyperparameters in Clustering Problems


Title	A Unified Framework for Tuning Hyperparameters in Clustering Problems
Authors	Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
Abstract	Selecting hyperparameters for unsupervised learning problems is challenging in general due to the lack of ground truth for validation. Despite the prevalence of this issue in statistics and machine learning, especially in clustering problems, there are not many methods for tuning these hyperparameters with theoretical guarantees. In this paper, we provide a framework with provable guarantees for selecting hyperparameters in a number of distinct models. We consider both the subgaussian mixture model and network models to serve as examples of i.i.d. and non-i.i.d. data. We demonstrate that the same framework can be used to choose the Lagrange multipliers of penalty terms in semi-definite programming (SDP) relaxations for community detection, and the bandwidth parameter for constructing kernel similarity matrices for spectral clustering. By incorporating a cross-validation procedure, we show the framework can also do consistent model selection for network models. Using a variety of simulated and real data examples, we show that our framework outperforms other widely used tuning procedures in a broad range of parameter settings.
Tasks	Community Detection, Model Selection
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08018v2
PDF	https://arxiv.org/pdf/1910.08018v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-tuning
Repo
Framework

Fair Adversarial Gradient Tree Boosting


Title	Fair Adversarial Gradient Tree Boosting
Authors	Vincent Grari, Boris Ruf, Sylvain Lamprier, Marcin Detyniecki
Abstract	Fair classification has become an important topic in machine learning research. While most bias mitigation strategies focus on neural networks, we noticed a lack of work on fair classifiers based on decision trees even though they have proven very efficient. In an up-to-date comparison of state-of-the-art classification algorithms in tabular data, tree boosting outperforms deep learning. For this reason, we have developed a novel approach of adversarial gradient tree boosting. The objective of the algorithm is to predict the output $Y$ with gradient tree boosting while minimizing the ability of an adversarial neural network to predict the sensitive attribute $S$. The approach incorporates at each iteration the gradient of the neural network directly in the gradient tree boosting. We empirically assess our approach on 4 popular data sets and compare against state-of-the-art algorithms. The results show that our algorithm achieves a higher accuracy while obtaining the same level of fairness, as measured using a set of different common fairness definitions.
Tasks
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05369v2
PDF	https://arxiv.org/pdf/1911.05369v2.pdf
PWC	https://paperswithcode.com/paper/fair-adversarial-gradient-tree-boosting
Repo
Framework

Hard-Aware Fashion Attribute Classification


Title	Hard-Aware Fashion Attribute Classification
Authors	Yun Ye, Yixin Li, Bo Wu, Wei Zhang, Lingyu Duan, Tao Mei
Abstract	Fashion attribute classification is of great importance to many high-level tasks such as fashion item search, fashion trend analysis, fashion recommendation, etc. The task is challenging due to the extremely imbalanced data distribution, particularly the attributes with only a few positive samples. In this paper, we introduce a hard-aware pipeline to make full use of “hard” samples/attributes. We first propose Hard-Aware BackPropagation (HABP) to efficiently and adaptively focus on training “hard” data. Then for the identified hard labels, we propose to synthesize more complementary samples for training. To stabilize training, we extend semi-supervised GAN by directly deactivating outputs for synthetic complementary samples (Deact). In general, our method is more effective in addressing “hard” cases. HABP weights more on “hard” samples. For “hard” attributes with insufficient training data, Deact brings more stable synthetic samples for training and further improve the performance. Our method is verified on large scale fashion dataset, outperforming other state-of-the-art without any additional supervisions.
Tasks
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10839v1
PDF	https://arxiv.org/pdf/1907.10839v1.pdf
PWC	https://paperswithcode.com/paper/hard-aware-fashion-attribute-classification
Repo
Framework

Multi-Module System for Open Domain Chinese Question Answering over Knowledge Base


Title	Multi-Module System for Open Domain Chinese Question Answering over Knowledge Base
Authors	Yiying Yang, Xiahui He, Kaijie Zhou, Zhongyu Wei
Abstract	For the task of open domain Knowledge Based Question Answering in CCKS2019, we propose a method combining information retrieval and semantic parsing. This multi-module system extracts the topic entity and the most related relation predicate from a question and transforms it into a Sparql query statement. Our method obtained the F1 score of 70.45% on the test data.
Tasks	Information Retrieval, Question Answering, Semantic Parsing
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12477v1
PDF	https://arxiv.org/pdf/1910.12477v1.pdf
PWC	https://paperswithcode.com/paper/multi-module-system-for-open-domain-chinese
Repo
Framework

Privacy Preserving Stochastic Channel-Based Federated Learning with Neural Network Pruning


Title	Privacy Preserving Stochastic Channel-Based Federated Learning with Neural Network Pruning
Authors	Rulin Shao, Hui Liu, Dianbo Liu
Abstract	Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of big data since the training process requires massive and representative data sets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. To address this problem, we propose a privacy-preserving method for the distributed system, Stochastic Channel-Based Federated Learning (SCBF), which enables the participants to train a high-performance model cooperatively without sharing their inputs. We design, implement and evaluate a channel-based update algorithm for the central server in a distributed system, which selects the channels with regard to the most active features in a training loop and uploads them as learned information from local datasets. A pruning process is applied to the algorithm based on the validation set, which serves as a model accelerator. In the experiment, our model presents equal performances and higher saturating speed than the Federated Averaging method which reveals all the parameters of local models to the server when updating. We also demonstrate that the converging rates could be increased by introducing a pruning process.
Tasks	Network Pruning
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02115v1
PDF	https://arxiv.org/pdf/1910.02115v1.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-stochastic-channel-based
Repo
Framework