October 16, 2019

3121 words 15 mins read

Paper Group ANR 1121

Fundamental Resource Trade-offs for Encoded Distributed Optimization. Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy. Improve the performance of transfer learning without fine-tuning using dissimilarity-based multi-view learning for breast cancer histology images. Generating Talking Face Landmarks from Speech. Unsupervised Object …

Fundamental Resource Trade-offs for Encoded Distributed Optimization


Title	Fundamental Resource Trade-offs for Encoded Distributed Optimization
Authors	A. Salman Avestimehr, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi
Abstract	Dealing with the shear size and complexity of today’s massive data sets requires computational platforms that can analyze data in a parallelized and distributed fashion. A major bottleneck that arises in such modern distributed computing environments is that some of the worker nodes may run slow. These nodes a.k.a.~stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. A recent computational framework, called encoded optimization, creates redundancy in the data to mitigate the effect of stragglers. In this paper we develop novel mathematical understanding for this framework demonstrating its effectiveness in much broader settings than was previously understood. We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy), and straggler toleration in this framework.
Tasks	Distributed Optimization
Published	2018-03-31
URL	http://arxiv.org/abs/1804.00217v2
PDF	http://arxiv.org/pdf/1804.00217v2.pdf
PWC	https://paperswithcode.com/paper/fundamental-resource-trade-offs-for-encoded
Repo
Framework

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy


Title	Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy
Authors	Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dongdong Zhang, Xu Sun
Abstract	Cross-lingual word embeddings aim to capture common linguistic regularities of different languages, which benefit various downstream tasks ranging from machine translation to transfer learning. Recently, it has been shown that these embeddings can be effectively learned by aligning two disjoint monolingual vector spaces through a linear transformation (word mapping). In this work, we focus on learning such a word mapping without any supervision signal. Most previous work of this task adopts parametric metrics to measure distribution differences, which typically requires a sophisticated alternate optimization process, either in the form of \emph{minmax game} or intermediate \emph{density estimation}. This alternate optimization process is relatively hard and unstable. In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding. Extensive experimental results show that our proposed model outperforms competitive baselines by a large margin.
Tasks	Density Estimation, Machine Translation, Transfer Learning, Word Embeddings
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00275v1
PDF	http://arxiv.org/pdf/1811.00275v1.pdf
PWC	https://paperswithcode.com/paper/learning-unsupervised-word-mapping-by
Repo
Framework

Improve the performance of transfer learning without fine-tuning using dissimilarity-based multi-view learning for breast cancer histology images


Title	Improve the performance of transfer learning without fine-tuning using dissimilarity-based multi-view learning for breast cancer histology images
Authors	Hongliu Cao, Simon Bernard, Laurent Heutte, Robert Sabourin
Abstract	Breast cancer is one of the most common types of cancer and leading cancer-related death causes for women. In the context of ICIAR 2018 Grand Challenge on Breast Cancer Histology Images, we compare one handcrafted feature extractor and five transfer learning feature extractors based on deep learning. We find out that the deep learning networks pretrained on ImageNet have better performance than the popular handcrafted features used for breast cancer histology images. The best feature extractor achieves an average accuracy of 79.30%. To improve the classification performance, a random forest dissimilarity based integration method is used to combine different feature groups together. When the five deep learning feature groups are combined, the average accuracy is improved to 82.90% (best accuracy 85.00%). When handcrafted features are combined with the five deep learning feature groups, the average accuracy is improved to 87.10% (best accuracy 93.00%).
Tasks	MULTI-VIEW LEARNING, Transfer Learning
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11241v1
PDF	http://arxiv.org/pdf/1803.11241v1.pdf
PWC	https://paperswithcode.com/paper/improve-the-performance-of-transfer-learning
Repo
Framework

Generating Talking Face Landmarks from Speech


Title	Generating Talking Face Landmarks from Speech
Authors	Sefik Emre Eskimez, Ross K Maddox, Chenliang Xu, Zhiyao Duan
Abstract	The presence of a corresponding talking face has been shown to significantly improve speech intelligibility in noisy conditions and for hearing impaired population. In this paper, we present a system that can generate landmark points of a talking face from an acoustic speech in real time. The system uses a long short-term memory (LSTM) network and is trained on frontal videos of 27 different speakers with automatically extracted face landmarks. After training, it can produce talking face landmarks from the acoustic speech of unseen speakers and utterances. The training phase contains three key steps. We first transform landmarks of the first video frame to pin the two eye points into two predefined locations and apply the same transformation on all of the following video frames. We then remove the identity information by transforming the landmarks into a mean face shape across the entire training dataset. Finally, we train an LSTM network that takes the first- and second-order temporal differences of the log-mel spectrogram as input to predict face landmarks in each frame. We evaluate our system using the mean-squared error (MSE) loss of landmarks of lips between predicted and ground-truth landmarks as well as their first- and second-order temporal differences. We further evaluate our system by conducting subjective tests, where the subjects try to distinguish the real and fake videos of talking face landmarks. Both tests show promising results.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09803v2
PDF	http://arxiv.org/pdf/1803.09803v2.pdf
PWC	https://paperswithcode.com/paper/generating-talking-face-landmarks-from-speech
Repo
Framework

Unsupervised Object Matching for Relational Data


Title	Unsupervised Object Matching for Relational Data
Authors	Tomoharu Iwata, Naonori Ueda
Abstract	We propose an unsupervised object matching method for relational data, which finds matchings between objects in different relational datasets without correspondence information. For example, the proposed method matches documents in different languages in multi-lingual document-word networks without dictionaries nor alignment information. The proposed method assumes that each object has latent vectors, and the probability of neighbor objects is modeled by the inner-product of the latent vectors, where the neighbors are generated by short random walks over the relations. The latent vectors are estimated by maximizing the likelihood of the neighbors for each dataset. The estimated latent vectors contain hidden structural information of each object in the given relational dataset. Then, the proposed method linearly projects the latent vectors for all the datasets onto a common latent space shared across all datasets by matching the distributions while preserving the structural information. The projection matrix is estimated by minimizing the distance between the latent vector distributions with an orthogonality regularizer. To represent the distributions effectively, we use the kernel embedding of distributions that hold high-order moment information about a distribution as an element in a reproducing kernel Hilbert space, which enables us to calculate the distance between the distributions without density estimation. The structural information encoded in the latent vectors are preserved by using the orthogonality regularizer. We demonstrate the effectiveness of the proposed method with experiments using real-world multi-lingual document-word relational datasets and multiple user-item relational datasets.
Tasks	Density Estimation
Published	2018-10-09
URL	http://arxiv.org/abs/1810.03770v3
PDF	http://arxiv.org/pdf/1810.03770v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-object-matching-for-relational
Repo
Framework

Out-of-sample extension of graph adjacency spectral embedding


Title	Out-of-sample extension of graph adjacency spectral embedding
Authors	Keith Levin, Farbod Roosta-Khorasani, Michael W. Mahoney, Carey E. Priebe
Abstract	Many popular dimensionality reduction procedures have out-of-sample extensions, which allow a practitioner to apply a learned embedding to observations not seen in the initial training sample. In this work, we consider the problem of obtaining an out-of-sample extension for the adjacency spectral embedding, a procedure for embedding the vertices of a graph into Euclidean space. We present two different approaches to this problem, one based on a least-squares objective and the other based on a maximum-likelihood formulation. We show that if the graph of interest is drawn according to a certain latent position model called a random dot product graph, then both of these out-of-sample extensions estimate the true latent position of the out-of-sample vertex with the same error rate. Further, we prove a central limit theorem for the least-squares-based extension, showing that the estimate is asymptotically normal about the truth in the large-graph limit.
Tasks	Dimensionality Reduction
Published	2018-02-17
URL	http://arxiv.org/abs/1802.06307v1
PDF	http://arxiv.org/pdf/1802.06307v1.pdf
PWC	https://paperswithcode.com/paper/out-of-sample-extension-of-graph-adjacency
Repo
Framework

Expert System for Diagnosis of Chest Diseases Using Neural Networks


Title	Expert System for Diagnosis of Chest Diseases Using Neural Networks
Authors	Ismail Kayali
Abstract	This article represents one of the contemporary trends in the application of the latest methods of information and communication technology for medicine through an expert system helps the doctor to diagnose some chest diseases which is important because of the frequent spread of chest diseases nowadays in addition to the overlap symptoms of these diseases, which is difficult to right diagnose by doctors with several algorithms: Forward Chaining, Backward Chaining, Neural Network(Back Propagation). However, this system cannot replace the doctor function, but it can help the doctor to avoid wrong diagnosis and treatments. It can also be developed in such a way to help the novice doctors.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06866v1
PDF	http://arxiv.org/pdf/1802.06866v1.pdf
PWC	https://paperswithcode.com/paper/expert-system-for-diagnosis-of-chest-diseases
Repo
Framework

Deep Diffeomorphic Normalizing Flows


Title	Deep Diffeomorphic Normalizing Flows
Authors	Hadi Salman, Payman Yadollahpour, Tom Fletcher, Kayhan Batmanghelich
Abstract	The Normalizing Flow (NF) models a general probability density by estimating an invertible transformation applied on samples drawn from a known distribution. We introduce a new type of NF, called Deep Diffeomorphic Normalizing Flow (DDNF). A diffeomorphic flow is an invertible function where both the function and its inverse are smooth. We construct the flow using an ordinary differential equation (ODE) governed by a time-varying smooth vector field. We use a neural network to parametrize the smooth vector field and a recursive neural network (RNN) for approximating the solution of the ODE. Each cell in the RNN is a residual network implementing one Euler integration step. The architecture of our flow enables efficient likelihood evaluation, straightforward flow inversion, and results in highly flexible density estimation. An end-to-end trained DDNF achieves competitive results with state-of-the-art methods on a suite of density estimation and variational inference tasks. Finally, our method brings concepts from Riemannian geometry that, we believe, can open a new research direction for neural density estimation.
Tasks	Density Estimation
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03256v2
PDF	http://arxiv.org/pdf/1810.03256v2.pdf
PWC	https://paperswithcode.com/paper/deep-diffeomorphic-normalizing-flows
Repo
Framework

Accelerating Stochastic Gradient Descent Using Antithetic Sampling


Title	Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Authors	Jingchang Liu, Linli Xu
Abstract	(Mini-batch) Stochastic Gradient Descent is a popular optimization method which has been applied to many machine learning applications. But a rather high variance introduced by the stochastic gradient in each step may slow down the convergence. In this paper, we propose the antithetic sampling strategy to reduce the variance by taking advantage of the internal structure in dataset. Under this new strategy, stochastic gradients in a mini-batch are no longer independent but negatively correlated as much as possible, while the mini-batch stochastic gradient is still an unbiased estimator of full gradient. For the binary classification problems, we just need to calculate the antithetic samples in advance, and reuse the result in each iteration, which is practical. Experiments are provided to confirm the effectiveness of the proposed method.
Tasks
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03124v1
PDF	http://arxiv.org/pdf/1810.03124v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-stochastic-gradient-descent
Repo
Framework

Economics of Human-AI Ecosystem: Value Bias and Lost Utility in Multi-Dimensional Gaps


Title	Economics of Human-AI Ecosystem: Value Bias and Lost Utility in Multi-Dimensional Gaps
Authors	Daniel Muller
Abstract	In recent years, artificial intelligence (AI) decision-making and autonomous systems became an integrated part of the economy, industry, and society. The evolving economy of the human-AI ecosystem raising concerns regarding the risks and values inherited in AI systems. This paper investigates the dynamics of creation and exchange of values and points out gaps in perception of cost-value, knowledge, space and time dimensions. It shows aspects of value bias in human perception of achievements and costs that encoded in AI systems. It also proposes rethinking hard goals definitions and cost-optimal problem-solving principles in the lens of effectiveness and efficiency in the development of trusted machines. The paper suggests a value-driven with cost awareness strategy and principles for problem-solving and planning of effective research progress to address real-world problems that involve diverse forms of achievements, investments, and survival scenarios.
Tasks	Decision Making
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06606v2
PDF	http://arxiv.org/pdf/1811.06606v2.pdf
PWC	https://paperswithcode.com/paper/economics-of-human-ai-ecosystem-value-bias
Repo
Framework

Improved Style Transfer by Respecting Inter-layer Correlations


Title	Improved Style Transfer by Respecting Inter-layer Correlations
Authors	Mao-Chuang Yeh, Shuai Tang
Abstract	A popular series of style transfer methods apply a style to a content image by controlling mean and covariance of values in early layers of a feature stack. This is insufficient for transferring styles that have strong structure across spatial scales like, e.g., textures where dots lie on long curves. This paper demonstrates that controlling inter-layer correlations yields visible improvements in style transfer methods. We achieve this control by computing cross-layer, rather than within-layer, gram matrices. We find that (a) cross-layer gram matrices are sufficient to control within-layer statistics. Inter-layer correlations improves style transfer and texture synthesis. The paper shows numerous examples on “hard” real style transfer problems (e.g. long scale and hierarchical patterns); (b) a fast approximate style transfer method can control cross-layer gram matrices; (c) we demonstrate that multiplicative, rather than additive style and content loss, results in very good style transfer. Multiplicative loss produces a visible emphasis on boundaries, and means that one hyper-parameter can be eliminated.
Tasks	Style Transfer, Texture Synthesis
Published	2018-01-05
URL	http://arxiv.org/abs/1801.01933v1
PDF	http://arxiv.org/pdf/1801.01933v1.pdf
PWC	https://paperswithcode.com/paper/improved-style-transfer-by-respecting-inter
Repo
Framework

Discovering Context Specific Causal Relationships


Title	Discovering Context Specific Causal Relationships
Authors	Saisai Ma, Jiuyong Li, Lin Liu, Thuc Duy Le
Abstract	With the increasing need of personalised decision making, such as personalised medicine and online recommendations, a growing attention has been paid to the discovery of the context and heterogeneity of causal relationships. Most existing methods, however, assume a known cause (e.g. a new drug) and focus on identifying from data the contexts of heterogeneous effects of the cause (e.g. patient groups with different responses to the new drug). There is no approach to efficiently detecting directly from observational data context specific causal relationships, i.e. discovering the causes and their contexts simultaneously. In this paper, by taking the advantages of highly efficient decision tree induction and the well established causal inference framework, we propose the Tree based Context Causal rule discovery (TCC) method, for efficient exploration of context specific causal relationships from data. Experiments with both synthetic and real world data sets show that TCC can effectively discover context specific causal rules from the data.
Tasks	Causal Inference, Decision Making, Efficient Exploration
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06316v1
PDF	http://arxiv.org/pdf/1808.06316v1.pdf
PWC	https://paperswithcode.com/paper/discovering-context-specific-causal
Repo
Framework

Removing Malicious Nodes from Networks


Title	Removing Malicious Nodes from Networks
Authors	Sixie Yu, Yevgeniy Vorobeychik
Abstract	A fundamental challenge in networked systems is detection and removal of suspected malicious nodes. In reality, detection is always imperfect, and the decision about which potentially malicious nodes to remove must trade off false positives (erroneously removing benign nodes) and false negatives (mistakenly failing to remove malicious nodes). However, in network settings this conventional tradeoff must now account for node connectivity. In particular, malicious nodes may exert malicious influence, so that mistakenly leaving some of these in the network may cause damage to spread. On the other hand, removing benign nodes causes direct harm to these, and indirect harm to their benign neighbors who would wish to communicate with them. We formalize the problem of removing potentially malicious nodes from a network under uncertainty through an objective that takes connectivity into account. We show that optimally solving the resulting problem is NP-Hard. We then propose a tractable solution approach based on a convex relaxation of the objective. Finally, we experimentally demonstrate that our approach significantly outperforms both a simple baseline that ignores network structure, as well as a state-of-the-art approach for a related problem, on both synthetic and real-world datasets.
Tasks
Published	2018-12-30
URL	http://arxiv.org/abs/1812.11448v6
PDF	http://arxiv.org/pdf/1812.11448v6.pdf
PWC	https://paperswithcode.com/paper/removing-malicious-nodes-from-networks
Repo
Framework

Manifold Adversarial Learning


Title	Manifold Adversarial Learning
Authors	Shufei Zhang, Kaizhu Huang, Jianke Zhu, Yang Liu
Abstract	Recently proposed adversarial training methods show the robustness to both adversarial and original examples and achieve state-of-the-art results in supervised and semi-supervised learning. All the existing adversarial training methods consider only how the worst perturbed examples (i.e., adversarial examples) could affect the model output. Despite their success, we argue that such setting may be in lack of generalization, since the output space (or label space) is apparently less informative.In this paper, we propose a novel method, called Manifold Adversarial Training (MAT). MAT manages to build an adversarial framework based on how the worst perturbation could affect the distributional manifold rather than the output space. Particularly, a latent data space with the Gaussian Mixture Model (GMM) will be first derived.On one hand, MAT tries to perturb the input samples in the way that would rough the distributional manifold the worst. On the other hand, the deep learning model is trained trying to promote in the latent space the manifold smoothness, measured by the variation of Gaussian mixtures (given the local perturbation around the data point). Importantly, since the latent space is more informative than the output space, the proposed MAT can learn better a robust and compact data representation, leading to further performance improvement. The proposed MAT is important in that it can be considered as a superset of one recently-proposed discriminative feature learning approach called center loss. We conducted a series of experiments in both supervised and semi-supervised learning on three benchmark data sets, showing that the proposed MAT can achieve remarkable performance, much better than those of the state-of-the-art adversarial approaches. We also present a series of visualization which could generate further understanding or explanation on adversarial examples.
Tasks
Published	2018-07-16
URL	https://arxiv.org/abs/1807.05832v2
PDF	https://arxiv.org/pdf/1807.05832v2.pdf
PWC	https://paperswithcode.com/paper/manifold-adversarial-learning
Repo
Framework

Dendritic-Inspired Processing Enables Bio-Plausible STDP in Compound Binary Synapses


Title	Dendritic-Inspired Processing Enables Bio-Plausible STDP in Compound Binary Synapses
Authors	Xinyu Wu, Vishal Saxena
Abstract	Brain-inspired learning mechanisms, e.g. spike timing dependent plasticity (STDP), enable agile and fast on-the-fly adaptation capability in a spiking neural network. When incorporating emerging nanoscale resistive non-volatile memory (NVM) devices, with ultra-low power consumption and high-density integration capability, a spiking neural network hardware would result in several orders of magnitude reduction in energy consumption at a very small form factor and potentially herald autonomous learning machines. However, actual memory devices have shown to be intrinsically binary with stochastic switching, and thus impede the realization of ideal STDP with continuous analog values. In this work, a dendritic-inspired processing architecture is proposed in addition to novel CMOS neuron circuits. The utilization of spike attenuations and delays transforms the traditionally undesired stochastic behavior of binary NVMs into a useful leverage that enables biologically-plausible STDP learning. As a result, this work paves a pathway to adopt practical binary emerging NVM devices in brain-inspired neuromorphic computing.
Tasks
Published	2018-01-09
URL	http://arxiv.org/abs/1801.02797v1
PDF	http://arxiv.org/pdf/1801.02797v1.pdf
PWC	https://paperswithcode.com/paper/dendritic-inspired-processing-enables-bio
Repo
Framework