July 29, 2019

2808 words 14 mins read

Paper Group ANR 116

Limitations on Variance-Reduction and Acceleration Schemes for Finite Sum Optimization. Efficient Inferencing of Compressed Deep Neural Networks. Evidence Updating for Stream-Processing in Big-Data: Robust Conditioning in Soft and Hard Fusion Environments. A feasibility study for predicting optimal radiation therapy dose distributions of prostate c …

Limitations on Variance-Reduction and Acceleration Schemes for Finite Sum Optimization


Title	Limitations on Variance-Reduction and Acceleration Schemes for Finite Sum Optimization
Authors	Yossi Arjevani
Abstract	We study the conditions under which one is able to efficiently apply variance-reduction and acceleration schemes on finite sum optimization problems. First, we show that, perhaps surprisingly, the finite sum structure by itself, is not sufficient for obtaining a complexity bound of $\tilde{\cO}((n+L/\mu)\ln(1/\epsilon))$ for $L$-smooth and $\mu$-strongly convex individual functions - one must also know which individual function is being referred to by the oracle at each iteration. Next, we show that for a broad class of first-order and coordinate-descent finite sum algorithms (including, e.g., SDCA, SVRG, SAG), it is not possible to get an `accelerated’ complexity bound of $\tilde{\cO}((n+\sqrt{n L/\mu})\ln(1/\epsilon))$, unless the strong convexity parameter is given explicitly. Lastly, we show that when this class of algorithms is used for minimizing $L$-smooth and convex finite sums, the optimal complexity bound is $\tilde{\cO}(n+L/\epsilon)$, assuming that (on average) the same update rule is used in every iteration, and $\tilde{\cO}(n+\sqrt{nL/\epsilon})$, otherwise. \|
Tasks
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01686v2
PDF	http://arxiv.org/pdf/1706.01686v2.pdf
PWC	https://paperswithcode.com/paper/limitations-on-variance-reduction-and
Repo
Framework

Efficient Inferencing of Compressed Deep Neural Networks


Title	Efficient Inferencing of Compressed Deep Neural Networks
Authors	Dharma Teja Vooturi, Saurabh Goyal, Anamitra R. Choudhury, Yogish Sabharwal, Ashish Verma
Abstract	Large number of weights in deep neural networks makes the models difficult to be deployed in low memory environments such as, mobile phones, IOT edge devices as well as “inferencing as a service” environments on cloud. Prior work has considered reduction in the size of the models, through compression techniques like pruning, quantization, Huffman encoding etc. However, efficient inferencing using the compressed models has received little attention, specially with the Huffman encoding in place. In this paper, we propose efficient parallel algorithms for inferencing of single image and batches, under various memory constraints. Our experimental results show that our approach of using variable batch size for inferencing achieves 15-25% performance improvement in the inference throughput for AlexNet, while maintaining memory and latency constraints.
Tasks	Quantization
Published	2017-11-01
URL	http://arxiv.org/abs/1711.00244v1
PDF	http://arxiv.org/pdf/1711.00244v1.pdf
PWC	https://paperswithcode.com/paper/efficient-inferencing-of-compressed-deep
Repo
Framework

Evidence Updating for Stream-Processing in Big-Data: Robust Conditioning in Soft and Hard Fusion Environments


Title	Evidence Updating for Stream-Processing in Big-Data: Robust Conditioning in Soft and Hard Fusion Environments
Authors	Thanuka Wickramarathne
Abstract	Robust belief revision methods are crucial in streaming data situations for updating existing knowledge or beliefs with new incoming evidence. Bayes conditioning is the primary mechanism in use for belief revision in data fusion systems that use probabilistic inference. However, traditional conditioning methods face several challenges due to inherent data/source imperfections in big-data environments that harness soft (i.e., human or human-based) sources in addition to hard (i.e., physics-based) sensors. The objective of this paper is to investigate the most natural extension of Bayes conditioning that is suitable for evidence updating in the presence of such uncertainties. By viewing the evidence updating process as a thought experiment, an elegant strategy is derived for robust evidence updating in the presence of extreme uncertainties that are characteristic of big-data environments. In particular, utilizing the Fagin-Halpern conditional notions, a natural extension to Bayes conditioning is derived for evidence that takes the form of a general belief function. The presented work differs fundamentally from the Conditional Update Equation (CUE) and authors own extensions of it. An overview of this development is provided via illustrative examples. Furthermore, insights into parameter selection under various fusion contexts are also provided.
Tasks
Published	2017-03-20
URL	http://arxiv.org/abs/1703.06565v2
PDF	http://arxiv.org/pdf/1703.06565v2.pdf
PWC	https://paperswithcode.com/paper/evidence-updating-for-stream-processing-in
Repo
Framework

A feasibility study for predicting optimal radiation therapy dose distributions of prostate cancer patients from patient anatomy using deep learning


Title	A feasibility study for predicting optimal radiation therapy dose distributions of prostate cancer patients from patient anatomy using deep learning
Authors	Dan Nguyen, Troy Long, Xun Jia, Weiguo Lu, Xuejun Gu, Zohaib Iqbal, Steve Jiang
Abstract	With the advancement of treatment modalities in radiation therapy for cancer patients, outcomes have improved, but at the cost of increased treatment plan complexity and planning time. The accurate prediction of dose distributions would alleviate this issue by guiding clinical plan optimization to save time and maintain high quality plans. We have modified a convolutional deep network model, U-net (originally designed for segmentation purposes), for predicting dose from patient image contours of the planning target volume (PTV) and organs at risk (OAR). We show that, as an example, we are able to accurately predict the dose of intensity-modulated radiation therapy (IMRT) for prostate cancer patients, where the average Dice similarity coefficient is 0.91 when comparing the predicted vs. true isodose volumes between 0% and 100% of the prescription dose. The average value of the absolute differences in [max, mean] dose is found to be under 5% of the prescription dose, specifically for each structure is 1.80%, 1.03%, 1.94%, 4.22%, 1.80%, 0.48%, [3.87%, 1.79%](L Femoral Head), [5.07%, 2.55%](R Femoral Head), and 1.26%, 1.62% of the prescription dose. We thus managed to map a desired radiation dose distribution from a patient’s PTV and OAR contours. As an additional advantage, relatively little data was used in the techniques and models described in this paper.
Tasks
Published	2017-09-26
URL	http://arxiv.org/abs/1709.09233v4
PDF	http://arxiv.org/pdf/1709.09233v4.pdf
PWC	https://paperswithcode.com/paper/a-feasibility-study-for-predicting-optimal
Repo
Framework

Predicting Causes of Reformulation in Intelligent Assistants


Title	Predicting Causes of Reformulation in Intelligent Assistants
Authors	Shumpei Sano, Nobuhiro Kaji, Manabu Sassano
Abstract	Intelligent assistants (IAs) such as Siri and Cortana conversationally interact with users and execute a wide range of actions (e.g., searching the Web, setting alarms, and chatting). IAs can support these actions through the combination of various components such as automatic speech recognition, natural language understanding, and language generation. However, the complexity of these components hinders developers from determining which component causes an error. To remove this hindrance, we focus on reformulation, which is a useful signal of user dissatisfaction, and propose a method to predict the reformulation causes. We evaluate the method using the user logs of a commercial IA. The experimental results have demonstrated that features designed to detect the error of a specific component improve the performance of reformulation cause detection.
Tasks	Speech Recognition, Text Generation
Published	2017-07-13
URL	http://arxiv.org/abs/1707.03968v1
PDF	http://arxiv.org/pdf/1707.03968v1.pdf
PWC	https://paperswithcode.com/paper/predicting-causes-of-reformulation-in
Repo
Framework

Feature overwriting as a finite mixture process: Evidence from comprehension data


Title	Feature overwriting as a finite mixture process: Evidence from comprehension data
Authors	Shravan Vasishth, Lena A. Jäger, Bruno Nicenboim
Abstract	The ungrammatical sentence “The key to the cabinets are on the table” is known to lead to an illusion of grammaticality. As discussed in the meta-analysis by Jaeger et al., 2017, faster reading times are observed at the verb are in the agreement-attraction sentence above compared to the equally ungrammatical sentence “The key to the cabinet are on the table”. One explanation for this facilitation effect is the feature percolation account: the plural feature on cabinets percolates up to the head noun key, leading to the illusion. An alternative account is in terms of cue-based retrieval (Lewis & Vasishth, 2005), which assumes that the non-subject noun cabinets is misretrieved due to a partial feature-match when a dependency completion process at the auxiliary initiates a memory access for a subject with plural marking. We present evidence for yet another explanation for the observed facilitation. Because the second sentence has two nouns with identical number, it is possible that these are, in some proportion of trials, more difficult to keep distinct, leading to slower reading times at the verb in the first sentence above; this is the feature overwriting account of Nairne, 1990. We show that the feature overwriting proposal can be implemented as a finite mixture process. We reanalysed ten published data-sets, fitting hierarchical Bayesian mixture models to these data assuming a two-mixture distribution. We show that in nine out of the ten studies, a mixture distribution corresponding to feature overwriting furnishes a superior fit over both the feature percolation and the cue-based retrieval accounts.
Tasks
Published	2017-03-12
URL	http://arxiv.org/abs/1703.04081v2
PDF	http://arxiv.org/pdf/1703.04081v2.pdf
PWC	https://paperswithcode.com/paper/feature-overwriting-as-a-finite-mixture
Repo
Framework

Deep Epitome for Unravelling Generalized Hamming Network: A Fuzzy Logic Interpretation of Deep Learning


Title	Deep Epitome for Unravelling Generalized Hamming Network: A Fuzzy Logic Interpretation of Deep Learning
Authors	Lixin Fan
Abstract	This paper gives a rigorous analysis of trained Generalized Hamming Networks(GHN) proposed by Fan (2017) and discloses an interesting finding about GHNs, i.e., stacked convolution layers in a GHN is equivalent to a single yet wide convolution layer. The revealed equivalence, on the theoretical side, can be regarded as a constructive manifestation of the universal approximation theorem Cybenko(1989); Hornik (1991). In practice, it has profound and multi-fold implications. For network visualization, the constructed deep epitomes at each layer provide a visualization of network internal representation that does not rely on the input data. Moreover, deep epitomes allows the direct extraction of features in just one step, without resorting to regularized optimizations used in existing visualization tools.
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05397v1
PDF	http://arxiv.org/pdf/1711.05397v1.pdf
PWC	https://paperswithcode.com/paper/deep-epitome-for-unravelling-generalized
Repo
Framework

Learning to Partition using Score Based Compatibilities


Title	Learning to Partition using Score Based Compatibilities
Authors	Arun Rajkumar, Koyel Mukherjee, Theja Tulabandhula
Abstract	We study the problem of learning to partition users into groups, where one must learn the compatibilities between the users to achieve optimal groupings. We define four natural objectives that optimize for average and worst case compatibilities and propose new algorithms for adaptively learning optimal groupings. When we do not impose any structure on the compatibilities, we show that the group formation objectives considered are $NP$ hard to solve and we either give approximation guarantees or prove inapproximability results. We then introduce an elegant structure, namely that of \textit{intrinsic scores}, that makes many of these problems polynomial time solvable. We explicitly characterize the optimal groupings under this structure and show that the optimal solutions are related to \emph{homophilous} and \emph{heterophilous} partitions, well-studied in the psychology literature. For one of the four objectives, we show $NP$ hardness under the score structure and give a $\frac{1}{2}$ approximation algorithm for which no constant approximation was known thus far. Finally, under the score structure, we propose an online low sample complexity PAC algorithm for learning the optimal partition. We demonstrate the efficacy of the proposed algorithm on synthetic and real world datasets.
Tasks
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07807v1
PDF	http://arxiv.org/pdf/1703.07807v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-partition-using-score-based
Repo
Framework

A Hybrid Data Association Framework for Robust Online Multi-Object Tracking


Title	A Hybrid Data Association Framework for Robust Online Multi-Object Tracking
Authors	Min Yang, Yuwei Wu, Yunde Jia
Abstract	Global optimization algorithms have shown impressive performance in data-association based multi-object tracking, but handling online data remains a difficult hurdle to overcome. In this paper, we present a hybrid data association framework with a min-cost multi-commodity network flow for robust online multi-object tracking. We build local target-specific models interleaved with global optimization of the optimal data association over multiple video frames. More specifically, in the min-cost multi-commodity network flow, the target-specific similarities are online learned to enforce the local consistency for reducing the complexity of the global data association. Meanwhile, the global data association taking multiple video frames into account alleviates irrecoverable errors caused by the local data association between adjacent frames. To ensure the efficiency of online tracking, we give an efficient near-optimal solution to the proposed min-cost multi-commodity flow problem, and provide the empirical proof of its sub-optimality. The comprehensive experiments on real data demonstrate the superior tracking performance of our approach in various challenging situations.
Tasks	Multi-Object Tracking, Object Tracking, Online Multi-Object Tracking
Published	2017-03-31
URL	http://arxiv.org/abs/1703.10764v1
PDF	http://arxiv.org/pdf/1703.10764v1.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-data-association-framework-for
Repo
Framework

Stance detection in online discussions


Title	Stance detection in online discussions
Authors	Peter Krejzl, Barbora Hourová, Josef Steinberger
Abstract	This paper describes our system created to detect stance in online discussions. The goal is to identify whether the author of a comment is in favor of the given target or against. Our approach is based on a maximum entropy classifier, which uses surface-level, sentiment and domain-specific features. The system was originally developed to detect stance in English tweets. We adapted it to process Czech news commentaries.
Tasks	Stance Detection
Published	2017-01-02
URL	http://arxiv.org/abs/1701.00504v1
PDF	http://arxiv.org/pdf/1701.00504v1.pdf
PWC	https://paperswithcode.com/paper/stance-detection-in-online-discussions
Repo
Framework

Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data


Title	Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data
Authors	Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma
Abstract	In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hindi and English code-mixed tweets of Hindi multilingual speakers for evaluation. The data set is manually annotated with Universal Dependencies.
Tasks
Published	2017-03-31
URL	http://arxiv.org/abs/1703.10772v1
PDF	http://arxiv.org/pdf/1703.10772v1.pdf
PWC	https://paperswithcode.com/paper/joining-hands-exploiting-monolingual
Repo
Framework

Informational Neurobayesian Approach to Neural Networks Training. Opportunities and Prospects


Title	Informational Neurobayesian Approach to Neural Networks Training. Opportunities and Prospects
Authors	Artem Artemov, Eugeny Lutsenko, Edward Ayunts, Ivan Bolokhov
Abstract	A study of the classification problem in context of information theory is presented in the paper. Current research in that field is focused on optimisation and bayesian approach. Although that gives satisfying results, they require a vast amount of data and computations to train on. Authors propose a new concept named Informational Neurobayesian Approach (INA), which allows to solve the same problems, but requires significantly less training data as well as computational power. Experiments were conducted to compare its performance with the traditional one and the results showed that capacity of the INA is quite promising.
Tasks
Published	2017-10-19
URL	http://arxiv.org/abs/1710.07264v3
PDF	http://arxiv.org/pdf/1710.07264v3.pdf
PWC	https://paperswithcode.com/paper/informational-neurobayesian-approach-to
Repo
Framework

A Deterministic Nonsmooth Frank Wolfe Algorithm with Coreset Guarantees


Title	A Deterministic Nonsmooth Frank Wolfe Algorithm with Coreset Guarantees
Authors	Sathya N. Ravi, Maxwell D. Collins, Vikas Singh
Abstract	We present a new Frank-Wolfe (FW) type algorithm that is applicable to minimization problems with a nonsmooth convex objective. We provide convergence bounds and show that the scheme yields so-called coreset results for various Machine Learning problems including 1-median, Balanced Development, Sparse PCA, Graph Cuts, and the $\ell_1$-norm-regularized Support Vector Machine (SVM) among others. This means that the algorithm provides approximate solutions to these problems in time complexity bounds that are not dependent on the size of the input problem. Our framework, motivated by a growing body of work on sublinear algorithms for various data analysis problems, is entirely deterministic and makes no use of smoothing or proximal operators. Apart from these theoretical results, we show experimentally that the algorithm is very practical and in some cases also offers significant computational advantages on large problem instances. We provide an open source implementation that can be adapted for other problems that fit the overall structure.
Tasks
Published	2017-08-22
URL	http://arxiv.org/abs/1708.06714v1
PDF	http://arxiv.org/pdf/1708.06714v1.pdf
PWC	https://paperswithcode.com/paper/a-deterministic-nonsmooth-frank-wolfe
Repo
Framework

Warp: a method for neural network interpretability applied to gene expression profiles


Title	Warp: a method for neural network interpretability applied to gene expression profiles
Authors	Trofimov Assya, Lemieux Sebastien, Perreault Claude
Abstract	We show a proof of principle for warping, a method to interpret the inner working of neural networks in the context of gene expression analysis. Warping is an efficient way to gain insight to the inner workings of neural nets and make them more interpretable. We demonstrate the ability of warping to recover meaningful information for a given class on a samplespecific individual basis. We found warping works well in both linearly and nonlinearly separable datasets. These encouraging results show that warping has a potential to be the answer to neural networks interpretability in computational biology.
Tasks
Published	2017-08-16
URL	http://arxiv.org/abs/1708.04988v1
PDF	http://arxiv.org/pdf/1708.04988v1.pdf
PWC	https://paperswithcode.com/paper/warp-a-method-for-neural-network
Repo
Framework

The Multi-layer Information Bottleneck Problem


Title	The Multi-layer Information Bottleneck Problem
Authors	Qianqian Yang, Pablo Piantanida, Deniz Gündüz
Abstract	The muti-layer information bottleneck (IB) problem, where information is propagated (or successively refined) from layer to layer, is considered. Based on information forwarded by the preceding layer, each stage of the network is required to preserve a certain level of relevance with regards to a specific hidden variable, quantified by the mutual information. The hidden variables and the source can be arbitrarily correlated. The optimal trade-off between rates of relevance and compression (or complexity) is obtained through a single-letter characterization, referred to as the rate-relevance region. Conditions of successive refinabilty are given. Binary source with BSC hidden variables and binary source with BSC/BEC mixed hidden variables are both proved to be successively refinable. We further extend our result to Guassian models. A counterexample of successive refinability is also provided.
Tasks
Published	2017-11-14
URL	http://arxiv.org/abs/1711.05102v1
PDF	http://arxiv.org/pdf/1711.05102v1.pdf
PWC	https://paperswithcode.com/paper/the-multi-layer-information-bottleneck
Repo
Framework