January 31, 2020

3069 words 15 mins read

Paper Group ANR 198

Solving Large-Scale 0-1 Knapsack Problems and its Application to Point Cloud Resampling. k-Same-Siamese-GAN: k-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training. A Fast Algorithm for Cosine Transform Based Tensor Singular Value Decomposition. End-to-End Moti …

Solving Large-Scale 0-1 Knapsack Problems and its Application to Point Cloud Resampling


Title	Solving Large-Scale 0-1 Knapsack Problems and its Application to Point Cloud Resampling
Authors	Duanshun Li, Jing Liu, Noseong Park, Dongeun Lee, Giridhar Ramachandran, Ali Seyedmazloom, Kookjin Lee, Chen Feng, Vadim Sokolov, Rajesh Ganesan
Abstract	0-1 knapsack is of fundamental importance in computer science, business, operations research, etc. In this paper, we present a deep learning technique-based method to solve large-scale 0-1 knapsack problems where the number of products (items) is large and/or the values of products are not necessarily predetermined but decided by an external value assignment function during the optimization process. Our solution is greatly inspired by the method of Lagrange multiplier and some recent adoptions of game theory to deep learning. After formally defining our proposed method based on them, we develop an adaptive gradient ascent method to stabilize its optimization process. In our experiments, the presented method solves all the large-scale benchmark KP instances in a minute whereas existing methods show fluctuating runtime. We also show that our method can be used for other applications, including but not limited to the point cloud resampling.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.05929v1
PDF	https://arxiv.org/pdf/1906.05929v1.pdf
PWC	https://paperswithcode.com/paper/solving-large-scale-0-1-knapsack-problems-and
Repo
Framework

k-Same-Siamese-GAN: k-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training


Title	k-Same-Siamese-GAN: k-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training
Authors	Yi-Lun Pan, Min-Jhih Huang, Kuo-Teng Ding, Ja-Ling Wu, Jyh-Shing Jang
Abstract	For a data holder, such as a hospital or a government entity, who has a privately held collection of personal data, in which the revealing and/or processing of the personal identifiable data is restricted and prohibited by law. Then, “how can we ensure the data holder does conceal the identity of each individual in the imagery of personal data while still preserving certain useful aspects of the data after de-identification?” becomes a challenge issue. In this work, we propose an approach towards high-resolution facial image de-identification, called k-Same-Siamese-GAN, which leverages the k-Same-Anonymity mechanism, the Generative Adversarial Network, and the hyperparameter tuning methods. Moreover, to speed up model training and reduce memory consumption, the mixed precision training technique is also applied to make kSS-GAN provide guarantees regarding privacy protection on close-form identities and be trained much more efficiently as well. Finally, to validate its applicability, the proposed work has been applied to actual datasets - RafD and CelebA for performance testing. Besides protecting privacy of high-resolution facial images, the proposed system is also justified for its ability in automating parameter tuning and breaking through the limitation of the number of adjustable parameters.
Tasks
Published	2019-03-27
URL	https://arxiv.org/abs/1904.00816v2
PDF	https://arxiv.org/pdf/1904.00816v2.pdf
PWC	https://paperswithcode.com/paper/k-same-siamese-gan-k-same-algorithm-with
Repo
Framework

A Fast Algorithm for Cosine Transform Based Tensor Singular Value Decomposition


Title	A Fast Algorithm for Cosine Transform Based Tensor Singular Value Decomposition
Authors	Wen-Hao Xu, Xi-Le Zhao, Michael Ng
Abstract	Recently, there has been a lot of research into tensor singular value decomposition (t-SVD) by using discrete Fourier transform (DFT) matrix. The main aims of this paper are to propose and study tensor singular value decomposition based on the discrete cosine transform (DCT) matrix. The advantages of using DCT are that (i) the complex arithmetic is not involved in the cosine transform based tensor singular value decomposition, so the computational cost required can be saved; (ii) the intrinsic reflexive boundary condition along the tubes in the third dimension of tensors is employed, so its performance would be better than that by using the periodic boundary condition in DFT. We demonstrate that the tensor product between two tensors by using DCT can be equivalent to the multiplication between a block Toeplitz-plus-Hankel matrix and a block vector. Numerical examples of low-rank tensor completion are further given to illustrate that the efficiency by using DCT is two times faster than that by using DFT and also the errors of video and multispectral image completion by using DCT are smaller than those by using DFT.
Tasks
Published	2019-02-08
URL	http://arxiv.org/abs/1902.03070v1
PDF	http://arxiv.org/pdf/1902.03070v1.pdf
PWC	https://paperswithcode.com/paper/a-fast-algorithm-for-cosine-transform-based
Repo
Framework

End-to-End Motion Planning of Quadrotors Using Deep Reinforcement Learning


Title	End-to-End Motion Planning of Quadrotors Using Deep Reinforcement Learning
Authors	Efe Camci, Erdal Kayacan
Abstract	In this work, a novel, end-to-end motion planning method is proposed for quadrotor navigation in cluttered environments. The proposed method circumvents the explicit sensing-reconstructing-planning in contrast to conventional navigation algorithms. It uses raw depth images obtained from a front-facing camera and directly generates local motion plans in the form of smooth motion primitives that move a quadrotor to a goal by avoiding obstacles. Promising training and testing results are presented in both AirSim simulations and real flights with DJI F330 Quadrotor equipped with Intel RealSense D435. The proposed system in action can be found in https://youtu.be/pYvKhc8wrTM.
Tasks	Motion Planning
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13599v2
PDF	https://arxiv.org/pdf/1909.13599v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-motion-planning-of-quadrotors
Repo
Framework

A practical two-stage training strategy for multi-stream end-to-end speech recognition


Title	A practical two-stage training strategy for multi-stream end-to-end speech recognition
Authors	Ruizhi Li, Gregory Sell, Xiaofei Wang, Shinji Watanabe, Hynek Hermansky
Abstract	The multi-stream paradigm of audio processing, in which several sources are simultaneously considered, has been an active research area for information fusion. Our previous study offered a promising direction within end-to-end automatic speech recognition, where parallel encoders aim to capture diverse information followed by a stream-level fusion based on attention mechanisms to combine the different views. However, with an increasing number of streams resulting in an increasing number of encoders, the previous approach could require substantial memory and massive amounts of parallel data for joint training. In this work, we propose a practical two-stage training scheme. Stage-1 is to train a Universal Feature Extractor (UFE), where encoder outputs are produced from a single-stream model trained with all data. Stage-2 formulates a multi-stream scheme intending to solely train the attention fusion module using the UFE features and pretrained components from Stage-1. Experiments have been conducted on two datasets, DIRHA and AMI, as a multi-stream scenario. Compared with our previous method, this strategy achieves relative word error rate reductions of 8.2–32.4%, while consistently outperforming several conventional combination methods.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10671v1
PDF	https://arxiv.org/pdf/1910.10671v1.pdf
PWC	https://paperswithcode.com/paper/a-practical-two-stage-training-strategy-for
Repo
Framework

Iterative Deep Learning Based Unbiased Stereology With Human-in-the-Loop


Title	Iterative Deep Learning Based Unbiased Stereology With Human-in-the-Loop
Authors	Saeed S. Alahmari, Dmitry Goldgof, Lawrence O. Hall, Palak Dave, Hady Ahmady Phoulady, Peter R. Mouton
Abstract	Lack of enough labeled data is a major problem in building machine learning based models when the manual annotation (labeling) is error-prone, expensive, tedious, and time-consuming. In this paper, we introduce an iterative deep learning based method to improve segmentation and counting of cells based on unbiased stereology applied to regions of interest of extended depth of field (EDF) images. This method uses an existing machine learning algorithm called the adaptive segmentation algorithm (ASA) to generate masks (verified by a user) for EDF images to train deep learning models. Then an iterative deep learning approach is used to feed newly predicted and accepted deep learning masks/images (verified by a user) to the training set of the deep learning model. The error rate in unbiased stereology count of cells on an unseen test set reduced from about 3 % to less than 1 % after 5 iterations of the iterative deep learning based unbiased stereology process.
Tasks
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04355v1
PDF	http://arxiv.org/pdf/1901.04355v1.pdf
PWC	https://paperswithcode.com/paper/iterative-deep-learning-based-unbiased
Repo
Framework

Non-Cooperative Game Theory Based Rate Adaptation for Dynamic Video Streaming over HTTP


Title	Non-Cooperative Game Theory Based Rate Adaptation for Dynamic Video Streaming over HTTP
Authors	Hui Yuan, Huayong Fu, Ju Liu, Junhui Hou, Sam Kwong
Abstract	Dynamic Adaptive Streaming over HTTP (DASH) has demonstrated to be an emerging and promising multimedia streaming technique, owing to its capability of dealing with the variability of networks. Rate adaptation mechanism, a challenging and open issue, plays an important role in DASH based systems since it affects Quality of Experience (QoE) of users, network utilization, etc. In this paper, based on non-cooperative game theory, we propose a novel algorithm to optimally allocate the limited export bandwidth of the server to multi-users to maximize their QoE with fairness guaranteed. The proposed algorithm is proxy-free. Specifically, a novel user QoE model is derived by taking a variety of factors into account, like the received video quality, the reference buffer length, and user accumulated buffer lengths, etc. Then, the bandwidth competing problem is formulated as a non-cooperation game with the existence of Nash Equilibrium that is theoretically proven. Finally, a distributed iterative algorithm with stability analysis is proposed to find the Nash Equilibrium. Compared with state-of-the-art methods, extensive experimental results in terms of both simulated and realistic networking scenarios demonstrate that the proposed algorithm can produce higher QoE, and the actual buffer lengths of all users keep nearly optimal states, i.e., moving around the reference buffer all the time. Besides, the proposed algorithm produces no playback interruption.
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.11954v1
PDF	https://arxiv.org/pdf/1912.11954v1.pdf
PWC	https://paperswithcode.com/paper/non-cooperative-game-theory-based-rate
Repo
Framework

Phenotypic Profiling of High Throughput Imaging Screens with Generic Deep Convolutional Features


Title	Phenotypic Profiling of High Throughput Imaging Screens with Generic Deep Convolutional Features
Authors	Philip T. Jackson, Yinhai Wang, Sinead Knight, Hongming Chen, Thierry Dorval, Martin Brown, Claus Bendtsen, Boguslaw Obara
Abstract	While deep learning has seen many recent applications to drug discovery, most have focused on predicting activity or toxicity directly from chemical structure. Phenotypic changes exhibited in cellular images are also indications of the mechanism of action (MoA) of chemical compounds. In this paper, we show how pre-trained convolutional image features can be used to assist scientists in discovering interesting chemical clusters for further investigation. Our method reduces the dimensionality of raw fluorescent stained images from a high throughput imaging (HTI) screen, producing an embedding space that groups together images with similar cellular phenotypes. Running standard unsupervised clustering on this embedding space yields a set of distinct phenotypic clusters. This allows scientists to further select and focus on interesting clusters for downstream analyses. We validate the consistency of our embedding space qualitatively with t-sne visualizations, and quantitatively by measuring embedding variance among images that are known to be similar. Results suggested the usefulness of our proposed workflow using deep learning and clustering and it can lead to robust HTI screening and compound triage.
Tasks	Drug Discovery
Published	2019-03-15
URL	http://arxiv.org/abs/1903.06516v1
PDF	http://arxiv.org/pdf/1903.06516v1.pdf
PWC	https://paperswithcode.com/paper/phenotypic-profiling-of-high-throughput
Repo
Framework

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition


Title	From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition
Authors	Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fügen, Geoffrey Zweig, Michael L. Seltzer
Abstract	There is an implicit assumption that traditional hybrid approaches for automatic speech recognition (ASR) cannot directly model graphemes and need to rely on phonetic lexicons to get competitive performance, especially on English which has poor grapheme-phoneme correspondence. In this work, we show for the first time that, on English, hybrid ASR systems can in fact model graphemes effectively by leveraging tied context-dependent graphemes, i.e., chenones. Our chenone-based systems significantly outperform equivalent senone baselines by 4.5% to 11.1% relative on three different English datasets. Our results on Librispeech are state-of-the-art compared to other hybrid approaches and competitive with previously published end-to-end numbers. Further analysis shows that chenones can better utilize powerful acoustic models and large training data, and require context- and position-dependent modeling to work well. Chenone-based systems also outperform senone baselines on proper noun and rare word recognition, an area where the latter is traditionally thought to have an advantage. Our work provides an alternative for end-to-end ASR and establishes that hybrid systems can be improved by dropping the reliance on phonetic knowledge.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01493v2
PDF	https://arxiv.org/pdf/1910.01493v2.pdf
PWC	https://paperswithcode.com/paper/from-senones-to-chenones-tied-context
Repo
Framework

Non-Stationary Streaming PCA


Title	Non-Stationary Streaming PCA
Authors	Daniel Bienstock, Apurv Shukla, SeYoung Yun
Abstract	We consider the problem of streaming principal component analysis (PCA) when the observations are noisy and generated in a non-stationary environment. Given $T$, $p$-dimensional noisy observations sampled from a non-stationary variant of the spiked covariance model, our goal is to construct the best linear $k$-dimensional subspace of the terminal observations. We study the effect of non-stationarity by establishing a lower bound on the number of samples and the corresponding recovery error obtained by any algorithm. We establish the convergence behaviour of the noisy power method using a novel proof technique which maybe of independent interest. We conclude that the recovery guarantee of the noisy power method matches the fundamental limit, thereby generalizing existing results on streaming PCA to a non-stationary setting.
Tasks
Published	2019-02-08
URL	http://arxiv.org/abs/1902.03223v2
PDF	http://arxiv.org/pdf/1902.03223v2.pdf
PWC	https://paperswithcode.com/paper/non-stationary-streaming-pca
Repo
Framework

SemEval-2013 Task 4: Free Paraphrases of Noun Compounds


Title	SemEval-2013 Task 4: Free Paraphrases of Noun Compounds
Authors	Iris Hendrickx, Preslav Nakov, Stan Szpakowicz, Zornitsa Kozareva, Diarmuid Ó Séaghdha, Tony Veale
Abstract	In this paper, we describe SemEval-2013 Task 4: the definition, the data, the evaluation and the results. The task is to capture some of the meaning of English noun compounds via paraphrasing. Given a two-word noun compound, the participating system is asked to produce an explicitly ranked list of its free-form paraphrases. The list is automatically compared and evaluated against a similarly ranked list of paraphrases proposed by human annotators, recruited and managed through Amazon’s Mechanical Turk. The comparison of raw paraphrases is sensitive to syntactic and morphological variation. The “gold” ranking is based on the relative popularity of paraphrases among annotators. To make the ranking more reliable, highly similar paraphrases are grouped, so as to downplay superficial differences in syntax and morphology. Three systems participated in the task. They all beat a simple baseline on one of the two evaluation measures, but not on both measures. This shows that the task is difficult.
Tasks
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10421v1
PDF	https://arxiv.org/pdf/1911.10421v1.pdf
PWC	https://paperswithcode.com/paper/semeval-2013-task-4-free-paraphrases-of-noun-1
Repo
Framework

Algorithmic Extremism: Examining YouTube’s Rabbit Hole of Radicalization


Title	Algorithmic Extremism: Examining YouTube’s Rabbit Hole of Radicalization
Authors	Mark Ledwich, Anna Zaitsev
Abstract	The role that YouTube and its behind-the-scenes recommendation algorithm plays in encouraging online radicalization has been suggested by both journalists and academics alike. This study directly quantifies these claims by examining the role that YouTube’s algorithm plays in suggesting radicalized content. After categorizing nearly 800 political channels, we were able to differentiate between political schemas in order to analyze the algorithm traffic flows out and between each group. After conducting a detailed analysis of recommendations received by each channel type, we refute the popular radicalization claims. To the contrary, these data suggest that YouTube’s recommendation algorithm actively discourages viewers from visiting radicalizing or extremist content. Instead, the algorithm is shown to favor mainstream media and cable news content over independent YouTube channels with slant towards left-leaning or politically neutral channels. Our study thus suggests that YouTube’s recommendation algorithm fails to promote inflammatory or radicalized content, as previously claimed by several outlets.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11211v1
PDF	https://arxiv.org/pdf/1912.11211v1.pdf
PWC	https://paperswithcode.com/paper/algorithmic-extremism-examining-youtubes
Repo
Framework

Supervised learning algorithms resilient to discriminatory data perturbations


Title	Supervised learning algorithms resilient to discriminatory data perturbations
Authors	Przemyslaw A. Grabowicz, Kenta Takatsu, Luis F. Lafuerza
Abstract	The actions of individuals can be discriminatory with respect to certain \textit{protected} attributes, such as race or gender. Recently, discrimination has become a focal concern in supervised learning algorithms augmenting human decision-making. These systems are trained using historical data, which may have been tainted by discrimination, and may learn biases against the protected groups. An important question is how to train models without propagating discrimination. Such discrimination can be either direct, when one or more of protected attributes are used in the decision-making directly, or indirect, when other attributes correlated with the protected attributes are used in an unjustified manner. In this work, we i) model discrimination as a perturbation of data-generating process; ii) introduce a measure of resilience of a supervised learning algorithm to potentially discriminatory data perturbations; and iii) propose a novel supervised learning method that is more resilient to such discriminatory perturbations than state-of-the-art learning algorithms addressing discrimination. The proposed method can be used with general supervised learning algorithms, prevents direct discrimination and avoids inducement of indirect discrimination, while maximizing model accuracy.
Tasks	Decision Making
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08189v2
PDF	https://arxiv.org/pdf/1912.08189v2.pdf
PWC	https://paperswithcode.com/paper/supervised-learning-algorithms-resilient-to
Repo
Framework

Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions


Title	Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions
Authors	Guha Balakrishnan, Adrian V. Dalca, Amy Zhao, John V. Guttag, Fredo Durand, William T. Freeman
Abstract	We introduce visual deprojection: the task of recovering an image or video that has been collapsed along a dimension. Projections arise in various contexts, such as long-exposure photography, where a dynamic scene is collapsed in time to produce a motion-blurred image, and corner cameras, where reflected light from a scene is collapsed along a spatial dimension because of an edge occluder to yield a 1D video. Deprojection is ill-posed– often there are many plausible solutions for a given input. We first propose a probabilistic model capturing the ambiguity of the task. We then present a variational inference strategy using convolutional neural networks as functional approximators. Sampling from the inference network at test time yields plausible candidates from the distribution of original signals that are consistent with a given input projection. We evaluate the method on several datasets for both spatial and temporal deprojection tasks. We first demonstrate the method can recover human gait videos and face images from spatial projections, and then show that it can recover videos of moving digits from dramatically motion-blurred images obtained via temporal projection.
Tasks
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00475v1
PDF	https://arxiv.org/pdf/1909.00475v1.pdf
PWC	https://paperswithcode.com/paper/visual-deprojection-probabilistic-recovery-of
Repo
Framework

Proceedings of the 2nd Symposium on Problem-solving, Creativity and Spatial Reasoning in Cognitive Systems, ProSocrates 2017


Title	Proceedings of the 2nd Symposium on Problem-solving, Creativity and Spatial Reasoning in Cognitive Systems, ProSocrates 2017
Authors	Ana-Maria Olteteanu, Zoe Falomir
Abstract	This book contains the accepted papers at ProSocrates 2017 Symposium: Problem-solving,Creativity and Spatial Reasoning in Cognitive Systems. ProSocrates 2017 symposium was held at the Hansewissenschaftkolleg (HWK) of Advanced Studies in Delmenhorst, 20-21July 2017. This was the second edition of this symposium which aims to bring together researchers interested in spatial reasoning, problem solving and creativity.
Tasks
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04199v1
PDF	http://arxiv.org/pdf/1901.04199v1.pdf
PWC	https://paperswithcode.com/paper/proceedings-of-the-2nd-symposium-on-problem
Repo
Framework