October 20, 2019

3381 words 16 mins read

Paper Group AWR 180

Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs. A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis. On Visual Hallmarks of Robustness to Adversarial Malware. Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Image Stylization. Utili …

Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs


Title	Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs
Authors	Matthew Roddy, Gabriel Skantze, Naomi Harte
Abstract	For spoken dialog systems to conduct fluid conversational interactions with users, the systems must be sensitive to turn-taking cues produced by a user. Models should be designed so that effective decisions can be made as to when it is appropriate, or not, for the system to speak. Traditional end-of-turn models, where decisions are made at utterance end-points, are limited in their ability to model fast turn-switches and overlap. A more flexible approach is to model turn-taking in a continuous manner using RNNs, where the system predicts speech probability scores for discrete frames within a future window. The continuous predictions represent generalized turn-taking behaviors observed in the training data and can be applied to make decisions that are not just limited to end-of-turn detection. In this paper, we investigate optimal speech-related feature sets for making predictions at pauses and overlaps in conversation. We find that while traditional acoustic features perform well, part-of-speech features generally perform worse than word features. We show that our current models outperform previously reported baselines.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11461v1
PDF	http://arxiv.org/pdf/1806.11461v1.pdf
PWC	https://paperswithcode.com/paper/investigating-speech-features-for-continuous
Repo	https://github.com/mattroddy/lstm_turn_taking_prediction
Framework	pytorch

A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis


Title	A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis
Authors	Kimberly Redmond, Lannan Luo, Qiang Zeng
Abstract	Given a closed-source program, such as most of proprietary software and viruses, binary code analysis is indispensable for many tasks, such as code plagiarism detection and malware analysis. Today, source code is very often compiled for various architectures, making cross-architecture binary code analysis increasingly important. A binary, after being disassembled, is expressed in an assembly languages. Thus, recent work starts exploring Natural Language Processing (NLP) inspired binary code analysis. In NLP, words are usually represented in high-dimensional vectors (i.e., embeddings) to facilitate further processing, which is one of the most common and critical steps in many NLP tasks. We regard instructions as words in NLP-inspired binary code analysis, and aim to represent instructions as embeddings as well. To facilitate cross-architecture binary code analysis, our goal is that similar instructions, regardless of their architectures, have embeddings close to each other. To this end, we propose a joint learning approach to generating instruction embeddings that capture not only the semantics of instructions within an architecture, but also their semantic relationships across architectures. To the best of our knowledge, this is the first work on building cross-architecture instruction embedding model. As a showcase, we apply the model to resolving one of the most fundamental problems for binary code similarity comparison—semantics-based basic block comparison, and the solution outperforms the code statistics based approach. It demonstrates that it is promising to apply the model to other cross-architecture binary code analysis tasks.
Tasks
Published	2018-12-23
URL	http://arxiv.org/abs/1812.09652v1
PDF	http://arxiv.org/pdf/1812.09652v1.pdf
PWC	https://paperswithcode.com/paper/a-cross-architecture-instruction-embedding
Repo	https://github.com/nlp-code-analysis/cross-arch-instr-model
Framework	none

On Visual Hallmarks of Robustness to Adversarial Malware


Title	On Visual Hallmarks of Robustness to Adversarial Malware
Authors	Alex Huang, Abdullah Al-Dujaili, Erik Hemberg, Una-May O’Reilly
Abstract	A central challenge of adversarial learning is to interpret the resulting hardened model. In this contribution, we ask how robust generalization can be visually discerned and whether a concise view of the interactions between a hardened decision map and input samples is possible. We first provide a means of visually comparing a hardened model’s loss behavior with respect to the adversarial variants generated during training versus loss behavior with respect to adversarial variants generated from other sources. This allows us to confirm that the association of observed flatness of a loss landscape with generalization that is seen with naturally trained models extends to adversarially hardened models and robust generalization. To complement these means of interpreting model parameter robustness we also use self-organizing maps to provide a visual means of superimposing adversarial and natural variants on a model’s decision space, thus allowing the model’s global robustness to be comprehensively examined.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03553v1
PDF	http://arxiv.org/pdf/1805.03553v1.pdf
PWC	https://paperswithcode.com/paper/on-visual-hallmarks-of-robustness-to
Repo	https://github.com/ALFA-group/robust-adv-malware-detection
Framework	pytorch

Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Image Stylization


Title	Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Image Stylization
Authors	Nikolay Jetchev, Urs Bergmann, Gokhan Yildirim
Abstract	Parametric generative deep models are state-of-the-art for photo and non-photo realistic image stylization. However, learning complicated image representations requires compute-intense models parametrized by a huge number of weights, which in turn requires large datasets to make learning successful. Non-parametric exemplar-based generation is a technique that works well to reproduce style from small datasets, but is also compute-intensive. These aspects are a drawback for the practice of digital AI artists: typically one wants to use a small set of stylization images, and needs a fast flexible model in order to experiment with it. With this motivation, our work has these contributions: (i) a novel stylization method called Fully Adversarial Mosaics (FAMOS) that combines the strengths of both parametric and non-parametric approaches; (ii) multiple ablations and image examples that analyze the method and show its capabilities; (iii) source code that will empower artists and machine learning researchers to use and modify FAMOS.
Tasks	Image Stylization
Published	2018-11-22
URL	http://arxiv.org/abs/1811.09236v1
PDF	http://arxiv.org/pdf/1811.09236v1.pdf
PWC	https://paperswithcode.com/paper/copy-the-old-or-paint-anew-an-adversarial
Repo	https://github.com/MQSchleich/SatelliteGAN
Framework	none

Utilizing Class Information for Deep Network Representation Shaping


Title	Utilizing Class Information for Deep Network Representation Shaping
Authors	Daeyoung Choi, Wonjong Rhee
Abstract	Statistical characteristics of deep network representations, such as sparsity and correlation, are known to be relevant to the performance and interpretability of deep learning. When a statistical characteristic is desired, often an adequate regularizer can be designed and applied during the training phase. Typically, such a regularizer aims to manipulate a statistical characteristic over all classes together. For classification tasks, however, it might be advantageous to enforce the desired characteristic per class such that different classes can be better distinguished. Motivated by the idea, we design two class-wise regularizers that explicitly utilize class information: class-wise Covariance Regularizer (cw-CR) and class-wise Variance Regularizer (cw-VR). cw-CR targets to reduce the covariance of representations calculated from the same class samples for encouraging feature independence. cw-VR is similar, but variance instead of covariance is targeted to improve feature compactness. For the sake of completeness, their counterparts without using class information, Covariance Regularizer (CR) and Variance Regularizer (VR), are considered together. The four regularizers are conceptually simple and computationally very efficient, and the visualization shows that the regularizers indeed perform distinct representation shaping. In terms of classification performance, significant improvements over the baseline and L1/L2 weight regularization methods were found for 21 out of 22 tasks over popular benchmark datasets. In particular, cw-VR achieved the best performance for 13 tasks including ResNet-32/110.
Tasks
Published	2018-09-25
URL	http://arxiv.org/abs/1809.09307v2
PDF	http://arxiv.org/pdf/1809.09307v2.pdf
PWC	https://paperswithcode.com/paper/utilizing-class-information-for-deep-network
Repo	https://github.com/snu-adsl/class_wise_regularizer
Framework	tf

SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization


Title	SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization
Authors	Dan Wang, Mengqi Ji, Yong Wang, Haoqian Wang, Lu Fang
Abstract	To overcome the oscillation problem in the classical momentum-based optimizer, recent work associates it with the proportional-integral (PI) controller, and artificially adds D term producing a PID controller. It suppresses oscillation with the sacrifice of introducing extra hyper-parameter. In this paper, we start by analyzing: why momentum-based method oscillates about the optimal point? and answering that: the fluctuation problem relates to the lag effect of integral (I) term. Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter. It separates momentum term adaptively when the inconsistency of current and historical gradient direction occurs. Extensive experiments demonstrate that SPIOptimizer generalizes well on popular network architectures to eliminate the oscillation, and owns competitive performance with faster convergence speed (up to 40% epochs reduction ratio ) and more accurate classification result on MNIST, CIFAR10, and CIFAR100 (up to 27.5% error reduction ratio) than the state-of-the-art methods.
Tasks	Stochastic Optimization
Published	2018-12-29
URL	http://arxiv.org/abs/1812.11305v2
PDF	http://arxiv.org/pdf/1812.11305v2.pdf
PWC	https://paperswithcode.com/paper/spi-optimizer-an-integral-separated-pi
Repo	https://github.com/sgflower66/SPI-Optimizer
Framework	pytorch

Scalable Robust Kidney Exchange


Title	Scalable Robust Kidney Exchange
Authors	Duncan C McElfresh, Hoda Bidkhori, John P Dickerson
Abstract	In barter exchanges, participants directly trade their endowed goods in a constrained economic setting without money. Transactions in barter exchanges are often facilitated via a central clearinghouse that must match participants even in the face of uncertainty—over participants, existence and quality of potential trades, and so on. Leveraging robust combinatorial optimization techniques, we address uncertainty in kidney exchange, a real-world barter market where patients swap (in)compatible paired donors. We provide two scalable robust methods to handle two distinct types of uncertainty in kidney exchange—over the quality and the existence of a potential match. The latter case directly addresses a weakness in all stochastic-optimization-based methods to the kidney exchange clearing problem, which all necessarily require explicit estimates of the probability of a transaction existing—a still-unsolved problem in this nascent market. We also propose a novel, scalable kidney exchange formulation that eliminates the need for an exponential-time constraint generation process in competing formulations, maintains provable optimality, and serves as a subsolver for our robust approach. For each type of uncertainty we demonstrate the benefits of robustness on real data from a large, fielded kidney exchange in the United States. We conclude by drawing parallels between robustness and notions of fairness in the kidney exchange setting.
Tasks	Combinatorial Optimization, Stochastic Optimization
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03532v1
PDF	http://arxiv.org/pdf/1811.03532v1.pdf
PWC	https://paperswithcode.com/paper/scalable-robust-kidney-exchange
Repo	https://github.com/duncanmcelfresh/RobustKidneyExchange
Framework	none

Approximate Inference for Constructing Astronomical Catalogs from Images


Title	Approximate Inference for Constructing Astronomical Catalogs from Images
Authors	Jeffrey Regier, Andrew C. Miller, David Schlegel, Ryan P. Adams, Jon D. McAuliffe, Prabhat
Abstract	We present a new, fully generative model for constructing astronomical catalogs from optical telescope image sets. Each pixel intensity is treated as a random variable with parameters that depend on the latent properties of stars and galaxies. These latent properties are themselves modeled as random. We compare two procedures for posterior inference. One procedure is based on Markov chain Monte Carlo (MCMC) while the other is based on variational inference (VI). The MCMC procedure excels at quantifying uncertainty, while the VI procedure is 1000 times faster. On a supercomputer, the VI procedure efficiently uses 665,000 CPU cores to construct an astronomical catalog from 50 terabytes of images in 14.6 minutes, demonstrating the scaling characteristics necessary to construct catalogs for upcoming astronomical surveys.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00113v3
PDF	http://arxiv.org/pdf/1803.00113v3.pdf
PWC	https://paperswithcode.com/paper/approximate-inference-for-constructing
Repo	https://github.com/jeff-regier/Celeste.jl
Framework	none

Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking


Title	Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking
Authors	Sarthak Sharma, Junaid Ahmed Ansari, J. Krishna Murthy, K. Madhava Krishna
Abstract	This paper introduces geometry and object shape and pose costs for multi-object tracking in urban driving scenarios. Using images from a monocular camera alone, we devise pairwise costs for object tracks, based on several 3D cues such as object pose, shape, and motion. The proposed costs are agnostic to the data association method and can be incorporated into any optimization framework to output the pairwise data associations. These costs are easy to implement, can be computed in real-time, and complement each other to account for possible errors in a tracking-by-detection framework. We perform an extensive analysis of the designed costs and empirically demonstrate consistent improvement over the state-of-the-art under varying conditions that employ a range of object detectors, exhibit a variety in camera and object motions, and, more importantly, are not reliant on the choice of the association framework. We also show that, by using the simplest of associations frameworks (two-frame Hungarian assignment), we surpass the state-of-the-art in multi-object-tracking on road scenes. More qualitative and quantitative results can be found at the following URL: https://junaidcs032.github.io/Geometry_ObjectShape_MOT/.
Tasks	Multi-Object Tracking, Object Tracking, Online Multi-Object Tracking
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09298v2
PDF	http://arxiv.org/pdf/1802.09298v2.pdf
PWC	https://paperswithcode.com/paper/beyond-pixels-leveraging-geometry-and-shape
Repo	https://github.com/JunaidCS032/MOTBeyondPixels
Framework	none

A New Cervical Cytology Dataset for Nucleus Detection and Image Classification (Cervix93) and Methods for Cervical Nucleus Detection


Title	A New Cervical Cytology Dataset for Nucleus Detection and Image Classification (Cervix93) and Methods for Cervical Nucleus Detection
Authors	Hady Ahmady Phoulady, Peter R. Mouton
Abstract	Analyzing Pap cytology slides is an important tasks in detecting and grading precancerous and cancerous cervical cancer stages. Processing cytology images usually involve segmenting nuclei and overlapping cells. We introduce a cervical cytology dataset that can be used to evaluate nucleus detection, as well as image classification methods in the cytology image processing area. This dataset contains 93 real image stacks with their grade labels and manually annotated nuclei within images. We also present two methods: a baseline method based on a previously proposed approach, and a deep learning method, and compare their results with other state-of-the-art methods. Both the baseline method and the deep learning method outperform other state-of-the-art methods by significant margins. Along with the dataset, we publicly make the evaluation code and the baseline method available to download for further benchmarking.
Tasks	Cervical Nucleus Detection, Image Classification
Published	2018-11-23
URL	http://arxiv.org/abs/1811.09651v1
PDF	http://arxiv.org/pdf/1811.09651v1.pdf
PWC	https://paperswithcode.com/paper/a-new-cervical-cytology-dataset-for-nucleus
Repo	https://github.com/parham-ap/cytology_dataset
Framework	none

Sparse Unsupervised Capsules Generalize Better


Title	Sparse Unsupervised Capsules Generalize Better
Authors	David Rawlinson, Abdelrahman Ahmed, Gideon Kowadlo
Abstract	We show that unsupervised training of latent capsule layers using only the reconstruction loss, without masking to select the correct output class, causes a loss of equivariances and other desirable capsule qualities. This implies that supervised capsules networks can’t be very deep. Unsupervised sparsening of latent capsule layer activity both restores these qualities and appears to generalize better than supervised masking, while potentially enabling deeper capsules networks. We train a sparse, unsupervised capsules network of similar geometry to Sabour et al (2017) on MNIST, and then test classification accuracy on affNIST using an SVM layer. Accuracy is improved from benchmark 79% to 90%.
Tasks
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06094v1
PDF	http://arxiv.org/pdf/1804.06094v1.pdf
PWC	https://paperswithcode.com/paper/sparse-unsupervised-capsules-generalize
Repo	https://github.com/ProjectAGI/sparse-unsupervised-capsules
Framework	tf

Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses


Title	Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
Authors	Jérôme Rony, Luiz G. Hafemann, Luiz S. Oliveira, Ismail Ben Ayed, Robert Sabourin, Eric Granger
Abstract	Research on adversarial examples in computer vision tasks has shown that small, often imperceptible changes to an image can induce misclassification, which has security implications for a wide range of image processing systems. Considering $L_2$ norm distortions, the Carlini and Wagner attack is presently the most effective white-box attack in the literature. However, this method is slow since it performs a line-search for one of the optimization terms, and often requires thousands of iterations. In this paper, an efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low $L_2$ norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image. Experiments conducted on the MNIST, CIFAR-10 and ImageNet datasets indicate that our attack achieves comparable results to the state-of-the-art (in terms of $L_2$ norm) with considerably fewer iterations (as few as 100 iterations), which opens the possibility of using these attacks for adversarial training. Models trained with our attack achieve state-of-the-art robustness against white-box gradient-based $L_2$ attacks on the MNIST and CIFAR-10 datasets, outperforming the Madry defense when the attacks are limited to a maximum norm.
Tasks
Published	2018-11-23
URL	http://arxiv.org/abs/1811.09600v3
PDF	http://arxiv.org/pdf/1811.09600v3.pdf
PWC	https://paperswithcode.com/paper/decoupling-direction-and-norm-for-efficient
Repo	https://github.com/jeromerony/fast_adversarial
Framework	pytorch

Emergence and Evolution of Hierarchical Structure in Complex Systems


Title	Emergence and Evolution of Hierarchical Structure in Complex Systems
Authors	Payam Siyari, Bistra Dilkina, Constantine Dovrolis
Abstract	It is well known that many complex systems, both in technology and nature, exhibit hierarchical modularity: smaller modules, each of them providing a certain function, are used within larger modules that perform more complex functions. What is not well understood however is how this hierarchical structure (which is fundamentally a network property) emerges, and how it evolves over time. We propose a modeling framework, referred to as Evo-Lexis, that provides insight to some fundamental questions about evolving hierarchical systems. Evo-Lexis models the most elementary modules of the system as symbols (“sources”) and the modules at the highest level of the hierarchy as sequences of those symbols (“targets”). Evo-Lexis computes the optimized adjustment of a given hierarchy when the set of targets changes over time by additions and removals (a process referred to as “incremental design”). In this paper we use computation modeling to show that: - Low-cost and deep hierarchies emerge when the population of target sequences evolves through tinkering and mutation. - Strong selection on the cost of new candidate targets results in reuse of more complex (longer) nodes in an optimized hierarchy. - The bias towards reuse of complex nodes results in an “hourglass architecture” (i.e., few intermediate nodes that cover almost all source-target paths). - With such bias, the core nodes are conserved for relatively long time periods although still being vulnerable to major transitions and punctuated equilibria. - Finally, we analyze the differences in terms of cost and structure between incrementally designed hierarchies and the corresponding “clean-slate” hierarchies which result when the system is designed from scratch after a change.
Tasks
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04924v2
PDF	http://arxiv.org/pdf/1805.04924v2.pdf
PWC	https://paperswithcode.com/paper/emergence-and-evolution-of-hierarchical
Repo	https://github.com/payamsiyari/Evo-Lexis
Framework	none

One Deep Music Representation to Rule Them All? : A comparative analysis of different representation learning strategies


Title	One Deep Music Representation to Rule Them All? : A comparative analysis of different representation learning strategies
Authors	Jaehun Kim, Julián Urbano, Cynthia C. S. Liem, Alan Hanjalic
Abstract	Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Language Processing, this learning paradigm has also found its way into the field of Music Information Retrieval. In order to benefit from deep learning in an effective, but also efficient manner, deep transfer learning has become a common approach. In this approach, it is possible to reuse the output of a pre-trained neural network as the basis for a new learning task. The underlying hypothesis is that if the initial and new learning tasks show commonalities and are applied to the same type of input data (e.g. music audio), the generated deep representation of the data is also informative for the new task. Since, however, most of the networks used to generate deep representations are trained using a single initial learning source, their representation is unlikely to be informative for all possible future tasks. In this paper, we present the results of our investigation of what are the most important factors to generate deep representations for the data and learning tasks in the music domain. We conducted this investigation via an extensive empirical study that involves multiple learning sources, as well as multiple deep learning architectures with varying levels of information sharing between sources, in order to learn music representations. We then validate these representations considering multiple target datasets for evaluation. The results of our experiments yield several insights on how to approach the design of methods for learning widely deployable deep data representations in the music domain.
Tasks	Information Retrieval, Music Information Retrieval, Representation Learning, Transfer Learning
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04051v4
PDF	http://arxiv.org/pdf/1802.04051v4.pdf
PWC	https://paperswithcode.com/paper/one-deep-music-representation-to-rule-them
Repo	https://github.com/eldrin/MTLMusicRepresentation-PyTorch
Framework	pytorch

BLeSS: Bio-inspired Low-level Spatiochromatic Similarity Assisted Image Quality Assessment


Title	BLeSS: Bio-inspired Low-level Spatiochromatic Similarity Assisted Image Quality Assessment
Authors	Dogancan Temel, Ghassan AlRegib
Abstract	This paper proposes a biologically-inspired low-level spatiochromatic-model-based similarity method (BLeSS) to assist full-reference image-quality estimators that originally oversimplify color perception processes. More specifically, the spatiochromatic model is based on spatial frequency, spatial orientation, and surround contrast effects. The assistant similarity method is used to complement image-quality estimators based on phase congruency, gradient magnitude, and spectral residual. The effectiveness of BLeSS is validated using FSIM, FSIMc and SR-SIM methods on LIVE, Multiply Distorted LIVE, and TID 2013 databases. In terms of Spearman correlation, BLeSS enhances the performance of all quality estimators in color-based degradations and the enhancement is at 100% for both feature- and spectral residual-based similarity methods. Moreover, BleSS significantly enhances the performance of SR-SIM and FSIM in the full TID 2013 database.
Tasks	Image Quality Assessment
Published	2018-11-16
URL	http://arxiv.org/abs/1811.07044v1
PDF	http://arxiv.org/pdf/1811.07044v1.pdf
PWC	https://paperswithcode.com/paper/bless-bio-inspired-low-level-spatiochromatic
Repo	https://github.com/olivesgatech/BLeSS
Framework	none