January 28, 2020

3273 words 16 mins read

Paper Group ANR 776

Cortical-inspired Wilson-Cowan-type equations for orientation-dependent contrast perception modelling. Lower bounds for testing graphical models: colorings and antiferromagnetic Ising models. Compacting, Picking and Growing for Unforgetting Continual Learning. Intelligent Solution System towards Parts Logistics Optimization. An Ensemble SVM-based A …

Cortical-inspired Wilson-Cowan-type equations for orientation-dependent contrast perception modelling


Title	Cortical-inspired Wilson-Cowan-type equations for orientation-dependent contrast perception modelling
Authors	Marcelo Bertalmío, Luca Calatroni, Valentina Franceschi, Benedetta Franceschiello, Dario Prandi
Abstract	We consider the evolution model proposed in [9, 6] to describe illusory contrast perception phenomena induced by surrounding orientations. Firstly, we highlight its analogies and differences with widely used Wilson-Cowan equations [48], mainly in terms of efficient representation properties. Then, in order to explicitly encode local directional information, we exploit the model of the primary visual cortex V1 proposed in [20] and largely used over the last years for several image processing problems [24,38,28]. The resulting model is capable to describe assimilation and contrast visual bias at the same time, the main novelty being its explicit dependence on local image orientation. We report several numerical tests showing the ability of the model to explain, in particular, orientation-dependent phenomena such as grating induction and a modified version of the Poggendorff illusion. For this latter example, we empirically show the existence of a set of threshold parameters differentiating from inpainting to perception-type reconstructions, describing long-range connectivity between different hypercolumns in the primary visual cortex.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06808v1
PDF	https://arxiv.org/pdf/1910.06808v1.pdf
PWC	https://paperswithcode.com/paper/cortical-inspired-wilson-cowan-type-equations
Repo
Framework

Lower bounds for testing graphical models: colorings and antiferromagnetic Ising models


Title	Lower bounds for testing graphical models: colorings and antiferromagnetic Ising models
Authors	Ivona Bezakova, Antonio Blanca, Zongchen Chen, Daniel Štefankovič, Eric Vigoda
Abstract	We study the identity testing problem in the context of spin systems or undirected graphical models, where it takes the following form: given the parameter specification of the model $M$ and a sampling oracle for the distribution $\mu_{\hat{M}}$ of an unknown model $\hat{M}$, can we efficiently determine if the two models $M$ and $\hat{M}$ are the same? We consider identity testing for both soft-constraint and hard-constraint systems. In particular, we prove hardness results in two prototypical cases, the Ising model and proper colorings, and explore whether identity testing is any easier than structure learning. For the ferromagnetic (attractive) Ising model, Daskalakis et al. (2018) presented a polynomial time algorithm for identity testing. We prove hardness results in the antiferromagnetic (repulsive) setting in the same regime of parameters where structure learning is known to require a super-polynomial number of samples. In particular, for $n$-vertex graphs of maximum degree $d$, we prove that if $\beta d = \omega(\log{n})$ (where $\beta$ is the inverse temperature parameter), then there is no polynomial running time identity testing algorithm unless $RP=NP$. We also establish computational lower bounds for a broader set of parameters under the (randomized) exponential time hypothesis. Our proofs utilize insights into the design of gadgets using random graphs in recent works concerning the hardness of approximate counting by Sly (2010). In the hard-constraint setting, we present hardness results for identity testing for proper colorings. Our results are based on the presumed hardness of #BIS, the problem of (approximately) counting independent sets in bipartite graphs. In particular, we prove that identity testing is hard in the same range of parameters where structure learning is known to be hard.
Tasks
Published	2019-01-22
URL	https://arxiv.org/abs/1901.07361v2
PDF	https://arxiv.org/pdf/1901.07361v2.pdf
PWC	https://paperswithcode.com/paper/lower-bounds-for-testing-graphical-models
Repo
Framework

Compacting, Picking and Growing for Unforgetting Continual Learning


Title	Compacting, Picking and Growing for Unforgetting Continual Learning
Authors	Steven C. Y. Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, Chu-Song Chen
Abstract	Continual lifelong learning is essential to many applications. In this paper, we propose a simple but effective approach to continual deep learning. Our approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. By enforcing their integration in an iterative manner, we introduce an incremental learning method that is scalable to the number of sequential tasks in a continual learning process. Our approach is easy to implement and owns several favorable characteristics. First, it can avoid forgetting (i.e., learn new tasks while remembering all previous tasks). Second, it allows model expansion but can maintain the model compactness when handling sequential tasks. Besides, through our compaction and selection/expansion mechanism, we show that the knowledge accumulated through learning previous tasks is helpful to build a better model for the new tasks compared to training the models independently with tasks. Experimental results show that our approach can incrementally learn a deep model tackling multiple tasks without forgetting, while the model compactness is maintained with the performance more satisfiable than individual task training.
Tasks	Continual Learning, Model Compression
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06562v3
PDF	https://arxiv.org/pdf/1910.06562v3.pdf
PWC	https://paperswithcode.com/paper/compacting-picking-and-growing-for
Repo
Framework

Intelligent Solution System towards Parts Logistics Optimization


Title	Intelligent Solution System towards Parts Logistics Optimization
Authors	Yaoting Huang, Boyu Chen, Wenlian Lu, Zhong-Xiao Jin, Ren Zheng
Abstract	Due to the complication of the presented problem, intelligent algorithms show great power to solve the parts logistics optimization problem related to the vehicle routing problem (VRP). However, most of the existing research to VRP are incomprehensive and failed to solve a real-work parts logistics problem. In this work, towards SAIC logistics problem, we propose a systematic solution to this 2-Dimensional Loading Capacitated Multi-Depot Heterogeneous VRP with Time Windows by integrating diverse types of intelligent algorithms, including, a heuristic algorithm to initialize feasible logistics planning schemes by imitating manual planning, the core Tabu Search algorithm for global optimization, accelerated by a novel bundle technique, heuristically algorithms for routing, packing and queuing associated, and a heuristic post-optimization process to promote the optimal solution. Based on these algorithms, the SAIC Motor has successfully established an intelligent management system to give a systematic solution for the parts logistics planning, superior than manual planning in its performance, customizability and expandability.
Tasks
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07260v1
PDF	http://arxiv.org/pdf/1903.07260v1.pdf
PWC	https://paperswithcode.com/paper/intelligent-solution-system-towards-parts
Repo
Framework

An Ensemble SVM-based Approach for Voice Activity Detection


Title	An Ensemble SVM-based Approach for Voice Activity Detection
Authors	Jayanta Dey, Md Sanzid Bin Hossain, Mohammad Ariful Haque
Abstract	Voice activity detection (VAD), used as the front end of speech enhancement, speech and speaker recognition algorithms, determines the overall accuracy and efficiency of the algorithms. Therefore, a VAD with low complexity and high accuracy is highly desirable for speech processing applications. In this paper, we propose a novel training method on large dataset for supervised learning-based VAD system using support vector machine (SVM). Despite of high classification accuracy of support vector machines (SVM), trivial SVM is not suitable for classification of large data sets needed for a good VAD system because of high training complexity. To overcome this problem, a novel ensemble-based approach using SVM has been proposed in this paper.The performance of the proposed ensemble structure has been compared with a feedforward neural network (NN). Although NN performs better than single SVM-based VAD trained on a small portion of the training data, ensemble SVM gives accuracy comparable to neural network-based VAD. Ensemble SVM and NN give 88.74% and 86.28% accuracy respectively whereas the stand-alone SVM shows 57.05% accuracy on average on the test dataset.
Tasks	Action Detection, Activity Detection, Speaker Recognition, Speech Enhancement
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01544v1
PDF	http://arxiv.org/pdf/1902.01544v1.pdf
PWC	https://paperswithcode.com/paper/an-ensemble-svm-based-approach-for-voice
Repo
Framework

Image-based marker tracking and registration for intraoperative 3D image-guided interventions using augmented reality


Title	Image-based marker tracking and registration for intraoperative 3D image-guided interventions using augmented reality
Authors	Andong Cao, Ali Dhanaliwala, Jianbo Shi, Terence Gade, Brian Park
Abstract	Augmented reality has the potential to improve operating room workflow by allowing physicians to “see” inside a patient through the projection of imaging directly onto the surgical field. For this to be useful the acquired imaging must be quickly and accurately registered with patient and the registration must be maintained. Here we describe a method for projecting a CT scan with Microsoft Hololens and then aligning that projection to a set of fiduciary markers. Radio-opaque stickers with unique QR-codes are placed on an object prior to acquiring a CT scan. The location of the markers in the CT scan are extracted and the CT scan is converted into a 3D surface object. The 3D object is then projected using the Hololens onto a table on which the same markers are placed. We designed an algorithm that aligns the markers on the 3D object with the markers on the table. To extract the markers and convert the CT into a 3D object took less than 5 seconds. To align three markers, it took $0.9 \pm 0.2$ seconds to achieve an accuracy of $5 \pm 2$ mm. These findings show that it is feasible to use a combined radio-opaque optical marker, placed on a patient prior to a CT scan, to subsequently align the acquired CT scan with the patient.
Tasks
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03237v1
PDF	https://arxiv.org/pdf/1908.03237v1.pdf
PWC	https://paperswithcode.com/paper/image-based-marker-tracking-and-registration
Repo
Framework

GlyphGAN: Style-Consistent Font Generation Based on Generative Adversarial Networks


Title	GlyphGAN: Style-Consistent Font Generation Based on Generative Adversarial Networks
Authors	Hideaki Hayashi, Kohtaro Abe, Seiichi Uchida
Abstract	In this paper, we propose GlyphGAN: style-consistent font generation based on generative adversarial networks (GANs). GANs are a framework for learning a generative model using a system of two neural networks competing with each other. One network generates synthetic images from random input vectors, and the other discriminates between synthetic and real images. The motivation of this study is to create new fonts using the GAN framework while maintaining style consistency over all characters. In GlyphGAN, the input vector for the generator network consists of two vectors: character class vector and style vector. The former is a one-hot vector and is associated with the character class of each sample image during training. The latter is a uniform random vector without supervised information. In this way, GlyphGAN can generate an infinite variety of fonts with the character and style independently controlled. Experimental results showed that fonts generated by GlyphGAN have style consistency and diversity different from the training images without losing their legibility.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12502v2
PDF	https://arxiv.org/pdf/1905.12502v2.pdf
PWC	https://paperswithcode.com/paper/glyphgan-style-consistent-font-generation
Repo
Framework

Improved Detection of Adversarial Attacks via Penetration Distortion Maximization


Title	Improved Detection of Adversarial Attacks via Penetration Distortion Maximization
Authors	Shai Rozenberg, Gal Elidan, Ran El-Yaniv
Abstract	This paper is concerned with the defense of deep models against adversarial attacks. We developan adversarial detection method, which is inspired by the certificate defense approach, and capturesthe idea of separating class clusters in the embedding space to increase the margin. The resultingdefense is intuitive, effective, scalable, and can be integrated into any given neural classificationmodel. Our method demonstrates state-of-the-art (detection) performance under all threat models.
Tasks
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00870v1
PDF	https://arxiv.org/pdf/1911.00870v1.pdf
PWC	https://paperswithcode.com/paper/improved-detection-of-adversarial-attacks-via-1
Repo
Framework

A Unifying Framework of Bilinear LSTMs


Title	A Unifying Framework of Bilinear LSTMs
Authors	Mohit Rajpal, Bryan Kian Hsiang Low
Abstract	This paper presents a novel unifying framework of bilinear LSTMs that can represent and utilize the nonlinear interaction of the input features present in sequence datasets for achieving superior performance over a linear LSTM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balanced by correspondingly trading off between the hidden state vector size vs. approximation quality of the weight matrix in the bilinear term so as to optimize the performance of our bilinear LSTM, while not incurring more parameters to be learned. We empirically evaluate the performance of our bilinear LSTM in several language-based sequence learning tasks to demonstrate its general applicability.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10294v1
PDF	https://arxiv.org/pdf/1910.10294v1.pdf
PWC	https://paperswithcode.com/paper/a-unifying-framework-of-bilinear-lstms
Repo
Framework

Application of Autoencoder-Assisted Recurrent Neural Networks to Prevent Cases of Sudden Infant Death Syndrome


Title	Application of Autoencoder-Assisted Recurrent Neural Networks to Prevent Cases of Sudden Infant Death Syndrome
Authors	Maximilian Du
Abstract	This project develops and trains a Recurrent Neural Network (RNN) that monitors sleeping infants from an auxiliary microphone for cases of Sudden Infant Death Syndrome (SIDS), manifested in sudden or gradual respiratory arrest. To minimize invasiveness and maximize economic viability, an electret microphone, and parabolic concentrator, paired with a specially designed and tuned amplifier circuit, was used as a very sensitive audio monitoring device, which fed data to the RNN model. This RNN was trained and operated in the frequency domain, where the respiratory activity is most unique from noise. In both training and operation, a Fourier transform and an autoencoder compression were applied to the raw audio, and this transformed audio data was fed into the model in 1/8 second time steps. In operation, this model flagged each perceived breath, and the time between breaths was analyzed through a statistical T-test for slope, which detected dangerous trends. The entire model achieved 92.5% accuracy on continuous data and had an 11.25-second response rate on data that emulated total respiratory arrest. Because of the compatibility of the trained model with many off-the-shelf devices like Android phones and Raspberry Pi’s, free-standing processing hardware deployment is a very feasible future goal.
Tasks
Published	2019-04-28
URL	https://arxiv.org/abs/1904.12386v2
PDF	https://arxiv.org/pdf/1904.12386v2.pdf
PWC	https://paperswithcode.com/paper/application-of-autoencoder-assisted-recurrent
Repo
Framework

AnonymousNet: Natural Face De-Identification with Measurable Privacy


Title	AnonymousNet: Natural Face De-Identification with Measurable Privacy
Authors	Tao Li, Lei Lin
Abstract	With billions of personal images being generated from social media and cameras of all sorts on a daily basis, security and privacy are unprecedentedly challenged. Although extensive attempts have been made, existing face image de-identification techniques are either insufficient in photo-reality or incapable of balancing privacy and usability qualitatively and quantitatively, i.e., they fail to answer counterfactual questions such as “is it private now?", “how private is it?", and “can it be more private?” In this paper, we propose a novel framework called AnonymousNet, with an effort to address these issues systematically, balance usability, and enhance privacy in a natural and measurable manner. The framework encompasses four stages: facial attribute estimation, privacy-metric-oriented face obfuscation, directed natural image synthesis, and adversarial perturbation. Not only do we achieve the state-of-the-arts in terms of image quality and attribute prediction accuracy, we are also the first to show that facial privacy is measurable, can be factorized, and accordingly be manipulated in a photo-realistic fashion to fulfill different requirements and application scenarios. Experiments further demonstrate the effectiveness of the proposed framework.
Tasks	Image Generation
Published	2019-04-19
URL	http://arxiv.org/abs/1904.12620v1
PDF	http://arxiv.org/pdf/1904.12620v1.pdf
PWC	https://paperswithcode.com/paper/190412620
Repo
Framework

WIQA: A dataset for “What if…” reasoning over procedural text


Title	WIQA: A dataset for “What if…” reasoning over procedural text
Authors	Niket Tandon, Bhavana Dalvi Mishra, Keisuke Sakaguchi, Antoine Bosselut, Peter Clark
Abstract	We introduce WIQA, the first large-scale dataset of “What if…” questions over procedural text. WIQA contains three parts: a collection of paragraphs each describing a process, e.g., beach erosion; a set of crowdsourced influence graphs for each paragraph, describing how one change affects another; and a large (40k) collection of “What if…?” multiple-choice questions derived from the graphs. For example, given a paragraph about beach erosion, would stormy weather result in more or less erosion (or have no effect)? The task is to answer the questions, given their associated paragraph. WIQA contains three kinds of questions: perturbations to steps mentioned in the paragraph; external (out-of-paragraph) perturbations requiring commonsense knowledge; and irrelevant (no effect) perturbations. We find that state-of-the-art models achieve 73.8% accuracy, well below the human performance of 96.3%. We analyze the challenges, in particular tracking chains of influences, and present the dataset as an open challenge to the community.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04739v1
PDF	https://arxiv.org/pdf/1909.04739v1.pdf
PWC	https://paperswithcode.com/paper/wiqa-a-dataset-for-what-if-reasoning-over
Repo
Framework


Title	Knowledge Refinement via Rule Selection
Authors	Phokion G. Kolaitis, Lucian Popa, Kun Qian
Abstract	In several different applications, including data transformation and entity resolution, rules are used to capture aspects of knowledge about the application at hand. Often, a large set of such rules is generated automatically or semi-automatically, and the challenge is to refine the encapsulated knowledge by selecting a subset of rules based on the expected operational behavior of the rules on available data. In this paper, we carry out a systematic complexity-theoretic investigation of the following rule selection problem: given a set of rules specified by Horn formulas, and a pair of an input database and an output database, find a subset of the rules that minimizes the total error, that is, the number of false positive and false negative errors arising from the selected rules. We first establish computational hardness results for the decision problems underlying this minimization problem, as well as upper and lower bounds for its approximability. We then investigate a bi-objective optimization version of the rule selection problem in which both the total error and the size of the selected rules are taken into account. We show that testing for membership in the Pareto front of this bi-objective optimization problem is DP-complete. Finally, we show that a similar DP-completeness result holds for a bi-level optimization version of the rule selection problem, where one minimizes first the total error and then the size.
Tasks	Entity Resolution
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10051v1
PDF	http://arxiv.org/pdf/1901.10051v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-refinement-via-rule-selection
Repo
Framework

Feature engineering workflow for activity recognition from synchronized inertial measurement units


Title	Feature engineering workflow for activity recognition from synchronized inertial measurement units
Authors	Andreas W. Kempa-Liehr, Jonty Oram, Andrew Wong, Mark Finch, Thor Besier
Abstract	The ubiquitous availability of wearable sensors is responsible for driving the Internet-of-Things but is also making an impact on sport sciences and precision medicine. While human activity recognition from smartphone data or other types of inertial measurement units (IMU) has evolved to one of the most prominent daily life examples of machine learning, the underlying process of time-series feature engineering still seems to be time-consuming. This lengthy process inhibits the development of IMU-based machine learning applications in sport science and precision medicine. This contribution discusses a feature engineering workflow, which automates the extraction of time-series feature on based on the FRESH algorithm (FeatuRe Extraction based on Scalable Hypothesis tests) to identify statistically significant features from synchronized IMU sensors (IMeasureU Ltd, NZ). The feature engineering workflow has five main steps: time-series engineering, automated time-series feature extraction, optimized feature extraction, fitting of a specialized classifier, and deployment of optimized machine learning pipeline. The workflow is discussed for the case of a user-specific running-walking classification, and the generalization to a multi-user multi-activity classification is demonstrated.
Tasks	Activity Recognition, Feature Engineering, Human Activity Recognition, Time Series
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08394v1
PDF	https://arxiv.org/pdf/1912.08394v1.pdf
PWC	https://paperswithcode.com/paper/feature-engineering-workflow-for-activity
Repo
Framework

Bi-Directional Domain Translation for Zero-Shot Sketch-Based Image Retrieval


Title	Bi-Directional Domain Translation for Zero-Shot Sketch-Based Image Retrieval
Authors	Jiangtong Li, Zhixin Ling, Li Niu, Liqing Zhang
Abstract	The goal of Sketch-Based Image Retrieval (SBIR) is using free-hand sketches to retrieve images of the same category from a natural image gallery. However, SBIR requires all categories to be seen during training, which cannot be guaranteed in real-world applications. So we investigate more challenging Zero-Shot SBIR (ZS-SBIR), in which test categories do not appear in the training stage. Traditional SBIR methods are prone to be category-based retrieval and cannot generalize well from seen categories to unseen ones. In contrast, we disentangle image features into structure features and appearance features to facilitate structure-based retrieval. To assist feature disentanglement and take full advantage of disentangled information, we propose a Bi-directional Domain Translation (BDT) framework for ZS-SBIR, in which the image domain and sketch domain can be translated to each other through disentangled structure and appearance features. Finally, we perform retrieval in both structure feature space and image feature space. Extensive experiments demonstrate that our proposed approach remarkably outperforms state-of-the-art approaches by about 8% on the Sketchy dataset and over 5% on the TU-Berlin dataset.
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13251v1
PDF	https://arxiv.org/pdf/1911.13251v1.pdf
PWC	https://paperswithcode.com/paper/bi-directional-domain-translation-for-zero
Repo
Framework