January 27, 2020

3204 words 16 mins read

Paper Group ANR 1300

G-TAD: Sub-Graph Localization for Temporal Action Detection. A Mobile Cloud Collaboration Fall Detection System Based on Ensemble Learning. A Co-analysis Framework for Exploring Multivariate Scientific Data. Topological Navigation Graph. Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters. An End-to-End Solution for Effec …

G-TAD: Sub-Graph Localization for Temporal Action Detection


Title	G-TAD: Sub-Graph Localization for Temporal Action Detection
Authors	Mengmeng Xu, Chen Zhao, David S. Rojas, Ali Thabet, Bernard Ghanem
Abstract	Temporal action detection is a fundamental yet challenging task in video understanding. Video context is a critical cue to effectively detect actions, but current works mainly focus on temporal context, while neglecting semantic con-text as well as other important context properties. In this work, we propose a graph convolutional network (GCN) model to adaptively incorporate multi-level semantic context into video features and cast temporal action detection as a sub-graph localization problem. Specifically, we formulate video snippets as graph nodes, snippet-snippet cor-relations as edges, and actions associated with context as target sub-graphs. With graph convolution as the basic operation, we design a GCN block called GCNeXt, which learns the features of each node by aggregating its context and dynamically updates the edges in the graph. To localize each sub-graph, we also design a SGAlign layer to embed each sub-graph into the Euclidean space. Extensive experiments show that G-TAD is capable of finding effective video context without extra supervision and achieves state-of-the-art performance on two detection benchmarks. On ActityNet-1.3, we obtain an average mAP of 34.09%; on THUMOS14, we obtain 40.16% in mAP@0.5, beating all the other one-stage methods.
Tasks	Action Detection, Video Understanding
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11462v1
PDF	https://arxiv.org/pdf/1911.11462v1.pdf
PWC	https://paperswithcode.com/paper/g-tad-sub-graph-localization-for-temporal
Repo
Framework

A Mobile Cloud Collaboration Fall Detection System Based on Ensemble Learning


Title	A Mobile Cloud Collaboration Fall Detection System Based on Ensemble Learning
Authors	Tong Wu, Yang Gu, Yiqiang Chen, Yunlong Xiao, Jiwei Wang
Abstract	Falls are one of the important causes of accidental or unintentional injury death worldwide. Therefore, this paper presents a reliable fall detection algorithm and a mobile cloud collaboration system for fall detection. The algorithm is an ensemble learning method based on decision tree, named Falldetection Ensemble Decision Tree (FEDT). The mobile cloud collaboration system can be divided into three stages: 1) mobile stage: use a light-weighted threshold method to filter out the activities of daily livings (ADLs), 2) collaboration stage: transmit data to cloud and meanwhile extract features in the cloud, 3) cloud stage: deploy the model trained by FEDT to give the final detection result with the extracted features. Experiments show that the performance of the proposed FEDT outperforms the others’ over 1-3% both on sensitivity and specificity, and more importantly, the system can provide reliable fall detection in practical scenario.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.04788v1
PDF	https://arxiv.org/pdf/1907.04788v1.pdf
PWC	https://paperswithcode.com/paper/a-mobile-cloud-collaboration-fall-detection
Repo
Framework

A Co-analysis Framework for Exploring Multivariate Scientific Data


Title	A Co-analysis Framework for Exploring Multivariate Scientific Data
Authors	Xiangyang He, Yubo Tao, Qirui Wang, Hai Lin
Abstract	In complex multivariate data sets, different features usually include diverse associations with different variables, and different variables are associated within different regions. Therefore, exploring the associations between variables and voxels locally becomes necessary to better understand the underlying phenomena. In this paper, we propose a co-analysis framework based on biclusters, which are two subsets of variables and voxels with close scalar-value relationships, to guide the process of visually exploring multivariate data. We first automatically extract all meaningful biclusters, each of which only contains voxels with a similar scalar-value pattern over a subset of variables. These biclusters are organized according to their variable sets, and biclusters in each variable set are further grouped by a similarity metric to reduce redundancy and support diversity during visual exploration. Biclusters are visually represented in coordinated views to facilitate interactive exploration of multivariate data based on the similarity between biclusters and the correlation of scalar values with different variables. Experiments on several representative multivariate scientific data sets demonstrate the effectiveness of our framework in exploring local relationships among variables, biclusters and scalar values in the data.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06576v1
PDF	https://arxiv.org/pdf/1908.06576v1.pdf
PWC	https://paperswithcode.com/paper/a-co-analysis-framework-for-exploring
Repo
Framework


Title	Topological Navigation Graph
Authors	Povilas Daniusis, Shubham Juneja, Lukas Valatka, Linas Petkevicius
Abstract	In this article, we focus on the utilisation of reactive trajectory imitation controllers for goal-directed mobile robot navigation. We propose a topological navigation graph (TNG) - an imitation-learning-based framework for navigating through environments with intersecting trajectories. The TNG framework represents the environment as a directed graph composed of deep neural networks. Each vertex of the graph corresponds to a trajectory and is represented by a trajectory identification classifier and a trajectory imitation controller. For trajectory following, we propose the novel use of neural object detection architectures. The edges of TNG correspond to intersections between trajectories and are all represented by a classifier. We provide empirical evaluation of the proposed navigation framework and its components in simulated and real-world environments, demonstrating that TNG allows us to utilise non-goal-directed, imitation-learning methods for goal-directed autonomous navigation.
Tasks	Autonomous Navigation, Imitation Learning, Object Detection, Robot Navigation
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06658v1
PDF	https://arxiv.org/pdf/1910.06658v1.pdf
PWC	https://paperswithcode.com/paper/topological-navigation-graph
Repo
Framework

Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters


Title	Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters
Authors	Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
Abstract	DARTS is a popular algorithm for neural architecture search (NAS). Despite its great advantage in search efficiency, DARTS often suffers weak stability, which reflects in the large variation among individual trials as well as the sensitivity to the hyper-parameters of the search process. This paper owes such instability to an optimization gap between the super-network and its sub-networks, namely, improving the validation accuracy of the super-network does not necessarily lead to a higher expectation on the performance of the sampled sub-networks. Then, we point out that the gap is due to the inaccurate estimation of the architectural gradients, based on which we propose an amended estimation method. Mathematically, our method guarantees a bounded error from the true gradients while the original estimation does not. Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages. Experiments on CIFAR10 and ImageNet demonstrate that our approach largely improves search stability and, more importantly, enables DARTS-based approaches to explore much larger search spaces that have not been investigated before.
Tasks	Neural Architecture Search
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11831v4
PDF	https://arxiv.org/pdf/1910.11831v4.pdf
PWC	https://paperswithcode.com/paper/stabilizing-darts-with-amended-gradient
Repo
Framework

An End-to-End Solution for Effectively Demoting Watermarked Images in Image Search


Title	An End-to-End Solution for Effectively Demoting Watermarked Images in Image Search
Authors	Ning Ma, Xin Zhao, Mark Bolin
Abstract	We propose an end-to-end solution, from watermark feature generation to metric design, for effectively demoting watermarked images surfed by a real world image search engine. We use a few fundamental techniques to obtain effective watermark features of images in the image search index, and utilize the signals in a commercial search engine to improve the image search quality. We collect a diverse and large set (about 1M) of images with human labels indicating whether the image contains visible watermark. We train a few deep convolutional neural networks to extract watermark information from the raw images. The deep CNN classifiers we trained can achieve high accuracy on the watermark test data set. We also analyze the images based on their domains to get watermark information from a domain-based watermark classifier. We design a new novel hybrid metric which includes the relevance, image attractiveness and watermark information all together. We demonstrate that using these watermark signals together with the new metric in image search ranker can significantly demote the watermarked images during the online image ranking.
Tasks	Image Retrieval
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09473v2
PDF	https://arxiv.org/pdf/1901.09473v2.pdf
PWC	https://paperswithcode.com/paper/watermark-signal-detection-and-its
Repo
Framework

Reversible Adversarial Example based on Reversible Image Transformation


Title	Reversible Adversarial Example based on Reversible Image Transformation
Authors	Zhaoxia Yin, Hua Wang, Weiming Zhang
Abstract	At present there are many companies that take the most advanced Deep Neural Networks (DNNs) to classify and analyze photos we upload to social networks or the cloud. In order to prevent users privacy from leakage, the attack characteristics of the adversarial example can be exploited to make these models misjudged. In this paper, we take advantage of reversible image transformation to construct reversible adversarial example, which is still an adversarial example to DNNs. It not only allows DNNs to extract the wrong information, but also can be recovered to its original image without any distortion. Experimental results show that reversible adversarial examples obtained by our method have higher attack success rates while ensuring that the reversible image quality is still high. Moreover, the proposed method is easy to operate, suitable for practical applications.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02360v3
PDF	https://arxiv.org/pdf/1911.02360v3.pdf
PWC	https://paperswithcode.com/paper/reversible-adversarial-examples-based-on
Repo
Framework

Multi-Channel Volumetric Neural Network for Knee Cartilage Segmentation in Cone-beam CT


Title	Multi-Channel Volumetric Neural Network for Knee Cartilage Segmentation in Cone-beam CT
Authors	Jennifer Maier, Luis Carlos Rivera Monroy, Christopher Syben, Yejin Jeon, Jang-Hwan Choi, Mary Elizabeth Hall, Marc Levenston, Garry Gold, Rebecca Fahrig, Andreas Maier
Abstract	Analyzing knee cartilage thickness and strain under load can help to further the understanding of the effects of diseases like Osteoarthritis. A precise segmentation of the cartilage is a necessary prerequisite for this analysis. This segmentation task has mainly been addressed in Magnetic Resonance Imaging, and was rarely investigated on contrast-enhanced Computed Tomography, where contrast agent visualizes the border between femoral and tibial cartilage. To overcome the main drawback of manual segmentation, namely its high time investment, we propose to use a 3D Convolutional Neural Network for this task. The presented architecture consists of a V-Net with SeLu activation, and a Tversky loss function. Due to the high imbalance between very few cartilage pixels and many background pixels, a high false positive rate is to be expected. To reduce this rate, the two largest segmented point clouds are extracted using a connected component analysis, since they most likely represent the medial and lateral tibial cartilage surfaces. The resulting segmentations are compared to manual segmentations, and achieve on average a recall of 0.69, which confirms the feasibility of this approach.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01362v1
PDF	https://arxiv.org/pdf/1912.01362v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-volumetric-neural-network-for
Repo
Framework

One-Stage Inpainting with Bilateral Attention and Pyramid Filling Block


Title	One-Stage Inpainting with Bilateral Attention and Pyramid Filling Block
Authors	Hongyu Liu, Bin Jiang, Wei Huang, Chao Yang
Abstract	Recent deep learning based image inpainting methods which utilize contextual information and two-stage architecture have exhibited remarkable performance. However, the two-stage architecture is time-consuming, the contextual information lack high-level semantics and ignores both the semantic relevance and distance information of hole’s feature patches, these limitations result in blurry textures and distorted structures of final result. Motivated by these observations, we propose a new deep generative model-based approach, which trains a shared network twice with different targets and utilizes a single network during the testing phase, so that we can effectively save inference time. Specifically, the targets of two training steps are structure reconstruction and texture generation respectively. During the second training, we first propose a Pyramid Filling Block (PF-block) to utilize the high-level features that the hole regions has been filled to guide the filling process of low-level features progressively, the missing content can be filled from deep to shallow in a pyramid fashion. Then, inspired by the classical bilateral filter [30], we propose the Bilateral Attention layer (BA-layer) to optimize filled feature map, which synthesizes feature patches at each position by computing weighted sums of the surrounding feature patches, these weights are derived by considering both distance and value relationships between feature patches, thus making the visually plausible inpainting results. Finally, experiments on multiple publicly available datasets show the superior performance of our approach.
Tasks	Image Inpainting, Texture Synthesis
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08642v1
PDF	https://arxiv.org/pdf/1912.08642v1.pdf
PWC	https://paperswithcode.com/paper/one-stage-inpainting-with-bilateral-attention
Repo
Framework

Multi-view Characterization of Stories from Narratives and Reviews using Multi-label Ranking


Title	Multi-view Characterization of Stories from Narratives and Reviews using Multi-label Ranking
Authors	Sudipta Kar, Gustavo Aguilar, Thamar Solorio
Abstract	This paper considers the problem of characterizing stories by inferring attributes like theme and genre using the written narrative and user reviews. We experiment with a multi-label dataset of narratives representing the story of movies and a tagset representing various attributes of stories. To identify the story attributes, we propose a hierarchical representation of narratives that improves over the traditional feature-based machine learning methods as well as sequential representation approaches. Finally, we demonstrate a multi-view method for discovering story attributes from user opinions in reviews that are complementary to the gold standard data set.
Tasks
Published	2019-08-24
URL	https://arxiv.org/abs/1908.09083v1
PDF	https://arxiv.org/pdf/1908.09083v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-characterization-of-stories-from
Repo
Framework

Double descent in the condition number


Title	Double descent in the condition number
Authors	Tomaso Poggio, Gil Kur, Andrzej Banburski
Abstract	In solving a system of $n$ linear equations in $d$ variables $Ax=b$, the condition number of the $n,d$ matrix $A$ measures how much errors in the data $b$ affect the solution $x$. Bounds of this type are important in many inverse problems. An example is machine learning where the key task is to estimate an underlying function from a set of measurements at random points in a high dimensional space and where low sensitivity to error in the data is a requirement for good predictive performance. Here we discuss the simple observation, which is well-known but surprisingly little quoted that when the columns of $A$ are random vectors, the condition number of $A$ is highest if $d=n$, that is when the inverse of $A$ exists. An overdetermined system ($n>d$) as well as an underdetermined system ($n<d$), for which the pseudoinverse must be used instead of the inverse, typically have significantly better, that is lower, condition numbers. Thus the condition number of $A$ plotted as function of $d$ shows a double descent behavior with a peak at $d=n$.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06190v2
PDF	https://arxiv.org/pdf/1912.06190v2.pdf
PWC	https://paperswithcode.com/paper/double-descent-in-the-condition-number
Repo
Framework

A cryptographic approach to black box adversarial machine learning


Title	A cryptographic approach to black box adversarial machine learning
Authors	Kevin Shi, Daniel Hsu, Allison Bishop
Abstract	We propose a new randomized ensemble technique with a provable security guarantee against black-box transfer attacks. Our proof constructs a new security problem for random binary classifiers which is easier to empirically verify and a reduction from the security of this new model to the security of the ensemble classifier. We provide experimental evidence of the security of our random binary classifiers, as well as empirical results of the adversarial accuracy of the overall ensemble to black-box attacks. Our construction crucially leverages hidden randomness in the multiclass-to-binary reduction.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03231v2
PDF	https://arxiv.org/pdf/1906.03231v2.pdf
PWC	https://paperswithcode.com/paper/a-cryptographic-approach-to-black-box
Repo
Framework

Can Unconditional Language Models Recover Arbitrary Sentences?


Title	Can Unconditional Language Models Recover Arbitrary Sentences?
Authors	Nishant Subramani, Samuel R. Bowman, Kyunghyun Cho
Abstract	Neural network-based generative language models like ELMo and BERT can work effectively as general purpose sentence encoders in text classification without further fine-tuning. Is it possible to adapt them in a similar way for use as general-purpose decoders? For this to be possible, it would need to be the case that for any target sentence of interest, there is some continuous representation that can be passed to the language model to cause it to reproduce that sentence. We set aside the difficult problem of designing an encoder that can produce such representations and, instead, ask directly whether such representations exist at all. To do this, we introduce a pair of effective, complementary methods for feeding representations into pretrained unconditional language models and a corresponding set of methods to map sentences into and out of this representation space, the reparametrized sentence space. We then investigate the conditions under which a language model can be made to generate a sentence through the identification of a point in such a space and find that it is possible to recover arbitrary sentences nearly perfectly with language models and representations of moderate size without modifying any model parameters.
Tasks	Language Modelling, Text Classification
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04944v2
PDF	https://arxiv.org/pdf/1907.04944v2.pdf
PWC	https://paperswithcode.com/paper/can-unconditional-language-models-recover
Repo
Framework

Platoon trajectories generation: A unidirectional interconnected LSTM-based car following model


Title	Platoon trajectories generation: A unidirectional interconnected LSTM-based car following model
Authors	Yangxin Lin, Ping Wang, Yang Zhou, Fan Ding, Chen Wang, Huachun Tan
Abstract	Car following models have been widely applied and made remarkable achievements in traffic engineering. However, the traffic micro-simulation accuracy of car following models in a platoon level, especially during traffic oscillations, still needs to be enhanced. Rather than using traditional individual car following models, we proposed a new trajectory generation approach to generate platoon level trajectories given the first leading vehicle’s trajectory. In this paper, we discussed the temporal and spatial error propagation issue for the traditional approach by a car following block diagram representation. Based on the analysis, we pointed out that error comes from the training method and the model structure. In order to fix that, we adopt two improvements on the basis of the traditional LSTM based car following model. We utilized a scheduled sampling technique during the training process to solve the error propagation in the temporal dimension. Furthermore, we developed a unidirectional interconnected LSTM model structure to extract trajectories features from the perspective of the platoon. As indicated by the systematic empirical experiments, the proposed novel structure could efficiently reduce the temporal and spatial error propagation. Compared with the traditional LSTM based car following model, the proposed model has almost 40% less error. The findings will benefit the design and analysis of micro-simulation for platoon level car following models.
Tasks
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11843v1
PDF	https://arxiv.org/pdf/1910.11843v1.pdf
PWC	https://paperswithcode.com/paper/platoon-trajectories-generation-a
Repo
Framework

A Comparative Evaluation of SGM Variants (including a New Variant, tMGM) for Dense Stereo Matching


Title	A Comparative Evaluation of SGM Variants (including a New Variant, tMGM) for Dense Stereo Matching
Authors	Sonali Patil, Tanmay Prakash, Bharath Comandur, Avinash Kak
Abstract	Our goal here is threefold: [1] To present a new dense-stereo matching algorithm, tMGM, that by combining the hierarchical logic of tSGM with the support structure of MGM achieves 6-8% performance improvement over the baseline SGM (these performance numbers are posted under tMGM-16 in the Middlebury Benchmark V3 ); and [2] Through an exhaustive quantitative and qualitative comparative study, to compare how the major variants of the SGM approach to dense stereo matching, including the new tMGM, perform in the presence of: (a) illumination variations and shadows, (b) untextured or weakly textured regions, (c) repetitive patterns in the scene in the presence of large stereo rectification errors. [3] To present a novel DEM-Sculpting approach for estimating initial disparity search bounds for multi-date satellite stereo pairs. Based on our study, we have found that tMGM generally performs best with respect to all these data conditions. Both tSGM and MGM improve the density of stereo disparity maps and combining the two in tMGM makes it possible to accurately estimate the disparities at a significant number of pixels that would otherwise be declared invalid by SGM. The datasets we have used in our comparative evaluation include the Middlebury2014, KITTI2015, and ETH3D datasets and the satellite images over the San Fernando area from the MVS Challenge dataset.
Tasks	Stereo Matching
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09800v1
PDF	https://arxiv.org/pdf/1911.09800v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-evaluation-of-sgm-variants
Repo
Framework