Paper Group ANR 127
Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles. Learning Sums of Independent Random Variables with Sparse Collective Support. Stability of the Stochastic Gradient Method for an Approximated Large Scale Kernel Machine. Point Cloud Colorization Based on Densely Annotated 3D Shape Dataset. NICT’s Neural and Statistical Mac …
Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles
Title | Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles |
Authors | Dahun Kim, Donghyeon Cho, In So Kweon |
Abstract | Self-supervised tasks such as colorization, inpainting and zigsaw puzzle have been utilized for visual representation learning for still images, when the number of labeled images is limited or absent at all. Recently, this worthwhile stream of study extends to video domain where the cost of human labeling is even more expensive. However, the most of existing methods are still based on 2D CNN architectures that can not directly capture spatio-temporal information for video applications. In this paper, we introduce a new self-supervised task called as \textit{Space-Time Cubic Puzzles} to train 3D CNNs using large scale video dataset. This task requires a network to arrange permuted 3D spatio-temporal crops. By completing \textit{Space-Time Cubic Puzzles}, the network learns both spatial appearance and temporal relation of video frames, which is our final goal. In experiments, we demonstrate that our learned 3D representation is well transferred to action recognition tasks, and outperforms state-of-the-art 2D CNN-based competitors on UCF101 and HMDB51 datasets. |
Tasks | Colorization, Representation Learning, Temporal Action Localization |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1811.09795v1 |
http://arxiv.org/pdf/1811.09795v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-video-representation-learning |
Repo | |
Framework | |
Learning Sums of Independent Random Variables with Sparse Collective Support
Title | Learning Sums of Independent Random Variables with Sparse Collective Support |
Authors | Anindya De, Philip M. Long, Rocco Servedio |
Abstract | We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports. For $\mathcal{A} \subset \mathbf{Z}_{+}$, a sum of independent random variables with collective support $\mathcal{A}$} (called an $\mathcal{A}$-sum in this paper) is a distribution $\mathbf{S} = \mathbf{X}_1 + \cdots + \mathbf{X}_N$ where the $\mathbf{X}_i$'s are mutually independent (but not necessarily identically distributed) integer random variables with $\cup_i \mathsf{supp}(\mathbf{X}_i) \subseteq \mathcal{A}.$ We give two main algorithmic results for learning such distributions: 1. For the case $ \mathcal{A} = 3$, we give an algorithm for learning $\mathcal{A}$-sums to accuracy $\epsilon$ that uses $\mathsf{poly}(1/\epsilon)$ samples and runs in time $\mathsf{poly}(1/\epsilon)$, independent of $N$ and of the elements of $\mathcal{A}$. 2. For an arbitrary constant $k \geq 4$, if $\mathcal{A} = { a_1,…,a_k}$ with $0 \leq a_1 < … < a_k$, we give an algorithm that uses $\mathsf{poly}(1/\epsilon) \cdot \log \log a_k$ samples (independent of $N$) and runs in time $\mathsf{poly}(1/\epsilon, \log a_k).$ We prove an essentially matching lower bound: if $\mathcal{A} = 4$, then any algorithm must use $\Omega(\log \log a_4) $ samples even for learning to constant accuracy. We also give similar-in-spirit (but quantitatively very different) algorithmic results, and essentially matching lower bounds, for the case in which $\mathcal{A}$ is not known to the learner. |
Tasks | |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.07013v1 |
http://arxiv.org/pdf/1807.07013v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-sums-of-independent-random-variables |
Repo | |
Framework | |
Stability of the Stochastic Gradient Method for an Approximated Large Scale Kernel Machine
Title | Stability of the Stochastic Gradient Method for an Approximated Large Scale Kernel Machine |
Authors | Aven Samareh, Mahshid Salemi Parizi |
Abstract | In this paper we measured the stability of stochastic gradient method (SGM) for learning an approximated Fourier primal support vector machine. The stability of an algorithm is considered by measuring the generalization error in terms of the absolute difference between the test and the training error. Our problem is to learn an approximated kernel function using random Fourier features for a binary classification problem via online convex optimization settings. For a convex, Lipschitz continuous and smooth loss function, given reasonable number of iterations stochastic gradient method is stable. We showed that with a high probability SGM generalizes well for an approximated kernel under given assumptions.We empirically verified the theoretical findings for different parameters using several data sets. |
Tasks | |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.08003v1 |
http://arxiv.org/pdf/1804.08003v1.pdf | |
PWC | https://paperswithcode.com/paper/stability-of-the-stochastic-gradient-method |
Repo | |
Framework | |
Point Cloud Colorization Based on Densely Annotated 3D Shape Dataset
Title | Point Cloud Colorization Based on Densely Annotated 3D Shape Dataset |
Authors | Xu Cao, Katashi Nagao |
Abstract | This paper introduces DensePoint, a densely sampled and annotated point cloud dataset containing over 10,000 single objects across 16 categories, by merging different kind of information from two existing datasets. Each point cloud in DensePoint contains 40,000 points, and each point is associated with two sorts of information: RGB value and part annotation. In addition, we propose a method for point cloud colorization by utilizing Generative Adversarial Networks (GANs). The network makes it possible to generate colours for point clouds of single objects by only giving the point cloud itself. Experiments on DensePoint show that there exist clear boundaries in point clouds between different parts of an object, suggesting that the proposed network is able to generate reasonably good colours. Our dataset is publicly available on the project page. |
Tasks | Colorization |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05396v1 |
http://arxiv.org/pdf/1810.05396v1.pdf | |
PWC | https://paperswithcode.com/paper/point-cloud-colorization-based-on-densely |
Repo | |
Framework | |
NICT’s Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task
Title | NICT’s Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task |
Authors | Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita |
Abstract | This paper presents the NICT’s participation to the WMT18 shared news translation task. We participated in the eight translation directions of four language pairs: Estonian-English, Finnish-English, Turkish-English and Chinese-English. For each translation direction, we prepared state-of-the-art statistical (SMT) and neural (NMT) machine translation systems. Our NMT systems were trained with the transformer architecture using the provided parallel data enlarged with a large quantity of back-translated monolingual data that we generated with a new incremental training framework. Our primary submissions to the task are the result of a simple combination of our SMT and NMT systems. Our systems are ranked first for the Estonian-English and Finnish-English language pairs (constraint) according to BLEU-cased. |
Tasks | Machine Translation |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07037v2 |
http://arxiv.org/pdf/1809.07037v2.pdf | |
PWC | https://paperswithcode.com/paper/nicts-neural-and-statistical-machine |
Repo | |
Framework | |
Open-Ended Content-Style Recombination Via Leakage Filtering
Title | Open-Ended Content-Style Recombination Via Leakage Filtering |
Authors | Karl Ridgeway, Michael C. Mozer |
Abstract | We consider visual domains in which a class label specifies the content of an image, and class-irrelevant properties that differentiate instances constitute the style. We present a domain-independent method that permits the open-ended recombination of style of one image with the content of another. Open ended simply means that the method generalizes to style and content not present in the training data. The method starts by constructing a content embedding using an existing deep metric-learning technique. This trained content encoder is incorporated into a variational autoencoder (VAE), paired with a to-be-trained style encoder. The VAE reconstruction loss alone is inadequate to ensure a decomposition of the latent representation into style and content. Our method thus includes an auxiliary loss, leakage filtering, which ensures that no style information remaining in the content representation is used for reconstruction and vice versa. We synthesize novel images by decoding the style representation obtained from one image with the content representation from another. Using this method for data-set augmentation, we obtain state-of-the-art performance on few-shot learning tasks. |
Tasks | Few-Shot Learning, Metric Learning |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1810.00110v1 |
http://arxiv.org/pdf/1810.00110v1.pdf | |
PWC | https://paperswithcode.com/paper/open-ended-content-style-recombination-via |
Repo | |
Framework | |
Fighting Redundancy and Model Decay with Embeddings
Title | Fighting Redundancy and Model Decay with Embeddings |
Authors | Dan Shiebler, Luca Belli, Jay Baxter, Hanchen Xiong, Abhishek Tayal |
Abstract | Every day, hundreds of millions of new Tweets containing over 40 languages of ever-shifting vernacular flow through Twitter. Models that attempt to extract insight from this firehose of information must face the torrential covariate shift that is endemic to the Twitter platform. While regularly-retrained algorithms can maintain performance in the face of this shift, fixed model features that fail to represent new trends and tokens can quickly become stale, resulting in performance degradation. To mitigate this problem we employ learned features, or embedding models, that can efficiently represent the most relevant aspects of a data distribution. Sharing these embedding models across teams can also reduce redundancy and multiplicatively increase cross-team modeling productivity. In this paper, we detail the commoditized tools, algorithms and pipelines that we have developed and are developing at Twitter to regularly generate high quality, up-to-date embeddings and share them broadly across the company. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.07703v1 |
http://arxiv.org/pdf/1809.07703v1.pdf | |
PWC | https://paperswithcode.com/paper/fighting-redundancy-and-model-decay-with |
Repo | |
Framework | |
Bayesian Outdoor Defect Detection
Title | Bayesian Outdoor Defect Detection |
Authors | Fei Jiang, Guosheng Yin |
Abstract | We introduce a Bayesian defect detector to facilitate the defect detection on the motion blurred images on rough texture surfaces. To enhance the accuracy of Bayesian detection on removing non-defect pixels, we develop a class of reflected non-local prior distributions, which is constructed by using the mode of a distribution to subtract its density. The reflected non-local priors forces the Bayesian detector to approach 0 at the non-defect locations. We conduct experiments studies to demonstrate the superior performance of the Bayesian detector in eliminating the non-defect points. We implement the Bayesian detector in the motion blurred drone images, in which the detector successfully identifies the hail damages on the rough surface and substantially enhances the accuracy of the entire defect detection pipeline. |
Tasks | |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1809.01000v1 |
http://arxiv.org/pdf/1809.01000v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-outdoor-defect-detection |
Repo | |
Framework | |
Information Theoretic Co-Training
Title | Information Theoretic Co-Training |
Authors | David McAllester |
Abstract | This paper introduces an information theoretic co-training objective for unsupervised learning. We consider the problem of predicting the future. Rather than predict future sensations (image pixels or sound waves) we predict “hypotheses” to be confirmed by future sensations. More formally, we assume a population distribution on pairs $(x,y)$ where we can think of $x$ as a past sensation and $y$ as a future sensation. We train both a predictor model $P_\Phi(zx)$ and a confirmation model $P_\Psi(zy)$ where we view $z$ as hypotheses (when predicted) or facts (when confirmed). For a population distribution on pairs $(x,y)$ we focus on the problem of measuring the mutual information between $x$ and $y$. By the data processing inequality this mutual information is at least as large as the mutual information between $x$ and $z$ under the distribution on triples $(x,z,y)$ defined by the confirmation model $P_\Psi(zy)$. The information theoretic training objective for $P_\Phi(zx)$ and $P_\Psi(zy)$ can be viewed as a form of co-training where we want the prediction from $x$ to match the confirmation from $y$. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07572v2 |
http://arxiv.org/pdf/1802.07572v2.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-co-training |
Repo | |
Framework | |
3D non-rigid registration using color: Color Coherent Point Drift
Title | 3D non-rigid registration using color: Color Coherent Point Drift |
Authors | Marcelo Saval-Calvo, Jorge Azorin-Lopez, Andres Fuster-Guillo, Victor Villena-Martinez, Robert B. Fisher |
Abstract | Research into object deformations using computer vision techniques has been under intense study in recent years. A widely used technique is 3D non-rigid registration to estimate the transformation between two instances of a deforming structure. Despite many previous developments on this topic, it remains a challenging problem. In this paper we propose a novel approach to non-rigid registration combining two data spaces in order to robustly calculate the correspondences and transformation between two data sets. In particular, we use point color as well as 3D location as these are the common outputs of RGB-D cameras. We have propose the Color Coherent Point Drift (CCPD) algorithm (an extension of the CPD method [1]). Evaluation is performed using synthetic and real data. The synthetic data includes easy shapes that allow evaluation of the effect of noise, outliers and missing data. Moreover, an evaluation of realistic figures obtained using Blensor is carried out. Real data acquired using a general purpose Primesense Carmine sensor is used to validate the CCPD for real shapes. For all tests, the proposed method is compared to the original CPD showing better results in registration accuracy in most cases. |
Tasks | |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01516v1 |
http://arxiv.org/pdf/1802.01516v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-non-rigid-registration-using-color-color |
Repo | |
Framework | |
Experimenting with robotic intra-logistics domains
Title | Experimenting with robotic intra-logistics domains |
Authors | Martin Gebser, Philipp Obermeier, Thomas Otto, Torsten Schaub, Orkunt Sabuncu, Van Nguyen, Tran Cao Son |
Abstract | We introduce the asprilo [1] framework to facilitate experimental studies of approaches addressing complex dynamic applications. For this purpose, we have chosen the domain of robotic intra-logistics. This domain is not only highly relevant in the context of today’s fourth industrial revolution but it moreover combines a multitude of challenging issues within a single uniform framework. This includes multi-agent planning, reasoning about action, change, resources, strategies, etc. In return, asprilo allows users to study alternative solutions as regards effectiveness and scalability. Although asprilo relies on Answer Set Programming and Python, it is readily usable by any system complying with its fact-oriented interface format. This makes it attractive for benchmarking and teaching well beyond logic programming. More precisely, asprilo consists of a versatile benchmark generator, solution checker and visualizer as well as a bunch of reference encodings featuring various ASP techniques. Importantly, the visualizer’s animation capabilities are indispensable for complex scenarios like intra-logistics in order to inspect valid as well as invalid solution candidates. Also, it allows for graphically editing benchmark layouts that can be used as a basis for generating benchmark suites. [1] asprilo stands for Answer Set Programming for robotic intra-logistics |
Tasks | |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.10247v1 |
http://arxiv.org/pdf/1804.10247v1.pdf | |
PWC | https://paperswithcode.com/paper/experimenting-with-robotic-intra-logistics |
Repo | |
Framework | |
Entity-Aware Language Model as an Unsupervised Reranker
Title | Entity-Aware Language Model as an Unsupervised Reranker |
Authors | Mohammad Sadegh Rasooli, Sarangarajan Parthasarathy |
Abstract | In language modeling, it is difficult to incorporate entity relationships from a knowledge-base. One solution is to use a reranker trained with global features, in which global features are derived from n-best lists. However, training such a reranker requires manually annotated n-best lists, which is expensive to obtain. We propose a method based on the contrastive estimation method that alleviates the need for such data. Experiments in the music domain demonstrate that global features, as well as features extracted from an external knowledge-base, can be incorporated into our reranker. Our final model, a simple ensemble of a language model and reranker, achieves a 0.44% absolute word error rate improvement over an LSTM language model on the blind test data. |
Tasks | Language Modelling |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04291v2 |
http://arxiv.org/pdf/1803.04291v2.pdf | |
PWC | https://paperswithcode.com/paper/entity-aware-language-model-as-an |
Repo | |
Framework | |
Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators
Title | Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators |
Authors | Shereen Oraby, Lena Reed, Shubhangi Tandon, T. S. Sharath, Stephanie Lukin, Marilyn Walker |
Abstract | Natural language generators for task-oriented dialogue must effectively realize system dialogue actions and their associated semantics. In many applications, it is also desirable for generators to control the style of an utterance. To date, work on task-oriented neural generation has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. Here we present three different sequence-to-sequence models and carefully test how well they disentangle content and style. We use a statistical generator, Personage, to synthesize a new corpus of over 88,000 restaurant domain utterances whose style varies according to models of personality, giving us total control over both the semantic content and the stylistic variation in the training data. We then vary the amount of explicit stylistic supervision given to the three models. We show that our most explicit model can simultaneously achieve high fidelity to both semantic and stylistic goals: this model adds a context vector of 36 stylistic parameters as input to the hidden state of the encoder at each time step, showing the benefits of explicit stylistic supervision, even when the amount of training data is large. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08352v1 |
http://arxiv.org/pdf/1805.08352v1.pdf | |
PWC | https://paperswithcode.com/paper/controlling-personality-based-stylistic |
Repo | |
Framework | |
Fast 5DOF Needle Tracking in iOCT
Title | Fast 5DOF Needle Tracking in iOCT |
Authors | Jakob Weiss, Nicola Rieke, Mohammad Ali Nasseri, Mathias Maier, Abouzar Eslami, Nassir Navab |
Abstract | Purpose. Intraoperative Optical Coherence Tomography (iOCT) is an increasingly available imaging technique for ophthalmic microsurgery that provides high-resolution cross-sectional information of the surgical scene. We propose to build on its desirable qualities and present a method for tracking the orientation and location of a surgical needle. Thereby, we enable direct analysis of instrument-tissue interaction directly in OCT space without complex multimodal calibration that would be required with traditional instrument tracking methods. Method. The intersection of the needle with the iOCT scan is detected by a peculiar multi-step ellipse fitting that takes advantage of the directionality of the modality. The geometric modelling allows us to use the ellipse parameters and provide them into a latency aware estimator to infer the 5DOF pose during needle movement. Results. Experiments on phantom data and ex-vivo porcine eyes indicate that the algorithm retains angular precision especially during lateral needle movement and provides a more robust and consistent estimation than baseline methods. Conclusion. Using solely crosssectional iOCT information, we are able to successfully and robustly estimate a 5DOF pose of the instrument in less than 5.5 ms on a CPU. |
Tasks | Calibration |
Published | 2018-02-18 |
URL | http://arxiv.org/abs/1802.06446v1 |
http://arxiv.org/pdf/1802.06446v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-5dof-needle-tracking-in-ioct |
Repo | |
Framework | |
Is One Hyperparameter Optimizer Enough?
Title | Is One Hyperparameter Optimizer Enough? |
Authors | Huy Tu, Vivek Nair |
Abstract | Hyperparameter tuning is the black art of automatically finding a good combination of control parameters for a data miner. While widely applied in empirical Software Engineering, there has not been much discussion on which hyperparameter tuner is best for software analytics. To address this gap in the literature, this paper applied a range of hyperparameter optimizers (grid search, random search, differential evolution, and Bayesian optimization) to defect prediction problem. Surprisingly, no hyperparameter optimizer was observed to be `best’ and, for one of the two evaluation measures studied here (F-measure), hyperparameter optimization, in 50% cases, was no better than using default configurations. We conclude that hyperparameter optimization is more nuanced than previously believed. While such optimization can certainly lead to large improvements in the performance of classifiers used in software analytics, it remains to be seen which specific optimizers should be applied to a new dataset. | |
Tasks | Hyperparameter Optimization |
Published | 2018-07-29 |
URL | http://arxiv.org/abs/1807.11112v4 |
http://arxiv.org/pdf/1807.11112v4.pdf | |
PWC | https://paperswithcode.com/paper/is-one-hyperparameter-optimizer-enough |
Repo | |
Framework | |