Paper Group NANR 212
MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization. Deep Fundamental Matrix Estimation. Learnability of Learned Neural Networks. Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12). Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study. A Multi …
MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization
Title | MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization |
Authors | Ian En-Hsu Yen, Wei-Cheng Lee, Kai Zhong, Sung-En Chang, Pradeep K. Ravikumar, Shou-De Lin |
Abstract | We consider a generalization of mixed regression where the response is an additive combination of several mixture components. Standard mixed regression is a special case where each response is generated from exactly one component. Typical approaches to the mixture regression problem employ local search methods such as Expectation Maximization (EM) that are prone to spurious local optima. On the other hand, a number of recent theoretically-motivated \emph{Tensor-based methods} either have high sample complexity, or require the knowledge of the input distribution, which is not available in most of practical situations. In this work, we study a novel convex estimator \emph{MixLasso} for the estimation of generalized mixed regression, based on an atomic norm specifically constructed to regularize the number of mixture components. Our algorithm gives a risk bound that trades off between prediction accuracy and model sparsity without imposing stringent assumptions on the input/output distribution, and can be easily adapted to the case of non-linear functions. In our numerical experiments on mixtures of linear as well as nonlinear regressions, the proposed method yields high-quality solutions in a wider range of settings than existing approaches. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8284-mixlasso-generalized-mixed-regression-via-convex-atomic-norm-regularization |
http://papers.nips.cc/paper/8284-mixlasso-generalized-mixed-regression-via-convex-atomic-norm-regularization.pdf | |
PWC | https://paperswithcode.com/paper/mixlasso-generalized-mixed-regression-via |
Repo | |
Framework | |
Deep Fundamental Matrix Estimation
Title | Deep Fundamental Matrix Estimation |
Authors | Rene Ranftl, Vladlen Koltun |
Abstract | We present an approach to robust estimation of fundamental matrices from noisy data contaminated by outliers. The problem is cast as a series of weighted homogeneous least-squares problems, where robust weights are estimated using deep networks. The presented formulation acts directly on putative correspondences and thus fits into standard 3D vision pipelines that perform feature extraction, matching, and model fitting. The approach can be trained end-to-end and yields computationally efficient robust estimators. Our experiments indicate that the presented approach is able to train robust estimators that outperform classic approaches on real data by a significant margin. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Rene_Ranftl_Deep_Fundamental_Matrix_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Rene_Ranftl_Deep_Fundamental_Matrix_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-fundamental-matrix-estimation |
Repo | |
Framework | |
Learnability of Learned Neural Networks
Title | Learnability of Learned Neural Networks |
Authors | Rahul Anand Sharma, Navin Goyal, Monojit Choudhury, Praneeth Netrapalli |
Abstract | This paper explores the simplicity of learned neural networks under various settings: learned on real vs random data, varying size/architecture and using large minibatch size vs small minibatch size. The notion of simplicity used here is that of learnability i.e., how accurately can the prediction function of a neural network be learned from labeled samples from it. While learnability is different from (in fact often higher than) test accuracy, the results herein suggest that there is a strong correlation between small generalization errors and high learnability. This work also shows that there exist significant qualitative differences in shallow networks as compared to popular deep networks. More broadly, this paper extends in a new direction, previous work on understanding the properties of learned neural networks. Our hope is that such an empirical study of understanding learned neural networks might shed light on the right assumptions that can be made for a theoretical study of deep learning. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJ1RPJWAW |
https://openreview.net/pdf?id=rJ1RPJWAW | |
PWC | https://paperswithcode.com/paper/learnability-of-learned-neural-networks |
Repo | |
Framework | |
Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12)
Title | Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12) |
Authors | |
Abstract | |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1700/ |
https://www.aclweb.org/anthology/W18-1700 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-twelfth-workshop-on-graph |
Repo | |
Framework | |
Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study
Title | Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study |
Authors | Eva Haji{\v{c}}ov{'a}, Ji{\v{r}}{'\i} M{'\i}rovsk{'y} |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1259/ |
https://www.aclweb.org/anthology/L18-1259 | |
PWC | https://paperswithcode.com/paper/discourse-coherence-through-the-lens-of-an |
Repo | |
Framework | |
A Multilingual Test Collection for the Semantic Search of Entity Categories
Title | A Multilingual Test Collection for the Semantic Search of Entity Categories |
Authors | Juliano Efson Sales, Siamak Barzegar, Wellington Franco, Bernhard Bermeitinger, Tiago Cunha, Brian Davis, Andr{'e} Freitas, H, Siegfried schuh |
Abstract | |
Tasks | Information Retrieval |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1398/ |
https://www.aclweb.org/anthology/L18-1398 | |
PWC | https://paperswithcode.com/paper/a-multilingual-test-collection-for-the |
Repo | |
Framework | |
Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI
Title | Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI |
Authors | Kira Griffitt, Jennifer Tracey, Ann Bies, Stephanie Strassel |
Abstract | |
Tasks | Transfer Learning |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1265/ |
https://www.aclweb.org/anthology/L18-1265 | |
PWC | https://paperswithcode.com/paper/simple-semantic-annotation-and-situation |
Repo | |
Framework | |
Zewen at SemEval-2018 Task 1: An Ensemble Model for Affect Prediction in Tweets
Title | Zewen at SemEval-2018 Task 1: An Ensemble Model for Affect Prediction in Tweets |
Authors | Zewen Chi, Heyan Huang, Jiangui Chen, Hao Wu, Ran Wei |
Abstract | This paper presents a method for Affect in Tweets, which is the task to automatically determine the intensity of emotions and intensity of sentiment of tweets. The term affect refers to emotion-related categories such as anger, fear, etc. Intensity of emo-tions need to be quantified into a real valued score in [0, 1]. We propose an en-semble system including four different deep learning methods which are CNN, Bidirectional LSTM (BLSTM), LSTM-CNN and a CNN-based Attention model (CA). Our system gets an average Pearson correlation score of 0.682 in the subtask EI-reg and an average Pearson correlation score of 0.784 in subtask V-reg, which ranks 17th among 48 systems in EI-reg and 19th among 38 systems in V-reg. |
Tasks | Sentence Classification, Sentiment Analysis |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1046/ |
https://www.aclweb.org/anthology/S18-1046 | |
PWC | https://paperswithcode.com/paper/zewen-at-semeval-2018-task-1-an-ensemble |
Repo | |
Framework | |
Recovering 3D Planes from a Single Image via Convolutional Neural Networks
Title | Recovering 3D Planes from a Single Image via Convolutional Neural Networks |
Authors | Fengting Yang, Zihan Zhou |
Abstract | In this paper, we study the problem of recovering 3D planar surfaces from a single image of man-made environment. We show that it is possible to directly train a deep neural network to achieve this goal. A novel plane structure-induced loss is proposed to train the network to simultaneously predict a plane segmentation map and the parameters of the 3D planes. Further, to avoid the tedious manual labeling process, we show how to leverage existing large-scale RGB-D dataset to train our network without explicit 3D plane annotations, and how to take advantage of the semantic labels come with the dataset for accurate planar and non-planar classification. Experiment results demonstrate that our method significantly outperforms existing methods, both qualitatively and quantitatively. The recovered planes could potentially benefit many important visual tasks such as vision-based navigation and human-robot interaction. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Fengting_Yang_Recovering_3D_Planes_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Fengting_Yang_Recovering_3D_Planes_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/recovering-3d-planes-from-a-single-image-via |
Repo | |
Framework | |
Lifelong Word Embedding via Meta-Learning
Title | Lifelong Word Embedding via Meta-Learning |
Authors | Hu Xu, Bing Liu, Lei Shu, Philip S. Yu |
Abstract | Learning high-quality word embeddings is of significant importance in achieving better performance in many down-stream learning tasks. On one hand, traditional word embeddings are trained on a large scale corpus for general-purpose tasks, which are often sub-optimal for many domain-specific tasks. On the other hand, many domain-specific tasks do not have a large enough domain corpus to obtain high-quality embeddings. We observe that domains are not isolated and a small domain corpus can leverage the learned knowledge from many past domains to augment that corpus in order to generate high-quality embeddings. In this paper, we formulate the learning of word embeddings as a lifelong learning process. Given knowledge learned from many previous domains and a small new domain corpus, the proposed method can effectively generate new domain embeddings by leveraging a simple but effective algorithm and a meta-learner, where the meta-learner is able to provide word context similarity information at the domain-level. Experimental results demonstrate that the proposed method can effectively learn new domain embeddings from a small corpus and past domain knowledges\footnote{We will release the code after final revisions.}. We also demonstrate that general-purpose embeddings trained from a large scale corpus are sub-optimal in domain-specific tasks. |
Tasks | Meta-Learning, Word Embeddings |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=H1BO9M-0Z |
https://openreview.net/pdf?id=H1BO9M-0Z | |
PWC | https://paperswithcode.com/paper/lifelong-word-embedding-via-meta-learning |
Repo | |
Framework | |
Through-Wall Human Pose Estimation Using Radio Signals
Title | Through-Wall Human Pose Estimation Using Radio Signals |
Authors | Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, Dina Katabi |
Abstract | This paper demonstrates accurate human pose estimation through walls and occlusions. We leverage the fact that wireless signals in the WiFi frequencies traverse walls and reflect off the human body. We introduce a deep neural network approach that parses such radio signals to estimate 2D poses. Since humans cannot annotate radio signals, we use state-of-the-art vision model to provide cross-modal supervision. Specifically, during training the system uses synchronized wireless and visual inputs, extracts pose information from the visual stream, and uses it to guide the training process. Once trained, the network uses only the wireless signal for pose estimation. We show that, when tested on visible scenes, the radio-based system is almost as accurate as the vision-based system used to train it. Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios. Demo videos are available at our website (http://rfpose.csail.mit.edu). |
Tasks | Pose Estimation, RF-based Pose Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhao_Through-Wall_Human_Pose_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhao_Through-Wall_Human_Pose_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/through-wall-human-pose-estimation-using |
Repo | |
Framework | |
Tools for The Production of Analogical Grids and a Resource of N-gram Analogical Grids in 11 Languages
Title | Tools for The Production of Analogical Grids and a Resource of N-gram Analogical Grids in 11 Languages |
Authors | Rashel Fam, Yves Lepage |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1171/ |
https://www.aclweb.org/anthology/L18-1171 | |
PWC | https://paperswithcode.com/paper/tools-for-the-production-of-analogical-grids |
Repo | |
Framework | |
One-shot and few-shot learning of word embeddings
Title | One-shot and few-shot learning of word embeddings |
Authors | Andrew Kyle Lampinen, James Lloyd McClelland |
Abstract | Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax and semantics of the surrounding words tells us. Here, we draw inspiration from this to highlight a simple technique by which deep recurrent networks can similarly exploit their prior knowledge to learn a useful representation for a new word from little data. This could make natural language processing systems much more flexible, by allowing them to learn continually from the new words they encounter. |
Tasks | Few-Shot Learning, Word Embeddings |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkYgAJWCZ |
https://openreview.net/pdf?id=rkYgAJWCZ | |
PWC | https://paperswithcode.com/paper/one-shot-and-few-shot-learning-of-word-1 |
Repo | |
Framework | |
On the Use of Word Embeddings Alone to Represent Natural Language Sequences
Title | On the Use of Word Embeddings Alone to Represent Natural Language Sequences |
Authors | Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin |
Abstract | To construct representations for natural language sequences, information from two main sources needs to be captured: (i) semantic meaning of individual words, and (ii) their compositionality. These two types of information are usually represented in the form of word embeddings and compositional functions, respectively. For the latter, Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) have been considered. There has not been a rigorous evaluation regarding the relative importance of each component to different text-representation-based tasks; i.e., how important is the modeling capacity of word embeddings alone, relative to the added value of a compositional function? In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Moreover, in a new SWEM setup, we propose to employ a max-pooling operation over the learned word-embedding matrix of a given sentence. This approach is demonstrated to extract complementary features relative to the averaging operation standard to SWEMs, while endowing our model with better interpretability. To further validate our observations, we examine the information utilized by different models to make predictions, revealing interesting properties of word embeddings. |
Tasks | Word Embeddings |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Sy5OAyZC- |
https://openreview.net/pdf?id=Sy5OAyZC- | |
PWC | https://paperswithcode.com/paper/on-the-use-of-word-embeddings-alone-to |
Repo | |
Framework | |
Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion
Title | Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion |
Authors | Cong Ma, Kaizheng Wang, Yuejie Chi, Yuxin Chen |
Abstract | Recent years have seen a flurry of activities in designing provably efficient nonconvex optimization procedures for solving statistical estimation problems. For various problems like phase retrieval or low-rank matrix completion, state-of-the-art nonconvex procedures require proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. When it comes to vanilla procedures such as gradient descent, however, prior theory either recommends highly conservative learning rates to avoid overshooting, or completely lacks performance guarantees. This paper uncovers a striking phenomenon in several nonconvex problems: even in the absence of explicit regularization, gradient descent follows a trajectory staying within a basin that enjoys nice geometry, consisting of points incoherent with the sampling mechanism. This “implicit regularization” feature allows gradient descent to proceed in a far more aggressive fashion without overshooting, which in turn results in substantial computational savings. Focusing on two statistical estimation problems, i.e. solving random quadratic systems of equations and low-rank matrix completion, we establish that gradient descent achieves near-optimal statistical and computational guarantees without explicit regularization. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent enables optimal control of both entrywise and spectral-norm errors. |
Tasks | Low-Rank Matrix Completion, Matrix Completion |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1893 |
http://proceedings.mlr.press/v80/ma18c/ma18c.pdf | |
PWC | https://paperswithcode.com/paper/implicit-regularization-in-nonconvex-1 |
Repo | |
Framework | |