當前位置:
首頁 > 知識 > 谷歌CVPR最全總結:45篇論文,Ian Goodfellow GAN演講PPT下載

谷歌CVPR最全總結:45篇論文,Ian Goodfellow GAN演講PPT下載

來源:Google、iangoodfellow.com、新智元

今天,2018年計算機視覺和模式識別會議(CVPR 2018)正在鹽湖城舉辦,這是計算機視覺領域最重要的年度學術會議,包括主大會和若干workshop和tutorial。作為會議的鑽石贊助商,谷歌在今年的CVPR上同樣表現強勢,有超過200名谷歌員工將在大會上展示論文或被邀請演講,谷歌也組織和參與了多個研討會。

根據谷歌官方博客,CVPR 2018谷歌共有45篇論文被接收。這些論文關注下一代智能系統和機器感知領域的最新機器學習技術,包括Pixel 2和Pixel 2 XL智能手機的人像模式背後的技術,V4版本的Open Images數據集等等。

Google at CVPR 2018

組織者

財務主席:Ramin Zabih

領域主席:Sameer Agarwal, Aseem Agrawala, Jon Barron, Abhinav Shrivastava, Carl Vondrick, Ming-Hsuan Yang

論文列表

Orals/Spotlights

作為結構表示的對象標誌的無監督發現

Unsupervised Discovery of Object Landmarks as Structural Representations

Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee

DoubleFusion:利用單個深度感測器實時捕捉人體的內體形狀

DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor

Tao Yu, Zerong Zheng, Kaiwen Guo, Jianhui Zhao, Qionghai Dai, Hao Li, Gerard Pons-Moll, Yebin Liu

用於無監督運動重定向的神經運動網路

Neural Kinematic Networks for Unsupervised Motion Retargetting

Ruben Villegas, Jimei Yang, Duygu Ceylan, Honglak Lee

用核預測網路去噪

Burst Denoising with Kernel Prediction Networks

Ben Mildenhall, Jiawen Chen, Jonathan Barron, Robert Carroll, Dillon Sharlet, Ren Ng

神經網路的量化和訓練,以實現高效的整數運算推理

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Matthew Tang, Menglong Zhu, Andrew Howard, Dmitry Kalenichenko, Hartwig Adam

AVA:一個時空本地化原子視覺動作視頻數據集

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

Chunhui Gu, Chen Sun, David Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik

視覺問答的視覺-文本注意力焦點

Focal Visual-Text Attention for Visual Question Answering

Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander G. Hauptmann

推斷來自陰影中的光場

Inferring Light Fields from Shadows

Manel Baradad, Vickie Ye, Adam Yedida, Fredo Durand, William Freeman, Gregory Wornell, Antonio Torralba

修改多個視圖中的非本地變數

Modifying Non-Local Variations Across Multiple Views

Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor

超越卷積的迭代視覺推理

Iterative Visual Reasoning Beyond Convolutions

Xinlei Chen, Li-jia Li, Fei-Fei Li, Abhinav Gupta

3D形變模型回歸的無監督訓練

Unsupervised Training for 3D Morphable Model Regression

Kyle Genova, Forrester Cole, Aaron Maschinot, Daniel Vlasic, Aaron Sarna, William Freeman

學習可擴展圖像識別的可轉換架構

Learning Transferable Architectures for Scalable Image Recognition

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le

生物物種分類和檢測數據集

The iNaturalist Species Classification and Detection Dataset

Grant van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie

利用觀察世界來學習內在的圖像分解

Learning Intrinsic Image Decomposition from Watching the World

Zhengqi Li, Noah Snavely

學習智能對話框用於邊界框注釋

Learning Intelligent Dialogs for Bounding Box Annotation

Ksenia Konyushkova, Jasper Uijlings, Christoph Lampert, Vittorio Ferrari

Posters

重新審視訓練對象類別檢測器的知識遷移

Revisiting Knowledge Transfer for Training Object Class Detectors

Jasper Uijlings, Stefan Popov, Vittorio Ferrari

重新思考用Faster R-CNN架構進行時間動作定位

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David Ross, Jia Deng, Rahul Sukthankar

視覺對象識別的層次式新穎性檢測

Hierarchical Novelty Detection for Visual Object Recognition

Kibok Lee, Kimin Lee, Kyle Min, Yuting Zhang, Jinwoo Shin, Honglak Lee

COCO-Stuff:語境中的事物和材料類別

COCO-Stuff: Thing and Stuff Classes in Context

Holger Caesar, Jasper Uijlings, Vittorio Ferrari

用於視頻分類的外觀關係網路

Appearance-and-Relation Networks for Video Classification

Limin Wang, Wei Li, Wen Li, Luc Van Gool

MorphNet:深度網路的快速簡單資源約束結構學習

MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks

Ariel Gordon, Elad Eban, Bo Chen, Ofir Nachum, Tien-Ju Yang, Edward Choi

圖形卷積自動編碼器的可變形形狀補完

Deformable Shape Completion with Graph Convolutional Autoencoders

Or Litany, Alex Bronstein, Michael Bronstein, Ameesh Makadia

MegaDepth:從互聯網照片學習單視圖深度預測

MegaDepth: Learning Single-View Depth Prediction from Internet Photos

Zhengqi Li, Noah Snavely

作為結構表示的對象標誌的無監督發現

Unsupervised Discovery of Object Landmarks as Structural Representations

Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee

用核預測網路去噪

Burst Denoising with Kernel Prediction Networks

Ben Mildenhall, Jiawen Chen, Jonathan Barron, Robert Carroll, Dillon Sharlet, Ren Ng

神經網路的量化和訓練,以實現高效的整數運算推理

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Matthew Tang, Menglong Zhu, Andrew Howard, Dmitry Kalenichenko, Hartwig Adam

Pix3D:單圖像3D形狀建模的數據集和方法

Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Tianfan Xue, Joshua Tenenbaum,William Freeman

用於表示和編輯圖像的稀疏智能輪廓

Sparse, Smart Contours to Represent and Edit Images

Tali Dekel, Dilip Krishnan, Chuang Gan, Ce Liu, William Freeman

MaskLab:通過使用語義和方向特徵優化對象檢測進行實例分割

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang,Hartwig Adam

大規模細粒度分類和領域特定的遷移學習

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie

改進的帶有初始值和空間自適應比特率的有損網路壓縮

Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks

Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Sung Jin Hwang, George Toderici, Troy Chinen, Joel Shor

MobileNetV2:反向殘差和線性瓶頸

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen

ScanComplete:3D掃描的大規模場景補完和語義分割

ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Juergen Sturm, Matthias Nie?ner

Sim2Real通過循環控制查看不變視覺伺服

Sim2Real View Invariant Visual Servoing by Recurrent Control

Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine

Alternating-Stereo VINS:可觀測性分析和性能評估

Alternating-Stereo VINS: Observability Analysis and PerformanceEvaluation

Mrinal Kanti Paul, Stergios Roumeliotis

Soccer on Your Tabletop

Konstantinos Rematas, Ira Kemelmacher, Brian Curless, Steve Seitz

使用3D幾何約束從單眼視頻中無監督地學習深度和自我運動

Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints

Reza Mahjourian, Martin Wicke, Anelia Angelova

AVA:一個時空本地化原子視覺動作視頻數據集

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

Chunhui Gu, Chen Sun, David Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik

推斷來自陰影中的光場

Inferring Light Fields from Shadows

Manel Baradad, Vickie Ye, Adam Yedida, Fredo Durand, William Freeman, Gregory Wornell, Antonio Torralba

修改多個視圖中的非本地變數

Modifying Non-Local Variations Across Multiple Views

Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor

用於單目深度估計的孔徑監控

Aperture Supervision for Monocular Depth Estimation

Pratul Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan Barron

實例嵌入轉移到無監督視頻對象分割

Instance Embedding Transfer to Unsupervised Video Object Segmentation

Siyang Li, Bryan Seybold, Alexey Vorobyov, Alireza Fathi, Qin Huang, C.-C. Jay Kuo

幀回放視頻超解析度

Frame-Recurrent Video Super-Resolution

Mehdi S. M. Sajjadi, Raviteja Vemulapalli, Matthew Brown

稀疏時間池網路的弱監督動作定位

Weakly Supervised Action Localization by Sparse Temporal Pooling Network

Phuc Nguyen, Ting Liu, Gautam Prasad, Bohyung Han

超越卷積的迭代視覺推理

Iterative Visual Reasoning Beyond Convolutions

Xinlei Chen, Li-jia Li, Fei-Fei Li, Abhinav Gupta

學習和使用時間箭頭

Learning and Using the Arrow of Time

Donglai Wei, Andrew Zisserman, William Freeman, Joseph Lim

HydraNets:高效推理的專用動態架構

HydraNets: Specialized Dynamic Architectures for Efficient Inference

Ravi Teja Mullapudi, Noam Shazeer, William Mark, Kayvon Fatahalian

在有限的監督下進行胸部疾病的識別和定位

Thoracic Disease Identification and Localization with Limited Supervision

Zhe Li, Chong Wang, Mei Han, Yuan Xue, Wei Wei, Li-jia Li, Fei-Fei Li

推斷分層文本-圖像合成的語義布局

Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis

Seunghoon Hong, Dingdong Yang, Jongwook Choi, Honglak Lee

深層語義的臉部去模糊

Deep Semantic Face Deblurring

Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

3D形變模型回歸的無監督訓練

Unsupervised Training for 3D Morphable Model Regression

Kyle Genova, Forrester Cole, Aaron Maschinot, Daniel Vlasic, Aaron Sarna, William Freeman

學習可擴展圖像識別的可轉換架構

Learning Transferable Architectures for Scalable Image Recognition

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le

利用觀察世界來學習內在的圖像分解

Learning Intrinsic Image Decomposition from Watching the World

Zhengqi Li, Noah Snavely

PiCANet:針對像素級的上下文注意力,以檢測顯著性

PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection

Nian Liu, Junwei Han, Ming-Hsuan Yang

Tutorials

機器人和駕駛中的計算機視覺

Computer Vision for Robotics and Driving

Anelia Angelova, Sanja Fidler

無監督視覺學習

Unsupervised Visual Learning

Pierre Sermanet, Anelia Angelova

UltraFast 3D感應,重建和理解人物、物體和環境

UltraFast 3D Sensing, Reconstruction and Understanding of People, Objects and Environments

Sean Fanello, Julien Valentin, Jonathan Taylor, Christoph Rhemann, Adarsh Kowdle, Jürgen Sturm, Christine Kaeser-Chen, Pavel Pidlypenskyi, Rohit Pandey, Andrea Tagliasacchi, Sameh Khamis, David Kim, Mingsong Dou, Kaiwen Guo, Danhang Tang, Shahram Izadi

生成對抗網路

Generative Adversarial Networks

Jun-Yan Zhu, Taesung Park, Mihaela Rosca, Phillip Isola, Ian Goodfellow

Ian Goodfellowa:生成對抗網路(35 PPT)

生成建模:密度估計

訓練數據密度函數

生成建模:樣本生成

訓練數據(CelebA)樣本生成

對抗網路的框架

Self-Attention GAN

ImageNet上最優的FID:1000個類別,128x128 像素

Self-Play

用GAN能做什麼呢?

模擬環境和訓練數據

缺失數據

半監督學習

多個正確答案

逼真的生成任務

基於模型的優化

自動化定製

域適應

自動駕駛數據集

用於模擬訓練數據的GAN

GAN用於缺失數據

從上面這張圖像能看出什麼呢?

用GAN模型看出它是一張臉

GAN用於半監督學習

用於半監督學習的有監督鑒別器

半監督分類

MNIST: 100訓練標籤 -> 80 測試錯誤

SVHN: 1000 訓練標籤 -> 4.3% 測試誤差

CIFAR-10: 4000 標籤 -> 14.4% 測試誤差

GAN用於下一幀視頻的預測

GAN用於逼真的生成任務

iGAN

圖像到圖像翻譯

無監督的圖像到圖像翻譯

CycleGAN

文本-圖像合成

GAN用於基於模型的優化

設計DNA以優化蛋白質結合的研究

GAN用於自動化定製

個性化的GANufacturing

GAN用於域自適應

域對抗網路

GAN的一些技巧

在鑒別器和生成器中 (Zhang et al 2018) 都進行頻譜歸一化 (Miyato et al 2017)

生成器和鑒別器的學習率不同(Heusel et al 2017)

不需要比生成器更頻繁地運行鑒別器(Zhang et al 2018)

許多不同的損失函數都能很好地工作(Lucic et al 2017); 可以花費更多時間調整超參數,而不是嘗試不同的損失函數

地址:https://ai.googleblog.com

http://www.iangoodfellow.com/slides/2018-06-18.pdf

- 加入AI學院學習 -


喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 AI講堂 的精彩文章:

依圖科技宣布將於近期完成2億美元C+輪融資
ACL 2018最佳論文公布!計算語言學最前沿研究都在這了

TAG:AI講堂 |