Two-Stream Inflated 3D ConvNet (I3D) HMDB-51: 80.9% and UCF-101: 98.0% 在Inception-v1 Kinetics上预训练 ConvNet+LSTM:每一帧都提feature后整视频pooling,或者每一帧提feature+LSTM.缺点,忽略了时间信息,open和close door会分错. 改进C3D:比二维卷积网络有更多的参数,缺点参数量大,不能imagenet pretrain,从头训难训.input 16帧 输入