ELU:
 
梯度下降优化方式:
 
  1. GradientDescentOptimizer
    This one is sensitive to the problem and you can face lots of problems using it, from getting stuck in saddle points to oscillating around the minimum and slow convergence. I found it useful for Word2Vec, CBOW and feed-forward architectures in general, but Momentum is also good.
  2. AdadeltaOptimizer 
    Adadelta addresses the issues of using constant of linearly decaying learning rate. In case of recurrent networks it’s among the fastest.
  3. MomentumOptimizer
    If you learn a regression and find your loss function oscillating, switching from SGD to Momentum may be the right solution.
  4. AdamOptimizer
    Adaptive momentum in addition to the Adadelta features.
  5. FtrlOptimizer
    I haven’t used it myself, but from the paper I see that it’s better suited for online learning on large sparse datasets, like recommendation systems.
  6. RMSPropOptimizer
    This is a variant Adadelta that serves the same purpose - dynamic decay of a learning rate multiplier.
 
CNN神经网络一些tricky的地方:
摘要:
1、适合Relu的参数初始化:w = np.random.randn(n) * sqrt(2.0/n) # current recommendation
2、LR: In practice, if you see that you stopped making progress on the validation set, divide the LR by 2 (or by 5), and keep going, which might give you a surprise.亲测有效
3、关于learning rate:
RNN学习:
 
FCN:http://blog.csdn.net/happyer88/article/details/47205839:Fully Convolutional Networks for Semantic Segmentation笔记
优点:
1,训练一个end-to-end的FCN模型,利用卷积神经网络的很强的学习能力,得到较准确的结果,以前的基于CNN的方法都是要对输入或者输出做一些处理,才能得到最终结果。
 
2,直接使用现有的CNN网络,如AlexNet, VGG16, GoogLeNet,只需在末尾加上upsampling,参数的学习还是利用CNN本身的反向传播原理,"whole image training is effective and efficient."
 
3,不限制输入图片的尺寸,不要求图片集中所有图片都是同样尺寸,只需在最后upsampling时按原图被subsampling的比例缩放回来,最后都会输出一张与原图大小一致的dense prediction map
 
理解DL细节的不错的文章:
 
 
如果遇到了最后的输出值都一样的情况,可能的解决办法如下:
Hey, I had a similar issue with my own (hand-coded) CNN trying to get some results with the CIFAR-10 dataset. What I found was that I had forgotten to normalize the input images to some range that made sense with my weight scales. Try something like X = X / max(abs(X)) to put values between -1 and 1.
Another possibility is your weight initialization is causing many ReLU units to die. I usually initialize all weights with a small number times a normal Gaussian distribution. For wx+ b, b being the biases, you can try that + a small positive constant. I.e. b = weight_scale*random.randn(num, 1) + 0.1
Another idea — your sigmoid unit might be squashing your responses too much. They’re fairly uncommon in CNNs from what I understand, maybe just stick to ReLUs.
Last point — try testing on a small training batch (say 10–20 images) and just train until you overfit with 100% accuracy. That’s one way of knowing that your network is capable of doing something. I think these smaller tests are very important before investing hours or days into proper training, which is what these networks often require.
我最后的解决办法是:加了batch normalization,不过具体原因也没有确定
 
 
GAN的资料:

最新文章

  1. Centos 上 Tengine安装
  2. thinphp框架的项目svn重新检出后的必备配置
  3. 如何进行安全测试-XSS篇
  4. HQL查询——from子句
  5. this Activity.this Activity.class
  6. gcc与makefile编译 BY 四喜三顺
  7. Atitit.木马病毒的免杀原理---sikuli 的使用
  8. android错误 Android NAND: nand_dev_load_disk_state,
  9. C#生成ACCESS文件几点注意事项
  10. sql 查询所有数据库、表名、表字段总结
  11. WebADI应用到Office 2016 64-bit
  12. Golang的CSP很酷?其实.NET也可以轻松完成
  13. 洛谷P2405 non天平
  14. python正则表达式相关记录
  15. 监控glusterfs
  16. robot_framework环境搭建
  17. 如何使用Bootstrap自带图标
  18. python web篇 Django centos 命令版
  19. 解题3(CoordinateCalculate)
  20. keras系列︱图像多分类训练与利用bottleneck features进行微调(三)

热门文章

  1. 解析苹果的官方例子LazyTableImages实现图片懒加载原理
  2. 使用linux远程登录另一台linux
  3. 阿里云全球首次互联网8K直播背后的技术解读
  4. mysql创建表的注意事项
  5. js实现字符串一个一个依次显示
  6. windows 查看端口号,杀进程
  7. Java 的字符串,String、StringBuffer、StringBuilder 有什么区别?
  8. 前段js初学总结
  9. SAP CX Upscale Commerce : SAP全新推出的电商云平台
  10. 玩Web虎-运行时受保护文件不可复制