soft q learning


Github 项目部署

新的 soft q learning (sql) 分支将 rllab 更换成了 garage,这里推荐用 rllab,即需要使用分支为:9634 的 sql。Git 切换分支的命令为:git checkout。


sql 算法的运行


source activate sql
cd rllab/
export PYTHONPATH=$(pwd):${PYTHONPATH}
cd ..


sql 算法的输出


2019-11-15 22:26:15.343800 CST | return-average             5.70493
2019-11-15 22:26:15.343866 CST | return-min                 5.70493
2019-11-15 22:26:15.343924 CST | return-max                 5.70493
2019-11-15 22:26:15.343980 CST | return-std                 0
2019-11-15 22:26:15.344034 CST | episode-length-avg      1000
2019-11-15 22:26:15.344085 CST | episode-length-min      1000
2019-11-15 22:26:15.344137 CST | episode-length-max      1000
2019-11-15 22:26:15.344188 CST | episode-length-std         0
2019-11-15 22:26:15.344239 CST | AverageForwardProgress     0.373091
2019-11-15 22:26:15.344291 CST | MaxForwardProgress         0.373091
2019-11-15 22:26:15.344343 CST | MinForwardProgress         0.373091
2019-11-15 22:26:15.344393 CST | StdForwardProgress         0
2019-11-15 22:26:15.344445 CST | qf-avg                    15.8138
2019-11-15 22:26:15.344496 CST | qf-std                     8.811
2019-11-15 22:26:15.344546 CST | mean-sq-bellman-error      0.580646
2019-11-15 22:26:15.344597 CST | time-train                22.2653
2019-11-15 22:26:15.344647 CST | time-eval                  1.7079
2019-11-15 22:26:15.344698 CST | time-sample                1.23221
2019-11-15 22:26:15.344748 CST | time-total               175.816
2019-11-15 22:26:15.344799 CST | epoch                      7
2019-11-15 22:26:15.344850 CST | max-path-return           17.365
2019-11-15 22:26:15.344900 CST | last-path-return           1.82404
2019-11-15 22:26:15.344951 CST | pool-size               8000
2019-11-15 22:26:15.345002 CST | episodes                   8
2019-11-15 22:26:15.345053 CST | total-samples           8000




Collecting cma==1.1.06
  Using cached https://files.pythonhosted.org/packages/c5/92/00c83a5e76b8941426b2bab66a035870b94fd83bb42db3d083d273a7fbdb/cma-1.1.06.tar.gz
Could not import setuptools which is required to install from a source distribution.
Traceback (most recent call last):
  File "/home/inksci/miniconda3/envs/sql4/lib/python3.6/site-packages/pip/req/req_install.py", line 387, in setup_py
    import setuptools  # noqa
  File "/home/inksci/miniconda3/envs/sql4/lib/python3.6/site-packages/setuptools/__init__.py", line 13, in <module>
    from ._deprecation_warning import SetuptoolsDeprecationWarning
ModuleNotFoundError: No module named 'setuptools._deprecation_warning'


解决方法:

使用 pip --upgrade 安装最新 pip、setuptools、wheel

深度学习推荐
深度学习推荐

墨之科技,版权所有 © Copyright 2017-2027

湘ICP备14012786号     邮箱:ai@inksci.com