(env36) [scfan@fdm tools]$ dask-scheduler distributed.scheduler - INFO - ----------------------------------------------- distributed.dashboard.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: pip install jupyter-server-proxy distributed.scheduler - INFO - Local Directory: /tmp/scheduler-bdk4b7li distributed.scheduler - INFO - ----------------------------------------------- distributed.scheduler - INFO - Clear task state distributed.scheduler - INFO - Scheduler at: tcp://10.0.2.14:8786 distributed.scheduler - INFO - dashboard at: :8787 distributed.scheduler - INFO - Register tcp://10.0.2.14:30547 distributed.scheduler - INFO - Starting worker compute stream, tcp://10.0.2.14:30547 distributed.core - INFO - Starting established connection distributed.scheduler - INFO - Register tcp://10.0.2.14:9190 distributed.scheduler - INFO - Starting worker compute stream, tcp://10.0.2.14:9190 distributed.core - INFO - Starting established connection
Dask-Scheduler 可视化界面
Dask-Worker
开启 Worker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
(env36) [scfan@fdm tools]$ dask-worker 10.0.2.14:8786 distributed.nanny - INFO - Start Nanny at: 'tcp://10.0.2.14:12075' distributed.diskutils - INFO - Found stale lock file and directory '/home/scfan/project/FISAMS/branches/branch_scfan/src/server/fdm/tools/worker-yyz2l21f', purging distributed.dashboard.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: pip install jupyter-server-proxy distributed.worker - INFO - Start worker at: tcp://10.0.2.14:17181 distributed.worker - INFO - Listening to: tcp://10.0.2.14:17181 distributed.worker - INFO - dashboard at: 10.0.2.14:36300 distributed.worker - INFO - Waiting to connect to: tcp://10.0.2.14:8786 distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Threads: 4 distributed.worker - INFO - Memory: 10.32 GB distributed.worker - INFO - Local Directory: /home/scfan/project/FISAMS/branches/branch_scfan/src/server/fdm/tools/worker-5304u4tp distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Registered to: tcp://10.0.2.14:8786 distributed.worker - INFO - ------------------------------------------------- distributed.core - INFO - Starting established connection
Dask-Worker 可视化界面
Dask 对比
Dask 缺点
dataframe
不提供 sql 支持,可以使用 dask.dataframe.from_sql
支持的数据格式
Tabular: Parquet, ORC, CSV, Line Delimited JSON, Avro, text