在Spark 上實現 TensorFlow 的庫——Sparkflow

知識 04-19

該庫是 TensorFlow 在 Spark 上的實現，旨在 Spark 上使用 TensorFlow 提供一個簡單的、易於理解的介面。藉助 SparkFlow，開發者可以輕鬆地將深度學習模型與 ML Spark Pipeline 相集成。SparkFlow 使用參數伺服器以分散式方式訓練 Tensorflow 網路，通過 API，用戶可以指定訓練風格，無論是 Hogwild 還是非同步鎖定。

為什麼要使用 SparkFlow

雖然有很多的庫都能在 Apache Spark 上實現 TensorFlow，但 SparkFlow 的目標是使用 ML Pipelines，為訓練 Tensorflow 圖提供一個簡單的界面，並為快速開發提供基本抽象。關於訓練，SparkFlow 使用一個參數伺服器，它位於驅動程序上並允許非同步培訓。此工具在訓練大數據時提供更快的訓練時間。

Github：

https://github.com/lifeomic/sparkflow

安裝

通過 pip 安裝：pip install sparkflow

安裝需求：Apache Spark 版本 >= 2.0，同時安裝好 TensorFlow

示例

簡單的 MNIST 深度學習例子：

from sparkflow.graph_utils import build_graph

from sparkflow.tensorflow_async import SparkAsyncDL

import tensorflowastf

from pyspark.ml.feature import VectorAssembler, OneHotEncoder

from pyspark.ml.pipeline import Pipeline

#simple tensorflow network

def small_model():

x=tf.placeholder(tf.float32, shape=[None,784], name="x")

y=tf.placeholder(tf.float32, shape=[None,10], name="y")

layer1 =tf.layers.dense(x,256, activation=tf.nn.relu)

layer2 =tf.layers.dense(layer1,256, activation=tf.nn.relu)

out =tf.layers.dense(layer2,10)

z=tf.argmax(out,1, name="out")

loss =tf.losses.softmax_cross_entropy(y, out)

returnloss

df = spark.read.option("inferSchema","true").csv("mnist_train.csv")

mg = build_graph(small_model)

#Assembleandone hot encode

va = VectorAssembler(inputCols=df.columns[1:785], outputCol="features")

encoded = OneHotEncoder(inputCol="_c0", outputCol="labels", dropLast=False)

spark_model = SparkAsyncDL(

inputCol="features",

tensorflowGraph=mg,

tfInput="x:0",

tfLabel="y:0",

tfOutput="out:0",

tfLearningRate=.001,

iters=1,

predictionCol="predicted",

labelCol="labels",

verbose=1

)

p= Pipeline(stages=[va, encoded, spark_model]).fit(df)

p.write().overwrite().save("location")

4 月 AI 求職季

8 大明星企業

10 場分享盛宴

20 小時獨門秘籍

4.10-4.19，我們準時相約！

新人福利

關注 AI 研習社（okweiwu），回復1領取

【超過 1000G 神經網路 / AI / 大數據資料】

最經典的 SVM 演算法在 Spark 上實現，這裡有一份詳盡的開發教程（含代碼）

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 AI研習社 的精彩文章:

※AI 開發持續火熱，今日頭條、華為諾亞方舟實驗室大量 AI 崗位虛左以待！
※數據科學、機器學習、人工智慧，都有哪些區別？

TAG:AI研習社 |