December 23, 2016

Service Fabric Performance and Scalability samplesの結果を見る

これはAzure AdventCalendar 24日目です。今回は、Service Fabric Performance and Scalability samplesを実行してみて、その結果を見てみたいと思います。

Service Fabric Performance and Scalability samples

Service Fabricのサンプルの一つで、最近出ました。このサンプルでは、Reliable DictionaryとActorのそれぞれReadWriteの時間を計測できます。今回は、Reliable Dictionaryを計測します。

この計測の流れは、クライアントがLoadDriverと呼ばれるサービスに計測リクエストをなげて、LoadDriverがSatefullServiceとのやり取りを行って計測して、結果をクライアントが受け取るというコードになってます。

このサンプルで面白いなと思ったのは、WcfCommunicationListenerを使ってFabricClientがサービスとやり取りをしているところです。 WcfCommunicationListenerを初めて聞いたとき、「いまさらWCF…」って思ったのですが、なるほどこんな風に使うんですね。

計測環境

次の2つの環境で計測しました。

	SF Version	CPU	Memory	Nodes	Machines	Partitions
Local1	5.4.145.9494	Intel® core™ i5-6500 @ 3.2 GHz	16GB	5	1	1
Local2	5.4.145.9494	Intel® core™ i5-6500 @ 3.2 GHz	16GB	5	1	4
Cloud	5.4.145.9494	Intel® Xeon™ E5-2673 v3 @ 2.4 GHz	4GB	5	5	6

Cloudの方は最近出たA2_v2インスタンスです。 Localでは、Reliable DicrionaryのPartitionsサイズを1とCore数分の二つで計測してみました。(計測を始めるとCPUが100%に張り付くのが気になったので)

Cloudはデフォの6のままです。

パラメータは次のように設定しました。

Parameter	Value	Description
NumWriteOperationsTotal	262144	Total number of write operations to be performed on the service.
NumOutstandingWriteOperations	64	Number of write operations sent to the service at any given time.
NumReadOperationsTotal	524288	Total number of read operations to be performed on the service.
NumOutstandingReadOperations	16	Number of read operations sent to the service at any given time.
OperationDataSizeInBytes	1024	Size in bytes of the data associated with each operation (i.e. read or write) performed on the service.
NumItems	2048	Number of items (e.g. number of rows in a table) that the operations are distributed across in the service.
NumClients	1	Number of clients used to perform the operations on the service.

NumOutstandingWriteOperations が64の場合、64個のタスクが生成されて並行に操作がNumWriteOperationsTotal回実行されます。Readのほうもしかり。

計測結果

	Test Case	Time	Operations	ava. Operaions [ope/sec]	ava. Operation Latency [ms]
Local1	Write	00:01:45.9455085	262144	2474.32858373604	25.8481560119629
Local1	Read	00:01:04.9925570	524288	8066.89295206527	1.9822887878418
Local2	Write	00:02:33.0721881	262144	2474.32858373604	37.3323948471069
Local2	Read	00:01:27.1474273	524288	6016.10416100029	2.65598070163727
Cloud	Write	00:04:04.4822671	262144	1072.24136584424	59.5640092216492
Cloud	Read	00:03:24.7382677	524288	2560.7718864176	6.24372196731567

考察

これから、友達と鍋パなので手短に！ Local2がLocal1より時間かかったのはやっぱり、マシン一台なのにパーティション切ったのがいけないかなと。全体的にやっぱり、書き込みが遅いですね。前に計測したときはヘッダに時間がかかってそうで、１件ずつとかかなり遅いって感じでした。今回のサンプルでは最初の一件目にどれだけかかってるか出てないし、クラスタのVMSizeが異なるので素直に比較はできませんが、100k件が32880[ms]だったので、

32880 * 2.62144 = 86192.9472 = 00:01:26.1929472

という感じでしょうか。今回は、リバースプロキシなどを経由している点を考慮すると前回とった結果とあんまり変わらなさそうです。

Cloudですごく結果が悪く見えます。Av2のCPUはDv2と同じなので、メモリかネットワークですかね。Localの時みたいにPatitionが分かれすぎってのもあるかも。

今、Reliable Collection回りは中の人が頑張って書き直しいるそうなので（ほんとか？）、SDK上がるたびに計測できるようにいい感じの計測サービスを準備しておきたい今日この頃です。

以上！結果書いただけかよ！って感じですが２４日目でした！