This wiki is obsolete, see the NorduGrid web pages for up to date information.
NOX/Tests/StoragePlots
Storage System Performance Plots
This page is created to house storage performance plots that will be used in an upcoming article about the developed storage system.
Folder Creation and stating
We use the folder creation plots to show how much time it takes to add a file to the system when we remove the file transfer, i.e. the actual transfer of bytes. We should demonstrate how the system performs both in the direction of depth and width.
Folder creation and stating in Depth vs Time
The plot should show how much time it takes to create a folder at a certain level. To remove fluctations 5 folders should be created on each level and the avarage of those 5 should be taken. The test should be done for a fully secure system.
Plots will be made by both Jon and Salman using their own machines. Services will be deployed on separate machines..
- X-axis: Depth up to 30 - Y-axis: Time in seconds
Script to produce data for plot: depth_vs_time.py Gnuplot script to produce plot: depth_vs_time.plot
Depth vs time, depth of 30, 5 samples averaged, secure, dingdang cluster, TCP_NODELAY, four machines on same LAN (A-Hash, Librarian, Bartender and client):
File:Timings depth vs time 30x5 secure dingdang.png
Data file used in plot: File:Timings depth vs time 30x5.txt (rename to .dat to use it with depth_vs_time.plot)
File:Depth WAN ReplicatedAHash.png
Data file used in plot: File:Timings depth vs time 30x5 replicatedAHash WAN.txt
Folder creation and stating in Width vs Time
The plot should show how much time it takes to create a subfolder at a fixed level (root or one down) and stating the folder. The test should be done for a fully secure system.
- X-axis: Width up to 1000 - Y-axis: Time in seconds
Script to produce plot: width_vs_time.py
Width vs. time, up to 1000 entries, 5 samples averaged, secure, dingding cluster, TCP_NODELAY, four machines on same LAN ((central) A-Hash, Librarian, Bartender and client): File:Width vs time 1000x5 secure dingdong.png
Data file used in plot: File:Timings width vs time 1000x5.txt
File:Width WAN ReplicatedAHash.png
Data file used in the plot:File:Timings width vs time 1000x5 replicatedAHash WAN.txt
Many clients in the system
These plots should show how the system behaves when handling several user requests simultaneously.
Number of client vs Time taken to create 50 collections
The plot should show the minimum, avarage and maximum time it takes to create 50 collections as a function of an increasing number of simultaneous users of the system.
- X-axis: Parallel clients (60 to 100) - Y-axis: Time for clients to create 50 folders
File:MultiClient LAN CentralAHash.png
Data file used in plot: File:Final multiclient LAN central.txt
File:MultiClient WAN ReplicatedAHash.png
Data file used in plot: File:Final multiclient WAN replicated.txt
Avarage time taken to create a folder
That plot should show the avarage time it takes to create a collection. i.e the time Client Max time divided by the number of folders created.
- X-axis: Parallel clients (up to 50) - Y-axis: Per collection creation time
Long term stability
The plots should show the long term stability of the system, particularly the services. The should be robust and not leak memory which would make them crash after 14 days.
Service memory usage
The services should run for weeks, months, years without having to crash due to memory leakages. The plot should show the memory consumption of each service during 1 week continuous operation.
- X-axis: Time - Y-axis: Used memory
Service CPU consumption
The plot should show the service CPU consumption under continuous operation. Suggest to have one loop which creates a continuous small load on the system, and one loop which regularly (e.g. every 6 hours) creates a heavy system load.
- X-axis: Parallel clients (up to 50) - Y-axis: Per collection creation time
A-Hash performance and consistency
The tables should show how the A-Hash performs in different environments.
Getting
For getting from the A-Hash we have 4 levels, and on each level three numbers: min, avg, max. The 4 levels:
- centralized - replicated, all nodes are stable - replicated, one of the nodes is unstable - replicated, all the nodes are unstable
Scripts to reproduce tables: tests/ahash
Setup:
Centralized: A-Hash and client on two separate machines, same LAN Replicated: 3 A-Hash replicas on separate machines, 1 client on 4th machine, same LAN
Test method:
Client repeatedly get's an entry from A-Hash For unstable tests, one replica is restarted every 60th second For "Replicated, unstable clients" only A-Hash clients are restarted For "Replicated, all nodes unstable" A-Hash master may be restarted as well For "Replicated, always kill master" A-Hash master only is restarted
Level | Average (s) | Min. (s) | Max. (s) | No. of requests |
---|---|---|---|---|
Centralized, no wait | 0.003780 | 0.003399 | 0.013441 | 153126 |
Replicated, stable, no wait | 0.003738 | 0.003453 | 0.013261 | 154798 |
Replicated, unstable clients, no wait | 0.003754 | 0.003412 | 0.289535 | 154201 |
Replicated, all nodes unstable, no wait | 0.003789 | 0.003465 | 0.324507 | 152752 |
Replicated, always kill master | 0.003763 | 0.003402 | 1.971131 | 153748 |
Setting
For setting to the A-Hash we have 4 levels, and on each level three numbers: min, avg, max. The 4 levels:
- centralized - replicated, all nodes are stable - replicated, one of the nodes is unstable - replicated, all the nodes are unstable
Scripts to reproduce tables: tests/ahash
Setup:
Centralized: A-Hash and client on two separate machines, same LAN Replicated: 3 A-Hash replicas on separate machines, 1 client on 4th machine, same LAN
Test method:
Client repeatedly writes an entry to A-Hash For unstable tests, one replica is restarted every 60th second For "Replicated, unstable clients" only A-Hash clients are restarted For "Replicated, all nodes unstable" A-Hash master may be restarted as well For "Replicated, always kill master" A-Hash master only is restarted
Level | Average (s) | Min. (s) | Max. (s) | No. of requests |
---|---|---|---|---|
Centralized, no wait | 0.004260 | 0.003828 | 0.014459 | 51388 |
Replicated, stable, no wait | 0.033902 | 0.016866 | 1.057602 | 10905 |
Replicated, unstable clients, no wait | 0.034239 | 0.016434 | 1.131142 | 10727 |
Replicated, all nodes unstable, no wait | 0.035428 | 0.016517 | 10.459225 | 10639 |
Replicated, always kill master | 0.044868 | 0.016293 | 60.902862 | 9156 |