microsoft / storscore Goto Github PK

A test framework to evaluate SSDs and HDDs

License: MIT License

Perl 12.24% C++ 86.06% Batchfile 0.01% C 1.70%

storscore's Introduction

StorScore: A test framework to evaluate SSDs and HDDs.

StorScore is a component-level evaluation tool for testing storage devices. When run with default settings it should give realistic metrics similar to what can be expected by a Windows application developer.

Background

We were motivated to write StorScore because most existing solutions had some problems:

```
 Difficult to automate (GUI tools)
```

 Don't properly measure SSDs (history effect, endurance)

```
 Linux-centric
```

StorScore is driven by a "recipe" file, which, like all good things, is just another Perl script. The recipe is simply a series of steps to be followed.

By default, StorScore will run the "turkey test", which is the recipe used by Microsoft to evaluate HDD and SSD for potential cloud deployments. Take a look in the recipes subdirectory to see other examples.

The only required command line option is --target. This can specify an existing file, volume, or a \.\PHYSICALDRIVE number. There are other command line parameters that may be useful, but documentation has not yet been written. Take a look at lib\GlobalConfig.pm to see them all.

Be aware that StorScore can easily be used in a data-destructive manner. Be careful with the --target option.

When running, StorScore will create a bunch of files in the results directory. We rarely look at these directly. Instead, we typically gather many results directories, from a cohort of comparable devices, and pass them to the parse_results.cmd script, which generates a nice Excel XLSX file. The Excel file is structured to facilitate use of pivot charts.

The Excel file has the usual raw metrics (throughput, latency, etc.) but also contains the result of our scoring system, which we designed to help summarize what would otherwise be far too much data (hence the name: StorScore).

Lecture Video

Laura and Mark gave a web presentation in August 2014 to the Microsoft MVP storage community. The talk was recorded, and provides a general overview and a demo of StorScore:

https://www.youtube.com/watch?v=gJZGu-Y3uXE

Dependencies

StorScore depends on some "external" software components.

You must download and install the following or StorScore will not work:

A Windows Perl interpreter:
    ActiveState: http://www.activestate.com/activeperl
    Strawberry: http://strawberryperl.com/

The Visual Studio 2013 C++ runtime libraries for x86 & x64:
    http://www.microsoft.com/en-us/download/details.aspx?id=40784

The Visual Studio 2015 C++ runtime libraries for x86 & x64:
    https://www.microsoft.com/en-us/download/details.aspx?id=48145

StorScore will work without these components, but some features will be disabled:

SmartCtl.exe, from SmartMonTools:
    http://www.smartmontools.org/

Ipmiutil.exe, from the IPMI Management Utilities:
    http://ipmiutil.sourceforge.net/

You can use StorScore to run tests and parse their data without these components, but you will need them to edit and compile the StorageTool:

Windows Driver Kit (WDK):
    https://developer.microsoft.com/en-us/windows/hardware/windows-driver-kit

Windows Software Development Kit (SDK):
    https://developer.microsoft.com/en-US/windows/downloads/windows-10-sdk

StorScore includes the following components "in the box." We would like to thank the authors and acknowledge their contribution:

The excellent Perl library, Excel::Writer::XLSX, by John McNamara.
    http://search.cpan.org/~jmcnamara/Excel-Writer-XLSX/lib/Excel/Writer/XLSX.pm        

DiskSpd.exe: an IO generator from the Microsoft Windows team.
    http://aka.ms/diskspd
    https://github.com/microsoft/diskspd

SQLIO2.exe: an IO generator from the Microsoft SQL Server team.

Feedback?

Questions, comments, bug reports, and especially accolades may be directed to the developers: Laura Caulfield [email protected] Mark Santaniello [email protected] Bikash Sharma [email protected]

Open Source Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

storscore's People

Contributors

Stargazers

Watchers

storscore's Issues

Support a "--percent_full" distinct from "--active_range"

Today our active_range is always identical to percent_full. We cannot vary these independently. Client scenarios might be interested in something like 99% full, 10% active range. In order to support this, we would most likely want to create two test files, initialize each of them, but then only run the IO generator targeting one of them.

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Log errors from bg_exec

Currently, there's no record of errors occurring in the process executed in the background (through "bg_exec" in the recipes). This can be a problem if you run one of the targeted tests designed to check that the performance of a workload is unchanged by background activity (like write_impact or smart_check). If the bg process silently errors, the fg process will have the same performance and the drive could incorrectly pass the test.

One solution is to capture the std err from the bg process (see line 830 of Util.pm). A better option is to fail out the whole recipe if the BG process errors.

A testing scenario (where I've seen the bg process error) is running write_impact in demo mode on a slow thumb drive on win10 client. Something about this combination causes the bg process to target a drive letter that no longer exists.

Switch to DiskSpd as the default IO Generator

Currently we default to SQLIO2, but there is more active development around DiskSpd.

Split Sqlio.pm and DiskSpd.pm into Parser/Runner modules

Parsing and running are different enough activities that the modules should probably be split. There are a lot of examples already (LogmanParser/LogmanRunner, etc.)

Parse the 95th percentile latency information from DiskSpd

Somehow we overlooked this when transitioning from SQLIO.

Add OS Version

Add the OS version to the data collection & the data parsing. Could use "wmic os get Caption."

DiskSpd returned non-zero errorlevelOut of memory!

I was trying to run the "recipes\SIT\SIT_SSD_Sanity.rcp" recipe on a 480GB SSD and got the error message in the title when it got to 4K seq 100% read QD 128.

Event Viewer has the following:
Faulting application name: DiskSpd.exe, version: 0.0.0.0, time stamp: 0x57449527
Faulting module name: ucrtbase.DLL, version: 10.0.10586.9, time stamp: 0x5642c48d
Exception code: 0xc0000409
Fault offset: 0x00000000000698fe
Faulting process id: 0xe30
Faulting application start time: 0x01d2bc18470f6f2b
Faulting application path: C:\storscore\bin\DiskSpd.exe
Faulting module path: C:\Windows\SYSTEM32\ucrtbase.DLL
Report Id: ed649a57-280c-11e7-80d1-e41d2dece230
Faulting package full name:
Faulting package-relative application ID:

Checkin of --interactive / --nointeractive doesn't seem to work

I specified a target and recipe file, and got the following with ActiveState Perl:

Error "Can't locate object method "interactive" via package "CommandLine" at C:\Users\A
dministrator.MICRONSTORAGE\Desktop\StorScore-master\StorScore.cmd line 208."

If I add --interactive or --nointeractive, I get back unknown option text like the following:
StorScore.cmd --target=.\physicaldrive1 --recipe=recipes\turkey_test.rcp --interactive returns "Unknown option: interactive"

Add FG_exec to Recipe.pm

Create a new test kind "fg_exec." This would allow you to add any command to the recipe as a step. Similar to bg_exec, but later steps block on the completion of the fg_exec.

Could not open file '.....\results\...\logman-raw-....csv' No such file or directory

Sometimes StorScore fails with the following message:

Could not open file "\results<drives' results folder>\logman-raw-.csv' No such file or directory.

This happens after starting a test step (ie not a purge, precondition or initialization step)

Support a 512B access size

It's reasonable to want to test a sector-sized access. DiskSpd supports this, but SQLIO2 and Precondition.exe do not, as of yet.

Add --identify to the smartutil.exe command line

When we create the smart.txt file we could pass --identify to smartctl.exe which would collect the result of the SATA IDENTIFY_DEVICE command. This information could be useful later on.

Add a targeted FUA test

It's possible to coerce StorAHCI into converting FILE_FLAG_WRITE_THROUGH to FUA. This is not default, but could be enabled through the registry:

 HKLM\System\CurrentControlSet\Services\storahci\Parameters\Device\EnableFuaSupport

attempting to run storscore.cmd against a usb drive fails

Hello;

I am trying to demo storscore with a plugged in usb drive with a sata disk inside of it and with a sandisk usb flash drive since I can not afford to trash my laptop's filesystem.

So preconditions fails with the following error:

Cleaning disk...
Creating new filesystem...
Creating test file...
Syncing target volume...
SmartCtl missing or broken. SMART capture disabled.
Ipmiutil missing or broken. Power measurement disabled.
Access is denied.
Precondition returned non-zero errorlevel at C:/cygwin64/home/utz_j/GIT/StorScor
e\lib/PreconditionRunner.pm line 121.

I get this same problem in both cases, any suggestions? Your help would be appreciated.

Complete capture follows:

C:\cygwin64\home\utz_j\GIT\StorScore>StorScore.cmd --target .\PHYSICALDRIVE1 r
ecipes\flush_check.rcp
Targeting HDD: SanDisk SanDisk Ultra USB Device

Loaded recipes\corners.rcp (24 tests, 24 steps)

    Run time will be >= 1.02 days after target init (1 overwrite).

    Warning!
    Detected System Center Endpoint Protection (MsMpEng.exe)
    This can delay IOs and cause bogus latency results.
    You have the following options:
        - Run on a machine without SCEP
        - Disable SCEP real-time protection
        - Exclude testfile.dat from SCEP scan.

    Warning!
    This will destroy \\.\PHYSICALDRIVE1

Do you wish to continue? [Y/N]

C:\cygwin64\home\utz_j\GIT\StorScore>

Dependencies Not Available

A section in the instructions directs the user to install these, however, both links report "this download is no longer available":

The Visual Studio 2013 C++ runtime libraries for x86 & x64:
http://www.microsoft.com/en-us/download/details.aspx?id=40784

The Visual Studio 2015 C++ runtime libraries for x86 & x64:
https://www.microsoft.com/en-us/download/details.aspx?id=48145

Support "mix" tests where read/write streams have different characteristics

Typical request is: what does the latency of small random reads look like when there is a simultaneous large block sequential write stream at low QD/throughput.

It's unclear how best to implement this. The load generators and preconditioner might need direct support. Another possibility is simply to arrange for everyone's CreateFile call to pass the most permissive dwShareMode, thus making it possible for StorScore to spawn multiple copies of the IO generator simultaneously targeting the same file.

Include endurance metrics (DWPD) in scoring policy

Today the scores are purely based on performance. We could include WAF/DWPD or other endurance metrics in the computation, or we could create separate performance and endurance scores.

Need Help with Maxpower recipe

Somehow I have to run a command like

storscore.cmd -recipe=recipes\max_power.rcp –raw

but there is no such a recipe in recipes in this tool.

Any idea where I can find it, please kindly advice.

StorScore fails with error - invalid parameter under certain conditions

StorScore seems to have an assumption that the drive(s) under test are using a small sector size.

As an example, storscore tries to execute this command, and it fails to start threads due to error 87 (invalid parameters)
DiskSpd.exe -w100 -si -b1K -t4 -o64 -a0,2,4,6,8,10,12,14,16,18,20,22,1,3,5,7,9,11,13,15,17,19,21,23 -L -h -d300 -Z20M,"C:\StorScore\entropy\0_pct_comp.bin" D:\testfile.dat

When I investigated the scenario, the logical sector size was 4K, but Diskspeed was using a stripe size of 1K, which would not have worked. Changing the command line to use a block size and stripe size of 4K permitted the test to run.

Can anything be done to validate the parameters against the test devices logical and physical characteristics to prevent logs reporting failures?

4k_RR Recipe inconsistency

I've been running the 4k_RR StorScore recipe on various drives and I have came across this issue:

For most drives that I have tested the test description reads "4K_Random_Reads-step-4" however for a few other drives the test description reads as "4K_Random_Reads-step-2". I selected the 4K_RR recipe each time. Is there a reason for this inconsistency?

4K Read and Write recipes

I've been running the StorScore 4K Read and Write recipes on various drives and comparing the results to those generated by FIO. I have noticed that while the 4K Write IOPS results produced by StorScore and FIO tend to be similar to each other and both are similar to the published performance data I have for the drives I am benchmarking the 4K Read IOPS results for StorScore tend to be significantly lower than those produced by FIO or found in published data for the drives I am using. Is there any obvious reason for this discrepancy? The 4K Read recipe itself does not seem to show anything that explain the significant difference in results that I am seeing.

Make it possible to modify scoring policies without editing parse_results.cmd

Today the parser is not very modular. In particular, you have to hack the source to change the scoring policy code. We could create a system akin to the DeviceDB, perhaps in a "score_policy" subdirectory, where we would ship BingPolicy.pm and AzurePolicy.pm by default. Users could create their own policies (conforming to the standard interface) by adding them to this directory.

Allow recipe to specify IO_Generator_Args

Add another optional property to the test definition that lets the user specify additional arguments for the IO generator per-test. For example, the following test would run a 4k random read test with IOs aligned to 8k boundaries:

test(
description => "4k Random Reads",
write_percentage => 0,
access_pattern => 'random',
block_size => '4K',
queue_depth => 1,
warmup_time => 60,
run_time => 3600,
io_gen_args => '-r8k ',
);

Currently, the only way to loop through multiple values for a generator's parameter is to build a script that calls storscore with different values for --io_generator_args. This makes it difficult to name each test and manage their results as a set.

Drive workloads at a fixed throughput rather than fixed QD

This is the so-called open-loop versus closed-loop issue.

Real world applications don't typically control for constant queue depth. Rather, they have some typical (average) throughput, which means that any "hiccups" in storage latency cause many IOs to "back up" and spike QD.

It we tested this way, it would be more realistic. The high-percentile latency outlier metrics (things like 5-nines latency) might be more representative of what a real app could expect.

First we'd have to run some kind of "probe" to determine the peak IOPS of the device, and then divide that range into discrete tests (something like 10%, 20%, etc.). The IO generator and preconditioner would need rate-limiting support as an alternative to the "-o" queue depth argument.

Reported partition size always defaults to user capacity when target is physical drive number

I've been working with some different partition sizes, and noticed the "Partition Size" and "User Capacity" columns are always equal when I target a physical drive number. This is because wmic data is collected once before all the steps run, and StorScore doesn't track the volume until the partition is created (by default on each step).

Automatically check for storage-related entries in the Windows Event Viewer

We could automatically detect errors during the test by checking the Windows Event Viewer. It's possible, for example, that some IOs will fail, retry, and eventually succeed. It would be nice to make this obvious.

Enable Multiple Targets

This will require some thought, but I want to get the conversation started early in my thinking. I would like StorScore to support multiple targets. This is important for evaluating hardware for the datacenter, where multiple users share the same drive. It's also timely with all the recent activity on streaming and open channel drives.

For example, I should be able to verify that a streaming drive with one sequentially-written file mapped to each stream will get a very low WAF. Whereas this same workload will appear essentially random to a standard block mode drive, yielding a higher WAF.

Here are a few areas that will need attention:

preconditioner -- the preconditioner should run for each target, and exit only when all targets have reached steady state. Preconditioner will need to take a "done" signal.
recipe -- it will need to allow a different IO pattern for each target, and a way to signal which tests are run sequentially and which are run simultaneously.
test setup -- We can start with having the test operator set up the targets manually, and pass the file names. Later improvements can automate this.
parsing -- the raw data could have another column to let us view the stream's performance individually. I will need to think about how/if the scoring will change.

Precondition.exe hangs when run on >2TB targets

In precondition.cpp:
The following line numbers and variable names need to change from int to int64_t:
Line 39: outstandingIOs
Line 419: postedIOs
Line 420: completedIOs
Line 428: TOTAL_BLOCKS

Implement generalized workload argument for initialize()

Essentially turn initialize() into SNIA's workload-independent preconditioning. This means we could do something like to test "128K sequential reads @qd1 after 4K random writes":

purge();
initialize( block_size = '4K', access_pattern = 'random' ); # Always 100% write, high queue depth
test(
write_percentage = 0,
access_pattern = 'sequential',
block_size = '128K',
queue_depth = 1,
warmup_time = 60,
run_time = 3600,
purge = 0,
initialize => 0,
);

Multi-thread Precondtion.exe

Right now it's very likely that we become CPU-limited in some cases and do not attain the requested queue depth. The solution is to multi-thread the preconditioner. The main hurdle will be to design a thread-safe version of steady_state_detector.h.

StorScore drive prep & preconditioning

This is more of a request for information than an issue. I was wondering if someone could breakdown how StorScore preps and preconditions a drive before a test is run. Just looking at the output I can see that the drive is purged, initialized, and then preconditions. What do each of these steps consist of?

Thank you

Specifying output path

Hi,

how can I specify output dir path for my results from a workload?

Latest version of Perl don't work

Many variations of:
"Experimental keys on scalar is now forbidden at C:\StorScore\parse_results.cmd line 1385."

Download file code issue

Hi~

Please ask if you can give me code "max_power.rcp", because i can't search for it on website. Thanks.

Break parse_results.cmd into modules

The parse_results.cmd is too large, and is not well-factored like StorScore.cmd. We should modularize it for better maintainability, but also to facilitate a 2-phase approach where we parse and score separately. It would be nice to parse results only once, and merely run the scoring steps multiple times when comparing different sets of devices.

warn_illegal_args fix

Hey Mark -- I have a question about the code in lib\Recipe.pm. I'd like to add a warning based on properties of a test. I believe "warn_illegal_args" is the right place to do that. It should be called when StorScore processes the recipe ("Phase 1" in the BUILD sub), but it isn't. I've traced it down to the anonymous function on line 375. The foreach loop runs for each kind of step, but a print statement inside the anonymous function doesn't run. Can you help me understand why? I suspect the anonymous function needs to be called, but it's not clear to me where that should happen, or why it's happening for phase 2 but not for phase 1.

For example, this code:

    # Install our handler for each kind of step 
    print("warn? $recipe_warnings\n");
    foreach my $kind ( @step_kinds )
    {
        my $sym = "${package}::$kind";
        print("in foreach loop $kind\n");
        *$sym = sub
        {
            print("here\n");
            return $self->handle_step(
                $callback,
                $recipe_warnings,
                $kind,
                @_
            )
        };
    }
    print("done\n");

prints the following:
warn? 1
in foreach loop test
in foreach loop purge
in foreach loop initialize
in foreach loop precondition
in foreach loop bg_exec
in foreach loop fg_exec
in foreach loop bg_killall
in foreach loop idle
done

Not generating proper results

Hi,
Initially I started by running the 4k_RR.rcp on my SSD and I got the results in the excel sheet with all the details but later when I ran Storscore command with 4K_RW.rcp or any other recipes, then only the first sheet is filled with the data. Final and score details looks empty even after many runs.

Commands used to run:
StorScore.cmd -recipe=recipes\4k_RW.rcp --target=f:
parse_results.cmd results* results.xlsx

Can anyone guide me/tell me why I am not getting the full result sheet with all the data even after parsing?

Increase number of Queues

Hi everyone,
In a recipe, how can I increase the number of queues to the device.
I see Queue Depth can be changed, is there a way to create multiple queues(threads) and have all of them pump reads/writes to the Device.
Thanks
Sowmya

Make cmd_line writable from Recipe

I would like to define some command line options from within a recipe. For example:

$cmd_line->keep_logman_raw = 1;

test(
description => "4k Random Reads",
write_percentage => 0,
access_pattern => 'random',
block_size => '4K',
queue_depth => 1,
warmup_time => 60,
run_time => 3600,
);

Currently, the first line fails.

Enable purge for a striped volume

StorScore has the ability to test any volume, which can be handy for testing many drives in a system as a striped volume. Unfortunately, when you target a volume, StoreScore is not able to purge and wipe the history between tests.

One possible solution is to pass several drives to StorScore (eg "target=1,3,2,5"). Just as StorScore creates a volume after purges for a single drive, it could create a striped volume will the whole set of targets before each test.

It could also have a flag to let the test operator choose between these two options:

Stripe all the targets
Pass the list of targets to diskspd (so they're tested individually & simultaneously)

Note: If the operator uses the --raw flag, StorScore will need to use options 2.

question about initialize step

If the purpose of the initialize step is to write data of full capacity sequentially, does it make sure that data is written sequentially when MDTS is smaller than 1M, the block size of initialize step? As I know, if MDTS is smaller than 1M, the device driver, stornvme.sys, divides it to several commands . In this case, does the device driver guarantee the sequential data written? If not, I suggest smaller "DEFAULT_IO_SIZE" such as 128K in precondition.h or add block size parameter in write_num_pass.pm line 78 as "$cmd .= "-b128k " to use smaller block size.

StorScore Corners recipe creates disk corruption messages in eventvwr during purge phase

With the latest June StorScore version, I am seeing an issue on Windows 2016 server edition where the storscore Corners.rcp file is causing drive corruption messages to pop up in the eventvwr. I am using a drive that supports Secure Erase. I see this issue if the storscore partition is already created and when it runs the "secure erase/ diskpart clean" during the purge phase. wanted to check if its a known issue.

Thanks
Deepak

Support SECURE ERASE in purge()

Right now we only "diskpart clean", which amounts to a TRIM of the whole target.

Support logman/ACPI power meter object as an alternate power measurement source

Today we get system power only via ipmitool. ACPI power meter object works on some systems to expose the same information via logman/perfmon counters.

The object is called "\Power Meter(_Total)\Power"