But what wish, whether it be good or evil, will not always happen according to our desires.
Introduction
As far as i known, there are two ways to benchmark HBase. One is quite famous, Yahoo Cloud Serving Benchmark (YCSB), which developing a framework and common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores, and HBase is included. The other is embedded in HBase, and serves HBase only of course, named PerformanceEvaluation, containing serveral workloads also, which can be used to evaluating HBase performance and scalability. I will try to cover both in source code level.
PerformanceEvaluation
There are two modes for running evaluation, one is map-reduce mode, each mapper runs a single client, the other is multi-threads mode. Each test client, by default, does about 1GB of data. Here are the workloads (based on the most updated version of HBase):
Workloads
| Workload | Description |
|---|---|
| AsyncRandomReadTest | Async random read test |
| AsyncRandomWriteTest | Async random write test |
| AsyncSequentialReadTest | Async sequential read test |
| AsyncSequentialWriteTest | Async sequential write test |
| AsyncScanTest | Async scan test (read every row) |
| RandomReadTest | Random read test |
| RandomSeekScanTest | Random seek and scan 100 test |
| RandomScanWithRange10Test | Random seek scan with both start and stop row (max 10 rows) |
| RandomScanWithRange100Test | Random seek scan with both start and stop row (max 100 rows) |
| RandomScanWithRange1000Test | Random seek scan with both start and stop row (max 1000 rows) |
| RandomScanWithRange10000Test | Random seek scan with both start and stop row (max 10000 rows) |
| RandomWriteTest | Random write test |
| SequentialReadTest | Sequential read test |
| SequentialWriteTest | Sequential write test |
| ScanTest | Scan test (read every row) |
| FilteredScanTest | Scan test using a filter to find a specific row based on it’s value |
| IncrementTest | Increment on each row |
| AppendTest | Append on each row |
| CheckAndMutateTest | CheckAndMutate on each row |
| CheckAndPutTest | CheckAndPut on each row |
| CheckAndDeleteTest | CheckAndDelete on each row |
Codes
All tests are based on class TestBase which defines the framework of a test (Template Pattern). The core in this framework is the method test(). Also please pay attention to the my added docs paragragh /**…*/, they are explanations of codes:
long test() throws IOException, InterruptedException {
/**
* In testSetup() method, each test should define how it creates connection
* and acts on start up by implementing createConnection()
* and onStartup() method.
*/
testSetup();
LOG.info("Timed test starting in thread " + Thread.currentThread().getName());
final long startTime = System.nanoTime();
try {
testTimed();
} finally {
/**
* Symmetric to testSetup() method, each test should define how it acts on
* take down and closes connection by implementing onTakedown() and
* closeConnection() method.
*/
testTakedown();
}
return (System.nanoTime() - startTime) / 1000000;
}
In testTimed() method:
void testTimed() throws IOException, InterruptedException {
int startRow = getStartRow();
int lastRow = getLastRow();
TraceUtil.addSampler(traceSampler);
// Report on completion of 1/10th of total.
for (int ii = 0; ii < opts.cycles; ii++) {
/**
* cycles define how many times this test should run,
* by default only run 1 time.
*/
if (opts.cycles > 1) LOG.info("Cycle=" + ii + " of " + opts.cycles);
for (int i = startRow; i < lastRow; i++) {
/**
* everyN is sample rate, by default 1, meaning executing every row.
* If it is 5, meaning executing every 5 rows.
*/
if (i % everyN != 0) continue;
long startTime = System.nanoTime();
try (TraceScope scope = TraceUtil.createTrace("test row");){
/**
* It exactly executes the test's action, like scan, put, read.
*/
testRow(i);
}
if ( (i - startRow) > opts.measureAfter) {
// If multiget is enabled, say set to 10, testRow() returns immediately first 9 times
// and sends the actual get request in the 10th iteration. We should only set latency
// when actual request is sent because otherwise it turns out to be 0.
if (opts.multiGet == 0 || (i - startRow + 1) % opts.multiGet == 0) {
latencyHistogram.update((System.nanoTime() - startTime) / 1000);
}
if (status != null && i > 0 && (i % getReportingPeriod()) == 0) {
status.setStatus(generateStatus(startRow, i, lastRow));
}
}
}
}
}
Parameters
| Parameter | Default | Description |
|---|---|---|
| nomapred | false | Run multiple clients using threads rather than use mapreduce |
| filterAll | false | Helps to filter out all the rows on the server side |
| startRow | 0 | Rows to start |
| size | 1.0f | Total size in GiB |
| perClientRunRows | 1048576 | Rows each client runs |
| numClientThreads | 1 | Threads of client |
| totalRows | 1048576 | Total rows for test |
| measureAfter | 0 | Start to measure the latency once measureAfter rows have been treated |
| sampleRate | 1.0f | Execute test on a sample of total rows. Only supported by randomRead |
| traceRate | 0.0 | Enable HTrace spans. Initiate tracing every N rows |
| tableName | TestTable | Alternate table name |
| flushCommits | true | Used to determine if the test should flush the table |
| writeToWAL | true | Set writeToWAL on puts |
| autoFlush | false | Set autoFlush on htable |
| oneCon | false | all the threads share the same connection |
| useTags | false | Writes tags along with KVs. Use with HFile V3 |
| noOfTags | 1 | Specify the no of tags that would be needed |
| reportLatency | false | Set to report operation latencies |
| multiGet | 0 | Batch gets together into groups of N |
| randomSleep | 0 | Do a random sleep before each get between 0 and entered value |
| inMemoryCF | false | Tries to keep the HFiles of the CF inmemory as far as possible |
| presplitRegions | 0 | Create presplit table |
| replicas | 1 | Enable region replica testing |
| splitPolicy | null | Specify a custom RegionSplitPolicy for the table |
| compression | NONE, LZO, GZ, SNAPPY, LZ4, BZIP2, ZSTD | Compression type to use |
| bloomType | NONE, ROW, ROWCOL | Bloom filter type |
| blockSize | 65536 | Blocksize to use when writing out hfiles |
| blockEncoding | NONE, PREFIX, DIFF, FAST_DIFF, ROW_INDEX_V1 | Block encoding to use |
| valueRandom | false | Set if we should vary value size between 0 and valueSize |
| valueZipf | false | Set if we should vary value size between 0 and valueSize in zipf |
| valueSize | 1000 | Value size to use |
| period | perClientRunRows / 10 | Report every period rows |
| cycles | 1 | How many times to cycle the test |
| columns | 1 | Columns to write per row |
| caching | 30 | Scan caching to use |
| addColumns | true | Adds columns to scans/gets explicitly |
| inMemoryCompaction | NONE, BASIC, EAGER, ADAPTIVE | Makes the column family to do inmemory flushes/compactions. |
| asyncPrefetch | false | Enable asyncPrefetch for scan |
| cacheBlocks | true | Set the cacheBlocks option for scan |
| scanReadType | DEFAULT, STREAM, PREAD | Set the readType option for scan |
YCSB
YCSB contains mainly contains two components, one is Client, the other is DB. Client just runs a specific workload against db. DB defines the common interfaces, such as scan, update, insert, delete, read, etc. And each specific db, like HBase, Cassandra, implements them.
Workloads
| Workload | Brief | Distribution | Description |
|---|---|---|---|
| A | Update Heavy | zipfian | 50 percent reads and 50 percent updates |
| B | Read Heavy | zipfian | 95 percent reads and 5 percent updates |
| C | Read Only | zipfian | 100 percent reads |
| D | Read Latest | latest | 95 percent reads and 5 percent inserts |
| E | Short Ranges | uniform | 95 percents scans and 5 percent inserts |
| F | Read-modify-write | zipfian | 50 percents reads and 50 percents reads-modifies-writes |
Distribution
Distributions means the probability of a record to be performed (insert, update, read, scan). There are following values:
Uniform, choose a record at random, and all records are equally likely to be chosen.Zipfian, some records will be popular, while some are unpopular.Latest, similiar withZipfian, but the most recently inserted is the most popular.Sequential, records are picked sequentially between a range [start, end).Hotspot, divides records in two parts, one is hot, other is cold. And a probability to perform records in hot data.Exponential, records are in exponential distribution, the lower number of records are more popular.
Parameters
- HBase Configurations:
| HBase DB | Value | Description |
|---|---|---|
| clientbuffering | false | buffer mutations on the client |
| writebuffersize | null | buffer size for client |
| durability | USE_DEFAULT, SKIP_WAL, ASYNC_WAL, SYNC_WAL, FSYNC_WAL | durability of a mutation |
| kerberos | SIMPLE, KERBEROS | client authentication |
| principal | null | if kerberos enable, principal of client |
| hbase.security.authentication | null | if kerberos enable, location of principal |
| table | usertable | table name |
| debug | null (true/false) | debug message |
| hbase.usepagefilter | true | use page filter |
| columnfamily | (required) | column family of a table |
- Core Workload Configurations:
| Core Workload | Default | Description |
|---|---|---|
| recordcount | (required) | The number of records in the table to be inserted in |
| table | usertable | table name |
| fieldcount | 10 | number of fields in a record |
| fieldlengthdistribution | constant, uniform, zipfian, histogram | field length distribution |
| fieldlength | 100 | length of a field in bytes |
| fieldlengthhistogram | hist.txt | filename containing the field length histogram |
| readallfields | true | whether to read one field (false) or all fields (true) of a record |
| writeallfields | false | whether to write one field (false) or all fields (true) |
| dataintegrity | false | whether to check all returned data integrity |
| readproportion | 0.95 | proportion of transactions that are reads |
| updateproportion | 0.05 | proportion of transactions that are updates |
| insertproportion | 0.0 | proportion of transactions that are inserts |
| scanproportion | 0.0 | proportion of transactions that are scans |
| readmodifywriteproportion | 0.0 | proportion of transactions that are read-modify-write |
| requestdistribution | (ditto) | distribution of requests across the keyspace |
| zeropadding | 1 | zero padding to record numbers in order to match string sort order |
| maxscanlength | 1000 | max scan length |
| scanlengthdistribution | uniform, zipfian | scan length distribution |
| insertorder | hashed, ordered | order to insert records |
| hotspotdatafraction | 0.2 | percentage data items that constitute the hot set |
| hotspotopnfraction | 0.8 | percentage operations that access the hot set |
| core_workload_insertion_retry_limit | 0 | times to retry when an insertion to a DB fails |
| core_workload_insertion_retry_interval | 3 | wait between the retries, seconds |
| operationcount | 3000000 | operations to use during the run phase |
| insertstart | 0 | offset of the first insertion |
| measurementtype | histogram | latency measurements are presented |
| histogram.buckets | 1000 | range of latencies to track in the histogram (milliseconds) |
| timeseries.granularity | 1000 | granularity for time series (in milliseconds) |
Overall
I tried both, YCSB is much more authoritative in test cases and more informative on result report. While PerformanceEvaluation is more like a functional test, it can train you how to write a client program in effective way.