Ilkyu Song - SPO600: SPO600 Project

I chose Redis (Remote Dictionary Server) for my project at stage1. Redis is open source software developed by Salvatore Sanfilippo, a volatile and persistent key-value store. Then, it stores and manages the data in the memory. Let's look at the benefits and data types of Redis.

1. The advantages of The Redis

Advantage	Description
Specialized for processing data in lists and arrays.	• The value supports several data types such as string, list, set, sorted set, hash type. • List type data entry and deletion are about 10 times faster than MySQL.
Redis transaction is also atomic.	Atomic processing provides an Atomic processing function to prevent data mismatch when several processes simultaneously request the same key update.
Persistent data preservation while utilizing memory.	• Do not delete data unless explicitly deleted with the command or set expires. • The snapshot function allows you to save the contents of the memory as *.rdb file and restore it to that point in time.
Multiple server configurations.	Consistent hashing or master-slave configuration

2. Redis provides five data types, and there are many processing instructions for each data type.

Data Type	Description
String	• We cannot just store a string as a string, • Binary data can also be saved (note that Redis does not have integer or real numbers). • The maximum size of data that can be inserted into the key is 512 MB.
List	• We can think of it as an array. • The maximum number of elements in a key is 4,294,967,295. • If the value of the data type is larger than the condition set in the configuration file, it is encoded as a linked list or zip list.
Set	• There is no duplicate data in key with unaligned aggregate type • The amount of time spent adding, removing, and checking existence is constant regardless of the number of elements in the sets • The maximum number of elements that can be in a key is 4,294,967,295
Sorted sets	• Sorted sets are called the most advanced Redis datatype. • Adding, removing, and updating elements are done in a very fast way, which is proportional to the "number of elements in the log". • It can be used in linking systems. • Each element of sets has a real value called score and is sorted in ascending order by score value. • There is no redundant data in the key, but the score value can be duplicated
Hashes	• Similar to lists, consisting of a series of "field names" and "field values". • The maximum number of field-value pairs that can be contained in a key is 4,294,967,295.

I compiled the benchmark file for the Redis benchmark by changing the compile options. And benchmarks were done on aarchie and x86 servers with different command counts. The result below is the number of commands executed per second. The aarchie server is a bit faster than the x86 server, although it did not show much difference from the test in stage1. However, the specifications of the two servers are so different that simple comparison is difficult. Some developers and architectures tend to look at performance only with code without considering hardware specs. However, the first stuff to consider when tuning database or optimizing code is the hardware specification.

1. aarchie

2. x86

Moreover, I ran the benchmark once again in stage2. I chose the Redis library source in stage 2 and benchmarked it. Then I used the ASM inline assembler to optimize the code. However, ASM does not guarantee optimization over C language. It is better to use c language first for optimization. The two figures below show the result of using the original source and ASM. The two results are very similar.

I am performing Stage3 and thinking about code optimization again. Code optimization is a program conversion technique that improves code by consuming fewer resources (ie, CPU, memory), resulting in faster machine code generation. I think I should remind this meaning. I tried to convert only the code to a simple knowledge what I knew for code optimization. I thought that converting only the code would speed up execution, and I thought that changing the compile options would speed up the program. However, in a simple program, the difference is not so different. I have to keep a few things in mind for code optimization. First of all, I need to know exactly the environment of the OS or platform where my program will run. (Actually, the library which I chose on Stage2 did not run on x86.) And I think I should have a knowledge of the specs of the machine on which my program will run. So, I need to provide the user with the minimum recommended specification for my program. Finally, you should benchmark it repeatedly over and over. To make a good program, I have to test it repeatedly many times. If I follow these three things, I will be able to develop a program that is nearest to optimization. As I proceeded with this course project, I was not only knowledgeable about code optimization, but also experienced. I think in programming as well as coding skills, experience is very important to programmers. This experience will be very beneficial to me. And this project taught me how to perform in the upstream. And code optimization and portability are not simply changing the programming code. I have to be knowledgeable about all operating systems, platforms and hardware.

Ilkyu Song - SPO600

Sunday, 22 April 2018

SPO600 Project - Stage 3

No comments:

Post a Comment

SPO600 Project - Stage 3

Search This Blog