WepSIM

Authors: Alejandro Calderón Mateos & Felix García Carballeira
License: GPL v3.0

Laboratory 1: assembly, power-consumption and security

The main goal of the proposed laboratory is to understand the concepts related to assembly programming, but also the impact of power-consumption in security. For this purpose we are going to use the RISC-V assembler. In order to become familiar with assembly programming and simulator used, it is recommended to solve the examples available in the simulator.

This laboratory consists of 3 exercises that you can develop and test in WepSIM.

Exercise 1

The goal of this exercise is to develop in RISC-V_imf32 assembler a function called "string_compare" to work with strings:

Function string_compare ( char A[], char B[] ) ;

This function allows to compare two strings so that it is possible to know if the strings stored in memory are the same or not.

This function receives the following arguments in the order given:

Argument 1: (A) starting address of a string that ends with the ASCII code 0.
Argument 2: (B) starting address of a string ending in end of string (ASCII code 0).

The function returns a single integer that takes one of these three possible values:

On error, it returns the integer -1.
If the two strings are the same, it returns the integer 1 (one).
If the two strings are different, it returns the integer 0 (zero).

The two possible errors to be considered in this function are that address A is null (zero value) and that address B is null (zero value).
In the "string_compare" function, in case of finding an error, it must return only the value -1 without doing anything else. In the same way, the function must compare character by character and as soon as it is a different one it must return 0.

The following examples illustrate...

...when the content of two strings is different:	...when the content of two strings that are equals:	...when an error ocurrs:
.data # 'u', 'n', 'o', '\0' A: .string "uno" # 'd', 'o', 's', '\0' B: .string "dos" .text string_compare: ... jr ra main: la a0 A la a1 B jal ra string_compare # <- a0 must be 0 # at this point # because are # differente string ...	.data # 'u', 'n', 'o', '\0' A: .string "uno" # 'u', 'n', 'o', '\0' B: .string "uno" .text string_compare: ... jr ra main: la a0 A la a1 B jal ra string_compare # <- a0 must be 1 # at this point # because are # the same string ...	.data # 'u', 'n', 'o', '\0' A: .string "uno" # 'u', 'n', 'o', '\0' B: .string "dos" .text string_compare: ... jr ra main: li a0 0 move a1 x0 jal ra string_compare # <- a0 must be -1 # at this point # because # an error happens ...

...when the content of two strings is different:

...when the content of two strings that are equals:

...when an error ocurrs:

.data
      # 'u', 'n', 'o', '\0'
  A: .string "uno"
      # 'd', 'o', 's', '\0'
  B: .string "dos"

.text

 string_compare:
    ...
    jr ra

 main: 
    la a0 A
    la a1 B
    jal ra string_compare

    #  <- a0 must be 0
    #     at this point 
    #     because are 
    #     differente string
    ...

.data
      # 'u', 'n', 'o', '\0'
  A: .string "uno" 
      # 'u', 'n', 'o', '\0'
  B: .string "uno"

.text

 string_compare:
    ...
    jr ra

 main: 
    la a0 A
    la a1 B
    jal ra string_compare

    #  <- a0 must be 1
    #     at this point 
    #     because are 
    #     the same string
    ...

.data
      # 'u', 'n', 'o', '\0'
  A: .string "uno"
      # 'u', 'n', 'o', '\0'
  B: .string "dos"

.text

 string_compare:
    ...
    jr ra

 main: 
    li   a0 0
    move a1 x0
    jal ra string_compare
    #  <- a0 must be -1
    #     at this point 
    #     because 
    #     an error happens
    ...

Click here for some help on this exercise...

Tip: this subrutine is part of the WepSIM example 21

.text
strcmp2:   # save stack
           addi sp sp -20
           sw s0 16(sp)
           sw s1 12(sp)
           sw s2  8(sp)
           sw s3  4(sp)
           sw s4  0(sp)

           # s3 -> 1 (found) | 0 (not found)
           # s4 -> # iterations
           li s3 0
           li s4 0

       l3: # s0 <- a0[s4]
           add  s0 a0 s4
           lb   s0 0(s0)
           # s1 <- a1[s4]
           add  s1 a1 s4
           lb   s1 0(s1)
           # (a0 == a1) ?
           bne  s0 s1 no3
           beq  s0 x0 yes3
           addi s4 s4 1
           beq  x0 x0 l3

     yes3: li a0 1
           beq x0 x0 e3
      no3: li a0 0
           beq x0 x0 e3

       e3: # restore stack
           lw  s4  0(sp)
           lw  s3  4(sp)
           lw  s2  8(sp)
           lw  s1 12(sp)
           lw  s0 16(sp)
           addi sp sp 20

           # return
           jr ra

Exercise 2

The goal of this exercise is to study the impact of number of clock cycles when using the string_compare function, especially its use for a possible side-channel attack:

Because more clock cycles needed by a function should be directed related to the power consumption in a normal CPU, we can know the input, output and the related power consumption for string_compare.
In this program, the rdcycle rd RISC-V pseudoinstruction is used.

This instruction stores in "rd" the number of cycles that have been executed so far.

Imagine that a security company uses the string_compare function of Exercise 1 to check if an user's stored key ("stored_key") is the same as the key that the user enters on the keyboard to authenticate.
To use string_compare the code used by this company performs the following steps:

The user introduces the key by keyboard and it is stored in "user_key".
The string_compare function is called to check that the key "user_key" entered by keyboard is valid by comparing it with the user's key that is stored in "stored_key", so if it returns 1 it is valid and if it returns 0 it is an invalid key.
A valid or invalid message is printed on the screen.

As an internship student of this company, we suspect that when comparing the key "one" with a string "a" the function only makes one single comparison between letters ('o' versus 'a'), but when comparing "one" with the string "o" the function makes two comparisons ('o' is equal to 'o' and 'n' is not equal to the end of the string). This causes the number of cycles executed and, therefore, the power consumption when the first letter is matched to be higher.

To demonstrate this hypothesis, we initially propose the following program skeleton:

.data
    valid_msg:      .string   ":Valid:"
    no_valid_msg:   .string   ":Invalid:"

    stored_key:     .string  "one"
    possible_key:   .string  "a"

.text
main:  ...
       # Obtain the cycles executed so far
       rdcycle	t0

       # call ro string_compare (stored_key, possible_key)
       ...
       jal ra string_compare
       mv  t1 a0              # t1: 1 ok y 0 not valid

       # Obtain the cycles executed so far
       rdcycle  t2
       sub t0 t2 t0           # t0: Number of elapsed cycles 

       # Print messages
       beq zero t1 p_no
p_yes: la  a0 valid_msg
       j end
p_no:  la  a0 no_valid_msg
end:   li  a7 4
       ecall

       # Print the number of cycles 
       mv a0 t0
       li a7 1
       ecall
       ...

The objective of this exercise is to modify the previous program to calculate the number of cycles (power consumption) for letters 'a' to 'z' as the first character of the key. That is, it is to find out the power consumption (number of cycles) for the following values of possible_key: 'a', 'b' ... 'z'. Only the 27 lowercase letters of the alphabet.

Note, that possible_key is a string, i.e. when you want to test with the letter 'k', the string will have the value "k", which is equivalent to having two characters in the string: one the ASCII code of the 'k' and then the code 0 indicating the end of the string. It is recommended to study how the characters 'a'...'z' are stored using the ASCII code.

For this purpose, the function has to be developed:

Function study_energy ( char password[], char dummy[] ) ;

This function prints in each line, for each of the letters, the following information:

letter: number of cycles.

For example, a possible output would be this:

a: 10
b: 10
c: 10
d: 20
e: 10
.
.
z: 10

Note that in this case the key starts with the letter d, since the number of cycles in this case is higher.

This function receives the following arguments in the given order:

Argument 1: (password) starting address of a string with the key to be discovered. Like all strings, it ends with the ASCII code 0 used as the end of the string.
Argument 2: (dummy) starting address of a string in memory. This string will always include a single character (in memory it will occupy two bytes: the ASCII code of the letter and the ASCII code 0 used as the end of the string).

The function does not return any value, it simply prints on the console a line for each letter of the alphabet with the format indicated above:

letter: number of cycles

It is mandatory to correctly follow the parameter passing convention described in the course for RISC-V, as well as to respect the function signature (name, number of parameters, order of parameters and value to be returned). Similarly, the study_energy function must use the string_compare function of the first exercise (it must call this function).

It will not be considered valid to develop the code of the functions directly in the main function. You must write the code corresponding to the requested function as a separate function.

Exercise 3

The goal of this exercise is to implement a simple function that performs a side-channel attack.
Consider that the keys are at most 8 characters for the study.
Through a brute-force attack, we would have to test up to 27⁸ (282,429,536,481) possible keys.

But with a side-channel attack, it would be possible to detect the first letter of the key by using the energy consumption (number of cycles executed), and once this character is fixed, it would be possible to study the 27 possible characters for the second letter and so on. In other words, possible combinations are reduced to 27*8 (216 possibilities), in case the key is just 8 characters long.

In Exercise 3, we have to implement in assembler RISC-V_32IMF a function called "attack" to discover a possible key of up to 8 characters:

Function attack ( char password[], char dummy[] ) ;

This function receives the following arguments in the provided order:

Argument 1: (password) starting address of the user key to be discovered. This string ends with the ASCII code 0 used as the end of the string.
Argument 2: (dummy) starting address of a string in memory where the detected key will be stored. This string must have space for 9 bytes (8 of the string plus the end of the string).

The function does not return any value, it simply stores in dummy memory address the value of the discovered key.

For the development of this exercise, you can start with the ideas of Exercise 2 and add everything necessary to perform the attack.
Note that this is an exercise, in a real scenario the contents of the user's password would never be known. However, the principle used side-channel attack is the same that has inspired attacks on current processors such as spectre.

Click here for some help on this exercise...

Tip: this subrutine is part of the WepSIM example 21

.data
   guessing: .byte   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0
   password: .string "srt"
   nopass:   .string "x"

.text
strcmp2: ... # see exercise 1

         # return
         jr ra

main:    # save stack
         addi sp sp -12
         sw   ra 8(sp)
         sw   s0 4(sp)
         sw   s1 0(sp)
         
         # reference value to s1
         la   a0 nopass
         la   a1 password
         rdcycle  s0
         jal  ra strcmp2
         rdcycle  s1
         sub  s1 s1 s0        

         # loop to guess keyword
         la  t0 guessing
         li  t3 -1
         li  t4 1
     l1: # for (offset=0; t3 != 1; offset++) {
         beq t3 t4 e1
       
         li  t1 'a'   
         li  t2 'z'
     l2: # for (t1='a'; t1<'z'; t1++) {
         beq t1 t2 e2
       
         # guessing[offset] = t1
         sb  t1 0(t0)
       
         # t3 = strcpy(guessing, pass)
         la   a0 guessing
         la   a1 password
         sub  t5 t0 a0
         mv   a0 t0
         add  a1 a1 t5
         rdcycle  s0
         jal  ra strcmp2
         rdcycle  t3
         sub  t3 t3 s0
       
         # if (a0 == 1) return
         bne a0 x0 e1
         bgt t3 s1 e2 # 2+ loops
       
         # }
         addi t1 t1 1
         beq  x0 x0 l2
      
     e2: # }
         addi t0 t0 1
         beq  x0 x0 l1
       
     e1: # print(guessing)
         la a0 guessing
         li a7 4
         ecall

         # restore stack
         lw   s0 4(sp)
         lw   s0 4(sp)
         lw   ra 8(sp)
         addi sp sp 12

         # return
         jr ra

Extra material

Save a checkpoint with WepSIM

WepSIM allows you to save the entire working session into a single file. This session can include the requested microcode, the assembly code, the states at different execution points, and a recording of the work session. In this way it is more agile to continue the work or share that work among members of the laboratory group.

To do this you have to use what in the WepSIM simulator is called checkpoint. The following are the steps to save a checkpoint:

In the run mode menu select the "Checkpoint" option.
Enter the file name in the "File name:" field, then press the "Save" button to save the file.
Check that the file has been saved correctly and contains everything ordered in the statement.