Uppercase .S vs Lowercase .s File Extensions in GAS Syntax Assembly

October 18, 2020

If you have ever written assembly for the GNU Assembler (GAS), you may have noticed that files sometimes have an .S extension and sometimes .s. This is not a meaningless distinction, and you could have a frustrating time if you accidentally use the wrong one.

The uppercase .S indicates that the file contents should be run through the preprocessor, while the lowercase .s indicates that the file contents should be assembled directly. Let’s take a look at an example in x86 assembly.

Note: GAS uses AT&T syntax by default for x86 assembly, which is what we use below. You can instruct the assembler to use Intel syntax with the .intel_syntax directive.

lowercase.s

.global _start

.text
_start:
    mov     $1, %rax
    mov     $1, %rdi
    mov     $message, %rsi
    mov     $10, %rdx
    syscall

    mov     $60, %rax
    xor     %rdi, %rdi
    syscall
message:
    .ascii  "lowercase\n"

What does this program do? It is a basic “Hello World” x86 assembly program. The first section of _start is calling the write Linux syscall by loading the syscall number ($1) into the 64-bit general purpose accumulator register (%rax), the first argument to the syscall ($1) into the destination index register (%rdi), our .ascii text into the source index register (%rsi), and finally the length of our message into the data register (%rdx). Then, we invoke the syscall. The subsequent section does a similar step, but this time with the exit syscall ($60) and the xor of %rdi with itself (which will always be 0) to let the system know we exited successfully.

We can assemble, link, and run this program with the following commands:

gcc -c lowercase.s && ld lowercase.o && ./a.out

Output:

lowercase

We didn’t use any C preprocessor directives, so this worked as expected. Let’s look at a variation of this program, this time with the uppercase .S extension.

uppercase.S

#define UPPER

.global _start

.text
_start:
    mov     $1, %rax
    mov     $1, %rdi
    mov     $message, %rsi
    mov     $10, %rdx
    syscall

    mov     $60, %rax
    xor     %rdi, %rdi
    syscall

message:
#ifdef UPPER
    .ascii  "uppercase\n"
#else
    .ascii  "lowercase\n"
#endif

Now we have defined UPPER and added conditional behavior based on it being defined. Let’s run this it and see what happens:

gcc -c uppercase.S && ld uppercase.o && ./a.out

Output:

uppercase

Now remove the #define UPPER directive and run again.

Output:

lowercase

Lastly, let’s change the file name from uppercase.S to uppercase.s and try to execute it.

gcc -c uppercase.s && ld uppercase.o && ./a.out

Output:

uppercase

Though we don’t have #define UPPER in the program, we still get the uppercase output because our program did not go through the preprocessor, so directives were not honored and the first .message declaration was used.

Send me a message @hasheddan on Twitter for any questions or comments!