a blog by @captainsafia
Looking into ls
Oh hey there! It looks like I’m on this little streak where I get way into implementations for command line functions. In my last few blog posts, I dove into the code for sudo and cd. Today, I thought I would look into another oft-used command in my development toolkit:
In my last blog post, I learned that
cd was a builtin command and relied on the
chdir Unix function to change the current working directory. As it turns out,
ls is not a built-in command on the Bash shell. I confirmed this by doing the following.
$ which ls /bin/ls
ls function is implemented as an actual binary. The source for this binary is located in the
coreutils page within the GNU ecosystem. Some Googling and searching reveals that the source for the
ls function is located here.
Side note: I know the “Googling and searching” bit glosses over a lot of things. For the most part, I’m very literally searching on Google for things like “ls source code” or searching through Git repositories for queries like “ls.” If you want more information about my “Google for code exploration” practices, let me know.
Since this is a C file, the tried and true place to start exploring this codebase is the read the
main function. The
main function is the entry point for C programs, so when we run
ls, we start executing the code defined in the
main function. The first couple of lines in the file are responsible for basic variable initialization.
int i; struct pending *thispend; int n_files; initialize_main (&argc, &argv); set_program_name (argv); setlocale (LC_ALL, ""); bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); initialize_exit_failure (LS_FAILURE); atexit (close_stdout); assert (ARRAY_CARDINALITY (color_indicator) + 1 == ARRAY_CARDINALITY (indicator_name)); exit_status = EXIT_SUCCESS; print_dir_name = true; pending_dirs = NULL; current_time.tv_sec = TYPE_MINIMUM (time_t); current_time.tv_nsec = -1; i = decode_switches (argc, argv);
One of the more interesting looking function calls in the code above is the
initialize_main function call. As it turns out, it’s not (well, depends on your definition of interesting)! I tried to find the definition of
initialize_main but was rather disappointed when I did find it.
/* Redirection and wildcarding when done by the utility itself. Generally a noop, but used in particular for native VMS. */ #ifndef initialize_main # define initialize_main(ac, av) #endif
I have no idea what the heck a native VMS is (I don’t think this has anything to do with VMs as in virtual machines). I’ll have to look into this later.
The second function that caught my eye was the
decode_switches function. I was particularly interested in this function because like the
initialize_main function it takes the arguments of the main function (that is the arguments you pass it at the command line) as its parameters.
Instead of reading the code for the
decode_switches function, I just read the comment above the function definition to determine what it did.
/* Set all the option flags according to the switches specified. Return the index of the first non-option argument. */ static int decode_switches (int argc, char **argv)
So it looks like that function is responsible for setting option flags, these options flags are things like the sort order and the display format that you would like the
ls command to have.
At this point, it would be good to mention that as I read through the code for the
ls command, I realized that a bulk of it as responsible for formatting and rendering the output of the command. The entire source file is about 5,3000 lines long, but a majority of it defines functions that handle things like “how do I print out a list of X words in Y columns?” or “how do I print this string with the correct separator?”
This is my way of saying I’m getting bored of reading all this C code and that I’ll end the blog post here. In summary, where you run the
ls command on your Unix machine, you’re running the
coreutils implementation of the
ls command. This implementation is written in C and is responsible for parsing the arguments that you provide an appropriately rendering the results of the
ls command in the pristine format you see in your command line.