gash/doc/gash.texi

\input texinfo
@c -*- mode: texinfo; -*-

@c %**start of header
@setfilename gash.info
@documentencoding UTF-8
@settitle Gash Reference Manual
@c %**end of header

@include version.texi

@copying
Copyright @copyright{} 2018 Rutger EW van Beusekom@*
Copyright @copyright{} 2018 Jan (janneke) Nieuwenhuizen@*
Copyright @copyright{} 2019 Timothy Sample@*

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
copy of the license is included in the section entitled ``GNU Free
Documentation License''.
@end copying

@dircategory Basics
@direntry
* Gash: (gash).       Guile As SHell.
* gash: (gash)Invoking gash.       Running Gash, a shell written in Scheme.
@end direntry

@titlepage
@title Gash Reference Manual
@subtitle A POSIX-compatible shell written in Guile Scheme.
@author The Gash developers

@page
@vskip 0pt plus 1filll
Edition @value{EDITION} @*
@value{UPDATED} @*

@insertcopying
@end titlepage

@contents

@c *********************************************************************
@node Top
@top Gash

This document describes Gash version @value{VERSION}, a
POSIX-compatible shell written in Guile Scheme.

@menu
* Introduction::                What is Gash about?
* Using Gash::                  How to use Gash as a shell.
* Hacking with Gash::           How to use Gash as a Scheme library.
* GNU Free Documentation License::  The license of this manual.
@end menu

@c *********************************************************************
@node Introduction
@chapter Introduction

Gash is a POSIX-compatible shell written in Guile Scheme.  It has two
goals: bootstrapping GNU Bash and exposing the shell to Scheme.  In
the following sections we will unpack that definition and explain all
of its constituent parts.


@node What is a POSIX-compatible shell?
@section What is a POSIX-compatible shell?

When talking about operating systems, a @dfn{shell} refers to the
day-to-day user interface that a person would use to interact with the
system.  That's what you might learn in computer science school, but
in popular usage ``shell'' refers to a very particular type of
interface---the one that developed along with the Unix operating
system.  The Unix shell is a command-line interface designed mostly
for invoking programs and connecting programs together.  Although we
provide a whirlwind tour of shell features in @ref{Using Gash}, this
manual is not a good general introduction to using a Unix-like shell.
For that, try
@url{http://write.flossmanuals.net/command-line/introduction/,
Introduction to the Command Line}.

So now we know what a @emph{Unix} shell is, but what is a
@emph{POSIX-compatible} shell?  Over time, Unix inspired many similar
systems, all of which were mostly the same, but not quite compatible
with one another.  What worked on one Unix-like operating system might
not work on another.  To make it easier to write portable software,
everyone got together and described a minimal set of features that all
Unix-like systems should provide.  They called it the @dfn{Portable
Operating System Interface}, or @dfn{POSIX} for short.  Being a
@emph{POSIX-compatible} shell means that Gash should have at least all
of the features required by POSIX.@footnote{Note that Gash is merely
POSIX-compatible and @emph{not} POSIX-certified.  This means we try
our best, but we have not gotten an official stamp of approval from
The Open Group, who owns the POSIX trademark and administers a
certification program.}


@node What is Guile Scheme?
@section What is Guile Scheme?

@dfn{Scheme} is a cool programming language from the 70's.  It is a
very conceptually simple dialect of @dfn{Lisp}---a cool programming
language from the 50's.  Why are we interested in using such an old
programming language?  There are two reasons.  The first is that even
by modern standards, Scheme is a very expressive, high-level language.
Writing in such a language is a pleasure.  The second reason is that
Scheme's simplicity makes it well-suited to the bootstrapping task
(@pxref{What of Bash and its bootstraps?}).

@dfn{Guile} is a particular implementation of Scheme.  It is the
official extension language of the GNU project.  There are many
implementations of Scheme, but Guile is our choice mainly out of a
desire to integrate nicely with the GNU Guix project, which is also
written in Guile.


@node What of Bash and its bootstraps?
@section What of Bash and its bootstraps?

One of the most important functions of Gash is to bootstrap GNU Bash
(@pxref{What is Bash?,,, bash, Bash Reference Manual}).
@dfn{Bootstrapping} is, briefly, going from having no software to
having software.  What makes it hard is that you usually use software
to build software, and that snake has a habit of eating its own tail.
For example, Bash uses shell scripts to describe how it should be
built, so you need a shell to build a shell.  What if this other shell
needs shell scripts too?  Then you have a problem.  Normally this
problem goes unnoticed, because you already have a shell that you got
from somebody else, and that somebody got their shell from somebody,
and this process works its way back to the dawn of time---er, the dawn
of Unix time, which is the early 70's---when the first shell emerged
from the primordial soup.

Where you really run head-long into this problem is when you have a
complete, fine-grained description of a system's software dependency
graph, like they have over at the GNU Guix project.  Then, if you
follow the arrows back as far as you can, you can't miss the fact that
the first step in building the system is ``download many megabytes of
inscrutable binary programs from the Internet.''  There are a handful
of people working to change this, and Gash is a small part of this
larger project.  For all the details, see the
@url{https://bootstrappable.org/, Bootstrappable Builds website}.


@node What does Scheme want from the shell?
@section What does Scheme want from the shell?

There is a long history of mixing Scheme and the shell.  Most
famously, there is @url{https://scsh.net/, Scsh}, which is a hybrid
language that tries to combine the brevity of the shell with the
flexibility of Scheme.  There are a few other projects that mix Scheme
or Lisp with shell syntax, and many projects mixing the shell with
other popular programming languages.  A common goal of many of these
projects is to provide a way forward for shell scripts that have
become so large they are difficult to maintain.  Having a language
that has a similar syntax to shell scripts but provides more ways to
structure code would allow these scripts to continue to grow cleanly.

On the other hand, it can be useful to provide a shell-like interface
for Scheme environments.  For instance, the Guix System boots into
Guile and if anything goes wrong, it presents the user with a Scheme
REPL.  It would be nice to be able to drive Guile using shell syntax,
since most of what you need to do in that context is manipulate files
and execute external utilities.  Another Guix example is adding
build-system phases that are just shell scripts.@footnote{This one is
a little heretical, but I imagine it being attractive as Guix becomes
more popular.}

Ultimately, the future of Gash is yet to be explored, but there are
plenty of spaces it could gracefully move into.  Patches are welcome!

@c *********************************************************************
@node Using Gash
@chapter Using Gash

Gash is a mostly complete POSIX-compatible shell, so it should mostly
work as you would expect.  However, it is missing a few POSIX features
(@pxref{Missing features}), and does not really have any features
beyond POSIX.  Specifically, many of the ``Bashisms'' that you may be
familiar with are not available.@footnote{@xref{Major Differences From
The Bourne Shell,,, bash, Bash Reference Manual}.}

In the following we cover the major features that work very well in
Gash.  You should feel confident using these features as you would in
any other POSIX-compatible shell.

All kinds of command invocation should just work.  This includes
setting temporary variables, redirects, and here-documents.  All of
the POSIX-specified ``special'' built-ins are available except for
@code{times}.  Only a handful of other built-ins are provided (like
@code{cd} and @code{pwd}).

All of the control structures are available and full-featured.  This
means @code{for}, @code{while}, @code{until}, @code{if}, and
@code{case}.  Functions and subshells also work fine.

Field splitting and quoting is well-tested and should work even in
exceptional cases, like @code{"$@@"} and when @code{$IFS} is
manipulated.

You can set variables and mark them read-only or exported.  Many
special variables are available, and about half of the variable
operators (like @code{$@{VARIABLE+alternate@}}) work.

Both types of command substitution work (that is, @code{$(...)} and
@code{`...`}), and can even be nested.

In general, Gash should not surprise you too much, and should run most
POSIX-compatible shell scripts.


@node Invoking Gash
@section Invoking Gash

The @command{gash} command is the sh interpreter.

@example
gash @var{option}@dots{} @file{FILE}
@end example

The @var{option}s can be among the following:

@table @code

@item -c@r{, }--command=@var{string}
Evaluate @var{string}, and then exit.

@item -e@r{, }--errexit
Exit immediately on any error.

@item -h@r{, }--help
Display help on invoking Gash, and then exit.

@item -p@r{, }--parse
Rather than evaluate a script, parse it and output its syntax tree.

@item -v@r{, }--version
Display the current version of Gash, and then exit.

@item -x@r{, }--xtrace
Print each command that is executed.

@end table


@node Using Gash from the Guile REPL
@section Using Gash from the Guile REPL

Gash defines a language specification that extends Guile, allowing you
to use shell syntax from the REPL.  This is accomplished by using the
@code{language} REPL command:

@example
scheme@atchar{}(guile-user)> ,language sh
Happy hacking with Guile as Shell!  To switch back, type `,L scheme'.
sh@atchar{}(guile-user)> echo "Hello Gash!"
Hello Gash!
$1 = 0
@end example


@node Missing features
@section Missing features

In this section, we cover some of the features that are currently
missing in Gash.  Particularly, we look at the features specified by
POSIX that have not yet been implemented.  This list is not
exhaustive, but covers the most glaring omissions.

@itemize @bullet

@item
Arithmetic substitution.

@item
Asynchronous commands and job control.

@item
Alias creation and substitution.

@item
Certain special variables.  This includes @code{$ENV}; all the
localization variables like @code{$LANG}, @code{$LC_*}, and
@code{$NLSPATH}; the parent process ID variable (@code{$PPID}); and
the prompt variables (@code{$PS*}).

@item
Tilde expansion.

@item
Variable pattern operators and assertion operators.  This means that
@code{$@{FOO%pattern@}} and the like do not work, and neither does
@code{$@{FOO?@}}.

@item
Multi-line commands from the readline interface.  If you press
@key{Enter} in the middle of a command (e.g., from within an
unfinished quotation) Gash will not ask for more input but rather
treat it as a syntax error and exit with an error message.

@end itemize

These features will be added to a future version of Gash.  If you
would like to contribute to Gash by implementing one of these
features, please get in touch by writing to
@url{mailto:gash-devel@@nongnu.org, gash-devel@@nongnu.org}.

@c *********************************************************************
@node Hacking with Gash
@chapter Hacking with Gash

Have you ever wanted to write Scheme code that can understand and
manipulate shell scripts on the fly?  Have you ever wanted your Scheme
scripts to have the breezy convenience of a shell script?  You've come
to the right place!  In this chapter we will look at all the tools
that Gash offers for exploring and illuminating the dark canyon that
separates Scheme and the shell.


@node Parsing shell scripts
@section Parsing shell scripts

If you're reading this, you probably have some familiarity with
Scheme, and you have probably heard of Scheme's @code{read} procedure.
Gash exports a very similar function from the @code{(gash parser)}
module.  It's called @code{read-sh}, and as you may have guessed, it
reads shell code instead of Scheme code.

With Scheme code, the internal and external representations of the
code are basically the same, but this is not true with shell code.
You will have to learn and understand Gash's particular internal
representation of shell code to use it effectively.  Fortunately, it
is designed to be familiar to a Scheme programmer, and we provide many
examples here to get you up to speed quickly.  At the end of this
section, a formal definition of the internal representation is given
for reference.

@defun{read-sh [port]}
Read a complete shell command from the input port @var{port} if it is
specified, or from the current input port if @var{port} is not
specified.  A @dfn{complete command} is essentially something an
interactive shell would be able to execute immediately without
prompting for more input.  Technically, it corresponds to the
@code{complete_command} production rule defined in the POSIX
specification.
@end defun

For convenience, we also export the following procedure.

@defun{read-sh-all [port]}
Read all complete shell commands from the input port @var{port} if it
is specified, or from the current input port if @var{port} is not
specified.  This works much like writing your own loop using
@code{read-sh}, but it should be more efficient.  As a bonus, it
ensures that the result is always a list of commands.
@end defun


@node Internal representation examples
@subsection Internal representation examples

The easiest way to understand Gash's internal representation of shell
is through examples, so let's take a look at a few.  To simplify the
examples, we are going abuse notation a little bit.  Instead of
writing

@example
(call-with-input-string "@var{SHELL-EXAMPLE}" read-sh)
@result{}
@var{RESULT}
@end example

we will simply write

@example
@var{SHELL-EXAMPLE}
@result{}
@var{RESULT}
@end example

The major benefit of doing this is that we do not have to escape the
shell examples to be proper Scheme strings.


@subsubheading Simple commands

The parser can read simple commands and split them into their
constituent parts:

@example
echo foo
@result{}
(<sh-exec> "echo" "foo")

echo $FOO
@result{}
(<sh-exec> "echo" (<sh-ref> "FOO"))

echo foo$BAR baz
@result{}
(<sh-exec> "echo" ("foo" (<sh-ref> "BAR")) "baz")
@end example

It preserves quotes, since they are significant for field splitting
and escaping special characters:

@example
echo "$FOO"
@result{}
(<sh-exec> "echo" (<sh-quote> (<sh-ref> "FOO")))

echo *
@result{}
(<sh-exec> "echo" "*")

echo '*'
@result{}
(<sh-exec> "echo" (<sh-quote> "*"))
@end example

It represents redirects and temporary variable assignments with a nod
to Scheme:

@example
CC=gcc CFLAGS='-O2 -g' make
@result{}
(<sh-exec-let> (("CC" "gcc")
                ("CFLAGS" (<sh-quote> "-O2 -g")))
  "make")

wc < gash.texi
@result{}
(<sh-with-redirects> ((< 0 "gash.texi"))
  (<sh-exec> "wc"))

POSIXLY_CORRECT=y sed N < file
@result{}
(<sh-with-redirects> ((< 0 "file"))
  (<sh-exec-let> (("POSIXLY_CORRECT" "y"))
    "sed" "N"))

;; Watch out for this one, though!
echo foo >| file
@result{}
(<sh-with-redirects> ((>! 1 "file"))
  (<sh-exec> "echo" "foo"))
@end example

Here documents get processed by the parser, so they only have one
form:

@example
cat <<EOF > hello.scm
(display "Hello world!\n")
EOF
@result{}
(<sh-with-redirects> ((<< 0 (<sh-quote> "(display \"Hello world!\\n\")\n"))
                      (> 1 "hello.scm"))
  (<sh-exec> "cat"))

cat <<EOF > hello.sh
hi=Howdy; echo $hi world!
EOF
@result{}
(<sh-with-redirects> ((<< 0 (<sh-quote>
                             ("hi=Howdy; echo " (<sh-ref> "hi") " world!\n")))
                      (> 1 "hello.sh"))
  (<sh-exec> "cat"))

;; That's not what we meant!  Let's try again.
cat <<\EOF > hello.sh
hi=Howdy; echo $hi world!
EOF
@result{}
(<sh-with-redirects> ((<< 0 (<sh-quote> "hi=Howdy; echo $hi world!\n"))
                      (> 1 "hello.sh"))
  (<sh-exec> "cat"))
@end example


@subsubheading Compound commands

Loops look like they do on the shell:

@example
while test $GASH = confusing
do
    read-the-manual
done
@result{}
(<sh-while> (<sh-exec> "test" (<sh-ref> "GASH") "=" "confusing")
  (<sh-exec> "read-the-manual"))

until cows come home; do wait; done
@result{}
(<sh-until> (<sh-exec> "cows" "come" "home")
  (<sh-exec> "wait"))

for cat in Murray Flash Boots
do
    pet $cat
done
@result{}
(<sh-for> ("cat" ("Murray" "Flash" "Boots"))
  (<sh-exec> "pet" (<sh-ref> "cat")))
@end example

If-statements and case-statements are a bit more like Scheme:

@example
if test $FLAVOR = chocolate
then
    echo hooray!
elif test $FLAVOR = vanilla
then
    echo yum!
else
    echo too fancy!
fi
@result{}
(<sh-cond>
 ((<sh-exec> "test" (<sh-ref> "FLAVOR") "=" "chocolate")
  (<sh-exec> "echo" "hooray!"))
 ((<sh-exec> "test" (<sh-ref> "FLAVOR") "=" "vanilla")
  (<sh-exec> "echo" "yum!"))
 (<sh-else>
  (<sh-exec> "echo" "too" "fancy!")))

case $FLAVOR
in
    *mint*) echo blech! ;;
    *peanut*) echo so good! ;;
    *maple*) echo maybe! ;;
    *) echo sure why not! ;;
esac
@result{}
(<sh-case> (<sh-ref> "FLAVOR")
  (("*mint*") (<sh-exec> "echo" "blech!"))
  (("*peanut*") (<sh-exec> "echo" "so" "good!"))
  (("*maple*") (<sh-exec> "echo" "maybe!"))
  (("*") (<sh-exec> "echo" "sure" "why" "not!")))
@end example

Brace groups and subshells are pretty straight-forward:

@example
@{ echo jump to the left;
  echo step to the right; @}
@result{}
(<sh-begin>
 (<sh-exec> "echo" "jump" "to" "the" "left")
 (<sh-exec> "echo" "step" "to" "the" "right"))

(cd tmp; cat ephemera.txt)
@result{}
(<sh-subshell>
 (<sh-exec> "cd" "tmp")
 (<sh-exec> "cat" "ephemera.txt"))
@end example

@subsubheading Function definitions

Function definitions look more like Lisp:

@example
dance() @{
  echo swing your hips now!
@}
@result{}
(<sh-defun> "dance"
  (<sh-exec> "echo" "swing" "your" "hips" "now!"))

;; Even this weird thing works!
dance_to_the_beat() @{
  sed 's/beat/swing hips/g'
@} < beats.txt
@result{}
(<sh-defun> "dance_to_the_beat"
  (<sh-with-redirects> ((< 0 "beats.txt"))
    (<sh-exec> "sed" (<sh-quote> "s/beat/move hips/g"))))
@end example


@subsubheading Pipelines and asynchronous commands

Pipelines collect commands into a list:

@example
source | sink
@result{}
(<sh-pipeline> (<sh-exec> "source") (<sh-exec> "sink"))

cut -d ' ' -f 4 < ice-cream.txt \
  | grep chocolate \
  | sed 's/mint/peanut butter/g'
@result{}
(<sh-pipeline>
 (<sh-with-redirects> ((< 0 "flavors.txt"))
   (<sh-exec> "cut" "-d" (<sh-quote> " ") "-f" "4"))
 (<sh-exec> "grep" "chocolate")
 (<sh-exec> "sed" (<sh-quote> "s/mint/peanut butter/g")))
@end example

Asynchronous commands are simply regular commands wrapped with
@code{<sh-async>}:

@example
helpful-daemon &
@result{}
(<sh-async> (<sh-exec> "helpful-daemon"))

;; So-so dancing.
swing-hips ; flail-arms
@result{}
(<sh-begin>
 (<sh-exec> "swing-hips")
 (<sh-exec> "flail-arms"))

;; Great dancing.
swing-hips & flail-arms
@result{}
(<sh-begin>
 (<sh-async> (<sh-exec> "swing-hips"))
 (<sh-exec> "flail-arms"))
@end example


@subsubheading Boolean expressions

The logical ``and'' and ``or'' operators are binary, left-associative
and have equal precedence:

@example
foo && bar || baz
@result{}
(<sh-or> (<sh-and> (<sh-exec> "foo")
                   (<sh-exec> "bar"))
         (<sh-exec> "baz"))

foo || bar && baz
@result{}
(<sh-and> (<sh-or> (<sh-exec> "foo")
                   (<sh-exec> "bar"))
          (<sh-exec> "baz"))
@end example

The logical ``not'' operator wraps a command or pipeline:

@example
! false
@result{}
(<sh-not> (<sh-exec> "false"))

! cat flavors.txt | grep mint
@result{}
(<sh-not> (<sh-pipeline>
           (<sh-exec> "cat" "flavors.txt")
           (<sh-exec> "grep" "mint")))
@end example


@subsubheading Variables and command substitution

Variable operators have names inspired by Scheme:

@example
echo $@{FOO+bar@}
@result{}
(<sh-exec> "echo" (<sh-ref-and> "FOO" "bar"))

echo $@{FOO:+bar@}
@result{}
(<sh-exec> "echo" (<sh-ref-and*> "FOO" "bar"))

echo $@{FOO-bar@}
@result{}
(<sh-exec> "echo" (<sh-ref-or> "FOO" "bar"))

echo $@{FOO=bar@}
@result{}
(<sh-exec> "echo" (<sh-ref-or!> "FOO" "bar"))
@end example

Command substitution is a simple wrapper:

@example
echo $(cat message.txt)
@result{}
(<sh-exec> "echo" (<sh-cmd-sub> (<sh-exec> "cat" "message.txt")))
@end example

That's it for examples.  By this point, you should have a pretty good
sense for what Gash is going to give you.  If you need to know more,
you can always do your own experiments or read the formal definition
in the next section.


@node Internal representation definition
@subsection Internal representation definition

The internal representation of shell code in Gash has the following
form.  (If Gash ever returns something that does not have this form,
please file a bug!)

@example
sync  ::= exp
        | ('<sh-async> exp)

exp   ::= pipe
        | ('<sh-and> exp1 exp2)
        | ('<sh-or> exp1 exp2)
        | ('<sh-not> pipe)

pipe  ::= cmd*
        | ('<sh-pipeline> cmd* ...)

cmd*  ::= cmd
        | ('<sh-defun> name sync ...)

cmd   ::= ('<sh-exec> word ...)
        | ('<sh-exec-let> ((var var-word) ...) word ...)
        | ('<sh-with-redirects> (redir ...) [cmd])
        | ('<sh-set!> (var word) ...)
        | ('<sh-subshell> sync ...)
        | ('<sh-for> (name (word ...)) sync ...)
        | ('<sh-case> word ((pattern-word ...) sync ...) ...)
        | ('<sh-cond> (list sync ...) ... [('<sh-else> sync ...)])
        | ('<sh-while> list sync ...)
        | ('<sh-until> list sync ...)
        | ('<sh-begin> sync ...)

redir ::= ('> fdes word)
        | ('< fdes word)
        | ('>& fdes word)
        | ('<& fdes word)
        | ('>> fdes word)
        | ('<> fdes word)
        | ('>! fdes word)
        | ('<< fdes word)

word  ::= string
        | (word ...)
        | ('<sh-quote> word)
        | ('<sh-cmd-sub> sync ...)
        | ('<sh-ref> var)
        | ('<sh-ref-or> var [word])
        | ('<sh-ref-or*> var [word])
        | ('<sh-ref-or!> var [word])
        | ('<sh-ref-or!*> var [word])
        | ('<sh-ref-assert> var [word])
        | ('<sh-ref-assert*> var [word])
        | ('<sh-ref-and> var [word])
        | ('<sh-ref-and*> var [word])
        | ('<sh-ref-except-min> var [word])
        | ('<sh-ref-except-max> var [word])
        | ('<sh-ref-skip-min> var [word])
        | ('<sh-ref-skip-max> var [word])
        | ('<sh-ref-length> var)

var   ::= string
        | ("LINENO" integer)
@end example


@node Using shell-like constructs from Scheme
@section Using shell-like constructs from Scheme

Gash also has some features for writing Scheme code with shell-like
semantics.  Indeed, this is how Gash works internally.  It interprets
shell code into calls into its shell interface.  The two modules that
provide this interface are @code{(gash shell)} and @code{(gash
environment)}.  The shell module provides control structures with
shell semantics.  For instance, you can use @code{sh:for} to update a
given shell variable and run a thunk for each element of a list of
strings.  The environment module allows you to manipulate shell
variables and functions, as well as a number of other things.

Truthfully, these interfaces are not ready for public consumption yet.
They are in need of some careful redesign to be useful outside of Gash
itself.  If this sounds interesting to you, please get in touch by
writing to @url{mailto:gash-devel@@nongnu.org,
gash-devel@@nongnu.org}.


@c *********************************************************************
@node GNU Free Documentation License
@appendix GNU Free Documentation License
@include fdl-1.3.texi

@bye

@c Local Variables:
@c ispell-local-dictionary: "american";
@c End: