diff --git a/concepts/README.md b/concepts/README.md index 6b9c92e4..5c92c4bd 100644 --- a/concepts/README.md +++ b/concepts/README.md @@ -2,9 +2,24 @@ The [plan](http://forum.exercism.org/t/bash-syllabus-planning/11952) -The suggested concept flow: - -[![bash syllabus concept flowchart](https://glennj.github.io/img/bash.syllabus.flow.png)](http://forum.exercism.org/t/bash-syllabus-flow/15038) +## Concept Flow: + +```mermaid +erDiagram +"Commands and Arguments" ||--|| Variables : "" +Variables ||--|| "The Importance of Quoting" : "" +"The Importance of Quoting" ||--|| Conditionals : "" +"The Importance of Quoting" ||--|| Arrays : "" +"The Importance of Quoting" ||--|| "Pipelines and Command Lists" : "" +Conditionals ||--|| Arithmetic : "" +Conditionals ||--|| Looping : "" +Arrays ||--|| "More About Arrays" : "" +"Pipelines and Command Lists" ||--|| Functions : "" +Functions ||--|| Redirection : "" +Redirection ||..|| "Command Substitution" : TODO +Redirection ||--|| "Here Documents" : "" +"Command Substitution" ||..|| "Process Substitution" : TODO +``` 1. Basic syntax: commands and arguments @@ -95,23 +110,26 @@ The suggested concept flow: ``` - sublist syntax `${ary[@]:offset:length}` -11. I/O +11. Redirection - file descriptors, stdin, stdout, stderr - redirection + +12. Here Documents - here-docs and here-strings + +## More Concepts to Add + +- I/O continued - command substitution - capturing stdout and stderr - capturing stdout and stderr **into separate variables** - - `exec` and redirections - process substitutions -## More Concepts - - brace expansions and how it's different from patterns `/path/to/{foo,bar,baz}.txt` -x. option parsing with getopts +- option parsing with getopts -x. `set` command and "strict mode" +- `set` command and "strict mode" - pros and cons of - `set -e` diff --git a/concepts/heredocs/.meta/config.json b/concepts/heredocs/.meta/config.json new file mode 100644 index 00000000..af23605b --- /dev/null +++ b/concepts/heredocs/.meta/config.json @@ -0,0 +1,10 @@ +{ + "authors": [ + "glennj" + ], + "contributors": [ + "IsaacG", + "kotp" + ], + "blurb": "Here Documents redirect an inline document to the standard input of a command." +} diff --git a/concepts/heredocs/about.md b/concepts/heredocs/about.md new file mode 100644 index 00000000..b64141bb --- /dev/null +++ b/concepts/heredocs/about.md @@ -0,0 +1,313 @@ +# About Here Documents + +In Bash scripting, a "here document" (or "heredoc") redirects multiple lines of input to a command or program, as if you were typing them directly into the terminal. +It's a powerful tool for embedding multi-line text within your scripts without needing external files or complex string manipulation. + +## Key Features and Syntax + +1. Delimiter: a heredoc starts with the `<<` operator followed by a delimiter word (often called the "marker" or "terminator"). + This delimiter can be any word you choose, but it's common to use something like `EOF`, `END`, or `TEXT` for clarity. + For more readable code, you can use something descriptive as the delimiter, for example `END_INSTALLATION_INSTRUCTIONS`. + +1. Content: after the initial `<< DELIMITER`, you write the content you want to redirect. + This can be multiple lines of text, code, or anything else. + +1. Termination: the heredoc ends when the delimiter word appears again on a line by itself, with no leading or trailing whitespace. + +## Basic Syntax + +```bash +command << DELIMITER + Content line 1 + Content line 2 + ... + Content line N +DELIMITER +``` + +## How it Works + +* Bash reads all the lines between the starting `<< DELIMITER` and the ending `DELIMITER`. +* Bash connects this content to the command's standard input. +* The command processes this input as if it were coming from the keyboard. + +### Example 1: Simple Text Output + +```bash +cat << EOF +This is the first line. +This is the second line. +This is the third line. +EOF +``` + +Output: + +```plaintext +This is the first line. +This is the second line. +This is the third line. +``` + +In this example: + +* `cat` is the command. +* `<< EOF` starts the heredoc with `EOF` as the delimiter. +* The three lines of text are the content. +* `EOF` on its own line ends the heredoc. +* `cat` then outputs the content it received. + +### Example 2: Using with wc (Word Count) + +```bash +wc -l << END +Line 1 +Line 2 +Line 3 +END +``` + +Output: + +```plaintext +3 +``` + +Here, `wc -l` counts the number of lines. +The heredoc provides the three lines as input. + +### Example 3: Passing data to a script + +The script: + +```bash +#!/usr/bin/env bash + +# Script to process input +while IFS= read -r line; do + echo "Processing: $line" +done +``` + +Call the script from an interactive bash prompt with a heredoc: + +```bash +./your_script << MY_DATA +Item 1 +Item 2 +Item 3 +MY_DATA +``` + +Output: + +```plaintext +Processing: Item 1 +Processing: Item 2 +Processing: Item 3 +``` + +## Variations and Advanced Features + +### Literal Content + +Bash performs variable expansion, command substitution, and arithmetic expansion within a heredoc. +In this sense, heredocs act like double quoted strings. + +```bash +cat << EOF +The value of HOME is $HOME +The current date is $(date) +Two plus two is $((2 + 2)) +EOF +``` + +Output: + +```plaintext +The value of HOME is /home/glennj +The current date is Thu Apr 24 13:47:32 EDT 2025 +Two plus two is 4 +``` + +When the delimiter is quoting (using single or double quotes), these expansions are prevented. +The content is taken literally. +This is like single quoted strings. + +```bash +cat << 'EOF' +The value of $HOME is not expanded here. +The result of $(date) is not executed. +Two plus two is calculated by $((2 + 2)) +EOF +``` + +Output: + +```plaintext +The value of $HOME is not expanded here. +The result of $(date) is not executed. +Two plus two is calculated by $((2 + 2)) +``` + +### Stripping Leading Tabs + +If you use `<<-` (with a trailing hyphen) instead of `<<`, Bash will strip any leading _tab characters_ from each line of the heredoc. +This is useful for indenting the heredoc content within your script without affecting the output. + +```bash +# Note, the leading whitespace is tab characters only, not spaces! +# The ending delimiter can have leading tabs as well. +cat <<- END + This line has 1 leading tab. + This line has a leading tab and some spaces. + This line 2 leading tabs. + END +``` + +The output is printed with all the leading tabs removed: + +```plaintext +This line has 1 leading tab. + This line has a leading tab and some spaces. +This line 2 leading tabs. +``` + +~~~~exercism/caution +The author doesn't recommend this usage. +While it can improve the readability of the script, + +1. it's easy to accidentally replace the tab characters with spaces (your editor may do this automatically), and +1. it's hard to spot the difference between spaces and tabs. +~~~~ + +## When to Use Here Documents + +* Multi-line input: when you need to provide multiple lines of text to a command. +* Configuration files: embedding small configuration snippets within a script. +* Generating code: creating code on the fly within a script. +* Scripting interactions: simulating user input for interactive programs. +* Avoiding external files: when you want to avoid creating temporary files. + +A typical usage might be to provide some help text: + +```bash +#!/usr/bin/env bash + +usage() { + cat << END_USAGE +Refresh database tables. + +usage: ${0##*/} [-h|--help] [-A|--no-archive] + +where: --no-archive flag will _skip_ the archive jobs +END_USAGE +} + +# ... parsing command line options here ... + +if [[ $flag_help == "true" ]]; then + usage + exit 0 +fi +``` + +## Possible Drawbacks + +* Large embedded documents can make your code harder to read. + It can be better to deploy your script with documentation in separate files. +* Here documents can break the flow of the code. + You might be in a deeply nested section of code when you want to pass some text to a program. + The heredoc's indentation can look jarring compared to the surrounding code. + +## Here Strings + +Like here documents, _here strings_ (or "herestrings") provide input to a command. +However, while heredocs are given as a block of text, herestrings are given as a single string of text. +Here strings use the `<<< "text"` syntax. + +```bash +tr 'a-z' 'A-Z' <<< "upper case this string" +``` + +Output: + +```plaintext +UPPER CASE THIS STRING +``` + +Unlike heredocs, no ending delimiter is required. + +### Why Use Here Strings? + +A pipeline can be used instead of a here string: + +```bash +echo "upper case this string" | tr 'a-z' 'A-Z' +``` + +So why use a here string? + +Consider the case where you get the string as output from a long-running computation, and you want to feed the result to two separate commands. +Using pipelines, you have to execute the computation twice: + +```bash +some_long_running_calculation | first_command +some_long_running_calculation | second_command +``` + +A more efficient approach is to capture the output of the computation (using command substutition), and use here strings to provide input to the two subsequent commands: + +```bash +result=$( some_long_running_calculation ) +first_command <<< "$result" +second_command <<< "$result" +``` + +Here's a real-world application of that example: + +* capture the JSON response to a REST API query (that is paginated), +* provide the JSON data to a jq program to parse the results and output that to a file, and then +* provide the JSON data to another jq program to determine the URL of the next query. + +```bash +# initialize the output CSV file +echo "ID,VALUE" > data.csv + +url='https//example.com/api/query?page=1' + +while true; do + json=$( curl "$url" ) + + # convert the results part of the response into CSV + jq -r '.results[] | [.id, .value] | @csv' <<< "$json" + + # get the URL for the next page + url=$( jq -r '.next_url // ""' <<< "$json" ) + if [[ "$url" == "" ]]; then + break + fi +done >> data.csv +``` + +Note the position of the output redirection. +All output from the while loop will be appended to the file `data.csv`. + +## Heredocs and Herestrings as Redirection + +Because these are just forms of redirection, they can be combined with other redirection operations: + +```bash +cat <<< END_OF_TEXT > output.txt +This is my important text. +END_OF_TEXT + +awk '...' <<< "$my_var" >> result.csv +``` + +## In Summary + +Here documents (or "heredocs") are a flexible and convenient way to manage multi-line input in Bash scripts. +They simplify the process of embedding text and data directly within your scripts, making them more self-contained and easier to read. + +Here strings (or "herestrings") are like here documents, but offer a simpler, more dynamic syntax. diff --git a/concepts/heredocs/introduction.md b/concepts/heredocs/introduction.md new file mode 100644 index 00000000..5a2446a8 --- /dev/null +++ b/concepts/heredocs/introduction.md @@ -0,0 +1,313 @@ +# Introduction to Here Documents + +In Bash scripting, a "here document" (or "heredoc") redirects multiple lines of input to a command or program, as if you were typing them directly into the terminal. +It's a powerful tool for embedding multi-line text within your scripts without needing external files or complex string manipulation. + +## Key Features and Syntax + +1. Delimiter: a heredoc starts with the `<<` operator followed by a delimiter word (often called the "marker" or "terminator"). + This delimiter can be any word you choose, but it's common to use something like `EOF`, `END`, or `TEXT` for clarity. + For more readable code, you can use something descriptive as the delimiter, for example `END_INSTALLATION_INSTRUCTIONS`. + +1. Content: after the initial `<< DELIMITER`, you write the content you want to redirect. + This can be multiple lines of text, code, or anything else. + +1. Termination: the heredoc ends when the delimiter word appears again on a line by itself, with no leading or trailing whitespace. + +## Basic Syntax + +```bash +command << DELIMITER + Content line 1 + Content line 2 + ... + Content line N +DELIMITER +``` + +## How it Works + +* Bash reads all the lines between the starting `<< DELIMITER` and the ending `DELIMITER`. +* Bash connects this content to the command's standard input. +* The command processes this input as if it were coming from the keyboard. + +### Example 1: Simple Text Output + +```bash +cat << EOF +This is the first line. +This is the second line. +This is the third line. +EOF +``` + +Output: + +```plaintext +This is the first line. +This is the second line. +This is the third line. +``` + +In this example: + +* `cat` is the command. +* `<< EOF` starts the heredoc with `EOF` as the delimiter. +* The three lines of text are the content. +* `EOF` on its own line ends the heredoc. +* `cat` then outputs the content it received. + +### Example 2: Using with wc (Word Count) + +```bash +wc -l << END +Line 1 +Line 2 +Line 3 +END +``` + +Output: + +```plaintext +3 +``` + +Here, `wc -l` counts the number of lines. +The heredoc provides the three lines as input. + +### Example 3: Passing data to a script + +The script: + +```bash +#!/usr/bin/env bash + +# Script to process input +while IFS= read -r line; do + echo "Processing: $line" +done +``` + +Call the script from an interactive bash prompt with a heredoc: + +```bash +./your_script << MY_DATA +Item 1 +Item 2 +Item 3 +MY_DATA +``` + +Output: + +```plaintext +Processing: Item 1 +Processing: Item 2 +Processing: Item 3 +``` + +## Variations and Advanced Features + +### Literal Content + +Bash performs variable expansion, command substitution, and arithmetic expansion within a heredoc. +In this sense, heredocs act like double quoted strings. + +```bash +cat << EOF +The value of HOME is $HOME +The current date is $(date) +Two plus two is $((2 + 2)) +EOF +``` + +Output: + +```plaintext +The value of HOME is /home/glennj +The current date is Thu Apr 24 13:47:32 EDT 2025 +Two plus two is 4 +``` + +When the delimiter is quoting (using single or double quotes), these expansions are prevented. +The content is taken literally. +This is like single quoted strings. + +```bash +cat << 'EOF' +The value of $HOME is not expanded here. +The result of $(date) is not executed. +Two plus two is calculated by $((2 + 2)) +EOF +``` + +Output: + +```plaintext +The value of $HOME is not expanded here. +The result of $(date) is not executed. +Two plus two is calculated by $((2 + 2)) +``` + +### Stripping Leading Tabs + +If you use `<<-` (with a trailing hyphen) instead of `<<`, Bash will strip any leading _tab characters_ from each line of the heredoc. +This is useful for indenting the heredoc content within your script without affecting the output. + +```bash +# Note, the leading whitespace is tab characters only, not spaces! +# The ending delimiter can have leading tabs as well. +cat <<- END + This line has 1 leading tab. + This line has a leading tab and some spaces. + This line 2 leading tabs. + END +``` + +The output is printed with all the leading tabs removed: + +```plaintext +This line has 1 leading tab. + This line has a leading tab and some spaces. +This line 2 leading tabs. +``` + +~~~~exercism/caution +The author doesn't recommend this usage. +While it can improve the readability of the script, + +1. it's easy to accidentally replace the tab characters with spaces (your editor may do this automatically), and +1. it's hard to spot the difference between spaces and tabs. +~~~~ + +## When to Use Here Documents + +* Multi-line input: when you need to provide multiple lines of text to a command. +* Configuration files: embedding small configuration snippets within a script. +* Generating code: creating code on the fly within a script. +* Scripting interactions: simulating user input for interactive programs. +* Avoiding external files: when you want to avoid creating temporary files. + +A typical usage might be to provide some help text: + +```bash +#!/usr/bin/env bash + +usage() { + cat << END_USAGE +Refresh database tables. + +usage: ${0##*/} [-h|--help] [-A|--no-archive] + +where: --no-archive flag will _skip_ the archive jobs +END_USAGE +} + +# ... parsing command line options here ... + +if [[ $flag_help == "true" ]]; then + usage + exit 0 +fi +``` + +## Possible Drawbacks + +* Large embedded documents can make your code harder to read. + It can be better to deploy your script with documentation in separate files. +* Here documents can break the flow of the code. + You might be in a deeply nested section of code when you want to pass some text to a program. + The heredoc's indentation can look jarring compared to the surrounding code. + +## Here Strings + +Like here documents, _here strings_ (or "herestrings") provide input to a command. +However, while heredocs are given as a block of text, herestrings are given as a single string of text. +Here strings use the `<<< "text"` syntax. + +```bash +tr 'a-z' 'A-Z' <<< "upper case this string" +``` + +Output: + +```plaintext +UPPER CASE THIS STRING +``` + +Unlike heredocs, no ending delimiter is required. + +### Why Use Here Strings? + +A pipeline can be used instead of a here string: + +```bash +echo "upper case this string" | tr 'a-z' 'A-Z' +``` + +So why use a here string? + +Consider the case where you get the string as output from a long-running computation, and you want to feed the result to two separate commands. +Using pipelines, you have to execute the computation twice: + +```bash +some_long_running_calculation | first_command +some_long_running_calculation | second_command +``` + +A more efficient approach is to capture the output of the computation (using command substutition), and use here strings to provide input to the two subsequent commands: + +```bash +result=$( some_long_running_calculation ) +first_command <<< "$result" +second_command <<< "$result" +``` + +Here's a real-world application of that example: + +* capture the JSON response to a REST API query (that is paginated), +* provide the JSON data to a jq program to parse the results and output that to a file, and then +* provide the JSON data to another jq program to determine the URL of the next query. + +```bash +# initialize the output CSV file +echo "ID,VALUE" > data.csv + +url='https//example.com/api/query?page=1' + +while true; do + json=$( curl "$url" ) + + # convert the results part of the response into CSV + jq -r '.results[] | [.id, .value] | @csv' <<< "$json" + + # get the URL for the next page + url=$( jq -r '.next_url // ""' <<< "$json" ) + if [[ "$url" == "" ]]; then + break + fi +done >> data.csv +``` + +Note the position of the output redirection. +All output from the while loop will be appended to the file `data.csv`. + +## Heredocs and Herestrings as Redirection + +Because these are just forms of redirection, they can be combined with other redirection operations: + +```bash +cat <<< END_OF_TEXT > output.txt +This is my important text. +END_OF_TEXT + +awk '...' <<< "$my_var" >> result.csv +``` + +## In Summary + +Here documents (or "heredocs") are a flexible and convenient way to manage multi-line input in Bash scripts. +They simplify the process of embedding text and data directly within your scripts, making them more self-contained and easier to read. + +Here strings (or "herestrings") are like here documents, but offer a simpler, more dynamic syntax. diff --git a/concepts/heredocs/links.json b/concepts/heredocs/links.json new file mode 100644 index 00000000..2a4d4742 --- /dev/null +++ b/concepts/heredocs/links.json @@ -0,0 +1,14 @@ +[ + { + "url": "https://www.gnu.org/software/bash/manual/bash.html#Here-Documents", + "description": "Here Documents in the manual" + }, + { + "url": "https://www.gnu.org/software/bash/manual/bash.html#Here-Strings", + "description": "Here Strings in the manual" + }, + { + "url": "https://mywiki.wooledge.org/BashGuide/InputAndOutput#Heredocs_And_Herestrings", + "description": "Heredocs in the Bash Guide" + } +] diff --git a/config.json b/config.json index 6887ea8e..a8a904c7 100644 --- a/config.json +++ b/config.json @@ -1294,6 +1294,11 @@ "uuid": "fb7effaa-9642-4365-8766-6b2876067302", "slug": "redirection", "name": "Redirection" + }, + { + "uuid": "701334b1-a9ad-4774-8c05-f681b2319b53", + "slug": "heredocs", + "name": "Here Documents" } ], "key_features": [ diff --git a/docs/SYLLABUS.md b/docs/SYLLABUS.md index 570c765c..3474ec1c 100644 --- a/docs/SYLLABUS.md +++ b/docs/SYLLABUS.md @@ -16,6 +16,7 @@ While the learning exercises are still incomplete, most of the concept documenta * [Pipelines and Command Lists](https://exercism.org/tracks/bash/concepts/pipelines) * [Functions](https://exercism.org/tracks/bash/concepts/functions) * [Redirection](https://exercism.org/tracks/bash/concepts/redirection) + * [Here Documents](https://exercism.org/tracks/bash/concepts/heredocs) There will be more. Check the "**What's going on with Bash**" section on the [Bash track page](https://exercism.org/tracks/bash) periodically.