variable values depends on LC_CTYPE
\037\213\010 is not a valid text if decoded in UTF-8. If your locale is UTF-8, you can only use texts that are valid UTF-8 encoding of Unicode strings.
If you set LC_CTYPE to C (without encoding), you can only use ASCII characters.
You cannot change the encoding during the execution of a shell script. This is required by POSIX.
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_03
Changing the value of LC_CTYPE after the shell has started shall not affect the lexical processing of shell commands in the current shell execution environment or its subshells. Invoking a shell script or performing exec sh subjects the new shell to the changes in LC_CTYPE.
It seems that variables cannot hold byte values unless LC_CTYPE=C and it seems like it is not possible to change that for a script.
Consider this script test.sh:
When executing it in dash, bash or busybox ash, it will not print anything. But with yash it prints:
It seems that variables can only hold 7-bit ascii values.
Now, if I set LC_CTYPE=C when spawning yash, it does pass:
But if I set LC_CTYPE=C within the a spawned shell, it still fails:
This effectively means that it is impossible to write or execute portable shells for yash that uses 8-bit bytes in variables. There is no way a portable script can control what locale the user has set in its yashrc.
This happens on Alpine Linux which uses musl libc and I believe the default locale in musl is C.utf8.
This was discovered when debugging yash for alpine linux' tiny-cloud script: https://gitlab.alpinelinux.org/alpine/cloud/tiny-cloud/-/blob/fda9a350a1dfb4a33e9e4bf9e5272d5b4f74f541/lib/tiny-cloud/init-main#L69