Comments (4)
Fish uses our widecharwidth script that parses the unicode data files to come up with a width representation.
Notably this includes "EastAsianWidth.txt", which lists:
2600..2604;N # So [5] BLACK SUN WITH RAYS..COMET
That means anything from U+2600 to U+2604 is classified as "N", which stands for "Neutral", which means they are "narrow" and occupy one cell. (tbh I don't really understand why "neutral" even exists, and TR11 is unhelpful here)
It is also listed as "Emoji" in emoji-data.txt, but not as "Emoji_Presentation". That means you can use the U+FE0F emoji variation selector to change it from text presentation (narrow because EastAsianWidth.txt applies) to emoji presentation (wide).
Fish will treat this combination as being two cells wide. This is also what I get when I copy your ☁️ here from the browser into my terminal - not just U+2601, but U+2601 followed by U+FE0F. This is also what I get from starship's default config (but again via a browser, it's always possible the U+FE0F was inserted there).
You can see what your fish thinks the width is with string length -V
:
string length -V ☁️ \U2601 \U2601\UFE0F
This should print "2", "1" and "2" - if that cloud glyph is U+2601 U+FE0F. If it prints "1" "1" "2" the glyph is just U+2601.
This seems correct to me from what I know of Unicode and TR11 specifically. It's not a great document, and it's not meant for terminals specifically, but it is the best we have and fish's interpretation seems reasonable to me.
Fish's width handling isn't perfect either (notably it does not handle full grapheme clusters but only codepoint-by-codepoint with some hacks like for variation selectors), but the easy and medium difficulty cases usually work, and I would call this a medium case.
Note: The names of $fish_emoji_width and $fish_ambiguous_width are perhaps too simple. They affect very specific things:
- fish_ambiguous_width affects all codepoints EastAsianWidth.txt classifies as "Ambiguous" width
- fish_emoji_width is about a change that happened in Unicode 9: It declared that all codepoints with "Emoji_Presentation" should default to wide. So $fish_emoji_width defines whether fish should honor that (because the terminal does), and so it only affects all codepoints with Emoji_Presentation by default (listed in emoji-data.txt) that were introduced before Unicode 9 (because anything after that won't be supported by a program that doesn't support Unicode 9 anyway). It is not of much use anymore because Unicode 9 is pretty old by now.
Neither applies to U+2601 because it is classified as neutral and not Emoji_Presentation.
So, in summary:
Terminal.app and iTerm obviously misrender the text. They show ☁️ (which I assume is U+2601 with the emoji selector) as being two cells wide but then don't use the second cell.
Warp apparently ignores the emoji selector and draws U+2601 with the emoji selector as occupying one cell. I believe that is wrong and would therefore call this a bug in Warp.
Note that the zsh comparison doesn't help much because zsh doesn't care about the width at all here, it just prints the prompt without repositioning. The issue comes up because fish repositions the cursor to do syntax highlighting, suggestions, right prompt handling, etc.
Our guidance: Please draw U+2601 alone into one cell, and U+2601 U+FE0F into two cells.
from fish-shell.
So, since I just tested this in 6 more terminals and 5 failed the same test (with various failure modes), here are some quotes from Unicode TRs to support our reading:
From TR51:
default emoji presentation character — A character that, by default, should appear with an emoji presentation, rather than a text presentation.
[...]
These characters have the Emoji_Presentation property.
[...]
default text presentation character — A character that, by default, should appear with a text presentation, rather than an emoji presentation.
[...]
These characters do not have the Emoji_Presentation property; that is, their Emoji_Presentation property value is No.
So a character either has default emoji or text presentation. U+2601 does not have Emoji_Presentation so it defaults to text presentation.
emoji presentation selector — The character U+FE0F VARIATION SELECTOR-16 (VS16), used to request an emoji presentation for an emoji character.
So U+FE0F can be used to get emoji presentation for something that otherwise has text presentation.
emoji presentation sequence — A variation sequence consisting of an emoji character followed by a emoji presentation selector.
So U+2601 U+FE0F is an emoji presentation sequence (it's also listed in the corresponding file).
From TR11:
East Asian Wide (W): [...] This category includes [...] characters that have the [UTS51] property Emoji_Presentation, with the exception of characters that have the [UCD] property Regional_Indicator
So anything that has Emoji_Presentation is "wide".
emoji presentation sequences behave as though they were East Asian Wide, regardless of their assigned East_Asian_Width property value.
So U+2601 U+FE0F is wide.
Neutral (Not East Asian): All other characters. Neutral characters do not occur in legacy East Asian character sets.
So, since U+2601 is text presentation by default, the emoji stuff doesn't apply, so the "Neutral" from EastAsianWidth.txt applies.
I think that should answer your question?
from fish-shell.
Updates:
- Got a better handle on how variation selectors work 👍
unicode-width
, the crate we use for helping calculate widths for Unicode characters, very recently added emoji presentation support (literally 3 days ago - unicode-rs/unicode-width#41 haha) 🔥.- Not in a release cut yet but I confirmed this fixes the width calculation for
"\u{2601}\u{FE0F}"
(previously 1, now gives correct result of 2).
- Not in a release cut yet but I confirmed this fixes the width calculation for
- Figured out a path forward on the Warp-side with how we should handle the full-width character followed by zero-width character to ultimately lead to a double-width character (on the rendering + spacing calculations side). We've got a rough prototype working here - I'm gonna continue working on this, which should actually help us better support Unicode emojis more broadly in Warp (outside of the fish use case too) 🙌
- There's both the rendering + width calculations piece of this for us to tackle correctly, as you mentioned.
Really appreciate your help on this @faho and I might have some follow-ups!
Also, curious, which terminal succeeded when you tested this in 6 mentioned above haha? Hopefully Warp will be another one to add soon 😄 !!
EDIT: for anyone wondering, I believe the terminal that handles this correctly is Kitty! But Warp is coming soon!
from fish-shell.
Awesome - thank you @faho, and thanks for digging into this! Just wanted to ack these comments in the interim - I'm working through reading up on this world and understanding this more deeply 😅.
I'll likely have some follow-up questions, but this is super helpful! 🙌
from fish-shell.
Related Issues (20)
- Paging command output as if it was written to a terminal HOT 2
- fish became slow because checking two tools HOT 7
- Not able to `ctrl+c` out of `fzf`
- fish hangs in futex() during start in VS Code’s integrated terminal HOT 8
- Flatpak completions include the author of the application preceded by a bunch of spaces HOT 6
- completions/git: Handle git restore for unmerged file better
- Confusing "highlighting" when path argument contains >1 `**` HOT 6
- Variable scope-local to for loop gets carried over loop iteration
- Any hook of pre-execution or post-execution for commands? HOT 1
- Wildcard expansion blocks normal completion
- Prompt is arbitrarily(?) redrawn while typing HOT 4
- Trying to generate a fun string when I start a new fish shell session (new tab, or source)--not quite working HOT 2
- [ feature request ] Compile fish scripts to wasm HOT 1
- Slow to startup on fresh boot of macOS HOT 8
- Unnecessary capitalization / completion for command `scp` HOT 1
- curl file completion doesn't work for @
- export function not expanding tilde HOT 1
- Documentation: `export` works fine HOT 2
- [Feature] Allow non-127 returns from fish_command_not_found
- How to sync history only when ctrl+r shotcut press?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fish-shell.