Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-text] Prevent line breaking after explicit hyphens #3434

Open
hftf opened this issue Dec 12, 2018 · 21 comments
Open

[css-text] Prevent line breaking after explicit hyphens #3434

hftf opened this issue Dec 12, 2018 · 21 comments
Labels
css-text-4 i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Tracked in DoC

Comments

@hftf
Copy link

hftf commented Dec 12, 2018

I want to prevent hyphenation/line breaking of words containing hyphens (hyphenated compounds). Some common examples of hyphenated compounds in English are T-shirt, long-term, and so-called.

Why?

This behavior is important for documents that prioritize legibility over aesthetics, such as:

  • highly technical documents¹
  • documents that will primarily be read out loud
  • pedagogical documents

These documents may want to apply the behavior to the entire document, not on a case-by-case basis.

¹ This very spec once faced a similar issue in which unwanted hyphenation led to confusion: #2307

Current status

CSS Text Module Level 3 does not define what hyphenation opportunities are:

Hyphenation occurs when the line breaks at a valid hyphenation opportunity…. CSS Text Level 3 does not define the exact rules for hyphenation….

Under hyphens, it says:

none
Words are not hyphenated, even if characters inside the word explicitly define hyphenation opportunities.

However, hyphens: none does not give the expected result in most browsers.
For example, see how Chrome 70/Mac renders this JSFiddle:

I am concerned that the current spec gives authors very little control over a simple display requirement. For example, it never mentions the behavior of U+002D HYPHEN-MINUS, the most widespread hyphen character by far, even once.

Some workarounds

Wrap all hyphenated compounds in <span style="white-space:nowrap;"> or <nobr> (nonstandard).

Cons:

  • Adds lots of unnecessary, presentational markup

Surround all hyphens with U+2060 WORD JOINER.

Cons:

  • Affects searching and copying

Replace all hyphens with U+2011 NON-BREAKING HYPHEN.

Cons:

  • Affects searching and copying
  • Affects display; may look very ugly depending on the font (or fallback font)
    • Few fonts have a glyph for U+2011 (379 out of 3426 on my system do).
      Even if they do, U+2011 may not look identical to U+002D or U+2010.

The Unicode Line Breaking Algorithm (UAX 14) recommends this method, but it seems rare in practice:

2011 NON-BREAKING HYPHEN
This is the preferred character to use where words need to be hyphenated but may not be broken at the hyphen. Because of its use as a substitute for ordinary hyphen, the appearance of this character should match that of U+2010 HYPHEN.

Possible solutions

  1. Define “hyphenation opportunities”
  2. Re(de)fine hyphens: none
  3. Add a brand new value, such as hyphens: none-including-hyphens

Questions

  1. Does this affect CSS Text Module Level 3 or Level 4, or neither?
  2. Will it impact the Last Call?

I can’t find any previous discussion on this topic; sorry if it’s a duplicate.

Thank you for considering this feedback – it’s my first time engaging with the standards process.

Further reading

@frivoal
Copy link
Collaborator

frivoal commented Dec 12, 2018

I believe the intent of the spec with hyphens: none is to do what you expect, and that the fact that browsers don't suppress the wrapping opportunity around U+002D is a bug.

At the same time, I can see how the definition about hyphens: none only talks about suppressing hyphenation opportunities, which isn't defined, and could be interpreted as only suppressing things that inject a hyphen, not suppressing wrapping opportunities around hyphens that are already there.

I think we should clarify that definition to make it more clear (possibly by defining hyphenation opportunity, possibly by rephrasing the whole thing).

@frivoal frivoal added the css-text-3 Current Work label Dec 12, 2018
@frivoal
Copy link
Collaborator

frivoal commented Dec 12, 2018

Does this affect CSS Text Module Level 3 or Level 4, or neither?

Level 4 will eventually include everything that's in 3, but for now it focuses on additions. If we solve this by clarifying that this is already the expected behavior, this is a level 3 thing. If we solve it by adding a new control, it is a level 4. I'll start by tagging as level 3, under the assumption that what we want is a clarification.

Will it impact the Last Call?

We'll deal with all issues, including this one, before going to CR.

@fantasai
Copy link
Collaborator

Usage data https://developer.microsoft.com/en-us/microsoft-edge/platform/usage/css/hyphens/
None is pretty high on the list, so I'm a bit concerned about changing its behavior.

@kojiishi
Copy link
Contributor

kojiishi commented Dec 13, 2018

U+002D is a break opportunity, not a hyphenation opportunity. A hyphenation opportunity inserts a hyphen if the line was broken there. U+002D doesn't do it. &shy is a hyphenation opportunity.

So maybe you want to tweak one of break opportunity properties.

@litherum
Copy link
Contributor

@hftf Wow, this issue is incredibly well researched and detailed. Thank you so much for taking the time to gather all this information and write up the issue!

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed Prevent line breaking after explicit hyphens, and agreed to the following:

  • RESOLVED: add a clarifying note to L3 and discuss a potential new value in L4
The full IRC log of that discussion <dael> Topic: Prevent line breaking after explicit hyphens
<dael> github: https://github.com//issues/3434
<fantasai> tantek, so does D
<tantek> fantasai, you're proving my point that appendices are in general informative
<fantasai> tantek, E doesn't say anything -- it's normative
<fantasai> tantek, F says it's informative explicitly
<fantasai> etc.
<dael> florian: Not entirely clear to me at least and at least one other impl if hyphens none is meant for only suppressing invisible but leave existing hyphens alone or if it's meant to turn off wrapping opportunity at regular hyphens
<dael> florian: koji and I think fantasai understood to not doing anything to normal hyphens. But I read that it's no breaking at hyphens. either way we should clarify. If we clarify to say it's not suppressed maybe explore a control in the next level
<dael> florian: Current spec is not clear so we should clarify
<dael> florian: Spec says suppresses hyphenation opportunities. Doesn't define a hyphenation opportunity.
<dael> florian: That's where ambig comes from
<dael> dbaron: You'd think hypenation opp is different then breaking opp.
<fantasai> https://drafts.csswg.org/css-text-3/#hyphenation
<dael> florian: Different from wrapping opp. Wrapping opp that's right after a hyphen is different. I can see it's reasonable for the spec to mean it
<dael> AmeliaBR: As an author, I've come across places where I want to suppress break at hypens. But you can do that by turning off wrapping
<dael> florian: You can do it with extra mark up.
<dael> astearns: Need extra to signal intent. It's not breaking at any breaking opp. in that string. If you have a reg hyphen and words on either side with breaking opp. you don't want those to break either
<myles> ++astearns
<dael> florian: You mean you would allow in other places as well? The automatic hypenation. But in this no auto, go wrapping.
<dael> astearns: You want markup on the special words where you don't want entire term to break
<dael> florian: Not opposed to saying hyphens:none does not disable wrapping at reg hyphens. I think we could use a little clarification
<dael> florian: It does seem that's what spec intends, but was no obvious
<dael> AmeliaBR: I like clarifying hyphenation opp is opp to insert a hyphen. Breaking opp is second to that.
<dael> florian: And we can re-open an issue on L4 to expore if we want an automatic way of doing this.
<dael> dauwhe: We have general rule where we don't want to hyphenate hyphenated phrases and I'd love some css control for that that doesn't involve preprocessing thousands of words. But that's L4
<dael> dauwhe: We don't want to hyphenate words with intrinsic hypens
<dael> florian: That's dealt with. This is a different case.
<dael> dauwhe: I'm phrasing in a differnet way. But yes we don't want line breaks anywhere in those phases as astearns said
<dael> fantasai: Seems really weird that you'd take hyphenated phrase like one in issue, forbid breaking in a long term is unusually strict. I can see not breaking at a point that's not the hyphen, but breaking at hyphen I don't imagine you'd want to suppress.
<bradk> “E-Mail” should not wrap
<dael> fantasai: Case here is someone that doesn't want hyphenation and is getting breaks at hyphens. And they thought they turned off hyphens and think hyphens and breaking is analogous
<dael> AmeliaBR: I think there is a property for hyphen w/o break
<AmeliaBR> s/property/Unicode character/
<dael> fantasai: I think it makes sense suppress hyphens at breaks through hypens prop. Given current state of impl none does not suppress those breaks. Might mean we add a value in L4 that means no really don't break at hyphens or hypenation points
<dael> myles: What's the exampl?
<AmeliaBR> @bradk, I think that would also be covered by hyphenate-limit-chars https://drafts.csswg.org/css-text-4/#hyphenate-char-limits
<dael> florian: bradk 's IRC example. You'd never want e-mail to break
<dael> myles: This is about things like long-term and t-shirt
<dauwhe> https://www.princexml.com/doc-refs/#prop-prince-hyphenate-before
<dael> fantasai: t-shirt could be a case where you don't break if it's less then 2 char on other side. You can control that for hyphenation in L4.
<dael> fantasai: long-term breaking there is less likely to be because each half is too short
<dael> florian: Seems like stylistic choice in this case
<dael> florian: Can we resolve for L3 to clarify as AmeliaBR and I spoke of and leave it open for L4 and hash it out there? Seem reasonable?
<dael> myles: One more thought, in section it says it affects searching and copying. I think affecting copying is a feature not a bug.
<dael> florian: Possibly. Searching is more annoying
<dael> myles: Have to look at searching. Searching is more complex
<dael> fantasai: [missed]
<fantasai> NFK normalization
<dael> florian: There was spec by i18n to help browsers figure out what to do when searching
<dael> myles: Like curly " match striaght "
<dael> myles: We use ICU u search facilities
<dael> florian: I think we're a little off topic. L3 hyphens:none doesn't do what we talked about, open issue in L4?
<dael> fantasai: I think we wouldn't change meaning of none in L4
<dael> florian: We might add a value
<dael> fantasai: fine with me
<bradk> 👍
<dael> Rossen: Any objections to add a clarifying note to L3 and discuss a potential new value in L4?
<dael> RESOLVED: add a clarifying note to L3 and discuss a potential new value in L4

frivoal added a commit that referenced this issue Dec 19, 2018
@frivoal
Copy link
Collaborator

frivoal commented Dec 19, 2018

So, based on the teleconf discussion minuted above, we've agreed that the behavior currently observed in browsers (hyphens:none does not suppress wrapping after U+002D or U+2010) was what the spec intended, and I just clarified that in 806cd4e

Now, as to whether we should have a separate control for this in level 4, everybody agreed that this was a useful thing to do in certain occasions, and we looked at different scenarios:

  • using a U+2011 NON-BREAKING HYPHEN can be appropriate for things that shouldn't break for semantic reasons, and should survive copy&paste. (Also, the UA should be smart about search).
  • using wrapping element styled with white-space:nowrap is appropriate when there's a semantic justification for having a wrapping element
  • For words like e-mail or T-shirt, the UA is already allowed to be smart enough and decide that hyphens after a single letter do not introduce a breaking opportunity, or some similar heuristic.

The remaining question is whether there are case where you want to suppress wrapping at hyphenation that don't fall in any of these.

I think the answer is yes. Doing <code style="white-space:nowrap">grid-template-rows</code> works fine to suppress the wrapping opportunity in that single word, but <code style="white-space:nowrap">grid-template-rows: 1fr min-content 2fr</code> would also suppress wrapping at spaces, which isn't necessarily desired, and there's no particular justification for having wrapper elements around each token in a code sample. Similarly, code samples are often styled with white-space: pre-wrap, and short of adding wrapper elements around every token, we cannot prevent wraps after hyphens.

Another example could be turning off wrapping after hyphens on people's names, to make sure there's no confusion about whether the hyphen is part of the name or inserted at layout time. But names can also contain spaces, and we don't want to suppress that wrapping opportunity.

Any other situation where the semantic of the phrase/paragraph/block (any piece of text that might contain spaces as well), rather than the semantic of the individual word calls for turning off wrapping after hyphens cannot be addressed without adding an explicit control. So I think we should have it.

Bikeshedding time? hyphens: nowrap ?

@hftf
Copy link
Author

hftf commented Dec 20, 2018

Thank you for investigating my issue!

I just clarified that in 806cd4e

I think this commit contains a typo: a visually indication. I am also somewhat concerned that this only begins to define what a “hyphenation opportunity” is not, rather than what it is. Maybe explicitly contrasting the “hyphenation” mechanism with line breaking behavior would be a helpful addition here, as well as an explicit, unambiguous sentence that literally begins “A hyphenation opportunity is a….”


  • For words like e-mail or T-shirt, the UA is already allowed to be smart enough and decide that hyphens after a single letter do not introduce a breaking opportunity, or some similar heuristic.

To clarify, is this independent of hyphenate-limit-chars, which was suggested in the teleconf? From what I gather, that property only controls “hyphenation opportunities,” and so, as newly redefined, only could affect whether email gets hyphenated to e- and mail, but not whether e-mail breaks apart.

<dael> fantasai: Seems really weird that you'd take hyphenated phrase like one in issue, forbid breaking in a long term is unusually strict. I can see not breaking at a point that's not the hyphen, but breaking at hyphen I don't imagine you'd want to suppress.

<dael> fantasai: t-shirt could be a case where you don't break if it's less then 2 char on other side. You can control that for hyphenation in L4.
<dael> fantasai: long-term breaking there is less likely to be because each half is too short

etc.

It’s not about whether the parts of a hyphenated compound are shorter than a threshold. In general, most hyphenated compounds are conceived as a single unit (and inflected atomically when spoken out loud). Personal names and hyphenated keywords in code are two good examples to add to my meager sample of three frequent English words. But I think the use case is much broader and is not “weird” or “unusual.” Keeping compounds together avoids interrupting or misleading the reader (a miscue or false scent or garden path) and increases legibility. Some style advice is collected below (not all relevant):

Links to selected style manuals and authorities

The Canadian Style: 2. Hyphenation: Compounding and Word Division

In many cases only one syllable in the compound is stressed. The trend over the years has been for the English compound to begin as two separate words, then be hyphenated and finally, if there is no structural impediment to union, become a single word written without a space or hyphen.

2.17: Word division

In order to ensure clear, unambiguous presentation, avoid dividing words at the end of a line as much as possible. If word division is necessary, text comprehension and readability should be your guides. The accepted practice is summarized below:

(i) Avoid misleading breaks that might cause the reader to confuse one word with another, as in read-just and reap-pear.
(j) Divide compounds only at the hyphen, if possible (court-martial, not court-mar-tial). A compound written as one word should be divided between its elements (hot-house, sail-boat).


Chicago Manual of Style, 17th edition

2: Manuscript Preparation, Manuscript Editing, and Proofreading

2.13: Hyphenation

The hyphenation function on your word processor should be turned off. The only hyphens that should appear in the manuscript are hyphens that would appear regardless of where they appeared on the page (e.g., in compound forms). Do not worry if such a hyphen happens to fall at the end of a line or if the right-hand margin is extremely ragged.

2.96: Marking dashes and hyphens

End-of-line hyphens should be marked to distinguish between soft (i.e., conditional or optional) and hard hyphens. Soft hyphens are those hyphens that are invoked only to break a word at the end of a line; hard hyphens are permanent (such as those in cul-de-sac) and must remain no matter where the hyphenated word or term appears.

2.112: Proofreading for word breaks

When it is a question of an intelligible but nonstandard word break for a line that would otherwise be too loose or too tight, the nonstandard break (such as the hyphenation of an already hyphenated term) may be preferred.

7: Spelling, Distinctive Treatment of Words, and Compounds

7.36–7.47: Word Division

7.40: Dividing compounds, prefixes, and suffixes

Hyphenated or closed compounds and words with prefixes or suffixes are best divided at the natural breaks.

poverty- / stricken (rather than pov- / erty-stricken)

7.42: Dividing proper nouns and personal names

Proper nouns of more than one element, especially personal names, should be broken, if possible, between the elements rather than within any of the elements.

Heitor Villa- / Lobos (or, better, Heitor / Villa-Lobos)

7.81–7.89: Compounds and Hyphenation

7.81: To hyphenate or not to hyphenate
7.82: Compounds defined

7.83: The trend toward closed compounds

With frequent use, open or hyphenated compounds tend to become closed (on line to on-line to online).

7.84: Hyphens and readability

A hyphen can make for easier reading by showing structure and, often, pronunciation. Words that might otherwise be misread, such as re-creation or co-op, should be hyphenated. Hyphens can also eliminate ambiguity. For example, the hyphen in much-needed clothing shows that the clothing is greatly needed rather than abundant and needed.


Microsoft Manual of Style: 7. Practical issues of style: Line breaks

  • Try to keep headings on one line. If a two-line heading is unavoidable, break the lines so that the first line is longer. Do not break headings by hyphenating words, and avoid breaking a heading between the parts of a hyphenated word. It does not matter whether the line breaks before or after a conjunction, but avoid breaking between two words that are part of a verb phrase.

    Microsoft style
    Bookmarks, cross-references,
    and captions
    Not Microsoft style
    Bookmarks, cross-
    references, and captions
  • Try to avoid breaking function names and parameters. If hyphenating is necessary, break these names between the words that make up the function or parameter, not within a word itself.


Garner’s Modern English Usage: Headlinese: Peculiar Use Of

Second, line breaks should reflect logical and grammatical breaks as closely as possible. Sometimes a break can create a miscue. … Even when no miscue is possible, it’s best not to split a preposition from its object between lines, for example, or to from the verb in an infinitive.
In a similar vein, the importance of hyphenating phrasal adjectives becomes apparent in the close quarters of a headline.


Technical Communication: Appendix: Reference Handbook: Part C: Editing and Proofreading Your Documents: Hyphens

  1. Use hyphens to divide a word at the end of a line.

    We will meet in the pavil-
    ion in one hour.

    Whenever possible, however, avoid such line breaks; they slow the reader down.

Illustrator CC: Visual QuickStart Guide: 20. Style & Edit Type: Applying hyphenation

Regardless of the current hyphenation settings, you will need to “eyeball” your hyphenated text and, if necessary, correct any awkward breaks using a soft return or the No Break command. To prevent a particular word from breaking at the end of a line, such as a compound word (e.g., “single-line”), to reunite an awkwardly hyphenated word (e.g., “sex-tuplet”)…


GPO Style Manual: 6. Compounding Rules

A compound word is a union of two or more words, either with or without a hyphen. It conveys a unit idea that is not as clearly or quickly conveyed by the component words in unconnected succession. The hyphen is a mark of punctuation that not only unites but also separates the component words; it facilitates understanding, aids readability, and ensures correct pronunciation. When compound words must be divided at the end of a line, such division should be made leaving prefixes and combining forms of more than one syllable intact.

To Hyphenate or Not to Hyphenate

StackOverflow: English Language & Usage: Should hyphenated compound words be permitted to break across lines?

StackOverflow: English Language & Usage: Is it normal to separate hyphenated words on different lines? [duplicate]

@frivoal
Copy link
Collaborator

frivoal commented Dec 20, 2018

his commit contains a typo

Fixed in ecc7db8. Thanks.

I am also somewhat concerned that this only begins to define what a “hyphenation opportunity” is not, rather than what it is.

Right, I agree the current text is still a bit hand-wavy about hyphenation opportunities. I thought it was worth lifting this particular ambiguity anyway, even if we later want to revisit to make the whole definition clearer, as that's where confusion has been. If you've got a suggestion for a complete rephrasing, I'm all ears.


I think cases like T-shirt or e-mail can be covered by the UA with the specification as it is, since the specification does not mandate that the break-after (BA) unicode line-breaking class be always respected, and so the UA could decide that (in English?) hyphen-minus should not introduce a wrapping opportunity when it is preceded by only one letter. Currently, Chrome and Safari do not do that, but Firefox does, as can be seen in this little demo: https://jsbin.com/hoqufobeje/edit?html,css,output.

Maybe we want to mandate that behavior specifically, but I suspect not, as such rules probably need to be language dependent (not to mention full of corner cases), and researching line-breaking best practices in all of the world's languages is beyond the scope of what we can hope to do in this specification. Maybe an note? Or, given that the spec already expects UAs to "do the right thing" for each language, even if the rules aren't spelled out explicitly, maybe the spec as it is is sufficient to consider that disallowing a break after the hyphen in “e-mail” is already must-do, and I can write a wpt test to check if it is being done.


As for the rest, yes, I'm with you. The links/quotes you gave show that there are situations where it's not desired to break after a hyphen. Since it's not universal, I don't think it would be the default, but a hyphens: nowrap value, which would turn off hyphenation AND wrapping opportunities after hyphens, seems to be a very reasonable possibility to me.

@hftf
Copy link
Author

hftf commented Dec 20, 2018

Thank you, I agree with all of that. I don’t have enough expertise to suggest a better rephrasing though.

I don’t think anything needs to be mandated for cases like T-shirt or e-mail (at least, once there is an explicit control like hyphens: nowrap for it). It could be informative to note that UAs can experiment.

@fantasai
Copy link
Collaborator

fantasai commented Dec 31, 2018

Pushed some rephrasing in 723a74e

Note the section quoted in #3463 (comment) is also relevant here wrt breaking “e-mail” and other words with similarly short segments.

Remaining things we could do:

  • Recommend that if a word contains hyphens, those breakpoints take priority over automatic hyphenation, similar to what we require for soft hyphens. Possibly note that if a segment without hyphens is particularly long, the UA can still hyphenate if needed. (L3 or L4)
  • Define that hyphenate-limit-chars also applies to breaks at explicit hyphens. (L4)
  • Add a nowrap value to hyphens in L4.

All three of these make sense to me. Agenda+ for WG discussion/resolution.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed Prevent line breaking after explicit hyphens, and agreed to the following:

  • RESOLVED: Accept text for hyphenation in L3
The full IRC log of that discussion <dael> Topic: Prevent line breaking after explicit hyphens
<dael> github: https://github.com//issues/3434#issuecomment-450610535
<dael> fantasai: I committed a set of changes to clarify what hyphenation is and when it's invoked.
<dael> fantasai: There were specific things we can do. Rec if a word contains a hyphen that breakpoint takes priority over auto hypen points. Could add no hyphens to L4. All this makes sense to me and wanted to ask WG what makes sense to do
<dael> astearns: Argument to put first into L3 instead of 4?
<TabAtkins> Whoops, sorry, I'm on IRC.
<dael> fantasai: Just spec a particular behavior. It wouldn't increase scope of l3. We can also not spec in L3
<TabAtkins> I can jump on phone to discuss the @charset thing.
<dael> florian: Adding no-wrap to hyphen in L4 would be helpful. Since I've done a talk on linebreaking people have asked for this feature.
<dael> florian: Priority on an actual hyphen over hpyenation I'm generally in support. There was a nuance brought up where in things like German you have [longword]-[longword] someone pointed out in some cases you might want to break middle of other words at high priority as well
<florian> https://github.com//issues/618#issuecomment-255135593
<dael> fantasai: So you prefer to break at another word and not hyphen if the break is close?
<dael> florian: Here's the comment^
<dael> fantasai: I can imagine if you have 2 long words you would allow hyphenation in them. but if auto hyphen point is 2 char from explicit hyphen you would want explicit hyphen. I'm not saying forbid the hyphen elsewhere, but encourage UA to use that break
<dael> florian: Trying to find some way around this for UA to do something smarter, either by keeping vague or we make it a must rule but if a-b is in the dictionary it can override and do what it wants
<dael> astearns: I think having a preference for the explicit hyphen but allow at other points, for a UA to make a decision it has to consider hyphenation points againt something else like a desired line break. A greedy linebreak algo will jsut pick the highest priority we spec for the longest line
<dael> fantasai: Greedy means fill the line as much as possible. Doesn't mean you can't say you prioritize breaks in that. Spaces win but a hypehn in 2 char of break works
<myles_> q+
<florian> q+
<dael> AmeliaBR: The desire is to keep some vagueness in rule for prioritization because it does end up around how many characters you will end up short. Spec has avoided strict hyphen algo so far
<astearns> ack myles_
<dael> myles_: The smarter we get on hyphenation and line breaking the more it seems to fit the text wrap multiline type thing
<dael> astearns: Given that we are discussing pro and con of spec explicit hyphen is desired I'm included to push to L4
<dael> florian: Need to say something in L3.
<myles_> what happened to text-wrap:multi-line? it isn't in the spec any more, but there are references to it if you search-in-page
<dael> fantasai: L3 spec if you break at punct it's rec you preform prioritization among your breaks. I don't think L3 needs to say anything more. It's not only allowed, but encouraged
<astearns> ack florian
<dael> fantasai: Happy to push to L4. If we want something in the spec I'll write it
<dauwhe> q+
<dael> florian: There's multiple ways. There's prioritization. There's also if you have a hyphen and disallow the rest. Even looking at German in the example this is allowed but not nice. Makes me think it's akin to line break where there's strict and loose.
<dael> fantasai: Okay
<dael> dauwhe: I'm fine with prioritization as fantasai wrote. I think that expresses all other things being equal we prefer to break at hyphen that's there, but there are other things algo need to consider
<dael> astearns: I think we need a resolution to accpe the current change. fantasai did you want a feeling of the group if those 3 items should be worked on in L4?
<dael> fantasai: Yeah. If we want to add to L4 I can edit those in
<florian> +1 for current change
<dael> astearns: First is accepting the changes in L3. Any objections to the current hyphenation text in L3?
<dael> RESOLVED: Accept text for hyphenation in L3
<florian> +1
<dael> astearns: More explicit rule on where to hyphenation when there is a n explicit hyphen. Objections to adding that to L4?
<dael> [silence]
<dael> astearns: So work on that
<AmeliaBR> aka the e-mail/T-shirt rule
<florian> +0 for hyphenate-limit-chars (no disagreement, just haven't thought about it, but go explore)
<dael> fantasai: There's don't hyphenate if there will be this many char before or after and proposal is to apply that to hyphens. You can't break if there's one character before or after explicit hyphen
<TabAtkins> e-
<TabAtkins> mail
<dael> astearns: Obj to add something in L4 around e-mail/t-shirt rule?
<dael> astearns: Hearing none, let's work on that.
<myles_> https://github.com/w3c/csswg-drafts/commit/a0c27afa0a50c462584511e617a20b687eb892af#diff-94819ad75aa15ba8049b412f93d8cc04
<florian> +1 for nowrap in L4
<dael> fantasai: Adding no-wrap to hyphens. None says don't do hyphenation but you can break at explicit hyphens. No-wrap says don'tbreak at explicit hyphens either
<dael> astearns: Obj to dealing with not wrapping at explicit hyphens in L4?
<dael> astearns: Let's work on that too.
<fantasai> https://drafts.csswg.org/css-text-4/#hyphenate-char-limits
<dael> astearns: One additional thing when talking about char limit. Does it make sense to have char limimt applyt o each segment in between explicit hyphens?
<dael> fantasai: Three values, required min for total char to hyphenate, min for char before hyphen, min for char after
<dael> astearns: Min for 3 char, you have 3 char, explicit hyphen, 2 char, hyphenation break. Is that allowed?
<dael> fantasai: Need to check
<dael> astearns: Not sure if that should be a thing or not. There are more then enough char before hyphen
<dael> fantasai: Yes, but if there wasn't a hyphen seems weird to break there
<dael> astearns: True. Maybe line length consideration needs the char
<dael> fantasai: THen you would allow to break after 2 char.
<dael> astearns: That's prob enough on this
<fantasai> s/char/char afte a space, too/

@kojiishi
Copy link
Contributor

Minor point but for nowrap, I have a mild preference to add it to one of existing (or new) line breaking properties, so that we could add line breaking control for other characters than U+002D in future.

@fantasai fantasai removed the css-text-3 Current Work label Jan 30, 2019
@faceless2
Copy link

Regarding breaks after explicit hyphens: what about adding an optional fourth value to hyphenate-limit-chars, which is the minimum distance to the nearest hyphen - either explicit or automatically inserted.

This would allow users to allow or disallow hyphenation at all in a word containing an explicit hyphen (which is common practice I believe), or optionally allow it only for long words. See discussion of this at https://www.princexml.com/forum/topic/3316/hyphenation-of-overlong-words

Note they state:

Oxford University Press (Hart's Rules) is to allow hyphenating words in the phrase, but no closer than 6 letters from the closest in-text hyphen...

So there is precedence for this sort of logic.

@valtlai
Copy link
Contributor

valtlai commented Jun 15, 2019

Chromium issue: https://crbug.com/974470

@r12a r12a added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Feb 10, 2021
@aghArdeshir
Copy link

Hi. I came across this thread as I need to transform this:
image

To this:
image

And I am not willing to use any of the workarounds (like adding extra HTML wrapper/element or using non-breaking hyphen), as my texts are coming from JSON translation files, and I'm not considering them as HTML.

My first go-to thought was word-break property.

@codewizard13
Copy link

Links to selected style manuals and authorities

I have to commend you @hftf for well and thoroughly researched responses! I thought I was concise and detailed, but your prose in this forum is certainly a cut above and a model for others to imitate. Bravo!

@codewizard13
Copy link

Hi. I came across this thread as I need to transform this: image

To this: image

And I am not willing to use any of the workarounds (like adding extra HTML wrapper/element or using non-breaking hyphen), as my texts are coming from JSON translation files, and I'm not considering them as HTML.

My first go-to thought was word-break property.

Excellent illustration of what I believe @hftf is talking about. This is also exactly what I am trying to accomplish. I tried white-space: nowrap, but it doesn't do what I expected. Then I found this post and reading the whole thread explained a lot. Thanks for the graphic!

@davidwebca
Copy link

This is a major issue with composed names like cities and first names. Ex.: Jean-François, Saint-Pamphile, etc.

Lots of writing guides and stylistic guides mention to never split a name like this. And just like @aghArdeshir mentioned, most of the times, I don't have control on the content itself so replacing manually or parsing the content before outputting it to the HTMl character is very much unviable.

The best solution would be a new value to the word-break property.

@JoelDSmith
Copy link

I personally would like the option to disable wrapping of hyphens-in-content, because of email addresses (e.g. an-email@example.com) that I need to print to the page within a sentence, but have no control over the surrounding HTML (only CSS).

Given the high likelihood of the email address being copy-pasted, I cannot just replace the hyphen with a non-wrapping, visually-identical variant, and the enclosing sentence must still wrap in general, so I can't use white-space: nowrap, either (so in my specific case, I'd expect the automatic wrapping to happen before the email address).

@simevidas
Copy link
Contributor

I noticed that word-break: keep-all works in Safari and Firefox, but not in Chrome.

Test page: https://jsbin.com/zakacav/edit?html,css,output

Screenshot of test page

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
css-text-4 i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Tracked in DoC
Projects
None yet
Development

No branches or pull requests