Variation 34: “Leveraging the “logit_bias” API Parameter to Combat Em Dashes — How I Had to Suppress 106 Tokens and What I Learned, Plus Sample Code for a “No Dash” Response Test”
Title: How to Suppress Em Dashes in ChatGPT Responses Using Logit Bias
Are you frustrated by ChatGPT’s persistent use of em dashes in its outputs? I was too. Despite trying custom instructions, system memories, and various prompt tweaks, the dreaded em dash kept sneaking in—until I discovered a more direct approach: leveraging the logit_bias parameter in the API.
The Challenge with Em Dashes
Efforts to stop em dashes manually often fall short because the model recognizes the symbol as a distinct token. Even when trying to exclude it through instructions, ChatGPT might still incorporate em dashes into its responses. The key lies in understanding how tokens are represented and manipulated at a granular level.
Using Logit Bias to “Ban” Em Dashes
The logit_bias parameter allows you to assign biases to specific tokens, with values ranging from -100 to +100. Setting a token’s bias to -100 effectively suppresses its likelihood of appearing. My initial goal was to identify the token ID for the em dash (—) and prevent it from showing up.
However, the process is more intricate than targeting a single token. Variations like en dashes (–) or hyphen usage (-) can be produced with different token IDs or through simplistic token combinations. To comprehensively disable em dash usage, I incrementally increased the suppression from just the em dash token to include related tokens such as versions with spaces, hyphens, and hyphen-rich tokens.
The Experiment: Suppressing 106 Tokens
It took a total of setting 106 tokens to a bias of -100 to effectively eliminate em dash appearances in ChatGPT responses—across multiple variants and similar symbols. Here’s the progression:
- Initially, targeting only the direct em dash (
—). - Expanding to include tokens with spaces around the dash.
- Extending to en dashes and hyphens, especially those that are used as substitutes.
- Final step involved suppressing all tokens that could be converted or combined into an em dash or hyphen usage.
Results and Observations
Despite a common concern that disabling these tokens might impair response quality, I found that the overall coherence and style remained intact. Here are some example prompts with two types of responses—standard and biased:
Prompt: Provide your most provocative ‘hot take’ in paragraph form.
(A) Normal Response
“Here’s a hot take: The



Post Comment