-6

Background

I'm working on processing a comma-separated username list (for an ACL whitelist optimization in my project) and need to normalize whitespace around commas, as well as trim leading/trailing whitespace from the string.

Code & Issue

I used this regex replacement to clean up the string:

const input = "a,b,c ";
const result = input.replace(/\s*,\s*|^\s*|\s*$/g, ',');
console.log(result); // Outputs "a,b,c,," (two trailing commas)

"a,b,c ".replace(/\s*,\s*|^\s*|\s*$/g, ',') // outputs two tailing commas

"c ".replace(/(\s*$)/g, ','); // outputs two tailing commas

function checkByIndexOf(commaStr, target) {
  const wrappedStr = `,${commaStr},`;
  const wrappedTarget = `,${target},`;
  return wrappedStr.indexOf(wrappedTarget) !== -1;
}

/**
 * High-performance check: indexOf + boundary validation (supports spaces/dots/no special chars)
 * @param {string} commaStr - Comma-separated string (may contain spaces, dots)
 * @param {string} target - Target item (may contain dots)
 * @returns {boolean} Whether the target is included as a standalone item
 */
function checkByIndexOfWithBoundary(commaStr, target) {
  const targetLen = target.length;
  const strLen = commaStr.length;
  let pos = commaStr.indexOf(target);

  // Return false immediately if target is not found
  if (pos === -1) return false;

  // Loop through all matching positions (avoid missing matches, e.g., duplicate items)
  while (pos !== -1) {
    // Check front boundary: start of string / previous char is comma/space
    const prevOk = pos === 0 || /[, ]/.test(commaStr[pos - 1]);
    // Check rear boundary: end of string / next char is comma/space
    const nextOk = (pos + targetLen) === strLen || /[, ]/.test(commaStr[pos + targetLen]);

    // Return true if both boundaries match (target is a standalone item)
    if (prevOk && nextOk) return true;

    // Find next matching position (avoid re-matching the same position)
    pos = commaStr.indexOf(target, pos + 1);
  }

  // All matching positions fail boundary validation
  return false;
}

/**
 * Check if a comma-separated string contains a specified standalone item
 * @param {string} commaStr - Original comma-separated string (e.g. "apple,banana,orange")
 * @param {string} target - Target string to check (e.g. "banana")
 * @returns {boolean} Whether the target item is included as a standalone entry
 */
function checkCommaStrInclude(commaStr, target) {
  // Escape regex special characters in the target string (e.g. . * + ? $ ^ [ ] ( ) { } | \ /)
  const escapedTarget = target.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
  
  // Build regex pattern: match (start of string | comma) + escaped target + (comma | end of string)
  // Ensures the target is a standalone item (avoids partial matches)
  const regex = new RegExp(`(^|,)${escapedTarget}(,|$)`, 'g');
  
  // Test if the regex matches the comma-separated string
  return regex.test(commaStr);
}

Problem

The expected output is "a,b,c" (no trailing commas, normalized commas), but the current code produces two trailing commas instead. I don't understand why the regex is matching in a way that adds extra commas at the end.

What I've Tried

  • I checked the regex pattern /\s*,\s*|^\s*|\s*$/g and understand it's meant to match:
    • Whitespace around commas (\s*,\s*)
    • Leading whitespace (^\s*)
    • Trailing whitespace (\s*$)
  • I replaced all matches with ,, but the trailing space in the input seems to trigger two replacements that result in double commas.

Question

  1. Why does this regex produce two trailing commas for the input "a,b,c "?
  2. How can I adjust the regex (or use a better approach) to get the clean output "a,b,c" for comma-separated strings with extra whitespace/commas?
5
  • 1
    I would use \s+ instead of \s*, because it only makes sense to replace when you've matched at least one space. Commented 6 hours ago
  • 4
    It produces ,a,b,c,, and not a,b,c but thats probably the LLM's fault... Reminder that ChatGPT generated content is not allowed. Commented 6 hours ago
  • 2
    ",,,a, b , c , ,".split(/\s|,/).filter((part) => part.length).join() Commented 6 hours ago
  • 1
    I’m voting to close this question because I find it is👎 Commented 6 hours ago
  • 3
    @samm - It's your question and you can delete it. The down votes will be removed from your reputation. Commented 5 hours ago

2 Answers 2

2

Why not use the built-in methods like split(), map(), trim(), filter(), and join()? Maybe they are not as fast, but the code is more readable.

const input = "a,b,c ";
const result = input.split(',')
                 .map(val => val.trim())
                 .filter(val => !!val)
                 .join(',');

console.log(result);

Sign up to request clarification or add additional context in comments.

1 Comment

The downside is that this would potentially remove commas, namely when an entry is empty, like in "a,,b. That is an effect that is not described/asked in the question.
1

Why does this regex produce two trailing commas for the input "a,b,c "?

It does so because your regex has three different alternatives, and only the first one matches a comma. So only if it is the first pattern that matches, will the inserted comma replace the one that was matched, but when the match is with one of the two other patterns (either ^\s* or \s*$), then no comma is matched, and so the comma that is inserted is an extra comma that did not occur in the input.

Additionally, after the trailing spaces have matched, there is one more match with an empty string, which gives the second match that appends a comma to your output.

How can I adjust the regex?

One way to solve this, is to capture the comma in a capture group (using parentheses). Then reproduce in the replacement what was captured with $1. Now if the second or third pattern is matched, the capture group will be empty, and so you avoid inserting a comma when none occurred in the match:

const input = "a,b,c ";
const result = input.replace(/\s*(,)\s*|^\s+|\s+$/g, '$1');
console.log(result); // Outputs "a,b,c"

NB: I also replaced \s* with \s+ in the second and third pattern, as you don't need to replace an empty string.

Another way is to not match any comma, and not insert one either. For that you can use look-around assertions:

const input = "a,b,c ";
const result = input.replace(/\s+(?=,|$)|(?<=,|^)\s+/g, '');
console.log(result); // Outputs "a,b,c"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.