Input:
Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.
Output:
A i e, t o s a, v v n.
I need a javascript function to solve this.
I'm trying something like this:
function short_verse(verse) {
let result = [];
verse.split(' ').map(word => word.charAt(0) != '' ? result.push(word.charAt(0)) : '');
return result.join(" ");
}
let input = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.",
output = short_verse(input);
console.log(output);
The story: They say you can memorize texts this way. :) So, I create an application that will include this feature, too.
It should work for non-ascii chars, too. Example:
Input: Aliqușam țipsum ex, tempăs ornâre semper ac, varius vitae îbh.
Output: A ț e, t o s a, v v î
Note: In my case romanian diacritics would be enough - ăâîșțĂÂÎȘȚ.
We can use a regex replacement approach here:
var input = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
var output = input.replace(/(\w)\w*/g, "$1");
console.log(output);
If you are using only word characters, you can keep the first character and remove the rest of the word characters.
\B
matches a non word boundary and \w+
matches 1 or more word characters:
const s = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
console.log(s.replace(/\B\w+/g, ""));
For the updated question, you can capture leading chars other than any letter or whitespace char, followed by a single letter. Follow optional letters that should be removed, and use capture group 1 in the replacement.
([^\p{L}\s]*\p{L})\p{L}*
See the regex matching in this regex demo.
[
"Dumnezeu a zis: „Să fie o întindere între ape, și ea să despartă apele de ape.”",
"Aliqușam țipsum ex, tempăs ornâre semper ac, varius vitae îbh.",
"Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh."
].forEach(s =>
console.log(s.replace(/([^\p{L}\s]*\p{L})\p{L}*/gu, "$1"))
)
The following function should work for characters, numbers and symbols. The magic is in the regex; [a-zA-ZÀ-ÿăâîșțĂÂÎȘȚ]+
extracts all unique words that contain alphanumeric and romanian alphabet characters (as per question request), \s
extracts all space characters as we want to preserve the spacing and finally ^\w\s
extracts all non-alphanumeric and non-space characters - a.k.a symbols:
function short_verse(verse) {
let result = [];
const tokens = verse.match(/([a-zA-ZÀ-ÿăâîșțĂÂÎȘȚ]+)|(\s)|[^\w\s]/g);
const firstChars = tokens.map((token) => token.charAt(0));
return firstChars.join('');
}
let input1 = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
console.log(short_verse(input1));
let input2 = "Să fie o întindere între ape, și ea să despartă apele de ape."
console.log(short_verse(input2));