twiliodeved / message-segment-calculator Goto Github PK

View Code? Open in Web Editor NEW

77.0 5.0 34.0 676 KB

JS-based tool to calculate and display message segmentation data for a given message body

Home Page: https://twiliodeved.github.io/message-segment-calculator/

License: MIT License

JavaScript 33.49% TypeScript 66.51%

sms encoding twilio-sms

message-segment-calculator's Introduction

SMS Segment Calculator

This repo contains a package for an SMS segments calculator. The package is released as a nodeJS package as well as a browser script. A browser demo for this package can be accessed here

Usage

nodeJS

The package can be installed using:

npm install --save sms-segments-calculator

Sample usage:

const { SegmentedMessage } = require('sms-segments-calculator');

const segmentedMessage = new SegmentedMessage('Hello World');

console.log(segmentedMessage.encodingName); // "GSM-7"
console.log(segmentedMessage.segmentsCount); // "1"

Browser

You can add the library to your page using the CDN file:

<script src="https://cdn.jsdelivr.net/gh/TwilioDevEd/message-segment-calculator/docs/scripts/segmentsCalculator.js" integrity="sha256-wXuHVlXNhEWNzRKozzB87Qyi9/3p6LKskjDXFHIMInw=" crossorigin="anonymous"></script>

Alternatively you can add the library to your page using the file segmentsCalculator.js provided in docs/scripts/ and adding it to your page:

<script type="text/javascript" src="scripts/segmentsCalculator.js"></script>

And example of usage can be find in docs/index.html

Documentation

`SegmentedMessage` class

This is the main class exposed by the package

`constructor(message, encoding)`

Arguments:

message: Body of the SMS
encoding: Optional: encoding. It can be GSM-7, UCS-2, auto. Default value: auto

`encodingName`

Returns the name of the calculated encoding for the message: GSM-7 or UCS-2

`totalSize`

Total size of the message in bits (including User Data Header if present)

`messageSize`

Total size of the message in bits (excluding User Data Header if present)

[`getNonGsmCharacters()`]

Return an array with the non GSM-7 characters in the body. It can be used to replace character and reduce the number of segments

Try the library

If you want to test the library you can use the script provided in playground/index.js. Install the dependencies (npm install) and then run:

$ node playground/index.js "👋 Hello World 🌍"

Contributing

This code is open source and welcomes contributions. All contributions are subject to our Code of Conduct.

The source code for the library is all contained in the src folder. Before submitting a PR:

Run linter using npm run lint command and make sure there are no linter error
Compile the code using npm run build command and make sure there are no errors
Execute the test using npm test and make sure all tests pass
Transpile the code using npm run webpack and test the web page in docs/index.html

License

MIT

Disclaimer

No warranty expressed or implied. Software is as is.

message-segment-calculator's People

Contributors

Stargazers

Watchers

message-segment-calculator's Issues

Can't force encoding to "UCS-2"

Hello,

The following code return a GSM-7 encoding message :

const message = new SegmentedMessage("abc", 'UCS-2')

Would it be possible to add the possibility to force encoding?

Example of encountered problem, the following codes are equal:

new SegmentedMessage(']]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]', 'UCS-2').segmentsCount
new SegmentedMessage(']]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]', 'GSM-7').segmentsCount

Thank you

Add method to return which character(s) require UCS-2 encoding

This method can be used to swap characters and keep the number of segments limited.

A possible implementation of this method can be found here: https://github.com/Melusyn/text-message-utils/blob/0eda75e5cf1e9f0d29598a40950f0084fd6efaac/lib/models.js#L309

Incorrect segment counting

I spotted a similar issue to this one: #18.

When providing a message like this one:
]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

Segment calculator shows it's 3 segments.

It's 153 scalars counted as 306 characters. That's 2142 bits for characters + 96 bits for 2 headers. That should add up to 2238 bits so still within 2 segments.

But when adding the last character of this string another header is added and the message takes up 2286 bits: 3 segments.

There are other calculators that show this as 2 segments (https://sakari.io/sms-length-calculator/)

But when sending the message on Twillio it is sent as 3 segments. So I have been wondering which is correct and what is causing this behaviour.

Add Twilio smart encoding

Add an option to use Twilio smart encoding to encode message body

Dockerfile/Ability to run this in a pipeline

I work for a non profit healthcare organization and we are using Twilio as our SMS provider. In general, our business users do not understand the nuances of SMS segments and we often send messages in languages other than english, which can make it difficult to balance the queuing/rate that we send out messages with any given message that we want to send.

What I would like:
When the business requests a new SMS message sent out, I want them to submit the message as code into our git repository, and then use that to run the message through this tool to let us know how many segments it has (and therefor how quickly we can send messages out given our phone number/subscription with twilio given whatever population we are sending the message to)

A docker image would work really well for my use case, and I think would help provide opportunities for groups that do not have solid node experience. (ie #2 ) I have a fork where I have created a Dockerfile and Makefile that I used to run the playground/index.js file with the message passed in via the cmd line which worked very well for me. I am happy to submit a PR, let me know your thoughts on this.

I had to move the devDependencies into the Dependencies property of the package.json file to get it to run and I am not sure of the effects of that. Also what is the difference between playground and dist? I feel like ideal state the Makefile/Dockerfile should be able to build/run the full website/app in addition to the cmd line/playground/index.js invocation, but I could not get any response when I ran node dist/index.js and did not spend much time digging into it any further because the playground/index.js script does what I need atm.

Create a library for the segment calculator

The proposal is to create a JS library for the segment calculator functionality. This library can then be reused in scripts / serverless / webapp to provide information about encoding of a message. The idea is then to distribute the script through npm / CDN.

In addition to that, it would be nice to have the lib in different languages as well.

@HugoCrd has already a very good implementation of that at https://www.npmjs.com/package/@setkeeper/text-message-utils

License for this Code?

This is a super useful tool when sending messages with Twilio, thanks for open sourcing it!

I was wondering if Twilio would consider adding a license to this code, as unlicensed code is generally still considered proprietary. We have a similar segment counting tool in our service that uses Twilio, and it would be awesome to be able to include this authoritative source in some way in our UI.

It looks like most of the other projects under the DevEd organization are MIT licensed, but just wanted to check.

Missing 's' in sample usage

const { SegmentedMessage } = require("sms-segment**s**-calculator");

Demo page disagrees with library on message segment count.

Possibly the library is out of date at https://twiliodeved.github.io/message-segment-calculator/ , it seems like characters that take 4 blocks aren't getting properly calculated to figure out the number of message segments.

Example message: Testing Hand 🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻🤟🏻 🤟🏻

I've pulled down and tested the playground and the demo webpage and with current commit of c74a6b2 everything seems to compute as expected.

Does not handle combining characters correctly

Awesome tool. Managed to find one issue:

If you enter a character built with combining characters, the tool will correctly recognize that the message must be UCS2, and correctly count the number of unicode scalars, but will not give a correct message size or segment count (if you happen to be on the border of a new segment).

Example string: "é́́aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

This has an "e" with 3 accents added to it. (it does not display properly in GitHub, but it will if you paste it into the tool)
There are 70 characters, and 73 unicode scalars. If you actually send this message via Twilio (or any of its competitors I've tested), it will take 2 segments.

However, this tool will claim it will only take 1 segment, and claim the message size is only 1120 bits. It will also only show the code point for the plain letter "e" in the "segments" section below the entry field.

I think the tool should be updated to handle combining characters the same way Twilio actually does in practice. The triple-accented "e" is a bit of an absurd example, but single accent combined characters are a plausible use case, even if there are usually single code point alternatives. (ex. "é" (two code points) vs "é" (one code point))

Support Segment Calculation for MMS

Is it possible to support MMS segments calculation?
constructor(message, encoding, type) type = "sms" | "mms"

Add checkbox for US Cellular special case

US Cellular has special segmentation rules when sending concatenated messages (150 characters instead of 160 per segment).

So ideally there should be a "US Cellular" checkbox on the calculator that applies these special rules.

Calculator says encoding is UCS-2, but it looks like UTF-16?

Great work! This is the best SMS segment calculator I have found on the web!

A small point:
When I tried a few emojis like 😋 (U+1F60B) I found they seemed to encode correctly, but it looks like UTF-16.
My understanding was the UCS-2 is fixed 16-bit width, so can't encode emojis.
https://stackoverflow.com/questions/54771549/emojis-not-correctly-encode-into-hexadecimal#answer-54771717
http://www.differencebetween.net/technology/software-technology/difference-between-ucs-2-and-utf-16/

The calculator says it's using UCS-2, but is it actually using UTF-16?

(happy to submit pull requests, just asking for clarification before taking any actions)

UCS-2 Highlighing isn't reflecting these characters correctly

I̳F̳ Y̳O̳U̳ P̳R̳E̳F̳E̳R̳

The above text requires UCS-2 but it isn't properly reflected in the Segments section of the tool to highlight these are the characters causing the use of UCS-2 over GSM-7.

The tool does properly reflected that encoding is UCS-2.

Incorrect segment counting

Hello! Just wanted to let you know that the segment counting doesn't seem to be done correctly. I was building my own implementation of a segment counter and stumbled upon an innaccuracy.

Take this ridiculous string for example:
this is a ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]😘

The tool says this is 3 segments.

However, my understanding is that it should be 2 segments. The length of the string is 86, with the emoji triggering UCS-2 encoding. However once switched to UCS-2, the special GSM-7 bracket should now only count as one character with a maximum per segment of 67 characters.

I believe the tool may be double counting the special GSM-7 counting, then not adjusting once UCS-2 is triggered.

I sent that string to Twilio and it does indeed count as two segments, not three.

Anyways, not terribly urgent, just figured I'd let you all know. Thanks for the tool! Really helped me wrap my head around the topic.

SQL version of this calculator

This calculator is extremely helpful. Is there by chance a version of this in SQL? Would it be more work than it was worth to try to recreate this in SQL?

Being able to run segment counts on messages in our database would be extremely helpful.