How to Use Bitwise Operations in JavaScript to Convert a String to Uppercase

javascript

Ever wondered how the function toUpperCase is implemented in lower level? Would it be possible for you to implement this method yourself without using any libraries?

Let’s dive deeper and take a look at the lower level today, to implement the toUpperCase method ourselves.

Imagine you have a character a, now how do you convert it to A?

Understand ASCII

The first thing we need to understand is, that the character a you see on your screen is actually not stored as, well, not the English character a. Computers can only understand numbers, so the character a is actually stored as a number in the memory. In fact, it is encoded by certain standard so each number represents a unique character. An ASCII code is the numerical representation of a character.

In JavaScript, we can use the method String.prototype.charCodeAt to convert a character to its ASCII code, and use String.fromCharCode to convert the ASCII code back to a character.

For example:

'a'.charCodeAt(0);

The above statement gives 97, which is the ASCII code for this character. Now we can take a look at the ASCII code chart:

Lowercase Decimal Uppercase Decimal
a 97 A 65
b 98 B 66
c 99 C 67

Do you see the pattern here? In ASCII, the uppercase is simply 32 less than its lowercase. Therefore, to convert a character to its uppercase, can we not simply convert it to its ASCII code, subtract 32, and convert it back to a character?

Let’s try this:

String.fromCharCode('a'.charCodeAt(0) - 32);

Yes, it works! This gives us the uppercase character A! And it also works for any other lower case character! Now let’s implement this into a function:

var myToUpperCase = function(str) {
  var ret = "";
  for(i = 0; i < str.length; i++) {
    ret += String.fromCharCode(str.charCodeAt(i) - 32);
  }
  return ret;
}

Does this work?

Bitwise Operations in JavaScript for Case Conversion

Mmmm… It kind of does, but not really. For the first one chenyumin that is pure lowercase it worked pretty well. For the second and third one that are mixed with spaces, uppercases and symbols, it does not work.

What Went Wrong?

The problem with -32 is — It does not work for spaces, uppercase letters, numbers and other symbols. But why? It still comes down to the ASCII codes. -32 failed to work for spaces because the ASCII code for space ` ` is already 32, and subtracting 32 from 32 is 0, which is why it gives \u0000 as we don’t have this character. Take the character ‘C’ for another example. Because it is already uppercase (67), when we subtract 32 from it, it’ll become ‘#’ (35).

To make it work for uppercase letters, numbers and symbols. we’ll have to take a look at the binary representation.

Lowercase Binary Uppercase Binary
a 1100001 A 1000001
b 1100010 B 1000010
c 1100011 C 1000011

See the pattern here? The only difference here is the 6th bit. This bit defines whether or not a letter is uppercase. So, we can actually simply just set the 6th bit to 0.

To do this, we can just use a bit operation. If we perform the AND operation on the lowercase letter with 11011111, this will set the 6th bit to 0 while retain the rest of the bits.

Therefore, the solution becomes & b11011111. JavaScript doesn’t support direct binary as value input, so we’d have to convert b11011111 to decimal first, which becomes our answer & 223.

Let’s give it a try:

String.fromCharCode('a'.charCodeAt(0) & 223); // => 'A'
String.fromCharCode('_'.charCodeAt(0) & 223); // => '_'
String.fromCharCode('P'.charCodeAt(0) & 223); // => 'P'

Yes, this works perfect! Here we have it, our simple custom toUpperCase method that can convert lowercase to uppercase!

Chen Yumin
Chen Yumin

Hi, my name is Chen Yumin.
I am the author of the stuff you're reading right now.

Latest Project
CoderBox.io
CoderBox.io

Connect With Me

Enjoy what you see? Follow me on:

Subscribe

Subscribe via RSS