Implementing osu!mania from scratch


Old osu!mania screenshot

I’ve used to play lots of games. Never been the competitive type so most of my time is spent on single-player games or casual LoL matches. But one genre I was most intrigued was rhythm games.

Rhythm games are the games where you follow the rhythm of a song, sometimes by mimicking the instruments (Guitar Hero, DJ Hero, RockBand) and sometimes in rather arbitrary ways (Rhythm Heaven, Sound Voltex). They hit the sweet spot where you don’t have to struggle learning an instrument, yet can follow the song.

osu! was one of my favorite games in high school, mostly because I used to spend a lot of time watching animes back then. It was fast, rewarding and still casual in a way you could just play it for the sake of listening to your favorite song. It’s been quite a time since I’ve stopped playing, I still remember the glory of reaching country #100 and frustration of grinding a song until I FC it.

For the old times’ sake, I’ll implement osu!mania for my website.

.osz format

osu! songs (or maps in osu terms) are stored in .osz files. To test my implementation, I’ll use this random song: https://osu.ppy.sh/beatmapsets/2134137#mania/4560851

It’s a 7K osu!mania song, which means it’s played like a piano (like DDR), and there are 7 keys total. Default keybindings for them are S, D, F, Space, J, K, L.

A quick Google search shows .osz files are just renamed zip files. These are the contents of it when I unzip it:

-rw-rw-r--@  1 zet  staff    86K May 31 10:30 Akiri Feat. InabaYap - Tonight We Fly (Blocko) [Extreme].osu
-rw-rw-r--@  1 zet  staff    96K May 31 10:31 Akiri Feat. InabaYap - Tonight We Fly (Blocko) [Soaring].osu
-rw-rw-r--@  1 zet  staff    62K May 31 10:29 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Another].osu
-rw-rw-r--@  1 zet  staff    13K May 31 10:29 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Easy].osu
-rw-rw-r--@  1 zet  staff    73K May 31 10:30 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Extra].osu
-rw-rw-r--@  1 zet  staff    37K May 31 10:29 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Hard].osu
-rw-rw-r--@  1 zet  staff    50K May 31 10:29 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Insane].osu
-rw-rw-r--@  1 zet  staff    22K May 31 10:29 Akiri feat. InabaYap - Tonight We Fly (Blocko) [Normal].osu
-rw-rw-r--@  1 zet  staff   3.5M Feb 12 00:18 audio.mp3
-rw-rw-r--@  1 zet  staff   3.0M Feb 12 00:18 bg.jpg
-rw-rw-r--@  1 zet  staff    41K Jul 19  2023 soft-hitnormal.wav

Files ending with .osu are level files, bg.jpg is the background for the level audio.mp3 is the music file and soft-hitnormal.wav is the sound it makes when you make a hit in the game.

It turns out, .osu too is not a distinct file type, instead it is a renamed .txt file in disguise. When I open Akiri feat. InabaYap - Tonight We Fly (Blocko) [Easy].osu this is what’s inside:

osu file format v14

[General]
AudioFilename: audio.mp3
AudioLeadIn: 0
PreviewTime: 73050
Countdown: 0
SampleSet: Soft
StackLeniency: 0.7
Mode: 3
LetterboxInBreaks: 0
SpecialStyle: 0
WidescreenStoryboard: 1

[Editor]
DistanceSpacing: 1.5
BeatDivisor: 1
GridSize: 4
TimelineZoom: 0.6999998

[Metadata]
Title:Tonight We Fly
TitleUnicode:Tonight We Fly
Artist:Akiri feat. InabaYap
ArtistUnicode:あきり feat. InabaYap
Creator:Blocko
Version:Easy
Source:osu!mania 7K World Cup 2024
Tags:MWC 2024 World Cup MWC2024 MWC7K2024 finals featured artist fa mappers' guild mpg mg melodic vocals electronic english drum and bass dnb d'n'b high-tech high tech technical hybrid trap hardcore neurofunk underjoy _underjoy collab
BeatmapID:4560766
BeatmapSetID:2134137

[Difficulty]
HPDrainRate:6
CircleSize:7
OverallDifficulty:6
ApproachRate:6.5
SliderMultiplier:1.4
SliderTickRate:1

[Events]
//Background and Video events
0,0,"bg.jpg",0,0
//Break Periods
//Storyboard Layer 0 (Background)
//Storyboard Layer 1 (Fail)
//Storyboard Layer 2 (Pass)
//Storyboard Layer 3 (Foreground)
//Storyboard Layer 4 (Overlay)
//Storyboard Sound Samples

[TimingPoints]
3012,372.670807453416,4,2,1,100,1,0
74564,-100,4,2,1,100,0,1
84999,-100,4,2,1,100,0,0
86490,-100,4,2,1,100,0,1
108850,-100,4,2,1,100,0,0
110341,-100,4,2,1,100,0,1
122266,-100,4,2,1,100,0,0


[HitObjects]
329,192,26117,5,0,0:0:0:0:
182,192,26490,128,0,26862:0:0:0:0:
256,192,26862,1,0,0:0:0:0:
36,192,26862,128,0,27980:0:0:0:0:
475,192,27235,1,0,0:0:0:0:
475,192,27980,1,0,0:0:0:0:
109,192,28167,128,0,29471:0:0:0:0:
256,192,28353,1,0,0:0:0:0:
402,192,28726,1,0,0:0:0:0:
402,192,29471,1,0,0:0:0:0:
182,192,29657,128,0,30589:0:0:0:0:
256,192,29844,1,0,0:0:0:0:
329,192,30216,1,0,0:0:0:0:
...

Nothing too surprising, information about the song, location of music file, some metadata, game settings like HP drain (ie how easy it is to fail the level out of 10).

And there are events, timing points and a lot of hit objects.

I guess events are breakpoints in the song where background, theme or sound of the level changes. Some maps have it as a aesthetic feature, but this one only has one background from start to finish.

I don’t know what timing points are, I’m sure someone experienced in mapping songs would have some knowledge about them, but I’m quite clueless about it.

Hit objects are the crucial part, they are the notes we’ll try to hit while playing the game.

Objective is to figure out the pace of the song, figure out which HitObject corresponds to which key of 7K and place them in a view.

Timing Points

Reading through osu! docs, I’ve found this page: https://osu.ppy.sh/wiki/en/Client/Beatmap_editor/Timing

Here it says:

In mapping, a timing point, colloquially called an offset, is a way to apply common settings, such as timing, slider velocity multipliers, or hitsounds and their respective volumes, to a specific section of a beatmap.

Ok, this is what I’ve to figure out in order to understand when to place notes.

Cool part of the osu! is that it is open sourced, in order to understand how it works I can just go to https://github.com/ppy/osu

Here I found this part of the code which handles (or used to handle since it has Legacy in its name):

private void handleTimingPoint(string line)
{
    string[] split = line.Split(',');

    double time = getOffsetTime(Parsing.ParseDouble(split[0].Trim()));

    // beatLength is allowed to be NaN to handle an edge case in which some beatmaps use NaN slider velocity to disable slider tick generation (see LegacyDifficultyControlPoint).
    double beatLength = Parsing.ParseDouble(split[1].Trim(), allowNaN: true);

    // If beatLength is NaN, speedMultiplier should still be 1 because all comparisons against NaN are false.
    double speedMultiplier = beatLength < 0 ? 100.0 / -beatLength : 1;

    TimeSignature timeSignature = TimeSignature.SimpleQuadruple;
    if (split.Length >= 3)
        timeSignature = split[2][0] == '0' ? TimeSignature.SimpleQuadruple : new TimeSignature(Parsing.ParseInt(split[2]));

    LegacySampleBank sampleSet = defaultSampleBank;
    if (split.Length >= 4)
        sampleSet = (LegacySampleBank)Parsing.ParseInt(split[3]);

    int customSampleBank = 0;
    if (split.Length >= 5)
        customSampleBank = Parsing.ParseInt(split[4]);

    int sampleVolume = defaultSampleVolume;
    if (split.Length >= 6)
        sampleVolume = Parsing.ParseInt(split[5]);

    bool timingChange = true;
    if (split.Length >= 7)
        timingChange = split[6][0] == '1';
    
    bool kiaiMode = false;
    bool omitFirstBarSignature = false;

    if (split.Length >= 8)
    {
        LegacyEffectFlags effectFlags = (LegacyEffectFlags)Parsing.ParseInt(split[7]);
        kiaiMode = effectFlags.HasFlagFast(LegacyEffectFlags.Kiai);
        omitFirstBarSignature = effectFlags.HasFlagFast(LegacyEffectFlags.OmitFirstBarLine);
    }
    
    // cnt.

There are still some stuff to unpack, but from a rough view, we can see that each line is the following tuple: (time,beatLength,timeSignature,sampleSet,customSampleBank,sampleVolume,timingChange,effectFlags)

For example for the first TimingPoint, relavant values would be:

# 3012,372.670807453416,4,2,1,100,1,0
- time: 3012
- beatLength: 372.670807453416
- speedMultiplier: 1
- timeSignature: TimeSignature(4)
- sampleSet: 2
- customSampleBank: 1
- sampleVolume: 100
- timingChange: true
- effectFlags: 0
- kiaiMode: false
- omitFirstBarSignature: false

For this song, only first timing point has timingChange set to true. Rest of the timing points will be false.

Some timing points have -100 for beatLength, which makes speedMultiplier 100.0 / -beatLength which is again 1

Some timing points have 1 for effectFlags. When I search for LegacyEffectFlags I’ve found this enum which means it enabled Kiai Mode (whatever it is):

namespace osu.Game.Beatmaps.Legacy
{
    [Flags]
    internal enum LegacyEffectFlags
    {
        None = 0,
        Kiai = 1,
        OmitFirstBarLine = 8
    }
}

Other than that most of the timing points have same values with different times.

It’s safe to say that first timing point kind of sets the song, and rest of it changes Kiai mode to on an off.

Let’s continue reading the decoder code:

    string stringSampleSet = sampleSet.ToString().ToLowerInvariant();
    if (stringSampleSet == @"none")
        stringSampleSet = HitSampleInfo.BANK_NORMAL;

    if (timingChange)
    {
        if (double.IsNaN(beatLength))
            throw new InvalidDataException("Beat length cannot be NaN in a timing control point");

        var controlPoint = CreateTimingControlPoint();

        controlPoint.BeatLength = beatLength;
        controlPoint.TimeSignature = timeSignature;
        controlPoint.OmitFirstBarLine = omitFirstBarSignature;

        addControlPoint(time, controlPoint, true);
    }

    int onlineRulesetID = beatmap.BeatmapInfo.Ruleset.OnlineID;

    addControlPoint(time, new DifficultyControlPoint
    {
        GenerateTicks = !double.IsNaN(beatLength),
        SliderVelocity = speedMultiplier,
    }, timingChange);

    var effectPoint = new EffectControlPoint
    {
        KiaiMode = kiaiMode,
    };

    // osu!taiko and osu!mania use effect points rather than difficulty points for scroll speed adjustments.
    if (onlineRulesetID == 1 || onlineRulesetID == 3)
        effectPoint.ScrollSpeed = speedMultiplier;

    addControlPoint(time, effectPoint, timingChange);

    addControlPoint(time, new LegacySampleControlPoint
    {
        SampleBank = stringSampleSet,
        SampleVolume = sampleVolume,
        CustomSampleBank = customSampleBank,
    }, timingChange);
} // End of `handleTimingPoint`

It seems like rest of the function adds control points depending on timing changes and kiai mode changes. Let’s try to understand how control points work

Control points

Reading further, we can find addControlPoint and flushPendingPoints functions:

private readonly List<ControlPoint> pendingControlPoints = new List<ControlPoint>();
private readonly HashSet<Type> pendingControlPointTypes = new HashSet<Type>();
private double pendingControlPointsTime;
private bool hasApproachRate;

private void addControlPoint(double time, ControlPoint point, bool timingChange)
{
    if (time != pendingControlPointsTime)
        flushPendingPoints();

    if (timingChange)
        pendingControlPoints.Insert(0, point);
    else
        pendingControlPoints.Add(point);

    pendingControlPointsTime = time;
}

private void flushPendingPoints()
{
    // Changes from non-timing-points are added to the end of the list (see addControlPoint()) and should override any changes from timing-points (added to the start of the list).
    for (int i = pendingControlPoints.Count - 1; i >= 0; i--)
    {
        var type = pendingControlPoints[i].GetType();
        if (!pendingControlPointTypes.Add(type))
            continue;

        beatmap.ControlPointInfo.Add(pendingControlPointsTime, pendingControlPoints[i]);
    }

    pendingControlPoints.Clear();
    pendingControlPointTypes.Clear();
}

Here, we also see the first modification to the beatmap representation in memory. Simply, we group control points based on their time, and if next control point’s time is different we flush them, modifying the beatmap’s control points.

If we go to Beatmap.cs, we’ll see that ControlPointInfo is an instance of ControlPointInfo class located in ControlPointInfo.cs, easy stuff.

There, we can find:

[NotNull]
public EffectControlPoint EffectPointAt(double time) => BinarySearchWithFallback(EffectPoints, time, EffectControlPoint.DEFAULT);

[NotNull]
public TimingControlPoint TimingPointAt(double time) => BinarySearchWithFallback(TimingPoints, time, TimingPoints.Count > 0 ? TimingPoints[0] : TimingControlPoint.DEFAULT);

public bool Add(double time, ControlPoint controlPoint)
{
    if (CheckAlreadyExisting(time, controlPoint))
        return false;

    GroupAt(time, true).Add(controlPoint);
    return true;
}

public ControlPointGroup GroupAt(double time, bool addIfNotExisting = false)
{
    var newGroup = new ControlPointGroup(time);

    int i = groups.BinarySearch(newGroup);

    if (i >= 0)
        return groups[i];

    if (addIfNotExisting)
    {
        newGroup.ItemAdded += GroupItemAdded;
        newGroup.ItemChanged += raiseControlPointsChanged;
        newGroup.ItemRemoved += GroupItemRemoved;

        groups.Insert(~i, newGroup);
        return newGroup;
    }

    return null;
}

We see that ControlPoint’s are grouped by time and can be accessed by TimingPointAt and EffectPointAt methods.

We know that timing change creates a TimingControlPoint and TimingPointAt returns first TimingControlPoint as default. Inside TimingControlPoint, we find BeatLength, TimeSignature, OmitFirstLineBar and BPM properties. BPM is simply 60,000 divided by BeatLength. In our case BPM will be 161 because 60000 / 372.670807453416 = 161

That’s enough for control points for now, let’s look into HitObjects

Hit Objects

The LegacyBeatmapDecoder.cs code we looked earlier has another function called handleHitObject which will be enough to understand what it does.

private void handleHitObject(string line)
{
    // If the ruleset wasn't specified, assume the osu!standard ruleset.
    parser ??= new Rulesets.Objects.Legacy.Osu.ConvertHitObjectParser(getOffsetTime(), FormatVersion);

    var obj = parser.Parse(line);

    if (obj != null)
    {
        obj.ApplyDefaults(beatmap.ControlPointInfo, beatmap.Difficulty);

        beatmap.HitObjects.Add(obj);
    }
}

Hmm, interesting. Compared to handling timing objects, this depends on the game mode. Thanks to the namespace of default value, we can find Mania ruleset under Rulesets.Object.Legacy.Mania.ConvertHitObjectParser, here it is and it has its parser function defined in here

Here are the relavant parts:

  1. ConvertHitObjectParser parses the line
public override HitObject Parse(string text)
{
    string[] split = text.Split(',');

    Vector2 pos = new Vector2((int)Parsing.ParseFloat(split[0], Parsing.MAX_COORDINATE_VALUE), (int)Parsing.ParseFloat(split[1], Parsing.MAX_COORDINATE_VALUE));

    double startTime = Parsing.ParseDouble(split[2]) + Offset;
    LegacyHitObjectType type = (LegacyHitObjectType)Parsing.ParseInt(split[3]);

    int comboOffset = (int)(type & LegacyHitObjectType.ComboOffset) >> 4;
    type &= ~LegacyHitObjectType.ComboOffset;

    bool combo = type.HasFlagFast(LegacyHitObjectType.NewCombo);
    type &= ~LegacyHitObjectType.NewCombo;

    var soundType = (LegacyHitSoundType)Parsing.ParseInt(split[4]);
    var bankInfo = new SampleBankInfo();

// cnt.
  1. Depending on the type functions like CreateHit, CreateSlider or CreateHold are called on the child class. Hit is a single note and slider is supposed to be notes where you need to hold a button but I don’t understand how it’s different from hold.
if (type.HasFlagFast(LegacyHitObjectType.Circle))
{
    // Call CreateHit
}
else if (type.HasFlagFast(LegacyHitObjectType.Slider))
{
    // Call CreateSlider
}
// ...

Now that I look at it, inside Slider code, there’s this line

else if (type.HasFlagFast(LegacyHitObjectType.Slider))
{
    double? length = null;

    int repeatCount = Parsing.ParseInt(split[6]);

    if (repeatCount > 9000)
        throw new FormatException(@"Repeat count is way too high");
    
    // ...

When I look at HitObject’s in my .osu file, there’s no line where there are 7 items. It means slider’s in Mania will be hold notes.

Now let’s parse an example line:

# 329,192,26117,5,0,0:0:0:0:
- pos: Vector2(329, 192)
- startTime: 26617 # + Offset
- type: 5
- soundType: 0

Part with 0:0:0:0: is representation of sampleBanks, but I’m not interested in that for now.

Looking at position values within HitObjects, it’s easy to observe that each pos can be associated with a key from 7K. And since we’re not considering osu! mode we can skip Y values and match X values with keys.

At this point, we have a somewhat simple plan to play a song from start to finish

  1. Unzip the .osz file
  2. Parse the .osu file to extract timing points and hit points
  3. Set up BPM of the song by using timing points
  4. Spawn hit points depending to their startTime property

For simplicity sake, I’ll ignore Kiai mode since I don’t know what it’s supposed to do. Without looking too much into it, seems like it increases note density (?).

Implementing the basic game

I’ll go an start a new vite project, because we have to depend on zip.js to do the extraction and in vanilla-js, it is very cumbersome to user minified version of it.

yarn create vite creates a new project, then I’ll select the vanilla option to create a plain HTML-JS project. Then I’ll install zip.js with yarn add zip.js

For the HTML part, I only need a canvas to display the notes and a div to put buttons so that we can select levels:

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <title>osumania</title>
    </head>
    <body>
        <div id="entries"></div>
        <canvas
            id="game"
            width="500"
            height="500"
            style="border: 1px solid black"
        >
        </canvas>

        <!-- Javascript -->
        <script type="module" src="/js/main.js"></script>
    </body>
</html>

Drag and drop files

I want it so that, we can drop an .osz file to the canvas and it will open it. To do that, we need to add event listeners to the canvas.

Additionally, I’ll add a text in the center of canvas to indicate

// /js/main.js
const CANVAS = document.querySelector('#game');
const CTX_2D = CANVAS.getContext("2d");

/** Draws a text to the center of canvas */
function drawText(ctx, text, color = "black") {
    const { width, height } = ctx.canvas;
    ctx.font = "30px Arial";
    ctx.textAlign = "center";
    ctx.fillStyle = color;
    ctx.fillText(text, width / 2, height / 2);
}

async function parseOszFile(e) {
    // TODO
}

CANVAS.addEventListener("dragover", (e) => e.preventDefault());
CANVAS.addEventListener("dragleave", (e) => e.preventDefault());
CANVAS.addEventListener("drop", (e) => {
    e.preventDefault();
    parseOszFile(e).catch((err) => {
        console.error(err);
        drawText(CTX_2D, "Error loading file");
    }); 
});
drawText(CTX_2D, "Drop .osz file here");

It will look like this:

Step 1

Now we can implement extraction logic inside parseOszFile function.

import { BlobReader, ZipReader, BlobWriter, TextWriter } from "@zip.js/zip.js";

// ...

async function extractOszFile(file) {
    const fileReader = new BlobReader(file);
    const zipReader = new ZipReader(fileReader);
    const entries = await zipReader.getEntries();

    const beatmaps = entries.filter((entry) => entry.filename.endsWith(".osu"));
    const audio = entries.find((entry) => entry.filename.endsWith(".mp3"));
    const background = entries.find(
        (entry) =>
            entry.filename.endsWith(".jpg") || entry.filename.endsWith(".png")
    );
    return { beatmaps, audio, background };
}

async function parseOszFile(e) {
    const file = e.dataTransfer.files[0];
    if (file == undefined) return alert("No file dropped");

    const { beatmaps, audio, background } = await extractOszFile(file);

    // Load audio
    if (audio == undefined) return alert("No audio found");
    const audioWriter = new BlobWriter("audio/mp3");
    const audioData = await audio.getData(audioWriter);
    const audioUrl = URL.createObjectURL(audioData);
    const audioElement = new Audio(audioUrl);

    // Set background
    const backgroundWriter = new BlobWriter("image/jpeg");
    const backgroundData = await background.getData(backgroundWriter);
    const backgroundUrl = URL.createObjectURL(backgroundData);
    CANVAS.style.backgroundSize = `auto ${CANVAS.height}px`;
    CANVAS.style.backgroundImage = `url(${backgroundUrl})`;


    const beatmapsNode = document.querySelector("#beatmaps");
    for (const beatmap of beatmaps) {
        const beatmapSelectButton = document.createElement("button");
        beatmapSelectButton.style.display = "block";
        beatmapSelectButton.textContent = beatmap.filename;
        beatmapSelectButton.addEventListener("click", () => startGame(beatmap));
        beatmapsNode.appendChild(beatmapSelectButton);
    }

    clear(CTX_2D);
    drawText(CTX_2D, "Select a difficulty", "white");
}

async function startGame(beatmap) {
    // TODO: Parse and play beatmap
}

Now, once we drag and drop the osz file, background will change accordingly.

Step 2

Game Loop

Let’s implement main game loop for a 7K game.

We’ll start by parsing .osu file and extracting the hit objects. I noticed that for a simple POC, timing points aren’t even needed; I can use times on the hit objects.

const POS_7K = [36, 109, 182, 256, 329, 402, 475];
const CIRCLE_FLAG = 1;
const HOLD_FLAG = 128;

async function parseOsuFile(beatmap) {
    const textWriter = new TextWriter();
    const text = await beatmap.getData(textWriter);

    const hitObjects = [];

    const lines = text.split("\n");
    let parseSection = false;
    let section = "";

    for (let line of lines) {
        line = line.trim();
        if (line.length == 0) continue;

        if (line.startsWith("osu file format")) {
            continue;
        } else if (line.startsWith("[")) {
            // Extract <SECTION> from [<SECTION>]
            section = line.substring(1, line.length - 1);
            parseSection = section == "HitObjects";
            continue;
        } else if (!parseSection) {
            continue;
        }

        if (section == "HitObjects") {
            const split = line.split(",");
            x = parseFloat(split[0]);
            y = parseFloat(split[1]);

            const idx = POS_7K.indexOf(x);
            if (idx == -1) {
                console.error(`Invalid x position: ${x}`);
                continue;
            }

            const startTime = parseFloat(split[2]);
            const type = parseInt(split[3]);

            if (type & CIRCLE_FLAG) {
                hitObjects.push({ type: CIRCLE_FLAG, idx, startTime });
            } else if (type & CIRCLE_FLAG) {
                const endTime = parseFloat(split[5]);
                hitObjects.push({ type: HOLD_FLAG, idx, startTime, endTime });
            }
        }
    }

    // Group hitObjects by time
    const hitObjectsByTime = [];

    let pendingHit = null;
    for (const hitObject of hitObjects) {
        if (pendingHit == null) {
            pendingHit = {
                startTime: hitObject.startTime,
                hitObjects: [hitObject],
            };
        } else {
            if (pendingHit.startTime == hitObject.startTime) {
                pendingHit.hitObjects.push(hitObject);
            } else {
                hitObjectsByTime.push(pendingHit);
                pendingHit = {
                    startTime: hitObject.startTime,
                    hitObjects: [hitObject],
                };
            }
        }
    }

    return { hitObjectsByTime };
}

This function first loads .osu data to memory using TextWriter from zip.js, then parses hit objects by the rules we talked before and groups them by time.

I skipped while going over hit object parsing but if note is a hold note, 6. element of the split will be its endTime.

Rendering notes

To render notes, let’s define some constants first, this will keep most of the rendering configurable.

const NOTE_RADIUS = 15;
const COL_WIDTH = 30;
const COL_SPACING = 10;
const MOVE_RATE = 400; // pixels per second
const LINE_HEIGHT = CANVAS.height * 0.8;
const LINE_ERROR = 40;

There’ll be 7 columns for notes to appear on, each note will be a circle with radius NOTE_RADIUS, and each note will move 400 pixels per second.

LINE_HEIGHT is where we can hit the notes and LINE_ERROR is the error margin we can click on them.

To render notes, I’ll add two functions; one for circle notes and one for hold notes. Circle notes are straight-forward but hold notes will have a few edge cases just because if we try to render a rect that’s too big for the canvas, it won’t render it, so we have to find how much of it fits the screen.

function drawCircleNote(ctx, x, y, r) {
    ctx.beginPath();
    ctx.arc(x, y, r, 0, 2 * Math.PI);
    ctx.fillStyle = "white";
    ctx.fill();
}

function drawHoldNote(ctx, x, y, r, h) {
    if (y < 0) return;

    const maxHeight = CANVAS.height;

    // Adjust height if it exceeds canvas height
    if (y > maxHeight) {
        // Subtract off-screen height from the height of the column
        h -= y - maxHeight;
        y = maxHeight;
    }

    // If height is more than y coordinate, render up to y coordinate only
    if (y - h < 0) {
        h = y;
    }

    ctx.beginPath();
    // Positive h values render the rect downwards.
    ctx.rect(x - r, y, 2 * r, -h);
    ctx.fillStyle = "white";
    ctx.fill();

    // Draw a circle at the start
    ctx.beginPath();
    ctx.arc(x, y, r, 0, 2 * Math.PI);
    ctx.fillStyle = "white";
    ctx.fill();

    // Draw a circle at the end
    ctx.beginPath();
    ctx.arc(x, y - h, r, 0, 2 * Math.PI);
    ctx.fillStyle = "white";
    ctx.fill();
}

With these two functions we can set up the game loop.

const colOffset = COL_WIDTH + COL_SPACING;
const upperBound = 2 * NOTE_RADIUS;
const lowerBound = 2 * NOTE_RADIUS + CANVAS.height;
const xStart = CANVAS.width / 2 - COL_WIDTH * 3.5 - COL_SPACING * 3;
const appearTimeOffset = ((LINE_HEIGHT + upperBound) * 1000) / MOVE_RATE;

const screenItems = [];
const addScreenItem = (item) => {
    screenItems.push({
        x: xStart + item.idx * colOffset,
        y: -upperBound,
        ...item,
    });
};

let lastTime = performance.now();
let time = 0;
let hitObjectIdx = 0

Notes will appear from y coordinate of -upperBound and will disappear at lowerBound. Where the note will appear depends on xStart and idx of the note.

To keep track of items on the screen, screenItems array will be used. lastTime will keep track of current timestamp, time will keep track of time passed after the song is started, hitObjectIdx will keep track of the last hit object we rendered.

while (true) {
    // Update UI by unblocking the event loop
    await new Promise((resolve) => setTimeout(resolve, 0));

    const delta = performance.now() - lastTime;
    const moveRate = (MOVE_RATE * delta) / 1000;

    const nextHitObject = hitObjectsByTime[hitObjectIdx];
    if (nextHitObject != undefined) {
        // If it's time to render the next hit object add it to the screen
        const nextTime = nextHitObject.startTime;
        if (time >= nextTime - appearTimeOffset) {
            for (const hitObject of nextHitObject.hitObjects) {
                addScreenItem(hitObject);
            }
            hitObjectIdx++;
        }
    } else if (screenItems.length == 0) {
        // All hit objects are rendered, stop the song
        audio.pause();
        audio.currentTime = 0;
        break;
    }

    clear(CTX_2D);
    drawLine(CTX_2D, LINE_HEIGHT);

    // Draw columns
    for (let i = 0; i < 7; i++) {
        const colColor =
            i == 3 ? "rgba(0,255,255,0.3)" : "rgba(0,0,255,0.3)";
        drawColumn(CTX_2D, xStart + i * colOffset, COL_WIDTH, colColor);
    }

    // Draw notes
    for (let i = 0; i < screenItems.length; i++) {
        const screenItem = screenItems[i];
        screenItem.y += moveRate;
        const { x, y, type } = screenItem;

        if (type & CIRCLE_FLAG) {
            drawCircleNote(CTX_2D, x, y, NOTE_RADIUS);
            // Remove notes that are out of bounds
            if (y > lowerBound) {
                screenItems.splice(i, 1);
                i--;
            }
        } else if (type & HOLD_FLAG) {
            const startTime = screenItem.startTime;
            const endTime = screenItem.endTime;
            const holdHeight = ((endTime - startTime) * MOVE_RATE) / 1000;

            drawHoldNote(CTX_2D, x, y, NOTE_RADIUS, holdHeight);
            // Remove notes that are out of bounds, including the hold note
            if (y - holdHeight > lowerBound) {
                screenItems.splice(i, 1);
                i--;
            }
        }
    }

    const now = performance.now();
    time += delta;
    lastTime = now;
}

In every loop, I calculate how much time has passed with delta and update screenItems depending on it. If a screen item passes the lowerBound, I delete it. That’s pretty much the gist of the rendering logic.

One crucial part of this logic is this line

await new Promise((resolve) => setTimeout(resolve, 0));

Without this line, our while loop will block the event loop of the browser and we wouldn’t see the render result on the UI. This line creates a new event and waits for it, so if there’s a pending UI change it will be handled by the browser.

Now once we select a difficulty, audio will start and notes begin to flow.

Step 3

Pressing notes

This part was much harder than I imagined. I thought it would be simpler because for single notes logic is pretty easy.

Single note hit demonstration

Error is abs(T1-T2) and if it’s smaller than our tolerance, it’s successful. Otherwise it is unsuccesful.

We can calculate our hit time by keeping track of how much time has passed since start of the song and compare it to data in the beatmap.

But when it comes to hold notes, it becomes more complicated.

Hold note hit demonstration

For a successful hit, press must be successful, release must be successful and note must be held pressed until it is released.

And if first press is missed, it’s not neccessarily a failure. In osu!mania we can miss first press and start holding, or do first press and stop holding and can still be rewarded points (though less than a successful hold).

Following graph explains this logic further:

Graph of the previous logic

In order to implement this logic, I’ll start with KEY_STATE and HOLD_STATE enums. Key state is state of keyboard keys and hold state is state of hold notes.

const KEY_STATE = Object.freeze({ PRESSED: 0, RELEASED: 1 });
const HOLD_STATE = Object.freeze({ NONE: 0, HOLDING: 1, MISSED: 2 });

For each key in the game, I’ll store a key state.

const KEY_LIST = ["s", "d", "f", " ", "j", "k", "l"];
const KEY_INDICES = KEY_LIST.reduce((acc, key, i) => {
    acc[key] = i;
    return acc;
}, {});
const keysPressed = KEY_LIST.map(() => null);
const makeKey = (key, state) => ({ key, state, time: performance.now() });

Then to keep track of them, I’ll have keydown and keyup events alongside a custom updateKeys function that’ll update their state.

document.addEventListener("keydown", (e) => {
    const { key } = e;
    const idx = KEY_INDICES[key];

    if (idx != undefined) {
        // We'll only prevent default action if we're interested with key
        e.preventDefault();

        // If already pressed, ignore
        if (keysPressed[idx] != null) return;

        keysPressed[idx] = makeKey(key, KEY_STATE.PRESSED);
    }
});

document.addEventListener("keyup", (e) => {
    const { key } = e;
    const idx = KEY_INDICES[key];

    if (idx != undefined) {
        e.preventDefault();

        if (keysPressed[idx] != null) {
            keysPressed[idx] = makeKey(key, KEY_STATE.RELEASED);
        }
    }
});

const KEY_TTL = 50;

function updateKeys(now) {
    for (let i = 0; i < keysPressed.length; i++) {
        const keyPressed = keysPressed[i];
        if (keyPressed == null) continue;

        const { time, state } = keyPressed;
        if (state == KEY_STATE.RELEASED && now - KEY_TTL > time) {
            keysPressed[i] = null;
        }
    }
}

Now when we are going to render a note, we can also check if the key for that note is pressed. If it’s pressed, we’ll remove the circle notes and we’ll update bottom bound of hold notes

const SCORE = document.querySelector("#score");
let score = 0;

function drawHoldNote(ctx, x, y, r, h, holding) {
    // ...

    // Updated here
    // Adjust height if it exceeds canvas height
    const maxHeight = holding ? LINE_HEIGHT : CANVAS.height;

    // ...
}

// Inside game loop

    // Draw notes
    for (let i = 0; i < screenItems.length; i++) {
        const screenItem = screenItems[i];
        screenItem.y += moveRate;
        const { x, y, idx, type } = screenItem;

        // Render
        if (type & CIRCLE_FLAG) {
            drawCircleNote(CTX_2D, x, y, NOTE_RADIUS);
        } else if (type & HOLD_FLAG) {
            const startTime = screenItem.startTime;
            const endTime = screenItem.endTime;
            const holdHeight = ((endTime - startTime) * MOVE_RATE) / 1000;
            const holdState = screenItem.holdState;

            drawHoldNote(CTX_2D, x, y, NOTE_RADIUS, holdHeight, holdState != HOLD_STATE.NONE);
        }

        // Check presses
        if (type & CIRCLE_FLAG) {
            const startTime = screenItem.startTime;
            const keyPress = keysPressed[idx];

            // If note is pressed, remove it from the screen
            if (keyPress != null) {
                const keyPressTime = keyPress.time - songStartAt;
                const timeDiff = Math.abs(keyPressTime - startTime);
                if (timeDiff < KEY_TOLERANCE) {
                    // Remove note from screen
                    screenItems.splice(i, 1);
                    i--;
                    // Remove key press
                    keysPressed[idx] = null;

                    SCORE.textContent = `Score: ${++score}`;
                }
            }

            // Remove notes that are out of bounds
            if (y > lowerBound) {
                screenItems.splice(i, 1);
                i--;
            }
        } else if (type & HOLD_FLAG) {
            const startTime = screenItem.startTime;
            const endTime = screenItem.endTime;
            const holdHeight = ((endTime - startTime) * MOVE_RATE) / 1000;
            const holdState = screenItem.holdState;

            switch (holdState) {
                case HOLD_STATE.NONE: {
                    const keyPress = keysPressed[idx];
                    if (keyPress == undefined) break;
                    const keyPressTime = keyPress.time - songStartAt;
                    const pressStart = startTime - KEY_TOLERANCE;
                    const timeDiff = Math.abs(keyPressTime - startTime);
                    if (time > pressStart) {
                        if (timeDiff < KEY_TOLERANCE) {
                            SCORE.textContent = `Score: ${++score}`;
                            screenItems[i].holdState = HOLD_STATE.HOLDING;
                        } else if (keyPress.state == KEY_STATE.PRESSED) {
                            screenItems[i].holdState = HOLD_STATE.MISSED;
                        }
                    }
                    break;
                }
                case HOLD_STATE.HOLDING: {
                    const keyPress = keysPressed[idx];
                    if (keyPress == undefined) {
                        screenItems[i].holdState = HOLD_STATE.MISSED;
                        break;
                    }
                    const keyPressTime = keysPressed.time - songStartAt;
                    const releaseStart = endTime - KEY_TOLERANCE;
                    const timeDiff = Math.abs(keyPressTime - endTime);

                    if (time > releaseStart) {
                        if (
                            keyPress.state == KEY_STATE.RELEASED &&
                            timeDiff < KEY_TOLERANCE
                        ) {
                            screenItems.splice(i, 1);
                            i--;
                            keysPressed[idx] = null;
                            SCORE.textContent = `Score: ${++score}`;
                        }
                    }
                    break;
                }
            }

            const lowerBound =
                holdState != HOLD_STATE.NONE
                    ? LINE_HEIGHT
                    : 2 * NOTE_RADIUS + CANVAS.height;

            // Remove notes that are out of bounds, including the hold note
            if (y - holdHeight > lowerBound) {
                screenItems.splice(i, 1);
                i--;
            }
        }
    }

requestAnimationFrame() for the game loop

I’ve talked about how we can use setTimeout to fit UI events to game loop so that our UI can be up-to-date. There’s an another function called requestAnimationFrame that does the same thing.

That way we can update our game loop from this

while (true) {
    await new Promise((res) => setTimeout(res, 0));
    // ...
}

To this

// requestAnimationFrame method will pass timestamp to the gameLoop
function gameLoop(timestamp) {
    // ...
    requestAnimationFrame(gameLoop);
}
requestAnimationFrame(gameLoop);

To be fair, I like my first approach better so I’ll stick with it but

At this point, we can finally play a song from start to finish. You can see that I don’t calculate scores the way osu!mania does and instead increase a score variable. That and many other improvements are possible with this code but so far I’m satisfied with the result.

Gameplay

That’s it, a working osu!mania game with Javascript. There was a lot of parts that turned out more complex than I anticipated but it was very fun regardless.

Hope you enjoyed reading it as well. You can also play the game here. (You need to find beatmaps from osu website)