Why is HypeHype building their own engine instead of using Unreal/Unity?

We got this question at Supercell TIMEOUT event yesterday. This is a super relevant question. And there are a lot of wrong answers to it. Let's discuss in a thread...
HypeHype is a game making platform for touchscreen mobile devices and web (HTML5 + WebGL 2.0). There's no PC client. Games are created inside the app, and the app also allows users to upload their own assets to our cloud servers. Games are instant loading (<1 sec loading time).
Let's start about data management. Traditional engines are built with an assumption that the developer uses a PC to build their game. PC hosts the source assets and PC cooks the standalone builds. All data converters run on PC. This code is part of the PC editor.
In HypeHype the data management runs in cloud server, and assets are shared between multiple games created by multiple authors. Everybody can use assets created by other people in their games. They don't need to download or install these assets. The game streams them.
The data versioning and cooking runs in the cloud server. The games have low resolution mips/LODs baked in to ensure <1 sec loading times. The app itself cooks these game bundles and sends them to the server. Server can refresh them and run patchers of them.
The app can also run data patchers itself, as we want to avoid patching all games in the server every time something changes in the data model. There's already 250,000 games and the platform is only in early access in Philippines and Finland. Distributing the cost works better.
Since we are planning to maintain this system for 10+ years, we care a lot about our data integrity. Public game engines are not designed to be 100% data compatible in 10+ year time period. We can't afford data compatibility issues when updating an engine over the years.
The other big topic is the tools. People license third party engine because it gives them great tools for building the levels. This gives them a big productivity boost and allows their artists to start building the levels early.
All content in HypeHype is build in the touchscreen app. Our team is not building the levels. The users are. Traditional engines don't offer any level creation tools that would compile on touchscreen Android/iOS devices. Their tools are designed for PC.
Ubisoft/RedLynx Trials game series and Frogmind Badland game series both are physics based games with in-game level creation tools. These user facing tools are great, since the team used them to build their levels too. Dogfooding is the best way to build quality tools.
The level designers working in these game projects loved the level creation tools. The iteration time was super good. You could instantly switch between game<->edit mode on the device. All Badland levels were build on iPad and Trials levels on Xbox consoles.
HypeHype has collaborative online editing on the device. Multiple people can build together in real time. People can spectate creators building the game and chat with the creators. The server ensures the data validity and prevents multiple people modifying the same objects.
Serialization speed: Have to ever seen a game shipped with a public game engine to load in <1 seconds? This is a hard requirement for us. RedLynx Trials games had <3s loading times, and <0.3s restart and game<->edit mode transitions. This is possible with custom engine.
Super fast serialization makes iteration time super good. Which makes the team more productive. Same goes for offline data cooking, lighting bake and level deploying to device. We don't have any of these costs. All our sample levels are also created on the device.
We want to allow our users to extract their games as standalone HTML 5 + WebGL 2.0 (+WebGPU) packages, that they can deploy to their own web server, or even sell in web game marketplaces. They could wrap them in HTML 5 player for App Store or Google Play.
The licensing terms of public game engines doesn't allow us to do this. Our users would need to license these 3rd party engines and pay licensing fees to them. We don't want this. We want our users to freely distribute their work and sell it too if they want.
This is why we have to be extra careful when licensing 3rd party technology in HypeHype. We don't want technical choices to limit our business choices. All tech we use must be freely usable by our customers in their HTML5 games. Without users having to sign contracts or pay fees.
Let's talk about technical factors next. People often say that performance is the reason they build their own engine. Or they NEED a custom renderer. These are often bad reasons to build your own engine. You can workaround many perf issues and renderers are customizable.
At Ubisoft it was common to keep using the old engine, while thowing away the renderer. People did this many times between console generations. A renderer alone is not a good enough reason to build a whole new engine. Just build a renderer. Or modify existing one.
Claybook used UE4. We had our own SDF volume based scene, our own SDF ray-tracer (running in async compute) and our own GPGPU physics engine (fluids and clay). We modified the UE4 renderer heavily and optimized the renderer backends. Everything else in UE4 was fine for us.
If you write your own engine, you have to write all the bits that you don't really care about. Usually a game has some unique features that you really want to showcase. Better to spend your focus on those features instead of rewriting all. Stock solutions work for most cases.
Traditionally big public engines were not great for massive simulations. Recently UE5 Mass and Unity DOTS have been introduced. The downside of these technologies is that they are not yet production ready. It's a risk to commit to them. But building your own isn't trivial either.
Nanite is also solving the massively kit bashed unoptimized user generated content case pretty well. But it doesn't scale down to mobiles. And I doubt it ever will scale to 43 GFLOP/s low end Android devices.
Professional game projects have technical artists designing the workflows. You can merge meshes together, put textures in the same atlases, create special content for far away geometry, hand place occluders / portals / probes, etc. User generated content is a different ball game.
At RedLynx/Ubisoft we created GPU-driven rendering to ensure that we can render poorly optimized kit bashed user generated content efficiently. We had 100% virtual texturing and streaming for everything to ensure the memory never runs out. Special needs require special tech.
Recap: If you are building a traditional game using traditional PC based workflow, you should consider licensing a public engine. Don't rewrite everything. Focus on the tech that makes your game unique. Do you really want to rewrite file system, asset management and audio system?
Custom engine is a good call if your business needs can't be covered by the engine licensing terms, all content is created inside your app, could server deploys/cooks/converts/versions all data, need instant loading, web distribution, and 10+ years of data persistence...
Generic solutions always use more memory and run slower than dedicated solution. The difference isn't massive, but it's certainly relevant for low end mobile devices. Especially if you can't ensure that your content is optimally made. User generated content requires special care.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sebastian Aaltonen

Sebastian Aaltonen Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @SebAaltonen

May 5
WebGL 2.0 doesn't support map/unmap for buffers:
khronos.org/registry/webgl…

--> There's no persistent mapping. Is there any way to efficiently sub-update big buffers in WebGL 2.0?

gl.bufferSubData exists, but I would assume it creates a shadow copy of the big buffer or stalls?
In our Vulkan and Metal backends I will be using big persistently mapped uniform buffers (tens of megabytes) and sub-allocating them. CPU only writes to locations that GPU hasn't used in the past 2 frames. Only a small subset of the data changes between frames.
I am wondering whether WebGL 2.0 developers call gl.bufferData for each draw call to fill the uniform buffer per draw. Or do you call gl.bufferData once, updating all data to a big uniform buffer and then call gl.bindBufferRange per draw?
Read 7 tweets
May 2
Considering implementing my own minimal CPU and GPU profiling brackets (scoped) and a visualization tool. Our customers are developing games on phones and tablets. They would also benefit having performance visualizations. Also it makes my life easier when browsing live games.
The one thing I missed in the Unity profiler was a GPU timeline next to the CPU thread timelines. We could have both and make them available on the phone/tablet (activate in debug menu).
I am confident that I could make a good touchscreen UI designed for phones/tablets. Scroll / pinch zoom the data. Super efficient 60/120 fps rendering. Zero frame drops. Should be better than PC for visualizing profile capture data.
Read 4 tweets
Apr 30
The rumoured AMD Phoenix laptop APU looks good. 8x Zen4 CPU cores + 24 RDNA2 (or 3) CUs. That's more flops than PS4 Pro, and not that far behind the 36 CUs of PS5. Should be close to M1 Pro in GPU performance, assuming they solve the memory bandwidth in some way.
It would be great to see an APU with a big GPU combined with the new 3D cache. That combination should offer good performance and be super energy efficient at the same time. With right clock rates it could offer similar full day battery life as the M1 Pro/Max do.
Too bad the Phoenix APU is not yet available. Even though the current RDNA2 iGPU has 12 CUs and beats previous gen iGPUs by almost 2x, it's still too slow for laptop vendors. There's zero 6800/6900HS (35W) laptop models without a discrete GPU.
Read 4 tweets
Apr 29
Spent a couple of hours investigating ARM Mali G57 performance with instancing.

Managed to render 34816 instanced objects (up from 13000) with some interesting changes...
Tested different instance counts (per draw). Identical amount of rendered objects in each. Numbers are strange...

1: 39.22ms
2: 95.25ms
4: 162.41ms
8: 178.87ms
16: 146.07ms
32: 96.96ms
64: 49.72ms
128: 33.08ms
My draw calls are single triangles, with random position in the screen. When N=1 one draw hits one tile. When N increases we start hitting more and more tiles. But we render only one triangle to each tile. When N increases more, we render more triangles to each tile.
Read 8 tweets
Apr 28
Nvidia is the only mobile GPU vendor (Shield, Pixel C) requiring above 64 bytes alignment for UBO offsets.

I would love to use the same persistent big UBO for both instancing and individual draws. But instancing requires tight packing while individual draws require align...
Alignment of 64 is fine. I could just ensure that all object data is padded to float4x4 boundaries. But 256 is iffy for instanced draws. Too much wasted memory and worse cache line utilization.
I don't want to setup any object data in the draw loop. I only setup the offset (based on culling results). All object datas (whole scene) is already in the big UBO persistently. For instanced draws I use a separate UBO with offsets (visible indices) and index based on these.
Read 6 tweets
Apr 27
First draw call performance test. 50% draws to G-buffer = 4xMRT and 50% additive alpha transparencies.

Mali G57 MP1: 19000 draw calls at 30 fps
PowerVR GE 8320: 20000 draw calls at 30 fps

These 99$ phones seem pretty competitive in Vulkan draw call performance.
No textures yet. One descriptor set which also has the framebuffer MRT bindings. Descriptor set has one UBO which is offset bound per draw. Persistently mapped UBO with bump allocated object positions (float4) filled every frame.
Will tomorrow test separating the UBO to it's own descriptor set to avoid rebinding the descriptor set containing the textures for each draw.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(