Master the fundamentals of hardware: Augmented Reality (Part 2)

This article is part 2 of an ongoing series based on my learnings from developing AR and VR applications called “Fundamentals of AR”. Catch up on Part 1.

Why AR?

Augmented reality captures our imagination like no other technology can. AR advancements in gaming and entertainment are completely revolutionizing the way many industries work.

By 2025 the healthcare revenue from augmented and virtual reality will be around $5 billion and it is already being used in hospitals and doctor’s offices to help easily find a vein for IV placement. The travel industry also has a lot to gain from the AR boom as 84% of consumers all over the world would be interested in using AR as part of their travel experiences and 42% believe that AR is the future of tourism.

As the technology grows, it is important to get comfortable with the emerging AR landscape, and how an AR app is made.

Hardware and software are generally easy to overlook when thinking about AR apps; the design and experience. However, there are various different things that make an AR app work, and an essential part of the final outcome is the actual hardware and software.

How even does a Computer Work?

Before going too deep, it is important to first start by understanding how a mobile device interacts with the AR and is able to run it in the first place.

For basics, all computers have a processor which is essentially the brain of a device. The processor is what let’s different things work on your phone by performing certain mathematical operations.

Processors come in the form of chips, and these chips contain several hardware units. The hardware unit that reads and performs a program instruction is called a core. A CPU (Central Processing Unit) is a bunch of core units put together. A CPU is able to process information in a step-by-step format.

If we were to put a few thousand cores together, we would have a GPU (Graphics Processing Unit). A GPU is able to process everything in groups that are parallel to each other at the same time. The GPU renders a graphic in several parts rather than in one piece. The GPU also increases frame rates and makes things look better and smoother in general.

Mobile devices also have what’s called a SOC (System on a Chip), a single chip that packs and condenses all the large parts of a desktop device together. For example, GPU+CPU=SOC.

How does a Computer see?

The process of a computer acquiring, analyzing and processing data from the real world is called Computer Vision. We are all aware that the most common method of dissecting and capturing information is through a camera. But really the most accurate way is to use sensors. A sensor is a piece of technology that detects information from its surroundings and then responds back with data.

Examples of sensors:

  • Depth Sensor: Calculates depth and distance.
  • MagnetoMeter: Essentially a compass that can always tell where north is.
  • Gyroscope: Detects the angle and position of your phone.
  • Proximity Sensor: Measures how close and far something is.
  • Accelerometer: Detects change in velocity, movement, and rotation.
  • Light Sensor: Measures light intensity and brightness.

How can a Computer even understand what it’s looking at?

It’s easy for a computer to simply gather data by looking at different things from the real world but the more meaningful part is how that computer takes that data and makes it purposeful. For example, calculating the distance between a phone and a door can be easy with a sensor. However, trying to identify the door can be extremely difficult. This would be very difficult for a computer to if it didn’t know the difference between a door and a wall or a window. Every door is not the same which is why it’s also important for the computer to comprehend the different types of doors that it is likely to encounter.

The different technologies that can be used to train a computer in this way is artificial intelligence (AI), Machine Learning (ML), Deep Learning (DL), and Artificial Neural Networks (ANN).

An open-source software library like Tensor Flow by Google makes it easy and fast to train a computer to understand what common objects like a door is.

  • Artificial Intelligence (AI): Making a computer perform a task just like a human would.
  • Machine Learning(ML): The process of learning and improving by performing tasks for better AI.
  • Deep Learning(DL): Subset of ML that learns, corrects, and comes to conclusions by itself.
  • Artificial Neural Networks (ANN): A network of algorithms that performs several tasks that enables Deep Learning.

What is some AR Software Terminology?

There are four main terms you must understand if you want to develop in the AR space.

The first is platform which is simply just the operating system where something can be specifically built.
e.g., Writing code for an iOS app means that an Android app cannot share the same code.

The engine is the software that can convert, power and render different types of data into content.
e.g., Creating a sphere in Unity and adding the ability to bounce.

Framework is a collection of predefined code that enables quicker development.
e.g., Rather than writing code for lights from scratch, Scenekit for iOS offers built-in lights.

The one of the most important is SDK (Software Development Kit). This is a collection of third-party tools and frameworks that supports or adds new functionality to an app.
e.g., Adding voice support using Twilio

Different types of Engines

Engines can come in several forms. A gaming engine like Unity has its own interface for real-time 3D authoring, but it can also be embedded inside an application. A rendering engine like V-Ray comes as a plug-in for other 3D applications. Software can also have multiple engines depending on the users’ needs. Keeping a sphere in mind, there are four common engines that can be used to add functionality to this:

Graphics Engine would be the primary technology which draws the images on all screens.
e.g., Makes a sphere shape.

The Rendering Engine are still a graphics engine, but specialized for transforming 3D models into stylized or photorealistic images and video.
e.g., Makes the sphere look hyper-realistic.

The Physics Engine is what simulates how objects would react under the real-world constraints such as gravity and physics.
e.g., Makes the sphere bounce.

The Gaming Engine is a software environment that contains things like physics engines as well as support for particles, audio, logic, and AI.
e.g., Makes the sphere change colors when moved up or down.

Common SDK’s and Frameworks

These tools make it so that a developer doesn’t have to code an application from scratch each time. However, certain frameworks may require specialized knowledge to code for crucial features.

There are tons of AR SDK’s and frameworks out in the market at the moment. Some kits can accommodate several platforms, whereas others are made for a single platform and/or device only (for example only apple or android devices).

  • Vuforia: Works on iOS, Android, and UWP (Universal Windows Platform)
  • iOS AR Kit: Developed by Apple, iOS only.
  • AR Core: Formerly part of ‘Tango’ Developed by Google, Android only.

Feel free to check out some of the projects I have developed using these SDK’s. Check out The Roll-A- Ball Game and the AR Construction App. Stay tuned for more projects, video and article tutorials.

What’s Next

The percentage of users with a mobile phone is forecasted to continue to grow, rounding up to 67 percent by 2019. It only makes sense to continue designing for mobile, as this will be one of the best way of making sure an experience gets into the hands of the most users possible. Not everyone has smart glasses right now, but nearly everyone has a phone.

Unlike phones, glasses are a more seamless way of experiencing an environment since it locks the content to your peripheral. Glasses make it easier to use gestures and freely move your hands, this introduces a new set of interactions that designers have already started to explore. There is a race for consumer-friendly AR headsets, and the number of manufacturers is increasing rapidly which means that it’s only a matter of time before an affordable headset makes waves in the consumer market.

With so much happening in this space, it is clear to say that design and experience will continue to be the differentiator between these products. And in order for us to make the largest impact and stay up to date with this growth, we can all start by developing AR and VR applications for mobile.

Please be aware that this is part 2 of a 2 part series. If you enjoyed these series, feel free to connect with me over LinkedIn.

If you enjoyed reading this article, please press the👏 button, and follow me to stay updated on my future articles. Also, feel free to share this article with others!

I’m a developer & innovator who enjoys building products and researching ways we can use AI, Blockchain & robotics to solve problems in healthcare and energy!